Geo vs GIS, a long lasting theme

Is there a difference between Geo and GIS? Many people ask me this question, and when they do not ask I tend to tell them that there is that clear difference. Last week I had such a discussion again, and it kept me thinking. The main reason for this blog post is to put my ideas on virtual paper, so I can in the future to refer to it. Of course I know that this kind of subjects can also start a discussion. If that is the case, please feel free to add your comments.

The Definition

Lets start at the begin, the definition of GIS, since that is the source of much confusion. I believe that for decades all was clear: we had GIS software that processed Spatial data or Geo data. But then came online mapping tools and navigation systems. These systems work with geo information, so in the above description they will be GIS. But are they really?

All this depends on our definition of what Geographical Information Systems are. If we look at it from an information system perspective the GIS will be a wide range of techniques. For example when we take a definition as they can be found in dictionaries on the web for Information Systems (IS) we read:

information system: The entire infrastructure, organization, personnel, and components that collect, process, store, transmit, display, disseminate, and act on information. [from The Free Dictionary]

Other descriptions are similar, they all mention a mix of technical and human resources that in combination are able to process data. In this way we can talk about “the GIS department”, “the GIS software”, and even “the GIS data”. In this definition our navigation systems, departments and teams, and online mapping tools like Bing and Google Maps are GIS. As I stated before this definition is too wide to my opinion.

In the most strict definition one can say: GIS is the software, the toolbox. Geo is the information that the GIS needs. Geo is the model and GIS works with it. I do realize that this definition leaves the human resource side and the special hardware of our field out of the discussion. And this is exactly the point where complexity starts.

Let me summarize briefly the different definitions of GIS.

GIS is:

  1. The entire infrastructure, organization, personnel, and components that work with spatial data.
  2. The software and the spatial data.
  3. The software.

Much of the discussion depends on what you choose as a definition.

The Discussion

It should be mentioned that this GIS is nothing without the spatial data that can be processed with it. But is it part of the GIS? I often ask a question like: “Does the word processor also contain the texts you are about to write when you unpack the box?” These text for the word processor are like the geo data for the GIS.

So I make a clear distinction between Geo and GIS. The consequence of this distinction is that for me the only possible definition is the third one. And I realize that this conflicts with the general definition of Information Systems.

Let me even go one step ahead — especially when we would like to keep the information system definition — and propose to make more use of the term Spatial Information System when it comes to the definition for the infrastructure, organization, personnel, and components. In that way we can reserve the term GIS for the software and Geo (or Spatial) for the data. Combinations with other fields like Spatial Intelligence but also the place of Remote Sensing may come more usable in this way.

I wonder what others are thinking…

Learning path

This week I have concluded the last two courses for this autumn semester. Although I have always promoted “lifelong learning” I had not really done courses the last years beside some hands-on training in my field. After finishing my studies I have started to teach, and the preparation of these courses always resulted in staying up to date. But this autumn I started taking classes again.

MOOC’s

This autumn I discovered the Massive Open Online Course. In total I have finished four courses, three at coursera, and one at the Knight Center for Journalism in the Americas. In the start it was a bit strange. Would I still be able to study, do homework on a regular base, pass the quizzes? After two weeks this fear was fully over. Classes by a number of video messages of about 12 to 15 minutes each, is good (in total about 1 hour per course each week). Doing the weekly assignments, sometimes multiple choice on the course material, but also assignments with a lot of maths, drawing, building, and designing. I even managed to build a prototype of a juicer!

design assignment

In the end I have done 4 courses, and although the subjects were all different I see a very nice learning path. Let me tell you about what courses I did:

  • Model Thinking, coursera, Scott E. Page, University of Michigan
    “Why do models make us better thinkers? Models help us to better organize information – to make sense of that fire hose or hairball of data (choose your metaphor) available on the Internet. Models improve our abilities to make accurate forecasts. They help us make better decisions and adopt more effective strategies. They even can improve our ability to design institutions and procedures.”
  • An Introduction to Operations Management, coursera, Christian Terwiesch, University of Pennsylvania
    “In short, you will learn how to analyze business processes and how to improve them.”
  • Design: Creation of Artifacts in Society, coursera, Karl T. Ulrich, University of Pennsylvania
    “The course marries theory and practice, as both are valuable in improving design performance. Lectures and readings will lay out the fundamental concepts that underpin design as a human activity.”
  • Introduction to Infographics and Data Visualization, Knight Center, Alberto Cairo, University of Miami’s School of Communication
    “How to work with graphics to communicate and analyze data.”

Although the courses all look very different I found a very nice learning path in it. There have been some moments where ideas from the one course came up while struggling with the other course. For example, when needing to make a presentation on business processes in the course on Operations Management I have chosen to create an infographics to present the data and the outcomes. The model thinking course on the other side helped a lot in data organization and further exploring a way of thinking.

In my professional work I can make use of all four courses, that was in first instance not the goal. Another thing is that I have used a number of new tools to make and order my course notes. Among these tooling is Evernote and Tableau Public, tools that I had not used before and has proven very valuable. For other tools that I have used already for a long time, like Freemind and Inkscape I have found new ways of applying them.

Plans for 2013

In 2013 I have subscribed to new courses: Computing for Data Analysis, Game Theory, and Creative Programming for Digital Media & Mobile Apps. Besides I have started to work on my teaching materials to create my on-line course, an introduction to GIS and Geospatial data. For this course I have started on udemy a platform for on-line courses in a wide range of subjects.

So… On-line courses and lifelong learning, it will continue.

This weeks assignment: data manipulation

This weeks assignment in the MOOC on infographics and data visualization by Alberto Cairo is about maps. From his new to appear book we have to read the fourth chapter on Cartography for Journalists, or as the chapter title reads: Thematic Maps, Statistics and Cartography Meet. Like his earlier book – The Functional Art -, also this chapter is a well written piece with many great visuals.1 Alberto Cairo describes thematic maps as “the purest and most successful form of information graphics”, and I certainly do not disagree. And about the assignment? That is to use data from the data from the US Bureau of Labor Statistics and show unemployment in the US. In a way like The Guardian’s Data Blog published a story about unemployment in the US: http://www.guardian.co.uk/news/datablog/interactive/2011/sep/08/us-unemployment-obama-jobs-speech-state-map. But with more functionality and depth.

With Tableau, TileMill, or even with ESRI ArcGIS online this task is not that big. The data is easily accessible and well-organized. But this is exactly where standard mapping differs from making infographics. I am quite sure that our teacher does not want the standard map. In the first assignment I created a map and Alberto commented: “However, it doesn’t improve the original as much as it could. The reason is that you are forcing me to click on each country to get the data, rather than giving me the opportunity to explore the data in different ways, such as creating rankings, comparisons between countries, etc.”. So no standard map this time, but something that will focus on the exploration.

I decided to focus on two of the questions that are raised in the assignment:

  • What kind of graphs or maps would you need to tell a compelling story based on this?
  • How would you give context to the data?

If we are talking about unemployment during the first period of Obama there are  some nice infographics on this subject already in the run to the elections.

My approach: Geo Tagging

The latest data available is from September 2012 and looking at that data you immediately see the big differences between states. Montana, Wyoming, North and South Dakota, Nebraska. From my times being there I know the views of large (or even more than large) plains, the emptiness. The number of people per square meter must be very low. On the other hand the giant peak in California, and the higher (but not peak) values in the dataset for states along the east coast, and the big cities like Chicago, Detroit, and states like Texas, and Florida. My first impression of the dataset is that it is not averaged by population density.

In order to map the data, and to show the population – unemployment relation the data must be geo-tagged. Since the data is ordered by state, and the state codes are given this is not a difficult task. MaxMind offers a nice table of states and their longitude and latitude. By combining this data with the given data set I now at least have point data that is geo referenced. And so it can be mapped with a centroid.

Then I started to play with tableau public. Within the map option of the software there are several settings and datasets preloaded, population, population by race, occupations.

What appears to me is that population density and mixture of race both have effects on the figures. States with many big cities seem to have a higher unemployment rate. So a first step would be to map the data against the population density.

The Census Bureau Data

The United States Census Bureau has a nice dataset for the census of April 2010. Although the census data is for the full population, including those people that are too young to work, or those that are retired the figures change already with the first quick lay-out. Rhode Island that at first was a small dot, now suddenly becomes one of the largest. On the other hand California that seemed to be the state with an incredible high unemployment rate now has become an average player. All in all we see how the differences have become smaller when it comes to the percentage of the total population that is unemployed.

Conclusion and the final result

So “to tell a compelling story” I made an interactive infographic where the map is a main element on the page. From this map you can click on a state and see the unemployment figures for 10 periods in time: at the start of the first period of Obama, and then each year, until just before his re-election. The context is that data is multi interpretable, even though everyone knows the facts. By leaving out specific details data manipulation becomes a word with a double meaning.

ps. Did I tell you that subscription to the second course, starting in January is open? You can subscribe here: Knight Center for Journalism in the Americas’s Distance Learning program.

1 In the first version this paragraph read: “From his book we have to read the fourth chapter on Cartography for Journalists, or as the chapter title reads: Thematic Maps, Statistics and Cartography Meet. The book is well written with many great visuals and the same is true in this chapter.”

Happy GISDay

A week ago (on Thursday) someone tweeted “happy postgisday”, and yes, it was the day after GISDay. This yearly event is, as is stated on the GISDay website: “The annual salute to geospatial technology and its power to transform and better our lives”. Looking at the event map as published on the website I was amazed by two things, firstly the wide spread of events all over the globe, secondly that there was no event planned in The Netherlands, although there exists a good and well organized GIS community. And I did not organize any event either.

My thought was: What could I have organized to bring GIS to a wider audience? The following themes come to my mind:

The application of GIS in area’s where you do not expect it.

Not so long ago, about a decade or so, GIS mainly took place in the drawing room. Networks were not any longer designed and maintained on the large drawing boards with pen and paper. In the GIS era these drawing boards were replaced by digitizer boards and large monitors, and the blueprints were replaced by bits and bytes. With all this development answering questions on the assets became easier. Examples of questions you can answer in this context are: What is the current state of our network, what type of asset had the biggest interference sensitivity over the last period, what customers should be informed about the upcoming repair work?

As said in an earlier blog post on this subject the main shift appeared when navigation systems became more and more a commodity. Nowadays  GIS is not any longer limited only to the drawing room. We see GIS in many different contexts, and different industries, on places where you would not expect it. To tell this story may be my first presentation.

Your safety monitored with GIS

The second story is about Geo-data and boundaries. In European context Inspire is becoming more and more grown up. Inspire is the initiative that should create an infrastructure to make geo information and spatial data better accessible. When you cross a border (and in this case I am not even talking about the country borders), it may well be that the data that you find on the other side of the border is not directly usable. This can cause problems, for example when a river gets polluted, and we want to take steps to prevent the pollution to get into the drinking water supply chain. Best is to have data that can be easily exchanged between different organizations.

Different local governments store their data in different ways, this is due to for example the GIS software they use. The main result of this is that if we want to get a full overview of data available we should first create a common language. But not only we should store the spatial data in a common way, it must also be found across the different borders. So labels to the data and the datasets, the metadata, must be generalized too. In the last years we have seen a fast growth of the so called geo-portals, in the future these will be the entrance to the European data. They are a wonderful way to tell a larger audience how spatial data, and the systems storing and analyzing this data work together on monitoring safety.

The past analyzed with GIS

A growing theme in historical studies is the application of GIS to study spatio-temporal processes. Mapping differences between two or more different time periods, and showing where changes appeared. In the last decades I have published a number of these studies. For example on detecting changes in the urban landscape (how a city developed). But there is so much more that can be done on this subject. In the book “Past Time, Past Place” Anne Knowles collected a number of very good examples on how GIS can be applied in history. This book was published in 2002 and since then there has been a lot of new development. For example GIS has become better accessible and more a commodity in the historical sciences.

If we apply GIS to history we also come to the subject of story telling. With the historical datasets that we have available we can tell a story that may have been hidden before. This story can make the past more interactive, how odd this may sound. We can show the development of a town, starting from a little village on a sand ridge, and how, based on the written deeds we find in the archives, we see that over time the village grew. For example we can show the map, and how more and more streets and houses appear. In addition to this map we can add the deeds on which we base our findings to the different plots.

Next year… GISDay

Next year on GISDay (Wednesday, November 20, 2013) I would like to show small projects on these three examples, mainly to introduce GIS to a wider audience. In the mean time I will post examples here.

Data, Information, Knowledge … and Wisdom

Last week the MOOC on Infographics and Data Visualization at the Knight Center for Journalism in the Americas started, and I am one of the 2000 lucky students. About 8 years ago my former employer dropped a book on my desk. Mouth watering and in one go I finished it. The book had to do with data processing and information visualization in a way that as a computer scientist and art historian I could understand so well. The book by Edward Tufte has been a source of inspiration for many lectures and thoughts while working with, in my case, the presentation of geographical data.

One of the main reasons for me to start with the course is that I see the importance of data visualization. I am neither a journalist, nor a professional designer yet I want to visualize my data analysis. For example the data about assets in a geo information system, and in a way like Stephen Few describes it: Meaningful decoded data where the nature of the data as well as the relationships between the different objects is clear.  I want to be able to present this data in a way that it is understandable for non geo informatics people.

Data information knowledge model

Last week Alberto Cairo introduced us to the concepts of the data information knowledge model and on how to analyze the ongoing stream of infographics that are produced in his first lecture on information visualization.  One of the references he makes is to a chapter on data visualization from Stephen Few. Few writes: “The goal is to translate abstract information into visual representations that can be easily, efficiently, accurately, and meaningfully decoded.” In the information processing for Geographical Information Systems we often meet the same goals.

One of the future directions that Few mentions in his text is: “The integration of geo-spatial and network displays (such as node and link diagrams) with other forms of display for seamless interaction and simultaneous use.” And that is exactly why 10 years ago that book landed on my desk. I believe that the integration as mentioned above is a very obvious one,  but we should be careful. Geo data looks very “sexy”, and we see that many designers of infographics tend to use maps as  a background, or when a location is given, map the data to that location. That brings me to one of the questions of Stephen Few: “Is it obvious how people should use the information”.

In the discussion last week on a map given by Alberto Cairo, the instructor of the course, mentioned the map as a background played an important role in many people’s responses. Information on Internet use for several countries had to be presented. Some of the responses tended towards the fact that everyone knows where specific countries are on the map, so why try to map a chart to a location. On the level of countries or continents I can understand that argument, but in many cases we work with data on a smaller scale. When it comes to statistics on your assets the map is an excellent carrier of information.

Location data

Probably more then 90% of the data in a geo-database has nothing to do with the map in a direct way. It does not contain X, Y, or Z coördinates by itself, but the data is linked to other data tables that do have the location connection. For example we can have a postal code that will link customer data to a specific area. In this way we can enrich the data. The last few years there have been a number of companies that showed us wonderful examples of how to do. The result is that much of the data that is available in information systems now can be linked in one way or the other to a specific location.

If you have a shopping card from the local supermarket, data is collected on the products you buy. Different queries can be run on this raw data, for example on price ranges or on the type of products. All this data has no location component, it’s products, prices, and quantities. And we can, based on this data, run wonderful statistics. We can add extra value to this data set when we combine these statistics to the postal code of the consumer. Suddenly we start to see patterns, for example when it comes to age categories in a certain area of town.

The next step, and here I refer back to the future direction, is to change the information that we get from the different queries into knowledge. Visualizations based on the above example may have added value. But this added value can only be achieved when the data is easily and efficiently available, plus easy to read and interpret.

This is exactly where we can learn from the designers that work in the newspaper offices. This is, besides the fun, a reason for me to take the course on infographics and data visualization.

Infographics, GIS, and Mapping

In order to present complex information in a quick and efficient way we see a growing use of infographics. Not only in sciences but also in journalism we see that more and more often infographics are used. A recent announcement of an on-line course by Alberto Cairo, the author of the book “The Functional Art”,  has already set its counter to over 1700 subscriptions within a few days. The rapid growth of the number of submissions on websites like visual.ly and easel.ly show that more and more people find their way into this field too.

Dashboards with infographics, or more

Maps and mapping are an important group within the infographics. But when we look at the many manuals that appear on websites on how to create infographics there is actually not a real list of tools available for this type of presentation. I look at the infographics from my field of geo-information where we are often confronted with questions on how to present large amounts of complex data, for example with customers that want to have a quick and clear insight in the current state of their assets.

What many companies would like to have is Business Intelligence-like dashboards, but with the ability to ask questions in a GIS way, and then to present the results in a infographics manner. Is that too much to ask? It should not be, and in the last years I have seen many very nice examples on presenting location data in a way that comes close.

GIS Software

ESRI, the giant in Geographical Information Systems (GIS), offers us many ways to present location linked data. With some basic knowledge and the great geo data collection that comes with the software you can quickly produce some basic maps. You can even do the statistics behind the maps within the software. But let me stress that I did on purpose use the word “basic” twice in previous line! GIS and dealing with geo data is a complex discipline. Besides the software is rather expensive for when you want to use it every now and then. Apart from ESRI there are other alternatives that offer ways to store, analyze, and present your geo data. Examples of other software are Mapinfo, Bentley Map, and GeoMedia.

An open source alternative is QGIS, an aggregation of Quantum and GIS. QGIS is a very good tool when it comes to the creation of maps every now and then. Contrary to ESRI ArcGIS it is more difficult to install, although much has improved recently. A page you should definitely see when it comes to learning how to do the basic GIS operations is “how to perform basic GIS operations in QGIS in the most straightforward way“. It has a some neat tricks. When it comes to the data there are also many open source inventories.

And after you have worked (or struggled) with the tools for a while you will probably find out that we are still far from the dashboard that I mentioned above.

Mapmaking

One thing that I want to mention is that map making is a profession too. Mapping is more than just plotting your data on a map that you have found on the Internet, or in a template that comes with your office suite. But do not get me wrong, first rule is that you pick the base layer that suits the job. Over the years I have been to many great conferences that primarily focused on how to present geo data, to know that you can make big mistakes there too. Unfortunately that is what I see happening in more and more infographics. Although the learning curve is steep, GIS can be of great use here.

Let me conclude to tell you: there is a long way to go, but there is definitely a great link between GIS and infographics. More on both, and especially on how to map your data, is to come.

Continued in part 2

Location everywhere

Already for some time I have wanted to write about indoor positioning. Since Indoor positioning is going to be a future direction for a number of fields, including GIS. I had collected articles and did my research. And then yesterday by following another post (on big data, food, and visualization) I end up at a (Dutch) post from Numrush: “Indoor navigation system Wifarer announces first customer” [My translations, EK]. Author Johan Voets states in his post exactly what I wanted to tell you in my blog: “Indoor navigation. It sounds a bit unnatural, but it is definitely a fast growing market.”

In my earlier blogs on the fast developing GIS market I already indicated that mobile devices, such as smart phones, offer great possibilities. On a post from envisioningtech the location awareness is mentioned in the context of new sensors. The article uses the term “planned spontaneity”, as where – based on earlier experiences – your system takes decisions, based on a certain context. And yes this context does include location too.

The four elements

In another recent study Latitude mentions the 4 I’s: “four elements—the ‘4 I’s’—that will continue to play a significant role in our experiences with narrative-based media”. Immersion, interactivity, integration, and impact. To cite their report even further: “Immersion and interactivity primarily help an audience to go deeper into a story, while integration and impact are about bringing a story of out of the screen, into our actual lives.”

Location based services can play a major role in experiencing the 4 I’s. What if we can offer extra experience based on the current location? From my background as an art historian and travel guide I think I can say something about this story telling effect here too. People want to go around through a town or a museum and as a guide you need to be pointing out the particularities of a certain object or view. Applications that do so already exist in “open air” situations. And also many musea offer you the possibilities for an online guided tour. I have seen over the last years many of these wonderful initiatives.

But in musea we still see people typing in coded numbers on devices in order to receive the stories and the context. The Indoor Geo Database will include many Points Of Interest. And our smart algorithm will select the right combination of these POI for the current context. Many stories to tell, and based on your interest I can show you the same museum in a number of different ways.

“will people even indoors use the smartphone to navigate?”

Voets ends his post with “The question is: will people even indoors use the smart phone to navigate?”. My answer is clear, Yes they will. And Indoor Positioning is not only to be used in a museum or cultural context. What if I could go through a department store where my smart phone shows me the latest gadgets and offers, based on my recent online searches? Or maybe the system could combine earlier experiences and show me something real life that I was looking for a month ago.

Does this sound scary, or do you see the new possibilities? Like I said before I see new, and serious applications of this technology, in many different fields.