Last week the MOOC on Infographics and Data Visualization at the Knight Center for Journalism in the Americas started, and I am one of the 2000 lucky students. About 8 years ago my former employer dropped a book on my desk. Mouth watering and in one go I finished it. The book had to do with data processing and information visualization in a way that as a computer scientist and art historian I could understand so well. The book by Edward Tufte has been a source of inspiration for many lectures and thoughts while working with, in my case, the presentation of geographical data.
One of the main reasons for me to start with the course is that I see the importance of data visualization. I am neither a journalist, nor a professional designer yet I want to visualize my data analysis. For example the data about assets in a geo information system, and in a way like Stephen Few describes it: Meaningful decoded data where the nature of the data as well as the relationships between the different objects is clear. I want to be able to present this data in a way that it is understandable for non geo informatics people.
Data information knowledge model
Last week Alberto Cairo introduced us to the concepts of the data information knowledge model and on how to analyze the ongoing stream of infographics that are produced in his first lecture on information visualization. One of the references he makes is to a chapter on data visualization from Stephen Few. Few writes: “The goal is to translate abstract information into visual representations that can be easily, efficiently, accurately, and meaningfully decoded.” In the information processing for Geographical Information Systems we often meet the same goals.
One of the future directions that Few mentions in his text is: “The integration of geo-spatial and network displays (such as node and link diagrams) with other forms of display for seamless interaction and simultaneous use.” And that is exactly why 10 years ago that book landed on my desk. I believe that the integration as mentioned above is a very obvious one, but we should be careful. Geo data looks very “sexy”, and we see that many designers of infographics tend to use maps as a background, or when a location is given, map the data to that location. That brings me to one of the questions of Stephen Few: “Is it obvious how people should use the information”.
In the discussion last week on a map given by Alberto Cairo, the instructor of the course, mentioned the map as a background played an important role in many people’s responses. Information on Internet use for several countries had to be presented. Some of the responses tended towards the fact that everyone knows where specific countries are on the map, so why try to map a chart to a location. On the level of countries or continents I can understand that argument, but in many cases we work with data on a smaller scale. When it comes to statistics on your assets the map is an excellent carrier of information.
Probably more then 90% of the data in a geo-database has nothing to do with the map in a direct way. It does not contain X, Y, or Z coördinates by itself, but the data is linked to other data tables that do have the location connection. For example we can have a postal code that will link customer data to a specific area. In this way we can enrich the data. The last few years there have been a number of companies that showed us wonderful examples of how to do. The result is that much of the data that is available in information systems now can be linked in one way or the other to a specific location.
If you have a shopping card from the local supermarket, data is collected on the products you buy. Different queries can be run on this raw data, for example on price ranges or on the type of products. All this data has no location component, it’s products, prices, and quantities. And we can, based on this data, run wonderful statistics. We can add extra value to this data set when we combine these statistics to the postal code of the consumer. Suddenly we start to see patterns, for example when it comes to age categories in a certain area of town.
The next step, and here I refer back to the future direction, is to change the information that we get from the different queries into knowledge. Visualizations based on the above example may have added value. But this added value can only be achieved when the data is easily and efficiently available, plus easy to read and interpret.
This is exactly where we can learn from the designers that work in the newspaper offices. This is, besides the fun, a reason for me to take the course on infographics and data visualization.