Final Assignment: Baller Gets 8 Year Extra

My final assignment for the MOOC on Infographics and Data Visualization is finished. Last week I collected all the data, this week it was about filling in the details and the design. And it is proven to be hard. Collecting data and presenting it is one thing, but in an infographics it should have just that extra. Plus it should have the story telling part. All in all I struggled.

The general story, like I wrote in my blog of last week is that Statistics Netherlands posted a new dataset on Life Expectancy and Income classes. A nice subject with a number of good and juicy details. It is the data from last year, but this institute already collects this kind of data for a long time. This allowed me to do something extra with the data, which is to show that life expectancy has changed over the last centuries.

In the end my infographics has 4 images.

  • Life Expectancy for Men, for the year 2011 in the age range of 0 to 80.
  • Life Expectancy for Woman, same as above
  • A detail showing one of the findings based on the analysis of the dataset
  • A historical overview of Life Expectancy in 25 year blocks from 1875 to 2000

I have first prepared my data sets in gnumeric. The data needed some cleaning, not all columns were needed and I wanted to separate the data for male and female. I have also struggled for a while with the income classes. The dataset has a subdivision between 4 classes: lower, lower middle, middle, upper middle, and upper class. All other datasets from the Statistics Netherlands have ten different classes, based on income, but they are not linear divided. It was quite some work to get them to match. Finally I used a histogram function based on standard deviation and a normal distribution. The book “Making History Count” from Feinstein and Thomas that once I used in my teaching has been a great help in this.

After importing the data into Tableau Public I started thinking about the presentation. I wanted to show the whole range, from the Life Expectancy of a new born child to the 80-years old. And I wanted to show the five different classes. After making a line diagram first I ended up changing my datapoints into rings, hoping that in that way the overlapping data for the classes would be more visible. I have partly succeeded in this.


What is not immediately visible is that as a new born child it is good to be born in a upper class family, but after the age of 55 we see that the middle class and upper middle class do better. Therefor I have included a detail showing this conclusion.


Finally I have created 6 time series showing similar data ranges. This time not for the different income groups, but for male and female. I have chosen 6 different moments in time, each with a difference of 25 years, from 1875 until 2000.


I have exported all the graphs to PDF and then the big play with the presentation started. For this I have used Inkscape, comparable with Illustrator, but Open Source. One of the tricks here is to play with the layers, making subdivisions between the different groups like graphs, images, text, and background. Setting guides to work and then prepare blocks. Since my first version was still dull I have decided to make a second version with some images in the historical series.

And the result? I am happy about it, but there is certainly place for much improvement. My goal was to learn more about Infographics in general. I succeeded in that, I now at least can see what needs to be done.


For all those reading this blog, and wondering… Alberto Cairo and the Knight Center for Journalism in the Americas start a new course in January 2013!

Infographics, GIS, and Mapping, the story continues (or part 2)

In my earlier blog post on the use of maps and mapping within infographics I focused mainly on the use of GIS applications that can create maps as one of their outputs. Here I mentioned among others ESRI ArcGIS and QGIS as examples of proprietary and open source software. The main use of these GIS software packages is – and I realize this may sound odd – not mapping. Although I do not want to get into the almost endless discussion of what GIS is, the presentation possibilities of GIS are only a part of the software. Besides the GIS applications there is a group of mapping software where the main purpose of the software is the visualization of geo data and its geographical features. Also at this point I want to stress that the world is not as black-and-white as I present it here. TileMill and Tableau are examples of tools from this second group.

The free tools list

For some time now Sharon Machlis has maintained a very useful table of free tools for data visualization and analysis at the ComputerWorld website. One of the categories in this list is GIS and Mapping. In this list there is no ESRI because it’s not free, but QGIS is mentioned as being open source GIS software. I have, from the start on, found this list a great resource even though it is not complete and always up-to-date. Two free tools that definitely should be mentioned are TileMill and Tableau Public.

TileMill from Mapbox is as they say themselves “a desktop application for creating beautiful web maps”, it has therefore very limited to no functionality to manipulate the data. It just takes the data from one of the many geo data formats and presents this data as a map. One of the main strengths of TileMill is its style language called CartoCSS that indeed looks very much like the CSS we know from web design. And for the Geo readers, compare it to the OpenLayers stylemap. TileMill is also powerful because they have separated content, data, and style. (I would nearly, like with XML write Semantic, Structure, and Style here).

Another tool that gets mentioned more and more often recently is Tableau Software. The free available Tableau Public has a simple feature to display data that has  a geo location added. Like with TileMill it is presentation software, and actually comes close to the dashboard I mentioned in my earlier blog post. The lack of a style language makes it look a little less flexible, and if it comes to the data (for example the postal codes) it is much focused on the big countries. Also the available database connections in the free version are limited. But to use it for basic maps that present the highlights from a dataset without needing the fine grain? It is excellent.

My favorite setup

What I like with both tools mentioned is that they can read the geo data that has been prepared with GIS software like ESRI ArcGIS en QGIS. By shapefiles (both), by direct read from the PostGIS database (QGIS-TileMill), or by linking to the local WMS-server (Tableau). That immediately describes my favorite setup for the mapping part of my infographics. Doing the questioning and preparation of my dataset in the GIS software, creating the right and balanced tables. Then present it with TileMill or Tableau.

The only hurdle is that Sharon Machlis indicates all GIS software in her table mentioned above with a skill level 4: “specialized knowledge in a field”.