My final assignment for the MOOC on Infographics and Data Visualization is finished. Last week I collected all the data, this week it was about filling in the details and the design. And it is proven to be hard. Collecting data and presenting it is one thing, but in an infographics it should have just that extra. Plus it should have the story telling part. All in all I struggled.
The general story, like I wrote in my blog of last week is that Statistics Netherlands posted a new dataset on Life Expectancy and Income classes. A nice subject with a number of good and juicy details. It is the data from last year, but this institute already collects this kind of data for a long time. This allowed me to do something extra with the data, which is to show that life expectancy has changed over the last centuries.
In the end my infographics has 4 images.
- Life Expectancy for Men, for the year 2011 in the age range of 0 to 80.
- Life Expectancy for Woman, same as above
- A detail showing one of the findings based on the analysis of the dataset
- A historical overview of Life Expectancy in 25 year blocks from 1875 to 2000
I have first prepared my data sets in gnumeric. The data needed some cleaning, not all columns were needed and I wanted to separate the data for male and female. I have also struggled for a while with the income classes. The dataset has a subdivision between 4 classes: lower, lower middle, middle, upper middle, and upper class. All other datasets from the Statistics Netherlands have ten different classes, based on income, but they are not linear divided. It was quite some work to get them to match. Finally I used a histogram function based on standard deviation and a normal distribution. The book “Making History Count” from Feinstein and Thomas that once I used in my teaching has been a great help in this.
After importing the data into Tableau Public I started thinking about the presentation. I wanted to show the whole range, from the Life Expectancy of a new born child to the 80-years old. And I wanted to show the five different classes. After making a line diagram first I ended up changing my datapoints into rings, hoping that in that way the overlapping data for the classes would be more visible. I have partly succeeded in this.
What is not immediately visible is that as a new born child it is good to be born in a upper class family, but after the age of 55 we see that the middle class and upper middle class do better. Therefor I have included a detail showing this conclusion.
Finally I have created 6 time series showing similar data ranges. This time not for the different income groups, but for male and female. I have chosen 6 different moments in time, each with a difference of 25 years, from 1875 until 2000.
I have exported all the graphs to PDF and then the big play with the presentation started. For this I have used Inkscape, comparable with Illustrator, but Open Source. One of the tricks here is to play with the layers, making subdivisions between the different groups like graphs, images, text, and background. Setting guides to work and then prepare blocks. Since my first version was still dull I have decided to make a second version with some images in the historical series.
And the result? I am happy about it, but there is certainly place for much improvement. My goal was to learn more about Infographics in general. I succeeded in that, I now at least can see what needs to be done.
For all those reading this blog, and wondering… Alberto Cairo and the Knight Center for Journalism in the Americas start a new course in January 2013!