Final Assignment: Baller Gets 8 Year Extra

My final assignment for the MOOC on Infographics and Data Visualization is finished. Last week I collected all the data, this week it was about filling in the details and the design. And it is proven to be hard. Collecting data and presenting it is one thing, but in an infographics it should have just that extra. Plus it should have the story telling part. All in all I struggled.

The general story, like I wrote in my blog of last week is that Statistics Netherlands posted a new dataset on Life Expectancy and Income classes. A nice subject with a number of good and juicy details. It is the data from last year, but this institute already collects this kind of data for a long time. This allowed me to do something extra with the data, which is to show that life expectancy has changed over the last centuries.

In the end my infographics has 4 images.

  • Life Expectancy for Men, for the year 2011 in the age range of 0 to 80.
  • Life Expectancy for Woman, same as above
  • A detail showing one of the findings based on the analysis of the dataset
  • A historical overview of Life Expectancy in 25 year blocks from 1875 to 2000

I have first prepared my data sets in gnumeric. The data needed some cleaning, not all columns were needed and I wanted to separate the data for male and female. I have also struggled for a while with the income classes. The dataset has a subdivision between 4 classes: lower, lower middle, middle, upper middle, and upper class. All other datasets from the Statistics Netherlands have ten different classes, based on income, but they are not linear divided. It was quite some work to get them to match. Finally I used a histogram function based on standard deviation and a normal distribution. The book “Making History Count” from Feinstein and Thomas that once I used in my teaching has been a great help in this.

After importing the data into Tableau Public I started thinking about the presentation. I wanted to show the whole range, from the Life Expectancy of a new born child to the 80-years old. And I wanted to show the five different classes. After making a line diagram first I ended up changing my datapoints into rings, hoping that in that way the overlapping data for the classes would be more visible. I have partly succeeded in this.

Male

What is not immediately visible is that as a new born child it is good to be born in a upper class family, but after the age of 55 we see that the middle class and upper middle class do better. Therefor I have included a detail showing this conclusion.

Detail_1

Finally I have created 6 time series showing similar data ranges. This time not for the different income groups, but for male and female. I have chosen 6 different moments in time, each with a difference of 25 years, from 1875 until 2000.

Over_the_Ages

I have exported all the graphs to PDF and then the big play with the presentation started. For this I have used Inkscape, comparable with Illustrator, but Open Source. One of the tricks here is to play with the layers, making subdivisions between the different groups like graphs, images, text, and background. Setting guides to work and then prepare blocks. Since my first version was still dull I have decided to make a second version with some images in the historical series.

And the result? I am happy about it, but there is certainly place for much improvement. My goal was to learn more about Infographics in general. I succeeded in that, I now at least can see what needs to be done.

FinalAssignment_EK2012

http://www.elwink.nl/infographics/FinalAssignment.pdf

For all those reading this blog, and wondering… Alberto Cairo and the Knight Center for Journalism in the Americas start a new course in January 2013!

Advertisements

Thinking about data visualization

Earlier this week we have received the final assignment for the MOOC on infographics and data visualization. Alberto does not spare his students, writing: “This time, I am giving you the freedom to do whatever you want.” My first idea was a slight jubilation, everything is possible. Then we get 7 steps: starting with making a headline and gathering the data and ending with getting the results back. While commuting to work I saw  a small headline in one of the free newspapers: “Well-off people live longer in good health”.

Statistics Netherlands (in Dutch CBS) collects and processes data “in order to publish statistics to be used in practice“. Their website has a nice series of interactive infographics, and already years ago they were one of the first that introduced a webmapping interface to their statline website. For many of my GIS classes I have used their data. So a very useful source of wonderful data. But let me return to the assignment, the first step: getting a headline.

life expectancy based on data from cbs.nl

A Simple Headline … but Tease the Information

This week someone in my tweet-lists posted a tweet on a Webinar: How to Write Headlines for the Web. After watching the webinar I understood that my headline could make or break my great story. A good story without a catchy headline will been read less on the web. It should contain big numbers and they should be easily digestible. Wow… as a non journalist this is quite a challenge. And according to Alberto Cairo I need to “Try to find a focus, a headline.” In the webinar one of my favorite techniques is used: free association, with in the back of my mind the main question: what the story is about. If not just with a blank sheet of paper, I often use a MindMapping tool (in my case the fabulous Freemind) for this process.

The general scope given by Statistics Netherlands is: “Men and women from high-income households on average live about 8 and 7 years longer respectively than their counterparts in low-income households.” Far to long for a headline. In Dutch we have a proverb “Riches alone make no man happy” (or “Money isn’t everything”). This is what my associations led to. Leaving me with a number of keywords: Riches, happiness, long live, income, and 7 and 8 years. What about “How to earn an 8 years longer life?” Is it simple enough? There is another great tool that I love to browse: the urban dictionary. Wealthy has many hits: moneybags, ballers. So… “Baller gets 8 years extra”?

Gather the data … Combinations and Context

The next step is to think about the story I want to tell. In this case I will focus on the Dutch data first. My mind wanders on: it would be nice to get data for another country. There must also be some historical data on this subject. The Dutch Economic-Historical Archive (NEHA) has this kind of data, also the Statistics Netherlands have data back to 1899 in its historical series. While talking about the subject over dinner my son came up with the fact from his history class: the average life expectancy of a worker in Manchester during the industrial revolution was very low (an average of 17).

Another idea may be to link the data to the life style. There are lots of data on that subject too. On the other hand it is more difficult, and the context may be a lot harder to give. The Statistics Netherlands also mentions good health and good mental health. This may be subjects to include, but not as a main subject for the assignment for now.

So the plan…

What is the story I want to tell? There is enough historical data available. I want to tell the story of the rich, the poor, and the middle class at different moments in time. The turn of the 20th century, the 1930’s crisis, the after war period, the late 1970’s where many patterns changed, and now the 21st century. This approach will tell many stories. It will tell about prosperity, the working class, history, and many, many social elements that make a culture.

My story will be about culture and people, based on historical statistics. Now the next step is to think about the form.