This weeks assignment: data manipulation

This weeks assignment in the MOOC on infographics and data visualization by Alberto Cairo is about maps. From his new to appear book we have to read the fourth chapter on Cartography for Journalists, or as the chapter title reads: Thematic Maps, Statistics and Cartography Meet. Like his earlier book – The Functional Art -, also this chapter is a well written piece with many great visuals.1 Alberto Cairo describes thematic maps as “the purest and most successful form of information graphics”, and I certainly do not disagree. And about the assignment? That is to use data from the data from the US Bureau of Labor Statistics and show unemployment in the US. In a way like The Guardian’s Data Blog published a story about unemployment in the US: http://www.guardian.co.uk/news/datablog/interactive/2011/sep/08/us-unemployment-obama-jobs-speech-state-map. But with more functionality and depth.

With Tableau, TileMill, or even with ESRI ArcGIS online this task is not that big. The data is easily accessible and well-organized. But this is exactly where standard mapping differs from making infographics. I am quite sure that our teacher does not want the standard map. In the first assignment I created a map and Alberto commented: “However, it doesn’t improve the original as much as it could. The reason is that you are forcing me to click on each country to get the data, rather than giving me the opportunity to explore the data in different ways, such as creating rankings, comparisons between countries, etc.”. So no standard map this time, but something that will focus on the exploration.

I decided to focus on two of the questions that are raised in the assignment:

  • What kind of graphs or maps would you need to tell a compelling story based on this?
  • How would you give context to the data?

If we are talking about unemployment during the first period of Obama there are  some nice infographics on this subject already in the run to the elections.

My approach: Geo Tagging

The latest data available is from September 2012 and looking at that data you immediately see the big differences between states. Montana, Wyoming, North and South Dakota, Nebraska. From my times being there I know the views of large (or even more than large) plains, the emptiness. The number of people per square meter must be very low. On the other hand the giant peak in California, and the higher (but not peak) values in the dataset for states along the east coast, and the big cities like Chicago, Detroit, and states like Texas, and Florida. My first impression of the dataset is that it is not averaged by population density.

In order to map the data, and to show the population – unemployment relation the data must be geo-tagged. Since the data is ordered by state, and the state codes are given this is not a difficult task. MaxMind offers a nice table of states and their longitude and latitude. By combining this data with the given data set I now at least have point data that is geo referenced. And so it can be mapped with a centroid.

Then I started to play with tableau public. Within the map option of the software there are several settings and datasets preloaded, population, population by race, occupations.

What appears to me is that population density and mixture of race both have effects on the figures. States with many big cities seem to have a higher unemployment rate. So a first step would be to map the data against the population density.

The Census Bureau Data

The United States Census Bureau has a nice dataset for the census of April 2010. Although the census data is for the full population, including those people that are too young to work, or those that are retired the figures change already with the first quick lay-out. Rhode Island that at first was a small dot, now suddenly becomes one of the largest. On the other hand California that seemed to be the state with an incredible high unemployment rate now has become an average player. All in all we see how the differences have become smaller when it comes to the percentage of the total population that is unemployed.

Conclusion and the final result

So “to tell a compelling story” I made an interactive infographic where the map is a main element on the page. From this map you can click on a state and see the unemployment figures for 10 periods in time: at the start of the first period of Obama, and then each year, until just before his re-election. The context is that data is multi interpretable, even though everyone knows the facts. By leaving out specific details data manipulation becomes a word with a double meaning.

ps. Did I tell you that subscription to the second course, starting in January is open? You can subscribe here: Knight Center for Journalism in the Americas’s Distance Learning program.

1 In the first version this paragraph read: “From his book we have to read the fourth chapter on Cartography for Journalists, or as the chapter title reads: Thematic Maps, Statistics and Cartography Meet. The book is well written with many great visuals and the same is true in this chapter.”

Location everywhere

Already for some time I have wanted to write about indoor positioning. Since Indoor positioning is going to be a future direction for a number of fields, including GIS. I had collected articles and did my research. And then yesterday by following another post (on big data, food, and visualization) I end up at a (Dutch) post from Numrush: “Indoor navigation system Wifarer announces first customer” [My translations, EK]. Author Johan Voets states in his post exactly what I wanted to tell you in my blog: “Indoor navigation. It sounds a bit unnatural, but it is definitely a fast growing market.”

In my earlier blogs on the fast developing GIS market I already indicated that mobile devices, such as smart phones, offer great possibilities. On a post from envisioningtech the location awareness is mentioned in the context of new sensors. The article uses the term “planned spontaneity”, as where – based on earlier experiences – your system takes decisions, based on a certain context. And yes this context does include location too.

The four elements

In another recent study Latitude mentions the 4 I’s: “four elements—the ‘4 I’s’—that will continue to play a significant role in our experiences with narrative-based media”. Immersion, interactivity, integration, and impact. To cite their report even further: “Immersion and interactivity primarily help an audience to go deeper into a story, while integration and impact are about bringing a story of out of the screen, into our actual lives.”

Location based services can play a major role in experiencing the 4 I’s. What if we can offer extra experience based on the current location? From my background as an art historian and travel guide I think I can say something about this story telling effect here too. People want to go around through a town or a museum and as a guide you need to be pointing out the particularities of a certain object or view. Applications that do so already exist in “open air” situations. And also many musea offer you the possibilities for an online guided tour. I have seen over the last years many of these wonderful initiatives.

But in musea we still see people typing in coded numbers on devices in order to receive the stories and the context. The Indoor Geo Database will include many Points Of Interest. And our smart algorithm will select the right combination of these POI for the current context. Many stories to tell, and based on your interest I can show you the same museum in a number of different ways.

“will people even indoors use the smartphone to navigate?”

Voets ends his post with “The question is: will people even indoors use the smart phone to navigate?”. My answer is clear, Yes they will. And Indoor Positioning is not only to be used in a museum or cultural context. What if I could go through a department store where my smart phone shows me the latest gadgets and offers, based on my recent online searches? Or maybe the system could combine earlier experiences and show me something real life that I was looking for a month ago.

Does this sound scary, or do you see the new possibilities? Like I said before I see new, and serious applications of this technology, in many different fields.