Tweet #200

On October 30th I posted my 100th tweet, now 2 months later on December 26th,  I reached the 200th tweet. When I reached tweet number 100 I posted a Wordle diagram with the subjects of my last 100 posts, now at number 200 I will do the same.

What have I done in the last months, and what have been my areas of interest? I believe that twitter is a good way to find that out. So I created the wordle again, with the 150 most used words.

tweet1-200_noRTMT

It is clear from seeing the diagram that GIS, MOOC, dataviz, and infographics have been main subjects. The teacher of the MOOC Alberto Cairo, that I have retweeted a lot is also very prominent. A term that is missing is web design, also the pizza that was more prominent from my food blogs last time do not make it to the most used 150 words. On the other hand a term that is special for this time of year and got listed is “Santa” this due to the wonderful Norad Santa tracker and the Google Maps interface.

All in all, a good overview, and now on my way to tweet #300

Geo vs GIS, a long lasting theme

Is there a difference between Geo and GIS? Many people ask me this question, and when they do not ask I tend to tell them that there is that clear difference. Last week I had such a discussion again, and it kept me thinking. The main reason for this blog post is to put my ideas on virtual paper, so I can in the future to refer to it. Of course I know that this kind of subjects can also start a discussion. If that is the case, please feel free to add your comments.

The Definition

Lets start at the begin, the definition of GIS, since that is the source of much confusion. I believe that for decades all was clear: we had GIS software that processed Spatial data or Geo data. But then came online mapping tools and navigation systems. These systems work with geo information, so in the above description they will be GIS. But are they really?

All this depends on our definition of what Geographical Information Systems are. If we look at it from an information system perspective the GIS will be a wide range of techniques. For example when we take a definition as they can be found in dictionaries on the web for Information Systems (IS) we read:

information system: The entire infrastructure, organization, personnel, and components that collect, process, store, transmit, display, disseminate, and act on information. [from The Free Dictionary]

Other descriptions are similar, they all mention a mix of technical and human resources that in combination are able to process data. In this way we can talk about “the GIS department”, “the GIS software”, and even “the GIS data”. In this definition our navigation systems, departments and teams, and online mapping tools like Bing and Google Maps are GIS. As I stated before this definition is too wide to my opinion.

In the most strict definition one can say: GIS is the software, the toolbox. Geo is the information that the GIS needs. Geo is the model and GIS works with it. I do realize that this definition leaves the human resource side and the special hardware of our field out of the discussion. And this is exactly the point where complexity starts.

Let me summarize briefly the different definitions of GIS.

GIS is:

  1. The entire infrastructure, organization, personnel, and components that work with spatial data.
  2. The software and the spatial data.
  3. The software.

Much of the discussion depends on what you choose as a definition.

The Discussion

It should be mentioned that this GIS is nothing without the spatial data that can be processed with it. But is it part of the GIS? I often ask a question like: “Does the word processor also contain the texts you are about to write when you unpack the box?” These text for the word processor are like the geo data for the GIS.

So I make a clear distinction between Geo and GIS. The consequence of this distinction is that for me the only possible definition is the third one. And I realize that this conflicts with the general definition of Information Systems.

Let me even go one step ahead — especially when we would like to keep the information system definition — and propose to make more use of the term Spatial Information System when it comes to the definition for the infrastructure, organization, personnel, and components. In that way we can reserve the term GIS for the software and Geo (or Spatial) for the data. Combinations with other fields like Spatial Intelligence but also the place of Remote Sensing may come more usable in this way.

I wonder what others are thinking…

Learning path

This week I have concluded the last two courses for this autumn semester. Although I have always promoted “lifelong learning” I had not really done courses the last years beside some hands-on training in my field. After finishing my studies I have started to teach, and the preparation of these courses always resulted in staying up to date. But this autumn I started taking classes again.

MOOC’s

This autumn I discovered the Massive Open Online Course. In total I have finished four courses, three at coursera, and one at the Knight Center for Journalism in the Americas. In the start it was a bit strange. Would I still be able to study, do homework on a regular base, pass the quizzes? After two weeks this fear was fully over. Classes by a number of video messages of about 12 to 15 minutes each, is good (in total about 1 hour per course each week). Doing the weekly assignments, sometimes multiple choice on the course material, but also assignments with a lot of maths, drawing, building, and designing. I even managed to build a prototype of a juicer!

design assignment

In the end I have done 4 courses, and although the subjects were all different I see a very nice learning path. Let me tell you about what courses I did:

  • Model Thinking, coursera, Scott E. Page, University of Michigan
    “Why do models make us better thinkers? Models help us to better organize information – to make sense of that fire hose or hairball of data (choose your metaphor) available on the Internet. Models improve our abilities to make accurate forecasts. They help us make better decisions and adopt more effective strategies. They even can improve our ability to design institutions and procedures.”
  • An Introduction to Operations Management, coursera, Christian Terwiesch, University of Pennsylvania
    “In short, you will learn how to analyze business processes and how to improve them.”
  • Design: Creation of Artifacts in Society, coursera, Karl T. Ulrich, University of Pennsylvania
    “The course marries theory and practice, as both are valuable in improving design performance. Lectures and readings will lay out the fundamental concepts that underpin design as a human activity.”
  • Introduction to Infographics and Data Visualization, Knight Center, Alberto Cairo, University of Miami’s School of Communication
    “How to work with graphics to communicate and analyze data.”

Although the courses all look very different I found a very nice learning path in it. There have been some moments where ideas from the one course came up while struggling with the other course. For example, when needing to make a presentation on business processes in the course on Operations Management I have chosen to create an infographics to present the data and the outcomes. The model thinking course on the other side helped a lot in data organization and further exploring a way of thinking.

In my professional work I can make use of all four courses, that was in first instance not the goal. Another thing is that I have used a number of new tools to make and order my course notes. Among these tooling is Evernote and Tableau Public, tools that I had not used before and has proven very valuable. For other tools that I have used already for a long time, like Freemind and Inkscape I have found new ways of applying them.

Plans for 2013

In 2013 I have subscribed to new courses: Computing for Data Analysis, Game Theory, and Creative Programming for Digital Media & Mobile Apps. Besides I have started to work on my teaching materials to create my on-line course, an introduction to GIS and Geospatial data. For this course I have started on udemy a platform for on-line courses in a wide range of subjects.

So… On-line courses and lifelong learning, it will continue.

Final Assignment: Baller Gets 8 Year Extra

My final assignment for the MOOC on Infographics and Data Visualization is finished. Last week I collected all the data, this week it was about filling in the details and the design. And it is proven to be hard. Collecting data and presenting it is one thing, but in an infographics it should have just that extra. Plus it should have the story telling part. All in all I struggled.

The general story, like I wrote in my blog of last week is that Statistics Netherlands posted a new dataset on Life Expectancy and Income classes. A nice subject with a number of good and juicy details. It is the data from last year, but this institute already collects this kind of data for a long time. This allowed me to do something extra with the data, which is to show that life expectancy has changed over the last centuries.

In the end my infographics has 4 images.

  • Life Expectancy for Men, for the year 2011 in the age range of 0 to 80.
  • Life Expectancy for Woman, same as above
  • A detail showing one of the findings based on the analysis of the dataset
  • A historical overview of Life Expectancy in 25 year blocks from 1875 to 2000

I have first prepared my data sets in gnumeric. The data needed some cleaning, not all columns were needed and I wanted to separate the data for male and female. I have also struggled for a while with the income classes. The dataset has a subdivision between 4 classes: lower, lower middle, middle, upper middle, and upper class. All other datasets from the Statistics Netherlands have ten different classes, based on income, but they are not linear divided. It was quite some work to get them to match. Finally I used a histogram function based on standard deviation and a normal distribution. The book “Making History Count” from Feinstein and Thomas that once I used in my teaching has been a great help in this.

After importing the data into Tableau Public I started thinking about the presentation. I wanted to show the whole range, from the Life Expectancy of a new born child to the 80-years old. And I wanted to show the five different classes. After making a line diagram first I ended up changing my datapoints into rings, hoping that in that way the overlapping data for the classes would be more visible. I have partly succeeded in this.

Male

What is not immediately visible is that as a new born child it is good to be born in a upper class family, but after the age of 55 we see that the middle class and upper middle class do better. Therefor I have included a detail showing this conclusion.

Detail_1

Finally I have created 6 time series showing similar data ranges. This time not for the different income groups, but for male and female. I have chosen 6 different moments in time, each with a difference of 25 years, from 1875 until 2000.

Over_the_Ages

I have exported all the graphs to PDF and then the big play with the presentation started. For this I have used Inkscape, comparable with Illustrator, but Open Source. One of the tricks here is to play with the layers, making subdivisions between the different groups like graphs, images, text, and background. Setting guides to work and then prepare blocks. Since my first version was still dull I have decided to make a second version with some images in the historical series.

And the result? I am happy about it, but there is certainly place for much improvement. My goal was to learn more about Infographics in general. I succeeded in that, I now at least can see what needs to be done.

FinalAssignment_EK2012

http://www.elwink.nl/infographics/FinalAssignment.pdf

For all those reading this blog, and wondering… Alberto Cairo and the Knight Center for Journalism in the Americas start a new course in January 2013!