Dataviz MOOC Review: Brace yourself, enjoy the ride

That's it, the MOOC on infographics and data visualization by Alberto Cairo is done, over, finished, completed. Phew. It was more work than I expected. It says four to six hours per week, but for me it was more around ten to twelve; less at the beginning and more towards the end. It was not easy to manage having a job and a family, like many other students and the teacher himself. Several students posted their work mentioning their difficulties in meeting deadlines and class requirements. That's why six weeks were really enough and, after going to bed at 4 am to finish my final assignment (the good old days), I am still recuperating. Still, I was stimulated and motivated enough to make it to the finish line. I hope we'll know how many of the 2000 made it.

Several people have already provided their review (here, here, Knight Center, and Cairo himself). Here's mine.

The good

The engagement of the teacher. It was impressive to see him everywhere making comments on a majority of discussions and projects, at least in the first few weeks. It was also motivating because it gave credence to the discussion forums. I was impressed by his engagement on Twitter and his generosity in shining a light on the students' blogs and entries. In fact, it motivated me to start this blog.

The Functional Art is a great book that will complement those of Stephen Few and Edward Tufte. I especially like the interviews at the end to give a perspective on the industry.

The journalistic angle. I had never seen this in my other readings, all of which were focused on more technical aspects of data visualization. The insistence on finding and showing a narrative has changed the way I do data visualization for the best.

The exercises. I'm glad we were pushed to produce something within relatively short deadlines. It is one thing to criticize, but quite another to deliver. I leave the class with something to show and that's something.

The network of students. It was one of my objectives when joining this class to find fellow data visualizers and this goal has been achieved in large part thanks to the forums and the regular interactions. I hope to stay in touch and continue to follow a few of the students.

The duration. Six weeks was good - long enough to get immersed, but short enough that it didn't become too much of a burden.

The price. It's amazing that we can get such quality education for free. Where's the catch?

The bad

The limited feedback by participants. It had greatly improved in the last week, but earlier I've seen a lot of feedback limited to some "good job, I like your colors". It sometimes looked like someone's just trying to tick a box for class participation. The initial 500 words request that was waived might have been to high, but there should be a threshold. We can benefit a lot from the feedback of our fellow students, but also from a careful consideration of their work. In fact, this experience convinced me further than team work can improve this kind of output.

The ugly

The platform. A few years ago, it would have been good enough, but nowadays, we are used to much more response and intuitive user interfaces and it is hard to adjust one's expectation to a rather antiquated one. I could barely find my way around in the first week and it never got comfortable. There were features so deep that they might as well be hidden. For instance, who else realized that we could establish "contacts" with other students, à la Facebook? Even I had a hard time remembering where my list was (it's under Messages). The inbox format is most bizarre, with messages taking a sixth of the screen width. I also wish that the images could be seen in line, even if hosted elsewhere. I am not sure it is worth trying to fix this platform and suspect it would be better moving to an entirely new one.

I have much more high than low points. My experience was very positive, both for a MOOC and for the knowledge. I would do it again in a heartbeat and in fact, I'll now be on the lookout for MOOCs and other data visualization training.

tl;dr: take it, brace yourself.

MOOC Weeks 5-6: UK Aid to India

For our last assignment of the MOOC, Alberto Cairo decided to give us enough rope to hang ourselves: "do whatever you want". I proceeded to swiftly spend half the allocated time deciding on a topic. Returning to aid, the subject of week 3, was a natural fit and I knew the data would be available. After considering a few generic variations on the themes "where does aid come from" and "where does aid go", I realized I needed an angle. The recent announcement by the UK that they are cutting their aid to India seemed intriguing enough and calling for some data. Then, I set as my goal to create one of these long, vertical infographic, but without resorting to some of the misleading and unhelpful techniques that plagues too may of them. Let's recap some of the lessons of the first four weeks.

  1. Look for a story in the data.
  2. Convey a narrative.
  3. Use good copy to draw the reader in.
  4. Combine several graphs.
  5. Present the same data in different ways.
  6. Use the appropriate graph for the data.
  7. Pick the color scheme carefully.
  8. Label and include legends.

Here is the result.

UK Aid to India. Francis Gagnon

UK Aid to India. Francis Gagnon

The story is that it is a big deal that the UK will cut its aid to India and there are many ways to understand the causes and consequences. It is a delicate topic and I did not want to turn the infographic into an editorial. It is rather designed to help the reader think about the issue and maybe open a few new perspectives, especially since some of the actors have strong opinions about this shift.

It starts by showing the reader how important this decision is: India is a top recipient of UK aid. Then it goes into a comparison of the two countries, to reflect on their relative economic health. This leads into an exploration of poverty in India and finally an overture towards the other potential beneficiaries of this change, showing this policy decision into a larger context. The sources are also an important aspect of an infographic and I wanted to provide them in a clear way to support the credibility of the data above.

This has taken much longer than anticipated. Dataviz nerds, look for a making-of in the coming days.

MOOC Week 4: Unemployment in the US 1960-2012

US unemployment data has been visualized over and over and over in the last year because of the election and recession. Not only is the topic important but the US Bureau of Labor makes the data easily available. It was not surprising that Alberto Cairo chose it for our week 4 exercise of the MOOC, but it also meant that the bar was already set high. The data set, coming through the Guardian Data Blog, was comprehensive if not diversified. My experience in the prior weeks led me to conclude that I should not spend too much time exploring the data and focus on executing a neat infographic. This turned out to be the wrong conclusion.

After a quick survey, the data seemed to lack in depth: providing four years of unemployment data immediately begs the question of what happened before those four years, especially since an economic meltdown had happened in the months prior. So I decided that my interactive graph would show 52 years of data (and it is available). I envisioned a basic visualization with multiple graphs that would allow the user to explore all the data and cut it in multiple ways - by state and by date. Here's how it looks. I chose a grey color theme to keep the focus on the data-related colors, but it seems a bit dull although it might not be the colors as much as the fonts and design.

Unemployment US FG

Data linked to population call for a cartogram because it is not based on the territory, contrarily to weather, agriculture and climate for instance. It seemed also natural to offer the option of comparing any set of states so I took my inspiration from the Google Data Explorer. The states are hidden in a drawer on the right and I'm not quite sure that the users would notice this.

Unemployment US FG2

The closest I came from providing a narrative is in creating ready-made periods, on the top right. In the example below, the user has selected the results of the 2012 election and the lines, bars and states are colored according to each party. I'm not sure how it would really work on the map since I would end up with two color shading, making hard to compare levels of unemployment across parties.

Unemployment US FG5

It seemed interesting to see the influence of the political system on the economic performance. It is unclear if presidents or even Congress can have an impact in the short run, but it is something that partisans constantly bring up. I thought that the users would like to know what happened while their party was in power.

Unemployment US FG3

Two variations in the next version. The most obvious is the circles representing jobs created and lost in each state. While circles are not very precise, I find it an interesting way to show four data points at once: losses, gains, differential and overall size. Also, it is unusual and eye-catching. Some users might need a minute to understand it, but it pays off.

The second variation is in the bottom graph where the data is now relative to the average national rate. It is less radically different than I expected. In fact, I'm not quite sure that it adds enough insight to justify the interactivity.

Unemployment US FG4

I went heavy on the interactivity and customization but I mostly missed the story. I did not look for hence I did not find one. My graph comes across as a source of raw data for people who are interested in economic data, but it does not draw in anyone new. Yet another lesson learned from this MOOC.