Concentric Circles on a Map, version 3

After my attempt at improving the concentric circles, Stephen Few was kind enough to provide more feedback: he still doesn't like them.

Your experiments with concentric circles are interesting and it's clear that you're having fun exploring this, but the new version doesn't seem to work any better than the first, even though you've eliminated the one annoying illusion. We're still left trying to compare areas, and even though inner circles make this slightly easier, the comparison still requires too much effort and time. Also, to my eyes the patterns formed by the concentric circles are hard to look at--similar to targets on a gun range--which make me a bit dizzy. I appreciate your efforts to find a better solution, but I doubt that concentric circles will prove useful.

The concentric circles don't make me dizzy, but I agree with the core of the arguments. If the goal was to equal bricks in ease-of-reading, then yes it is a failed attempt. But if the goal was to improve on plain shapes and colors to display quantities on a map, then it seem like a fair addition to the data visualization arsenal. Circles exist and are regularly used on maps and this is a suggestion to make them a little more precise.

I made a few more tweaks to the concentric circles. Here is version 3. I am starting to think that they should have been called "circular gridlines" ever since I replaced the circumferences with colored the area.

circles v3

circles v3

The smallest circle has disappeared because the spacing was not constant: the small circle represented 1 unit; the next circle, 5 units; and the third, 10 units (1-5-10...). Now the interval is constant at 5 units (0-5-10-15). The result makes clear that the area grows much faster than the radius. I double and triple checked my numbers, but it seems that the inner circle is really the same area as each of the two rings. Truly, these areas are counter-intuitive.

I now use white gridlines with the vague hope that they will be less dizzying, if that is truly a problem. The downside is that we can't see when the value reaches a multiple of 5 units, like with the faint grey gridline, only when it exceeds it. That's why I have a third row in the example above, to show more than one gridline at work. One is not limited to intervals of 5, of course, and a different interval would certainly work better in some cases.

Time to put the version 3 on a map and compare with the plain circles. Click for full size.

Source of original map: http://commons.wikimedia.org/wiki/File:US_state_outline_map.png

Source of original map: http://commons.wikimedia.org/wiki/File:US_state_outline_map.png

So, which one seems clearer? For testing purposes, compare Texas and Louisiana. Washington and Oregon. Oklahoma and New Mexico. In these cases, the circular gridlines help me establish which one is largest, something I can't quite do with the plain circles.

Stephen Few has very high standards, which I respect and wish I could meet, and he wrote that he will not endorse a method that uses area to encode data. Still, this is not about getting his coveted approval, but contributing to and engaging with the larger data visualization community. I would be interested to hear what you think and to see the result if some of you ever test the concentric circles on a real project.

Note: This is a screenshot of the original poll. It is no longer active now that we moved web platform.

polldaddy.png

Concentric Circles on a Map, version 2.0

In response to Stephen Few's challenge to do better than his bricks to display quantities on a map, I proposed the concentric circles. He shot down version 1.0:

Thanks for exploring the possibilities of concentric circles. Unfortunately, as you've seen, when there are more than three or four concentric circles, we cannot perceive the quantities by subitizing; we must attentively count them, which is very difficult to do because they are close to one another and hard to differentiate. Even with attention, it is very difficult to see the difference between a set of seven vs. a set of eight, and so on. Also, notice that sets of closely packed concentric circles beyond a small number create an annoying visual illusion of partially overlapping circles at the four cardinal positions (top, bottom, left, and right). You can see this especially in your map example. Even though this doesn't work, it was definitely worthwhile to make the attempt. Thanks for the contribution.

This is fair criticism, so I went back to the drawing board. My goal is not to match or exceed the bricks, which I think do a fine job on the preattentive side, but rather to improve on the circles to convey quantities on a map. These were my challenges.

  • Get rid of the optical illusion.
  • Preserve the capacity to overlap.
  • Make the quantities easier to perceive.

Here are the concentric circles version 2.0.

Image

Image

The colors is now on the area instead of the stroke and there is a circular gridline every 5 units.

This design is less busy and does not create the optical illusion of version 1 at smaller sizes and lower resolutions.

circles 1 and 2

circles 1 and 2

The concentric circles can still overlap and preserve their shape.

circles 2 overlap

circles 2 overlap

And the circular gridline allows to see when certain thresholds are crossed on the circle, something that is not possible on a plain circle.

circles threshold

circles threshold

It is not possible to interpolate precisely between gridlines. Columns and bar charts suffer from a similar problem, but they hold two advantages. The first is that they generally have a gridline that exceeds the length of the longest column or bar.

columns

columns

The concentric circles 2.0 could do the same thing.

circles outer gridline

circles outer gridline

I don't want to discard this solution entirely, but I am concerned that we will perceive the outer limit more than their colored area and overestimate the size of the circles. The cost seems to outweigh the benefit.

The second advantage of the bar is that the distance between the gridlines is constant. In a circle, it is well-known, the distance between the circumferences of concentric circles with areas of equal intervals gets smaller as they area grows. It is unlikely that people will adjust their perception of the distance and scale between each circular gridline.

column and circle

column and circle

I am not sure how much of a problem this is, considering that we are not aiming for the precision of a table, but rather for a visual method that allows a fair approximation. Still, the approximation is likely inferior to that of the column and bar charts.

The contribution of the concentric circles is that they make this confusing property of areas visible, while the plain circles do not.

circumference and area

circumference and area

Enough parading, time to put the concentric circles at work on a map. Click for real size.

concentric circles map

concentric circles map

Compare with the plain circles.

circles map

circles map

So, is it easier to visually estimate quantities with the concentric circles? The slight difference between Arizona and California seems more visible with the concentric circles, and easier to perceive than with version 1.0. The difference between Oklahoma and Louisiana, at least to my eye, is perceptible with the concentric circles, but barely with the plain circles.

Click below to see some other experiments that I discarded or keep for later versions.

circles experiments

circles experiments

The first one was inspired by a suggestion from Taimur Sajid on Twitter and it put me on the scent for the version 2.0 (thanks!). I find it too busy though and prone to the optical illusion and difficulties of counting rings. The gradient was too hard to test at different sizes (!). The other three are still too busy. There are still two options that hold potential however.

circles by shade

circles by shade

The shades replace or complement the circular gridlines. I ran out of time today to test them. I am concerned, however, that the shading will interfere with the transparency when circles overlap. Still, it could either reinforce the visual encoding for quantities, or simplify the design in the case of the one without strokes.

Finally, I used many of the default settings of my software. There's more to try with different colors both for the area and for the strokes to make the concentric circles clearer. I look forward to the discussion, hoping to see more people weigh in because there is much to gain from a clearer depiction of quantities on a map or with areas in general.

Diving with a view

Part II of my observations from the World Bank Data Dive on poverty and corruption. It might start with the data, but for me the fun is in the analysis, especially visual. I had in fact joined the group fighting corruption because they seemed the most likely to need data exploration and visualization.

Board approvals per month

Board approvals per month

Below is the result of a long day's worth, more or less. I wish I had a graph shining the light of integrity on collusion, coercion or some other evil, but no. Slowed down by data issues, we did not make it that far. I can't say that I'm satisfied with any graph I've done over the week-end but then again, I've done them.

The first one happened while I was idly playing with the project data. By déformation professionnelle, I looked at the number of projects that the Board of Directors had approved per month. With July at the top, it is clear that there is a rush to approve more projects towards the end of the fiscal year, in June.

Is it possible that more cases of corruption happen in projects approved in May and June because the staff takes less time to conduct the due diligence? This question opened the Pandora box of linking disbarment data and the project data. If we were to find project characteristics that lead to higher likelihood of corruption, it could orient the preventive work of the integrity team. It was too much to resist and became our undoing as we spent hours trying to recreate that link, leaving available data sets unused.

Trend share WB board approvals

Trend share WB board approvals

While the true wizards were working on said link, I continued to explore visually the project data. My original graph showed cumulative approvals for 66 years. What if this bunching is an old problem and that the Board now approves a constant number of projects per month? I needed a trend.

I'm afraid this is my best effort of the week-end. About 800 data points visible with a clear enough message: the trend has worsened over the decades and the Board approves a growing share of projects towards the end of the year. The months with a larger share have gotten an increasing share vice versa. Since the mid-1980s, the share has reached 30% regularly in June. This is nearly four times as much as would be expected from an equal distribution per month (1/12 = 8.25%). This finding confirmed that it was still worth exploring the impact of this share of approvals on the due diligence of individual projects. Unfortunately, the data materialized too late and the link was never explored.

We did get an original data set though: the historical list of firms and individuals disbarred by the World Bank. I'm afraid I did nothing worth sharing with it. A few bar graphs showing the number of firms, the average number of days of disbarment per country. No corruption fighting histogram in there, no revolutionary radar graph.

In lieu, here are two of the most interesting visualizations I've seen. The first one is a network diagram of the bidders on World Bank contracts built by Nick Violi with data that he scraped himself (wow). It draws no conclusion, but it makes me curious. What are these clusters? I don't even know what the colors mean, but I'd like to know why some clusters are all yellow, some are mostly blue and some are mixed. G11 is an interesting nod, as it bids on few things but then bids across two clusters. What kind of company can it be? This is the kind of exploratory visualization that makes me want to dive into the data.

Credit: Nick Violi @nvioli

Credit: Nick Violi @nvioli

The second is from a team exploring UNDP's resources allocation. In a scatter plot, it compares the overhead with the expenses of, apparently, hundreds of projects. It might look like a Caribbean hurricane to you, but to me the resulting distribution of the data is surprisingly elegant. The two measures have expenses in common, which accounts for the  slope pattern. The horizontal cut-off at 1.0 is due to budget limits (or one hopes). The color overlay provides a nice analytical tool, suggesting to the reader where to look and how to interpret the data. There are a few startling findings already. A surprising number of projects have spent 2-3 times as much in overhead as in operations. Despite the high quantity of outliers, there is a strong concentration of projects around the target of spending 100% of budget and keeping the overhead low, which suggests good planning and lean implementation.

World Bank DataDive UNDP Capacity & Performance

World Bank DataDive UNDP Capacity & Performance

This graph would benefit from some graphic design flair. The overlay text should be readable and aligned everywhere. The overlay colors could be more visible and helpful. I'd be curious to experiment with empty circles instead of semi-transparent ones. The vertical text could be made horizontal. The light grey frame could be removed.

Knowing the conditions in which these graphs were produced, I wouldn't take the data for granted, nor draw any hard conclusion. But they might inspire a few in-depths analysis. Have a look at a few more on this Tumblr.

Thank you but mostly congratulations to the organizers at the World Bank and DataKind. For an event so open, it is impressive how purposeful it felt. A special thanks to the to data ambassadors of our group, Sisi Wei and Taimur Sajid. I hope that the World Bank, UNDP and other organizers and participants will benefit from the event. I know I did.