The BBC has an article about the massiveness of Facebook—at least in the United States. They have taken the data and spent time to do a little bit of visualisation. It is worth a look; the design is not perfect but acceptable in a broad sense.
The Washington Post has released an in-depth article, or series of articles, about the intelligence community of the United States and its growth since 11 September 2001. There are several visualisations of data and relationships between government agencies and companies along with a video introduction and, well, a traditional written article or two.
Overall, the piece is quite interesting to look through—although I have not yet had the time to do just that. Some of the visualisations appear a bit thin. But, that may be just because I have not yet had time to play with them enough to draw out any particular insights.
What is nice, however, is again having visualisations supporting editorial content in such a fashion.
The election has come and gone yet very little is resolved; the UK now has a hung parliament. Labour, the Tories, and the Lib Dems are now left to negotiate on the details of forming a coalition government, wherein two parties formally agree to cooperate in governing the country, or a minority government, wherein the Tories try to govern with the most seats but less than a majority. Or does Labour try to work with the Lib Dems and achieve something of a minority coalition government. The one certain thing about the election is that we now have loads of electoral data that wants to be visualised.
A few things at the top, as an American, despite my following of British politics, I am, well, an American. I am more familiar with the American system and so some of what may follow may be inaccurate. If at all, please do speak up. I should very much like to understand an electoral system that may now change entirely.
I wanted to point out a couple of sites real quick and some advantages and disadvantages thereof. Most of these were likely around before the election, however, I have been a tad busy with work and some other things to provide any commentary until now.
Auntie. The Beeb. The BBC. They have done a pretty good job at playing with four variables and the results. Are pie charts great? No. Not at all. However, they naturally limit us to 100% whereas bar charts displaying share are not necessarily as limiting—understanding that, yes, such things can be coded into the system.
Another interesting thing about the BBC’s electoral map is their cartographic decision to represent each constituency as a hexagon instead of overlaying the constituencies over a political map. This actually makes quite a lot of sense, however, if one considers that British constituencies are supposed to be rather equal in terms of population—not geographic area. And so while a traditional map will portray vast swathes of Tory blue and Lib Deb yellow, Labour counters in holding numerous visually insignificant constituencies in the inner cities of the UK.
Does the BBC need to represent each Commons seat as a square and arrange them to cross the majority line? Most likely not. However, it does keep with the idea of displaying each constituency as the boxes are placed next to the hexagons.
All in all, I think the BBC’s piece is quite effective. I do miss seeing the actual geography of the UK. But I understand how it is less useful in displaying the outcome of one’s playing with the electoral swing. Useful, but not necessarily needed, is the provision of several historical elections as comparisons to one’s playing.
The Guardian is next, in no particular order. Their swingometer is a bit less interesting than that of the BBC’s. Certainly in some senses it makes more sense, any bi-directional swing, while easier to grasp, ignore the complexities in having the Liberal Democrats as a viable third party and thus third axis. The circular swingometer attempts to rectify that. However, what the BBC does with their pie chart version is delve into the politics of the regional and fourth party candidates. For example, the Greens won a single constituency in southern England. In a hung parliament a single vote may be the difference between passing and defeating a bill. The BBC accounts for this while the Guardian does not.
What is particularly interesting about their calculator, however, is the ability to track individual seats and watch as one’s changes affect that particular constituency. As I play with the calculator, I can watch as Brighton Pavilion, where the Green party candidate won, changes from Labour to Conservative. However, nowhere in my exercises, have I managed to switch the seat to a fourth party candidate. The BBC solves this by not allowing one to select particular constituencies; one can only guess which seats they are looking at.
Also interesting about the Guardian’s version is their provision of different data displays. The default is a proportional representation, with each seat equating to a single square. However, they also allow one to view the results on a natural geographic level and strictly in terms of number of results and how close said results are to the magic number of 326. Additionally, the map allows you to filter for only that region of particular interest to you. If I only wish to look at, for example, the West Midlands, I can look at just the West Midlands without being distracted by additional regions. (The West Midlands provides another interesting example of being unable to factor in the role of fourth party players as Wyre Forest switched to the Conservatives, a result I cannot here duplicate.)
Overall, I really like how the Guardian provides different ways of viewing the data and the ability to track those changes to a particular constituency—even across the changes in data views. However, the Guardian is lacking at least in the ability to address the role of independents and regional parties. Perhaps this is do to a level of difficulty in predicting results at that level of granularity; something that is wholly understandable. However, that the BBC does just that is unfortunate for the rest of the Guardian’s piece because the rest of it is so nice. Even aesthetically, I find the Guardian’s to be appealing.
Next is the Sun. This, admittedly, is not so much a calculator but more a results map. And as such, it is effective in its simplicity. There is no messing about with swing or such—again probably because it is simply filling in constituencies by result. However, where the Sun’s piece fails is that to see any result, one needs to click a specific region. When selecting the UK, one can only see the outlines of the various regions of the UK. There appears to be no way of seeing UK-wide results.
Additionally, the data is presented strictly on a natural geography. This has the deficits as outlined above. And while the Guardian does present the results in such a fashion, it is not the only fashion in which data can be presented. Further, to see any results for a particular constituency, one must click all the way through the map before seeing data. None of this helps one access the actual data. And while one could say that the results are less important than showing the victor, one still needs to click into a specific region to see a victor thereby requiring a click whereas the other pieces provide results at an instant view.
Aesthetically, while both the BBC and Guardian favour a lighter, more open space the Sun’s piece feels trapped in a claustrophobic space surrounded by dark advertisements and flush against menus and heavy-handed navigation. All in all, I must confess that the Sun’s piece strikes me as an underwhelming piece that is less than wholly successful. It could have been made at least wholly successful if I needed not navigate into a particular region to mouseover a constituency.
The last piece I am going to look at is that from the Times. While there appears to be no way of playing with possible outcomes, the Times provides interesting ways of slicing the data in a more narrative structure. In terms of the map, the display suffers from being viewable only as the natural geography of the United Kingdom without being able to even toggle to a proportional view.
The additional data is displayed nicely in a side panel. I have to say that from an aesthetic standpoint, the Times’ mini site for the election results is my favourite. The black banner and main navigation sits well against the light colours used for the remainder of the piece. The serifed typeface for the numbers fits well with the newspaper feel and the black and serif combined works well to recall No. 10, Downing Street. A very nice touch and design decision.
As noted, the display fails in that only shows the data in a natural geographic sense. Now, the site overall provides links to news coverage of the event; these are accessible through a dropdown menu in the black banner. But, when clicked, these stories alter the map and highlight the particular constituencies in question. This approach provides a nice touch on straight data visualisation in linking the data to the editorial content of the newspaper. Which seats were taken or lost by independents? On a broad and filled-in map of the United Kingdom, I may not be able to know. But by clicking on that story, the map filters appropriately and I can click each constituency and get the story.
And so while the data visualisation is not necessarily on par with that of the BBC and the Guardian, the tie-in with the editorial emphasis—in my mind—makes up for the lack of detail in data visualisation. Data is wonderful, however, the narrative is what helps us make sense of what is otherwise just numbers and figures.
That editorial link and the subtle design decision to link the minisite to the sort of 10 Downing Street aesthetic makes the Times version my favourite and the best designed experience. Besides the lack of detail in the data visualisation aspect, the only other drawback is perhaps the load time for each change in display.
Sunday night, the US House of Representatives passed a bill that you may not have heard about. The bill goes towards addressing universal healthcare coverage for US citizens. As I said, you may not have heard of it…
The bill was passed largely along partisan lines with about 30 conservative Democrats joining the conservative Republicans in voting against the legislation. This morning, the GOP unveiled a new website called Fire Nancy Pelosi that seeks to capitalise on the anger against—and perhaps even hatred for—providing healthcare to all Americans by collecting donations to capture 40 House seats in the forthcoming mid-term election. The website uses a map of the United States to show “who wants to fire Nancy Pelosi most”. According to the map, states are ranked by donation totals.
Without attempting to talk about the politics, the problem with the map is that it is attempting to equate the state’s supposed anger against Speaker Pelosi and healthcare for Americans with the sum of donations per state. As of the time when I captured my screen, the interesting visual is that many traditional red states are blue and purple, e.g. West Virginia and North Dakota, whereas many traditional blue states are purple and red, e.g.California and Illinois. The problem however, is that one state may be able to provide more donations than another.
The most obvious difference is in terms of population. Without access to the data I cannot state facts about exact donation totals. However, the map does break down the states into deciles and so I have quickly pulled from Wikipedia some rankings on population (from 2009) and income per capita (from 2000). What is quite clear is that the states donating the least are among the states with the smallest population. Six of the Bottom 10 donating states are from the ten smallest states. Conversely, eight of the GOP’s Top 10 Donating States are among the ten largest states in the country by population.
If you compare income per capita, I find the message a bit more confusing, but still quite interesting. I do not claim to be a statistician and an analytic review of the numbers is a bit outside my area of expertise. However, for the Top 10, not a single state is ranked 40 or below in terms of income. And only one state is ranked in the 30s. One Top 10 Donating State is also found in the top ten by income per capita and a total of five of the Top 10 Donating States are in the top twenty by income per capita. Among the Bottom 10 Donating States we also find five states from the top twenty states by income per capita. However, we also find four of the last twenty states by income.
What strikes me is that the Top 10 Donating States have a larger population base from which to draw donations and, loosely, earn more per capita and thus, perhaps presumptuously, have more disposable income for contributions. The Bottom 10 Donating States have among the smallest populations and while some are seemingly quite wealthy, a significant number are among the least well-off in terms of income per capita.
And none of this critique discusses how the Top 10 Donating States use a bright and vivid red to draw attention whereas the Bottom 10 use a fairly dark and almost dull blue to push forward the bright red.
This weekend was pretty busy. We had another earthquake in Latin America—if one includes Haiti as part of Latin America—and the closing of the Olympics. Both have prompted some information graphics that are worth noting and comparing. I am going to leave the New York Times’ explanation of the Chilean earthquake to another post and instead focus here on the Olympics.
I wanted to look at three different visualisations of the Olympics, chiefly centred on the always popular medal count.
First we have CNN, which dedicates an entire special coverage site to the Winter Olympics. The site has the 2.0-esque feel with different boxes providing the user with different types of material: text-based stories, video, access to an interactive map, and a medal count. The map is what first strikes me because of its warning of reds and oranges and yellows. When I clicked to access the map, however, I felt disappointed in what appeared. And then I wonder why I am being warned about the US and Canada.
Generally speaking, a lot of the world’s landmass did not participate. And of those that did, not a lot won any medals. The vast emptiness of the grey map does a disservice to those much smaller areas of the world, particularly in northern and eastern Europe, that did win but are difficult to see. And, personally speaking, as a fan of Antarctica, I was disappointed to see they neither contributed athletes nor won any medals.
What does work, however, is the idea of highlighting those nations that competed. Perhaps not everybody knows that not every nation competes. The map could have been better executed or even a more stylised visualisation of percentages of regions that competed. 7 nations on the African continent, by my quick count, did compete, while three or four European countries did not. The point is important to make. The visualisation, however, does not support it as well it could.
The medal count is also quite interesting from CNN. The special coverage site maintains a snapshot of the leader board that links to the Sports Illustrated medal count. From Sports Illustrated, we are provided a simple table-driven display of who won what with little bar charts to highlight the total medal count. By clicking on a country you see the historical details of the country’s performance, again presented as a table with a bar chart for the total medals won in each year. What I find lacking, however is ranking the countries only by total medals. If someone is more interested in the best gold-medal-winning countries, one has to work to find the data. A simple ability to sort by medal type would be a valuable addition.
An interesting situation arises, however, when looking at the historical figures. I am no expert on the Olympics. I did not watch any of them this year. Nor have I really watched them in the past. However, I do know that sports are added and subtracted. So, by clicking on Germany, for example, we see six medals won in 1936 versus the 36 in 2002. In 1936, however, I count only 51 medals total. In 2002, I count 234 medals. 12% compared to 15%. It does not seem so drastic an increase when put into context.
The BBC also has its own special coverage of the Olympics and has its own charts and tables to support the medal count. The main table they chose to use is interesting to me because it is particularly dark compared to the white and grey aesthetic of CNN and the New York Times (below). It certainly stands out. However, when I say ‘it’ I am referring not to the medal count but to the horizontal bands of dark greys and blacks. Between the dark colours, the small type for the medals, and the large flags to identify the countries, I find the medal count last. And when I find it difficult to sort by anything other than the default. But the default is not the total medal count as with CNN. Instead, it is the gold medal count. An interesting choice.
The chart the BBC provides for total medal winnings is also interesting. You can compare across countries, years, and medals. Some of the interface makes it a bit tricky. For several minutes I was trying to figure out which years were which at the outset only to finally realise that I was looking at all years. Perhaps if even all the years were a light weight and then the year selected made bold it would be easier. Or a more noticeable shift in colour.
And then I started trying to discern between countries and was left with three-letter abbreviations for each country. And I could not. Perhaps my inability to figure out which country was which stems from my being an American. But I do like to think that I am pretty good at identifying countries. I thought if I moused over one of the stacked bars the abbreviation would reveal its true meaning to me. But it did not. And so I am left looking at stacks of medals for countries I cannot identify.
The stacked bars are interesting for they are not ideal in comparing medal counts within each country—but the mouseover state provides the required detail. Switching to the country comparison makes it a tad easier, but one must still filter medal by medal. Then aesthetically I do not care for the polished, faux-three-dimensional appearance. Nor do I think most of the bars need to be as wide as they are.
Overall, I like where the BBC was trying to go with lots of cross-comparing and filtering to break down medal counts and historical performance. However, aesthetically, I find too many elements large, bulky and distracting. From large flags and small type to black–grey bands and thick stacked bars, the interface is a tad too cluttered to really allow what the BBC did to show.
Lastly, I want to look at the New York Times. Like CNN, the Times has created a map of the medal-winning countries in the Olympics, both current and historical. Unlike CNN’s map, however, the map by the Times is not based strictly on geography and each country is represented by a ‘bubble’ whose size corresponds to the number of medals won. I am not particularly keen on bubbles for displaying values. However, by eliminating those parts of the world not participating, the user can instead focus on the results. And those results, while not on a map, are spatially arranged to indicate their regional groupings. For example, the European countries are grouped together and coloured differently from those in North America. Switching tabs quickly reorganises the bubbles into rows ordered by numbers of medals won irrespective of geography.These are interesting design decisions and while I have reservations about using bubbles, the exact figures are provided as one moves his or her mouse over the bubble.
For the medal count, we see another table. The numbers are cleanly presented with small flags to the left—and without the bars of Sports Illustrated. Like both CNN and the BBC, one can dive deeper into a nation’s particular results by clicking on that nation. Yet, the experience of clicking into a nation is much smoother with the Times than with the others. No additional pages load and I am not watching my screen jump about to and fro. Instead, the additional information loads in a panel to the right of my click. It is a quick and clean transition that does not leave my frustrated by jumping browser windows.
Yet, the additional information loaded is quite text heavy. And with the exception of the headers for each sport, nothing in particular stands out. Not the sport, not the medal won, not the athlete who won the medal. While the quiet type is aesthetically pleasing, it lacks punch. Instead of reading the words gold, silver or bronze—since we are looking at the medals—perhaps just their icons could be used. Perhaps the athlete is in bold. I would just like something to which my eyes can be drawn in the table.
The New York Times effort is brilliant work. Clean and simple design brings out the information—even if such information is presented in the forms of bubbles and such. Tables are rather clean and easy to read. I think their effort a success. Not perfect, as I find the more detailed table insufficient in terms of visual distinction, but still a success.
All of the efforts are significant and worthy of mention. Each has some flaws and each has some strengths. And admittedly, each is more complex and detailed than I have described and commented upon here. But I have to stop somewhere.
The BBC has posted an article addressing the causes for the horrible death toll in the Haitian earthquake last month. Charts and data-driven graphics supplement the text and provide a parallel, though not synchronous, visual story.
I applaud an intensive use of graphics, especially data-driven graphics, to better relate a story. Perhaps especially because not everybody can learn something simply by reading it. Many among us are instead visual learners, and by incorporating graphics into stories, be them online news articles, printed magazine articles, or interactive experiences, we can reach a larger audience and hopefully inform, impact, and influence said audience.
The graphics for the BBC article do some things well and some things not so well. Firstly, as a series, the graphics are varied well enough to not be overly repetitive. We have two sets of bar charts, three sets of area-of-circle charts (I presume), and what Nathan Yau at FlowingData called the everything chart. And if we look at the placement of the charts within the article (B for bar, A for area, and E for everything) we see A–B–E–A–B–A. The placement of the graphic types within the structure for the visual component of the article is sufficiently varied not to bore the audience.
Secondly, the graphics are well done through the restrained use of colour. Each graphic is composed of tints of grey and red. Grey is largely used as the base colour for the graphic while red or its tints accent the key elements of the visual argument. For example, when comparing the number of people dead among the affected people, a single square is coloured red. In the larger areas of grey for China and Italy, that single square or single person, while not lost, is not necessarily readily important. Instead, it serves as a link across the countries for when people look at Haiti and that red square stands out among fourteen other grey squares.
On a few other fronts, however, the graphics stumble where, if a bit more attention had been paid, they could have really strode to the front of the pack.
From the perspective of representing data, I have three qualms with the graphics. The first is somewhat minor—for it may have well been intentional. Although one of the underlying points of the article is the sheer difference in scale of the impact, the Italian earthquake is often difficult to visually compare in detail next to the Chinese and Haitian earthquakes. Why? Because the Italian numbers are visually insignificant. Could this have been resolved in a different fashion? Yes. However, as aforementioned that could have been an intentional design decision and I only point it out as something worth considering.
Secondly, and more importantly, I have concerns about the area-of-the-circle charts. Chief among those concerns is that nowhere is it specified that we are looking at area instead of, as is sometimes the case, the radius of the circle. This error is significant and I shall illustrate below. If we take, for the sake of example, a circle of radius 5 units and compare it to a circle of radius 10 units, we have a difference of 5 radius units. Circle 2 is, by one measure, twice the size of Circle 1.
And while some compare radii of circles, others instead compare areas of circle—as I assume but cannot be certain of in this instance. However, if we look at 5 units and 10 units as the radius for the areas of Circles 1 and 2, we arrive at a difference of four times the size.
This is not a problem specific to this article—although it could have been partially solved by stating area comparison or providing a scale. Unfortunately, this problem appears with an unsettling degree of frequency. But the problem is also compounded because people are not the best judges of area. It is far easier for an individual to compare two bars and see one as twice the size than it is to compare the area of two circle and see the same difference.
The problem of comparing length to length times width for a datapoint has the fair counterpart of visual repetition aforementioned. One of the three circle charts could easily be replaced by a bar chart. The two others could also be replaced, albeit with a little more work. However, the article would then be visually boring for it would be replete with bar charts.
And so the problem of comparing areas is one to be handled carefully. This piece has some potentially serious problems in comparing data. However, the quick and easy fixes would make the piece less visually interesting and thus, perhaps, cause the reader to lose interest. And with interest lost, the reader clicks away and the article fails to make an impact. On a case-by-case basis, the designer must often make the decision between fidelity to data versus capturing the reader’s attention. And here, I think the pendulum swung just a bit too far to capture the reader’s attention.
My third major point of contention, however, concerns the data itself and an incongruity in the graphics. Understandably, the earthquake in Haiti is a recent event. And indeed, the rebuilding and thus the cost is a current and ongoing event. Therefore, nobody should expect any institution to have anything but cost estimates. The final graphic makes note of the fact that we have no hard figures and that all figures are estimates.
Yet, the graphic provides two visual estimates hinting at two substantially different figures. The straight cost is represented on a bar chart that states “estimated at several billion” but visually depicts $40 billion. The cost as relative to GDP includes an asterisk for a footnote that shows the cost as but $2 billion.
Using the estimates for an early comparison of the financial cost of the earthquake is perfectly valid so long as the numbers are noted as estimates—as they are here. However, one should use the same estimate consistently across the piece. Because here, the use of different estimates can be used to imply that the straight cost is nearly 15 times more than that of Italy’s cost. But when using the cost to compare the cost relative to the economy, Haiti’s straight cost is $500 million less than that of Italy’s cost. If anything, either the bar chart should have had no bar for Haiti, or made use of the dashed red line at the $2 billion line.
And so while I certainly have reservations about the graphics as a whole, I support the decision to use data-driven graphics to support the story and the argument.
I have measured out my life with coffee spoons…
—T.S. Eliot, The Love Song of J. Alfred Prufrock
Modern life in the Western world revolves around data that then becomes misinformation, disinformation, or, more rarely, information. In theory, we use this information to inform our decision-making process and then live fitter, happier lives. (Please hold all comments about how well the theory of Communism is working in the Soviet Union until the end.) However, for most of us, gleaning any kind of information from row after row after column after column of data is too laborious and too time consuming to be worth our time. And so most of us have turned to an emerging field that is known by many names but is perhaps best described as data visualisation.
Data can come in many different forms. It can be the gross domestic product of the United States. It can be the preferred term for carbonated beverages across the state of Pennsylvania. (Although I think we can all agree it is soda, not pop.) Data can be the route of Philadelphia’s regional rail lines. (And how late they happen to be on any given day.) Data can even be the price paid for a cup of tea in a bookstore.
And with all these different forms, one can have perhaps an even greater number of forms in which that data is visualised—either alone or in comparison to other datapoints. One can look at the GDP of the US as part of a bar chart against the GDPs of China and India. One can look at a map of Pennsylvania and see the barbaric lexicon near Pittsburgh in their preference for pop. One can look at a non-geographic map of SEPTA’s regional rail lines and all the stops of the Main Line. (And the lines coloured not by their destination but their on-time-ness.) One can even see the price of a cup of tea printed on a piece of paper in an itemised list. And each of those forms and the many, many others has inherent benefits and drawbacks. Each can be used appropriately. Or not. Some look good. Others not.
An interesting dividing line in this nascent field concerns the efficiency of any visualisation in communicating the data. Some statisticians argue for stripped down charts and tables with relatively little care put towards the aesthetics of the visualisation. On the other side, some designers care more for what many non-designers call the ‘prettification’ of data. That is to say, placing care towards the presentation and legibility of the visualisation perhaps at the expense of clearly communicating all the data. So where do I stand? Well damn it, Jim, I’m a designer, not a statistician. While I do believe in maintaining a level of fidelity to the data, that data must still be clearly communicated to reach the designated audience. The piece in question must still grab the attention of the audience enough to get them to delve into the layers of data.
The ultimate goal of this blog is to examine data visualisations, information graphics, charts, whatevers and break them down to see what may have been done well and not so well—all from my aforenoted perspective. Naturally, some may disagree and I wholly encourage dissent from the party line. My party line. If you find something interesting, please send it my way.
Modern life in the Western world revolves around data. Data that we create each and every day. We are aware of some of the data we create while we remain ignorant of some other data. We create it through every click of the mouse. Every swipe of a card. Every change of a channel. Every mobile phone dialed. Each and every day our lives are broken down into rows and columns. We may no longer measure our lives in coffee spoons, but we measure them nonetheless.