Where’s My (State) Stimulus?

Here’s an interesting post from FiveThirtyEight. The article explores where different states have spent their pandemic relief funding from the federal government. The nearly $2 trillion dollar relief included a $350 billion block grant given to the states, to do with as they saw fit. After all, every state has different needs and priorities. Huzzah for federalism. But where has that money been going?

Enter the bubbles.

I mean bubbles need water distribution systems, right?

This decision to use a bubble chart fascinates me. We know that people are not great at differentiating between area. That’s why bars, dots, and lines remain the most effective form of visually communicating differences in quantities. And as with the piece we looked at on Monday, we don’t have a legend that informs us how big the circles are relative to the dollar values they represent.

And I mention that part because what I often find is that with these types of charts, designers simply say the width of the circle represents, in this case, the dollar value. But the problem is we don’t see just the diameter of the circle, we actually see the area. And if you recall your basic maths, the area of a circle = πr2. In other words, the designer is showing you far more than the value you want to see and it distorts the relationship. I am not saying that is what is happening here, but that’s because we do not have a legend to confirm that for us.

This sort of piece would also be helped by limited duty interactivity. Because, as a Pennsylvanian, I am curious to see where the Commonwealth is choosing to spend its share of the relief funds. But there is no way at present to dive into the data. Of course, if Pennsylvania is not part of the overall story—and it’s not—than an inline graphic need not show the Keystone State. In these kinds of stories, however, I often enjoy an interactive piece at the end wherein I can explore the breadth and depth of the data.

So if we accept that a larger interactive piece is off the table, could the graphic have been redesigned to show more of the state level data with more labelling? A tree map would be an improvement over the bubbles because scaling to length and height is easier than a circle, but still presents the area problem. What a tree map allows is inherent grouping, so one could group by either spending category or by state.

I would bet that a smart series of bar charts could work really well here. It would require some clever grouping and probably colouring, but a well structured set of bars could capture both the states and categories and could be grouped by either.

Overall a fascinating idea, but I’m left just wanting a little more from the execution.

Credit for the piece goes to Elena Mejia.

Water, Water Everywhere Nor Any Drop to Drink Part II

Yesterday we looked at the New York Times coverage of some water stress climate data and how some US cities fit within the context of the world’s largest cities. Well today we look at how the Washington Post covered the same data set. This time, however, they took a more domestic-centred approach and focused on the US, but at the state level.

Still no reason to move to the Southwest
Still no reason to move to the Southwest

Both pieces start with a map to anchor the piece. However, whereas the Times began with a world map, the Post uses a map of the United States. And instead of highlighting particular cities, it labels states mentioned in the following article.

Interestingly, whereas the Times piece showed areas of No Data, including sections of the desert southwest, here the Post appears to be labelling those areas as “arid area”. We also see two different approaches to handling the data display and the bin ranges. Whereas the Times used a continuous gradient the Post opts for a discrete gradient, with sharply defined edges from one bin to the next. Of course, a close examination of the Times map shows how they used a continuous gradient in the legend, but a discrete application. The discrete application makes it far easier to compare areas directly. Gradients are, by definition, harder to distinguish between relatively close areas.

The next biggest distinguishing characteristic is that the Post’s approach is not interactive. Instead, we have only static graphics. But more importantly, the Post opts for a state-level approach. The second graphic looks at the water stress level, but then plots it against daily per capita water use.

California is pretty outlying
California is pretty outlying

My question is from the data side. Whence does the water use data come? It is not exactly specified. Nor does the graphic provide any axis limits for either the x- or the y-axis. What this graphic did make me curious about, however, was the cause of the high water consumption. How much consumption is due to water-intensive agricultural purposes? That might be a better use of the colour dimension of the graphic than tying it to the water stress levels.

The third graphic looks at the international dimension of the dataset, which is where the Times started.

China and India are really big
China and India are really big

Here we have an interesting use of area to size population. In the second graphic, each state is sized by population. Here, we have countries sized by population as well. Except, the note at the bottom of the graphic notes that neither China nor India are sized to scale. And that make sense since both countries have over a billion people. But, if the graphic is trying to use size in the one dimension, it should be consistent and make China and India enormous. If anything, it would show the scale of the problem of being high stress countries with enormous populations.

I also like how in this graphic, while it is static in nature, breaks each country into a regional classification based upon the continent where the country is located.

Overall this, like the Times piece, is a solid graphic with a few little flaws. But the fascinating bit is how the same dataset can create two stories with two different foci. One with an international flavour like that of the Times, and one of a domestic flavour like this of the Post.

Credit for the piece goes to Bonnie Berkowitz and Adrian Blanco.

Post-Brexit Trading

Off of yesterday’s piece looking at the potential slowdown in British economic growth post-Brexit, I wanted to look at a piece from the Economist exploring the state of the UK’s current trade deals.

Still loathe the use of bubbles though…
Still loathe the use of bubbles though…

I understand what is going on, with the size of the bubbles relating to British exports and the colour to the depth of the free trade deal, i.e. how complex, thorough, and wide-ranging. But the grouping by quadrant?

With trade, geographical proximity is a factor. Things that come from farther cost more because fuel, labour time, &c. One of the advantages the UK currently has is the presence of a massive market on its doorstep with which it already has tariff- and customs-less trade—the European Union.

Consequently, could the graphic somehow incorporate the element of distance? The problem would be how to account for routes, modes of transport, time—how long does a lorry have to queue at the border, for example. Alas, I do not have a great answer.

Regardless of my concepts, this piece does show how the most valuable trade partners already enjoy the deepest and largest trade deals, all through the European Union. And so the UK will need to work to replicate those deals with all of these various countries.

Credit for the piece goes the Economist Data Team.

An Ailing Graphic on the Healthcare Labour Force

I know I have said it before, but I like the increasing number of graphics-led articles published by Politico. Many policy and politics stories are driven—or should be driven—by data. But, myself included, we cannot hit it out of the park at every plate appearance. And that is what we have from Politico today, actually last week.

The graphic focuses on the healthcare industry and its need for a larger labour force in coming years as the baby boomers continue to age and start to retire. If their own doctors retire along with them, who will be their new doctors?

But there are two components of the graphic on which I want to focus. The first is the projection of the number of registered nurses (RNs) in 2024 compared to a 2014 baseline.

We need more. Just more.
We need more. Just more.

The story focuses on the future condition, but that colour is set to the lighter green thus drawing the reader’s eyes to the 2014 data point. Flipping those two colours would shift the focus of the chart to the 2024 timeframe, which would better match the text above.

Then we have the design decision to include a line chart for the growth rate, presumably total, for each category of RN from 2014 to 2024. The problem is that the chart itself does not sit on any baseline. While I do not care for the dual axis chart, that format at least keeps an axis legend on the right side of the chart. (You still have the problem of implying certain things based on what scale you choose to use relative to the first data series.) Here, because there is no chart lines associated with the growth data, I wonder if a table below the x-axis labels would be more efficient? Home health care, a very small category, will have the highest growth (a small change from a small base will beat the same small change or even slightly bigger changes from a far larger base) but the eye has the furthest to travel to reach the 61% number from the top of the bars or the labelling.

The other component I wanted to discuss is the scatter plot that compares the number of jobs to their average salary.

Bursting these bubbles…
Bursting these bubbles…

But this is a bubble chart, not a scatter plot, and so we have a third variable encoded in the size of the dot/bubble. The first thing I looked for was a scale for the size of the circles. What magnitude is the RN circle vs. the Personal Care Aides circle? There is none, but unfortunately that seems to be a common practice with bubble chart. But after failing to find that, I noticed that the circles decrease in size from right to left. That was when I looked to the legend and saw the y-axis in numbers of jobs and the x-axis in average salary. But then the circles are sized in proportion to the average salary of each profession to the other. In other words, the circles are basically re-plotting the x-axis. The physical therapist circle should be roughly twice as large, by area, than the vocational nurses. But we can also just see by the x-axis coordinates. The bubble chart-ness of the chart is unnecessary and the data could be told more clearly by stripping that away and making a straight-up scatter plot where all the circles are sized the same.

Credit for the piece goes to Christina Animashaun.

Trump’s Wall

Another day, another story about the administration to cover with data-driven graphics. We are approaching Trump’s 100th day in office, traditionally the first point at which we examine the impact of the new president. And well, beyond appointing a Supreme Court justice, it is hard to find a lot of things President Trump has actually done. But on his 99th day, he will also need to approve a Congressional bill to fund the government, or else the government shuts down on his 100th day. Not exactly the look of a successful head of state and government.

Why do I bring this up? Well, one of the many things that may or may not make it into the bill is funding for Trump’s wall that Mexico will pay for, but at an undetermined later date, because he wants to get started building the wall early, but late because he promised to start on Day 1.

Several weeks ago the Wall Street Journal published a fantastic piece on the current wall bordering Mexico. It examines the current state of fencing and whether parts of the border are fenced or not. It turns out a large portion is not. But, the piece goes on to explain just why large sections are not.

The wall today
The wall today

You should read the full piece for a better understanding. Because while the president says building the wall will cost $10 billion or less, real estimates place the costs at double that. Plus there would be lawsuits because, spoiler: significant sections of the border wall would cross private property, national parks, and Native American reservations. Also the southern border crosses varied terrain from rives to deserts to mountains some lengths of which are really difficult to build walls upon.

But the part that I really like about the piece is this scatter plot that examines the portion of the border fenced vs. the number of apprehensions. It does a brilliant job of highlighting the section of the border that would benefit most significantly from fencing, i.e. a sector with minimal fencing and a high number of apprehensions: the Rio Grande Valley.

Where would more fencing make more of a difference
Where would more fencing make more of a difference

And to make that point clear, the designers did a great job of annotating the plot to help the reader understand the plot’s meaning. As some of my readers will recall, I am not a huge fan of bubble plots. But here there is some value. The biggest bubbles are all in the lower portion of fenced sectors. Consequently, one can see that those rather well-fenced sectors would see diminished returns by completing the wall. A more economical approach would be to target a sector that has low mileage of fencing, but also a high number of apprehensions—a big circle in the lower right of the chart. And that Rio Grande Valley sector sits right there.

Overall, a fantastic piece by the Wall Street Journal.

Credit for the piece goes to Stephanie Stamm, Renée Rigdon, and Dudley Althaus.

 

Hans Rosling Has Died

It’s easy to miss the news these days. But as a designer who does a lot of work—and writes a blog about—data visualisation and information design, I was fortunate to catch the word that Hans Rosling died. You might know him best from his TED talks, but I became familiar with him through his Gapminder project.

Mind the gap, please.
Mind the gap, please.

Do I agree with the design decisions? Of course not, just ask anyone who has asked me anything about bubble charts. But that is not the point. He and others laid the groundwork for myself and those newer to the field to work on the presentation of data, and its integration into analysis.

Unfortunately his death comes at a time when the field of data visualisation comes under threat. Not from the Chinese stealing our jobs, or robots doing them better for cheaper, but from those who assail the veracity of data and fact itself.

It’s easy to joke about alternative facts and alternative data—I do it on an almost daily basis now. But, as Rosling knew that accepting facts, even if unpleasant or challenging to your view on things, was critical to public discourse. To quote from Claire Provost of the Guardian, who interviewed him in 2013:

“Rosling stood for the exact opposite – the idea we can have debates about what could or should be done, but that facts and an open mind are needed before informed discussions can begin.”

Hans Rosling, dead at the age of 68.

Credit for the piece goes to Hans Rosling.

Basketball Finals

So the basketball finals begin tonight with the Cleveland Cavaliers taking on the Golden State Warriors. This is also the part of the post where I fully admit I know almost nothing about basketball. I did, however, catch this so-labelled infographic from ESPN contrasting the two teams.

Point differential
Point differential

What I appreciate at this piece is that ESPN labelled it an infographics. And while the data might be at times light, this is more a data-rich experience than most infographics these days. Additionally the design degrades fairly nicely as your browser reduces in size.

The chart formats themselves are not too over-the-top (that seemed like a decent basketball pun when I typed it out) with bars, line, and scatter plots. Player illustrations accent the piece, but do not convey information as data-encoded variables. I quibble with the rounded bar charts for the section on each team’s construction, but the section itself is fascinating.

I might not know most of the metrics’ definitions, but I did not mind reading through the piece.

Go Red Sox.

Credit for the piece goes to Luke Knox and Cun Shi.

Growth of Inland Cities

Some of the nation’s fastest growing cities are inland, away from the coast where housing prices are high. To support an article about the demographic shift, the New York Times created this map. Circle size represents growth over a six-year period while the colour of the bubble represents housing prices.

Fastest growing cities
Fastest growing cities

Credit for the piece goes to the New York Times graphics department.

Coal vs. the Great Barrier Reef

Your humble author is away this week. But the Great Barrier Reef in Australia is still here. For now. The Guardian takes a look at the growing threat to the World Heritage site from the coal industry in Queensland, Australia. The author takes you through the narrative in a chapter format, using charts and maps to illustrate the points in the brief bit of text. A really nice job altogether.

Major ports and their volume
Major ports and their volume

Credit for the piece goes to Nick Evershed.

The Curse(s) of the CEOs

It’s Friday, so we should try to take things a bit lighter. For me that usually means knocking back a drink or two and a swear-y exultation about it being the end of the work week. But, it turns out, I’m just trying to emulate our captains of industry. Bloomberg has gone through company conference calls and tabulated the number of swear words used and charted the results. And for fun, you can read some of the excerpts.

They'll swear by it
They’ll swear by it

Credit for the piece goes to David Ingold, Keith Collins, and Jeff Green.