Discontinuous Lead Bars

Last week the Guardian published an article about drinking water pollution across the United States. Overall, it was a nicely done piece and the graphics within segmented the longer text into discrete sections. Each unit looks similar:


The left focuses on a definition and provides contextual information. It includes small illustrations of the mechanisms by which the pollutant enters the water system. To the right is a chart showing the levels of the contamination detected in the 120 tests the Guardian (and its partner Consumer Reports) conducted.

In almost all of the charts, we see the maximum depicted on the y-axis. And the bars are coloured if that observation station exceeds the health and safety limits. (The limit is represented by the dotted line.)

But towards the end of the piece we get to lead, a particularly problematic pollutant. There is no safe level of lead contamination. But how the piece handles the lead chart leaves a bit to be desired.

But how bad is it, really?

The first thing is colour, but that’s okay. Everything is red, but again, there is no safe level of lead so everything is over the limit. But look at the y-axis. That little black line at the top indicates a discontinuity in the lines, in other words the values for those three observations are literally off the chart.

But does that work?

First, this kind of thing happens all the time. If you ever have to work with data on either China or India, you’ll often find those two nations, due to their sheer demographic size, skew datasets that involve people. But in these kind of situations, how do we handle off the charts data points?

There is a value to including those points. It can show how extreme of an outlier those observations truly are. In other words, it can help with data transparency, i.e. you’re not trying to hide data points that don’t fit the narrative with which you’re working.

In this piece, it’s never explicitly stated what the largest value in the data set is, but I interpret it as being 5.8. So what happens if we make a quick chart showing a value of 6 (because it’s easier than 5.8)? I added a blue bar to distinguish it from the the rest of the chart.

It’s pretty bad.

You can see that including the data point drastically changes how the chart looks. The number falls well outside the graphic, but it also shows just how dangerously high that one observation truly is.

But if you say, well yeah, but that falls outside the box allowed by the webpage, you’re correct. There are ways it could be handled to sit outside the “box”, but that would require some extra clever bits. And this isn’t a print layout where it’s much easier to play with placement. So what happens when we resize that graphic to fit within its container?

And resized

You can see that All the other bars become quite small. And this is probably why the designers chose to break the chart in the first place. But as we’ve established, in doing so they’ve minimised the danger of those few off-the-charts sites as well as left off context that shows how for the vast majority of sites, the situation is not nearly as dire—though, again, no lead is good lead.

What else could have been done? If maintaining the height of the less affected bars was paramount, the designers had a few other options they could have used. First, you could exclude those observations and perhaps put a line below the 118 text that says “for three sites, the data was off the charts and we’ve excluded them from the set below.”

I have used that approach in the past, but I use it with great reluctance. You are removing important outliers from the data set and the set is not complete without them. After all, if you are looking to use this data set to inform a policy choice such as, which communities should receive emergency funding to reduce lead levels, I’d want to start with the city in blue. Sure, I would like everyone to get money, but we’d have to prioritise resources.

I think the best compromise here would have actually been a small tweak to the original. Above the three bars that are broken (or perhaps to the right with some labelling), label the discontinuous data points to provide clearer context to the vast majority of the sites, which are below 0.5 ppb.

As easy as ABC

This preserves the ability to easily compare the lower level observations, but provides important context of where they sit within the overall data set by maintaining the upper limits of the worst offenders.

Credit for the piece goes to the Guardian’s graphics department.

Merging of the States

Dorian now speeds away from Newfoundland and into the North Atlantic. We looked at its historic intensity last week. But during that week, with all the talk of maps and Alabama, I noted to myself a map from the BBC that showed the forecast path.

Did New Jersey eat Delaware?
Did New Jersey eat Delaware?

But note the state borders. New Jersey and Delaware have merged. Is it Delawarsey? And what about Maryland, Virginia, and the District of Columbia? Compare that to this map from the Guardian.

Here the states are intact
Here the states are intact

What we have are intact states. But, and it might be difficult to see at this scale, the problem may be that it appears the BBC map is using sea borders. I wonder if the Delaware Bay, which isn’t a land border, is a reason for the lack of a boundary between the two states. Similarly, is the Potomac River and its estuary the reason for a lack of a border between Virginia, Maryland, and DC?

I appreciate that land shape boundary files are easy, but they sometimes can mislead users as to actual land borders.

Credit for these pieces go the BBC graphics department and the Guardian graphics department.

Urban Boom Towns

Today we look at a piece from the Guardian about the blossoming of some cities from, essentially, out of nowhere. Think similar to how there is really no reason for Las Vegas or Phoenix to exist—cities of hundreds of thousands situated smack in the middle of the desert. But most of these new growth cities, cities from scratch as the Guardian calls them, are sprouting in Africa and Asia.

The piece uses two pretty straight-forward graphics to show the scale of the growth problem.

A lot of urban area growth is yet to come.
A lot of urban area growth is yet to come.

I don’t love the area chart, but even for all its flaws, it it still massively obvious just how much Africa will contribute to population growth in the coming decades. And the line chart, which I find far more effective despite its borderline spaghetti-ness, shows just how much of that growth will likely be urban in nature.

But the star of the piece, for which you will need to click over to the original article to enjoy, are the motion graphics. They capture year-by-year the satellite views showing how the cities have grown from almost nothing. This is a screencapture of Ordos, China. But go back a couple of years and it’s almost an empty desert.

Check this out from decades ago and you'll see nothing.
Check this out from decades ago and you’ll see nothing.

Credit for the piece goes to Antonio Voce and Nick Van Mead.

The Rent Is Too Bloody High

This is a graphic from the Guardian that sort of mystified me at first. The article it supports details how the rising rents across England are hurting the rural youth so much so they elect to stay in their small towns instead of moving to the big city.

But all those segments?
But all those segments?

The first thing I noticed is that there really is no description of the data. We have a chart looking at something from 1997 and comparing it to 2018. The title is more of a sentence describing the first pair of bars. And from that title we can infer that these bars are income changes for the specified move, e.g. Sunderland to York, for the specified year. But a casual reader might not pick up on that casual description.

Then we have the issue of the bars themselves. What sort of range are we looking at? What is the min? The max? That too is implied by the data presented in the bars. Well, technically not the bars, but in the numbers at the end of each bar. I will spare you the usual rant about numbers in graphics defeating the purpose of graphics and organisation vs. visual relationship. Instead, the numbers here are essential because we can use them to suss out the scale of the grey bars. After looking at a few bars, we can tell that the white lines separating the grey boxes are most likely 10% increments. And from that we can gather the minimum is about -40% and the maximum 100%. But instead of making the reader work to figure this out, would not some min/max labels at the bottom of the chart be far clearer?

And then there is the issue of the grey boxes/bars themselves. Why are they there in the first place? If the dataset were more about an unmet value, say reservoirs in towns were only at x% of capacity, the grey bars could relate the overall capacity and the coloured bars the actual values. But here, income is not a capacity or similar type of value. It could expand well beyond the 100% or decline beyond the -40%. These bars imply the values are trapped within these ranges. I would instead drop the grey bars entirely and let the coloured bars exist on their own.

Overall this is a confusing graphic for a fascinating article. I wish the graphic had been a little bit clearer.

Credit for the piece goes to the Guardian’s graphics department.

Regional Power Plays

One of the things we missed covering last week whilst I was on holiday? The dust up in the Gulf of Oman, located near the Strait of Hormuz, where two foreign ships were attacked by mines or other explosive devices. The United States blames Iran and, of course, Iran denies it. The thing is, an inordinate amount of oil flows through the Strait, connecting the petroleum-driven economies of the West to the instability in the Middle East. Thankfully we have a graphic from the Guardian to explain just what is going on there.

Not shown: the US, the EU, China, and Russia
Not shown: the US, the EU, China, and Russia

The above is a screenshot from the article, one of several graphics. There is a stacked bar chart showing the total volume of oil in transit, and the Strait’s share of it. Spoiler: it’s significant. We all know how I feel about stacked bars: not the biggest fan.

There are, of course, locator maps showing the locations of the attacked ships. We also have some photographs showing the damage inflicted upon the tankers, as well as some evidence of what the US claims is Iranian activities. (Side note: isn’t it great that when the US really wants the world to trust its intelligence agencies the White House has been doing nothing but trashing said intelligence agencies?)

The above, however, is a simple map showing the political fault line in the Middle East. It gets to the heart of the potential conflict here being not a US vs. Iran war, but a Saudi Arabia vs. Iran war. After all, relations between the Saudis and the Trumps have warmed significantly since the Obama administration. And not shown in the map is the role of Israel, which, again has seen a significant warming in relations between Trump and Netanyahu, and which has also been quietly supporting Saudi Arabia in its undeclared war against Iran, to date fought only with proxies, most notably in Yemen.

In other words, the Middle East is a complicated and complex tinder box, built next to a few nuclear reactors, all of which just happen to sit atop vast reserves of oil and natural gas. So the best thing to do? Clearly start exploding things.

Credit for the piece goes to the Guardian graphics department.

The Entire United States Pt 2

Yesterday I wrote about the failure in a Politico piece to include Alaska and Hawaii in a graphic depicting the “entire” United States. After I had posted it, I recalled an article I read in the Guardian that looked at the shape of the United States, using the term “logo map”. It compared what many would consider the logo map to the actual map of the United States.

Still no New Zealand…
Still no New Zealand…

I warn you, it is a long read. But it was worth it to try and reframe the idea of what does the United States look like?

Credit for the piece goes to the Guardian graphics department.

Another Week, Another Brexit Vote

Yet again, we are poised to watch the British House of Commons this week as it votes on several key pieces of Brexit-related legislation. In short, MPs are set to vote on Prime Minister Theresa May’s Brexit deal. Again. Basically the same one that MPs rejected by a historic margin last month. The question is will they vote against it again?

Thankfully the Guardian put together a graphic explaining what will happen now as a flow chart.

So many votes…
So many votes…

So get ready for a week of fascinating votes.

Credit for the piece goes to the Guardian graphics department.

Arctic Chill

The Arctic air mass that has frozen the Midwest continues to spread and so today will be a tad chilly in Philadelphia. Yesterday, however, the Guardian had a piece that used data from NASA to show how the air masses over the Northern Hemisphere have been disturbed by unusually warm air.

The Arctic plunge.
The Arctic plunge.

One theory to how this all works is that the reduced polar sea ice means water absorbs summer heat instead of being locked in the ice. But then that heat is basically released come winter. (I’m oversimplifying this.) That warms the air, which disturbs the polar vortex. As the Guardian then explains, the destabilised air mass can wobble and spill some of its frigid air down into the lower latitudes. (It takes a little while because the polar vortex is in the upper atmosphere and the air needs to sink to the ground.)

Point is, bundle up and stay warm.

Credit for the piece goes to the Guardian graphics department.

Asteroids on the Moon

I hope everybody enjoyed their holiday. But, before we dive back into the meatier topics of the news, I wanted to share this serpentine graphic from the Guardian I discovered last week. Functionally it is a timeline charting the size of 96 known large asteroid impact craters on the Moon, between 80ºS and 80ºN.

Impacts on the Moon
Impacts on the Moon

The biggest question I have is whether the wrapping layout is necessary. I would prefer a more simplistic and straightforward, well, straight timeline, but I can imagine space constraints forcing the graphic into this box—either for the digital version and/or the likely print version.

The transparencies help to give a sense of density to the strikes, especially in the later years. And the orange ones highlight important or well-known craters like Tycho.

I do wonder, however, if the designer could have added a line at the 290 million years point. Since the graphic’s title calls that year out in particular, it might help the audience more quickly grasp the graphic’s…impact. In theory, the reader can more or less figure it out from the highlighting of the Ohm impact crater that is listed as 291 million years old. But a small grey line like those for the 250 million year increments could have been a nice little touch.

Overall, however, it’s nice to see a compact and helpful space graphic.

Credit for the piece goes to the Guardian graphics team.

What’s Next, Brexit?

A no confidence vote on Theresa May’s government, that’s what.

For those not familiar with parliamentary democracies, basically a no confidence vote is held when a substantial number of members of parliament have just that, no confidence, in the government of the day. The legislative body then votes and if the government wins, the government stays in power. If the government loses, typically, though not always, a new election is held to create a distribution of seats—it’s thought—that will yield a government that can hold the confidence. (There really is not an analogy for this in the US government that I can think of.)

To be fair, nobody really expects May’s government to collapse this afternoon. The Tories and her Democratic Unionist Party (a small Northern Irish party supporting the government) do not want to hold new elections nor do they want to give the Leader of the Opposition, Jeremy Corbyn, the chance to form his own government as much as they might despise May and her Brexit deal. So in all likelihood May survives by a dozen or so votes. On the other hand, the result yesterday was surprising in its scale, so could twenty or so of the 118 Tories who voted no vote against May? Possibly.

So then what next? Thankfully the Guardian put together two calendars showing just what happens and, crucially, in the context of how much time remains until the UK crashes out of the EU.

In case she wins, as we expect.

It still doesn't leave a lot of time to figure out what to do.
It still doesn’t leave a lot of time to figure out what to do.

If she loses, which is possible, but unlikely.

The UK would have even less time in this scenario.
The UK would have even less time in this scenario.

The key thing to note is that the election campaign would eat up most of the time left and leave the UK very little time to do anything but ask the EU for an extension.

These are two small, but really nicely done graphics.

Credit for the piece goes to the Guardian’s graphics department.