Warmer, Wetter Winters in the UK

I remember hearing and reading stories as a child about the Thames in London freezing over and hosting winter festivals. Of course most of that happened during what we call the Little Ice Age, a period of below average temperatures during the 15th through the early 19th century.

But those days are over.

The UK’s Meteorological Office, or the Met for short, released some analysis of the impacts of climate change to winter temperatures in the United Kingdom. And if, like me, you’re more partial to winter than summer, the news is…not great.

Winter warming

Broadly speaking, winters will become warmer and wetter, i.e. less snowy and more rainy. Meanwhile summers will become hotter and drier. Farewell, frost festivals.

But let’s talk about the graphic. Broadly, it works. We see two maps with a unidirectional stepped gradient of six bins. And most importantly those bins are consistent between the maps, allowing for the user to compare regions for the same temperatures: like for like.

But there are a couple of things I would probably do a bit differently. Let’s start with colour. And for once we’re not dealing with the colour of the BBC weather map. Instead, we have shades of blue for the data, but all sitting atop an even lighter blue that represents the waters around the UK and Ireland. I don’t think that blue is really necessary. A white background would allow for the warmest shade of blue, +4ºC, to be even lighter. That would allow greater contrast throughout the spectrum.

Secondly, note the use of think black lines to delineate the sub-national regions of the UK whilst the border of the Republic of Ireland is done in a light grey. What if that were reversed? If the political border between the UK and Ireland were black and the sub-national region borders were light grey—or white—we would see a greater contrast with less visual disruption. The use of lines lighter in intensity would allow the eye to better focus on the colours of the map.

Then we reach an interesting discussion about how to display the data. If the purpose of the map is to show “coldness”, this map does it just fine. For my American audience unfamiliar with Celsius, 4ºC is about 39ºF, many of you would definitely say that’s cold. (I wouldn’t, because like many of my readers, I spent eight winters in Chicago.)

The article touches upon the loss of snowy winters. And by and large, winters require temperatures below the freezing point, 0ºC. So what if the map used a bidirectional, divergent stepped gradient? Say temperatures above freezing were represented in shades of a different colour like red whilst below freezing remained in blue, what would happen? You could easily see which regions of the UK would have their lowest temperatures fail to fall below freezing.

Or another way of considering looking at the data is through the lens of absolute vs. change. This graphic compares the lowest annual temperature. But what if we instead had only one map? What if it coloured the UK by the change in temperature? Then you could see which regions are being the most (or least) impacted.

If the data were isolated to specific and discrete geographic units, you could take it a step further and then compare temperature change to the baseline temperatures and create a simple scatterplot for the various regions. You could create a plot showing cold areas getting warmer, and those remaining stable.

That said, this is still a really nice piece. Just a couple little tweaks could really improve it.

Credit for the piece goes to the UK Met Office.

Biden’s Biggest Pyramids

Yesterday we looked at an article from the Inquirer about the 2020 election and how Biden won because of increased margins in the suburbs. Specifically we looked at an interactive scatter plot.

Today I want to talk a bit about another interactive graphic from the same article. This one is a map, but instead of the usual choropleth—a form the article uses in a few other graphics—here we’re looking at three-dimensional pyramids.

All the pyramids, built by aliens?

Yesterday we talked about the explorative vs. narrative concept. Here we can see something a bit more narrative in the annotations included in the graphic. These, however, are only a partial win, though. They call out the greatest shifts, which are indeed mentioned in the text. But then in another paragraph the author writes about Bensalem and its rightward swing. But there’s no callout of Bensalem on the map.

But the biggest things here, pun intended, are those pyramids. Unlike the choropleth maps used elsewhere in the article, the first thing this map fails to communicate is scale. We know the colour means a county’s net shift was either Democratic or Republican. But what about the magnitude? A big pyramid likely means a big shift, but is that big shift hundreds of votes? Thousands of votes? How many thousands? There’s no way to tell.

Secondly, when we are looking at rural parts of Bucks, Chester, and Montgomery Counties, the pyramids are fine. They remain small and contained within their municipality boundaries. Intuitively this makes sense. Broadly speaking, population decreases the further you move from the urban core. (Unless there’s a secondary city, e.g. Minneapolis has St. Paul.) But nearer the city, we have more population, and we have geographically smaller municipalities. Compare Colwyn, Delaware County to Springfield, Bucks County. Tiny vs. huge.

In choropleth maps we face this problem all the time. Look at a classic election map at the county level from 2016.

Wayb ack when…

You can see that there is a lot more red on that map. But Hillary Clinton won the popular vote by more then 3,000,000 votes. (No, I won’t rehash the Electoral College here and now.) More people are crowded into smaller counties than there are in those big, expansive red counties with far, far fewer people.

And that pattern holds true in the Philadelphia region. But instead of using the colour fill of an area as above, this map from the Inquirer uses pyramids. But we face the same problem, we see lots of pyramids in a small space. And the problem with the pyramids is that they overlap each other.

At a glance, you cannot see one pyramid beind another. At least in the choropleth, we see a tiny field of colour, but that colour is not hidden behind another.

Additionally, the way this is constructed, what happens if in a municipality there was a small net shift? The pyramid’s height will be minimal. But to determine the direction of the shift we need to see the colour, and if the area under the line creating the pyramid is small, we may be unable to see the colour. Again, compare that to a choropleth where there would at least be a difference between, say, a light blue and light red. (Though you could also bin the small differences into a single neutral bin collecting all small shifts be them one way or the other.)

I really think that a more straight forward choropleth would more clearly show the net shifts here. And even then, we would still need a legend.

The article overall, though, is quite strong and a great read on the electoral dynamics of the Philadelphia region a month ago.

Credit for the piece goes to John Duchneskie.

Biden Won the Burbs

The thing with election results is that we don’t have the final numbers for a little while after Election Day. And that’s normal.

There are a few things I want to look at in the coming weeks and months once my schedule eases up a bit. But for now, we can use this nice piece from the Philadelphia Inquirer to look at a story close to home: the vote in the Philadelphia suburbs.

It’s all happening in the yellow.

I’ve already looked at some analysis like this for Wisconsin and I shared it on my social. But there I looked at the easy, county-level results. What the Inquirer did above is break down the Pennsylvania collar counties of Philadelphia, i.e. the suburbs, into municipality level results. It then plotted them 2020 vs. 2016 and the results were—as you can guess since we know the result—Biden beat Trump.

What this chart does well is colours the municipalities that Biden flipped yellow. It’s a great choice from a colour standpoint. As the third of the primaries, with both blue and red well represented, it easily contrasts with the Biden- and Trump-won towns and cities of the region. The colour is a bit “darker” than a full-on, bright yellow, but that’s because the designers recognised it needs to stand out on a white field.

Let’s face it, yellow is a great colour to use, but it’s difficult because it’s so light and sometimes difficult to see. Add just the faintest bit of black to your mix, especially if you’re using paints, and voila, it works pretty well. So here the designer did a great job recognising that issue with using yellow. Though you can still see the challenge, because even though it is a bit darker, look at how easy it is to read the text in the blue and the red. Now compare that to the yellow. So if you’re going to use yellow, you want to be careful how and when you do.

The other design decision here comes down to what I call the explorative vs. the narrative. Now, I don’t think explorative is a word—and the red squiggle agrees—but it pairs nicely with narrative. And I’ve been talking about this a lot in my field the last several works, especially offline. (In the non-blog sense, because obviously all my work is done online these days. Oh, how I miss my old office.)

Explorative works present the user with a data set and then allow them to, in this case, mouse over or tap on dots and reveal additional layers of information, i.e. names and specific percentages. The idea is not to tell a specific story, but show an overall pattern. And if the piece is interactive, as this is, potentially allow the user to drill down and tease out their own stories.

Compare that to the narrative, my Wisconsin piece I referenced above is more in this category. Here the work takes you through a guided tour of the data. It labels specific data points, be them on trend or outliers and is sometimes more explicit in its analysis. These can also be interactive—though my static image is not—and allow users to drill down, and critically away, from the story to see dots of interest, for example.

This piece is more explorative. The scatter plot naturally divides the municipalities into those that voted for Biden, Trump, and then more or less than they voted for Trump in 2016. The labels here are actually redundant, but certainly helpful. I used the same approach in my Wisconsin graphic.

But in my Wisconsin graphic, I labelled specific counties of interest. If I had written an accompanying article, they would have been cited in the textual analysis so that the graphic and text complemented each other. But here in the Inquirer, it’s a bit of a missed opportunity in a sense.

The author mentions places like Upper Darby and Lower Merion and how they performed in 2020 vis-a-vis 2016. But it’s incumbent on the user to find those individual municipalities on the scatter plot. What if the designer had created a version where the towns of interest were labelled from the start? The narrative would have been buttressed by great visualisations that explicitly made the same point the author wrote about in the text. And that is a highly effective form of communication when you’re not just telling, but also showing your story or argument.

Overall it’s a great article with a lot to talk about. Because, spoiler, I’m going to be talking about it again tomorrow.

Credit for the piece goes to Jonathan Lai.

Choose Your Own FiveThirtyEight Adventure

In case you weren’t aware, the US election is in less than a week, five days. I had written a long list of issues on the ballot, but it kept getting longer and longer so I cut it. Suffice it to say, Americans are voting on a lot of issues this year. But a US presidential election is not like many other countries’ elections in that we use the Electoral College.

For my non-American readers, the Electoral College, very briefly, was created by the country’s founding fathers (Washington, Jefferson, Adams, Franklin, et al.) to do two things. One, restrict selection of the American president to a class of individuals who theoretically had a broader/deeper understanding of the issues—but who also had vested interests in the outcome. The founders did not intend for the American people to elect the president. The second feature of the Electoral College was to prevent the largest states from dominating smaller states in elections. Why else would Delaware and Rhode Island surrender their sovereignty to join the new United States if Virginia, Pennsylvania, and New York make all the decisions? (The founders went a step further and added the infamous 3/5 clause, but that’s another post.)

So Americans don’t elect the president directly and larger states like California, New York, and Texas, have slightly less impact than smaller states like Wyoming, Vermont, and Delaware. Each state is allotted a number of Electoral College votes and the key is to reach 270. (Maybe another time I’ll get into the details of what happens in a 269–269 tie.) Many Americans are probably familiar with sites like 270 To Win, where you can determine the outcome of the election by saying who won each state. But, even though the US election is really 50 different state elections, common threads and themes run through all those states and if one candidate or another wins one state, it makes winning or losing other states more or less likely. FiveThirtyEight released a piece that attempts to link those probabilities and help reveal how decisions voters in one state make may reflect on how other voters decide.

The interface is fairly straightforward—I’m looking at this on a desktop, though it does work on mobile—with a bunch of choices at the top and a choropleth map below. There we have a continually divergent gradient, meaning the states aren’t grouped into like bins but we have incredibly subtle differences between similar states. (I should also point out that Maine and Nebraska are the two exceptions to my above description of the Electoral College. They divide their votes by congressional district, whoever wins the district gets that Electoral College vote and then the state overall winner receives the remaining two votes.)

Below that we have a bar chart, showing each state, its more/less likely winner state and the 270 threshold. Below that, we have what I’ve read/heard described as a ball plot. It represents runs of the simulation. As of Thursday morning, the current FiveThirtyEight model says Trump has an 11 in 100 chance of winning, Biden, conversely, an 89-in-100 chance.

But what happens when we start determining the winners of states?

Well, for my non-American readers, this election will feature a large number of voters casting their ballots early. (I voted early by mail, and dropped my ballot off at the county election office.) That’s not normal. And I cannot emphasise this next point enough. We may not know who wins the election Tuesday night or by the time Americans wake up on Wednesday. (Assuming they’re not like me and up until Alaska and Hawaii close their polls. Pro-tip, there’s a potentially competitive Senate race in Alaska, though it’s definitely leaning Republican.)

But, some states vote early and/or by mail every year and have built the infrastructure to count those votes, or the vast majority of them, on or even before Election Day. Three battleground states are in that group: Arizona, Florida, and North Carolina. We could well know the result in those states by midnight on Election Day—though Florida is probably going to Florida.

So what happens with this FiveThirtyEight model if we determine the winners of those three states? All three voted for Trump in 2016, so let’s say he wins them again next week.

We see that the states we’ve decided are now outlined in black. The remainder of the states have seen their colours change as their odds reflect the set electoral choice of our three states. We also now have a rest button that appears only once we’ve modified the map. I’m also thinking that I like FiveyFox, the site’s new mascot? He provides a succinct, plain language summary of what the user is looking at. At the bottom we see what the model projects if Arizona, Florida, and North Caroline vote for Trump. And in that scenario, Trump wins in 58 out of 100 elections, Biden in only 41. Still, it’s a fairly competitive election.

So what happens if by midnight we have results from those three states that Biden has managed to flip them? And as of Thursday morning, he’s leading very narrowly in the opinion polls.

Well, the interface hasn’t really changed. Though I should add below this screenshot there is a button to copy the link to this outcome to your clipboard if, like me, you want to share it with the world or my readers.

As to the results, if Biden wins those three states, Trump has less than a 1-in-100 chance of winning and Biden a greater than 99-in-100.

This is a really strong piece from FiveThirtyEight and it does a great job to show how states are subtly linked in terms of their likelihood to vote one way or the other.

Credit for the piece goes to Ryan Best, Jay Boice, Aaron Bycoffe and Nate Silver.

Covid Migration

Yep, Covid-19 remains a thing. About a month or so ago, an article in City Lab (now owned by Bloomburg), looked at the data to see if there was any truth in the notion that people are fleeing urban areas. Spoiler: they’re not, except in a few places. The entire article is well worth a read, as it looks at what is actually happening in migration and why some cities like New York and San Francisco are outliers.

But I want to look at some of the graphics going on inside the article, because those are what struck me more than the content itself. Let’s start with this map titled “Change in Moves”, which examines “the percentage drop in moves between March 11 and June 30 compared to last year”.

Conventionally, what would we expect from this kind of choropleth map. We have a sequential stepped gradient headed in one direction, from dark to light. Presumably we are looking at one metric, change in movement, in one direction, the drop or negative.

But look at that legend. Note the presence of the positive 4—there is an entire positive range within this stepped gradient. Conventionally we would expect to see some kind of red equals drop, blue equals gain split at the zero point. Others might create a grey bin to cover a negative one to positive one slight-to-no change set of states. Here, though, we don’t have that. Nor do we even get a natural split, instead the dark bin goes to a slightly less dark bin at positive four, so everything less than four through -16 is in the darker bin.

Look at the language, too, because that’s where it becomes potentially more confusing. If the choropleth largely focuses on the “percentage drop” and has negative numbers, a negative of a negative would be…a positive. A -25% drop in Texas could easily be mistaken with its use of double negatives. Compare Texas to Nebraska, which had a 2% drop. Does that mean Nebraska actually declined by 2%, or does it mean it rose by 2%?

A clean up in the data definition to, say, “Percentage change in moves from…” could clear up a lot of this ambiguity. Changing the colour scheme from a single gradient to a divergent one, with a split around zero (perhaps with a bin for little-to-no change), would make it clearer which states were in the positive and which were in the negative.

The article continues with another peculiar choice in its bar charts when it explores the data on specific cities.

Here we see the destinations of people moving out of San Francisco, using, as a note explains, requests for quotes as a proxy for the numbers of actual moves. What interests me here is the minimalist take on the bar charts. Note the absence of an axis, which leaves the bars almost groundless for comparison, except that the designer attached data labels to the ends of the bars.

Normally data labels are redundant. The point of a visualisation is to visualise the comparison of data sets. If hyper precise differences to the decimal point are required, tables often are a better choice. But here, there are no axis labels to inform the user as to what the length of a bar means.

It’s a peculiar design decision. If we think of labelling as data ink, is this a more efficient use with data labels than just axis labels? I would venture to say no. You would probably have five axis labels (0–4) and then a line to connect them. That’s probably less ink/pixels than the data labels here. I prefer axis lines to help guide the user from labels up (in this case) through the bars. Maybe the axis lines make for more data ink than the labels? It’s hard to say.

Regardless, this is a peculiar decision. Though, I should note it’s eminently more defensible than the choropleth map, which needs a rethink in both design and language.

Credit for the piece goes to Marie Patino.

Positioning Is Important

Yesterday Pew Research released the results of a survey of how the rest of the world views select countries throughout the world. The Washington Post covered it in an article and created some graphics to support the text. The text, of course, was no big surprise in that the rest of the world views the United States poorly compared to just several years ago and that, in particular, President Trump is a leader in whom the world has no confidence.

But that’s not what I want to talk about. Instead, I want to address a design element in the one of their graphics. (But you should go ahead and read about the survey results.)

The issue here is the positioning of the labels for each bar, representing a world leader. At the very top of the graphic, things are in a good way. We have Merkel with a small space beneath that text then another label, “No confidence, 19 percent”, and then a connecting line to a dot to the blue bar. We then have a small space and the label Macron, meaning we have moved on and are on the next world leader.

But what if the reader sees the title and starts towards the bottom? They want to see the leaders in whom the world has no confidence. Now look at the bottom of the chart and the positioning of the labels for Trump, and above him, Xi, Putin, and maybe even Johnson. Because the “No confidence, x percent” labels have moved further to the right, there is an enormous space between the leader’s name and their coloured bar. Visually, this creates a link between the leader’s name and the preceding bar. For example, Trump appears to have a no confidence value of 78 with an unlabelled bar chart beneath him.

I suggest that there are two easy fixes to better link the labels to the data. The first is to move the leaders’ labels down, once the “No confidence” label has moved sufficiently far to the right. Like so.

The leader is now very clearly attached to his or her data with little confusion.

My second option is to fix the “No confidence” labels permanently to the left of the chart so as not to create that visual space in the first place, like so.

Here, after seeing the first option, I wonder if there is enough visual space at all between the leaders. But, this is only a quick Photoshop exercise. If I wanted to really tweak this, I would consider putting the data point or number in bold to the right of the label.That would eliminate an entire line of type that could be repurposed as a visual buffer between leaders.

I think either option would be preferable because of increased clarity for the reader.

Credit for the piece goes to the Washington Post graphics department.

Double Your Hurricanes, Double Your Fun

In a first, the Gulf of Mexico basin has two active hurricanes simultaneously. Unfortunately, they are both likely to strikes somewhere along the Louisiana coastline within approximately 36 hours of each other. Fortunately, neither is strong as a storm named Katrina that caused a mess of things several years ago now.

Over the last few weeks I have been trying to start the week with my Covid datagraphics, but I figured we could skip those today and instead run with this piece from the Washington Post. It tracks the forecast path and forecast impact of tropical storm force winds for both storms.

The forecast path above is straight forward. The dotted line represents the forecast path. The coloured area represents the probability of that area receiving tropical storm force winds. Unsurprisingly the present locations of both storms have the greatest possibilities.

Now compare that to the standard National Weather Service graphic, below. They produce one per storm and I cannot find one of the combined threat. So I chose Laura, the one likely to strike mid-week and not the one likely to strike later today.

The first and most notable difference here is the use of colour. The ocean here is represented in blue compared to the colourless water of the Post version. The colour draws attention to the bodies of water, when the attention should be more focused on the forecast path of the storm. But, since there needs to be a clear delineation between land and water, the Post uses a light grey to ground the user in the map (pun intended).

The biggest difference is what the coloured forecast areas mean. In the Post’s versions, it is the probability of tropical force winds. But, in the National Weather Service version, the white area actually is the “cone”, or the envelope or range of potential forecast paths. The Post shows one forecast path, but the NWS shows the full range and so for Laura that means really anywhere from central Louisiana to eastern Texas. A storm that impacts eastern Texas, for example, could have tropical storm force winds far from the centre and into the Galveston area.

Of course every year the discussion is about how people misinterpret the NWS version as the cone of impact, when that is so clearly not the case. But then we see the Post version and it might reinforce that misconception. Though, it’s also not the Post’s responsibility to make the NWS graphic clearer. The Post clearly prioritised displaying a single forecast track instead of a range along with the areas of probabilities for tropical storm force winds.

I would personally prefer a hybrid sort of approach.

But I also wanted to touch briefly on a separate graphic in the Post version, the forecast arrival times.

This projects when tropical storm force winds will begin to impact particular areas. Notably, the areas of probability of tropical storm force winds does not change. Instead the dotted line projections for the paths of the storms are replaced by lines relatively perpendicular to those paths. These lines show when the tropical storm winds are forecast to begin. It’s also another updated design of the National Weather Service offering below.

Again, we only see one storm per graphic here and this is only for Laura, not Marco. But this also probably most analogous to what we see in the Post version. Here, the black outline represents the light pink area on the Post map, the area with at least a 5% forecast to receive tropical storm force winds. The NWS version, however, does not provide any further forecast probabilities.

The Post’s version is also design improved, as the blue, while not as dark the heavy black lines, still draws unnecessary attention to itself. Would even a very pale blue be an improvement? Almost certainly.

In one sense, I prefer the Post’s version. It’s more direct, and the information presented is more clearly presented. But, I find it severely lack in one key detail: the forecast cone. Even yesterday, the forecast cone had Laura moving in a range both north and south of the island of Cuba from its position west of Puerto Rico. 24 hours later, we now know it’s on the southern track and that has massive impact on future forecast tracks.

Being east of west of landfall can mean dramatically different impacts in terms of winds, storm surge, and rainfall. And the Post’s version, while clear about one forecast track, obscures the very real possibilities the range of impacts can shift dramatically in just the course of one day.

I think the Post does a better job of the tropical storm force wind forecast probabilities. In an ideal world, they would take that approach to the forecast paths. Maybe not showing the full spaghetti-like approach of all the storm models, but a percentage likelihood of the storm taking one particular track over another.

Credit for the Post pieces goes to the Washington Post graphics department.

Credit for the National Weather Service graphics goes to the National Weather Service.

Flood Stages of the Schuylkill

Hurricane Isaias ran up the East Coast of the United States then the Hudson River Valley before entering Canada. Before it left the US, however, it dumped some record-setting amounts of rain in Philadelphia and across the region. And in times of heavy rains, the lower-lying areas of the city (and suburbs like Upper Darby and Downingtown to mention a few) face inundation from swollen rivers and creeks. And in the city itself, the neighbourhood of Eastwick is partially built upon a floodplain. So staying atop river levels is important and the National Weather Service has been doing that for years.

The National Weather Service graphic above is from this very morning and represents the water level of the Schuylkill River (the historical Philadelphia was sited between two rivers, the more commonly known Delaware and its tributary the Schuylkill), which receives water from the suburbs to the north and west of the city, the area hardest hit by Isaias’ rainfall.

The chart looks at the recent as well as the forecast stages of the river. Not surprisingly, the arrival of Isaias accounts for the sudden rise in the blue line. But there is a lot going on here, yellows, reds, and purples, some kind of NOAA logo behind the chart, labels sitting directly on lines, and some of the type is pixellated and difficult to read.

But it does do a nice job of showing the differences in observations and forecast points in time. By that I mean, a normal line chart has an equal distribution of observations along its length. There is an equal space between the weeks or the months or the years. But in instances like this, observations may not be continuous—imagine a flood destroying a sensor—or here that the forecasts are not as frequently produced as observations. And so these are all called out by the dots on the lines we see.

This is the chart I am accustomed to seeing. But then last night, reading about the damage I came across this graphic (screenshot also from this morning to compare to above) from the Philadelphia Inquirer.

It takes the same data and presents it a cleaner, clearer fashion. The flood stages are far easier to read. Gone is the NOAA logo and the unnecessary vertical gridlines. The type is far more legible and the palette less jarring and puts the data series in front and centre.

In general, this is a tremendous improvement for the legibility of the chart. I would probably use a different colour for the record flood stage line, or given their use of solid lines for the axis maybe make it dotted. But that’s a small quibble.

The only real issue here is what happens to the time? Compare the frequent observations in the past in the original, every half hour or so, to the six hourly dots (the blue versus the purple). In the Inquirer version, those spaces between forecast points disappear and become the same as the half-hour increments.

To be fair, the axis labelleing implies this as the label goes from August 4 to 5 and then jumps all the way to 7, but it is not as intuitive as it could be. Here I would recommend following the National Weather Service’s fashion of adjusting for the time gap. It would probably mean some kind of design tweak to emphasise that the observations earlier than now are observed every half hour or so, versus the six-hour forecasts. The NWS did this through dots. One could use a dotted line, or some other design treatment.

This missing time is the only thing really holding back this piece from the Inquirer from standing out as a great update of the traditional National Weather Service hydrograph chart.

Credit for the National Weather Service piece goes to the National Weather Service.

Credit for the Inquirer piece goes to Dominique DeMoe.

Axis Lines in Charts

The British election campaign is wrapping up as it heads towards the general election on Thursday. I haven’t covered it much here, but this piece from the BBC has been at the back of my mind. And not so much for the content, but strictly the design.

In terms of content, the article stems from a question asked in a debate about income levels and where they fall relative to the rest of the population. A man rejected a Labour party proposal for an increase in taxes on those earning more than £80,000 per annum, saying that as someone who earned more than that amount he was “not even in the top 5%, not even the top 50”.

The BBC looked at the data and found that actually the man was certainly within the top 50% and likely in the top 5%, as they earn more than £75,300 per annum. Here in the States, many Americans cannot place their incomes within the actual spreads of income. The income gap here is severe and growing.  But, I want to look at the charts the BBC made to illustrate its points.

The most important is this line chart, which shows the income level and how it fits among the percentages of the population.

Are things lining up? It's tough to say.
Are things lining up? It’s tough to say.

I am often in favour of minimal axis lines and labelling. Too many labels and explicit data points begin to subtract from the visual representation or comparison of the data. If you need to be able to reference a specific data point for a specific point on the curve, you need a table, not a chart.

However, there is utility in having some guideposts as to what income levels fit into what ranges. And so I am left to wonder, why not add some axis lines. Here I took the original graphic file and drew some grey lines.

Better…
Better…

Of course, I prefer the dotted or dashed line approach. The difference in line style provides some additional contrast to the plotted series. And in this case, where the series is a thin but coloured line, the interruptions in the solidity of the axis lines makes it easier to distinguish them from the data.

Better still.
Better still.

But the article also has another chart, a bar chart, that looks at average weekly incomes across different regions of the United Kingdom. (Not surprisingly, London has the highest average.) Like the line chart, this bar chart does not use any axis labels. But what makes this one even more difficult is that the solid black line that we can use in the line charts above to plot out the maximum for 180,000 is not there. Instead we simply have a string of numbers at the bottom for which we need to guess where they fall.

Here we don't even a solid line to take us out to 700.
Here we don’t even a solid line to take us out to 700.

If we assume that the 700 value is at the centre of the text, we can draw some dotted grey lines atop the existing graphic. And now quite clearly we can get a better sense of which regions fall in which ranges of income.

We could have also tried the solid line approach.
We could have also tried the solid line approach.

But we still have this mess of black digits at the bottom of the graphic. And after 50, the numbers begin to run into each other. It is implied that we are looking at increments of 50, but a little more spacing would have helped. Or, we could simply keep the values at the hundreds and, if necessary, not label the lines at the 50s. Like so.

Much easier to read
Much easier to read

The last bit I would redo in the bar chart is the order of the regions. Unless there is some particular reason for ordering these regions as they are—you could partly argue they are from north to south, but then Scotland would be at the top of the list—they appear an arbitrary lot. I would have sorted them maybe from greatest to least or vice versa. But that bit was outside my ability to do this morning.

So in short, while you don’t want to overcrowd a chart with axis lines and labelling, you still need a few to make it easier for the user to make those visual comparisons.

Credit for the original pieces goes to the BBC graphics department.