On a Line. Or Not.

Two weeks ago I was reading an article in the BBC that fact checked some of President Biden’s claims about the economy. Now I noted the other day in a post about axis lines and their use in graphics. Axis lines help ground the user in making comparisons between bars, lines, or whatever, and the minimum/maximum/intervals of the data set.

I was reading the article and first came upon this graphic. It’s nothing crazy and shows job growth in the aggregate for the first three months of a presidential administration. A pretty neat comparison in the combination of the data. I like.

Pay attention to what you see here. There will be a quiz.

I don’t like the lack of grid lines for the axis, however. But, okay, none to be found.

I keep reading the article. And then a couple of paragraphs later I come upon this graphic. It looks at the monthly figures and uses a benchmark line, the red dotted one, to break out those after January 2021 when Biden took office.

Spot the differences.

But do you notice anything?

The lines for the y-axis are back!

The article had a third graphic that also included axis lines.

I don’t have a lot to say about these graphics in particular, but the most important thing is to try and be consistent. I understand the need to experiment with styles as a brand evolves. Swap out the colours, change the styles of the lines, try a new typeface. (Except for the blue, we are seeing different colours and typefaces here, but that’s not what I want to write about.)

First, I don’t know if these are necessarily style experiments. I suspect not, but let’s be charitable for the sake of argument. I would refrain from experimenting within a single article. In other words, use the lines or don’t, but be consistent within the article.

For the record, I think they should use the lines.

Another point I want to make is with the third graphic. You’ll note that, like I said above, it does use axis lines. But that’s not what I want to mention.

At least we have lines.

Instead I want to look at the labelling on the axes. Let’s start with the y-axis, the percentage change in GDP on the previous quarter. The top of the chart we have 30%. As I’ve said before, you can see in the Trump administration, the bar for the initial Covid-19 rebound rises above the 30% line. It’s not excessive, I can buy it if you’re selling it.

But let’s go down below the 0-line. Just prior to the rebound we had the crash. Similarly, this extends just below the -30% line. But here we have a big space and then a heavy black line below that -30% line. It looks like the bottom line should be -40%, but scanning over to the left and there is no label. So what’s going on?

First, that heavy black line, why does it appear the same as the baseline or zero-growth line? The axis lines, by comparison, are thin and grey. You use a heavier, darker line to signify the breaking point or division between, in this case, positive and negative growth. Theoretically, you don’t need the two different colours for positive and negative growth, because the direction of the bar above/below that black line encodes that value. By making the bottom line the same style as the baseline, you conflate the meaning of the two lines, especially since there is no labelling for the bottom line to tell you what the line means.

Second, the heaviness of the line draws visual attention to it and away from the baseline, especially since the bottom line has the white space above it from the -30% line. Consider here the necessity of this line. For the 30% line that sets the maximum value of the y-axis, we have the blue bar rising above the line and the administration labels sit nicely above that line. There is no reason the x-axis labels could not exist in a similar fashion below the -30% line. If anything, this is an inconsistency within the one chart, let alone the one graphic.

Third, is it -40%? I contend the line isn’t necessary and that if the blue bar pokes above the 30% line, the orange bar should poke below the -30% line. But, if the designer wants to use a line below the -30% line, it should be labelled.

Finally, look at the x-axis. This is more of a minor quibble, but while we’re here…. Look at the intervals of the years. 2012, 2014, 2016, every two years. Good, make sense. 2018. 20…21? Suddenly we jump from every two years to a three-year interval. I understand it to a point, after all, who doesn’t want to forget 2020. But in all seriousness, the chart ends at 2021 and you cannot divide that evenly. So what is a designer to do? If this chart had less space on the x-axis and the years were more compressed in terms of their spacing, I probably wouldn’t bring this up.

However, we have space here. If we kept to a two-year interval system, I would introduce the labels as 2012, but then contract them with an apostrophe after that point. For example, 2014 becomes ’14. By doing that, you should be able to fit the two-year intervals in the space as well as the ending year of the data set.

Overall, I have to say that this piece shocked me. The lack of attention to detail, the inconsistency, the clumsiness of the design and presentation. I would expect this from a lesser oganisation than the BBC, which for years had been doing solid, quality work.

The first chart is conceptually solid. If Biden spoke about job creation in the first three months of the administration vs. his predecessor, aggregate the data and show it that way. But the presentation throughout this piece does that story a disservice. I wish I knew what was going on.

Credit for the piece goes to the BBC graphics department.

Covid Update: 23 May

Last week I wrote about how we were seeing new cases continuing to rapidly decline. This week we can say cases are still declining, but perhaps a bit less rapidly than earlier.

New case curves for PA, NJ, DE, VA, & IL.

The charts above show that slowdown in the tail at the right of the chart. First some points to note, Delaware reported that several hundred cases had not been entered into their database, and so we saw a one-time spike midweek. But note that after the spike, the numbers continue to trend down. In other words, the rapid decline was probably a bit less rapid than we saw, but it was still a decline.

Pennsylvania’s chart has a problem of your author’s own design. Now that I’m fully vaccinated I was able to leave the flat this weekend and the Pennsylvania data wasn’t ready by the time I left on Saturday. But by the Sunday data, it was and so the 2500 new cases is probably split somehow between those days—accounted for by the seven-day average. This points to a broader question for which I do not yet have an answer: as life increasingly returns to normal, how much longer will I continue to update these charts?

I started these graphics as a way for myself to track the spread of the virus in my home state and the state where I still have a large number of friends. At the time, there were few if any visualisations out there doing this. Now most media outlets have them and my work at home led to a similar project at work. The reason I continued to make these was you, my readers here and in other places where I post this work. Your comments, messages, texts, and emails made it clear you valued the work. First, I know there are still many people left to be fully vaccinated, nearly half the population, and due to bias, some of the people most likely to follow these posts are those most likely to get vaccinated as early as possible. But please let me know, readers, if you’re still getting value out of these graphics.

But back to the data, in two of the remaining three states, Virginia and Illinois, we saw numbers continue to decline. New Jersey, however, shows a tail with a slight uptick in the seven-day average of new cases. This will be something I follow closely this coming week.

Deaths finally appear to be dropping.

Death curves for PA, NJ, DE, VA, & IL.

Not by large numbers, no, but in Virginia and Illinois we saw declines of 5 deaths per day. Pennsylvania was even greater with a decline of 7. We are still above rates we saw last summer, but it does appear that finally we have hit the inflection point we have been waiting for the last several weeks.

Finally we have vaccinations. These charts look at the cumulative number of people fully vaccinated.

Fully vaccinated curves for PA, VA, & IL.

And in that the number keeps going up, and that’s good. But they can also only keep going up. But if you look closely at the right tail of the curve, you begin to see it flattening out as the rate of daily vaccinations begins to drop. Unfortunately we’re well below levels we think we’d need for herd immunity. But, to try and look at the positive, we’re almost halfway there and that is certainly playing a role as we can see with the rapid decline in numbers of new cases. But we need to keep trying to get more people vaccinated.

Credit for the piece is mine.

Some Data on Deaths in Gaza and Israel

I’ve seen an uptick in traffic to the blog the last few days, specifically my older content on the Middle East. I don’t exactly have the bandwidth to track the conflict between Israel and Gaza in addition to Covid-19 and my other projects. But as we approached the ten-day mark since Hamas first fired rockets into Israel, I wanted to get a sense of the death toll and so here we are.

The biggest thing to note is that we should take all this data with a grain of salt. For example, the Israeli Defence Force will likely talk up the effectiveness of its Iron Dome air defence system and downplay total civilian deaths. Conversely, Hamas will likely talk up civilian deaths while not detailing at all the deaths of its fighters. And when it comes to deaths in Gaza, it’s not clear what share of those reported by civilian authorities, i.e. the hospital systems, are militant fighters vs. civilians.

Not at all covered by any of this is a discussion of the opportunity costs involved, particularly when it comes to Israeli air strikes. For example, if a Gaza household contains a known Hamas fighter, one can certainly regret an Israeli drone strike that kills the fighter and his non-combatant son whilst in a field. But that strike may be a better outcome than striking the fighter’s home and along with killing not just him and his son, but now his wife, daughters, and the rest of his family.

Credit for the piece is mine.

Covid Update: 16 May

Last week I wrote about how new cases in the five states we cover (Pennsylvania, New Jersey, Delaware, Virginia, and Illinois) were falling and falling rapidly. And this week that pattern continues to hold.

New case curves for PA, NJ, DE, VA, & IL.

If we look at the Sunday-to-Sunday numbers, daily new cases were down in all five states. If we look at the seven-day averages, cases are down in all states. Pennsylvania and Illinois are now down below 2000 new cases per day, Virginia is just over 500 per day, New Jersey is below 400, and Delaware is over 100. These are all levels we last saw last autumn. In other words, we’re not quite back to summer levels of low transmission, but this time next month, I wouldn’t be surprised if we were.

Deaths remain stubbornly resistant to falling.

Death curves for PA, NJ, DE, VA, & IL.

In fact, if we compare the Sunday-to-Sunday numbers we see that the numbers yesterday were largely the same as last Sunday, except in Pennsylvania where they were up significantly. The seven-day average?

Here’s where it gets interesting, because deaths are up slightly. Not by much, for example, Illinois was at 29.1 deaths per day last Sunday, this Sunday? 30.9. Illinois isn’t alone. Pennsylvania, Delaware, and Virginia all have reported slight upticks in their death rates.

But the biggest concern is the continuing slowdown in vaccinations. We’re perhaps halfway to the point of herd immunity in the three states we track. All three are between 37% and 38%. The thing to track this coming week will be if the rate continues to slow.

Total full vaccination curves for PA, VA, & IL.

Credit for the piece is mine.

The May Jobs Report

Last Friday, the government released the labour statistics from April and they showed a weaker rebound in employment than many had forecasted. When I opened the door Saturday morning, I got to see the numbers above the fold on the front page of the New York Times.

Welcome to the weekend

What I enjoyed about this layout, was that the graphic occupied half the above the fold space. But, because the designers laid the page out using a six-column grid, we can see just how they did it. Because this graphic is itself laid out in the column widths of the page itself. That allows the leftmost column of the page to run an unrelated story whilst the jobs numbers occupy 5/6 of the page’s columns.

If we look at the graphic in more detail, the designers made a few interesting decisions here.

Jobs in detail

First, last week I discussed a piece from the Times wherein they did not use axis labels to ground the dataset for the reader. Here we have axis labels back, and the reader can judge where intervening data points fall between the two. For attention to detail, note that under Retail, Education and health, and Business and professional services, the “illion” in -2 Million was removed so as not to interfere with legibility of the graphic, because of bars being otherwise in the way.

My issue with the axis labels? I have mentioned in the past that I don’t think a designer always needs to put the maximum axis line in place, especially when the data point darts just above or below the line. We see this often here, for example Construction and Manufacturing both handle it this way for their minimums. This works for me.

But for the column above Construction, i.e. State and local government and Education and health, we enter the space where I think the graphic needs those axis lines. For Education and health, it’s pretty simple, the red losses column looks much closer to a -3 million value than a -2 million value. But how close? We cannot tell with an axis line.

And then under State and local government we have the trickier issue. But I think that’s also precisely why this could use some axis lines. First, almost all the columns fall below the -1 million line. This isn’t the case of just one or two columns, it’s all but two of them. Second, these columns are all fairly well down below the -1 million axis line. These aren’t just a bit over, most are somewhere between half to two-thirds beyond. But they are also not quite nearly as far to -2 million as the ones we had in the Education and health growth were near to -3 million.

So why would I opt to have an axis line for State and local governments? The designers chose this group to add the legend “Gain in April”. That could neatly tuck into the space between the columns and the axis line.

Overall it’s a solid piece, but it needs a few tweaks to improve its legibility and take it over the line.

Credit for the piece goes to Ella Koeze and Bill Marsh.

Covid Update: 9 May

Last week I wrote about how, for new cases, we had seen a few consecutive days of increasing cases. Were we witnessing an aberration, a one-off “well, that was weird”? Or was this the beginning of a trend towards increasing new cases?

A week later and we have our answer. Just a one-off.

New cases curves for PA, NJ, DE, VA, & IL.

If we focus on just the seven-day average, in just one week the numbers in New Jersey have fallen by half. In Pennsylvania, Virginia, it’s by one quarter. Illinois is a little less than that, as is Delaware. Across the board, numbers are falling and falling quickly.

Deaths curves for PA, NJ, DE, VA, & IL.

When we move to deaths, we’re beginning to see an improvement. As the lagging indicator, we would expect these to begin to drop a few weeks after new cases begin to drop. We have begun to see what might be the peaks of deaths in a few states.

Full vaccination curves for PA, NJ, DE, VA, & IL.

Over this coming week, I’ll be closely watching these numbers to see if we can finally begin to say authoritatively that deaths are in decline.

Vaccinations drive all of this. And we continue to see the total number of fully vaccinated people climbing in Pennsylvania, Virginia, and Illinois. But, that rate is slowing down. Most likely we are entering a phase where those eager for their shots have largely received them. Now begins the challenge of vaccinating those who might lack easy access or have reservations.

But to be clear, we need those people to become fully vaccinated before we can truly begin to return to normal. Whatever normal is. It’s hard to remember anymore.

Credit for the piece is mine.

Off the Axis

Two Fridays ago, I opened the door and found my copy of the New York Times with a nice graphic above the fold. This followed the announcement from the White House of aggressive targets to reduce greenhouse gas emissions

In general, I love seeing charts and graphics above the fold. As an added bonus, this set looked at climate data.

Need to see more downward trending lines.

But there are a few things worth pointing out.

First from a data side, this chart is a little misleading. Without a doubt, carbon dioxide represents the greatest share of greenhouse gasses, according to the US Environmental Protection Agency (EPA) it was 76% in 2010. Methane contributes the next largest share at 16%. But the labelling should be a little clearer here. Or, perhaps lead with a small chart showing CO2’s share of greenhouse gasses and from there, take a look at the largest CO2 emitters per person.

Second, where are the axis labels?

I will probably have more on this at a later date, but neither the bar chart nor the line charts have axis labels. Now the designers did choose to label the beginning value for the lines and the bars, but this does not account for the minimums or maximums. (It also assumes that the bottom of the lines is zero.)

For example, we can see that China began 1990 with emissions at 3.4 billon metric tons. The annotation makes clear that China’s aggregate emissions surpassed those of the US in 2004. But where do they peak? What about developing countries?

If I pull out a ruler and draw some lines I can roughly make some height comparisons. But, an easier way would be simply to throw some dotted lines across the width of the page, or each line chart.

This piece takes a big swing at presenting the challenge of reducing emissions, but it fails to provide the reader with the proper—and I think necessary—context.

Credit for the piece goes to Nadja Popovich and Bill Marsh.

Covid-19: A Global Update

I’ve been trying to limit the amount of Covid-19 visualisations I’ve been covering. But on Sunday this image landed at my front door, above the fold on page 1 of the New York Times. And it dovetails nicely with our story about the pandemic’s impact on Pennsylvania, New Jersey, Delaware, Virginia, and Illinois.

Some not so great looking numbers across the globe.

For most of 2020, the United States was one of the worst hit countries as the pandemic raged out of control. Since January 2021, however, the United States has slowly been coming to grips with the virus and the pandemic. Its rate is now solidly middle of the pack—no longer is America first.

And if you compare the chart at the bottom to those that I’ve been producing, you can clearly see how our five states have really gotten this most recent wave under control to the point of declining rates of new cases.

However, you’ve probably heard the horror stories from India and Brazil where things are not so great. It’s countries like those that account for the continual increase in new cases at a global level.

Credit for the piece goes to Lazaro Gamio, Bill Marsh, and Alexandria Symonds.

Covid Update: 2 May

I didn’t write a post last Monday, but this Monday I am. A few things may have changed in the Covid situation. The most important is that we may have finally seen the peak of this current wave’s surge of new cases.

For the last few weeks we’ve seen cases rising in the five states. Only New Jersey of late had shown a return to declining cases. About the middle of the week before last, we began to see those numbers decline. And so in this past week we did begin to see cases decline in all five states.

New case curves for PA, NJ, DE, VA, & IL.

The thing to watch this week will be that at the very end of last week, new cases ticked up slightly for two or three days in a number of states. It could be an aberrant one-off, but with full vaccinations still well below herd immunity and cases still at high levels, it isn’t difficult to imagine a scenario where the virus begins to surge once again.

Deaths on the other hand, they continue to climb. We aren’t seeing massive increases, instead these are largely marginal. But they are increasing all the same.

Death curves for PA, NJ, DE, VA, & IL.

Encouragingly, if cases can continue to decline, deaths will begin to fall. As a lagging indicator, they will be the last metric we see decline. Consequently, it’s a question of when, not if, deaths begin to decline. On Saturday, we did see a small decline in deaths, but one day before the weekend is insufficient to determine whether or not we’ve seen the inflection point, after which deaths would fall.

Vaccinations remain a broad set of positive news. All three states are now reporting just over 30% of their populations as fully vaccinated. However, the rate of vaccination has begun to slow.

Total vaccination curves for PA, NJ, DE, VA, & IL.

And that worries me and the professionals, because we are still far from herd immunity. Until we reach that level, the virus can easily spread among unvaccinated populations. The charts above don’t show the decline, as they look only at the total, cumulative effect. But the charts that I see make it quite clear the decline over the last week or two.

Moral of that story is, if you haven’t been vaccinated yet, please register to do so or visit a location that allows walk-up vaccinations.

Arrowheads

I don’t know if this is a trend, but I’ve now seen a few graphics appearing using arrows to show the direction or trend of the data. This graphic in an article by Bloomberg prompted me to talk about this piece.

I should add, after rereading my draft, that I’m not clear who made this graphic. I assume that it was the Bloomberg graphics team, because it appears in Bloomberg and all the data is presented to recreate the chart. But, it could also be a chart made by someone at Goldman Sachs that credits Bloomberg as a source and then someone at Bloomberg got hold of a copy. And a graphic made for a news/media outlet will typically be of a different quality or level of polish than one made perhaps by and for analysts. (Not that I think there should be said differences, as it does a disservice to internal users, but I digress from a digression.)

All the things going on in this chart.

The arrow here appears above the peak quarter, i.e. the second of 2021, for both the Goldman Sachs Economics forecast and the consensus forecast. But what does it really add? First, it adds “ink”, in this case pixels. Here, every pixel consumes our attention and there is a finite number of available pixels within the space of this graphic.

When I work with authors or subject matter experts, I often find myself asking them “what’s the most important thing to communicate?” or something along those lines. If the person answers with a long laundry list, I remind them that if everything is important, nothing is important. If everything is set in bold, all caps text, what will look most important is the rare bit of text set in regular, lower-case letters.

In the above graphic, there are so many things screaming for my attention, it’s difficult to say which is the most important. First, I’m fairly certain that “US QoQ annualised GDP growth” could move to the graphic subhead or data definition. Allow the graphic’s data container to contain, well, data. Second, the data series labels can be moved outside the data container. The labels here have an inherent problem is that the Goldman Sachs Economics numbers are in blue, and that blue text has less visual weight than the black text of the Consensus label. Consequently, the Goldman Sachs Economics label recedes into the background and becomes lost, not what you want from your legend.

Third, I don’t believe the data labels here add anything to the chart. They function as sparkly distractions from the visual trend, which should be the most important aspect of a visual chart.

Finally, we get to the arrow, the impetus for this post. First, I should note that it is not clear what growth it shows. The fact the line is black makes me think it reflects the Consensus forecast whereas a blue line would represent the Goldman Sachs forecast. But it could also be the average of the two or even a more general “here’s the general shape”. The problem is that the shape matters. If you look at the slope of the actual forecasts, you see a sharp increase to the peak followed by a slower, more gradual taper. The arrow in the original graphic shows a decelerating curve that is shallower in the lead up to the peak and that is not what is forecast to happen.

Now we get to the issue I mentioned at the top, the extraneous labelling and data ink wasted. If we look at the chart as is, but remove the arrow, we see this.

Immediately to the right of the peak, we have have some blue data labels and then just a bit to the right of that, but sitting vertically above the label we have the bold blue text labelling the data series. But further to the upper right we have a dark and bold block of text that draws the eye away from the peak and into the corner. It draws the eye away from the very element of the shape the peak needs to be a peak, the trough in the wave. Consequently, it makes sense with the eye being drawn up and to the right that the designers threw an arrow in above the peak to show how, no, actually your eye needs to go down and to the right.

But what happens if we then strip out the data series labelling? Do we still need the arrow? Let’s take a look.

I would argue that no, we do not. And so let’s strip the arrow out of the picture and take a look.

Here the shape of the curve is clear, a sharp rise and then a gradual taper to the right. No arrow needed to show the contour. In other words, the additional labelling wastes our attention, which then forces us to add an arrow to see what we needed to see in the first place, but then further wasting our attention.

There are a number of other things I take issue with in this chart: the black outlines of the blue rectangles, the tick marks on the x-axis, the solid border of the container, the lack of axis lines. But the arrow points to this graphic’s central problem, a poorly thought out labelling structure.

So because the chart provides all the data, I took a quick stab at how I would chart it using my own styles. I gave myself a 3:2 ratio, less space than the original graphic had. This is where I landed. I would prefer the legend below the chart labelling, but it felt cramped in the space. And with so few data points along the x-axis, the chart doesn’t need a ton of horizontal space and so I repurposed some of it to create a vertical legend space.

I mixed typefaces only because my default does not have a proper small capitals and I wanted to use small capitals to reduce and balance out the weight of the exhibit label in the graphic title.

I could still tweak the spacing between the bars and perhaps the treatment of the years below the quarters could use some additional work, but the main point here is that the shape of the curve is clear. I need no arrow to tell the user that there is a peak and that after the peak the line goes down. The white space around the bars and the line does that for me.

Credit for the piece goes to either the Bloomberg graphics department or the Goldman Sachs graphics department. Not sure.