datagraphic – Coffee Spoons

Datagraphics as Marketing Materials

I spent the last two weeks out of town, and my post for the Friday before didn’t happen because there was a fire at my building—I and my unit are fine—that knocked out internet for about 24 hours. But now I have returned.

One of the things I did was visit the city of Pittsburgh in western Pennsylvania. There I discovered the city has a World War II era submarine, the USS Reqin, a Tench-class submarine that launched at the end of the war and saw no active combat. She was later preserved and arrived at the Carnegie Science Centre in Pittsburgh where she serves as a museum ship.

As I waited for the self-guided tour to begin, I spotted a small poster with some big numbers. Naturally I investigated and found it to be a marketing piece by PPG, a Pittsburgh-based paint and coatings company. The poster detailed the work that went in the preservation of the submarine’s exterior using PPG’s own paints and coatings.

We can see the large numbers clearly and to the piece’s credit the hierarchy works. What are we talking about? Three paints applied to the submarine in these quantities in this amount of time. The only factette not totally relevant is how many tourists annually visit the submarine.

Design wise, the poster does a nice job of dividing up its space into an attention-grabbing upper-half. After all, it grabbed my attention. The lower half then subdivides into three columns that speak to the aforementioned subjects. The last column then divides again into halves.

As marketing design does, it’s not the most offensive. For example, we don’t have the gallon buckets sized or scaled differently. The designers used a restrained palette and kept a consistent typographic treatment.

Admittedly, I was a bit disappointed because I had thought it would be some facts or data about the submarine itself. But for what the piece is, I thought it did a nice job.

Credit for the piece goes to PPG’s graphics department.

Substandard Housing in Philadelphia

I took a holiday yesterday and headed down the street to the Philadelphia City Archives, which houses some of the oldest documents dating back to the founding of the colony. But I was there primarily to try and find deeds and property information for my ancestors as part of my genealogy work.

When I walked into the building—the archives moved a few years ago from an older building in University City into this new facility—an interactive exhibit confronted me immediately. Now I did not take the time to really investigate the exhibit, because I anticipated spending the entire day there and wanted to maximise my time.

But there was this one graphic that felt appropriate to share here on Coffeespoons.

Philadelphia’s population crested in the 1950 census, it would decline continually until the 2010 census.

Like a lot of statistical graphics from the mid-20th century we have a single-colour piece because colour printing costs money. It makes use of a stacked bar chart to highlight the share of housing in the city that can be classified as substandard, i.e. dilapidated or without access to a private bath.

The designer chose to separate the nonwhite from the white population on different sides of the date labels, though the scale remains the same. I wonder what would have happened if the nonwhite bars sat immediately below the white bars within each year. That would allow for a more direct comparison of the absolute numbers of housing units.

That would then free up space for a smaller chart dedicated to a comparison of the percentages that are otherwise written as small labels. Because both the absolutes and percentages are important parts of the story here.

The white housing stock increased and the number of substandard units decreased in an absolute sense, leading to a strong decline in percentages.

But with nonwhite housing, the number of substandard units slightly increased, but with larger growth in the sheer number of nonwhite housing units overall, that shrank the overall percentage.

Put it all together and you have significant improvements in white housing, though in an absolute sense there still remain more substandard units for whites than nonwhites. Conversely, we don’t see the same improvements in housing for nonwhites. Rather the improvement from 45% to 35% is due more from the increase in housing units overall. You could therefore argue that nonwhite housing did not improve nearly as much as white housing between 1940 and 1950. Though we need to underline that and say there was indeed improvement.

Anyway, I then went inside and spent several hours looking through deed abstracts. Not sure if those will make it into a post here, but I did have an idea for one over a pint at lunch afterwards.

Credit for the datagraphic goes to some graphics person for some government department.

Credit for the exhibit goes to Talia Greene.

Yep, Still Hotter

Like I said yesterday, I wanted to compare cities, surprise, Philadelphia vs. Chicago. And so with some extra time I was able to finish this graphic that took the data from Climate Central to compare the two cities.

What you can see below is that Philadelphia has seen more significant temperature growth in both summer highs and summer lows. And, importantly, the increase in low temperatures, i.e. nighttime, has been greater than that of daytime highs. That means that we have less of an opportunity to cool down after a hot summer day, adding stress to the system.

Chicago on the other hand has seen less overall growth, though it’s still present. And there too we see the same pattern of greater increases in low, i.e. nighttime, temperatures than of daytime highs.

It’s remarkable to think that the flat where I lived seven of my eight years in Chicago had no air conditioning unit in the bedroom, only in the living room. It was, of course, an older concrete building from the 1960s/70s when, as the chart above shows, nighttime temperatures didn’t really require air conditioning.

But like I said yesterday, I’m just glad I’ve been able to crank the air conditioning the last several days.

Credit for the piece is mine.

Hot and Not So Hot Graphics

Thankfully today’s forecast calls for cooler temperatures. Your author is not a fan of hot weather, which means being outside in summer is…less than ideal. It also means that the air conditioner runs frequently and on high for a few months. (Conversely, I can probably count on one hand the number of times I turned on the heat this winter.)

The problem is, the two biggest contributors to US carbon emissions? Heating/cooling and transport. In other words, heating your home in the winter, cooling it in the summer, and then driving your non-electric vehicle.

After the recent heatwave in New England, the Boston Globe examined the impact of the heatwave on the environment. The article led with the claim it used four charts to do so. I quibble with that distinction because this is a screenshot of the second graphic.

I mean, it’s not prose text. Rather, we have three factettes paired with illustrations. At the top of this post, I mentioned the impact of transport for a reason. In an ideal world, in order to get carbon emissions under control one of the changes we would need to see is getting people out of their personal automobiles and into mass transit. Subways and light rail are far cleaner and can actually be cheaper for households than car ownership. And so we should be encouraging their use and building more of them.

Look above and you’ll see an icon of a subway car. Except it’s not. The graphic/factette is actually talking about rail cars full of coal that transport fuel from mine to generating station. Those look more like this, from James St. James via Wikimedia Commons.

Small, subtle details matter. And so I’d propose a new icon that tries to capture the industrial coal train, ideally something that I spent more than five minutes on.

But it breaks the linkage between passenger train and coal train, which is not ideal for the purposes of an article highlighting the environmental impacts of US households.

That all said, the article did a really good job with the other graphics it used. My favourite was this chart, decidedly not a combination chart.

It looks at the correlation between high temperatures and energy usage. But, instead of lazily throwing the temperatures atop the bars, the designers more carefully placed them below the energy usage chart. The top chart should look familiar to those who have been following my Covid-19 charts, a daily number that then has the rolling seven-day average plotted above it to smooth out any one-day quirks. The designer then chose to highlight the heatwave in red.

For temperatures, I like the overall approach. But I wonder if a more nuanced approach could have taken the graph a step farther to excellent. Presently we have a single red line representing daily average high temperature. But in the plot above we use red to indicate the heat wave of early June, five consecutive days of temperatures in excess of 90ºF. What if that line were black or grey or some neutral colour, and then only the heatwave was coloured in red? It would more clearly link the two together. And it avoids the trap of red implying heat, when you need to only go back to late May when the East Coast had early spring like temperatures near 50ºF, decidedly not red on a temperature scale.

Overall, though, it’s refreshing to see a thoughtful approach taken here instead of the usual slapdash throw one chart atop the other.

And the rest of the article uses restrained, smart graphics as well. Bar charts and small multiples to capture air pollution and EMS calls. You should read the full article for the insights and the feedback loops we have.

After all, it’s not that the heating/cooling is itself the problem, especially since the removal of CFCs since the Montreal Protocol in 1987 that banned those pesky chemicals that harm the ozone layer—remember when that was the big environmental issue in the 1990s? The issue is how we generate the electricity that powers the heating/cooling systems—and if you want to use electric cars, whence comes their electric charge—as if we’re using coal plants, that just exacerbates the problem. But if we use carbon-less plants, e.g. nuclear, solar, or wind, we’re not generating carbon emissions.

Credit for the piece goes to John Hancock.

Inflating Areas

One trend people have begun to follow lately is that of rising prices for consumer goods. If you have shopped recently for things, you may have noticed that you have been paying more than you were just a few weeks ago. We call this inflation. The Bureau of Labour Statistics (BLS) tracks this for a whole range of goods. We call the the consumer price index (CPI)

Prices can vary wildly for some goods, most notably food and energy. For those of my readers who drive, recall how quickly petrol/gasoline prices can change. Because of that volatility, the Bureau of Labour Statistics strips out food and energy prices and the inflation that excludes food and energy is what we call Core CPI.

Lately, we have been seeing an increase in prices and inflation is on the rise. To an extent, this is not surprising. The pandemic disrupted supply chains and wiped out supplies and stores of goods. But with many people working remotely, many now have pent up savings they want to spend. But with low supply and high demand, basic economics suggests rising prices. As supplies increase in the coming months, however, the rise in prices will begin to cool off. In other words, most economists are not yet concerned and expect this spike in inflation to be passing in nature. But not everyone agrees.

Last week, the Washington Post had an article examining the cause of inflation for a number of industries. To do so, it used some charts looking at prices over the past two years. This screenshot is from the used car section.

I want to focus on the design of this graphic, though, not the content. The designers’ goal appears to be contrasting the inflation over the last year to that of the last two years. Easy peasy. Red represents one-year inflation and blue two-year.

Typically when you see a chart that look like this, an area or filled line chart, the coloured area reflects the total value of the thing being measured. You can also use the colour to make positive/negative values clearer. In this case, neither of those things are happening.

Because the blue, for example, starts at the beginning of the time series and at the bottom of the chart, it looks like an enormous amount of consistent blue growth. And when the line runs into May 2020, we begin to see what appears as a stacked area chart, with the blue area increasing at the expense of the red.

Another way of reading it could be that the 29.7% and 29.3% increases equal the shaded areas, but that’s also problematic. If the shaded area locked to the baseline like you’ll see in a moment, I could maybe see that working, but at this point it just leaves me confused.

Now you can use the area fill to make it clear when a line dips above or below the baseline, in this case 0%. And I took that approach when I reimagined the chart as seen below.

What we do here is we set the bottom of the area fill to the baseline. Consequently, where the chart is filled above 0 we have positive inflation, and where it falls below the 0 line we have negative inflation, or deflation.

We need to note here that the text in the original article talks about the monthly change in inflation, e.g. that used car prices have increased by 7.3% last month. That, however, is not what the chart looks at. Instead, the chart shows the change yearly, in other words, prices now vs last May. To an extent, the 29.7% increase is not terribly surprising given how terrible the recession was.

Ultimately, I don’t see the value in the filled blue and red areas of the chart because I am left more confused. Does the reader need to see how far back one year and two years are from May 2021? Don’t the date labels do that sufficiently well?

This is just a weird article that left me scratching my head at the graphics. But read the text, it’s super informative about the content. I just wish a bit more work went into the graphics. There are some nice illustrations beginning each section, but I kind of feel that more time was spent on the illustrations than the charts.

Credit for the piece goes to Abha Bhattarai and Alyssa Fowers.

On a Line. Or Not.

Two weeks ago I was reading an article in the BBC that fact checked some of President Biden’s claims about the economy. Now I noted the other day in a post about axis lines and their use in graphics. Axis lines help ground the user in making comparisons between bars, lines, or whatever, and the minimum/maximum/intervals of the data set.

I was reading the article and first came upon this graphic. It’s nothing crazy and shows job growth in the aggregate for the first three months of a presidential administration. A pretty neat comparison in the combination of the data. I like.

Pay attention to what you see here. There will be a quiz.

I don’t like the lack of grid lines for the axis, however. But, okay, none to be found.

I keep reading the article. And then a couple of paragraphs later I come upon this graphic. It looks at the monthly figures and uses a benchmark line, the red dotted one, to break out those after January 2021 when Biden took office.

But do you notice anything?

The lines for the y-axis are back!

The article had a third graphic that also included axis lines.

I don’t have a lot to say about these graphics in particular, but the most important thing is to try and be consistent. I understand the need to experiment with styles as a brand evolves. Swap out the colours, change the styles of the lines, try a new typeface. (Except for the blue, we are seeing different colours and typefaces here, but that’s not what I want to write about.)

First, I don’t know if these are necessarily style experiments. I suspect not, but let’s be charitable for the sake of argument. I would refrain from experimenting within a single article. In other words, use the lines or don’t, but be consistent within the article.

For the record, I think they should use the lines.

Another point I want to make is with the third graphic. You’ll note that, like I said above, it does use axis lines. But that’s not what I want to mention.

Instead I want to look at the labelling on the axes. Let’s start with the y-axis, the percentage change in GDP on the previous quarter. The top of the chart we have 30%. As I’ve said before, you can see in the Trump administration, the bar for the initial Covid-19 rebound rises above the 30% line. It’s not excessive, I can buy it if you’re selling it.

But let’s go down below the 0-line. Just prior to the rebound we had the crash. Similarly, this extends just below the -30% line. But here we have a big space and then a heavy black line below that -30% line. It looks like the bottom line should be -40%, but scanning over to the left and there is no label. So what’s going on?

First, that heavy black line, why does it appear the same as the baseline or zero-growth line? The axis lines, by comparison, are thin and grey. You use a heavier, darker line to signify the breaking point or division between, in this case, positive and negative growth. Theoretically, you don’t need the two different colours for positive and negative growth, because the direction of the bar above/below that black line encodes that value. By making the bottom line the same style as the baseline, you conflate the meaning of the two lines, especially since there is no labelling for the bottom line to tell you what the line means.

Second, the heaviness of the line draws visual attention to it and away from the baseline, especially since the bottom line has the white space above it from the -30% line. Consider here the necessity of this line. For the 30% line that sets the maximum value of the y-axis, we have the blue bar rising above the line and the administration labels sit nicely above that line. There is no reason the x-axis labels could not exist in a similar fashion below the -30% line. If anything, this is an inconsistency within the one chart, let alone the one graphic.

Third, is it -40%? I contend the line isn’t necessary and that if the blue bar pokes above the 30% line, the orange bar should poke below the -30% line. But, if the designer wants to use a line below the -30% line, it should be labelled.

Finally, look at the x-axis. This is more of a minor quibble, but while we’re here…. Look at the intervals of the years. 2012, 2014, 2016, every two years. Good, make sense. 2018. 20…21? Suddenly we jump from every two years to a three-year interval. I understand it to a point, after all, who doesn’t want to forget 2020. But in all seriousness, the chart ends at 2021 and you cannot divide that evenly. So what is a designer to do? If this chart had less space on the x-axis and the years were more compressed in terms of their spacing, I probably wouldn’t bring this up.

However, we have space here. If we kept to a two-year interval system, I would introduce the labels as 2012, but then contract them with an apostrophe after that point. For example, 2014 becomes ’14. By doing that, you should be able to fit the two-year intervals in the space as well as the ending year of the data set.

Overall, I have to say that this piece shocked me. The lack of attention to detail, the inconsistency, the clumsiness of the design and presentation. I would expect this from a lesser oganisation than the BBC, which for years had been doing solid, quality work.

The first chart is conceptually solid. If Biden spoke about job creation in the first three months of the administration vs. his predecessor, aggregate the data and show it that way. But the presentation throughout this piece does that story a disservice. I wish I knew what was going on.

Credit for the piece goes to the BBC graphics department.

Israel’s Palestine Trilemma

In what feels like forever ago, I wrote about the trilemma facing the British government as it related to Brexit. Brexit presented Westminster with three choices, of which they could only make two as all three were, together, impossible. Once made, those two choices determined the outcome of Brexit. For better or worse, Prime Minister Boris Johnson made that decision.

We can apply the same trilemma system to Israel in relation to the circumstances of Israel and Palestine. I will skip the long history lesson here. Israel faces some tough decisions. I will also skip the critique of Israeli government policy over the last few decades that brought us to this point. Because here is where we are.

Israel needs to balance three things: the importance of being a representative democracy, of being a Jewish state, and of security control of Gaza and the West Bank for the security of Israel. Here is how that looks.

If Israel wants to remain an ethnically Jewish state—I’m going to also skip the discourse about Jewishness as an ethnicity, though I will point to Judaism as an ethnic religion as opposed to the other Abrahamic universal religions of Christianity and Islam—and it wants to be retain security control over Palestine, i.e. the Gaza Strip and the West Bank, you have what we have today.

If Israel wants to remain an ethnically Jewish state and it wants to be a representative democracy, you get the Two-State Solution. In that scenario, Palestine, again conceived as Gaza and the West Bank, becomes a fully-fledged independent and sovereign state. Israel remains Jewish and Palestine becomes Arab. But, Israel loses the ability to police and militarily control Gaza and the West Bank, instead relying on its newfound partners in the Palestinian Authority or whatever becomes the executive government of Palestine. This has long been the goal of Middle East peace plans, but over the last decade or so you hear Two-State Solution less and less frequently.

Finally, if Israel wants to be a representative democracy, in which case both Jewish citizens and Arab–Israelis and Palestinians all have the right to full political representation without reservations, e.g. the loyalty oath, and it wants to maintain security control over Gaza and the West Bank, you get something I don’t hear often discussed outside foreign policy circles: a non-Jewish, multi-ethnic Israel. Today Arab–Israelis and Palestinians nearly—if not already—outnumber Jewish Israelis. In a representative democracy, it would be near impossible to maintain an ethnically Jewish state in a county where the Jewish population is in the minority. Consequently, Israel would almost certainly cease being a Jewish state.

One can tinker around the edges, e.g. what are the borders of a Two-State Solution West Bank, but broadly the policy choices above determine the three outcomes.

The outstanding question remains, what future does Israel want?

Credit for the piece is mine.

The May Jobs Report

Last Friday, the government released the labour statistics from April and they showed a weaker rebound in employment than many had forecasted. When I opened the door Saturday morning, I got to see the numbers above the fold on the front page of the New York Times.

What I enjoyed about this layout, was that the graphic occupied half the above the fold space. But, because the designers laid the page out using a six-column grid, we can see just how they did it. Because this graphic is itself laid out in the column widths of the page itself. That allows the leftmost column of the page to run an unrelated story whilst the jobs numbers occupy 5/6 of the page’s columns.

If we look at the graphic in more detail, the designers made a few interesting decisions here.

First, last week I discussed a piece from the Times wherein they did not use axis labels to ground the dataset for the reader. Here we have axis labels back, and the reader can judge where intervening data points fall between the two. For attention to detail, note that under Retail, Education and health, and Business and professional services, the “illion” in -2 Million was removed so as not to interfere with legibility of the graphic, because of bars being otherwise in the way.

My issue with the axis labels? I have mentioned in the past that I don’t think a designer always needs to put the maximum axis line in place, especially when the data point darts just above or below the line. We see this often here, for example Construction and Manufacturing both handle it this way for their minimums. This works for me.

But for the column above Construction, i.e. State and local government and Education and health, we enter the space where I think the graphic needs those axis lines. For Education and health, it’s pretty simple, the red losses column looks much closer to a -3 million value than a -2 million value. But how close? We cannot tell with an axis line.

And then under State and local government we have the trickier issue. But I think that’s also precisely why this could use some axis lines. First, almost all the columns fall below the -1 million line. This isn’t the case of just one or two columns, it’s all but two of them. Second, these columns are all fairly well down below the -1 million axis line. These aren’t just a bit over, most are somewhere between half to two-thirds beyond. But they are also not quite nearly as far to -2 million as the ones we had in the Education and health growth were near to -3 million.

So why would I opt to have an axis line for State and local governments? The designers chose this group to add the legend “Gain in April”. That could neatly tuck into the space between the columns and the axis line.

Overall it’s a solid piece, but it needs a few tweaks to improve its legibility and take it over the line.

Credit for the piece goes to Ella Koeze and Bill Marsh.

Off the Axis

Two Fridays ago, I opened the door and found my copy of the New York Times with a nice graphic above the fold. This followed the announcement from the White House of aggressive targets to reduce greenhouse gas emissions

In general, I love seeing charts and graphics above the fold. As an added bonus, this set looked at climate data.

Need to see more downward trending lines.

But there are a few things worth pointing out.

First from a data side, this chart is a little misleading. Without a doubt, carbon dioxide represents the greatest share of greenhouse gasses, according to the US Environmental Protection Agency (EPA) it was 76% in 2010. Methane contributes the next largest share at 16%. But the labelling should be a little clearer here. Or, perhaps lead with a small chart showing CO2’s share of greenhouse gasses and from there, take a look at the largest CO2 emitters per person.

Second, where are the axis labels?

I will probably have more on this at a later date, but neither the bar chart nor the line charts have axis labels. Now the designers did choose to label the beginning value for the lines and the bars, but this does not account for the minimums or maximums. (It also assumes that the bottom of the lines is zero.)

For example, we can see that China began 1990 with emissions at 3.4 billon metric tons. The annotation makes clear that China’s aggregate emissions surpassed those of the US in 2004. But where do they peak? What about developing countries?

If I pull out a ruler and draw some lines I can roughly make some height comparisons. But, an easier way would be simply to throw some dotted lines across the width of the page, or each line chart.

This piece takes a big swing at presenting the challenge of reducing emissions, but it fails to provide the reader with the proper—and I think necessary—context.

Credit for the piece goes to Nadja Popovich and Bill Marsh.

Covid-19: A Global Update

I’ve been trying to limit the amount of Covid-19 visualisations I’ve been covering. But on Sunday this image landed at my front door, above the fold on page 1 of the New York Times. And it dovetails nicely with our story about the pandemic’s impact on Pennsylvania, New Jersey, Delaware, Virginia, and Illinois.

Some not so great looking numbers across the globe.

For most of 2020, the United States was one of the worst hit countries as the pandemic raged out of control. Since January 2021, however, the United States has slowly been coming to grips with the virus and the pandemic. Its rate is now solidly middle of the pack—no longer is America first.

And if you compare the chart at the bottom to those that I’ve been producing, you can clearly see how our five states have really gotten this most recent wave under control to the point of declining rates of new cases.

However, you’ve probably heard the horror stories from India and Brazil where things are not so great. It’s countries like those that account for the continual increase in new cases at a global level.

Credit for the piece goes to Lazaro Gamio, Bill Marsh, and Alexandria Symonds.