Sometimes in the course of my work I stumble across graphics and work that I previously missed. In this case I was seeking a post about one of my favourite infographics, but it turned out I’ve never posted about it and so I will have to rectify that someday. However in my searching, I came upon an article from the New York Times last year where they wrote about research from MIT that compared the carbon dioxide emissions—bad for the environment and climate—per mile to the average monthly cost of a wide range of 2021 vehicles. The important distinction here is that average monthly cost is not the sticker price of a vehicle, but rather the sticker price plus lifetime operating costs. (For their analysis, the authors assumed a 15-year lifespan and 13,000 miles driven per year.)
Why is this so important? It’s pretty simple, really. In the United States, vehicle emissions are the largest source of carbon emissions. And the vast majority of that is due to passenger vehicles. If we as a society want to get serious about reducing our carbon footprint, the biggest changes we need to make are reducing our amount of driving, moving more people into mass transit, or switching out people’s gas-powered vehicles for electric vehicles.
The New York Times turned their work into a really nice static datagraphic. It is static, so there is no real interactivity if you want to compare your vehicle to others. However, the designers did choose some popular models and identified some of the key outliers.
The designers group the cars, represented by dots, into colour fields. These do a good job of showing how there is overlap between the different types of vehicles. Not all hybrid and plug-in vehicles are cheaper or even less CO2 emitting than some gas-powered vehicles, typically your smaller compacts and hatchbacks. Each colour field is linked to a textual annotation that also functions as a legend.
That alone is very helpful in understanding the differences, subtle and not-so-much, between the types of vehicles. Later on in the article the designers also used a scatter plot of a narrower set of data to compare a select set of vehicles.
Here we can see that one cannot simply assume that all electric vehicles are cheaper long-term than their gas-powered compatriots. Here we can see that the Nissan Altima, whilst emitting more CO2, compares favourably with the Tesla Model 3 in both the long-term cost but also in the upfront sticker price.
Despite finding this article a year and a half late, we can tie this to current events in that President Biden’s climate bill creates tax credits for electric vehicles. While the bill is perhaps not as significant as many would like, it is remarkable for still being a lot of money devoted to reducing our emissions. And when it comes to electric vehicles, one of the key components is the creation of tax credits. These would help mitigate those upfront sticker costs of electric vehicles. Because whilst they may generally be cheaper in the long-run, you still need to put up more money than their conventionally-powered alternatives either as lump sums or down payments. And with interest rates rising, what you need to cover via an auto loan will become more expensive.
Overall this is a really nice piece. Should I ever need to buy another vehicle, I would love to see this as a resource available to the general public. Unfortunately it only compares 2021 vehicles. And it does make me wonder where my 2005 vehicle compares. Probably not too terribly favourably.
I had something else for today, but this morning I opened the door and found my morning paper. Nothing terribly special. No massive headline. No large front-page graphic. See what I mean?
But then as I bent down to pick it up, I spotted a little tree map. But it turned out it wasn’t a tree map. It was a rectangle, largely, but it was actually a county map of Kansas. It was so small it fit within a single column.
The map showed those counties that had a majority vote in favour of keeping abortion rights. And then those counties that also voted for Trump in 2020 were outlined in orange—a good colour pairing. Turned out a number of counties did.
Without wading into the politics of it, because that’s a separate article, this was a great little map. It didn’t need to be crazy complicated or even large.
Credit for the piece goes to the New York Times graphics department.
For my readers in the northern hemisphere, which is the vast majority of you, we are in the middle of meteorological summer, the dog days. And whilst my UK and Europe readers continue to bake under temperatures greater than 40ºC (104ºF), the northeast United States and Philadelphia in particular is looking at a heatwave starting today that’s forecast to peak at a temperature of 38ºC (100ºF) this weekend and a heat index reaching 41ºC (106ºF) tomorrow.
Not cool.
Yesterday we examined a completely different topic, property tax increases in Philadelphia, but we contrasted that work with a heat index map from the New York Times. With the heatwave beginning this afternoon, however, it seemed apropos to revisit that contrasting article.
It begins with the map that we looked at yesterday. Of course yesterday was Tuesday. Today is Wednesday, and so you can already compare these two maps to see how and where the heat has shifted. Spoiler: the Southeast and Midwest.
It does so with a nice simple three-colour unidirectional spectrum from a light yellow to a burnt orange. And you can see the orange spreading up from the Gulf Coast and along the southeastern Atlantic Seaboard.
For those not familiar, the heat index is basically what the air “feels like” taking into consideration the actual temperature and the relative humidity in the air. Humans cool themselves via perspiration and when the air is excessively humid our ability to perspire decreases and thus the body begins to run hotter. Warmer temperatures allow the atmosphere to increase the amount of moisture it can contain and you can see all that Gulf and subtropical moisture carrying itself into the hot air moving up from the south.
Very not cool.
The piece also offers a look at the forecast for the heat index, showing the next six days. These small multiples allow the reader to see the geographic progression of the heat. Whereas today will be particularly for parts of the Midwest in southern Illinois and Indiana, tomorrow will see the worst for the Eastern Seaboard. Luckily the heat index retreats a bit, though as I noted above, the temperatures will continue to rise until Sunday, meaning higher temperatures, but lower relative humidity. For Philadelphia in particular we talking about 50% relative humidity tomorrow and only 35% on Sunday. That makes a big difference.
Overall this is a great piece despite the content.
Personally, I just can’t wait until summer.
Credit for the piece goes to Matthew Bloch, Lazaro Gamio, Zach Levitt, Eleanor Lutz, and John-Michael Murphy.
The other day I was reading an article about the coming property tax rises in Philadelphia. After three years—has anything happened in those three years?—the city has reassessed properties and rates are scheduled to go up. In some neighbourhoods by significant amounts. I went down the related story link rabbit hole and wound up on a Philadelphia Inquirerarticle I had missed from early May that included a map of just where those increases were largest. The map itself was nothing crazy.
We have a choropleth with city zip codes coloured by the percentage increase. I was thrown for a bit of a loop as I immediately perceived the red representing lower values and green higher values, the standard green to red palette. But given that higher values are “bad”, I can live red representing bad and sitting at the top of the spectrum.
I filed it away to review later, but when I returned I visited on my mobile phone. And what I saw broadly looked the same, but there was a disconcerting difference. Take a look at the legend.
You can see that instead of running vertically like it did on the desktop, now the legend runs horizontally across the bottom. In and of itself, that’s not the issue. Though I do wonder if this particular legend could have still worked in roughly the same spot/alignment given the geographic shape of Philadelphia along the Delaware River.
Rather look at the order. We go from the higher, positive values on the left to the negative, lower values on the right. When you read the legend, this creates some odd jumps. For example, we move from “+32% to +49%” then to “+15% to +31%”. We would normally say something to the point of the increase bins moving from “+15% to +31%” then to “+32% to +49%”. In other words, the legend itself is a continuum.
The fix for this would be to simply flip the running order of the legend. Put the lower values on the left and then step up to the right. For a quick comparison, I visited the New York Times website and pulled up the first graphic I could find that looked like a choropleth. Here we have a map of the dangerous temperatures across the United States.
Note how here the New York Times also runs their legend horizontally below the graphic. But instead of running high-to-low like in the Inquirer, the Times runs low-to-high, making for a more natural and intuitive legend.
This kind of simple ordering change would make the Inquirer’s map that much better.
Credit for the Inquirer piece goes to Kasturi Pananjady and John Duchneskie.
Credit for the Times piece goes to Matthew Bloch, Lazaro Gamio, Zach Levitt, Eleanor Lutz, and John-Michael Murphy.
Editor’s note: I was having some technical issues last week. This was supposed to post last week.
Editor’s note two: This was supposed to go up on Monday. Still didn’t. Third time’s the charm?
Yesterday I wrote about a piece from the New York Times that arrived on my doorstep Saturday morning. Well a few mornings earlier I opened the door and found this front page: a map of the western United States highlighting the state of New Mexico.
Unlike the graphic we looked at yesterday, this graphic stretched down the page and below the fold, not by much, but still notably. The maps are good and the green–red spectrum passes the colour blind test. How the designer chose to highlight New Mexico is subtle, but well done. As the temperature and precipitation push towards the extreme, the colours intensify and call attention to those areas.
Also unlike the graphic we looked at yesterday, this piece contained some additional graphics on the inside pages.
These are also nicely done. Starting with the line chart at the bottom of the page, we can contrast this to some of the charts we looked at yesterday.
Here the designer used axis lines and scales to clearly indicate the scale of New Mexico’s wildfire problem. Not only can you see that the number of fires detected has spiked far above than the number in the previous years back to 2003. And not only is the number greater, the speed at which they’ve occurred is noticeably faster than most years. The designer also chose to highlight the year in question and then add secondary importance to two other bad years, 2011 and 2012.
The other graphics are also maps like on the front page. The first was a locator map that pointed out where the fires in question occurred. Including one isn’t much of a surprise, but what this does really nicely is show the scale of these fires. They are not an insignificant amount of area in the state.
Finally we have the main graphic of the piece, which is a map of the spread of the Calf Canyon and Hermits Peak fire, which was two separate fires until they merged into one. The article does a good job explaining how part of the fire was actually intentionally set as part of a controlled burn. It just became a bit uncontrolled shortly thereafter.
This reminded me of a piece I wrote about last autumn when the volcano erupted on La Palma. In that I looked at an article from the BBC covering the spread of the lava as it headed towards the coast. In that case darker colours indicated the earlier time periods. Here the Times reversed that and used the darker reds to indicate more recent fire activity.
Overall the article does a really nice job showing just what kind of problems New Mexico faces not just now from today’s environmental conditions, but also in the future from the effects of climate change.
Credit for the piece goes to Guilbert Gates, Nadja Popovich, and Tim Wallace.
Friday the Bureau of Labour Statistics published the data on the jobs facet of the American economy. Saturday morning I woke up and found the latest New York Times visualisation of said jobs report waiting for me at my door. The graphic sat\s above the fold and visually led the morning paper.
We have a fairly simple piece here, in a good way. Two sections comprise the graphic. The first uses a stacked bar chart to detail the months wherein the US economy lost jobs during the previous two and a half years. We can take a closer look in this second photo that I took.
Here we can see the stacked bars pile up with the most recent bars to the right. Some of the larger bars have labels stating the number of jobs either lost (top) or gained (bottom). I’m not normally a fan of stacked bar charts, because they don’t allow a reader to easily discern like-for-like changes. In this instance, the goal is to show how close all the little bits have come towards making up the three negative bars. Where I take issue is that I would prefer the designers used some sort of scale to indicate even a rough sense of how many jobs the various bars represent.
That issue crops up again to a slightly lesser degree with the bottom set of graphics. These compare the growth of hourly earnings and inflation both from February 2020. During the first few months of the pandemic and its recession, you can see earnings for those most directly impacted by shutdowns drop. But there is no negative scale accompanying the positive scale and that makes it difficult to determine just how far earnings fell for those in, say, leisure and hospitality.
The second part of the graphic works overall, however it’s just some of the finer design details that are missing and take away from the graphic’s overall effectiveness.
This all fits part of a larger trend in data visualisation that I’ve been noticing the last few months. Fewer charts seem to be using axes and scales. It’s not a good thing for the field. Maybe some other day I’ll write some things about it.
For this piece, though, we have an overall solid effort. Some different design decisions could have made the piece clearer and more effective, but it still does the job.
Credit for the piece goes to Ben Casselman, Ella Koeze, and Bill Marsh.
Yesterday I focused on the big graphic from the New York Times that crossed the full spread of the front/back page. But the graphic was merely the lead graphic for a larger piece. I linked to the online version of the article, but for this post I’m going to stick with the print edition. The article consists of a full-page open then an entire interior spread, all in limited colour. The remainder of the extensive coverage consists of photo essays and interviews that understandably attempt to humanise the data points, after all, each dot from yesterday represented one individual, solitary, human being. That is an important element of a story like this and other national and international tragedies, but we also need to focus on the data and not let the emotion of the story overwhelm our rational and logical analysis.
From a data visualisation standpoint the first page begins simply enough with a long timeline of the Covid-19 pandemic charting the number of absolute deaths each day. As we looked at yesterday, the absolute deaths tell part of the story. But if we were to have looked at the number of absolute cases in conjunction with the deaths, we could also see how the virus has thus far evolved to be more transmissible but less lethal. Here the number of daily deaths from Omicron surpassed Delta, but fell short of the winter peak in early 2021. But the number of cases exploded with Omicron, making its mortality rate lower. In other words, far more people were getting sick, but as far fewer were dying.
An interesting note is that if you take a look at the online version, there the designers chose a more stylised approach to presenting the data.
Here they kept the dot approach and simply stacked and reordered the dots. However, I presume for aesthetic reasons, they kept the stacking loose dots and dropped all the axis lines because it does make for a nice transition from the map to this chart. But they also dropped all headings and descriptors that tell the reader just what they are looking at. These decisions make the chart far less useful as a tool to tell the data-driven element of the story.
There are three annotations that label the number of deaths in New York, the Northeast, and the rest of the United States. But what does the chart say? When are the endpoints for those annotations? And then you can compare the scale of the y-axis of this chart and compare it to the printed version above. A more dramatic scale leads to a more dramatic narrative.
This sort of visual style of flash and fancy transitions over the clear communication of the data is why I find the print piece more compelling and more trustworthy. I find the online version, still useful, but far more lacking and wanting in terms of information design.
The interior spread is where this article shines.
From an editorial design standpoint, the symmetry works very well here. It’s a clear presentation and the white space around the graphic blocks lets that content shine as it should in this type of story. Collectively these pieces do a great job telling the story of the pandemic thus far across the nation. The graphics do not need a lot of colour and make do with sparse flash. Annotations call the reader’s attention to salient points and outliers.
From a content standpoint, I would be particularly curious if we have robust data for deaths by education level. Earlier this year I recall reading news about a study that said education best correlated to Covid cases, and I would be curious to see if that held true for deaths. Of course these charts do a great job of showing just how effective the vaccines were and remain. They are the best preventative measure we have available to us.
Here I disagree with the design decision of how to break down the states into regions. The Census Bureau breaks down the United States into four regions using the same names as in the graphic above. However, if you look closely at the inset map, you will see that Delaware, Maryland, and West Virginia in particular are included as part of the Northeast. (I cannot tell if the District of Columbia is included as part of the Northeast or South.)
Now compare that to the Census Bureau’s definition:
If you ask me to include Delaware and Maryland as part of the Northeast, well, if you’re selling it, I’ll buy it. After all, just because the Census Bureau defines the United States this way does not mean the New York Times has to. Both are connected to the Northeast Corridor via Amtrak and I-95 and are plugged into the Megalopolis economy. Maybe the Potomac should be the demarcation between Northeast and South. But I struggle to understand West Virginia. Before you go and connect it to the Northeast, I would argue that West Virginia has far more in common with the Midwest geographically, economically, and culturally.
More critically, given this issue, it strikes me as a serious problem when the online version of the chart—with the aforementioned issues—does not even include the little inset to highlight this at best unusual regional definition.
And so while I have reservations about the data—how would the data have looked if the states were realigned?—the design of the line charts overall is good.
Again, I am talking about the print version, not that online graphic. I would argue that the above screenshot is barely even a chart and more “data art” or an illustration of data. Consider here, for example, that for the South we have that muted slate blue for the dots, but the spacing and density of the dots leads to areas of lighter slate and darker slate. But a lighter slate means more space between stacked dots and darker slate means a more compact design. A lighter colour therefore pushes the “edge” of the line further up the y-axis and artificially inflates its value, not that we can understand what that value is as the “chart” lacks any sort of y-axis.
Finally the print piece has a set of small multiples breaking down deaths by income in the three largest American cities: New York, Los Angeles, and Chicago. These are just great little charts showing the correlation between income and death from Covid, organised by Zip code.
But this also serves as a stark reminder of just how much better the print piece is over the online version. Because if we take a look at a screenshot from the online article, we have a graphic that addresses all the issues I pointed out earlier.
I am left to wonder why the reader of the online version does not have access to this clearer and more accurate representation of the data throughout the piece?
To me this article is a great example of when the print piece far exceeds that of the online version. Content-wise this is a great story that needed to be told this weekend, but design wise we see a significant gap in quality from print to online. Suffice it to say that on Sunday I was very glad I received the print version.
Credit for the piece goes to Sarah Almukhtar, Amy Harmon, Danielle Ivory, Lauren Leatherby, Albert Sun, and Jeremy White.
This past weekend the United States surpassed one million deaths due to Covid-19. To put that in other terms, imagine the entire city of San Jose, California simply dead. Or just a little bit more than the entire city of Austin, Texas. Estimates place the number of those infected at about 80 million. Back of the envelope maths puts that fatality rate at 1.25%. That’s certainly lower than earlier versions of the virus, which has evolved to be more transmissible, but thankfully less lethal than its original form.
Sunday morning I opened the door to my flat and found the Sunday edition of the New York Times waiting for me with a sobering graphic not just above the fold, nor across the front page. No, the graphic—a map where each dot represents one Covid-19 death—wrapped around the entire paper.
You don’t need to do much more here. Black and white colour sets the tone simply enough. Of course, a bit more critically, these maps mask one of the big issues with the geographic spread of not just this virus but many other things: relatively few people live west of the Mississippi River.
Enormous swathes of the plains and Rocky Mountains have but few farmers and ranchers living there. Most of the nation’s populous cities are along the coast, particularly the East Coast, or along rivers or somewhat arbitrary transport hubs. You can see those because this map does not actually plot the locations of individual deaths, but rather fills county borders with dots to represent the deaths that occurred within those limits. That’s why, particularly west of the Mississippi, you see square-shaped concentrations of deaths.
A choropleth map that explores deaths per capita, that is after adjusting for population, shows a different story. (This screenshot comes from the New York Times‘ data centre for Covid-19.
The story here is literally less black and white as here we see colours in yellows to deep burnt crimsons. Whilst the big map yesterday morning concentrated deaths in the Northeast, West Coast, and around Chicago we see here that, relative to the counties’ populations, those same areas fared much better than counties in the plains, Midwest, and Deep South.
A quick scan of the Northeast and Mid-Atlantic states shows that only one county, Juniata in Pennsylvania, fell into the two worst deaths per capita bins—the deeper reds. Juniata County sits squarely in the middle of Pennsyltucky or Trumpsylvania, where Covid countermeasures were not terribly popular. No other county in the region shares that deep red.
Look to the southeast and south, however, and you see lots of deep and burnt crimsons dotting the landscape. This doesn’t mean people didn’t die in the Northeast, because of course they did. Rather, a greater percentage of the population died elsewhere when, as the policies enacted by the Northeast and West Coast show, they didn’t need to.
After all, injecting bleach was never a good idea.
I will try to get to my weekly Covid-19 post tomorrow, but today I want to take a brief look at a graphic from the New York Times that sat above the fold outside my door yesterday morning. And those who have been following the blog know that I love print graphics above the fold.
Of the six-column layout, you can see that this graphic gets three, in other words half-a-page width, and the accompany column of text for the article brings this to nearly 2/3 the front page.
When we look more closely at the graphic, you can see it consists of two separate parts, a scatter plot and a line chart. And that’s where it begins to fall apart for me.
The scatter plot uses colour to indicate the vote share that went to Trump. My issue with this is that the colour isn’t necessary. If you look at the top for the x-axis labelling, you will see that the axis represents that same data. If, however, the designer chose to use colour to show the range of the state vote, well that’s what the axis labelling should be for…except there is none.
If the scatter plot used proper x-axis labels, you could easily read the range on either side of the political spectrum, and colour would no longer be necessary. I don’t entirely understand the lack of labelling here, because on the y-axis the scatter plot does use labelling.
On a side note, I would probably have added a US unvaccination rate for a benchmark, to see which states are above and below the US average.
Now if we look at the second part of the graphic, the line chart, we do see labelling for the axis here. But what I’m not fond of here is that the line for counties with large Trump shares, the line significantly exceeds the the maximum range of the chart. And then for the 0.5 deaths per 100,000 line, the dots mysteriously end short of the end of the chart. It’s not as if the line would have overlapped with the data series. And even if it did, that’s the point of an axis line, so the user can know when the data has exceeded an interval.
I really wanted to like this piece, because it is a graphic above the fold. But the more I looked at it in detail, the more issues I found with the graphic. A couple of tweaks, however, would quickly bring it up to speed.
Today I want to highlight a print article from the New York Times I received about two weeks ago. It’s been sitting in a pile of print pieces I want to sit down, photograph, and then write up. But as we begin to return to normal, I need my second dining room chair back because at some point I’ll have guests over.
The article in question examined the rates of Covid-19 vaccination across the United States. And on the front page, above the fold no less, we can compare the vaccination rates for Covid-19 to those of the 2019–2020 flu and if you unfold it to its full-length glory we can add in the 2009–2010 H1N1 swine flu outbreak.
First thing I want to address is the obvious. Look at those colours. Who loves a green-to-red scale on a choropleth? Not this guy. They are a pretty bad choice because of green-to-red colour blindness. (There’s two different types as well as other types of colour blindness, but I’m simplifying here.) But here’s what happen when I pull the photo into Photoshop and test for it. (This is a screenshot, because I’m not aware of a means of exporting a proof image.)
You can still see the difference between the reds and greens. That’s good. And it’s because colour is complicated. In red-green colour blindness, the issue is sensitivity to picking up reds and greens. (Again, oversimplifying for the sake of a blog post.) Between those two colours in the spectrum we have yellow. To the other side of green we have blue.
So if a designer needs to use a red-green colour scheme—and any designer who has worked in data visualisation will have undoubtedly have had a client asking for the map/chart/whatever to be in red and green—there’s a trick to making it work.
I don’t know if this is true, but growing up, I learned that green was the one colour the human eye evolved to distinguish the most. Now for a print piece like this, you are working in what we call CMYK space (cyan, magenta, yellow, and black). Red is a mixture of magenta and yellow. Green a mixture of cyan and yellow. If you remember your school days, it’s similar to—but not the same as—mixing your primary colours. So if you need to make red and green work, what can you do? First, you can subtract a bit of yellow from your green, because that exists between red and green. But then, and this is why CMYK is different from your primary school primary colours, we can adjust the amount of magenta. Magenta is not a “pure” red, instead it’s kind of purplish and that means has some blue in it. Adding a little bit of magenta, while it does add “red” into the green, it’s also adding more blue to the blue present in the cyan. Now you can spend quite a bit of time tweaking these colours, but very quickly I can get these two options.
Great, you can still see them as both red and green. Your client is probably happy and probably accepts this greenish-blue as green, because we have that ability to distinguish so many types of green. But what about those with red-green colour blindness? Again, I can’t quite do a straight export, so the best is a screenshot, but we can compare those two options like so.
You can probably still tweak the green, but by going for that simple tweak, you can make the client happy—even though it’s still just better to avoid the red and green altogether—and still make the graphic work.
There’s a bit more to say about the rest of the article, which has some additional graphics inside. But that’ll have to wait for another day. As will clearing down the pile of print pieces to share, because that keeps on growing.
Credit for the piece goes to Lazaro Gamio and Amy Schoenfield Walker.