For those who don’t know, China currently engages in ethnocide, or cultural genocide in its western province of Xinjiang, a province with a majority of its population being Uighurs, a Turkic Muslim people. Ethnocide is a term I prefer over genocide as genocide more commonly refers to practices like those in Nazi Germany or 1990s Rwanda and Bosnia wherein people are systematically executed and murdered. Ethnocide leaves a people alive but aims to destroy and extinguish their culture ultimately replacing it with that of another. In this case, Beijing’s policy is to strip the Uighurs of their Muslim culture and identity and replace it with loyalty to China and the Chinese Communist Party.
The BBC have just published what they call the Xinjiang Police Files, files and data hacked off of Chinese government servers and then handed over to a US-based expert on Xinjiang and the atrocities there. That person then handed copies to the BBC, which has verified much of the content.
There is not much by way of data visualisation or information design, but the story is worth mentioning because maybe over one million people are being forcibly detained and “re-educated” by Beijing. One of the articles about the files, however, does have a small graphic of one of the “re-education camps”, i.e. prison, and details its design and the facilities therein.
Political liberalism and pluralism are messy. Often it means we hear and listen to things with which we disagree, sometimes vehemently. Freedom of speech, expression, and religion can make us feel uncomfortable, hurt our feelings, and even sick to our stomachs. But that is also the price of our liberty to speak, express, and pray ourselves. Because we only need to look to China to see what happens when a society or a government decides what is or isn’t acceptable speech (peaceful protests against the government), expression (growing out a beard), or religion (praying in a mosque). An authoritarian regime, an anti-liberal regime, will attempt to stifle, silence, and ultimately imprison those who go against the (Chinese Communist) party line.
1984 rings a little more true each year.
Credit for the piece goes to John Sudworth and the BBC’s Visual Journalism Team.
Last week the Washington Post published a nice long-form article about the troubles facing the Colorado River in the American and Mexican west. The Colorado is the river dammed by the Hoover and Glen Canyon Dams. It’s what flows through the Grand Canyon and provides water to the thirsty residents of the desert southwest.
But the river no longer reaches the ocean at the Gulf of California.
Why? Part drought, part population growth, and part economic activity. The article does a great job of exploring the issue and it does so through the occasional use of information graphics. This screenshot captures the storage capacity of the two main dams, Lake Mead and Lake Powell, created by the Hoover and Glen Canyon Dams, respectively. You may have heard of these recently because the water shortages presently affecting the region have brought reservoir levels to some of their lowest levels in years. And that means people have been finding all sorts of things.
But the graphic does a nice job of showing just how low things have gotten of late. Naturally I am curious what the data looks like on a longer timeline. Hoover Dam, of course, began during the administration of Herbert Hoover but was completed during the Franklin Roosevelt administration—who also renamed the dam as Boulder Dam though Congress reversed that change in 1947. Lake Powell came along three decades later and so the timelines would not be the exact same, but I am curious all the same.
The overall article makes sparse use of the graphics and they occupy much less space in the design than the numerous accompanying photographs. But the balance in terms of content works, I just would have preferred the charts and maps a bit larger.
Contrast this to what we explored last week in a New York Times piece, specifically the online version. There we saw graphics with no headers, data descriptors, axes labels, &c. Here we see the Washington Post was able to create a captivating piece but treat the data and information—and the reader—with respect. There are fewer graphics in this piece, but the way they were handled puts this leaps and bounds above the online version we looked at last week.
Credit for the piece goes to a lot of people, but the graphics specifically to John Muyskens. The rest of the credits go to the author Karin Brulliard and then just copying and pasting from the page: Editing by Amanda Erickson and Olivier Laurent. Photography by Matt McClain. Video by Erin Patrick O’Connor and Jesús Salazar. Video editing by Jesse Mesner-Hage and Zoeann Murphy. Graphics by John Muyskens. Graphics editing by Monica Ulmanu. Design and development by Leo Dominguez. Design editing by Matthew Callahan and Joe Moore. Copy editing by Susan Stanford. Additional editing by Ann Gerhart.
to be overturned by the Supreme Court, as seems likely, states have been busy passing laws to both restrict and expand abortion access. This article from FiveThirtyEight describes the statutory activity with the use of a small multiple graphic I’ve screenshot below.
Each little map represents an action that states could have taken recently, for example in the first we have states banning abortion before 13 weeks, i.e. a nearly total ban on abortion. It uses dots, for this map orange, to indicate legislative acts to that effect. But if states have passed multiple legislative acts, e.g. South Dakota when it comes to banning specific types or reasons for abortion, multiple dots are used.
I generally like this, but would have liked to have seen an overview map either at the beginning or end that would put all the states together in context. Dot placement, especially for states like Kentucky, would be tricky, but it would go a way to show how complex and convoluted the issue has become at the state level.
Last month, a 2-year old shot and killed his 4-year old sister whilst they sat in a car at a petrol station in Chester, Pennsylvania, a city just south of Philadelphia.
Not surprisingly some people began to look at the data around kid-involved shootings. One such person was Christopher Ingraham who explored the data and showed how shootings by children is up 50% since the pandemic. He used two graphics, one a bar chart and another a choropleth map.
The map shows where kid-involved shootings have occurred. Now what’s curious about this kind of a map is that the designer points out that toddler incidents are concentrated around the Southeast and Midwest. And that appears to be true, but some of the standouts like Ohio and Florida—not to mention Texas—are some of the most populated states in the country. More people would theoretically mean more deaths.
So if we go back to the original data and then grab a 2020 US Census estimate for the under-18 population of each state, I can run some back of the envelope maths and we can take a look at how many under-18 deaths there had been per 100,000 under-18 year-olds. And that map begins to look a little bit different.
If anything we see the pattern a bit more clearly. The problem persists in the Southeast, but it’s more concentrated in what I would call the Deep South. The problem states in the Midwest fade a bit to a lower rate. Some of the more obvious outliers here become Alaska and Maine.
As the original author points out, some of these numbers likely owe to lax gun regulation in terms of safe storage and trigger locks. I wonder if the numbers in Alaska and Maine could be due to the more rural nature of the states, but then we don’t see similar rates of kid deaths in places like Wyoming, Montana, and Idaho.
Credit for the original piece goes to Christopher Ingraham.
Yesterday I focused on the big graphic from the New York Times that crossed the full spread of the front/back page. But the graphic was merely the lead graphic for a larger piece. I linked to the online version of the article, but for this post I’m going to stick with the print edition. The article consists of a full-page open then an entire interior spread, all in limited colour. The remainder of the extensive coverage consists of photo essays and interviews that understandably attempt to humanise the data points, after all, each dot from yesterday represented one individual, solitary, human being. That is an important element of a story like this and other national and international tragedies, but we also need to focus on the data and not let the emotion of the story overwhelm our rational and logical analysis.
From a data visualisation standpoint the first page begins simply enough with a long timeline of the Covid-19 pandemic charting the number of absolute deaths each day. As we looked at yesterday, the absolute deaths tell part of the story. But if we were to have looked at the number of absolute cases in conjunction with the deaths, we could also see how the virus has thus far evolved to be more transmissible but less lethal. Here the number of daily deaths from Omicron surpassed Delta, but fell short of the winter peak in early 2021. But the number of cases exploded with Omicron, making its mortality rate lower. In other words, far more people were getting sick, but as far fewer were dying.
An interesting note is that if you take a look at the online version, there the designers chose a more stylised approach to presenting the data.
Here they kept the dot approach and simply stacked and reordered the dots. However, I presume for aesthetic reasons, they kept the stacking loose dots and dropped all the axis lines because it does make for a nice transition from the map to this chart. But they also dropped all headings and descriptors that tell the reader just what they are looking at. These decisions make the chart far less useful as a tool to tell the data-driven element of the story.
There are three annotations that label the number of deaths in New York, the Northeast, and the rest of the United States. But what does the chart say? When are the endpoints for those annotations? And then you can compare the scale of the y-axis of this chart and compare it to the printed version above. A more dramatic scale leads to a more dramatic narrative.
This sort of visual style of flash and fancy transitions over the clear communication of the data is why I find the print piece more compelling and more trustworthy. I find the online version, still useful, but far more lacking and wanting in terms of information design.
The interior spread is where this article shines.
From an editorial design standpoint, the symmetry works very well here. It’s a clear presentation and the white space around the graphic blocks lets that content shine as it should in this type of story. Collectively these pieces do a great job telling the story of the pandemic thus far across the nation. The graphics do not need a lot of colour and make do with sparse flash. Annotations call the reader’s attention to salient points and outliers.
From a content standpoint, I would be particularly curious if we have robust data for deaths by education level. Earlier this year I recall reading news about a study that said education best correlated to Covid cases, and I would be curious to see if that held true for deaths. Of course these charts do a great job of showing just how effective the vaccines were and remain. They are the best preventative measure we have available to us.
Here I disagree with the design decision of how to break down the states into regions. The Census Bureau breaks down the United States into four regions using the same names as in the graphic above. However, if you look closely at the inset map, you will see that Delaware, Maryland, and West Virginia in particular are included as part of the Northeast. (I cannot tell if the District of Columbia is included as part of the Northeast or South.)
Now compare that to the Census Bureau’s definition:
If you ask me to include Delaware and Maryland as part of the Northeast, well, if you’re selling it, I’ll buy it. After all, just because the Census Bureau defines the United States this way does not mean the New York Times has to. Both are connected to the Northeast Corridor via Amtrak and I-95 and are plugged into the Megalopolis economy. Maybe the Potomac should be the demarcation between Northeast and South. But I struggle to understand West Virginia. Before you go and connect it to the Northeast, I would argue that West Virginia has far more in common with the Midwest geographically, economically, and culturally.
More critically, given this issue, it strikes me as a serious problem when the online version of the chart—with the aforementioned issues—does not even include the little inset to highlight this at best unusual regional definition.
And so while I have reservations about the data—how would the data have looked if the states were realigned?—the design of the line charts overall is good.
Again, I am talking about the print version, not that online graphic. I would argue that the above screenshot is barely even a chart and more “data art” or an illustration of data. Consider here, for example, that for the South we have that muted slate blue for the dots, but the spacing and density of the dots leads to areas of lighter slate and darker slate. But a lighter slate means more space between stacked dots and darker slate means a more compact design. A lighter colour therefore pushes the “edge” of the line further up the y-axis and artificially inflates its value, not that we can understand what that value is as the “chart” lacks any sort of y-axis.
Finally the print piece has a set of small multiples breaking down deaths by income in the three largest American cities: New York, Los Angeles, and Chicago. These are just great little charts showing the correlation between income and death from Covid, organised by Zip code.
But this also serves as a stark reminder of just how much better the print piece is over the online version. Because if we take a look at a screenshot from the online article, we have a graphic that addresses all the issues I pointed out earlier.
I am left to wonder why the reader of the online version does not have access to this clearer and more accurate representation of the data throughout the piece?
To me this article is a great example of when the print piece far exceeds that of the online version. Content-wise this is a great story that needed to be told this weekend, but design wise we see a significant gap in quality from print to online. Suffice it to say that on Sunday I was very glad I received the print version.
Credit for the piece goes to Sarah Almukhtar, Amy Harmon, Danielle Ivory, Lauren Leatherby, Albert Sun, and Jeremy White.
This past weekend the United States surpassed one million deaths due to Covid-19. To put that in other terms, imagine the entire city of San Jose, California simply dead. Or just a little bit more than the entire city of Austin, Texas. Estimates place the number of those infected at about 80 million. Back of the envelope maths puts that fatality rate at 1.25%. That’s certainly lower than earlier versions of the virus, which has evolved to be more transmissible, but thankfully less lethal than its original form.
Sunday morning I opened the door to my flat and found the Sunday edition of the New York Times waiting for me with a sobering graphic not just above the fold, nor across the front page. No, the graphic—a map where each dot represents one Covid-19 death—wrapped around the entire paper.
You don’t need to do much more here. Black and white colour sets the tone simply enough. Of course, a bit more critically, these maps mask one of the big issues with the geographic spread of not just this virus but many other things: relatively few people live west of the Mississippi River.
Enormous swathes of the plains and Rocky Mountains have but few farmers and ranchers living there. Most of the nation’s populous cities are along the coast, particularly the East Coast, or along rivers or somewhat arbitrary transport hubs. You can see those because this map does not actually plot the locations of individual deaths, but rather fills county borders with dots to represent the deaths that occurred within those limits. That’s why, particularly west of the Mississippi, you see square-shaped concentrations of deaths.
A choropleth map that explores deaths per capita, that is after adjusting for population, shows a different story. (This screenshot comes from the New York Times‘ data centre for Covid-19.
The story here is literally less black and white as here we see colours in yellows to deep burnt crimsons. Whilst the big map yesterday morning concentrated deaths in the Northeast, West Coast, and around Chicago we see here that, relative to the counties’ populations, those same areas fared much better than counties in the plains, Midwest, and Deep South.
A quick scan of the Northeast and Mid-Atlantic states shows that only one county, Juniata in Pennsylvania, fell into the two worst deaths per capita bins—the deeper reds. Juniata County sits squarely in the middle of Pennsyltucky or Trumpsylvania, where Covid countermeasures were not terribly popular. No other county in the region shares that deep red.
Look to the southeast and south, however, and you see lots of deep and burnt crimsons dotting the landscape. This doesn’t mean people didn’t die in the Northeast, because of course they did. Rather, a greater percentage of the population died elsewhere when, as the policies enacted by the Northeast and West Coast show, they didn’t need to.
After all, injecting bleach was never a good idea.
Two years ago I posted about how the Event Horizon Telescope Collaboration managed to take the first photograph of a black hole, in particular a supermassive black hole at the centre of the M87 galaxy, one of those galaxies far, far away that we see at a long time ago.
This morning, the same group of scientists released the first photograph of Sagittarius A*, the supermassive black hole at the centre of our very own Milky Way Galaxy. The BBC article I read this morning included the photo of the black hole, which you should definitely check out because of its importance in the history of astronomy. But, for our purposes here on Coffeespoons, I wanted to look at the diagram the designers at the BBC made to explain the photograph.
The designer used some simple white lines with a thicker stroke for the axis and defining features and a thinner line to point to elements of the photo. In particular I like the dotted line for the black hole, because there is no real way to photograph the hole itself since it consumes all the light we would need to image it. Instead, we photograph the “black hole” at the centre of the accretion disk, all the super heated gas and matter slowly swirling around and collapsing into the singularity. We also get two axes to show the size of the ring and that of the black hole itself. The ring measures a diameter of about 63 million kilometres. The distance from the Sun to Mercury, the closest planet to our Sun, is 58 million kilometres.
Well done, science. Well done.
Credit for the piece goes to the graphics team at the BBC.
Earlier this week I read an article in the Philadelphia Inquirer about the political prospects of some of the candidates for the open US Senate seat for Pennsylvania, for which I and many others will be voting come November. But before I get to vote on a candidate, members of the political parties first get to choose whom they want on the ballot. (In Pennsylvania, independent voters like myself are ineligible to vote in party primaries.)
This year the Republican Party has several candidates running and one of them you may have heard of: Dr. Oz. Yeah, the one from television. And while he is indeed the front runner, he is not in front by much as the article explains. Indeed, the race largely had been a two-person contest between Oz and David McCormick until recently when Kathy Barnette pulled just about even with the two.
In fact, according to a recent poll the three candidates are all statistically tied in that they all fall within the margin of error for victory. And that brings us to the graphic from the article.
Conceptually this is a pretty simple bar chart with the bar representing the share of the support of those polled. But I wanted to point out how the designer chose to represent the margin of error via hatched shading to both sides of the ends of the red bar.
In some cases the hatch job does not work for me, particularly with those smaller candidates where the bar goes negative. I would have grave reservations about the vote should any candidate win a negative share of the vote. 0% perhaps, but negative? No. I also don’t think the grey hatching works as well over the grey bar in particular and to a lesser degree the red.
I have often thought that these sorts of charts should use some kind of box plot approach. So this morning I took the chart above and reworked it.
Overall, however, I really like this designer’s approach. We should not fear subtlety and nuance, and margins of error are just that. After all, we need not go back too far in time to remember a certain candidate who thought she had a presidential election locked up when really her opponent was within the margin of error.
Everyone knows inflation is a thing. If not, when was the last time you went shopping? Last week the Boston Globelooked specifically at children’s shoes. I don’t have kids, but I can imagine how a rapidly growing miniature human requires numerous pairs of shoes and frequently. The article explores some of the factors going into the high price of shoes and uses, not very surprisingly, some line charts to show prices for components and the final product over time. But the piece also contains a few bar charts and that’s what I’d like to briefly discuss today, starting with the screenshot below.
What we see here are a list of countries and the share of production for select inputs—leather, rubber, and textiles—in 2020. At the top we have a button that allows the user to toggle between the two and a little movement of the bars provides the transition. The length of the bar encodes the country in question’s market share for the selected material.
We also have all this colour, but what is it doing? What data point does the colour encode? Initially I thought perhaps geographic regions, but then you have the US and Mexico, or Italy and Russia, or Argentina and Brazil, all pairs of countries in the same geographic regions and yet all coloured differently. Colour encodes nothing and thus becomes a visual distraction that adds confusion.
Then we have the white spaces between the bars. The gap between bars is there because the country labels attach to the top of the bars. But, especially for the top of the chart, the labels are small and the gap is at just the right height such that the white spaces become white bars competing with the coloured bars for visual attention.
The spaces and the colours muddy the picture of what the data is trying to show. How do we know this? Because later in the article we get this chart.
This works much better. The focus is on the bars, the labelling is clear, almost nothing else competes visually with the data. I have a few quibbles with this design as well, but it’s certainly an improvement over the earlier screenshot we discussed. (I should note that this graphic, as it does here, also comes after the earlier graphic.)
My biggest issue is that when I first look at the piece, I want to see it sorted, say greatest to least. In other words, Furniture and bedding sits at the top with its 15.8% increase, year-on-year, and then Alcoholic beverages last at 3.7%. The issue here, however, is that we are not necessarily looking at goods at the same hierarchical level.
The top of the list is pretty easy to consider: food, new vehicles, alcoholic beverages, shelter, furniture and bedding, and appliances. We can look at all those together. But then we have All apparel. And then immediately after that we have Men’s, Women’s, Boys’ , Girls’, and Infants’ and toddlers’ apparel. In other words, we are now looking at a subset of All apparel. All apparel is at the same level of Food or Shelter, but Men’s apparel is not.
At that point we would need to differentiate between the two, whilst also grouping them together, because the range of values for those different sub-apparel groups comprise the aggregate value for All apparel. And showing them all next to Food is not an apples-to-apples comparison.
If I were to sort these, I would sort by from greatest to least by the parent group and then immediately beneath the parent I would display the children. To differentiate between parent-level and children-level, I would probably make the bars shorter in the vertical and then address the different levels typographically with the labels, maybe with smaller type or by putting the children in italic.
Finally, again, whilst this is a massive improvement over the earlier graphic, I’d make one more addition, an addition that would also help the first graphic. As we are talking about inflation year-on-year, we can see how much greater costs are from Furniture and bedding to Alcoholic beverages and that very much is part of the story. But what is the inflation rate overall?
According to the Bureau of Labour Statistics, inflation over that period was 8.5%. In other words, a number of the categories above actually saw price increases less than the average inflation rate—that’s good—even though they were probably higher than increases had been prior to the pandemic—that’s bad. But, more importantly for this story, with the addition of a benchmark line running vertically at 8.5%, we could see how almost all apparel and footwear child-level line items were below the inflation rate. But the children and infant level items far exceeded that benchmark line, hence the point of the article. I made a quick edit to the screenshot to show how that could work in theory.
Overall, an interesting article worth reading, but it contained one graphic in need of some additional work and then a second that, with a few improvements, would have been a better fit for the article’s story.