The Observation Table

We made it to the end of yet another week. Before the weekend begins for most of my audience—though for my UK readers, enjoy the extended bank holiday and God save the Queen—I wanted to take a look at a graphic from xkcd that shows one can use different types of scopes to make different types of observations.

All the scopes.

I’m constantly thinking about getting a record player. But if I do, maybe I’ll just start calling it my radiogyroscope.

Credit for the piece goes to Randall Munroe.

How Accurate Is Punxsutawney Phil?

For those unfamiliar with Groundhog Day—the event, not the film, because as it happens your author has never seen the film—since 1887 in the town of Punxsutawney, Pennsylvania (60 miles east-northeast of Pittsburgh) a groundhog named Phil has risen from his slumber, climbed out of his burrow, and went to see if he could see his shadow. Phil prognosticates upon the continuance of winter—whether we receive six more weeks of winter or an early spring—based upon the appearance of his shadow.

But as any meteorological fan will tell you, a groundhog’s shadow does not exactly compete with the latest computer modelling running on servers and supercomputers. And so we are left with the all important question: how accurate is Phil?

Thankfully the National Oceanic and Atmospheric Administration (NOAA) published an article several years ago that they continue to update. And their latest update includes 2021 data.

Not exactly an accurate depiction of Phil.

I am loathe to be super critical of this piece, because, again, relying upon a groundhog for long-term weather forecasting is…for the birds (the best I could do). But critiques of information design is largely what this blog is for.

Conceptually, dividing up the piece between a long-term, i.e. since 1887, and a shorter-term, i.e. since 2012, makes sense. The long-term focuses more on how Phil split out his forecasts—clearly Phil likes winter. I dislike the use of the dark blue here for the years for which we have no forecast data. I would have opted for a neutral colour, say grey, or something that is visibly less impactful than the two light colours (blue and yellow) that represent winter and spring.

Whilst I don’t love the icons used in the pie chart, they do make sense because the designers repeat them within the table. If they’re selling the icon use, I’ll buy it. That said, I wonder if using those icons more purposefully could have been more impactful? What would have happened if they had used a timeline and each year was represented by an icon of a snowflake or a sun? What about if we simply had icons grouped in blocks of ten or twenty?

The table I actually enjoy. I would tweak some of the design elements, for example the green check marks almost fade into the light blue sky. A darker green would have worked well there. But, conceptually this makes a lot of sense. Run each prognostication and compare it with temperature deviation for February and March (as a proxy for “winter” or “spring”) and then assess whether Phil was correct.

I would like to know more about what a slightly above or below measurement means compared to above or below. And I would like to know more about the impact of climate change upon these measurements. For example, was Phil’s accuracy higher in the first half of the 20th century? The end of the 19th?

Finally, the overall article makes a point about how difficult it would be for a single groundhog in western Pennsylvania to determine weather for the entire United States let alone its various regions. But what about Pennsylvania? Northern Appalachia? I would be curious about a more regionally-specific analysis of Phil’s prognostication prowess.

Credit for the piece goes to the NOAA graphics department.

How the Globe’s Writers Voted

Yesterday we looked at a piece by the Boston Globe that mapped out all of David Ortiz’s home runs. We did that because he has just been voted into baseball’s Hall of Fame. But to be voted in means there must be votes and a few weeks after the deadline, the Globe posted an article about how that publication’s eligible voters, well, voted.

The graphic here was a simple table. But as I’ll always say, tables aren’t an inherently bad or easy-way-out form of data visualisation. They are great at organising information in such a way that you can quickly find or reference specific data points. For example, let’s say you wanted to find out whether or not a specific writer voted for a specific ballplayer.

Just don’t ask me for whom I would have voted…

Simple red check marks represent those players for whom the Globe’s eligible staff voted. I really like some of the columns on the left that provide context on the vote. For the unfamiliar, players can only remain on the list for up to ten years. And so for the first four, this was their last year of eligibility. None made the cut. Then there’s a column for the total number of votes made by the Globe’s staff. Following that is more context, the share of votes received in 2021. Here the magic number if 75% to be elected. Conversely, if you do not make 5% you drop off the following year. Almost all of those on their first year ballot failed to reach that threshold.

The only potential drawback to this table is that by the time you reach the end of the table, there are few check marks to create implicit rules or lines that guide you from writer to player. David Ortiz’s placement helps because six—remarkably not all Globe writers voted for him—it grounds you for the only person below him (alphabetically) to receive a vote. And we need that because otherwise quickly linking Alex Rodriguez to Alex Speier would be difficult.

Finally below the table we have jump links to each writer’s writings about their selections. And if you’ll allow a brief screenshot of that…

Still don’t ask me

We have a nicely designed section here. Designers delineated each author’s section with red arrows that evoke the red stitching on a baseball. It’s a nice design tough. Then each author receives a headline and a small call out box inside which are the players—and their headshots—for whom the author voted. An initial dropped capital (drop cap), here a big red M, grabs the reader’s attention and draws them into the author’s own words.

Overall this was a solidly designed piece. I really enjoyed it. And for those who don’t follow the sport, the table is also an indicator of how divisive the voting can be. Even the Globe’s writers couldn’t unanimously agree on voting for David Ortiz.

Credit for the piece goes to Daigo Fujiwara and Ryan Huddle.

What’s in a Corporate Name?

Last Thursday I wrote about the Wagner Group, an off-the-books semi-private army the Kremlin uses wage war where plausible deniability is desired. During that piece I mentioned Blackwater, one of the more infamous American private security contractor firms.

The day before I had seen a tweet, this tweet, where Samantha Stokes created a matrix to help people remember just what Blackwater did, as compared to Blackstone.

Bridgewater buying Bridgestone whose tires were shot out by Blackwater bought by Blackrock.

Credit for the piece goes to Samantha Stokes.

2020 Census Apportionment

Every ten years the United States conducts a census of the entire population living within the United States. My genealogy self uses the federal census as the backbone of my research. But that’s not what it’s really there for. No, it exists to count the people to apportion representation at the federal level (among other reasons).

The founding fathers did not intend for the United States to be a true democracy. They feared the tyranny of mob rule as majority populations are capable of doing and so each level of the government served as a check on the other. The census-counted people elected their representatives for the House, but their senators were chosen by their respective state legislatures. But I digress, because this post is about a piece in the New York Times examining the new census apportionment results.

I received my copy of the Times two Tuesdays ago, so these are photos of the print piece instead of the digital, online editions. The paper landed at my front door with a nice cartogram above the fold.

A cartogram exploded.

Each state consists of squares, each representing one congressional district. This is the first place where I have an issue with the graphic, admittedly a minor one. First we need to look at the graphic’s header, “States That Will Gain or Los Seats in the Next Congress” and then look at the graphic. It’s unclear to me if the squares therefore represent the states today with their numbers of districts, or if we are looking at a reapportioned map. Up in Montana, I know that we are moving from one at-large seat to two seat, and so I can resolve that this is the new apportionment. But I am left wondering if a quick phrase or sentence that declares these represent the 2022 election apportionment and not those of this past decade would be clearer?

Or if you want a graphic treatment, you could have kept all the states grey, but used an unfilled square in those states, like Pennsylvania and Illinois, losing seats, and then a filled square in the states adding seats.

Inside the paper, the article continued and we had a few more graphics. The above graphic served as the foundation for a second graphic that charted the changing number of seats since 1910, when the number of seats was fixed.

Timeline of gains and losses

I really like this graphic. My issue here is more with my mobile that took the picture. Some of these states appear quite light, and they are on the printed page. However, they are not quite as light as these photos make them out to be. That said, could they be darker? Probably. Even in print, the dark grey “no change” instances jump out instead of perhaps falling to the background.

The remaining few graphics are far more straightforward, one isn’t even a graphic technically.

First we have two maps.

Good old primary colours.

Nothing particularly remarkable here. The colours make a lot of sense, with red representing Republicans and blue Democrats. Yellow represents independent commissions and grey is only one state, Pennsylvania, where the legislature is controlled by Republicans and the governorship by Democrats.

Finally we have a table with the raw numbers.

Tables are great for organising information. Do you have a state you’re most curious about, Illinois for example? If so, you can quickly scan down the state column to find the row and then over to the column of interest. What tables don’t allow you to do is quickly identify any visual patterns. Here the designers chose to shade the cells based on positive/negative changes, but that’s not highlighting a pattern.

Overall, this was a really strong piece from the Times. With just a few language tweaks on the front page, this would be superb.

Credit for the piece goes to Weyi Cai and the New York Times graphics department.

Quantifying Part of the Opioid Crisis

Two weeks ago the Washington Post published a fascinating article detailing the prescription painkiller market in the United States. The Drug Enforcement Administration made the database available to the public and the Post created graphics to explore the top-line data. But the Post then went further and provided a tool allowing users to explore the data for their own home counties.

The top line data visualisation is what you would expect: choropleth maps showing the prescription and death rates. This article is a great example of when maps tell stories. Here you can clearly see that the heaviest hit areas of the crisis were Appalachia. Though that is not to say other states were not ravaged by the crisis.

There are some clear geographic patterns to see here
There are some clear geographic patterns to see here

For me, however, the true gem in this piece is the tool allowing you the user to find information on your county. Because the data is granular down to county-level information on things like pill shipments from manufacturer to distributor, we can see which pharmacies were receiving the most pills. And, crucially, which manufacturers were flooding the markets. For this screenshot I looked at Philadelphia, though I only moved here in 2016, well after the date range for this data set.

It could be worse
It could be worse

You can clearly see, however, the designers chose simple bar charts to show the top-five. I don’t know if the exact numbers are helpful next to the bars. Visually, it becomes a quick mess of greys, blacks, and burgundies. A quieter approach may have allowed the bars to really shine while leaving the numbers, seemingly down to the tens, for tables. I also cannot figure out why, typographically, the pharmacies are listed in all capitals.

But the because I lived in Chicago for most of the crisis, here is the screenshot for Cook County. Of course, for those not from Chicago, it should be pointed out that Chicago is only a portion of Cook County, there are other small towns there. And some of Chicago is within DuPage County. But, still, this is pretty close.

Better numbers than Philly
Better numbers than Philly

In an unrelated note, the bar charts here do a nice job of showing the market concentration or market power of particular companies. Compare the dominance of Walgreens as a distributor in Cook County compared to McKesson in Philadelphia. Though that same chart also shows how corporate structures can obscure information. I was never far from a big Walgreens sign in Chicago, but I have never seen a McKesson Corporation logo flying outside a pharmacy here in Philadelphia.

Lastly, the neat thing about this tool is that the user can opt to download an image of the top-five chart. I am not sure how useful that bit is. But as a designer, I do like having that functionality available. This is for Pennsylvania as a whole.

For Pennsylvania, state-wide
For Pennsylvania, state-wide

Credit for the piece goes to Armand Emamdjomeh, Kevin Schaul, Jake Crump and Chris Alcantara.

The World Grows On and On

I mentioned this this time last year, but I used to make a lot of datagraphics about GDP growth. The format here has not changed and so there is nothing new to look at there. But, the content is still interesting. And the accompanying Economist article makes the point that high growth rates are not always what they seem. After all, Syria’s high growth rate is because its base is so small.

The 2019 GDP growth forecasts
The 2019 GDP growth forecasts

Credit for the piece goes to the Economist Data Team.

Ratings the Foods

For my American audience, Happy Thanksgiving. Coffeespoons will be on holiday for the remainder of the week. But don’t worry, we’ll be back. For my non-American audience, we basically celebrate a tale of the Pilgrims feasting with Native Americans after a successful harvest.

Today’s graphic is really just a series of tables. I think I missed this back in 2016 because, surprise, I had just moved to Philadelphia and was still settling into things—including running Coffeespoons. Anyway, FiveThirtyEight published an article trying to discover the most popular dishes. This is just a sampling , a screenshot of the meats. But you should go check it out to see if your favourite dishes made the cut.

Where's the beef?
Where’s the beef?

Mine did not. I am not a big fan of turkey and am doing a pork roast tomorrow . I guess I could go with the ham in a pinch though.

Credit for the piece goes to Walt Hickey.

Tracking the Charges and Convictions

In case you missed it somehow, the President of the United States, the Leader of the Free World, is now also an unnamed, unindicted criminal co-conspirator in a federal campaign election law case in New York to which his co-conspirator pled guilty.

And you thought Obama’s tan suit was bad.

The guilty plea by Michael Cohen and the eight convictions of Paul Manafort are all part of a growing scandal surrounding the White House. Thankfully the New York Times published a piece highlighting the results of the various trials. In short, the former National Security Advisor has pled guilty, as has a former campaign advisor, a former deputy campaign manager/transition leader/early administration staffer, and another campaign advisor. Throw in yesterday’s news and this table will get longer.

How much longer will the table get?
How much longer will the table get?

Credit for the piece goes to the New York Times graphics department.

Going Over (But Actually Under)

Late last week I was explaining to someone in the pub why the World Cup matches are played beyond their 90 minute booking. For those among you that do not know, basically the referees add up all the stoppage time, i.e. when play stops for things like injuries or people dilly dallying, and then tack that on to the end of the match.

But it turns out that after I explained this, FiveThirtyEight published an article exploring just how accurate this stoppage time was compared to the amount of stopped time. Spoiler: not very.

In design terms, the big takeaway was the dataset of recorded minutes of actual play in all the matches theretofore. It captured everything but the activity totals where they broke down stoppage time into categories, e.g. injuries, video review, free kicks, &c. (How those broke out across an average game are a later graphic.)

Through 27 June
Through 27 June

The setup is straightforward: a table organises the data for every match. The little spark chart in the centre of the table is a nice touch that shows how much of the 90 minutes the ball was actually in play. The right side of the table might be a bit too crowded, and I probably would have given a bit more space particularly between the expected and actual stoppage times. On the whole, however, the table does its job in organising the data very well.

Now I just wonder how this would apply to a baseball or American football broadcast…

Credit for the piece goes to David Bunnell.