dot plot – Page 2 – Coffee Spoons

Mask Up

Well, we made it to Friday. But, if you’ve been following me on the social, you’ll know that Covid is beginning to spread once again in Pennsylvania, New Jersey, Delaware, Virginia, and Illinois. I live in a tower block and I can say that many of my neighbours are no longer wearing masks indoors. Yet mask-wearing is the easiest defence we have against the spread of the coronavirus. So let’s take a look at the most effective types of masks, thankfully charted by xkcd.

Credit for the piece goes to Randall Munroe.

The Covid Recession’s Continuing Impact on Youth

Earlier this week, some of the work work my team does was published. We produced a one-page summary of a far larger and more comprehensive (relative to the scope of the summary) survey of consumers during the Covid Recession. I will spare you the details of recreating existing templates from scratch and the design decisions that went into that bit—neither insignificant nor unsubstantial—and rather focus on the one graphic we designed.

The broad thrust of the summary is that while overall we are beginning to see some job recovery, that the recovery is uneven and that, in fact, those below the age of 36 are getting hit pretty hard (my words, not the authors). That while in some industries the young are recovering in good numbers, in other industries, industries with a larger share of the youth population, young people are still losing jobs. Then we broke those top line numbers out by industries in the below graphic captured by screenshot.

How different age groups in different industries are faring in the recession.

There are a couple of things from a design side to discuss. We had about two or three days from when we started the project to develop some ideas and then execute and produce the summary. And as I noted above, that also included quite a bit of time in emulating existing documents and building ourselves a new template should we need to do something similar in the future.

But for that graphic in particular, there’s one thing I wanted to highlight: the lack of values on the axis. The challenge here was that the data displayed is people not working. And when we compared this time period (Wave 3) to the earlier waves, we were looking for declines. And so if we going to say that 36+ are gaining construction jobs, that would be -2% value and the youth are about a -13% increase. If you are doing a bit of a double-take at a negative increase, so did the team. Ultimately, we used the data to generate the chart, but then opted for qualitative labelling on the axes. They simply point that in one direction, youth are either gaining or losing jobs, and the same for the 36+. To reinforce this idea, we also added some descriptors in the far corner of each quadrant that said whether the age groups were gaining or losing jobs.

Despite the unusual design decisions I took in the graphic, I’m really proud of this piece especially given its tight turnaround. It shows in almost real-time how fractured the recovery—is this a recovery?—is at this point.

Credit for the piece goes to the team on this, Tom Akana, Kate Gamble, Natalie Spingler, and myself.

Red Sox Starting Rotation: A Dumpster Fire in a Dumpster Fire Year

Baseball for the Red Sox starts on Friday. Am I glad baseball is back? Yes?

I love the sport and will be glad that it’s back on the air to give me something to watch. But the But the way it’s being done boggles the mind. Here today I don’t want to get into the Covid, health, and labour relations aspect of the game. But, as the title suggests, I want to look at a graphic that looks at just how bad the Red Sox could be this (shortened) year. And over at FiveThirtyEight, they created a model to evaluate teams’ starting rotations on an ongoing basis.

The Red Sox are just bad. — Look at the Red Sox, one of the worst in baseball.

Form wise, this isn’t too difficult than what we looked at yesterday. It’s a dot plot with the dots representing individual pitchers. The size of the dots represents their number of total starts. This is an important metric in their model, but as we all know size is a difficult attribute for people to compare and I’m not entirely convinced it’s working here. Some dots are clearly smaller than others, but for most it’s difficult for me to clearly tell.

Colour is just tied to the colour of the teams. Necessary? Not at all. Because the teams are not compared on the same plot, they could all be the same colour. If, however, an eventual addition were made that plot the day’s matchups on one line, then colour would be very much appropriate.

I like the subtle addition of “Better” at the top of the plots to help the user understand the constructed metric. Otherwise the numbers are just that, numbers that don’t mean anything.

Overall a solid piece. And it does a great job of showing just how awful the Red Sox starting rotation is going to be. Because I know who Nate Eovaldi is. And I’ve heard of Martin Perez. Ryan Weber I only know through largely pitching in relief last year. And after that? Well, not on this graphic, but we have Eduardo Rodriguez who had corona and, while he has recovered, nobody knows how that will impact people in sports. There’s somebody named Hall who I have never heard of. Then we have Brian Johnson, a root for the guy story of beating the odds to reach the Major Leagues but who has been inconsistent. Then…it is literally a list of relief pitchers.

We dumped the salary of Mookie Betts and David Price and all we got was basically a tee-shirt saying “We still need a pitcher or three”.

Credit for the piece goes to Jay Boice.

Consumer Payment Methods During the Corona Times

Okay, so we’re going to post some more of my work today, but it’s not about cases and deaths. Instead, I took some data produced by my colleagues and thought that it could do for a small transformation from a table into a chart. The original table can be found in their report on consumer payment options during the Covid-19 pandemic.

After setting the kettle on for some tea this morning we started on their Table 1. Thirty minutes later and a cup of Irish Breakfast consumed, I had transformed it into this:

Obviously I changed the language/title a little bit. But the original was too long and didn’t fit. Also this is my blog, so my rules. The visualisation improves upon the table in a number of ways, but tables do have their place. Tables are great for organising information. Find a column header and a row header and you can get any specific data point. But, if you want to make a comparison between two data points or several of them, a chart is the way to go. Now, you may lose some precision. For example, do I know to the decimal point or to the tenths even what one of those dots represents? Nope. But at a glance, can I see which dots are below the overall respondents? Yep. It’s abundantly clear that those earning less than $40,000 per year have a greater availability of debit cards than the other groups shown.

And after all, I couldn’t have made this graphic without that table.

Full disclosure, as alluded to above, I work at the Federal Reserve Bank of Philadelphia. But I had nothing to do with the data, report, or presentation thereof.

Credit for the graphic is mine. The data to the folks over at the Consumer Finance Institute.

How Mass Shootings Have Changed

A few weeks ago here in the United States, we had the mass shootings in El Paso, Texas and Dayton, Ohio. The Washington Post put together a piece looking at how mass shootings have changed since 1966. And unfortunately one of the key takeaways is that since 1999 they are far too common.

The biggest graphic from the article is its timeline.

It captures the total number of people killed per event. But, it also breaks down the shootings by admittedly arbitrary time periods. Here it looks at three distinct ones. The first begins at the beginning of the dataset: 1966. The second begins with Columbine High School in 1999, when two high school teenagers killed 13 fellow students. Then the third begins with the killing of 9 worshippers in a African Episcopal Methodist church in Charlestown, South Carolina.

Within each time period, the peaks become more extreme, and they occur more frequently. The beige boxes do a good job of calling out just how frequently they occur. And then the annotations call out the unfortunate historic events where record numbers of people were killed.

The above is a screenshot of a digital presentation. However, I hope the print piece did a full-page printing of the timeline and showed the entire timeline in sequence. Here, the timeline is chopped up into two separate lines. I like how the thin grey rule breaks the second from the third segment. But the reader loses the vertical comparison of the bars in the first segment to those in the second and third.

Later on in the graphic, the article uses a dot plot to examine the age of the mass shooters. There it could have perhaps used smaller dots that did not feature as much overlap. Or a histogram could have been useful as infrequently used type of chart.

Lastly it uses small multiples of line charts to show the change in frequency of particular types of locations.

Overall it’s a solid piece. But the timeline is its jewel. Unfortunately, I will end up talking about similar graphics about mass shootings far too soon in the future.

Credit for the piece goes to Bonnie Berkowitz, Adrian Blanco, Brittany Renee Mayes, Klara Auerbach, and Danielle Rindler.

Water, Water Everywhere Nor Any Drop to Drink Part II

Yesterday we looked at the New York Times coverage of some water stress climate data and how some US cities fit within the context of the world’s largest cities. Well today we look at how the Washington Post covered the same data set. This time, however, they took a more domestic-centred approach and focused on the US, but at the state level.

Still no reason to move to the Southwest

Both pieces start with a map to anchor the piece. However, whereas the Times began with a world map, the Post uses a map of the United States. And instead of highlighting particular cities, it labels states mentioned in the following article.

Interestingly, whereas the Times piece showed areas of No Data, including sections of the desert southwest, here the Post appears to be labelling those areas as “arid area”. We also see two different approaches to handling the data display and the bin ranges. Whereas the Times used a continuous gradient the Post opts for a discrete gradient, with sharply defined edges from one bin to the next. Of course, a close examination of the Times map shows how they used a continuous gradient in the legend, but a discrete application. The discrete application makes it far easier to compare areas directly. Gradients are, by definition, harder to distinguish between relatively close areas.

The next biggest distinguishing characteristic is that the Post’s approach is not interactive. Instead, we have only static graphics. But more importantly, the Post opts for a state-level approach. The second graphic looks at the water stress level, but then plots it against daily per capita water use.

My question is from the data side. Whence does the water use data come? It is not exactly specified. Nor does the graphic provide any axis limits for either the x- or the y-axis. What this graphic did make me curious about, however, was the cause of the high water consumption. How much consumption is due to water-intensive agricultural purposes? That might be a better use of the colour dimension of the graphic than tying it to the water stress levels.

The third graphic looks at the international dimension of the dataset, which is where the Times started.

Here we have an interesting use of area to size population. In the second graphic, each state is sized by population. Here, we have countries sized by population as well. Except, the note at the bottom of the graphic notes that neither China nor India are sized to scale. And that make sense since both countries have over a billion people. But, if the graphic is trying to use size in the one dimension, it should be consistent and make China and India enormous. If anything, it would show the scale of the problem of being high stress countries with enormous populations.

I also like how in this graphic, while it is static in nature, breaks each country into a regional classification based upon the continent where the country is located.

Overall this, like the Times piece, is a solid graphic with a few little flaws. But the fascinating bit is how the same dataset can create two stories with two different foci. One with an international flavour like that of the Times, and one of a domestic flavour like this of the Post.

Credit for the piece goes to Bonnie Berkowitz and Adrian Blanco.

Water, Water Everywhere Nor Any Drop to Drink

Most of Earth’s surface is covered by water. But, as any of you who have swallowed seawater can attest, it is not exactly drinkable. Instead, mankind evolved to drink freshwater. And as some new data suggests, that might not be as plentiful in the future because some areas are already under extreme stress. Yesterday the New York Times published an article looking at the findings.

More reasons for me not to move to the desert southwest

The piece leads with a large map showing the degree of water stress across the globe. It uses a fairly standard yellow to red spectrum, but note the division of the labels. The High range dwarfs that of the Low, but instead of continuing on, the Extremely High range then shrinks. Unfortunately, the article does not go into the methodology behind that decision and it makes me wonder why the difference in bin sizes.

Of course, any big map makes one wonder about their own local condition. How stressed is Philadelphia, for example? Thankfully, the designers kept that in mind and created an interactive dot plot that marks where each large city falls according to the established bins.

At this scale, it is difficult to find a particular city. I would have liked a quick text search ability to find Philadelphia. Instead, I had to open the source code and search the text there for Philadelphia. But more curiously, I am not certain the graphic shows what the subheading says.

To understand what a third of major urban areas is, we would need to know the total number of said cities. If we knew that, a small number adjacent to the categorisation could be used to create a quick sum. Or a separate graphic showing the breakdown strictly by number of cities could also work. Because seeing where each city falls is both interesting and valuable, especially given how the shown cities are mentioned in the text—it just doesn’t fit the subheading.

But, for those of you from Chicago, I included my former home as a different screenshot. Though I didn’t need to search the source code, because I just happened across it scrolling through the article.

It helps having Lake Michigan right there

Credit for the piece goes to Somini Sengupta and Weiyi Cai.

The Tory Leadership Race: The Favourite and All the Also Rans

This piece was published Monday, so it’s one round out of date, but it still holds true. It looks at the betting odds of each of the candidates looking to enter No. 10 Downing Street. And yeah, it’s going to be Boris.

That's a pretty sizable gap — That’s a pretty sizable gap

The thing that strikes me as odd about this piece however, is note the size of the circles. Why are they larger for Boris Johnson and Rory Stewart? It cannot be proportional to their odds of victory or else Boris’ head would be…even bigger. Is that even possible? Maybe it relates to their predicted placement of first and second, the two of which go to the broader Tory party for a vote. It’s really unclear and deserves some explanation.

The graphic also includes a standard line chart. It falls down because of spaghettification in that all those also rans have about the same odds, i.e. slim, to beat Boris.

Perhaps the most interesting thing to follow is who will be the other person on the ballot. But then who remembers Andrea Leadsom was the runner up to Theresa May?

Credit for the piece goes to the Economist graphics department.

Studying Will Be the Death of Me

At least in certain fields. Happy Thursday all. For me, however, it’s more of a Friday. I am on holiday the next several days, so until I resume posting mid-next week, I will leave you with an xkcd graphic that looks at how what you study can kill you. I think all my economist colleagues are safe.

Where's design though? — Where’s design though?

Credit for the piece goes to Randall Munroe.

How Does the UK View Their Political Parties?

The United Kingdom crashes out of the European Union on Friday. That means there is no deal to safeguard continuity of trading arrangements, healthcare, air traffic control, security and intelligence deals, &c. Oh, and it will likely wreck the economy. No big deal, Theresa. But what do UK voters think about their leading political parties in this climate? Thankfully Politico is starting to collect some survey data from areas of marginal constituencies, what Americans might call battleground districts, ahead of the eventual next election.

And it turns out the Tories aren’t doing well. Though it’s not like Labour is performing any better, because polling indicates the public sees Corbyn as an even worse leader than Theresa May. But this post is more to talk about the visualisation of the results.

Of course I naturally wonder the perception of the smaller parties like the Liberal Democrats or Change UK (the Independent Group)

The graphics above are a screenshot where blue represents the Conservatives (Tories) and red Labour. The key thing about these results is that the questions were framed around a 0–10 scale. But look at the axes. Everything looks nice and evenly spread, until you realise the maximum on the y-axis is only six. The minimum is two. It gives the wrong impression that things are spread out neatly around the midpoint, which here appears to be four. But what happens if you plot it on a full axis? Well, the awfulness of the parties becomes more readily apparent.

Labour might be scoring around a five on Health, but its score is pretty miserable in these other two categories. And don’t worry, the article has more. But this quick reimagination goes to show you how important placing an axis’ minimum and maximum values can be.

Credit for the piece goes to the Politico graphics department.