Datagraphic – Page 21 – Coffee Spoons

The Super Short European Super League

Sunday night, news broke that a number of European football clubs were creating a rogue league, the European Super League. My British and European readers—and Americans who follow football—will know the names of Manchester United, Liverpool, AC Milan, Juventus, Real Madrid, and the others.

To put this in perspective for my American readers, imagine the Yankees, Dodgers, Red Sox, Astros, Padres, Mets, Cardinals, Phillies, Angels, and Nationals saying that they were leaving Major League Baseball to go and form their own new baseball league. That they were doing so to “save the sport”. But in so doing, they also guarantee they all make the playoffs every year.

My frequent readers and those who know me will know I’m a fan of the Boston Red Sox. I should point out that the owner of the Red Sox, John Henry, owns both the Red Sox and Liverpool through his company Fenway Sports Group.

Of course, the analogy doesn’t quite hold up, because there are some significant differences between American sports and European football. Relegation is a big one. Personally, I wish American sports had some way of using relegation to incentivise teams to not intentionally suck.

The basic premise of relegation. Take English football. You have four levels of play and in theory any team can exist in any level. Each year, the worst teams move from their current level down one whilst the best teams move up. And for the top level, the top teams get to compete in lucrative European-wide matches. That is a bit simplistic, but imagine that at the end of last year, the Pirates, Rangers, Tigers, and Red Sox became AAA minor league teams and the four best AAA minor league teams became MLB teams. MLB teams would theoretically try to do everything they could to stay in the MLB and not drop to AAA, because that would mean a loss of money. After all, the Yankees would no longer be heading to Fenway nor the White Sox to Detroit. Would seeing the Detroit Tigers play the Woo Sox really be worth the ticket prices you pay at Comerica Park?

But that’s not how American sports work. And so a few American owners, namely those of Manchester United, Arsenal, and Liverpool, want to ensure a steady stream of money. By creating their own league where their teams cannot be relegated, they guarantee that revenue stream.

In other words, this is all about the owners of these Super League teams making even more money.

Because, during the last year, teams have been hurting without fans in attendance. And that gets us to why I can write this up. Because the BBC in an article about this new league addressed the fact that most of these teams are heavily in debt.

This graphic, however, is a bit misleading. Look at Liverpool. There is no available data for how much financial debt the club holds. So why is it placed between Chelsea and Manchester City? It could well have more debt than Tottenham. Liverpool should really be left off this chart and included in the note, because its placement suggests that it has little debt, when that may well not be the case. This is a really misleading graphic when it comes to how Liverpool fits with the other 11 clubs.

From a design standpoint, I’m also not clear on why the x-axis line extends beyond the labels for £-200m and £600m.

I’m not going to touch all the data labels. That’s for another piece I’ve been working on off and on for a little while now.

At this point I should point out that I was going to post this article later, but in the last 18 hours or so the whole thing has fallen apart as the English teams, followed by the others, have been dropping out under immense pressure from the sport and their fans. To bring back my analogy above, imagine MLB retaliating and saying that if those teams created their own league, the players would not be allowed to play in any other matches and the teams would be locked out from all other competitive baseball games. It’s a mess.

Credit for the piece goes to the BBC graphics department.

Politicising Vaccinations

Yesterday I wrote my usual weekly piece about the progress of the Covid-19 pandemic in the five states I cover. At the end I discussed the progress of vaccinations and how Pennsylvania, Virginia, and Illinois all sit around 25% fully vaccinated. Of course, I leave my write-up at that. But not everyone does.

This past weekend, the New York Times published an article looking at the correlation between Biden–Trump support and rates of vaccination. Perhaps I should not be surprised this kind of piece exists, let alone the premise.

From a design standpoint, the piece makes use of a number of different formats: bars, lines, choropleth maps, and scatter plots. I want to talk about the latter in this piece. The article begins with two side by side scatter plots, this being the first.

Hesitancy rates compared to the election results

The header ends in an ellipsis, but that makes sense because the next graphic, which I’ll get to shortly, continues the sentence. But let’s look at the rest of the plot.

Starting with the x-axis, we have a fairly simple plot here: votes for the candidates. But note that there is no scale. The header provides the necessary definition of being a share of the vote, but the lack of minimum and maximum makes an accurate assessment a bit tricky. We can’t even be certain that the scales are consistent. If you recall our choropleth maps from the other day, the scale of the orange was inconsistent with the scale of the blue-greys. Though, given this is produced by the Times, I would give them the benefit of the doubt.

Furthermore, we have five different colours. I presume that the darkest blues and reds represent the greatest share. But without a scale let alone a legend, it’s difficult to say for certain. The grey is presumably in the mixed/nearly even bin, again similar to what I described in the first post about choropleths from my recent string.

Finally, if we look at the y-axis, we see a few interesting decisions. The first? The placement of the axis labels. Typically we would see the labelling on the outside of the plot, but here, it’s all aligned on the inside of the plot. Intriguingly, the designers took care for the placement—or have their paragraph/character styles well set—as the text interrupts the axis and grid lines, i.e. the text does not interfere with the grey lines.

The second? Wyoming. I don’t always think that every single chart needs to have all the outliers within the bounds of the plot. I’ve definitely taken the same approach and so I won’t criticise it, but I wonder what the chart would have looked like if the maximum had been 35% and the grid lines were set at intervals of 5%. The tradeoff is likely increased difficulty in labelling the dots. And that too is a decision I’ve made.

Third, the lack of a zero. I feel fairly comfortable assuming the bottom of the y-axis is zero. But I would have gone ahead and labelled it all the same, especially because of how the minimum value for the axis is handled in the next graphic.

Speaking of, moving on to the second graphic we can see the ellipsis completes the sentence.

Vaccination rates compared to the election results

We otherwise run into similar issues. Again, there is a lack of labelling on the x-axis. This makes it difficult to assess whether we are looking at the same scale. I am fairly certain we are, because when I overlap the graphics I can see that the two extremes, Wyoming and Vermont, look to exist on the same places on the axis.

We also still see the same issues for the y-axis. This time the axis represents vaccination rates. I wish this graphic made a little clearer the distinction between partial and full vaccination rates. Partial is good, but full vaccination is what really matters. And while this chart shows Pennsylvania, for example, at over 40% vaccinated, that’s misleading. Full vaccination is 15 points lower, at about 25%. And that’s the number that needs to be up in the 75% range for herd immunity.

But back to the labelling, here the minimum value, 20%, is labelled. I can’t really understand the rationale for labelling the one chart but not the other. It’s clearly not a spacing issue.

I have some concerns about the numbers chosen for the minimum and maximum values of the y-axis. However, towards the middle of the article, this basic construct is used to build a small multiples matrix looking at all 50 states and their rates of vaccination. More on that in a moment.

My last point about this graphic is on the super picky side. Look at the letter g in “of residents given”. It gets clipped. You can still largely read it as a g, but I noticed it. Not sure why it’s happening, though.

So that small multiples graphic I mentioned, well, see below.

Note how these use an expanded version of the larger chart. The y-minimum appears to be 0%, but again, it would be very helpful if that were labelled.

Also for the x-axis in all the charts, I’m not sure every one needs the Biden–Trump label. After all, not every chart has the 0–60% range labelled, but the beginning of each row makes that clear.

In the super picky, I wish that final row were aligned with the four above it. I find it super distracting, but that’s probably just me.

Overall, this is a strong piece that makes good use of a number of the standard data visualisation forms. But I wish the graphics were a bit tighter to make the graphics just a little clearer.

Credit for the piece goes to Danielle Ivory, Lauren Leatherby and Robert Gebeloff.

Covid Update: 18 April

Last week I wrote about how we may have been beginning to see divergent patterns in new cases, i.e. how New Jersey in particular had seen its new cases numbers falling whilst other states continued with increasing case counts.

One week later, that may still broadly hold true.

Emphasis on may.

New case curves for PA, NJ, DE, VA, & IL.

If we look at the new charts, we can see that broadly, New Jersey did continue its downward trend as Pennsylvania and Delaware experienced significant rises in new cases. Virginia remained fairly stable, but with a slight trend towards increasing numbers of new cases.

But New Jersey and now Illinois present some interesting trends to watch this coming week. Illinois reminds me of New Jersey in that despite rising numbers most of last week, the last few days (and of course the weekend) saw numbers lower than preceding days. You can see from the slightest of dips at the tail of the line the trend has flipped direction. Will the direction hold, however, once we start receiving weekday reporting figures starting Tuesday?

Back to New Jersey, though. The downward trend continued most of the week. But, the last several days could portend a reversal of sorts. For most of the last week, the state saw daily new case numbers increasing day after day. But the trend line, as it should, remained heading downwards. Until just a few days ago. If you look at the tail of the line there, you will see a slight uptick. This too will be something to watch in the coming week.

Deaths also need careful attention this week.

Last I asked the question, will deaths follow rising cases? After a week of data, the answer is unmistakably yes. However, unlike new cases, the increases are largely of a marginal number. Look closely at the ends of the lines for Pennsylvania, New Jersey, Delaware, and Illinois and you will see last week’s shallow rise continued.

Virginia bucked the trend with decreasing numbers of deaths. And of course marginal increases could easily give way to marginal decreases. Now I try not to mention too many daily numbers in these posts because I take the weekly view, but I will be closely following Pennsylvania this week. For the last several weeks, the Commonwealth regularly reported deaths on Sunday and Monday in the single digits. Yesterday Harrisburg reported 40. Is this a one-day surge of reports? Is the state resuming reporting more deaths at the weekend? Or does it portend something worse, a mores significant rise in the number of deaths?

Vaccinations continue apace. Although, I would expect to see some slowdown as the Johnson & Johnson vaccine pause ripples out across the vaccination programme.

Fully vaccinated curves for PA, NJ, DE, VA, & IL.

For now though we continue to see increasing numbers. Indeed, the three states I track have now all reached or should reach today 25% of their population as fully vaccinated.

One, that is good news.

But, two, this is just the beginning.

Last week in some tense questioning about when we can expect resumption of “normal”, Dr. Fauci provided a figure of 10,000 new cases per day across the US. (Currently we are about at 60,000 or so.) Vaccines will impede the transmission as they become ever more widely administered and fully implemented—remember that a first dose of a two-dose regimen does not mean you should be heading out and socialising.

At present, we have Pennsylvania averaging 5,000 new cases per day. In other words, Pennsylvania alone represents half of Dr. Fauci’s target. We are clearly far from that reopening level.

What I will be curious about in the coming weeks though is that interplay between new cases and vaccinations. If Illinois does begin to see a downward trend in new cases this week, how much of it is due to the state being 25% fully vaccinated?

That’s a complex question to answer, but at some point, increasing vaccinations will force new cases to reach an inflection point. First they will begin to bend downward, increasing more slowly instead of exponentially. Then with even more vaccinations a second point will be reached at which this new surge begins to finally turn and new cases drop.

The question is when.

Credit for the piece is mine.

Party Time Post-Vaccine

If all goes according to plan, your author today will receive his first dose of the Covid-19 vaccine, the Pfizer variety for the curious. As such, it feels appropriate to share this recent piece from xkcd.

Also looks like some funky bar chord notation.

All joking aside, it should be said that, and as this graphic illustrates, just because you receive your first dose, doesn’t mean you should be out socialising and seeing people later that night.

You are not fully vaccinated until two weeks after your second dose, or the first if you received Johnson & Johnson. And so while I may be receiving my first dose this afternoon, it is going to be close to a month and a half before I’m able to leave my household unit and socialise with others. Probably three weeks for my second dose and then another two weeks for the vaccine to fully take effect.

Doesn’t mean I won’t be counting the days, though.

Credit for the piece goes to Randall Munroe.

Choropleths…Again

Admittedly, I was trying to find a data set for a piece, but couldn’t find one. So instead for today’s post I’ll turn to something that’s been sitting in my bookmarks for a little while now. It’s a choropleth map from the US Census Bureau looking at population change between the censuses.

The reason I have it bookmarked is for the apportionment map, but I will save apportionment for another post because, well, it’s complicated. But map colours are a thing we’ve been discussing of late and we can extend that conversation here.

What I find interesting about this map is how they used a very dark blue-grey colour for their positive growth and an orange that is a fair bit brighter for negative growth, or population loss. And because of that difference in brightness, the orange really jumps out at you.

To be fair, that’s ideal if you’re trying to talk about where state populations are shrinking, because it focuses attention on declines. But, if you’re trying to present a more neutral position, like this seems to be, that colour choice might not be ideal.

Another issue is that if you look at the legend it simply says loss for that orange. But, look above and you’ll see four bins clearly delimited by ranges of percents for the positive growth. If we are trying to present a more neutral story, the use of the orange places it visually somewhere near the top of that blue-grey spectrum.

If you look at the percentages, however, Michigan’s population decline was 0.6% and Puerto Rico’s 2.2%. If this map used a legend that treated positive and negative growth equally, you would place that one state and one should-be state in a presumably light orange. The scale of their negative growth is equal to something like Ohio, which is in the lightest blue-grey available.

Consequently, this map is a little bit misleading when it comes to negative growth.

Credit for the piece goes to the Census Bureau graphics team.

Building Back Better Boston Transit

The alliteration failed at that last word, but it gets the point across. No mater how you may want to define infrastructure, the term always includes transit. In the Boston Globe, an opinion piece proposed how the city and region of Boston could improve upon the city’s mass transit options.

And they made a map.

The map is an interesting one. It uses thick purple lines to indicate the commuter rail branches—not the metro/subway lines. The problem is that the outside of those lines then encodes the suggested improvements. An orange outline indicates where tracks should be electrified—Boston still uses diesel engines for some of its commuter rail transit. But the problem is that the dark purple dominates the graphic. If, however, the purple were entirely replaced by an orange line, it would be clearer that the Providence needs electrification. (It’s actually already electrified, as that’s the same line Amtrak uses, but Boston’s transit service still uses diesel engines on the line.)

Similarly, the key to indicate upgraded tracks and signals is a blue line of similar “colour” to the purple. That makes it hard to distinguish between the two, especially when next to the green inline option, representing increased speeds.

The key flaw? A long-time wish for Boston transit lovers (or haters). Note how the system is divided into two, the two main hubs, South Station and North Station, do not connect. Connecting the two will require billions of dollars. But the benefits can be tremendous.

Philadelphia, for example, for decades had two rail hubs: Broad Street Station across from City Hall and Reading Terminal several blocks east along Market Street. Reading Terminal was the terminus for the Reading Railroad and Broad Street Station for the Pennsy, or Pennsylvania Railroad. In 1930, Broad Street Station was replaced by an underground station, today’s Suburban Station. But it would not be until 1984 when rail tunnels would finally be opened linking the western/southern Pennsylvania Railroad lines to the northern lines of Reading. But today you can take a train from a southwest suburb to the far northern suburbs without changing trains because of that connection.

Credit for the piece goes to TransitMatters.

Choropleths and Colours Part 2

Last Thursday I wrote about the use of colour in a choropleth map from the Philadelphia Inquirer. Then on Sunday morning, I opened the door to collect the paper and saw a choropleth above the fold for the New York Times. I’ll admit my post was a bit lengthy—I’ve never been one described as short of words—but the key point was how in the Inquirer piece the designer opted to use a blue-to-red palette for what appeared to be a data set whose numbers ran in one direction. The bins described the number of weeks a house remained on the market, in other words, it could only go up as there are no negative weeks.

Compare that to this graphic from the Times.

Here we are not looking at the Philadelphia housing market, but rather the spread of the UK/Kent variant of SARS-CoV-2, the virus that causes COVID-19. (In the states we call it the UK variant, but obviously in the UK they don’t call it the UK variant, they call it the Kent variant from the county in the UK where it first emerged.)

Specifically, the map looks at the share (percent) of the variant, technically named B.1.1.7, in the tests reported for each country. The Inquirer map had six bins, this Times map has five. The Inquirer, as I noted above, went from less than one week to over five weeks. This map divides 100% into five 20-percent bins.

Unlike the Inquirer map, however, this one keeps to one “colour”. Last week I explained why you’ll see one colour mean yellow to red like we see here.

This map makes better use of colour. It intuitively depicts increasing…virus share, if that’s a phrase, by a deepening red. The equivalent from last week’s map would have, say, 0–40% in different shades of blue. That doesn’t make any sense by default. You could create some kind of benchmark—though off the top of my head none come to mind—where you might want to split the legend into two directions, but in this default setting, one colour headed in one direction makes significant sense.

Separately, the map makes a lot of sense here, because it shows a geographic spread of the variant, rippling outward from the UK. The first significant impacts registering in the countries across the Channel and the North Sea. But within four months, the variant can be found in significant percentages across the continent.

Credit for the piece goes to Josh Holder, Allison McCann, Benjamin Mueller, and Bill Marsh.

Covid Update: 11 April

This time last week I wrote about how we should not be surprised at rising levels of coronavirus in the states of Pennsylvania, New Jersey, Delaware, Virginia, and Illinois. After all, our elected officials reopened economies despite data saying they should do otherwise. On top of that, people have been engaging in reckless behaviour and seemingly abandoning the very behaviours that had been leading to declining rates. With those two failures, our last hope is that vaccines will come quickly and be widely taken by the public.

A week hence.

Well, we are beginning to see some divergent patterns, especially with new cases.

Last week there was some evidence that New Jersey might be bucking the trend and headed downwards after weeks of rising new cases. And now that appears to be a more sustained trend as the line for the Garden State’s seven-day average clearly began headed the right direction this past week.

That’s the good news. The bad news is that we continue to see rising numbers of new cases in Pennsylvania, Delaware, and Illinois. Although if we want to try and find the positives in the bad, we can see that Delaware’s upward trend remains fairly shallow. Illinois, while steeper, is rising from a lower base as the Land of Lincoln managed to reach low, summer levels of new case spread earlier this year. And in Pennsylvania, there is a bend in the curve, an inflection point, that could indicate growth in the number of new cases is slowing. We still need to see it turn negative, but slowing growth is better than increasing growth.

Virginia splits the difference between those sets. It remains at an elevated level of new case transmission, but the upward tick we saw—unlike the other states—was not followed by a general surge in new cases. The little rise we did see, in fact seems to have perhaps shifted back downward.

One of the big questions in this current wave of new cases is will deaths rise? We are seeing increasing numbers of new cases and hospitalisations, but will deaths follow? The hope is that we have vaccinated enough of the most vulnerable populations to prevent them from suffering the most serious of results.

So far so good. While death rates remain slightly elevated over summer levels, we do not yet see any signs of rising numbers of deaths. The only possible exception is Virginia, where cases bottomed out after the state added delayed death certificates from the holidays, but have risen in recent days.

Finally we have vaccinations. Here is the best news at which we can look. We can now say that at least 20% of the populations of Pennsylvania, Virginia, and Illinois are fully vaccinated. To be clear, that is still a long way from herd immunity levels, but that’s 20 percentage points more than we had four months ago.

Total full vaccination curves for PA, VA, & IL.

One big outstanding question is how much, if at all, can vaccinated people spread coronavirus? This is why we need to continue to wear masks and socially distance even those who have been vaccinated. But at some point—I don’t know when—these increasing levels of full vaccination should begin to flatten the new case curves. Could that be what’s flattening the curves in New Jersey, Virginia, and Pennsylvania? It’s too early to say, but one can hope.

Credit for the piece is mine.

But What About Pluto?

Damn you Neil deGrasse Tyson (but not really though)!

Because, you know, he advocated for de-planet-fying Pluto back in the oughts.

Which I mention because of this post from xkcd, which corrects common images of planets in the solar system accounting for their population.

Credit for the piece goes to Randall Munroe.

Choropleths and Colours

In many cities through the United States, real estate represents a hot commodity. It’s not difficult to understand why, as have covered before, Americans are saving a bit more. Coupled with stay-at-home orders in a pandemic, spending that cash on a home down payment makes a lot of sense for a lot of people. But with little new construction, it’s a seller’s market.

The Philadelphia Inquirer covers that angle for the Philadelphia region and in the article, it includes a map looking at time to sell a house. And it’s that interactive map I want to look at briefly this morning.

Primarily I want to discuss the colours, as you can gather from this post’s title. We have six bins here, each indicating an amount of time in one-week intervals. So far so good. Now to the colours, we have red for homes that sell in one week or less and blue for homes that sell in five weeks or more.

Blue to red is a pretty standard choice. You will often see it in maps where you have positive growth to negative growth or something similar, I’ve used it myself on Coffeespoons a number of times, like in this map of population growth at the county level here in Pennsylvania.

In those scenarios, however, note how you have positive values and negative values. The change in colour (hue) encodes the change in numerical value, i.e. positive vs. negative. We then encode the values within that positive or negative range with lighter/darker blues and reds. Most often the darker the blue or red, the greater the value toward the end of the spectrum. For example, in Pennsylvania, the dark blue meant population growth greater than 8% and red meant population declines in excess of 8%.

As an aside you’ll note that there are no dark blue counties in that map and that’s by design. By keeping the legend symmetrical in terms of its minimum and maximum values, we can show how no counties experienced rapid population growth whilst several declined rapidly. If dark blue had meant greater than 4% growth, that angle of the story would have been absent from the map.

Back to our choropleth discussion, however. How does that fit with this map of selling times for homes in the Philadelphia region?

Note first that five weeks is a positive value. But so is one week or less. The use of the red-blue split here is not immediately intuitive. If this map were about the change or growth in how long homes sell, certainly you could see positive and negative rates and those would make sense in red and blue.

The second part to understand about a traditional red-blue choropleth is that at some point you have to switch from red to blue, a mid-point if you will. If you are talking positive/negative like in my Pennsylvania map, zero makes a whole lot of sense. Anything above zero, blue, anything below zero red.

Sometimes, you will see a third colour, maybe a grey or a purple, between that red and blue. That encodes a fuzzier split between positive and negative. Say you want to give a margin of 1%, i.e. any geographic area that has growth between +1% and -1%. That intrinsically means the bin is both positive and negative at the same time, so a neutral colour like grey or a blend of the two colours, a purple in the case of red and blue, makes a whole lot of sense.

Here we have nothing like that. Instead we jump from a light yellow two-to-three weeks to a light blue three-to-four weeks.

What about that yellow? In a spectrum of dark blue to light blue, you will see lighter blues than darker blues. But in a red spectrum, that light red becomes pinkish or salmonish depending on that exact type of red you use. (Conversation for another day.) Personal preferences will often push clients to asking a designer to “use less pink” in their maps. I can’t tell you the number of times I’ve heard that.

If that comes up, designers will often keep their blue side of the legend from the dark to light—no complaints there, or at least I’ve never heard any. But for the red side, they’ll switch to using hue or type of colour instead of dark to light red.

Not all colours are as dark as others. Blue and red can be pretty dark. Yellow, however, is a fairly light colour. Imagine if you converted the colours to greyscale, you’ll have very dark greys for blue and red, but yellow will be consistently far lighter than the other two.

The designer can use the light yellow as the light red. But to link the yellow to red, they need to move through the hues or colours between the two. There’s a whole conversation here about colour theory and pigment and light absorption vs. pixels and light emission, but let’s go back to your colours you learned in primary school (pigment and light absorption). Take your colour wheel and what sits between red and yellow? Orange.

And so if a client objects to a light pink, you’ll see a pseudo dark-to-light red spectrum that uses a dark red, a medium orange, and a light yellow. Just like we see here in this Inquirer map.

Back to the two-to-three week and three-to-four week switch, though. What’s the deal? This is my sticking point with the graphic. I am looking for the explanation of why the sudden break in colour here, but I don’t see any obvious one.

Why would you use this colour scheme where blue and red diverge around a non-zero value? Let’s say the average home in the region sells in three weeks, any of the zip codes in red are selling faster than average, hot markets, and those taking longer than average are in blue, cold markets. Maybe it’s the current average, however. What if it were the average last year? Or the national average? These all serve as benchmarks for the presented data and provide valuable context to understand the market.

Unfortunately it’s not clear what, if any, benchmarks the divergence point in this map reflects. And if there is no reason to change colours mid-legend, with only six bins, a designer could find a single colour, a blue or purple for example, and then provide five additional lighter/darker shades of that to indicate increasing/decreasing levels of speed at which homes sell.

Overall, I left this piece a wee bit confused. The general trend of regional differences in how quickly homes are selling? I get that. But because there’s a non-logical break between red and blue here—or at least one I fail to see in the graphic—this map would work almost as well if each bin were a separate colour entirely, using ROYGBIV as a base for example.

Credit for the piece goes to John Duchneskie.