After working pretty much non-stop all spring and summer, your humble author finally took a few days off and throw in a bank holiday and you are looking at a five-day weekend. But, because this is 2020 travelling was out of the question and so instead I hunkered down to finish writing/designing an article I have been working on for the last several weeks/few months.
The main write-up—it is a lengthy-ish read so you may want to brew a cup of tea—is over at my data projects site. This is the first project I have really written about for that since spring/summer 2016. Some of my longer-listening readers may recall that the penultimate piece there I wrote about Pennsyltucky was inspired by work I did here at Coffeespoons.
To an extent, so is this piece. I wrote about Trumpsylvania, the political realignment of the state of Pennsylvania. 2016 and the state’s vote for Donald Trump was less an aberration than many think. It was the near-end result of a decades-long transformation of the state’s political geography. And so I looked at the data underlying the shift and how and where it occurred.
And originally, I had a slightly different conclusion as to how this related to Pennsylvania in the upcoming 2020 election. But, the whole 2020 thing made me shift my thinking slightly. But you’ll have to read the whole thing to understand what I’m talking about. I will leave you with one of the graphics I made for the piece. It looks at who won each county in the state, but also whether or not the candidate was able to flip the county. In other words, was Clinton able to flip a Republican county? Was Trump able to flip a Democratic county?
Let me know what you think.
And of course, many, many thanks to all the people who suffered my ideas, thoughts, and early drafts over the last several weeks. And even more thanks to those who edited it. Any and all mistakes or errors in the piece are all mine and not theirs.
This is from a social media post I made a few days ago, but think it may be of some relevance/interest to my Coffeespoons followers. I was curious to see at 30+ days from the general election, how has the landscape changed for the two parties since 2016?
Well, this project has driven me to a related, but slightly different project that has been consuming my non-work time. Hopefully I will have more on that in the coming days. Without further ado, the post:
Pennsylvania will likely be one of the more critical battleground swing states in this year’s election. In 2016, then candidate Trump won the state by less than one percentage point. But four years is a long time and I was curious to see how things have changed.
In the first chart on the right we see counties won by Trump and on the left, Clinton. The further from the centre, the greater the candidate’s margin of victory over the other. The top half plots registered Republicans’ margin over Democrats as a percentage of all registered voters in the county (including independents and third party) and the bottom half does the same for Democrats. Closer to the centre, the more competitive, further away, less so.
Trump’s key to victory was the white, working class voter clustered in the west and the northeast of the state–old mining and steel towns. There Democrats normally counted on organised labour support as registered Democrats. That all but collapsed in 2016. The bottom right shows a number of nominally Democratic counties Trump won, whereas Clinton only picked up one Republican county, Chester.
But what are PA’s battlegrounds?
In the second chart we ignore places like Philly and Fulton County and zoom in on more competitive counties within 20 point margins. Polls presently point to a Biden lead of about 5 points in PA. If every dot moved left by 5 points (it doesn’t really work like that), we only see Erie and Northampton with potential to flip.
But Trump’s realignment of politics is accelerating (more on this another day) a realignment of PA’s political geography.
In the fourth chart, neither Erie nor Northampton show any real movement via party registration back to Democrats. Erie may flip, but Northampton’s likely a stretch. Places like Cumberland and Lancaster counties are too solidly Republican to flip this year. Instead Trump is more likely to flip counties like Monroe and Lehigh red, even if he loses the state.
Because, not shown, the key to a Biden victory will be running up the margins in Philly & Pittsburgh, and to a lesser extent Philly’s four collar counties, including Chester, which appears to be rapidly shifting in Democrats’ favour.
Earlier this week, some of the work work my team does was published. We produced a one-page summary of a far larger and more comprehensive (relative to the scope of the summary) survey of consumers during the Covid Recession. I will spare you the details of recreating existing templates from scratch and the design decisions that went into that bit—neither insignificant nor unsubstantial—and rather focus on the one graphic we designed.
The broad thrust of the summary is that while overall we are beginning to see some job recovery, that the recovery is uneven and that, in fact, those below the age of 36 are getting hit pretty hard (my words, not the authors). That while in some industries the young are recovering in good numbers, in other industries, industries with a larger share of the youth population, young people are still losing jobs. Then we broke those top line numbers out by industries in the below graphic captured by screenshot.
There are a couple of things from a design side to discuss. We had about two or three days from when we started the project to develop some ideas and then execute and produce the summary. And as I noted above, that also included quite a bit of time in emulating existing documents and building ourselves a new template should we need to do something similar in the future.
But for that graphic in particular, there’s one thing I wanted to highlight: the lack of values on the axis. The challenge here was that the data displayed is people not working. And when we compared this time period (Wave 3) to the earlier waves, we were looking for declines. And so if we going to say that 36+ are gaining construction jobs, that would be -2% value and the youth are about a -13% increase. If you are doing a bit of a double-take at a negative increase, so did the team. Ultimately, we used the data to generate the chart, but then opted for qualitative labelling on the axes. They simply point that in one direction, youth are either gaining or losing jobs, and the same for the 36+. To reinforce this idea, we also added some descriptors in the far corner of each quadrant that said whether the age groups were gaining or losing jobs.
Despite the unusual design decisions I took in the graphic, I’m really proud of this piece especially given its tight turnaround. It shows in almost real-time how fractured the recovery—is this a recovery?—is at this point.
Credit for the piece goes to the team on this, Tom Akana, Kate Gamble, Natalie Spingler, and myself.
This past weekend I continued looking at the spread of COVID-19 across the United States. But in addition to my usual maps of Pennsylvania, New Jersey, Delaware, Virginia, and Illinois, I also looked at the number of cases across the United States adjusted for population. I then looked at the five aforementioned states in terms of new cases to see if the curve is flattening. Finally, I looked at the number of hospital beds per 1000 people vs the number of cases per 1000 people.
The latter in particular I wanted to be an examination of hospitalisation rates vs ICU beds, which are a small fraction of total hospital beds. But as I could not find that data, I made do with overall cases and overall beds.
So first let’s look at the cases across the U.S. What you can see is that whilst New York and New Jersey do have some of the worst of the impact, Washington is still not great and Louisiana and Michigan are also suffering.
And then when we look at the states by their cases per 1000 people and their hospital beds per 1000 people, we see that the states often claimed to be overwhelmed, New York, New Jersey, and Washington are all well over the blue line, which indicates an equal number of beds and cases per 1000 people, or near it. Because it is important to remember that not all beds are the type needed for COVID-19 victims, who often require the more fully kitted out ICU beds. Additionally, not all cases are severe enough to warrant hospitalisation.
Then from the broader national view, we can look at the states of interest. Here, those of you who have been following my social media posts, you can see fewer dark purples in these maps. That’s because I have adopted a new palette that has sacrificed granularity at the lower end of the scale and added it at the top, a particular need in New Jersey and the Philadelphia and Chicago metro areas. And finally we look at the daily new cases to see if that curve is flattening.
Pennsylvania now has almost every county infected. But unlike Illinois, which has a similar infection rate but more unaffected counties, Pennsylvania has fewer cases in its big city, Philadelphia, and has more cases in the smaller cities and towns.
New Jersey is just a disaster. Deaths are now reported in every county—so I can probably remove those orange outlines. The only potential good news is that new cases for the second day in a row were fewer than the day before. It could be a blip. But it could also be a signal that the peak of infection has or is nearing. That said, hospitalisations and deaths are lagging indicators and could take two weeks to follow the positive test results. So in the best case scenario that this is a peak, New Jersey is far from out of the woods.
Delaware is the smallest state I look at—and one of the smallest in the union overall—but its cases are worryingly increasing rapidly, although like every state I examine in detail it had fewer new cases Sunday than Saturday.
Virginia is in a better spot overall than the other four states. You can see that in the national map above. And most of Virginia’s cases are concentrated in the DC and Richmond areas as well as the cities along the peninsulas jutting into the Chesapeake.
Illinois is, as noted above, similar to Pennsylvania in terms of infections. In terms of deaths, however, it is doubling Pennsylvania’s numbers. And most of its cases are located in and around Chicago. Big chunks of downstate Illinois are unaffected or lightly affected compared to the Commonwealth.
Finally, as I noted in New Jersey, could these lower numbers Sunday than Saturday be meaningful? Possibly. But in all five states? Highly unlikely. Regardless, we can look at the number of daily new cases and see if that curve of infection is flattening. We should wait several days before beginning to make that assessment. But one can hope.
All of this is to say that things are bad and likely will continue to get worse. But I will keep looking at the data daily and presenting it to the public to keep them informed.
Yesterday we looked at the expansion of city footprints by sprawl, in modern years largely thanks to the automobile. Today, I want to go back to another article I’ve been saving for a wee bit. This one comes from the Economist, though it dates only back to the beginning of October.
This article looks at the different ways a city can achieve density. Usually one things of soaring skyscrapers, but there are other paths. For those interested, the article is a short read and I won’t cover it here. But for the sake of the graphic below, there are three basic paths: coverage, height, and crowding. Or to put in other terms, how much of the city is covered by homes, how tall those homes go, and how many people fit into each home.
I really like this graphic. It does a great job of using small multiples to compare and contrast three cities that exemplify the different paths. Notably, it keeps each city footprint at the same scale, making it easier to see things such as why Hong Kong builds skyward. Because it has little land. (It is, after all, an island and the tip of a peninsula.)
One area where I wish the graphic had kept to the small multiples is its display of Minneapolis. There, the scale shifts (note the lines for 5 km below vs. Minneapolis’ 10 km). I think I understand why, because the sprawling city would not have fit within the confines of the graphic, but that would have also hammered home the point of sprawl.
I should also point out that the article begins with a graphic I chose not to screenshot, but that I also really enjoy. It uses small multiples to compare cities density over time, running population on the x-axis and people per hectare on the y-. It is not a perfect graphic (it uses I think unnecessary arrowheads at the end of the line), but scatter plots over time are, I think, an underused graphic to show how two variables (ideally related) have moved in tandem over time.
Overall, this is a strong piece from the Economist.
Credit for the piece goes to the Economist graphics department.
For all my American readers, I hope you all enjoyed their Labour Day holiday. For the rest of you, today is just a Tuesday. Unless you live in the Bahamas, then today is just another nightmarish day as Hurricane Dorian continues his assault on the islands.
The storm will be one for the record books when all is said and done, and not just because of the damage likely to be catastrophic when people can finally emerge and examine what remains. The storm, by several metrics, is one of the most powerful in the Atlantic since we started recording data on hurricanes. If we look at pressure and sustained wind speeds, i.e. not wind gusts, Sam Lillo has plotted the path of Dorian through those metrics and found it sitting scarily in the lower-right corner of this plot.
The graphic does a couple of nice things here. I like the use of colour to indicate the total number of observations in that area. Clearly, we see a lot more of the weaker, higher pressure storms. Hence the dark blue in the upper-left. But then against that we have the star of the graphic, and my favourite part of the plot: the plot over time of Dorian’s progress and intensification as a storm. The final green dot indicates the point of the last observation when the graphic was made.
Overall this is a simple and solid piece that shows in the available historical context just how powerful Dorian is. Unfortunately that correlates with likely heavy damage to the Bahamas.
Credit for the piece I presume goes to Sam Lillo, though with the Twitter one can never be entirely certain.
Yesterday we looked at the New York Times coverage of some water stress climate data and how some US cities fit within the context of the world’s largest cities. Well today we look at how the Washington Post covered the same data set. This time, however, they took a more domestic-centred approach and focused on the US, but at the state level.
Both pieces start with a map to anchor the piece. However, whereas the Times began with a world map, the Post uses a map of the United States. And instead of highlighting particular cities, it labels states mentioned in the following article.
Interestingly, whereas the Times piece showed areas of No Data, including sections of the desert southwest, here the Post appears to be labelling those areas as “arid area”. We also see two different approaches to handling the data display and the bin ranges. Whereas the Times used a continuous gradient the Post opts for a discrete gradient, with sharply defined edges from one bin to the next. Of course, a close examination of the Times map shows how they used a continuous gradient in the legend, but a discrete application. The discrete application makes it far easier to compare areas directly. Gradients are, by definition, harder to distinguish between relatively close areas.
The next biggest distinguishing characteristic is that the Post’s approach is not interactive. Instead, we have only static graphics. But more importantly, the Post opts for a state-level approach. The second graphic looks at the water stress level, but then plots it against daily per capita water use.
My question is from the data side. Whence does the water use data come? It is not exactly specified. Nor does the graphic provide any axis limits for either the x- or the y-axis. What this graphic did make me curious about, however, was the cause of the high water consumption. How much consumption is due to water-intensive agricultural purposes? That might be a better use of the colour dimension of the graphic than tying it to the water stress levels.
The third graphic looks at the international dimension of the dataset, which is where the Times started.
Here we have an interesting use of area to size population. In the second graphic, each state is sized by population. Here, we have countries sized by population as well. Except, the note at the bottom of the graphic notes that neither China nor India are sized to scale. And that make sense since both countries have over a billion people. But, if the graphic is trying to use size in the one dimension, it should be consistent and make China and India enormous. If anything, it would show the scale of the problem of being high stress countries with enormous populations.
I also like how in this graphic, while it is static in nature, breaks each country into a regional classification based upon the continent where the country is located.
Overall this, like the Times piece, is a solid graphic with a few little flaws. But the fascinating bit is how the same dataset can create two stories with two different foci. One with an international flavour like that of the Times, and one of a domestic flavour like this of the Post.
Credit for the piece goes to Bonnie Berkowitz and Adrian Blanco.
Earlier this month the Economist published an article that looked at a different way of measuring the economic output of North Korea. The state is so secretive that the publicly available data we all rely on for almost every country is not available. Nor would we necessarily believe their figures. So we have to rely on other measures to estimate the North Korean economy.
The article is about how luminosity, i.e. the lights on seen from space at night, can be used as a proxy for economic activity in the reclusive state.
The article is a fascinating read and uses a scatter plot to show the correlation between luminosity and GDP per capita then how that translates to North Korea, comparing it to older models.
Credit for the piece goes to the Economist graphics department.
In science news, we turn to graphics about planets and things. Specifically we are talking about exoplanets, i.e. planets that exist outside our solar system. Keep in mind that we have only been able to detect exoplanets since the 1990s. Prior to then, how rare was our system with all our planets? It could have been very rare. Now we know, probably not so much.
But, in all of that discovery, we are missing entire types of planets. This article published by Forbes does a nice job explaining why. But one of the key types of planets that we have been unable to discover heretofore have been: intermediately distant, giant planets. Think the Jupiters and Saturns of our system. Prior to now we could detect massive Jupiter-like planets orbiting super near to their distant stars. Or, we could detect super massive planets orbiting very far away. The in-betweeners? Not so much.
The above screenshot does a good job of showing where new detection methods have allowed scientists to begin to fill in the gaps. It shows how there is an enormous gap between what we have discovered and how they have been discovered. And the article does a nice job explaining how the science works in that only now with our longer periods of observation will help resolve certain issues.
From a design standpoint, this isn’t a super complicated graphic. It does rely upon a logarithmic scale, which isn’t common in non-scientific or academic papers. But this graphic comes from that environment, so it makes a lot of sense. The article is full of graphics from third-party sources, but I found this the most informative because of that very gap it highlights and how the new work (the stars) begin to fill it in.
Credit for the screenshotted piece goes to E. L. Rickman et al.
Yesterday we looked at how China and the European Union are planning their tariff/trade war retaliation to target Trump voters. Today let’s take a look at how those voters are doing as this article from Bloom does.
The article is not terribly complicated. We have four choropleth maps at the county level. Two of the maps isolate Trump-won counties and the other two are Clinton-won. For each candidate we have a GDP growth and an employment growth map.
In the Trump-won maps, the Clinton-won counties are white, and vice versa. Naturally, because the Democratic vote is greatest in the large cities, which, especially on the East Coast, are in tiny counties, you see a lot less colour in the Clinton maps.
Design wise, I should point out the obvious that green-to-red maps are not usually ideal. But the designers did a nice job of tweaking these specific colours so that when tested, these burnt oranges and green-blues do provide contrast.
But I am really curious to see this data plotted out in a scatter plot. Of course the big counties in the desert southwest are noticeable. But what about Philadelphia County? Cook County? Kings County? A scatter plot would make them equally tiny dots. Well, hopefully not tiny. But then when you compare GDP growth and employment growth and benchmark them against the US average, we might see some interesting patterns emerge that are otherwise masked behind the hugeness of western counties.
But lastly. And always. Where. Are .Alaska. And. Hawaii? (Of course the hugeness problem is of a different scale in Hawaii. Their county equivalents are larger than states combined.)
Credit for the piece goes to the Bloomberg graphics department.