Almost a month ago I wrote about how the Pennsylvania Supreme Court was considering a case involving the state’s heavily gerrymandered US congressional districts, which some have called among the worst in the nation. About a week later the Pennsylvania Supreme Court decided that the map is in fact so gerrymandered it violates the Pennsylvania Constitution. It ordered the Republican-controlled legislature to create a new, non-gerrymandered map that would have to be approved by the Democratic governor. I did not write up that then Pennsylvanian Republicans appealed to the US Supreme Court—no graphics for that story. That appeal was rejected by Justice Alito, but with only days to spare the state legislature then created this new map and sent in this new one on Friday.
The problem, according to the governor and outside analysts, is that the map is just as gerrymandered as the previous one. Consequently, yesterday the governor rejected the new map and so now the Pennsylvania Supreme Court, working with outside experts in political redistricting, will create a new congressional map for Pennsylvania. Hopefully before May when the state has its first primaries.
But just how do we know that the new map, despite looking different, was just as gerrymandered. Well, the Washington Post plotted the election margins for districts in 2016 using precinct data versus their proposed 2018 map overlaid atop those same precincts. What did they get? Almost identical results. The districts are no longer Goofy Kicking Donald Duck-esque, but they exhibit the same Republican bias of the previous map.
For the purposes of design, I probably would have dropped the “PA-” labels, as they are redundant since the whole plot examines Pennsylvania congressional districts. I think that, perhaps with a marker, and maybe a line of no-change would go a bit further in more clearly showing how the ultimately rejected map was nearly identical to its previous incarnation.
Credit for the map borders goes to the Pennsylvania state legislature, the version here to the Washington Post Wonkblog. All Wonkblog for the scatterplot.
Off of yesterday’s piece looking at the potential slowdown in British economic growth post-Brexit, I wanted to look at a piece from the Economist exploring the state of the UK’s current trade deals.
I understand what is going on, with the size of the bubbles relating to British exports and the colour to the depth of the free trade deal, i.e. how complex, thorough, and wide-ranging. But the grouping by quadrant?
With trade, geographical proximity is a factor. Things that come from farther cost more because fuel, labour time, &c. One of the advantages the UK currently has is the presence of a massive market on its doorstep with which it already has tariff- and customs-less trade—the European Union.
Consequently, could the graphic somehow incorporate the element of distance? The problem would be how to account for routes, modes of transport, time—how long does a lorry have to queue at the border, for example. Alas, I do not have a great answer.
Regardless of my concepts, this piece does show how the most valuable trade partners already enjoy the deepest and largest trade deals, all through the European Union. And so the UK will need to work to replicate those deals with all of these various countries.
Credit for the piece goes the Economist Data Team.
Baseball season begins next week. For different teams it starts different days, but for the Red Sox at least, pitchers and catchers report to Spring Training on Tuesday. But the Red Sox, along with many other teams throughout baseball, have holes in their roster. Why? Arguably because nearly 100 free agents remain unsigned.
I do not intend to go into the different theories as to why, but this has been a remarkably slow offseason. How do we know? Well using MLB Trade Rumours listing of the top-50 free agents this year, and the signings reported on Baseball Reference, we can look at the upper and middle, or maybe upper-middle, tiers of free agency.
Kind of messy to look at with all the player labels, but we can see here the projected contracts, in both length and total value, along with the contracts players signed, if they have. And for context we can see how those contracts compares to the Qualifying Offer (QO). What’s that? Complicated baseball stuff that is meant to ensure teams that lose stars or highly valuable players are compensated, especially since they might come from smaller market teams that cannot afford superstar prices. The QO is meant to help competitiveness in the sport. How does it do that? Let’s just say complicated baseball stuff. We should also point out that some players, most notably the Yankees’ Masahiro Tanaka, were expected to opt out of their contracts and try the free market. Tanaka did not, which is why his projection was so far off.
So is it true that free agency is or has moved slowly? Consider that approximately 100 free agents remain unsigned as of late Thursday night—please no big signings tomorrow morning—and that of the top 50, 22 of them remain unsigned. And if we take the QO as a proxy for the best players in the game, add in two players who were exempt because baseball stuff, we can say that 8 of the 11 best players remain unsigned. Though, in fairness to ownership, three of those players are reportedly sitting on multi-year offers in the nine-figure range.
But if players are unsigned, does that mean they are competing for lower value contracts? Possibly. If we use MLB Trade Rumours’ projected contracts, because in years past they have proven smart at these things, we can see that for the 28 who have signed, it’s a roughly even split in terms of the number of players who have signed for more or less than their projection. Sometimes however, non-monetary factors come into play. Two notable free agents, Todd Frazier and Addison Reed, both reportedly signed lesser value contracts to play closer to a specified geography, in Frazier’s case the Northeast and in Reed’s the Midwest.
But the telling part in that graphic is not necessarily the vertical movement, i.e. dollars, but the horizontal movement. (Though we should call out the cases of Carlos Santana and Tyler Chatwood, signed by the Phillies and Cubs respectively, who did far better than projected.) Consider that a team might not have a lot of money to spend and so might extend a contract over additional years, offering job security to a player. Or in a bidding war, the length of the contract might be what leads a player to pick one team over another. In those cases we would expect to see more left-to-right movement. So far we have only had one player, Lorenzo Cain, who signed for more years than expected. Most players who have signed for less have also signed for fewer years. Note the cluster of right-to-left, or shorter-than-expected, contracts in the lower tiers versus the small, vertical-only cluster in the same section for those signing greater than projected contracts.
Lastly, are these trends hitting any specific positional type of player? Well maybe. Ignoring the market for catchers because of how small the pool was—though the case of Jonathan Lucroy as the unsigned catcher is fascinating—we can see that the market has really been there for relief pitchers as there are few of the top-50 remaining on the market. Starting pitchers and outfields, while with quite a few still on the market, have generally done better than projected. But infielders lag behind with numerous players unsigned and those that have signed, most have signed for less than projected.
But at the same time, I would fully expect that once these higher level free agents come off the board—while one would think they would certainly be signed, who knows in such a weird offseason as this—the unsigned middle and lower tiers will quickly follow suit.
Of course none of this touches upon age. (Largely because lack of time on my part.) Though, in most cases, getting to free agency in and of itself makes a player older by definition the way baseball’s pre-arbitration and arbitration salary periods work. (Again, more baseball stuff but suffice it to say your first several years you play for peanuts and crackerjacks.)
Hopefully by this afternoon—Friday that is—some of these players will have signed. After all, baseball starts next week. If we are lucky this post will be outdated, at least in terms of the dataset, come Monday. Regardless, it has been a fascinating albeit boring baseball offseason.
Credit for the data goes to MLB Trade Rumours and Baseball Reference.
In the United Kingdom, the month of January has been less than stellar for the National Health Service, the NHS, as surgeries have been cancelled or delayed, patients left waiting in corridors, and a shortage of staff to cope with higher-than-usual demand.
But another problem is the shortage of hospital beds, which compounds problems elsewhere in hospitals and health services. The Guardian did a nice job last week of capturing the state of bed capacity in some hospitals. Overall, the piece uses line charts and scatter plots to tell the story, but this screenshot in particular is a lovely small multiples set that shows how even with surge capacity, the beds in orange, many hospitals are running at near 100% capacity.
Well, the data speaks for itself. I wanted to use this screenshot, however, to show you the story because I think it does a fantastic job. Without having to read the article, the image encapsulates what is to come in the article.
That said, there are a few other scatter plots worth checking out if the topic is of interest. And the explanation of the data makes all the more sense.
But I really loved the impact of that homepage.
Credit for the piece goes to Max Fisher and Josh Keller.
I’ve worked on a few scatter plots of late and so this piece from the Economist grabbed my attention. It examines the correlation between unemployment rates and inflation rates. Broadly speaking, the theory has been that low unemployment rates lead to high inflation rates. But the United States has had low unemployment rates now for a few years, but inflation is around that ideal 2% realm. This theory is called the Phillips Curve.
The graphic does a nice job of showing three data series all in one plot. Normally, I would argue for splitting the chart into three smaller plots, a la the small multiples. But here, the data aligns just well enough that the overlapping is minimal. And smart colour choices mean that each data range appears clearly separate from the rest. A nice thoughtful addition is the annotations to the time period are set in the same colour as the dots themselves.
My only two quibbles: One, I would probably increase the height of the chart to better show the trend line. I find that for scatter plots, a more squarish profile works better than the long rectangle. Overall, though, a really well done chart. Second, I would consider adding a zero line to the x-axis to show 0% cyclical unemployment. But that might also not be terribly useful, because you can see how the curve should move regardless of that natural line.
Full disclosure: the Economist article cites a paper from the Philadelphia Fed Research Department, which employs me.
Credit for the piece goes to the Economist Data Team.
When I lived in Chicago, people back East would always ask if I was worried about murder and gun crime in Chicago. My reply was always, “no, not really”. Why? Because I lived in generally safe neighbourhoods. But on that topic, the second most numerous question/comment was always, why are the strict gun laws in Chicago not preventing these crimes? More often than not the question had more to do with saying gun control laws were ineffective.
But in Chicago, it seemed to me to be fairly common knowledge that most of the guns people used to commit crimes were, in fact, not purchased in Illinois. Rather, criminals imported them from neighbouring states that had far looser regulations on firearms.
They bring back more than just cheese from Wisconsin…I am not the biggest fan of the maps that they use, although I understand why. Most Americans would probably not be able to name the states bordering Illinois, California, or Maryland—the two other states examined this way—and this helps ground the readers in that geographically important context. But, thankfully the designers opted for another further down in the article that explores the data set in a more nuanced approach. Surprise, surprise, it’s not that simple of an issue.
I know I have said it before, but I like the increasing number of graphics-led articles published by Politico. Many policy and politics stories are driven—or should be driven—by data. But, myself included, we cannot hit it out of the park at every plate appearance. And that is what we have from Politico today, actually last week.
The graphic focuses on the healthcare industry and its need for a larger labour force in coming years as the baby boomers continue to age and start to retire. If their own doctors retire along with them, who will be their new doctors?
But there are two components of the graphic on which I want to focus. The first is the projection of the number of registered nurses (RNs) in 2024 compared to a 2014 baseline.
The story focuses on the future condition, but that colour is set to the lighter green thus drawing the reader’s eyes to the 2014 data point. Flipping those two colours would shift the focus of the chart to the 2024 timeframe, which would better match the text above.
Then we have the design decision to include a line chart for the growth rate, presumably total, for each category of RN from 2014 to 2024. The problem is that the chart itself does not sit on any baseline. While I do not care for the dual axis chart, that format at least keeps an axis legend on the right side of the chart. (You still have the problem of implying certain things based on what scale you choose to use relative to the first data series.) Here, because there is no chart lines associated with the growth data, I wonder if a table below the x-axis labels would be more efficient? Home health care, a very small category, will have the highest growth (a small change from a small base will beat the same small change or even slightly bigger changes from a far larger base) but the eye has the furthest to travel to reach the 61% number from the top of the bars or the labelling.
The other component I wanted to discuss is the scatter plot that compares the number of jobs to their average salary.
But this is a bubble chart, not a scatter plot, and so we have a third variable encoded in the size of the dot/bubble. The first thing I looked for was a scale for the size of the circles. What magnitude is the RN circle vs. the Personal Care Aides circle? There is none, but unfortunately that seems to be a common practice with bubble chart. But after failing to find that, I noticed that the circles decrease in size from right to left. That was when I looked to the legend and saw the y-axis in numbers of jobs and the x-axis in average salary. But then the circles are sized in proportion to the average salary of each profession to the other. In other words, the circles are basically re-plotting the x-axis. The physical therapist circle should be roughly twice as large, by area, than the vocational nurses. But we can also just see by the x-axis coordinates. The bubble chart-ness of the chart is unnecessary and the data could be told more clearly by stripping that away and making a straight-up scatter plot where all the circles are sized the same.
Credit for the piece goes to Christina Animashaun.
This has been a busy week. I am working on a small piece on the Red Sox managers in the free agency period—I thought it would be ready yesterday, but not so much—but news continues to happen outside of the baseball world. Some of the biggest, at least in the US, would have to be the speech by Senator Flake of Arizona who announced he would not seek re-election in 2018.
So cue the politically-themed graphics. Today’s piece comes from the Washington Post. The graphic itself is not terribly complex as it is a scatter plot comparing the liberal/conservativeness of senators with how their respective state voted in 2016.
But what the piece does really well is weave a narrative through the chart. Scrolling down the page locks the graphic in place while the text changes to provide new context. And then different dots are highlighted or called out.
It proves that not all the best graphics need to be terribly complex.
Credit for the piece goes to Kevin Schaul and Kevin Uhrmacher.
Last week I covered a lot of Red Sox data. And your feedback has been fantastic. I think you can look forward to more visualisation of sportsball data. But since this is not a sports blog, let us dive back into some other topics. Like today’s piece on economic growth.
It comes from the Economist and explores the development history of national economies relative to that of the United States. The point of the chart was to illustrate what the researchers determined was the middle income trap, a space in which countries develop and become semi-rich, but then can never quite escape.
The Economist makes the point that the definition of middle income matters. The range is enormous and one statistic says that it could take 48 years to graduate at a healthy rate of economic growth. I wonder is this bit, however, could also have been charted. The show don’t tell mantra works well here for setting up the problem, but a chart or two showing that exact range could have supplemented the text and perhaps made it more digestible.
Credit for the piece goes to the Economist Data Team.