Choose Your Own FiveThirtyEight Adventure

In case you weren’t aware, the US election is in less than a week, five days. I had written a long list of issues on the ballot, but it kept getting longer and longer so I cut it. Suffice it to say, Americans are voting on a lot of issues this year. But a US presidential election is not like many other countries’ elections in that we use the Electoral College.

For my non-American readers, the Electoral College, very briefly, was created by the country’s founding fathers (Washington, Jefferson, Adams, Franklin, et al.) to do two things. One, restrict selection of the American president to a class of individuals who theoretically had a broader/deeper understanding of the issues—but who also had vested interests in the outcome. The founders did not intend for the American people to elect the president. The second feature of the Electoral College was to prevent the largest states from dominating smaller states in elections. Why else would Delaware and Rhode Island surrender their sovereignty to join the new United States if Virginia, Pennsylvania, and New York make all the decisions? (The founders went a step further and added the infamous 3/5 clause, but that’s another post.)

So Americans don’t elect the president directly and larger states like California, New York, and Texas, have slightly less impact than smaller states like Wyoming, Vermont, and Delaware. Each state is allotted a number of Electoral College votes and the key is to reach 270. (Maybe another time I’ll get into the details of what happens in a 269–269 tie.) Many Americans are probably familiar with sites like 270 To Win, where you can determine the outcome of the election by saying who won each state. But, even though the US election is really 50 different state elections, common threads and themes run through all those states and if one candidate or another wins one state, it makes winning or losing other states more or less likely. FiveThirtyEight released a piece that attempts to link those probabilities and help reveal how decisions voters in one state make may reflect on how other voters decide.

The interface is fairly straightforward—I’m looking at this on a desktop, though it does work on mobile—with a bunch of choices at the top and a choropleth map below. There we have a continually divergent gradient, meaning the states aren’t grouped into like bins but we have incredibly subtle differences between similar states. (I should also point out that Maine and Nebraska are the two exceptions to my above description of the Electoral College. They divide their votes by congressional district, whoever wins the district gets that Electoral College vote and then the state overall winner receives the remaining two votes.)

Below that we have a bar chart, showing each state, its more/less likely winner state and the 270 threshold. Below that, we have what I’ve read/heard described as a ball plot. It represents runs of the simulation. As of Thursday morning, the current FiveThirtyEight model says Trump has an 11 in 100 chance of winning, Biden, conversely, an 89-in-100 chance.

But what happens when we start determining the winners of states?

Well, for my non-American readers, this election will feature a large number of voters casting their ballots early. (I voted early by mail, and dropped my ballot off at the county election office.) That’s not normal. And I cannot emphasise this next point enough. We may not know who wins the election Tuesday night or by the time Americans wake up on Wednesday. (Assuming they’re not like me and up until Alaska and Hawaii close their polls. Pro-tip, there’s a potentially competitive Senate race in Alaska, though it’s definitely leaning Republican.)

But, some states vote early and/or by mail every year and have built the infrastructure to count those votes, or the vast majority of them, on or even before Election Day. Three battleground states are in that group: Arizona, Florida, and North Carolina. We could well know the result in those states by midnight on Election Day—though Florida is probably going to Florida.

So what happens with this FiveThirtyEight model if we determine the winners of those three states? All three voted for Trump in 2016, so let’s say he wins them again next week.

We see that the states we’ve decided are now outlined in black. The remainder of the states have seen their colours change as their odds reflect the set electoral choice of our three states. We also now have a rest button that appears only once we’ve modified the map. I’m also thinking that I like FiveyFox, the site’s new mascot? He provides a succinct, plain language summary of what the user is looking at. At the bottom we see what the model projects if Arizona, Florida, and North Caroline vote for Trump. And in that scenario, Trump wins in 58 out of 100 elections, Biden in only 41. Still, it’s a fairly competitive election.

So what happens if by midnight we have results from those three states that Biden has managed to flip them? And as of Thursday morning, he’s leading very narrowly in the opinion polls.

Well, the interface hasn’t really changed. Though I should add below this screenshot there is a button to copy the link to this outcome to your clipboard if, like me, you want to share it with the world or my readers.

As to the results, if Biden wins those three states, Trump has less than a 1-in-100 chance of winning and Biden a greater than 99-in-100.

This is a really strong piece from FiveThirtyEight and it does a great job to show how states are subtly linked in terms of their likelihood to vote one way or the other.

Credit for the piece goes to Ryan Best, Jay Boice, Aaron Bycoffe and Nate Silver.

Cheesesteaks and Politics

For those unaware, Pennsylvania matters in the 2020 election. And it has mattered for years as a perennial swing state. There are of course the visits to steel mill cities like Pittsburgh, deindustrialised places like Johnstown, and unions love visits to places in Lackawanna and Luzerne. (You can read more about Pennsylvania as a swing state in my latest analysis here.)

But I want to focus on visits to Philadelphia. Because they inevitably involve the candidate consuming a cheesesteak. The Economist’s sister magazine, 1843, recently published an article on this very subject. And the whole thing is worth a read.

How have I managed to find this relevant to a blog about data visualisation? Well, they included a recipe to help people understand just what goes into the traditional Philadelphia dish.

Personally, I always have to confess, I’ve never been a huge fan. But, I’ll take provolone over whiz any day.

Credit for the piece goes to Jake Read.

Trumpsylvania

After working pretty much non-stop all spring and summer, your humble author finally took a few days off and throw in a bank holiday and you are looking at a five-day weekend. But, because this is 2020 travelling was out of the question and so instead I hunkered down to finish writing/designing an article I have been working on for the last several weeks/few months.

The main write-up—it is a lengthy-ish read so you may want to brew a cup of tea—is over at my data projects site. This is the first project I have really written about for that since spring/summer 2016. Some of my longer-listening readers may recall that the penultimate piece there I wrote about Pennsyltucky was inspired by work I did here at Coffeespoons.

To an extent, so is this piece. I wrote about Trumpsylvania, the political realignment of the state of Pennsylvania. 2016 and the state’s vote for Donald Trump was less an aberration than many think. It was the near-end result of a decades-long transformation of the state’s political geography. And so I looked at the data underlying the shift and how and where it occurred.

And originally, I had a slightly different conclusion as to how this related to Pennsylvania in the upcoming 2020 election. But, the whole 2020 thing made me shift my thinking slightly. But you’ll have to read the whole thing to understand what I’m talking about. I will leave you with one of the graphics I made for the piece. It looks at who won each county in the state, but also whether or not the candidate was able to flip the county. In other words, was Clinton able to flip a Republican county? Was Trump able to flip a Democratic county?

Who won what? Who flipped what?

Let me know what you think.

And of course, many, many thanks to all the people who suffered my ideas, thoughts, and early drafts over the last several weeks. And even more thanks to those who edited it. Any and all mistakes or errors in the piece are all mine and not theirs.

Credit for the piece is mine.

Parties in Pennsylvania

This is from a social media post I made a few days ago, but think it may be of some relevance/interest to my Coffeespoons followers. I was curious to see at 30+ days from the general election, how has the landscape changed for the two parties since 2016?

Well, this project has driven me to a related, but slightly different project that has been consuming my non-work time. Hopefully I will have more on that in the coming days. Without further ado, the post:

Pennsylvania will likely be one of the more critical battleground swing states in this year’s election. In 2016, then candidate Trump won the state by less than one percentage point. But four years is a long time and I was curious to see how things have changed.

In the first chart on the right we see counties won by Trump and on the left, Clinton. The further from the centre, the greater the candidate’s margin of victory over the other. The top half plots registered Republicans’ margin over Democrats as a percentage of all registered voters in the county (including independents and third party) and the bottom half does the same for Democrats. Closer to the centre, the more competitive, further away, less so.

Trump’s key to victory was the white, working class voter clustered in the west and the northeast of the state–old mining and steel towns. There Democrats normally counted on organised labour support as registered Democrats. That all but collapsed in 2016. The bottom right shows a number of nominally Democratic counties Trump won, whereas Clinton only picked up one Republican county, Chester.

But what are PA’s battlegrounds?

In the second chart we ignore places like Philly and Fulton County and zoom in on more competitive counties within 20 point margins. Polls presently point to a Biden lead of about 5 points in PA. If every dot moved left by 5 points (it doesn’t really work like that), we only see Erie and Northampton with potential to flip.

But Trump’s realignment of politics is accelerating (more on this another day) a realignment of PA’s political geography.

In the fourth chart, neither Erie nor Northampton show any real movement via party registration back to Democrats. Erie may flip, but Northampton’s likely a stretch. Places like Cumberland and Lancaster counties are too solidly Republican to flip this year. Instead Trump is more likely to flip counties like Monroe and Lehigh red, even if he loses the state.

Because, not shown, the key to a Biden victory will be running up the margins in Philly & Pittsburgh, and to a lesser extent Philly’s four collar counties, including Chester, which appears to be rapidly shifting in Democrats’ favour.

Credit for the piece is mine.

Positioning Is Important

Yesterday Pew Research released the results of a survey of how the rest of the world views select countries throughout the world. The Washington Post covered it in an article and created some graphics to support the text. The text, of course, was no big surprise in that the rest of the world views the United States poorly compared to just several years ago and that, in particular, President Trump is a leader in whom the world has no confidence.

But that’s not what I want to talk about. Instead, I want to address a design element in the one of their graphics. (But you should go ahead and read about the survey results.)

The issue here is the positioning of the labels for each bar, representing a world leader. At the very top of the graphic, things are in a good way. We have Merkel with a small space beneath that text then another label, “No confidence, 19 percent”, and then a connecting line to a dot to the blue bar. We then have a small space and the label Macron, meaning we have moved on and are on the next world leader.

But what if the reader sees the title and starts towards the bottom? They want to see the leaders in whom the world has no confidence. Now look at the bottom of the chart and the positioning of the labels for Trump, and above him, Xi, Putin, and maybe even Johnson. Because the “No confidence, x percent” labels have moved further to the right, there is an enormous space between the leader’s name and their coloured bar. Visually, this creates a link between the leader’s name and the preceding bar. For example, Trump appears to have a no confidence value of 78 with an unlabelled bar chart beneath him.

I suggest that there are two easy fixes to better link the labels to the data. The first is to move the leaders’ labels down, once the “No confidence” label has moved sufficiently far to the right. Like so.

The leader is now very clearly attached to his or her data with little confusion.

My second option is to fix the “No confidence” labels permanently to the left of the chart so as not to create that visual space in the first place, like so.

Here, after seeing the first option, I wonder if there is enough visual space at all between the leaders. But, this is only a quick Photoshop exercise. If I wanted to really tweak this, I would consider putting the data point or number in bold to the right of the label.That would eliminate an entire line of type that could be repurposed as a visual buffer between leaders.

I think either option would be preferable because of increased clarity for the reader.

Credit for the piece goes to the Washington Post graphics department.

Axis Lines in Charts

The British election campaign is wrapping up as it heads towards the general election on Thursday. I haven’t covered it much here, but this piece from the BBC has been at the back of my mind. And not so much for the content, but strictly the design.

In terms of content, the article stems from a question asked in a debate about income levels and where they fall relative to the rest of the population. A man rejected a Labour party proposal for an increase in taxes on those earning more than £80,000 per annum, saying that as someone who earned more than that amount he was “not even in the top 5%, not even the top 50”.

The BBC looked at the data and found that actually the man was certainly within the top 50% and likely in the top 5%, as they earn more than £75,300 per annum. Here in the States, many Americans cannot place their incomes within the actual spreads of income. The income gap here is severe and growing.  But, I want to look at the charts the BBC made to illustrate its points.

The most important is this line chart, which shows the income level and how it fits among the percentages of the population.

Are things lining up? It's tough to say.
Are things lining up? It’s tough to say.

I am often in favour of minimal axis lines and labelling. Too many labels and explicit data points begin to subtract from the visual representation or comparison of the data. If you need to be able to reference a specific data point for a specific point on the curve, you need a table, not a chart.

However, there is utility in having some guideposts as to what income levels fit into what ranges. And so I am left to wonder, why not add some axis lines. Here I took the original graphic file and drew some grey lines.

Better…
Better…

Of course, I prefer the dotted or dashed line approach. The difference in line style provides some additional contrast to the plotted series. And in this case, where the series is a thin but coloured line, the interruptions in the solidity of the axis lines makes it easier to distinguish them from the data.

Better still.
Better still.

But the article also has another chart, a bar chart, that looks at average weekly incomes across different regions of the United Kingdom. (Not surprisingly, London has the highest average.) Like the line chart, this bar chart does not use any axis labels. But what makes this one even more difficult is that the solid black line that we can use in the line charts above to plot out the maximum for 180,000 is not there. Instead we simply have a string of numbers at the bottom for which we need to guess where they fall.

Here we don't even a solid line to take us out to 700.
Here we don’t even a solid line to take us out to 700.

If we assume that the 700 value is at the centre of the text, we can draw some dotted grey lines atop the existing graphic. And now quite clearly we can get a better sense of which regions fall in which ranges of income.

We could have also tried the solid line approach.
We could have also tried the solid line approach.

But we still have this mess of black digits at the bottom of the graphic. And after 50, the numbers begin to run into each other. It is implied that we are looking at increments of 50, but a little more spacing would have helped. Or, we could simply keep the values at the hundreds and, if necessary, not label the lines at the 50s. Like so.

Much easier to read
Much easier to read

The last bit I would redo in the bar chart is the order of the regions. Unless there is some particular reason for ordering these regions as they are—you could partly argue they are from north to south, but then Scotland would be at the top of the list—they appear an arbitrary lot. I would have sorted them maybe from greatest to least or vice versa. But that bit was outside my ability to do this morning.

So in short, while you don’t want to overcrowd a chart with axis lines and labelling, you still need a few to make it easier for the user to make those visual comparisons.

Credit for the original pieces goes to the BBC graphics department.

From Order to Chaos?

A few weeks ago we said farewell to John Bercow as Speaker of the House (UK). Whilst I covered the election for the new speaker, I missed the opportunity to post this piece from the BBC. It looked at Bercow’s time in office from a data perspective.

The piece did not look at him per se, but that era for the House of Commons. The graphic below was a look at what constituted debates in the chamber using words in speeches as a proxy. Shockingly, Brexit has consumed the House over the last few years.

At least climate change has also ticked upwards?
At least climate change has also ticked upwards?

I love the graphic, as it uses small multiples and fixes the axes for each row and column. It is clean, clear, and concise—just what a graphic should be.

And the rest of the piece makes smart use of graphical forms. Mostly. Smart line charts with background shading, some bar charts, and the only questionable one is where it uses emoji handclaps to represent instances of people clapping the chamber—not traditionally a thing that  happens.

Content wise it also nailed a few important things, chiefly Bercow’s penchant for big words. The piece did not, however, cover his amazing sense of sartorial style vis-a-vis neckties.

Overall a solid piece with which to begin the weekend.

Credit for the piece goes to Ed Lowther & Will Dahlgreen.

Casual Fails?

In a recent Washington Post piece, I came across a graphic style that I am not sure I can embrace. The article looked at the political trifecta at state levels, i.e. single political party control over the government (executive, lower legislative chamber, and upper legislative chamber). As a side note, I do like how they excluded Nebraska because of its unicameral legislature. It’s also theoretically non-partisan (though everybody knows who belongs to which party, so you could argue it’s as partisan as any other legislature).

At the outset, the piece uses a really nice stacked bar chart. It shows how control over the levers of state government have ebbed and flowed.

You can pretty easily spot the recent political eras by the big shifts in power.
You can pretty easily spot the recent political eras by the big shifts in power.

It also uses little black lines with almost cartoonish arrowheads to point to particular years. The annotations are themselves important to the context—pointing out the various swing years. But from an aesthetic standpoint, I have to wonder if the casualness of the marks detracts from the seriousness of the content.

Sometimes the whimsical works. Pie charts about pizza pies or pie toppings can be whimsical. A graphic about political control over government is a different subject matter. Bloomberg used to tackle annotations with a subtler and more serious, but still rounded curve type of approach. Notably, however, Bloomberg at that time went for an against the grain, design forward, stoic business serious second approach.

Then we get to a choropleth map. It shows the current state of control for each state.

X marks the spot?

X marks the spot?However, here the indicator for recent party switches is a set of x’s. These have the same casual approach as the arrows above. But in this case, a careful examination of the x’s indicates they are not unique, like a person drawing a curve with a pen tool. Instead these come from a pre-determined set as the x’s share the exact same shape, stroke lengths and directions.

In years past we probably would have seen the indicator represented by an outline of the state border or a pattern cross-hatching. After all, with the purple being lighter than the blue, the x’s appear more clearly against purple states than blue. I have to admit I did not see New Jersey at first.

Of course, in an ideal world, a box map would probably be clearer still. But the curious part is that the very next map does a great job of focusing the user’s attention on the datapoint that matters: states set for potential changes next November.

Pennsylvania is among the states…
Pennsylvania is among the states…

Here the states of little interest are greyed out. The designers use colour to display the current status of the potential trifecta states. And so I am left curious why the designers did not choose to take a similar approach with the remaining graphics in the piece.

Overall, I should say the piece is strong. The graphics generally work very well. My quibbles are with the aesthetic stylings, which seem out of place for a straight news article. Something like this could work for an opinion piece or for a different subject matter. But for politics it just struck a loud dissonant chord when I first read the piece.

Credit for the piece goes to Kate Rabinowitz and Ashlyn Still.

The Shifting Suburbs

Last we looked at the revenge of the flyover states, the idea that smaller cities in swing states are trending Republican and defeating the growing Democratic majority in big cities. This week I want to take a look at something a few weeks back, a piece from CityLab about the elections in Virginia, Kentucky, and Mississippi.

There’s nothing radical in this piece. Instead, it’s some solid uses of line charts and bar charts (though I still don’t generally love them stacked). The big flashy graphic was this, a map of Virginia’s state legislative districts, but mapped not by party but by population density.

Democrats now control a majority of these seats.
Democrats now control a majority of these seats.

It classified districts by how how urban, suburban, or rural (or parts thereof) each district was. Of course the premise of the article is that the suburbs are becoming increasingly Democratic and rural areas increasingly Republican.

But it all goes to show that 2020 is going to be a very polarised year.

Credit for the piece goes to David Montgomery.