Datagraphic – Page 27 – Coffee Spoons

Can Texas Go Blue on Tuesday?

One story I’m following on Tuesday night is Texas. The state’s early voting—still with Monday to go—has surpassed the state’s total 2016 vote. Polling suggests that early votes lean Biden due to President Trump’s insistence that his supporters vote in person on Election Day as he lies about the integrity of early and mail-in voting.

The Texas Tribune looked at what we know about that turnout and what it may portend for Tuesday’s results. And, to be honest, we don’t—and won’t—really know until the votes are counted. They put together a great piece that divided Texas counties into four groups (their terminology): big blue cities, fast-changing counties, solidly red territory, and border counties. They then looked at the growth in registered voters in those counties from 2016, and looked at how they voted in the 2016 presidential election (Hillary Clinton vs. Donald Trump) and the 2018 US Senate election (Beto O’Rourke vs. Ted Cruz).

The piece uses the above stacked bar chart to show that Texas’ 1.8 million new registered voters’ largest share belongs to the big blue cities. The second largest group is the competitive suburbs in the fast-changing counties. The third largest, though quite close to second, was the solidly red territory. The border counties, still important for the margins, ranks a distant fourth.

I’m not normally a fan of stacked bar charts, because they do not allow for great comparisons of the constituent elements. For example, try comparing any of of those solidly red territory counties to one another. But here, the value is more in the stacked set as a group rather than the decomposition of the set, because you can see how the big blue cities have, as a group, a greater number of those 1.8 million new voters.

Those fast-changing counties include a lot of the suburbs for Texas’ largest cities. And those are areas where, across the country, Republicans are losing voters by the tens of thousands to the Democrats. As battlegrounds, these presented a challenge, because as swing counties, they split their votes between Clinton and Cruz and Trump and Beto. And so the designers chose purple to represent them in the stacked design. I think it’s a solid choice and works really well here.

But in terms of the story, I’ll add that in 2016, Trump won Texas by 807,000 votes. Texas added 1,800,000 new voters since then. And turnout before Election Day is already greater than it was in 2016.

It’s still a state likely to go for Trump on Tuesday. But, if Biden has a good night, it’s not inconceivable that Texas flips. FiveThirtyEight’s polling average has Trump with only a 1.2 point lead.

Credit for the piece goes to Mandi Cai, Darla Cameron and Anna Novak.

Red Shift, Blue Shift

Last night I published a graphic on Instagram that I think people may find helpful if they try to follow Election Day results on Tuesday. I wanted to explain the concept of a red shift or blue shift. (I’ve also seen it described as states having a red mirage or a blue mirage.)

For my non-American readers, it’s important to understand that while this is a national election, the United States’ federal system means that each state runs its own election with its own rules and they can vary some state to state. For example, early or mail-in voting can vary significantly from state to state with some states allowing it only in emergencies (and some of those this cycle will not allow people to cite COVID-19 as an emergency).

Another factor for everyone to consider is that polling indicates President Trump’s fraudulent messaging about, well, voting fraud has shifted a normally split use of early/mail-in voting to a Democratic advantage. In other words, Democrats are far more likely to vote early, either in person or by post. Republicans are far more likely to vote on Election Day.

Combine those two factors and we get Red Shift vs. Blue Shift.

Some states allow election officials to begin counting their early votes prior to Election Day. Other states forbid counting until Election Day morning, or in some cases until after the polls close.

In states where early votes can be counted—the swing states Arizona, Florida, and North Carolina are among this group—it is possible that when the polls close, or shortly thereafter, we will see an instant and enormous lead for Joe Biden. But, as the states begin to count in-person day-of votes, which again favour Republicans, Trump may begin to eat into those margins. The question will be, can Trump’s numbers eat in so much that when the final counts are complete, he can overtake those Biden numbers? This is the Red Shift.

Conversely we have the Blue Shift. In these states—swing states like Georgia, Michigan, Pennsylvania, Texas, and Wisconsin are in this group—election officials cannot begin to count early votes either until the morning or when the polls close. In these states we may see the in-person day-of votes, largely expected to be for Republicans, run up to high totals fairly quickly. At that time, Trump may have a significant lead. Then when officials pivot to counting the early votes, Biden will begin to eat into those margins. And again, the question will be, can Biden eat into those margins sufficiently to shift the outcome after all the votes are counted?

Be prepared to hear about these scenarios Tuesday night.

Credit for the piece is mine.

Choose Your Own FiveThirtyEight Adventure

In case you weren’t aware, the US election is in less than a week, five days. I had written a long list of issues on the ballot, but it kept getting longer and longer so I cut it. Suffice it to say, Americans are voting on a lot of issues this year. But a US presidential election is not like many other countries’ elections in that we use the Electoral College.

For my non-American readers, the Electoral College, very briefly, was created by the country’s founding fathers (Washington, Jefferson, Adams, Franklin, et al.) to do two things. One, restrict selection of the American president to a class of individuals who theoretically had a broader/deeper understanding of the issues—but who also had vested interests in the outcome. The founders did not intend for the American people to elect the president. The second feature of the Electoral College was to prevent the largest states from dominating smaller states in elections. Why else would Delaware and Rhode Island surrender their sovereignty to join the new United States if Virginia, Pennsylvania, and New York make all the decisions? (The founders went a step further and added the infamous 3/5 clause, but that’s another post.)

So Americans don’t elect the president directly and larger states like California, New York, and Texas, have slightly less impact than smaller states like Wyoming, Vermont, and Delaware. Each state is allotted a number of Electoral College votes and the key is to reach 270. (Maybe another time I’ll get into the details of what happens in a 269–269 tie.) Many Americans are probably familiar with sites like 270 To Win, where you can determine the outcome of the election by saying who won each state. But, even though the US election is really 50 different state elections, common threads and themes run through all those states and if one candidate or another wins one state, it makes winning or losing other states more or less likely. FiveThirtyEight released a piece that attempts to link those probabilities and help reveal how decisions voters in one state make may reflect on how other voters decide.

The interface is fairly straightforward—I’m looking at this on a desktop, though it does work on mobile—with a bunch of choices at the top and a choropleth map below. There we have a continually divergent gradient, meaning the states aren’t grouped into like bins but we have incredibly subtle differences between similar states. (I should also point out that Maine and Nebraska are the two exceptions to my above description of the Electoral College. They divide their votes by congressional district, whoever wins the district gets that Electoral College vote and then the state overall winner receives the remaining two votes.)

Below that we have a bar chart, showing each state, its more/less likely winner state and the 270 threshold. Below that, we have what I’ve read/heard described as a ball plot. It represents runs of the simulation. As of Thursday morning, the current FiveThirtyEight model says Trump has an 11 in 100 chance of winning, Biden, conversely, an 89-in-100 chance.

But what happens when we start determining the winners of states?

Well, for my non-American readers, this election will feature a large number of voters casting their ballots early. (I voted early by mail, and dropped my ballot off at the county election office.) That’s not normal. And I cannot emphasise this next point enough. We may not know who wins the election Tuesday night or by the time Americans wake up on Wednesday. (Assuming they’re not like me and up until Alaska and Hawaii close their polls. Pro-tip, there’s a potentially competitive Senate race in Alaska, though it’s definitely leaning Republican.)

But, some states vote early and/or by mail every year and have built the infrastructure to count those votes, or the vast majority of them, on or even before Election Day. Three battleground states are in that group: Arizona, Florida, and North Carolina. We could well know the result in those states by midnight on Election Day—though Florida is probably going to Florida.

So what happens with this FiveThirtyEight model if we determine the winners of those three states? All three voted for Trump in 2016, so let’s say he wins them again next week.

We see that the states we’ve decided are now outlined in black. The remainder of the states have seen their colours change as their odds reflect the set electoral choice of our three states. We also now have a rest button that appears only once we’ve modified the map. I’m also thinking that I like FiveyFox, the site’s new mascot? He provides a succinct, plain language summary of what the user is looking at. At the bottom we see what the model projects if Arizona, Florida, and North Caroline vote for Trump. And in that scenario, Trump wins in 58 out of 100 elections, Biden in only 41. Still, it’s a fairly competitive election.

So what happens if by midnight we have results from those three states that Biden has managed to flip them? And as of Thursday morning, he’s leading very narrowly in the opinion polls.

Well, the interface hasn’t really changed. Though I should add below this screenshot there is a button to copy the link to this outcome to your clipboard if, like me, you want to share it with the world or my readers.

As to the results, if Biden wins those three states, Trump has less than a 1-in-100 chance of winning and Biden a greater than 99-in-100.

This is a really strong piece from FiveThirtyEight and it does a great job to show how states are subtly linked in terms of their likelihood to vote one way or the other.

Credit for the piece goes to Ryan Best, Jay Boice, Aaron Bycoffe and Nate Silver.

Where Are the Votes?

I’m not working for a good chunk of the next few days. But, I did want to share with my readers an analysis of Pennsylvania’s missing votes. Broadly, Trump needs to win the Commonwealth of Pennsylvania next week—yes, the US election is now one week away. Though, Pennsylvania allows mail-in ballots postmarked on Election Day to arrive within a few days and still be counted. So we may not have final tallies for the state until the weekend or Monday after Election Day.

Pennsylvania, of course, narrowly voted for Donald Trump over Hillary Clinton in 2016 with 44,000+ votes making the difference. In 2020, polling has consistently placed Joe Biden above Donald Trump by 5+ points. But, can Trump again pull off an upset victory?

I argue that yes, he can. And fairly easily too. (If you want to see why I think Pennsylvania is really Trumpsylvania, I recommend checking out my longer, more in-depth analysis.) So where would the votes come from? I mapped the 2016 difference between votes cast and registered voters, i.e. people who could have voted, but did not for whatever reason. I then coloured the map by the county’s winner in 2016. Red counties voted for Trump by more than 10 points and blue for Clinton by more than 10 points. The purple counties are those that were competitive, plus or minus 10 points for either candidate.

In the purple counties, both candidates will want to drive out as many voters as possible. But in the blue counties, Biden has reliably Democratic votes and in red Trump has reliably Republican votes. So why on Monday did Trump visit Allentown, Lititz, and Martinsburg? Because that’s where those votes are.

Allentown, in Lehigh County, is competitive. In fact, neighbouring Northampton Co. will be a key swing county next week and one I will be following closely as the returns come in. But Lititz, Lancaster Co., and Martinsburg, Blair Co., are in reliably red counties. (Though in my Trumpsylvania piece I argue Lancaster Co. is undergoing a transition to a competitive, albeit lean Republican county.)

In Lancaster Co., which went to Trump by nearly 20 percentage points in 2016, there were still just short of 100,000 voters who didn’t vote in 2016. Not all of those voters would have voted for Trump, but for sake of argument, just say 50% would have. That makes just short of 50,000 potential Trump votes—more than Trump’s entire state margin.

Blair Co. is in the Pennsyltucky region of the state, relatively rural, but in Blair’s case, its county seat Altoona is the state’s 10th largest city. While the total number of votes—and the total number of non-voting voters—are smaller than in Lancaster Co., add up all the available votes and it’s a large number.

If you add up all those red counties’ missing votes, you get a total of just shy of 840,000 missing votes. Far more than enough to drastically swing the Commonwealth to Trump in 2020.

Of course, Biden’s counting on driving out turnout in Philadelphia and Pittsburgh and their suburbs, along with other cities in the state, like Allentown, Scranton, Harrisburg, and Erie. In those blue counties, there were 927,000 missing votes, so the potential for a Biden win is also there.

But, if Democratic voters don’t vote again in 2016, Trump has plenty of potential votes to pick up across the state.

Credit for the piece is mine.

Options

We made it to Friday, everyone. We also had a debate last night. The last debate in the 2020 cycle.

Anyway, this piece from Indexed seemed appropriate to start the weekend.

Credit for the piece goes to Jessica Hagy.

Covid Migration

Yep, Covid-19 remains a thing. About a month or so ago, an article in City Lab (now owned by Bloomburg), looked at the data to see if there was any truth in the notion that people are fleeing urban areas. Spoiler: they’re not, except in a few places. The entire article is well worth a read, as it looks at what is actually happening in migration and why some cities like New York and San Francisco are outliers.

But I want to look at some of the graphics going on inside the article, because those are what struck me more than the content itself. Let’s start with this map titled “Change in Moves”, which examines “the percentage drop in moves between March 11 and June 30 compared to last year”.

Conventionally, what would we expect from this kind of choropleth map. We have a sequential stepped gradient headed in one direction, from dark to light. Presumably we are looking at one metric, change in movement, in one direction, the drop or negative.

But look at that legend. Note the presence of the positive 4—there is an entire positive range within this stepped gradient. Conventionally we would expect to see some kind of red equals drop, blue equals gain split at the zero point. Others might create a grey bin to cover a negative one to positive one slight-to-no change set of states. Here, though, we don’t have that. Nor do we even get a natural split, instead the dark bin goes to a slightly less dark bin at positive four, so everything less than four through -16 is in the darker bin.

Look at the language, too, because that’s where it becomes potentially more confusing. If the choropleth largely focuses on the “percentage drop” and has negative numbers, a negative of a negative would be…a positive. A -25% drop in Texas could easily be mistaken with its use of double negatives. Compare Texas to Nebraska, which had a 2% drop. Does that mean Nebraska actually declined by 2%, or does it mean it rose by 2%?

A clean up in the data definition to, say, “Percentage change in moves from…” could clear up a lot of this ambiguity. Changing the colour scheme from a single gradient to a divergent one, with a split around zero (perhaps with a bin for little-to-no change), would make it clearer which states were in the positive and which were in the negative.

The article continues with another peculiar choice in its bar charts when it explores the data on specific cities.

Here we see the destinations of people moving out of San Francisco, using, as a note explains, requests for quotes as a proxy for the numbers of actual moves. What interests me here is the minimalist take on the bar charts. Note the absence of an axis, which leaves the bars almost groundless for comparison, except that the designer attached data labels to the ends of the bars.

Normally data labels are redundant. The point of a visualisation is to visualise the comparison of data sets. If hyper precise differences to the decimal point are required, tables often are a better choice. But here, there are no axis labels to inform the user as to what the length of a bar means.

It’s a peculiar design decision. If we think of labelling as data ink, is this a more efficient use with data labels than just axis labels? I would venture to say no. You would probably have five axis labels (0–4) and then a line to connect them. That’s probably less ink/pixels than the data labels here. I prefer axis lines to help guide the user from labels up (in this case) through the bars. Maybe the axis lines make for more data ink than the labels? It’s hard to say.

Regardless, this is a peculiar decision. Though, I should note it’s eminently more defensible than the choropleth map, which needs a rethink in both design and language.

Credit for the piece goes to Marie Patino.

Covid-19 Update: 19 October

I took a holiday yesterday. To be honest, I’ll be taking a lot of short holidays as the year winds down on account of not taking any the first three quarters of the year. So expect quite a few quiet Mondays and Fridays in the next few months.

But back to Covid-19. I won’t have a lot to say in this weekly update, because I didn’t write anything last night when I made these. Suffice it to say that things are bad and getting worse. Although, things could also be much worse. And by that I mean, while we are seeing dramatic rises in new cases, we are not yet seeing the rises in deaths that accompanied similar rises in March and April.

Although it should be said that while still low, deaths have been rising. The easiest seen instance of that is in Illinois, below. You can see deaths are rising slowly upwards and the state is approaching 50 deaths per day. While that is still off from the peaks of 100+ earlier this year, that’s still too many people.

New case curves for PA, NJ, DE, VA, and IL.

Death curves for PA, NJ, DE, VA, and IL.

Credit for the pieces is mine.

Mask Up

Well, we made it to Friday. But, if you’ve been following me on the social, you’ll know that Covid is beginning to spread once again in Pennsylvania, New Jersey, Delaware, Virginia, and Illinois. I live in a tower block and I can say that many of my neighbours are no longer wearing masks indoors. Yet mask-wearing is the easiest defence we have against the spread of the coronavirus. So let’s take a look at the most effective types of masks, thankfully charted by xkcd.

Credit for the piece goes to Randall Munroe.

Cheesesteaks and Politics

For those unaware, Pennsylvania matters in the 2020 election. And it has mattered for years as a perennial swing state. There are of course the visits to steel mill cities like Pittsburgh, deindustrialised places like Johnstown, and unions love visits to places in Lackawanna and Luzerne. (You can read more about Pennsylvania as a swing state in my latest analysis here.)

But I want to focus on visits to Philadelphia. Because they inevitably involve the candidate consuming a cheesesteak. The Economist’s sister magazine, 1843, recently published an article on this very subject. And the whole thing is worth a read.

How have I managed to find this relevant to a blog about data visualisation? Well, they included a recipe to help people understand just what goes into the traditional Philadelphia dish.

Personally, I always have to confess, I’ve never been a huge fan. But, I’ll take provolone over whiz any day.

Credit for the piece goes to Jake Read.

Trumpsylvania

After working pretty much non-stop all spring and summer, your humble author finally took a few days off and throw in a bank holiday and you are looking at a five-day weekend. But, because this is 2020 travelling was out of the question and so instead I hunkered down to finish writing/designing an article I have been working on for the last several weeks/few months.

The main write-up—it is a lengthy-ish read so you may want to brew a cup of tea—is over at my data projects site. This is the first project I have really written about for that since spring/summer 2016. Some of my longer-listening readers may recall that the penultimate piece there I wrote about Pennsyltucky was inspired by work I did here at Coffeespoons.

To an extent, so is this piece. I wrote about Trumpsylvania, the political realignment of the state of Pennsylvania. 2016 and the state’s vote for Donald Trump was less an aberration than many think. It was the near-end result of a decades-long transformation of the state’s political geography. And so I looked at the data underlying the shift and how and where it occurred.

And originally, I had a slightly different conclusion as to how this related to Pennsylvania in the upcoming 2020 election. But, the whole 2020 thing made me shift my thinking slightly. But you’ll have to read the whole thing to understand what I’m talking about. I will leave you with one of the graphics I made for the piece. It looks at who won each county in the state, but also whether or not the candidate was able to flip the county. In other words, was Clinton able to flip a Republican county? Was Trump able to flip a Democratic county?

Let me know what you think.

And of course, many, many thanks to all the people who suffered my ideas, thoughts, and early drafts over the last several weeks. And even more thanks to those who edited it. Any and all mistakes or errors in the piece are all mine and not theirs.

Credit for the piece is mine.