Biden’s English Ancestry Revisited

Last week I posted about an article in the BBC on the English ancestry of American president Joe Biden. And these types of article are a bit pro forma, famous person has an article about their personal ancestry with a family tree attached. Interestingly, this article did not, just the timeline I mentioned and a graphic as part of an aside on the declining self-identification as English-American.

And that, normally is it. Perhaps the article comes out with a few revisions upon the famous person’s marriage, birth of children, and more rarely death, but that is it. Yesterday, however, the BBC posted a follow-up article about an English family claiming kinship with Joe Biden. This article, however, included a family tree of sorts.

With some interesting spacing here…

This isn’t a family tree in the traditional sense, I would argue it’s the sort of chart genealogists would use to highlight two parties’ relationship to their most recent common ancestor (MCRA). But this chart does something odd, it spaces out the generations inconsistently and so Joe Biden appears at the bottom, aligned with the grandchildren of Paul Harris, the man at the centre of the story.

If you compare the height/length of the lines linking the different generations you can see the lines on Biden’s side of the graphic are very long compared to those on the Harris’ side. This isn’t technically incorrect, but it muddies the water when it comes to understanding the generational differences. So I revisited the design below.

Now with more even spacing…

Here I dropped the photographs because, primarily, I don’t have access to them. But they also eat up valuable real estate and aren’t necessary to communicate the relationships. I kept the same distance between generations, which does a better job showing the relationship between Joe Biden and Paul Harris, who appear to be actual fifth cousins. Joe is clearly at a different level than that of Paul’s grandchildren.

I added some context with labelling the generational relationship. At the top we have William and James Biden, assuming they are brothers, listed as siblings. The next level down are first cousins, then second, &c. Beyond Paul, however, we have two additional generations that are removed from the same relationship level. This is where the confusing “once-removed” or “twice-removed” comes into play. One way to think of it is as the number of steps you need to take from, say, Paul’s grandchildren, to get to a common generational level. In their case two levels, hence the grandchildren are fifth cousins to Joe Biden, twice removed.

These types of charts are great to show narrow relationships. Because, if we assume that up until recently each of the generations depicted above had four or five children, that tree would be unwieldy at best to show the relationship between Paul’s family and Joe Biden. If you ever find yourself working on your family ancestry or history and need to show someone how you are related, this type of chart is a great tool.

Credit for the original goes to the BBC graphics department

Credit for my remake is mine.

Biden’s English Ancestry

We all know Joe Biden as the Irish American president. And that’s no malarkey. But, go back far enough in your family tree and you may find some interesting ancestry and ethnic origins and that’s no different with Joe Biden. Keep in mind that our number of ancestors doubles every generation. You have four grandparents, and many of us met most of them. But you had eight great-grandparents. How many of those did you know? And you had 16 great-great-grandparents, you likely didn’t know any of them personally. It becomes pretty easy for an ethnic line to sneak into your ancestry.

And in Biden’s case it may well be English. Although sneaking in is probably a stretch, as this BBC article points out, because his patrilineal line, i.e. his father’s father’s father’s, &c., is likely English. Of course back in the day the Irish and the English mixing would have been unconscionable, at least as my grandmother would have described it. And so it’s easy to see how the exact origins of family lines are quietly forgotten. But that’s why we have genealogists.

The article eschews the traditional family tree graphic and instead uses only two charts. The first is a simple timeline of Biden’s direct ancestors.

Biden’s patrilineal timeline

No, it’s no family tree, but timelines are a critical tool used by genealogists because at its core, genealogy is all about time and place. And a timeline has got one of those two facets covered.

Timelines help visualise stories in chronological order. I cannot tell you the number of family trees I have seen where people who create trees casually simply copy and paste data without scrutiny. Children born well after the deaths of parents are common. Or children born to parents in their 50s or 60s—perhaps not strictly impossible, but certainly highly irregular. And so to see Biden’s ancestors plotted out chronologically is a common graphic for those who do any work in genealogy, which my regular readers know is my hobby.

That alone would make the article worth sharing. Because, I enjoyed that graphic. I probably would have created a separate line for the birthplace of each individual, but I quibble.

However, we have another graphic that’s not so great. And once again with the BBC I’m talking about axis lines.

American ethnic origins

Here we have a chart looking at US ancestry as claimed in the US censuses of 1980 and 2000. But we do not have any vertical lines making it easy for readers to accurately compare the lengths of the various bars. Twice lately I’ve posted about axis lines and the BBC. Third time’s the charm?

We can also look at using these not as bars, but as line charts as I did in this re-imagining to the right.

First, we no longer need two distinct colours, though you could argue the English line should be a highlight or call out colour given its role in the article. Instead each line receives a label at the right and only the English line crosses any other, but given their point-to-point slope, it’s not confusing like a line chart with all years between 1980 and 2000 could be.

Secondly, the slope here of the line reinforces the idea of falling population numbers. The bar chart also shows this, but through a leftward movement in bars. The bar option certainly works and there’s nothing wrong with it, but these lines offer a more intuitive concept of falling numbers.

I also added some clarification to the data definition. These lines represent the number of people who reported at least one ethnic ancestry—at the time US census respondents could enter upwards of two. For myself, as an example, I could have entered Irish and Carpatho-Rusyn. But my own small sliver of English ancestry would have been left off the list.

Ultimately, the declining numbers of responses along with some reporting on self-identification points to the disappearing concepts of “Irish American” or “English American” as many increasingly see themselves as simply White Americans. But that’s a story for another day.

In the meantime, we have Joe Biden, the Irish American president, with a small bit of English ancestry. Those interested in the genealogy, the article also includes some nice photos of baptismal records and marriage records. It’s an interesting read, though I’m hungry for more as it’s a very light duty pass.

Credit for the BBC pieces goes to the BBC graphics department.

Credit for my reimagination is mine.

On a Line. Or Not.

Two weeks ago I was reading an article in the BBC that fact checked some of President Biden’s claims about the economy. Now I noted the other day in a post about axis lines and their use in graphics. Axis lines help ground the user in making comparisons between bars, lines, or whatever, and the minimum/maximum/intervals of the data set.

I was reading the article and first came upon this graphic. It’s nothing crazy and shows job growth in the aggregate for the first three months of a presidential administration. A pretty neat comparison in the combination of the data. I like.

Pay attention to what you see here. There will be a quiz.

I don’t like the lack of grid lines for the axis, however. But, okay, none to be found.

I keep reading the article. And then a couple of paragraphs later I come upon this graphic. It looks at the monthly figures and uses a benchmark line, the red dotted one, to break out those after January 2021 when Biden took office.

Spot the differences.

But do you notice anything?

The lines for the y-axis are back!

The article had a third graphic that also included axis lines.

I don’t have a lot to say about these graphics in particular, but the most important thing is to try and be consistent. I understand the need to experiment with styles as a brand evolves. Swap out the colours, change the styles of the lines, try a new typeface. (Except for the blue, we are seeing different colours and typefaces here, but that’s not what I want to write about.)

First, I don’t know if these are necessarily style experiments. I suspect not, but let’s be charitable for the sake of argument. I would refrain from experimenting within a single article. In other words, use the lines or don’t, but be consistent within the article.

For the record, I think they should use the lines.

Another point I want to make is with the third graphic. You’ll note that, like I said above, it does use axis lines. But that’s not what I want to mention.

At least we have lines.

Instead I want to look at the labelling on the axes. Let’s start with the y-axis, the percentage change in GDP on the previous quarter. The top of the chart we have 30%. As I’ve said before, you can see in the Trump administration, the bar for the initial Covid-19 rebound rises above the 30% line. It’s not excessive, I can buy it if you’re selling it.

But let’s go down below the 0-line. Just prior to the rebound we had the crash. Similarly, this extends just below the -30% line. But here we have a big space and then a heavy black line below that -30% line. It looks like the bottom line should be -40%, but scanning over to the left and there is no label. So what’s going on?

First, that heavy black line, why does it appear the same as the baseline or zero-growth line? The axis lines, by comparison, are thin and grey. You use a heavier, darker line to signify the breaking point or division between, in this case, positive and negative growth. Theoretically, you don’t need the two different colours for positive and negative growth, because the direction of the bar above/below that black line encodes that value. By making the bottom line the same style as the baseline, you conflate the meaning of the two lines, especially since there is no labelling for the bottom line to tell you what the line means.

Second, the heaviness of the line draws visual attention to it and away from the baseline, especially since the bottom line has the white space above it from the -30% line. Consider here the necessity of this line. For the 30% line that sets the maximum value of the y-axis, we have the blue bar rising above the line and the administration labels sit nicely above that line. There is no reason the x-axis labels could not exist in a similar fashion below the -30% line. If anything, this is an inconsistency within the one chart, let alone the one graphic.

Third, is it -40%? I contend the line isn’t necessary and that if the blue bar pokes above the 30% line, the orange bar should poke below the -30% line. But, if the designer wants to use a line below the -30% line, it should be labelled.

Finally, look at the x-axis. This is more of a minor quibble, but while we’re here…. Look at the intervals of the years. 2012, 2014, 2016, every two years. Good, make sense. 2018. 20…21? Suddenly we jump from every two years to a three-year interval. I understand it to a point, after all, who doesn’t want to forget 2020. But in all seriousness, the chart ends at 2021 and you cannot divide that evenly. So what is a designer to do? If this chart had less space on the x-axis and the years were more compressed in terms of their spacing, I probably wouldn’t bring this up.

However, we have space here. If we kept to a two-year interval system, I would introduce the labels as 2012, but then contract them with an apostrophe after that point. For example, 2014 becomes ’14. By doing that, you should be able to fit the two-year intervals in the space as well as the ending year of the data set.

Overall, I have to say that this piece shocked me. The lack of attention to detail, the inconsistency, the clumsiness of the design and presentation. I would expect this from a lesser oganisation than the BBC, which for years had been doing solid, quality work.

The first chart is conceptually solid. If Biden spoke about job creation in the first three months of the administration vs. his predecessor, aggregate the data and show it that way. But the presentation throughout this piece does that story a disservice. I wish I knew what was going on.

Credit for the piece goes to the BBC graphics department.

What Is Infrastructure?

This morning I read a piece in Politico Playbook that broke down President Biden’s $2.25 trillion proposal for infrastructure spending. A thing generally regarded as the United States sorely needs. $2.25 trillion is a lot of money and it’s a fair question to ask whether all that money is really money for infrastructure.

Because, it turns out, it’s not.

Please, sir, may I have more train money?

That isn’t to say money spent on job retraining or home care services wouldn’t be money well spent. Rather, it’s just not infrastructure.

But politics and the English language is a topic for another day. Oh wait, somebody already did write about that.

Credit for the piece is mine.

Biden’s Cabinet

Note: I wanted this to go up on Inauguration Day, but I had some server issues last week. And while I got everything back for Friday and Monday, I didn’t want to wait too long to post this. You’ll note at the end that I have questions about General Austin and whether he could be confirmed as Defence Secretary. Spoiler: He was.

Today is Inauguration Day and at noon, President Trump returns to being a citizen and Joe Biden assumes the office of the presidency. He comes to office with arguably the most diverse cabinet in American history supporting him and his agenda.

CNN took a look at that diversity with this piece, which uses an interactive, animated stacked bar chart.

The proposed cabinet vs. the US ethnic breakdown

I took a screenshot at the ethnic/racial diversity. At the top, each bar represents one member of cabinet who you can reveal after mousing over the bar. Below is a stacked bar chart showing the racial makeup of the United States. You can see how it does resemble, and in some cases exceeds, the diversity of the broader United States.

One thing to note, however, is that we see 26 members of Cabinet. Some of those are the heads of the big executive departments like Treasury and Defence. But I’m not certain everyone is technically a cabinet-level position, e.g. Celia Rouse, Chair of the Council of Economic Advisors. It could be that the position is being elevated to cabinet level like John Kerry’s role as climate envoy. And if I just missed the press announcement, that’s on me. But that could affect the overall numbers.

Regardless, the nominated cabinet is more diverse than the previous two administrations as the CNN piece also shows.

The proposed cabinet vs. the preceding inaugural cabinets

I should point out that usually an incoming administration usually has a few of its national security positions already confirmed or confirmed on the first day, e.g. Defence and State. However, the Republican Senate, obsessed with the lie of a fraudulent election, has only just begun to start the confirmation process. In fact, as of late last night, only Avril Haines has been confirmed by the Senate (84–10) for Director of National Intelligence.

Furthermore, almost every administration has one or two nominations that fail to pass the Senate. George W Bush had Linda Chavez, Barack Obama had Tom Daschle, and Donald Trump had Andrew Puzder, just to give one from each of the last three administrations.

With a 50–50 Senate, I would expect there to be a few nominees who fail to make it over the line. Austin could be one, there appears to be some bipartisan agreement that we ought not nominate recent military officials as civilian heads of said military. Another to keep an eye out for is Neera Tanden. She riles conservatives and angers Bernie Sanders supporters, so whether the Senate will confirm her as Director of the Office of Management and Budget remains an open question in my mind.

Credit for the piece goes to Priya Krishnakumar, Catherine E. Shoichet, Janie Boschma and Kenneth Uzquiano.

Can Texas Go Blue on Tuesday?

One story I’m following on Tuesday night is Texas. The state’s early voting—still with Monday to go—has surpassed the state’s total 2016 vote. Polling suggests that early votes lean Biden due to President Trump’s insistence that his supporters vote in person on Election Day as he lies about the integrity of early and mail-in voting.

The Texas Tribune looked at what we know about that turnout and what it may portend for Tuesday’s results. And, to be honest, we don’t—and won’t—really know until the votes are counted. They put together a great piece that divided Texas counties into four groups (their terminology): big blue cities, fast-changing counties, solidly red territory, and border counties. They then looked at the growth in registered voters in those counties from 2016, and looked at how they voted in the 2016 presidential election (Hillary Clinton vs. Donald Trump) and the 2018 US Senate election (Beto O’Rourke vs. Ted Cruz).

The piece uses the above stacked bar chart to show that Texas’ 1.8 million new registered voters’ largest share belongs to the big blue cities. The second largest group is the competitive suburbs in the fast-changing counties. The third largest, though quite close to second, was the solidly red territory. The border counties, still important for the margins, ranks a distant fourth.

I’m not normally a fan of stacked bar charts, because they do not allow for great comparisons of the constituent elements. For example, try comparing any of of those solidly red territory counties to one another. But here, the value is more in the stacked set as a group rather than the decomposition of the set, because you can see how the big blue cities have, as a group, a greater number of those 1.8 million new voters.

Those fast-changing counties include a lot of the suburbs for Texas’ largest cities. And those are areas where, across the country, Republicans are losing voters by the tens of thousands to the Democrats. As battlegrounds, these presented a challenge, because as swing counties, they split their votes between Clinton and Cruz and Trump and Beto. And so the designers chose purple to represent them in the stacked design. I think it’s a solid choice and works really well here.

But in terms of the story, I’ll add that in 2016, Trump won Texas by 807,000 votes. Texas added 1,800,000 new voters since then. And turnout before Election Day is already greater than it was in 2016.

It’s still a state likely to go for Trump on Tuesday. But, if Biden has a good night, it’s not inconceivable that Texas flips. FiveThirtyEight’s polling average has Trump with only a 1.2 point lead.

Credit for the piece goes to Mandi Cai, Darla Cameron and Anna Novak.

Red Shift, Blue Shift

Last night I published a graphic on Instagram that I think people may find helpful if they try to follow Election Day results on Tuesday. I wanted to explain the concept of a red shift or blue shift. (I’ve also seen it described as states having a red mirage or a blue mirage.)

For my non-American readers, it’s important to understand that while this is a national election, the United States’ federal system means that each state runs its own election with its own rules and they can vary some state to state. For example, early or mail-in voting can vary significantly from state to state with some states allowing it only in emergencies (and some of those this cycle will not allow people to cite COVID-19 as an emergency).

Another factor for everyone to consider is that polling indicates President Trump’s fraudulent messaging about, well, voting fraud has shifted a normally split use of early/mail-in voting to a Democratic advantage. In other words, Democrats are far more likely to vote early, either in person or by post. Republicans are far more likely to vote on Election Day.

Combine those two factors and we get Red Shift vs. Blue Shift.

Some states allow election officials to begin counting their early votes prior to Election Day. Other states forbid counting until Election Day morning, or in some cases until after the polls close.

In states where early votes can be counted—the swing states Arizona, Florida, and North Carolina are among this group—it is possible that when the polls close, or shortly thereafter, we will see an instant and enormous lead for Joe Biden. But, as the states begin to count in-person day-of votes, which again favour Republicans, Trump may begin to eat into those margins. The question will be, can Trump’s numbers eat in so much that when the final counts are complete, he can overtake those Biden numbers? This is the Red Shift.

Conversely we have the Blue Shift. In these states—swing states like Georgia, Michigan, Pennsylvania, Texas, and Wisconsin are in this group—election officials cannot begin to count early votes either until the morning or when the polls close. In these states we may see the in-person day-of votes, largely expected to be for Republicans, run up to high totals fairly quickly. At that time, Trump may have a significant lead. Then when officials pivot to counting the early votes, Biden will begin to eat into those margins. And again, the question will be, can Biden eat into those margins sufficiently to shift the outcome after all the votes are counted?

Be prepared to hear about these scenarios Tuesday night.

Credit for the piece is mine.

Choose Your Own FiveThirtyEight Adventure

In case you weren’t aware, the US election is in less than a week, five days. I had written a long list of issues on the ballot, but it kept getting longer and longer so I cut it. Suffice it to say, Americans are voting on a lot of issues this year. But a US presidential election is not like many other countries’ elections in that we use the Electoral College.

For my non-American readers, the Electoral College, very briefly, was created by the country’s founding fathers (Washington, Jefferson, Adams, Franklin, et al.) to do two things. One, restrict selection of the American president to a class of individuals who theoretically had a broader/deeper understanding of the issues—but who also had vested interests in the outcome. The founders did not intend for the American people to elect the president. The second feature of the Electoral College was to prevent the largest states from dominating smaller states in elections. Why else would Delaware and Rhode Island surrender their sovereignty to join the new United States if Virginia, Pennsylvania, and New York make all the decisions? (The founders went a step further and added the infamous 3/5 clause, but that’s another post.)

So Americans don’t elect the president directly and larger states like California, New York, and Texas, have slightly less impact than smaller states like Wyoming, Vermont, and Delaware. Each state is allotted a number of Electoral College votes and the key is to reach 270. (Maybe another time I’ll get into the details of what happens in a 269–269 tie.) Many Americans are probably familiar with sites like 270 To Win, where you can determine the outcome of the election by saying who won each state. But, even though the US election is really 50 different state elections, common threads and themes run through all those states and if one candidate or another wins one state, it makes winning or losing other states more or less likely. FiveThirtyEight released a piece that attempts to link those probabilities and help reveal how decisions voters in one state make may reflect on how other voters decide.

The interface is fairly straightforward—I’m looking at this on a desktop, though it does work on mobile—with a bunch of choices at the top and a choropleth map below. There we have a continually divergent gradient, meaning the states aren’t grouped into like bins but we have incredibly subtle differences between similar states. (I should also point out that Maine and Nebraska are the two exceptions to my above description of the Electoral College. They divide their votes by congressional district, whoever wins the district gets that Electoral College vote and then the state overall winner receives the remaining two votes.)

Below that we have a bar chart, showing each state, its more/less likely winner state and the 270 threshold. Below that, we have what I’ve read/heard described as a ball plot. It represents runs of the simulation. As of Thursday morning, the current FiveThirtyEight model says Trump has an 11 in 100 chance of winning, Biden, conversely, an 89-in-100 chance.

But what happens when we start determining the winners of states?

Well, for my non-American readers, this election will feature a large number of voters casting their ballots early. (I voted early by mail, and dropped my ballot off at the county election office.) That’s not normal. And I cannot emphasise this next point enough. We may not know who wins the election Tuesday night or by the time Americans wake up on Wednesday. (Assuming they’re not like me and up until Alaska and Hawaii close their polls. Pro-tip, there’s a potentially competitive Senate race in Alaska, though it’s definitely leaning Republican.)

But, some states vote early and/or by mail every year and have built the infrastructure to count those votes, or the vast majority of them, on or even before Election Day. Three battleground states are in that group: Arizona, Florida, and North Carolina. We could well know the result in those states by midnight on Election Day—though Florida is probably going to Florida.

So what happens with this FiveThirtyEight model if we determine the winners of those three states? All three voted for Trump in 2016, so let’s say he wins them again next week.

We see that the states we’ve decided are now outlined in black. The remainder of the states have seen their colours change as their odds reflect the set electoral choice of our three states. We also now have a rest button that appears only once we’ve modified the map. I’m also thinking that I like FiveyFox, the site’s new mascot? He provides a succinct, plain language summary of what the user is looking at. At the bottom we see what the model projects if Arizona, Florida, and North Caroline vote for Trump. And in that scenario, Trump wins in 58 out of 100 elections, Biden in only 41. Still, it’s a fairly competitive election.

So what happens if by midnight we have results from those three states that Biden has managed to flip them? And as of Thursday morning, he’s leading very narrowly in the opinion polls.

Well, the interface hasn’t really changed. Though I should add below this screenshot there is a button to copy the link to this outcome to your clipboard if, like me, you want to share it with the world or my readers.

As to the results, if Biden wins those three states, Trump has less than a 1-in-100 chance of winning and Biden a greater than 99-in-100.

This is a really strong piece from FiveThirtyEight and it does a great job to show how states are subtly linked in terms of their likelihood to vote one way or the other.

Credit for the piece goes to Ryan Best, Jay Boice, Aaron Bycoffe and Nate Silver.

Where Are the Votes?

I’m not working for a good chunk of the next few days. But, I did want to share with my readers an analysis of Pennsylvania’s missing votes. Broadly, Trump needs to win the Commonwealth of Pennsylvania next week—yes, the US election is now one week away. Though, Pennsylvania allows mail-in ballots postmarked on Election Day to arrive within a few days and still be counted. So we may not have final tallies for the state until the weekend or Monday after Election Day.

Pennsylvania, of course, narrowly voted for Donald Trump over Hillary Clinton in 2016 with 44,000+ votes making the difference. In 2020, polling has consistently placed Joe Biden above Donald Trump by 5+ points. But, can Trump again pull off an upset victory?

I argue that yes, he can. And fairly easily too. (If you want to see why I think Pennsylvania is really Trumpsylvania, I recommend checking out my longer, more in-depth analysis.) So where would the votes come from? I mapped the 2016 difference between votes cast and registered voters, i.e. people who could have voted, but did not for whatever reason. I then coloured the map by the county’s winner in 2016. Red counties voted for Trump by more than 10 points and blue for Clinton by more than 10 points. The purple counties are those that were competitive, plus or minus 10 points for either candidate.

In the purple counties, both candidates will want to drive out as many voters as possible. But in the blue counties, Biden has reliably Democratic votes and in red Trump has reliably Republican votes. So why on Monday did Trump visit Allentown, Lititz, and Martinsburg? Because that’s where those votes are.

Allentown, in Lehigh County, is competitive. In fact, neighbouring Northampton Co. will be a key swing county next week and one I will be following closely as the returns come in. But Lititz, Lancaster Co., and Martinsburg, Blair Co., are in reliably red counties. (Though in my Trumpsylvania piece I argue Lancaster Co. is undergoing a transition to a competitive, albeit lean Republican county.)

In Lancaster Co., which went to Trump by nearly 20 percentage points in 2016, there were still just short of 100,000 voters who didn’t vote in 2016. Not all of those voters would have voted for Trump, but for sake of argument, just say 50% would have. That makes just short of 50,000 potential Trump votes—more than Trump’s entire state margin.

Blair Co. is in the Pennsyltucky region of the state, relatively rural, but in Blair’s case, its county seat Altoona is the state’s 10th largest city. While the total number of votes—and the total number of non-voting voters—are smaller than in Lancaster Co., add up all the available votes and it’s a large number.

If you add up all those red counties’ missing votes, you get a total of just shy of 840,000 missing votes. Far more than enough to drastically swing the Commonwealth to Trump in 2020.

Of course, Biden’s counting on driving out turnout in Philadelphia and Pittsburgh and their suburbs, along with other cities in the state, like Allentown, Scranton, Harrisburg, and Erie. In those blue counties, there were 927,000 missing votes, so the potential for a Biden win is also there.

But, if Democratic voters don’t vote again in 2016, Trump has plenty of potential votes to pick up across the state.

Credit for the piece is mine.