Datagraphic – Page 25 – Coffee Spoons

The Earth Is Actually Quite Smooth

At scale. Not quite as smooth as a billiards ball, as is often claimed. But still, with the majority of the Earth’s surface covered by water, the highest mountains of Everest and K2 make for mere fractions of differences in height relative to the Earth’s size.

But that did not stop xkcd from making a scale model of Earth.

Credit for the piece goes to Randall Munroe.

Covid-19 Update: 18 January

Last week we saw that in the weeks after Christmas, new cases and deaths rebounded in the five states of Pennsylvania, New Jersey, Delaware, Virginia, and Illinois. The question was how bad would things continue to get? Would these rebounds sustain themselves?

A week later we can see a glimmer of good news in that with new cases, these rebounds appear to have crested and are now ebbing back down. At least in four states.

In Virginia, unfortunately, we see that new cases continue to climb with a new record of nearly 10,000 cases reported late last week. More broadly, this is the dilemma that confronts the United States. We have states like Pennsylvania, Delaware, and Illinois where we are bringing the virus back to heel. But in other states like Virginia, things continue to get worse.

New Jersey is somewhere in the middle. It appears to just possibly be cresting with its average actually ticking higher the last few days despite falling daily new cases. We will need to see how the Garden State plays out over the course of this week.

When we look at deaths, we continue to see the grim numbers pile up.

Deaths, of course, lag new cases by 2–4 weeks, sometimes as many as six or longer. In most of our five states, the average rate of deaths appears to be cresting or peaking. In Pennsylvania the curve may have peaked. In Delaware, we have seen a plateau and in Illinois we see the best news of a resumed decline.

In both New Jersey and Virginia, however, we see deaths continuing to climb, and in some cases by significant amounts.

If cases really have peaked in some of these states over the last week, we may expect deaths to continue to rise over the course of this week before beginning to fall again.

I also want to add two new graphics today. I have been trying to figure out how to cover the vaccination programme of the five states. Unfortunately, they do not all report the same data in the same way.

The graphic that perhaps makes the most sense is the one that looks the emptiest at the moment.

In order to resume “normal” lives, we need to achieve herd immunity. When we reach that level, the virus starves of new hosts and dies out. Broadly speaking, we have two ways of achieving herd immunity.

Option 1, let the virus run rampant and takes its course through the population. The benefit is that society remains open and people can return to cafes, pubs, shops, and museums. The cost is that millions get sick and hundreds of thousands die. Sadly, this is the route taken by Sweden and, unofficially, the United States.

Option 2, vaccinate the population. The benefit here is that millions do not get sick and hundreds of thousands do not die. The cost is that in order to wait for a vaccine and vaccination we would need to close cafes, pubs, shops, and museums.

The reality is that we chose something between the two. In the initial months, after we (belatedly) recognised the threat of the virus, we shut down our economies and stayed home. We chose option 2. You can see in the state charts above how that quickly helped us curb the spread of new infections.

Unfortunately, then the Trump administration chose to follow option 1 and encouraged states to “reopen” their economies. And because we never got the virus fully under control, we sowed the seeds for the explosive growth this autumn and winter.

But the vaccines are now here and the best bet is to vaccinate the population. How many people do we need to vaccinate? The exact number depends upon the infectiousness of the virus. Measles, one of the most infectious viruses out there, requires near 100% vaccination rates to achieve herd immunity. Thankfully, this coronavirus is not as infectious as measles. Early estimates placed the range at 60–70%. But lately, some epidemiologists have indicated the true number may be higher. Dr. Fauci of the National Institutes of Health (NIH) has said the true number is likely 70–85%.

This is why the new strains of the coronavirus we have identified in South Africa and the United Kingdom worry folks. Both appear to be more transmissible than earlier strains. Neither strain appears to be more lethal in its own right—although more cases means more people will die—but this increased infectiousness could mean we need an ever higher level of herd immunity, which means more vaccinations. And we’re already seeing the anti-vaccination support rising to somewhere in the range of 15-20%, just the threshold we could perhaps tolerate with the higher herd immunity range.

So what about the chart?

As we begin vaccinations, some states are reporting the numbers of people in their state that have been fully vaccinated against the coronavirus. I plot those numbers here. Pennsylvania, Virginia, and Illinois do so. Unfortunately, neither New Jersey nor Delaware does. I only have one data point recorded for Virginia and Illinois, and so they are not plotted yet, but both fall below the level of Pennsylvania, which has reported 0.50% of its population fully vaccinated. I have added a bar to show the range of estimated herd immunity we need.

And that gets us to the second new chart, the number of total doses administered per day.

Functionally this resembles the usual two charts. We track the number of doses administered daily and then plot their seven-day average to smooth out any day-to-day blips. Of course this means almost the opposite of those two charts as we are tracking the progress of people who will be immune from the virus.

The catch is that with the current vaccines we need two shots for a full course of treatment and not all states break the data down with that level of granularity. Again, we are looking at Delaware and New Jersey as they provide only the total number of doses administered. Now that’s still helpful, but it doesn’t give us the most accurate picture of what is happening with vaccinations.

But in order to make things comparable across five states, I have decided to use that broader, total doses administered metric for Pennsylvania, Virginia, and Illinois. (Virginia and Illinois provide another headache in that it reports the daily number of people fully vaccinated, but does not break down the number of full vaccination doses.)

So what is this second chart showing us?

Well, we are seeing a slow, nearly steady growth in the number of vaccines administered. The problem is that we need to see steep, nearly exponential line charts here if we want to have any hope of returning to “normal” anytime soon. Reporting tells us that the federal government’s approach to the logistics of vaccine distribution has been…not great. (Although at this point, perhaps that should not surprise us.)

Until we see these second charts begin to show more exponential growth, the first charts of the number of people fully vaccinated will be far below that herd immunity threshold we need to see.

Covering the vaccines in addition to the virus is a bit more work, but I’m going to try and cover them both over the next several months as I have with the outbreak itself.

Credit for the piece is mine.

Dove vs. Hawk

Earlier, I saw these two graphics floating around the Twitter. They each come from a major financial institution and attempt to place the voting (and non-voting members) of the Federal Open Market Committee (FOMC) on a spectrum of doves to hawks or slightly less dovish. The FOMC, part of the Federal Reserve system, sets interest rates for the US economy. Now, I’m being super simplistic here, but it’s broadly true. I should add, full disclosure, I presently work for the Federal Reserve Bank of Philadelphia.

The first graphic is from JPMorgan and plots in one-colour all the voting and non-voting members on a single axis from very dovish to somewhat less dovish. Thin black lines point to evenly spaced points on the axis and people are listed at each interval.

It’s a fairly simple approach, but effective. Nothing revolutionary here. What I find a bit odd is the line underneath the centre tick. What prompts that group to have what I’ll call a summary bar? Is it because Jay Powell, the chair of the Federal Reserve, is placed within that group? It’s a bit unclear.

Now keep in mind the classifications here, very dovish and somewhat less dovish, as we compare JPMorgan’s graphic to that of Bank of America.

The first thing that strikes me is the use of colour. Here we have a fairly straightforward divergent spectrum of red to blue. Along with other design elements, like typographic scale and contrast for the header, subhead, and labels, this piece strikes me as better designed and more polished.

But I still have questions.

Here we have dovish to hawkish. At the hawkish extreme, we have Esther George of Kansas City and Robert Kaplan of Dallas. In JPMorgan’s chart, both are grouped together as somewhat less dovish. But with Bank of America, they are decidedly hawkish. (Although with nine intervals, the Bank of America graphic has a bit more granularity than JPMorgan’s.)

So the biggest question, unfortunately left unanswered by each graphic, is what defines hawkish and somewhat less dovish? Just by words, they sound not at all alike. But both companies clearly place both individuals at the same end of the spectrum.

Part of the issue stems from the divergence point between red and blue. For most spectra of this type, that would be the demarcation between a committee member who is a dove or a hawk. But we have no similar separation for JPMorgan.

There is, however, one design element for Bank of America’s piece that I really like. My explanation of the FOMC at the top was a bit simplistic. Not every regional Federal Reserve president gets to vote every year. They rotate each year except for New York. These presidents get to vote alongside those on the Board of Governors.

In the graphic, note that everybody above the axis label is a member of the Board, i.e. they get to vote every year until their term expires. Below the axis we have the rotation schedule. Each line represents a bank president who can vote in a particular year. For example, the Philadelphia president, Patrick Harker, was a voting member on the committee in 2020, but falls off in 2021 and will not return to 2023. The Bank of America graphic captures this for each president very well.

I am a bit confused as to why some members, i.e. Kaplan and John Williams of New York, appear to sit between lines. I am unaware of any reasons why they would be between years.

Overall, I prefer the Bank of America piece. It more clearly presents the rotation element of the voting members of the FOMC. Yes, it has colours, but I’m confused as to why the demarcation between doves and hawks happens where it does. And why JPMorgan doesn’t describe anyone as a hawk. So while I prefer it, I think it could still use some additional information or context to make it clearer to readers.

Credit for the JPMorgan piece goes to a designer at JPMorgan.

Credit for the Bank of America piece goes to a Bank of American Global Research designer.

2020 Election Results…

Via xkcd.

It’s Friday and we’ve made it all to the end of the week. A little while back xkcd posted about the 2020 US election, showing where the votes for both candidates are approximately located.

This isn’t quite funny like I normally might post on a Friday, but it felt appropriate after this week we had with the impeachment.

Credit for the piece goes to Randall Munroe.

Impeachment 2: The Insurrection

Like many Americans I closely followed the outcome of yesterday’s historic vote by the House of Representatives to impeach President Trump for his incitement of an insurrection at the US Capitol in a failed coup attempt to overturn the 2020 election.

Words I still never thought I’d write describing an American election.

So at the end of the vote, I created this first graphic to capture the bipartisan nature of the impeachment. Ten Republicans broke ranks and voted with the Democrats. Keep in mind that in 2020, zero Republicans did the same. Justin Amash had by then resigned from the Republican Party and sat as an independent.

But I was also interested in how “courageous” these votes could be seen. Trump remains immensely popular with his base despite his attempt to overthrow the US government and keep himself in power. Did the Republicans who supported impeachment sit in districts won by Biden?

The answer? Not really. Two did: congressmen from New York and California. But a look at the other eight reveals they represent Trump-supporting districts.

To be fair, there are probably three tiers of seats in that group. Liz Cheney, the No. 3 Republican in the House, is in her own Trump-supporting seat as Wyoming’s at large representative. But four other Republicans have seats where Trump won by more than 10 points.

Three more Republicans are in seats I’d label competitive, but lean Republican.

Clearly the argument can be made that for most of these Republicans, it was not a politically safe choice to vote for impeachment. House seats will be redistricted this year for the 2022 midterms, but I’ll be curious to see how these Republicans fare in those redistricting proceedings and then in the ultimate elections thereafter.

Credit for the piece is mine.

Trump’s White Wall

I meant to publish this yesterday, but this piece also offers a reminder that the hardest part of a data-driven story is usually finding the data. I was unable to find a single source of data for all the numbers I needed by the time I switched on for work. And so this had to wait until last night when I found what I needed.

And of course upon waking up this morning I found a few new articles with the data and more recent figures.

Since 2016, Trump has made building a great, big, beautiful wall on the US-Mexican border his signature policy. Of course, most illegal immigrants cross the border legally at checkpoints and normal ports of entry. A significant number are people who overstay the limits on their visas. So the efficacy of a great, big, beautiful wall is really not that great.

He also claimed that he would make Mexico pay for it.

So as he prepares to leave office, Trump this week is going on something of a victory tour and touting up his administration’s successes. The first stop? Alamo, Texas to highlight his wall.

Let’s look at that wall and how much the administration has accomplished.

For context, the US border with Mexico is nearly 2000 miles long. As of 18 December, the administration had built 452 miles, less than a quarter of the border’s total length.

Crucially, most of that construction merely replaced sections of existing wall and fence scheduled for replacement. The total amount of new wall built, as of 18 December, totals about 40 miles.

The cost of that 452 miles? More than $15 billion.

How much has Mexico paid? $0.

Credit for the piece is mine.

Covid-19 Update: 10 January

The last time we checked in on Covid-19 in the states of Pennsylvania, New Jersey, Delaware, Virginia, and Illinois, things were peaking across the five states. As I said then:

If you look at the very tippy tip top of the curves in the other four states, we might just be seeing an inflection point.

And in the month since, my highly scientific term of “tippy point” appears to have been correct. New cases did begin to drop and by the start of the Christmas holiday we began to see real improvement. I should point out that deaths continued to rise, however, but we should expect that because deaths lag new cases by sometimes as many as four to six weeks.

So how are things now, a month hence?

The new case curves for PA, NJ, DE, VA, & IL.

Well as you can see with new cases, not great and getting worse. Pennsylvania, New Jersey, Delaware, and Illinois all bottomed out prior to the holidays, and since then have been rising. It speaks to a surge in new cases likely caused by gatherings centred on the holidays.

The good news—if you can call it that—is that in Pennsylvania and Illinois, whilst cases rebound, they have not yet reached their mid-December peak in Pennsylvania and mid-November peak in Illinois. It’s worth pointing out that Chicago and separately Illinois instituted lockdowns earlier than the other four states prior to the holidays. That may account for the more dramatic reduction in those states.

The bad news is that in New Jersey and Delaware, the rebounds have now surpassed the peaks we saw in mid-December and cases continue to climb with new daily records pointing towards escalation of new cases in those states.

But the really bad news is in Virginia, where the inflection point was there—note the little mini “W” at the top of the chart—but that new cases declined neither significantly long nor in significant numbers such that there was no real holiday decline. Instead, at best we could describe it as numbers paused for two weeks before resuming their upward trends.

How about deaths?

Again, fairly grim news here. A month ago we were talking about rising rates of deaths in all but Illinois. And in fact, Illinois is the only state where the death rate is significantly lower than what it was in mid-December.

In New Jersey and Virginia, we see two states where the rising death rate perhaps slowed, but it never really entered into decline. Pennsylvania and Delaware offer perhaps static death rates. In fact, Pennsylvania just yesterday surpassed its mid-December peak level.

But keep in mind that deaths lag new cases by somewhere between two to four weeks, sometimes longer. What this means is that with new cases now rebounding and in fact surpassing their peaks from a month ago, we can expect that the end of January and beginning of February could be particularly deadly.

The situation is dire in the United States and things are going to get worse before they get better.

Credit for the piece is mine.

Needle Time

Yesterday was maybe the last election day for the 2020 US General Election. (There are still a few US House seats yet to be called, most notably a contested race in upstate New York.) These were a pair of runoff elections in Georgia for the state’s two US Senate seats (one for a full, six-year term, the other to finish out the final two years of a retiring senator).

I spent most of the night eating pizza and tracking results. One thing that I keep tabs on (in the sense of open tabs in the browser) is the New York Times needle forecast. It has its problems, but I wanted to highlight something I think was new last night. Or, if it wasn’t, I didn’t notice it back in November.

Below the needle was a simple table of results.

In the past, the needle was a bit opaque and it consumed data and spat out forecasts without users having a sense of what was driving those forecasts. Back in November, there were a few instances where states published incorrect data—that they later fixed—and when the needle consumed it, the needle forecast incorrect results.

But now we have a clear record of what data the forecast consumed in the table below the needles. It’s fairly straightforward as tables go. But tables don’t have to be sexy to be clear and effective.

The table lists the time when the data was added, the number of votes added, the type of vote added, and then the actual data vs. what was expected. And ultimately how that changed the needle. This goes a long way towards data transparency.

Simple colour use, bright blues and reds, show when the result/data favoured the Republican or Democrat. Thin, light strokes instead of heavy black lines for rows and columns place the visual emphasis on the data. And smaller type for the timestamp places the less important data at a lower level of importance.

It’s just very well done.

Credit for the piece goes to Michael Andre, Aliza Aufrichtig, Matthew Bloch, Andrew Chavez, Nate Cohn, Matthew Conlen, Annie Daniel, Asmaa Elkeurti, Andrew Fischer, Will Houp, Josh Katz, Aaron Krolik, Jasmine C. Lee, Rebecca Lieberman, Jaymin Patel, Charlie Smart, Ben Smithgall, Umi Syam, Miles Watkins and Isaac White.

Difficult Descendancy Charts

The holiday break is over as your author has burned up all his remaining time for 2020 and so now we’re back to work. And that means attempting to return to a more frequent and regular posting schedule for Coffeespoons.

I wanted to start with the death of Diego Maradona, a legendary Argentinian footballer. He died in December of a heart attack and left behind a complicated inheritance situation. To help explain the situation, the BBC created what in genealogy we call a descendancy chart. You typically use a descendancy chart to show the children, and sometimes grandchildren, of a person. (You can also attach people above the person of interest and show the person’s ancestral families.)

This is an example of a descendancy chart from my research into an unrelated family.

You can see Samuel Miller married Sabra Clark and had at least nine children with her. And I followed one of them, another Samuel, who married Elizabeth Woodruff and they had four children. In this version, you can also see Samuel the elder’s parents and siblings.

But Diego presents a complicated situation. He was married and had two children, then divorced. That’s not terribly uncommon. But he then went on to have potentially eight children with potentially five different women. (I say potentially because some of the claims are still working their way through the courts via paternity tests.)

The above type of chart works well with one couple. In my own family, I have at least one ancestor who had potentially two husbands (the second marriage has not yet been confirmed, but she definitely had children with two different men). And when we use this chart type to look at my ancestor’s descendants, you can see it becomes tricky.

Her children’s fathers can be placed to either side and then the children flow out from that. But whereas in the first chart we could see all nine children in one glance, Mary Remington had four and we only see two in this same view.

So how do you deal with one person who has six total relationships that have offspring?

The BBC opted for a vertical chart that uses colour to link the couples. Diego and his ex-wife receive a red line, and that link moves vertically down from Diego with the two daughters shown as descendants on the right.

Each subsequent relationship with offspring receives its own colour and continues to move vertically down the page, linking the mother on the left to the children on the right.

What I find interesting is the inconsistency within the chart, however. At the end, with the unidentified women, we have two instances of multiple children. Santiago Lara and Magali Gil, for example, descend from one stem. But note at the top how Diego’s two daughters Gianinna and Dalma each receive their own stem. Is there a reason for combining the two children from one unidentified mother into one branch?

And why the vertical format? You can see in my two examples, we are looking at a horizontal format. It works well when I am working on my desktop. The format is less useful on a mobile. I wonder if the BBC knows from their analytics that most people access their content like this via mobile phone and created a graphic that best uses that tall but narrow proportion. Because the proportions do not work well when the article is viewed on a desktop.

The vertical descendancy chart here is an intriguing solution to show descendants from multiple partners in a single mobile screen display. I am not sure how useful it would be as a new form, because I am not certain of how many times we would run into issues of children from six partners, but it could be worth exploring.

Credit for the images from my examples goes to the designers at Ancestry.com.

Credit for the BBC graphic goes to the graphics department of the BBC.

Covid-19 Update: 13 December

So as begin to head into winter, where are we at with the spread of Covid-19 in the five states of Pennsylvania, New Jersey, Delaware, Virginia, and Illinois?

Nowhere good. Let’s take a look.

New cases curves for PA, NJ, DE, VA, & IL.

If you recall where we were at last week, also not great but better, cases had resumed rising post-Thanksgiving across the board. The data from yesterday indicates that cases have continued to rise everywhere but Illinois, which initiated a lockdown earlier than the other states we cover.

But Philadelphia did eventually institute a lockdown and eventually the rest of the Commonwealth followed, and similar measures—none of course as significant as those from the spring—were enacted in other states.

If you look at the very tippy tip top of the curves in the other four states, we might just be seeing an inflection point. That is, the curve of new cases could be slowing from their near exponential rates of increase. The numbers released today we should expect to be lower than average. Consequently we will want to see the numbers beginning Tuesday through the end of the week to see whether this slowdown is real or a blip.

Regardless of whether or not new cases numbers are slowing down, we have to contend with rising numbers of deaths. Deaths of course lag new cases by weeks, sometimes as many as 4–8. So if we hypothetically hit peak new cases today, we would expect the number of deaths to continue rising and then peak perhaps sometime in mid- to late-January.

So where are we with deaths today? Also nowhere good. Let’s take a look.

In all five states with the potential exception of Illinois, new deaths continue to rise. Pennsylvania, worryingly, will likely surpass the peak death rate it saw in the spring if current trends continue. I would expect that sometime likely this week.

Illinois remains the one state where we might be seeing some good news. As I just mentioned above, deaths lag new cases by several weeks. And several weeks ago we appear to have peaked there in terms of new cases. It’s possible that we are beginning to or have already seen peak deaths in Illinois and that the next several weeks could be a gradual decline as the state gets its outbreak under control.

In the other four states, if we were to hypothetically peak with new cases this week, again, we would likely see these orange lines continue heading upwards for several weeks to come. And in that case, we’d almost certainly pass the peak death rates of the spring in Pennsylvania, Delaware, and Virginia. New Jersey might be the exception to that, however. And that would be largely due to the fact that so many deaths there happened so early in the pandemic before we had identified the best ways to save lives.

I suspect that the data coming out this week will be important to inform us whether or not we have crested or begun to crest this latest wave of infections.

Credit for the piece is mine.