Big Bar Chart Better

Today isn’t a Friday, but I want to take a quick look at something that made me laugh aloud—literally LOL—whilst simultaneously cringe.

Not surprisingly it has to do with Trump and data/facts.

This all stems from an interview Axios’ Jonathan Swan conducted with President Trump on 28 July and that was released yesterday. I haven’t watched the interview in its entirety, but I’ve seen some excerpts. Including this gem.

It’s eerily reminiscent of a British show called The Thick of It written by Armando Iannucci or probably more accurately an interview out of one his earlier works with Chris Morris, On the Hour or The World Today. He later went on to create Veep for American audiences, based loosely or inspired by the Thick of It, but I found it a weak substitute for the original. But I digress.

In that clip, the President talks about how he looks at the number of deaths as a share of cases, the case fatality rate, whilst Swan is discussing deaths as a share of total population, deaths per capita. Now the latter is not a great data point to use, especially in the middle of the pandemic, because we’re not certain what the actual denominator is. I’ve discussed this before in some of my “this is not the flu” posts where the case fatality rate, sometimes more commonly called simply the mortality rate, was in the 3–5% range.

Regardless of whether or not one should use the metric, here is how the President visualised that data.

2+2=5

Four big and beautiful bar charts. The best charts.

The President claims the United States “Look, we’re last. Meaning we’re first. We have the best. Take a look again, it’s cases [it’s actually still the case mortality rate]. And we have cases because of the testing.”

The problem is that one, it’s the wrong metric. Two, the idea that testing creates cases is…insane. Third, the United States is last in that big set of bar charts. Why is every country a different colour? In the same data series, they should all be the same, unless you’re encoding a variable such as, say, region via colour. But with four data points, a bar chart taking up the entirety of a US-letter sized paper is grossly inefficient.

But that’s not even the full picture. Because if you look at a more robust data set, this one from Our World in Data, we get a better sense of where the United States sits.

2+2=4

Still not the highest on the chart, true. But even in this set; Norway (of not a shithole fame), India, South Korea, New Zealand, South Africa, and Congo all rank lower. The United States is far from last. And for those wondering, yes, I took the data from the same date as the interview.

There’s another clip within that clip I linked to earlier that deals with South Korea’s numbers and how the President says we “don’t know that”. And this is the bigger problem. We all know that data can be manipulated. But if we cannot agree that the data is real, we cannot have a framework for a real discourse on how to solve very real problems.

As someone who works with data to communicate information or stories on a near daily basis, this is just frightening. It’s as if you say to me, the sky is a beautiful shade of blue today without a cloud in the sky and I reply, no, I think it’s a foreboding sky with those heavy clouds of green with red polka dots. At that point we cannot even have a discussion about the weather.

And it’s only Tuesday.

Credit for the Trump graphic goes to somebody in the White House I assume.

Credit for the complete graphics goes to Our World in Data.

Covid-19 Update

As I mentioned last week, I am going to try using my blog here for the weekly update on the five states people have asked me to explore. And for the second week in a row, we are basically seeing numbers down compared to previous days. But given that numbers are generally lower on the weekends, that is not terribly surprising.

The real question is by Friday, will these numbers have rebounded?

The Covid-19 curves for PA, NJ, DE, VA, and IL
The Covid-19 death curves for PA, NJ, DE, VA, and IL.

Credit for these graphics is mine.

Habitable Zones Around Masses of Light and Heat

But those masses are campfires.

It’s Friday, everyone, and we’ve made it to the end of the week. And with the successful launch of Perseverance yesterday, this post from xkcd made a lot of sense. For those that don’t enjoy astronomy, basically stars have habitable zones, or sometimes the Goldilocks zone, around the star where planets would likely be neither too hot nor too cold for liquid water to form on the surface of orbiting planets. And since life as we presently know it requires water, it makes sense that these zones are where we focus our attention in studies of exoplanets.

Just generally not a fan of s’mores over here though.

Credit for the piece goes to Randall Munroe.

What Will the Next Recovery Look Like?

Earlier this morning, the Bureau of Economic Analysis released its US 2nd quarter GDP figures and the news…isn’t great. On an annualised basis, we saw -32.9% growth. That’s pretty bad. Like Great Depression level bad. I’ve posted on the social media how bad this current recession is and how nobody in the workforce today worked or didn’t through the Great Depression to really relate to the numbers we are seeing.

But that’s all today. The sun will come out tomorrow. (And scorch the Earth as climate change renders certain parts of the globe uninhabitable to mankind. But we’ll get to those posts in later weeks.) And when it does come out, eventually, what will the recovery look like? I’ve seen a few mentions recently in the media of a V-shaped recovery. What is this mysterious V-shape?

A long time ago, in a galaxy far away. Or during the last recession in Chicago, I worked with some really smart people in some of my professional projects and we covered the exact same question. There are a couple key “shapes” to an economic recovery. And when we say recovery, we mean just to return to pre-recession peak levels of growth. Anything above that is an expansion. That’s what we want to get back to.

What kind of shape will the recovery take?
Who knew typographers loved economics?

The V-shape we hear a lot about is a sharp recovery after the economy bottoms out (the trough). Broadly speaking, if a recession has to last two consecutive quarters (it doesn’t, but that’s a pretty common definition so let’s stick with it), then in a V-shape, we are talking about a recovery one or two quarters later.

Similar to the V is the W-shape, where things start to improve rapidly, but some kind of shock to the economic system and things go back negative once again before finally picking up quickly. It’s not hard to imagine something going horribly wrong with the Covid-19 pandemic to be just that external shock that could push the economy back down again.

Similar still is the U-shape. Here, after hitting rock bottom, growth isn’t quite as quick to pick up as we linger in the depths of the valley of recession. But after a bit of time, we again see a rapid recovery to pre-recession levels of growth.

These are all pretty short term recoveries, the W being a little bit longer because two sharp downturns. But they are nothing compared to what’s also possible.

First we have the L-shape. Here, after hitting bottom, things start to recover quickly. But that recovery is slow and takes a long time. Growth remains slower than average, creeping up to average, and then still takes its time to reach pre-recession levels. Is something like this possible? Well, if vaccines fail and if some countries still can’t get their act together (cough, US, cough), the willingness of consumers to go out, eat, drink, buy things, travel, and generally make merry could be suppressed for a long time. So it’s certainly not out of the question.

And then lastly we have the UUUU-shape. Though you could probably add or subtract a U or two. This features more drawn out stays at the bottom of the valley with quick and sharp upticks in growth. But those growths, never reaching pre-recession levels, also collapse quickly back into declines, though also never really reaching the same depths as earlier. Essentially, the recovery faces multiple setbacks knocking the economy back down as it sputters to life. As with the L-shape, it’s also not hard to imagine a world where a country hasn’t managed to contain its outbreak struggling to get back on its feet.

What do you think? Are we at rock bottom? Did I miss a recovery type?

Credit for the graphic is mine.

All the Little Spacecraft

Early tomorrow morning, weather permitting, NASA’s Perseverance rover will blast off from Cape Canaveral on a six-plus month trip to Mars. There, hopefully it will land successfully and join all the rovers that have come before.

And so this piece from the New York Times feels appropriate. It’s a great illustration of all the spacecraft we have sent into space, including the active and inactive, with some notable exceptions.

What spacecraft are in orbit of Earth and headed to Mars.

I really like how it pays attention not just to the planets and their satellites (like the Moon), but also the comets, asteroids, and even the Lagrange points. And it does this all with small illustrations of the spacecraft.

Credit for the piece goes to Jonathan Corum.

African Descent in African Americans

A study published last week explores the long-lasting impact of the Atlantic triangle trade of slaves on the genetic makeup of present day African Americans. Genetic genealogy can break down many of what we genealogists call brick walls, where paper records and official documentation prevent researchers from moving any further back in time. In American research, slavery and its lack of records identifying specific individuals by name, birth, and place of origin prevents many descendants from tracing their ancestry beyond the 1860s or 50s.

But DNA doesn’t lie. And by comparing the source populations of present day African countries to the DNA of present day Americans (and others living in the Western hemisphere), we can glean a bit more insight into at least the rough places of origin for individual’s ancestors. And so the BBC, which wrote an article about the survey, created this map to show the average amount of African ancestry in people today.

Average amount of African genetic ancestry in present day populations of African descent

There is a lot to unpack from the study, and for those interested, you should read the full article. But what this graphic shows is that there is significant variation in the amount of African descent in African-[insert country here] ethnic groups. African-Brazilians, on average, have somewhere between 10–35% African DNA, whereas in Mexico that figures falls to 0–10%, but in parts of the United States it climbs upwards of 70–95%.

In a critique of the graphic itself, when I look at some of the data tables, I’m not sure the map’s borders are the best fit. For example, the data says “northern states” for the United States, but the map clearly shows outlines for individual states like New York, Pennsylvania, and New Jersey. In this case, a more accurate approach would be to lump those states into a single shape that doesn’t break down into the constituent polities. Otherwise, as in this case, it implies the value for that particular state falls within the range, when the data itself does not—and cannot because of the way the study was designed—support that conclusion.

Credit for the piece goes to the BBC graphics department.

Sunday’s Covid Numbers

I do not want this blog to become a permanent Covid-19 data site. So in my push to resume posting last week, I tried to keep to from posting the numbers and instead focused on discussing how the data is displayed.

But I hear from quite a few people via comments, DMs, emails, and text messages that they find the graphics I produce helpful. So on the blog, I’m going to try posting just one set of graphics per week. Will it always be Monday? I don’t know. On the one hand, new week, new data. But on the other, weekend numbers tend to be lower than the rest of the week and could make it seem like, yay, the numbers are starting to go down especially if you only come to my blog and only see this data once a week.

Daily cases and their rolling average for Pennsylvania, New Jersey, Delaware, Virginia, and Illinois
Daily new cases
Daily new deaths and their rolling average for Pennsylvania, New Jersey, Delaware, Virginia, and Illinois.
Daily new deaths

So yeah, we’ll see how this goes. And I’ll try to keep Tuesday–Friday to discussing the world of data visualisation, although in these days, a good chunk of it will likely revolve around Covid.

Credit for these graphics is mine.

Corona’s Moment in Your Life

Earlier this week I was on the social medias when I came across a graphic some people were sharing that was meant to be inspirational. It had a giant circle and then a small black pixel that represented “this moment”. Of course, how you define the moment is entirely subjective.

But it made me wonder, if we looked at the coronavirus Covid-19 pandemic as a moment in our lives, how big of a moment is it? Well, I went to the CDC to get a sense of the average life expectancy of an American and then I got the fraction of that lifespan that is the last six months. And, well take a look.

Corona in your lifespan
A not so insignificant span of time

As you can see, the Covid-19 pandemic is more than just a pixel. It’s a significant moment, and of course the pandemic is ongoing. There are new concerns that the 2020 Olympics, now postponed to 2021, may not happen in 2021.

That dot represents graduations, weddings, funerals, birthdays, anniversaries, holidays, opportunities for education, career advancement, life goals all delayed or in some cases missed and never to return.

And while the rest of the world shows some signs of improvement, for my American audience, things are going from bad to worse.

So Happy Friday, everyone.

The Vaxx Path

Today we look at a wee graphic from the BBC examining the current state of Covid-19 vaccines. None have been approved, but 163 are on the path to approval.

The vaxx path

This falls into the category of not everything has to be super complex. Each vaccine is shown as a discrete unit, a small square. For me in this instance this works better than a bar chart showing the total number per each phase. It highlights how each vaccine is a distinct unit and that it can move from one section down to the next. (Although I suppose if it fails a phase it can also be removed entirely.)

And if you want another reason why a nationalist, isolationist foreign policy that bashes foreign countries is not great…none of the Phase 3 candidates, closest to approval, are from an American company or institution.

Credit for the piece goes to the BBC graphics department.

Red Sox Starting Rotation: A Dumpster Fire in a Dumpster Fire Year

Baseball for the Red Sox starts on Friday. Am I glad baseball is back? Yes?

I love the sport and will be glad that it’s back on the air to give me something to watch. But the But the way it’s being done boggles the mind. Here today I don’t want to get into the Covid, health, and labour relations aspect of the game. But, as the title suggests, I want to look at a graphic that looks at just how bad the Red Sox could be this (shortened) year. And over at FiveThirtyEight, they created a model to evaluate teams’ starting rotations on an ongoing basis.

The Red Sox are just bad.
Look at the Red Sox, one of the worst in baseball.

Form wise, this isn’t too difficult than what we looked at yesterday. It’s a dot plot with the dots representing individual pitchers. The size of the dots represents their number of total starts. This is an important metric in their model, but as we all know size is a difficult attribute for people to compare and I’m not entirely convinced it’s working here. Some dots are clearly smaller than others, but for most it’s difficult for me to clearly tell.

Colour is just tied to the colour of the teams. Necessary? Not at all. Because the teams are not compared on the same plot, they could all be the same colour. If, however, an eventual addition were made that plot the day’s matchups on one line, then colour would be very much appropriate.

I like the subtle addition of “Better” at the top of the plots to help the user understand the constructed metric. Otherwise the numbers are just that, numbers that don’t mean anything.

Overall a solid piece. And it does a great job of showing just how awful the Red Sox starting rotation is going to be. Because I know who Nate Eovaldi is. And I’ve heard of Martin Perez. Ryan Weber I only know through largely pitching in relief last year. And after that? Well, not on this graphic, but we have Eduardo Rodriguez who had corona and, while he has recovered, nobody knows how that will impact people in sports. There’s somebody named Hall who I have never heard of. Then we have Brian Johnson, a root for the guy story of beating the odds to reach the Major Leagues but who has been inconsistent. Then…it is literally a list of relief pitchers.

We dumped the salary of Mookie Betts and David Price and all we got was basically a tee-shirt saying “We still need a pitcher or three”.

Credit for the piece goes to Jay Boice.