Rating Scale

This week is almost over and so instead of a graphic about unemployment numbers, let’s look at a piece from xkcd that provides us all with a new rating scale.

Because, let’s be honest, we all at some point are going to need to rate 2020 come December. And while we still have almost five months remaining, what are you thinking?

Credit for the piece goes to Randall Munroe.

Flood Stages of the Schuylkill

Hurricane Isaias ran up the East Coast of the United States then the Hudson River Valley before entering Canada. Before it left the US, however, it dumped some record-setting amounts of rain in Philadelphia and across the region. And in times of heavy rains, the lower-lying areas of the city (and suburbs like Upper Darby and Downingtown to mention a few) face inundation from swollen rivers and creeks. And in the city itself, the neighbourhood of Eastwick is partially built upon a floodplain. So staying atop river levels is important and the National Weather Service has been doing that for years.

The National Weather Service graphic above is from this very morning and represents the water level of the Schuylkill River (the historical Philadelphia was sited between two rivers, the more commonly known Delaware and its tributary the Schuylkill), which receives water from the suburbs to the north and west of the city, the area hardest hit by Isaias’ rainfall.

The chart looks at the recent as well as the forecast stages of the river. Not surprisingly, the arrival of Isaias accounts for the sudden rise in the blue line. But there is a lot going on here, yellows, reds, and purples, some kind of NOAA logo behind the chart, labels sitting directly on lines, and some of the type is pixellated and difficult to read.

But it does do a nice job of showing the differences in observations and forecast points in time. By that I mean, a normal line chart has an equal distribution of observations along its length. There is an equal space between the weeks or the months or the years. But in instances like this, observations may not be continuous—imagine a flood destroying a sensor—or here that the forecasts are not as frequently produced as observations. And so these are all called out by the dots on the lines we see.

This is the chart I am accustomed to seeing. But then last night, reading about the damage I came across this graphic (screenshot also from this morning to compare to above) from the Philadelphia Inquirer.

It takes the same data and presents it a cleaner, clearer fashion. The flood stages are far easier to read. Gone is the NOAA logo and the unnecessary vertical gridlines. The type is far more legible and the palette less jarring and puts the data series in front and centre.

In general, this is a tremendous improvement for the legibility of the chart. I would probably use a different colour for the record flood stage line, or given their use of solid lines for the axis maybe make it dotted. But that’s a small quibble.

The only real issue here is what happens to the time? Compare the frequent observations in the past in the original, every half hour or so, to the six hourly dots (the blue versus the purple). In the Inquirer version, those spaces between forecast points disappear and become the same as the half-hour increments.

To be fair, the axis labelleing implies this as the label goes from August 4 to 5 and then jumps all the way to 7, but it is not as intuitive as it could be. Here I would recommend following the National Weather Service’s fashion of adjusting for the time gap. It would probably mean some kind of design tweak to emphasise that the observations earlier than now are observed every half hour or so, versus the six-hour forecasts. The NWS did this through dots. One could use a dotted line, or some other design treatment.

This missing time is the only thing really holding back this piece from the Inquirer from standing out as a great update of the traditional National Weather Service hydrograph chart.

Credit for the National Weather Service piece goes to the National Weather Service.

Credit for the Inquirer piece goes to Dominique DeMoe.

The Covid Recession’s Continuing Impact on Youth

Earlier this week, some of the work work my team does was published. We produced a one-page summary of a far larger and more comprehensive (relative to the scope of the summary) survey of consumers during the Covid Recession. I will spare you the details of recreating existing templates from scratch and the design decisions that went into that bit—neither insignificant nor unsubstantial—and rather focus on the one graphic we designed.

The broad thrust of the summary is that while overall we are beginning to see some job recovery, that the recovery is uneven and that, in fact, those below the age of 36 are getting hit pretty hard (my words, not the authors). That while in some industries the young are recovering in good numbers, in other industries, industries with a larger share of the youth population, young people are still losing jobs. Then we broke those top line numbers out by industries in the below graphic captured by screenshot.

How different age groups in different industries are faring in the recession.

There are a couple of things from a design side to discuss. We had about two or three days from when we started the project to develop some ideas and then execute and produce the summary. And as I noted above, that also included quite a bit of time in emulating existing documents and building ourselves a new template should we need to do something similar in the future.

But for that graphic in particular, there’s one thing I wanted to highlight: the lack of values on the axis. The challenge here was that the data displayed is people not working. And when we compared this time period (Wave 3) to the earlier waves, we were looking for declines. And so if we going to say that 36+ are gaining construction jobs, that would be -2% value and the youth are about a -13% increase. If you are doing a bit of a double-take at a negative increase, so did the team. Ultimately, we used the data to generate the chart, but then opted for qualitative labelling on the axes. They simply point that in one direction, youth are either gaining or losing jobs, and the same for the 36+. To reinforce this idea, we also added some descriptors in the far corner of each quadrant that said whether the age groups were gaining or losing jobs.

Despite the unusual design decisions I took in the graphic, I’m really proud of this piece especially given its tight turnaround. It shows in almost real-time how fractured the recovery—is this a recovery?—is at this point.

Credit for the piece goes to the team on this, Tom Akana, Kate Gamble, Natalie Spingler, and myself.

Big Bar Chart Better

Today isn’t a Friday, but I want to take a quick look at something that made me laugh aloud—literally LOL—whilst simultaneously cringe.

Not surprisingly it has to do with Trump and data/facts.

This all stems from an interview Axios’ Jonathan Swan conducted with President Trump on 28 July and that was released yesterday. I haven’t watched the interview in its entirety, but I’ve seen some excerpts. Including this gem.

It’s eerily reminiscent of a British show called The Thick of It written by Armando Iannucci or probably more accurately an interview out of one his earlier works with Chris Morris, On the Hour or The World Today. He later went on to create Veep for American audiences, based loosely or inspired by the Thick of It, but I found it a weak substitute for the original. But I digress.

In that clip, the President talks about how he looks at the number of deaths as a share of cases, the case fatality rate, whilst Swan is discussing deaths as a share of total population, deaths per capita. Now the latter is not a great data point to use, especially in the middle of the pandemic, because we’re not certain what the actual denominator is. I’ve discussed this before in some of my “this is not the flu” posts where the case fatality rate, sometimes more commonly called simply the mortality rate, was in the 3–5% range.

Regardless of whether or not one should use the metric, here is how the President visualised that data.

2+2=5

Four big and beautiful bar charts. The best charts.

The President claims the United States “Look, we’re last. Meaning we’re first. We have the best. Take a look again, it’s cases [it’s actually still the case mortality rate]. And we have cases because of the testing.”

The problem is that one, it’s the wrong metric. Two, the idea that testing creates cases is…insane. Third, the United States is last in that big set of bar charts. Why is every country a different colour? In the same data series, they should all be the same, unless you’re encoding a variable such as, say, region via colour. But with four data points, a bar chart taking up the entirety of a US-letter sized paper is grossly inefficient.

But that’s not even the full picture. Because if you look at a more robust data set, this one from Our World in Data, we get a better sense of where the United States sits.

2+2=4

Still not the highest on the chart, true. But even in this set; Norway (of not a shithole fame), India, South Korea, New Zealand, South Africa, and Congo all rank lower. The United States is far from last. And for those wondering, yes, I took the data from the same date as the interview.

There’s another clip within that clip I linked to earlier that deals with South Korea’s numbers and how the President says we “don’t know that”. And this is the bigger problem. We all know that data can be manipulated. But if we cannot agree that the data is real, we cannot have a framework for a real discourse on how to solve very real problems.

As someone who works with data to communicate information or stories on a near daily basis, this is just frightening. It’s as if you say to me, the sky is a beautiful shade of blue today without a cloud in the sky and I reply, no, I think it’s a foreboding sky with those heavy clouds of green with red polka dots. At that point we cannot even have a discussion about the weather.

And it’s only Tuesday.

Credit for the Trump graphic goes to somebody in the White House I assume.

Credit for the complete graphics goes to Our World in Data.

Covid-19 Update

As I mentioned last week, I am going to try using my blog here for the weekly update on the five states people have asked me to explore. And for the second week in a row, we are basically seeing numbers down compared to previous days. But given that numbers are generally lower on the weekends, that is not terribly surprising.

The real question is by Friday, will these numbers have rebounded?

The Covid-19 curves for PA, NJ, DE, VA, and IL
The Covid-19 death curves for PA, NJ, DE, VA, and IL.

Credit for these graphics is mine.

Habitable Zones Around Masses of Light and Heat

But those masses are campfires.

It’s Friday, everyone, and we’ve made it to the end of the week. And with the successful launch of Perseverance yesterday, this post from xkcd made a lot of sense. For those that don’t enjoy astronomy, basically stars have habitable zones, or sometimes the Goldilocks zone, around the star where planets would likely be neither too hot nor too cold for liquid water to form on the surface of orbiting planets. And since life as we presently know it requires water, it makes sense that these zones are where we focus our attention in studies of exoplanets.

Just generally not a fan of s’mores over here though.

Credit for the piece goes to Randall Munroe.

What Will the Next Recovery Look Like?

Earlier this morning, the Bureau of Economic Analysis released its US 2nd quarter GDP figures and the news…isn’t great. On an annualised basis, we saw -32.9% growth. That’s pretty bad. Like Great Depression level bad. I’ve posted on the social media how bad this current recession is and how nobody in the workforce today worked or didn’t through the Great Depression to really relate to the numbers we are seeing.

But that’s all today. The sun will come out tomorrow. (And scorch the Earth as climate change renders certain parts of the globe uninhabitable to mankind. But we’ll get to those posts in later weeks.) And when it does come out, eventually, what will the recovery look like? I’ve seen a few mentions recently in the media of a V-shaped recovery. What is this mysterious V-shape?

A long time ago, in a galaxy far away. Or during the last recession in Chicago, I worked with some really smart people in some of my professional projects and we covered the exact same question. There are a couple key “shapes” to an economic recovery. And when we say recovery, we mean just to return to pre-recession peak levels of growth. Anything above that is an expansion. That’s what we want to get back to.

What kind of shape will the recovery take?
Who knew typographers loved economics?

The V-shape we hear a lot about is a sharp recovery after the economy bottoms out (the trough). Broadly speaking, if a recession has to last two consecutive quarters (it doesn’t, but that’s a pretty common definition so let’s stick with it), then in a V-shape, we are talking about a recovery one or two quarters later.

Similar to the V is the W-shape, where things start to improve rapidly, but some kind of shock to the economic system and things go back negative once again before finally picking up quickly. It’s not hard to imagine something going horribly wrong with the Covid-19 pandemic to be just that external shock that could push the economy back down again.

Similar still is the U-shape. Here, after hitting rock bottom, growth isn’t quite as quick to pick up as we linger in the depths of the valley of recession. But after a bit of time, we again see a rapid recovery to pre-recession levels of growth.

These are all pretty short term recoveries, the W being a little bit longer because two sharp downturns. But they are nothing compared to what’s also possible.

First we have the L-shape. Here, after hitting bottom, things start to recover quickly. But that recovery is slow and takes a long time. Growth remains slower than average, creeping up to average, and then still takes its time to reach pre-recession levels. Is something like this possible? Well, if vaccines fail and if some countries still can’t get their act together (cough, US, cough), the willingness of consumers to go out, eat, drink, buy things, travel, and generally make merry could be suppressed for a long time. So it’s certainly not out of the question.

And then lastly we have the UUUU-shape. Though you could probably add or subtract a U or two. This features more drawn out stays at the bottom of the valley with quick and sharp upticks in growth. But those growths, never reaching pre-recession levels, also collapse quickly back into declines, though also never really reaching the same depths as earlier. Essentially, the recovery faces multiple setbacks knocking the economy back down as it sputters to life. As with the L-shape, it’s also not hard to imagine a world where a country hasn’t managed to contain its outbreak struggling to get back on its feet.

What do you think? Are we at rock bottom? Did I miss a recovery type?

Credit for the graphic is mine.

All the Little Spacecraft

Early tomorrow morning, weather permitting, NASA’s Perseverance rover will blast off from Cape Canaveral on a six-plus month trip to Mars. There, hopefully it will land successfully and join all the rovers that have come before.

And so this piece from the New York Times feels appropriate. It’s a great illustration of all the spacecraft we have sent into space, including the active and inactive, with some notable exceptions.

What spacecraft are in orbit of Earth and headed to Mars.

I really like how it pays attention not just to the planets and their satellites (like the Moon), but also the comets, asteroids, and even the Lagrange points. And it does this all with small illustrations of the spacecraft.

Credit for the piece goes to Jonathan Corum.

African Descent in African Americans

A study published last week explores the long-lasting impact of the Atlantic triangle trade of slaves on the genetic makeup of present day African Americans. Genetic genealogy can break down many of what we genealogists call brick walls, where paper records and official documentation prevent researchers from moving any further back in time. In American research, slavery and its lack of records identifying specific individuals by name, birth, and place of origin prevents many descendants from tracing their ancestry beyond the 1860s or 50s.

But DNA doesn’t lie. And by comparing the source populations of present day African countries to the DNA of present day Americans (and others living in the Western hemisphere), we can glean a bit more insight into at least the rough places of origin for individual’s ancestors. And so the BBC, which wrote an article about the survey, created this map to show the average amount of African ancestry in people today.

Average amount of African genetic ancestry in present day populations of African descent

There is a lot to unpack from the study, and for those interested, you should read the full article. But what this graphic shows is that there is significant variation in the amount of African descent in African-[insert country here] ethnic groups. African-Brazilians, on average, have somewhere between 10–35% African DNA, whereas in Mexico that figures falls to 0–10%, but in parts of the United States it climbs upwards of 70–95%.

In a critique of the graphic itself, when I look at some of the data tables, I’m not sure the map’s borders are the best fit. For example, the data says “northern states” for the United States, but the map clearly shows outlines for individual states like New York, Pennsylvania, and New Jersey. In this case, a more accurate approach would be to lump those states into a single shape that doesn’t break down into the constituent polities. Otherwise, as in this case, it implies the value for that particular state falls within the range, when the data itself does not—and cannot because of the way the study was designed—support that conclusion.

Credit for the piece goes to the BBC graphics department.

Sunday’s Covid Numbers

I do not want this blog to become a permanent Covid-19 data site. So in my push to resume posting last week, I tried to keep to from posting the numbers and instead focused on discussing how the data is displayed.

But I hear from quite a few people via comments, DMs, emails, and text messages that they find the graphics I produce helpful. So on the blog, I’m going to try posting just one set of graphics per week. Will it always be Monday? I don’t know. On the one hand, new week, new data. But on the other, weekend numbers tend to be lower than the rest of the week and could make it seem like, yay, the numbers are starting to go down especially if you only come to my blog and only see this data once a week.

Daily cases and their rolling average for Pennsylvania, New Jersey, Delaware, Virginia, and Illinois
Daily new cases
Daily new deaths and their rolling average for Pennsylvania, New Jersey, Delaware, Virginia, and Illinois.
Daily new deaths

So yeah, we’ll see how this goes. And I’ll try to keep Tuesday–Friday to discussing the world of data visualisation, although in these days, a good chunk of it will likely revolve around Covid.

Credit for these graphics is mine.