What Is Infrastructure?

This morning I read a piece in Politico Playbook that broke down President Biden’s $2.25 trillion proposal for infrastructure spending. A thing generally regarded as the United States sorely needs. $2.25 trillion is a lot of money and it’s a fair question to ask whether all that money is really money for infrastructure.

Because, it turns out, it’s not.

Please, sir, may I have more train money?

That isn’t to say money spent on job retraining or home care services wouldn’t be money well spent. Rather, it’s just not infrastructure.

But politics and the English language is a topic for another day. Oh wait, somebody already did write about that.

Credit for the piece is mine.

Discontinuous Lead Bars

Last week the Guardian published an article about drinking water pollution across the United States. Overall, it was a nicely done piece and the graphics within segmented the longer text into discrete sections. Each unit looks similar:


The left focuses on a definition and provides contextual information. It includes small illustrations of the mechanisms by which the pollutant enters the water system. To the right is a chart showing the levels of the contamination detected in the 120 tests the Guardian (and its partner Consumer Reports) conducted.

In almost all of the charts, we see the maximum depicted on the y-axis. And the bars are coloured if that observation station exceeds the health and safety limits. (The limit is represented by the dotted line.)

But towards the end of the piece we get to lead, a particularly problematic pollutant. There is no safe level of lead contamination. But how the piece handles the lead chart leaves a bit to be desired.

But how bad is it, really?

The first thing is colour, but that’s okay. Everything is red, but again, there is no safe level of lead so everything is over the limit. But look at the y-axis. That little black line at the top indicates a discontinuity in the lines, in other words the values for those three observations are literally off the chart.

But does that work?

First, this kind of thing happens all the time. If you ever have to work with data on either China or India, you’ll often find those two nations, due to their sheer demographic size, skew datasets that involve people. But in these kind of situations, how do we handle off the charts data points?

There is a value to including those points. It can show how extreme of an outlier those observations truly are. In other words, it can help with data transparency, i.e. you’re not trying to hide data points that don’t fit the narrative with which you’re working.

In this piece, it’s never explicitly stated what the largest value in the data set is, but I interpret it as being 5.8. So what happens if we make a quick chart showing a value of 6 (because it’s easier than 5.8)? I added a blue bar to distinguish it from the the rest of the chart.

It’s pretty bad.

You can see that including the data point drastically changes how the chart looks. The number falls well outside the graphic, but it also shows just how dangerously high that one observation truly is.

But if you say, well yeah, but that falls outside the box allowed by the webpage, you’re correct. There are ways it could be handled to sit outside the “box”, but that would require some extra clever bits. And this isn’t a print layout where it’s much easier to play with placement. So what happens when we resize that graphic to fit within its container?

And resized

You can see that All the other bars become quite small. And this is probably why the designers chose to break the chart in the first place. But as we’ve established, in doing so they’ve minimised the danger of those few off-the-charts sites as well as left off context that shows how for the vast majority of sites, the situation is not nearly as dire—though, again, no lead is good lead.

What else could have been done? If maintaining the height of the less affected bars was paramount, the designers had a few other options they could have used. First, you could exclude those observations and perhaps put a line below the 118 text that says “for three sites, the data was off the charts and we’ve excluded them from the set below.”

I have used that approach in the past, but I use it with great reluctance. You are removing important outliers from the data set and the set is not complete without them. After all, if you are looking to use this data set to inform a policy choice such as, which communities should receive emergency funding to reduce lead levels, I’d want to start with the city in blue. Sure, I would like everyone to get money, but we’d have to prioritise resources.

I think the best compromise here would have actually been a small tweak to the original. Above the three bars that are broken (or perhaps to the right with some labelling), label the discontinuous data points to provide clearer context to the vast majority of the sites, which are below 0.5 ppb.

As easy as ABC

This preserves the ability to easily compare the lower level observations, but provides important context of where they sit within the overall data set by maintaining the upper limits of the worst offenders.

Credit for the piece goes to the Guardian’s graphics department.

Covid Update: 4 April

Last week I wrote about how the inevitable rise in new Covid-19 cases was occurring in Pennsylvania, New Jersey, Delaware, Virginia, and Illinois. Now, one, in the last week, we saw no evidence of states preparing to reinforce their public health and safety restrictions. And two, whilst we have no data on people not following guidelines, anecdotally a large group of people threw a party in my building’s common amenities space so it does seem like people are feeling less inclined to wear masks, socially distance, and isolate to their own households.

Those two conditions, of course, do not help reduce the case count. Instead they add to it. So it should come as no surprise that Covid-19 continues to rapidly spread in our five states, though some are doing worse than others.

New case curves for PA, NJ, DE, VA, & IL.

New Jersey and Pennsylvania arguably performed the worst. If we look at the peak to trough decline from early winter’s surge to late winter’s nadir, we can see that New Jersey has reached 40% of that peak. Pennsylvania enjoyed a better decline and so has a large gap, but is still nearing 20% its previous peak.

Illinois is also remarkable—again not in a good way—as its peak to trough fall was even greater than Pennsylvania’s, however it’s also now clearly rising. The Land of Lincoln, however, did manager to reach late summer levels of new cases—good. But those are now rising—bad. Delaware too is seeing a rise, albeit at a slower rate than its two tristate neighbours.

Only Virginia’s rise remains slight, barely discernible in the chart.

Deaths, while not exactly good news, aren’t exactly good news either. Last week I mentioned how they had stalled out and stopped declining. That is better than rising death rates, but the levels of deaths per day is still higher than we saw last summer. In other words, things could be significantly better even in pandemic terms.

Death curves for PA, NJ, DE, VA, & IL.

Last week? Deaths continued to stubbornly persist at those elevated levels. We remain vigilant, looking for any indication that deaths will follow the rates of new cases and hospitalisations and begin to climb.

The hope, of course, is that we have vaccinated enough of the most at risk populations to prevent a surge in deaths. But, we just don’t know yet. The only good news is that vaccinations continue to progress.

Vaccination curves for PA, VA, & IL.

Illinois has surpassed 18% of its population being fully vaccinated. Virginia is not far behind at 17.75%. Pennsylvania, because of the bifurcated nature of its data reporting, remains unclear. It sits at 17.8% fully vaccinated, but Philadelphia has not posted updated data since late Thursday. It’s likely that the Commonwealth has joined Illinois in surpassing 18%, but it’s not fully certain.

Also this past week, the CDC updated its guidance for the fully vaccinated, saying that it was safe for them to travel. I take some issue with this, primarily on the messaging front.

First, we need to be clear about what fully vaccinated means. It means two weeks after your final dose. For Johnson & Johnson’s vaccine, that means two weeks after your shot as you only receive one. For both Pfizer and Moderna, you are only fully vaccinated two weeks after your second shot—not before. And keep in mind with Pfizer you need to wait three weeks between first and second dose. With Moderna it’s four weeks. In other words, with J&J you need to wait two weeks after your first (and only) shot before you can begin to follow the loosened guidelines. If you receive Pfizer’s, you need to wait five weeks from your first shot, assuming you do receive your second three weeks later, and with Moderna it’s six weeks, again assuming the recommended four week gap.

The problem is that only about 20% of the US population is fully vaccinated. And with the virus spreading at high rates and at high levels, it poses a significant risk as the newer, more lethal, and more infectious variants could take root in the United States and overwhelm the healthcare systems of the 50 states. We do not yet know if fully vaccinated people can spread the virus if they do become infected.

I think the advice should have remained to refrain from all but essential travel until we reached a high percentage of fully vaccinated folks. I ballparked earlier this week something like 2/3 the estimated amount of full vaccinations required for herd immunity (est. at 75%). In other words, keeping restrictions on travel until at least 50% of the US becomes fully vaccinated.

We remain several weeks away from that milestone, unfortunately. I understand the desire/urge people have to get out and do things and enjoy spring after a year of isolation. Sadly, if winter was the darkest/hardest part of the pandemic, I think that makes spring and early summer the most challenging. Because we see progress, we see the light at the end of the tunnel, and it coincides with warmer weather and we want nothing more to get out and do things and see people. But that is the last thing we need to be doing at this point.

I’ve often described the vaccination as the marshmallow test. In a study, scientists presented kids with a marshmallow. They could eat the marshmallow immediately, but if they waited 15 minutes, unsupervised, they could then have an additional marshmallow. We are all just grabbing that first marshmallow whilst the promise of a more normal summer is ours if we can wait just 15 minutes.

Credit for the piece is mine.

Covid Update: 29 March

Two weeks ago I wrote about how new cases in the states of Pennsylvania, New Jersey, Delaware, Virginia, and Illinois were stalling out, i.e. no longer declining. Additionally, with the exception of Illinois, they were stalling at rates far higher than what we saw last summer. I wrote

This means that the environment is ripe for a new surge of cases if people stop following social distancing and begin resuming indoor activities with other people. Sadly, both those things appear to be occurring throughout the US.

Two weeks hence, one of one thing inevitably occurred.

New cases are now rising in all five states. I wrote about the flat tails of the curves for the seven-day averages. A quick look at the chart shows those have swung upwards, in some cases sharply.

New case curves in PA, NJ, DE, VA, & IL.

Two weeks ago I referenced Europe as a cautionary tale. Governments there eased up on their restrictions, cases surged, and then as hospitalisations rose, governments had to reimpose restrictions and effect new lockdowns. Europe has typically been 3–4 weeks ahead of us throughout the pandemic. So that we are now at a point where we are seeing rising cases, absolutely none of this should be surprising.

The evidence has been in our faces for weeks, plus we have the European example to look at. Reopening makes no sense until we can get case numbers lower, especially with new more virulent and lethal strains of coronavirus now circulating.

Deaths too have been trending the wrong way over the last few weeks.

Death curves for PA, NJ, DE, VA, & IL.

We have seen the curves largely bottom out. And if you look closely, these bottoms are higher than the rates we saw last summer, in some cases more than 3× as much. This flattening occurred just a few weeks after cases began to flatten. The question becomes, will they rise in a few weeks time? Or have we vaccinated enough of our most vulnerable populations?

That’s the real wildcard.

Right now, we have only fully vaccinated about 15% of the populations of Pennsylvania, Virginia, and Illinois.

Vaccination curves for PA, VA, & IL.

Is that enough to prevent hospitalisations and deaths in what looks like will be a fourth wave?

Credit for the piece is mine.

Covid Update: 14 March

Last week I wrote about how our progress in dealing with Covid-19 was stagnating. To put it simply, this past week did not get any better on that front.

New case curves for PA, NJ, DE, VA, & IL.

In Pennsylvania, Delaware, and Illinois we see that the flattened tail I described last week, well remained a flattened tail. In Delaware, we see more movement, but the average of the average, if you will, is flat over the last two weeks. And in New Jersey, where I mentioned some signs of rising numbers, we see a clearly rising number of new cases over the last week. Only in Virginia are numbers heading down, and those are shallowing out.

The problem here is that in Pennsylvania and Delaware, the new case rate, whilst flat, is well above the summer rate of low transmission. This means that the environment is ripe for a new surge of cases if people stop following social distancing and begin resuming indoor activities with other people. Sadly, both those things appear to be occurring throughout the US.

In Europe we see a cautionary tale. They too saw their holidays peaks decline and the national governments began easing restrictions on their populations. Within the last several days, however, new cases have begun to surge. Italy has gone so far as to announce a new lockdown. Other governments are considering the same.

If the United States cannot resume pushing its numbers of new cases down, it could well follow Europe into a new wave of outbreaks that would threaten lockdowns and push back our eventual return of normalcy.

None of this would be an issue if vaccinations were nearing herd immunity levels. However, in the states we cover, nowhere is above 12% fully vaccinated.

Vaccination curves for PA, VA, & IL.

Pennsylvania now lags behind the other two states. But at least the Commonwealth is over 10% fully vaccinated.

And of course, the problem under this dire scenario is that deaths could rise once again, though at this point the most vulnerable are in the middle of being vaccinated. Indeed, if we look at the last week, we see the good news for the week, that deaths are headed down in all five states.

Death curves for PA, NJ, DE, VA, & IL.

Previously, Virginia had been working through a backlog of death records, but those appear now cleared. We are not quite back to summer-level lows, but we are steadily approaching them.

The big question this week will be what happens to those new cases numbers. Today’s data, Monday, will likely show lower numbers because of lower testing on the weekend. But starting Tuesday, what do we see over the course of the next five days?

Credit for the piece is mine.

Making America Save Again

For years, one issue with the American economy had been that we did not save enough. It’s understandable, as it’s hard to keep up with the image of the carefree American without profligate spending. But that’s also not great long-term. But thanks to Covid-19, we’ve now swung to the other side of the spectrum: Americans may be saving too much.

Saying that sounds callous to the devastation the pandemic has wrought upon large swathes of the economy. But it’s true in the aggregate as this New York Times piece explains. In particular, the authors highlight one example. Consider a corporate CEO who earned a $100,000 bonus for keeping the company he runs afloat during the recession. He adds $100k to the aggregate American income. But at a restaurant shuttered by the pandemic, owners lay off a hostess, a server, a bartender, and a dishwasher, each earning $25,000. Their collective lost income is $100,000 and so balances out that one CEO. And as CEOs are more able to work remotely than servers, it’s not hard to see how the upper-income earning cohorts of the economy have done well. In human-terms, four unemployed service industry people is terrible. But statistically, it’s a wash. Once we understand that, it makes the piece sensible.

It uses decomposition charts, basically stacked bar charts broken apart, to show what constitutes the two sides of the American household budget: earning and spending. I’ve taken a screenshot of the spending side of the ledger.

This is the aggregate, I’d be curious how this relates to you, my readers.

We see that starting from the baseline, the solid line, American households spent more money this year on durable goods. A dotted line then carries that adjusted baseline to the right for the next component of the ledger: nondurable goods. We spent more on those too, so the baseline moves up. The designers annotated the graphic, adding descriptions of what each bar represents in a casual, lighthearted tone. I’ve definitely been cooking for myself a lot more.

Here I wish we had some more traditional charting elements, e.g. axis lines and labels. Now this piece is published under the Upshot, a more conversational and less formal brand than the Times as a whole. That probably explains the casual annotations. But I think some basic axis labels, e.g. spending more vs. spending less, could add some context without the need for the annotations.

Where the piece might lose people is what happens after durable goods. Americans stopped spending on services, a decline of over half a trillion dollars. That’s a lot of money. And so the adjusted baseline shifts to well below where we started. Add on savings from things like interest rates (Jay Powell is the chair of the Federal Reserve, for whose Philadelphia bank I work in full disclosure) and Americans have spent more than half a trillion dollars less. And as the article explains, we’ve also saved an enormous amount, to the tune of $1 trillion. Add it together and you’ve got America saving $1.5 trillion in 2020.

That money has to go somewhere. And you can see where some of it went when you look at surging prices in GameStop. Longer term, when the pandemic begins to end, we are going to have a pent up demand from people who have had their lives on hold for a year or more. And if there is insufficient supply for whatever’s in demand, prices will rise and we could see a sharp jump in inflation. But that’s a post for another day.

Back to this graphic, as a statistical graphic, it works. But without axis labels and data definitions, barely so. However, I think it’s meant to be more casual and illustrative than data-driven. If I look at this piece through that lens, I do think it works.

Credit for the piece goes to Neil Irwin and Weiyi Cai.

Covid Update: 7 March

Last week I wrote about some signals indicating a potential stagnation in terms of declining numbers of new cases. I also wrote about some potential signs of reversals, or increasing numbers of new cases.

This week, what we saw signs of came to pass.

New case curves for PA, NJ, DE, VA, & IL.

At the tail ends of each chart, you can see that the last week was broadly stagnant. In Pennsylvania and Illinois the seven-day average was itself remarkably flat. Delaware is now where it was this time last week; a slight rise in new cases was met with an equal magnitude decline.

In reversals, we have New Jersey. New case numbers there increased throughout the week. With lower weekend data, those numbers have fallen slightly.

Only in Virginia did we see good numbers in new cases. Numbers there fell over the last week, though notably at a slower pace than in previous weeks.

Deaths presented broadly good news. Last week we had mixed signals with increasing numbers in Delaware and Virginia. We knew the increase in Virginia was due to the state processing a backlog of death certificates with Covid.

Death curves for PA, NJ, DE, VA, & IL.

But in the last few days, those numbers have also fallen though the state reports it is still processing the backlog. And in Delaware, the daily number of deaths has also fallen again. I think it’s too early to say this peak has crested, but it could well be.

And in the other states, we continue to see slowly falling numbers of deaths. There are some potential signs of that bottoming or stalling out in Illinois, but we’ll have to see how this week pans out.

Finally, the best news we had over the course of last week was with vaccinations.

Vaccination curves for VA & IL.

Last week I mentioned that we can see the lines moving upwards as we approach 10% fully vaccinated in Pennsylvania, Virginia, and Illinois.

This week, well let’s start here: as I’ve pointed out in the past, Pennsylvania does not have a centralised reporting system. Most notably the state reports figures for all but Philadelphia county (coterminus with the city). The city reports its own figures. I aggregate the two. But for the last several days, the Philadelphia data site has been broken, so we don’t know the progress of vaccinations in the city. And as the largest city/county in the state, Philadelphia is an enormous part of figuring out the statewide numbers.

So looking only at Virginia and Illinois, the numbers look good. Virginia is at nearly 9.5%. Illinois is on 8.92%.

But we really need Philadelphia to get its act together.

Credit for the piece is mine.

Lead Pie

This past weekend, I read an article in Politico discussing parents’ outrage over levels of lead and other toxic metals in baby food. The story focuses on a Congressional report into the matter, but that ties back into an EPA study from 2017 that investigated lead contamination. Specifically the article’s author notes “a chart that was buried in supplemental material”. Buried chart? Well I went off to investigate.

And I found all the charts. But I wanted to focus on one. I am not entirely clear what it means: Percent contribution by pathway adjusted for bioavailability of each media for NHEXAS Region 5 study. I get that it’s looking at channels of intake, but it’s unclear if this is lead or some other contaminant. Is this for all people? Or a sub-section of the population as other charts in that supplemental material pack are?

So I made a graphic where I compared the original to two alternate versions.

Now, the editorial focus of the article is on baby food, which is not the apparent focus of the study (unless it is couched in academic/technical terms). But what’s worth noting is that the pale yellow recedes into the background as the burgundy dominates the graphic.

If graphics are done well, they should show clear visual relationships, they do not need to label specific datapoints unless through a progressive disclosure of information. But if you are going to label everything, I would want to make certain that in the case of that same burgundy slice, we have sufficient contrast to read the 17% label.

Pie charts are not good at allowing people to compare slices. So the pie chart as the format here is not a great place to start, but as you can see in my Option 2, if you are going to choose a pie chart form, there are ways of making it more legible. Namely, do not make it three-dimensional.

Here the foreground receives prominence over the background, which may be receding and visually shrinking into the background. And as the point of a chart is to make visual comparisons, if we cannot compare like for like, it’s not ideal.

Also, we have the thickness of the pie chart. That vertical heights adds yellow to the slice of the pie we see in front. Casually, that makes the yellow slice appear even larger than it already is from the three-dimensional foreshortening.

Option 2 presents this as a stripped down pie chart. Make it flat. I used one colour with tints of one purple. I used the 100% to highlight the dietary intake channel, because of the Politico article’s focus.

But really, Option 1 is the improvement here. Comparing the smaller slices is easier here as the eye simply moves vertically down the graphic. We are also able to add axis lines that provide a context for where those values fall, between 0 and 10 for Water intake, and just over 10 for Air. Somewhere between 15 and 20 for Soil and dust ingestion.

Finally, that legend. We don’t want the reader to have to strain to identify what slice is what. Why is the legend in a box? Why is it so far away from the pie? In both my options I closely and visually link the labels to the slices/bars they represent. That makes it easier for the reader to know what they are looking at when they are looking at it.

The moral of the story, people, don’t use three-dimensional pie charts.

Credit for the original version goes to the EPA. Credit for the alternate versions is mine.

Covid Update: 28 February

Last week we saw some positive trends with respect to new Covid-19 cases in the Pennsylvania, New Jersey, Delaware, Virginia, and Illinois area. What did we see this week? Curiously, we saw stagnating figures and, in some instances, slight reversals.

New case curves in PA, NJ, DE, VA, & IL.

This stagnation can be seen by the small flattenings at the end of the lines for Pennsylvania, Illinois, and Virginia. And if you look at Delaware and New Jersey, you can see the reversals as little upward hooks.

I do not think this means we will be returning to the levels we saw earlier this winter. In fact, if you look a little ways back in Delaware and a bit further back in both Pennsylvania and Illinois you can see a similar pattern. Slight reversals appear as jagged little outcrops on the slope. New cases do indeed climb for a week or so—probably isolated to specific geographies within those states tied to outbreak clusters, but that’s pure speculation on my part.

These reversals, therefore, are something we should pay attention to this week when the weekday data resumes on Tuesday. But I am not worrying about this breaking the overall trend of falling numbers of new cases.

Deaths, on the other hand, while still a bit mixed, are broadly positive. Last week we were in a similar position as we are with new cases this week. In particular, we were looking at increasing numbers in both Delaware and Virginia while the other three states saw slowly falling numbers.

Death curves for PA, NJ, DE, VA, & IL.

In Delaware we have the numbers down a bit, but the longer term trend remains generally up. I will be watching this closely this week. Virginia, however, is an easier, but maybe better explanation? During the course of this past week, Virginia stated that it’s processing death certificates from the post-holiday surge in deaths.

This means the state under-reported deaths earlier this year and so that the curve should have actually been significantly higher. But the positive news in that is that the deaths we are seeing now happened in the past so that deaths today are far lower than are being reported.

And with vaccinations we continue to have good news. The lines below are clearly off the baseline now as the three states we track move towards 10% fully vaccinated.

Vaccination curves for PA, VA, & IL.

It’s not all perfect, as the rate in Pennsylvania appears to have slowed slightly. This after vaccine administrators mistakenly used second doses for first doses. Now the state has to play catch-up.

But in Virginia and Illinois, we continue to see increasing rates. You can see this as the curve is beginning to gradually slope more and more upward instead of the shallow angle we saw for the last few weeks.

Like with new cases, which, while positive, still have a ways to go before we get to summer-like levels that would allow us to head out and socialise, vaccinations have a long way to go.

And importantly, just because someone is vaccinated doesn’t mean society should reopen just for those lucky to get their doses early. We need to wait—or should wait—for higher levels of vaccination before reopening.

Credit for the piece is mine.

Covid Update: 20 February

Another week, another snowstorm in the Northeast. This winter has been far busier than last, when Philadelphia saw no snow. Unfortunately, whilst people like me enjoy seeing the snow, it’s hampering with testing and vaccination.

Last week we saw some middling signs of improvement, but perhaps partially exaggerated by the closures caused by the storm. When we look back at the last week, despite the impact of a storm later in the week, it’s been a categorically positive week with respect to new cases.

New case curves for PA, NJ, DE, VA, & IL.

After the plateaus of the week before, most notably in the straight line in Pennsylvania, this week we saw the line for the seven-day average resume a sharp trajectory down. That isn’t to say we are seeing a slowdown in that reduction of new cases. Illinois best fits that, but we can see slight flattening of the downward curve also in Delaware and New Jersey. In Illinois’ case, that is still welcome as the state approaches early autumn levels of new case rates. In the remaining states, we still have a little ways to go before we reach those levels.

Deaths, on the other hand, remain a mixed bag of results. Last week we talked about a much improved picture from the week before with Delaware and Virginia in particular exhibiting significantly decreased rates.

Death curves for PA, NJ, DE, VA, & IL.

This week we saw some reversal of fortune in those two same states. In Delaware, the numbers of deaths have ticked back upwards and the seven-day average has made up about a third of the gains we saw. In Virginia, the upward swing can be largely—though not entirely—attributed to a one-day spike in numbers.

Whilst the other three states continued to see gradual improvements, the question over the coming week will be what trends emerge within Delaware and Virginia. Do the deaths increase and the situation worsen? Or will the increases prove a temporary aberration followed by a return to decreasing numbers of new deaths.

Finally with vaccines

The vaccination curves for PA, NJ, DE, VA, & IL.

The story to follow in Pennsylvania will be how distribution sites mistakenly administered second doses as first. 60,000 people awaiting their second dose will now have to wait—though still within the recommended window—for their second dose whilst 50,000 people will now have to wait for their first dose.

Otherwise, we continue to see an uptick in vaccinations. Last week we saw states make significant gains in their fully vaccinated populations. Virginia had passed 4% and Pennsylvania was about to hit the same milestone. This week begins with Virginia at nearly 5.5% and Pennsylvania almost at 5%, sitting on 4.77%. We need to keep in mind that this excludes any new vaccinations from the city, which doesn’t report vaccination data at the weekend. Illinois is now the lagging state at 4.29%.

Credit for the piece is mine.