Big Beer

A few weeks back, a good friend of mine sent me this graphic from Statista that detailed the global beer industry. It showed how many of the world’s biggest brands are, in fact, owned by just a few of the biggest companies. This isn’t exactly news to either my friend or me, because we both worked in market research in our past lives, but I wanted to talk about this particular chart.

Not included, your home brew

At first glance we have a tree map, where the area of each “squarified” shape represents, usually, the share of the total. In this case, the share of global beer production in millions of hectolitres. Nothing too crazy there.

Next, colour often will represent another variable, for market share you might often see greens or blues to red that represent the recent historical growth or forecast future growth of that particular brand, company, or market. Here, however, is where the chart begins to breakdown. Colour does not appear to encode any meaningful data. It could have been used to encode data about region of origin for the parent company. Imagine blue represented European companies, red Asian, and yellow American. We would still have a similarly coloured map, sans purple and green,

But we also need to look at the data the chart communicates. We have the production in hectolitres, or the shape of the rectangle. But what about that little rectangle in the lower right corner? Is that supposed to be a different measurement or is it merely a label? Because if it’s a label, we need to compare it to the circles in the upper right. Those are labels, but they change in size whereas the rectangles change only in order to fit the number.

And what about those circles? They represent the share of total beer production. In other words the squares represent the number of hectolitres produced and the circles represent the share of hectolitres produced. Two sides of the same coin. Because we can plot this as a simple scatter plot and see that we’re really just looking at the same data.

Not the most interesting scatter plot I’ve ever seen…

We can see that there’s a pretty apparent connection between the volume of beer produced and the share of volume produced—as one would (hopefully) expect. The chart doesn’t really tell us too much other than that there are really three tiers in the Big Six of Breweries. AB Inbev is in own top tier and Heineken is a second separate tier. But Carlsberg and China Resources Snow Breweries are very competitive and then just behind them are Molson Coors and Tsingtao. But those could all be grouped into a third tier.

Another way to look at this would be to disaggregate the scatter plot into two separate bar charts.

And now to the bars…

You can see the pattern in terms of the shapes of the bars and the resulting three tiers is broadly the same. You can also see how we don’t need colour to differentiate between any of these breweries, nor does the original graphic. We could layer on additional data and information, but the original designers opted not to do that.

But I find that the big glaring miss is that the article makes the point despite the boom in craft beer in recent years, American craft beer is still a very small fraction of global beer production. The text cites a figure that isn’t included in the graphic, probably because they come from two different sources. But if we could do a bit more research we could probably fit American craft breweries into the data set and we’d get a resultant chart like this.

A better bar…

This more clearly makes the point that American craft beer is a fraction of global beer production. But it still isn’t a great chart, because it’s looking at global beer production. Instead, I would want to be able to see the share of craft brewery production in the United States.

How has that changed over the last decade? How dominant are these six big beer companies in the American market? Has that share been falling or rising? Has it been stable?

Well, I went to the original source and pulled down the data table for the Top 40 brewers. I took the Top 15 in beer production, all above 1% share in 2020, and then plotted that against the change in their beer production from 2019 to 2020. I added a benchmark of global beer production—down nearly 5% in the pandemic year—and then coloured the dots by the region of origin. (San Miguel might not seem to fit in Asia by name, but it’s from the Philippines.)

Now I can use a good bar.

What mine does not do, because I couldn’t find a good (and convenient) source is what top brands belong to which parent companies. That’s probably buried in a report somewhere. But whilst market share data and analysis used to be my job, as I alluded to in the opening, it is no longer and I’ve got to get (virtually) to my day job.

Credit to the original goes to Felix Richter.

Credit for my take goes to me.

Rarely Shady in Philadelphia

After a rainy weekend in Philadelphia thanks to Hurricane Henri, we are bracing for another heat wave during the middle of this week. Of course when you swelter in the summer, you seek out shade. But as a recent article in the Philadelphia Inquirer pointed out, not all neighbourhoods have the same levels of tree cover, or canopy.

From a graphics standpoint, the article includes a really nice scatter plot that explores the relationship between coverage and median household income. It shows that income correlates best with lack of shade rather than race. But I want to focus on a screenshot of another set of graphics earlier on in the article.

On the other hand, pollen.

I enjoyed this graphic in particular. It starts with a “simple” map of tree coverage in Philadelphia and then overlays city zip codes atop that. Two zip codes in particular receive highlights with bolder and larger type.

Those two zip codes, presumably the minimum and maximum or otherwise broadly representative, then receive call outs directly below. Each includes an enlarged map and then the data points for tree cover, median income, and then Black/Latino percentage of the population.

I don’t think the median income needs to be in bar chart form here, especially given the bars do not line up so that you can easily compare the zip codes. The numbers would work well enough as factettes or perhaps a small dot plot with the zip codes highlighted could work instead.

Additionally, the data labels would be particularly redundant if a small scale were used instead. That would work especially well if the median income were moved to the lowest place in the table and the share charts were consolidated in one graphic. Conceptually, though, I enjoy the deep dive into those two zip codes.

Then I wanted to highlight some great design work on the maps. Note how in particular for Chestnut Hill, 19118, the outline of the zip code is largely in a thicker, black stroke than the rest of the map. At the upper right, however, you have two important roads that define the area and the black stroke breaks at those points so the roads can be clearly and well labelled. The other map does the same thing for two roads, but their breaks are shorter as the roads run perpendicular to the border.

Overall this was just a great piece to read and I thoroughly enjoyed the graphics.

Credit for the piece goes to John Duchneskie.

Olympic Recap/Retro

Every four years (or so) I have to confess that I think fondly back upon my former job, because I worked with a few wonderful colleagues of mine on some data about the Olympics. And the highlight was that we had a model to try and predict the number of medals won by the host country as we were curious about the idea of a host nation bump. In other words, do host countries witness an increase in their medal count relative to their performance in other Olympiads?

We concluded that host nations do see a slight bump in their total medal count and we then forecast that we expected Team GB (the team for Great Britain and Northern Ireland) to win a total of 65 medals. We reached 64 by the final day and it wasn’t until the women’s pentathlon when, in maybe the last event, Team GB won a silver medal bringing its total to 65, exactly in line with our forecast.

Probably the most Olympics I’ve ever watched.

Of course we also looked at the data for a number of other things, including if GDP per capita correlated to Olympic performance. We also looked at BMI and that did yield some interesting tidbits. But at the end of the day it was the medal forecast that thrilled me in the summer of 2012.

So yeah, today’s a shameless plug for some old work of mine. But I’m still proud of it two olympiads later.

If you’d like to see some of the pieces, I have them in my portfolio.

Credit for the piece is mine.

Sunday Covid-19 Data

Another day, more cases of coronavirus and Covid-19. So let’s take a look at Sunday’s data as there were some interesting things going on.

First, let’s dispense with Virginia. The state is enhancing its reporting structure, and so they admit the data is likely an underestimate of the present situation in Virginia. So here’s Virginia, nothing really changed.

The situation in Virginia
The situation in Virginia

Moving on, we have Pennsylvania. Here we are beginning to truly see the disparity between the cities in the southeast and southwest, namely Philadelphia and Pittsburgh, and the T that describes what sometimes is used to describe Pennsyltucky. (Though it also includes cities like Harrisburg, the state capital.) The point is that the T of Pennsylvania has yet to suffer greatly from the outbreak. Of course, it’s also the part of the state least equipped to deal with a pandemic.

The situation in Pennsylvania
The situation in Pennsylvania

New Jersey is just bad. One can make the argument that South Jersey is hanging on. (Though I will touch on that later with an idea for today’s afterwork work.) Bergen County in the northeast is likely to surpass 10,000 cases on its own today. And that will put it above most states.

The situation in New Jersey
The situation in New Jersey

Delaware is tough because it sits as a small state next to several much larger ones. But, the numbers seem to indicate the outbreak is still worsening. Though in terms of geographic spread, there’s little to say other than that New Castle County, home to Wilmington, in the north is the heart of the state’s outbreak.

The situation in Delaware
The situation in Delaware

Illinois is a fascinating state, because of how dissimilar it is compared to Pennsylvania, a state which has a similar number of people.

The situation in Illinois
The situation in Illinois

The map shows that geographic spread still has a little way to go before reaching every county in the state. But the outbreak has been there longer than in Pennsylvania. And most of the darker purples are concentrated in the northeast, in Chicago and its collar counties. Compare that to Pennsylvania above where you will see dark purple scattered across the cities of its eastern third, e.g. Allentown and Scranton, and in the western parts near Pittsburgh. This too could be worth exploring in depth in the future.

Lastly I want to get to the cases curves charts. Here we look at the daily new cases in each state.

The curves, flattening or otherwise, of the five states.
The curves, flattening or otherwise, of the five states.

And unfortunately Sunday’s numbers will impact the Virginia curve, but it overall looks as if the state is worsening. I would argue that Illinois, which appears to be bending towards a steadying condition is likely in a weird weekly pattern where it appears to stabilise on weekends and then resumes reported infections come Monday. Pennsylvania might well be flattening its curve. I would want to see a few more days’ worth of data before stating that more definitively. Let’s give it to Wednesday or Thursday.

And then in New Jersey we have a fascinating trend. The curve of increasing number of cases has clearly broken. But it also is not shrinking. Instead, it seems to be more of a plateau. And in that case, the outbreak in New Jersey is not getting worse, but it’s also not getting any better. At least not numerically. However, the goal of flattening the curve is to create a slower, more steady increase in case numbers to help hospitals cope with surge volumes. So good news?

Credit for the pieces is mine.

Wednesday’s Corona Update

As I said yesterday, since people are finding these updates helpful on the social media, I am going to repost the previous evening’s graphics I make on the Coronavirus Covid-19 outbreak here on Coffeespoons as well. So while today is Thursday, these are the numbers states provided yesterday, so it’s more of a Wednesday update.

But here I can start with the flatter curves graphic. The New Jersey numbers in particular look good—I mean they’re still bad. Of course we are just a few big breaches of quarantine and lapses in social distancing from reversing that progress.

Maybe some curve flattening?
Maybe some curve flattening?

State-wise, Pennsylvania continues to worsen. However, a close look at the slope of the line in the previous chart indicates that the steepness of the growth may be lessening. Deaths passed 300 and cases are now firmly entrenched on both sides of the state with the rural, less densely populated areas in the Ridge and Valley portion of the state seemingly hit not as hard.

The situation in Pennsylvania
The situation in Pennsylvania

Despite the potential flattening, New Jersey is just in a rough spot. The final bastions of low case numbers in South Jersey are slowly filling up as Cape May County passed the 100-case threshold.

The situation in New Jersey
The situation in New Jersey

Delaware continues to accelerate and is now past 1000 cases.

The situation in Delaware
The situation in Delaware

Virginia continues to see cases spreading in the eastern, more populous portions of the state. And at 75 deaths, it’s nearing the 100-death threshold.

The situation in Virginia
The situation in Virginia

Illinois is seeing deaths occur away from Chicago, in the St. Louis suburban counties and in and around Springfield and Champaign and Bloomington areas.

The situation in Illinois
The situation in Illinois

Credit for the piece goes to me.

Tuesday’s Data on Covid-19

Here are the Tuesday figures for Pennsylvania, New Jersey, Delaware, Virginia, and Illinois. At the end is an updated version of the flattening curves chart as well. Given the value of these graphics that people have been texting, emailing, and DMing me on social media, I might consider making these a regular staple here on my blog as well. I would probably slowly write about other graphics covering the outbreak as well.

Any feedback is welcome on how to make the graphics more useful to you, the public.

Pennsylvania has finally reached the point where the virus has infected at least one person in every county. Now, if we shift our attention a wee bit to the deaths, we can see those are still largely confined to the eastern third of the state.

The condition in Pennsylvania
The condition in Pennsylvania

New Jersey continues to suffer greatly. But a sharp increase in new cases could be a blip, or it could mean the curve isn’t flattening. We need more data to see a longer trend. Regardless, over 3000 more people were reported infected and over 200 more died.

The condition in New Jersey
The condition in New Jersey

Delaware worsened significantly. As a small state, it has a lower captive population. But it is rapidly approaching 1000 cases. In fact, I would not be surprised if that is the headline from Wednesday.

The condition in Delaware
The condition in Delaware

Virginia also saw a significant uptick in cases. And most counties and independent cities in eastern Virginia now report cases. But the rural, mountainous counties in the west and southwest are not uniformly infected. At least not yet.

The condition in Virginia
The condition in Virginia

Illinois saw some geographic spread, but again, compared to a state like Pennsylvania, the worst in Illinois is disproportionately concentrated in the Chicago metropolitan area.

The condition in Illinois
The condition in Illinois

Lastly, the curves are not flattening in all the states but maybe New Jersey. But as I noted above, the higher daily cases there might be a blip.

The state of curves
The state of curves

Credit for the pieces goes to me.

Where’s My Corona? Another Round, Please

This past weekend I continued looking at the spread of COVID-19 across the United States. But in addition to my usual maps of Pennsylvania, New Jersey, Delaware, Virginia, and Illinois, I also looked at the number of cases across the United States adjusted for population. I then looked at the five aforementioned states in terms of new cases to see if the curve is flattening. Finally, I looked at the number of hospital beds per 1000 people vs the number of cases per 1000 people.

The latter in particular I wanted to be an examination of hospitalisation rates vs ICU beds, which are a small fraction of total hospital beds. But as I could not find that data, I made do with overall cases and overall beds.

So first let’s look at the cases across the U.S. What you can see is that whilst New York and New Jersey do have some of the worst of the impact, Washington is still not great and Louisiana and Michigan are also suffering.

The situation across the United States
The situation across the United States

And then when we look at the states by their cases per 1000 people and their hospital beds per 1000 people, we see that the states often claimed to be overwhelmed, New York, New Jersey, and Washington are all well over the blue line, which indicates an equal number of beds and cases per 1000 people, or near it. Because it is important to remember that not all beds are the type needed for COVID-19 victims, who often require the more fully kitted out ICU beds. Additionally, not all cases are severe enough to warrant hospitalisation.

Cases per 1k people vs hospital beds per 1k people
Cases per 1k people vs hospital beds per 1k people

Then from the broader national view, we can look at the states of interest. Here, those of you who have been following my social media posts, you can see fewer dark purples in these maps. That’s because I have adopted a new palette that has sacrificed granularity at the lower end of the scale and added it at the top, a particular need in New Jersey and the Philadelphia and Chicago metro areas. And finally we look at the daily new cases to see if that curve is flattening.

Pennsylvania now has almost every county infected. But unlike Illinois, which has a similar infection rate but more unaffected counties, Pennsylvania has fewer cases in its big city, Philadelphia, and has more cases in the smaller cities and towns.

The situation in Pennsylvania
The situation in Pennsylvania

New Jersey is just a disaster. Deaths are now reported in every county—so I can probably remove those orange outlines. The only potential good news is that new cases for the second day in a row were fewer than the day before. It could be a blip. But it could also be a signal that the peak of infection has or is nearing. That said, hospitalisations and deaths are lagging indicators and could take two weeks to follow the positive test results. So in the best case scenario that this is a peak, New Jersey is far from out of the woods.

The situation in New Jersey
The situation in New Jersey

Delaware is the smallest state I look at—and one of the smallest in the union overall—but its cases are worryingly increasing rapidly, although like every state I examine in detail it had fewer new cases Sunday than Saturday.

The situation in Delaware
The situation in Delaware

Virginia is in a better spot overall than the other four states. You can see that in the national map above. And most of Virginia’s cases are concentrated in the DC and Richmond areas as well as the cities along the peninsulas jutting into the Chesapeake.

The situation in Virginia
The situation in Virginia

Illinois is, as noted above, similar to Pennsylvania in terms of infections. In terms of deaths, however, it is doubling Pennsylvania’s numbers. And most of its cases are located in and around Chicago. Big chunks of downstate Illinois are unaffected or lightly affected compared to the Commonwealth.

The situation in Illinois
The situation in Illinois

Finally, as I noted in New Jersey, could these lower numbers Sunday than Saturday be meaningful? Possibly. But in all five states? Highly unlikely. Regardless, we can look at the number of daily new cases and see if that curve of infection is flattening. We should wait several days before beginning to make that assessment. But one can hope.

The case for flattening curves
The case for flattening curves

All of this is to say that things are bad and likely will continue to get worse. But I will keep looking at the data daily and presenting it to the public to keep them informed.

Credit for this piece is mine.

Another Friday, Another Corona with Lime Update

Today is yet another Friday in the pandemic. And so I wanted to just upload a few of the graphics I have been making for family, friends, and coworkers and posting on the Instagram and the Facebook. I did this two weeks ago as well, and if you compare those maps to these, you will see quite a stark difference. But on to today’s maps.

As a brief reminder, I am specifically looking at Pennsylvania, New Jersey, and Delaware—the tri-state region for my non-Philly followers—as well as Virginia and Illinois by the request of friends and former colleagues who live in those states. And then at the end I’ve been putting the tri-state region together to provide a fuller regional context.

Lastly, for today only, the Bureau of Labour Statistics published its jobs report about the number of job losses in March across the US. And…it wasn’t pretty.

Conditions in Pennsylvania
Conditions in Pennsylvania

Conditions in New Jersey
Conditions in New Jersey

Conditions in Delaware
Conditions in Delaware

Conditions in Virginia
Conditions in Virginia

Conditions in Illinois
Conditions in Illinois

Conditions in the tri-state region
Conditions in the tri-state region

Plus, the added bonus of the Bureau of Labour Statistics’ monthly jobs report. And spoiler, things aren’t so great out there.

Conditions in the national job market. Not great!
Conditions in the national job market. Not great!

Credit for the work is mine.

 

Modelling the Impact of Not Sheltering in Place or Staying at Home

The administration botched the early stages of the COVID-19 pandemic. Only within the last two weeks have states acted to begin enacting dramatic policies aimed at slowing the spread of the virus through their communities. But what policies the federal government has enacted are now threatened by an administration that prioritises the economy and market over the lives of the citizens it leads.  The White House is discussing loosening all the policies of social distancing that health officials and scientists say are necessary to slow the spread of the virus.

This website from CovidActNow.org uses a model to predict the impact state by state of various policies on hospital overcrowding and ultimately deaths. The site opens with a map of the United States showing, broadly, what kind of response each state has followed (understanding things change rapidly these days).

The state of reactions in the United States
The state of reactions in the United States

That also serves as the navigation for a deep dive into those models for that state. Here I have selected my home state of Pennsylvania. It borders New Jersey and New York, two states that revolve, at least in part, around New York City, rapidly becoming the epicentre of the US outbreak, supplanting Seattle and the Pacific Northwest. What would the state face if we allowed things to keep going blithely on? What would happen if we merely socially distance for three months? What if we shelter in place for three months? (Emphasis added by me to show this is a long-term problem.)

Potential outcomes for Pennsylvania
Potential outcomes for Pennsylvania

Turns out that things don’t work out that well if we don’t stay at home, stop travelling, stop socialising. A table below the line charts shows the user how bad things go for the state in a table.

A table of potential outcomes
A table of potential outcomes

As you can see, for Pennsylvania, if we were to continue going on like normal, that would result in the deaths of almost the size of the entire city of Pittsburgh. Imagine if the city of Pittsburgh were suddenly wiped off the state map. That’s the level we are talking about.

Just three months of just social distancing? Well now you’re talking about wiping out just the cities of Allentown and Scranton.

Sheltering in place for three months, statewide? Well, thankfully Pennsylvania has lots of towns around the size of 5000 to choose from. Imagine no more Paoli, or Tyrone. Or maybe a Collegeville or Kutztown. An Oxford or a Media. Pick one of those and wipe it from the map.

Fundamentally the choice comes down to, do you want to restart your economy or do you want to save lives? Saving lives will undoubtedly mean unemployment, shattered 401k plans, bankruptcies, mental health problems, and cities, towns, and industries devastated without a tax base to provide for the necessary services. But, saving those jobs and dollars will means tens if not hundreds of thousands of deaths.

I don’t envy the state executive branches making these decisions.

Pennsylvania has chosen a middle road, if you will. It enacted a stay-at-home policy for seven counties: Allegheny (Pittsburgh); Philadelphia and its suburban counties of Bucks, Chester, Delaware, and Montgomery; and Monroe County. The rest of the state, primarily where the virus has yet to make any real significant appearance or appears to be spreading in the community, is not under the strictest of measures. This site’s model doesn’t account for a partial, statewide stay-at-home, but Pennsylvania’s choice is clearly a far superior one for people who prioritise lives over dollars.

Finally, to the people I have seen from my apartment gathering in parks, partying in outdoor spaces, that I can hear throwing house parties, please stop. If not for you, for the rest of us.

Credit for the piece goes to CovidActNow.org.

The Spread of COVID-19 in Select States

By now we have probably all seen the maps of state coverage of the COVID-19 outbreak. But state level maps only tell part of the story. Not all outbreaks are widespread within states. And so after some requests from family, friends, and colleagues, I’ve been attempting to compile county-level data from the state health departments where those family, friends, and colleagues live. Not surprisingly, most of these states are the Philadelphia and Chicago metro areas, but also Virginia.

These are all images I have posted to Instagram. But the content tells a familiar story. The outbreaks in this early stage are all concentrated in and around the larger, interconnected cities. In Pennsylvania, that means clusters around the large cities of Philadelphia, Pittsburgh, and Harrisburg. In New Jersey they stretch along the Northeast Corridor between New York and Trenton (and along into Philadelphia) and then down into Delaware’s New Castle County, home to the city of Wilmington. And then in Virginia, we see small clusters in Northern Virginia in the DC metro area and also around Richmond and the Williamsburg area. Finally in Illinois we have a big cluster in and around Chicago, but also Springfield and the St. Louis area, whose eastern suburbs include Illinois communities like East St. Louis.

19 March county wide spread of COVID-19
19 March county wide spread of COVID-19

19 March county wide spread of COVID-19
19 March county wide spread of COVID-19

19 March county wide spread of COVID-19
19 March county wide spread of COVID-19

19 March county wide spread of COVID-19
19 March county wide spread of COVID-19

19 March county wide spread of COVID-19
19 March county wide spread of COVID-19

I have also been taking a more detailed look at the spread in Pennsylvania, because I live there. And I want to see the rapidity with which the outbreak is growing in each county. And for that I moved from a choropleth to a small multiple matrix of line charts, all with the same fixed scale. And, well, it doesn’t look good for southeastern Pennsylvania.

County levels compared
County levels compared

Then last night I also compared the total number of cases in Pennsylvania, New Jersey, Delaware, and Virginia. Most interestingly, Pennsylvania and New Jersey’s outbreaks began just a day apart (at least so far as we know given the limited amount of testing in early March). And those two states have taken dramatically different directions. New Jersey has seen a steep curve doubling less than every two days whereas Pennsylvania has been a bit more gradual, doubling a little less than every three.

State levels since early March
State levels since early March

For those of you who want to continue following along, I will be looking at potential options this coming weekend whilst still recording the data for future graphics.

Credit for the pieces is mine.