The Sun’s Over the Yardarm Somewhere

It’s been a little while since my last post, and more on that will follow at a later date, but this weekend I glanced through the Pennsylvania Liquor Control Board’s annual report. For those unfamiliar with the Commonwealth’s…peculiar…alcohol laws, residents must purchase (with some exceptions) their wine and spirits at government-owned and -operated shops.

It’s as awful as it sounds. Compare that to my eight years in Chicago, where I could pick up a bottle of wine at a cheese shop at the end of the block for a quiet night in or a bottle of fine Scotch a few blocks from the office on Whisky Friday for that evening’s festivities. Here all your wine and spirits come from the state store.

And whilst it’s awful from a consumer/consumption standpoint, it makes for some interesting data, because we can largely use that one source to get a sense of the market for wine and spirits in the Commonwealth. That is to say, you don’t need to (really) worry about collecting data from hundreds of other large vendors. Consequently, at the end of the fiscal year you can get a glimpse into the wine and spirit landscape in Pennsylvania.

So what do we see this year?

A choropleth map of per 21+ capita sales of wine and spirits in Pennsylvania.

To start I chose to revisit a choropleth map I made in 2020, just before the pandemic kicked off in the United States. Broadly speaking, not much has changed. You can find the highest per 21+ year old capita value sales—henceforth I’ll simply refer to this as per capita—outside Philadelphia, Pittsburgh, and up in the northeast corner of the Commonwealth.

The great thing about per capita sales are that, by definition, it accounts for population. So this isn’t just that because Philadelphia and Pittsburgh are the largest two metropolitan areas they have the largest value sales—though they do in the aggregate as well. In fact, if we look at the northeast of the Commonwealth in places like Wayne County we see the second highest per capita sales, just under the top-ranked in Montgomery County.

Wayne County’s population, at least of the legal drinking age, is flat comparing 2018 to 2022: 0.0% or just six people. However, sales over that same period are up 20.2% per person. That’s the 15th greatest increase out of 67 counties. What happened?

A little thing called Covid-19. During the pandemic, significant numbers of higher-income people from New York and Philadelphia bought second properties in Wayne County and, surely, they brought some of that income and are now spending it on wine and high-priced spirits.

Wayne County stands out starkly on the map, but it does not look like a total outlier. Indeed, if you look at the highest growth rates for per capita sales from 2018 to 2022, you will find them all in the more rural parts of the Commonwealth. Furthermore, almost every county that has seen greater than 15% growth is in a county whose drinking-age population has shrunk in the last five years.

Overall, however, the map looks broadly similar to how it did at the beginning of 2020. The top and centre of the Commonwealth have relatively low per capita sales, and this is Appalachia or Pennsyltucky as some call it. Broadly speaking, these are more rural counties and counties of lower income.

I spend a little bit of time out in Appalachia each year and have family roots out in the mountains. And my experience casts one shadow on the data. Personally, I prefer my cocktails, whiskies, and gins. But when I go out for a drink or two out west, I often settle for a pint or two. That part of the Commonwealth strikes me as more fond of beer than wine or spirits. And this dataset does not include beer. I have to wonder how the data would look if we included beer sales—though lower price-point session beers would still probably keep the per capita value sales on the lower end given the broad demographics of the region.

Finally, one last note on that second call out, Potter County having the lowest per capita sales at just under $42 per person. The number struck me as odd. The next lowest county, Fulton, sits nearly $30 more per person. Did I copy and paste the data incorrectly? Was there a glitch in the machine? Is the underlying data incorrect? I can’t say for certain about the third possibility, but I did some digging to try and hit the bottom of this curiosity.

First, you need to understand that Potter County is, by population, the 5th smallest with just over 16,000 total people living there. And as far as I can tell, it had just three stores at the beginning of 2022. But then, before the beginning of the new fiscal year, one of the three stores closed when an adjoining building collapsed. It was never rebuilt. And so perhaps 1/3 of the local population was forced to head out-of-county for wine and spirits. Compared to 2018, per capita sales in Potter County declined by 62%, and most of that is within the last year as the annual report lists the year-on-year decline as just under 54%.

In coming days and weeks I’ll be looking at the data a bit more to see what else it tells us. Stay tuned.

Credit for the piece is mine.

Political Hatch Jobs

Earlier this week I read an article in the Philadelphia Inquirer about the political prospects of some of the candidates for the open US Senate seat for Pennsylvania, for which I and many others will be voting come November. But before I get to vote on a candidate, members of the political parties first get to choose whom they want on the ballot. (In Pennsylvania, independent voters like myself are ineligible to vote in party primaries.)

This year the Republican Party has several candidates running and one of them you may have heard of: Dr. Oz. Yeah, the one from television. And while he is indeed the front runner, he is not in front by much as the article explains. Indeed, the race largely had been a two-person contest between Oz and David McCormick until recently when Kathy Barnette pulled just about even with the two.

In fact, according to a recent poll the three candidates are all statistically tied in that they all fall within the margin of error for victory. And that brings us to the graphic from the article.

It would be funny to see a candidate finish with negative vote share.

Conceptually this is a pretty simple bar chart with the bar representing the share of the support of those polled. But I wanted to point out how the designer chose to represent the margin of error via hatched shading to both sides of the ends of the red bar.

In some cases the hatch job does not work for me, particularly with those smaller candidates where the bar goes negative. I would have grave reservations about the vote should any candidate win a negative share of the vote. 0% perhaps, but negative? No. I also don’t think the grey hatching works as well over the grey bar in particular and to a lesser degree the red.

I have often thought that these sorts of charts should use some kind of box plot approach. So this morning I took the chart above and reworked it.

Now with box plots.

Overall, however, I really like this designer’s approach. We should not fear subtlety and nuance, and margins of error are just that. After all, we need not go back too far in time to remember a certain candidate who thought she had a presidential election locked up when really her opponent was within the margin of error.

Credit for the piece goes to John Duchneskie.

How Accurate Is Punxsutawney Phil?

For those unfamiliar with Groundhog Day—the event, not the film, because as it happens your author has never seen the film—since 1887 in the town of Punxsutawney, Pennsylvania (60 miles east-northeast of Pittsburgh) a groundhog named Phil has risen from his slumber, climbed out of his burrow, and went to see if he could see his shadow. Phil prognosticates upon the continuance of winter—whether we receive six more weeks of winter or an early spring—based upon the appearance of his shadow.

But as any meteorological fan will tell you, a groundhog’s shadow does not exactly compete with the latest computer modelling running on servers and supercomputers. And so we are left with the all important question: how accurate is Phil?

Thankfully the National Oceanic and Atmospheric Administration (NOAA) published an article several years ago that they continue to update. And their latest update includes 2021 data.

Not exactly an accurate depiction of Phil.

I am loathe to be super critical of this piece, because, again, relying upon a groundhog for long-term weather forecasting is…for the birds (the best I could do). But critiques of information design is largely what this blog is for.

Conceptually, dividing up the piece between a long-term, i.e. since 1887, and a shorter-term, i.e. since 2012, makes sense. The long-term focuses more on how Phil split out his forecasts—clearly Phil likes winter. I dislike the use of the dark blue here for the years for which we have no forecast data. I would have opted for a neutral colour, say grey, or something that is visibly less impactful than the two light colours (blue and yellow) that represent winter and spring.

Whilst I don’t love the icons used in the pie chart, they do make sense because the designers repeat them within the table. If they’re selling the icon use, I’ll buy it. That said, I wonder if using those icons more purposefully could have been more impactful? What would have happened if they had used a timeline and each year was represented by an icon of a snowflake or a sun? What about if we simply had icons grouped in blocks of ten or twenty?

The table I actually enjoy. I would tweak some of the design elements, for example the green check marks almost fade into the light blue sky. A darker green would have worked well there. But, conceptually this makes a lot of sense. Run each prognostication and compare it with temperature deviation for February and March (as a proxy for “winter” or “spring”) and then assess whether Phil was correct.

I would like to know more about what a slightly above or below measurement means compared to above or below. And I would like to know more about the impact of climate change upon these measurements. For example, was Phil’s accuracy higher in the first half of the 20th century? The end of the 19th?

Finally, the overall article makes a point about how difficult it would be for a single groundhog in western Pennsylvania to determine weather for the entire United States let alone its various regions. But what about Pennsylvania? Northern Appalachia? I would be curious about a more regionally-specific analysis of Phil’s prognostication prowess.

Credit for the piece goes to the NOAA graphics department.

Where Are the Votes?

I’m not working for a good chunk of the next few days. But, I did want to share with my readers an analysis of Pennsylvania’s missing votes. Broadly, Trump needs to win the Commonwealth of Pennsylvania next week—yes, the US election is now one week away. Though, Pennsylvania allows mail-in ballots postmarked on Election Day to arrive within a few days and still be counted. So we may not have final tallies for the state until the weekend or Monday after Election Day.

Pennsylvania, of course, narrowly voted for Donald Trump over Hillary Clinton in 2016 with 44,000+ votes making the difference. In 2020, polling has consistently placed Joe Biden above Donald Trump by 5+ points. But, can Trump again pull off an upset victory?

I argue that yes, he can. And fairly easily too. (If you want to see why I think Pennsylvania is really Trumpsylvania, I recommend checking out my longer, more in-depth analysis.) So where would the votes come from? I mapped the 2016 difference between votes cast and registered voters, i.e. people who could have voted, but did not for whatever reason. I then coloured the map by the county’s winner in 2016. Red counties voted for Trump by more than 10 points and blue for Clinton by more than 10 points. The purple counties are those that were competitive, plus or minus 10 points for either candidate.

In the purple counties, both candidates will want to drive out as many voters as possible. But in the blue counties, Biden has reliably Democratic votes and in red Trump has reliably Republican votes. So why on Monday did Trump visit Allentown, Lititz, and Martinsburg? Because that’s where those votes are.

Allentown, in Lehigh County, is competitive. In fact, neighbouring Northampton Co. will be a key swing county next week and one I will be following closely as the returns come in. But Lititz, Lancaster Co., and Martinsburg, Blair Co., are in reliably red counties. (Though in my Trumpsylvania piece I argue Lancaster Co. is undergoing a transition to a competitive, albeit lean Republican county.)

In Lancaster Co., which went to Trump by nearly 20 percentage points in 2016, there were still just short of 100,000 voters who didn’t vote in 2016. Not all of those voters would have voted for Trump, but for sake of argument, just say 50% would have. That makes just short of 50,000 potential Trump votes—more than Trump’s entire state margin.

Blair Co. is in the Pennsyltucky region of the state, relatively rural, but in Blair’s case, its county seat Altoona is the state’s 10th largest city. While the total number of votes—and the total number of non-voting voters—are smaller than in Lancaster Co., add up all the available votes and it’s a large number.

If you add up all those red counties’ missing votes, you get a total of just shy of 840,000 missing votes. Far more than enough to drastically swing the Commonwealth to Trump in 2020.

Of course, Biden’s counting on driving out turnout in Philadelphia and Pittsburgh and their suburbs, along with other cities in the state, like Allentown, Scranton, Harrisburg, and Erie. In those blue counties, there were 927,000 missing votes, so the potential for a Biden win is also there.

But, if Democratic voters don’t vote again in 2016, Trump has plenty of potential votes to pick up across the state.

Credit for the piece is mine.

Cheesesteaks and Politics

For those unaware, Pennsylvania matters in the 2020 election. And it has mattered for years as a perennial swing state. There are of course the visits to steel mill cities like Pittsburgh, deindustrialised places like Johnstown, and unions love visits to places in Lackawanna and Luzerne. (You can read more about Pennsylvania as a swing state in my latest analysis here.)

But I want to focus on visits to Philadelphia. Because they inevitably involve the candidate consuming a cheesesteak. The Economist’s sister magazine, 1843, recently published an article on this very subject. And the whole thing is worth a read.

How have I managed to find this relevant to a blog about data visualisation? Well, they included a recipe to help people understand just what goes into the traditional Philadelphia dish.

Personally, I always have to confess, I’ve never been a huge fan. But, I’ll take provolone over whiz any day.

Credit for the piece goes to Jake Read.

Trumpsylvania

After working pretty much non-stop all spring and summer, your humble author finally took a few days off and throw in a bank holiday and you are looking at a five-day weekend. But, because this is 2020 travelling was out of the question and so instead I hunkered down to finish writing/designing an article I have been working on for the last several weeks/few months.

The main write-up—it is a lengthy-ish read so you may want to brew a cup of tea—is over at my data projects site. This is the first project I have really written about for that since spring/summer 2016. Some of my longer-listening readers may recall that the penultimate piece there I wrote about Pennsyltucky was inspired by work I did here at Coffeespoons.

To an extent, so is this piece. I wrote about Trumpsylvania, the political realignment of the state of Pennsylvania. 2016 and the state’s vote for Donald Trump was less an aberration than many think. It was the near-end result of a decades-long transformation of the state’s political geography. And so I looked at the data underlying the shift and how and where it occurred.

And originally, I had a slightly different conclusion as to how this related to Pennsylvania in the upcoming 2020 election. But, the whole 2020 thing made me shift my thinking slightly. But you’ll have to read the whole thing to understand what I’m talking about. I will leave you with one of the graphics I made for the piece. It looks at who won each county in the state, but also whether or not the candidate was able to flip the county. In other words, was Clinton able to flip a Republican county? Was Trump able to flip a Democratic county?

Who won what? Who flipped what?

Let me know what you think.

And of course, many, many thanks to all the people who suffered my ideas, thoughts, and early drafts over the last several weeks. And even more thanks to those who edited it. Any and all mistakes or errors in the piece are all mine and not theirs.

Credit for the piece is mine.

Parties in Pennsylvania

This is from a social media post I made a few days ago, but think it may be of some relevance/interest to my Coffeespoons followers. I was curious to see at 30+ days from the general election, how has the landscape changed for the two parties since 2016?

Well, this project has driven me to a related, but slightly different project that has been consuming my non-work time. Hopefully I will have more on that in the coming days. Without further ado, the post:

Pennsylvania will likely be one of the more critical battleground swing states in this year’s election. In 2016, then candidate Trump won the state by less than one percentage point. But four years is a long time and I was curious to see how things have changed.

In the first chart on the right we see counties won by Trump and on the left, Clinton. The further from the centre, the greater the candidate’s margin of victory over the other. The top half plots registered Republicans’ margin over Democrats as a percentage of all registered voters in the county (including independents and third party) and the bottom half does the same for Democrats. Closer to the centre, the more competitive, further away, less so.

Trump’s key to victory was the white, working class voter clustered in the west and the northeast of the state–old mining and steel towns. There Democrats normally counted on organised labour support as registered Democrats. That all but collapsed in 2016. The bottom right shows a number of nominally Democratic counties Trump won, whereas Clinton only picked up one Republican county, Chester.

But what are PA’s battlegrounds?

In the second chart we ignore places like Philly and Fulton County and zoom in on more competitive counties within 20 point margins. Polls presently point to a Biden lead of about 5 points in PA. If every dot moved left by 5 points (it doesn’t really work like that), we only see Erie and Northampton with potential to flip.

But Trump’s realignment of politics is accelerating (more on this another day) a realignment of PA’s political geography.

In the fourth chart, neither Erie nor Northampton show any real movement via party registration back to Democrats. Erie may flip, but Northampton’s likely a stretch. Places like Cumberland and Lancaster counties are too solidly Republican to flip this year. Instead Trump is more likely to flip counties like Monroe and Lehigh red, even if he loses the state.

Because, not shown, the key to a Biden victory will be running up the margins in Philly & Pittsburgh, and to a lesser extent Philly’s four collar counties, including Chester, which appears to be rapidly shifting in Democrats’ favour.

Credit for the piece is mine.

Covid-19 Update: 28 September

Apologies for the lack of posting, work is pretty busy as we wrap some projects up. But here’s a look at the latest Covid data for Pennsylvania, New Jersey, Delaware, and Illinois. Normally we look at Virginia as well, but their site was down for maintenance and so there was no data to report.

When it comes to new cases, we have on the one hand places like New Jersey and Illinois, where new cases continue to rise. The rate is nowhere near as fast as it was in March and April, but the inclines are clearly there. Delaware has been up and down, but largely hovering around just shy of 100 new cases per day. Pennsylvania is a bit harder to tell because of some dramatic swings that have knocked the average around, but it does appear to be trending upward, though I’m not quite as confident in that as I am with New Jersey and Illinois.

New cases curve in PA, NJ, DE, and IL.

And then when we look at deaths, we generally have good news. Last week we were looking at Virginia and its working through a backlog of unreported deaths. That artificially inflated recent days, but also depressed deaths earlier in the pandemic. Beyond the old Dominion, however, deaths have remained fairly low. Only in Pennsylvania and Illinois do they hover around 20 deaths per day from the virus.

Death curves in PA, NJ, DE, AND IL.

Credit for the piece is mine.

Covid-19 Update: 21 September

Apologies for the lack of posting yesterday, but I wasn’t feeling well. I had some other things planned for today, but then some other things happened this weekend and then I took ill. But it’s still important to look at what’s going on with the pandemic, especially in the United States where it’s been disastrously handled by the White House.

As we approach 200,000 dead Americans, we still look at what’s going on in the tristate region alongside Virginia and Illinois. Specifically we compare last week’s post to this week’s post. Note that normally we look at Sunday data on Monday morning and today we’ll be looking at Monday data on a Tuesday. Both Sunday and Monday are reports from their preceding days, and so we are still looking at weekend reporting of figures. So we can expect them to be lower than workweek data.

New cases curves for PA, NJ, DE, VA, & IL.

If we compare the above chart to last week’s, we can see that Pennsylvania has decidedly reversed course. Whereas things had been headed down in terms of averages, I was worried about the days of daily new cases exceeding the average. Sure enough the average has caught up to the new cases and we’re seeing a rise in the average to levels not really seen since the summer.

New Jersey remains on the path of slowly increasing its numbers of new cases. Delaware looks to be heading back down after a small bump. We might be seeing the beginning of a decline in cases in Virginia, down from its long-running plateau of nearly 1000 new cases per day. And finally in Illinois, it’s not quite clear where things are headed at present. But for the one-day spike that raised the average, it seemed as if new cases had been in decline, but the end of that otherwise decline might have been an inflection point as the average may be trending back upwards again.

Death curves in PA, NJ, DE, VA, & IL.

Then when we look at deaths, well we see no real significant change in four of the states. But last week, we were saying Virginia was at a good spot with its latest surge cycle coming to an end. Well now look at that spike and deaths that are higher now than they were in the spring. If you follow my daily posts on social media, you’ll know that there’s a reason for this.

For the last week Virginia has been working through a backlog of deaths that were not entered into its electronic database. And so these deaths happened over the last several months. Consequently the rise, if there even is one, is not nearly as high as shown. But it also means that the earlier peaks may have been far higher than reported at the time.

Credit for the pieces is mine.

Covid-19 Update: 13 September

Apologies for the lack of posting last week. I’m on deadline for, well, today. Plus I had some technical difficulties on the server side of the blog. But it’s a Monday, so we’re back with Covid updates for Pennsylvania, New Jersey, Delaware, Virginia, and Illinois.

New cases curves for PA, NJ, DE, VA, & IL

The good news, such that it is during a global pandemic, is that in Pennsylvania, Delaware, and Illinois, the seven-day average appears to be lower than this time last week or, especially in Delaware’s situation, about to break. For the First State, I’m looking at those days prior to the weekend below the average line that, in combination with the weekend, will likely begin to push that trend downward, especially if we keep seeing fewer and fewer cases this week.

Unfortunately, some states like Virginia and New Jersey appear to be, not surging, but experiencing low and slow growth. Low and slow, while great for barbecue, is less than ideal during a pandemic. Granted, it’s better than the rapid infections we saw in March, April, and May, but it still means the virus is spreading in those communities.

Death curves in PA, NJ, DE, VA, & IL.

When we look at deaths from Covid-19 in these five states, the news is better. The only real significant level of deaths was in Virginia, but we can see that the latest little surge, which was at peak last week, has now all but abated, almost to a level not seen since the spring.

The other states remain low with, at most, deaths average about 20 per day. Again, not good, but better than hundreds per day.

Credit for the pieces is mine.