choropleth – Coffee Spoons

The Sun’s Over the Yardarm Somewhere

It’s been a little while since my last post, and more on that will follow at a later date, but this weekend I glanced through the Pennsylvania Liquor Control Board’s annual report. For those unfamiliar with the Commonwealth’s…peculiar…alcohol laws, residents must purchase (with some exceptions) their wine and spirits at government-owned and -operated shops.

It’s as awful as it sounds. Compare that to my eight years in Chicago, where I could pick up a bottle of wine at a cheese shop at the end of the block for a quiet night in or a bottle of fine Scotch a few blocks from the office on Whisky Friday for that evening’s festivities. Here all your wine and spirits come from the state store.

And whilst it’s awful from a consumer/consumption standpoint, it makes for some interesting data, because we can largely use that one source to get a sense of the market for wine and spirits in the Commonwealth. That is to say, you don’t need to (really) worry about collecting data from hundreds of other large vendors. Consequently, at the end of the fiscal year you can get a glimpse into the wine and spirit landscape in Pennsylvania.

So what do we see this year?

A choropleth map of per 21+ capita sales of wine and spirits in Pennsylvania.

To start I chose to revisit a choropleth map I made in 2020, just before the pandemic kicked off in the United States. Broadly speaking, not much has changed. You can find the highest per 21+ year old capita value sales—henceforth I’ll simply refer to this as per capita—outside Philadelphia, Pittsburgh, and up in the northeast corner of the Commonwealth.

The great thing about per capita sales are that, by definition, it accounts for population. So this isn’t just that because Philadelphia and Pittsburgh are the largest two metropolitan areas they have the largest value sales—though they do in the aggregate as well. In fact, if we look at the northeast of the Commonwealth in places like Wayne County we see the second highest per capita sales, just under the top-ranked in Montgomery County.

Wayne County’s population, at least of the legal drinking age, is flat comparing 2018 to 2022: 0.0% or just six people. However, sales over that same period are up 20.2% per person. That’s the 15th greatest increase out of 67 counties. What happened?

A little thing called Covid-19. During the pandemic, significant numbers of higher-income people from New York and Philadelphia bought second properties in Wayne County and, surely, they brought some of that income and are now spending it on wine and high-priced spirits.

Wayne County stands out starkly on the map, but it does not look like a total outlier. Indeed, if you look at the highest growth rates for per capita sales from 2018 to 2022, you will find them all in the more rural parts of the Commonwealth. Furthermore, almost every county that has seen greater than 15% growth is in a county whose drinking-age population has shrunk in the last five years.

Overall, however, the map looks broadly similar to how it did at the beginning of 2020. The top and centre of the Commonwealth have relatively low per capita sales, and this is Appalachia or Pennsyltucky as some call it. Broadly speaking, these are more rural counties and counties of lower income.

I spend a little bit of time out in Appalachia each year and have family roots out in the mountains. And my experience casts one shadow on the data. Personally, I prefer my cocktails, whiskies, and gins. But when I go out for a drink or two out west, I often settle for a pint or two. That part of the Commonwealth strikes me as more fond of beer than wine or spirits. And this dataset does not include beer. I have to wonder how the data would look if we included beer sales—though lower price-point session beers would still probably keep the per capita value sales on the lower end given the broad demographics of the region.

Finally, one last note on that second call out, Potter County having the lowest per capita sales at just under $42 per person. The number struck me as odd. The next lowest county, Fulton, sits nearly $30 more per person. Did I copy and paste the data incorrectly? Was there a glitch in the machine? Is the underlying data incorrect? I can’t say for certain about the third possibility, but I did some digging to try and hit the bottom of this curiosity.

First, you need to understand that Potter County is, by population, the 5th smallest with just over 16,000 total people living there. And as far as I can tell, it had just three stores at the beginning of 2022. But then, before the beginning of the new fiscal year, one of the three stores closed when an adjoining building collapsed. It was never rebuilt. And so perhaps 1/3 of the local population was forced to head out-of-county for wine and spirits. Compared to 2018, per capita sales in Potter County declined by 62%, and most of that is within the last year as the annual report lists the year-on-year decline as just under 54%.

In coming days and weeks I’ll be looking at the data a bit more to see what else it tells us. Stay tuned.

Credit for the piece is mine.

Legendary Adjustments

The other day I was reading an article about the coming property tax rises in Philadelphia. After three years—has anything happened in those three years?—the city has reassessed properties and rates are scheduled to go up. In some neighbourhoods by significant amounts. I went down the related story link rabbit hole and wound up on a Philadelphia Inquirer article I had missed from early May that included a map of just where those increases were largest. The map itself was nothing crazy.

We have a choropleth with city zip codes coloured by the percentage increase. I was thrown for a bit of a loop as I immediately perceived the red representing lower values and green higher values, the standard green to red palette. But given that higher values are “bad”, I can live red representing bad and sitting at the top of the spectrum.

I filed it away to review later, but when I returned I visited on my mobile phone. And what I saw broadly looked the same, but there was a disconcerting difference. Take a look at the legend.

You can see that instead of running vertically like it did on the desktop, now the legend runs horizontally across the bottom. In and of itself, that’s not the issue. Though I do wonder if this particular legend could have still worked in roughly the same spot/alignment given the geographic shape of Philadelphia along the Delaware River.

Rather look at the order. We go from the higher, positive values on the left to the negative, lower values on the right. When you read the legend, this creates some odd jumps. For example, we move from “+32% to +49%” then to “+15% to +31%”. We would normally say something to the point of the increase bins moving from “+15% to +31%” then to “+32% to +49%”. In other words, the legend itself is a continuum.

The fix for this would be to simply flip the running order of the legend. Put the lower values on the left and then step up to the right. For a quick comparison, I visited the New York Times website and pulled up the first graphic I could find that looked like a choropleth. Here we have a map of the dangerous temperatures across the United States.

Note how here the New York Times also runs their legend horizontally below the graphic. But instead of running high-to-low like in the Inquirer, the Times runs low-to-high, making for a more natural and intuitive legend.

This kind of simple ordering change would make the Inquirer’s map that much better.

Credit for the Inquirer piece goes to Kasturi Pananjady and John Duchneskie.

Credit for the Times piece goes to Matthew Bloch, Lazaro Gamio, Zach Levitt, Eleanor Lutz, and John-Michael Murphy.

Kids Do the Darnedest Things: But Really They Do

Remember how just last week I posted a graphic about the number of under-18 year olds killed by under-18 year olds? Well now we have an 18-year-old shooting up an elementary school killing 19 students and two teachers. Legally the alleged shooter, Salvador Ramos, is an adult given his age. But he was also a high-school student, reportedly more of a loner type. Legally an adult, perhaps, but I’d argue still more of a child. At least a young adult.

Well, as I noted above, here we are again, kids killing kids. With guns!

And it does look like it correlates with those state with more liberal gun laws, including Texas.

If you keep doing the same thing, but expect different results…

Credit for the piece is mine.

Kids Do the Darnedest Things: Shoot Other Kids

Last month, a 2-year old shot and killed his 4-year old sister whilst they sat in a car at a petrol station in Chester, Pennsylvania, a city just south of Philadelphia.

Not surprisingly some people began to look at the data around kid-involved shootings. One such person was Christopher Ingraham who explored the data and showed how shootings by children is up 50% since the pandemic. He used two graphics, one a bar chart and another a choropleth map.

The map shows where kid-involved shootings have occurred. Now what’s curious about this kind of a map is that the designer points out that toddler incidents are concentrated around the Southeast and Midwest. And that appears to be true, but some of the standouts like Ohio and Florida—not to mention Texas—are some of the most populated states in the country. More people would theoretically mean more deaths.

So if we go back to the original data and then grab a 2020 US Census estimate for the under-18 population of each state, I can run some back of the envelope maths and we can take a look at how many under-18 deaths there had been per 100,000 under-18 year-olds. And that map begins to look a little bit different.

If anything we see the pattern a bit more clearly. The problem persists in the Southeast, but it’s more concentrated in what I would call the Deep South. The problem states in the Midwest fade a bit to a lower rate. Some of the more obvious outliers here become Alaska and Maine.

As the original author points out, some of these numbers likely owe to lax gun regulation in terms of safe storage and trigger locks. I wonder if the numbers in Alaska and Maine could be due to the more rural nature of the states, but then we don’t see similar rates of kid deaths in places like Wyoming, Montana, and Idaho.

Credit for the original piece goes to Christopher Ingraham.

One Million Covid-19 Deaths

This past weekend the United States surpassed one million deaths due to Covid-19. To put that in other terms, imagine the entire city of San Jose, California simply dead. Or just a little bit more than the entire city of Austin, Texas. Estimates place the number of those infected at about 80 million. Back of the envelope maths puts that fatality rate at 1.25%. That’s certainly lower than earlier versions of the virus, which has evolved to be more transmissible, but thankfully less lethal than its original form.

Sunday morning I opened the door to my flat and found the Sunday edition of the New York Times waiting for me with a sobering graphic not just above the fold, nor across the front page. No, the graphic—a map where each dot represents one Covid-19 death—wrapped around the entire paper.

You don’t need to do much more here. Black and white colour sets the tone simply enough. Of course, a bit more critically, these maps mask one of the big issues with the geographic spread of not just this virus but many other things: relatively few people live west of the Mississippi River.

Enormous swathes of the plains and Rocky Mountains have but few farmers and ranchers living there. Most of the nation’s populous cities are along the coast, particularly the East Coast, or along rivers or somewhat arbitrary transport hubs. You can see those because this map does not actually plot the locations of individual deaths, but rather fills county borders with dots to represent the deaths that occurred within those limits. That’s why, particularly west of the Mississippi, you see square-shaped concentrations of deaths.

A choropleth map that explores deaths per capita, that is after adjusting for population, shows a different story. (This screenshot comes from the New York Times‘ data centre for Covid-19.

The story here is literally less black and white as here we see colours in yellows to deep burnt crimsons. Whilst the big map yesterday morning concentrated deaths in the Northeast, West Coast, and around Chicago we see here that, relative to the counties’ populations, those same areas fared much better than counties in the plains, Midwest, and Deep South.

A quick scan of the Northeast and Mid-Atlantic states shows that only one county, Juniata in Pennsylvania, fell into the two worst deaths per capita bins—the deeper reds. Juniata County sits squarely in the middle of Pennsyltucky or Trumpsylvania, where Covid countermeasures were not terribly popular. No other county in the region shares that deep red.

Look to the southeast and south, however, and you see lots of deep and burnt crimsons dotting the landscape. This doesn’t mean people didn’t die in the Northeast, because of course they did. Rather, a greater percentage of the population died elsewhere when, as the policies enacted by the Northeast and West Coast show, they didn’t need to.

After all, injecting bleach was never a good idea.

Credit for the piece goes to Jeremy White.

The Potential Impacts of Throwing Out Roe v Wade

Spoiler: they are significant.

Last night we had breaking news on two very big fronts. The first is that somebody inside the Supreme Court leaked an entire draft of the majority opinion, written by Justice Alito, to Politico. Leaks from inside the Supreme Court, whilst they do happen, are extremely rare. This alone is big news.

But let’s not bury the lede, the majority opinion is to throw out Roe v. Wade in its entirety. For those not familiar, perhaps especially those of you who read me from abroad, Roe v Wade is the name of a court case that went before the United States Supreme Court in 1971 and was decided in 1973. It established the woman’s right to an abortion as constitutionally protected, allowing states to enact some regulations to balance out the state’s role in concern for women’s public health and the health of the fetus as it nears birth. Regardless of how you feel about the issue—and people have very strong feelings about it—that’s largely been the law of the United States for half a century.

Until now.

To be fair, the draft opinion is just that, a draft. And the supposed 5-3 vote—Chief Justice Roberts is reportedly undecided, but against the wholesale overthrow of Roe—could well change. But let’s be real, it won’t. And even if Roberts votes against the majority he would only make the outcome 5-4. In other words, it looks like at some point this summer, probably June or July, tens of millions of American women will lose access to reproductive healthcare.

And to the point of this post, what will that mean for women?

This article by Grid runs down some of the numbers, starting with laying out the numbers on who chooses to have abortions. And then ultimately getting to this map that I screenshot.

That’s pretty long distances in the south…

The map shows how far women in a state would need to travel for an abortion with Roe active as law and without. I’ve used the toggle to show without. Women in the south in particular will need to travel quite far. The article further breaks out distances today with more granularity to paint the picture of “abortion deserts” where women have to travel sometimes well over 200 miles to have a safe, legal abortion.

I am certain that we will be returning to this topic frequently in coming months, unfortunately.

Credit for the piece goes to Alex Leeds Matthews.

Colours for Maps

Today we have an interesting little post, a choropleth map in a BBC article examining the changes occurring in the voting systems throughout the United States. Broadly speaking, we see two trends in the American political system when it comes to voting: make it easier because democracy; make it more restrictive because voter fraud/illegitimacy. The underlying issue, however, is that we have not seen any evidence of widespread or concerted efforts of voter fraud or problems with elections.

Think mail-in ballots are problematic? They’ve been used for decades without issues in many states. That doesn’t mean a new state could screw up the implementation of mail-in voting, but it’s a proven safe and valid system for elections.

Think that were issues of fraudulent voters? We had something like sixty cases brought before the courts and I believe in only one or two instances were the issues even remotely proven. The article cites some Associated Press (AP) reporting that identified only 500 cases of fraudulent votes. Out of over 14 million votes cast.

500 out of 14,000,000.

Anyway, the map in the article colours states by whether they have passed expansive or restrictive changes to voting. Naturally there are categories for no changes as well as when some expansive changes and some restrictive changes were both passed.

Normally I would expect to see a third colour for the overlap. Imagine we had red and blue, a blend of those colours like purple would often be a designer’s choice. Here, however, we have a hatched pattern with alternating stripes of orange and blue. You don’t see this done very often, and so I just wanted to highlight it.

I don’t know if this marks a new stylistic design direction by the BBC graphics department. Here I don’t necessarily love the pattern itself, the colours make it difficult to read the text—though the designers outlined said text, so points for that.

But I’ll be curious to see if I, well, see more of this in coming weeks and months.

Credit for the piece goes to the BBC graphics department.

Hey Boo Boo

When I was in the Berkshires, one thing I noticed was signs about bears. Bear crossing. Don’t feed the bears. Be beary careful. Okay, not so much the latter. But it was nonetheless odd to a city dweller like myself where I just need to be wary of giant rats.

Less than a month later, I read an article in the Boston Globe about how the black bear population in Massachusetts is expanding from the western and central portions of the state to those in the east.

The graphic in the article actually comes from the Massachusetts Division of Fisheries and Wildlife, so credit goes to them, but it shows the existing range and the black bears’ new range.

I understand the inclusion of the highways in red, green, and black, but I wish they had some even simple labelling. In the article they mention a few highways, but my familiarity with the highway system in Massachusetts is not great. Also, because the designer used thin black lines to demarcate the towns, one could think that the black lines, especially out west, represent counties or other larger political geography units.

Credit for the piece goes to the Massachusetts Division of Fisheries and Wildlife.

Out with the New, In with the Old

After twenty years out of power, the Taliban in Afghanistan are back in power as the Afghan government collapsed spectacularly this past weekend. In most provinces and districts, government forces surrendered without firing a shot. And if you’re going to beat an army in the field, you generally need to, you know, fight if you expect to beat them.

I held off on posting anything about the Taliban takeover of Afghanistan simply because it happened so quick. It was not even two months ago when they began their offensive. But whenever I started to prepare a post, things would be drastically different by the next morning.

And so this timeline graphic from the BBC does a good job of capturing the rapid collapse of the Afghan state. It starts in early July with a mixture of blue, orange, and red—we’ll come back to the colours a bit later. Blue represents the Afghan government, red the Taliban, and orange contested areas.

The graphic includes some controls at the bottom, a play/pause and forward/backward skip buttons. The geographic units are districts, sub-provincial level units that I would imagine are roughly analogous to US counties, but that’s supposition on my part. Additionally the map includes little markers for some of the country’s key cities. Finally in the lower right we have a little scorecard of sorts, showing how many of the nearly 400 districts were in the control of which group.

Skip forward five weeks and the situation could not be more different.

Almost all of Afghanistan is under the control of the Taliban. There’s not a whole lot else to say about that fact. The army largely surrendered without firing a shot. Though some special forces and commando units held out under siege, notably in Kandahar where a commando unit held the airport until after the government fell only to be evacuated to the still-US-held Hamid Karzai International Airport in Kabul.

My personal thoughts, well you can blame Biden and the US for a rushed US exodus that looks bad optically, but the American withdrawal plan, initiated by Trump let’s not forget, counted on the Afghan army actually fighting the Taliban and the government negotiating some kind of settlement with the Taliban. Neither happened. And so the end came far quicker than anyone thought possible.

But we’re here to talk graphics.

In general I like this. I prefer this district-level map to some of the similar province-level maps I have seen, because this gives a more granular view of the situation on the ground. Ideally I would have included a thicker line weight to denote the provinces, but again if it’s one or the other I’d opt for district-level data.

That said, I’d probably have used white lines instead of black. If you look in the east, especially south and east of Kabul, the geographically small areas begin to clump up into a mass of shapes made dark by the black outlines. That black is, of course, darker than the reds, blues, and yellows. If the designers had opted for white or even a light shade of grey, we would enhance the user’s ability to see the district-level data by dropping the borders to the back of the visual hierarchy.

Finally with colours, I’m not sure I understand the rationale behind the red, blue, yellow here. Let’s compare the BBC’s colour choice to that of the Economist. (Initially I was going to focus on the Economist’s graphics, but last minute change of plans.)

Another day, more losses for the government

Here we see a similar scheme: red for the Taliban, blue for the government. But notably the designers coloured the contested areas grey, not yellow. We also have more desaturated colours here, not the bright and vibrant reds, blues, and yellows of the BBC maps above.

First the grey vs. yellow. It depends on what the designers wanted to show. The grey moves the contested districts into the background, focusing the reader’s attention on the (dwindling) number of districts under government control. If the goal is to show where the fighting is occurring, i.e. the contest, the yellow works well as it draws the reader’s attention. But if the goal is to show which parts of the country the Taliban control and which parts the government, the grey works better. It’s a subtle difference, I know, but that’s why it would be important to know the designer’s goal.

I’ll also add that the Economist map here shows the provincial capitals and uses a darker, more saturated red dot to indicate if they’d fallen to the Taliban. Contrast that with the BBC’s simple black dots. We had a subtler story than “Taliban overruns country” in Afghanistan where the Taliban largely did hold the rural, lower populated districts outside the major cities, but that the cities like the aforementioned Kandahar, Herat, Mazar-i-Sharif held out a little bit longer, usually behind commando units or local militia. Personally I would have added a darker, more saturated blue dot for cities like Kabul, which at the time of the Economist’s map, was not under threat.

Then we have the saturation element of the red and blue.

Should the reds be brighter, vibrant and attention grabbing or ought they be lighter and restrained, more muted? It’s actually a fairly complex answer and the answer is ultimately “it depends”. I know that’s the cheap way out, but let me explain in the context of these maps.

Choropleth maps like this, i.e. maps where a geographical unit is coloured based on some kind of data point, in this instance political/military control, are, broadly speaking, comprised of large shapes or blocks of colour. In other words, they are not dot plots or line charts where we have small or thin instances of colour.

Now, I’m certain that in the past you’ve seen a wall or a painting or an advert for something where the artist or designer used a large, vast area of a bright colour, so bright that it hurt your eyes to look at the area. I mean imagine if the walls in your room were painted that bright yellow colour of warning signs or taxis.

That same concept also applies to maps, data visualisation, and design. We use bright colours to draw attention, but ideally do so sparingly. Larger areas or fields of colours often warrant more muted colours, leaving any bright uses to highlight particular areas of attention or concern.

Imagine that the designers wanted to highlight a particular district in the maps above. The Economist’s map is better designed to handle that need, a district could have its red turned to 11, so to speak, to visually separate it from the other red districts. But with the BBC map, that option is largely off the table because the colours are already at 11.

Why do we have bright colours? Well over the years I’ve heard a number of reasons. Clients ask for graphics to be “exciting”, “flashy”, “make it sizzle” because colours like the Economist’s are “boring”, “not sexy”.

The point of good data visualisation, however, is not to make things sexy, exciting, or flashy. Rather the goal is clear communication. And a more restrained palette leaves more options for further clarification. The architect Mies van der Rohe famously said “less is more”. Just as there are different styles of architecture we have different styles of design. And personally my style is of the more restrained variety. Using less leaves room for more.

Note how the Economist’s map is able to layer labels and annotations atop the map. The more muted and desaturated reds, blues, and greys also allow for text and other artwork to layer atop the map but, crucially, still be legible. Imagine trying to read the same sorts of labels on the BBC map. It’s difficult to do, and you know that it is because the BBC designers needed to move the city labels off the map itself in order to make them legible.

Both sets of maps are strong in their own right. But the ultimate loser here is going to be the Afghan people. Though it is pretty clear that this was the ultimate result. There just wasn’t enough support in the broader country to prop up a Western style liberal democracy. Or else somebody would have fought for it.

Credit for the BBC piece goes to the BBC graphics department.

Credit for the Economist piece goes to the Economist graphics department.

I’ve Got the Seeing the Reds and Greens as One Blues

Today I want to highlight a print article from the New York Times I received about two weeks ago. It’s been sitting in a pile of print pieces I want to sit down, photograph, and then write up. But as we begin to return to normal, I need my second dining room chair back because at some point I’ll have guests over.

The article in question examined the rates of Covid-19 vaccination across the United States. And on the front page, above the fold no less, we can compare the vaccination rates for Covid-19 to those of the 2019–2020 flu and if you unfold it to its full-length glory we can add in the 2009–2010 H1N1 swine flu outbreak.

First thing I want to address is the obvious. Look at those colours. Who loves a green-to-red scale on a choropleth? Not this guy. They are a pretty bad choice because of green-to-red colour blindness. (There’s two different types as well as other types of colour blindness, but I’m simplifying here.) But here’s what happen when I pull the photo into Photoshop and test for it. (This is a screenshot, because I’m not aware of a means of exporting a proof image.)

Reds and greens become yellows and greys.

You can still see the difference between the reds and greens. That’s good. And it’s because colour is complicated. In red-green colour blindness, the issue is sensitivity to picking up reds and greens. (Again, oversimplifying for the sake of a blog post.) Between those two colours in the spectrum we have yellow. To the other side of green we have blue.

So if a designer needs to use a red-green colour scheme—and any designer who has worked in data visualisation will have undoubtedly have had a client asking for the map/chart/whatever to be in red and green—there’s a trick to making it work.

I don’t know if this is true, but growing up, I learned that green was the one colour the human eye evolved to distinguish the most. Now for a print piece like this, you are working in what we call CMYK space (cyan, magenta, yellow, and black). Red is a mixture of magenta and yellow. Green a mixture of cyan and yellow. If you remember your school days, it’s similar to—but not the same as—mixing your primary colours. So if you need to make red and green work, what can you do? First, you can subtract a bit of yellow from your green, because that exists between red and green. But then, and this is why CMYK is different from your primary school primary colours, we can adjust the amount of magenta. Magenta is not a “pure” red, instead it’s kind of purplish and that means has some blue in it. Adding a little bit of magenta, while it does add “red” into the green, it’s also adding more blue to the blue present in the cyan. Now you can spend quite a bit of time tweaking these colours, but very quickly I can get these two options.

Great, you can still see them as both red and green. Your client is probably happy and probably accepts this greenish-blue as green, because we have that ability to distinguish so many types of green. But what about those with red-green colour blindness? Again, I can’t quite do a straight export, so the best is a screenshot, but we can compare those two options like so.

I can see the differences significantly more clearly here.

You can probably still tweak the green, but by going for that simple tweak, you can make the client happy—even though it’s still just better to avoid the red and green altogether—and still make the graphic work.

There’s a bit more to say about the rest of the article, which has some additional graphics inside. But that’ll have to wait for another day. As will clearing down the pile of print pieces to share, because that keeps on growing.

Credit for the piece goes to Lazaro Gamio and Amy Schoenfield Walker.