The .500 Red Sox

I initially made this datagraphic over the weekend, after watching the last few weeks of Boston Red Sox baseball wherein they continued to win a game, lose a game, resulting in an even .500 record.

When I started, the graphic I sketched looked very different as I had included timelines and highlighted key moments where key players went down for the year or the year-to-date. But after I added some context of the sport’s leading clubs’ games above or below .500, I realised most of those clubs were all those that my good friends and family followed.

Consequently I ditched my initial concept and opted to instead show how middling my Red Sox have been to the rest of them. And whilst this graphic may have a few more spaghetti lines than I’d typically prefer, it does show that squiggle of consistency in the middle that is the Red Sox 2024 season to date.

Of course, when I posted it, the Red Sox had just lost to the Yankees and I said I expected them to win one and lose one the rest of the weekend to stay at .500. So what happened? The Red Sox won both and are now two games over .500.

Baseball superstition thus requires I post more graphics about the .500 Red Sox to get them more games over .500.

Credit for the piece is mine.

My Irishness

Yesterday was Saint Patrick’s Day and those who have followed me at Coffeespoons—or more generally know me—are well aware that my background is predominantly Irish. Those same people probably also know of my keen interest in genealogy. And that’s what today’s post is all about.

Irish genealogy is difficult because of the lack of records and lack of record access. My struggle is often in connecting an ancestor to a specific place in Ireland, necessary for any work to identify baptism, marriage, or death records. Starting with my maternal lines, it’s easy to see how ancestors were from “Ireland”, but I’ve been able to place precious few into a specific geographic context.

Thomas Doyle is the only ancestor I can place into a specific parish, and he wasn’t the key person who allowed it. For those interested in genealogy, it’s always worthwhile to investigate siblings, cousins, aunts, uncles, and sometimes even friends and neighbours because they often can provide clues, as it did in the case of the Doyles.

Sometimes you also need to step outside and get lost in a cemetery. I took a drive one weekend before the pandemic to find the graves of John Hickey and his family. Until that point, I knew nothing about the origins of him or his wife. Luckily his gravestone went one step beyond Ireland and stated he was born in Queen’s County, now County Laois. But I’ve still found no evidence of where in Laois he was born and so tracking the rest of his family is difficult, perhaps impossible.

Furthermore, you can also see that I have little specific information about when these ancestors all arrived. None were present in the 1850 US Census, so we can reasonably work from a starting hypothesis that they arrived after 1850 and then when each had children documented born in the US—or the rarer occasion of a US marriage record—we can reasonably assume they arrived between 1850 and the child’s birth.

On my maternal side there is a lot of work to do, which belies all the effort put into just getting this far over the last decade plus. Contrast that to my paternal side.

Here I have more Irish ancestors to investigate and I’m fortunate that I have more of an American paper trail, which when stitched together allowed me to get snippets of counties of birth or marriage, which, with some helpfully uncommon names, allowed me to dial in on specific parishes and towns. In other cases, my Irish ancestors first settled in Canada or the United Kingdom, which have much better preserved records. And finally a few have had family histories written and documented elsewhere, which allowed me to check the paper trail and validate the work.

And obviously when dealing with people in the mid-19th century, we don’t have a lot of photography and I’m lucky to have found a website—no longer extant, rest in peace Geocities—that had photos of my ancestors and a cousin over in Ireland who had a few photos sent my ancestors to their relations—though we’re not sure how they’re related, another story for another day—that I can put two faces to 18 names of direct Irish immigrant ancestors.

And of course the thing of note for all these people is that grey bar in the middle of the timeline: the Great Famine. In a roughly seven year period, over one million Irish died in Ireland and another over one million people left Ireland for places like the UK, Canada, the United States, Australia, New Zealand, among other places. It’s partly the reason for the massive Irish diaspora and why Saint Patrick’s Day is celebrated globally.

You can see some of my Irish ancestry is clearly unrelated, at least directly, to the Great Famine. But when you dig a bit deeper, you see the indirect connection. That John Barry who was an Irish stablekeeper who left Edinburgh for Philadelphia via Liverpool and New York, he was born to Irish parents in Cumberland, England—now Cumbria—who married there just after the end of the Great Famine and for whom there is no record prior to the Great Famine. In other words, they likely fled their home for fear of starvation and then in one generation their children all left England for America.

Irish genealogy is incredibly difficult, but it can also be incredibly rewarding. But you have have to keep digging and digging for even sometimes the shallowest roots.

Credit for the piece is mine.

The Sun’s Over the Yardarm Somewhere

It’s been a little while since my last post, and more on that will follow at a later date, but this weekend I glanced through the Pennsylvania Liquor Control Board’s annual report. For those unfamiliar with the Commonwealth’s…peculiar…alcohol laws, residents must purchase (with some exceptions) their wine and spirits at government-owned and -operated shops.

It’s as awful as it sounds. Compare that to my eight years in Chicago, where I could pick up a bottle of wine at a cheese shop at the end of the block for a quiet night in or a bottle of fine Scotch a few blocks from the office on Whisky Friday for that evening’s festivities. Here all your wine and spirits come from the state store.

And whilst it’s awful from a consumer/consumption standpoint, it makes for some interesting data, because we can largely use that one source to get a sense of the market for wine and spirits in the Commonwealth. That is to say, you don’t need to (really) worry about collecting data from hundreds of other large vendors. Consequently, at the end of the fiscal year you can get a glimpse into the wine and spirit landscape in Pennsylvania.

So what do we see this year?

A choropleth map of per 21+ capita sales of wine and spirits in Pennsylvania.

To start I chose to revisit a choropleth map I made in 2020, just before the pandemic kicked off in the United States. Broadly speaking, not much has changed. You can find the highest per 21+ year old capita value sales—henceforth I’ll simply refer to this as per capita—outside Philadelphia, Pittsburgh, and up in the northeast corner of the Commonwealth.

The great thing about per capita sales are that, by definition, it accounts for population. So this isn’t just that because Philadelphia and Pittsburgh are the largest two metropolitan areas they have the largest value sales—though they do in the aggregate as well. In fact, if we look at the northeast of the Commonwealth in places like Wayne County we see the second highest per capita sales, just under the top-ranked in Montgomery County.

Wayne County’s population, at least of the legal drinking age, is flat comparing 2018 to 2022: 0.0% or just six people. However, sales over that same period are up 20.2% per person. That’s the 15th greatest increase out of 67 counties. What happened?

A little thing called Covid-19. During the pandemic, significant numbers of higher-income people from New York and Philadelphia bought second properties in Wayne County and, surely, they brought some of that income and are now spending it on wine and high-priced spirits.

Wayne County stands out starkly on the map, but it does not look like a total outlier. Indeed, if you look at the highest growth rates for per capita sales from 2018 to 2022, you will find them all in the more rural parts of the Commonwealth. Furthermore, almost every county that has seen greater than 15% growth is in a county whose drinking-age population has shrunk in the last five years.

Overall, however, the map looks broadly similar to how it did at the beginning of 2020. The top and centre of the Commonwealth have relatively low per capita sales, and this is Appalachia or Pennsyltucky as some call it. Broadly speaking, these are more rural counties and counties of lower income.

I spend a little bit of time out in Appalachia each year and have family roots out in the mountains. And my experience casts one shadow on the data. Personally, I prefer my cocktails, whiskies, and gins. But when I go out for a drink or two out west, I often settle for a pint or two. That part of the Commonwealth strikes me as more fond of beer than wine or spirits. And this dataset does not include beer. I have to wonder how the data would look if we included beer sales—though lower price-point session beers would still probably keep the per capita value sales on the lower end given the broad demographics of the region.

Finally, one last note on that second call out, Potter County having the lowest per capita sales at just under $42 per person. The number struck me as odd. The next lowest county, Fulton, sits nearly $30 more per person. Did I copy and paste the data incorrectly? Was there a glitch in the machine? Is the underlying data incorrect? I can’t say for certain about the third possibility, but I did some digging to try and hit the bottom of this curiosity.

First, you need to understand that Potter County is, by population, the 5th smallest with just over 16,000 total people living there. And as far as I can tell, it had just three stores at the beginning of 2022. But then, before the beginning of the new fiscal year, one of the three stores closed when an adjoining building collapsed. It was never rebuilt. And so perhaps 1/3 of the local population was forced to head out-of-county for wine and spirits. Compared to 2018, per capita sales in Potter County declined by 62%, and most of that is within the last year as the annual report lists the year-on-year decline as just under 54%.

In coming days and weeks I’ll be looking at the data a bit more to see what else it tells us. Stay tuned.

Credit for the piece is mine.

Cavalcante Captured

Well, I’ve had to update this since I first wrote, but had not yet published, this article. Because this morning police captured Danelo Cavalcante, the murderer on the lam after escaping from Chester County Prison, with details to follow later today.

This story fascinates me because it understandably made headlines in Philadelphia, from which the prison is only perhaps 30–40 miles, but the national and even international coverage astonished me. Maybe not the initial article, but the days-long coverage certainly seemed excessive when we had much larger problems or notable events occurring throughout the world.

That brings me to this quick comparison of these two maps. The first is from the local paper, the Philadelphia Inquirer. It is a screenshot in two parts, the first the actual map and the second the accompanying timeline.

The Inquirer map
The timeline from the Inquirer

Then we have the BBC and their map of the story:

The BBC version

Both maps use light greys and neutral colours to ground the reader’s experience, his or her welcome to the world of southeastern Pennsylvania. The Inquirer uses a beige and a white focus for Chester County and the BBC omits county distinctions and uses white for rural and grey for built-up areas around Philadelphia.

Both maps use red numbers in their timeline sections to sequence the events, though the Inquirer’s is more extensive in its details and links the red events to red map markers.

The Inquirer leans heavily on local roads and highways with lines of varying width in white with thin outlines. Whereas the BBC marks only significant roads as thin blue lines.

The Inquirer’s map adds a lot of geographical context, especially for an audience fastidiously following the situation. And the following makes sense given all the local closures and anxiety—though I’m of the opinion a significant bit of those closures and anxiety were unwarranted. But for a reader in London, Toronto, or Melbourne, does anyone really need to see Boot Road? Strasburg Road? Even Route 30? Or the Route 30 Bypass (at Route 100, hi, Mum)? Not really, and so the omission of many of the local roads makes sense.

I would keep the roads relevant to the story of the search or the capture, for example Routes 23 and 1, and places relevant, for example Longwood Gardens and South Coventry. Here the BBC perhaps goes too far in omitting any place labels aside from Philadelphia, which is itself borderline out of place.

What I like about the BBC’s map, however, is the use of the white vs. grey to denote rural vs. built-up areas, a contextual element the Inquirer lacks. Over the last two weeks I have heard from city folks here in Philadelphia, why can’t the cops capture Cavalcante in Chester County? Well, if you’ve ever driven around the area where he initially roamed, it’s an area replete with wooded hills and creeks and lots of not-so-dense rich people homes. We don’t yet know where he was finally captured, but in Phoenixville he was spotted on camera because it’s an actual borough (I’m pretty certain it’s incorporated) with a walkable downtown. It’s dense with people. And not surprisingly the number of spottings increased as he moved into a denser area.

The Inquirer’s map, however, doesn’t really capture that. It’s just some lines moving around a map with some labels. The BBC’s map, though imperfect because the giant red box obscures a lot of the initial search area, at least shows us how Cavalcante evaded capture in a white thus rural, less-dense area before being seen in a grey thus built-up dense area.

All-in-all, both are good enough. But I wish somebody had managed to combine both into one. Less road map than the Inquirer’s, but more context and grounding than the BBC.

Credit for the Inquirer piece goes to John Duchneskie.

Credit for the BBC piece goes to the BBC graphics department.

It’s Been a Little While, But I Haven’t Gone Very Far

I last posted to Coffeespoons a year ago. Well, I’m back. Sort of.

Over the last year, there has been a lot going on in my family and personal life. Suffice it to say that all’s now relatively well. But the last 12 months forced me to prioritise some things over other things, and a daily(ish) blog about information design and data visualisation did not quite make the cut. And over all that time I also picked up a few new interests and hobbies, the most significant being photography.

Nevertheless I still enjoy information design. So I’m back. Though I doubt I will be posting every workday. After all, that’s when I have to go through my photographs and the other things I work upon nowadays. But, I don’t want to completely neglect this blog.

To ease back into the process, I updated a county map of the United States I last updated at the end of 2019, before the pandemic struck.

Where I’ve been in dark purple and counties through which I’ve driven or taken the train in light purple.

But I can’t really say I’ve travelled that far away from Philadelphia over the last year. The only work trip was to Chicago and for holidays I’ve travelled north to the Berkshires and New England several times. I’ve also added Providence and crossed off Rhode Island from the states I’ve visited. Finally, I’ve spent some time working remote from hotel rooms allowing me to watch baseball in nearby Minor League ballparks, Salisbury, Maryland’s Arthur Perdue Stadium, among others.

What remains abundantly clear are the two major phases of my life to date. I was born and raised in the greater Delaware Valley (Philadelphia, southeastern Pennsylvania, and southern New Jersey) and lived eight years in the Midwest (Chicago). And what connects all the journeys I’ve made from those home bases, if you will, is the tenuous county-wide tether stretching along I-80 across Indiana and Ohio into I-76 in Pennsylvania.

Unfortunately I still haven’t made it beyond the United States yet post-pandemic—hopefully that will begin changing in 2024—and so I have no updates for that map.

I cannot quite say when the next post will be. I don’t think it will be 12 months. But will it be monthly? Weekly? I can’t quite say. I doubt I will return to daily posting, because as those who know me well know, that was an enormous amount of time I spent every week preparing, writing, and posting content. But I also know well that a regular update frequency is critical to a blog, so that’s a thing I will be thinking about as 2023 begins to fade into autumn and winter.

Stay tuned.

Credit for the piece is mine.

No Matter What You Say, I’m Still Me

As many long-time readers know, I was long ago bitten by the genealogy bug and that included me taking several DNA tests. The real value remains in the genetic matches, less so the ethnicity estimates. But the estimates are fun, I’ll give you that. Every so often the companies update their analysis of the DNA and you will see your ethnicity results change. I wrote about this last year. Well yesterday I received an e-mail that this year’s updates were released.

So you get another graphic.

The clearest change is that the Scottish bit has disappeared. How do you go from nearly 20% Scottish to 0%? Because population groups in the British isles have mixed for centuries. When the Scottish colonised northern Ireland, they brought Scottish DNA with them. And as I am fairly certain that I have Irish ancestors from present-day Northern Ireland, it would make sense that my DNA could read as Scottish. But clearly with the latest analysis, Ancestry is able to better point to that bit as Irish instead of Scottish. And this shouldn’t surprise you or me, because those purple bars represent their confidence bands. I might have been 20% Scottish, but I also could have been reasonably 0% Scottish.

Contrast that to the Carpatho-Rusyn, identified here as Eastern European and Russian. That hovers around 20%, which makes sense because my maternal grandfather was 100% Carpatho-Rusyn—his mother was born in the old country, present-day Slovakia. We inherit 50% of our DNA from each of our parents, but because they also inherit 50%, we don’t necessarily inherit exactly 25% from our grandparents and 12.5% from our great-grandparents, &c.

But also note how the confidence band for my Carpatho-Rusyn side has narrowed considerably over the last three years. As Ancestry.com has collected more samples, they’re better able to identify that type of DNA as Carpatho-Rusyn.

Finally we have the trace results. Often these are misreads. A tiny bit of DNA may look like something else. Often these come and go each year with each update. But the Sweden and Denmark bit persisted this year with the exact same values. If I compare my matches, my paternal side almost always has some Swedish and Danish ethnicity, not so for my maternal side. And importantly, those matches have more. Remember, because of that inheritance my matches further up on my tree should have more DNA, and that holds true.

That leads me to believe this likely isn’t a misread, but rather is an indication that I probably have an ancestor who was from what today we call Sweden or Denmark. Could be. Maybe. But at 2%, assuming the DNA all came from one person, it’s probably a 4th to a 6th great-grandparent depending on how much I and my direct ancestors inherited.

Clearly there’s more work to do.

Climate Conscientious and Cheaper Cars

Sometimes in the course of my work I stumble across graphics and work that I previously missed. In this case I was seeking a post about one of my favourite infographics, but it turned out I’ve never posted about it and so I will have to rectify that someday. However in my searching, I came upon an article from the New York Times last year where they wrote about research from MIT that compared the carbon dioxide emissions—bad for the environment and climate—per mile to the average monthly cost of a wide range of 2021 vehicles. The important distinction here is that average monthly cost is not the sticker price of a vehicle, but rather the sticker price plus lifetime operating costs. (For their analysis, the authors assumed a 15-year lifespan and 13,000 miles driven per year.)

Why is this so important? It’s pretty simple, really. In the United States, vehicle emissions are the largest source of carbon emissions. And the vast majority of that is due to passenger vehicles. If we as a society want to get serious about reducing our carbon footprint, the biggest changes we need to make are reducing our amount of driving, moving more people into mass transit, or switching out people’s gas-powered vehicles for electric vehicles.

The New York Times turned their work into a really nice static datagraphic. It is static, so there is no real interactivity if you want to compare your vehicle to others. However, the designers did choose some popular models and identified some of the key outliers.

There are nice annotations here that double their effort as a legend here.

The designers group the cars, represented by dots, into colour fields. These do a good job of showing how there is overlap between the different types of vehicles. Not all hybrid and plug-in vehicles are cheaper or even less CO2 emitting than some gas-powered vehicles, typically your smaller compacts and hatchbacks. Each colour field is linked to a textual annotation that also functions as a legend.

That alone is very helpful in understanding the differences, subtle and not-so-much, between the types of vehicles. Later on in the article the designers also used a scatter plot of a narrower set of data to compare a select set of vehicles.

Oh, there’s your Tesla.

Here we can see that one cannot simply assume that all electric vehicles are cheaper long-term than their gas-powered compatriots. Here we can see that the Nissan Altima, whilst emitting more CO2, compares favourably with the Tesla Model 3 in both the long-term cost but also in the upfront sticker price.

Despite finding this article a year and a half late, we can tie this to current events in that President Biden’s climate bill creates tax credits for electric vehicles. While the bill is perhaps not as significant as many would like, it is remarkable for still being a lot of money devoted to reducing our emissions. And when it comes to electric vehicles, one of the key components is the creation of tax credits. These would help mitigate those upfront sticker costs of electric vehicles. Because whilst they may generally be cheaper in the long-run, you still need to put up more money than their conventionally-powered alternatives either as lump sums or down payments. And with interest rates rising, what you need to cover via an auto loan will become more expensive.

Overall this is a really nice piece. Should I ever need to buy another vehicle, I would love to see this as a resource available to the general public. Unfortunately it only compares 2021 vehicles. And it does make me wonder where my 2005 vehicle compares. Probably not too terribly favourably.

Credit for the piece goes to Veronica Penney.

Just Keep Grinding it Out

There are certain journalism outlets that I read that consistently do a good job with information design or at least are known for it. Now I try to keep my media diet fairly large and ideologically broad, but in that there are also still some outlets that feature quality design than others. The New York Times, the Washington Post, and the Economist are usually probably top of my list, but you will also see the Wall Street Journal, Philadelphia Inquirer, Boston Globe, the Guardian, and the BBC. I also read more niche outlets for some of my interests, e.g. the Athletic for Red Sox and baseball. But these often don’t feature information design. Politico is one that I read for my political news fix. And when I was reading it whilst on holiday, I was surprised to find an article about the employment market with a really nice line chart.

The article examines the changing labour market where, for over a year now, bargaining power largely resided with employees. If employees wanted raises, benefits, perks, whatever, they could often leave their current employer if their requests weren’t met because another employer, desperate for staff, would likely meet their asks. However, as the economy cools, we would expect the labour market to tighten making few openings available. That begins to reduce the bargaining power of employees as now employers can say “take it or leave it”, knowing that the offers they make to staff aren’t likely to be met by other employers who don’t have open positions or aren’t otherwise hiring.

Four graphics punctuate the article, detailing just that changeover. The full article is worth a read, but I wanted to take a look at one graphic that I think best captures the design decisions made.

That looks like an inflection point to me.

My screenshot above doesn’t capture the interactivity, but we will return to that in a moment. We see three data series: job openings, quits, and layoffs and discharges. The designer represented each with a primary colour, making clear distinctions between the three, and since all three are represented by thousands of units, they can be plotted together. That allows one to make easy comparisons across the three series at particular moments in time, e.g. the Covid recession. My only real quibble is with that recession bar. I probably would have used a neutral colour like a light grey instead of red, because the red appears visually linked to layoffs and discharges when they really are not.

Normally when we see an interactive line chart, we have a small legend above, sometimes below, the graphic. Here, however, the labelling for the lines sit directly next to the line. This makes the display clearer for the reader who scans the data series and I’ve seen the approach often in print, but rarely for interactive work.

And when the reader mouses over the work, the highlight does a few nice things.

See what you want to see.

We can first see that the line with which the user is engaged becomes the focus: the remaining two lines recede into the background as they are greyed out. We also get a simple, but well designed text label above the cursor. Note how that behind the text there is a thin white stroke that creates visual separation between the letters and the data line. And that cursor is a small grey circle surrounding the data point, allowing you to see said data point.

Take it all together and you have a very clear and very effective interactive line chart. It’s a job well done.

When I see good work from unexpected places it’s important to call it out and highlight it because it means some design director somewhere cares enough to try and improve their publication’s quality of communication. And in an era when many outlets suffer from disinvestment and cost-cutting staff reductions that leave fewer designers, editors, and photographers on staff it is easy to imagine design quality decreasing.

So credit for this piece goes to Eleanor Mueller.

Facebook’s for the Old Folks

We start this work week with something that the young people use, but in a different way than older people do, including elder millennials like myself: social media. Of course, as an elder millennial, I remember Facebook when it was The Facebook when it expanded access to Penn State, which I attended for a single year.

Pew Research conducted a study of teenagers that revealed they use social media more than ever before, but that they use new (sort of) platforms more than the venerable paragon of the past: Facebook.

The Economist’s Data Team looked at the data and created this graphic showing the trends.

What do you use? How often?

We see stacked bar charts on the left and then a line chart on the right. The left-hand chart shows the frequency with which teenagers use various social media platforms. What I don’t understand is how someone uses a social media application “almost constantly”. But that’s probably why I’m an elder millennial.

Get off my lawn, you whippersnappers.

On the right we see the percentage of teenagers who have used an application at least once. The biggest winners? Applications primarily featuring image over text. The losers? Those that use words.

Now longtime readers know that I am not terribly fond of stacked bar charts, especially because they make comparisons between, in this case, social media platforms very difficult. And I feel like we have a story in the occasional use responses, but it’s tough teasing it out from this graphic.

On the right, well, this is one I enjoy. You can tell just how much the social media environment has evolved in the last 7–8 years because TikTok did not exist and YouTube was not thought of as a social media platform.

I wonder if different colours were truly needed for the line chart. The lines do not really overlap and there is sufficient separation that each line can be read cleanly. If the designers wanted to highlight the fall of Facebook or another story line, they could have used accent colours.

But overall a solid graphic.

Now to check my feeds.

Credit for the piece goes to the Economist’s Data Team.