2025 Red Sox Draft Breakdown

Monday and Tuesday, Major League Baseball conducted its amateur player draft, wherein teams select American university and high school players. They have two weeks to sign them and assign them. (Though many will not actually play this year.)

Two years ago the Red Sox installed Craig Breslow as their new chief baseball organisation. He has cut a number of front office personnel and reorganised the Red Sox front office, leading to a number of departures. Crucially for this context, a number of the scouts who identified key Red Sox players like Roman Anthony were either let go or left. The team then focused on analysts and models.

My questions have thus been focused on how this might change the Red Sox’ approach to the draft. A running joke in Sox circles has been how every year the Red Sox draft a high school shortstop from California. But this year, the Red Sox’ first pick was Kyson Witherspoon, a starting pitcher from Oklahoma.

The graphic above shows how Witherspoon was ranked by the media who covers this niche area of baseball: a consensus top-10 pick. And yet the Sox selected Witherspoon at no. 15 overall. This has been another trend of the Sox over the last several years, where other teams select lower-ranked players and leave higher-ranked players available to the Sox and other mid-round selectors. Similarly, fourth-round pick Anthony Eyanson, ranked roughly 40–65, remained on the board and so the Sox took him at no. 87.

As someone who follows the Sox system, they need quality pitching prospects as they have very few of proven track records in the minors. Witherspoon and Eyanson provide them that, at least the quality, the track records have yet to develop. Marcus Phillips, seemingly, presents more of a lottery ticket. His ranking spread so far, from 13 to 98, it is clear there is no consensus on the type of talent the Sox took in him.

Godbout is a middle-infielder with a good hit tool, but light on the power. Clearly the Sox believe they can work with him to develop the power in the next few years. But all in all, three pitchers in the first four rounds.

Now, the additional context for the non-baseball fans amongst you who are still reading is this. Baseball’s draft does not work in the same way as those of, say the NFL or the NBA. One, the draft is much deeper at 20 rounds. (In my lifetime it used to be as deep as 50.) Two, teams (usually) do not draft for need. I.e., unlike the NFL where a team , say the Patriots, who needs a wide receiver might draft a wide receiver with their first pick, a team like the Red Sox who need, say, a catcher will not draft a catcher. A key reason why, it takes years for an MLB draftee to reach the majors if he does so at all. Whereas an NFL draftee likely plays for the Patriots the following year. In short, there is often a lag between the draft and the debut—unless you are the Los Angeles Angels. Thus you address your current positional needs via free agency or trades, not the draft. (Unless you are the Angels.) For the purposes of the draft, you therefore draft the “best player available” (BPA).

Some systems, however, are just better at doing different things. Some teams do a better job of developing pitchers, others of developing hitters. Some of developing certain traits of pitching or hitting. Some teams are just bad at it overall. The Sox have, of late, been very good at developing position players/hitters. They have been pretty not-so-great at developing pitching. Hence, when Breslow said he could improve their pitching pipeline, the Sox jumped at the chance to hire him. (It also helps everyone else they interviewed said no, and a number of candidates declined to even be interviewed.)

In part, the failure to develop pitching could be a failure to identify the correct player traits or characteristics. It could be the wrong methods and strategies, improper techniques and technologies. But, if we look at the recent history of Red Sox drafts, it could be, in part, also a consistent lack of drafting pitching. After all, the 26-man MLB team roster comprises 14 pitchers and 12 position players. (Technically it is a limit of 14 pitchers, but teams seem to generally max out their pitcher limit.)

You can see in my graphic above, since the late 2000s, the Red Sox, with few exceptions, ever drafted more than 50% pitchers. This period of time coincides with the ascendance of the vaunted Sox position player development factory and the decline of the homegrown starter. (Again, the obligatory reminder correlation is not causation.)

Nevertheless, in the last few years, we have seen the drafting of pitchers spike. In the first two years of the new Breslow regime, pitchers represent more than 70% of the amateur draft. (There is also the international signing period where players from around the world can be signed within limits. This is how the Sox have drafted very talented players like Rafael Devers and Xander Bogaerts. I omitted this talent acquisition channel from the graphics.)

Consequently, when a team states its strategy is to draft the BPA, but over 70% of all players selected are pitchers, I wonder how one defines “best”. Are the Red Sox weighing pitching more heavily than hitting? Is this an attempt to address a long-standing asymmetry in talent? In the models teams like the Red Sox use, are pitchers worth, say, 1.5× more than hitters? I doubt we will ever know the answer, though the team maintains they draft the best player available.

Ultimately, it may matter very little for the Red Sox in the near-term. The sport’s best prospect, Roman Anthony, is just starting to man the outfield for the Sox. A consensus top-10 prospect, Marcelo Mayer, has also just debuted. A top-25 prospect, Kristian Campbell, debuted on Opening Day. Two second-year players round out the outfield in Ceddanne Rafaela and Wilyer Abreu. A rookie catcher is behind the plate. The Sox may not need serious high-end positional player talent in the next 3–5 years. (Though it certainly helps when trying to trade for other pieces.)

But a two-year lull in drafting high-end positional player talent, on top of the previous two years’ first-round draft picks, catcher Kyle Teal and outfielder Braden Montgomery, being traded for ace Garrett Crochet, means the Sox may well have a several-year gap in positional player matriculation to the majors. That might matter.

Baseball, unlike the NFL and the NBA, is a marathon, however. So perhaps this is all a tempest in a teapot. Let us check back in five years’ time and we can see whether this new draft strategy, if it is indeed a strategy, has cost the Red Sox anything.

Credit for the pieces is mine.

It’s Raining Drones

Last Friday the BBC published an article about the US’ resumption of supplying military assistance to Ukraine in its defence of Russia’s invasion. But in that article, the author referenced the increased intensity of Russian drone and missile strikes on Ukraine over that week.

To show the intensity, the BBC included this graphic, which incorporates a heat map into a traditional calendar design. A thin white line separates each day and a thicker stroke separates the months.

The legend incorporates its own visualisation component, wherein the scale of the difference in the bin buckets shows. After all, there is a significant difference between a bucket of 25 strikes, say between 25 and 50, versus 250 strikes, say between 250 and 500.

I really liked this graphic a lot. It very clearly shows that increasing intensity and annotations point out the worst days for Ukraine were indeed in that last week. And in attention to detail, note how the arrows have a thin white stroke outlining them, helping create visual separation between the arrows and the calendar heatmap below.

Credit for the piece goes to the BBC graphics department.

A Warming Climate Floods All Rivers

Last weekend, the United States’ 4th of July holiday weekend, the remnants of a tropical system inundated a central Texas river valley with months’ worth of rain in just a few short hours. The result? The tragic loss of over 100 lives (and authorities are still searching for missing people).

Debate rages about why the casualties ranked so high—the gutting of the National Weather Service by the administration shines brightly—but the natural causes of the disaster are easier to identify. And the BBC did a great job covering those in a lengthy article with a number of helpful graphics.

I will start with this precipitation map, created with National Oceanic and Atmospheric Administration (NOAA) data.

A map of precipitation over central Texas.

I remain less than fully enthusiastic about continual gradients for map colouration schemes, however the extreme volume of rainfall during the weather event makes the location of the flooding obvious to all. Nonetheless the designers annotated the map, pointing out river, the camp at the centre of the tragedy and the county wherein most of the deaths occurred.

In short, more than 12 inches of rain fell in less than 24 hours. The article also uses a time lapse video to show the river’s flash flooding when it rose a number of feet in less than half an hour.

The article uses the captivating footage of the flash flooding as the lead graphic component. And I get it. The footage is shocking. And you want to get those sweet, sweet engagement clicks and views. But from the standpoint of the overall narrative structure of the piece, I wonder if starting with the result works best.

Rather, the extreme rainfall and geographic features of the river valley contributed at the most fundamental level and showcasing that information and data, such as in the above map, would be a better place to start. The endpoint or culmination of the contributing factors is the flash flooding and the annotated photo of flood water heights inside the cabins of the camp.

Overall I enjoyed the piece tremendously and walked away better informed. I had visited an area 80 miles east of the floods several years ago for a wedding. Coincidentally on the 4th I remarked to a different friend from the area now living in Philadelphia about the flatness and barrenness of the landscape between Austin and San Antonio. I had no idea that just to the west rivers cut through the elevated terrain that would together cause over a hundred deaths a few hours later.

Credit for the piece goes to the BBC, but the article listed a healthy number of contributors whom I shall paste here: Writing by Gary O’Donoghue in Kerr County, Texas, Matt Taylor of BBC Weather and Malu Cursino. Edited by Tom Geoghegan. Images: Reuters/Evan Garcia, Brandon Bell, Dustin Safranek/EPA/Shutterstock, Camp Mystic, Jim Vondruska, Ronaldo Schemidt/AFP and Getty.

Living Longer by the Generations

Last weekend was Easter—for both the Catholics and the Orthodox—and I visited the Appalachian ancestral home of the Carpatho–Rusyn side of my family. Before leaving town I drove up to the old cemetery on a hill overlooking the old church and the Juniata River to pay my respects to those who came before me and without whom I would not be here.

At the end of the four-hour drive back to Philadelphia, stuck in traffic on the Schuylkill Expressway because of course, I realised I had never really looked holistically at the causes of death of my direct ancestors. Earlier this week I spent some time putting that together and then, of course, I realised I wanted to see if I could find any patterns in the data. So of course I made a chart.

If we go back a couple of generations, you can see my ancestors lived to a median age of their mid-60s. But by the time of my grandparents that has increased to almost 80. Of course, the sample size is far smaller for grandparents than great-great-great-&c.-grandparents. Nonetheless, the general trend of the median line is upward.

A few exceptions pull those lines in both directions, however. Catherine Sexton died at the age of 35 from heart disease and James Scollon in the same generation died at 36 from typhoid fever. Additionally, that generation includes a few ancestors who remained in present-day Slovakia in what was one of the most impoverished areas of Europe. Not surprisingly they died in their 40s and 50s. If I exclude those people, the average shoots back up to about 70.

I also decided to colour the minimums and maximums by gender, because as you can see there is a broad pattern of longer-lived women and men who died young. I want to dig more into that aspect of the demographics at a later date to see if that trend holds. I suspect it would because that is the historical trend, but you never know.

Credit for the piece is mine.

Happy Liberation Day

Yesterday I created a map detailing the new tariff rates released by President Trump on Wednesday. I was inspired by the curious inclusion of several small territories with almost no trade with the United States, and a few of whom are uninhabited. What follows is the graphic and the accompanying text I wrote as I wrote it.

I say that only because some people have not entirely caught the…let’s say tone with which I wrote.


All hail the new tariffs. Very obviously, foreign governments will be paying us lots of cash money. Places like Lesotho, with its so-called high rates of poverty, AIDS, and under-development, are clearly just fronts for the rich. Because their tariffs on us are turning them into the richest, most luxurious places on Earth.

Now I don’t know for sure, but some people say the shithole places like Nambia are really cash cows. Nerds tell me places like Nambia don’t exist, but their just idiots looking in the wrong wardrobe. Genius-level intellects like me can easily find Nambia on a map.

There are some very bad ombres out there, and I’m looking at you, Señor Diego Garcia. Some say you’re a thug with bad tattoos whom we should disappear to a secret black site. But the nerds keep telling me you’re not a person, just an island. That you’re not an illegal alien, but a British island where no civilians live, just US soldiers on a secret military base. But we need that money to pay for all the tax cuts for the rich. So we’ll just make our troops there pay Señor Garcia’s tariffs until he stops being lazy and pays us.

Then I’m looking at places like Christmas Island. That Santa Claus is really a bad guy. I know some of you like him—I like him too; he was good to me when I was a child. But all he does is export toys and joys. And that needs to be taxed. So I need Christmas Island to give us all their very real Christmas money.

Finally, I’m looking at Heard Island and McDonald Islands who’re trying to hide near the Antarctic Circle with all the other bad guys and their fortresses of solitude and vaults of swimmable coins. Sure, those nerds keep telling me these islands are uninhabited. But Amber Heard and Ronald McDonald are real people, in league with the Hamburgler, stealing all our rightful American money. The nerds say the islands are only inhabited by penguins. So if you want to say that Amber and Ronald are really just penguins, then we’re going to get all our sweet tariff money from the so-called penguins. Some of whom are emperors. Can you believe that? Emperor penguins? Emperors are rich. So we need to liberate those penguin dollars from the penguin monarchy.

Credit for the piece is mine.

The Red Sox May Finally Have a Second Baseman

Last week was baseball’s opening day. And so on the socials I released my predictions for the season and then a look at the revolving door that has been the Red Sox and second base since 2017.

Back in 2017 we were in the 11th year of Dustin Pedroia being the Sox’ star second baseman. That summer, Manny Machado slid spikes up into second and ruined Pedroia’s knee. Pedroia had surgery and missed Opening Day 2018 then struggled to return. He played 105 games in 2017 then only three in 2018 and then six in 2019. And thus began the instability. Here’s a list of the Opening Day second baseman since 2017.

  • 2018 Eduardo Nuñez
  • 2019 Eduardo Nuñez
  • 2020 José Peraza
  • 2021 Kiké Hernández
  • 2022 Trevor Story
  • 2023 Christian Arroyo
  • 2024 Enmanuel Valdez
  • 2025 Kristian Campbell

And, again, by comparison…

  • 2007 Dustin Pedroia
  • 2008 Dustin Pedroia
  • 2009 Dustin Pedroia
  • 2010 Dustin Pedroia
  • 2011 Dustin Pedroia
  • 2012 Dustin Pedroia
  • 2013 Dustin Pedroia
  • 2014 Dustin Pedroia
  • 2015 Dustin Pedroia
  • 2016 Dustin Pedroia
  • 2017 Dustin Pedroia

But not only is it a lack of stability, it is a lack of production. Wins Above Replacement (WAR) is a statistic that attempts to capture a player’s value relative to an “average” player or substitute. A below replacement level person is less than 0 WAR. A substitute is 0–2, a regular everyday players is 2–5, an All Star is 5–8, and an elite MVP level performance is 8+ WAR. And, spoiler, the Sox have not had a 5+ WAR second baseman since Pedroia’s final full season in 2016.

Suffice it to say, the Sox have long had a need for a long-term second baseman. The graphics I created were meant to be two Instagram images in the same post, and so the the axis labels and lines stretch across the artboards.

The graphic shows pretty clearly the turmoil at the keystone. The two outliers are Kiké Hernández in 2021 and Trevor Story in 2022. The latter is easily explained. Story was signed to be the backup plan in case shortstop Xander Bogaerts left after 2022. (Back in 2013 I made a graphic after a similar revolving door of shortstops in the eight years after the Red Sox traded Nomar Garciaparra. Then the question was, would a young rookie named Xander Bogaerts be the replacement for the beloved Nomah. Xander played 10 years for the Sox.)

Kiké, however, is a bit trickier to explain. WAR weights value by position. A second baseman is worth more than a leftfielder. But shortstops and centrefielders are worth more than second baseman. And Kiké played a lot more shortstop and centre than he did second base, which likely explains his 4.9 WAR that season.

And so now in 2025 we had yet another guy starting at second. His name? Kristian Campbell. I saw him a few times last year as he rocketed from A to AAA, the lowest to highest levels of minor league player development below the major league. I thought he looked good and so did the professionals, because he’s a consensus top-10 prospect in the sport.

Going into Monday’s matchup between Boston and Baltimore, Campbell is hitting 6 for 14 with one homer and two doubles, an on-base percentage of .500 and an OPS (on-base plus slugging, which weights extra base hits more heavily than singles) of 1.286. Spoiler: that’s very good.

Boston beat writers are reporting the Sox and Campbell’s agent are in talks for a long-term extension.

It looks like the Sox may have found their new long-term second baseman.

Credit for the piece is mine.

My Irish Heritage

This week began with Saint Patrick’s Day, a day that here in the States celebrates Ireland and Irish heritage. And I have an abundance of that. As we saw in a post earlier this year about some new genetic ancestry results, Ireland accounts for approximately 2/3 of my ancestry. But as many of my readers know, actual records-based genealogy is one of my big hobbies and so for this Saint Patrick’s Day, I decided to create a few graphics to capture all my current research on my family’s Irish heritage.

In the current political climate wherein we hyperfixate on immigration, I started with my ancestors’ immigration to North America.

My graphic features a timeline marking when certain ancestors arrived, with the massive caveat I do not know when all my Irish ancestors arrived. I separate the ancestors into paternal and maternal lines. My maternal lines are only half Irish, and unfortunately most of them offer little in terms of early records or origins and so the bulk of the graphic lands on my paternal lines.

I did sort out that two–four lines began in Canada and included them with orange dots. (The one couple married in Ireland shortly before setting sail for Canada. The other two lines married in Canada.) I also added a grey bar representing the length of the Great Famine. I suspect a number of my ancestors arrived during the famine based on the fact they begin to appear in the records around 1850, but sadly none of those records state when they arrived specifically instead they just appear in the United States.

I also used filled vs. open dots to indicate whether or not I had primary source documents for arrivals. I.e., a passenger manifest, naturalisation papers, &c. that specifically details immigration information weighs more heavily as evidence than, say, a census record wherein a respondent can say he or she arrived in such a year. (Spoiler, census records are not infallible.)

The overall takeaway, most of my Irish immigrants, for whom I have information, arrived in the middle of the 19th century within a decade of the Great Famine.

The second graphic features even more difficult data to find. Whence did my ancestors come?

For those unfamiliar with Irish genealogy, finding the town or parish from which your ancestors hailed can be nigh impossible. To start, you need some kind of American-based record that gives you a clue as to where in Ireland to look—a county or city. From my experience, most records simply state places of birth as “Ireland”—not very helpful.

Then if you can get back to Ireland, the typical resource you might use in the United States, United Kingdom, and other countries is the census. And Ireland did record a census every ten years, beginning in 1821. Unfortunately 1861 and 1871 were destroyed shortly after the data was recorded. Then during World War I, the 1881 and 1891 censuses were pulped due to a paper shortage. Then in 1921, there was no census because of the whole Irish Civil War thing. Finally in 1922, during the Battle of Dublin in the whole Irish Civil War thing, the Public Records Office at the Four Courts, which held government records dating back hundreds of years as well as guns and ammunition, was blown up. And with the ammunition, so too was blown up the census records for 1821, 1831, 1841, and 1851. In short, genealogists only have access to census records for 1901 and 1911. (The 1926 census organised post-Civil War, does not become public until 2027.)

Then you have the whole unavailability of Catholic Church records, which is another long discussion about the conflict between Protestants and Catholics in Ireland. (Just a minor thing in Irish history.)

There are some civil public records available and they begin in the mid-19th century, which in many cases is just a bit too late for genealogical purposes.

Suffice it to say, Irish genealogy can be tricky and in 15 years of researching it myself, I have only been able to find the origins of 10 Irish immigrant ancestors. For context, to the best of my knowledge I have 18 Irish immigrant ancestors. Thus that map is very empty.

The second map of the United States and United Kingdom is more complete because more complete records. It maps the residences of my Irish and Irish-American ancestors. Initially I attempted to link all the towns and cities with arrows to show the migration patterns, alas it quickly became a mess at such a small scale. That remains a project for another day.

My Irish heritage is a thing of which I am proud, and I am glad to say my genealogy hobby has allowed me to explore it much more deeply and richly than a green-dyed pint would allow.

Credit for the pieces is mine.

A Refreshed Look at My Ethnic Heritage

Late last week I received an update on my ethnic breakdown from My Heritage, a competitor of Ancestry.com and other genealogy/family history/genetic ancestry companies. For many years, the genealogical community had been waiting for this long-promised update. And it has finally arrived.

For my money, My Heritage’s older analysis, v0.95, did not align with my historical record research—something I have done for almost 15 years now. That DNA analysis painted me with an 85% heritage of Irish, Scottish, and Welsh. Because I have spent a decade and a half researching my ancestors, I know all of my second-great-grandparents, 16 total. 85% means 13–14 of them would be Irish, Scottish, or Welsh. However, four of them are Carpatho-Rusyns from present day eastern Slovakia. And nowhere in my research have I found any connection to the Baltic states or Finland.

Compare that to the update.

Here we have a drastically reduced Irish component that, importantly, has been split from Scottish and Welsh, which now exists as its own genetic group. The East European group appears too low, but perhaps My Heritage identified some of my Slavic ancestry as Balkan—there is a sizeable Carpatho-Rusyn community in Vojvodina, an autonomous oblast in Serbia. Maybe Germanic too? That would start to push it near to 20%.

I do have English ancestry—my Angophilia must come from somewhere—though it is relatively small and I can trace it to the Medieval period. That includes more of the Norman elite than the Anglo-Saxon plebs and so seeing Breton register could be indicative of that Norman/Anglo-Saxon population mixture.

But how does My Heritage results compare to those provided by Ancestry and FamilyTreeDNA, two competitors whose services I have also used. And how does it compare to my actual historical document research?

My Heritage’s newest analysis certainly hits a lot better and is nearer to Ancestry, which aligns best with my research. I do have two questions for my second-great-grandparents. One surrounds Nathaniel Miller, one of whose grandparents (Eliza Garrotson) may not be English but rather Dutch from the Dutch colonisation of the Hudson River Valley in New York south of Albany.

The other question revolves around William Doyle. His mother is identified in the records variously as English and Irish. A family story on that side of the family also suggests one ancestor of English descent. And finally, a recently discovered marriage record for his parents details how his mother (Martha Atkins) was baptised and converted to Catholicism as an adult prior to her marriage. Not all Irish are Catholic, but the vast majority are and that would also suggest Martha was not Irish.

Taking those two questions into account, I have a small range of expected values for my English ancestry and a slightly larger one for my Irish and you can see those in the graphic.

When you compare that to the My Heritage results alongside the Ancestry and FamilyTreeDNA results you can see Ancestry aligns best with my research whereas FamilyTreeDNA aligns the least. My Heritage now falls squarely between the two. And so I consider their update a success. I think the company still has some work to do, but progress is progress.

Credit for the pieces is mine.

Imports, Tariffs, and Taxes, Oh My!

Apologies, all, for the lengthy delay in posting. I decided to take some time away from work-related things for a few months around the holidays and try to enjoy, well, the holidays. Moving forward, I intend to at least start posting about once per week. After all, the state of information design these days provides me a lot of potential critiques.

Let us start with the news du jour , the application of tariffs on China and the delayed imposition on both Canada and Mexico. Firstly, let us be very clear what a tariff is. A tariff is a tax paid by importers or consumers on goods sourced from outside the country. In this case, we are talking about Canadian, Mexican, and Chinese imports and the United States slapping tariffs on goods from those countries. Foreign governments do not pay money to the United States, neither Canada, nor Mexico, nor China will pay money to the United States.

You will.

You should expect your shopping costs to increase, whether that is on the price of gasoline (imported from Canada), fast fashion apparel (from China), or avocados (from Mexico). On the more durable goods side, homes are built with Canadian lumber and your automobiles with parts sourced from across North America—the reason why we negotiated NAFTA back in the 1990s.

Now that we have established what tariffs are, why is the Trump administration imposing them? Ostensibly because border security and fentanyl. What those two issues have to do with trade policy and economics…I have no idea. But a few news outlets created graphics showing US imports from our top-five trading partners.

First I saw this graphic from the New York Times. It is a variation of a streamgraph and it needs some work.

A streamgraph type chart from the New York Times

To start, at any point along the timeline, can you roughly get a sense of what the value for any country is? No. Because there is no y-axis to provide a sense of scale. Perhaps these are the top import sources and these are their share of the total imports? Read the fine print and…no. These are the countries with a minimum of 2% share in 2024, which is approximately 75% of US imports.

This graphic fails at clearly communicating the share of imports. You need to somehow extrapolate from the y-height in 2024 given the three direct labels for Canada, Mexico, and China what the values are at any other point in time or for any other country.

Nevertheless, the chart does a few things nicely. It does highlight the three countries of importance to the story, using colours instead of greys. That focuses your attention on the story, whilst leaving other countries of importance still available for your review. Secondly, the nature of this chart ranks the greatest share as opposed to a straight stacked area chart.

Overall, for me the chart fails on a number of fronts. You could argue it looks pretty, though.

The aforementioned stacked area charts—also not a favourite of mine for this sort of comparison—forces the designer to choose a starting country in this case and then stack other countries atop it.

A stacked area chart from the BBC

What this chart does really well, especially well compared to the previous New York Times example is provide content for all countries across all time periods by the inclusion of the y-axis. Like the Times graphic it focuses attention on Canada, Mexico, and China with colour and uses grey to de-emphasise the other countries. You can see here how the Times’ decision to exclude all countries below 2% can skew the visual impact of the chart, though here all countries below Japan (everything but the top-five) are grouped as other.

Personally, the inclusion of the specific data labels for Canada, Mexico, and China distract from the visualisation and are redundant. The y-axis provides the necessary framework to visually estimate the share. If the reader needs a value to the precision level of tenths, a table may be a better option.

I could not find one of the charts I thought I had bookmarked and so in an image search I found a chart from one of my former employers on the same topic (though it uses value instead of share) and it is worth a quick critique.

A stacked area chart from Euromonitor International

Towards the end of my time there, I was creating templates for more wide-screen content. My fear from an information design and data visualisation standpoint, however, was the increased stretch in simple, low data-intensity graphics. This chart incorporates just 42 data points and yet it stretches across 1200 pixels on my screen with a height of 500.

Compare that to the previous BBC graphic, which is also 1200 pixels, but has a greater height of 825 pixels. Those two dimensions give ratios of 2.4 for Euromonitor International and 1.455 for the BBC. Neither is the naturally aesthetically pleasing golden ratio of 1.618, but at least the BBC version is close to Tufte’s recommended 1.5–1.6. The idea behind this is that the greater the ratio, the softer the slope of the line. This can make it more difficult to compare lines. A steeper slope can emphasise changes over time, especially in a line chart. You can roughly compare this by looking at the last few years of the longer time span in the BBC graphic to the entirety of this graphic. You can more easily see the change in the y-axis because you have more pixels in which to show the change.

Finally we get to another New York Times graphic. This one, however, is a more traditional line chart.

A line chart from the New York Times

And for my money, this is the best. The data is presented most clearly and the chart is the most legible and digestible. The colours clearly focus your attention on Canada, Mexico, and China. The use of lines instead of stacked area allow the top importer to “rise” to the top. You can track the rapid rise of Chinese imports from the late 1990s through to the first Trump administration and the imposition of tariffs in 2018—note the significant drop in the line. In fact you can see the impact in Mexico becoming the United States’ top trading partner in recent years.

Over the years, if I had a dollar for every time I was told someone wanted a graphic made “sexier” or with more “sizzle” or made “flashier”, I would have…a bigger bank account. The issue is that “cooler” graphics do not always lead to clearer graphics. Graphics that communicate the data better. And the guiding principle of information design and data visualisation should be to make your graphics clear rather than cool.

Credit for the New York Times streamgraph goes to Karl Russell.

Credit for the BBC graphic goes to the BBC graphics department.

Credit for the Euromonitor International graphic goes to Justinas Liuima.

Credit for the New York Times line chart goes to the New York Times.

Racing to the Final Finish Line

Thoroughbred racing is big business. And Philadelphia’s Parx Casino owns a racing track that, in a recent article in the Philadelphia Inquirer, has seen a number of horse deaths. The article includes a single graphic worth noting, a bar chart showing the thoroughbred death rate. The graphic contrasts rising deaths at Parx with a national trend of declining deaths.

Traditionally rate statistics are shown using dots or line. The idea is that a bar represents counting stats, i.e. how many total horses died. I understand the coloured bars present a more visually compelling graphic on the page, and so I can buy that reason if you are selling it.

Labelling each datapoint, however, with a grey text label above the bar remains unnecessary. They create sparkling, distracting grey baubles above the important blue bars. If you need the specificity to the hundredths degree, use a table. This graphic is also interactive. The mouseover state is where a specific number can be provided, adding an additional layer or level of depth in a progressive disclosure of information.

Credit for the piece goes to Dylan Purcell.