Predicting…the Known Stats?

I have been trying to post more regularly here on Coffeespoons, but now that baseball’s postseason is in full swing—pun fully intended—my free time is spent watching balls and strikes at all hours of the day. (Though, with the Wild Card round over and the move from four to two games per day, my time will likely expand as the week winds down. Sort of. More on that in a moment.)

What I have noticed on a few broadcasts, however, is the broadcast team touting Google’s ability to forecast a player’s ability to get on base. Most recently, on Sunday afternoon my mates and I were watching the Phillies–Mets contest and the broadcaster announced or the graphic popped on screen claiming Google predicts Francisco Lindor has a 34% chance to get on base in the plate appearance.

That can be a useful nugget of knowledge. And wow, that is crazy that Google can predict Lindor’s chances of getting on base.

Except it is not.

Francisco Lindor’s on base percentage (OBP) for the 2024 season was 0.344. In other words, in 34.4% of plate appearances (PAs), Lindor either gets a hit or takes a walk. With a entire sample of 689 PAs, Lindor got on base 34% of the time. Maybe Google was taking into account some other factors, but that was just the most recent one I can recall.

I wish I could recall which batter first keyed me into this situation. I want to say it was a high OBP guy, and for whatever reason I pulled my mobile out and opened the batter’s page on Baseball Reference only to find the prediction matched the OBP exactly.

Then it happened again. And again. And again.

Baseball is the greatest sport. One reason I love it is because you can use data and information to describe it. Plan for it. Play it. And sometimes predict it. Sometimes that works. Sometimes, when it doesn’t, it breaks your heart.

Baseball has reams of data and, yes, that data can feed into newer and cooler algorithms and models for predicting outcomes. (Outcomes that surely have nothing to do with the flood of sports gambling available on mobile phones.) But to me, it seems a bit disingenuous to call a statistic that has largely moved out of the realm of baseball nerds into the common understanding of the sport—thanks, Moneyball—a company’s new predictive statistic when that statistic has existed forever.

Separately, as I alluded to earlier, I shall not be posting the next few weeks. I have a weekday wedding to attend later in the week and then I am headed out of town for a few weeks and intend to be doing very little digital stuff. Plus, by the time I return baseball’s postseason shall likely be over.

But in the meantime, I am going to be heading out this afternoon to meet some mates as they cheer on their local squad, the Philadelphia Phillies as they play the Mets. (No, the Red Sox did not, yet again, make the postseason.)

As the first batter, Kyle Schwarber, steps to the plate, I predict he will have a 37% chance of getting on base. And look, his OBP is 0.366.

Author: Brendan Barry

I am a graphic designer who focuses on information design. My day job? Well, they asked me not to say. But to be clear, this blog is my something I do on my own time and does not represent the views of…my employers. I think what I can say is that given my interest in information design—be it in the shape of clear charts, maps, diagrams, or wayfinding systems—I am fortunate that my day job focuses on data visualisation. Outside of work, I try to stay busy with personal design work. Away from the world of design, I have become an amateur genealogist and family historian. You will sometimes see that area of work bleed into my posts.