Sports and Games

Well that was a week. Let’s try to stay on the lighter side this Friday. Several weeks ago I was debating with several people about the difference between a game and a sport. I decided that the best way to try and capture our conversation was with a Venn diagram.

So in the interest of furthering that conversation, I’ve digitised that sketch and am presenting it here for everyone else to see and, if they want, comment upon.

No war games here.

Hopefully this weekend and next week are a bit calmer.

Credit for the piece is mine.

How the Globe’s Writers Voted

Yesterday we looked at a piece by the Boston Globe that mapped out all of David Ortiz’s home runs. We did that because he has just been voted into baseball’s Hall of Fame. But to be voted in means there must be votes and a few weeks after the deadline, the Globe posted an article about how that publication’s eligible voters, well, voted.

The graphic here was a simple table. But as I’ll always say, tables aren’t an inherently bad or easy-way-out form of data visualisation. They are great at organising information in such a way that you can quickly find or reference specific data points. For example, let’s say you wanted to find out whether or not a specific writer voted for a specific ballplayer.

Just don’t ask me for whom I would have voted…

Simple red check marks represent those players for whom the Globe’s eligible staff voted. I really like some of the columns on the left that provide context on the vote. For the unfamiliar, players can only remain on the list for up to ten years. And so for the first four, this was their last year of eligibility. None made the cut. Then there’s a column for the total number of votes made by the Globe’s staff. Following that is more context, the share of votes received in 2021. Here the magic number if 75% to be elected. Conversely, if you do not make 5% you drop off the following year. Almost all of those on their first year ballot failed to reach that threshold.

The only potential drawback to this table is that by the time you reach the end of the table, there are few check marks to create implicit rules or lines that guide you from writer to player. David Ortiz’s placement helps because six—remarkably not all Globe writers voted for him—it grounds you for the only person below him (alphabetically) to receive a vote. And we need that because otherwise quickly linking Alex Rodriguez to Alex Speier would be difficult.

Finally below the table we have jump links to each writer’s writings about their selections. And if you’ll allow a brief screenshot of that…

Still don’t ask me

We have a nicely designed section here. Designers delineated each author’s section with red arrows that evoke the red stitching on a baseball. It’s a nice design tough. Then each author receives a headline and a small call out box inside which are the players—and their headshots—for whom the author voted. An initial dropped capital (drop cap), here a big red M, grabs the reader’s attention and draws them into the author’s own words.

Overall this was a solidly designed piece. I really enjoyed it. And for those who don’t follow the sport, the table is also an indicator of how divisive the voting can be. Even the Globe’s writers couldn’t unanimously agree on voting for David Ortiz.

Credit for the piece goes to Daigo Fujiwara and Ryan Huddle.

558 Dingers

Yesterday baseball writers elected David Ortiz of the Boston Red Sox, better known as Big Papi, to the Baseball Hall of Fame. I was trying to work on a thing for yesterday, but ran out of time. While I will attempt to return to that later, for now I want to share a simple interactive graphic from the Boston Globe. As the blog title suggests, it’s about the 558 career home runs Ortiz hit between his time with the Twins and the Red Sox. He hit 541 of those during the regular season, tacking on 17 more in the post season including his famous 2013 ALCS grand slam against the Detroit Tigers. (The one where the cop’s arms are in the air alongside Torii Hunter’s legs.)

That’s a lot of runs

Now you can see that Ortiz was a left-handed pull hitter with that home run concentration to right field, especially those wrapped around Fenway’s (in)famous Pesky Pole.

But with the number of dots you see inside the grounds at Fenway, you can also see the one downside of a chart like this. The graphic maps home runs at all Major League ballparks to that of Fenway. Not to mention the role that the Green Monster plays in turning a lot of those line drive home runs that when hit to right field leave the yard, but to left simply bounce off the Monster for doubles or the dreaded long single. But in part that’s why Ortiz also had ridiculous season numbers for extra base hits because of all those Green Monster doubles. (Conversely, how many popups a mile in the sky came down into the Green Monster seats?)

You access this interactive piece by scrolling through the experience as the Globe chose 12 home runs to represent Ortiz’s entire career. I’m fortunate enough to remember watching several of them on the television.

Big Papi was a force to be reckoned with and watching him hit was entertainment. I’m very excited to see him enter the Hall of Fame.

This summer? It’s his effing Hall.

Credit for the piece goes to John Hancock.

Those Are Some Heavy Balls

Unfortunately, I don’t subscribe to Business Insider, but I saw this graphic on the Twitter and felt the need to share it. Primarily because baseball will almost certainly stop at midnight when the owners of the teams will impose a lockout (as opposed to players going on strike). And with that baseball will be on hold until the two parties resolve their current labour issues.

And at present that seems like it could take quite some time.

So on the eve of the lockout Bradford William Davis tweeted a link to an article he wrote, alas no subscription as aforementioned, but he did share one of the graphics therein.

Those are a lot of blue balls…

We have a basic dot plot charting the weight of the centre of baseballs, sorted by the month of game from which they were pulled.

The designer made a few interesting choices here. First, typographically, we have a few decisions around the type. I would have loved to have seen a bit of editing or design to eliminate the widow at the end of the graphic’s subtitle, that bit that just says “(blue)”. Do the descriptors in parentheses even need to be there when the designer included a legend immediately below? I find that one word incredibly distracting.

On the other hand, the designer chose to use a thin white outline around the text on the plot. Normally I’d really like this choice, because it can reduce some of the issues around legibility when lines intersect text, especially when they are the same colour. Here, however, the backgrounds are not white. I would have tried, for the top, using that light blue instead of white as the stroke for the outside of the letters. And on the bottom I would have tried the light pink. That would probably achieve the presumed desired effect of reducing the visual interference unintentionally created by the white. I also would have moved the top label up so it didn’t sit overlay the top dot.

As far as the dot plot itself goes, that works fine. I wonder if some transparency in the dots would have emphasised how many dots sit atop each other. Or maybe they could have clustered, but when overlapping moved horizontally off the vertical axis.

Overall this was a really nice graphic with which to end this half of the baseball off season. Hopefully the lockout doesn’t last too long.

Credit for the piece goes to Taylor Tyson.

Data Analysis and Baseball

First, a brief housekeeping thing for my regular readers. It is that time of year, as I alluded to last week, where I’ll be taking quite a bit of holiday. This week that includes yesterday and Friday, so no posts. After that, unless I have the entire week off—and I do on a few occasions—it’s looking like three days’ worth of posts, Monday through Wednesday. Then I’m enjoying a number of four day weekends.

But to start this week, we have Game 6 of the World Series tonight between the Atlanta Braves and the Houston Astros. That should the Braves vs. the Red Sox, but whatever. If you want your bats to fall asleep, you deserve to lose. Anyways, rest in peace, RemDawg.

Yesterday the BBC posted an article about baseball, which is first weird because baseball is far more an American sport that’s played in relatively few countries. Here’s looking at you Japanese gold medal for the sport earlier this year. Nevertheless I fully enjoyed having a baseball article on the BBC homepage. But beyond that, it also combined baseball with history and with data and its visualisation.

You might say they hit the sweet spot of the bat.

There really isn’t much in the way of graphics, because we’re talking about work from the 1910s. So I recommend reading the piece, it’s fascinating. Overall it describes how Hugh Fullerton, a sportswriter, determined that the 1919 White Sox had thrown the World Series.

Fullerton, long story short, loved baseball and he loved data. He went to games well before the era of Statcast and recorded everything from pitches to hits and locations of batted balls. He used this to create mathematical models that helped him forecast winners and losers. And he was often right.

For the purposes of our blog post, he explained in 1910 how his system of notations worked and what it allowed him to see in terms of how games were won and lost. Below we have this screen capture of the only relevant graphic for our purposes.

Grooves on the diamond

In it we see the areas where the batter is like safe or out depending upon where the ball is hit. Along the first and third base foul lines we thin strips of what all baseball fans fear: doubles or triples down the line. If you look closely you can see the dark lines become small blobs near home plate. We’ve all seen those little tappers off the end of the bat that die, effectively a bunt.

Then in the outfield we have the two power alleys in right- and left-centre. When your favourite power hitter hits a blast deep to the outfield for a home run, it’s usually in one of those two areas.

We also have some light grey lines, which are more where batted balls are going to get through the infielders. We are talking ground balls up the middle and between the middle infielders and the corners. Of course this was baseball in the early 20th century. And while, yes, shifting was a thing, it was nowhere near as prevalent. Consequently defenders were usually lined up in regular positions. These correspond to those defensive alignments.

Finally the vast majority of the infield is coloured another dark grey, representing how infielders can usually soak up any groundball and make the play.

The whole article is well worth the read, but I loved this graphic from 1910 that explains (unshifted) baseball in the 21st century.

Credit for the piece goes to Hugh Fullerton.

Low Expectations

Today the 2021 Major League Baseball season begins its playoffs. Tomorrow we get the Los Angeles Dodgers and the St. Louis Cardinals. Why the Dodgers, the team with the second-best record in all of baseball, need to play a one-game play-in is dumb, but a subject for perhaps another post. Tonight, however, is the American League (AL) Wildcard game and it features one of the best rivalries in baseball if not American sports: the Boston Red Sox vs. the New York Yankees.

Full disclosure, as many of you know, I’m a Sox fan and consider the Yankees the Evil Empire. But at the beginning of the year, the consensus around the sport was that the Yankees would win first place in their division and be followed by the Tampa Bay Rays or the Toronto Blue Jays. The Red Sox would place fourth and the lowly Baltimore Orioles fifth. The Red Sox, as the consensus went, were, after gutting their team of top-flight talent and a no-good, rotten, despicable 2020 showing, nowhere near ready to reach the playoffs. The Yankees were an unstoppable offensive juggernaut.

When the 2021 season ended Sunday night, as the dust around home plate settled, the Rays dominated the AL East to take first. But it was the Red Sox that finished second and the Yankees who took third. Whilst the two teams had the same record, in head-t0-head match-ups the Red Sox won more games than the Yankees, 10–9. Not bad for a team that everyone thought couldn’t make the playoffs and would be in fourth place.

That got me thinking though, how wrong were our expectations? After doing some Googling to find individual reports and finding a Red Sox twitter account (@RedSoxStats) that captured as many preseason forecasts as he could, I was ready to make a chart. The caveat here is that we don’t have data for all beat writers, who cover the Red Sox exclusively or almost exclusively on a daily basis, or even national media writers, who cover the Red Sox along with the rest of the sport and its teams. For example, ESPN polled 37 of its writers, but all we know is that 0 of 37 expected the Red Sox to make the playoffs. I don’t have a single estimate for the number of wins, which obviously determines who gets into said playoffs, for those 37 forecasts. Others, like CBS Sports, broke down each of their five writers’ rankings for the division and all five had the Red Sox finishing fourth. But again, we don’t have numbers of wins. So in a sense, if we could get numbers from back in the winter and early spring, this chart would look even crazier with the Red Sox being even more outperform-ier than they do here.

Dirty water

We should also remember that during September, in the lead-up to the playoffs, the Red Sox were struggling with a Covid-19 outbreak that put nearly half their starting roster on the Injured List (IL). The Sox had the backups to the backups starting alongside the backups, some of whom then also went on the IL with Covid-19 leading to signings of players who, despite being integral to the September success, are not eligible to play in the playoffs due to when they signed. José Iglesias brought some 2013 magic to be sure. Earlier in the year, MLB would postpone games when significant numbers of players were unavailable, but the Red Sox, for whatever reason, had to play every game. And there were instances where players started the game, but in the middle of the game their tests came back positive and they had to be removed from the field in the middle of the game.

I’m not certain where I stand on how much managers influence the win-loss record in baseball. But if the Sox manager, Alex Cora, doesn’t at least get some nods for being manager of the year, I’ll be truly shocked.

The Red Sox are not a great team. This is not the 2018 behemoth, but rather an early rebuild for a hopefully competitive team in 2023. Their defence is not great. They lack depth in the rotation and the bullpen. I, for one, never doubted their offence—2020 surely had to have been a pandemic fluke. But I had serious questions about their starting rotation. Ultimately the rotation proved itself to be…adequate. And while they played through Covid-19 and kept their heads above water in September, the last few weeks were, at times, hard to watch. The Yankees swept them at Fenway, site of tonight’s game, just last weekend. Of late, the Yankees have been the better team. And all year long, the Red Sox played less competitively than I’d like against the other teams that made the playoffs.

I don’t expect them to win let alone make the World Series, but nobody expected them to be here anyway. Maybe they still have a few more surprises in them. After all, anything can happen in October baseball.

Credit for the piece is mine.

Sankey Shows Starters Sticking with Sticky Stuff

I spent way more time trying to craft that title than I’d like to admit. Headline writing is not easy.

Quick little piece today about Sankey diagrams. I love them. You often see them described as flow diagrams—this piece is in the article we’ll get to shortly—but they are more of a subset within a flow diagram. What sets Sankeys apart is their use of proportional strokes or widths of the directional arrows to indicate share of movement.

The graphic in question comes from an article about Major League Baseball’s (MLB’s) problem with “sticky stuff”. For the unfamiliar, sticky stuff is a broad term for foreign substances pitchers put on their fingers to provide better grip on the baseball. A better grip makes it easier to create movement like sliding and sinking in a pitch there therefore makes it harder for a hitter to hit it. Back when I was a wannabe pitcher, it was spitballs and scuff balls. Now professionals use things like Spider Tack. These are substances that allow you to put the ball in the palm of your hand, then turn your hand over to face the ground and not have the ball fall out of your hand.

So the graphic looks at starting pitchers and how their spin rate, the quantifiable measure impacted by sticky stuff, of their fastballs has changed since MLB instituted a ban on sticky stuff. (It had actually long been in place, see spitballs for example, but had rarely been enforced.)

Showing a small number of pitchers have managed to increased their fastball spin rates

This graphic explores how 223 pitchers saw their spin rates change in the first two months after the change in policy was announced to the nearly month after that period.

Sankeys use proportional width not just to show movement from category to category but the important element of what share of which category moves to which category. For example, we can see a little less than half of starting pitchers saw their spin rates stay the same after the policy change and another almost equal group saw their spin rates decrease. That’s probably a sign they were using sticky stuff and stopped lest they get caught.

But we can then see of that group, maybe 1/6 then saw their spin rates increase again over the last month. That could be a sign that they have found a way to evade the ban. Though it could also be they’ve found new ways of gripping or throwing the baseball. Spin rate alone does not prove sticky stuff usage.

Similarly, we can see that in the group that maintained their spin rate, a small group has found a way to increase it. Finally, a small fraction of the original 223 saw their spin rates increase and a fraction of that group has seen their spin rates increase even further.

This was just a really nice graphic to see in an article from the Athletic about sticky stuff and its potential return.

Credit for the piece goes to Max Bay.

Ranking the Red Sox Prospects

My regular readers will know that I am a fan of the Boston Red Sox, an American baseball team located in Boston, Massachusetts. I would consider myself a bit more involved than a casual fan in that I keep tabs on the team’s prospects.

For those unfamiliar with baseball, the sport works by keeping development pipelines of young talent fed through what we call a farm system. In essence a number of teams owned or contractually linked to the Major League team develop young players until they are ready to debut at the sport’s highest level.

Very few of total number of players in the system will ever get called up to “the Show”. In fact, in the history of the sport only 20,000 men have reached that level. Most of the rest will peak somewhere in the Minor Leagues. Most that reach the Majors will have been at some point prospects. And so to keep tabs on your team’s prospects and farm system sets one apart, in my mind, from the casual fan who simply knows a few of the team’s star players and enjoys a hot dog and a pint of beer at the stadium a few times a summer.

Red Sox fans are fortunate to have a website dedicated to coverage of Boston’s farm system, SoxProspects.com. They rank the system’s Top 60 prospects using their own methodology and research and publish the list online for fans like myself to enjoy.

Last week they updated their rankings. Long story short, the pandemic has impacted baseball and the development of young players. Consequently, the rankings changed significantly. What I really wanted to see was a visualisation of all the changes. So I took it upon myself to do just that using their data.

Hopefully we get a good player or two out of this

Now, if you also happen to be a Red Sox fan, I highly recommend their site. It’s fantastic. Normally I would take the train up to Trenton and see the Portland affiliate when it played there, but the Trenton team no longer exists. I’m not sure when I’ll get to see a Red Sox minor league team again. But hopefully sometime soon, because there look to be some good players coming up.

So I’ll be looking forward to, hopefully, a good run of contending teams in the coming years.

Credit for the piece is mine.

Baseball’s Injury Problem

Last week, Ken Rosenthal of the Athletic wrote an article examining the recent spate of injuries in Major League Baseball. For those interested in the sport, the article is well worth the read. For the unfamiliar, baseball played only about 1/3 of the number of games as usual last year due to Covid-19. This year, pitcher after pitcher seems to be falling prey to arm troubles. Position players are straining hamstrings, quads, and other muscles I’ve never heard of let alone used over the last year. And joking aside, therein is thought to be the problem.

And the evidence, in part, shows that we are seeing an increase in the numbers of injuries. But 2020 may not be as much of a problem as youngsters throwing baseballs near 100 mph. But I digress. The article contained a table detailing the numbers of injuries for certain body parts in the first month (April) of the season in both 2021 and 2019, the last comparable season due to Covid-19.

To be fair, the table was nice, but in the exhaustion of post-second dose shot last weekend, I sketched out some things and decided to turn it into a proper post.

Ouch.

Credit for the piece is mine.

Expansion Teams in Baseball

I was not planning on posting this today, because I was—am?—still working on it. But there was some baseball news last night that prompted me to export what I had to try and get this live.

For a little while now I’ve been wondering why a number of baseball stars, albeit in their later years, are still looking for employment. Some are pretty obvious in that they are facing legal troubles. Some may have high demands that ball clubs are not willing to meet. Some may have reasonable demands but the clubs are just being incredibly cheap. Or it may be none of those. Or some combination of those. But when you see some of the players some teams put on the field each night, you can’t tell me some of these free agents wouldn’t be better options.

Separately, I also tend to think baseball needs to expand and add some new clubs. But they won’t until the Oakland Athletics and Tampa Bay Rays resolve their stadium issues.

But what if…

Well a normal expansion would include two teams to keep an even balance. The new teams would likely use some kind of draft to select players from the rosters of other teams, with a certain number of players almost certainly protected. But what if we just used those unsigned ball players?

Anibal Sanchez is the guy messing this up. He’s been a free agent for some time now but is reportedly going to sign by the end of this week, perhaps today. So with him and everyone else, could we field two expansion teams?

Kinda, yeah.

First up, the Charlotte Piedmonters.

The Charlotte Piedmonters could also be looking for a new name.

Not a great team—nor would we expect it to be as all the really good free agents have already been signed. But these former stars, award winners, and fan favoutites may have just enough left in the tank to make for some competitive games if all goes well. My readers who happen to be fellow baseball fans will probably recognise most of these names, though I’ll admit a number of the relief pitchers are new to me. I can figure out basically everything but a centre fielder. But you could probably get somebody from an independent league or international league or just convert somebody.

I used projected Wins Above Replacement (WAR) to determine how good the players would be. For non-baseball fans, WAR is a value you can use to determine how good a player is relative to an average replacement player. Somebody with the value 0 to 1 is a scrub or bench player. Take any average ballplayer and sub them in and you wouldn’t know the difference. 2s and 3s are solid role playing guys, but not likely stars. Stars get into the picture around 4 and your best players are probably 5 to 6 or higher.

In Charlotte, nobody has a WAR higher than Rick Porcello’s 1.4. In other words, he’s a better than average pitcher, but not by much. Tyler Flowers: a better than average catcher, but not by much. Homer Bailey: barely better than average starting pitcher. Everyone else, generally you could sub them out and not know the difference. But, crucially for our purposes, they are not below average players. Some of those are still on the market, but I didn’t assign them to Charlotte.

Now if Charlotte gets a team, so does Portland, Oregon: the Portland Lumberjacks.

Again, I’m open to name suggestions.

Here you can see Anibal Sanchez as the third man in the rotation. You can also see that the rotation here is the weakest part. For Charlotte you could get away with a bullpen game every five days. But two bullpen days? Well, take a look at the Boston Red Sox in 2020 and that pitching dumpster fire and you’ll see what having only two or three starters can do. (Though the relief starters they did use were all worse than the people on these lists, which just makes my point that there are talented if not star-level players available.)

Neither of these teams would be good. You can imagine a team like Charlotte getting beat almost every night in the AL East—except by Baltimore. The NL East might be a bit easier. And Portland in the NL West would be similarly a punching bag—except by Colorado probably. But dump either into the AL or NL Central and who knows.

Two teams is clearly a stretch. So what if we just made one? What if we brought back the Montreal Expos? Sure, it messes up the schedule, but we get to pick the best players from Charlotte and Portland.

No new name needed.

The result is a team that is significantly improved. That doesn’t mean very good. These Expos wouldn’t make the playoffs. But the rotation is full of guys who could be, at best, solid middle- to, more likely, back-end starters. The lineup, well, the lineup would still be mostly replacement level players, a.k.a. scrubs, with two exceptions. But with past track records, it’s not impossible to imagine a few of these players having a better than projected year.

On paper, they still wouldn’t be as good as the worst team in baseball (by WAR), the Pirates. But Pittsburgh also doesn’t have a centre fielder, so…

Anyway, I was going to try and do some more analysis beyond using WAR, but I wanted to get this out before Sanchez signed this week.

I also got to add Oliver Perez, who despite having a good year was released by Cleveland today. Boston needs a solid lefty reliever for the middle innings, and I hope they pick up Perez and option Josh Taylor down to Worcester.

Credit for the piece is mine.