World Cup Match Probabilities

The World Cup has had some impressive matches and some stunners. (And the two are not mutually exclusive.) But if you are like me and have to work during most of the broadcasts, how can you follow along? Well thankfully FiveThirtyEight put together a nice statistical model that provides the probability of a team winning—or drawing—in real time.

Looking pretty good for Portugal this morning…
Looking pretty good for Portugal this morning…

The design is fairly simple: a small table with the score and probability followed by a chart drawn as the match goes on. (Clearly I took this image at the half.)

I included a snippet of the table below to show the other work the FiveThirtyEight team put out there. You can explore the standings, the screenshot above, as well as the matches and then the brackets later in the competition.

The table makes nice use of the heat map approach to show is likely to make easy of the different stages of the competition. Like I said the other day, they are high on Brazil, because Brazil. But a little lower on Germany. But never count Germany out.

Shouldn't Iran be in the top slot?
Shouldn’t Iran be in the top slot?

The only unclear thing to me in the table? The sorting mechanism. In Group B, at least whilst the Portugal match is ongoing, should probably have Iran at the top. After all, as of writing, it is the only team in the group to have won a match. The only thing I can guess is that it has to do with an overall likelihood to advance to the next round. I highly doubt that Iran will defeat either Spain or Portugal. But as with many knockout-style championships, anything can happen in a single match sample size.

Credit for the piece goes to Jay Boice, Rachael Dottle,Andrei Scheinkman, Gus Wezerek, and Julia Wolfe.

A Wetter Midwest

Here in Philadelphia, I think yesterday was the first day it had not rained in over a week. Not that everyday was a drenching storm, but at least showers passed through along with some downpours and definitely grey skies. But what about my old home, Chicago?

Well, FiveThirtyEight turned to a longer-term look and examined how over the century the amount of rainfall in the upper Midwest has been increasing. We are actually looking at the same places the Post looked at a few days ago. But instead of political maps, we have rainfall maps.

This one in particular is weird.

Water water everywhere
Water water everywhere

I get why they have the map, to show the geographic distribution of the rain gauges that collect the data. And those are site specific, not statewide. But did the designer have to choose area?

We know that area is a less than ideal way of allowing users to compare data points. And as I just noted, a choropleth, even at say the county level, is out of the question. But what about little squares? Or circles? Could colour have been used to encode the same data instead of size? And then we would likely have fewer overlapping triangles.

I suppose the argument is that the big triangles make a bigger visual impact. But they do so at the cost of comparable data points across the Midwest. Maybe the designer chose the area of triangles because there were too few gauges across the country. I am not sure, but for me the triangles are not quite on point.

That said, the graphics throughout the rest of the article are quite good, especially the opening scatterplots. They are not the sexiest of charts, but they clearly show a trends towards a wetter climate.

Credit for the piece goes to Ella Koeze.

Albert Pujols Isn’t Too Bad at That Baseball Thing

On Friday Albert Pujols joined the very elite club of baseball players who have managed 3000 hits in their career. Thankfully FiveThirtyEight covered it with a few graphics in an article that pointed out just how hard it is to do. Especially because, and I did not know this, Pujols did it in a not terribly common fashion. (Funny story, I had to explain this past weekend how Randy Johnson was a ridiculous pitcher, in the lots-of-strikeouts-and-also-exploded-a-bird way.)

My video game version of me would probably be on there if only those games lasted more than one season…
My video game version of me would probably be on there if only those games lasted more than one season…

The piece uses a ternary plot, which we can also just call a triangle chart because it is, you know, in the shape of an equilateral triangle, to look at three components of Pujols’ hit skill.

There are different types of hitters in baseball. The guys who crush home runs all the time, the guys who hit singles all the time, guys who walk a lot. (Technically a walk is not a hit, but they are still getting on base.) There are fancy metrics that can be used to tease out just how much power is in a person’s game, and when you compare that to the batting average and to their walk rate, you can see clusters of players.

These kind of charts can be difficult to read—what does it mean for a player in a certain area of the chart? But what the designer did real well here is label an example of the type of player. Ichiro, called out for being a singles machine, is notable because he just sort-of-retired last week. He also has something like another 1500 hits back in Japan. That guy can hit.

Credit for the piece goes to Neil Paine and Rachael Dottle.

English Premier League’s Lack of Premier-ness

This piece will make a ton of sense to my British and European readers, likely less so to those of you from the States. The English Premier League has been not so great at finishing well let alone winning in the Champions League.

Super briefly, English football—soccer—has a whole bunch of teams that play at different levels. Kind of like the US minor leagues, but without the affiliation of minor league teams to major league teams. That is, every team for itself. The Premier League is the top rung. (Every year, the worst teams in the Premier League are dropped into the minors and the very best from the minors move up into the Premier League.) This league includes the ones even Americans have heard of: Manchester, Arsenal, Chelsea. And maybe even Liverpool. Liverpool is playing today to make it into the Champions League finals.

(Full disclosure: I always say if I had to pick an English team to follow it would be Liverpool. Why? Because they are owned by Fenway Sports Group, the same group that owns the Boston Red Sox.)

The thing is that as well known as many of these teams are, they have been faring not well in the Champions League, which is like the Premier League but of all European football. That is, the best teams from every top league in all of Europe compete for a European trophy. FiveThirtyEight explored some reasons why, but also included a nice graphic to showcase the relative failures of the Premier League teams.

Making it through the Champions League…
Making it through the Champions League…

The chart makes nice use of grouped bar charts showing the number of teams from each league at each stage of the playoffs. The designers made good use of labelling, especially at the top to indicate to which country each league belongs. My only question would be is whether these make sense from the top down, as they presently are, or if they would work better bottom up, in that the winning team has to climb their way to victory.

To be honest, I am not really sure which approach would work best. I think it might be even odds. Either way, Liverpool plays Roma later today.

Credit for the piece goes to Tim Wigmore.

Down on the Farms

Just a neat little piece today from FiveThirtyEight. They take a look at the potential impact of the Trump administration’s proposed tariffs on the farm vote in the United States. The screenshot of the table shows how the farm population compares to Trump’s margin of victory in 2016.

Farming clearly isn't big in Alaska…
Farming clearly isn’t big in Alaska…

The three states at the top? The very same Pennsylvania, Wisconsin, and Michigan about which we hear so often. Yes, Pennsylvania does have large cities like Philadelphia and Pittsburgh, but agriculture is an important part of its economy. So if the tariffs or the reprisals to the tariffs have any significant impact on the livelihood of farmers, that could be enough, all things being equal, to flip those states.

About the design, I think the inclusion of the mini-bar chart helps tremendously. Tables are great for organising information, but scanning over and through cell after cell of black text can hide patterns. The visualisation of those patterns at the end of each row helps the user tremendously, by making it very clear why those three states were highlighted.

Credit for the piece goes to Rebecca Shimoni Stoil.

Deaths in America

Yesterday was murders in London and New York. Today, we have a nice article from FiveThirtyEight about deaths more broadly in America. If you recall, my point yesterday was that not all graphics need to be full column width. And this article takes that approach—some graphics are full width whereas others are not.

This screenshot shows a nice line chart that, while the graphic sits in the full column, the actual chart is only about half the width of the graphic. I think the only thing that does not sit well with me is the alignment of the chart below the header. I probably would align the two as it creates an odd spacing to the left of the chart. But I applaud the restraint from making the line charts full width, as it would mask the vertical change in the data set.

The screenshot is of the graphic's full width, note the lines only go a little over half the width.
The screenshot is of the graphic’s full width, note the lines only go a little over half the width.

Meanwhile, the article’s maps all sit in the full column. But my favourite graphic of the whole set sits at the very end of the piece. It examines respiratory deaths in a tabular format. But it makes a fantastic use of sparklines to show the trend leading towards the final number in the row.

Loving the sparklines…
Loving the sparklines…

Credit for the piece goes to Ella Koeze and Anna Maria Barry-Jester.

Baseball Is Back

Praise the (baseball) gods.

The 2018 season starts today with I think every team playing—the Red Sox open down in St. Petersburg against the Rays. So today’s post is on the light side as I could not find the awesomest baseball graphic. But FiveThirtyEight did at least preview the season and ran some projections. Naturally, I disagree with their projections. But I think finally this year the Yankees will be more of a threat to the Red Sox than they have been in years. The rivalry is back. (Though it never really went away in my mind.)

Switch numbers one and two and I think this might be okay…
Switch numbers one and two and I think this might be okay…

The above is the screenshot for the American League East, because Boston. But, the rest of the AL is on that page as well. For those of you from my more National League-following cities like Philadelphia and Chicago, FiveThirtyEight also previewed the NL divisions here.

Credit for the piece goes to Neil Paine.

Basketball Tournament Locations

My apologies to those of you who are big fans of basketball. Since it is not really my sport of choice, I had no idea that March Madness started yesterday. Otherwise, I would have posted this earlier.

FiveThirtyEight analysed the locations of basketball conferences’ tournaments relative to the geographic centres of said conferences. As it turns out many conferences do not play near their centre. The article goes on to suggest cities near their conference centre.

All the locations…
All the locations…

As a geography nerd, I found the whole exercise fascinating. I only have a few peccadillos with this. I would have loved to see a follow-up map with their suggested locations plotted against the centre and the actual tournament location.

Secondly, the first three maps highlight how the tournaments are being played in New York. But in the first they label it as Brooklyn. Then the next two just use NYC. Since Brooklyn is part of New York, why use Brooklyn instead of NYC? Or instead of NYC, use Manhattan.

Again, my apologies for not realising this should have been up earlier this week. So for those of you who do participate in the bracket challenges, hopefully your brackets are still more or less intact. And best of luck with the rest of the tournament.

Credit for the piece goes to Neil Paine and Ella Koeze.

Baseball: The Bouncier Edition

Baseball is finally back as Spring Training continues to push through March, getting us closer to Opening Day. But one lingering question from last year remains: why the increase in power and home runs? While Major League Baseball (MLB) says there has been no change to the baseball, many think otherwise.

FiveThirtyEight published a piece looking at the insides of eight baseballs, four predating the power surge, which began after the 2015 All Star Game, and three balls since in addition to a newly manufactured and unused ball.

The piece uses a few graphics to showcase the differences, including this cutaway diagram highlighting the different layers of a baseball.

What's inside a baseball?
What’s inside a baseball?

But the real gem is the X-ray photography done to examine the balls without cutting into them. Thankfully for those of us unfamiliar with x-rays, the designers provided a legend showing the clearly different core densities in the balls.

Old balls vs. new balls
Old balls vs. new balls

If you are interested in baseball, and in particular the increase in home runs, the whole article is worth the short read. And if you’re not, well, the x-ray views of baseballs are still pretty neat.

Credit for the piece goes to Rob Arthur and Tim Dix.

US Olympic Performance

I don’t know if you heard, but the Winter Olympics just concluded. I’m admittedly not a huge fan of the Winter Olympics, but that doesn’t mean I didn’t keep my eye on some of the stories coming out of the coverage. One that I liked was this piece from FiveThirtyEight.

US performance was lagging at this point
US performance was lagging at this point

It was about halfway through the Olympics and the US was not doing terribly well. The chart does a great job of showing how various countries were performing, or over- or under-performing, their expected total medal winnings. It did this through a filled bar chart with a bar-specific benchmark line. It was a nice combination of a couple of different techniques to incorporate not just the usual above or below the trend, but also the actual amounts.

Credit for the piece goes to Gus Wezerek.