Can We Pop Our Political Bubbles?

It’s no secret that Americans—and likely at least Western communities more broadly—live in bubbles, one of which being our political bubbles. And so I want to thank one of my mates for sending me the link to this opinion piece about political bubbles from the New York Times.

The piece is fairly short, but begins with an interactive piece that allows you to plot your address and examine whether or not you live in a political bubble. Using my flat in Philadelphia, the map shows lots of little blue dots, representing Democratic voters, near the marker for my address and comparatively few red dots for Republicans.

An island of blue in a sea of red.

If you then look a bit more broadly, you can see that by summing up the dots, my geographic bubble is largely a political bubble, as only 13% of my neighbours are Republicans. Not terribly surprising for a Democratic city.

A certain lack of diversity in political thought.

And while the piece does then zoom back out a wee bit, it tries to show me that I don’t live too far from a politically integrated bubble. Except in this case, it’s across a decent sized river and getting there isn’t the easiest thing in the world. I’m not headed to Gloucester anytime soon.

Things are better in Jersey?

These interactives serve the purpose of drawing the user into the article, which continues explaining some of the causes of this political segregation, by both policy, redlining, and personal choice, lifestyle. The approach works, because it gives us the most relatable story in a large dataset, ourselves. We’re now emotionally or intellectually invested in the idea, in this case political bubbles, and want to learn all about it. Because the more you know…

The piece uses the same type of map to showcase the bubbles more broadly from the Bay Area to the plains of Wyoming. (No surprises in the nature of those political bubbles.) It wraps up by showing how politicians can use the geography of our political bubbles to create political geographies via gerrymandering that shore up their political careers by creating safe districts. The authors use a gerrymandered northeastern Ohio district that encompasses two cities, Cleveland and Akron, to make that point.

That’s in part why I’m in favour of apolitical, independent boundary commissions to create more competitive congressional districts. Personally, I would have been fascinated to see how Pennsylvania’s congressional districts, redrawn in 2018 by the Pennsylvania Supreme Court, after the court found the gerrymandered districts of 2011 unconstitutional, created political competition between parties instead of within parties. But I digress.

And then for kicks, I looked at how my flat in Chicago compared.

Less island of blue and sea of red, because a lake of blue water alters that geography.

Not surprisingly, my neighbourhood in Lakeview was another political bubble, though this one even more Democratic than my current one.

Lakeview is even more Democratic than Logan Square, Philly’s Logan Square that is.

But if I had wanted to move to an integrated political bubble, instead of Philadelphia, I could have moved to…Jefferson Park.

Because everyone can agree Polish food is good food.

Credit for the piece goes to Gus Wezerek, Ryan D. Enos and Jacob Brown.

Choropleths…Again

Admittedly, I was trying to find a data set for a piece, but couldn’t find one. So instead for today’s post I’ll turn to something that’s been sitting in my bookmarks for a little while now. It’s a choropleth map from the US Census Bureau looking at population change between the censuses.

Unequal growth

The reason I have it bookmarked is for the apportionment map, but I will save apportionment for another post because, well, it’s complicated. But map colours are a thing we’ve been discussing of late and we can extend that conversation here.

What I find interesting about this map is how they used a very dark blue-grey colour for their positive growth and an orange that is a fair bit brighter for negative growth, or population loss. And because of that difference in brightness, the orange really jumps out at you.

To be fair, that’s ideal if you’re trying to talk about where state populations are shrinking, because it focuses attention on declines. But, if you’re trying to present a more neutral position, like this seems to be, that colour choice might not be ideal.

Another issue is that if you look at the legend it simply says loss for that orange. But, look above and you’ll see four bins clearly delimited by ranges of percents for the positive growth. If we are trying to present a more neutral story, the use of the orange places it visually somewhere near the top of that blue-grey spectrum.

If you look at the percentages, however, Michigan’s population decline was 0.6% and Puerto Rico’s 2.2%. If this map used a legend that treated positive and negative growth equally, you would place that one state and one should-be state in a presumably light orange. The scale of their negative growth is equal to something like Ohio, which is in the lightest blue-grey available.

Consequently, this map is a little bit misleading when it comes to negative growth.

Credit for the piece goes to the Census Bureau graphics team.

The Armchair General…

Manager.

Of the New England Patriots.

As many of my long-term readers know, I am really only a one sport kind of guy. And that sport is baseball. American football, well, I’ve seen one match live and in person and it was…boring. But it’s a big deal in America. And this is the time of the year when teams begin signing free agents.

I happened to be reading the Boston Globe for news on the Red Sox, my team, when I saw a link to this interactive tool allowing users to build their own roster with free agent signings.

Go Pats

Conceptually, the piece is fairly simple. There is a filterable list of free agents, broken out by whether their forecast signing values falls into the high-, middle-, or low-end of the range. Plus a draft pick.

I root for the Patriots. However, if you asked me to name a single player on last season’s roster, I could only name Cam Newton. Apparently he wasn’t great. I really and truly don’t follow the sport.

The piece displays the available free agents, along with those no longer available. (Though, the piece does offer you the option to go back to the beginning of free agent season and pretend reality didn’t happen.)

I have no idea who any of these people are.

I went through and began semi-randomly picking names. I’d heard of some of them, and others were blind choices. Once you’ve selected within the budget, you can choose a draft pick. They all appear in list format to the right with the ability to remove them via a small X button.

Nope, not a clue.

Once you’ve confirmed your choices you’re taken to a screen that reviews your selection. You are able to either tweet it to the world—which I did not do—or start over again. I would do that, but I wouldn’t do any better than how I just did.

I hope I did at least okay.

Overall, the piece felt intuitive and I never had any issues selecting my free agents. Of course, it would help if I knew anything about the sport. But that’s a user problem.

Credit for the piece goes to Ben Volin.

The Changing Colours of Rivers

No two rivers are the same, though they certainly can be similar. Rivers have their own ecosystems and when I was at school, I learned of the different classifications of rivers by the colour of their water: black, white, and clear. Broadly speaking, that just means the amount of sediment dissolved in the river’s water. Black colours appear when slow moving water has absorbed lots from its environment, think swamps. White waters resemble tea or coffee with added milk or cream. This happens when sediments enter and dissolved into the water. Clear water is that, relatively clear and free of sediment.

But a team of scientists at University of North Carolina at Chapel Hill (UNC Chapel Hill) recently released some work where they used shifts in blue to yellow and green to help classify rivers. Their classification differs, but broadly can point to a change from healthy (blue) to unhealthy (yellow and green). The novelty in their work, however, focuses on using satellite imagery to capture the colour of rivers and their evolution since the mid 1980s.

A look at the broader lower-48 of the United States

They published their findings as an interactive application driven primarily by a clickable map. Clearly not all rivers are available, but a large number are, and you can see some obvious patterns at a national scale—their work excludes Alaska and Hawaii. If blue represents healthy rivers, we see healthy rivers in New England and the Pacific Northwest with a host of green rivers in the Mid-Atlantic and Upper Midwest with yellow in the Mississippi basin and southeast.

I wanted to look at Pennsylvania a bit more specifically given my familiarity with the Commonwealth and zoomed in a bit on the map.

The colour of Pennsylvania’s rivers

You can see that using that above scale, Pennsylvania’s rivers are in okay, not great state. Some of the upper stretches of the Delaware and Susquehanna Rivers are coloured blue, but we mostly see a lot of green.

To the right of the map, the designers placed three smaller charts driven by the user’s selection of river. Let’s take a look at the Juniata River as an example—my grandfather grew up living alongside a tributary that emptied into the Frankstown Branch just a short walk from his house.

A look at the Schuylkill River south of the Fairmount Water Works

We can see that the chart on the upper right shows the colour shift over the decades for that observed section of the river. The legend provides the information that the section of the river has shifting blue—gotten healthier—and then below it looks for any seasonal changes. Here the chart is grey, indicating the system lacks enough data for a clear trend. This examines the short changes we might see in a river based on seasonal effects like rainy season, dry season, and human-driven effects—perhaps we pollute more in the spring and then use rivers recreationally in the summer.

Finally a distribution of the river section’s colour, all in wavelengths of light.

My biggest critique here would be the wavelengths. Users likely will not the colour spectrum by wavelength, and adding some labels like blue, yellow, and green could go a long ways to help users understand at what they are looking.

Overall, though, this is a really fascinating project.

Credit for the piece goes to John Gardner et al.

Biden’s Biggest Pyramids

Yesterday we looked at an article from the Inquirer about the 2020 election and how Biden won because of increased margins in the suburbs. Specifically we looked at an interactive scatter plot.

Today I want to talk a bit about another interactive graphic from the same article. This one is a map, but instead of the usual choropleth—a form the article uses in a few other graphics—here we’re looking at three-dimensional pyramids.

All the pyramids, built by aliens?

Yesterday we talked about the explorative vs. narrative concept. Here we can see something a bit more narrative in the annotations included in the graphic. These, however, are only a partial win, though. They call out the greatest shifts, which are indeed mentioned in the text. But then in another paragraph the author writes about Bensalem and its rightward swing. But there’s no callout of Bensalem on the map.

But the biggest things here, pun intended, are those pyramids. Unlike the choropleth maps used elsewhere in the article, the first thing this map fails to communicate is scale. We know the colour means a county’s net shift was either Democratic or Republican. But what about the magnitude? A big pyramid likely means a big shift, but is that big shift hundreds of votes? Thousands of votes? How many thousands? There’s no way to tell.

Secondly, when we are looking at rural parts of Bucks, Chester, and Montgomery Counties, the pyramids are fine. They remain small and contained within their municipality boundaries. Intuitively this makes sense. Broadly speaking, population decreases the further you move from the urban core. (Unless there’s a secondary city, e.g. Minneapolis has St. Paul.) But nearer the city, we have more population, and we have geographically smaller municipalities. Compare Colwyn, Delaware County to Springfield, Bucks County. Tiny vs. huge.

In choropleth maps we face this problem all the time. Look at a classic election map at the county level from 2016.

Wayb ack when…

You can see that there is a lot more red on that map. But Hillary Clinton won the popular vote by more then 3,000,000 votes. (No, I won’t rehash the Electoral College here and now.) More people are crowded into smaller counties than there are in those big, expansive red counties with far, far fewer people.

And that pattern holds true in the Philadelphia region. But instead of using the colour fill of an area as above, this map from the Inquirer uses pyramids. But we face the same problem, we see lots of pyramids in a small space. And the problem with the pyramids is that they overlap each other.

At a glance, you cannot see one pyramid beind another. At least in the choropleth, we see a tiny field of colour, but that colour is not hidden behind another.

Additionally, the way this is constructed, what happens if in a municipality there was a small net shift? The pyramid’s height will be minimal. But to determine the direction of the shift we need to see the colour, and if the area under the line creating the pyramid is small, we may be unable to see the colour. Again, compare that to a choropleth where there would at least be a difference between, say, a light blue and light red. (Though you could also bin the small differences into a single neutral bin collecting all small shifts be them one way or the other.)

I really think that a more straight forward choropleth would more clearly show the net shifts here. And even then, we would still need a legend.

The article overall, though, is quite strong and a great read on the electoral dynamics of the Philadelphia region a month ago.

Credit for the piece goes to John Duchneskie.

Biden Won the Burbs

The thing with election results is that we don’t have the final numbers for a little while after Election Day. And that’s normal.

There are a few things I want to look at in the coming weeks and months once my schedule eases up a bit. But for now, we can use this nice piece from the Philadelphia Inquirer to look at a story close to home: the vote in the Philadelphia suburbs.

It’s all happening in the yellow.

I’ve already looked at some analysis like this for Wisconsin and I shared it on my social. But there I looked at the easy, county-level results. What the Inquirer did above is break down the Pennsylvania collar counties of Philadelphia, i.e. the suburbs, into municipality level results. It then plotted them 2020 vs. 2016 and the results were—as you can guess since we know the result—Biden beat Trump.

What this chart does well is colours the municipalities that Biden flipped yellow. It’s a great choice from a colour standpoint. As the third of the primaries, with both blue and red well represented, it easily contrasts with the Biden- and Trump-won towns and cities of the region. The colour is a bit “darker” than a full-on, bright yellow, but that’s because the designers recognised it needs to stand out on a white field.

Let’s face it, yellow is a great colour to use, but it’s difficult because it’s so light and sometimes difficult to see. Add just the faintest bit of black to your mix, especially if you’re using paints, and voila, it works pretty well. So here the designer did a great job recognising that issue with using yellow. Though you can still see the challenge, because even though it is a bit darker, look at how easy it is to read the text in the blue and the red. Now compare that to the yellow. So if you’re going to use yellow, you want to be careful how and when you do.

The other design decision here comes down to what I call the explorative vs. the narrative. Now, I don’t think explorative is a word—and the red squiggle agrees—but it pairs nicely with narrative. And I’ve been talking about this a lot in my field the last several works, especially offline. (In the non-blog sense, because obviously all my work is done online these days. Oh, how I miss my old office.)

Explorative works present the user with a data set and then allow them to, in this case, mouse over or tap on dots and reveal additional layers of information, i.e. names and specific percentages. The idea is not to tell a specific story, but show an overall pattern. And if the piece is interactive, as this is, potentially allow the user to drill down and tease out their own stories.

Compare that to the narrative, my Wisconsin piece I referenced above is more in this category. Here the work takes you through a guided tour of the data. It labels specific data points, be them on trend or outliers and is sometimes more explicit in its analysis. These can also be interactive—though my static image is not—and allow users to drill down, and critically away, from the story to see dots of interest, for example.

This piece is more explorative. The scatter plot naturally divides the municipalities into those that voted for Biden, Trump, and then more or less than they voted for Trump in 2016. The labels here are actually redundant, but certainly helpful. I used the same approach in my Wisconsin graphic.

But in my Wisconsin graphic, I labelled specific counties of interest. If I had written an accompanying article, they would have been cited in the textual analysis so that the graphic and text complemented each other. But here in the Inquirer, it’s a bit of a missed opportunity in a sense.

The author mentions places like Upper Darby and Lower Merion and how they performed in 2020 vis-a-vis 2016. But it’s incumbent on the user to find those individual municipalities on the scatter plot. What if the designer had created a version where the towns of interest were labelled from the start? The narrative would have been buttressed by great visualisations that explicitly made the same point the author wrote about in the text. And that is a highly effective form of communication when you’re not just telling, but also showing your story or argument.

Overall it’s a great article with a lot to talk about. Because, spoiler, I’m going to be talking about it again tomorrow.

Credit for the piece goes to Jonathan Lai.

Choose Your Own FiveThirtyEight Adventure

In case you weren’t aware, the US election is in less than a week, five days. I had written a long list of issues on the ballot, but it kept getting longer and longer so I cut it. Suffice it to say, Americans are voting on a lot of issues this year. But a US presidential election is not like many other countries’ elections in that we use the Electoral College.

For my non-American readers, the Electoral College, very briefly, was created by the country’s founding fathers (Washington, Jefferson, Adams, Franklin, et al.) to do two things. One, restrict selection of the American president to a class of individuals who theoretically had a broader/deeper understanding of the issues—but who also had vested interests in the outcome. The founders did not intend for the American people to elect the president. The second feature of the Electoral College was to prevent the largest states from dominating smaller states in elections. Why else would Delaware and Rhode Island surrender their sovereignty to join the new United States if Virginia, Pennsylvania, and New York make all the decisions? (The founders went a step further and added the infamous 3/5 clause, but that’s another post.)

So Americans don’t elect the president directly and larger states like California, New York, and Texas, have slightly less impact than smaller states like Wyoming, Vermont, and Delaware. Each state is allotted a number of Electoral College votes and the key is to reach 270. (Maybe another time I’ll get into the details of what happens in a 269–269 tie.) Many Americans are probably familiar with sites like 270 To Win, where you can determine the outcome of the election by saying who won each state. But, even though the US election is really 50 different state elections, common threads and themes run through all those states and if one candidate or another wins one state, it makes winning or losing other states more or less likely. FiveThirtyEight released a piece that attempts to link those probabilities and help reveal how decisions voters in one state make may reflect on how other voters decide.

The interface is fairly straightforward—I’m looking at this on a desktop, though it does work on mobile—with a bunch of choices at the top and a choropleth map below. There we have a continually divergent gradient, meaning the states aren’t grouped into like bins but we have incredibly subtle differences between similar states. (I should also point out that Maine and Nebraska are the two exceptions to my above description of the Electoral College. They divide their votes by congressional district, whoever wins the district gets that Electoral College vote and then the state overall winner receives the remaining two votes.)

Below that we have a bar chart, showing each state, its more/less likely winner state and the 270 threshold. Below that, we have what I’ve read/heard described as a ball plot. It represents runs of the simulation. As of Thursday morning, the current FiveThirtyEight model says Trump has an 11 in 100 chance of winning, Biden, conversely, an 89-in-100 chance.

But what happens when we start determining the winners of states?

Well, for my non-American readers, this election will feature a large number of voters casting their ballots early. (I voted early by mail, and dropped my ballot off at the county election office.) That’s not normal. And I cannot emphasise this next point enough. We may not know who wins the election Tuesday night or by the time Americans wake up on Wednesday. (Assuming they’re not like me and up until Alaska and Hawaii close their polls. Pro-tip, there’s a potentially competitive Senate race in Alaska, though it’s definitely leaning Republican.)

But, some states vote early and/or by mail every year and have built the infrastructure to count those votes, or the vast majority of them, on or even before Election Day. Three battleground states are in that group: Arizona, Florida, and North Carolina. We could well know the result in those states by midnight on Election Day—though Florida is probably going to Florida.

So what happens with this FiveThirtyEight model if we determine the winners of those three states? All three voted for Trump in 2016, so let’s say he wins them again next week.

We see that the states we’ve decided are now outlined in black. The remainder of the states have seen their colours change as their odds reflect the set electoral choice of our three states. We also now have a rest button that appears only once we’ve modified the map. I’m also thinking that I like FiveyFox, the site’s new mascot? He provides a succinct, plain language summary of what the user is looking at. At the bottom we see what the model projects if Arizona, Florida, and North Caroline vote for Trump. And in that scenario, Trump wins in 58 out of 100 elections, Biden in only 41. Still, it’s a fairly competitive election.

So what happens if by midnight we have results from those three states that Biden has managed to flip them? And as of Thursday morning, he’s leading very narrowly in the opinion polls.

Well, the interface hasn’t really changed. Though I should add below this screenshot there is a button to copy the link to this outcome to your clipboard if, like me, you want to share it with the world or my readers.

As to the results, if Biden wins those three states, Trump has less than a 1-in-100 chance of winning and Biden a greater than 99-in-100.

This is a really strong piece from FiveThirtyEight and it does a great job to show how states are subtly linked in terms of their likelihood to vote one way or the other.

Credit for the piece goes to Ryan Best, Jay Boice, Aaron Bycoffe and Nate Silver.

Double Your Hurricanes, Double Your Fun

In a first, the Gulf of Mexico basin has two active hurricanes simultaneously. Unfortunately, they are both likely to strikes somewhere along the Louisiana coastline within approximately 36 hours of each other. Fortunately, neither is strong as a storm named Katrina that caused a mess of things several years ago now.

Over the last few weeks I have been trying to start the week with my Covid datagraphics, but I figured we could skip those today and instead run with this piece from the Washington Post. It tracks the forecast path and forecast impact of tropical storm force winds for both storms.

The forecast path above is straight forward. The dotted line represents the forecast path. The coloured area represents the probability of that area receiving tropical storm force winds. Unsurprisingly the present locations of both storms have the greatest possibilities.

Now compare that to the standard National Weather Service graphic, below. They produce one per storm and I cannot find one of the combined threat. So I chose Laura, the one likely to strike mid-week and not the one likely to strike later today.

The first and most notable difference here is the use of colour. The ocean here is represented in blue compared to the colourless water of the Post version. The colour draws attention to the bodies of water, when the attention should be more focused on the forecast path of the storm. But, since there needs to be a clear delineation between land and water, the Post uses a light grey to ground the user in the map (pun intended).

The biggest difference is what the coloured forecast areas mean. In the Post’s versions, it is the probability of tropical force winds. But, in the National Weather Service version, the white area actually is the “cone”, or the envelope or range of potential forecast paths. The Post shows one forecast path, but the NWS shows the full range and so for Laura that means really anywhere from central Louisiana to eastern Texas. A storm that impacts eastern Texas, for example, could have tropical storm force winds far from the centre and into the Galveston area.

Of course every year the discussion is about how people misinterpret the NWS version as the cone of impact, when that is so clearly not the case. But then we see the Post version and it might reinforce that misconception. Though, it’s also not the Post’s responsibility to make the NWS graphic clearer. The Post clearly prioritised displaying a single forecast track instead of a range along with the areas of probabilities for tropical storm force winds.

I would personally prefer a hybrid sort of approach.

But I also wanted to touch briefly on a separate graphic in the Post version, the forecast arrival times.

This projects when tropical storm force winds will begin to impact particular areas. Notably, the areas of probability of tropical storm force winds does not change. Instead the dotted line projections for the paths of the storms are replaced by lines relatively perpendicular to those paths. These lines show when the tropical storm winds are forecast to begin. It’s also another updated design of the National Weather Service offering below.

Again, we only see one storm per graphic here and this is only for Laura, not Marco. But this also probably most analogous to what we see in the Post version. Here, the black outline represents the light pink area on the Post map, the area with at least a 5% forecast to receive tropical storm force winds. The NWS version, however, does not provide any further forecast probabilities.

The Post’s version is also design improved, as the blue, while not as dark the heavy black lines, still draws unnecessary attention to itself. Would even a very pale blue be an improvement? Almost certainly.

In one sense, I prefer the Post’s version. It’s more direct, and the information presented is more clearly presented. But, I find it severely lack in one key detail: the forecast cone. Even yesterday, the forecast cone had Laura moving in a range both north and south of the island of Cuba from its position west of Puerto Rico. 24 hours later, we now know it’s on the southern track and that has massive impact on future forecast tracks.

Being east of west of landfall can mean dramatically different impacts in terms of winds, storm surge, and rainfall. And the Post’s version, while clear about one forecast track, obscures the very real possibilities the range of impacts can shift dramatically in just the course of one day.

I think the Post does a better job of the tropical storm force wind forecast probabilities. In an ideal world, they would take that approach to the forecast paths. Maybe not showing the full spaghetti-like approach of all the storm models, but a percentage likelihood of the storm taking one particular track over another.

Credit for the Post pieces goes to the Washington Post graphics department.

Credit for the National Weather Service graphics goes to the National Weather Service.

Red Sox Starting Rotation: A Dumpster Fire in a Dumpster Fire Year

Baseball for the Red Sox starts on Friday. Am I glad baseball is back? Yes?

I love the sport and will be glad that it’s back on the air to give me something to watch. But the But the way it’s being done boggles the mind. Here today I don’t want to get into the Covid, health, and labour relations aspect of the game. But, as the title suggests, I want to look at a graphic that looks at just how bad the Red Sox could be this (shortened) year. And over at FiveThirtyEight, they created a model to evaluate teams’ starting rotations on an ongoing basis.

The Red Sox are just bad.
Look at the Red Sox, one of the worst in baseball.

Form wise, this isn’t too difficult than what we looked at yesterday. It’s a dot plot with the dots representing individual pitchers. The size of the dots represents their number of total starts. This is an important metric in their model, but as we all know size is a difficult attribute for people to compare and I’m not entirely convinced it’s working here. Some dots are clearly smaller than others, but for most it’s difficult for me to clearly tell.

Colour is just tied to the colour of the teams. Necessary? Not at all. Because the teams are not compared on the same plot, they could all be the same colour. If, however, an eventual addition were made that plot the day’s matchups on one line, then colour would be very much appropriate.

I like the subtle addition of “Better” at the top of the plots to help the user understand the constructed metric. Otherwise the numbers are just that, numbers that don’t mean anything.

Overall a solid piece. And it does a great job of showing just how awful the Red Sox starting rotation is going to be. Because I know who Nate Eovaldi is. And I’ve heard of Martin Perez. Ryan Weber I only know through largely pitching in relief last year. And after that? Well, not on this graphic, but we have Eduardo Rodriguez who had corona and, while he has recovered, nobody knows how that will impact people in sports. There’s somebody named Hall who I have never heard of. Then we have Brian Johnson, a root for the guy story of beating the odds to reach the Major Leagues but who has been inconsistent. Then…it is literally a list of relief pitchers.

We dumped the salary of Mookie Betts and David Price and all we got was basically a tee-shirt saying “We still need a pitcher or three”.

Credit for the piece goes to Jay Boice.

A Map of Unequal Comparisons

I’ve largely been busy creating and posting content on the Covid pandemic and its impact on the Pennsylvania, New Jersey, and Delaware tristate area along with, by request, both Virginia, and Illinois, my former home. It leaves me very little time for blogging, and I really do not want this site to become a blog of my personal work. That’s why I have a portfolio or my data project sites, after all.

But in posting my Covid datagraphics, I’ve come across variations of this map with all sorts of meme-y, witty captions saying why Canada is doing so much better than the US, why Americans shouldn’t be allowed to travel to Canada, and now why the Blue Jays shouldn’t be allowed to host Major League Baseball games.

Wait just a minute, there…

Well, that map isn’t necessarily wrong, but it’s incredibly misleading.

First, the map comes from the fantastic Johns Hopkins work on Covid-19. (Full disclosure, that’s the data source I use at work to create my work work datagraphics: https://philadelphiafed.org/covid-19/covid-19-research/covid-19-cases-and-deaths#.) And their site has a larger and more comprehensive dashboard (still hate that term but it does have sticking power) of which the map is the focal point.

The numbers as of this posting.

You can see the map there in the centre and some tables to the left, some tables to the right, and even a micro table beneath thundering away at the map’s position. I could get into the overall design—maybe I will one of these days—but again, let’s look at that map.

The crux of the argument is that there are a lot of red dots in the United States and very few in Canada. But look at the table in the dashboard on the left. At the very bottom you see three small tabs, Admin 0, Admin 1, and Admin 2. Admin 0 contains all entities at the sovereign state level, e.g. US, Canada, Sweden, Brazil, &c. Admin 1 is the provincial/state level, e.g. Pennsylvania, Illinois, Ontario, Quebec, &c. Admin 2 is the sub-provincial/sub-state level, e.g. Philadelphia County, Cook County, Chester County, Lake County, &c.

Notice anything about my examples? Not all countries have provinces/states, but Canada certainly does. And then at Admin 2, the examples and indeed the data only have US counties and US data. Everything in Canada has been aggregated up to Admin 1. And that is the problem.

The second part to point out is the dot-ness of the map. And to be fair, this is part of a broader problem I have been seeing in data visualisation the last few months. Dots, circles, or markers imply specificity in location. The centre of that object, after all, has to fall on a specific geographic place, a latitude and longitude coordinate. It utterly fails to capture the dimensions and physical size of the geographic unit, which can be critical.

Because not all geographic units are of the same size. We all know Rhode Island as one of the smallest US states. Let’s compare that to Nunavut or Yukon in Canada, massive provinces that spread across the Canadian Arctic. Rhode Island, according to Google, 1212 square kilometres. Nunavut? 808,200.

So now show both states/provinces on a map with one dot and Rhode Island’s will practically cover the state. And it will also be surrounded by and in close proximity to the states or Massachusetts and Connecticut. Nunavut, on the other hand will be a small dot in a massive empty space on a map. But those dots are equal.

Now, combine that with the fact that the Hopkins map is showing data on the US county level. Every single county in the United States gets a red dot. By default, that means the US is covered with red dots. But there is no county-level equivalent data for Canada. Or for Mexico (also seen in the above graphic). And so given we’re only using dots to relate the data, we see wide swaths of empty space, untouched by red dots. And that’s just not true.

Yes, large parts of the Canadian Arctic are devoid of people, but not southern Ontario and Quebec, not the southwestern coast of British Columbia, not the Maritimes.

The Hopkins map should be showing geographic units at the same admin level. By that I mean that when on Admin 0, the map should reflect geographic units of sovereign state level, allowing us to compare the US to Canada directly. But, and for this argument I’m assuming we’re keeping the dots despite their flaws, we only see Admin 0 level data.

Admin 1 shows only provincial level data. Some countries will begin to disappear, because Hopkins does not have the data at that level. But in North America, we still can compare Pennsylvania and Illinois to Ontario and Quebec.

But then at Admin 2, we only see the numerous dots of the United States counties. It’s neither an accurate nor a helpful comparison to contrast Chester County or Will County to the entire province of Ontario and so the map should not allow it. Instead, as the above graphic shows, it creates misconceptions of the true state of the pandemic in the US and Canada.

Credit for the Hopkins dashboard goes to, well, Hopkins.