Choropleths and Colours

In many cities through the United States, real estate represents a hot commodity. It’s not difficult to understand why, as have covered before, Americans are saving a bit more. Coupled with stay-at-home orders in a pandemic, spending that cash on a home down payment makes a lot of sense for a lot of people. But with little new construction, it’s a seller’s market.

The Philadelphia Inquirer covers that angle for the Philadelphia region and in the article, it includes a map looking at time to sell a house. And it’s that interactive map I want to look at briefly this morning.

Red vs. blue

Primarily I want to discuss the colours, as you can gather from this post’s title. We have six bins here, each indicating an amount of time in one-week intervals. So far so good. Now to the colours, we have red for homes that sell in one week or less and blue for homes that sell in five weeks or more.

Blue to red is a pretty standard choice. You will often see it in maps where you have positive growth to negative growth or something similar, I’ve used it myself on Coffeespoons a number of times, like in this map of population growth at the county level here in Pennsylvania.

In those scenarios, however, note how you have positive values and negative values. The change in colour (hue) encodes the change in numerical value, i.e. positive vs. negative. We then encode the values within that positive or negative range with lighter/darker blues and reds. Most often the darker the blue or red, the greater the value toward the end of the spectrum. For example, in Pennsylvania, the dark blue meant population growth greater than 8% and red meant population declines in excess of 8%.

As an aside you’ll note that there are no dark blue counties in that map and that’s by design. By keeping the legend symmetrical in terms of its minimum and maximum values, we can show how no counties experienced rapid population growth whilst several declined rapidly. If dark blue had meant greater than 4% growth, that angle of the story would have been absent from the map.

Back to our choropleth discussion, however. How does that fit with this map of selling times for homes in the Philadelphia region?

Note first that five weeks is a positive value. But so is one week or less. The use of the red-blue split here is not immediately intuitive. If this map were about the change or growth in how long homes sell, certainly you could see positive and negative rates and those would make sense in red and blue.

The second part to understand about a traditional red-blue choropleth is that at some point you have to switch from red to blue, a mid-point if you will. If you are talking positive/negative like in my Pennsylvania map, zero makes a whole lot of sense. Anything above zero, blue, anything below zero red.

Sometimes, you will see a third colour, maybe a grey or a purple, between that red and blue. That encodes a fuzzier split between positive and negative. Say you want to give a margin of 1%, i.e. any geographic area that has growth between +1% and -1%. That intrinsically means the bin is both positive and negative at the same time, so a neutral colour like grey or a blend of the two colours, a purple in the case of red and blue, makes a whole lot of sense.

Here we have nothing like that. Instead we jump from a light yellow two-to-three weeks to a light blue three-to-four weeks.

What about that yellow? In a spectrum of dark blue to light blue, you will see lighter blues than darker blues. But in a red spectrum, that light red becomes pinkish or salmonish depending on that exact type of red you use. (Conversation for another day.) Personal preferences will often push clients to asking a designer to “use less pink” in their maps. I can’t tell you the number of times I’ve heard that.

If that comes up, designers will often keep their blue side of the legend from the dark to light—no complaints there, or at least I’ve never heard any. But for the red side, they’ll switch to using hue or type of colour instead of dark to light red.

Not all colours are as dark as others. Blue and red can be pretty dark. Yellow, however, is a fairly light colour. Imagine if you converted the colours to greyscale, you’ll have very dark greys for blue and red, but yellow will be consistently far lighter than the other two.

The designer can use the light yellow as the light red. But to link the yellow to red, they need to move through the hues or colours between the two. There’s a whole conversation here about colour theory and pigment and light absorption vs. pixels and light emission, but let’s go back to your colours you learned in primary school (pigment and light absorption). Take your colour wheel and what sits between red and yellow? Orange.

And so if a client objects to a light pink, you’ll see a pseudo dark-to-light red spectrum that uses a dark red, a medium orange, and a light yellow. Just like we see here in this Inquirer map.

Back to the two-to-three week and three-to-four week switch, though. What’s the deal? This is my sticking point with the graphic. I am looking for the explanation of why the sudden break in colour here, but I don’t see any obvious one.

Why would you use this colour scheme where blue and red diverge around a non-zero value? Let’s say the average home in the region sells in three weeks, any of the zip codes in red are selling faster than average, hot markets, and those taking longer than average are in blue, cold markets. Maybe it’s the current average, however. What if it were the average last year? Or the national average? These all serve as benchmarks for the presented data and provide valuable context to understand the market.

Unfortunately it’s not clear what, if any, benchmarks the divergence point in this map reflects. And if there is no reason to change colours mid-legend, with only six bins, a designer could find a single colour, a blue or purple for example, and then provide five additional lighter/darker shades of that to indicate increasing/decreasing levels of speed at which homes sell.

Overall, I left this piece a wee bit confused. The general trend of regional differences in how quickly homes are selling? I get that. But because there’s a non-logical break between red and blue here—or at least one I fail to see in the graphic—this map would work almost as well if each bin were a separate colour entirely, using ROYGBIV as a base for example.

Credit for the piece goes to John Duchneskie.

Too Much Horsing Around

Last week the Philadelphia Inquirer published an investigation of the staggering number of horse deaths in Pennsylvania’s race track facilities. I found the article fascinating, but admittedly at a point or two a wee bit squeamish when the author described how horses essentially die. Then about halfway through the article I ran into the first of two graphics looking at the data.

Seeing red…

The first is pretty simple, a timeline of deaths over the course of one year, 2019. Overall it works, you can clearly see clusters of racing deaths, but that those clusters spread across the year. When I sat with the graphic for a moment, however, a few things began to stick out at me. The first was a distracting vibration in the background. Not the alternating beige and blue of the months, but if you look closely you’ll see tightly spaced lines within the colour fields: presumably the days of the month for aligning the deaths.

On a large enough graphic it makes all the sense to tick off sub-monthly increments, but in this space I would have probably opted to show only the months. Maybe weeks could have worked, as that approach may have reinforced the statistic about a horse dying every six days on average.

The second point is the black stroke or outline of each dot. Here the designer faces a challenging constraint. Essentially, the smaller the dot (or the symbol) the brighter the colour. In a rich, blood red colour you have a dark heavier colour. Compare that to say a stop sign that is bright red. It has a lighter feel. The blood red colour, in a given space, has let’s say an amount of black ink or pixels—I’m simplifying here—mixed in with the red. But in a large area, there’s enough red ink or pixels to still be clearly blood red. The stop sign red has no other colours but red. And in large areas, it can be an eye-stabbing amount of red—precisely why it’s likely so useful for, you know, stop signs.

But at the small scale of these very small dots, you still proportionally have the same amount of red and black ink, but with fewer and fewer amounts, the eye can begin to experience difficulty in truly reading the colour for what it is. For example, in an area of say 49 pixels (7×7), while the ratio of red to black may be consistent, you still only have a total of 49 pixels with which to convey “red” to the reader. Consequently, in smaller spaces, you may find that designers sometimes opt for brighter colours, a la stop sign red, than they would in larger fields of colour.

Here we have a nice use of brighter red, green, and yellow. (I will quickly add that the choice of red and green can be problematic for colour blindness, but I don’t want to revisit that here.) But to provide better separation between those small, circle sized fields of colour a border probably helps. A thin black line, or stroke, makes sense. But the black is darker than the colours themselves, thus it can draw more attention than the colour fill. And that begins to happen here. I wonder if a thin white stroke may have been less distracting and placed more emphasis on the fill colours.

As I said, overall a really nice if not sobering graphic in an important but disturbing article. I think a few small tweaks could really bring the graphic over the finish line. Pun fully intended. Sorry, not sorry.

Credit for the pieces goes to John Duchneskie.

Farewell, Cardboard Cutouts

In 2020, baseball did not permit fans to attend regular season matches. (They changed this for the playoffs.) Instead, many stadiums opted for cardboard cutouts: fans often paid a fee and submitted a picture that the team printed on cardboard cutouts. Like so many things we will say about 2020, it was surreal.

But in Philadelphia at least, cardboard cutouts are out, and human fans are in. The state government in Harrisburg and the city government will allow 20% capacity at outdoor stadiums and 15% for indoor stadiums.

The Philadelphia Inquirer created a small graphic for its homepage to capture this news.

I cannot wait to safely attend a live match. C’mon, vaccines.

I intentionally included other site elements in the cropping to show how the graphic fits into the broader site. The extra white space around the image helps focus attention on the datagraphic over the numerous photographic elements for each article. Clicking on other tabs in the section brings up full-component-width graphics.

To the graphic itself.

Still can’t wait…

My guess would be this was a quick turnaround piece. There are a few things going on here. The first and most obvious one, the squares as spectators. Now I confess this confused me at first. I was not entirely certain what the coloured squares meant; they mean in-person attendees. Was this supposed to be an overall stadium? Or was it a representative seating section?

The quick turnaround becomes important, because this is probably how I would have first conceptualised the graphic. But, with more time, I may have attempted to incorporate the shape of the playing field, be it a baseball diamond or basketball court, or hockey rink—I know all the sports terms!—and surrounded them with shapes representing a certain number of spectators. Squares might not work in that case because of the curves. Circles? Hexagons? Regardless of the shape, the filling of occupied seats would be the same as here, but it would perhaps be clearer to some readers, i.e. me.

Second, we get to the table below the graphics. Here we have a subtle design decision. Note that here the designer greyed out the normal capacity figures. The new figures at that 20% and 15% rates are what appear in black bold text. My usual instinct is to use typographic weight, regular vs. bold, in these situations. But the grey here works equally well.

Third, and this also involves the table, we have the first game data. We talked about the comparison of the capacity and permitted attendance. But I wonder, did the date of the first game with fans needed to be displayed in the same way as the permitted attendance? Because the news isn’t the dates of the first games—at least not as I read the news—but the numbers of attendees. And because of that, maybe I would have reduced the size of the type for the date of the first game. Or, conversely, set the type for the new attendance in a larger point size.

Overall, I enjoyed seeing this news presented visually, even if I was left confused.

Credit for the piece goes to John Duchneskie.

Biden’s Biggest Pyramids

Yesterday we looked at an article from the Inquirer about the 2020 election and how Biden won because of increased margins in the suburbs. Specifically we looked at an interactive scatter plot.

Today I want to talk a bit about another interactive graphic from the same article. This one is a map, but instead of the usual choropleth—a form the article uses in a few other graphics—here we’re looking at three-dimensional pyramids.

All the pyramids, built by aliens?

Yesterday we talked about the explorative vs. narrative concept. Here we can see something a bit more narrative in the annotations included in the graphic. These, however, are only a partial win, though. They call out the greatest shifts, which are indeed mentioned in the text. But then in another paragraph the author writes about Bensalem and its rightward swing. But there’s no callout of Bensalem on the map.

But the biggest things here, pun intended, are those pyramids. Unlike the choropleth maps used elsewhere in the article, the first thing this map fails to communicate is scale. We know the colour means a county’s net shift was either Democratic or Republican. But what about the magnitude? A big pyramid likely means a big shift, but is that big shift hundreds of votes? Thousands of votes? How many thousands? There’s no way to tell.

Secondly, when we are looking at rural parts of Bucks, Chester, and Montgomery Counties, the pyramids are fine. They remain small and contained within their municipality boundaries. Intuitively this makes sense. Broadly speaking, population decreases the further you move from the urban core. (Unless there’s a secondary city, e.g. Minneapolis has St. Paul.) But nearer the city, we have more population, and we have geographically smaller municipalities. Compare Colwyn, Delaware County to Springfield, Bucks County. Tiny vs. huge.

In choropleth maps we face this problem all the time. Look at a classic election map at the county level from 2016.

Wayb ack when…

You can see that there is a lot more red on that map. But Hillary Clinton won the popular vote by more then 3,000,000 votes. (No, I won’t rehash the Electoral College here and now.) More people are crowded into smaller counties than there are in those big, expansive red counties with far, far fewer people.

And that pattern holds true in the Philadelphia region. But instead of using the colour fill of an area as above, this map from the Inquirer uses pyramids. But we face the same problem, we see lots of pyramids in a small space. And the problem with the pyramids is that they overlap each other.

At a glance, you cannot see one pyramid beind another. At least in the choropleth, we see a tiny field of colour, but that colour is not hidden behind another.

Additionally, the way this is constructed, what happens if in a municipality there was a small net shift? The pyramid’s height will be minimal. But to determine the direction of the shift we need to see the colour, and if the area under the line creating the pyramid is small, we may be unable to see the colour. Again, compare that to a choropleth where there would at least be a difference between, say, a light blue and light red. (Though you could also bin the small differences into a single neutral bin collecting all small shifts be them one way or the other.)

I really think that a more straight forward choropleth would more clearly show the net shifts here. And even then, we would still need a legend.

The article overall, though, is quite strong and a great read on the electoral dynamics of the Philadelphia region a month ago.

Credit for the piece goes to John Duchneskie.

Biden Won the Burbs

The thing with election results is that we don’t have the final numbers for a little while after Election Day. And that’s normal.

There are a few things I want to look at in the coming weeks and months once my schedule eases up a bit. But for now, we can use this nice piece from the Philadelphia Inquirer to look at a story close to home: the vote in the Philadelphia suburbs.

It’s all happening in the yellow.

I’ve already looked at some analysis like this for Wisconsin and I shared it on my social. But there I looked at the easy, county-level results. What the Inquirer did above is break down the Pennsylvania collar counties of Philadelphia, i.e. the suburbs, into municipality level results. It then plotted them 2020 vs. 2016 and the results were—as you can guess since we know the result—Biden beat Trump.

What this chart does well is colours the municipalities that Biden flipped yellow. It’s a great choice from a colour standpoint. As the third of the primaries, with both blue and red well represented, it easily contrasts with the Biden- and Trump-won towns and cities of the region. The colour is a bit “darker” than a full-on, bright yellow, but that’s because the designers recognised it needs to stand out on a white field.

Let’s face it, yellow is a great colour to use, but it’s difficult because it’s so light and sometimes difficult to see. Add just the faintest bit of black to your mix, especially if you’re using paints, and voila, it works pretty well. So here the designer did a great job recognising that issue with using yellow. Though you can still see the challenge, because even though it is a bit darker, look at how easy it is to read the text in the blue and the red. Now compare that to the yellow. So if you’re going to use yellow, you want to be careful how and when you do.

The other design decision here comes down to what I call the explorative vs. the narrative. Now, I don’t think explorative is a word—and the red squiggle agrees—but it pairs nicely with narrative. And I’ve been talking about this a lot in my field the last several works, especially offline. (In the non-blog sense, because obviously all my work is done online these days. Oh, how I miss my old office.)

Explorative works present the user with a data set and then allow them to, in this case, mouse over or tap on dots and reveal additional layers of information, i.e. names and specific percentages. The idea is not to tell a specific story, but show an overall pattern. And if the piece is interactive, as this is, potentially allow the user to drill down and tease out their own stories.

Compare that to the narrative, my Wisconsin piece I referenced above is more in this category. Here the work takes you through a guided tour of the data. It labels specific data points, be them on trend or outliers and is sometimes more explicit in its analysis. These can also be interactive—though my static image is not—and allow users to drill down, and critically away, from the story to see dots of interest, for example.

This piece is more explorative. The scatter plot naturally divides the municipalities into those that voted for Biden, Trump, and then more or less than they voted for Trump in 2016. The labels here are actually redundant, but certainly helpful. I used the same approach in my Wisconsin graphic.

But in my Wisconsin graphic, I labelled specific counties of interest. If I had written an accompanying article, they would have been cited in the textual analysis so that the graphic and text complemented each other. But here in the Inquirer, it’s a bit of a missed opportunity in a sense.

The author mentions places like Upper Darby and Lower Merion and how they performed in 2020 vis-a-vis 2016. But it’s incumbent on the user to find those individual municipalities on the scatter plot. What if the designer had created a version where the towns of interest were labelled from the start? The narrative would have been buttressed by great visualisations that explicitly made the same point the author wrote about in the text. And that is a highly effective form of communication when you’re not just telling, but also showing your story or argument.

Overall it’s a great article with a lot to talk about. Because, spoiler, I’m going to be talking about it again tomorrow.

Credit for the piece goes to Jonathan Lai.

The Size of the California Wildfires Compared to Philly

The West Coast is a different scale than the East Coast. After all, California alone is almost the size of New England and parts of the Mid-Atlantic combined. So when we take that enormous size into consideration, how big are these fires on an East Coast scale? It can be difficult to imagine.

Thankfully the Philadelphia Inquirer addressed the issue.

It’s a simple concept, but I love these kind of graphics. The East Coast is dense and cities and towns are clustered closer together, being they were founded before personal automobiles were things. And so the August Complex fire in California would cover a significant portion of the Philadelphia metropolitan area, almost wiping it all off the map.

Credit for the piece goes to John Duchneskie.

PECO Outages Five Years Ago

Christmas time is a time when people receive gifts. Well this year was no different and I received a few. One, however, was in a box stuffed with old newspaper pages. And it turns out one of said pages had a graphic on it. So let us spend today looking at this little blast from the past.

The piece looks at PECO outages, PECO being the Philadelphia region’s main electricity supplier. The article is full page and is both headed and footed with photography, the graphic in which we are interested sits centre stage in the middle of the page.

Full page design.
Full page design.

Overall the graphic is fairly compact and works well at showing the distribution of the outages, which the bar chart below the choropleth shows was historically significant. (Despite my years in Chicago, I was somehow in the area for all but the storm written about and can confirm that they were, in fact, disruptive.)

Ice storms suck.
Ice storms suck.

The choropleth works, but I question the colour scheme. The bins diverge at about 50%, which to my knowledge marks no special boundary other than “half”. If that yellow bin represented, say, the average number of outages per storm or the acceptable number of outages per storm, sure, I could buy it. Otherwise, this is really just degrees of severity along one particular axis. I would have either kept the bins all red or all blue and proceeded from a light of either to a dark of either.

I probably would have also dropped Philadelphia entirely from the map, but I can understand how it may be important to geographically anchor readers in the most populous county to orientate themselves to a story about suburbia.

Lastly, I have one data question. With power lines down during an ice storm, I would be curious to see less of the important roadways as the map depicts and other variables. What about things like average temperature during the storm? Was the more urban and built-up Delaware County less susceptible because of an urban heat bubble preventing water from freezing? Or what about trees? Does the impact in the more rural areas have anything to do with increasing numbers of trees as one heads away from the city?

Those last data questions were definitely out of scope for the graphic, but I nevertheless remain curious. But then again, this piece is almost five years old. Just a look at how some graphical forms remain in use because of their solid ability to communicate data. Long live the bar chart. Long live the choropleth.

Credit for the piece goes to the Philadelphia Inquirer graphics department.

The Freedom of the Press

By now you may have heard that this Thursday media outlets across the United, joined by some international outlets as well, have all published editorials about the importance of the freedom of the press and the dangers of the office of the President of the United States declaring unflattering but demonstrably true coverage “fake news”. And even more so, declaring journalists, especially those that are critical of the government, “enemies of the people”.

I have commented upon this in the past, so I will refrain from digressing too much, but the sort of open hostility towards objective reality from the president threatens the ability of a citizenry to engage in meaningful debates on public policy. Let us take the clearly controversial idea of gun control; it stirs passions on both sides of the debate. But, before we can have a debate on how much or how little to regulate guns we need to know the data on how many guns are out there, how many people own them, how many are used in crimes, in lethal crimes, are owned legally or illegally. That data, that verifiably true data exists. And it is upon those numbers we should be debating the best way to reduce the numbers of children massacred in American schools. But, this president and this administration, and certain elements of the citizenry refuse to acknowledge data and truth and instead invent their own. And in a world where 2+2=5, no longer 4, who is to say next that no, 2+2=6.

There are hundreds of editorials out there.

Read one from the Philadelphia Inquirer, the Chicago Tribune, the Guardian, and/or the New York Times.

But the one editorial board that started it is that of the Boston Globe. I was dreading how to tie this very important issue into my blog, which you all know tries to focus on data and design. As often as I stand upon my soap box, I try to keep this blog a little less soapy. Thankfully, the Globe incorporated data into their argument.

The end of their post concludes with a small interactive piece that presents survey data. It shows favourability and trustworthiness ratings for several media outlets broken out into their political leanings. The screenshot below is for the New York Times.

Clearly Republicans and Democrats view the Times differently
Clearly Republicans and Democrats view the Times differently

The design is simple and effective. The darker the red, the more people believe an outlet to be trustworthy and how favourably they view it.

But before wrapping up today’s post, I also want to share another bit from that same Boston Globe editorial. As some of you may know, George Orwell’s 1984 is one of my favourite books of all time. I watched part of a rambling speech by the president a few weeks ago and was struck at how similar his line was to a theme in that novel. I am glad the Globe caught it as well.

Credit for this piece goes to the Boston Globe design staff.

Bus Transit in Philadelphia

I have lived in Philadelphia for almost ten months now and that time can be split into two different residences. For the first, I took the El to and from Centre City. For the second, I walk to and from work. I look for living spaces near transit lines. In Chicago I took the El for eight years to get home. But to get to work, I often used the 143 express bus. Personally, I prefer trains and subways to busses—faster, dedicated right-of-way, Amtrak even has WiFi. But, busses are an integral part of a dense city’s transit network. You can cram dozens of people into one vehicle and remove several cars from the road. Here in Philadelphia, however, as the Inquirer reports, bus ridership is down over the last two years at the same time as ride-hailing apps are growing in usage.

For those interested in urban planning and transit, the article is well worth the read. But let’s look at one of the graphics for the article.

Lots of red in Centre City
Lots of red in Centre City

The map uses narrow lines for bus routes and the designer wisely chose to alternate between only two shades of a colour: high and low values of either growth (green) or decline (red). But, and this is where it might be tricky given the map, I would probably dropdown all the greys in the map to be more of an even colour. And I would ditch the heavy black lines representing borders. They draw more attention and grab the eye first, well before the movement to the green and red lines.

And the piece did a good job with the Uber time wait map comparison as well. It uses the same colour pattern and map, small multiple style, and then you can see quite clearly the loss of the entire dark purple data bin. It is a simple, but very effective graphic. My favourite kind.

Still haven't used Uber yet. Unless you count the times I'm being put into one by a friend…
Still haven’t used Uber yet. Unless you count the times I’m being put into one by a friend…

Anyway, from the data side, I would be really curious to see the breakout for trolleys versus busses—yes, folks, Philly still has several trolley lines. If only because, by looking at the map, those routes seem to be in the green and growing category. So as I complain to everyone here in Philly, Philly, build more subways (and trolleys). But, as the article shows, don’t forget about the bus network either.

Credit for the piece goes to the Inquirer graphics department.

Revisiting the End of the Shuttles

This is a post that goes back a little bit in time, but that I stumbled upon and found worth a post. Last summer the United States ended the Space Shuttle programme by retiring all of our orbiters. And of course this prompted many to attempt infographics about the history of bringing liberty and freedom to space.

Amidst the fond farewells, I missed this interactive piece from the Philadelphia Inquirer about the history and the future of Americans in space.

Interactive history
Interactive history

The interactive piece contains three separate sections. The first looks at the individual Americans who made it into space. The second compares the Space Shuttle to the Russian Soyuz craft that we now must use to get into space. The third looks at the future, and what we might use.

But, the Inquirer also had a print edition to worry about, and published a static version of the piece. Is it perhaps a bit cluttered, yes, but the addition of the photographs and the annotations (even though the annotations are available as rollover conditions in the interactive piece) makes the print version more welcoming to explore and read at leisure. Additionally, the difference in scale of the three segments of the piece give a clear importance to the individuals rather than to the technology. This distinction is lost in the interactive piece because each segment is the same size and receives the same scale of treatment.

Static shuttle
Static shuttle

Credit for the interactive piece goes to Kevin Burkett and Rob Kandel. Credit for the print piece goes to Kevin Burkett.