Choropleths and Colours

In many cities through the United States, real estate represents a hot commodity. It’s not difficult to understand why, as have covered before, Americans are saving a bit more. Coupled with stay-at-home orders in a pandemic, spending that cash on a home down payment makes a lot of sense for a lot of people. But with little new construction, it’s a seller’s market.

The Philadelphia Inquirer covers that angle for the Philadelphia region and in the article, it includes a map looking at time to sell a house. And it’s that interactive map I want to look at briefly this morning.

Red vs. blue

Primarily I want to discuss the colours, as you can gather from this post’s title. We have six bins here, each indicating an amount of time in one-week intervals. So far so good. Now to the colours, we have red for homes that sell in one week or less and blue for homes that sell in five weeks or more.

Blue to red is a pretty standard choice. You will often see it in maps where you have positive growth to negative growth or something similar, I’ve used it myself on Coffeespoons a number of times, like in this map of population growth at the county level here in Pennsylvania.

In those scenarios, however, note how you have positive values and negative values. The change in colour (hue) encodes the change in numerical value, i.e. positive vs. negative. We then encode the values within that positive or negative range with lighter/darker blues and reds. Most often the darker the blue or red, the greater the value toward the end of the spectrum. For example, in Pennsylvania, the dark blue meant population growth greater than 8% and red meant population declines in excess of 8%.

As an aside you’ll note that there are no dark blue counties in that map and that’s by design. By keeping the legend symmetrical in terms of its minimum and maximum values, we can show how no counties experienced rapid population growth whilst several declined rapidly. If dark blue had meant greater than 4% growth, that angle of the story would have been absent from the map.

Back to our choropleth discussion, however. How does that fit with this map of selling times for homes in the Philadelphia region?

Note first that five weeks is a positive value. But so is one week or less. The use of the red-blue split here is not immediately intuitive. If this map were about the change or growth in how long homes sell, certainly you could see positive and negative rates and those would make sense in red and blue.

The second part to understand about a traditional red-blue choropleth is that at some point you have to switch from red to blue, a mid-point if you will. If you are talking positive/negative like in my Pennsylvania map, zero makes a whole lot of sense. Anything above zero, blue, anything below zero red.

Sometimes, you will see a third colour, maybe a grey or a purple, between that red and blue. That encodes a fuzzier split between positive and negative. Say you want to give a margin of 1%, i.e. any geographic area that has growth between +1% and -1%. That intrinsically means the bin is both positive and negative at the same time, so a neutral colour like grey or a blend of the two colours, a purple in the case of red and blue, makes a whole lot of sense.

Here we have nothing like that. Instead we jump from a light yellow two-to-three weeks to a light blue three-to-four weeks.

What about that yellow? In a spectrum of dark blue to light blue, you will see lighter blues than darker blues. But in a red spectrum, that light red becomes pinkish or salmonish depending on that exact type of red you use. (Conversation for another day.) Personal preferences will often push clients to asking a designer to “use less pink” in their maps. I can’t tell you the number of times I’ve heard that.

If that comes up, designers will often keep their blue side of the legend from the dark to light—no complaints there, or at least I’ve never heard any. But for the red side, they’ll switch to using hue or type of colour instead of dark to light red.

Not all colours are as dark as others. Blue and red can be pretty dark. Yellow, however, is a fairly light colour. Imagine if you converted the colours to greyscale, you’ll have very dark greys for blue and red, but yellow will be consistently far lighter than the other two.

The designer can use the light yellow as the light red. But to link the yellow to red, they need to move through the hues or colours between the two. There’s a whole conversation here about colour theory and pigment and light absorption vs. pixels and light emission, but let’s go back to your colours you learned in primary school (pigment and light absorption). Take your colour wheel and what sits between red and yellow? Orange.

And so if a client objects to a light pink, you’ll see a pseudo dark-to-light red spectrum that uses a dark red, a medium orange, and a light yellow. Just like we see here in this Inquirer map.

Back to the two-to-three week and three-to-four week switch, though. What’s the deal? This is my sticking point with the graphic. I am looking for the explanation of why the sudden break in colour here, but I don’t see any obvious one.

Why would you use this colour scheme where blue and red diverge around a non-zero value? Let’s say the average home in the region sells in three weeks, any of the zip codes in red are selling faster than average, hot markets, and those taking longer than average are in blue, cold markets. Maybe it’s the current average, however. What if it were the average last year? Or the national average? These all serve as benchmarks for the presented data and provide valuable context to understand the market.

Unfortunately it’s not clear what, if any, benchmarks the divergence point in this map reflects. And if there is no reason to change colours mid-legend, with only six bins, a designer could find a single colour, a blue or purple for example, and then provide five additional lighter/darker shades of that to indicate increasing/decreasing levels of speed at which homes sell.

Overall, I left this piece a wee bit confused. The general trend of regional differences in how quickly homes are selling? I get that. But because there’s a non-logical break between red and blue here—or at least one I fail to see in the graphic—this map would work almost as well if each bin were a separate colour entirely, using ROYGBIV as a base for example.

Credit for the piece goes to John Duchneskie.

Farewell, Cardboard Cutouts

In 2020, baseball did not permit fans to attend regular season matches. (They changed this for the playoffs.) Instead, many stadiums opted for cardboard cutouts: fans often paid a fee and submitted a picture that the team printed on cardboard cutouts. Like so many things we will say about 2020, it was surreal.

But in Philadelphia at least, cardboard cutouts are out, and human fans are in. The state government in Harrisburg and the city government will allow 20% capacity at outdoor stadiums and 15% for indoor stadiums.

The Philadelphia Inquirer created a small graphic for its homepage to capture this news.

I cannot wait to safely attend a live match. C’mon, vaccines.

I intentionally included other site elements in the cropping to show how the graphic fits into the broader site. The extra white space around the image helps focus attention on the datagraphic over the numerous photographic elements for each article. Clicking on other tabs in the section brings up full-component-width graphics.

To the graphic itself.

Still can’t wait…

My guess would be this was a quick turnaround piece. There are a few things going on here. The first and most obvious one, the squares as spectators. Now I confess this confused me at first. I was not entirely certain what the coloured squares meant; they mean in-person attendees. Was this supposed to be an overall stadium? Or was it a representative seating section?

The quick turnaround becomes important, because this is probably how I would have first conceptualised the graphic. But, with more time, I may have attempted to incorporate the shape of the playing field, be it a baseball diamond or basketball court, or hockey rink—I know all the sports terms!—and surrounded them with shapes representing a certain number of spectators. Squares might not work in that case because of the curves. Circles? Hexagons? Regardless of the shape, the filling of occupied seats would be the same as here, but it would perhaps be clearer to some readers, i.e. me.

Second, we get to the table below the graphics. Here we have a subtle design decision. Note that here the designer greyed out the normal capacity figures. The new figures at that 20% and 15% rates are what appear in black bold text. My usual instinct is to use typographic weight, regular vs. bold, in these situations. But the grey here works equally well.

Third, and this also involves the table, we have the first game data. We talked about the comparison of the capacity and permitted attendance. But I wonder, did the date of the first game with fans needed to be displayed in the same way as the permitted attendance? Because the news isn’t the dates of the first games—at least not as I read the news—but the numbers of attendees. And because of that, maybe I would have reduced the size of the type for the date of the first game. Or, conversely, set the type for the new attendance in a larger point size.

Overall, I enjoyed seeing this news presented visually, even if I was left confused.

Credit for the piece goes to John Duchneskie.

Cheesesteaks and Politics

For those unaware, Pennsylvania matters in the 2020 election. And it has mattered for years as a perennial swing state. There are of course the visits to steel mill cities like Pittsburgh, deindustrialised places like Johnstown, and unions love visits to places in Lackawanna and Luzerne. (You can read more about Pennsylvania as a swing state in my latest analysis here.)

But I want to focus on visits to Philadelphia. Because they inevitably involve the candidate consuming a cheesesteak. The Economist’s sister magazine, 1843, recently published an article on this very subject. And the whole thing is worth a read.

How have I managed to find this relevant to a blog about data visualisation? Well, they included a recipe to help people understand just what goes into the traditional Philadelphia dish.

Personally, I always have to confess, I’ve never been a huge fan. But, I’ll take provolone over whiz any day.

Credit for the piece goes to Jake Read.

The Size of the California Wildfires Compared to Philly

The West Coast is a different scale than the East Coast. After all, California alone is almost the size of New England and parts of the Mid-Atlantic combined. So when we take that enormous size into consideration, how big are these fires on an East Coast scale? It can be difficult to imagine.

Thankfully the Philadelphia Inquirer addressed the issue.

It’s a simple concept, but I love these kind of graphics. The East Coast is dense and cities and towns are clustered closer together, being they were founded before personal automobiles were things. And so the August Complex fire in California would cover a significant portion of the Philadelphia metropolitan area, almost wiping it all off the map.

Credit for the piece goes to John Duchneskie.

Baby, It’s Hot Outside

Those of you living on the East Coast, specifically the Mid-Atlantic, know that presently the weather is quite warm outside. As in levels of dangerous heat and humidity. Personally, your author has not left his flat in a few days now because it is so bad.

Alas, not everyone has access to air conditioning in his or her abode. Consequently, they need to look to public spaces with air conditioning. Usually that means libraries or public buildings. But here in Philadelphia, have people considered the subway?

Billy Penn investigated the temperatures in Philadelphia’s subsurface stations along the Broad Street and Market–Frankford Lines—Philadelphia’s third and oft-forgot line, the Patco, was untested. What they found is that temperatures in the stations were significantly below the temperatures above ground. The Market–Frankford stations, for example, were less than 100ºF.

Just explore the rails…
Just explore the rails…

Of course that misses the 2nd Street station in Old City, but otherwise picks up all the Market–Frankford stations situated underground.

Then there is the Broad Street Line.

More rail riding…
More rail riding…

Here, I do have a question about why the line wasn’t investigated from north to south. It ran only as far north as Girard, stopping well short of north Philadelphia neighbourhoods, and then as far south as Snyder, missing both Oregon and Pattison (sorry, corporately branded AT&T) stations. The robustness of the dataset is a bit worrying.

The colours here too mean nothing. Instead blue is used for the blue-coloured Market–Frankford line and orange for the orange-coloured Broad Street line. (The Patco line would have been red.) Here was a missed opportunity to encode temperature data along the route.

Finally, if the sidewalk temperatures were measured at each station, I would want to see that data alongside and perhaps run some comparisons.

This is an interesting story, but some more exploration and visualisation of the data could have taken it to the next level.

Credit for the piece goes to Danya Henninger.

PECO Outages Five Years Ago

Christmas time is a time when people receive gifts. Well this year was no different and I received a few. One, however, was in a box stuffed with old newspaper pages. And it turns out one of said pages had a graphic on it. So let us spend today looking at this little blast from the past.

The piece looks at PECO outages, PECO being the Philadelphia region’s main electricity supplier. The article is full page and is both headed and footed with photography, the graphic in which we are interested sits centre stage in the middle of the page.

Full page design.
Full page design.

Overall the graphic is fairly compact and works well at showing the distribution of the outages, which the bar chart below the choropleth shows was historically significant. (Despite my years in Chicago, I was somehow in the area for all but the storm written about and can confirm that they were, in fact, disruptive.)

Ice storms suck.
Ice storms suck.

The choropleth works, but I question the colour scheme. The bins diverge at about 50%, which to my knowledge marks no special boundary other than “half”. If that yellow bin represented, say, the average number of outages per storm or the acceptable number of outages per storm, sure, I could buy it. Otherwise, this is really just degrees of severity along one particular axis. I would have either kept the bins all red or all blue and proceeded from a light of either to a dark of either.

I probably would have also dropped Philadelphia entirely from the map, but I can understand how it may be important to geographically anchor readers in the most populous county to orientate themselves to a story about suburbia.

Lastly, I have one data question. With power lines down during an ice storm, I would be curious to see less of the important roadways as the map depicts and other variables. What about things like average temperature during the storm? Was the more urban and built-up Delaware County less susceptible because of an urban heat bubble preventing water from freezing? Or what about trees? Does the impact in the more rural areas have anything to do with increasing numbers of trees as one heads away from the city?

Those last data questions were definitely out of scope for the graphic, but I nevertheless remain curious. But then again, this piece is almost five years old. Just a look at how some graphical forms remain in use because of their solid ability to communicate data. Long live the bar chart. Long live the choropleth.

Credit for the piece goes to the Philadelphia Inquirer graphics department.

Philly Rules

Yo. C’mon, bro. This jawn is getting tired. Just stop already.

If you did not catch it this week, the most important news was Donald Trump disinviting the Super Bowl champions Eagles to the White House to celebrate their victory over the Patriots. He then lied about Eagles players kneeling during the US anthem—no player did during the 2017 season. He then claimed that the Eagles abandoned their fans. Yeah, good luck convincing the city of that.

So naturally we have a Friday graphic for youse.

That's 25,304.
That’s 25,304.

Full disclosure: I root for the Patriots. But I mean, seriously, can’t youse guys do the math?

Tech Economies in the USA

Earlier this March the Washington Post published a piece looking at the twenty finalist contenders for the second Amazon headquarters. Specifically it explored how the cities rank in metrics that speak to a city’s technology and innovation economy.

That in and of itself, while incredibly fascinating, is not noteworthy in and of itself. Though I will say the article’s online title is neatly presented, split half-and-half with the vertical graphic showing the cities ranked.

I really like how this title space received a special design.
I really like how this title space received a special design.

But the point that was really neat was the interactivity that followed. Here you can see a dropdown from which the user selects a city of interest—surprise, surprise we are looking at Philadelphia. From that point on, the piece keeps the selected city highlighted in every graphic that follows.

Looking at Philly
Looking at Philly

Again, that is nothing truly surprising, but it is neat to see. What would have taken it to the next step is if each of those associated paragraphs were tailored to the specific city. Instead, they appear to be general paragraphs.

But overall, it does a really nice job of comparing the twenty cities—it’s actually fewer because both Washington and New York have multiple sites per metro area—across the different metrics.

The only part that left me scratching my head a bit was the colour choice. I am not certain that it needs the blue-green to yellow-green palette. Those colours seem defined by a city’s placement on the overall list and I am not convinced that the piece would not have still worked if they had been only a single colour, using another colour to define the selected city.

Credit for the piece goes to Darla Cameron and Jonathan O’Connell.

Bus Transit in Philadelphia

I have lived in Philadelphia for almost ten months now and that time can be split into two different residences. For the first, I took the El to and from Centre City. For the second, I walk to and from work. I look for living spaces near transit lines. In Chicago I took the El for eight years to get home. But to get to work, I often used the 143 express bus. Personally, I prefer trains and subways to busses—faster, dedicated right-of-way, Amtrak even has WiFi. But, busses are an integral part of a dense city’s transit network. You can cram dozens of people into one vehicle and remove several cars from the road. Here in Philadelphia, however, as the Inquirer reports, bus ridership is down over the last two years at the same time as ride-hailing apps are growing in usage.

For those interested in urban planning and transit, the article is well worth the read. But let’s look at one of the graphics for the article.

Lots of red in Centre City
Lots of red in Centre City

The map uses narrow lines for bus routes and the designer wisely chose to alternate between only two shades of a colour: high and low values of either growth (green) or decline (red). But, and this is where it might be tricky given the map, I would probably dropdown all the greys in the map to be more of an even colour. And I would ditch the heavy black lines representing borders. They draw more attention and grab the eye first, well before the movement to the green and red lines.

And the piece did a good job with the Uber time wait map comparison as well. It uses the same colour pattern and map, small multiple style, and then you can see quite clearly the loss of the entire dark purple data bin. It is a simple, but very effective graphic. My favourite kind.

Still haven't used Uber yet. Unless you count the times I'm being put into one by a friend…
Still haven’t used Uber yet. Unless you count the times I’m being put into one by a friend…

Anyway, from the data side, I would be really curious to see the breakout for trolleys versus busses—yes, folks, Philly still has several trolley lines. If only because, by looking at the map, those routes seem to be in the green and growing category. So as I complain to everyone here in Philly, Philly, build more subways (and trolleys). But, as the article shows, don’t forget about the bus network either.

Credit for the piece goes to the Inquirer graphics department.

Traffic Accidents in Philadelphia

I’m working on a set of stories and in the course of that research I came across this article from Philly.com exploring traffic accident in Philadelphia.

Lots of red there…
Lots of red there…

The big draw for the piece is the heat map for Philadelphia. Of course at this scale the map is pretty much meaningless. Consequently you need to zoom in for any significant insights. This view is of the downtown part of the city and the western neighbourhoods.

A more granular view
A more granular view

 

As you can see there are obvious stretches of red. As a new resident of the city, I can tell you that you can connect the dots along a few key routes: I-76, I-676, and I-95. That and a few arterial streets.

Now while I do not love the colour palette, the form of the visualisation works. The same cannot be said for other parts of the piece. Yes, there are too many factettes. But…pie charts.

 

This is the bad kind of pie
This is the bad kind of pie

From a design standpoint, first is the layout. The legend needs to be closer to the actual chart. Two, well, we all know my dislike of pie charts, in particular those with lots of data points, which this piece has. But that gets me to point three. Note that there are so many pieces the pie chart loops round its palette and begins recycling colours. Automotives and unicycles are the same blue. Yep, unicycles. (Also bi- and tricycles, but c’mon, I just want to picture some an accident with a unicycle.)

If you are going to have so many data points in the pie chart, they should be encoded in different colours. Of course, with so many data points, it would be difficult to find so many distinguishable but also not garish colours. But when you get to that point, you might also be at the point where a pie chart is a bad form for the visualisation. If I had the time this morning I would create a quick bar chart to show how it would perform better, but I do not. Trust me, though, it would.

Credit for the piece goes to Michele Tranquilli.