The Evolution of the S&P 500

I found myself doing a bit of summer cleaning yesterday and I stumbled upon a few graphics of interest. This one comes from a September 2016 Wall Street Journal article about the changes in the S&P 500, a composite index of American stocks, some of the 500 largest.

In terms of the page design, if it were not for that giant 1/6 page advert in the lower right corner, this graphic could potentially dominate the visual page.  The bulk of it sits above the page’s fold and the only other competing element is a headshot to the upper-right. Regardless, it was clearly enough to grab my attention as I was going through some papers.

The overall page
The overall page

As for the graphic itself, I probably would have some done things differently.


To start, are these actual tree maps? Or are they things attempting to look like tree maps? It is difficult to tell. In an actual tree map, the rectangles are not just arranged by convenience, as they appear to be here. Instead, they are in descending—or perhaps occasionally ascending—area, within groupings.

The groupings would have been particularly powerful here.  Imagine instead of disparate blue boxes for industrials and utilities in the latter two years that they were combined into a single box. In 2001, that box may have been larger than the orange financials. Then by 2016, you would have seen those boxes switch places—in both years well behind the green boxes of 2001 debuts. If instead the goal was to show the percentages, as it might be given each percentage is labelled, a straight bar chart would have sufficed.

I am not always a fan of the circle for sizes along the bottom. But the bigger problem I have here is the alignment of the labelling and the pseudo-tree maps. One of my first questions was “how big are these years?”. However, that was one of the last points displayed, and it is separated from the tree maps from the listing of the largest company in the index from that year. I would have kept the total market cap closer to the trees, and perhaps used the whole length of line beneath the trees and instead pushed the table labels somewhere between the rather large gap from 1976 and 2001.

Credit for the piece goes to the Wall Street Journal graphics department.

The Economic Impact of Hurricanes

Yesterday Hurricane Ophelia hit Ireland and the United Kingdom. Yes, the two islands get hit with ferocious storms from time to time, but rarely do they enjoy the hurricanes like we do on the eastern seaboards of Canada, Mexico, and the United States.

Earlier this hurricane season the US had to deal with Harvey, Irma, and Maria. And in early October the Wall Street Journal published a piece that looked at the economic impact of the former two hurricanes as exhibited in economic data.

Overall the piece does a nice job explaining how hurricanes impact different sectors of the economy, well, differently. And wouldn’t you know it that leisure and hospitality is the hardest hit? But then they put together this stacked bar chart showing the impact of the hurricanes on both Florida/Georgia and Texas for Irma and Harvey, respectively.

I just want a common baseline…
I just want a common baseline…

The problem is that the stacked bar chart does not allow us to examine each hurricane as a specific data set. Because the Harvey data set is first, we have the common baseline and can compare the lengths of the magenta-ish bars. But what about the blue sets for Irma? How large is natural resources and mining compared to professional and business services? It is incredibly difficult to tell because neither bar starts at the same point. You must mentally move the bars to the same baseline and then hope your brain can accurately capture the length.

Instead, a split bar chart with each sector having two bars would have been preferable. Or, barring that, two plots under the same title. Then you could even sort the data sets and make it even easier to see which sectors were more important in the impacted areas.

Stacked bar charts work when you are trying to show total magnitude and the breakdowns are incidental to the point. But as soon as the comparison of the breakdowns becomes important, it’s time to make another chart.

Credit for the piece goes to Andrew Van Dam.

Harvey’s Rainfall Totals

Hurricane Harvey landed north of Corpus Christi, Texas late Friday night. By Monday morning, Houston has been flooded as nearly two feet of water have fallen upon the city, built on and around wetlands long ago paved over with concrete. Naturally the news has covered this story in depth all weekend. Even leading up to it, when I was still posting eclipse things, various outlets had projections and why we should care graphics. But as the storm begins to move back into the Gulf—only to move back inland tomorrow—I wanted to compare some of the graphics I have been seeing.

Of course, not all graphics are the same, let alone cover the same things. So this morning we are looking at just the rainfall total maps of a few different outlets.

From the Washington Post, we have the following graphic.

The Post's rainfall graphic
The Post’s rainfall graphic

The palette chosen performs well at quickly scaling up to the record level of rainfall, i.e. the 20+ inches realm, but quickly shifting from the green–blue palette into dark purples.

Then we have the Wall Street Journal’s graphic.

The Journal's graphic
The Journal’s graphic

Here we have a more familiar blue–red diverging spectrum. The point of divergence set to 20 inches.

Lastly, we have the New York Times graphic. Though in this case, it’s not an exact like-for-like comparison. I could not find a graphic mapping total rainfall, instead this is for projected rainfall totals. But the design is for the same type of map, i.e. how much rain falls in a location.

The Times' graphic
The Times’ graphic

The Post takes the closest approach to the true continuous spectrum palette, where the shift from dry to drenched is gradual. It makes for a smoother, more blended looking map. Somewhere around that 20 inch point, however, the palette shifts from the green to blue range to purple. It emphasises the record-hitting point, but otherwise the totals are presented as more fluid. Perhaps correctly since rain does not neatly fall evenly into pixels.

By comparison, the Journal segments the rainfall totals into bins of blues. The scale is not even, the lighter blues incorporate two inches, the darkers upwards of five. And then again, like the Post, separate 20+ as a different colour, here switching to reds.

Lastly the Times keeps to a simple segmented bin palette of all blues. 20+ inches is rendered is just a dark blue.

Each map has pluses, each has minuses. The Times map, for example, is simple and quick to understand. Southeastern Texas will be wet by the middle of next week. If your goal is only to communicate that point, well this map has done its job. It is worth noting, again, that this is a map of projections. Because the other thing missing from this map is the storm’s path. So if the goal were to showcase the rainfall along the storm’s path, well this graphic does not accomplish that nearly as well as the other two.

The Post and the Journal both show the track of the storm. The Journal takes it one step further and plots its projected course through Thursday. This helps us really see if not understand the east side problem of hurricanes. That is, the eastern quadrants of hurricanes typically experience the heaviest amounts of rain. And as the darker portions of the map all fall to the north and east of those lines, it reaffirms this for us.

So what really differentiates the two? The colour palette and its application. The Post’s palette is more natural as, again, rain does not fall neatly into bins and instead makes for blurred and messy totals across a map. Separating the heaviest rains into the purples, however, makes a lot of sense as that amount of rainfall, as we are seeing this morning, makes for a mess in Houston.

But the point of a graphic is to translate nature and the observed into a digestible and pointed statement of the observed. What should I learn? Why should I care? The Journal, like the Post, does a fantastic job of splitting out the 20+ inch totals by using a divergent palette. But instead of blending into that colour, the distinction is sharp. And then below that threshold, we get rainfall totals segmented into just a few bins. These help the reader see, also more starkly because of the selection of the specific blues, just where the bands of heavy rain will fall.

I do want to point out, however, that all of these maps occur in articles with lots of other fantastic graphics that visually explore lots of details about the story. And in particular, I want to highlight that the normal bit where I state the credits includes a lot of people. Creating a whole host of graphics to support a story takes a lot of work.

Credit for the Washington Post piece goes to Darla Cameron, Samuel Granados, Chris Alcantara, and Gabriel Florit.

Credit for the Wall Street Journal piece goes to Bradley Olson, Arian Campo-Flores, Miguel Bustillo, Dan Frosch, Erin Ailworth, Christopher M. Matthews, and Russell Gold.

Credit for the New York Times piece goes to Gregor Aisch, Sarah Almukhtar, Jeremy Ashkenas, Matthew Bloch, Joe Burgess, Audrey Carlsen, Ford Fessenden, Troy Griggs, K.K. Rebecca Lai, Jasmine C. Lee, Jugal K. Patel, Adam Pearce, Bedel Saget, Anjali Singhvi, Joe Ward, and Josh Williams.

The World’s Fighter Jets

As you know, I am a sucker for military-related things. So here we have a piece from the Wall Street Journal on the leading fighter jets of the world. If you have a bone to pick on which jets were included, please take that up with them and not me.

Of course, speed isn't everything…
Of course, speed isn’t everything…

The screenshot is from the end of an animation where they depict the maximum range and the relative speed of each aircraft against each other.

Credit for the piece goes to Andrew Barnett, Jason French, and Robert Wall.

Could Marine Le Pen Have Won?

Well not likely—it was going to be tough regardless.

Today’s piece is also from the Wall Street Journal and it was posted Saturday, the day before the election. It used a Sankey diagram to explore the support that Le Pen would have needed to draw from every candidate in the first round to get over the 50% mark in the second round.

Turns out she didn't get the maths
Turns out she didn’t get the maths

If anything this chart is not the story. The story is that the final count I saw put Macron not on 60%, but on just over 66%.

Turns out she couldn’t.

Credit for the piece goes to Stacy Meichtry and Jovi Juan.

Vive la France

Emmanuel Macron won the French presidential election yesterday. So Guess what we have a graphic or two of this week? If you guessed Mongolian puppies, you were wrong.

Thursday afternoon the Wall Street Journal—they seem to really be upping their game of late—published an article breaking down the connection between a Le Pen support in the first round and unemployment. For me, the key to the article was the following graphic, which plots those two variables by department. The departments that she won, generally speaking, suffer higher unemployment.

Unemployment and Len Pen support
Unemployment and Len Pen support

Colour coding relates to the winner of the department. I am not certain that the size of the voters in the department matters as much. But the annotation of particular departments, qualified as being limited to the French mainland—see my problem back in April about when France is more than France—flows through the several graphics in the piece.

This is a piece from the Thursday running up to Sunday’s vote. Tomorrow we will look at a piece from the day before the vote that looked at another key component of Macron’s win.

Credit for the piece goes to Martin Burch and Renée Rigdon.

Trump’s Wall

Another day, another story about the administration to cover with data-driven graphics. We are approaching Trump’s 100th day in office, traditionally the first point at which we examine the impact of the new president. And well, beyond appointing a Supreme Court justice, it is hard to find a lot of things President Trump has actually done. But on his 99th day, he will also need to approve a Congressional bill to fund the government, or else the government shuts down on his 100th day. Not exactly the look of a successful head of state and government.

Why do I bring this up? Well, one of the many things that may or may not make it into the bill is funding for Trump’s wall that Mexico will pay for, but at an undetermined later date, because he wants to get started building the wall early, but late because he promised to start on Day 1.

Several weeks ago the Wall Street Journal published a fantastic piece on the current wall bordering Mexico. It examines the current state of fencing and whether parts of the border are fenced or not. It turns out a large portion is not. But, the piece goes on to explain just why large sections are not.

The wall today
The wall today

You should read the full piece for a better understanding. Because while the president says building the wall will cost $10 billion or less, real estimates place the costs at double that. Plus there would be lawsuits because, spoiler: significant sections of the border wall would cross private property, national parks, and Native American reservations. Also the southern border crosses varied terrain from rives to deserts to mountains some lengths of which are really difficult to build walls upon.

But the part that I really like about the piece is this scatter plot that examines the portion of the border fenced vs. the number of apprehensions. It does a brilliant job of highlighting the section of the border that would benefit most significantly from fencing, i.e. a sector with minimal fencing and a high number of apprehensions: the Rio Grande Valley.

Where would more fencing make more of a difference
Where would more fencing make more of a difference

And to make that point clear, the designers did a great job of annotating the plot to help the reader understand the plot’s meaning. As some of my readers will recall, I am not a huge fan of bubble plots. But here there is some value. The biggest bubbles are all in the lower portion of fenced sectors. Consequently, one can see that those rather well-fenced sectors would see diminished returns by completing the wall. A more economical approach would be to target a sector that has low mileage of fencing, but also a high number of apprehensions—a big circle in the lower right of the chart. And that Rio Grande Valley sector sits right there.

Overall, a fantastic piece by the Wall Street Journal.

Credit for the piece goes to Stephanie Stamm, Renée Rigdon, and Dudley Althaus.


Get Ready Folks

Well have we got an interesting week this week. Friday begins Trump Time. So hold onto your Twitter accounts, folks. But before we get there, I wanted to do a short week of some data-driven graphics that take a look at the state of things.

Instead of what I had intended for today, let us take a look at a new post from the Wall Street Journal that examines GDP, inflation, industrial production, and the unemployment rate in advanced economies. At its most basic level, the graphics show how many of the 39 advanced economies have a value within a one-percentage point range. The size of the dots indicates how many countries fall within the bin.

A look at advanced economies' GDPs
A look at advanced economies’ GDPs

What keeps getting me, however, is the colour. Nowhere does the piece explain what the colour represents. Does it represent anything? I think it might only be used to show the ranges in the values, not the number of countries sharing said values. And if that is the case, it is a poor design decision.

My eye goes to the colour first before it goes to the dot density let alone the size of the dots. Like a Magic Eye, when I stare at the piece long enough, I begin to see the overall trend for each metric. But blink and the colours reassert their visual dominance.

I wonder what would happen if the graphic settled on a single colour? My instinct says that the patterns would become far clearer, because colour change would no longer be a visual pattern needing interpretation—even though it needs no interpretation from a data standpoint. By limiting the number of visual patterns, the piece would make the data stand out more clearly and make for clearer communication.

If an editor screams something like “It needz more colourz!!1!”, I would reserve four separate colours and then use one and only one for each of the four metrics.

That all said, what the piece does really well is explain segments of the data. In the above screenshot, you can clearly see and get the overall GDP story. But then from there you read down and get explanations or callouts of the overall to provide more context and information. The designer greys out the remainder of the dots and allows the colour to emphasise those countries in focus. A lightly transparent overlay allows for the background dots to remain faintly visible while the text can clearly be read.

All in all, I am not sure where I fall on this particular piece. It does some things well, others not so much. But either way, the piece does paint an interesting portrait of populism’s potential causes.

Credit for the piece goes to Andrew Van Dam.

Diversity in the 115th Congress

Well, we have arrived at 2017. We all know the big political story in the executive branch. But we also saw elections in the legislative branch. But how different will the 115th Congress look from the 114th? The Wall Street Journal took a look at that in an article.

Congressional diversity
Congressional diversity

The article’s graphic does a nice job showing the two different compositions. But if we are truly interested in the growth, we could use a line chart to better showcase the data. So what did I do last night? I made that chart. But as I was playing with the data I saw some numbers that stood out for me. So I compared the proportion of minorities in the original graphic to their proportion of the US national population, per Census Bureau data.

Redesigning the original graphic
Redesigning the original graphic

The line charts, broken out into the House vs. the Senate and then into the two parties, do a really good job of showing how the growth is not equally distributed between the two parties. And the reverse of that is that it shows how one party has failed to diversify between the two congresses.

The 115th Congress might be more diverse than ever. But it has a long way to go.

Credit for the original piece goes to the Wall Street Journal graphics department.

Detroit’s Housing Market

A few weeks ago the Wall Street Journal published a graphic that I thought could use some work. I like line charts, and I think line charts with two or three lines that overlap can be legible. But when I see five in five colours in a small space…well not so much.

So I spent 45 minutes attempting to rework the graphic. Admittedly, I did not have source data, so I simply traced the lines as they appeared in the graphic. I kept the copy and dimensions and tried to work within those limitations. Clearly I am biased, but I think the work is now a little bit clearer. I also added for context the Great Recession, during which credit tightened, ergo more properties would have been likely purchased with cash. It’s all about the context.

The original:

The original graphic
The original graphic

And my take:

My take on it all
My take on it all

Credit for the original work goes to the Wall Street Journal graphics department.