America’s Crime Problem

During the pandemic, media reports of the rise of crime have inundated American households. Violent crimes, we are told, are at record highs. One wonders if society is on the verge of collapse.

But last night a few friends asked me to take a look at the data during the pandemic (2020–2021) and see what is actually going on out on the streets in a few big cities. Naturally I agreed and that’s why we have this post today. The first thing to understand, however, is that we do not have a federal-level database where we can cross compare crimes in cities using standardised definitions. The FBI used to produce such a thing, but in 2020 retired it in favour of a new system that, for reasons, local and state agencies have yet to fully embrace. Consequently, just when we need some real data, we have a notable lack of it.

At the very least we have national-level reporting on violent crimes and homicides, the latter of which is a subset of violent crimes. Though these reports are also dependent on local and state agencies self-reporting to the FBI. I also wanted to look at not just whether crime is up of late, but is crime up over the last several years. I chose to go back 30 years, or a generation.

We can see one important trend here, that at a national level violent crimes are largely stable at rate of 400 per 100,000 people. Homicides, however, have climbed by nearly a third. Violent crimes are not rising, but murders are.

My initial charge was to look at cities and violent crime. However, knowing that nationally violent crimes are largely stable, the issue of concern would be how the rise in murders is playing out on American city streets. With the caveat that we do not have a single database to review, I pulled data directly from the five cities of interest: Philadelphia, Chicago, New York, Washington, and Detroit.

I also considered that large cities will have more murders simply by dint of their larger populations. And so when I collected the data, I also tried to find the Census Bureau’s population estimates of the cities during the same time frame. Unfortunately the 2021 estimates are not yet available so I had to use the 2020 population estimates for my 2021 calculations.

First we can see that not all cities report data for the same time period. And for Detroit in particular that makes comparisons tricky. In fact only New York had data back to the beginning of the century. Regardless of the data set’s less than full robustness we can see that in all five cities homicides rose in 2020 and 2021.

Second, however, if squint through that lack of full data, we see a trend at the city level that aligns with the national level. Homicides, tragically, are indeed up. However, in New York and Washington homicides are still below the data from near 2000 and at that time homicides already appear on a downward trajectory. I would bet that homicides were even higher during the 1990s and that the 2000s represented a long-run decline. In other words, whilst homicides are up, they are still below their peaks. A worrying trend, but far from the sky is falling.

That cannot quite be said for other cities. Let’s start with Detroit. Sadly we have too few years of data to draw any conclusion other than that homicides rose compared to the years preceding the pandemic.

That leaves us with Philadelphia and Chicago. Philadelphia has less data available and it’s harder to make a determination of what is happening. But we can say that since 2007, homicides have not been higher. If you look closely though, you can see how there does appear to be a downward trend at the beginning of the line. We do not have enough data like we do with New York and Washington, but I would bet homicides are up in Philadelphia, but still far short of what they were in the 1990s.

Chicago is the oddball. Yes, it saw a peak in homicides during the pandemic. But in 2016 the city didn’t miss the pandemic peak by much. In other words, homicides were staggeringly high in Chicago before the pandemic. If anything, we see a failure to combat high crime rates. But even before that spike in 2016, we see more of a valley floor in homicides. True, at the beginning of the century homicides appear to have trended down. But unlike the other cities here, homicides bottomed out at around 450 per 100,000 people. I’m not so certain we had a persistent, long-run decline in Chicago with which to start.

And like I said above, larger populations we would expect to have more murders because more potential criminals and victims. When we equalise for population we see the same trends as we expect—the city populations have been relatively stable over the last 20 years. Instead what we see is that relative to each other murders are more common in some cities and less so in others.

New York is a great example with nearly 500 murders last year, a number on par with Philadelphia. But New York has over 8 million inhabitants. Philadelphia has just 1.6. Consequently New York’s homicide rate is a surprisingly low 5.9 per 100,000 people. Philadelphia’s on the other hand? 35.6.

Philadelphia is near the top of that list, with Washington and Chicago having similar, albeit lower, rates at 31.7 and 30.1, respectively. But sadly Detroit surpasses them all and is in league of its own: 47.5 in 2021.

Credit for the pieces is mine.

Obfuscating Bars

On Friday, I mentioned in brief that the East Coast was preparing for a storm. One of the cities the storm impacted was Boston and naturally the Boston Globe covered the story. One aspect the paper covered? The snowfall amounts. They did so like this:

All the lack of information

This graphic fails to communicate the breadth and literal depth of the snow. We have two big reasons for that and they are both tied to perspective.

First we have a simple one: bars hiding other bars. I live in Greater Centre City, Philadelphia. That means lots of tall buildings. But if I look out my window, the tall buildings nearer me block my view of the buildings behind. That same approach holds true in this graphic. The tall red columns in southeastern Massachusetts block those of eastern and northeastern parts of the state and parts of New Hampshire as well. Even if we can still see the tops of the columns, we cannot see the bases and thus any real meaningful comparison is lost.

Second: distance. Pretty simple here as well, later today go outside. Look at things on your horizon. Note that those things, while perhaps tall such as a tree or a skyscraper, look relatively small compared to those things immediately around you. Same applies here. Bars of the same data, when at opposite ends of the map, will appear sized differently. Below I took the above screenshot and highlighted two observations that differed in only 0.5 inches of snow. But the box I had to draw—a rough proxy for the columns’ actual heights—is 44% larger.

These bars should be about the same.

This map probably looks cool to some people with its three-dimensional perspective and bright colours on a dark grey map. But it fails where it matters most: clearly presenting the regional differences in accumulation of snowfall amounts.

Compare the above to this graphic from the Boston office of the National Weather Service (NWS).

No, it does not have the same cool factor. And some of the labelling design could use a bit of work. But the use of a flat, two-dimensional map allows us to more clearly compare the ranges of snowfall and get a truer sense of the geographic patterns in this weekend’s storm. And in doing so, we can see some of the subtleties, for example the red pockets of greater snowfall amounts amid the wider orange band.

Credit for the Globe piece goes to John Hancock.

Credit for the NWS piece goes to the graphics department of NWS Boston.

How the Globe’s Writers Voted

Yesterday we looked at a piece by the Boston Globe that mapped out all of David Ortiz’s home runs. We did that because he has just been voted into baseball’s Hall of Fame. But to be voted in means there must be votes and a few weeks after the deadline, the Globe posted an article about how that publication’s eligible voters, well, voted.

The graphic here was a simple table. But as I’ll always say, tables aren’t an inherently bad or easy-way-out form of data visualisation. They are great at organising information in such a way that you can quickly find or reference specific data points. For example, let’s say you wanted to find out whether or not a specific writer voted for a specific ballplayer.

Just don’t ask me for whom I would have voted…

Simple red check marks represent those players for whom the Globe’s eligible staff voted. I really like some of the columns on the left that provide context on the vote. For the unfamiliar, players can only remain on the list for up to ten years. And so for the first four, this was their last year of eligibility. None made the cut. Then there’s a column for the total number of votes made by the Globe’s staff. Following that is more context, the share of votes received in 2021. Here the magic number if 75% to be elected. Conversely, if you do not make 5% you drop off the following year. Almost all of those on their first year ballot failed to reach that threshold.

The only potential drawback to this table is that by the time you reach the end of the table, there are few check marks to create implicit rules or lines that guide you from writer to player. David Ortiz’s placement helps because six—remarkably not all Globe writers voted for him—it grounds you for the only person below him (alphabetically) to receive a vote. And we need that because otherwise quickly linking Alex Rodriguez to Alex Speier would be difficult.

Finally below the table we have jump links to each writer’s writings about their selections. And if you’ll allow a brief screenshot of that…

Still don’t ask me

We have a nicely designed section here. Designers delineated each author’s section with red arrows that evoke the red stitching on a baseball. It’s a nice design tough. Then each author receives a headline and a small call out box inside which are the players—and their headshots—for whom the author voted. An initial dropped capital (drop cap), here a big red M, grabs the reader’s attention and draws them into the author’s own words.

Overall this was a solidly designed piece. I really enjoyed it. And for those who don’t follow the sport, the table is also an indicator of how divisive the voting can be. Even the Globe’s writers couldn’t unanimously agree on voting for David Ortiz.

Credit for the piece goes to Daigo Fujiwara and Ryan Huddle.

Slaveholders in the Halls of Congress

Taking a break from going through the old articles and things I’ve saved, let’s turn to a an article from the Washington Post published earlier this week. As the title indicates, the Post’s article explores slaveholders in Congress. Many of us know that the vast majority of antebellum presidents at one point or another owned slaves. (Washington and Jefferson being the two most commonly cited in recent years.) But what about the other branches of government?

The article is a fascinating read about the prevalence of slaveholders in the legislative branch. For our purposes it uses a series of bar charts and maps to illustrate its point. Now, the piece isn’t truly interactive as it’s more of the scrolling narrative, but at several points in American history the article pauses to show the number of slaveholders in office during a particular Congress. The screenshot below is from the 1807 Congress.

That year is an interesting choice, not mentioned explicitly in the article, because the United States Constitution prohibited Congress from passing limits on the slave trade prior to 1808. But in 1807 Congress passed a law that banned the slave trade from 1 January 1808, the first day legally permitted by the Constitution.

Almost half of Congress in the early years had, at one point or another, owned slaves.

Graphic-wise, we have a set of bar charts representing the percentage and then a choropleth map showing each state’s number of slaveholders in Congress. As we will see in a moment, the map here is a bit too small to work. Can you really see Delaware, Rhode Island, and (to a lesser extent) New Jersey? Additionally, because of the continuous gradient it can be difficult to distinguish just how many slaveholders were present in each state. I wonder if a series of bins would have been more effective.

The decision to use actual numbers intrigues me as well. Ohio, for example, has few slaveholders in Congress based upon the map. But as a newly organised state, Ohio had only two senators and one congressman. That’s a small actual, but 33% of its congressional delegation.

Overall though, the general pervasiveness of slaveholders warrants the use of a map to show geographic distribution was not limited to just the south.

Later on we have what I think is the best graphic of the article, a box map showing each state’s slaveholders over time.

How the trends changed over time over geography.

Within each state we can see the general trend, including the legacy of the Civil War and Reconstruction. The use of a light background allows white to represent pre-statehood periods for each state. And of course some states, notably Alaska and Hawaii, joined the United States well after this period.

But I also want to address one potential issue with the methodology of the article. One that it does briefly address, albeit tangentially. This data set looks at all people who at one point or another in their life held slaves. First, contextually, in the early years of the republic slavery was not uncommon throughout the world. Though by the aforementioned year of 1807 the institution appeared on its way out in the West. Sadly the cotton gin revolutionised the South’s cotton industry and reinvigorated the economic impetus for slavery. There after slavery boomed. The banning of the slave trade shortly thereafter introduced scarcity into the slave market and then the South’s “peculiar institution” truly took root. That cotton boom may well explain how the initial decline in the prevalence of slaveholders in the first few Congresses reversed itself and then held steady through the early decades of the 19th century.

And that initial decline before a hardening of support for slavery is what I want to address. The data here looks only at people who at one point in their life held slaves. It’s not an accurate representation of current slaveholders in Congress at the time they served. It’s a subtle but important distinction. The most obvious result of this is how after the 1860s the graphics show members of Congress as slaveholders when this was not the case. They had in the past held slaves.

That is not to say that some of those members were reluctant and, in all likelihood, would have preferred to have kept their slaves. And therefore those numbers are important to understand. But it undermines the count of people who eventually came to realise the error of their ways. The article addresses this briefly, recounting several anecdotes of people who later in life became abolitionists. I wonder though whether these people should count in this graphic as—so far as we can tell—their personal views changed so substantially to be hardened against slavery.

I would be very curious to see these charts remade with a data set that accounts for contemporary ownership of slaves represented in Congress.

Regardless of the methodology issue, this is still a fascinating and important read.

Credit for the piece goes to Adrian Blanco, Leo Dominguez, and Julie Zuazmer Weil.

Graduate Degrees

Many of us know the debt that comes along with undergraduate degrees. Some of you may still be paying yours down. But what about graduate degrees? A recent article from the Wall Street Journal examined the discrepancies between debt incurred in 2015–16 and the income earned two years later.

The designers used dot plots for their comparisons, which narratively reveal themselves through a scrolling story. The author focuses on the differences between the University of Southern California and California State University, Long Beach. This screenshot captures the differences between the two in both debt and income.

Pretty divergent outcomes…

Some simple colour choices guide the reader through the article and their consistent use makes it easy for the reader to visually compare the schools.

From a content standpoint, these two series, income and debt, can be combined to create an income to debt ratio. Simply put, does the degree pay for itself?

What’s really nice from a personal standpoint is that the end of the article features an exploratory tool that allows the user to search the data set for schools of interest. More than just that, they don’t limit that tool to just graduate degrees. You can search for undergraduate degrees.

Below the dot plot you also have a table that provides the exact data points, instead of cluttering up the visual design with that level of information. And when you search for a specific school through the filtering mechanism, you can see that school highlighted in the dot plot and brought to the top of the table.

Fortunately my alma mater is included in the data set.

Welp.

Unfortunately you can see that the data suggests that graduates with design and applied arts degrees earn less (as a median) than they spend to obtain the degree. That’s not ideal.

Overall this was a really nice, solid piece. And probably speaks to the discussions we need to have more broadly about post-secondary education in the United States. But that’s for another post.

Credit for the piece goes to James Benedict, Andrea Fuller, and Lindsay Huth.

Philadelphia’s Wild Winters

Winter is coming? Winter is here. At least meteorologically speaking, because winter in that definition lasts from December through February. But winters in Philadelphia can be a bit scattershot in terms of their weather. Yesterday the temperature hit 19ºC before a cold front passed through and knocked the overnight low down to 2ºC. A warm autumn or spring day to just above freezing in the span of a few hours.

But when we look more broadly, we can see that winters range just that much as well. And look the Philadelphia Inquirer did. Their article this morning looked at historical temperatures and snowfall and whilst I won’t share all the graphics, it used a number of dot plots to highlight the temperature ranges both in winter and yearly.

Yep, I still prefer winter to summer.

The screenshot above focuses attention on the range in January and July and you can see how the range between the minimum and maximum is greater in the winter than in the summer. Philadelphia may have days with summer temperatures in the winter, but we don’t have winter temperatures in summer. And I say that’s unfair. But c’est la vie.

Design wise there are a couple of things going on here that we should mention. The most obvious is the blue background. I don’t love it. Presently the blue dots that represent colder temperatures begin to recede into and blend into the background, especially around that 50ºF mark. If the background were white or even a light grey, we would be able to clearly see the full range of the temperatures without the optical illusion of a separation that occurs in those January temperature observations.

Less visible here is the snowfall. If you look just above the red dots representing the range of July temperatures, you can see a little white dot near the top of the screenshot. The article has a snowfall effect with little white dots “falling” down the page. I understand how the snowfall fits with the story about winter in Philadelphia. Whilst the snowfall is light enough to not be too distracting, I personally feel it’s a bit too cute for a piece that is data-driven.

The snowfall is also an odd choice because, as the article points out, Philadelphia winters do feature snowfall, but that on days when precipitation falls, snow accounts for less than 1/3 of those days with rain and wintry mixes accounting for the vast majority.

Overall, I really like the piece as it dives into the meteorological data and tries to accurately paint a portrait of winters in Philadelphia.

And of course the article points out that the trend is pointing to even warmer winters due to climate change.

Credit for the piece goes to Aseem Shukla and Sam Morris.

The Terrible No Good Chart About Gas Prices

Saw this graphic on the Twitter the other day from the Democratic Congressional Campaign Committee (DCCC), or the D Triple C or D Trip C. The context was that earlier in the day Matt Yglesias posted a clearly tongue-in-cheek chart about how after signing the infrastructure bill, President Biden had single-handedly fixed inflation and gas prices were heading down.

Oh, the power to misuse FRED.

Of course, anyone with a brain knows this isn’t true. The President of the United States cannot control the price of petrol. Because, you know, market economy. The underlying problem of high demand and low supply was, of course, not solved by the infrastructure bill. But lots of people complain on the telly or the internets about Biden not doing more about inflation, but, you know, not really within the wheelhouse.

Anyway, this chart in particular does not bother me. Because Yglesias knows—and most of his audience knows—it is not meant to be taken seriously. It is really just a joke.

But emphasis most of his audience.

Because the DCCC later posted this graphic with the accompanying text “Thanks, Joe Biden”.

Oh boy.

Oh boy.

Clearly they didn’t get the memo about the original being a joke.

The entire scale of the chart is 4¢. I cannot even recall the last time I had to use the glyph ¢ we’re talking so small a scale. The change in the the three week period amounts to a decline of 2¢.

And now you get the joke of the post. Ask me my 2¢ about the chart…

Now look closely at that y-axis. You’ll also note that we are carrying it all the way out to the third decimal point. Now, it’s true that some petrol stations will have a wee little nine trailing just after the two digits to the right of the decimal. Sometimes you might see a 9/10. As was explained to me in school that’s because people will buy something if it looks even a fraction of a cent cheaper. Thing 99¢—getting the use out of this glyph today—versus $1. Makes all the difference. So back when petrol was cheap (inflation stories come round and round), 0.899 looked better than 0.90. But now that it’s routinely well over a few dollars, that 9/10 is a laughable percentage of the total price.

So, yes, we do present petrol prices to three decimals in the environmental design space. But think to yourself, when have you ever aloud repeated a price to the third decimal point? You probably haven’t. And so this chart probably shouldn’t be using that granular a level of specificity.

The other underlying problem, jokes aside, is that the chart spends all that horizontal space looking at three data points. Three. If the data were showing the daily price, not the weekly average, we’d have 21 days worth of data, and that—scale notwithstanding—would be worthy of charting. My basic rule is that if it’s five or six data points, you can use a table unless there is a contextual or design reason for doing so. Say, for example, you’re doing a series of small multiples for a time series of objects in a category. For all but a few categories you have dozens of data points, but just a few have really spotty observations. In those cases, plot the three or four numbers. But in this case, just don’t.

Instead this kind of graphic is best presented as a factette, a big old number, preferably in a narrow or condensed width. Because a 2¢ decline over a three-week period is also not terribly newsworthy. (Unless your story is how prices haven’t changed much over the last three weeks.)

This also points to how the original chart misses the context of time. Granted, a lot can happen in three weeks, but a 2¢ shift is not massive. Give those three weeks their proper place in time, however, and you can see just how little movement that truly is. Cue my own quickly whipped up charts.

That’s more like it.

In the first chart you can begin to see how the change, during the course of the last nearly two years, is not significant. And in the second you can see that things really are not that bad compared to where they were back during the lead up to the Great Recession and then in the recovery that followed. (Aww, look at back in the early oughts when prices averaged just over a $1/gallon. I can still remember filling up my minivan for prices like 99¢.)

If the designer wants to make a point that perhaps we’re reaching the peak prices during this time period, sure. Because a two-week decline in prices could well be the beginning of that. But, to show that you also need to show the context of the time before that.

But once again, the President of the United States cannot much affect the price of petrol short of releasing the strategic reserve, which as its name implies, is meant for strategic purposes in case of national emergency. And high consumer prices are not a strategic national emergency on the scale of, say, a crippling storm impacting the refineries in the Gulf or an earthquake destroying pipelines in Alaska or an invasion or stifling blockade of overseas imports.

At the end of the day, this was just a terrible, terrible chart. And I think it speaks to a degree of chart illiteracy that I see creeping up in society at large. Not that it wasn’t there in the past—get off my lawn, kids—but seems more ever present these days. I don’t know if that’s because of the amplification effect of things like the Twitter or just a decline in education and critical thinking. But those are topics for another day.

This chart fails on so many levels. The concept is bad, i.e. neither Biden nor Trump nor their predecessors nor their successors—unless we adopt a planned economy, am I right, comrades?—can directly affect petrol prices. Prices are governed by larger market forces that boil down to supply and demand.

But also, the sheer design is bad. Don’t use a chart of three data points. Don’t stretch out the x-axis. Don’t use decimal points to a point where they’re unrecognisable.

In the meantime, charts like this? Don’t do them, kids.

Credit for the first original goes to FRED, whose chart Matt Yglesias used.

Credit for the second goes to the DCCC graphics department.

Oh, and because I used Federal Reserve data for the charts, and because I work there, I should add the views and opinions are my own and don’t represent those of my employer.

Those Are Some Heavy Balls

Unfortunately, I don’t subscribe to Business Insider, but I saw this graphic on the Twitter and felt the need to share it. Primarily because baseball will almost certainly stop at midnight when the owners of the teams will impose a lockout (as opposed to players going on strike). And with that baseball will be on hold until the two parties resolve their current labour issues.

And at present that seems like it could take quite some time.

So on the eve of the lockout Bradford William Davis tweeted a link to an article he wrote, alas no subscription as aforementioned, but he did share one of the graphics therein.

Those are a lot of blue balls…

We have a basic dot plot charting the weight of the centre of baseballs, sorted by the month of game from which they were pulled.

The designer made a few interesting choices here. First, typographically, we have a few decisions around the type. I would have loved to have seen a bit of editing or design to eliminate the widow at the end of the graphic’s subtitle, that bit that just says “(blue)”. Do the descriptors in parentheses even need to be there when the designer included a legend immediately below? I find that one word incredibly distracting.

On the other hand, the designer chose to use a thin white outline around the text on the plot. Normally I’d really like this choice, because it can reduce some of the issues around legibility when lines intersect text, especially when they are the same colour. Here, however, the backgrounds are not white. I would have tried, for the top, using that light blue instead of white as the stroke for the outside of the letters. And on the bottom I would have tried the light pink. That would probably achieve the presumed desired effect of reducing the visual interference unintentionally created by the white. I also would have moved the top label up so it didn’t sit overlay the top dot.

As far as the dot plot itself goes, that works fine. I wonder if some transparency in the dots would have emphasised how many dots sit atop each other. Or maybe they could have clustered, but when overlapping moved horizontally off the vertical axis.

Overall this was a really nice graphic with which to end this half of the baseball off season. Hopefully the lockout doesn’t last too long.

Credit for the piece goes to Taylor Tyson.

There’s Water in the Basement

If you didn’t know, climate change is real and it threatens much of our current way of life. I don’t go so far as to say it threatens the extinction of mankind, because there are nearly seven billion of us and to wipe out every living soul would be a tall order. But, it could wipe out parts of our history.

If you didn’t know, the city of Washington in the District of Columbia was built on a swamp. Except, actually, it wasn’t. Most of the city was built on higher ground along the riverbank of the Potomac. True, there are low-lying areas affected by the tides and high water, such as the National Mall, but places like the Capitol were purposefully placed on high ground.

And that gets us to this article in the Washington Post. It takes a look at the impact of rising waters and flash flooding on the National Mall, home to some of the preeminent American museums. The article uses a map to show just how the museums are threatened by extreme weather events that will only increase in frequency as climate change ramps up.

Note the Capitol and the White House will both be fine.

The designer used colour to denote museums by their risk of flooding, and sadly there are several. But as the article describes, there are few short-term fixes that we can undertake to mitigate the risk of damage to the collections.

Credit for the piece goes to Taylor Johnston.

Hey Boo Boo

When I was in the Berkshires, one thing I noticed was signs about bears. Bear crossing. Don’t feed the bears. Be beary careful. Okay, not so much the latter. But it was nonetheless odd to a city dweller like myself where I just need to be wary of giant rats.

Less than a month later, I read an article in the Boston Globe about how the black bear population in Massachusetts is expanding from the western and central portions of the state to those in the east.

The graphic in the article actually comes from the Massachusetts Division of Fisheries and Wildlife, so credit goes to them, but it shows the existing range and the black bears’ new range.

I understand the inclusion of the highways in red, green, and black, but I wish they had some even simple labelling. In the article they mention a few highways, but my familiarity with the highway system in Massachusetts is not great. Also, because the designer used thin black lines to demarcate the towns, one could think that the black lines, especially out west, represent counties or other larger political geography units.

Credit for the piece goes to the Massachusetts Division of Fisheries and Wildlife.