As many of my long-term readers know, I am really only a one sport kind of guy. And that sport is baseball. American football, well, I’ve seen one match live and in person and it was…boring. But it’s a big deal in America. And this is the time of the year when teams begin signing free agents.
I happened to be reading the Boston Globe for news on the Red Sox, my team, when I saw a link to this interactive tool allowing users to build their own roster with free agent signings.
Go Pats
Conceptually, the piece is fairly simple. There is a filterable list of free agents, broken out by whether their forecast signing values falls into the high-, middle-, or low-end of the range. Plus a draft pick.
I root for the Patriots. However, if you asked me to name a single player on last season’s roster, I could only name Cam Newton. Apparently he wasn’t great. I really and truly don’t follow the sport.
The piece displays the available free agents, along with those no longer available. (Though, the piece does offer you the option to go back to the beginning of free agent season and pretend reality didn’t happen.)
I have no idea who any of these people are.
I went through and began semi-randomly picking names. I’d heard of some of them, and others were blind choices. Once you’ve selected within the budget, you can choose a draft pick. They all appear in list format to the right with the ability to remove them via a small X button.
Nope, not a clue.
Once you’ve confirmed your choices you’re taken to a screen that reviews your selection. You are able to either tweet it to the world—which I did not do—or start over again. I would do that, but I wouldn’t do any better than how I just did.
I hope I did at least okay.
Overall, the piece felt intuitive and I never had any issues selecting my free agents. Of course, it would help if I knew anything about the sport. But that’s a user problem.
Or just shake my hand, because today marks the second St. Patrick’s Day spent in isolation. I am lucky, of course, because two years ago I spent the holiday in Dublin. One of those bucket list kind of things. There I ran into a(n American) friend who was coincidentally in town. Then the next day I took the train to Cork to visit another friend. If you don’t count weddings, I think that was the last big trip I took.
Two years hence, I am here in my flat alone on a holiday meant to be spent with family and friends. But in the last year, I made significant progress on my Irish genealogy. For part of that progress I took two additional DNA tests. So this St. Patrick’s Day seems like a good time to reflect on those tests.
For those that don’t know, I do a lot of genealogy work as a hobby. Primarily I focus on paper records, but DNA is an important piece of the puzzle. In a sense, it is the only record that cannot lie. It will reveal your biological connections to family that may have been otherwise lost. And it cannot be faked.
But that’s only true for your genetic matches. Those are the real power of taking a DNA test. I would bet, however, that most people initially take the tests for the ethnicity estimates. On a day like today, how Irish are you? How Irish am I?
That’s a lot of green.
Not surprisingly, I’m pretty Irish.
Of course, if you look at me, those Irish values do not quite equal each other. So what’s the deal? After all, the underlying DNA does not change from spit tube to cheek swab.
The first thing to know is that in one sense, ethnicity is, like so many things, a social construct. Super broadly, every individual is unique—except twins. Of course humans have spread across the globe and in that spread, certain regions have evolved incredibly slight differences between the populations. In addition to those genetic differences, the populations created civilisations and cultures. An ethnicity, in a sense, is a group of people who share that culture, civilisation, and genetic similarities vis-a-vis genetic differences across the world.
Importantly, within those groups, we still have differences. The Irish, for example, are known for freckles and red hair. But not all Irish have those traits. Instead, again super broadly, we say that for a group of people, a certain percentage will share a certain set of features. Consequently, within an ethnic group, you will still have variations and outliers. In some cases because generations ago a traveller from a different group entered the gene pool for some reason or another. And while the offspring might identify entirely with their new civilisation and culture, their genes don’t lie and a DNA test would reveal their traits from their ancestor’s foreign gene pool.
The second point to make is that Ireland is a fairly modern creation. Ireland did not exist as a sovereign state until 1922. Before then, the idea of Ireland existed. The country, however, did not. A better example would be German or Italian. Neither Germany nor Italy existed until the 1870s and 1860s, respectively. If you have “German” ancestors who arrived in Philadelphia in 1848, you don’t have German ancestors. You have ancestors from one of the various principalities or bishoprics comprising the German Confederation. Italy had the Venetian Republic, the Kingdom of the Two Sicilies, and many others. Being Irish, German, or Italian is thus a modern construct.
The third point is that identifying anyone as any of these ethnic groups requires a baseline for a comparison. To do that, you need a reference population in the area you are going to define as Ireland, Germany, or Italy. But humans have migrated throughout history. Ireland was conquered by the English. Germans…well, let’s just say Germans have a history with conquering parts of Europe. And so you can see exchanges of genetic information among populations pretty easily. And over time, those genetic populations evolve.
Take those three points and add them together in admixture test and your results are really only good back to about 500 years. And even then, you may find yourself belonging to something incredibly vague and all-encompassing because, especially as with France and Germany, there’s been too much mixture to get so granular as to fit ourselves within the borders of modern political states.
In the above results, you can see my “Irishness” varies from 63% to 75%. Though, as far as I know 21/32 (66%) of my 3xgreat-grandparents arrived from Ireland. That’s why I say I’m 2/3 Irish. But, genetically, I may be more or less because those 21 might have English or Scottish ancestors. Ancestry says I may be 18% Scottish, but whilst I have ancestors who lived in Scotland, I’m not aware of any ancestors born and raised for multiple generations in Scotland.
And then that’s just how Ancestry defines it. Compare that to my results from My Heritage. Because of the aforementioned difficulty in separating out certain population groups, they lump the Irish, Scottish, and Welsh together. Add my Ancestry Irish and Scottish together and I have 81%, not far from My Heritage’s 85% estimate. Then look at my results from Family Tree. They estimate me as 75% Irish, but add in the 10% Scandinavia and I’m up to 85%.
That brings me to my last point about DNA tests. It’s probably fair to say that I’m something like 80–85% genetically from the British Isles/North Sea region. What about the other 15–20%?
You will often hear you receive half your DNA from each of your parents. And they get half from each of theirs and so on and so forth. I’ve had conversations with folks who take that to mean they get 25% from each grandparent and 12.5% from each great-grandparent et cetera. But that’s not quite true.
You do receive 50% of your DNA from your father and the other 50% from your mother. But that 50%, well that’s a sort of random sample from the share your parents received from their parents.
My maternal grandfather was 100% Carpatho-Rusyn. For generations, his ancestors lived, reproduced, and died in the Carpathian Mountains. If we received exactly half from each previous generation, I should expect 25% of my DNA from my grandfather. But Ancestry, which has the best representation of this small ethnic group, says it’s 17% (though they give it as a range of being between 2 and 27%). In other words, I’m missing seven percentage points.
And so if you take a DNA test and you know you have a great-great grandparents of Irish descent, you may only see a small fraction in your results. If your connection to Ireland (or anywhere else) is even further back, the result becomes smaller still. In fact, beyond 5–7 generations back, you may not even inherit any genetic material from a specific ancestor in your family tree.
But ultimately, for today, as I wrote in one of my very first posts here on Coffeespoons, back in 2010, on St. Patrick’s Day, we’re all at least a little bit Irish.
Hopefully next year we’ll be able to celebrate in person.
Yesterday I wrote about Covid-19 here in five states of the US. I mentioned how I am concerned about the levelling out of new cases in certain states, notably Pennsylvania and New Jersey. In Italy, the government issued a new round of lockdowns in an attempt to contain a new wave before it swamps their healthcare system.
At the end of that BBC article, they used a small multiples graphic showing the seven-day average in several European countries. Today is the 16th, and so the data is now a few days old, but the concept remains important.
New cases curves for several European countries.
From a design standpoint, we are seeing a few things here. First, each country’s line chart exists with its own scale. Unfortunately this makes comparing country-to-country nigh impossible. We know from the title that in the present these are the countries with the highest new case rates in Europe. But, how do these rates today compare to earlier peaks? Without axis lines or a baseline, it’s difficult to say.
Of course, the point could well be just to show how in places like Italy, France, Poland, &c. we are seeing an emergent surge of new cases since the holiday peak.
If that is the goal, I think this chart works well. However, if the goal is to provide more context of the state of the pandemic in these select countries, we need some additional context and information.
Credit for the piece goes to the BBC graphics department.
Last week I wrote about how our progress in dealing with Covid-19 was stagnating. To put it simply, this past week did not get any better on that front.
New case curves for PA, NJ, DE, VA, & IL.
In Pennsylvania, Delaware, and Illinois we see that the flattened tail I described last week, well remained a flattened tail. In Delaware, we see more movement, but the average of the average, if you will, is flat over the last two weeks. And in New Jersey, where I mentioned some signs of rising numbers, we see a clearly rising number of new cases over the last week. Only in Virginia are numbers heading down, and those are shallowing out.
The problem here is that in Pennsylvania and Delaware, the new case rate, whilst flat, is well above the summer rate of low transmission. This means that the environment is ripe for a new surge of cases if people stop following social distancing and begin resuming indoor activities with other people. Sadly, both those things appear to be occurring throughout the US.
In Europe we see a cautionary tale. They too saw their holidays peaks decline and the national governments began easing restrictions on their populations. Within the last several days, however, new cases have begun to surge. Italy has gone so far as to announce a new lockdown. Other governments are considering the same.
If the United States cannot resume pushing its numbers of new cases down, it could well follow Europe into a new wave of outbreaks that would threaten lockdowns and push back our eventual return of normalcy.
None of this would be an issue if vaccinations were nearing herd immunity levels. However, in the states we cover, nowhere is above 12% fully vaccinated.
Vaccination curves for PA, VA, & IL.
Pennsylvania now lags behind the other two states. But at least the Commonwealth is over 10% fully vaccinated.
And of course, the problem under this dire scenario is that deaths could rise once again, though at this point the most vulnerable are in the middle of being vaccinated. Indeed, if we look at the last week, we see the good news for the week, that deaths are headed down in all five states.
Death curves for PA, NJ, DE, VA, & IL.
Previously, Virginia had been working through a backlog of death records, but those appear now cleared. We are not quite back to summer-level lows, but we are steadily approaching them.
The big question this week will be what happens to those new cases numbers. Today’s data, Monday, will likely show lower numbers because of lower testing on the weekend. But starting Tuesday, what do we see over the course of the next five days?
Perseverance landed on Mars on 18 February, almost a month ago. The video and photography the rover has already sent back has been stunning. We all know she is the most capable rover yet landed on the Red Planet, but what we all want to know is how cute is Perseverance compared to her predecessors?
For years, one issue with the American economy had been that we did not save enough. It’s understandable, as it’s hard to keep up with the image of the carefree American without profligate spending. But that’s also not great long-term. But thanks to Covid-19, we’ve now swung to the other side of the spectrum: Americans may be saving too much.
Saying that sounds callous to the devastation the pandemic has wrought upon large swathes of the economy. But it’s true in the aggregate as this New York Times piece explains. In particular, the authors highlight one example. Consider a corporate CEO who earned a $100,000 bonus for keeping the company he runs afloat during the recession. He adds $100k to the aggregate American income. But at a restaurant shuttered by the pandemic, owners lay off a hostess, a server, a bartender, and a dishwasher, each earning $25,000. Their collective lost income is $100,000 and so balances out that one CEO. And as CEOs are more able to work remotely than servers, it’s not hard to see how the upper-income earning cohorts of the economy have done well. In human-terms, four unemployed service industry people is terrible. But statistically, it’s a wash. Once we understand that, it makes the piece sensible.
It uses decomposition charts, basically stacked bar charts broken apart, to show what constitutes the two sides of the American household budget: earning and spending. I’ve taken a screenshot of the spending side of the ledger.
This is the aggregate, I’d be curious how this relates to you, my readers.
We see that starting from the baseline, the solid line, American households spent more money this year on durable goods. A dotted line then carries that adjusted baseline to the right for the next component of the ledger: nondurable goods. We spent more on those too, so the baseline moves up. The designers annotated the graphic, adding descriptions of what each bar represents in a casual, lighthearted tone. I’ve definitely been cooking for myself a lot more.
Here I wish we had some more traditional charting elements, e.g. axis lines and labels. Now this piece is published under the Upshot, a more conversational and less formal brand than the Times as a whole. That probably explains the casual annotations. But I think some basic axis labels, e.g. spending more vs. spending less, could add some context without the need for the annotations.
Where the piece might lose people is what happens after durable goods. Americans stopped spending on services, a decline of over half a trillion dollars. That’s a lot of money. And so the adjusted baseline shifts to well below where we started. Add on savings from things like interest rates (Jay Powell is the chair of the Federal Reserve, for whose Philadelphia bank I work in full disclosure) and Americans have spent more than half a trillion dollars less. And as the article explains, we’ve also saved an enormous amount, to the tune of $1 trillion. Add it together and you’ve got America saving $1.5 trillion in 2020.
That money has to go somewhere. And you can see where some of it went when you look at surging prices in GameStop. Longer term, when the pandemic begins to end, we are going to have a pent up demand from people who have had their lives on hold for a year or more. And if there is insufficient supply for whatever’s in demand, prices will rise and we could see a sharp jump in inflation. But that’s a post for another day.
Back to this graphic, as a statistical graphic, it works. But without axis labels and data definitions, barely so. However, I think it’s meant to be more casual and illustrative than data-driven. If I look at this piece through that lens, I do think it works.
Credit for the piece goes to Neil Irwin and Weiyi Cai.
Last week I wrote about some signals indicating a potential stagnation in terms of declining numbers of new cases. I also wrote about some potential signs of reversals, or increasing numbers of new cases.
This week, what we saw signs of came to pass.
New case curves for PA, NJ, DE, VA, & IL.
At the tail ends of each chart, you can see that the last week was broadly stagnant. In Pennsylvania and Illinois the seven-day average was itself remarkably flat. Delaware is now where it was this time last week; a slight rise in new cases was met with an equal magnitude decline.
In reversals, we have New Jersey. New case numbers there increased throughout the week. With lower weekend data, those numbers have fallen slightly.
Only in Virginia did we see good numbers in new cases. Numbers there fell over the last week, though notably at a slower pace than in previous weeks.
Deaths presented broadly good news. Last week we had mixed signals with increasing numbers in Delaware and Virginia. We knew the increase in Virginia was due to the state processing a backlog of death certificates with Covid.
Death curves for PA, NJ, DE, VA, & IL.
But in the last few days, those numbers have also fallen though the state reports it is still processing the backlog. And in Delaware, the daily number of deaths has also fallen again. I think it’s too early to say this peak has crested, but it could well be.
And in the other states, we continue to see slowly falling numbers of deaths. There are some potential signs of that bottoming or stalling out in Illinois, but we’ll have to see how this week pans out.
Finally, the best news we had over the course of last week was with vaccinations.
Vaccination curves for VA & IL.
Last week I mentioned that we can see the lines moving upwards as we approach 10% fully vaccinated in Pennsylvania, Virginia, and Illinois.
This week, well let’s start here: as I’ve pointed out in the past, Pennsylvania does not have a centralised reporting system. Most notably the state reports figures for all but Philadelphia county (coterminus with the city). The city reports its own figures. I aggregate the two. But for the last several days, the Philadelphia data site has been broken, so we don’t know the progress of vaccinations in the city. And as the largest city/county in the state, Philadelphia is an enormous part of figuring out the statewide numbers.
So looking only at Virginia and Illinois, the numbers look good. Virginia is at nearly 9.5%. Illinois is on 8.92%.
But we really need Philadelphia to get its act together.
This week I’m on deadline for the magazine I produce. Technically, the files go out Monday, but I spend Monday double/triple-checking things and assembling all the packages I need and so everything really needs to be done the day before, for this quarter, that’s today. Regardless, that means little sleep and craziness.
Over at Indexed, Jessica Hagy nailed how I feel this time every quarter with a simple scatter plot entitled “‘You Look Tired’.”
But I have an intern joining me for the summer, so huzzah for Q3.
In 2020, baseball did not permit fans to attend regular season matches. (They changed this for the playoffs.) Instead, many stadiums opted for cardboard cutouts: fans often paid a fee and submitted a picture that the team printed on cardboard cutouts. Like so many things we will say about 2020, it was surreal.
But in Philadelphia at least, cardboard cutouts are out, and human fans are in. The state government in Harrisburg and the city government will allow 20% capacity at outdoor stadiums and 15% for indoor stadiums.
The Philadelphia Inquirer created a small graphic for its homepage to capture this news.
I cannot wait to safely attend a live match. C’mon, vaccines.
I intentionally included other site elements in the cropping to show how the graphic fits into the broader site. The extra white space around the image helps focus attention on the datagraphic over the numerous photographic elements for each article. Clicking on other tabs in the section brings up full-component-width graphics.
To the graphic itself.
Still can’t wait…
My guess would be this was a quick turnaround piece. There are a few things going on here. The first and most obvious one, the squares as spectators. Now I confess this confused me at first. I was not entirely certain what the coloured squares meant; they mean in-person attendees. Was this supposed to be an overall stadium? Or was it a representative seating section?
The quick turnaround becomes important, because this is probably how I would have first conceptualised the graphic. But, with more time, I may have attempted to incorporate the shape of the playing field, be it a baseball diamond or basketball court, or hockey rink—I know all the sports terms!—and surrounded them with shapes representing a certain number of spectators. Squares might not work in that case because of the curves. Circles? Hexagons? Regardless of the shape, the filling of occupied seats would be the same as here, but it would perhaps be clearer to some readers, i.e. me.
Second, we get to the table below the graphics. Here we have a subtle design decision. Note that here the designer greyed out the normal capacity figures. The new figures at that 20% and 15% rates are what appear in black bold text. My usual instinct is to use typographic weight, regular vs. bold, in these situations. But the grey here works equally well.
Third, and this also involves the table, we have the first game data. We talked about the comparison of the capacity and permitted attendance. But I wonder, did the date of the first game with fans needed to be displayed in the same way as the permitted attendance? Because the news isn’t the dates of the first games—at least not as I read the news—but the numbers of attendees. And because of that, maybe I would have reduced the size of the type for the date of the first game. Or, conversely, set the type for the new attendance in a larger point size.
Overall, I enjoyed seeing this news presented visually, even if I was left confused.