April 9th, 2020

Context is key for your data visualizations

You may have seen this graphic from The New York Times earlier this week:

It shows percent change in travel habits by county for the week of March 23, 2020, and clearly indicates that people across the country have chosen not to change their travel habits at all in response to the outbreak of COVID-19. It tells a true story, but perhaps not the whole story.

Let's zoom in to a particularly dark-red portion of the map:

Zoomed in on portions of West Texas

Darker red counties indicate fewer changes in travel behavior (i.e., nothing has changed in these areas). Some of the darker counties shown in this map?

  • Culberson county, 0.6 people per square mile
  • Reeves county, 5.2 people per square mile
  • Pecos county, 3.3 people per square mile
  • Terrell County, 0.4 people per square mile
  • Crockett County, 1.3 people per square mile

That's a grand total of 37,727 people over 16,385 square miles (or 2.3 people per based on 2010 population totals). You could feasibly drive hundreds of miles across all 5 of these counties and never see another soul. If that was the population density of the entire United States, we'd number about 8.7 million.

Contrast that to New York City, which has a population density of 27,751 per square mile. You can't look out your window without seeing at least a few dozen people. At that density, these 5 counties would hold over 450 million people.

It's absolutely true that the inhabitants of these particular counties have, as of yet, not made any major changes to their travel habits. Does that mean that they are treating our present situation with reckless abandon? Perhaps not.

It's most likely the case that, given the extreme land-to-human ratio in these areas, the potential for community spread of the novel coronavirus is significantly less than in areas with higher population density. Social distancing just has a slightly different meaning when you have a population density of 0.6/square mile. The baseline between the fine residents of Culberson County, Texas, and their more urban counterparts isn't close enough to produce a relevant comparison.

Astute observers no doubt recognized this. In fact, many pointed it out on Twitter and other social media. Nevertheless, it's worth stating again: While there is nothing factually inaccurate about this graphic, without the added context of population density, it doesn't tell the whole story. 

How can you avoid this problem?

  • Account for it directly in your visualization (perhaps different colors could have been used to segment counties by population density)
  • Split your single visualization into two or more to better show the context by physically separating the different classes of your data
  • Acknowledge the deficiencies directly on your visualization, so if it gets shared your acknowledgement gets shared as well
  • Provide a written or oral accompaniment to your visualization (if it's in a report or presentation, for example)