America’s Entire Understanding of the Pandemic Was Shaped by Messy Data
To understand any data set, you have to understand the way its information is compiled. That’s especially true for a patchwork data set such as the one composed of U.S. COVID-19 data, which is the product of 56 smaller systems belonging to each state and territory in the country.
In our year of working with COVID-19 data, we harnessed our attention on these systems and found that much of the information they produced reflected their individual structures. This reality runs parallel to the country’s biggest public-health-data challenge: The data pipelines that so deeply affected the pandemic’s trajectory were not given the decades of support—financial and otherwise—needed to perform well under pressure. Instead, a novel threat arrived, and the data response we saw was fragmented, unstandardized, and limited by constraints of the reporting systems.
In this post, we’ll offer a summary of how states reported the five major COVID-19 metrics—tests, cases, deaths, hospitalizations, and recoveries—and a look at how reporting complexities shaped our understanding of the pandemic.
Tests
Before the COVID-19 pandemic, the CDC had never collected comprehensive national testing data for any infectious disease in the United States. But last March, as COVID-19 began to spread throughout the country, the number of tests conducted became the most crucial data point with which to understand the pandemic. Without it, we couldn’t understand whether or where low case counts were just an artifact of inadequate testing.
So, last April, the CDC partnered with the (APHL) to start the (CELR), which would eventually collect from every state. While the federal from state health-department websites. Like the CDC, states had never collected data at the scale the pandemic demanded, and as a result, all testing data were incomplete and unstandardized.
You’re reading a preview, subscribe to read more.
Start your free 30 days