Skip to main content

Analyze

NIEHS Dashboard Data Sources

tschuetz

GitHub Repository

“To empower additional modeling efforts, the complete time series of all daily PVI scores and data are available at https://github.com/COVID19PVI/data. “

12 Key Indicators

“[The authors] assembled U.S. county- and state-level datasets into 12 key indicators across four major domains: current infection rates (infection prevalence, rate of increase), baseline population concentration (daytime density/traffic, residential density), current interventions (social distancing, testing rates), and health and environmental vulnerabilities (susceptible populations, air pollution, age distribution, comorbidities, health disparities, and hospital beds).”

Three types of modeling

“Our modeling efforts directly address the discussion in [6], by contextualizing factors such as racial differences with corrections for socioeconomic factors, health resource allocation, and co-morbidities, plus highlighting place- based risks and resource deficits that might explain spatial distributions. Specifically, three types of modeling efforts were performed and are regularly updated. First, epidemiological modeling on cumulative case- and death-related outcomes provides insights into the epidemiology of the pandemic. Second, dynamic time-dependent modeling provides similar outcome estimates as national-level models, but with county-level resolution. Finally, a Bayesian machine learning approach provides data-driven, short-term forecasts. “

Blackness and PM 2.5

“With respect to factors affecting COVID-19 related mortality, we find that the proportion of Black residents and the PM2.5 index of small-particulate air pollution are the most significant predictors among those included, reinforcing conclusions from previous reports[7]. An increase of one percentage point of Black residents is associated with a 3.3% increase in the COVID-19 death rate. The effect of a 1 g/m3 increase in PM2.5 is associated with an approximately 16% increase in the COVID-19 death rate, a value at the high end of a previously reported confidence interval from a report in late April 2020[7] when deaths had reached 38% of the current total.”

Machine learning and prediction

“To accurately predict future cases and mortality, it is necessary to account for the fluid nature of the data. Accordingly, we developed a Bayesian spatiotemporal random-effects model that jointly describes the log-observed and log-death counts to build local forecasts. Log-observed cases for a given day are predicted using known covariates (e.g., population density, social distancing metrics), a spatiotemporal random-effect smoothing component, and the time- weighted average number of cases for these counts. This smoothed time-weighted average is related to a Euler approximation of a differential equation; it provides modeling flexibility while approximating potential mechanistic models of disease spread. The smoothed case estimates are used in a similar spatiotemporal model predicting future log-death counts based on a geometric mean estimate of the estimated number of observed cases for the previous seven days as well as the other data streams. The resulting county-level predictions and corresponding confidence intervals are shown (Fig. 1)."

Source: https://www.researchgate.net/publication/343642027_The_COVID-19_Pandemi…

US NIEHS Dashboard Creators and Curators

tschuetz

Skylar W. Marvel1, John S. House2, Matthew Wheeler2, Kuncheng Song1, Yihui Zhou1, Fred A. Wright1,3, Weihsueh A. Chiu4, Ivan Rusyn4, Alison Motsinger-Reif2*, David M. Reif1*

Affiliations:

1 Bioinformatics Research Center, Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA.

2 Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences, Research Triangle Park, NC, 27709, USA.

3 Department of Statistics, North Carolina State University, Raleigh, NC 27695, USA

4 Veterinary Integrative Biosciences, College of Veterinary Medicine and Biomedical Sciences, Texas A&M University, College Station, TX 77845, USA.

US NIEHS Dashboard Types of Data

tschuetz

“Data sources in the current model (version 11.2.1) include the Social Vulnerability Index (SVI) of the Centers for Disease Control and Prevention (CDC) for emergency response and hazard mitigation planning (Horney et al. 2017), testing rates from the COVID Tracking Project (Atlantic Monthly Group 2020), social distancing metrics from mobile device data ( https://www.unacast.com/covid19/social-distancing-scoreboard), and dynamic measures of disease spread and case numbers ( https://usafacts.org/issues/coronavirus/). Methodological details concerning the integration of data streams—plus the complete, daily time series of all source data since February 2020 and resultant PVI scores—are maintained on the public Github project page (COVID19PVI 2020). Over this period, the PVI has been strongly associated with key vulnerability-related outcome metrics (by rank-correlation), with updates of its performance assessment posted with model updates alongside data at the Github project page (COVID19PVI 2020).”

Source: https://ehp.niehs.nih.gov/doi/10.1289/EHP8690

US NIEHS Dashboard Motivations

tschuetz

Empowering local actoors

“We present the PVI Dashboard as a dynamic container for contextualizing these disparities. It is a modular tool that will evolve to incorporate new data sources and analytics as they emerge (e.g., concurrent flu infections, school and business reopening statistics, heterogeneous public health practices). This flexibility positions it well as a resource for integrated prioritization of eventual vaccine distribution and monitoring its local impact. The PVI Dashboard can empower local and state officials to take informed action to combat the pandemic by communicating interactive, visual profiles of vulnerability atop an underlying statistical framework that enables the comparison of counties and the evaluation of the PVI’s component data.”

US NIEHS Dashboard Visualization

tschuetz

Built with toxicology knowledge

“The software used to generate PVI scores and profiles from these data is freely available at https://toxpi.org

General visualization capabilities

“The interactive visualization within the PVI Dashboard is intended to communicate factors underlying vulnerability and empower community action [...] The visualization and quantification of county-level vulnerability indicators are displayed by a radar chart, where each of the 12 indicators comprises a “slice” of the overall PVI profile. On loading, the Dashboard displays the top 250 PVI profiles (by rank) for the current day. The data, PVI scores, and predictions are updated daily, and users can scroll through historical PVI and county outcome data. Individual profiles are an interactive map layer with numerous display options/filters that include sorting by overall score, filtering by combinations of slice scores, clustering by profile similarity (i.e. vulnerability “shape”), and searching for counties by name or state (Additional functionality is detailed in the Supplement). User selection of any county overlays the summary Scorecard and populates surrounding panels with county- specific information (Figure 1). The scrollable panels at left include plots of vulnerability drivers relative to the nation-wide distribution across all U.S. counties, with the location of the selected county delineated. The panels across the bottom of the Dashboard report cumulative county numbers of cases and deaths; timelines of cumulative cases, deaths, PVI score, and PVI rank; daily changes in cases and deaths for the most recent 14-day period (commonly used in reopening guidelines[6]; and predicted cases and deaths for a 7-day forecast horizon.”

Visualizing comparison and "peer counties"

“the multi-criteria filtering capabilities in the Dashboard were used to find a “peer county” for comparison. “

Source: https://ehp.niehs.nih.gov/doi/10.1289/EHP8690 and https://www.researchgate.net/publication/343642027_The_COVID-19_Pandemi…

NIEHS Dashboard Data Sources

tschuetz

GitHub Repository

“To empower additional modeling efforts, the complete time series of all daily PVI scores and data are available at https://github.com/COVID19PVI/data. “

12 Key Indicators

“[The authors] assembled U.S. county- and state-level datasets into 12 key indicators across four major domains: current infection rates (infection prevalence, rate of increase), baseline population concentration (daytime density/traffic, residential density), current interventions (social distancing, testing rates), and health and environmental vulnerabilities (susceptible populations, air pollution, age distribution, comorbidities, health disparities, and hospital beds).”

Three types of modeling

“Our modeling efforts directly address the discussion in [6], by contextualizing factors such as racial differences with corrections for socioeconomic factors, health resource allocation, and co-morbidities, plus highlighting place- based risks and resource deficits that might explain spatial distributions. Specifically, three types of modeling efforts were performed and are regularly updated. First, epidemiological modeling on cumulative case- and death-related outcomes provides insights into the epidemiology of the pandemic. Second, dynamic time-dependent modeling provides similar outcome estimates as national-level models, but with county-level resolution. Finally, a Bayesian machine learning approach provides data-driven, short-term forecasts. “

Blackness and PM 2.5

“With respect to factors affecting COVID-19 related mortality, we find that the proportion of Black residents and the PM2.5 index of small-particulate air pollution are the most significant predictors among those included, reinforcing conclusions from previous reports[7]. An increase of one percentage point of Black residents is associated with a 3.3% increase in the COVID-19 death rate. The effect of a 1 g/m3 increase in PM2.5 is associated with an approximately 16% increase in the COVID-19 death rate, a value at the high end of a previously reported confidence interval from a report in late April 2020[7] when deaths had reached 38% of the current total.”

Machine learning and prediction

“To accurately predict future cases and mortality, it is necessary to account for the fluid nature of the data. Accordingly, we developed a Bayesian spatiotemporal random-effects model that jointly describes the log-observed and log-death counts to build local forecasts. Log-observed cases for a given day are predicted using known covariates (e.g., population density, social distancing metrics), a spatiotemporal random-effect smoothing component, and the time- weighted average number of cases for these counts. This smoothed time-weighted average is related to a Euler approximation of a differential equation; it provides modeling flexibility while approximating potential mechanistic models of disease spread. The smoothed case estimates are used in a similar spatiotemporal model predicting future log-death counts based on a geometric mean estimate of the estimated number of observed cases for the previous seven days as well as the other data streams. The resulting county-level predictions and corresponding confidence intervals are shown (Fig. 1)."

Source: https://www.researchgate.net/publication/343642027_The_COVID-19_Pandemi…

US NIEHS Dashboard Creators and Curators

tschuetz

Skylar W. Marvel1, John S. House2, Matthew Wheeler2, Kuncheng Song1, Yihui Zhou1, Fred A. Wright1,3, Weihsueh A. Chiu4, Ivan Rusyn4, Alison Motsinger-Reif2*, David M. Reif1*

Affiliations:

1 Bioinformatics Research Center, Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA.

2 Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences, Research Triangle Park, NC, 27709, USA.

3 Department of Statistics, North Carolina State University, Raleigh, NC 27695, USA

4 Veterinary Integrative Biosciences, College of Veterinary Medicine and Biomedical Sciences, Texas A&M University, College Station, TX 77845, USA.

Disaster Media Heuristic

tschuetz

The authors "define disaster media as a heuristic, or approach, that recognizes the ways “natural” and human-made disasters are communicated aboutconstructed, and variously exacerbated or relieved through media means. This heuristic is not simply a temporary model for problem solving but tries to account for ecological forces and material conditions" (my emphasis).

They close the article with three provocations:

1) All Media on Deck: the current moment of combo disaster (COVID and climate crisis) requires the production of more public and open access materials (of various kinds), but also boosting of media literacy. The auhtors acknowledge the conundrum of producing more media, while being confronted with sustainability issues and the call for "no-carbon" media.

2) Relief and media Production: a critical look at the kinds of assumptions that governments/NGOs/industry bring to COVID-19 relief efforts (videos, websites, maps, algorithms...) -- what counts as relief and for whom? 

3) Focus on Social and Environmental Justice: "In moving forward, it will be crucial to approach disaster media as a domain in which structural reform agendas that interweave social and environmental justice can flourish."

Covid Visualizations

tschuetz

In the article, the authors address visualizations of COVID cases, including related satellite mages of air pollution in Southern California and China (generated by NASA/ESA) as well as of mass graves in Iran.

First, they provide basic framing of how to critically read air pollution satellite imagery. Connections between COVID-19 measures and improvements in air pollution are not identifiable in a straightforward way.

"Figure 1a, for instance, uses bright magenta to indicate greater concentrations of nitrogen dioxide and light blue to signify cleaner air. However, such color choices can be misleading: there is no material correlation between nitrogen dioxide and the color magenta; and reduced traces of this chemical do not turn the sky a paler shade of blue. [...] color-coding selections imply, satellite images are not just scientific; they are cultural as well."

Second, they point out the paradox role of satellite imagery to account for the inequitable impact of COVID-19

"satellite image, from a US satellite operator, locates pandemic “excesses” in an Iranian “elsewhere.” But this is an increasingly deceptive proposition, given that the United States has one of the highest COVID-19 per capita transmission and fatality rates in the world."

Third, they draw comparisons between the "hockey stick" visualization of global Climate Change and the various "curves" used to display COVID-19 developments:

From a disaster media perspective, the film’s global warming graph depicts a dramatic climate shift, projects imminent catastrophe, and issues a world warning. Its circulation in global media culture for the past fifteen years potentially informs the ways people are engaging now with similar-looking charts of coronavirus death and illness. Historically, news media have relied on sensationalistic photos of human suffering to convey a sense of disaster, but in the age of big data and the current pandemic, numbers speak, and graphs and curves tend to dominate the mediascape. In both cases, scientific experts and publics must grapple with how these graphs make meaning, what datasets they rely upon, and how these media come to stand in for highly complex conditions.

Finally, they remark that COVID-19 visualizations are always incomplete - because of lack of testing and withholding of data - but also because stories of e.g. workers are missing. They reference the cover of the New York Times (May 24, 2020) that displayed the names of 100,000 people who had died from COVID.