Skip to main content

Search

NIEHS Dashboard Data Sources

tschuetz

GitHub Repository

“To empower additional modeling efforts, the complete time series of all daily PVI scores and data are available at https://github.com/COVID19PVI/data. “

12 Key Indicators

“[The authors] assembled U.S. county- and state-level datasets into 12 key indicators across four major domains: current infection rates (infection prevalence, rate of increase), baseline population concentration (daytime density/traffic, residential density), current interventions (social distancing, testing rates), and health and environmental vulnerabilities (susceptible populations, air pollution, age distribution, comorbidities, health disparities, and hospital beds).”

Three types of modeling

“Our modeling efforts directly address the discussion in [6], by contextualizing factors such as racial differences with corrections for socioeconomic factors, health resource allocation, and co-morbidities, plus highlighting place- based risks and resource deficits that might explain spatial distributions. Specifically, three types of modeling efforts were performed and are regularly updated. First, epidemiological modeling on cumulative case- and death-related outcomes provides insights into the epidemiology of the pandemic. Second, dynamic time-dependent modeling provides similar outcome estimates as national-level models, but with county-level resolution. Finally, a Bayesian machine learning approach provides data-driven, short-term forecasts. “

Blackness and PM 2.5

“With respect to factors affecting COVID-19 related mortality, we find that the proportion of Black residents and the PM2.5 index of small-particulate air pollution are the most significant predictors among those included, reinforcing conclusions from previous reports[7]. An increase of one percentage point of Black residents is associated with a 3.3% increase in the COVID-19 death rate. The effect of a 1 g/m3 increase in PM2.5 is associated with an approximately 16% increase in the COVID-19 death rate, a value at the high end of a previously reported confidence interval from a report in late April 2020[7] when deaths had reached 38% of the current total.”

Machine learning and prediction

“To accurately predict future cases and mortality, it is necessary to account for the fluid nature of the data. Accordingly, we developed a Bayesian spatiotemporal random-effects model that jointly describes the log-observed and log-death counts to build local forecasts. Log-observed cases for a given day are predicted using known covariates (e.g., population density, social distancing metrics), a spatiotemporal random-effect smoothing component, and the time- weighted average number of cases for these counts. This smoothed time-weighted average is related to a Euler approximation of a differential equation; it provides modeling flexibility while approximating potential mechanistic models of disease spread. The smoothed case estimates are used in a similar spatiotemporal model predicting future log-death counts based on a geometric mean estimate of the estimated number of observed cases for the previous seven days as well as the other data streams. The resulting county-level predictions and corresponding confidence intervals are shown (Fig. 1)."

Source: https://www.researchgate.net/publication/343642027_The_COVID-19_Pandemi…

US NIEHS Dashboard Creators and Curators

tschuetz

Skylar W. Marvel1, John S. House2, Matthew Wheeler2, Kuncheng Song1, Yihui Zhou1, Fred A. Wright1,3, Weihsueh A. Chiu4, Ivan Rusyn4, Alison Motsinger-Reif2*, David M. Reif1*

Affiliations:

1 Bioinformatics Research Center, Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA.

2 Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences, Research Triangle Park, NC, 27709, USA.

3 Department of Statistics, North Carolina State University, Raleigh, NC 27695, USA

4 Veterinary Integrative Biosciences, College of Veterinary Medicine and Biomedical Sciences, Texas A&M University, College Station, TX 77845, USA.

US NIEHS Dashboard Types of Data

tschuetz

“Data sources in the current model (version 11.2.1) include the Social Vulnerability Index (SVI) of the Centers for Disease Control and Prevention (CDC) for emergency response and hazard mitigation planning (Horney et al. 2017), testing rates from the COVID Tracking Project (Atlantic Monthly Group 2020), social distancing metrics from mobile device data ( https://www.unacast.com/covid19/social-distancing-scoreboard), and dynamic measures of disease spread and case numbers ( https://usafacts.org/issues/coronavirus/). Methodological details concerning the integration of data streams—plus the complete, daily time series of all source data since February 2020 and resultant PVI scores—are maintained on the public Github project page (COVID19PVI 2020). Over this period, the PVI has been strongly associated with key vulnerability-related outcome metrics (by rank-correlation), with updates of its performance assessment posted with model updates alongside data at the Github project page (COVID19PVI 2020).”

Source: https://ehp.niehs.nih.gov/doi/10.1289/EHP8690

US NIEHS Dashboard Motivations

tschuetz

Empowering local actoors

“We present the PVI Dashboard as a dynamic container for contextualizing these disparities. It is a modular tool that will evolve to incorporate new data sources and analytics as they emerge (e.g., concurrent flu infections, school and business reopening statistics, heterogeneous public health practices). This flexibility positions it well as a resource for integrated prioritization of eventual vaccine distribution and monitoring its local impact. The PVI Dashboard can empower local and state officials to take informed action to combat the pandemic by communicating interactive, visual profiles of vulnerability atop an underlying statistical framework that enables the comparison of counties and the evaluation of the PVI’s component data.”

US NIEHS Dashboard Visualization

tschuetz

Built with toxicology knowledge

“The software used to generate PVI scores and profiles from these data is freely available at https://toxpi.org

General visualization capabilities

“The interactive visualization within the PVI Dashboard is intended to communicate factors underlying vulnerability and empower community action [...] The visualization and quantification of county-level vulnerability indicators are displayed by a radar chart, where each of the 12 indicators comprises a “slice” of the overall PVI profile. On loading, the Dashboard displays the top 250 PVI profiles (by rank) for the current day. The data, PVI scores, and predictions are updated daily, and users can scroll through historical PVI and county outcome data. Individual profiles are an interactive map layer with numerous display options/filters that include sorting by overall score, filtering by combinations of slice scores, clustering by profile similarity (i.e. vulnerability “shape”), and searching for counties by name or state (Additional functionality is detailed in the Supplement). User selection of any county overlays the summary Scorecard and populates surrounding panels with county- specific information (Figure 1). The scrollable panels at left include plots of vulnerability drivers relative to the nation-wide distribution across all U.S. counties, with the location of the selected county delineated. The panels across the bottom of the Dashboard report cumulative county numbers of cases and deaths; timelines of cumulative cases, deaths, PVI score, and PVI rank; daily changes in cases and deaths for the most recent 14-day period (commonly used in reopening guidelines[6]; and predicted cases and deaths for a 7-day forecast horizon.”

Visualizing comparison and "peer counties"

“the multi-criteria filtering capabilities in the Dashboard were used to find a “peer county” for comparison. “

Source: https://ehp.niehs.nih.gov/doi/10.1289/EHP8690 and https://www.researchgate.net/publication/343642027_The_COVID-19_Pandemi…

Bridging Gaps in Publicly Accessible Data

Carly.Rospert

How are Data Gaps Worked Around:

Sarnia, and the surrounding area around chemical valley, have 9 air monitoring stations in which air pollutants are monitored from the nearby petrochemical complex. Until 2017, only data from one of these stations (the one on Christina Street in downtown Sarnia) was publicly available. This created a gap in accessiblility of important data for sarnia and the nearby AFN residents. In September 2015, the Clean Air Sarnia and Area group launched as a "community advisory panel made up of representatives from the public, government, First Nations, and industry, who are dedicated to providing the community with a clear understanding of ambient air quality in the Sarnia area." This group works to improve air quality in Sarnia by making information about air quality publicly available and by making recommendations to relevant authorities. In 2018, this group launched the website: https://reporting.cleanairsarniaandarea.com/ (also uploaded as an artifact) which allows public to access data from the air quality monitoring stations and understand how air quality compares to Ontario's standards. This site works to fill the gap of publicly available air quality data in Sarnia.

Standards Undercutting Safety

Carly.Rospert

This report from Ecojustice shows a decline in air pollution compared to Ecojustice's first report released in 2007 for the area around Chemical Valley, yet Sarnia industries continue "to release far more pollution, and in particular far more SO2 , than comparable U.S. refineries." One contributor to the continued excessive emissions is Ontario's lagging air quality standards. The report notes that "Ontario’s AAQC and air quality standards are lagging behind current science on the health impacts of air pollutants, which may put the health of residents at risk." The report highlights pollutants where Ontario's standard is above the national standard or where Ontario has no standard at all. Additionally, Sarnia's benzene emissions are exempt from Ontario's health-based standard for this chemical and are instead regulated by  "an industry technical-based standard" allowing benzene levels to be far higher than the health-based standard. The lagging, lack of, or exemption from regulation undercut efforts in monitoring and reducing emissions to a "safe" level as what is considered "safe" by standards is out of line with what is considered "safe" by health and other standards.

NIEHS Dashboard Data Sources

tschuetz

GitHub Repository

“To empower additional modeling efforts, the complete time series of all daily PVI scores and data are available at https://github.com/COVID19PVI/data. “

12 Key Indicators

“[The authors] assembled U.S. county- and state-level datasets into 12 key indicators across four major domains: current infection rates (infection prevalence, rate of increase), baseline population concentration (daytime density/traffic, residential density), current interventions (social distancing, testing rates), and health and environmental vulnerabilities (susceptible populations, air pollution, age distribution, comorbidities, health disparities, and hospital beds).”

Three types of modeling

“Our modeling efforts directly address the discussion in [6], by contextualizing factors such as racial differences with corrections for socioeconomic factors, health resource allocation, and co-morbidities, plus highlighting place- based risks and resource deficits that might explain spatial distributions. Specifically, three types of modeling efforts were performed and are regularly updated. First, epidemiological modeling on cumulative case- and death-related outcomes provides insights into the epidemiology of the pandemic. Second, dynamic time-dependent modeling provides similar outcome estimates as national-level models, but with county-level resolution. Finally, a Bayesian machine learning approach provides data-driven, short-term forecasts. “

Blackness and PM 2.5

“With respect to factors affecting COVID-19 related mortality, we find that the proportion of Black residents and the PM2.5 index of small-particulate air pollution are the most significant predictors among those included, reinforcing conclusions from previous reports[7]. An increase of one percentage point of Black residents is associated with a 3.3% increase in the COVID-19 death rate. The effect of a 1 g/m3 increase in PM2.5 is associated with an approximately 16% increase in the COVID-19 death rate, a value at the high end of a previously reported confidence interval from a report in late April 2020[7] when deaths had reached 38% of the current total.”

Machine learning and prediction

“To accurately predict future cases and mortality, it is necessary to account for the fluid nature of the data. Accordingly, we developed a Bayesian spatiotemporal random-effects model that jointly describes the log-observed and log-death counts to build local forecasts. Log-observed cases for a given day are predicted using known covariates (e.g., population density, social distancing metrics), a spatiotemporal random-effect smoothing component, and the time- weighted average number of cases for these counts. This smoothed time-weighted average is related to a Euler approximation of a differential equation; it provides modeling flexibility while approximating potential mechanistic models of disease spread. The smoothed case estimates are used in a similar spatiotemporal model predicting future log-death counts based on a geometric mean estimate of the estimated number of observed cases for the previous seven days as well as the other data streams. The resulting county-level predictions and corresponding confidence intervals are shown (Fig. 1)."

Source: https://www.researchgate.net/publication/343642027_The_COVID-19_Pandemi…

US NIEHS Dashboard Creators and Curators

tschuetz

Skylar W. Marvel1, John S. House2, Matthew Wheeler2, Kuncheng Song1, Yihui Zhou1, Fred A. Wright1,3, Weihsueh A. Chiu4, Ivan Rusyn4, Alison Motsinger-Reif2*, David M. Reif1*

Affiliations:

1 Bioinformatics Research Center, Department of Biological Sciences, North Carolina State University, Raleigh, NC 27695, USA.

2 Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences, Research Triangle Park, NC, 27709, USA.

3 Department of Statistics, North Carolina State University, Raleigh, NC 27695, USA

4 Veterinary Integrative Biosciences, College of Veterinary Medicine and Biomedical Sciences, Texas A&M University, College Station, TX 77845, USA.