Source of the Spilhaus data
tschuetzThe data is collected and visualized by researcher Spilhaus....
The data is collected and visualized by researcher Spilhaus....
GitHub Repository
“To empower additional modeling efforts, the complete time series of all daily PVI scores and data are available at https://github.com/COVID19PVI/data. “
12 Key Indicators
“[The authors] assembled U.S. county- and state-level datasets into 12 key indicators across four major domains: current infection rates (infection prevalence, rate of increase), baseline population concentration (daytime density/traffic, residential density), current interventions (social distancing, testing rates), and health and environmental vulnerabilities (susceptible populations, air pollution, age distribution, comorbidities, health disparities, and hospital beds).”
Three types of modeling
“Our modeling efforts directly address the discussion in [6], by contextualizing factors such as racial differences with corrections for socioeconomic factors, health resource allocation, and co-morbidities, plus highlighting place- based risks and resource deficits that might explain spatial distributions. Specifically, three types of modeling efforts were performed and are regularly updated. First, epidemiological modeling on cumulative case- and death-related outcomes provides insights into the epidemiology of the pandemic. Second, dynamic time-dependent modeling provides similar outcome estimates as national-level models, but with county-level resolution. Finally, a Bayesian machine learning approach provides data-driven, short-term forecasts. “
Blackness and PM 2.5
“With respect to factors affecting COVID-19 related mortality, we find that the proportion of Black residents and the PM2.5 index of small-particulate air pollution are the most significant predictors among those included, reinforcing conclusions from previous reports[7]. An increase of one percentage point of Black residents is associated with a 3.3% increase in the COVID-19 death rate. The effect of a 1 g/m3 increase in PM2.5 is associated with an approximately 16% increase in the COVID-19 death rate, a value at the high end of a previously reported confidence interval from a report in late April 2020[7] when deaths had reached 38% of the current total.”
Machine learning and prediction
“To accurately predict future cases and mortality, it is necessary to account for the fluid nature of the data. Accordingly, we developed a Bayesian spatiotemporal random-effects model that jointly describes the log-observed and log-death counts to build local forecasts. Log-observed cases for a given day are predicted using known covariates (e.g., population density, social distancing metrics), a spatiotemporal random-effect smoothing component, and the time- weighted average number of cases for these counts. This smoothed time-weighted average is related to a Euler approximation of a differential equation; it provides modeling flexibility while approximating potential mechanistic models of disease spread. The smoothed case estimates are used in a similar spatiotemporal model predicting future log-death counts based on a geometric mean estimate of the estimated number of observed cases for the previous seven days as well as the other data streams. The resulting county-level predictions and corresponding confidence intervals are shown (Fig. 1)."
Source: https://www.researchgate.net/publication/343642027_The_COVID-19_Pandemi…
“Data sources in the current model (version 11.2.1) include the Social Vulnerability Index (SVI) of the Centers for Disease Control and Prevention (CDC) for emergency response and hazard mitigation planning (Horney et al. 2017), testing rates from the COVID Tracking Project (Atlantic Monthly Group 2020), social distancing metrics from mobile device data ( https://www.unacast.com/covid19/social-distancing-scoreboard), and dynamic measures of disease spread and case numbers ( https://usafacts.org/issues/coronavirus/). Methodological details concerning the integration of data streams—plus the complete, daily time series of all source data since February 2020 and resultant PVI scores—are maintained on the public Github project page (COVID19PVI 2020). Over this period, the PVI has been strongly associated with key vulnerability-related outcome metrics (by rank-correlation), with updates of its performance assessment posted with model updates alongside data at the Github project page (COVID19PVI 2020).”
https://www.researchgate.net/publication/351209404_PM25_Emissions_from_…;
This study is set in South/Southeast Asia and uncovering that, when trying to count the percentages of PM2.5 put off during biomass, the true amount of emissions were being gravely undercalculated. Specifically rice straw burning becuase the amount burned varied so much because of different harvest and burning practices that it just wasn't taken into consideration. What this study does is go bottom up using these strategies: "subnational spatial database of rice-harvested area, region-specific fuel-loading factors, region, and burning-practice-specific emission and combustion factors, including literature-derived estimates of straw and stubble burned"(Lasko et al. 2021, 1).
CalEPA Regulated Site Portal (CalEPA RSP) does not generate any of their own data but combines data from 7 state data sets and 2 federal datasets.
State Data Sets: Cal/OSHA, CERS, CIWQS, EnviroStor, GeoTracker, SMARTS and, SWIS
Federal Data Sets: EIS and TRI
This database uses a broad variety of data. Most of the data is collected by the EPA itself. Users are able to search for facilities regulated under the following systems:
When looking at individual facilities, the database provides detailed facility reports, enforcement case reports (civil and criminal), air pollutant reports, effluent charts, pollutant loading reports, effluent limit exceedances reports, CWA program area reports, permit limits reports, and other facility documents as available. The database provides easy ways to download and map the data. The database also allows users to narrow facilities searches using demographic data from EJScreen (also maintained by the EPA), the U.S. Census, and tribal land data.
Users can also look for information on federal administrative and judicial enforcement actions through an enforcement case search.
California Open Data Portal is designed to host open data from more than one state agency and aims to link all existing state portals in order for California's open data sets to be easily searched for from https://data.ca.gov. Open data or public data is collected through the state's routine business activities and is published in a way that is simple to search for, download, and combine with other data. Open data does not include private or confidential information on individuals.
The Student Health Index draws from data that is publicly available and up to date on a statewide level. Sources include the University of California San Francisco Health Atlas, the American Community Survey, the U.S. Census Bureau, the California Department of Education’s Downloadable Data Files site, and the CDC.
Detailed list of sources:
PLACES Project, CDC (available through the UCSF Health Atlas)
CalEnviroScreen (available through the UCSF Health Atlas)
Opportunity Atlas (available through the UCSF Health Atlas)
Health Resources and Services Administration (available through the UCSF Health Atlas)
American Community Survey (available through the UCSF Health Atlas)
California Department of Education’s Downloadable Data Files site
Kidsdata.org
The EMT disaster database is compiled from a wide variety of sources, including UN agencies, NGOs, insurance companies, research institutes, and press agencies. The dataset compilation process prioritizes data from UN agencies, the International Federation of Red Cross and Red Crescent Societies, and government agencies. Entries are reviewed prior to consolidation, and this process of checking and incorporating data is done on a daily basis. More routined data checking and management also occurs at a monthly interval, with revisions made at the end of each year.