REBEL SCIENCE: A UNIVERSITY OF WYOMING CAPSTONE COURSE
  • Home
  • Student Posters
  • Syllabi
  • Videos
  • Mastery Data
  • Grant Proposals
  • ACE(2)d it!

"Once Upon a Time in the West"

Once Upon a Time in the West: COVID-19 Deaths are Under-reported in Wyoming; Less Female and Most Diverse Counties are at Risk.

12/9/2020

0 Comments

 
Annaliese Bronner | Jared Spencer | Liam Guille

ABSTRACT         
Rural regions of the United States present a unique challenge for the modeling of infectious disease owing to geographic separation of resources resulting in numerous healthcare disparities. These disparities are extremely relevant in light of the COVID-19 pandemic, where accurate collection of real-time epidemiological data is crucial for public health decision making. It is also known that the impacts of the COVID-19 pandemic are not distributed equally and are linked to community factors such as socioeconomic status, wage inequality, insurance coverage, and racial background. All of these concerns are extremely relevant in the state of Wyoming, where nearly 70% of the population lives in a Rural area. Consideration of both potential underreporting of COVID-19 mortalities, as well as the susceptibility of Wyoming communities to the impacts of COVID-19 is paramount for effective decision making, prompting the creation of new statistical tools. This study utilizes a hybrid ARIMA-ETS-NNAR model time series forecast and structural equation modelling to provide for a county by county understanding of excess deaths and susceptibility to COVID-19 based on various factors. Utilizing this data, we found there to be evidence of underreporting of deaths at the county level, with particularly worrying results in Campbell, Natrona, and Sweetwater counties. Factors found to be significant for the prediction of COVID deaths included higher income inequality and a greater proportion of men in the population. Because of the low population in a number of Wyoming counties, it may be necessary to continue this analysis over a longer period of time to develop better statistical power for other factors. The findings of this study suggest a need to more robustly study the impacts of the COVID-19 pandemic on communities in Wyoming, with a specific emphasis on understanding how race and ethnicity may be affecting community susceptibility to COVID-19. Other studies may be indicated to focus on deaths of residents of a given Wyoming county rather than occurrent deaths in the county due to the potential for patients to be transported across county lines for medical care. Additional topics of interest may be how COVID-19 is impacting populations at a community level and if proposed excess death models are aligned with provider perceptions in these counties.

Picture
GINI Index estimation completed via Integration over brackets. GINI is a measure of income inequality on a scale of 0 to 1. The higher the number, the more unequal income distribution in a given region. This analysis indicates that significant income inequality exists in Albany, Teton, and Johnson counties in Wyoming.
Act 1: The Wild, Wild, (Sick) West
Introduction
Large disparities in public health outcomes exist between rural and urban regions throughout the United States and internationally. In no scenario have these disparities become more pronounced than in the impact of the COVID-19 pandemic, which poses both acute and long-term challenges for rural healthcare systems(1). These challenges are especially concerning to the State of Wyoming, where 70% of the population lives in a recognized rural area(2).  With the lack of healthcare resources allocated to rural counties, there is legitimate concern that the data may not robustly represent the true impact of COVID-19, creating “blind spots” for public health interventions(3) Some of the factors that we postulated would have the largest impact on COVID-19 infection and mortality included: socioeconomic status, income distribution, population density, and air quality. There is some foundational literature present focusing on very narrow risk factors. However, this literature often fails to account for confounding variables within a community, suggesting that other factors may better discriminate community susceptibility to COVID-19. In this proposed study we hypothesize that the non-inclusion of these indicators, such as socioeconomic inequality, is more detrimental than the individual risk factors themselves. Thus, there was a need not only to create a metric more specific to the state of Wyoming, but a model that more inclusively considers risk factors to inform public health intervention.

Aims and Hypotheses
As part of this project, we formulated the following aims and hypotheses:
AIM 1
We will statistically estimate the under-representation of COVID-19 data in the State of Wyoming   

Hypothesis 1: Utilizing poisson regressions trained with historical data, we will find evidence of under-representation of COVID-19 data in the state of Wyoming

AIM 2
Utilizing Exploratory Factor Analysis and Multivariate Analysis, we aim to statistically quantify the susceptibility of communities that experience mortalities due to COVID-19 infections

Hypothesis 2: Utilizing Exploratory Factor Analysis and Multivariate Analysis we will find evidence that the following factors will correlate with varying degrees of prevalence and susceptibility to COVID-19 mortality: socioeconomic status, income distribution, population density,  air quality, GINI Index.

AIM 3
We aim to create a metric that estimates the extent of COVID-19 data underrepresentation in Wyoming communities, and the potential for harm due to this underrepresentation

Hypothesis 3: In creating a novel metric estimating the underrepresentation of COVID-19 data in Wyoming and the potential for harm due to susceptibility to COVID-19 infections we will find that those areas will a higher prevalence of underrepresentation of data are more likely to be susceptible to COVID-19 mortality.
Picture
Concept map of Aims and Hypotheses. Credit: Jared Spencer.
Act 2: The Good, the Bad, and the Under-reported
or  The Searchers for Significance.

Methods
While initially expecting to utilize Poisson regressional analysis to estimate excess mortalities, new strategies were ultimately necessary due to the novelty of such statistical methods, and the need for a method that could more robustly manage the fragmented mortality datasets present in Wyoming counties. Thus estimation of excess deaths was achieved through the use of hybrid time series analysis of ARIMA, ETS, and Neural Network Models. Such modelling has been shown to be robust in epidemiological applications (4).  All models were trained on occurent death data in each Wyoming county for 2015-2019 and forecasts were created utilizing the Forecasthybrid() R package (5).

To produce an estimation of under-reported deaths from COVID-19, Observed occurent deaths from the first 10 months of 2020 were subtracted by the deaths attributed to COVID-19 from the New York Times COVID-19 Dataset (6), thus creating a set of deaths that were "not attributed to COVID". Subtracting this value from the mean predicted deaths allowed for the estimation of excess deaths and thus, potential under-reporting of COVID-19 mortalities by county.
Picture
A summative diagram of methods utilized to estimate excess mortality in the state of Wyoming.
Methodologies concerning the understanding of relevant factors community COVID-19 susceptibility also evolved throughout the course of this project. Structural Equation Modelling (SEM) was determined to be more germane to the aims of the project due to its ability to robustly understand the relationships between the diverse variables presented to the model. Structural equation modelling has also been utilized in various public health applications, and may be seen as a reliable statistical technique for the purposes of this project (7).

SEM was performed utilizing publicly available information at the county level in the state of Wyoming. All SEM modelling was carried out utilizing the lavaan() package (8).
Act 3: Once Upon a Time in the Results
Picture
A line graph representing the Cumulative excess deaths from January to October 2020. Most counties exhibit excess deaths in the range of 0-20, with some outliers in Campbell and Carbon county suggesting modeling errors and potential confounding variables.

Picture
A heatmap illustrating the number of excess deaths per county in a given month. Notice how some counties exhibit similar temporal trends, suggesting reliability of the model.

PictureA simple scatter plot representing the correlation coefficients and P-values of varying measures when examined via SEM. Median Age and Percent Non-White demographics within a county are positively correlated with higher numbers of COVID-19 deaths with statistical significance (P<.05). Other measures such as percentage of a population in poverty and GINI index hold some potential for correlational understanding, but ultimately lack statistical significance.

Act 4: Annie Get Your Conclusion (and Discussion)
Conclusion
In completing Hybrid Time Series Analysis and Structural Equation Modelling concerning COVID-19 in the state of Wyoming, we ultimately concluded that there is evidence of COVID-19 mortality under-reporting in the state of Wyoming, supporting Hypothesis 1. Most counties have experienced excess deaths in the range of 0-20, with some counties experiencing less deaths for the year than expected. Particularly worrying are excess deaths in Campbell, Sweet Water, and Natrona counties, which make up over 50% of the excess deaths estimated.

Concerning the SEM data, of those deaths that are reported, the most powerful predictors of COVID-19 moralities at the county level appear to be median age and percentage of the population that is non-white. Other factors such Air Quality and population density were shown to be less predictive, indicating mixed results concerning Hypothesis 2. There is promise that percentage of a population in poverty and GINI income inequality may have some correlational value; Longer periods of observation may allow for these factors to achieve statistical significance.

Discussion

Our conclusions evidence the fact that many "blind spots" exist in the State of Wyoming in relation to the COVID-19 pandemic. As the pandemic continues progress throughout the Winter of 2020 into the Spring of 2021, health and policymakers should be aware that official statistics of deaths from COVID-19 may under-represent the true toll of the pandemic on these communities-- it is not unreasonable to expect that it will be many years until we understand the true impact of COVID-19 in total lives lost.

Furthermore, officials should be aware that those counties with the highest proportion of males and non-white populations should be prioritized when considering public health interventions. With expected approval FDA approval of the Pfizer/BioNtech vaccine, these considerations are extremely relevant while vaccine resources are in short supply. Officials may also want to consider the percentage of these counties that are impoverished and their levels of income inequality to prioritize county-level access to the vaccine.

This study has a number of limitations, not least of which is the small sample size present in a number of low-population counties in Wyoming. Such small samples have the potential to create conflicting data in relation to time forecasting. Thus, when interpreting time forecast data, one must be mindful of the greater uncertainty present for these small counties. These small sample sizes also have outsized affects on SEM data, prompting the need for future research on larger timescales to ensure that current findings remain germane.

Furthermore, the study only evaluated the deaths occurring in these counties, not the county of residence of those individuals who died. Given the unique pathophysiology of COVID-19, there is the potential for an increased number of deaths occurring across county and state lines as critically ill patients in more rural counties are transported to major medical centers for treatment. Thus, future research is necessary considering the county of residence of these deaths to account for these possible affects.

This study suggests the need to continue modelling the affects of the COVID-19 pandemic at the county level in Wyoming in new and diverse ways. Additional research should be considered for those counties that are estimated to be highly underestimating COVID-19 mortalities to better understand why these discrepancies exist and determine if they may be present for other chronic conditions such as cancer, heart disease, and stroke.
Roll the Credits
The authors would like to thank Rachel Watson, Ella DeWolf, and Sierra Jech for their assistance in the Microbiology Capstone Course.
Thank you to Corina Davis at Wyoming Vital Statistics for providing the mortality data necessary to complete this project.
Further thanks to the team at the Wyoming Public Health Labs for their collaboration and willingness to provide assistance.

References:
  1. Lakhani, Hari Vishal, Sneha S. Pillai, Mishghan Zehra, Ishita Sharma, and Komal Sodhi. 2020. “Systematic Review of Clinical Insights into Novel Coronavirus (CoVID-19) Pandemic: Persisting Challenges in U.S. Rural Population.” International Journal of Environmental Research and Public Health 17 (12): 4279. https://doi.org/10.3390/ijerph17124279.
  2. “USDA ERS - Rural-Urban Continuum Codes.” n.d. Accessed September 5, 2020. https://www.ers.usda.gov/data-products/rural-urban-continuum-codes//.
  3. Cai, Guoshuai I., Yohan I. Bosse, Feifei I. Xiao, Farrah I. Kheradmand, and Christopher I. Amos. "Tobacco Smoking Increases the Lung Gene Expression of ACE2, the Receptor of SARS-CoV-2." American Journal of Respiratory and Critical Care Medicine 201, no. 12 (June 15, 2020): 1557-1559. doi:10.1164/rccm.202003-0693LE.   
  4. Perone, Gaetano. 2020. “Comparison of ARIMA, ETS, NNAR and Hybrid Models to Forecast the Second Wave of COVID-19 Hospitalizations in Italy.” SSRN Electronic Journal. https://doi.org/10.2139/ssrn.3716343.
  5. Shaub, David, and Peter Ellis. 2020. ForecastHybrid: Convenient Functions for Ensemble Time Series Forecasts (version 5.0.19). https://CRAN.R-project.org/package=forecastHybrid.
  6. Nytimes/Covid-19-Data. (2020) 2020. The New York Times. https://github.com/nytimes/covid-19-data.
  7. Amorim, Leila Denise Alves Ferreira, Rosemeire L. Fiaccone, Carlos Antônio S. T. Santos, Tereza Nadya dos Santos, Lia Terezinha L. P. de Moraes, Nelson F. Oliveira, Silvano O. Barbosa, et al. 2010. “Structural Equation Modeling in Epidemiology.” Cadernos de Saúde Pública 26 (12): 2251–62. https://doi.org/10.1590/S0102-311X2010001200004.
  8. Rosseel, Yves, Terrence D. Jorgensen, Nicholas Rockwood, Daniel Oberski, Jarrett Byrnes, Leonard Vanbrabant, Victoria Savalei, et al. 2020. Lavaan: Latent Variable Analysis (version 0.6-7). https://CRAN.R-project.org/package=lavaan.
0 Comments

    Our Objectives

    This project was a part of the 2020 Microbiology Capstone. Our team was dedicated to more robustly understanding the impacts of COVID-19 in the state of Wyoming in two ways:

    The first: To estimate the under-reporting of COVID-19 deaths in the state.

    The second: To understand what factors predict COVID-19 mortality in Wyoming communities.

Powered by Create your own unique website with customizable templates.
  • Home
  • Student Posters
  • Syllabi
  • Videos
  • Mastery Data
  • Grant Proposals
  • ACE(2)d it!