Journal:Epidemiological data challenges: Planning for a more robust future through data standards

From LIMSWiki
Revision as of 18:31, 27 April 2020 by Shawndouglas (talk | contribs) (Saving and adding more.)
Jump to navigationJump to search
Full article title Epidemiological data challenges: Planning for a more robust future through data standards
Journal Frontiers in Public Health
Author(s) Fairchild, Geoffrey; Tasseff, Byron; Khalsa, Hari; Generous, Nicholas; Daughton, Ashlynn R.;
Velappan, Nileena; Priedhorsky, Reid; Deshpande, Alina
Author affiliation(s) Los Alamos National Laboratory
Primary contact Email: gfairchild at lanl dot gov
Editors Efird, Jimmy T.
Year published 2018
Volume and issue 6
Article # 336
DOI 10.3389/fpubh.2018.00336
ISSN 2296-2565
Distribution license Creative Commons Attribution 4.0 International
Website https://www.frontiersin.org/articles/10.3389/fpubh.2018.00336/full
Download https://www.frontiersin.org/articles/10.3389/fpubh.2018.00336/pdf (PDF)

Abstract

Accessible epidemiological data are of great value for emergency preparedness and response, understanding disease progression through a population, and building statistical and mechanistic disease models that enable forecasting. The status quo, however, renders acquiring and using such data difficult in practice. In many cases, a primary way of obtaining epidemiological data is through the internet, but the methods by which the data are presented to the public often differ drastically among institutions. As a result, there is a strong need for better data sharing practices. This paper identifies, in detail and with examples, the three key challenges one encounters when attempting to acquire and use epidemiological data: (1) interfaces, (2) data formatting, and (3) reporting. These challenges are used to provide suggestions and guidance for improvement as these systems evolve in the future. If these suggested data and interface recommendations were adhered to, epidemiological and public health analysis, modeling, and informatics work would be significantly streamlined, which can in turn yield better public health decision-making capabilities.

Keywords: data, computational epidemiology, public health, disease modeling, informatics, disease surveillance

Introduction

At the heart of disease surveillance and modeling are epidemiological data. These data are generally presented as a time series of cases, T, for a geographic region, G, and for a demographic, D. The type of cases presented may vary depending on the context. For example, T may be a time series of confirmed or suspected cases, or it might be hospitalizations or deaths; in some circumstances, it may be a summation of some combination of these (e.g., confirmed + suspected cases). G is most commonly a political boundary; it might be a country, state/province, county/district, city, or sub-city region, such as a postal code or United States (U.S.) Census Bureau census tract. Depending on the context, D may simply be the the entire population of G, or it might be stratified by age, sex, race, education, or other relevant factors.

Epidemiological data have a variety of uses. From a public health perspective, they can be used to gain an understanding of population-level disease progression. This understanding can in turn be used to aid in decision-making and allocation of resources. Recent outbreaks like Ebola and Zika have demonstrated the value of accessible epidemiological data for emergency preparedness and the need for better data sharing.[1] These data may influence vaccine distribution[2], and hospitals can anticipate surge capacity during an outbreak, allowing them to obtain extra temporary help if necessary.[3][4]

From a modeler's perspective, high-quality reference data (also commonly referred to as "ground truth data") are needed to enable prediction and forecasting.[5] These data can be used to parameterize compartmental models[6] as well as stochastic agent-based models[7][8][9][10][11], and they can also be used to train and validate machine learning and statistical models.[12][13][14][15][16][17][18][19]


References

  1. Chretien, J.P.; Rivers, C.M.; Johansson, M.A. (2016). "Make Data Sharing Routine to Prepare for Public Health Emergencies". PLoS One 13 (8): e1002109. doi:10.1371/journal.pmed.1002109. PMC PMC4987038. PMID 27529422. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4987038. 
  2. Centers for Disease Control and Prevention (2018). "Allocating and Targeting Pandemic Influenza Vaccine During an Influenza Pandemic". U.S. Department of Health and Human Services. https://asprtracie.hhs.gov/technical-resources/resource/2846/guidance-on-allocating-and-targeting-pandemic-influenza-vaccine. 
  3. Nap, R.E.; Andriessen, M.P.; Meessen, N.E. et al. (2007). "Pandemic influenza and hospital resources". Emerging Infectious Diseases 13 (11): 1714-9. doi:10.3201/eid1311.070103. PMC PMC3375786. PMID 18217556. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3375786. 
  4. Hota, S.; Fried, E.; Burry, L. et al. (2010). "Preparing your intensive care unit for the second wave of H1N1 and future surges". Critical Care Medicine 38 (4 Suppl.): e110–9. doi:10.1097/CCM.0b013e3181c66940. PMID 19935417. 
  5. Moran, K.R.; Fairchild, G.; Generous, N. et al. (2016). "Epidemic Forecasting is Messier Than Weather Forecasting: The Role of Human Behavior and Internet Data Streams in Epidemic Forecast". Journal of Infectious Diseases 214 (Suppl. 4): S404-S408. doi:10.1093/infdis/jiw375. PMC PMC5181546. PMID 28830111. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5181546. 
  6. Hethcore, H.W. (2000). "The Mathematics of Infectious Diseases". SIAM Review 42 (4): 599–653. doi:10.1137/S0036144500371907. 
  7. Eubank, S.; Guclu, H.; Kumar, V.S. et al. (2004). "Modelling disease outbreaks in realistic urban social networks". Nature 429 (6988): 180–4. doi:10.1038/nature02541. PMID 15141212. 
  8. Busset, K.R.; Chen, J.; Feng, X. et al. (2009). "EpiFast: A fast algorithm for large scale realistic epidemic simulations on distributed memory systems". Proceedings of the 23rd international conference on Supercomputing: 430–39. doi:10.1145/1542275.1542336. 
  9. Chao, D.L.; Halstead, S.B.; Halloran, M.E. et al. (2012). "Controlling dengue with vaccines in Thailand". PLoS Neglected Tropical Diseases 6 (10): e1876. doi:10.1371/journal.pntd.0001876. PMC PMC3493390. PMID 23145197. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3493390. 
  10. Grefenstette, J.J.; Brown, S.T.; Rosenfeld, R. et al. (2013). "FRED (a Framework for Reconstructing Epidemic Dynamics): An open-source software system for modeling infectious diseases and control strategies using census-based populations". BMC Public Health 13: 940. doi:10.1186/1471-2458-13-940. PMC PMC3852955. PMID 24103508. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3852955. 
  11. McMahon, B.H.; Manore, C.A.; Hyman, J.M. et al. (2014). "Coupling Vector-host Dynamics with Weather Geography and Mitigation Measures to Model Rift Valley Fever in Africa". Mathematical Modelling of Natural Phenomena 9 (2): 161–77. doi:10.1051/mmnp/20149211. PMC PMC4398965. PMID 25892858. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4398965. 
  12. Viboud, C.; Boëlle, P.Y.; Carrat, F. et al. (2003). "Prediction of the spread of influenza epidemics by the method of analogues". American Journal of Epidemiology 158 (10): 996-1006. doi:10.1093/aje/kwg239. PMID 14607808. 
  13. Polgreen, P.M.; Chen, Y.; Pennock, D.M. et al. (2008). "Using internet searches for influenza surveillance". Clinical Infectious Diseases 47 (11): 1443-8. doi:10.1086/593098. PMID 18954267. 
  14. Ginsberg, J.; Mohebbi, M.H.; Patel, R.S. et al. (2009). "Detecting influenza epidemics using search engine query data". Nature 457 (7232): 1012-4. doi:10.1038/nature07634. PMID 19020500. 
  15. Signorini, A.; Segre, A.M.; Polgreen, P.M. et al. (2011). "The use of Twitter to track levels of disease activity and public concern in the U.S. during the influenza A H1N1 pandemic". PLoS One 6 (5): e19467. doi:10.1371/journal.pone.0019467. PMC PMC3087759. PMID 21573238. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3087759. 
  16. Shaman, J.; Karspeck, A.; Yang, W. et al. (2013). "Real-time influenza forecasts during the 2012-2013 season". Nature Communications 4: 2837. doi:10.1038/ncomms3837. PMC PMC3873365. PMID 24302074. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3873365. 
  17. Generous, N.; Fairchild, G.; Deshpande, A. et al. (2014). "Global disease monitoring and forecasting with Wikipedia". PLoS Computational Biology 10 (11): e1003892. doi:10.1371/journal.pcbi.1003892. PMC PMC4231164. PMID 25392913. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4231164. 
  18. Hickmann, K.S.; Fairchild, G.; Priedhorsky, R. et al. (2015). "Forecasting the 2013-2014 influenza season using Wikipedia". PLoS Computational Biology 11 (5): e1004239. doi:10.1371/journal.pcbi.1004239. PMC PMC4431683. PMID 25974758. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4431683. 
  19. Fairchild, G.; Del Valle, S.Y.; De Silva, L. et al. (2015). "Eliciting Disease Data from Wikipedia Articles". Proceedings of the 2015 International AAAI Conference on Weblogs and Social Media: 26–33. PMC PMC5511739. PMID 28721308. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5511739. 

Notes

This presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added.