Difference between revisions of "Journal:Epidemiological data challenges: Planning for a more robust future through data standards"

From LIMSWiki
Jump to navigationJump to search
(Created stub. Saving and adding more.)
 
(Saving and adding more.)
Line 28: Line 28:


==Abstract==
==Abstract==
Accessible [[Epidemiology|epidemiological]] data are of great value for emergency preparedness and response, understanding disease progression through a population, and building statistical and mechanistic disease models that enable forecasting. The status quo, however, renders acquiring and using such data difficult in practice. In many cases, a primary way of obtaining epidemiological data is through the internet, but the methods by which the data are presented to the public often differ drastically among institutions. As a result, there is a strong need for better data sharing practices. This paper identifies, in detail and with examples, the three key challenges one encounters when attempting to acquire and use epidemiological data: (1) interfaces, (2) data formatting, and (3) reporting. These challenges are used to provide suggestions and guidance for improvement as these systems evolve in the future. If these suggested data and interface recommendations were adhered to, epidemiological and public health analysis, modeling, and [[Public health informatics|informatics]] work would be significantly streamlined, which can in turn yield better public health decision-making capabilities.
Accessible [[Epidemiology|epidemiological]] data are of great value for emergency preparedness and response, understanding disease progression through a population, and building statistical and mechanistic disease models that enable forecasting. The status quo, however, renders acquiring and using such data difficult in practice. In many cases, a primary way of obtaining epidemiological data is through the internet, but the methods by which the data are presented to the public often differ drastically among institutions. As a result, there is a strong need for better data sharing practices. This paper identifies, in detail and with examples, the three key challenges one encounters when attempting to acquire and use epidemiological data: (1) interfaces, (2) data formatting, and (3) reporting. These challenges are used to provide suggestions and guidance for improvement as these systems evolve in the future. If these suggested data and interface recommendations were adhered to, epidemiological and [[public health]] analysis, modeling, and [[Public health informatics|informatics]] work would be significantly streamlined, which can in turn yield better public health decision-making capabilities.


'''Keywords''': data, computational epidemiology, public health, disease modeling, informatics, disease surveillance
'''Keywords''': data, computational epidemiology, public health, disease modeling, informatics, disease surveillance
==Introduction==
At the heart of [[Infectious disease informatics|disease surveillance and modeling]] are [[Epidemiology|epidemiological]] data. These data are generally presented as a time series of cases, ''T'', for a geographic region, ''G'', and for a demographic, ''D''. The type of cases presented may vary depending on the context. For example, ''T'' may be a time series of confirmed or suspected cases, or it might be hospitalizations or deaths; in some circumstances, it may be a summation of some combination of these (e.g., confirmed + suspected cases). ''G'' is most commonly a political boundary; it might be a country, state/province, county/district, city, or sub-city region, such as a postal code or United States (U.S.) Census Bureau census tract. Depending on the context, ''D'' may simply be the the entire population of ''G'', or it might be stratified by age, sex, race, education, or other relevant factors.
Epidemiological data have a variety of uses. From a [[public health]] perspective, they can be used to gain an understanding of population-level disease progression. This understanding can in turn be used to aid in decision-making and allocation of resources. Recent outbreaks like Ebola and Zika have demonstrated the value of accessible epidemiological data for emergency preparedness and the need for better data sharing.<ref name="ChretianMake16">{{cite journal |title=Make Data Sharing Routine to Prepare for Public Health Emergencies |journal=PLoS One |author=Chretien, J.P.; Rivers, C.M.; Johansson, M.A. |volume=13 |issue=8 |at=e1002109 |year=2016 |doi=10.1371/journal.pmed.1002109 |pmid=27529422 |pmc=PMC4987038}}</ref> These data may influence vaccine distribution<ref name="CDCAlloc18">{{cite web |url=https://asprtracie.hhs.gov/technical-resources/resource/2846/guidance-on-allocating-and-targeting-pandemic-influenza-vaccine |title=Allocating and Targeting Pandemic Influenza Vaccine During an Influenza Pandemic |author=Centers for Disease Control and Prevention |publisher=U.S. Department of Health and Human Services |date=2018}}</ref>, and [[hospital]]s can anticipate surge capacity during an outbreak, allowing them to obtain extra temporary help if necessary.<ref name="NapPandemic07">{{cite journal |title=Pandemic influenza and hospital resources |journal=Emerging Infectious Diseases |author=Nap, R.E.; Andriessen, M.P.; Meessen, N.E. et al. |volume=13 |issue=11 |pages=1714-9 |year=2007 |doi=10.3201/eid1311.070103 |pmid=18217556 |pmc=PMC3375786}}</ref><ref name="HotaPrep10">{{cite journal |title=Preparing your intensive care unit for the second wave of H1N1 and future surges |journal=Critical Care Medicine |author=Hota, S.; Fried, E.; Burry, L. et al. |volume=38 |issue=4 Suppl. |pages=e110–9 |year=2010 |doi=10.1097/CCM.0b013e3181c66940 |pmid=19935417}}</ref>
From a modeler's perspective, high-quality reference data (also commonly referred to as "ground truth data") are needed to enable prediction and forecasting.<ref name="MoranEpi16">{{cite journal |title=Epidemic Forecasting is Messier Than Weather Forecasting: The Role of Human Behavior and Internet Data Streams in Epidemic Forecast |journal=Journal of Infectious Diseases |author=Moran, K.R.; Fairchild, G.; Generous, N. et al. |volume=214 |issue=Suppl. 4 |pages=S404-S408 |year=2016 |doi=10.1093/infdis/jiw375 |pmid=28830111 |pmc=PMC5181546}}</ref> These data can be used to parameterize compartmental models<ref name="HethcoteTheMath00">{{cite journal |title=The Mathematics of Infectious Diseases |journal=SIAM Review |author=Hethcore, H.W. |volume=42 |issue=4 |pages=599–653 |year=2000 |doi=10.1137/S0036144500371907}}</ref> as well as stochastic agent-based models<ref name="EubankModel04">{{cite journal |title=Modelling disease outbreaks in realistic urban social networks |journal=Nature |author=Eubank, S.; Guclu, H.; Kumar, V.S. et al. |volume=429 |issue=6988 |pages=180–4 |year=2004 |doi=10.1038/nature02541 |pmid=15141212}}</ref><ref name="BissetEpiFast09">{{cite journal |title=EpiFast: A fast algorithm for large scale realistic epidemic simulations on distributed memory systems |journal=Proceedings of the 23rd international conference on Supercomputing |author=Busset, K.R.; Chen, J.; Feng, X. et al. |pages=430–39 |year=2009 |doi=10.1145/1542275.1542336}}</ref><ref name="ChaoControl12">{{cite journal |title=Controlling dengue with vaccines in Thailand |journal=PLoS Neglected Tropical Diseases |author=Chao, D.L.; Halstead, S.B.; Halloran, M.E. et al. |volume=6 |issue=10 |at=e1876 |year=2012 |doi=10.1371/journal.pntd.0001876 |pmid=23145197 |pmc=PMC3493390}}</ref><ref name="GrefenstetteFRED13">{{cite journal |title=FRED (a Framework for Reconstructing Epidemic Dynamics): An open-source software system for modeling infectious diseases and control strategies using census-based populations |journal=BMC Public Health |author=Grefenstette, J.J.; Brown, S.T.; Rosenfeld, R. et al. |volume=13 |at=940 |year=2013 |doi=10.1186/1471-2458-13-940 |pmid=24103508 |pmc=PMC3852955}}</ref><ref name="McMahonCoupling14">{{cite journal |title=Coupling Vector-host Dynamics with Weather Geography and Mitigation Measures to Model Rift Valley Fever in Africa |journal=Mathematical Modelling of Natural Phenomena |author=McMahon, B.H.; Manore, C.A.; Hyman, J.M. et al. |volume=9 |issue=2 |pages=161–77 |year=2014 |doi=10.1051/mmnp/20149211 |pmid=25892858 |pmc=PMC4398965}}</ref>, and they can also be used to train and validate machine learning and statistical models.
(12–19)





Revision as of 18:10, 27 April 2020

Full article title Epidemiological data challenges: Planning for a more robust future through data standards
Journal Frontiers in Public Health
Author(s) Fairchild, Geoffrey; Tasseff, Byron; Khalsa, Hari; Generous, Nicholas; Daughton, Ashlynn R.;
Velappan, Nileena; Priedhorsky, Reid; Deshpande, Alina
Author affiliation(s) Los Alamos National Laboratory
Primary contact Email: gfairchild at lanl dot gov
Editors Efird, Jimmy T.
Year published 2018
Volume and issue 6
Article # 336
DOI 10.3389/fpubh.2018.00336
ISSN 2296-2565
Distribution license Creative Commons Attribution 4.0 International
Website https://www.frontiersin.org/articles/10.3389/fpubh.2018.00336/full
Download https://www.frontiersin.org/articles/10.3389/fpubh.2018.00336/pdf (PDF)

Abstract

Accessible epidemiological data are of great value for emergency preparedness and response, understanding disease progression through a population, and building statistical and mechanistic disease models that enable forecasting. The status quo, however, renders acquiring and using such data difficult in practice. In many cases, a primary way of obtaining epidemiological data is through the internet, but the methods by which the data are presented to the public often differ drastically among institutions. As a result, there is a strong need for better data sharing practices. This paper identifies, in detail and with examples, the three key challenges one encounters when attempting to acquire and use epidemiological data: (1) interfaces, (2) data formatting, and (3) reporting. These challenges are used to provide suggestions and guidance for improvement as these systems evolve in the future. If these suggested data and interface recommendations were adhered to, epidemiological and public health analysis, modeling, and informatics work would be significantly streamlined, which can in turn yield better public health decision-making capabilities.

Keywords: data, computational epidemiology, public health, disease modeling, informatics, disease surveillance

Introduction

At the heart of disease surveillance and modeling are epidemiological data. These data are generally presented as a time series of cases, T, for a geographic region, G, and for a demographic, D. The type of cases presented may vary depending on the context. For example, T may be a time series of confirmed or suspected cases, or it might be hospitalizations or deaths; in some circumstances, it may be a summation of some combination of these (e.g., confirmed + suspected cases). G is most commonly a political boundary; it might be a country, state/province, county/district, city, or sub-city region, such as a postal code or United States (U.S.) Census Bureau census tract. Depending on the context, D may simply be the the entire population of G, or it might be stratified by age, sex, race, education, or other relevant factors.

Epidemiological data have a variety of uses. From a public health perspective, they can be used to gain an understanding of population-level disease progression. This understanding can in turn be used to aid in decision-making and allocation of resources. Recent outbreaks like Ebola and Zika have demonstrated the value of accessible epidemiological data for emergency preparedness and the need for better data sharing.[1] These data may influence vaccine distribution[2], and hospitals can anticipate surge capacity during an outbreak, allowing them to obtain extra temporary help if necessary.[3][4]

From a modeler's perspective, high-quality reference data (also commonly referred to as "ground truth data") are needed to enable prediction and forecasting.[5] These data can be used to parameterize compartmental models[6] as well as stochastic agent-based models[7][8][9][10][11], and they can also be used to train and validate machine learning and statistical models.

(12–19)


References

  1. Chretien, J.P.; Rivers, C.M.; Johansson, M.A. (2016). "Make Data Sharing Routine to Prepare for Public Health Emergencies". PLoS One 13 (8): e1002109. doi:10.1371/journal.pmed.1002109. PMC PMC4987038. PMID 27529422. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4987038. 
  2. Centers for Disease Control and Prevention (2018). "Allocating and Targeting Pandemic Influenza Vaccine During an Influenza Pandemic". U.S. Department of Health and Human Services. https://asprtracie.hhs.gov/technical-resources/resource/2846/guidance-on-allocating-and-targeting-pandemic-influenza-vaccine. 
  3. Nap, R.E.; Andriessen, M.P.; Meessen, N.E. et al. (2007). "Pandemic influenza and hospital resources". Emerging Infectious Diseases 13 (11): 1714-9. doi:10.3201/eid1311.070103. PMC PMC3375786. PMID 18217556. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3375786. 
  4. Hota, S.; Fried, E.; Burry, L. et al. (2010). "Preparing your intensive care unit for the second wave of H1N1 and future surges". Critical Care Medicine 38 (4 Suppl.): e110–9. doi:10.1097/CCM.0b013e3181c66940. PMID 19935417. 
  5. Moran, K.R.; Fairchild, G.; Generous, N. et al. (2016). "Epidemic Forecasting is Messier Than Weather Forecasting: The Role of Human Behavior and Internet Data Streams in Epidemic Forecast". Journal of Infectious Diseases 214 (Suppl. 4): S404-S408. doi:10.1093/infdis/jiw375. PMC PMC5181546. PMID 28830111. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5181546. 
  6. Hethcore, H.W. (2000). "The Mathematics of Infectious Diseases". SIAM Review 42 (4): 599–653. doi:10.1137/S0036144500371907. 
  7. Eubank, S.; Guclu, H.; Kumar, V.S. et al. (2004). "Modelling disease outbreaks in realistic urban social networks". Nature 429 (6988): 180–4. doi:10.1038/nature02541. PMID 15141212. 
  8. Busset, K.R.; Chen, J.; Feng, X. et al. (2009). "EpiFast: A fast algorithm for large scale realistic epidemic simulations on distributed memory systems". Proceedings of the 23rd international conference on Supercomputing: 430–39. doi:10.1145/1542275.1542336. 
  9. Chao, D.L.; Halstead, S.B.; Halloran, M.E. et al. (2012). "Controlling dengue with vaccines in Thailand". PLoS Neglected Tropical Diseases 6 (10): e1876. doi:10.1371/journal.pntd.0001876. PMC PMC3493390. PMID 23145197. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3493390. 
  10. Grefenstette, J.J.; Brown, S.T.; Rosenfeld, R. et al. (2013). "FRED (a Framework for Reconstructing Epidemic Dynamics): An open-source software system for modeling infectious diseases and control strategies using census-based populations". BMC Public Health 13: 940. doi:10.1186/1471-2458-13-940. PMC PMC3852955. PMID 24103508. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3852955. 
  11. McMahon, B.H.; Manore, C.A.; Hyman, J.M. et al. (2014). "Coupling Vector-host Dynamics with Weather Geography and Mitigation Measures to Model Rift Valley Fever in Africa". Mathematical Modelling of Natural Phenomena 9 (2): 161–77. doi:10.1051/mmnp/20149211. PMC PMC4398965. PMID 25892858. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4398965. 

Notes

This presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added.