Journal:Extending an open-source tool to measure data quality: Case report on Observational Health Data Science and Informatics (OHDSI)

From LIMSWiki
Revision as of 18:55, 10 August 2020 by Shawndouglas (talk | contribs) (Saving and adding more.)
Jump to navigationJump to search
Full article title Extending an open-source tool to measure data quality: Case report on Observational Health Data Science and Informatics (OHDSI)
Journal BMJ Health & Care Informatics
Author(s) Dixon, Brian E.; Wen, Chen; French, Tony; Williams, Jennifer L.; Duke, Jon D.; Grannis, Shaun J.
Author affiliation(s) Indiana University–Purdue University Indianapolis, Regenstrief Institute, Georgia Tech Research Institute
Primary contact Email: bedixon at regenstrief dot org
Year published 2020
Volume and issue 27 (1)
Article # e100054
DOI 10.1136/bmjhci-2019-100054
ISSN 2632-1009
Distribution license Creative Commons Attribution-NonCommercial 4.0 International
Website https://informatics.bmj.com/content/27/1/e100054
Download https://informatics.bmj.com/content/bmjhci/27/1/e100054.full.pdf (PDF)

Abstract

Introduction: As the health system seeks to leverage large-scale data to inform population outcomes, the informatics community is developing tools for analyzing these data. To support data quality assessment within such a tool, we extended the open-source software Observational Health Data Sciences and Informatics (OHDSI) to incorporate new functions useful for population health.

Methods: We developed and tested methods to measure the completeness, timeliness, and entropy of information. The new data quality methods were applied to over 100 million clinical messages received from emergency department information systems for use in public health syndromic surveillance systems.

Discussion: While completeness and entropy methods were implemented by the OHDSI community, timeliness was not adopted as its context did not fit with the existing OHDSI domains. The case report examines the process and reasons for acceptance and rejection of ideas proposed to an open-source community like OHDSI.

Introduction

Observational research requires an information infrastructure that can gather, integrate, manage, analyze, and apply evidence to decision-making and operations in an enterprise. In healthcare, we currently seek to develop, implement, and operationalize learning health systems in which an expanding universe of electronic health data can be transformed into evidence through observational research and applied to clinical decisions and processes within health systems.[1][2]

Leveraging large-scale health data is challenging because clinical data generally derive from myriad smaller systems across diverse institutions and are captured for various intended uses through varying business processes. The result is variable data quality, limiting the utility of data for decision-making and application. To ensure data are fit for use at both the granular patient-level and the broader aggregate population-level, it is important to assess, monitor, and improve data quality.[3][4]

A growing body of knowledge documents abundant data quality challenges in healthcare. Liaw et al. examined the completeness and accuracy of emergency department information system (EDIS) data for identifying patients with select chronic diseases (e.g., type 2 diabetes mellitus, cardiovascular disease, and chronic obstructive pulmonary disease). They found that information on the target diseases was missing from EDIS discharge summaries in 11%–20% of cases.[5] Furthermore, an audit confirmed just 61% of diagnoses found in a query of the EDIS for the target conditions. Studies among integrated delivery networks and multiple provider organizations show similar results. A study of data from multiple laboratory information systems (LIS) transmitting electronic messages to public health departments found low completeness for a number of data critical to surveillance processes.[6]

Given poor data quality in health information systems, researchers as well as national organizations advocate for developing tools to enable standardized assessment, monitoring, and improvement of data quality.[3][4][7][8] For example, in the report from a National Science Foundation workshop on the learning health system, key research questions called for developing methods to curate data, compute fitness-for-use measures from the data themselves, and infer the strength of a data set based on its provenance.[9] Similar questions were posed by the National Academy of Medicine in its report on the role of observational studies in the learning health system.[10]


References

  1. Dixon, B.E.; Whipple, E.C.; Lajiness, J.M. et al. (2016). "Utilizing an integrated infrastructure for outcomes research: A systematic review". Health Information and Libraries Journal 33 (1): 7–32. doi:10.1111/hir.12127. PMID 26639793. 
  2. Institute of Medicine (2011). Digital Infrastructure for the Learning Health System: The Foundation for Continuous Improvement in Health and Health Care: Workshop Series Summary. National Academies Press. doi:10.17226/12912. ISBN 9780309225014. 
  3. 3.0 3.1 Dixon, B.E.; Rosenman, M.; Xia, Y. et al. (2013). "A vision for the systematic monitoring and improvement of the quality of electronic health data". Studies in Health Technology and Informatics 192: 884–8. doi:10.3233/978-1-61499-289-9-884. PMID 23920685. 
  4. 4.0 4.1 Weiskopf, N.G.; Bakken, S.; Hripcsak, G. et al. (2013). "A Data Quality Assessment Guideline for Electronic Health Record Data Reuse". EGEMS 5 (1): 14. doi:10.5334/egems.218. PMC PMC5983018. PMID 29881734. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5983018. 
  5. Liaw, S.-T.; Chen, H.-Y.; Maneze, D. et al. (2012). "Health reform: Is routinely collected electronic information fit for purpose?". EGEMS 24 (1): 57–63. doi:10.1111/j.1742-6723.2011.01486.x. PMID 22313561. 
  6. Dixon, B.E.; Siegel, J.A.; Oemig, T.V. et al. (2013). "Electronic health information quality challenges and interventions to improve public health surveillance data and practice". Public Health Reports 128 (6): 546–53. doi:10.1177/003335491312800614. PMC PMC3804098. PMID 24179266. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3804098. 
  7. Martin, E.G.; Law, J.; Ran, W. et al. (2017). "Evaluating the Quality and Usability of Open Data for Public Health Research: A Systematic Review of Data Offerings on 3 Open Data Platforms". Journal of Public Health Management and Practice 23 (4): e5-e13. doi:10.1097/PHH.0000000000000388. PMID 26910872. 
  8. Botts, N.; Bouhaddou, O.; Bennett, J. et al. (2014). "Data Quality and Interoperability Challenges for eHealth Exchange Participants: Observations from the Department of Veterans Affairs' Virtual Lifetime Electronic Record Health Pilot Phase". AMIA Annual Symposium Proceedings 2014: 307–14. PMC PMC4419918. PMID 25954333. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4419918. 
  9. Friedman, C.; Rubin, J.; Brown. J. et al. (2015). "Toward a science of learning systems: a research agenda for the high-functioning Learning Health System". JAMIA 22 (1): 43-50. doi:10.1136/amiajnl-2014-002977. PMC PMC4433378. PMID 25342177. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4433378. 
  10. Institute of Medicine (2013). Observational Studies in a Learning Health System: Workshop Summary. National Academies Press. doi:10.17226/18438. ISBN 9780309290845. 

Notes

This presentation is faithful to the original, with only a few minor changes to presentation and grammar.