Difference between revisions of "Journal:Learning health systems need to bridge the "two cultures" of clinical informatics and data science"

From LIMSWiki
Jump to navigationJump to search
(Saving and adding more.)
Line 43: Line 43:


==Routine clinical data is highly problematic==
==Routine clinical data is highly problematic==
Data quality in frontline healthcare systems faces a dual challenge in our current environment. First is the lack of standard data sets and adoption of reference values, though work is progressing in this area.<ref name="ScottDevelop15">{{cite journal |title=Developing a conformance methodology for clinically-defined medical record headings: A preliminary report |journal=European Journal for Biomedical Informatics |author=Scott, P.; Bentley, S.; Carpenter, I. et al. |volume=11 |issue=2 |pages=en23–en30 |year=2015 |url=https://www.ejbi.org/abstract/developing-a-conformance-methodology-for-clinicallydefinedrnmedical-record-headings-a-preliminary-report-3384.html}}</ref> The second is the lack of data quality due to unreliable adherence to processes<ref name="BurnettHow12">{{cite journal |title=How reliable are clinical systems in the UK NHS? A study of seven NHS organisations |journal=BMJ Quality and Safety |author=Burnett, S.; Franklin, B.D.; Moorthy, K. et al. |volume=21 |issue=6 |pages=466–72 |year=2012 |doi=10.1136/bmjqs-2011-000442 |pmid=22495099 |pmc=PMC3355340}}</ref>  and poor system usability.<ref name="KoppelTheHealth16">{{cite journal |title=The health information technology safety framework: building great structures on vast voids |journal=BMJ Quality and Safety |author=Koppel, R. |volume=25 |issue=4 |pages=218-20 |year=2016 |doi=10.1136/bmjqs-2015-004746 |pmid=26584580}}</ref> Embarking on the implementation of clinical terminology, including the [[Systematized Nomenclature of Medicine]] Clinical Terms (SNOMED CT) and [[LOINC|Logical Observation Identifiers Names and Codes]] (LOINC), shows us that our historical environment and the complexity of these standards always causes long debate and significant amounts of implementation effort. So far, little progress has been made even by the "Global Digital Exemplars"<ref name="NHSGlobal">{{cite web |url=https://www.england.nhs.uk/digitaltechnology/connecteddigitalsystems/exemplars/ |title=Global Digital Exemplars |publisher=NHS England |date=2018 |accessdate=26 March 2018}}</ref> in implementing SNOMED CT in any depth. Furthermore, complexity is introduced when interoperating with other care settings such as social care and mental health. General practitioner (GP) data is far from consistent. Different practices will use different fields in different ways and usage varies from clinician to clinician. Historically, the system has not forced users to standardize their recording or practice. This results in varying data quality between GP practices, which affects not just epidemiological studies but operational processes. Failure to enter accurate data into healthcare systems occurs for a number of reasons, including poor usability, overly complex systems, lack of data input logic to check errors, and poor business change leadership.
Data quality in frontline healthcare systems faces a dual challenge in our current environment. First is the lack of standard data sets and adoption of reference values, though work is progressing in this area.<ref name="ScottDevelop15">{{cite journal |title=Developing a conformance methodology for clinically-defined medical record headings: A preliminary report |journal=European Journal for Biomedical Informatics |author=Scott, P.; Bentley, S.; Carpenter, I. et al. |volume=11 |issue=2 |pages=en23–en30 |year=2015 |url=https://www.ejbi.org/abstract/developing-a-conformance-methodology-for-clinicallydefinedrnmedical-record-headings-a-preliminary-report-3384.html}}</ref> The second is the lack of data quality due to unreliable adherence to processes<ref name="BurnettHow12">{{cite journal |title=How reliable are clinical systems in the UK NHS? A study of seven NHS organisations |journal=BMJ Quality and Safety |author=Burnett, S.; Franklin, B.D.; Moorthy, K. et al. |volume=21 |issue=6 |pages=466–72 |year=2012 |doi=10.1136/bmjqs-2011-000442 |pmid=22495099 |pmc=PMC3355340}}</ref>  and poor system usability.<ref name="KoppelTheHealth16">{{cite journal |title=The health information technology safety framework: Building great structures on vast voids |journal=BMJ Quality and Safety |author=Koppel, R. |volume=25 |issue=4 |pages=218-20 |year=2016 |doi=10.1136/bmjqs-2015-004746 |pmid=26584580}}</ref> Embarking on the implementation of clinical terminology, including the [[Systematized Nomenclature of Medicine]] Clinical Terms (SNOMED CT) and [[LOINC|Logical Observation Identifiers Names and Codes]] (LOINC), shows us that our historical environment and the complexity of these standards always causes long debate and significant amounts of implementation effort. So far, little progress has been made even by the "Global Digital Exemplars"<ref name="NHSGlobal">{{cite web |url=https://www.england.nhs.uk/digitaltechnology/connecteddigitalsystems/exemplars/ |title=Global Digital Exemplars |publisher=NHS England |date=2018 |accessdate=26 March 2018}}</ref> in implementing SNOMED CT in any depth. Furthermore, complexity is introduced when interoperating with other care settings such as social care and mental health. General practitioner (GP) data is far from consistent. Different practices will use different fields in different ways and usage varies from clinician to clinician. Historically, the system has not forced users to standardize their recording or practice. This results in varying data quality between GP practices, which affects not just epidemiological studies but operational processes. Failure to enter accurate data into healthcare systems occurs for a number of reasons, including poor usability, overly complex systems, lack of data input logic to check errors, and poor business change leadership.
 
Most epidemiological research with routine clinical data uses coded data, rather than free text. Thus, there is an overreliance on codes used during clinical consultations. A national evaluation of usage of codes in primary care in Scotland, taking allergy as an example, found 50% of usage in over two million consultations over seven years were from eight codes used to report for an incentive program for GPs, 95% usage was from 10% of the 352 allergy codes (''n'' = 36), and 21% of codes were never used.<ref name="MukherjeeUsage16">{{cite journal |title=Usage of allergy codes in primary care electronic health records: A national evaluation in Scotland |journal=Allergy |author=Mukherjee, M.; Wyatt, J.C.; Simpson, C.R.; Sheikh, A. |volume=71 |issue=11 |pages=1594–1602 |year=2016 |doi=10.1111/all.12928 |pmid=27146325}}</ref> A systematic review found variations in completeness (66%–96%) and correctness of morbidity recording across disease areas.<ref name="JordanQuality04">{{cite journal |title=Quality of morbidity coding in general practice computerized medical records: A systematic review |journal=Family Practice |author=Jordan, K.; Porcheret, M.; Croft, P. |volume=21 |issue=4 |pages=396–412 |year=2004 |doi=10.1093/fampra/cmh409 |pmid=15249528}}</ref> For instance, the quality of recording in diabetes is better than asthma in primary care.
 
There are also changes in case definition and diagnostic criteria across disease areas over time, which are seldom mentioned in the databases. A recent primary care study found that choice of codes can make a difference to outcome measures; for example, the incidence rate was found to be higher when non-diagnostic codes were used rather than with diagnostic codes.<ref name="TateQuality17">{{cite journal |title=Quality of recording of diabetes in the UK: how does the GP's method of coding clinical data affect incidence estimates? Cross-sectional study using the CPRD database |journal=BMJ Open |author=Tate, A.R.; Dungey, S.; Glew, S. et al. |volume=7 |issue=1 |pages=e012905 |year=2017 |doi=10.1136/bmjopen-2016-012905 |pmid=28122831 |pmc=PMC5278252}}</ref> Since there is variability of coding of data across GP practices, when practices with poor quality of recording were included in the analysis, there was significant difference in incidence rate and trends, with a lower incidence rate and decreasing trends when they were included. This study highlights the effect of miscoding and misclassification. It also shows that when data are missing, they might not be missing at random. Furthermore, there could be unavailability of codes that were needed during consultation and thus were recorded in free text. All these salient features around coding of data are often ignored when interrogating patient databases for research and thus could lead to erroneous conclusions. No amount of data cleansing could sort the inherent discrepancies involved in coded data.


==References==
==References==

Revision as of 23:50, 3 December 2018

Full article title Learning health systems need to bridge the "two cultures" of clinical informatics and data science
Journal Journal of Innovation in Health Informatics
Author(s) Scott, Philip J.; Dunscombe, Rachel; Evans, David; Mukherjee, Mome; Wyatt, Jeremy C.
Author affiliation(s) University of Portsmouth, Salford Royal NHS Foundation Trust, British Computer Society, University of Edinburgh, University of Southampton
Primary contact Email: Philip dot scott at port dot ac dot uk
Year published 2018
Volume and issue 25(2)
Page(s) 126–31
DOI 10.14236/jhi.v25i2.1062
ISSN 1687-8035
Distribution license Creative Commons Attribution 4.0 International
Website https://www.hindawi.com/journals/abi/2018/4059018/
Download http://downloads.hindawi.com/journals/abi/2018/4059018.pdf (PDF)

Abstract

Background: United Kingdom (U.K.) health research policy and plans for population health management are predicated upon transformative knowledge discovery from operational "big data." Learning health systems require not only data but also feedback loops of knowledge into changed practice. This depends on knowledge management and application, which in turn depends upon effective system design and implementation. Biomedical informatics is the interdisciplinary field at the intersection of health science, social science, and information science and technology that spans this entire scope.

Issues: In the U.K., the separate worlds of health data science (bioinformatics, big data) and effective healthcare system design and implementation (clinical informatics, "digital health") have operated as "two cultures." Much National Health Service and social care data is of very poor quality. Substantial research funding is wasted on data cleansing or by producing very weak evidence. There is not yet a sufficiently powerful professional community or evidence base of best practice to influence the practitioner community or the digital health industry.

Recommendation: The U.K. needs increased clinical informatics research and education capacity and capability at much greater scale and ambition to be able to meet policy expectations, address the fundamental gaps in the discipline’s evidence base, and mitigate the absence of regulation. Independent evaluation of digital health interventions should be the norm, not the exception.

Conclusions: Policy makers and research funders need to acknowledge the existing gap between the two cultures and recognize that the full social and economic benefits of digital health and data science can only be realized by accepting the interdisciplinary nature of biomedical informatics and supporting a significant expansion of clinical informatics capacity and capability.

Keywords: big data, health informatics, bioinformatics, evidence-based practice, health policy, program evaluation, education, learning health systems

Introduction

Novelist and English physical chemist C.P. Snow famously characterized the gulf between what he called the "two cultures" of science and the humanities as a serious barrier to progress.[1] In our field, at least in the U.K., there appears to be an analogous gap between the policy and funding programs of data science (bioinformatics, "big data") and effective system design and implementation (clinical informatics, "digital health").

Data science in healthcare is subject to strong regulatory and ethical controls, minimum educational qualifications, well-established methodologies, mandatory professional accreditation, and evidence-based independent scrutiny. By contrast, digital health has minimal substantive regulation or ethical foundation, no specified educational requirements, weak methodologies, a contested evidence base, and negligible peer scrutiny. Yet, vision of big data is to base science on the data routinely produced by digital health systems.

This paper is focused on the U.K. context. We bring together experience from the frontline National Health Service (NHS) clinical informatics and epidemiological research to present the operational realities of health data quality and the implications for data science. We argue that to build a successful learning health system, data science and clinical informatics should be seen as two parts of the same discipline with a common mission. We commend the work in progress to bridge this cultural divide but propose that the U.K. needs to expand its clinical informatics research and education capacity and capability at much greater scale to address the substantial gaps in the evidence base and to realize the anticipated societal aims.

Routine clinical data is highly problematic

Data quality in frontline healthcare systems faces a dual challenge in our current environment. First is the lack of standard data sets and adoption of reference values, though work is progressing in this area.[2] The second is the lack of data quality due to unreliable adherence to processes[3] and poor system usability.[4] Embarking on the implementation of clinical terminology, including the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT) and Logical Observation Identifiers Names and Codes (LOINC), shows us that our historical environment and the complexity of these standards always causes long debate and significant amounts of implementation effort. So far, little progress has been made even by the "Global Digital Exemplars"[5] in implementing SNOMED CT in any depth. Furthermore, complexity is introduced when interoperating with other care settings such as social care and mental health. General practitioner (GP) data is far from consistent. Different practices will use different fields in different ways and usage varies from clinician to clinician. Historically, the system has not forced users to standardize their recording or practice. This results in varying data quality between GP practices, which affects not just epidemiological studies but operational processes. Failure to enter accurate data into healthcare systems occurs for a number of reasons, including poor usability, overly complex systems, lack of data input logic to check errors, and poor business change leadership.

Most epidemiological research with routine clinical data uses coded data, rather than free text. Thus, there is an overreliance on codes used during clinical consultations. A national evaluation of usage of codes in primary care in Scotland, taking allergy as an example, found 50% of usage in over two million consultations over seven years were from eight codes used to report for an incentive program for GPs, 95% usage was from 10% of the 352 allergy codes (n = 36), and 21% of codes were never used.[6] A systematic review found variations in completeness (66%–96%) and correctness of morbidity recording across disease areas.[7] For instance, the quality of recording in diabetes is better than asthma in primary care.

There are also changes in case definition and diagnostic criteria across disease areas over time, which are seldom mentioned in the databases. A recent primary care study found that choice of codes can make a difference to outcome measures; for example, the incidence rate was found to be higher when non-diagnostic codes were used rather than with diagnostic codes.[8] Since there is variability of coding of data across GP practices, when practices with poor quality of recording were included in the analysis, there was significant difference in incidence rate and trends, with a lower incidence rate and decreasing trends when they were included. This study highlights the effect of miscoding and misclassification. It also shows that when data are missing, they might not be missing at random. Furthermore, there could be unavailability of codes that were needed during consultation and thus were recorded in free text. All these salient features around coding of data are often ignored when interrogating patient databases for research and thus could lead to erroneous conclusions. No amount of data cleansing could sort the inherent discrepancies involved in coded data.

References

  1. Snow, C.P. (1959). The Two Cultures and the Scientific Revolution: The Rede Lecture. Cambridge University Press. pp. 58. 
  2. Scott, P.; Bentley, S.; Carpenter, I. et al. (2015). "Developing a conformance methodology for clinically-defined medical record headings: A preliminary report". European Journal for Biomedical Informatics 11 (2): en23–en30. https://www.ejbi.org/abstract/developing-a-conformance-methodology-for-clinicallydefinedrnmedical-record-headings-a-preliminary-report-3384.html. 
  3. Burnett, S.; Franklin, B.D.; Moorthy, K. et al. (2012). "How reliable are clinical systems in the UK NHS? A study of seven NHS organisations". BMJ Quality and Safety 21 (6): 466–72. doi:10.1136/bmjqs-2011-000442. PMC PMC3355340. PMID 22495099. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3355340. 
  4. Koppel, R. (2016). "The health information technology safety framework: Building great structures on vast voids". BMJ Quality and Safety 25 (4): 218-20. doi:10.1136/bmjqs-2015-004746. PMID 26584580. 
  5. "Global Digital Exemplars". NHS England. 2018. https://www.england.nhs.uk/digitaltechnology/connecteddigitalsystems/exemplars/. Retrieved 26 March 2018. 
  6. Mukherjee, M.; Wyatt, J.C.; Simpson, C.R.; Sheikh, A. (2016). "Usage of allergy codes in primary care electronic health records: A national evaluation in Scotland". Allergy 71 (11): 1594–1602. doi:10.1111/all.12928. PMID 27146325. 
  7. Jordan, K.; Porcheret, M.; Croft, P. (2004). "Quality of morbidity coding in general practice computerized medical records: A systematic review". Family Practice 21 (4): 396–412. doi:10.1093/fampra/cmh409. PMID 15249528. 
  8. Tate, A.R.; Dungey, S.; Glew, S. et al. (2017). "Quality of recording of diabetes in the UK: how does the GP's method of coding clinical data affect incidence estimates? Cross-sectional study using the CPRD database". BMJ Open 7 (1): e012905. doi:10.1136/bmjopen-2016-012905. PMC PMC5278252. PMID 28122831. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5278252. 

Notes

This presentation is faithful to the original, with only a few minor changes to presentation. Grammar and punctuation was edited to American English, and in some cases additional context was added to text when necessary. In some cases important information was missing from the references, and that information was added.