Public health informatics

Public Health Informatics has been defined as the systematic application of information and computer science and technology to public health practice, research, and learning. It is one of the sub-domains of biomedical informatics.

United States

In the United States, public health informatics is practiced by individuals in public health agencies at the federal and state levels and in the larger local health jurisdictions. Additionally, research and training in public health informatics takes place at a variety of academic institutions.

At the federal Centers for Disease Control and Prevention in Atlanta, Georgia, the Public Health Informatics and Technology Program Office (PHITPO) focuses on advancing the state of information science and applies digital information technologies to aid in the detection and management of diseases and syndromes in individuals and populations. The three sub-units within PHITPO include Informatics Practice, Policy and Coordination; Informatics Solutions and Operations; and Informatics Research and Development.

The bulk of the work of public health informatics in the United States, as with public health generally, takes place at the state and local level, in the state departments of health and the county or parish departments of health. At a state health department the activities may include: collection and storage of vital statistics (birth and death records); collection of reports of communicable disease cases from doctors, hospitals, and laboratories, used for infectious disease surveillance; display of infectious disease statistics and trends; collection of child immunization and lead screening information; daily collection and analysis of emergency room data to detect early evidence of biological threats; collection of hospital capacity information to allow for planning of responses in case of emergencies. Each of these activities presents its own information processing challenge.

Collection of public health data

Before the advent of the internet, public health data in the United States, like other healthcare and business data, were collected on paper forms and stored centrally at the relevant public health agency. If the data were to be computerized they required a distinct data entry process, were stored in the various file formats of the day and analyzed by mainframe computers using standard batch processing.

(TODO: describe CDC-provided DOS/desktop-based systems like TIMSS (TB), STDMIS (Sexually transmitted diseases); Epi-Info for epidemiology investigations; and others )

Since the beginning of the World Wide Web, public health agencies with sufficient information technology resources have been transitioning to web-based collection of public health data, and, more recently, to automated messaging of the same information. In the years roughly 2000 to 2005 the Centers for Disease Control and Prevention, under its National Electronic Disease Surveillance System (NEDSS), built and provided free to states a comprehensive web and message-based reporting system called the NEDSS Base System (NBS). Due to the funding being limited and it not being wise to have fiefdom-based systems, only a few states and larger counties have built their own versions of electronic disease surveillance systems, such as Pennsylvania's PA-NEDSS. These do not provide timely full intestate notification services causing an increase in disease rates versus the NEDSS federal product.

To promote interoperability, the CDC has encouraged the adoption in public health data exchange of several standard vocabularies and messaging formats from the health care world. The most prominent of these are: the Health Level 7 (HL7) standards for health care messaging; the LOINC system for encoding laboratory test and result information; and the Systematized Nomenclature of Medicine (SNOMED) vocabulary of health care concepts.

Since about 2005, the CDC has promoted the idea of the Public Health Information Network to facilitate the transmission of data from various partners in the health care industry and elsewhere (hospitals, clinical and environmental laboratories, doctors' practices, pharmacies) to local health agencies, then to state health agencies, and then to the CDC. At each stage the entity must be capable of receiving the data, storing it, aggregating it appropriately, and transmitting it to the next level. A typical example would be infectious disease data, which hospitals, labs, and doctors are legally required to report to local health agencies; local health agencies must report to their state public health department; and which the states must report in aggregate form to the CDC. Among other uses, the CDC publishes the Morbidity and Mortality Weekly Report (MMWR) based on these data acquired systematically from across the United States.

Major issues in the collection of public health data are: awareness of the need to report data; lack of resources of either the reporter or collector; lack of interoperability of data interchange formats, which can be at the purely syntactic or at the semantic level; variation in reporting requirements across the states, territories, and localities.

Storage of public health data

Storage of public health data shares the same data management issues as other industries. And like other industries, the details of how these issues play out are affected by the nature of the data being managed.

Due to the complexity and variability of public health data, like health care data generally, the issue of data modeling presents a particular challenge. While a generation ago flat data sets for statistical analysis were the norm, today's requirements of interoperability and integrated sets of data across the public health enterprise require more sophistication. The relational database is increasingly the norm in public health informatics. Designers and implementers of the many sets of data required for various public health purposes must find a workable balance between very complex and abstract data models such as HL7's Reference Information Model (RIM) or CDC's Public Health Logical Data Model, and simplistic, ad hoc models that untrained public health practitioners come up with and feel capable of working with.

Due to the variability of the incoming data to public health jurisdictions, data quality assurance is also a major issue.

Analysis of public health data

The need to extract usable public health information from the mass of data available requires the public health informaticist to become familiar with a range of analysis tools, ranging from business intelligence tools to produce routine or ad hoc reports, to sophisticated statistical analysis tools such as SAS and PSPP/SPSS, to Geographical Information Systems (GIS) to expose the geographical dimension of public health trends.

Applications in health surveillance and epidemiology

SAPPHIRE (Health care) or Situational Awareness and Preparedness for Public Health Incidences and Reasoning Engines is a semantics-based health information system capable of tracking and evaluating situations and occurrences that may affect public health.

References

Public Health Informatics and Information Systems by D.A. Ross, A.R. Hinman, K. Saarlas, and W.H. Foege (Hardcover - Oct 16, 2002) ISBN 0-387-95474-0
A Vision for More Effective Public Health Information Technology on SSRN
Olmeda, Christopher J. (2000). Information Technology in Systems of Care. Delfin Press. ISBN 978-0-9821442-0-6
http://www.fda.gov/fdac/features/596_info.html on FDA
Health Data Tools and Statistics