Journal:Health care and cybersecurity: Bibliometric analysis of the literature
|Full article title||Health care and cybersecurity: Bibliometric analysis of the literature|
|Journal||Journal of Medical Internet Research|
|Author(s)||Jalali, Mohammad S.; Razak, Sabina; Gordon, William; Perakslis, Eric; Madnick, Stuart|
|Author affiliation(s)||Harvard Medical School, Massachusetts Institute of Technology, Brigham & Women’s Hospital, Partners Healthcare,|
|Primary contact||Email: msjalali at mgh dot harvard dot edu|
|Volume and issue||21(2)|
|Distribution license||Creative Commons Attribution 4.0 International|
Background: Over the past decade, clinical care has become globally dependent on information technology. The cybersecurity of health care information systems is now an essential component of safe, reliable, and effective health care delivery.
Objective: The objective of this study was to provide an overview of the literature at the intersection of cybersecurity and health care delivery.
Methods: A comprehensive search was conducted using PubMed and Web of Science for English-language peer-reviewed articles. We carried out chronological analysis, domain clustering analysis, and text analysis of the included articles to generate a high-level concept map composed of specific words and the connections between them.
Results: Our final sample included 472 English-language journal articles. Our review results revealed that a majority of the articles were focused on technology. Technology–focused articles made up more than half of all the clusters, whereas managerial articles accounted for only 32 percent of all clusters. This finding suggests that nontechnological variables (human–based and organizational aspects, strategy, and management) may be understudied. In addition, software development security, business continuity, and disaster recovery planning each accounted for three percent of the studied articles. Our results also showed that publications on physical security account for only one percent of the literature, and research in this area is lacking. Cyber vulnerabilities are not all digital; many physical threats contribute to breaches and potentially affect the physical safety of patients.
Conclusions: Our results revealed an overall increase in research on cybersecurity and identified major gaps and opportunities for future work.
Keywords: bibliometric review, cybersecurity, health care, literature analysis, text mining
Cybersecurity is an increasingly critical aspect of health care information technology infrastructure. The rapid digitization of health care delivery, from electronic health records (EHR) and telehealth to mobile health (mHealth) and network-enabled medical devices, introduces risks related to cybersecurity vulnerabilities. These vulnerabilities are particularly worrisome because cyberattacks in a health care setting can result in the exposure of highly sensitive personal information or cause disruptions in clinical care. Cyberattacks may also affect the safety of patients, for example, by compromising the integrity of data or impairing medical device functionality. The WannaCry and NotPetya ransomware attacks and vulnerabilities in Medtronic Implantable Cardiac Device Programmers are recent examples that have resulted in impaired health care delivery capabilities.
Health care organizations are particularly vulnerable to cyber threats. Verizon’s 2018 Data Breach Investigations Report found that the health care field, in general, was most affected by data breaches, which accounted for 24 percent of all investigated breaches across all industries. Additionally, a report by the Ponemon Institute found that almost 90 percent of respondents (involved in health plans and health care clearinghouses, as well as health care providers with EHRs) experienced a data breach in the past two years. Another survey of health care information security professionals revealed that over 75 percent of health care organizations experienced a recent security incident. The causes are multifactorial, involving both technology and people, and human error and cultural factors play increasingly critical roles. Despite efforts to teach best-practice security behavior through training programs, recent surveys have revealed that one in five health care employees still write down their usernames and passwords on paper.
Given the increasing importance of cybersecurity for safe, effective, and reliable health care delivery, there is a need to provide an overview of the literature at the intersection of cybersecurity and health care. Recent systematic reviews synthesized insights from 31 articles on cyber threats in health care and aggregated strategies from 13 articles about responding to cyber incidents in health care organizations. In this study, we conduct a large bibliometric review of the literature and describe the current state of research on various aspects of cybersecurity in health care in order to not only understand current trends but also identify gaps and guide future research efforts toward improving the security of our health care systems.
Study eligibility criteria
A comprehensive search was conducted using PubMed and Web of Science (WoS) for English-language peer-reviewed articles. We identified search keywords by adopting terminologies in The National Initiative for Cybersecurity Careers and Studies and The British Standards Institution glossaries. The list of keywords used follows.
WoS (journal articles, all years):
“Health*” AND “Cybersecurity” OR “Cyber Security” OR “Cyber Attack*” OR “Cyber Crisis*” OR “Cyber Incident*” OR “Cyber Infrastructure*” OR “Cyber Operation*” OR “Cyber Risk*” OR “Cyber Threat*” OR “Cyberspace*” OR “Data Breach*” OR “Data Security*” OR “Firewall*” OR “Information Security*” OR “Information Systems Security*” OR “Information Technology Security*” OR “IT Security*” OR “Malware*” OR “Phishing*” OR “Ransomware*” OR “Security Incident*” OR “Information Assurance*”
PubMed (journal articles, all years, abstract availability):
“Cybersecurity” OR “Cyber Security” OR “Cyber Attack” OR “Cyber Crisis” OR “Cyber Incident” OR “Cyber Infrastructure” OR “Cyber Operation” OR “Cyber Risk” OR “Cyber Threat” OR “Cyberspace” OR “Data Breach” OR “Data Security” OR “Firewall” OR “Information Security” OR “Information Systems Security” OR “Information Technology Security” OR “IT Security” OR “Malware” OR “Phishing” OR “Ransomware” OR “Security Incident” OR “Information Assurance”
Keywords that widened the search results far beyond the scope were rejected. For example, “exploit” and “malicious” can be used in a cyber context, but they are more commonly used in unrelated contexts that add noise to the search. Such terms were not included because of their contribution to an overwhelming amount of irrelevant results.
We included articles published from the inception of PubMed in 1966 and WoS in 1900, all the way to September 2017. Articles were excluded if they did not clearly focus on cybersecurity or health care or if they were reviews or meta-analyses. Inclusion and exclusion criteria were formulated prior to the preliminary title and abstract screening. The eligibility criteria were intentionally nonspecific to obtain a complete picture of the existing relevant research. To increase our confidence in the inclusion criteria, we conducted an initial pilot screening of 100 articles.
Screening of titles and abstracts was conducted using the software package abstrackr. Full texts of the “maybe” articles were independently reviewed by two trained individuals to assess study eligibility. Disagreements about study inclusion were discussed until a consensus was reached. More details about our methodology are available in Multimedia Appendix 1.
Chronological clustering and trend analysis
We performed chronological analysis of the number of articles published per year and the number of authors per article. We topically clustered articles using 10 security domains created by the International Information Systems Security Certification Consortium to categorize each article (Multimedia Appendix 1). Each clustered article was further categorized as technological, managerial, legal, or interdisciplinary (if it fell into more than three categories). Features of the included articles, such as the publishing journal and number of citations, were recorded.
After analyzing all the titles and abstracts, we removed words with high frequencies that were common in research articles but were not specific to our subject (e.g., “paper,” “using,” and “results”). In addition, we merged the plural forms with singular forms of the same word and merged “healthcare” and “health care” into “healthcare.” Subsequently, we created word clouds to visualize the word frequencies in titles and abstracts over time. Word frequency is represented by color and size, with darker, larger words representing higher occurrence.
We then assessed text titles and abstracts to generate a high-level concept map composed of specific words and the connections between them by using the software package Leximancer text analytics (version 4.5; Leximancer Pty Ltd, Brisbane, Australia). The software started with an unsupervised machine learning approach to extract a network of meaning from the data and developed a heat map that visually illustrated the end results. The method, underpinned by a naive Bayesian co-occurrence metric, considers how often two words co-occur as well as how often they occur apart. Heat maps consist of “themes” represented by bubbles and “concepts” represented by grey dots. Concepts can be equated to a list of similar terms coalescing into a monothematic idea, and themes are clusters of these concepts. The lines between dots suggest a strong connection between two concepts.
The primary search on PubMed for papers containing terms pertaining to “cyber” yielded 1,480 articles, and the search on WoS yielded 810 articles. After removing 310 duplicates, the titles and abstracts of 1,980 articles were screened, which was facilitated by the Abstrackr software. Based on the inclusion criteria, 1,262 articles were excluded in the first screening, reducing the results to 718 articles for full-text review. Eventually, a further screening removed additional articles to provide a final selection of 472 articles. Figure 1 presents the search method and results.
Chronological clustering and trend analysis
Figure 2 presents the overall trend of all publications over time, from 1985 to September 2017; the first included article was published in 1979 but was excluded from the figure for better visualization. Figure 2 shows a steady increase in the number of articles published on cybersecurity in health care (Multimedia Appendix 1).
Figure 3 shows the distribution among the three high-level categories: technological, managerial, and legal (Multimedia Appendix 1). The seven technological clusters made up more than half of all clusters, the two managerial clusters represented 32 percent, and the legal cluster represented 18 percent of all clusters.
The orange-shaded portion within each cluster in Figure 3 represents interdisciplinary articles (spanning multiple high-level categories). Although the topic of physical security had the lowest number of publications (Figure 3), it was the most interdisciplinary cluster (six out of the seven articles [85.7 percent] identified as interdisciplinary). The topic of legal, regulations, investigations, and compliance was the second most interdisciplinary cluster (59.8 percent of the articles in this category were interdisciplinary), followed by operations security (52.9 percent), business continuity and disaster recovery planning (50 percent), Information security governance and risk management (43.9 percent), and access control (30.6 percent). Although the topic of security architecture and design was the second most frequent cluster overall, only 22.2 percent of the articles were found to be interdisciplinary. The less interdisciplinary categories were in the topics of telecommunications and network security (18.9 percent), software development security (17.6 percent), and cryptography (four percent) (Multimedia Appendix 1).
We analyzed the publication trends over time in the 10 clusters (Figure 4). All clusters showed increased frequency, and some clusters such as security architecture and design, information security gGovernance and risk management, and cryptography demonstrated particularly steep increases in frequency.
Overall, the 472 articles included were published in 239 unique journals. We ranked the journals according to the number of published articles and selected the journals with more than three articles, which resulted in a list of 17 journals (Table 1). According to the corresponding InCites Journal Citation Reports (JCR) categories, the top journals tended to focus on computer science, information systems, and medical informatics. The most popular JCR category, accounting for seven out of the 10 journals listed in JCR, was medical informatics. Six journals had a computer science category, specifically within information systems, interdisciplinary applications, or theory and methods. Five journals were from the health care sciences and services category. Only one of the top 15 journals was categorized as a biomedical engineering journal; one, as a math and computational biology journal; and one, as a radiology, nuclear medicine, and medical imaging journal.
Approximately 73 percent of the 239 journals had only published one article at the intersection of cybersecurity and health care. The high number and diversity of the journals included along with the low publication rate suggest that there is currently no major niche for medical practice readership at the intersection of cybersecurity and health care due to the cross-disciplinary nature of the field.
Characteristics of the most cited articles
Table 2 shows the most influential publications in the field of cybersecurity in health care, ranked by the number of citations as of September 2017. Six of the top 15 cited articles were published in five journals of the Institute of Electrical and Electronics Engineers. The clusters show a mix of article domains across the legal, managerial, and technological domains. The author-denoted keywords support this finding.
Of the total clusters of the top 15 articles, 38 percent belonged to security architecture and design category. Cryptography was the next most popular cluster (17 percent); followed by legal, regulations, investigations, and compliance (13 percent); and access control (13 percent). Overall, 79 percent of the clusters were technological, 13 percent were legal, and eight percent were managerial. Additionally, 20 percent of the papers were interdisciplinary, with multiple clusters of distinct high-level categories. Notably, the list of most cited articles does not reflect the most recent articles, as citation of these articles is often significantly delayed.
The text-mining analysis identified specific trends in the article texts. The map produced from all titles and abstracts is shown in Figure 5. The thematic bubbles are ranked by relevance based on a heat-map color scheme: hot colors indicate more important themes, and cool colors indicate less important themes. The relative positions of the bubbles indicate the relationship between aggregated ideas, reflecting how closely they are related to each other. The sizes of the bubbles are only set to include their grey dots, and the size of each grey dot (a common word within the theme) indicates its relative frequency. The lines between these dots signify connectivity and association of concepts.
The overlay of grey-dot concepts onto thematic bubbles allows for more specific analysis of terms. Technological terms emerge as the main theme in Figure 5, including words like “encryption” and “software.” Concept words within these themes highlighted the following common elements of an organization’s informal technology structure related to cybersecurity: “internet,” “network,” “applications,” “records,” “breaches,” “key,” and “electronic.” Managerial and legal terms were also identified as concepts (Figure 5). “Management” was a concept within the “information” theme. “Policies” and “process” were concepts in the risk theme and indicated the influence of risk analysis on the cybersecurity policies and procedures of organizations. “HIPAA” was a concept that stemmed from the “information” concept in the “important” theme.
The two central themes “security” and “information” included multiple, large grey-dot concepts that branched out into other thematic areas. There was an overlap between “security” and “encryption,” suggesting that encoding material is fundamental to security. An overlap between “security” and “users” could imply that user control is imperative to security.
For further analysis of word frequencies, the articles from 1985 to 2017 were split into four time periods: 1985-1993, 1994-2001, 2002-2009, and 2010-2017 (September). Multimedia Appendix 1 presents the word clouds within the four time periods. The size of the word represents the frequency of its occurrence. The term “privacy” increased in size in the last three time periods. “Internet” appeared in 1994-2001, around the time of the dot-com bubble. “Legal” was mentioned in 1985-1993, and “legislation” was found in 1994-2001. “HIPAA” appeared in 2002-2009 and again, although to a smaller extent, in 2010-2017.
Maps of the four time periods were also created to identify trends over time (Figure 6). “Security” remained the most popular concept from 1985 to 2009, but was overtaken by “health care” from 2010 to 2017 (the most popular concept is indicated by the red bubble). The time period maps in Multimedia Appendix 1 provide further details.
This article provides an analysis of the literature at the intersection of cybersecurity and health care. In general, research in this area has been increasing over the past 20 years and is continually represented in a wide, distributed array of academic journals, reflecting the importance of cybersecurity. With the increase in cybersecurity attacks against hospitals and dependency of health care delivery on technology, we expect cybersecurity to continue to play a central role in health care delivery.
Despite the increase in research and attention to cybersecurity, there are persistent shortcomings in the research on cybersecurity. For example, our research suggests that majority of the articles on cybersecurity focus on technology. In our domain-clustering analysis, technology–focused articles accounted for more than half of all the clusters, whereas managerial articles accounted for only 32 percent. Similarly, in our journal analysis, 58 articles included in the 15 most published journals were from computer science journals and 12 articles were from health-focused journals. Notably, 79 percent of the top 15 most cited paper clusters were technological. This focus on the technological aspects of cybersecurity suggests that nontechnological variables (human–based and organizational aspects, strategy, and management) may be understudied. Investment in technological tools should be the output of a robust cybersecurity strategy rather than the foundation. An overwhelming majority of cybersecurity incidents are caused or propagated by people, and technological solutions can mitigate this risk to a limited extent.
We found discordance between the topics of the highly cited articles and the topical breakdown of our cluster analysis (these articles were published more than five years ago, implying that emergent threats are poorly captured). This finding suggests that articles on topics such as cryptography have significant traction, even though they are not widely present in the literature. On the other hand, only a few information security governance and compliance articles were frequently cited, despite accounting for a large portion of the literature.
Cybersecurity is most often examined with respect to privacy and compliance. Our results show that physical security is lacking in research, and only one percent of the literature is categorized under physical security. Not all cyber vulnerabilities are digital. Many physical threats contribute to breaches, and these threats potentially affect the physical safety of patients. Software development security, business continuity, and disaster recovery planning, each accounted for three percent of the studied articles. Further examination is needed on these topics, and our study suggests that incident recovery (critical to the success of recovery from incidents) is not a significant focus in the research community. Articles focusing on legality were the least represented. Moreover, federal cybersecurity guidance such as the publications of the National Institute of Standards and Technology was seldom observed in our text analysis. In addition, massive increases in cybersecurity spending did not drive proportional growth in the literature.
Our lexical analysis highlighted a separation of security processes and software terminology, with longer word distances between these themes. Additionally, the time period maps for 2002-2009 and 2010-2017 showed no overlap between the management and technological themes. More interdisciplinary research is needed to avoid gaps that arise from only analyzing managerial and technological security issues.
Unlike medical research, which is set up to openly benefit human lives, cybersecurity is based on the premise of an active adversary. The presence of this adversary may, unfortunately, drive a school of thought that knowledge, especially specific strategies and tactics, should not be shared openly, which impedes the growth and utility of research in this field.
Limitations and suggestions for future research
Our review was limited to journal articles indexed in PubMed and WoS. Information retrieval was limited to articles that included the terms of the search strategy in their titles or abstracts; articles that used different terminology were not retrieved. Additionally, we only included articles with cybersecurity at the core of the study.
Our review did not assess non-English language articles or documents other than journal articles (e.g., conference articles, white papers, or reports by governments or other organizations). A more comprehensive search could include these sources. Importantly, much of the work on cybersecurity and health care is operational and administrative, not academic. Information security professionals may not rely on academic literature as extensively as clinicians do when considering new diagnostics or therapeutics and may instead favor “on the job” experience and industry best practices. Additionally, information security research performed within the health care ecosystem may not be publishable due to security-related concerns such as exposing an internal vulnerability. Understanding the published literature in this space is an important starting point, and hospitals and patients will benefit from transparency in research, wherever possible.
Future reviews can focus on individual clusters that were reviewed in our study to provide a more in-depth analysis of the cluster. For instance, they could look specifically at business continuity and disaster recovery planning or software development security. Such a detailed focus can help synthesize research findings and provide best practices. Studies may also analyze the gap in managerial research and the implications of a narrow technological focus. Furthermore, such studies can focus on different settings in health care, such as inpatient and outpatient care, translational research, health and wellness environments, and integration of mobile devices and networked systems.
HIPAA: Health Insurance Portability and Accountability Act
IEEE: Institute of Electrical and Electronics Engineers
mHealth: mobile health
NIST: National Institute of Standards and Technology
PACS: picture archiving and communication system
WoS: Web of Science
Multimedia Appendix 1
Details of the methodology: PDF
Financial support for this study was provided by Cybersecurity at MIT Sloan (CAMS), also known as the Interdisciplinary Consortium for Improving Critical Infrastructure Cybersecurity.
Conflicts of interest
- Jalali, M.S.; Kaiser, J.P. (2018). "Cybersecurity in Hospitals: A Systematic, Organizational Perspective". Journal of Medical Internet Research 20 (5): e10059. doi:10.2196/10059. PMC PMC5996174. PMID 29807882. http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pmcentrez&artid=PMC5996174.
- Gordon, W.J.; Fairhall, A.; Landman, A. (2017). "Threats to Information Security - Public Health Implications". New England Journal of Medicine 377 (8): 707–9. doi:10.1056/NEJMp1707212. PMID 28700269.
- Perakslis, E.D. (2014). "Cybersecurity in health care". New England Journal of Medicine 371 (5): 395–7. doi:10.1056/NEJMp1404358. PMID 25075831.
- Jarrett, M.P. (2017). "Cybersecurity-A Serious Patient Care Concern". JAMA 318 (14): 1319–20. doi:10.1001/jama.2017.11986. PMID 28973258.
- Kramer, D.B.; Fu, K. (2017). "Cybersecurity Concerns and Medical Devices: Lessons From a Pacemaker Advisory". JAMA 318 (21): 2077–78. doi:10.1001/jama.2017.15692. PMID 29049709.
- Furnell, S.; Emm, D. (2017). "The ABC of ransomware protection". Computer Fraud & Security 2017 (10): 5–11. doi:10.1016/S1361-3723(17)30089-1.
- Verizon (2018). "2018 Data Breach Investigations Report" (PDF). Verizon. https://enterprise.verizon.com/resources/reports/DBIR_2018_Report.pdf. Retrieved 01 September 2018.
- Ponemon Institute, LLC (May 2016). "Sixth Annual Benchmark Study on Privacy & Security of Healthcare Data" (PDF). Ponemon Institute, LLC. https://www.ponemon.org/local/upload/file/Sixth%20Annual%20Patient%20Privacy%20%26%20Data%20Security%20Report%20FINAL%206.pdf. Retrieved 09 April 2018.
- Healthcare Information and Management Systems Society (2018). "2018 HIMSS Cybersecurity Survey" (PDF). Healthcare Information and Management Systems Society. https://www.himss.org/sites/hde/files/d7/u132196/2018_HIMSS_Cybersecurity_Survey_Final_Report.pdf. Retrieved 30 July 2020.
- Madnick, S.; Jalali, M.S.; Siegel, M. et al. (2017). "Measuring Stakeholders’ Perceptions of Cybersecurity for Renewable Energy Systems". Proceedings from DARE 2016: Data Analytics for Renewable Energy Integration. Lecture Notes in Computer Science 10097: 67–77. doi:10.1007/978-3-319-50947-1_7.
- Jalali, M.S.; Siegel, M.; Madnick, S. (2019). "Decision-making and biases in cybersecurity capability development: Evidence from a simulation game experiment". The Journal of Strategic Information Systems 28 (1): 66–82. doi:10.1016/j.jsis.2018.09.003.
- "One in Five Health Employees Willing to Sell Confidential Data to Unauthorized Parties, Accenture Survey Finds". Accenture. 1 March 2018. https://newsroom.accenture.com/news/one-in-five-health-employees-willing-to-sell-confidential-data-to-unauthorized-parties-accenture-survey-finds.htm.
- Kruse, C.S.; Frederick, B.; Jacobson, T. et al. (2017). "Cybersecurity in healthcare: A systematic review of modern threats and trends". Technology and Health Care 25 (1): 1–10. doi:10.3233/THC-161263. PMID 27689562.
- Jalali, M.S.; Russell, B.; Razak, S. et al. (2019). "EARS to cyber incidents in health care". JAMIA 26 (1): 81–90. doi:10.1093/jamia/ocy148. PMID 30517701.
- National Initiative for Cybersecurity Careers and Studies. "Glossary". National Initiative for Cybersecurity Careers and Studies. https://niccs.us-cert.gov/about-niccs/glossary. Retrieved 31 December 2018.
- British Standards Institution. "Glossary of cyber security terms". British Standards Institution. https://www.bsigroup.com/en-GB/Cyber-Security/Glossary-of-cyber-security-terms/. Retrieved 31 July 2018.
- Wallace, B.C.; Small, K.; Brodley, C.E. (2012). "Deploying an interactive machine learning system in an evidence-based practice center: abstrackr". Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium: 819–24. doi:10.1145/2110363.2110464.
- Smith, A.E.; Humphreys, M.S. (2006). "Evaluation of unsupervised semantic mapping of natural language with Leximancer concept mapping". Behavior Research Methods 38: 262–79. doi:10.3758/BF03192778.
- Cheng, M. (2019). "A comparative automated content analysis approach on the review of the sharing economy discourse in tourism and hospitality". Current Issues in Tourism 22 (1): 35–49. doi:10.1080/13683500.2017.1361908.
- Clarivate Analytics. "Journal Citation Reports". Clarivate Analytics. https://clarivate.com/webofsciencegroup/solutions/journal-citation-reports/. Retrieved 29 October 2018.
- Health Care Industry Cybersecurity Task Force (June 2017). "Report on Improving Cybersecurity in the Health Care Industry" (PDF). Assistant Secretary for Preparedness and Response. https://www.phe.gov/Preparedness/planning/CyberTF/Documents/report2017.pdf.
- van Zadelhoff, M. (19 September 2016). "The biggest cybersecurity threats are inside your company". Harvard Business Review. https://hbr.org/2016/09/the-biggest-cybersecurity-threats-are-inside-your-company. Retrieved 04 February 2019.
- Gartner (7 December 2017). "Gartner Forecasts Worldwide Security Spending Will Reach $96 Billion in 2018, Up 8 Percent from 2017". Gartner. https://www.gartner.com/en/newsroom/press-releases/2017-12-07-gartner-forecasts-worldwide-security-spending-will-reach-96-billion-in-2018. Retrieved 29 October 2018.
- Ioannidis, J.P.; Greenland, S.; Hlatky, M.A. et al. (2014). "Increasing value and reducing waste in research design, conduct, and analysis". Lancet 383 (9912): 166–75. doi:10.1016/S0140-6736(13)62227-8. PMC PMC4697939. PMID 24411645. http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pmcentrez&artid=PMC4697939.
This presentation is faithful to the original, with only a few minor changes to presentation, grammar, and punctuation. In some cases important information was missing from the references, and that information was added. The original cited an Accenture YouTube video for the claim regarding users writing their credentials down; for this version a more informative press release, which links to the video, was used.