Journal:Towards a risk catalog for data management plans

From LIMSWiki
Revision as of 23:10, 5 March 2021 by Shawndouglas (talk | contribs) (Saving and adding more.)
Jump to navigationJump to search
Full article title Towards a risk catalog for data management plans
Journal International Journal of Digital Curation
Author(s) Weng, Franziska; Thoben, Stella
Author affiliation(s) Kiel University
Primary contact Email: franziskaweng at web dot de
Year published 2020
Volume and issue 15(1)
Page(s) 18
DOI 10.2218/ijdc.v15i1.697
ISSN 1746-8256
Distribution license Creative Commons Attribution 4.0 International
Website http://www.ijdc.net/article/view/697
Download http://www.ijdc.net/article/view/697/614 (PDF)

Abstract

Although data management and its careful planning are not new topics, there is little published research on risk mitigation in data management plans (DMPs). We consider it a problem that DMPs do not include a structured approach for the identification or mitigation of risks, because it would instill confidence and trust in the data and its stewards, and foster the successful conduction of data-generating projects, which often are funded research projects. In this paper, we present a lightweight approach for identifying general risk in DMPs. We introduce an initial version of a generic risk catalog for funded research and similar projects. By analyzing a selection of 13 DMPs for projects from multiple disciplines published in the Research Ideas and Outcomes (RIO) journal, we demonstrate that our approach is applicable to DMPs and transferable to multiple institutional constellations. As a result, the effort for integrating risk management in data management planning can be reduced.

Keywords: data management plan, data management, risk management, risk assessment, information security

Introduction

University of New Mexico's William Michener describes a data management plan (DMP) as "a document that describes how you will treat your data during a project and what happens with the data after the project ends.”[1] The Digital Curation Centre's (DCC) Martin Donnelly notes that DMPs “serve to mitigate risks and help instill confidence and trust in the data and its stewards.”[2] Sarah Jones, also of the DCC, adds that “planning for the effective creation, management, and sharing of your data enables you to get the most out of your research.”[3] As such, the creation of a DMP should not only happen for obtaining a grant but also for successfully conducting the proposed project.

According to ISO 31000[4], a risk is “an effect of uncertainty on objectives.” Data management plans should help to decrease effects of uncertainty on project objectives. We consider it a problem that neither DMPs nor funders’ DMP evaluation schemes include a structured approach for the identification or mitigation of risks, since this would foster the successful conduction of data-generating projects, which often are funded research projects. We believe our approach will help funders evaluate risks of proposed projects and hence the risks of their investment options.

Data management maturity models like the Data Management Maturity (DMM) model[5] or the Enterprise Information Management (EIM) maturity model[6] are primarily designed for enterprises and may not be feasible for higher education institutions (HEIs). A rigid model for HEIs to coordinate support of data management and sharing across a diverse range of actors and processes to deliver the necessary technological and human infrastructures “cannot be prescribed since individual organizations and cultures occupy a spectrum of differences.”[7] Also, there is a potential conflict between organizational demands and scientific freedom. The Charter of Fundamental Rights of the E.U. contains scientific freedom as a constitutional right, and researchers may view the imposition of specific data management processes as a restriction of their scientific freedom. On an even more international level, the UNESCO recommends that “each Member State should institute procedures adapted to its needs for ensuring that, in the performance of research and development, scientific researchers respect public accountability while at the same time enjoying the degree of autonomy appropriate to their task and to the advancement of science and technology.”[8]

We consider it important, that researchers commit themselves to data management practices like e.g., ISO 31000. However, ISO 31000 defines the risk management process as a feedback loop to be conducted in organizations.[4] Projects tend to have a much more limited scope with regard to funding and duration than organizations. Therefore, we regard the ISO 31000 risk management process as too time-consuming and of limited suitability for funded research and similar projects.

In this paper, we propose a lightweight approach for the identification of general risks in DMPs. We introduce an initial version of a generic risk catalog for funded research and similar projects. By analyzing a selection of 13 DMPs for projects from multiple disciplines published in the Research Ideas and Outcomes (RIO) journal[9][10][11][12][13][14][15][16][17][18][19][20][21], we demonstrate that our approach is applicable and transferable to multiple institutional constellations. As a result, the effort for integrating risk management in data management planning can be reduced.

Related work

Jones et al. developed a guide for HEIs “to help institutions understand the key aims and issues associated with planning and implementing research data management (RDM) services.”[7] In this guide, the authors mention data management risks for HEIs. They note that While the upfront costs for cheap storage of active data “may be only a fraction of those quoted by central services, the risks of data loss and security breaches are significantly higher, potentially leading to far greater costs in the long term.”[7] Additionally, there are “potential legal risks from using third-party services.”[7] However, data selection counters the risks of “reputational damage from exposing dirty, confidential, or undocumented data that has been retained long after the researchers who created it have left.”[7]

The OSCRP working group developed the OSCRP (Open Science Cyber Risk Profile), which “is designed to help principal investigators (PI) and their supporting information technology (IT) professionals assess cybersecurity risks related to open science projects.”[22] The OSCRP working group proposes that principal investigators examine risks, consequences and avenues of attack for each mission critical science asset on an inventory list, whereas assets include devices, systems, data, personnel, workflows, and other kinds of resources.[22] We regard this as a very detailed alternative to our approach, but FAIR Guiding Principles[23] and long-term preservation need to be added.

In 2014, Ferreira et al.[24] “propose an analysis process for eScience projects using a data management plan and ISO 31000 in order to create a risk management plan that can complement the data management plan.” The authors describe an analytical process for creating a risk management plan and “present the previous process’ validation, based on the MetaGen-FRAME project.”[24] Within this validation Ferreira et al. also identify a project’s task-specific risks, e.g., “R6: Loss of metadata, denying the representation of the output information to the user via Taverna.”[24] This risk is tailored to the use of Taverna and hence may not be relevant for the majority of funded research and similar projects. There may be projects for which analyzing specific risks for all resources may be crucial. However, a detailed risk analysis may require a considerable amount of work.

Methods

We propose a lightweight approach that can serve as a starting point to include risk management in research data management planning. It doesn’t preclude detailed approaches like OSCRP[22] or ISO 31000.[4] Instead, we propose an approach which tries to reduce and maybe avoid the burden of a full risk management process like, e.g., ISO 31000. Our approach is based on a pre-tailored and extensible general risk catalog (Table 1) to lessen the effort required for risk management. We derived part of this risk catalog from 29 interviews with researchers from multiple disciplines[a], which we conducted as part of project SynFo: Creating synergies on the operational level of research data management.[25] One goal of project SynFo was the development of a transferable approach to improve research data management in multiple organizational constellations. In generalized content from the interviews, we identified risks entailed by interfaces of information, e.g., between researchers and data subjects or between researchers and external service providers. For the development of our approach, we also consulted the catalogs for threats and measures from the supplement of the “IT-Grundschutz” catalogs[26] by the German Federal Office for Information Security (BSI), the FAIR Guiding Principles[23], and the report and action plan from the European Commission expert group on FAIR data.[27]


Footnotes

  1. Geo sciences (12), biology (5), humanities (5), social and behavioral sciences (4), computer science, systems engineering and electrical engineering (2), and medicine (1)

References

  1. Michener, W.K. (2015). "Ten Simple Rules for Creating a Good Data Management Plan". PLoS Computational Biology 11 (10): e1004525. doi:10.1371/journal.pcbi.1004525. 
  2. Donnelly, M. (2012). "Chapter 5: Data management plans and planning". In Pryor, G.. Managing Research Data. Facet. pp. 83–104. doi:10.29085/9781856048910.006. ISBN 9781856048910. 
  3. Jones, S. (2011). "How to Develop a Data Management and Sharing Plan". Digital Curation Centre. https://www.dcc.ac.uk/guidance/how-guides/develop-data-plan. Retrieved 19 November 2019. 
  4. 4.0 4.1 4.2 "ISO 31000:2018 Risk management — Guidelines". International Organization for Standardization. February 2018. https://www.iso.org/standard/65694.html. 
  5. "Data Management Maturity (DMM)". Information System Audit and Control Association, Inc. 2019. https://cmmiinstitute.com/data-management-maturity. Retrieved 22 November 2019. 
  6. Newman, D.; Logan, D. (23 December 2008). "Overview: Gartner Introduces the EIM Maturity Model". Gartner. https://www.gartner.com/en/documents/846312/overview-gartner-introduces-the-eim-maturity-model. 
  7. 7.0 7.1 7.2 7.3 7.4 Jones, S.; Pryor, G.; Whyte, A. (25 March 2013). "How to Develop RDM Services - A guide for HEIs". Digital Curation Centre. https://www.dcc.ac.uk/guidance/how-guides/how-develop-rdm-services. Retrieved 19 November 2019. 
  8. UNESCO (2017). "Records of the General Conference, 39th session, Paris, 30 October-14 November 2017, v. 1: Resolutions". p. 116. https://unesdoc.unesco.org/ark:/48223/pf0000260889.page=116. Retrieved 30 November 2019. 
  9. Canhos, D.A.L. (2017). "Data Management Plan: Brazil's Virtual Herbarium". RIO 3: e14675. doi:10.3897/rio.3.e14675. 
  10. Fey, J.; Anderson, S. (2016). "Boulder Creek Critical Zone Observatory Data Management Plan". RIO 2: e9419. doi:https://doi.org/10.3897/rio.2.e9419. 
  11. Fisher, J.; Nading, A.M. (2016). "A Political Ecology of Value: A Cohort-Based Ethnography of the Environmental Turn in Nicaraguan Urban Social Policy". RIO 2: e8720. doi:10.3897/rio.2.e8720. 
  12. Gatto, L. (2017). "Data Management Plan for a Biotechnology and Biological Sciences Research Council (BBSRC) Tools and Resources Development Fund (TRDF) Grant". RIO 3: e11624. doi:10.3897/rio.3.e11624. 
  13. McWhorter, J.; Wright, D.; Thomas, J. (2016). "Coastal Data Information Program (CDIP)". RIO 2: e8827. doi:10.3897/rio.2.e8827. 
  14. Neylon, C. (2017). "Data Management Plan: IDRC Data Sharing Pilot Project". RIO 3: e14672. doi:10.3897/rio.3.e14672. 
  15. Nichols, H.; Stolze, S. (2016). "Migration of legacy data to new media formats for long-time storage and maximum visibility: Modern pollen data from the Canadian Arctic (1972/1973)". RIO 2: e10269. doi:10.3897/rio.2.e10269. 
  16. Pannell, J.L. (2016). "Data Management Plan for PhD Thesis "Climatic Limitation of Alien Weeds in New Zealand: Enhancing Species Distribution Models with Field Data"". RIO 2: e10600. doi:10.3897/rio.2.e10600. 
  17. Traynor, C. (2017). "Data Management Plan: Empowering Indigenous Peoples and Knowledge Systems Related to Climate Change and Intellectual Property Rights". RIO 3: e15111. doi:10.3897/rio.3.e15111. 
  18. Wael, R. (2017). "Data Management Plan: HarassMap". RIO 3: e15133. doi:10.3897/rio.3.e15133. 
  19. White, E.P. (2016). "Data Management Plan for Moore Investigator in Data Driven Discovery Grant". RIO 2: e10708. doi:10.3897/rio.2.e10708. 
  20. Woolfrey, L. (2017). "Data Management Plan: Opening access to economic data to prevent tobacco related diseases in Africa". RIO 3: e14837. doi:10.3897/rio.3.e14837. 
  21. Xu, H.; Ishida, M.; Wang, M. (2016). "A Data Management Plan for Effects of particle size on physical and chemical properties of mine wastes". RIO 2: e11065. doi:10.3897/rio.2.e11065. 
  22. 22.0 22.1 22.2 Peisert, S.; Welch, V.; Adams, A. et al. (2017). "Open Science Cyber Risk Profile (OSCRP)". IUScholar Works. http://hdl.handle.net/2022/21259. Retrieved 19 November 2019. 
  23. 23.0 23.1 Wilkinson, M.D.; Dumontier, M.; Aalbersberg, I.J. et al. (2016). "The FAIR Guiding Principles for scientific data management and stewardship". Scientific Data 3: 160018. doi:10.1038/sdata.2016.18. PMC PMC4792175. PMID 26978244. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4792175. 
  24. 24.0 24.1 24.2 Ferreira, F.; Coimbra, M.E.; Bairrão, R. et al. (2014). "Data Management in Metagenomics: A Risk Management Approach". International Journal of Digital Curation 9 (1): 41–56. doi:10.2218/ijdc.v9i1.299. 
  25. University Computing Centre (Rechenzentrum) (2019). "SynFo - Creating synergies on the operational level of research data management". Kiel University. https://www.rz.uni-kiel.de/en/projects/synfo-creating-synergies-on-the-operational-level-of-research-data-management. 
  26. German Federal Office for Information Security (22 December 2016). "IT-Grundschutz-catalogues 15th version - 2015 (Draft)". Archived from the original on 28 January 2020. https://web.archive.org/web/20200128211607/https://www.bsi.bund.de/SharedDocs/Downloads/DE/BSI/Grundschutz/International/GSK_15_EL_EN_Draft.html. Retrieved 19 November 2019. 
  27. Collins, S.; Genova, F.; Harrower, N. (26 November 2018). "Turning FAIR into reality". European Commission. doi:10.2777/1524. https://op.europa.eu/en/publication-detail/-/publication/7769a148-f1f6-11e8-9982-01aa75ed71a1/language-en. 

Notes

This presentation is faithful to the original, with only a few minor changes to presentation. Grammar was cleaned up for smoother reading. In some cases important information was missing from the references, and that information was added. The original article lists references in alphabetical order; this version lists them in order of appearance, by design. [[Category:LIMSwiki journal articles on cybersecurity]