Journal:Rethinking data sharing and human participant protection in social science research: Applications from the qualitative realm

From LIMSWiki
Revision as of 18:52, 28 December 2017 by Shawndouglas (talk | contribs) (Created stub. Saving and adding more.)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search
Full article title Rethinking data sharing and human participant protection in social science research: Applications from the qualitative realm
Journal Data Science Journal
Author(s) Kirilova, Dessi; Karcher, Sebastian
Author affiliation(s) Qualitative Data Repository
Primary contact Email: skarcher at syr dot edu
Year published 2017
Volume and issue 16(1)
Page(s) 43
DOI 10.5334/dsj-2017-043
ISSN 1683-1470
Distribution license Creative Commons Attribution 4.0 International
Website https://datascience.codata.org/articles/10.5334/dsj-2017-043/
Download https://datascience.codata.org/articles/10.5334/dsj-2017-043/galley/712/download/ (PDF)

Abstract

While data sharing is becoming increasingly common in quantitative social inquiry, qualitative data are rarely shared. One factor inhibiting data sharing is a concern about human participant protections and privacy. Protecting the confidentiality and safety of research participants is a concern for both quantitative and qualitative researchers, but it raises specific concerns within the epistemic context of qualitative research. Thus, the applicability of emerging protection models from the quantitative realm must be carefully evaluated for application to the qualitative realm. At the same time, qualitative scholars already employ a variety of strategies for human-participant protection implicitly or informally during the research process. In this practice paper, we assess available strategies for protecting human participants and how they can be deployed. We describe a spectrum of possible data management options, such as de-identification and applying access controls, including some already employed by the Qualitative Data Repository (QDR) in tandem with its pilot depositors. Throughout the discussion, we consider the tension between modifying data or restricting access to them, and retaining their analytic value. We argue that developing explicit guidelines for sharing qualitative data generated through interaction with humans will allow scholars to address privacy concerns and increase the secondary use of their data.

Keywords: qualitative data, data sharing, sensitive data, research ethics, data curation

Introduction

While data sharing is becoming increasingly common in quantitative social inquiry, qualitative data are still rarely shared. One of the major factors inhibiting data sharing is a concern about human participant protections and privacy. Protecting the confidentiality and safety of research participants is a consideration for both quantitative and qualitative researchers, but it raises specific worries within the epistemic context of qualitative research. Thus, the applicability of emerging protection models from the quantitative realm must be carefully evaluated for elements appropriate for the qualitative realm. At the same time, qualitative scholars already employ a variety of strategies for human-participant protection implicitly or informally during the research process, so part of the challenge is lessened if data repositories help researchers draw on their familiarity and comfort with these and enhance them in the process.

In this practice paper, we draw on our experiences working at the Qualitative Data Repository (QDR) to assess available approaches for protecting human participants and how they can be deployed in the qualitative realm in particular. We describe a spectrum of possible data management options that can be used individually or in combinations, such as de-identification and applying access controls. We also review some use-case applications by the repository in tandem with its pilot depositors. Throughout the discussion, we consider the tension between modifying data or restricting access to them, and retaining their analytic value. We argue that domain data professionals, cognizant of the needs of social scientific scholarly communities, can develop explicit but flexible guidelines for sharing qualitative data generated through interaction with humans that allow scholars to address privacy concerns throughout their work process. This, in turn, will make their collected data shareable and increase their secondary use for analytical or pedagogical purposes.

Impossible to share?

All those records had now been burned: Even before the controversy began, Goffman felt as though their ritual incineration was the only way she could protect her friend-informers from police scrutiny after her book was published.[1]

Until recently, sociologist Alice Goffman’s approach to protecting her research participants was the norm in qualitative social science, even with data far less sensitive than her ethnographic study of crime and policing in low-income communities in Philadelphia.[2] A lack of awareness about the need for and benefits of data sharing limited practicable strategies for protecting participants, and structurally conservative institutional review boards (IRBs) all combined to dissuade researchers from even attempting to share their data. Even more fundamentally, not thinking of the qualitative materials they collect as “data” with inherent value beyond their own study, many social scientists have remained oblivious to the developing technologies, practices and scientific infrastructure that make sharing that is both legal and ethical newly possible.

The tide is turning, however: open science and research transparency are becoming established as disciplinary norms, and funding agencies as well as journals are developing mandates for making articles, data, and software available to the scientific community and the public at large. Simultaneously, textual, audio, video and other types of qualitative data are becoming more immediately obtainable, and those collected in digital formats are increasingly easy to distribute. Each of these factors leads to an increased interest in managing and sharing qualitative data, but also raises concerns about how to openly share those involving human subjects both ethically and safely. The tensions between the broad vision of open access and the long-standing demand to protect the people whose information researchers use are important, but should not be declared irreconcilable. The most fruitful way forward is for institutions that fund data collection, that store data for sharing, and that publish academic work making knowledge claims on the basis of these data – in collaboration with the researchers themselves – to develop policies and procedures that are consistent with relevant legal and ethical obligations, ensuring the well-being and privacy of research participants.

The Qualitative Data Repository (QDR, www.qdr.org), hosted by Syracuse University, went online in 2014. It was established as a domain repository with the explicit mission to provide a home for qualitative and multi-method primary data, which might otherwise remain invisible in the social science research community. QDR serves this mission most directly by offering a user-friendly platform that enables researchers from around the world and across all social science disciplines to publish their data projects in a reliable digital venue and thus make them durably discoverable (via indexing and use of digital object identifiers or DOIs), citable (by suggesting an accurate and complete bibliographical record), intelligible to others (by providing narrative documentation and structured metadata) and, ideally, linked to the original researcher’s and others’ publications that use them (by using CrossRef/DataCite article-data linking).

More broadly, QDR’s staff – which includes the authors of this paper – has learned during these early years that its key role is to cultivate the repository’s intended user community. QDR has consequently been at the forefront of efforts to promote and support the sharing of qualitative social science data, not simply by providing technical infrastructure, but by working closely with individual data depositors to curate their qualitative data for preservation and reuse and by creating useful guidance materials that address the various stages of a research project (see https://qdr.syr.edu/guidance). When provided education in the basics of data management, social scientists become well-positioned to undertake their work with the goal of sharing in mind from the planning stages. Over the course of repository operations, we have found that the biggest impact we can have is to encourage qualitative researchers to start seeing what they do as “data collection” and its artifacts as stand-alone scholarly products that are publishable and deserving of intellectual recognition.

Qualitative data sharing works best when researchers are able to capitalize on their closeness to the human sources of their rich materials and on existing feelings of responsibility for and skills in protecting those sources. By giving researchers both credit for and control over their data work, we believe repositories can partner with them to advance the cause of safe and ethical data sharing. Drawing specific lessons from an initial set of pilot studies, each with its own challenges, we at QDR developed strategies to coach researchers about the options at their disposal to share even sensitive qualitative data. These strategies fit within the research data lifecycle, from planning through data publication.

Planning for data collection

The main insight throughout QDR’s pilot projects has been the advantage of early and thorough data management planning oriented towards the later sharing of data.[3] However, many standard approaches borrowed from quantitative research are difficult to apply directly to qualitative research. For example, as a general rule of thumb, QDR recommends that scholars do not collect identifying information where it is not substantively needed for the purpose of the study. However, the nature of qualitative interviews often produces a paper (or e-mail) trail to schedule the interview where direct identifiers (names, phone numbers, addresses) abound. Complicating the situation even further, researchers often build lasting relationships spanning multiple interviews with their participants, making such a strategy inapplicable. The objection to data sharing most commonly raised by qualitative researchers themselves combines this integral role they as individuals play in the research process and the very richness of the contextual material typically gathered.[4]

We propose to reconsider this “closeness” of the investigators, as we find that it makes them best positioned to undertake the necessary modifications to received strategies that can enable reasonable data sharing without introducing harm to the participants the researchers know so well. Instead of making the sharing and archiving of qualitative data particularly challenging, the embeddedness of researchers in their research site should be thought of as a resource, a deep foundation of knowledge of local circumstances and expectations. Thus a scholar would be able to decide in advance what might be the right secure location to store any contact information necessary for his or her ongoing interactions in the field: One example could be a notebook separate from the digital transcriptions of interviews that they keep with them at all times because of a fear that their rented apartment in the field can be accessed without their knowledge; another – a file encrypted on a memory stick, locked in a cabinet once back at their home institution, where negligence is a greater concern than unauthorized searches. Additionally, scholars who have decided on such basic data management rules in advance can use them to easily train any transcribers or other research assistants they work with in the chosen privacy protocols. Even more importantly, they can present a cogent argument during their IRB application process (i.e., before the rules are put in action) why a given option that does not involve destroying collected materials is the right choice for a given research project. Crucially, all of these downstream advantages can only be realized if the idea of sharing the data is pursued from the earliest project planning stages.

Another key aspect of qualitative data gathering concerns the informed consent procedure. As Bishop[5] notes, many qualitative researchers (often to accommodate what they think IRBs expect) use highly restrictive terms of consent, even where risks are minimal and research participants would not object to data sharing. Beyond requesting affirmative consent to share the collected data, researchers can and should tailor the details of their consent procedure to the locale and cultural context – and qualitative researchers can use their close interaction with participants to gain a better sense of the most appropriate form of consent. For example, we talked to one researcher studying former civil war combatants who found (somewhat to her surprise) that her interviewees were reassured by the detailed written consent forms she used. In other contexts, written consent might have the opposite effect. The guiding principle however applies to both those scenarios: the researcher needs to make intentional choices and provide clear documentation of them. Even if the decision is for verbal collective-based consent, for instance, justified on the basis of a traditional understanding of authority to grant such in the group the researcher is studying, this result and its rationale will be recorded and presented as documentation alongside the actual transcripts (full or redacted further, which should be another discrete option) of the group interviews.

Those choices themselves should be based on a thoughtful and realistic assessment of both the probability and degree of risk of harm, as weighed against the benefits of the research itself and the sharing of the data.[6] This sort of “risk-benefit calculation” is quite familiar to IRBs from the biomedical research realm where they originated[7], and its logic remains broadly pertinent for social science work, both qualitative and quantitative.

Data collection

References

  1. Lewis-Kraus, G. (12 January 2016). "The Trials of Alice Goffman". The New York Times Magazine. https://www.nytimes.com/2016/01/17/magazine/the-trials-of-alice-goffman.html. 
  2. Goffman, A. (2014). On the Run: Fugitive Life in an American City. University of Chicago Press. doi:10.7208/chicago/9780226136851.001.0001. ISBN 9780226136851. 
  3. Karcher, S.; Kirilova, D.; Weber, N. (2016). "Beyond the matrix: Repository services for qualitative data". IFLA Journal 42 (4): 292-302. doi:10.1177/0340035216672870. 
  4. Fink, A.S. (2000). "The Role of the Researcher in the Qualitative Research Process: A Potential Barrier to Archiving Qualitative Data". Forum: Qualitative Social Research 1 (3): 4. doi:10.17169/fqs-1.3.1021. 
  5. Bishop, L. (2009). "Ethical Sharing and Reuse of Qualitative Data". Australian Journal of Social Issues 44 (3): 255–272. doi:10.1002/j.1839-4655.2009.tb00145.x. 
  6. Van Den Eynden, V. (2008). "Sharing Research Data and Confidentiality: Restrictions Caused by Deficient Consent Forms". Research Ethics 4 (1): 37–38. doi:10.1177/174701610800400111. 
  7. Beauchamp, T.L. (2011). "Informed Consent: Its History, Meaning, and Present Challenges". Cambridge Quarterly of Healthcare Ethics 20 (4): 515-523. doi:10.1017/S0963180111000259. 

Notes

This presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added. The original article lists references alphabetically, but this version — by design — lists them in order of appearance. Footnotes have been changed from numbers to letters as citations are currently using numbers.