Difference between revisions of "User:Shawndouglas/sandbox/sublevel12"

From LIMSWiki
Jump to navigationJump to search
 
(44 intermediate revisions by the same user not shown)
Line 8: Line 8:
==Sandbox begins below==
==Sandbox begins below==
<div class="nonumtoc">__TOC__</div>
<div class="nonumtoc">__TOC__</div>
<div class="nonumtoc">__TOC__</div>
[[File:FAIRResourcesGraphic AustralianResearchDataCommons 2018.png|right|520px]]
[[File:Daily Operations in the Microbiology Lab Aboard USNS Comfort (49826560406).jpg|right|450px]]
'''Title''': ''What are the potential implications of the FAIR data principles to laboratory informatics applications?''
'''Title''': ''What are the key elements of a LIMS for medical microbiology?''


'''Author for citation''': Shawn E. Douglas
'''Author for citation''': Shawn E. Douglas
Line 16: Line 15:
'''License for content''': [https://creativecommons.org/licenses/by-sa/4.0/ Creative Commons Attribution-ShareAlike 4.0 International]
'''License for content''': [https://creativecommons.org/licenses/by-sa/4.0/ Creative Commons Attribution-ShareAlike 4.0 International]


'''Publication date''': April 2024
'''Publication date''': May 2024


==Introduction==
==Introduction==


This brief topical article will examine


This brief topical article will examine the informatics needs of the medical microbiology lab, including a base set of [[laboratory information management system]] (LIMS) or [[laboratory information system]] (LIS) functionality (i.e., system requirements) that is critical to fulfilling the information management and workflow requirements of this type of lab. (Going forward, for simplicity, this article will discuss these requirements largely in the scope of a LIMS; however, note that an LIS is equally viable here.) Additional unique requirements will also be briefly discussed.
==The "FAIR-ification" of research objects and software==
First discussed during a 2014 FORCE-11 workshop dedicated to "overcoming data discovery and reuse obstacles," the [[Journal:The FAIR Guiding Principles for scientific data management and stewardship|FAIR Guiding Principles]] were published by Wilkinson ''et al.'' in 2016 as a stakeholder collaboration driven to see research "objects" (i.e., research data and [[information]] of all shapes and formats) become more universally findable, accessible, interoperable and reusable (FAIR) by both machines and people.<ref name="WilkinsonTheFAIR16">{{Cite journal |last=Wilkinson |first=Mark D. |last2=Dumontier |first2=Michel |last3=Aalbersberg |first3=IJsbrand Jan |last4=Appleton |first4=Gabrielle |last5=Axton |first5=Myles |last6=Baak |first6=Arie |last7=Blomberg |first7=Niklas |last8=Boiten |first8=Jan-Willem |last9=da Silva Santos |first9=Luiz Bonino |last10=Bourne |first10=Philip E. |last11=Bouwman |first11=Jildau |date=2016-03-15 |title=The FAIR Guiding Principles for scientific data management and stewardship |url=https://www.nature.com/articles/sdata201618 |journal=Scientific Data |language=en |volume=3 |issue=1 |pages=160018 |doi=10.1038/sdata.2016.18 |issn=2052-4463 |pmc=PMC4792175 |pmid=26978244}}</ref> The authors released the FAIR principles while recognizing that "one of the grand challenges of data-intensive science ... is to improve knowledge discovery through assisting both humans and their computational agents in the discovery of, access to, and integration and analysis of task-appropriate scientific data and other scholarly digital objects."<ref name="WilkinsonTheFAIR16" />


'''Note''': Any citation leading to a software vendor's site is not to be considered a recommendation for that vendor. The citation should however still stand as a representational example of what vendors are implementing in their systems.
Since 2016, other research stakeholders have taken to publishing their thoughts about how the FAIR principles apply to their fields of study and practice<ref name="NIHPubMedSearch">{{cite web |url=https://pubmed.ncbi.nlm.nih.gov/?term=fair+data+principles |title=fair data principles |work=PubMed Search |publisher=National Institutes of Health, National Library of Medicine |accessdate=30 April 2024}}</ref>, including in ways beyond what perhaps was originally imagined by Wilkinson ''et al.''. For example, multiple authors have examined whether or not the software used in scientific endeavors itself can be considered a research object worth being developed and managed in tandem with the FAIR data principles.<ref>{{Cite journal |last=Hasselbring |first=Wilhelm |last2=Carr |first2=Leslie |last3=Hettrick |first3=Simon |last4=Packer |first4=Heather |last5=Tiropanis |first5=Thanassis |date=2020-02-25 |title=From FAIR research data toward FAIR and open research software |url=https://www.degruyter.com/document/doi/10.1515/itit-2019-0040/html |journal=it - Information Technology |language=en |volume=62 |issue=1 |pages=39–47 |doi=10.1515/itit-2019-0040 |issn=2196-7032}}</ref><ref name="GruenpeterFAIRPlus20">{{Cite web |last=Gruenpeter, M. |date=23 November 2020 |title=FAIR + Software: Decoding the principles |url=https://www.fairsfair.eu/sites/default/files/FAIR%20%2B%20software.pdf |format=PDF |publisher=FAIRsFAIR “Fostering FAIR Data Practices In Europe” |accessdate=30 April 2024}}</ref><ref>{{Cite journal |last=Barker |first=Michelle |last2=Chue Hong |first2=Neil P. |last3=Katz |first3=Daniel S. |last4=Lamprecht |first4=Anna-Lena |last5=Martinez-Ortiz |first5=Carlos |last6=Psomopoulos |first6=Fotis |last7=Harrow |first7=Jennifer |last8=Castro |first8=Leyla Jael |last9=Gruenpeter |first9=Morane |last10=Martinez |first10=Paula Andrea |last11=Honeyman |first11=Tom |date=2022-10-14 |title=Introducing the FAIR Principles for research software |url=https://www.nature.com/articles/s41597-022-01710-x |journal=Scientific Data |language=en |volume=9 |issue=1 |pages=622 |doi=10.1038/s41597-022-01710-x |issn=2052-4463 |pmc=PMC9562067 |pmid=36241754}}</ref><ref>{{Cite journal |last=Patel |first=Bhavesh |last2=Soundarajan |first2=Sanjay |last3=Ménager |first3=Hervé |last4=Hu |first4=Zicheng |date=2023-08-23 |title=Making Biomedical Research Software FAIR: Actionable Step-by-step Guidelines with a User-support Tool |url=https://www.nature.com/articles/s41597-023-02463-x |journal=Scientific Data |language=en |volume=10 |issue=1 |pages=557 |doi=10.1038/s41597-023-02463-x |issn=2052-4463 |pmc=PMC10447492 |pmid=37612312}}</ref><ref>{{Cite journal |last=Du |first=Xinsong |last2=Dastmalchi |first2=Farhad |last3=Ye |first3=Hao |last4=Garrett |first4=Timothy J. |last5=Diller |first5=Matthew A. |last6=Liu |first6=Mei |last7=Hogan |first7=William R. |last8=Brochhausen |first8=Mathias |last9=Lemas |first9=Dominick J. |date=2023-02-06 |title=Evaluating LC-HRMS metabolomics data processing software using FAIR principles for research software |url=https://link.springer.com/10.1007/s11306-023-01974-3 |journal=Metabolomics |language=en |volume=19 |issue=2 |pages=11 |doi=10.1007/s11306-023-01974-3 |issn=1573-3890}}</ref> Researchers quickly recognized that any planning around updating processes and systems to make research objects more FAIR would have to be tailored to specific research contexts, recognize that digital research objects go beyond data and information, and recognize "the specific nature of software" and not consider it "just data."<ref name="GruenpeterFAIRPlus20" /> The end result has been applying the core concepts of FAIR but differently from data, with the added context of research software being more than just data, requiring more nuance and a different type of planning from applying FAIR to digital data and information.


==Base LIMS requirements for medical microbiology labs==
A 2019 survey by Europe's FAIRsFAIR found that researchers seeking and re-using relevant research software on the internet faced multiple challenges, including understanding and/or maintaining the necessary software environment and its dependencies, finding sufficient documentation, struggling with accessibility and licensing issues, having the time and skills to install and/or use the software, finding quality control of the source code lacking, and having an insufficient (or non-existent) software sustainability and management plan.<ref name="GruenpeterFAIRPlus20" /> These challenges highlight the importance of software to researchers and other stakeholders, and the roll FAIR has in better ensuring such software is findable, interoperable, and reusable, which in turn better ensures researchers' software-driven research is repeatable (by the same research team, with the same experimental setup), reproducible (by a different research team, with the same experimental setup), and replicable (by a different research team, with a different experimental setup).<ref name="GruenpeterFAIRPlus20" />
Like other labs, medical microbiology labs increasingly require an informatics solution that meets all or most of its workflow requirements. These requirements are often driven by standardized test methods, in turn driven by regulations and accreditation requirements. This requires a pre-configured and future-configurable solution that enables medical microbiology personnel to quickly select and use standardized test methods and forms, and make the changes they need to those methods and forms if those changes make sense within the overall data structure of the LIMS.


What follows is a list of fundamental LIMS functionality important to most any medical microbiology laboratory, with a majority of that functionality found in many vendor software solutions.<ref name="RhoadsClin14">{{Cite journal |last=Rhoads |first=Daniel D. |last2=Sintchenko |first2=Vitali |last3=Rauch |first3=Carol A. |last4=Pantanowitz |first4=Liron |date=2014-10 |title=Clinical Microbiology Informatics |url=https://journals.asm.org/doi/10.1128/CMR.00049-14 |journal=Clinical Microbiology Reviews |language=en |volume=27 |issue=4 |pages=1025–1047 |doi=10.1128/CMR.00049-14 |issn=0893-8512 |pmc=PMC4187636 |pmid=25278581}}</ref><ref name="SlclabVideos23">{{cite web |url=https://slclab.com/en/videos-en.aspx |title=Video tutorials - Microbiology Results (Antibiogram) |publisher=SLCLAB Informática SL |date=2023 |accessdate=17 April 2024}}</ref><ref name="BGASoftMolec23">{{cite web |url=https://www.limsabc.com/molecular-id/ |title=Molecular ID LIS Solution |publisher=BGASoft, Inc |date=2023 |accessdate=17 April 2024}}</ref>
At this point, the topic of what "research software" represents must be addressed further, and, unsurprisingly, it's not straightforward. Ask 20 researchers what "research software" is, and you may get 20 different opinions. Some definitions can be more objectively viewed as too narrow, while others may be viewed as too broad, with some level of controversy inherent in any mutual discussion.<ref name="GruenpeterDefining21">{{Cite journal |last=Gruenpeter, Morane |last2=Katz, Daniel S. |last3=Lamprecht, Anna-Lena |last4=Honeyman, Tom |last5=Garijo, Daniel |last6=Struck, Alexander |last7=Niehues, Anna |last8=Martinez, Paula Andrea |last9=Castro, Leyla Jael |last10=Rabemanantsoa, Tovo |last11=Chue Hong, Neil P. |date=2021-09-13 |title=Defining Research Software: a controversial discussion |url=https://zenodo.org/record/5504016 |journal=Zenodo |doi=10.5281/zenodo.5504016}}</ref><ref name="JulichWhatIsRes24">{{cite web |url=https://www.fz-juelich.de/en/rse/about-rse/what-is-research-software |title=What is Research Software? |work=JuRSE, the Community of Practice for Research Software Engineering |publisher=Forschungszentrum Jülich |date=13 February 2024 |accessdate=30 April 2024}}</ref><ref name="vanNieuwpoortDefining24">{{Cite journal |last=van Nieuwpoort |first=Rob |last2=Katz |first2=Daniel S. |date=2023-03-14 |title=Defining the roles of research software |url=https://upstream.force11.org/defining-the-roles-of-research-software |language=en |doi=10.54900/9akm9y5-5ject5y}}</ref> In 2021, as part of the FAIRsFAIR initiative, Gruenpeter ''et al.'' made a good-faith effort to define "research software" with the feedback of multiple stakeholders. Their efforts resulted in this definition:


'''Test, sample, and result management'''
<blockquote>Research software includes source code files, algorithms, scripts, computational workflows, and executables that were created during the research process, or for a research purpose. Software components (e.g., operating systems, libraries, dependencies, packages, scripts, etc.) that are used for research but were not created during, or with a clear research intent, should be considered "software [used] in research" and not research software. This differentiation may vary between disciplines. The minimal requirement for achieving computational reproducibility is that all the computational components (i.e., research software, software used in research, documentation, and hardware) used during the research are identified, described, and made accessible to the extent that is possible.</blockquote>


*Sample log-in and management, with support for unique IDs
Note that while the definition primarily recognizes software created during the research process, software created (whether by the research group, other open-source software developers outside the organization, or even commercial software developers) "for a research purpose" outside the actual research process is also recognized as research software. This notably can lead to disagreement about whether a proprietary, commercial spreadsheet or [[laboratory information management system]] (LIMS) offering that conducts analyses and visualizations of research data can genuinely be called research software, or simply classified as software used in research. van Nieuwpoort and Katz further elaborated on this concept, at least indirectly, by formally defining the roles of research software in 2023. Their definition of the various roles of research software—without using terms such as "open-source," "commercial," or "proprietary"—essentially further defined what research software is<ref name="vanNieuwpoortDefining24" />:
*Sample batching
*[[Barcode]] and RFID support
*End-to-end sample and inventory tracking
*Pre-defined and configurable industry-specific test and method management for a variety of physical, mechanical, and chemical analyses
*Pre-defined and configurable industry-specific workflows
*Configurable screens and data fields
*Specification management
*Test, sampling, instrument, etc. scheduling and assignment
*Test requesting
*Data import, export, and archiving
*Robust query tools
*Analytical tools, including [[data visualization]], statistical analysis, and [[data mining]] tools
*Document and image management
*Project management
*Facility and sampling site management
*Storage management and monitoring


'''Quality, security, and compliance'''
* Research software is a component of our instruments.
* Research software is the instrument.
* Research software analyzes research data.
* Research software presents research results.
* Research software assembles or integrates existing components into a working whole.
* Research software is infrastructure or an underlying tool.
* Research software facilitates distinctively research-oriented collaboration.


*[[Quality assurance]] / [[quality control]] mechanisms
When considering these definitions<ref name="GruenpeterDefining21" /><ref name="vanNieuwpoortDefining24" /> of research software and their adoption by other entities<ref name="F1000Open24">{{cite web |url=https://www.f1000.com/resources-for-researchers/open-research/open-source-software-code/ |title=Open source software and code |publisher=F1000 Research Ltd |date=2024 |accessdate=30 April 2024}}</ref>, it would appear that at least in part some [[laboratory informatics]] software—whether open-source or commercially proprietary—fills these roles in academic, military, and industry research laboratories of many types. In particular, [[electronic laboratory notebook]]s (ELNs) like open-source [[Jupyter Notebook]] or proprietary ELNs from commercial software developers fill the role of analyzing and visualizing research data, including developing molecular models for new promising research routes.<ref name="vanNieuwpoortDefining24" /> Even more advanced LIMS solutions that go beyond simply collating, auditing, securing, and reporting analytical results could conceivably fall under the umbrella of research software, particularly if many of the analytical, integration, and collaboration tools required in modern research facilities are included in the LIMS.
*Mechanisms for compliance with ISO/IEC 17025, ISO 9000, ASTM, A2LA, ANAB, and other requirements
*Result, method, protocol, batch, and material validation, review, and release
*Data validation
*Trend and control charting for statistical analysis and measurement of uncertainty
*User qualification, performance, and training management
*[[Audit trail]]s and [[chain of custody]] support
*Configurable and granular role-based security
*Configurable system access and use (i.e., authentication requirements, account usage rules, account locking, etc.)
*[[Electronic signature]] support
*Data [[encryption]] and secure communication protocols
*Archiving and [[Data retention|retention]] of data and information
*Configurable data [[backup]]s
*Status updates and alerts
*Incident and non-conformance notification, tracking, and management


'''Operations management and reporting'''
Ultimately, assuming that some laboratory informatics software can be considered research software and not just "software used in research," it's tough not to arrive at some deeper implications of research organizations' increasing need for FAIR data objects and software, particularly for laboratory informatics software and the developers of it.


*Configurable dashboards for monitoring, by material, process, facility, etc.
==Implications of the FAIR concept to laboratory informatics software==
*Customizable rich-text reporting, with multiple supported output formats
Non-relational Resource Description Framework (RDF) knowledge graph databases used in well-designed software help make research objects more FAIR.
*Custom and industry-specific reporting, including certificates of analysis (CoAs)
*Email integration
*Bi-directional instrument interfacing and data management
*Third-party software interfacing (e.g., [[scientific data management system]] [SDMS], other databases)
*Data import, export, and archiving
*Instrument calibration and maintenance tracking
*Inventory and material management
*Supplier/vendor/customer management
*Customer portal


==Specialty LIMS requirements==
- https://21624527.fs1.hubspotusercontent-na1.net/hubfs/21624527/Resources/RDF%20Knowledge%20Graph%20Databases%20White%20Paper.pdf
Some laboratory informatics software vendors are addressing medical microbiology laboratories' needs beyond the features of a basic all-purpose LIMS. A standard LIMS tailored for materials testing may already contribute to some of these wider organizational functions, as well as more advanced laboratory workflow requirements, but many may not, or may vary in what additional functionality they provide. In that regard, a materials testing LIMS vendor may also include specialized functionality that assists these labs. This includes the provision of:


*'''Derivative asset linking and tracking''': Unlike many other labs in the biomedical sciences, a medical microbiology lab will end up creating (e.g., via [[cell culture]]) multiple derivative assets from a single accessioned specimen. For example, a specimen suspected of polymicrobial infection may require derivative specimens representing "aerobic bacteria, anaerobic bacteria, mycobacteria, and/or fungi, and all of these need to be linked to the original accession number."<ref name="RhoadsClin14" /> As Rhoads ''et al.'' note, "properly handling the electronic information associated with a sample, such as tracking its derivatives, modifying descriptions of its derivatives, and linking its derivatives with their accession number, is a unique and essential aspect of the microbiology LIS."<ref name="RhoadsClin14" />
- https://biss.pensoft.net/article/37412/
*'''Support for notations on primary and derivative assets, as well as other entities''': Given the above about primary specimens and the culturing of derivatives, it's vital that careful note-taking is performed at the various stages of analysis and interpretation by microbiologists. This electronic note-taking—in the past performed on physical note cards<ref name="RhoadsClin14" />—in turn can improve quality and patient outcomes. As such, many informatics systems will provide note-taking functions at granular levels for a variety of entities in the system.<ref name="RhoadsClin14" />
 
*'''More robust, standardized, optimized, and automated result reporting''': Microbiology labs in particular have multiple requirements and methods for reporting analytical and interpretive results, compared to other clinical laboratory disciplines, which usually report in a largely quantitative way. The microbiology lab will need report and interpret complex qualitative, semi-quantitative, and quantitative data and information, using long and repetitive text strings (e.g., ''Staphylococcus epidermidis'') in both preliminary and final results. Ensuring ease-of-use with keyboard shortcuts for long, repetitive text strings, while also ensuring succinct, standardized terminology and clear and accurate test results or interpretations is imperative. A LIMS can apply a more "synoptic" approach to reporting typically found with surgical pathology, supporting highly configurable layouts, the enforcement of standardized nomenclature, and appropriate result highlighting mechanisms to ensure more confident report interpretation by both microbiologists and treating physicians.<ref name="RhoadsClin14" />
- https://link.springer.com/article/10.1007/s40192-024-00348-4
*'''Robust support for interfacing with other systems''': While system and instrument interfacing is largely ''de facto'' required out of most any LIMS, this interfacing is essential to a majority of medical microbiology labs. Communication between it and any [[hospital information system]] (HIS), for example, must be unhindered and clear such that analytical and interpretive orders placed in the HIS make their way to the LIMS, and results from the lab are readily transferred back to the HIS in a standardized and readable format. This interfacing must be verified periodically to ensure high levels of quality and patient outcomes. As such, the microbiology LIMS must have robust interfacing support using standardized protocols that ensure clear and rapid bidirectional communication between other informatics systems and laboratory instruments.<ref name="RhoadsClin14" />
 
*'''Analytical and reporting support for susceptibility testing and antibiograms''': An antibiogram is a cumulative summary or "overall profile of [''in vitro''] susceptibility testing results for a specific microorganism to an array of antimicrobial drugs," often given in a tabular form.<ref name="UnivMNHowTo20">{{cite web |url=https://arsi.umn.edu/sites/arsi.umn.edu/files/2020-02/How_to_Use_a_Clinical_Antibiogram_26Feb2020_Final.pdf |format=PDF |title=How to Use a Clinical Antibiogram |author=Antimicrobial Resistance and Stewardship Initiative, University of Minnesota |date=February 2020 |accessdate=17 April 2024}}</ref> There are multiple approaches to antibiograms for a wide variety of susceptibility testing, common to microbiology labs.<ref>{{Cite journal |last=Gajic |first=Ina |last2=Kabic |first2=Jovana |last3=Kekic |first3=Dusan |last4=Jovicevic |first4=Milos |last5=Milenkovic |first5=Marina |last6=Mitic Culafic |first6=Dragana |last7=Trudic |first7=Anika |last8=Ranin |first8=Lazar |last9=Opavski |first9=Natasa |date=2022-03-23 |title=Antimicrobial Susceptibility Testing: A Comprehensive Review of Currently Used Methods |url=https://www.mdpi.com/2079-6382/11/4/427 |journal=Antibiotics |language=en |volume=11 |issue=4 |pages=427 |doi=10.3390/antibiotics11040427 |issn=2079-6382 |pmc=PMC9024665 |pmid=35453179}}</ref> While it's not common for a LIMS or LIS to have extensive data analysis capabilities, some may support the sometimes complex work of generating antibiograms.<ref name="SlclabVideos23" /><ref name="BGASoftMolec23" /> At a minimum, the LIMS should have robust reporting capabilities able to handle the nuances of susceptibility testing and antibiograms, particularly to the standard CLSI M39 ''Analysis and Presentation of Cumulative Antimicrobial Susceptibility Test Data''.<ref>{{Cite journal |last=Simner |first=Patricia J. |last2=Hindler |first2=Janet A. |last3=Bhowmick |first3=Tanaya |last4=Das |first4=Sanchita |last5=Johnson |first5=J. Kristie |last6=Lubers |first6=Brian V. |last7=Redell |first7=Mark A. |last8=Stelling |first8=John |last9=Erdman |first9=Sharon M. |date=2022-10-19 |editor-last=Humphries |editor-first=Romney M. |title=What’s New in Antibiograms? Updating CLSI M39 Guidance with Current Trends |url=https://journals.asm.org/doi/10.1128/jcm.02210-21 |journal=Journal of Clinical Microbiology |language=en |volume=60 |issue=10 |pages=e02210–21 |doi=10.1128/jcm.02210-21 |issn=0095-1137 |pmc=PMC9580356 |pmid=35916520}}</ref> Ideally, the LIMS can even run the analyses themselves, pulling data from the LIMS and HIS, but again this may not be common.
- https://www.nature.com/articles/s41597-022-01352-z
 
- https://www.degruyter.com/document/doi/10.1515/jib-2018-0023/html
 
- https://arxiv.org/abs/2404.12935
 
 
==Resources==
*LIMS and FAIR: [[Journal:A roadmap for LIMS at NIST Material Measurement Laboratory]]
*ELNs and FAIR: [[Structure-based knowledge acquisition from electronic lab notebooks for research data provenance documentation]]
*Biomedical software and FAIR: https://www.nature.com/articles/s41597-023-02463-x
*Making software workflows FAIR: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10538699/
*AWS and FAIR for healthcare and life sciences: https://aws.amazon.com/blogs/industries/implement-fair-scientific-data-principles-when-building-hcls-data-lakes/
*APIs and FAIR data: https://www.labguru.com/blog/fair-data-principles-and-apis
*Bioinformatics LIMS and FAIR: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8425304/
*Labbit: https://labbit.com/fair-data-lims
 
* Extending FAIR to data graphics: https://www.nature.com/articles/s41597-022-01352-z
 
*https://riojournal.com/article/96075/ Importance of metadata for FAIR data objects
*Deep talk about metadata: [[Journal:Shared metadata for data-centric materials science]]
*More metadata, for findability: "While descriptive metadata may not be available, support for generalized CRUD operations requires essential structural and administrative metadata to be captured, stored, and made available for requestors. Metadata capture must be highly automated and reliable, both in terms of technical reliability and ensured metadata quality." [[Journal:Making data and workflows findable for machines]]
*More metadat, for reusability: "make recommendations for assigning identifiers and metadata that supports sample tracking, integration, and reuse. Our goal is to provide a practical approach to sample management, geared towards ecosystem scientists who contribute and reuse sample data." [[Journal:Sample identifiers and metadata to support data management and reuse in multidisciplinary ecosystem sciences]]
 
*"The principles should be considered during development of informatics systems to further promote data discovery and reuse. In Table 1, we have correlated the various BRICS functional components to the FAIR principles to illustrate the extent to which each of the components contributes towards the principles." [[Journal:Development of an informatics system for accelerating biomedical research]]
 
Restricted or personal information while still being FAIR
 
*[[Journal:FAIR Health Informatics: A health informatics framework for verifiable and explainable data analysis]]
*[[Journal:Restricted data management: The current practice and the future]]
 
*Linking databases of data that haven't seen proper "FAIR-ification" and metadata handling won't be as useful.
*Further discussion on data quality in the scope of FAIR: [[Journal:Towards a contextual approach to data quality]]


==Conclusion==
==Conclusion==
Line 96: Line 94:
==References==
==References==
{{Reflist|colwidth=30em}}
{{Reflist|colwidth=30em}}
 
<!---Place all category tags here-->
<!---
[[Category:LIMS Q&A articles (added in 2024)]]
[[Category:LIMS Q&A articles (all)]]
[[Category:LIMS Q&A articles on materials testing]] Place all category tags here-->

Latest revision as of 21:32, 30 April 2024

Sandbox begins below

FAIRResourcesGraphic AustralianResearchDataCommons 2018.png

Title: What are the potential implications of the FAIR data principles to laboratory informatics applications?

Author for citation: Shawn E. Douglas

License for content: Creative Commons Attribution-ShareAlike 4.0 International

Publication date: May 2024

Introduction

This brief topical article will examine

The "FAIR-ification" of research objects and software

First discussed during a 2014 FORCE-11 workshop dedicated to "overcoming data discovery and reuse obstacles," the FAIR Guiding Principles were published by Wilkinson et al. in 2016 as a stakeholder collaboration driven to see research "objects" (i.e., research data and information of all shapes and formats) become more universally findable, accessible, interoperable and reusable (FAIR) by both machines and people.[1] The authors released the FAIR principles while recognizing that "one of the grand challenges of data-intensive science ... is to improve knowledge discovery through assisting both humans and their computational agents in the discovery of, access to, and integration and analysis of task-appropriate scientific data and other scholarly digital objects."[1]

Since 2016, other research stakeholders have taken to publishing their thoughts about how the FAIR principles apply to their fields of study and practice[2], including in ways beyond what perhaps was originally imagined by Wilkinson et al.. For example, multiple authors have examined whether or not the software used in scientific endeavors itself can be considered a research object worth being developed and managed in tandem with the FAIR data principles.[3][4][5][6][7] Researchers quickly recognized that any planning around updating processes and systems to make research objects more FAIR would have to be tailored to specific research contexts, recognize that digital research objects go beyond data and information, and recognize "the specific nature of software" and not consider it "just data."[4] The end result has been applying the core concepts of FAIR but differently from data, with the added context of research software being more than just data, requiring more nuance and a different type of planning from applying FAIR to digital data and information.

A 2019 survey by Europe's FAIRsFAIR found that researchers seeking and re-using relevant research software on the internet faced multiple challenges, including understanding and/or maintaining the necessary software environment and its dependencies, finding sufficient documentation, struggling with accessibility and licensing issues, having the time and skills to install and/or use the software, finding quality control of the source code lacking, and having an insufficient (or non-existent) software sustainability and management plan.[4] These challenges highlight the importance of software to researchers and other stakeholders, and the roll FAIR has in better ensuring such software is findable, interoperable, and reusable, which in turn better ensures researchers' software-driven research is repeatable (by the same research team, with the same experimental setup), reproducible (by a different research team, with the same experimental setup), and replicable (by a different research team, with a different experimental setup).[4]

At this point, the topic of what "research software" represents must be addressed further, and, unsurprisingly, it's not straightforward. Ask 20 researchers what "research software" is, and you may get 20 different opinions. Some definitions can be more objectively viewed as too narrow, while others may be viewed as too broad, with some level of controversy inherent in any mutual discussion.[8][9][10] In 2021, as part of the FAIRsFAIR initiative, Gruenpeter et al. made a good-faith effort to define "research software" with the feedback of multiple stakeholders. Their efforts resulted in this definition:

Research software includes source code files, algorithms, scripts, computational workflows, and executables that were created during the research process, or for a research purpose. Software components (e.g., operating systems, libraries, dependencies, packages, scripts, etc.) that are used for research but were not created during, or with a clear research intent, should be considered "software [used] in research" and not research software. This differentiation may vary between disciplines. The minimal requirement for achieving computational reproducibility is that all the computational components (i.e., research software, software used in research, documentation, and hardware) used during the research are identified, described, and made accessible to the extent that is possible.

Note that while the definition primarily recognizes software created during the research process, software created (whether by the research group, other open-source software developers outside the organization, or even commercial software developers) "for a research purpose" outside the actual research process is also recognized as research software. This notably can lead to disagreement about whether a proprietary, commercial spreadsheet or laboratory information management system (LIMS) offering that conducts analyses and visualizations of research data can genuinely be called research software, or simply classified as software used in research. van Nieuwpoort and Katz further elaborated on this concept, at least indirectly, by formally defining the roles of research software in 2023. Their definition of the various roles of research software—without using terms such as "open-source," "commercial," or "proprietary"—essentially further defined what research software is[10]:

  • Research software is a component of our instruments.
  • Research software is the instrument.
  • Research software analyzes research data.
  • Research software presents research results.
  • Research software assembles or integrates existing components into a working whole.
  • Research software is infrastructure or an underlying tool.
  • Research software facilitates distinctively research-oriented collaboration.

When considering these definitions[8][10] of research software and their adoption by other entities[11], it would appear that at least in part some laboratory informatics software—whether open-source or commercially proprietary—fills these roles in academic, military, and industry research laboratories of many types. In particular, electronic laboratory notebooks (ELNs) like open-source Jupyter Notebook or proprietary ELNs from commercial software developers fill the role of analyzing and visualizing research data, including developing molecular models for new promising research routes.[10] Even more advanced LIMS solutions that go beyond simply collating, auditing, securing, and reporting analytical results could conceivably fall under the umbrella of research software, particularly if many of the analytical, integration, and collaboration tools required in modern research facilities are included in the LIMS.

Ultimately, assuming that some laboratory informatics software can be considered research software and not just "software used in research," it's tough not to arrive at some deeper implications of research organizations' increasing need for FAIR data objects and software, particularly for laboratory informatics software and the developers of it.

Implications of the FAIR concept to laboratory informatics software

Non-relational Resource Description Framework (RDF) knowledge graph databases used in well-designed software help make research objects more FAIR.

- https://21624527.fs1.hubspotusercontent-na1.net/hubfs/21624527/Resources/RDF%20Knowledge%20Graph%20Databases%20White%20Paper.pdf

- https://biss.pensoft.net/article/37412/

- https://link.springer.com/article/10.1007/s40192-024-00348-4

- https://www.nature.com/articles/s41597-022-01352-z

- https://www.degruyter.com/document/doi/10.1515/jib-2018-0023/html

- https://arxiv.org/abs/2404.12935


Resources

Restricted or personal information while still being FAIR

Conclusion

References

  1. 1.0 1.1 Wilkinson, Mark D.; Dumontier, Michel; Aalbersberg, IJsbrand Jan; Appleton, Gabrielle; Axton, Myles; Baak, Arie; Blomberg, Niklas; Boiten, Jan-Willem et al. (15 March 2016). "The FAIR Guiding Principles for scientific data management and stewardship" (in en). Scientific Data 3 (1): 160018. doi:10.1038/sdata.2016.18. ISSN 2052-4463. PMC PMC4792175. PMID 26978244. https://www.nature.com/articles/sdata201618. 
  2. "fair data principles". PubMed Search. National Institutes of Health, National Library of Medicine. https://pubmed.ncbi.nlm.nih.gov/?term=fair+data+principles. Retrieved 30 April 2024. 
  3. Hasselbring, Wilhelm; Carr, Leslie; Hettrick, Simon; Packer, Heather; Tiropanis, Thanassis (25 February 2020). "From FAIR research data toward FAIR and open research software" (in en). it - Information Technology 62 (1): 39–47. doi:10.1515/itit-2019-0040. ISSN 2196-7032. https://www.degruyter.com/document/doi/10.1515/itit-2019-0040/html. 
  4. 4.0 4.1 4.2 4.3 Gruenpeter, M. (23 November 2020). "FAIR + Software: Decoding the principles" (PDF). FAIRsFAIR “Fostering FAIR Data Practices In Europe”. https://www.fairsfair.eu/sites/default/files/FAIR%20%2B%20software.pdf. Retrieved 30 April 2024. 
  5. Barker, Michelle; Chue Hong, Neil P.; Katz, Daniel S.; Lamprecht, Anna-Lena; Martinez-Ortiz, Carlos; Psomopoulos, Fotis; Harrow, Jennifer; Castro, Leyla Jael et al. (14 October 2022). "Introducing the FAIR Principles for research software" (in en). Scientific Data 9 (1): 622. doi:10.1038/s41597-022-01710-x. ISSN 2052-4463. PMC PMC9562067. PMID 36241754. https://www.nature.com/articles/s41597-022-01710-x. 
  6. Patel, Bhavesh; Soundarajan, Sanjay; Ménager, Hervé; Hu, Zicheng (23 August 2023). "Making Biomedical Research Software FAIR: Actionable Step-by-step Guidelines with a User-support Tool" (in en). Scientific Data 10 (1): 557. doi:10.1038/s41597-023-02463-x. ISSN 2052-4463. PMC PMC10447492. PMID 37612312. https://www.nature.com/articles/s41597-023-02463-x. 
  7. Du, Xinsong; Dastmalchi, Farhad; Ye, Hao; Garrett, Timothy J.; Diller, Matthew A.; Liu, Mei; Hogan, William R.; Brochhausen, Mathias et al. (6 February 2023). "Evaluating LC-HRMS metabolomics data processing software using FAIR principles for research software" (in en). Metabolomics 19 (2): 11. doi:10.1007/s11306-023-01974-3. ISSN 1573-3890. https://link.springer.com/10.1007/s11306-023-01974-3. 
  8. 8.0 8.1 Gruenpeter, Morane; Katz, Daniel S.; Lamprecht, Anna-Lena; Honeyman, Tom; Garijo, Daniel; Struck, Alexander; Niehues, Anna; Martinez, Paula Andrea et al. (13 September 2021). "Defining Research Software: a controversial discussion". Zenodo. doi:10.5281/zenodo.5504016. https://zenodo.org/record/5504016. 
  9. "What is Research Software?". JuRSE, the Community of Practice for Research Software Engineering. Forschungszentrum Jülich. 13 February 2024. https://www.fz-juelich.de/en/rse/about-rse/what-is-research-software. Retrieved 30 April 2024. 
  10. 10.0 10.1 10.2 10.3 van Nieuwpoort, Rob; Katz, Daniel S. (14 March 2023) (in en). Defining the roles of research software. doi:10.54900/9akm9y5-5ject5y. https://upstream.force11.org/defining-the-roles-of-research-software. 
  11. "Open source software and code". F1000 Research Ltd. 2024. https://www.f1000.com/resources-for-researchers/open-research/open-source-software-code/. Retrieved 30 April 2024.