Difference between revisions of "User:Shawndouglas/sandbox/sublevel13"

From LIMSWiki
Jump to navigationJump to search
Tag: Reverted
 
(34 intermediate revisions by the same user not shown)
Line 7: Line 7:


==Sandbox begins below==
==Sandbox begins below==
<div class="nonumtoc">__TOC__</div>
[[File:|right|520px]]
'''Title''': ''Why are the FAIR data principles increasingly important to research laboratories and their software?''


==5. Taking the next step==
'''Author for citation''': Shawn E. Douglas
[[File:Possible features in web cooperation platforms like wikis.png|right|600px]]In section 3.4 of this guide, we briefly discussed how a user requirements specification (URS) fits into the process of purchasing [[laboratory informatics]] solutions for your [[ISO/IEC 17025]] [[laboratory]]. The URS has been viewed as a means for the purchaser to ensure their needs are satisfied by the functionality of the software. Traditionally, this has turned into a "wish list" for the purchaser, which while somewhat practical still lacks in its finesse. One common problem with this wishlist approach is the risk of "requirements creep," where more functionality than is truly necessary is desired, inevitably leading to a state where no vendor can meet all the wishlisted requirements. This makes selecting a solution even more difficult, particularly without significant prioritization skills.<ref name="AasemAnalysis10">{{cite journal |title=Analysis and optimization of software requirements prioritization techniques |author=Aasem, M.; Ramzan, M.; Jaffar, A. |journal=Proceedings from the 2010 International Conference on Information and Emerging Technologies |pages=1–6 |year=2010 |doi=10.1109/ICIET.2010.5625687}}</ref><ref name="Hirsch10Steps13">{{cite web |url=https://www.phase2technology.com/blog/successful-requirements-gathering |title=10 Steps To Successful Requirements Gathering |author=Hirsch, J. |publisher=Phase2 Technology, LLC |date=22 November 2013 |accessdate=20 January 2023}}</ref><ref name="BurrissSoftware07">{{cite web |url=http://sce2.umkc.edu/BIT/burrise/pl/requirements/ |archiveurl=https://web.archive.org/web/20190925003040/http://sce2.umkc.edu/BIT/burrise/pl/requirements/ |title=Requirements Specification |work=CS451R, University of Missouri–Kansas City |author=Burris, E. |publisher=University of Missouri–Kansas City |date=2007 |archivedate=25 September 2019 |accessdate=20 January 2023}}</ref>


Noting the potential problems with this wishlist approach, [[LII:LIMSpec 2022 R2|LIMSpec]]—a requirement specification document for laboratory informatics solutions—took a new approach and turned to standards and regulations that drive laboratories of all types, as well as the data they manage. LIMSpec was rebuilt based on [[ASTM E1578|ASTM E1578-18]] ''Standard Guide for Laboratory Informatics'', as well as dozens of other standards (including ISO/IEC 17025) and regulations, while still leaving room for a software buyer to add their own custom requirements for their industry or lab.  
'''License for content''': [https://creativecommons.org/licenses/by-sa/4.0/ Creative Commons Attribution-ShareAlike 4.0 International]


The rest of this chapter examines the research, documentation, and acquisition process that the ISO/IEC 17025 lab needing laboratory informatics solutions will want to go through, with an emphasis on the utility of a sound requirements specification. While LIMSpec is offered as a solid starting point, you don't strictly need to use LIMSpec to conduct this process; the information in this chapter can largely be applied with or without LIMSpec itself. However, as section 3.1.2 of Chapter 3 pointed out, LIMSpec was built with standards like ISO/IEC 17025 in mind, making it relevant to the lab using software to better conform to the standard.
'''Publication date''': May 2024


==Introduction==


===5.1 Conduct initial research into a specification document tailored to your lab's needs===
==The growing importance of the FAIR principles to research laboratories==
A specification is "a detailed precise presentation of something or of a plan or proposal for something."<ref name="MWSpec">{{cite web |url=https://www.merriam-webster.com/dictionary/specification |title=specification |work=Merriam-Webster |publisher=Merriam-Webster, Inc |accessdate=20 January 2023}}</ref> This concept of a specification as a presentation is critical to the laboratory seeking to find laboratory informatics software that fulfills their needs; they "present" their use case with the help of a requirements specification, and the vendor "presents" their ability (or inability) to comply through documentation and demonstration (more on that later). However, even the most seasoned of presenters at conferences and the like still require quality preparation before the presentation. This is where initial specification research comes into play for the lab.
The [[Journal:The FAIR Guiding Principles for scientific data management and stewardship|FAIR data principles]] were published by Wilkinson ''et al.'' in 2016 as a stakeholder collaboration driven to see research "objects" (i.e., research data and [[information]] of all shapes and formats) become more universally findable, accessible, interoperable, and reusable (FAIR) by both machines and people.<ref name="WilkinsonTheFAIR16">{{Cite journal |last=Wilkinson |first=Mark D. |last2=Dumontier |first2=Michel |last3=Aalbersberg |first3=IJsbrand Jan |last4=Appleton |first4=Gabrielle |last5=Axton |first5=Myles |last6=Baak |first6=Arie |last7=Blomberg |first7=Niklas |last8=Boiten |first8=Jan-Willem |last9=da Silva Santos |first9=Luiz Bonino |last10=Bourne |first10=Philip E. |last11=Bouwman |first11=Jildau |date=2016-03-15 |title=The FAIR Guiding Principles for scientific data management and stewardship |url=https://www.nature.com/articles/sdata201618 |journal=Scientific Data |language=en |volume=3 |issue=1 |pages=160018 |doi=10.1038/sdata.2016.18 |issn=2052-4463 |pmc=PMC4792175 |pmid=26978244}}</ref> The authors released the FAIR principles while recognizing that "one of the grand challenges of data-intensive science ... is to improve knowledge discovery through assisting both humans and their computational agents in the discovery of, access to, and integration and analysis of task-appropriate scientific data and other scholarly digital objects."<ref name="WilkinsonTheFAIR16" /> Since being published, other researchers have taken the somewhat broad set of principles and refined them to their own scientific disciplines, as well as to other types of research objects, including the research software being used by those researchers to generate research objects.<ref name="NIHPubMedSearch">{{cite web |url=https://pubmed.ncbi.nlm.nih.gov/?term=fair+data+principles |title=fair data principles |work=PubMed Search |publisher=National Institutes of Health, National Library of Medicine |accessdate=30 April 2024}}</ref><ref name="HasselbringFromFAIR20">{{Cite journal |last=Hasselbring |first=Wilhelm |last2=Carr |first2=Leslie |last3=Hettrick |first3=Simon |last4=Packer |first4=Heather |last5=Tiropanis |first5=Thanassis |date=2020-02-25 |title=From FAIR research data toward FAIR and open research software |url=https://www.degruyter.com/document/doi/10.1515/itit-2019-0040/html |journal=it - Information Technology |language=en |volume=62 |issue=1 |pages=39–47 |doi=10.1515/itit-2019-0040 |issn=2196-7032}}</ref><ref name="GruenpeterFAIRPlus20">{{Cite web |last=Gruenpeter, M. |date=23 November 2020 |title=FAIR + Software: Decoding the principles |url=https://www.fairsfair.eu/sites/default/files/FAIR%20%2B%20software.pdf |format=PDF |publisher=FAIRsFAIR “Fostering FAIR Data Practices In Europe” |accessdate=30 April 2024}}</ref><ref name=":0">{{Cite journal |last=Barker |first=Michelle |last2=Chue Hong |first2=Neil P. |last3=Katz |first3=Daniel S. |last4=Lamprecht |first4=Anna-Lena |last5=Martinez-Ortiz |first5=Carlos |last6=Psomopoulos |first6=Fotis |last7=Harrow |first7=Jennifer |last8=Castro |first8=Leyla Jael |last9=Gruenpeter |first9=Morane |last10=Martinez |first10=Paula Andrea |last11=Honeyman |first11=Tom |date=2022-10-14 |title=Introducing the FAIR Principles for research software |url=https://www.nature.com/articles/s41597-022-01710-x |journal=Scientific Data |language=en |volume=9 |issue=1 |pages=622 |doi=10.1038/s41597-022-01710-x |issn=2052-4463 |pmc=PMC9562067 |pmid=36241754}}</ref><ref name=":1">{{Cite journal |last=Patel |first=Bhavesh |last2=Soundarajan |first2=Sanjay |last3=Ménager |first3=Hervé |last4=Hu |first4=Zicheng |date=2023-08-23 |title=Making Biomedical Research Software FAIR: Actionable Step-by-step Guidelines with a User-support Tool |url=https://www.nature.com/articles/s41597-023-02463-x |journal=Scientific Data |language=en |volume=10 |issue=1 |pages=557 |doi=10.1038/s41597-023-02463-x |issn=2052-4463 |pmc=PMC10447492 |pmid=37612312}}</ref><ref name=":2">{{Cite journal |last=Du |first=Xinsong |last2=Dastmalchi |first2=Farhad |last3=Ye |first3=Hao |last4=Garrett |first4=Timothy J. |last5=Diller |first5=Matthew A. |last6=Liu |first6=Mei |last7=Hogan |first7=William R. |last8=Brochhausen |first8=Mathias |last9=Lemas |first9=Dominick J. |date=2023-02-06 |title=Evaluating LC-HRMS metabolomics data processing software using FAIR principles for research software |url=https://link.springer.com/10.1007/s11306-023-01974-3 |journal=Metabolomics |language=en |volume=19 |issue=2 |pages=11 |doi=10.1007/s11306-023-01974-3 |issn=1573-3890}}</ref>


Your lab's requirements specification document will eventually be a critical component for effectively selecting a [[laboratory informatics]] solution. There are numerous ways to approach the overall development of such a document. But why re-invent the wheel when others have already gone down that road? Sure, you could search for examples of such documents on the internet and customize them to your needs, or you and your team could brainstorm how a laboratory informatics solution should help your ISO/IEC 17025 lab accomplish its goals. LIMSpec makes for one of the more thorough starting points to use, though you could also use other structured documents that have been developed by others. For the purposes of this guide, we'll look at LIMSpec.
But why are research laboratories increasingly pushing for more findable, accessible, interoperable, and reusable research objects and software? The short answer, as evidenced by the Wilkinson ''et al.'' quote above is that greater innovation can be gained through improved knowledge discovery. The discovery process necessary for that greater innovation—whether through traditional research methods or [[artificial intelligence]] (AI)-driven methods—is enhanced when research objects and software are compatible with the core ideas of FAIR.<ref name="WilkinsonTheFAIR16" /><ref name="OlsenEmbracing23">{{cite web |url=https://www.pharmasalmanac.com/articles/embracing-fair-data-on-the-path-to-ai-readiness |title=Embracing FAIR Data on the Path to AI-Readiness |author=Olsen, C. |work=Pharma's Almanac |date=01 September 2023 |accessdate=03 May 2024}}</ref><ref name="HuertaFAIRForAI23">{{Cite journal |last=Huerta |first=E. A. |last2=Blaiszik |first2=Ben |last3=Brinson |first3=L. Catherine |last4=Bouchard |first4=Kristofer E. |last5=Diaz |first5=Daniel |last6=Doglioni |first6=Caterina |last7=Duarte |first7=Javier M. |last8=Emani |first8=Murali |last9=Foster |first9=Ian |last10=Fox |first10=Geoffrey |last11=Harris |first11=Philip |date=2023-07-26 |title=FAIR for AI: An interdisciplinary and international community building perspective |url=https://www.nature.com/articles/s41597-023-02298-6 |journal=Scientific Data |language=en |volume=10 |issue=1 |pages=487 |doi=10.1038/s41597-023-02298-6 |issn=2052-4463 |pmc=PMC10372139 |pmid=37495591}}</ref>


The [[Book:LIMSpec 2022 R2|LIMSpec 2022]] document is divided into five distinct sections, with numerous subsections in each:
A slightly longer answer, suitable for a Q&A topic, requires looking at a few more details of the FAIR principles as applied to both research objects and research software. Research laboratories, whether located in an organization or contracted out as third parties, exist to innovate. That innovation can come in the form of discovering new materials that may or may not have a future application, developing a pharmaceutical to improve patient outcomes for a particular disease, or modifying (for some sort of improvement) an existing food or beverage recipe, among others. In academic research labs, this usually looks like knowledge advancement and the publishing of research results, whereas in industry research labs, this typically looks like more practical applications of research concepts to new or existing products or services. In both cases, research software was likely involved at some point, whether it be something like a researcher-developed [[bioinformatics]] application or a commercial vendor-developed [[electronic laboratory notebook]] (ELN).


* Primary Laboratory Workflow
===FAIR research objects===
** 1. Sample and experiment registration
Regarding research objects themselves, the FAIR principles essentially say "vast amounts of data and information in largely heterogeneous formats spread across disparate sources both electronic and paper make modern research workflows difficult, tedious, and at times impossible. Further, repeatability, reproducibility, and replicability of openly published or secure internal research results is at risk, giving less confidence to academic peers in the published research, or less confidence to critical stakeholders in the viability of a researched prototype." As such, research objects (which include not only their inherent data and information but also any [[metadata]] that describe features of that data and information) need to be<ref name="Rocca-SerraFAIRCook22">{{Cite book |last=Rocca-Serra, Philippe |last2=Sansone, Susanna-Assunta |last3=Gu, Wei |last4=Welter, Danielle |last5=Abbassi Daloii, Tooba |last6=Portell-Silva, Laura |date=2022-06-30 |title=D2.1 FAIR Cookbook |url=https://zenodo.org/record/6783564 |chapter=Introducing the FAIR Principles |doi=10.5281/ZENODO.6783564}}</ref>:
** 2. Sample management
** 3. Core laboratory testing and experiments
** 4. Results review and verification
** 5. Sample, experiment, and study approval and verification
** 6. Reporting
* Maintaining Laboratory Workflow and Operations
** 7. Document and records management
** 8. Resource management
** 9. Compliance management
** 10. Instrument and equipment management
** 11. Batch and lot management
** 12. Scheduled event management
** 13. Instrument data capture and control
** 14. Standard and reagent management
** 15. Inventory management
** 16. Investigation and quality management
* Specialty Laboratory Functions (minus non-relevant industries)
** 17. Production management
** 18. Statistical trending and control charts
** 19. Agriculture and food data management
** 20. Environmental data management
** 21. Forensic case and data management
** 22. Clinical and public health data management
** 23. Veterinary data management
** 24. Scientific data management
** 25. Health information technology
* Technology and Performance Improvements
** 26. Instrument data systems functions
** 27. Systems integration
** 28. Laboratory scheduling and capacity planning
** 29. Lean laboratory and continuous improvement
** 30. Artificial intelligence and smart systems
* Security and Integrity of Systems and Operations
** 31. Data integrity
** 32. Configuration management
** 33. System validation and commission
** 34. System administration
** 35. Cybersecurity
** 36. Information privacy


These sections and subsections should be able to address most any requirement you have for your system. Of course, if something isn't covered by LIMSpec, you can always add additional requirements.  
*''findable'', with globally unique and persistent identifiers, rich metadata that link to the identifier of the data described, and an ability to be indexed as an effectively searchable resource;
*''accessible'', being able to be retrieved (including metadata of data that is no longer available) by identifiers using secure standardized communication protocols that are open, free, and universally implementable with authentication and authorization mechanisms;
*''interoperable'', represented using formal, accessible, shared, and relevant language models and vocabularies that abide by FAIR principles, as well as with qualified linkage to other metadata; and
*''reusable'', being richly described by accurate and relevant metadata, released with a clear and accessible data usage license, associated with sufficiently detailed provenance information, and compliant with discipline-specific community standards.


During the initial research towards your URS, you won't have to include every requirement for when you approach potential vendors. Most vendors appreciate a more inviting approach that doesn't overwhelm, at least initially. You will want to go with a limited yet practical set of requirements carefully chosen because they matter to you and your laboratory the most. In the case of a lab seeking a solution to help them better comply with ISO/IEC 17025, one might want to make requirements related to the standard the practical starting point. (See Chapter 3, section 3.1.2, Table 1 for the full list.) Wherever you choose to start, you'll likely want to wait until after participating in several software demonstrations before even considering your URS to be complete. (More on that in 5.3.1.) This naturally leads us to a discussion about the request for information (RFI) process.
All that talk of unique persistent identifiers, communication protocols, authentication mechanisms, language models (e.g., [[Ontology (information science)|ontology]] languages), standardized vocabularies, provenance information, and more could make one's head spin. And, to be fair, it has been challenging for research groups to adopt FAIR, with few widespread international efforts to translate the FAIR principles to broad research. The FAIR Cookbook represents one example of such international collaborative effort, providing "a combination of guidance, technical, hands-on, background and review types to cover the operation steps of FAIR data management."<ref name="Rocca-SerraFAIRCook22-1">{{Cite book |last=Rocca-Serra, Philippe |last2=Sansone, Susanna-Assunta |last3=Gu, Wei |last4=Welter, Danielle |last5=Abbassi Daloii, Tooba |last6=Portell-Silva, Laura |date=2022-06-30 |title=D2.1 FAIR Cookbook |url=https://zenodo.org/record/6783564 |chapter=Introduction |doi=10.5281/ZENODO.6783564}}</ref> In fact, the Cookbook is illustrative of the challenges of implementing FAIR in research laboratories, particularly given the diverse array of vocabularies used across the wealth of scientific disciplines, such as [[biobanking]], [[biomedical engineering]], [[botany]], [[food science]], and [[materials science]]. The way a botanical research organization makes its research objects FAIR is going to require a set of different tools than the materials science research organization. But all of them will turn to [[Informatics (academic field)|informatics]] tools, data management plans, database tools, and more to not only massage existing research objects to be FAIR but also better ensure newly created research objects are FAIR as well.


===FAIR research software===
Discussion on research software and its FAIRness is more complicated. It is beyond the scope of this article to go into greater detail about the concepts surrounding FAIR research software, but a brief overview will be attempted. When the FAIR principles were first published, the framework was largely being applied to research objects. However, researchers quickly recognized that any planning around updating processes and systems to make research objects more FAIR would have to be tailored to specific research contexts. This led to recognizing that digital research objects go beyond data and information, and that there is a "specific nature of software" used in research; that research software should not be considered "just data."<ref name="GruenpeterFAIRPlus20" /> The end result has seen researchers begin to apply the core concepts of FAIR to research software, but slightly differently from research objects.<ref name="NIHPubMedSearch" /><ref name="HasselbringFromFAIR20" /><ref name="GruenpeterFAIRPlus20" /><ref name=":0" /><ref name=":1" /><ref name=":2" />


===5.2 Issue some of the specification as part of a request for information (RFI)===
Unsurprisingly, what researchers consider to be "research software" for purposes of FAIR has historically been interpreted numerous ways. Does the commercial spreadsheet software used to make calculations to research data deserve to be called research software in parallel with the lab-developed bioinformatics application used to generate that data? Given the difficulties of gaining a consensus definition of the term, a 2021 international initiative called FAIRsFAIR made a good-faith effort to define "research software" with the feedback of multiple stakeholders. The short version of their resulting definition is that, "[r]esearch software includes source code files, algorithms, scripts, computational workflows, and executables that were created during the research process, or for a research purpose."<ref name="GruenpeterDefining21">{{Cite journal |last=Gruenpeter, Morane |last2=Katz, Daniel S. |last3=Lamprecht, Anna-Lena |last4=Honeyman, Tom |last5=Garijo, Daniel |last6=Struck, Alexander |last7=Niehues, Anna |last8=Martinez, Paula Andrea |last9=Castro, Leyla Jael |last10=Rabemanantsoa, Tovo |last11=Chue Hong, Neil P. |date=2021-09-13 |title=Defining Research Software: a controversial discussion |url=https://zenodo.org/record/5504016 |journal=Zenodo |doi=10.5281/zenodo.5504016}}</ref> Of note is the last part, acknowledging that research software can be developed in the lab during the research process or developed beforehand by, for example, a commercial software developer with a strong purpose of being used for research. As such, Microsoft Excel may not be looked upon as research software, but an ELN or [[laboratory information management system]] (LIMS) thoughtfully developed with research activities in mind could be considered research software. More often than not, research software is going to be developed in-house. A growing push for the FAIRification of that software, as well as commercial research solutions, has seen the emergence of "research software engineering" as a domain of practice.<ref name="MoynihanTheHitch20">{{cite web |url=https://invenia.github.io/blog/2020/07/07/software-engineering/ |title=The Hitchhiker’s Guide to Research Software Engineering: From PhD to RSE |author=Moynihan, G. |work=Invenia Blog |publisher=Invenia Technical Computing Corporation |date=07 July 2020}}</ref><ref name="WoolstonWhySci22">{{Cite journal |last=Woolston |first=Chris |date=2022-05-31 |title=Why science needs more research software engineers |url=https://www.nature.com/articles/d41586-022-01516-2 |journal=Nature |language=en |pages=d41586–022–01516-2 |doi=10.1038/d41586-022-01516-2 |issn=0028-0836}}</ref> While in the past, broadly speaking, researchers often cobbled together research software with less a focus on quality and reproducibility and more on getting their research published, today's push for FAIR data and software by academic journals, institutions, and other researchers seeking to collaborate has placed a much greater focus on the concept of "better software, better research"<ref name="WoolstonWhySci22" /><ref name="CohenTheFour21">{{Cite journal |last=Cohen |first=Jeremy |last2=Katz |first2=Daniel S. |last3=Barker |first3=Michelle |last4=Chue Hong |first4=Neil |last5=Haines |first5=Robert |last6=Jay |first6=Caroline |date=2021-01 |title=The Four Pillars of Research Software Engineering |url=https://ieeexplore.ieee.org/document/8994167/ |journal=IEEE Software |volume=38 |issue=1 |pages=97–105 |doi=10.1109/MS.2020.2973362 |issn=0740-7459}}</ref>, with research software engineering efforts focusing on that concept as being vital to future research outcomes. Cohen ''et al.'' add that "ultimately, good research software can make the difference between valid, sustainable, reproducible research outputs and short-lived, potentially unreliable or erroneous outputs."<ref name="CohenTheFour21" />
In some cases—particularly if your organization is of significant size—it may make sense to issue a formal RFI or request for proposal (RFP) and have laboratory informatics vendors approach your ISO/IEC 17025 lab with how they can meet its needs. The RFI and RFP are traditional means towards soliciting bidding interest in an organization's project, containing the organization's specific requirements and vital questions that the bidder should be able to effectively answer. However, even if your organization chooses to skip the RFI or RFP process and do most of the investigative work of researching and approaching informatics vendors, turning to a key set of questions typically found in an RFI is extremely valuable towards your fact finding.


An RFI is an ideal means for learning more about a potential solution and how it can solve your problems, or for when you're not even sure how to solve your problem yet. However, the RFI should not be unduly long and tedious to complete for prospective vendors; it should be concise, direct, and honest. This means not only presenting a clear and humble vision of your own organization and its goals, but also asking just the right amount of questions to allow potential vendors to demonstrate their expertise and provide a clearer picture of who they are. Some take a technical approach to an RFI, using dense language and complicated spreadsheets for fact finding. However, as previously noted, you will want to limit the specified requirements in your RFI to those carefully chosen—for example, ISO/IEC 17025-specific requirements—because they matter to you and your lab the most.<ref name="HolmesItsAMatch">{{cite web |url=https://allcloud.io/blog/its-a-match-how-to-run-a-good-rfi-rfp-or-rfq-and-find-the-right-partner/ |title=It's a Match: How to Run a Good RFI, RFP, or RFQ and Find the Right Partner |author=Holmes, T. |work=AllCloud Blog |accessdate=20 January 2023}}</ref>  
Hasselbring ''et al.'' note that "it is essential [for academic research groups] to publish research software in addition to research data," to increase trust in the peer review system, build new research on top of existing research, and ensure greater reproducibility of any published results.<ref name="HasselbringFromFAIR20" /> They extend FAIR data principles to FAIR research software, noting that<ref name="HasselbringFromFAIR20" />:


Remember, an RFI is not meant to answer all of your questions. The RFI is meant as a means to help narrow down your search to a few quality candidates while learning more about each other.<ref name="HolmesItsAMatch" /> Once the pool of potential software vendors is narrowed down, and you then participate in their demonstrations, you then can broadly add more requirements to the original collection of critical requirements from the RFI to ensure those providers meet all or most of your needs. That said, be cognizant that there may be no vendor that can meet each and every need of your lab. Your lab will have to make important decisions about which requirements are non-negotiable and which are more flexible. The vendors you engage with may be able to provide realistic advice in this regard, based upon your lab's requirements and their past experience with labs. As such, those vendors with real-world experience meeting the needs of ISO/IEC 17025 laboratories may have a strong leg up on other vendors.
*''findable'' software acknowledges that "the first step in (re)using ... software is to find it";
*''accessible'' software acknowledges that once found, the researcher needs to know how to best access the software, recognizing authentication or authentication mechanisms may need to be in place;
*''interoperable'' software acknowledges that the software will need to eventually integrate with other research objects and software, demanding a FAIR-driven methods and tools in the software's development; and
*''reusable'' software acknowledges that the software will need to not only produce research objects that can be reused, combined, and extended, but that the software itself should have metadata that helps make it retrievable and reusable.


Again, a comprehensive specifications document like LIMSpec makes for one possible source from which you can draw the requirements that are most critical to be addressed in an RFI. If you have zero experience developing an RFI, you may want to first seek out LIMSpec and other various example RFIs on the internet, as well as some basic advice articles on the topic. Some websites may provide templates to examine for further details. Broadly speaking, if you're conducting a full RFI or RFP, you're going to lead with the standard components of an RFI or RFP, including:
The applicability of these principles is clear to academic research software developed in-house, with the concept of open science driving FAIR development and release of that software, including on platforms like GitHub.<ref name="HasselbringFromFAIR20" /> It's less clear for commercial developers making research software. The growing prevalence of FAIR data and software practices in research laboratories doesn't mean commercial developers are going to suddenly take an open-source approach to their code, and it doesn't mean academic and institutional research labs are going to give up the benefits of the open-source paradigm as applied to research software.<ref name="HasselbringFromFAIR20" /> However, both research software development paradigms stand to gain from the shift to more FAIR data and software.<ref name="MoynihanTheHitch20" /> Additionally, if commercial vendors of research software want to continue to competitively market relevant and sustainable research software to research labs, they frankly have little choice but to commit extra resources to learning about the application of FAIR principles to their offerings tailored to FAIR-abiding research labs.


* a table of contents;
===FAIRer research objects + better software = the potential for greater innovation===
* an honest introduction and overview of your organization, its goals and problems, and the services sought to solve them;
As stated at the beginning of this article, greater research innovation can be gained through improved knowledge discovery, which is enabled by FAIR research objects and software. The FAIR principles say that when data and software is created, managed, updated, and developed such that they are more findable, accessible, interoperable, and reusable, researchers and other stakeholders benefit. Published research results are more reputable, reproducible, and reusable, benefiting the overall research community. However, this extends beyond academic research. The provenance of industry research—e.g., as with the pharmaceutical industry—performed with the help of and documented within ELNs and other research management software is better maintained using FAIR principles. As a result, clinical and preclinical studies are more reproducible, ensuring proper funneling of research funding, limiting resource waste, and limiting potential suffering of research participants.<ref>{{Cite journal |last=Sahoo |first=Satya S. |last2=Valdez |first2=Joshua |last3=Kim |first3=Matthew |last4=Rueschman |first4=Michael |last5=Redline |first5=Susan |date=2019-01 |title=ProvCaRe: Characterizing scientific reproducibility of biomedical research studies using semantic provenance metadata |url=https://linkinghub.elsevier.com/retrieve/pii/S1386505618302697 |journal=International Journal of Medical Informatics |language=en |volume=121 |pages=10–18 |doi=10.1016/j.ijmedinf.2018.10.009 |pmc=PMC6343667 |pmid=30545485}}</ref> Finally, patients suffering from rare diseases may benefit from FAIRer data practices that help prevent the data silos of testing, medical device use, patient outcomes, treatment history, and clinical trial history data. If these types of data were made more FAIR, "new diagnostics, treatments, and health care policies to benefit patients" could be developed, at the same time empowering those patients to take their health care journey into their own hands.<ref>{{Cite journal |last=van Lin |first=Nawel |last2=Paliouras |first2=Georgios |last3=Vroom |first3=Elizabeth |last4=’t Hoen |first4=Peter A.C. |last5=Roos |first5=Marco |date=2021-11-02 |title=How Patient Organizations Can Drive FAIR Data Efforts to Facilitate Research and Health Care: A Report of the Virtual Second International Meeting on Duchenne Data Sharing, March 3, 2021 |url=https://www.medra.org/servlet/aliasResolver?alias=iospress&doi=10.3233/JND-210721 |journal=Journal of Neuromuscular Diseases |volume=8 |issue=6 |pages=1097–1108 |doi=10.3233/JND-210721 |pmc=PMC8673524 |pmid=34334415}}</ref> However, in all these cases, laboratories are involved, and their software's ability to effectively ensure FAIR research objects are created is vital. As such, the implications of FAIR research objects and software on modern research laboratories' operations are undoubtable. Greater innovation and improved patient outcomes are only part of the benefits to society.
* details on how the RFI or RFP evaluation process will be conducted;
* basis for award (if an RFP);
* the calendar schedule (including times) for related events;
* how to submit the document and any related questions about it, including response format; and
* your organization's background, business requirements, and current technical environment.


Being honest about your organization, its informatics requirements, and its current technical environment upfront in the RFI or RFP will also ensure that the time spent on the process is optimized for all involved parties. Before submitting any RFI, your lab will want to conduct thorough internal research ensuring everyone understands what the current technology and processes are, and how you all want to shape that with the introduction or updating of laboratory informatics systems. (If your lab has limited to no experience with adding [[Laboratory automation|automation]] and informatics elements to a laboratory, you may want to read through laboratory informatics veteran Joe Liscouski's [[LII:The Application of Informatics to Scientific Work: Laboratory Informatics for Newbies|''The Application of Informatics to Scientific Work: Laboratory Informatics for Newbies'']] for further insight.) You'll also want to answer critical questions such as "who will be responsible for maintaining the solution and its security?" and "how will our processes and procedures change with the introduction or updating of informatics systems?". These and other questions make up your business considerations, which should also address the:
==Conclusion==


* acquisition and long-term maintenance budget;
* diversity of laboratory services offered now and into the future;
* level of in-house knowledge and experience with informatics systems and automation;
* level of in-house, executive buy-in of informatics adoption; and
* need for additional vendor pre-planning.


One other note: make it clear in any issued RFI that it's strictly a request for information and not a guarantee to issue a contract with any respondent.
===5.3 Respond to or open dialogue with vendors===
If you went the route of the RFI, you hopefully received more than a few well-crafted responses. Your RFI presumably included a small but critical set of requirements that needed to be addressed, and the vendors who responded dutifully addressed those critical requirements. Even if you didn't send out an RFI, you at least did your own research about some of the big players in the laboratory informatics space, and you may have even opened an initial dialogue with a few of them. If all has gone well, you're now at the point where you've narrowed down the pool of vendors but still have a basket of them to continue dialogue with. (If you're not comfortably at this point after an RFI or engagements with multiple vendors, you may need to either reconsider the effectiveness of your RFI or engagements or enlist help from a knowledgeable and experienced consultant to help steer you back on-course.)
As dialogue continues with vendors, you'll have several points to address:
1. What do I want their [[laboratory information management system]] (LIMS) to do for me?
2. How does their solution fit into our previously discussed budget?
Regarding question one, you've already laid some of the groundwork for that with the help of your handful of critical requirements (and the associated research that went into developing them). Outside of those critical requirements, a laboratory informatics solution should also provide clearly definable benefits to how you operate your ISO/IEC 17025 laboratory. These expected benefits should tie in with your overall business mission and goals. Using a LIMS as an example, here are a few of the benefits a well-developed LIMS can provide to practically any laboratory. Whenever you go through the discovery process with a vendor, you'll be asking how their system provides these and other benefits through its functionality. A quality LIMS can provide<ref name="McLelland98">{{cite web |url=http://www.rsc.org/pdf/andiv/tech.pdf |archiveurl=https://web.archive.org/web/20131004232754/http://www.rsc.org/pdf/andiv/tech.pdf |format=PDF |title=What is a LIMS - a laboratory toy, or a critical IT component? |author=McLelland, A. |publisher=Royal Society of Chemistry |page=1 |date=1998 |archivedate=04 October 2013 |accessdate=20 January 2023}}</ref><ref name="SciCompRisksBens">{{cite journal |title=Industry Insights: Examining the Risks, Benefits and Trade-offs of Today’s LIMS |journal=Scientific Computing |author=Joyce, J.R. |issue=January/February 2010 |pages=15–23 |year=2010}}</ref>:
* increased accuracy: the minimization or elimination of transcription and other errors;
* streamlined processes: ensuring each process step in a protocol/method is completed in the proper order, with all requirements met, updating sample statuses automatically;
* automation: integration with instruments, allowing for automatic uploading of samples and returning of results;
* regulatory and standards compliance: functionality that aids with compliance, including reporting results to state and local authorities;
* data security: role-based, configurable, secure access to data, processes, reporting, etc.;
* flexible reporting: reporting tools that allows for the design and generation of certificates of authority and other reports to lab- and regulation-based specs;
* instant data retrieval: query tools for finding data instantly according to any criteria (date range, test, product type, etc.); and
* configurability and cost-effectiveness: a user-configurable system (as opposed to hard-coded, requiring development for any modifications) that is flexible enough to adapt to rapid changes in test volume and type over time, without breaking the bank.
As for the second question, budgeting is always a tricky topic, both internally and when discussing it with vendors. We already mentioned in the previous section that addressing the acquisition and long-term maintenance budget of your solution(s) must be addressed as part of your lab's business considerations. (And we already mentioned some cost considerations in 3.1.6; this discussion will add a few more points.) The fact that laboratory informatics systems like the LIMS come in all kinds of price ranges makes it difficult to judge if a given system, as priced, is appropriate for your lab and its budget. There are some basic cost realities associated with LIMS acquisition<ref name="CSolsHowMuch17">{{cite web |url=https://www.slideshare.net/CSolsInc/how-much-does-a-lims-cost-licensing-and-beyond-pittcon-2017-tech-talk |title=How Much Does a LIMS Cost? Licensing and Beyond |author=Rosenberg, H.J. |work=SlideShare |date=28 March 2017 |accessdate=20 January 2023}}</ref><ref name="CSolsSaving18">{{cite web |url=https://www.csolsinc.com/blog/saving-costs-with-lims/ |title=Saving Costs with LIMS |publisher=CSols, Inc |date=25 October 2018 |accessdate=20 January 2023}}</ref>, which will help you understand where the vendor price comes from, and how it figures into your lab's budget (though some of these concepts may also apply to other informatics systems).
:1. Vendor pricing is generally based on how many will be using the system. This can be measured in concurrent users (how many will be using the system at any one time) or named users (the number of total users who will ever use the system, by name). Additionally, laboratory informatics vendors increasingly offer the option of a [[Cloud computing|cloud-hosted]] subscription, which of course has the advantage of not requiring your own IT department, and allowing labs to defray cost over time, with little or no actual license fee. Think about your usage strategy and choose the pricing format that makes the most sense for you.
:2. Most costs are related to the work involved with installing, configuring, and migrating data to the system. Try to choose a solution that has what you need out of the box, as much as possible. The more customized or unique options you ask for up-front, the more it tends to cost, as extra items are a function of the time it takes developers to add them.
:3. "User-configurable" beats "vendor-configurable" on cost-effectiveness. Some vendors offer a free or low-cost option, but don't be fooled. They are in business to make money, and they are counting on the fact that you'll need to pay them to make things work, add necessary functionality, and provide support and training. If you can find a vendor who offers a genuinely user-configurable system, and whose manuals and other support materials are clearly helpful and available so that you can adjust things the way you want, when you want, then that will go a long way toward budget efficiency and longevity.
:4. Additional interfaces and reporting requirements cost money. If necessary, consider phasing in any additional instrument and software interfaces over time, as revenue eases cash flow. You can go live with your system operations more quickly, entering results manually until you can afford to interface your instruments one-by-one. This goes for reports as well; a simple reporting module that meets regulatory requirements will do. You can make your reports and other exportable documents more attractive later.
Ideally, your budget has room for roughly $40- to $80,000 minimum (including setup, training, interfaces, etc.) for a quality, full-featured professional LIMS or LIS, with $300 to $900 per month (depending on number of users) for ongoing subscriptions. At around five concurrent users, the economics start to favor purchasing perpetual licenses rather than paying for a subscription. Purchased licenses will also entail ongoing annual or monthly costs as well (e.g., maintenance, support, warranty for updates etc.) Subscriptions (if available) are generally aimed at smaller labs. If you will be growing and scaling up, it may be a great way to get started, but make sure you have the option to switch to perpetual licenses later.
With much of this information in hand, you're likely ready to move on to finalizing the requirements specification and choosing a vendor, but not before you've sat through a few highly useful demonstrations.
====5.3.1 The value of demonstrations====
[[File:ForUM demo (2659615090).jpg|left|360px]]Participating in a demonstration of a laboratory informatics solution is an integral part of making your final decisions. The demo offers a unique and valuable opportunity to see in-person how data and information is added, edited, deleted, tracked, and protected within the context of the application; you can ask about how a function works and see it right then and there. Equally, it is an excellent time to compare notes with the vendor, particularly in regard to the critical requirements that were addressed in your RFI (or through direct communication with the vendor). You can ask the vendor in real-time to answer questions about how a specific task is achieved, or how the system addresses ISO/IEC 17025 compliance needs, and the vendor can ask you about your lab's system and workflow requirements and how you best envision them being implemented in the system (e.g., does this interface seem intuitive?).
A demonstration is typically performed online, which is useful for a couple of reasons, COVID-19 notwithstanding. First, it means you can schedule and reschedule at your convenience, with little in the way of logistics to arrange. Second, the demonstration session is likely to be recorded (though be sure to verify this), so everyone is clear on what was promised and what wasn't, how processes were shown to work, etc. Additionally, you can later review parts you may have missed, forgotten, or not quite understood, and you can share it with others, who then also get a look at the proposed system in action.
Be careful about falling for the temptation of presenting a full URS or other specification document to the vendor during the demonstration. You'll want to wait until after participating in several software demonstrations to consider presenting your full specification document to the vendor, and that's assuming that you've grown enamored with their solution. By waiting to finalize your lab's requirements specification until after the demos, a common error is avoided: too often labs think the first thing they must do is create a requirements list, then sit back and let the informatics vendors tell them how they meet it. Remember that even though most labs thoroughly understand their processes, they likely don't have as strong a grasp on the informatics portion of their processes and workflows. Participating in a demo before finalizing your list of specified requirements—or having only a minimal yet flexible requirements list during the demo—is a great way to later crosscheck the software features you have seen demonstrated to your lab's processes and any initial requirements specification you've made.<ref name="HammerHowTo19">{{cite web |url=https://www.striven.com/blog/erp-software-demo |title=How to Get the Most Value from an ERP Software Demo |author=Hammer, S. |work=The Takeoff |date=27 June 2019 |accessdate=20 January 2023}}</ref> After all, how can you effectively require specific functions of your laboratory informatics software if you don't fully know what such a system is capable of? After the demonstrations, you may end up adding several requirements to your final specifications document, which you later pass on to your potential vendors of choice for final confirmation.
===5.4 Finalize the requirements specification and choose a vendor===
Now that the demonstrations have been conducted and more questions asked, you should be close to finalizing your requirement specification with one or more vendors. In fact, you may have taken LIMSpec, chosen a few critical requirements from it, added them to a few unique requirements of your own, and included them as part of an RFI or question and answer session with vendors. You then likely took those responses and added them to your wider overall specification, along with your own notes and observations from interacting with the vendor. This may have been repeated for several vendors and their offerings.
At this point, you're likely ready to either have those vendors complete the rest of the responses for their corresponding URS, or you may even be ready to narrow down your vendor selection. This all likely depends on what the initial fact finding revealed. How well did the vendors respond to your laboratory's unique set of needs? Were there critical areas that one vendor could address with their off-the-shelf solution but another vendor would have to address with custom coding? Did any of the vendors meet your budget expectations? Have you followed up on any references and customer experiences the vendors provided to you?
It may be that several vendors are appealing at this point, meaning it's time to have them respond to the rest of the URS. This makes not only for good due diligence, to better ensure most requirements can be met, but also a reviewable option for any "tie-breaker" you have between vendors. In reality, this tie-breaker scenario would rarely come up; more often, some other aspect of the software, company, or pricing will be a stronger limiter. However, you still want to get all those vendor responses, even if you've early on filtered your options down to one vendor.
Ultimately, your specification document may look similar to LIMSpec, or it may have a slightly different format. Many prospective buyers will develop a requirement specification in Microsoft Excel, but that has a few minor disadvantages. Regardless of format, you'll want to give plenty of space for vendors to submit a response to each requirement. (See the next section concerning a Microsoft Word version of LIMSpec.)
Additionally, remember that often is the case that after the URS is completed and final questions asked, no single vendor can meet all your needs. Be ready for this possibility, whether it be a functionality requirement or a budget issue. Know ahead of time where your laboratory is willing to be flexible, and how much flex you have. After all of your lab's preparation, and with a little luck, you've found a vendor that fits the bill, even if a few minor compromises had to be made along the way.
===5.5 LIMSpec in Microsoft Word format===
Microsoft Excel is often used as a tool to document requirements specifications. However, one downside to Microsoft Excel is its inability to handle multiple hyperlinks in the same cell. If you've looked over the LIMSpec, you've likely noticed there are multiple hyperlinks to regulations, specifications, and guidance documents in the first column of the tables. Translating these wiki-based documents to Excel makes for a challenge when trying to maintain those hyperlinks. As they add value to not only your laboratory's requirements research but also to vendors' understanding of the sources for your requirements, it was decided the hyperlinks should be maintained in any portable version. As such, a Microsoft Word version was created.
You can download a copy of the Microsoft Word version of LIMSpec from LIMSwiki by going to [[File:LIMSpec 2022R2 v1.0.docx|this file page]], right-clicking the URL under the white box, and selecting "Save link as..." (Alternatively, you can just click the link, open the file, and then save it.) A compromise was made between keeping the hyperlinks in the first column readable and leaving enough room in the third column for a vendor to provide a response. This response space admittedly may be a limiting factor for vendors wanting to include screenshots. If this situation arises, you may encourage the vendor to select the entire first column and delete it, then widen the response column.
Note that this downloadable version of LIMSpec is released under the same licensing terms as this guide. Please see the first paragraph of the download for more details.


==References==
==References==
{{Reflist|colwidth=30em}}
{{Reflist|colwidth=30em}}
<!---Place all category tags here-->

Latest revision as of 02:03, 8 May 2024

Sandbox begins below

[[File:|right|520px]] Title: Why are the FAIR data principles increasingly important to research laboratories and their software?

Author for citation: Shawn E. Douglas

License for content: Creative Commons Attribution-ShareAlike 4.0 International

Publication date: May 2024

Introduction

The growing importance of the FAIR principles to research laboratories

The FAIR data principles were published by Wilkinson et al. in 2016 as a stakeholder collaboration driven to see research "objects" (i.e., research data and information of all shapes and formats) become more universally findable, accessible, interoperable, and reusable (FAIR) by both machines and people.[1] The authors released the FAIR principles while recognizing that "one of the grand challenges of data-intensive science ... is to improve knowledge discovery through assisting both humans and their computational agents in the discovery of, access to, and integration and analysis of task-appropriate scientific data and other scholarly digital objects."[1] Since being published, other researchers have taken the somewhat broad set of principles and refined them to their own scientific disciplines, as well as to other types of research objects, including the research software being used by those researchers to generate research objects.[2][3][4][5][6][7]

But why are research laboratories increasingly pushing for more findable, accessible, interoperable, and reusable research objects and software? The short answer, as evidenced by the Wilkinson et al. quote above is that greater innovation can be gained through improved knowledge discovery. The discovery process necessary for that greater innovation—whether through traditional research methods or artificial intelligence (AI)-driven methods—is enhanced when research objects and software are compatible with the core ideas of FAIR.[1][8][9]

A slightly longer answer, suitable for a Q&A topic, requires looking at a few more details of the FAIR principles as applied to both research objects and research software. Research laboratories, whether located in an organization or contracted out as third parties, exist to innovate. That innovation can come in the form of discovering new materials that may or may not have a future application, developing a pharmaceutical to improve patient outcomes for a particular disease, or modifying (for some sort of improvement) an existing food or beverage recipe, among others. In academic research labs, this usually looks like knowledge advancement and the publishing of research results, whereas in industry research labs, this typically looks like more practical applications of research concepts to new or existing products or services. In both cases, research software was likely involved at some point, whether it be something like a researcher-developed bioinformatics application or a commercial vendor-developed electronic laboratory notebook (ELN).

FAIR research objects

Regarding research objects themselves, the FAIR principles essentially say "vast amounts of data and information in largely heterogeneous formats spread across disparate sources both electronic and paper make modern research workflows difficult, tedious, and at times impossible. Further, repeatability, reproducibility, and replicability of openly published or secure internal research results is at risk, giving less confidence to academic peers in the published research, or less confidence to critical stakeholders in the viability of a researched prototype." As such, research objects (which include not only their inherent data and information but also any metadata that describe features of that data and information) need to be[10]:

  • findable, with globally unique and persistent identifiers, rich metadata that link to the identifier of the data described, and an ability to be indexed as an effectively searchable resource;
  • accessible, being able to be retrieved (including metadata of data that is no longer available) by identifiers using secure standardized communication protocols that are open, free, and universally implementable with authentication and authorization mechanisms;
  • interoperable, represented using formal, accessible, shared, and relevant language models and vocabularies that abide by FAIR principles, as well as with qualified linkage to other metadata; and
  • reusable, being richly described by accurate and relevant metadata, released with a clear and accessible data usage license, associated with sufficiently detailed provenance information, and compliant with discipline-specific community standards.

All that talk of unique persistent identifiers, communication protocols, authentication mechanisms, language models (e.g., ontology languages), standardized vocabularies, provenance information, and more could make one's head spin. And, to be fair, it has been challenging for research groups to adopt FAIR, with few widespread international efforts to translate the FAIR principles to broad research. The FAIR Cookbook represents one example of such international collaborative effort, providing "a combination of guidance, technical, hands-on, background and review types to cover the operation steps of FAIR data management."[11] In fact, the Cookbook is illustrative of the challenges of implementing FAIR in research laboratories, particularly given the diverse array of vocabularies used across the wealth of scientific disciplines, such as biobanking, biomedical engineering, botany, food science, and materials science. The way a botanical research organization makes its research objects FAIR is going to require a set of different tools than the materials science research organization. But all of them will turn to informatics tools, data management plans, database tools, and more to not only massage existing research objects to be FAIR but also better ensure newly created research objects are FAIR as well.

FAIR research software

Discussion on research software and its FAIRness is more complicated. It is beyond the scope of this article to go into greater detail about the concepts surrounding FAIR research software, but a brief overview will be attempted. When the FAIR principles were first published, the framework was largely being applied to research objects. However, researchers quickly recognized that any planning around updating processes and systems to make research objects more FAIR would have to be tailored to specific research contexts. This led to recognizing that digital research objects go beyond data and information, and that there is a "specific nature of software" used in research; that research software should not be considered "just data."[4] The end result has seen researchers begin to apply the core concepts of FAIR to research software, but slightly differently from research objects.[2][3][4][5][6][7]

Unsurprisingly, what researchers consider to be "research software" for purposes of FAIR has historically been interpreted numerous ways. Does the commercial spreadsheet software used to make calculations to research data deserve to be called research software in parallel with the lab-developed bioinformatics application used to generate that data? Given the difficulties of gaining a consensus definition of the term, a 2021 international initiative called FAIRsFAIR made a good-faith effort to define "research software" with the feedback of multiple stakeholders. The short version of their resulting definition is that, "[r]esearch software includes source code files, algorithms, scripts, computational workflows, and executables that were created during the research process, or for a research purpose."[12] Of note is the last part, acknowledging that research software can be developed in the lab during the research process or developed beforehand by, for example, a commercial software developer with a strong purpose of being used for research. As such, Microsoft Excel may not be looked upon as research software, but an ELN or laboratory information management system (LIMS) thoughtfully developed with research activities in mind could be considered research software. More often than not, research software is going to be developed in-house. A growing push for the FAIRification of that software, as well as commercial research solutions, has seen the emergence of "research software engineering" as a domain of practice.[13][14] While in the past, broadly speaking, researchers often cobbled together research software with less a focus on quality and reproducibility and more on getting their research published, today's push for FAIR data and software by academic journals, institutions, and other researchers seeking to collaborate has placed a much greater focus on the concept of "better software, better research"[14][15], with research software engineering efforts focusing on that concept as being vital to future research outcomes. Cohen et al. add that "ultimately, good research software can make the difference between valid, sustainable, reproducible research outputs and short-lived, potentially unreliable or erroneous outputs."[15]

Hasselbring et al. note that "it is essential [for academic research groups] to publish research software in addition to research data," to increase trust in the peer review system, build new research on top of existing research, and ensure greater reproducibility of any published results.[3] They extend FAIR data principles to FAIR research software, noting that[3]:

  • findable software acknowledges that "the first step in (re)using ... software is to find it";
  • accessible software acknowledges that once found, the researcher needs to know how to best access the software, recognizing authentication or authentication mechanisms may need to be in place;
  • interoperable software acknowledges that the software will need to eventually integrate with other research objects and software, demanding a FAIR-driven methods and tools in the software's development; and
  • reusable software acknowledges that the software will need to not only produce research objects that can be reused, combined, and extended, but that the software itself should have metadata that helps make it retrievable and reusable.

The applicability of these principles is clear to academic research software developed in-house, with the concept of open science driving FAIR development and release of that software, including on platforms like GitHub.[3] It's less clear for commercial developers making research software. The growing prevalence of FAIR data and software practices in research laboratories doesn't mean commercial developers are going to suddenly take an open-source approach to their code, and it doesn't mean academic and institutional research labs are going to give up the benefits of the open-source paradigm as applied to research software.[3] However, both research software development paradigms stand to gain from the shift to more FAIR data and software.[13] Additionally, if commercial vendors of research software want to continue to competitively market relevant and sustainable research software to research labs, they frankly have little choice but to commit extra resources to learning about the application of FAIR principles to their offerings tailored to FAIR-abiding research labs.

FAIRer research objects + better software = the potential for greater innovation

As stated at the beginning of this article, greater research innovation can be gained through improved knowledge discovery, which is enabled by FAIR research objects and software. The FAIR principles say that when data and software is created, managed, updated, and developed such that they are more findable, accessible, interoperable, and reusable, researchers and other stakeholders benefit. Published research results are more reputable, reproducible, and reusable, benefiting the overall research community. However, this extends beyond academic research. The provenance of industry research—e.g., as with the pharmaceutical industry—performed with the help of and documented within ELNs and other research management software is better maintained using FAIR principles. As a result, clinical and preclinical studies are more reproducible, ensuring proper funneling of research funding, limiting resource waste, and limiting potential suffering of research participants.[16] Finally, patients suffering from rare diseases may benefit from FAIRer data practices that help prevent the data silos of testing, medical device use, patient outcomes, treatment history, and clinical trial history data. If these types of data were made more FAIR, "new diagnostics, treatments, and health care policies to benefit patients" could be developed, at the same time empowering those patients to take their health care journey into their own hands.[17] However, in all these cases, laboratories are involved, and their software's ability to effectively ensure FAIR research objects are created is vital. As such, the implications of FAIR research objects and software on modern research laboratories' operations are undoubtable. Greater innovation and improved patient outcomes are only part of the benefits to society.

Conclusion

References

  1. 1.0 1.1 1.2 Wilkinson, Mark D.; Dumontier, Michel; Aalbersberg, IJsbrand Jan; Appleton, Gabrielle; Axton, Myles; Baak, Arie; Blomberg, Niklas; Boiten, Jan-Willem et al. (15 March 2016). "The FAIR Guiding Principles for scientific data management and stewardship" (in en). Scientific Data 3 (1): 160018. doi:10.1038/sdata.2016.18. ISSN 2052-4463. PMC PMC4792175. PMID 26978244. https://www.nature.com/articles/sdata201618. 
  2. 2.0 2.1 "fair data principles". PubMed Search. National Institutes of Health, National Library of Medicine. https://pubmed.ncbi.nlm.nih.gov/?term=fair+data+principles. Retrieved 30 April 2024. 
  3. 3.0 3.1 3.2 3.3 3.4 3.5 Hasselbring, Wilhelm; Carr, Leslie; Hettrick, Simon; Packer, Heather; Tiropanis, Thanassis (25 February 2020). "From FAIR research data toward FAIR and open research software" (in en). it - Information Technology 62 (1): 39–47. doi:10.1515/itit-2019-0040. ISSN 2196-7032. https://www.degruyter.com/document/doi/10.1515/itit-2019-0040/html. 
  4. 4.0 4.1 4.2 Gruenpeter, M. (23 November 2020). "FAIR + Software: Decoding the principles" (PDF). FAIRsFAIR “Fostering FAIR Data Practices In Europe”. https://www.fairsfair.eu/sites/default/files/FAIR%20%2B%20software.pdf. Retrieved 30 April 2024. 
  5. 5.0 5.1 Barker, Michelle; Chue Hong, Neil P.; Katz, Daniel S.; Lamprecht, Anna-Lena; Martinez-Ortiz, Carlos; Psomopoulos, Fotis; Harrow, Jennifer; Castro, Leyla Jael et al. (14 October 2022). "Introducing the FAIR Principles for research software" (in en). Scientific Data 9 (1): 622. doi:10.1038/s41597-022-01710-x. ISSN 2052-4463. PMC PMC9562067. PMID 36241754. https://www.nature.com/articles/s41597-022-01710-x. 
  6. 6.0 6.1 Patel, Bhavesh; Soundarajan, Sanjay; Ménager, Hervé; Hu, Zicheng (23 August 2023). "Making Biomedical Research Software FAIR: Actionable Step-by-step Guidelines with a User-support Tool" (in en). Scientific Data 10 (1): 557. doi:10.1038/s41597-023-02463-x. ISSN 2052-4463. PMC PMC10447492. PMID 37612312. https://www.nature.com/articles/s41597-023-02463-x. 
  7. 7.0 7.1 Du, Xinsong; Dastmalchi, Farhad; Ye, Hao; Garrett, Timothy J.; Diller, Matthew A.; Liu, Mei; Hogan, William R.; Brochhausen, Mathias et al. (6 February 2023). "Evaluating LC-HRMS metabolomics data processing software using FAIR principles for research software" (in en). Metabolomics 19 (2): 11. doi:10.1007/s11306-023-01974-3. ISSN 1573-3890. https://link.springer.com/10.1007/s11306-023-01974-3. 
  8. Olsen, C. (1 September 2023). "Embracing FAIR Data on the Path to AI-Readiness". Pharma's Almanac. https://www.pharmasalmanac.com/articles/embracing-fair-data-on-the-path-to-ai-readiness. Retrieved 03 May 2024. 
  9. Huerta, E. A.; Blaiszik, Ben; Brinson, L. Catherine; Bouchard, Kristofer E.; Diaz, Daniel; Doglioni, Caterina; Duarte, Javier M.; Emani, Murali et al. (26 July 2023). "FAIR for AI: An interdisciplinary and international community building perspective" (in en). Scientific Data 10 (1): 487. doi:10.1038/s41597-023-02298-6. ISSN 2052-4463. PMC PMC10372139. PMID 37495591. https://www.nature.com/articles/s41597-023-02298-6. 
  10. Rocca-Serra, Philippe; Sansone, Susanna-Assunta; Gu, Wei; Welter, Danielle; Abbassi Daloii, Tooba; Portell-Silva, Laura (30 June 2022). "Introducing the FAIR Principles". D2.1 FAIR Cookbook. doi:10.5281/ZENODO.6783564. https://zenodo.org/record/6783564. 
  11. Rocca-Serra, Philippe; Sansone, Susanna-Assunta; Gu, Wei; Welter, Danielle; Abbassi Daloii, Tooba; Portell-Silva, Laura (30 June 2022). "Introduction". D2.1 FAIR Cookbook. doi:10.5281/ZENODO.6783564. https://zenodo.org/record/6783564. 
  12. Gruenpeter, Morane; Katz, Daniel S.; Lamprecht, Anna-Lena; Honeyman, Tom; Garijo, Daniel; Struck, Alexander; Niehues, Anna; Martinez, Paula Andrea et al. (13 September 2021). "Defining Research Software: a controversial discussion". Zenodo. doi:10.5281/zenodo.5504016. https://zenodo.org/record/5504016. 
  13. 13.0 13.1 Moynihan, G. (7 July 2020). "The Hitchhiker’s Guide to Research Software Engineering: From PhD to RSE". Invenia Blog. Invenia Technical Computing Corporation. https://invenia.github.io/blog/2020/07/07/software-engineering/. 
  14. 14.0 14.1 Woolston, Chris (31 May 2022). "Why science needs more research software engineers" (in en). Nature: d41586–022–01516-2. doi:10.1038/d41586-022-01516-2. ISSN 0028-0836. https://www.nature.com/articles/d41586-022-01516-2. 
  15. 15.0 15.1 Cohen, Jeremy; Katz, Daniel S.; Barker, Michelle; Chue Hong, Neil; Haines, Robert; Jay, Caroline (1 January 2021). "The Four Pillars of Research Software Engineering". IEEE Software 38 (1): 97–105. doi:10.1109/MS.2020.2973362. ISSN 0740-7459. https://ieeexplore.ieee.org/document/8994167/. 
  16. Sahoo, Satya S.; Valdez, Joshua; Kim, Matthew; Rueschman, Michael; Redline, Susan (1 January 2019). "ProvCaRe: Characterizing scientific reproducibility of biomedical research studies using semantic provenance metadata" (in en). International Journal of Medical Informatics 121: 10–18. doi:10.1016/j.ijmedinf.2018.10.009. PMC PMC6343667. PMID 30545485. https://linkinghub.elsevier.com/retrieve/pii/S1386505618302697. 
  17. van Lin, Nawel; Paliouras, Georgios; Vroom, Elizabeth; ’t Hoen, Peter A.C.; Roos, Marco (2 November 2021). "How Patient Organizations Can Drive FAIR Data Efforts to Facilitate Research and Health Care: A Report of the Virtual Second International Meeting on Duchenne Data Sharing, March 3, 2021". Journal of Neuromuscular Diseases 8 (6): 1097–1108. doi:10.3233/JND-210721. PMC PMC8673524. PMID 34334415. https://www.medra.org/servlet/aliasResolver?alias=iospress&doi=10.3233/JND-210721.