Difference between revisions of "User:Shawndouglas/sandbox/sublevel12"

From LIMSWiki
Jump to navigationJump to search
Tag: Reverted
 
(345 intermediate revisions by the same user not shown)
Line 7: Line 7:


==Sandbox begins below==
==Sandbox begins below==
LIMSwiki provides [[:Category:LIMS vendors by industry|a useful directory]] of LIMS vendors serving specific industries. If a vendor clearly indicates what industries they serve on their website, they are categorized under those industries. Otherwise, they are placed under the "General" category. However, "ISO/IEC 17025" isn't considered an industry for purposes of categorization; as we learned in Chapter 2, the testing, calibration, and sampling activities covered by ISO/IEC 17025 occur in a wide variety of industry contexts. As such, using such a directory will require you to examine vendors serving your industry and then investigate further how their LIMS touches all those requirements listed in the previous chapter.
<div class="nonumtoc">__TOC__</div>
[[File:FAIRResourcesGraphic AustralianResearchDataCommons 2018.png|right|520px]]
'''Title''': ''What are the potential implications of the FAIR data principles to laboratory informatics applications?''


For purposes of this guide, we simply list all known LIMS vendors as a base starting point. This may have little use to you, so you may also wish to consult the previously mentioned categorized directory for further granularity. The LIMS vendors and consultants listed below are directly pulled from LIMSwiki's maintained tabular listings of these types of entities. The professional section addresses trade organizations, conferences, and more. The last section further explores LIMSpec, which has been discussed in prior chapters.
'''Author for citation''': Shawn E. Douglas


===4.1 LIMS vendors===
'''License for content''': [https://creativecommons.org/licenses/by-sa/4.0/ Creative Commons Attribution-ShareAlike 4.0 International]
NOTE: This listing represents all known active LIMS vendors. For a categorized listing of LIMS vendors by industry, see the [[:Category:LIMS vendors by industry|listing of vendors by industry]].


'''Publication date''': May 2024


{{All active LIMS vendors}}
==Introduction==


This brief topical article will examine


===4.2 Consultants===
==The "FAIR-ification" of research objects and software==
First discussed during a 2014 FORCE-11 workshop dedicated to "overcoming data discovery and reuse obstacles," the [[Journal:The FAIR Guiding Principles for scientific data management and stewardship|FAIR Guiding Principles]] were published by Wilkinson ''et al.'' in 2016 as a stakeholder collaboration driven to see research "objects" (i.e., research data and [[information]] of all shapes and formats) become more universally findable, accessible, interoperable and reusable (FAIR) by both machines and people.<ref name="WilkinsonTheFAIR16">{{Cite journal |last=Wilkinson |first=Mark D. |last2=Dumontier |first2=Michel |last3=Aalbersberg |first3=IJsbrand Jan |last4=Appleton |first4=Gabrielle |last5=Axton |first5=Myles |last6=Baak |first6=Arie |last7=Blomberg |first7=Niklas |last8=Boiten |first8=Jan-Willem |last9=da Silva Santos |first9=Luiz Bonino |last10=Bourne |first10=Philip E. |last11=Bouwman |first11=Jildau |date=2016-03-15 |title=The FAIR Guiding Principles for scientific data management and stewardship |url=https://www.nature.com/articles/sdata201618 |journal=Scientific Data |language=en |volume=3 |issue=1 |pages=160018 |doi=10.1038/sdata.2016.18 |issn=2052-4463 |pmc=PMC4792175 |pmid=26978244}}</ref> The authors released the FAIR principles while recognizing that "one of the grand challenges of data-intensive science ... is to improve knowledge discovery through assisting both humans and their computational agents in the discovery of, access to, and integration and analysis of task-appropriate scientific data and other scholarly digital objects."<ref name="WilkinsonTheFAIR16" />


{{LIMS, LIS, and laboratory}}
Since 2016, other research stakeholders have taken to publishing their thoughts about how the FAIR principles apply to their fields of study and practice<ref name="NIHPubMedSearch">{{cite web |url=https://pubmed.ncbi.nlm.nih.gov/?term=fair+data+principles |title=fair data principles |work=PubMed Search |publisher=National Institutes of Health, National Library of Medicine |accessdate=30 April 2024}}</ref>, including in ways beyond what perhaps was originally imagined by Wilkinson ''et al.''. For example, multiple authors have examined whether or not the software used in scientific endeavors itself can be considered a research object worth being developed and managed in tandem with the FAIR data principles.<ref>{{Cite journal |last=Hasselbring |first=Wilhelm |last2=Carr |first2=Leslie |last3=Hettrick |first3=Simon |last4=Packer |first4=Heather |last5=Tiropanis |first5=Thanassis |date=2020-02-25 |title=From FAIR research data toward FAIR and open research software |url=https://www.degruyter.com/document/doi/10.1515/itit-2019-0040/html |journal=it - Information Technology |language=en |volume=62 |issue=1 |pages=39–47 |doi=10.1515/itit-2019-0040 |issn=2196-7032}}</ref><ref name="GruenpeterFAIRPlus20">{{Cite web |last=Gruenpeter, M. |date=23 November 2020 |title=FAIR + Software: Decoding the principles |url=https://www.fairsfair.eu/sites/default/files/FAIR%20%2B%20software.pdf |format=PDF |publisher=FAIRsFAIR “Fostering FAIR Data Practices In Europe” |accessdate=30 April 2024}}</ref><ref>{{Cite journal |last=Barker |first=Michelle |last2=Chue Hong |first2=Neil P. |last3=Katz |first3=Daniel S. |last4=Lamprecht |first4=Anna-Lena |last5=Martinez-Ortiz |first5=Carlos |last6=Psomopoulos |first6=Fotis |last7=Harrow |first7=Jennifer |last8=Castro |first8=Leyla Jael |last9=Gruenpeter |first9=Morane |last10=Martinez |first10=Paula Andrea |last11=Honeyman |first11=Tom |date=2022-10-14 |title=Introducing the FAIR Principles for research software |url=https://www.nature.com/articles/s41597-022-01710-x |journal=Scientific Data |language=en |volume=9 |issue=1 |pages=622 |doi=10.1038/s41597-022-01710-x |issn=2052-4463 |pmc=PMC9562067 |pmid=36241754}}</ref><ref>{{Cite journal |last=Patel |first=Bhavesh |last2=Soundarajan |first2=Sanjay |last3=Ménager |first3=Hervé |last4=Hu |first4=Zicheng |date=2023-08-23 |title=Making Biomedical Research Software FAIR: Actionable Step-by-step Guidelines with a User-support Tool |url=https://www.nature.com/articles/s41597-023-02463-x |journal=Scientific Data |language=en |volume=10 |issue=1 |pages=557 |doi=10.1038/s41597-023-02463-x |issn=2052-4463 |pmc=PMC10447492 |pmid=37612312}}</ref><ref>{{Cite journal |last=Du |first=Xinsong |last2=Dastmalchi |first2=Farhad |last3=Ye |first3=Hao |last4=Garrett |first4=Timothy J. |last5=Diller |first5=Matthew A. |last6=Liu |first6=Mei |last7=Hogan |first7=William R. |last8=Brochhausen |first8=Mathias |last9=Lemas |first9=Dominick J. |date=2023-02-06 |title=Evaluating LC-HRMS metabolomics data processing software using FAIR principles for research software |url=https://link.springer.com/10.1007/s11306-023-01974-3 |journal=Metabolomics |language=en |volume=19 |issue=2 |pages=11 |doi=10.1007/s11306-023-01974-3 |issn=1573-3890}}</ref> Researchers quickly recognized that any planning around updating processes and systems to make research objects more FAIR would have to be tailored to specific research contexts, recognize that digital research objects go beyond data and information, and recognize "the specific nature of software" and not consider it "just data."<ref name="GruenpeterFAIRPlus20" /> The end result has been applying the core concepts of FAIR but differently from data, with the added context of research software being more than just data, requiring more nuance and a different type of planning from applying FAIR to digital data and information.


A 2019 survey by Europe's FAIRsFAIR found that researchers seeking and re-using relevant research software on the internet faced multiple challenges, including understanding and/or maintaining the necessary software environment and its dependencies, finding sufficient documentation, struggling with accessibility and licensing issues, having the time and skills to install and/or use the software, finding quality control of the source code lacking, and having an insufficient (or non-existent) software sustainability and management plan.<ref name="GruenpeterFAIRPlus20" /> These challenges highlight the importance of software to researchers and other stakeholders, and the roll FAIR has in better ensuring such software is findable, interoperable, and reusable, which in turn better ensures researchers' software-driven research is repeatable (by the same research team, with the same experimental setup), reproducible (by a different research team, with the same experimental setup), and replicable (by a different research team, with a different experimental setup).<ref name="GruenpeterFAIRPlus20" />


===4.3 Other resources===
At this point, the topic of what "research software" represents must be addressed further, and, unsurprisingly, it's not straightforward. Ask 20 researchers what "research software" is, and you may get 20 different opinions. Some definitions can be more objectively viewed as too narrow, while others may be viewed as too broad, with some level of controversy inherent in any mutual discussion.<ref name="GruenpeterDefining21">{{Cite journal |last=Gruenpeter, Morane |last2=Katz, Daniel S. |last3=Lamprecht, Anna-Lena |last4=Honeyman, Tom |last5=Garijo, Daniel |last6=Struck, Alexander |last7=Niehues, Anna |last8=Martinez, Paula Andrea |last9=Castro, Leyla Jael |last10=Rabemanantsoa, Tovo |last11=Chue Hong, Neil P. |date=2021-09-13 |title=Defining Research Software: a controversial discussion |url=https://zenodo.org/record/5504016 |journal=Zenodo |doi=10.5281/zenodo.5504016}}</ref><ref name="JulichWhatIsRes24">{{cite web |url=https://www.fz-juelich.de/en/rse/about-rse/what-is-research-software |title=What is Research Software? |work=JuRSE, the Community of Practice for Research Software Engineering |publisher=Forschungszentrum Jülich |date=13 February 2024 |accessdate=30 April 2024}}</ref><ref name="vanNieuwpoortDefining24">{{Cite journal |last=van Nieuwpoort |first=Rob |last2=Katz |first2=Daniel S. |date=2023-03-14 |title=Defining the roles of research software |url=https://upstream.force11.org/defining-the-roles-of-research-software |language=en |doi=10.54900/9akm9y5-5ject5y}}</ref> In 2021, as part of the FAIRsFAIR initiative, Gruenpeter ''et al.'' made a good-faith effort to define "research software" with the feedback of multiple stakeholders. Their efforts resulted in this definition:


====4.3.1 ISO/IEC 17025 accreditation bodies====
<blockquote>Research software includes source code files, algorithms, scripts, computational workflows, and executables that were created during the research process, or for a research purpose. Software components (e.g., operating systems, libraries, dependencies, packages, scripts, etc.) that are used for research but were not created during, or with a clear research intent, should be considered "software [used] in research" and not research software. This differentiation may vary between disciplines. The minimal requirement for achieving computational reproducibility is that all the computational components (i.e., research software, software used in research, documentation, and hardware) used during the research are identified, described, and made accessible to the extent that is possible.</blockquote>
What follows is a set of examples of accreditation bodies handling ISO/IEC 17025 accreditation for laboratories around the world. For a more complete list of accreditation bodies, see the International Laboratory Accreditation Cooperation (ILAC) [https://ilac.org/signatory-search/ directory] of approved bodies.


*[https://www.accredia.it/en/accredited-services/ Accredia, the Italian Accreditation Body] (Italy)
Note that while the definition primarily recognizes software created during the research process, software created (whether by the research group, other open-source software developers outside the organization, or even commercial software developers) "for a research purpose" outside the actual research process is also recognized as research software. This notably can lead to disagreement about whether a proprietary, commercial spreadsheet or [[laboratory information management system]] (LIMS) offering that conducts analyses and visualizations of research data can genuinely be called research software, or simply classified as software used in research. van Nieuwpoort and Katz further elaborated on this concept, at least indirectly, by formally defining the roles of research software in 2023. Their definition of the various roles of research software—without using terms such as "open-source," "commercial," or "proprietary"—essentially further defined what research software is<ref name="vanNieuwpoortDefining24" />:
*[https://a2la.org/ American Association for Laboratory Accreditation (A2LA)] (U.S.)
*[https://www.aihaaccreditedlabs.org/lab-accreditation-programs American Industrial Hygiene Association (AIHA)] (U.S.)
*[https://anab.ansi.org/laboratory-accreditation/iso-iec-17025 ANSI National Accreditation Board (ANAB)] (U.S.)
*[http://www.boa.gov.vn/ Bureau of Accreditation (BoA)] (Vietnam)
*[https://www.cala.ca/accreditation/fields-of-testing/ Canadian Association for Laboratory Accreditation (CALA)] (Canada)
*[https://www.cofrac.fr/en/ Comité français d'accréditation (Cofrac)] (France)
*[https://www.dakks.de/en/testing-and-calibration-laboratories.html Deutsche Akkreditierungsstelle (DAkkS)] (Germany)
*[https://www.rva.nl/en/disciplines/ Dutch Accreditation Council (RVA)] (Netherlands)
*[https://egac.gov.eg/en/accreditation/ Egyptian Accreditation Council (EGAC)] (Egypt)
*[http://www.bata.gov.ba/Default.aspx?langTag=en-US Institute for Accreditation of B&H (BATA)] (Bosnia and Herzegovina)
*[https://www.ianz.govt.nz/ International Accreditation New Zealand (IANZ)] (New Zealand)
*[https://www.iasonline.org/services/ International Accreditation Service (IAS)] (U.S.)
*[https://www.inab.ie/inab-services/ Irish National Accreditation Board (INAB)] (Ireland)
*[http://kan.or.id/index.php/programs/sni-iso-iec-17025 Komite Akreditasi Nasional (KAN)] (Indonesia)
*[https://www.knab.go.kr/ Korea Laboratory Accreditation Scheme (KOLAS)] (South Korea)
*[https://nabl-india.org/ National Accreditation Board for Testing and Calibration Laboratories (NABL)] (India)
*[https://nata.com.au/accreditation/laboratory-accreditation-iso-iec-17025/ National Association of Testing Authorities (NATA)] (Australia)
*[https://nrc.canada.ca/en/certifications-evaluations-standards/calibration-laboratory-assessment-service/about-calibration-laboratory-assessment-service National Research Council of Canada (NRC)] (Canada)
*[https://www.nist.gov/nvlap National Voluntary Laboratory Accreditation Program (NVLAP)] (U.S.)
*[https://www.pjlabs.com/accreditation-programs/isoiec-17025 Perry Johnson Laboratory Accreditation (PJLA)] (U.S.)
*[https://www.en.aenor.com/certificacion/analisis-y-ensayos-de-laboratorio Spanish Association for Standardization and Certification (AENOR)] (Spain)
*[https://www.scc.ca/en/accreditation/programs/laboratories Standards Council of Canada (SCC)] (Canada)


====4.3.2 ISO/IEC 17025 reference and resource library====
*Research software is a component of our instruments.
*Research software is the instrument.
*Research software analyzes research data.
*Research software presents research results.
*Research software assembles or integrates existing components into a working whole.
*Research software is infrastructure or an underlying tool.
*Research software facilitates distinctively research-oriented collaboration.


'''Guides''':
When considering these definitions<ref name="GruenpeterDefining21" /><ref name="vanNieuwpoortDefining24" /> of research software and their adoption by other entities<ref name="F1000Open24">{{cite web |url=https://www.f1000.com/resources-for-researchers/open-research/open-source-software-code/ |title=Open source software and code |publisher=F1000 Research Ltd |date=2024 |accessdate=30 April 2024}}</ref>, it would appear that at least in part some [[laboratory informatics]] software—whether open-source or commercially proprietary—fills these roles in academic, military, and industry research laboratories of many types. In particular, [[electronic laboratory notebook]]s (ELNs) like open-source [[Jupyter Notebook]] or proprietary ELNs from commercial software developers fill the role of analyzing and visualizing research data, including developing molecular models for new promising research routes.<ref name="vanNieuwpoortDefining24" /> Even more advanced LIMS solutions that go beyond simply collating, auditing, securing, and reporting analytical results could conceivably fall under the umbrella of research software, particularly if many of the analytical, integration, and collaboration tools required in modern research facilities are included in the LIMS.


* {{cite web |url=https://www.aphl.org/aboutAPHL/publications/Documents/FS-2018Feb-ISO-IEC-Accreditation-Costs-Survey-Report.pdf |title=Laboratory Costs of ISO/IEC 17025 Accreditation: A 2017 Survey Report |format=PDF |author=Association of Public Health Laboratories |date=February 2018}}
Ultimately, assuming that some laboratory informatics software can be considered research software and not just "software used in research," it's tough not to arrive at some deeper implications of research organizations' increasing need for FAIR data objects and software, particularly for laboratory informatics software and the developers of it.
* {{cite web |url=https://www.limswiki.org/index.php/LII:FDA_Food_Safety_Modernization_Act_Final_Rule_on_Laboratory_Accreditation_for_Analyses_of_Foods:_Considerations_for_Labs_and_Informatics_Vendors |title=FDA Food Safety Modernization Act Final Rule on Laboratory Accreditation for Analyses of Foods: Considerations for Labs and Informatics Vendors |author=Douglas, S.E. |work=LIMSwiki.org |date=21 February 2022}}
* {{cite book |url=https://pasargadabzar.com/wp-content/uploads/2022/04/Implementing-ISOIEC-170252017-by-Bob-Mehta.pdf |format=PDF |title=Implementing ISO/IEC 17025:2017 |author=Mehta, B. |edition=2nd |publisher=ASQ Quality Press |year=2019 |isbn=9780873899802}}
* {{cite web |url=https://nata.com.au/files/2021/05/17025-2017-Gap-analysis.pdf |format=PDF |title=General Accreditation Guidance: ISO/IEC 17025:2017 Gap analysis |author=National Association of Testing Authorities |date=April 2018}}
* {{cite web |url=https://www.pjcinc.com/Downloads/ISOIEC17025_exov.pdf |format=PDF |title=ISO/IEC 17025:2017 Testing and Calibration Laboratories: An Executive Overview |author=Perry Johnson Consulting, Inc |date=January 2022}}
* {{cite web |url=http://www.ipac.pt/docs/publicdocs/requisitos/OGC001_GuiaAplicacao17025_v20181231_En_20191110.pdf |format=PDF |title=Guide for ISO/IEC 17025 Application |author=Portuguese Accreditation Institute |date=10 November 2019}}
* {{cite web |url=https://www.unido.org/sites/default/files/files/2020-06/Guide%20ISO%2017025-2017_online.pdf |format=PDF |title=Tested & Accepted: Implementing ISO/IEC 17025:2017 |author=Vehring, S. |publisher=United Nations Industrial Development Organization |date=June 2020}}


'''Journal articles''':
==Implications of the FAIR concept to laboratory informatics software==
===The global FAIR initiative affects, and even benefits, commercial laboratory informatics research software developers as much as it does academic and institutional ones===
To be clear, there is undoubtedly a difference in the software development approach of "homegrown" research software by academics and institutions, and the more streamlined and experienced approach of commercial software development houses as applied to research software. Moynihan of Invenia Technical Computing described the difference in software development approaches thusly in 2020, while discussing the concept of "research software engineering"<ref name="MoynihanTheHitch20">{{cite web |url=https://invenia.github.io/blog/2020/07/07/software-engineering/ |title=The Hitchhiker’s Guide to Research Software Engineering: From PhD to RSE |author=Moynihan, G. |work=Invenia Blog |publisher=Invenia Technical Computing Corporation |date=07 July 2020}}</ref>:


* {{Cite journal |last=Dror |first=Itiel E. |last2=Pierce |first2=Michal L. |date=2020-05 |title=ISO Standards Addressing Issues of Bias and Impartiality in Forensic Work |url=https://onlinelibrary.wiley.com/doi/10.1111/1556-4029.14265 |journal=Journal of Forensic Sciences |language=en |volume=65 |issue=3 |pages=800–808 |doi=10.1111/1556-4029.14265 |issn=0022-1198}}
<blockquote>Since the environment and incentives around building academic research software are very different to those of industry, the workflows around the former are, in general, not guided by the same engineering practices that are valued in the latter. That is to say: there is a difference between what is important in writing software for research, and for a user-focused software product. Academic research software prioritizes scientific correctness and flexibility to experiment above all else in pursuit of the researchers’ end product: published papers. Industry software, on the other hand, prioritizes maintainability, robustness, and testing, as the software (generally speaking) is the product. However, the two tracks share many common goals as well, such as catering to “users” [and] emphasizing performance and reproducibility, but most importantly both ventures are collaborative. Arguably then, both sets of principles are needed to write and maintain high-quality research software.</blockquote>
* {{Cite journal |last=Krismastuti |first=Fransiska Sri Herwahyu |last2=Habibie |first2=Muhammad Haekal |date=2022-12 |title=Complying with the resource requirements of ISO/IEC 17025:2017 in Indonesian calibration and testing laboratories: current challenges and future directions |url=https://link.springer.com/10.1007/s00769-022-01523-w |journal=Accreditation and Quality Assurance |language=en |volume=27 |issue=6 |pages=359–367 |doi=10.1007/s00769-022-01523-w |issn=0949-1775 |pmc=PMC9579603 |pmid=36275871}}
* {{Cite journal |last=Mandal |first=Goutam |last2=Ansari |first2=M. A. |last3=Aswal |first3=D. K. |date=2021-09 |title=Quality Management System at NPLI: Transition of ISO/IEC 17025 From 2005 to 2017 and Implementation of ISO 17034: 2016 |url=https://link.springer.com/10.1007/s12647-021-00490-w |journal=MAPAN |language=en |volume=36 |issue=3 |pages=657–668 |doi=10.1007/s12647-021-00490-w |issn=0970-3950 |pmc=PMC8308083}}
* {{Cite journal |last=Miguel |first=Anna |last2=Moreira |first2=Renata |last3=Oliveira |first3=André |date=2021 |title=ISO/IEC 17025: History and Introduction of Concepts |url=https://www.limswiki.org/index.php/Journal:ISO/IEC_17025:_History_and_introduction_of_concepts |journal=Química Nova |doi=10.21577/0100-4042.20170726}}
* {{Cite journal |last=Monteiro Bastos da Silva |first=Juliana |last2=Chaker |first2=Jade |last3=Martail |first3=Audrey |last4=Costa Moreira |first4=Josino |last5=David |first5=Arthur |last6=Le Bot |first6=Barbara |date=2021-01-26 |title=Improving Exposure Assessment Using Non-Targeted and Suspect Screening: The ISO/IEC 17025: 2017 Quality Standard as a Guideline |url=https://www.mdpi.com/2039-4713/11/1/1 |journal=Journal of Xenobiotics |language=en |volume=11 |issue=1 |pages=1–15 |doi=10.3390/jox11010001 |issn=2039-4713 |pmc=PMC7838891 |pmid=33530331}}
* {{Citation |last=Neves |first=Rodrigo S. |last2=Da Silva |first2=Daniel P. |last3=Galhardo |first3=Carlos E.C. |last4=Ferreira |first4=Erlon H.M. |last5=Trommer |first5=Rafael M. |last6=Damasceno |first6=Jailton C. |date=2017-02-22 |editor-last=Kounis |editor-first=Leo D. |title=Key Aspects for Implementing ISO/IEC 17025 Quality Management Systems at Materials Science Laboratories |url=http://www.intechopen.com/books/quality-control-and-assurance-an-ancient-greek-term-re-mastered/key-aspects-for-implementing-iso-iec-17025-quality-management-systems-at-materials-science-laborator |work=Quality Control and Assurance - An Ancient Greek Term Re-Mastered |language=en |publisher=InTech |doi=10.5772/66100 |isbn=978-953-51-2921-9}}
* {{Cite journal |last=Okezue |first=Mercy A. |last2=Adeyeye |first2=Mojisola C. |last3=Byrn |first3=Steve J. |last4=Abiola |first4=Victor O. |last5=Clase |first5=Kari L. |date=2020-12 |title=Impact of ISO/IEC 17025 laboratory accreditation in sub-Saharan Africa: a case study |url=https://bmchealthservres.biomedcentral.com/articles/10.1186/s12913-020-05934-8 |journal=BMC Health Services Research |language=en |volume=20 |issue=1 |pages=1065 |doi=10.1186/s12913-020-05934-8 |issn=1472-6963 |pmc=PMC7686690 |pmid=33228675}}
* {{cite journal |last=Pillai |first=Segaran |last2=Calvert |first2=Jennifer |last3=Fox |first3=Elizabeth |date=2022-11-03 |title=Practical considerations for laboratories: Implementing a holistic quality management system |url=https://www.frontiersin.org/articles/10.3389/fbioe.2022.1040103/full |journal=Frontiers in Bioengineering and Biotechnology |volume=10 |pages=1040103 |doi=10.3389/fbioe.2022.1040103 |issn=2296-4185 |pmc=PMC9670165 |pmid=36406233}}


'''Other useful documents''':
This brings us to our first point: the application of small-scale, FAIR-driven academic research software engineering practices and elements to the larger development of more commercial laboratory informatics software, and vice versa with the application of commercial-scale development practices to small FAIR-focused academic and institutional research software engineering efforts, has the potential to help better support all research laboratories using both independently-developed and commercial research software.


* [[LII:LIMSpec 2022 R2|LIMSpec 2022 R2]]
The concept of the research software engineer (RSE) began to take full form in 2012, and since then universities and institutions of many types have formally developed their own RSE groups and academic programs.<ref name="WoolstonWhySci22">{{Cite journal |last=Woolston |first=Chris |date=2022-05-31 |title=Why science needs more research software engineers |url=https://www.nature.com/articles/d41586-022-01516-2 |journal=Nature |language=en |pages=d41586–022–01516-2 |doi=10.1038/d41586-022-01516-2 |issn=0028-0836}}</ref><ref name="KITRSE@KIT24">{{cite web |url=https://www.rse-community.kit.edu/index.php |title=RSE@KIT |publisher=Karlsruhe Institute of Technology |date=20 February 2024 |accessdate=01 May 2024}}</ref><ref name="PUPurdueCenter">{{cite web |url=https://www.rcac.purdue.edu/rse |title=Purdue Center for Research Software Engineering |publisher=Purdue University |date=2024 |accessdate=01 May 2024}}</ref> RSEs range from pure software developers with little knowledge of a given research discipline, to scientific researchers just beginning to learn how to develop software for their research project(s). While in the past, broadly speaking, researchers often cobbled together research software with less a focus on quality and reproducibility and more on getting their research published, today's push for FAIR data and software by academic journals, institutions, and other researchers seeking to collaborate has placed a much greater focus on the concept of "better software, better research."<ref name="WoolstonWhySci22" /><ref name="CohenTheFour21">{{Cite journal |last=Cohen |first=Jeremy |last2=Katz |first2=Daniel S. |last3=Barker |first3=Michelle |last4=Chue Hong |first4=Neil |last5=Haines |first5=Robert |last6=Jay |first6=Caroline |date=2021-01 |title=The Four Pillars of Research Software Engineering |url=https://ieeexplore.ieee.org/document/8994167/ |journal=IEEE Software |volume=38 |issue=1 |pages=97–105 |doi=10.1109/MS.2020.2973362 |issn=0740-7459}}</ref> Elaborating on that concept, Cohen ''et al.'' add that "ultimately, good research software can make the difference between valid, sustainable, reproducible research outputs and short-lived, potentially unreliable or erroneous outputs."<ref name="CohenTheFour21" />
* [https://www.nist.gov/nist-quality-system NIST Quality System for Measurement Services]


The concept of [[software quality management]] (SQM) has traditionally not been lost on professional, commercial software development businesses. Good SQM practices have been less prevalent in homegrown research software development; however, the expanded adoption of FAIR data and FAIR software approaches has shifted the focus on to the repeatability, reproducibility, and interoperability of research results and data produced by a more sustainable research software. The adoption of FAIR by academic and institutional research labs not only brings commercial SQM and other software development approaches into their workflow, but also gives commercial laboratory informatics software developers an opportunity to embrace many aspects of the FAIR approach to laboratory research practices, including lessons learned and development practices from the growing number of RSEs. This doesn't mean commercial developers are going to suddenly take an open-source approach to their code, and it doesn't mean academic and institutional research labs are going to give up the benefits of the open-source paradigm as applied to research software.<ref>{{Cite journal |last=Hasselbring |first=Wilhelm |last2=Carr |first2=Leslie |last3=Hettrick |first3=Simon |last4=Packer |first4=Heather |last5=Tiropanis |first5=Thanassis |date=2020-02-25 |title=From FAIR research data toward FAIR and open research software |url=https://www.degruyter.com/document/doi/10.1515/itit-2019-0040/html |journal=it - Information Technology |language=en |volume=62 |issue=1 |pages=39–47 |doi=10.1515/itit-2019-0040 |issn=2196-7032}}</ref> However, as Moynihan noted, both research software development paradigms stand to gain from the shift to more FAIR data and software. Additionally, if commercial laboratory informatics vendors want to continue to competitively market relevant and sustainable research software to research labs, they frankly have little choice but to commit extra resources to learning about the application of FAIR principles to their offerings tailored to those labs.


===4.4 LIMSpec===
===The focus on data types and metadata within the scope of FAIR is shifting how laboratory informatics software developers and RSEs make their research software and choose their database approaches===
[[File:LIMSpec.png|right]][[Book:LIMSpec 2022 R2|LIMSpec]] is an ever-evolving set of software user requirements specifications for [[laboratory informatics]] systems, especially the [[laboratory information management system]] (LIMS). The specification has grown significantly from its humble origins over a decade ago. Earlier versions of LIMSpec focused on a mix of both regulatory requirements and clients' "wishlist" features for a given system. The wishlist items haven't necessarily been ignored by developers, but they do in fact have to be prioritized by the potential buyer as "nice to have" or "essential to system operation," or something in between.<ref name="AasemAnalysis10">{{cite journal |title=Analysis and optimization of software requirements prioritization techniques |author=Aasem, M.; Ramzan, M.; Jaffar, A. |journal=Proceedings from the 2010 International Conference on Information and Emerging Technologies |pages=1–6 |year=2010 |doi=10.1109/ICIET.2010.5625687}}</ref><ref name="Hirsch10Steps13">{{cite web |url=https://www.phase2technology.com/blog/successful-requirements-gathering |title=10 Steps To Successful Requirements Gathering |author=Hirsch, J. |publisher=Phase2 Technology, LLC |date=22 November 2013 |accessdate=20 January 2023}}</ref><ref name="BurrissSoftware07">{{cite web |url=http://sce2.umkc.edu/BIT/burrise/pl/requirements/ |archiveurl=https://web.archive.org/web/20190724173601/http://sce2.umkc.edu/BIT/burrise/pl/requirements/ |title=Requirements Specification |work=CS451R, University of Missouri–Kansas City |author=Burris, E. |publisher=University of Missouri–Kansas City |date=2007 |archivedate=24 July 2019 |accessdate=20 January 2023}}</ref> This latest version is different, focusing strictly on a regulatory-, standards-, and guidance-based approach to building a specification document for laboratory informatics systems.  
Close to the core of any deep discussion of the FAIR data principles are the concepts of data models, data types, [[metadata]], and persistent unique identifiers (PIDs). Making research objects more findable, accessible, interoperable, and reusable is no easy task when data types and approaches to metadata assignment (if there even is such an approach) are widely differing and inconsistent. Metadata is a means for better storing and characterizing research objects for the purposes of ensuring provenance and reproducibility of those research objects.<ref name="GhiringhelliShared23">{{Cite journal |last=Ghiringhelli |first=Luca M. |last2=Baldauf |first2=Carsten |last3=Bereau |first3=Tristan |last4=Brockhauser |first4=Sandor |last5=Carbogno |first5=Christian |last6=Chamanara |first6=Javad |last7=Cozzini |first7=Stefano |last8=Curtarolo |first8=Stefano |last9=Draxl |first9=Claudia |last10=Dwaraknath |first10=Shyam |last11=Fekete |first11=Ádám |date=2023-09-14 |title=Shared metadata for data-centric materials science |url=https://www.nature.com/articles/s41597-023-02501-8 |journal=Scientific Data |language=en |volume=10 |issue=1 |pages=626 |doi=10.1038/s41597-023-02501-8 |issn=2052-4463 |pmc=PMC10502089 |pmid=37709811}}</ref><ref name="FirschenAgile22">{{Cite journal |last=Fitschen |first=Timm |last2=tom Wörden |first2=Henrik |last3=Schlemmer |first3=Alexander |last4=Spreckelsen |first4=Florian |last5=Hornung |first5=Daniel |date=2022-10-12 |title=Agile Research Data Management with FDOs using LinkAhead |url=https://riojournal.com/article/96075/ |journal=Research Ideas and Outcomes |volume=8 |pages=e96075 |doi=10.3897/rio.8.e96075 |issn=2367-7163}}</ref> This means as early as possible implementing a software-based approach that is FAIR-driven, capturing FAIR metadata using flexible domain-driven ontologies at the source and cleaning up old research objects that aren't FAIR-ready while also limiting hindrances to research processes as much as possible.<ref name="FirschenAgile22" /> And that approach must value the importance of metadata and PIDs. As Weigel ''et al.'' note in a discussion on making laboratory data and workflows more machine-findable: "Metadata capture must be highly automated and reliable, both in terms of technical reliability and ensured metadata quality. This requires an approach that may be very different from established procedures."<ref>{{Cite journal |last=Weigel |first=Tobias |last2=Schwardmann |first2=Ulrich |last3=Klump |first3=Jens |last4=Bendoukha |first4=Sofiane |last5=Quick |first5=Robert |date=2020-01 |title=Making Data and Workflows Findable for Machines |url=https://direct.mit.edu/dint/article/2/1-2/40-46/9994 |journal=Data Intelligence |language=en |volume=2 |issue=1-2 |pages=40–46 |doi=10.1162/dint_a_00026 |issn=2641-435X}}</ref> Enter non-relational RDF knowledge graph databases.


At its core, LIMSpec is rooted in [[ASTM E1578|ASTM E1578-18]] ''Standard Guide for Laboratory Informatics''. With the latest version released in 2018, the standard includes an updated Laboratory Informatics Functional Requirements checklist, which "covers functionality common to the various laboratory informatics systems discussed throughout [the] guide as well as requirements recommended as part of [the] guide." It goes on to state that the checklist "is an example of typical requirements that can be used to guide the purchase, upgrade, or development of a laboratory informatics system," though it is certainly "not meant to be exhaustive."
This brings us to our second point: given the importance of metadata and PIDs to FAIRifying research objects (and even research software), established, more traditional research software development methods using common relational databases may not be enough, even for commercial laboratory informatics software developers. Non-relational Resource Description Framework (RDF) knowledge graph databases used in FAIR-driven, well-designed laboratory informatics software help make research objects more FAIR for all research labs.  


LIMSpec borrows from that requirements checklist and then adds more to it from a wide variety of sources, including [[ISO/IEC 17025]]. An attempt has been made to find the most relevant regulations, standards, and guidance that shape how a compliant laboratory informatics system is developed and maintained. However, the LIMSpec should also definitely be considered a continual work in progress, with more to be added as new pertinent regulations, standards, and guidance are discovered.
Research objects can take many forms (i.e., data types), making the storage and management of those objects challenging, particularly in research settings with great diversity of data, as with materials research. Some have approached this challenge by combining different database and systems technologies that are best suited for each data type.<ref name="AggourSemantics24">{{Cite journal |last=Aggour |first=Kareem S. |last2=Kumar |first2=Vijay S. |last3=Gupta |first3=Vipul K. |last4=Gabaldon |first4=Alfredo |last5=Cuddihy |first5=Paul |last6=Mulwad |first6=Varish |date=2024-04-09 |title=Semantics-Enabled Data Federation: Bringing Materials Scientists Closer to FAIR Data |url=https://link.springer.com/10.1007/s40192-024-00348-4 |journal=Integrating Materials and Manufacturing Innovation |language=en |doi=10.1007/s40192-024-00348-4 |issn=2193-9764}}</ref> However, while query performance and storage footprint improves with this approach, data across the different storage mechanisms typically remains unlinked and non-compliant with FAIR principles. Here, either a full RDF knowledge graph database or similar integration layer is required to better make the research objects more interoperable and reusable, whether it's materials records or specimen data.<ref name="AggourSemantics24" /><ref name="GrobeFromData19">{{Cite journal |last=Grobe |first=Peter |last2=Baum |first2=Roman |last3=Bhatty |first3=Philipp |last4=Köhler |first4=Christian |last5=Meid |first5=Sandra |last6=Quast |first6=Björn |last7=Vogt |first7=Lars |date=2019-06-26 |title=From Data to Knowledge: A semantic knowledge graph application for curating specimen data |url=https://biss.pensoft.net/article/37412/ |journal=Biodiversity Information Science and Standards |language=en |volume=3 |pages=e37412 |doi=10.3897/biss.3.37412 |issn=2535-0897}}</ref>


If you've never worked with a user requirements specification document, the concept remains relatively simple to grasp. Merriam-Webster defines a "specification" as "a detailed precise presentation of something or of a plan or proposal for something."<ref name="MWSpec">{{cite web |url=https://www.merriam-webster.com/dictionary/specification |title=specification |work=Merriam-Webster |publisher=Merriam-Webster, Inc |accessdate=20 January 2023}}</ref> Within this organized "plan or proposal" are requirements. A requirement typically comes in the form of a statement that begins with "the system/user/vendor shall/should ..." and focuses on a provided service, reaction to input, or expected behavior in a given situation. The statement may be abstract (high-level), or it may be specific and detailed to a precise function. The statement may also be of a functional nature, describing functionality or services in detail, or of a non-functional nature, describing the constraints of a given functionality or service and how it's rendered.
It is beyond the scope of this Q&A article to discuss RDF knowledge graph databases at length. (For a deeper dive on this topic, see Rocca-Serra ''et al.'' and the FAIR Cookbook.<ref name="Rocca-SerraFAIRCook22">{{Cite journal |last=Rocca-Serra, Philippe |last2=Sansone, Susanna-Assunta |last3=Gu, Wei |last4=Welter, Danielle |last5=Abbassi Daloii, Tooba |last6=Portell-Silva, Laura |date=2022-06-30 |title=D2.1 FAIR Cookbook - FAIR and Knowledge graphs |url=https://zenodo.org/record/6783564 |journal=Zenodo |doi=10.5281/ZENODO.6783564}}</ref>) However, know that the primary strength of these databases to FAIRification of research objects is their ability to provide semantic transparency, making these objects more easily accessible, interoperable, and machine-readable.<ref name="AggourSemantics24" /> The resulting knowledge graphs, with their "subject-property-object" syntax and PIDs or uniform resource identifiers (URIs) helping to link data, metadata, ontology classes, and more, can be interpreted, searched, and linked by machines, and made human-readable, resulting in better research through derivation of new knowledge from the existing research objects. The end result is a representation of heterogeneous data and metadata that complies with the FAIR guiding principles.<ref name="AggourSemantics24" /><ref name="GrobeFromData19" /><ref name="Rocca-SerraFAIRCook22" /><ref name="TomlinsonRDF23">{{cite web |url=https://21624527.fs1.hubspotusercontent-na1.net/hubfs/21624527/Resources/RDF%20Knowledge%20Graph%20Databases%20White%20Paper.pdf |format=PDF |title=RDF Knowledge
Graph Databases: A Better Choice for Life Science Lab Software |author=Tomlinson, E. |publisher=Semaphore Solutions, Inc |date=28 July 2023 |accessdate=01 May 2024}}</ref><ref name="DeagenFAIRAnd22">{{Cite journal |last=Deagen |first=Michael E. |last2=McCusker |first2=Jamie P. |last3=Fateye |first3=Tolulomo |last4=Stouffer |first4=Samuel |last5=Brinson |first5=L. Cate |last6=McGuinness |first6=Deborah L. |last7=Schadler |first7=Linda S. |date=2022-05-27 |title=FAIR and Interactive Data Graphics from a Scientific Knowledge Graph |url=https://www.nature.com/articles/s41597-022-01352-z |journal=Scientific Data |language=en |volume=9 |issue=1 |pages=239 |doi=10.1038/s41597-022-01352-z |issn=2052-4463 |pmc=PMC9142568 |pmid=35624233}}</ref><ref>{{Cite journal |last=Brandizi |first=Marco |last2=Singh |first2=Ajit |last3=Rawlings |first3=Christopher |last4=Hassani-Pak |first4=Keywan |date=2018-09-25 |title=Towards FAIRer Biological Knowledge Networks Using a Hybrid Linked Data and Graph Database Approach |url=https://www.degruyter.com/document/doi/10.1515/jib-2018-0023/html |journal=Journal of Integrative Bioinformatics |language=en |volume=15 |issue=3 |pages=20180023 |doi=10.1515/jib-2018-0023 |issn=1613-4516 |pmc=PMC6340125 |pmid=30085931}}</ref> This concept can even be extended to ''post factum'' visualizations of the knowledge graph data<ref name="DeagenFAIRAnd22" />, as well as the FAIR management of computational laboratory [[workflow]]s.<ref>{{Cite journal |last=de Visser |first=Casper |last2=Johansson |first2=Lennart F. |last3=Kulkarni |first3=Purva |last4=Mei |first4=Hailiang |last5=Neerincx |first5=Pieter |last6=Joeri van der Velde |first6=K. |last7=Horvatovich |first7=Péter |last8=van Gool |first8=Alain J. |last9=Swertz |first9=Morris A. |last10=Hoen |first10=Peter A. C. ‘t |last11=Niehues |first11=Anna |date=2023-09-28 |editor-last=Palagi |editor-first=Patricia M. |title=Ten quick tips for building FAIR workflows |url=https://dx.plos.org/10.1371/journal.pcbi.1011369 |journal=PLOS Computational Biology |language=en |volume=19 |issue=9 |pages=e1011369 |doi=10.1371/journal.pcbi.1011369 |issn=1553-7358 |pmc=PMC10538699 |pmid=37768885}}</ref>


An example of a functional software requirement could be "the user shall be able to query either all of the initial set of databases or select a subset from it." This statement describes specific functionality the system should have. On the other hand, a non-functional requirement, for example, may state "the system's query tool shall conform to the ABC 123-2014 standard." The statement describes a constraint placed upon the system's query functionality. Once compiled, a set of requirements can serve not only to strengthen the software requirements specification, but the requirements set can also be used for bidding on a contract or serve as the basis for a specific contract that is being finalized.<ref name="MemonSoftware10">{{cite web |url=https://www.cs.umd.edu/~atif/Teaching/Spring2010/Slides/3.pdf |format=PDF |title=Software Requirements: Descriptions and specifications of a system |author=Memon, A. |publisher=University of Maryland |date=Spring 2010 |accessdate=20 January 2023}}</ref>
- https://labbit.com/resources/rdf-knowledge-graph-databases-a-better-choice-for-life-science-lab-software and https://21624527.fs1.hubspotusercontent-na1.net/hubfs/21624527/Resources/RDF%20Knowledge%20Graph%20Databases%20White%20Paper.pdf


Section 3.1.2 of Chapter 3 took those specific LIMSpec requirements linked to ISO/IEC 17025 and presented them as a means to demonstrate the critical functionality a LIMS should have in order for a [[laboratory]] to more easily comply with the standard. However, there's more to LIMSpec than just those requirements. The requirements span multiple areas of laboratory [[workflow]] and practice, from the core functions of a lab to its ancillary operations, specialty functions, technology and performance touch points, and data and security touch points. As of December 2022, LIMSpec is based off more than 130 regulations, standards, guidance documents, and more, proving to be a robust starting point for labs seeking to use a requirements specification to acquire a LIMS.
- https://biss.pensoft.net/article/37412/
 
- https://link.springer.com/article/10.1007/s40192-024-00348-4
 
- https://www.nature.com/articles/s41597-022-01352-z
 
 
 
 
===Applying FAIR-driven metadata schemes to laboratory informatics software development gives data a FAIRer chance at being ready for machine learning and artificial intelligence applications===
 
*By developing laboratory informatics software with a focus on FAIR-driven metadata schemes, not only are data objects more FAIR but also "clean" and machine-ready for advanced analytical uses as with machine learning and artificial intelligence either built into the laboratory informatics software, or separate from it.
 
- https://www.pharmasalmanac.com/articles/embracing-fair-data-on-the-path-to-ai-readiness
 
- https://arxiv.org/abs/2404.05779
 
- https://www.nature.com/articles/s41597-023-02298-6
 
- https://repositories.lib.utexas.edu/items/a366780e-3d54-4aaa-8465-6da1e38ee38a
 
==Resources==
 
*LIMS and FAIR: [[Journal:A roadmap for LIMS at NIST Material Measurement Laboratory]]
*ELNs and FAIR: [[Structure-based knowledge acquisition from electronic lab notebooks for research data provenance documentation]]
*LIMS+ELN and FAIR: https://datascience.codata.org/articles/10.5334/dsj-2023-044
*Biomedical software and FAIR: https://www.nature.com/articles/s41597-023-02463-x
*Making software workflows FAIR: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10538699/
*AWS and FAIR for healthcare and life sciences: https://aws.amazon.com/blogs/industries/implement-fair-scientific-data-principles-when-building-hcls-data-lakes/
*APIs and FAIR data: https://www.labguru.com/blog/fair-data-principles-and-apis
*Bioinformatics LIMS and FAIR: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8425304/
*Labbit: https://labbit.com/fair-data-lims and https://semaphoresolutions.com/applying-fair-principles-to-lab-data/
 
Zontal: https://8248491.fs1.hubspotusercontent-na1.net/hubfs/8248491/Marketing%20Material/White%20paper/FAIR%20Data/FAIR%20data%20-%20how%20data%20increases%20the%20value%20of%20biotechs.pdf
 
*Extending FAIR to data graphics: https://www.nature.com/articles/s41597-022-01352-z
 
 
*More metadata, for findability: "While descriptive metadata may not be available, support for generalized CRUD operations requires essential structural and administrative metadata to be captured, stored, and made available for requestors. Metadata capture must be highly automated and reliable, both in terms of technical reliability and ensured metadata quality." [[Journal:Making data and workflows findable for machines]]
*More metadata, for reusability: "make recommendations for assigning identifiers and metadata that supports sample tracking, integration, and reuse. Our goal is to provide a practical approach to sample management, geared towards ecosystem scientists who contribute and reuse sample data." [[Journal:Sample identifiers and metadata to support data management and reuse in multidisciplinary ecosystem sciences]]
 
*"The principles should be considered during development of informatics systems to further promote data discovery and reuse. In Table 1, we have correlated the various BRICS functional components to the FAIR principles to illustrate the extent to which each of the components contributes towards the principles." [[Journal:Development of an informatics system for accelerating biomedical research]]
 
Restricted or personal information while still being FAIR
 
*[[Journal:FAIR Health Informatics: A health informatics framework for verifiable and explainable data analysis]]
*[[Journal:Restricted data management: The current practice and the future]]
 
*Linking databases of data that haven't seen proper "FAIR-ification" and metadata handling won't be as useful.
*Further discussion on data quality in the scope of FAIR: [[Journal:Towards a contextual approach to data quality]]
 
 
On data integrity and FAIR: https://arkivum.com/what-is-fair-data-and-can-life-science-organisations-ensure-data-is-compliant-whilst-adhering-to-these-principles/
 
More: https://www.europeanpharmaceuticalreview.com/article/157371/implementing-the-fair-data-principles-is-now-a-critical-endeavour/
 
More: https://www.lexjansen.com/phuse/2019/sa/SA04.pdf
 
Knowledge graph stuff:
 
- https://arxiv.org/abs/2404.12935
 
- https://direct.mit.edu/dint/article/4/4/867/112737/FAIR-Versus-Open-Data-A-Comparison-of-Objectives
==Conclusion==


The next chapter discusses the user requirements specification, using LIMSpec as an example. You'll learn how to shape such a specification to your laboratory's needs, how to issue the specification as a request for information (RFI), and how to get the most out of it when getting decision-related information from vendors.


==References==
==References==
{{Reflist|colwidth=30em}}
{{Reflist|colwidth=30em}}
<!---Place all category tags here-->

Latest revision as of 20:07, 1 May 2024

Sandbox begins below

FAIRResourcesGraphic AustralianResearchDataCommons 2018.png

Title: What are the potential implications of the FAIR data principles to laboratory informatics applications?

Author for citation: Shawn E. Douglas

License for content: Creative Commons Attribution-ShareAlike 4.0 International

Publication date: May 2024

Introduction

This brief topical article will examine

The "FAIR-ification" of research objects and software

First discussed during a 2014 FORCE-11 workshop dedicated to "overcoming data discovery and reuse obstacles," the FAIR Guiding Principles were published by Wilkinson et al. in 2016 as a stakeholder collaboration driven to see research "objects" (i.e., research data and information of all shapes and formats) become more universally findable, accessible, interoperable and reusable (FAIR) by both machines and people.[1] The authors released the FAIR principles while recognizing that "one of the grand challenges of data-intensive science ... is to improve knowledge discovery through assisting both humans and their computational agents in the discovery of, access to, and integration and analysis of task-appropriate scientific data and other scholarly digital objects."[1]

Since 2016, other research stakeholders have taken to publishing their thoughts about how the FAIR principles apply to their fields of study and practice[2], including in ways beyond what perhaps was originally imagined by Wilkinson et al.. For example, multiple authors have examined whether or not the software used in scientific endeavors itself can be considered a research object worth being developed and managed in tandem with the FAIR data principles.[3][4][5][6][7] Researchers quickly recognized that any planning around updating processes and systems to make research objects more FAIR would have to be tailored to specific research contexts, recognize that digital research objects go beyond data and information, and recognize "the specific nature of software" and not consider it "just data."[4] The end result has been applying the core concepts of FAIR but differently from data, with the added context of research software being more than just data, requiring more nuance and a different type of planning from applying FAIR to digital data and information.

A 2019 survey by Europe's FAIRsFAIR found that researchers seeking and re-using relevant research software on the internet faced multiple challenges, including understanding and/or maintaining the necessary software environment and its dependencies, finding sufficient documentation, struggling with accessibility and licensing issues, having the time and skills to install and/or use the software, finding quality control of the source code lacking, and having an insufficient (or non-existent) software sustainability and management plan.[4] These challenges highlight the importance of software to researchers and other stakeholders, and the roll FAIR has in better ensuring such software is findable, interoperable, and reusable, which in turn better ensures researchers' software-driven research is repeatable (by the same research team, with the same experimental setup), reproducible (by a different research team, with the same experimental setup), and replicable (by a different research team, with a different experimental setup).[4]

At this point, the topic of what "research software" represents must be addressed further, and, unsurprisingly, it's not straightforward. Ask 20 researchers what "research software" is, and you may get 20 different opinions. Some definitions can be more objectively viewed as too narrow, while others may be viewed as too broad, with some level of controversy inherent in any mutual discussion.[8][9][10] In 2021, as part of the FAIRsFAIR initiative, Gruenpeter et al. made a good-faith effort to define "research software" with the feedback of multiple stakeholders. Their efforts resulted in this definition:

Research software includes source code files, algorithms, scripts, computational workflows, and executables that were created during the research process, or for a research purpose. Software components (e.g., operating systems, libraries, dependencies, packages, scripts, etc.) that are used for research but were not created during, or with a clear research intent, should be considered "software [used] in research" and not research software. This differentiation may vary between disciplines. The minimal requirement for achieving computational reproducibility is that all the computational components (i.e., research software, software used in research, documentation, and hardware) used during the research are identified, described, and made accessible to the extent that is possible.

Note that while the definition primarily recognizes software created during the research process, software created (whether by the research group, other open-source software developers outside the organization, or even commercial software developers) "for a research purpose" outside the actual research process is also recognized as research software. This notably can lead to disagreement about whether a proprietary, commercial spreadsheet or laboratory information management system (LIMS) offering that conducts analyses and visualizations of research data can genuinely be called research software, or simply classified as software used in research. van Nieuwpoort and Katz further elaborated on this concept, at least indirectly, by formally defining the roles of research software in 2023. Their definition of the various roles of research software—without using terms such as "open-source," "commercial," or "proprietary"—essentially further defined what research software is[10]:

  • Research software is a component of our instruments.
  • Research software is the instrument.
  • Research software analyzes research data.
  • Research software presents research results.
  • Research software assembles or integrates existing components into a working whole.
  • Research software is infrastructure or an underlying tool.
  • Research software facilitates distinctively research-oriented collaboration.

When considering these definitions[8][10] of research software and their adoption by other entities[11], it would appear that at least in part some laboratory informatics software—whether open-source or commercially proprietary—fills these roles in academic, military, and industry research laboratories of many types. In particular, electronic laboratory notebooks (ELNs) like open-source Jupyter Notebook or proprietary ELNs from commercial software developers fill the role of analyzing and visualizing research data, including developing molecular models for new promising research routes.[10] Even more advanced LIMS solutions that go beyond simply collating, auditing, securing, and reporting analytical results could conceivably fall under the umbrella of research software, particularly if many of the analytical, integration, and collaboration tools required in modern research facilities are included in the LIMS.

Ultimately, assuming that some laboratory informatics software can be considered research software and not just "software used in research," it's tough not to arrive at some deeper implications of research organizations' increasing need for FAIR data objects and software, particularly for laboratory informatics software and the developers of it.

Implications of the FAIR concept to laboratory informatics software

The global FAIR initiative affects, and even benefits, commercial laboratory informatics research software developers as much as it does academic and institutional ones

To be clear, there is undoubtedly a difference in the software development approach of "homegrown" research software by academics and institutions, and the more streamlined and experienced approach of commercial software development houses as applied to research software. Moynihan of Invenia Technical Computing described the difference in software development approaches thusly in 2020, while discussing the concept of "research software engineering"[12]:

Since the environment and incentives around building academic research software are very different to those of industry, the workflows around the former are, in general, not guided by the same engineering practices that are valued in the latter. That is to say: there is a difference between what is important in writing software for research, and for a user-focused software product. Academic research software prioritizes scientific correctness and flexibility to experiment above all else in pursuit of the researchers’ end product: published papers. Industry software, on the other hand, prioritizes maintainability, robustness, and testing, as the software (generally speaking) is the product. However, the two tracks share many common goals as well, such as catering to “users” [and] emphasizing performance and reproducibility, but most importantly both ventures are collaborative. Arguably then, both sets of principles are needed to write and maintain high-quality research software.

This brings us to our first point: the application of small-scale, FAIR-driven academic research software engineering practices and elements to the larger development of more commercial laboratory informatics software, and vice versa with the application of commercial-scale development practices to small FAIR-focused academic and institutional research software engineering efforts, has the potential to help better support all research laboratories using both independently-developed and commercial research software.

The concept of the research software engineer (RSE) began to take full form in 2012, and since then universities and institutions of many types have formally developed their own RSE groups and academic programs.[13][14][15] RSEs range from pure software developers with little knowledge of a given research discipline, to scientific researchers just beginning to learn how to develop software for their research project(s). While in the past, broadly speaking, researchers often cobbled together research software with less a focus on quality and reproducibility and more on getting their research published, today's push for FAIR data and software by academic journals, institutions, and other researchers seeking to collaborate has placed a much greater focus on the concept of "better software, better research."[13][16] Elaborating on that concept, Cohen et al. add that "ultimately, good research software can make the difference between valid, sustainable, reproducible research outputs and short-lived, potentially unreliable or erroneous outputs."[16]

The concept of software quality management (SQM) has traditionally not been lost on professional, commercial software development businesses. Good SQM practices have been less prevalent in homegrown research software development; however, the expanded adoption of FAIR data and FAIR software approaches has shifted the focus on to the repeatability, reproducibility, and interoperability of research results and data produced by a more sustainable research software. The adoption of FAIR by academic and institutional research labs not only brings commercial SQM and other software development approaches into their workflow, but also gives commercial laboratory informatics software developers an opportunity to embrace many aspects of the FAIR approach to laboratory research practices, including lessons learned and development practices from the growing number of RSEs. This doesn't mean commercial developers are going to suddenly take an open-source approach to their code, and it doesn't mean academic and institutional research labs are going to give up the benefits of the open-source paradigm as applied to research software.[17] However, as Moynihan noted, both research software development paradigms stand to gain from the shift to more FAIR data and software. Additionally, if commercial laboratory informatics vendors want to continue to competitively market relevant and sustainable research software to research labs, they frankly have little choice but to commit extra resources to learning about the application of FAIR principles to their offerings tailored to those labs.

The focus on data types and metadata within the scope of FAIR is shifting how laboratory informatics software developers and RSEs make their research software and choose their database approaches

Close to the core of any deep discussion of the FAIR data principles are the concepts of data models, data types, metadata, and persistent unique identifiers (PIDs). Making research objects more findable, accessible, interoperable, and reusable is no easy task when data types and approaches to metadata assignment (if there even is such an approach) are widely differing and inconsistent. Metadata is a means for better storing and characterizing research objects for the purposes of ensuring provenance and reproducibility of those research objects.[18][19] This means as early as possible implementing a software-based approach that is FAIR-driven, capturing FAIR metadata using flexible domain-driven ontologies at the source and cleaning up old research objects that aren't FAIR-ready while also limiting hindrances to research processes as much as possible.[19] And that approach must value the importance of metadata and PIDs. As Weigel et al. note in a discussion on making laboratory data and workflows more machine-findable: "Metadata capture must be highly automated and reliable, both in terms of technical reliability and ensured metadata quality. This requires an approach that may be very different from established procedures."[20] Enter non-relational RDF knowledge graph databases.

This brings us to our second point: given the importance of metadata and PIDs to FAIRifying research objects (and even research software), established, more traditional research software development methods using common relational databases may not be enough, even for commercial laboratory informatics software developers. Non-relational Resource Description Framework (RDF) knowledge graph databases used in FAIR-driven, well-designed laboratory informatics software help make research objects more FAIR for all research labs.

Research objects can take many forms (i.e., data types), making the storage and management of those objects challenging, particularly in research settings with great diversity of data, as with materials research. Some have approached this challenge by combining different database and systems technologies that are best suited for each data type.[21] However, while query performance and storage footprint improves with this approach, data across the different storage mechanisms typically remains unlinked and non-compliant with FAIR principles. Here, either a full RDF knowledge graph database or similar integration layer is required to better make the research objects more interoperable and reusable, whether it's materials records or specimen data.[21][22]

It is beyond the scope of this Q&A article to discuss RDF knowledge graph databases at length. (For a deeper dive on this topic, see Rocca-Serra et al. and the FAIR Cookbook.[23]) However, know that the primary strength of these databases to FAIRification of research objects is their ability to provide semantic transparency, making these objects more easily accessible, interoperable, and machine-readable.[21] The resulting knowledge graphs, with their "subject-property-object" syntax and PIDs or uniform resource identifiers (URIs) helping to link data, metadata, ontology classes, and more, can be interpreted, searched, and linked by machines, and made human-readable, resulting in better research through derivation of new knowledge from the existing research objects. The end result is a representation of heterogeneous data and metadata that complies with the FAIR guiding principles.[21][22][23][24][25][26] This concept can even be extended to post factum visualizations of the knowledge graph data[25], as well as the FAIR management of computational laboratory workflows.[27]

- https://labbit.com/resources/rdf-knowledge-graph-databases-a-better-choice-for-life-science-lab-software and https://21624527.fs1.hubspotusercontent-na1.net/hubfs/21624527/Resources/RDF%20Knowledge%20Graph%20Databases%20White%20Paper.pdf

- https://biss.pensoft.net/article/37412/

- https://link.springer.com/article/10.1007/s40192-024-00348-4

- https://www.nature.com/articles/s41597-022-01352-z



Applying FAIR-driven metadata schemes to laboratory informatics software development gives data a FAIRer chance at being ready for machine learning and artificial intelligence applications

  • By developing laboratory informatics software with a focus on FAIR-driven metadata schemes, not only are data objects more FAIR but also "clean" and machine-ready for advanced analytical uses as with machine learning and artificial intelligence either built into the laboratory informatics software, or separate from it.

- https://www.pharmasalmanac.com/articles/embracing-fair-data-on-the-path-to-ai-readiness

- https://arxiv.org/abs/2404.05779

- https://www.nature.com/articles/s41597-023-02298-6

- https://repositories.lib.utexas.edu/items/a366780e-3d54-4aaa-8465-6da1e38ee38a

Resources

Zontal: https://8248491.fs1.hubspotusercontent-na1.net/hubfs/8248491/Marketing%20Material/White%20paper/FAIR%20Data/FAIR%20data%20-%20how%20data%20increases%20the%20value%20of%20biotechs.pdf


Restricted or personal information while still being FAIR


On data integrity and FAIR: https://arkivum.com/what-is-fair-data-and-can-life-science-organisations-ensure-data-is-compliant-whilst-adhering-to-these-principles/

More: https://www.europeanpharmaceuticalreview.com/article/157371/implementing-the-fair-data-principles-is-now-a-critical-endeavour/

More: https://www.lexjansen.com/phuse/2019/sa/SA04.pdf

Knowledge graph stuff:

- https://arxiv.org/abs/2404.12935

- https://direct.mit.edu/dint/article/4/4/867/112737/FAIR-Versus-Open-Data-A-Comparison-of-Objectives

Conclusion

References

  1. 1.0 1.1 Wilkinson, Mark D.; Dumontier, Michel; Aalbersberg, IJsbrand Jan; Appleton, Gabrielle; Axton, Myles; Baak, Arie; Blomberg, Niklas; Boiten, Jan-Willem et al. (15 March 2016). "The FAIR Guiding Principles for scientific data management and stewardship" (in en). Scientific Data 3 (1): 160018. doi:10.1038/sdata.2016.18. ISSN 2052-4463. PMC PMC4792175. PMID 26978244. https://www.nature.com/articles/sdata201618. 
  2. "fair data principles". PubMed Search. National Institutes of Health, National Library of Medicine. https://pubmed.ncbi.nlm.nih.gov/?term=fair+data+principles. Retrieved 30 April 2024. 
  3. Hasselbring, Wilhelm; Carr, Leslie; Hettrick, Simon; Packer, Heather; Tiropanis, Thanassis (25 February 2020). "From FAIR research data toward FAIR and open research software" (in en). it - Information Technology 62 (1): 39–47. doi:10.1515/itit-2019-0040. ISSN 2196-7032. https://www.degruyter.com/document/doi/10.1515/itit-2019-0040/html. 
  4. 4.0 4.1 4.2 4.3 Gruenpeter, M. (23 November 2020). "FAIR + Software: Decoding the principles" (PDF). FAIRsFAIR “Fostering FAIR Data Practices In Europe”. https://www.fairsfair.eu/sites/default/files/FAIR%20%2B%20software.pdf. Retrieved 30 April 2024. 
  5. Barker, Michelle; Chue Hong, Neil P.; Katz, Daniel S.; Lamprecht, Anna-Lena; Martinez-Ortiz, Carlos; Psomopoulos, Fotis; Harrow, Jennifer; Castro, Leyla Jael et al. (14 October 2022). "Introducing the FAIR Principles for research software" (in en). Scientific Data 9 (1): 622. doi:10.1038/s41597-022-01710-x. ISSN 2052-4463. PMC PMC9562067. PMID 36241754. https://www.nature.com/articles/s41597-022-01710-x. 
  6. Patel, Bhavesh; Soundarajan, Sanjay; Ménager, Hervé; Hu, Zicheng (23 August 2023). "Making Biomedical Research Software FAIR: Actionable Step-by-step Guidelines with a User-support Tool" (in en). Scientific Data 10 (1): 557. doi:10.1038/s41597-023-02463-x. ISSN 2052-4463. PMC PMC10447492. PMID 37612312. https://www.nature.com/articles/s41597-023-02463-x. 
  7. Du, Xinsong; Dastmalchi, Farhad; Ye, Hao; Garrett, Timothy J.; Diller, Matthew A.; Liu, Mei; Hogan, William R.; Brochhausen, Mathias et al. (6 February 2023). "Evaluating LC-HRMS metabolomics data processing software using FAIR principles for research software" (in en). Metabolomics 19 (2): 11. doi:10.1007/s11306-023-01974-3. ISSN 1573-3890. https://link.springer.com/10.1007/s11306-023-01974-3. 
  8. 8.0 8.1 Gruenpeter, Morane; Katz, Daniel S.; Lamprecht, Anna-Lena; Honeyman, Tom; Garijo, Daniel; Struck, Alexander; Niehues, Anna; Martinez, Paula Andrea et al. (13 September 2021). "Defining Research Software: a controversial discussion". Zenodo. doi:10.5281/zenodo.5504016. https://zenodo.org/record/5504016. 
  9. "What is Research Software?". JuRSE, the Community of Practice for Research Software Engineering. Forschungszentrum Jülich. 13 February 2024. https://www.fz-juelich.de/en/rse/about-rse/what-is-research-software. Retrieved 30 April 2024. 
  10. 10.0 10.1 10.2 10.3 van Nieuwpoort, Rob; Katz, Daniel S. (14 March 2023) (in en). Defining the roles of research software. doi:10.54900/9akm9y5-5ject5y. https://upstream.force11.org/defining-the-roles-of-research-software. 
  11. "Open source software and code". F1000 Research Ltd. 2024. https://www.f1000.com/resources-for-researchers/open-research/open-source-software-code/. Retrieved 30 April 2024. 
  12. Moynihan, G. (7 July 2020). "The Hitchhiker’s Guide to Research Software Engineering: From PhD to RSE". Invenia Blog. Invenia Technical Computing Corporation. https://invenia.github.io/blog/2020/07/07/software-engineering/. 
  13. 13.0 13.1 Woolston, Chris (31 May 2022). "Why science needs more research software engineers" (in en). Nature: d41586–022–01516-2. doi:10.1038/d41586-022-01516-2. ISSN 0028-0836. https://www.nature.com/articles/d41586-022-01516-2. 
  14. "RSE@KIT". Karlsruhe Institute of Technology. 20 February 2024. https://www.rse-community.kit.edu/index.php. Retrieved 01 May 2024. 
  15. "Purdue Center for Research Software Engineering". Purdue University. 2024. https://www.rcac.purdue.edu/rse. Retrieved 01 May 2024. 
  16. 16.0 16.1 Cohen, Jeremy; Katz, Daniel S.; Barker, Michelle; Chue Hong, Neil; Haines, Robert; Jay, Caroline (1 January 2021). "The Four Pillars of Research Software Engineering". IEEE Software 38 (1): 97–105. doi:10.1109/MS.2020.2973362. ISSN 0740-7459. https://ieeexplore.ieee.org/document/8994167/. 
  17. Hasselbring, Wilhelm; Carr, Leslie; Hettrick, Simon; Packer, Heather; Tiropanis, Thanassis (25 February 2020). "From FAIR research data toward FAIR and open research software" (in en). it - Information Technology 62 (1): 39–47. doi:10.1515/itit-2019-0040. ISSN 2196-7032. https://www.degruyter.com/document/doi/10.1515/itit-2019-0040/html. 
  18. Ghiringhelli, Luca M.; Baldauf, Carsten; Bereau, Tristan; Brockhauser, Sandor; Carbogno, Christian; Chamanara, Javad; Cozzini, Stefano; Curtarolo, Stefano et al. (14 September 2023). "Shared metadata for data-centric materials science" (in en). Scientific Data 10 (1): 626. doi:10.1038/s41597-023-02501-8. ISSN 2052-4463. PMC PMC10502089. PMID 37709811. https://www.nature.com/articles/s41597-023-02501-8. 
  19. 19.0 19.1 Fitschen, Timm; tom Wörden, Henrik; Schlemmer, Alexander; Spreckelsen, Florian; Hornung, Daniel (12 October 2022). "Agile Research Data Management with FDOs using LinkAhead". Research Ideas and Outcomes 8: e96075. doi:10.3897/rio.8.e96075. ISSN 2367-7163. https://riojournal.com/article/96075/. 
  20. Weigel, Tobias; Schwardmann, Ulrich; Klump, Jens; Bendoukha, Sofiane; Quick, Robert (1 January 2020). "Making Data and Workflows Findable for Machines" (in en). Data Intelligence 2 (1-2): 40–46. doi:10.1162/dint_a_00026. ISSN 2641-435X. https://direct.mit.edu/dint/article/2/1-2/40-46/9994. 
  21. 21.0 21.1 21.2 21.3 Aggour, Kareem S.; Kumar, Vijay S.; Gupta, Vipul K.; Gabaldon, Alfredo; Cuddihy, Paul; Mulwad, Varish (9 April 2024). "Semantics-Enabled Data Federation: Bringing Materials Scientists Closer to FAIR Data" (in en). Integrating Materials and Manufacturing Innovation. doi:10.1007/s40192-024-00348-4. ISSN 2193-9764. https://link.springer.com/10.1007/s40192-024-00348-4. 
  22. 22.0 22.1 Grobe, Peter; Baum, Roman; Bhatty, Philipp; Köhler, Christian; Meid, Sandra; Quast, Björn; Vogt, Lars (26 June 2019). "From Data to Knowledge: A semantic knowledge graph application for curating specimen data" (in en). Biodiversity Information Science and Standards 3: e37412. doi:10.3897/biss.3.37412. ISSN 2535-0897. https://biss.pensoft.net/article/37412/. 
  23. 23.0 23.1 Rocca-Serra, Philippe; Sansone, Susanna-Assunta; Gu, Wei; Welter, Danielle; Abbassi Daloii, Tooba; Portell-Silva, Laura (30 June 2022). "D2.1 FAIR Cookbook - FAIR and Knowledge graphs". Zenodo. doi:10.5281/ZENODO.6783564. https://zenodo.org/record/6783564. 
  24. Tomlinson, E. (28 July 2023). [https://21624527.fs1.hubspotusercontent-na1.net/hubfs/21624527/Resources/RDF%20Knowledge%20Graph%20Databases%20White%20Paper.pdf "RDF Knowledge Graph Databases: A Better Choice for Life Science Lab Software"] (PDF). Semaphore Solutions, Inc. https://21624527.fs1.hubspotusercontent-na1.net/hubfs/21624527/Resources/RDF%20Knowledge%20Graph%20Databases%20White%20Paper.pdf. Retrieved 01 May 2024. 
  25. 25.0 25.1 Deagen, Michael E.; McCusker, Jamie P.; Fateye, Tolulomo; Stouffer, Samuel; Brinson, L. Cate; McGuinness, Deborah L.; Schadler, Linda S. (27 May 2022). "FAIR and Interactive Data Graphics from a Scientific Knowledge Graph" (in en). Scientific Data 9 (1): 239. doi:10.1038/s41597-022-01352-z. ISSN 2052-4463. PMC PMC9142568. PMID 35624233. https://www.nature.com/articles/s41597-022-01352-z. 
  26. Brandizi, Marco; Singh, Ajit; Rawlings, Christopher; Hassani-Pak, Keywan (25 September 2018). "Towards FAIRer Biological Knowledge Networks Using a Hybrid Linked Data and Graph Database Approach" (in en). Journal of Integrative Bioinformatics 15 (3): 20180023. doi:10.1515/jib-2018-0023. ISSN 1613-4516. PMC PMC6340125. PMID 30085931. https://www.degruyter.com/document/doi/10.1515/jib-2018-0023/html. 
  27. de Visser, Casper; Johansson, Lennart F.; Kulkarni, Purva; Mei, Hailiang; Neerincx, Pieter; Joeri van der Velde, K.; Horvatovich, Péter; van Gool, Alain J. et al. (28 September 2023). Palagi, Patricia M.. ed. "Ten quick tips for building FAIR workflows" (in en). PLOS Computational Biology 19 (9): e1011369. doi:10.1371/journal.pcbi.1011369. ISSN 1553-7358. PMC PMC10538699. PMID 37768885. https://dx.plos.org/10.1371/journal.pcbi.1011369.