Difference between revisions of "Journal:A roadmap for LIMS at NIST Material Measurement Laboratory"

From LIMSWiki
Jump to navigationJump to search
(Saving and adding more.)
(Saving and adding more.)
Line 85: Line 85:


===Challenges===
===Challenges===
Within the MML LIMS COI exchange forum, lessons learned and shared by early adopters of LIMS highlighted several key challenges. These will be factored into data management plans for implementation and extended to the broader MML community, which relies on either operation of LIMS or end-usage LIMS outputs.
One common challenge with instrumentation data is generation of vendor-proprietary output formats. A repository for sharing data format exchange software tools is a good example of a solution benefiting LIMS by supporting the need to transform vendor data into more consumable open data formats for downstream analysis and computation. A prototype repository was created by the Office of Data and Informatics (ODI) with a few extractor tools, and efforts are underway to explore how this may achieve wider utility. Community repositories such as [https://docs.openmicroscopy.org/bio-formats/6.7.0/index.html Bio-Formats] and [https://github.com/materials-data-facility/MaterialsIO MaterialsIO] are examples of resources which support tools for conversions of third-party data into open data models. These community-oriented solutions successfully demonstrate methods to lower the barrier for LIMS through shared software.
Another challenge is finding appropriately skilled labor resources required for domain-specific engineering LIMS workflows. This is a common barrier to LIMS prioritization for research organizations. Integrating data structures requires close collaboration between domain and data science subject matter experts (SMEs) for modeling and mapping of multiple source data to repository storage.
Integration or migration of legacy systems and bespoke tools with next-generation LIMS architecture presents another challenge, especially for those with limited resources supporting maintenance. Legacy systems commonly lack sustainability due to factors such as end of funding support or unavailable expertise.
Data provenance is commonly required for sample tracking and traceability across laboratory processes (e.g., sample transformations, generation of parts, or inter-laboratory sample exchange). The latter is a challenge for inter-laboratory study because data management systems (including LIMS) are most commonly not standard or normalized. Supporting common [[data exchange]] protocols and [[chain of custody]] workflows will be an ongoing design consideration for interoperability, including concepts such as [[Data integrity|data trust and integrity]].
Another common challenge is operational security compliance for IT infrastructure. NIST adheres to CIS Controls (Critical Security Controls,) and as LIMS architectures rely on networked systems, this translates to requirements for vigilant monitoring of service and platform deployments to ensure organizational security.
==MML LIMS stakeholders==





Revision as of 00:03, 6 May 2022

Full article title A roadmap for LIMS at NIST Material Measurement Laboratory
Author(s) Greene, Gretchen; Ragland, Jared; Trautt, Zachary; Lau, June; Plante, Raymond; Taillon, Joshua; Creuziger, Adam; Becker, Chandler; Bennett, Joseph; Blonder, Niksa; Borsuk, Lisa; Campbell, Carelyn; Friss, Adam; Hale, Lucas; Halter, Michael; Hanisch, Robert; Hardin, Gary; Levine, Lyle; Maragh, Samantha; Miller, Sierra; Muzny, Christopher; Newrock, Marcus; Perkins, John; Plant, Anne; Ravel, Bruce; Ross, David; Scott, John H.; Szakal, Chris; Tona, Alessandro; Vallone, Peter
Author affiliation(s) National Institute of Standards and Technology
Year published 2022
Volume and issue NIST Technical Note 2216
Page(s) i–iii, 1–17
DOI 10.6028/NIST.TN.2216
Distribution license Public domain
Website https://www.nist.gov/publications/roadmap-lims-nist-material-measurement-laboratory
Download https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=934610 (PDF)

Foreword

Over the past decade, emerging technology in laboratory and computational science has changed the landscape for research by accelerating the production, processing, and exchange of data. The NIST Material Measurement Laboratory community recognizes that to keep pace with the transformation of measurement science to a digital paradigm, it is essential to implement laboratory information management systems (LIMS). Effective introduction of LIMS early in the research life cycle provides direct support for planning and execution of experiments and accelerating research productivity. From this perspective, LIMS are not passive entities with isolated interaction, but rather key resources supporting collaboration, scientific integrity, and transfer of knowledge over time. They serve as a delivery system for organizational contributions to the broader federated data community, supporting both controlled and open access, determined by the sensitivity of the research.

The overall goal of a successful LIMS is to empower a research community by establishing common tools providing access to laboratory data resources. Modern LIMS should therefore provide several core functions and touchpoints:

  • Workflow management – A research workflow describes steps to be performed to derive results. These patterns serve as a prescription for LIMS to control the progression of data

and associated services or tools. Automation of a workflow simplifies the transfer of information through defined interfaces from a network of systems.

  • Repository of data – Effective storage and retrieval of data (raw and derived)—including associated metadata, data products, calibration, software, logs, etc.—facilitates data discovery, processing, collaboration, and dissemination.
  • Creation of data products and tools – A LIMS should support storage and processing of raw data, leading to products which can be shared and consumed. Examples would include sample data, instrument-generated data, and algorithms generating defined outputs. Inclusion of data models provides context and structure, and machine learning (ML) integration may generate related data which could be combined into more comprehensive data models. Tools may include visualization, evaluation, and analysis packages offering users advanced capabilities for their research projects.
  • Organization of data for search and retrieval – Tools and interfaces give users access to sophisticated searches of data holdings and efficient mechanisms for data transfer in standardized formats. Searching should extend to domain or project-specific semantics, be coupled closely with related data, and go beyond individual research projects to include super-searches (e.g., use-case-driven interoperability between LIMS).
  • Long-lived, stable, and agile structures – LIMS require institutional and architectural sustainability for long baseline research and curatorship. Technology tends to change faster than the practical lifetime of research programs, so paths must exist for maintaining IT infrastructure and introducing faster and more complex processes.
  • Standards and best practices – LIMS benefit from standardization to support collaborations among research communities and make data workflows efficient and affordable. Community buy-in for standards and best practices is an essential part of LIMS, and organizational shared expertise naturally serves as a means for coordination and adaptation of standards.
  • User involvement – In all the core functions listed above, it is critical to involve the subject matter experts from the beginning. LIMS should establish a working team that explicitly includes representatives from the end user community.

Abstract

Instrumentation generates data faster and in greater quantity than ever before, and inter-laboratory research is in historic demand domestically and internationally to stimulate economic innovation. Strategic mission needs of the NIST Material Measurement Laboratory (MML) to support a wide array of research disciplines therefore compel our organization to adopt advanced strategies for research data management. Laboratory information management systems (LIMS) provide a framework for managing data from the outset of the research life cycle, delivering new capabilities for machine learning (ML), data analysis, collaboration, and dissemination. This roadmap describes our current understanding and strategy for adapting our research workflows for LIMS throughout MML by embracing the use of standards and best practices from data science communities. The NIST research data cyber-infrastructure complements these goals for MML by providing a secure environment to host LIMS solutions. Additionally, integration of scientific workflows requires ongoing collaboration to bridge organizational LIMS with external scientific communities. Thus, MML LIMS will evolve over time in synergy with the technology and experimental environments, delivering new science. LIMS will broaden our mission impact through adoption of the FAIR Data Principles.

Keywords: data, laboratory information management systems, experimental data, research data, research workflows

Introduction

Beginning late 2019, MML initiated as part of its strategic plan the development of "next-generation" data and informatics with a focus on LIMS as a key resource to support research data and science. This effort was complemented by initiatives for enhancing data management planning and data systems infrastructure. The first year’s groundwork established common needs for both individual researchers and teams to engage more readily with LIMS, with a goal of building greater capacity for interaction and use of data. A vision for LIMS was written to convey the purpose of these collective efforts:

“Laboratory information management systems will provide MML scientists a practical means for repeatability, traceability, reproducibility, efficiency, and compliance of research, serving as a beacon to both intramural and extramural community stakeholders.”

The MML approach to implementing LIMS started with defining goals for specific division research laboratory projects and established a cross-divisional Community of Interest (COI) group for sharing solutions, services, practices, and challenges. Comprehensive LIMS solutions have been successfully implemented in several NIST laboratories. Several shared resources have successfully demonstrated use of LIMS components including repository platforms, a standard persistent identifier service, a centralized research data storage with networked data transfer nodes, and data transfer services, in addition to expertise in data modeling and semantics. These resources, along with community best practices, contribute to a basic LIMS architecture for research.

A system view and architecture model provide the foundation for planned future outcomes. In this roadmap, we define a set of research-oriented LIMS capabilities which serves to guide implementation along with components to deliver these capabilities. More detailed guidance for use of specific LIMS resources is available internally to NIST and where possible shared on external repository websites. It is also anticipated that LIMS implementation will provide an important contribution to the goals of NIST program areas such as artificial intelligence (AI), biosystems, chemical informatics, additive manufacturing, and the materials science areas which spearheaded early innovation in data systems through the Materials Genome Initiative.

Roadmap objectives

This roadmap provides a framework for manifesting the MML LIMS vision and outlines the key objectives highlighted by the MML LIMS COI project goals. These are grouped into near- and long-term objectives for MML LIMS prioritization. These objectives, along with broader community efforts, will strengthen the NIST data-as-an-asset [1] strategic approach to research. More comprehensive goals such as the development of a "digital twin" [2] will enable models to probe the measurement science space to further analyze the physics and gain understanding, leading to new science.

Near-term objectives include:

  • Establish a LIMS COI for MML (as of this writing, already in place)
  • Develop pilot LIMS solutions for targeted research workflows (as of this writing, several solutions already piloted and deployed for laboratory operations)
  • Design tiered LIMS architectures to support a range of research workflow implementations
  • Establish core infrastructure services to support LIMS
  • Develop data acquisition and experimental activity capture solutions
  • Deploy key functional support services such as a Handle.Net service (supports persistent identifiers) and data transfer service (see the later subsection on supporting services)
  • Prototype and exchange LIMS components as a basis for shared resources (e.g., repository platforms for experimental activity; instrumentation; samples; extract, transform, and load [ETL])
  • Establish best practices in digital object management to support standards and community practice

Long-term objectives include:

  • Establish best practices for development and implementation of data models and semantics
  • Deploy prioritized LIMS end-to-end solutions (achieving multi-component level functionality)
  • Develop use-case-driven solutions for cross-laboratory LIMS interconnectivity
  • Provide on-demand system-level LIMS resources for research teams at NIST for rapid engagement
  • Develop automated workflow integration between LIMS and computational platforms (e.g., high-performance computing [HPC], ML, and analysis applications like SciServer)
  • Develop integration between LIMS and public data access systems
  • Establish methodology for building and applying "digital twin" models
  • Provide NIST leadership a strategy for adoption and implementation of LIMS to promote innovative data-driven science

Challenges

Within the MML LIMS COI exchange forum, lessons learned and shared by early adopters of LIMS highlighted several key challenges. These will be factored into data management plans for implementation and extended to the broader MML community, which relies on either operation of LIMS or end-usage LIMS outputs.

One common challenge with instrumentation data is generation of vendor-proprietary output formats. A repository for sharing data format exchange software tools is a good example of a solution benefiting LIMS by supporting the need to transform vendor data into more consumable open data formats for downstream analysis and computation. A prototype repository was created by the Office of Data and Informatics (ODI) with a few extractor tools, and efforts are underway to explore how this may achieve wider utility. Community repositories such as Bio-Formats and MaterialsIO are examples of resources which support tools for conversions of third-party data into open data models. These community-oriented solutions successfully demonstrate methods to lower the barrier for LIMS through shared software.

Another challenge is finding appropriately skilled labor resources required for domain-specific engineering LIMS workflows. This is a common barrier to LIMS prioritization for research organizations. Integrating data structures requires close collaboration between domain and data science subject matter experts (SMEs) for modeling and mapping of multiple source data to repository storage.

Integration or migration of legacy systems and bespoke tools with next-generation LIMS architecture presents another challenge, especially for those with limited resources supporting maintenance. Legacy systems commonly lack sustainability due to factors such as end of funding support or unavailable expertise.

Data provenance is commonly required for sample tracking and traceability across laboratory processes (e.g., sample transformations, generation of parts, or inter-laboratory sample exchange). The latter is a challenge for inter-laboratory study because data management systems (including LIMS) are most commonly not standard or normalized. Supporting common data exchange protocols and chain of custody workflows will be an ongoing design consideration for interoperability, including concepts such as data trust and integrity.

Another common challenge is operational security compliance for IT infrastructure. NIST adheres to CIS Controls (Critical Security Controls,) and as LIMS architectures rely on networked systems, this translates to requirements for vigilant monitoring of service and platform deployments to ensure organizational security.

MML LIMS stakeholders

References

Notes

This document falls in the U.S. public domain and is republished courtesy of the National Institute of Standards and Technology. This presentation is faithful to the original, with only a few minor changes to presentation, spelling, and grammar.