Journal:Efficient sample tracking with OpenLabFramework

From LIMSWiki
Revision as of 19:44, 21 January 2016 by Shawndouglas (talk | contribs) (Added content. Saving and adding more.)
Jump to navigationJump to search
Full article title Efficient sample tracking with OpenLabFramework
Journal Scientific Reports
Author(s) List, Markus; Schmidt, Steffen; Trojnar, Jakub; Thomas, Jochen; Thomassen, Mads; Kruse, Torben A.; Tan, Qihua; Baumbach, Jan; Mollenhauer, Jan
Author affiliation(s) University of Southern Denmark, io-consultants GmbH & Co. KG
Primary contact Email: http://www.nature.com/articles/srep04278/email/correspondent/c1/new (Requires login)
Year published 2015
Volume and issue 4
Page(s) 4278
DOI 10.1038/srep04278
ISSN 2045-2322
Distribution license Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported
Website http://www.nature.com/articles/srep04278
Download http://www.nature.com/articles/srep04278.pdf (PDF)

Abstract

The advance of new technologies in biomedical research has led to a dramatic growth in experimental throughput. Projects therefore steadily grow in size and involve a larger number of researchers. Spreadsheets traditionally used are thus no longer suitable for keeping track of the vast amounts of samples created and need to be replaced with state-of-the-art laboratory information management systems. Such systems have been developed in large numbers, but they are often limited to specific research domains and types of data. One domain so far neglected is the management of libraries of vector clones and genetically engineered cell lines. OpenLabFramework is a newly developed web-application for sample tracking, particularly laid out to fill this gap, but with an open architecture allowing it to be extended for other biological materials and functional data. Its sample tracking mechanism is fully customizable and aids productivity further through support for mobile devices and barcoded labels.

Introduction

With the development of high-throughput technologies, laboratory work has seen a paradigm shift from small projects involving single or few researchers towards large-scale projects involving several laboratories and often hundreds or thousands of samples. Sample management is therefore a growing issue, especially since most laboratories still attempt to keep track of their samples using spreadsheet tools. A high turn-over of academic staff coupled with maintenance of individual files that are often locked or outdated, as well as inconsistent nomenclature and labeling, can lead to tedious repetition of previously existing work. The significant amount of time that is often spent on locating samples would be better used for performing experiments. Moreover, expensive storage space is wasted, since samples are often not labeled properly and cannot be identified. Even if a label is given, it usually does not include a standardized minimal amount of information that allows unambiguous identification of the materials or the experiments they were derived from. Numerous commercial and open-source solutions have been developed in an attempt to overcome these problems.

Although solutions are offered by commercial companies like LabVantage, most academic laboratories find it difficult to afford the license costs, which usually rise with additional users and technical features. The focus of this paper is thus open-source systems.

As Table 1 shows, open-source laboratory information management systems (LIMS) are often customized towards specific types of biomaterials or research data, as for instance genotyping[1][2][3], protein production[4][5], protein-protein-interaction[6], 2D gel electrophoresis[7], or protein crystallography[8] data. Some generic LIMS target specific laboratory tasks, such as sample management[3][9][10], laboratory work-flows and protocols[2][11][12][13], documentation, management of lab stocks, or clinical studies.[14] Further solutions exist for molecular genetics and the creation of vector libraries.[15] There is, however, no dedicated LIMS for the management of large vector construct and cell line libraries. At our Lundbeck Foundation Center of Excellence in Nanomedicine (NanoCAN) at the University of Southern Denmark in Odense such large-scale libraries need to be handled efficiently (see Mollenhauer et al.[16] for a short overview about our work). This motivated us to develop a novel open-source LIMS platform: OpenLabFramework (OLF).

Table 1. Some examples of existing browser-based LIMS solutions. Corresponding project URLs can be found in Supplemental Table 1.
Project name Ref. Main purpose Built with
MMP-LIMS [1] Genome mapping in maize Java
AGL-LIMS [2] Genotyping work-flow Java
SMS [3] Gene mutation screening & biobanking Java
PiMS [4] Sample & experiment tracking for protein production Java
ProteinTracker [5] Protein production & purification Java
PARPs Database [6] Protein-protein interaction data and data-mining Perl/Java
LIPAGE [7] 2D gel electrophoresis based proteomics PHP
LISA [8] Protein crystallography PHP
EnzymeTracker [9] Data analysis, sample management, spreadsheet functionality PHP
FreeLIMS Sample management, reports Java
YourLabData Sample tracking and lab notebook -
Open-LIMS Experimental work-flow, sample & document management PHP
OpenFreezer [10] Sample management & tracking PHP/Python
iLAP [11] Data management, analysis, experimental protocol design Java
SIGLa [12] Customized experimental work-flows Java
Bika Whole lab work-flow for clinical studies Python
MicroGen [13] Mircoarray information and work-flow MS-Access
LabLog Project documentation Java
LabStoRe Chemical lab stocks PHP
SIMBioMS [14] Linking experimental, patient, and high-throughput data PHP
MolabIS [15] Molecular genetics data Perl

Results

Any LIMS that involves sample management on a large scale should fulfill a number of requirements listed in the following as R1-15. Existing open-source LIMS fulfill these requirements to varying degrees (Table 2).

Table 2. Feature Comparison of requirements across browser-based LIMS solutions for sample management using the following abbreviations: EnzymeTracker (ET), Free-LIMS (FL), SLIMS (SL), YourLabData (YL), Open-LIMS (OL), ProteinTracker (PT), AGL-LIMS (AL), SMS (SM), MolabIS (MI), SIMBioMS (SI), OpenFreezer (OF), and PiMS (PS). "O" depicts limited fulfilment. Sample Tracking refers to the physical location of samples. Local Deployment refers to a local installation not requiring a database installation. Cloud Deployment refers to documented cases.
Requirements ET FL SL YL OL PT AL SM MI SI OF PS OLF
Open-source R1 Y Y Y N Y Y Y Y Y Y Y Y Y
Modularity R2 N N N N Y N Y N Y Y Y N Y
Sample management R3 Y Y Y Y Y Y Y Y Y Y Y Y Y
Sample tracking R4 N N O Y N N N Y Y N Y Y Y
File management R5 N N N Y Y N Y N N Y N N Y
Reports R6 Y Y Y N N Y Y Y Y N N N Y
Multiple DBMS R7 N N N N N N Y N N Y N Y Y
Local deployment R8 N N N N N N N N Y Y N N Y
Cloud deployment R9 N N N N N N N N N N N N Y
Documentation R10 Y N Y O Y Y N Y Y Y Y Y Y
Barcodes R11 Y N N N N N N Y N N N N Y
Labels R12 N N N N N N N Y N N N N Y
Mobile devices R13 N N N N N N N N N N N N Y
Data analysis R14 Y N N N O N N N N N N N N
Audit-logging R15 Y N Y N N N N O N N N N N

Implementation

A LIMS for an academic environment needs to be open-source (R1), in order to save costs and to allow for adaptation to the specific requirements of a given scientific field and laboratory. Since adaptation can be a difficult and time-consuming task, a LIMS that is modular and extensible by design (R2) would be most appropriate. Although difficult to assess for existing projects, a LIMS should be reliable and its implementation simple. Existing frameworks and software packages that are maintained and tested by a large community are often more reliable than individual solutions and should thus be incorporated.

Data handling

Dealing with a large number of samples in a library or biobank requires efficient mechanisms for sample management (R3) and physical sample tracking over several hierarchical levels (R4). Since related information and experimental results are usually stored in additional documents, a management system, where files can be linked to an arbitrary number of samples (R5), would be most useful. Another requirement is that raw data previously entered into the system can be exported to various file formats. This requirement is usually met through an integrated reporting mechanism (R6).

Flexibility in deployment

Academic laboratories are often part of an existing IT infrastructure, but support is in many cases limited, e.g. to a single database management system (DBMS), such as MySQL. LIMS deployment should thus be as flexible as possible not be bound to a specific operating system or DBMS. While the first requirement is fulfilled by all LIMS considered in this paper, multiple database support remains an issue (R7). Furthermore, if a suitable server is not available, deployment locally (R8) or to a cloud service (R9) is advantageous.

User acceptance and excess value

Triplet et al. have identified approachability as a major hurdle in the acceptance of a LIMS.[9] Modern web-technologies like Ajax allow for a more responsive and intuitive user interface, which in turn improves the user experience and reduces the learning period. Another crucial requirement for a successful adaptation of a LIMS is good documentation (R10). User acceptance can also be improved by offering an excess value over traditional spreadsheet tools, for instance by incorporating the use of barcodes (R11), label printing (R12), and mobile devices, such as smartphones (R13). A further advantage would be the incorporation of data analysis tools directly within the LIMS (R14).

Security

LIMS typically address security concerns by restricting access through secure user logins and different user roles. Security would also be enhanced by audit logging features (R15), where a version number is added to each database entry. Any change will then result in a copy of the entry with a new version number, so that accidentally overwritten entries can be restored.

OpenLabFramework

We present OpenLabFramework (OLF), a laboratory information management system (LIMS) primarily targeted at advanced sample and storage management in mid-sized laboratories with less than 50 users. It facilitates a seamless integration of virtual and real world storage handling by making use of mobile devices, which are carried by lab personal anyways, in combination with cheap and fully integrated barcode labeling technology. In the following we shed a light on how OLF fulfills the LIMS requirements that we have identified before (R1–R15). A brief comparison with existing open-source LIMS is given in Table 2.

Modularity and extendibility

OLF is published as open-source (R1) and, due to its modular structure, it can be adapted to different types of laboratory data and sample types. New functionality can also be added and integrated (R2). Various features are covered by the following modules.

GeneTracker: GeneTracker is intended to fulfill requirements specific to the hierarchical organization of genes, gene variants, vector constructs, and genetically engineered cell lines, thus helping to keep track of extensive sample libraries in the field of targeted genomics. The organization of these samples is further supported through OLF's built-in user and project management features.

Sample storage: The Storage module adds options for tracking and organizing samples in a customizable storage infrastructure (R3–4). This infrastructure is hierarchical, starting from buildings and rooms and ending in individual freezers and storage boxes. Interactive grids help the user to assess the content of a storage box at a glance. Together with GeneTracker, samples can be added or removed from storage in an intuitive manner, while providing an overview of remaining copies and related samples.

File uploads: The FileAttachments module allows users to up- and download arbitrary files, allowing for a better organization of their results and documents. Files are stored with a combination of timestamp and original file name to avoid conflicts arising from identical file names. Files are uploaded to a configurable folder on the server and not to the database itself. They can be linked to an arbitrary number of samples, so that other users can quickly obtain an overview of files relevant to a sample (R5).

Barcode and label support: The functionality of the Storage module is complemented by the Barcode module, with which a user can create and print barcode labels (R11–12). These can later be used to locate a sample in OLF by scanning the barcode using a USB-connected scanner or a mobile device (R13). The Barcode module currently requires a connected DYMO label printer but can be extended in the future to support other devices.

Reporting

Apache POI is utilized to export lists of samples to various file formats, including Excel (XLSX), Open Document Spreadsheets (ODS), PDF, and comma separated values (CSV). This feature is currently available for lists of genes, vector constructs, and cell lines. The storage hierarchy and individual boxes can also be exported to Excel spreadsheets (R6).

Flexibility

Grails applications are not bound to a specific database management system and will even work with non-SQL solutions, such as MongoDB (R7). OLF is compiled either as WAR file, which is suitable for deployment on a large number of Java-based web containers, or as locally executable JAR file, which comes packed with its own web container and file-based SQL solution (R8). It should be noted that OLF has only been tested thoroughly on Tomcat versions 6 and 7.

Cloud deployment

Grails also offers a plug-in for cloud deployment using the VMware Cloud-Foundry service (http://www.cloudfoundry.com/) (R9). Apart from CloudFoundry credentials and memory settings, no further configuration is needed. Upon deployment CloudFoundry automatically configures a suitable database to work with the application.

Mobile support

OLF utilizes the Spring Mobile Grails plug-in to distinguish mobile clients from desktop clients. If a mobile device is detected, a different view is shown that is tailored for the small-sized screen and touch-screen interaction (R13).

User approachability and excess value

OLF offers a modern web-interface that is clearly organized and intuitive (Figure 1), and allows for responsive user interaction. The Compass-powered search engine allows users to locate required information quickly and conveniently. Users can also develop effective laboratory work-flows using the sample tracking feature together with barcode labels and mobile devices (Figure 2). OLF validates all user entered data for validity and will, where applicable, provide a list of viable options in form of select boxes. In this way, OLF effectively avoids ambiguity and ensures consistency of sample data. Finally, OLF comes with online documentation that introduces the system to users, administrators, and software developers (R10).

References

  1. 1.0 1.1 Sanchez-Villeda, H.; Schroeder, S.; Polacco, M. et al. (2003). "Development of an integrated laboratory information management system for the maize mapping project". Bioinformatics 19 (16): 2022-2030. doi:10.1093/bioinformatics/btg274. PMID 14594706. 
  2. 2.0 2.1 2.2 Jayashree, B.; Reddy, P.T.; Leeladevi, Y. et al. (2006). "Laboratory information management software for genotyping workflows: Applications in high throughput crop genotyping". BMC Bioinformatics 7: 383. doi:10.1186/1471-2105-7-383. PMC PMC1559653. PMID 16914063. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1559653. 
  3. 3.0 3.1 3.2 Voegele, C.; Alteyrac, L.; Caboux, E. et al. (2010). "A sample storage management system for biobanks". Bioinformatics 26 (21): 2798-2800. doi:10.1093/bioinformatics/btq502. PMID 20807837. 
  4. 4.0 4.1 Morris, C.; Pajon, A.; Griffiths, S.L. et al. (2011). "The Protein Information Management System (PiMS): A generic tool for any structural biology research laboratory". Acta Crystallographica Section D 67 (4): 249–260. doi:10.1107/S0907444911007943. PMC PMC3069740. PMID 21460443. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3069740. 
  5. 5.0 5.1 Ponko, S.C.; Bienvenue, D. (2012). "ProteinTracker: An application for managing protein production and purification". BMC Research Notes 5: 224. doi:10.1186/1756-0500-5-224. PMC PMC3436699. PMID 22574679. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3436699. 
  6. 6.0 6.1 Droit, A.; Hunter, J.M.; Rouleau, M. et al. (2007). "PARPs database: A LIMS systems for protein-protein interaction data mining or laboratory information management system". BMC Bioinformatics 8: 483. doi:10.1186/1471-2105-8-483. PMC PMC2266781. PMID 18093328. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2266781. 
  7. 7.0 7.1 Morisawa, H.; Hirota, M.; Toda, T. (2006). "Development of an open source laboratory information management system for 2-D gel electrophoresis-based proteomics workflow". BMC Bioinformatics 7: 430. doi:10.1186/1471-2105-7-430. PMC PMC1599757. PMID 17018156. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1599757. 
  8. 8.0 8.1 Haebel, P.W.; Arcus, V.L.; Baker, E.N.; Metcalf, P. (2001). "LISA: An intranet-based flexible database for protein crystallography project management". Acta Crystallographica Section D 57 (Pt 9): 1341-1343. doi:10.1107/S0907444901009295. PMID 11526339. 
  9. 9.0 9.1 9.2 Triplet, T.; Butler, G. (2012). "The EnzymeTracker: An open-source laboratory information management system for sample tracking". BMC Bioinformatics 13 (15): 1341-1343. doi:10.1186/1471-2105-13-15. PMC PMC3353834. PMID 22280360. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3353834. 
  10. 10.0 10.1 Olhovsky, M.; Williton, K.; Dai, A.Y. et al. (2011). "OpenFreezer: A reagent information management software system". Nature Methods 8 (8): 612–613. doi:10.1038/nmeth.1658. PMID 21799493. 
  11. 11.0 11.1 Stocker, G.; Fischer, M.; Rieder, D. et al. (2009). "iLAP: a workflow-driven software for experimental protocol development, data acquisition and analysis". BMC Bioinformatics 10: 390. doi:10.1186/1471-2105-10-390. PMC PMC2789074. PMID 19941647. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2789074. 
  12. 12.0 12.1 Melo, Alexandre; Alessandra Faria-Campos; Daiane M DeLaat; Rodrigo Keller; Vinícius Abreu; Sérgio Campos (2010). "SIGLa: an adaptable LIMS for multiple laboratories". BMC Genomics 11 (Suppl 5): S8. doi:10.1186/1471-2164-11-S5-S8. PMC PMC3045801. PMID 21210974. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3045801. 
  13. 13.0 13.1 Burgarella, S.; Cattaneo, D.; Pinciroli, F.; Masseroli, M. (2005). "MicroGen: a MIAME compliant web system for microarray experiment information and workflow management". BMC Bioinformatics 6 (Suppl 4): S6. doi:10.1186/1471-2105-6-S4-S6. PMC PMC1866379. PMID 16351755. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1866379. 
  14. 14.0 14.1 Krestyaninova, M.; Zarins, A.; Viksna, J. et al. (2009). "A system for information management in biomedical studies – SIMBioMS". Bioinformatics 25 (20): 2768-2769. doi:10.1093/bioinformatics/btp420. PMC PMC2759553. PMID 19633095. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2759553. 
  15. 15.0 15.1 Truong, C.V.C.; Groeneveld, L.F.; Morgenstern, B.; Groeneveld, E. (2011). "MolabIS - An integrated information system for storing and managing molecular genetics data". BMC Bioinformatics 12: 425. doi:10.1186/1471-2105-12-425. PMC PMC3268772. PMID 22040322. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3268772. 
  16. Mollenhauer, J.; Stamou, D.; Flyvbjerg, A. et al. (2010). "David versus Goliath". Nanomedicine 6 (4): 504–509. doi:10.1016/j.nano.2010.04.002. PMID 20417315. 

Notes

This presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added. In Table 2, checkmarks and Xs were replaced with Ys and Ns.