Journal:STATegra EMS: An experiment management system for complex next-generation omics experiments

From LIMSWiki
Revision as of 20:40, 14 March 2016 by Shawndouglas (talk | contribs) (Created stub. Saving and adding more.)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search
Full article title STATegra EMS: An experiment management system for complex next-generation omics experiments
Journal BMC Systems Biology
Author(s) Hernández-de-Diego, R.; Boix-Chova, N.; Gómez-Cabrero, D.; Tegner, J.; Abugessaisa, I.; Conesa, A.
Author affiliation(s) Centro de Investigación Príncipe Felipe and the Karolinska Institute
Primary contact None given
Year published 2014
Volume and issue 8(Suppl 2)
Page(s) S9
DOI 10.1186/1752-0509-8-S2-S9
ISSN 1752-0509
Distribution license Creative Commons Attribution 2.0 Generic
Website http://bmcsystbiol.biomedcentral.com/articles/10.1186/1752-0509-8-S2-S9
Download http://bmcsystbiol.biomedcentral.com/track/pdf/10.1186/1752-0509-8-S2-S9 (PDF)

Abstract

High-throughput sequencing assays are now routinely used to study different aspects of genome organization. As decreasing costs and widespread availability of sequencing enable more laboratories to use sequencing assays in their research projects, the number of samples and replicates in these experiments can quickly grow to several dozens of samples and thus require standardized annotation, storage and management of preprocessing steps. As a part of the STATegra project, we have developed an Experiment Management System (EMS) for high throughput omics data that supports different types of sequencing-based assays such as RNA-seq, ChIP-seq, Methyl-seq, etc, as well as proteomics and metabolomics data. The STATegra EMS provides metadata annotation of experimental design, samples and processing pipelines, as well as storage of different types of data files, from raw data to ready-to-use measurements. The system has been developed to provide research laboratories with a freely-available, integrated system that offers a simple and effective way for experiment annotation and tracking of analysis procedures.

Background

The widespread availability of high-throughput sequencing techniques have importantly impacted genome research and reshaped the way we study genome function and structure. The rapidly decreasing costs of sequencing make these technologies affordable to small and medium size laboratories. Furthermore, the constant development of novel sequencing based assays, coined with the suffix -seq, expands the scope of cellular properties analyzable by high-throughput sequencing, with sequencing reads forming an underlying common data format. Today, virtually all nucleic acid omics methods traditionally based on microarrays have a -seq counterpart and many more have been made available recently. As a consequence, the possibility of running multiple sequencing-based experiments to measure different aspects of gene regulation and combining these with non-sequencing omics technologies such as proteomics and metabolomics has become practical.[1][2][3][4][5] For example, the ENCODE project combined ten major types of sequencing-based assays to unravel the complexity of genome architecture.[6] Many records can be found at the SRA archive that integrate multiple -seq technologies measured on the same samples and a PubMed search for NGS plus proteomics or metabolomics results in over hundred entries. Last but not least, one of the advantages of sequenced-based experiments is that they are equally applicable to the study of well-annotated model organisms as well as less-studied non-model organisms since little or no a priori genome knowledge is required.

References

  1. Song, C.X.; Szulwach, K.E.; Dai, Q. et al. (2013). "Genome-wide profiling of 5-formylcytosine reveals its roles in epigenetic priming". Cell 153 (3): 678–691. doi:10.1016/j.cell.2013.04.001. PMC PMC3657391. PMID 23602153. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3657391. 
  2. Wei, G.; Abraham, B.J.; Yagi, R. et al. (2011). "Genome-wide analyses of transcription factor GATA3-mediated gene regulation in distinct T cell types". Immunity 35 (2): 299–311. doi:10.1016/j.immuni.2011.08.007. PMC PMC3169184. PMID 21867929. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3169184. 
  3. Schmid, N.; Pessi, G.; Deng, Y. et al. (2012). "The AHL- and BDSF-dependent quorum sensing systems control specific and overlapping sets of genes in Burkholderia cenocepacia H111". PLoS One 7 (11): e49966. doi:10.1371/journal.pone.0049966. PMC PMC3502180. PMID 23185499. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3502180. 
  4. Bordbar, A.; Mo, M.L.; Nakayasu, E.S. et al. (2012). "Model-driven multi-omic data analysis elucidates metabolic immunomodulators of macrophage activation". Molecular Systems Biology 8: 558. doi:10.1038/msb.2012.21. PMC PMC3397418. PMID 22735334. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3397418. 
  5. Baltz, A.G.; Munschauer, M.; Schwanhäusser, B. et al. (2012). "The mRNA-bound proteome and its global occupancy profile on protein-coding transcripts". Molecular Cell 46 (5): 674–690. doi:10.1016/j.molcel.2012.05.021. PMID 22681889. 
  6. ENCODE Project Consortium; Bernstein, B.E.; Birney, E. et al. (2012). "An integrated encyclopedia of DNA elements in the human genome". Nature 489 (7414): 57–74. doi:10.1038/nature11247. PMC PMC3439153. PMID 22955616. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3439153. 

Notes

This presentation is faithful to the original, with only a few minor changes to presentation. In some cases important information was missing from the references, and that information was added.