Difference between revisions of "Template:Article of the week"

Revision as of 18:19, 7 November 2016

"Principles of metadata organization at the ENCODE data coordination center"

The Encyclopedia of DNA Elements (ENCODE) Data Coordinating Center (DCC) is responsible for organizing, describing and providing access to the diverse data generated by the ENCODE project. The description of these data, known as metadata, includes the biological sample used as input, the protocols and assays performed on these samples, the data files generated from the results and the computational methods used to analyze the data. Here, we outline the principles and philosophy used to define the ENCODE metadata in order to create a metadata standard that can be applied to diverse assays and multiple genomic projects. In addition, we present how the data are validated and used by the ENCODE DCC in creating the ENCODE Portal. (Full article...)

Recently featured:

▪ Integrated systems for NGS data management and analysis: Open issues and available solutions

▪ Practical approaches for mining frequent patterns in molecular datasets

▪ Improving the creation and reporting of structured findings during digital pathology review

@@ Line 1: / Line 1: @@
-<div style="float: left; margin: 0.5em 0.9em 0.4em 0em;">[[File:Fig1 Bianchi FrontinGenetics2016 7.jpg|240px]]</div>
+<div style="float: left; margin: 0.5em 0.9em 0.4em 0em;">[[File:Fig1 Hong Database2016 2016.jpg|240px]]</div>
-'''"[[Journal:Integrated systems for NGS data management and analysis: Open issues and available solutions|Integrated systems for NGS data management and analysis: Open issues and available solutions]]"'''
+'''"[[Journal:Principles of metadata organization at the ENCODE data coordination center|Principles of metadata organization at the ENCODE data coordination center]]"'''
-Next-generation sequencing (NGS) technologies have deeply changed our understanding of cellular processes by delivering an astonishing amount of data at affordable prices; nowadays, many biology [[laboratory|laboratories]] have already accumulated a large number of sequenced samples. However, managing and analyzing these data poses new challenges, which may easily be underestimated by research groups devoid of IT and quantitative skills. In this perspective, we identify five issues that should be carefully addressed by research groups approaching NGS technologies. In particular, the five key issues to be considered concern: (1) adopting a [[laboratory information management system]] (LIMS) and safeguard the resulting raw data structure in downstream analyses; (2) monitoring the flow of the data and standardizing input and output directories and file names, even when multiple analysis protocols are used on the same data; (3) ensuring complete traceability of the analysis performed; (4) enabling non-experienced users to run analyses through a graphical user interface (GUI) acting as a front-end for the pipelines; (5) relying on standard metadata to annotate the datasets, and when possible using controlled vocabularies, ideally derived from biomedical ontologies. ('''[[Journal:Integrated systems for NGS data management and analysis: Open issues and available solutions|Full article...]]''')<br />
+The Encyclopedia of DNA Elements (ENCODE) Data Coordinating Center (DCC) is responsible for organizing, describing and providing access to the diverse data generated by the ENCODE project. The description of these data, known as metadata, includes the biological sample used as input, the protocols and assays performed on these samples, the data files generated from the results and the computational methods used to analyze the data. Here, we outline the principles and philosophy used to define the ENCODE metadata in order to create a metadata standard that can be applied to diverse assays and multiple genomic projects. In addition, we present how the data are validated and used by the ENCODE DCC in creating the ENCODE Portal. ('''[[Journal:Principles of metadata organization at the ENCODE data coordination center|Full article...]]''')<br />
 <br />
 ''Recently featured'':
+: ▪ [[Journal:Integrated systems for NGS data management and analysis: Open issues and available solutions|Integrated systems for NGS data management and analysis: Open issues and available solutions]]
 : ▪ [[Journal:Practical approaches for mining frequent patterns in molecular datasets|Practical approaches for mining frequent patterns in molecular datasets]]
 : ▪ [[Journal:Improving the creation and reporting of structured findings during digital pathology review|Improving the creation and reporting of structured findings during digital pathology review]]
-: ▪ [[Journal:The challenges of data quality and data quality assessment in the big data era|The challenges of data quality and data quality assessment in the big data era]]

Difference between revisions of "Template:Article of the week"

Revision as of 18:19, 7 November 2016

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools

Popular publications