Difference between revisions of "Template:Article of the week"

Revision as of 16:23, 31 October 2016

"Integrated systems for NGS data management and analysis: Open issues and available solutions"

Next-generation sequencing (NGS) technologies have deeply changed our understanding of cellular processes by delivering an astonishing amount of data at affordable prices; nowadays, many biology laboratories have already accumulated a large number of sequenced samples. However, managing and analyzing these data poses new challenges, which may easily be underestimated by research groups devoid of IT and quantitative skills. In this perspective, we identify five issues that should be carefully addressed by research groups approaching NGS technologies. In particular, the five key issues to be considered concern: (1) adopting a laboratory information management system (LIMS) and safeguard the resulting raw data structure in downstream analyses; (2) monitoring the flow of the data and standardizing input and output directories and file names, even when multiple analysis protocols are used on the same data; (3) ensuring complete traceability of the analysis performed; (4) enabling non-experienced users to run analyses through a graphical user interface (GUI) acting as a front-end for the pipelines; (5) relying on standard metadata to annotate the datasets, and when possible using controlled vocabularies, ideally derived from biomedical ontologies. (Full article...)

Recently featured:

▪ Practical approaches for mining frequent patterns in molecular datasets

▪ Improving the creation and reporting of structured findings during digital pathology review

▪ The challenges of data quality and data quality assessment in the big data era

@@ Line 1: / Line 1: @@
-<div style="float: left; margin: 0.5em 0.9em 0.4em 0em;">[[File:Fig1 Naulaerts BioAndBioInsights2016 10.png|240px]]</div>
+<div style="float: left; margin: 0.5em 0.9em 0.4em 0em;">[[File:Fig1 Bianchi FrontinGenetics2016 7.jpg|240px]]</div>
-'''"[[Journal:Practical approaches for mining frequent patterns in molecular datasets|Practical approaches for mining frequent patterns in molecular datasets]]"'''
+'''"[[Journal:Integrated systems for NGS data management and analysis: Open issues and available solutions|Integrated systems for NGS data management and analysis: Open issues and available solutions]]"'''
-Pattern detection is an inherent task in the analysis and interpretation of complex and continuously accumulating biological data. Numerous [[wikipedia:Sequential pattern mining|itemset mining]] algorithms have been developed in the last decade to efficiently detect specific pattern classes in data. Although many of these have proven their value for addressing bioinformatics problems, several factors still slow down promising algorithms from gaining popularity in the life science community. Many of these issues stem from the low user-friendliness of these tools and the complexity of their output, which is often large, static, and consequently hard to interpret. Here, we apply three software implementations on common [[bioinformatics]] problems and illustrate some of the advantages and disadvantages of each, as well as inherent pitfalls of biological data mining. Frequent itemset mining exists in many different flavors, and users should decide their software choice based on their research question, programming proficiency, and added value of extra features. ('''[[Journal:Practical approaches for mining frequent patterns in molecular datasets|Full article...]]''')<br />
+Next-generation sequencing (NGS) technologies have deeply changed our understanding of cellular processes by delivering an astonishing amount of data at affordable prices; nowadays, many biology [[laboratory|laboratories]] have already accumulated a large number of sequenced samples. However, managing and analyzing these data poses new challenges, which may easily be underestimated by research groups devoid of IT and quantitative skills. In this perspective, we identify five issues that should be carefully addressed by research groups approaching NGS technologies. In particular, the five key issues to be considered concern: (1) adopting a [[laboratory information management system]] (LIMS) and safeguard the resulting raw data structure in downstream analyses; (2) monitoring the flow of the data and standardizing input and output directories and file names, even when multiple analysis protocols are used on the same data; (3) ensuring complete traceability of the analysis performed; (4) enabling non-experienced users to run analyses through a graphical user interface (GUI) acting as a front-end for the pipelines; (5) relying on standard metadata to annotate the datasets, and when possible using controlled vocabularies, ideally derived from biomedical ontologies. ('''[[Journal:Integrated systems for NGS data management and analysis: Open issues and available solutions|Full article...]]''')<br />
 <br />
 ''Recently featured'':
+: ▪ [[Journal:Practical approaches for mining frequent patterns in molecular datasets|Practical approaches for mining frequent patterns in molecular datasets]]
 : ▪ [[Journal:Improving the creation and reporting of structured findings during digital pathology review|Improving the creation and reporting of structured findings during digital pathology review]]
 : ▪ [[Journal:The challenges of data quality and data quality assessment in the big data era|The challenges of data quality and data quality assessment in the big data era]]
-: ▪ [[Journal:Water, water, everywhere: Defining and assessing data sharing in academia|Water, water, everywhere: Defining and assessing data sharing in academia]]

Difference between revisions of "Template:Article of the week"

Revision as of 16:23, 31 October 2016

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools

Popular publications