Difference between revisions of "Template:Article of the week"

From LIMSWiki
Jump to: navigation, search
(Updated article of the week text)
(Updated article of the week text)
Line 1: Line 1:
<div style="float: left; margin: 0.5em 0.9em 0.4em 0em;">[[File:Fig2 Palmieri Molecules2019 24-19.png|240px]]</div>
+
<div style="float: left; margin: 0.5em 0.9em 0.4em 0em;">[[File:Fig1 Khare eGEMs2019 7-1.png|240px]]</div>
'''"[[Journal:Identification of Cannabis sativa L. (hemp) retailers by means of multivariate analysis of cannabinoids|Identification of Cannabis sativa L. (hemp) retailers by means of multivariate analysis of cannabinoids]]"'''
+
'''"[[Journal:Design and refinement of a data quality assessment workflow for a large pediatric research network|Design and refinement of a data quality assessment workflow for a large pediatric research network]]"'''
  
In this work, the concentration of nine [[wikipedia:Cannabinoid|cannabinoid]]s—six neutral cannabinoids (THC, CBD, CBC, CBG, CBN, and CBDV) and three acidic cannabinoids (THCA, CBGA, and CBDA)—was used to identify the Italian retailers of ''[[wikipedia:Cannabis sativa|Cannabis sativa]]'' L. ([[wikipedia:Hemp|hemp]]), reinforcing the idea that the practice of categorizing hemp samples only using THC and CBD is inadequate. A [[high-performance liquid chromatography]]–[[tandem mass spectrometry]] (HPLC-MS/MS) method was developed for screening and simultaneously analyzing the nine cannabinoids in 161 hemp samples sold by four retailers located in different Italian cities. The hemp samples dataset was analyzed by [[wikipedia:Univariate analysis|univariate]] and [[wikipedia:Multivariate analysis|multivariate analysis]], with the aim to identify the associated hemp retailers without using any other [[information]] on the hemp samples such as [[wikipedia:Cannabis strains|''Cannabis'' strains]], seeds, soil and cultivation characteristics, geographical origin, product storage, etc. The univariate analysis highlighted that the hemp samples could not be differentiated by using any of the nine cannabinoids analyzed. ('''[[Journal:Identification of Cannabis sativa L. (hemp) retailers by means of multivariate analysis of cannabinoids|Full article...]]''')<br />
+
Clinical data research networks (CDRNs) aggregate [[electronic health record]] (EHR) data from multiple hospitals to enable large-scale research. A critical operation toward building a CDRN is conducting continual evaluations to optimize [[wikipedia:Data quality|data quality]]. The key challenges include determining the assessment coverage on big datasets, handling data variability over time, and facilitating communication with data teams. This study presents the evolution of a systematic workflow for data quality assessment in CDRNs.
 +
 
 +
Using a specific CDRN as a use case, a workflow was iteratively developed and packaged into a toolkit. The resultant toolkit comprises 685 data quality checks to identify any data quality issues, procedures to reconciliate with a history of known issues, and a contemporary GitHub-based reporting mechanism for organized tracking.
 +
 
 +
During the first two years of network development, the toolkit assisted in discovering over 800 data characteristics and resolving over 1400 programming errors. Longitudinal analysis indicated that the variability in time to resolution (15day mean, 24day IQR) is due to the underlying cause of the issue, perceived importance of the domain, and the complexity of assessment. ('''[[Journal:Design and refinement of a data quality assessment workflow for a large pediatric research network|Full article...]]''')<br />
 
<br />
 
<br />
 
''Recently featured'':
 
''Recently featured'':
 +
: ▪ [[Journal:Identification of Cannabis sativa L. (hemp) retailers by means of multivariate analysis of cannabinoids|Identification of Cannabis sativa L. (hemp) retailers by means of multivariate analysis of cannabinoids]]
 
: ▪ [[Journal:Data sharing at scale: A heuristic for affirming data cultures|Data sharing at scale: A heuristic for affirming data cultures]]
 
: ▪ [[Journal:Data sharing at scale: A heuristic for affirming data cultures|Data sharing at scale: A heuristic for affirming data cultures]]
 
: ▪ [[Journal:Design and evaluation of a LIS-based autoverification system for coagulation assays in a core clinical laboratory|Design and evaluation of a LIS-based autoverification system for coagulation assays in a core clinical laboratory]]
 
: ▪ [[Journal:Design and evaluation of a LIS-based autoverification system for coagulation assays in a core clinical laboratory|Design and evaluation of a LIS-based autoverification system for coagulation assays in a core clinical laboratory]]
: ▪ [[Journal:CyberMaster: An expert system to guide the development of cybersecurity curricula|CyberMaster: An expert system to guide the development of cybersecurity curricula]]
 

Revision as of 16:40, 2 December 2019

Fig1 Khare eGEMs2019 7-1.png

"Design and refinement of a data quality assessment workflow for a large pediatric research network"

Clinical data research networks (CDRNs) aggregate electronic health record (EHR) data from multiple hospitals to enable large-scale research. A critical operation toward building a CDRN is conducting continual evaluations to optimize data quality. The key challenges include determining the assessment coverage on big datasets, handling data variability over time, and facilitating communication with data teams. This study presents the evolution of a systematic workflow for data quality assessment in CDRNs.

Using a specific CDRN as a use case, a workflow was iteratively developed and packaged into a toolkit. The resultant toolkit comprises 685 data quality checks to identify any data quality issues, procedures to reconciliate with a history of known issues, and a contemporary GitHub-based reporting mechanism for organized tracking.

During the first two years of network development, the toolkit assisted in discovering over 800 data characteristics and resolving over 1400 programming errors. Longitudinal analysis indicated that the variability in time to resolution (15day mean, 24day IQR) is due to the underlying cause of the issue, perceived importance of the domain, and the complexity of assessment. (Full article...)

Recently featured:

Identification of Cannabis sativa L. (hemp) retailers by means of multivariate analysis of cannabinoids
Data sharing at scale: A heuristic for affirming data cultures
Design and evaluation of a LIS-based autoverification system for coagulation assays in a core clinical laboratory