DOI: 10.1101/482224Nov 29, 2018Paper

Using expert driven machine learning to enhance dynamic metabolomics data analysis.

BioRxiv : the Preprint Server for Biology
Charlie BeirnaertKris Laukens

Abstract

Data analysis for metabolomics is undergoing rapid progress thanks to the proliferation of novel tools and the standardization of existing workflows. However, as datasets and experiments continue to increase in size and complexity, standardized workflows are often not sufficient. In addition, as the ground truth for metabolomics experiments is intrinsically unknown, there is no way to critically evaluate the performance of tools. Here, we investigate the problem of dynamic multi-class metabolomics experiments using a simulated dataset with a known ground truth and evaluate the performance of tinderesting, a new and intuitive tool based on gathering expert knowledge to be used in machine learning, and compare it to EDGE, a statistical method for sequence data. This paper presents three novel outcomes. First we present a way to simulate dynamic metabolomics data with a known ground truth based on ordinary differential equations. This method is made available through the MetaboLouise R package. Second, we show that the EDGE tool, originally developed for genomics data analysis, is highly performant in analyzing dynamic case vs control metabolomics data. Last, we introduce the tinderesting method to analyse more complex dynamic met...Continue Reading

Related Concepts

Learning
Software Tools
Size
Cell Proliferation
Simulation
Research Study
Comparative Genomic Analysis
Analysis
Metabolomics
Computer Program Package

Related Feeds

BioRxiv & MedRxiv Preprints

BioRxiv and MedRxiv are the preprint servers for biology and health sciences respectively, operated by Cold Spring Harbor Laboratory. Here are the latest preprint articles (which are not peer-reviewed) from BioRxiv and MedRxiv.