Summary of Study ST002132

This data is available at the NIH Common Fund's National Metabolomics Data Repository (NMDR) website, the Metabolomics Workbench, https://www.metabolomicsworkbench.org, where it has been assigned Project ID PR001350. The data can be accessed directly via it's Project DOI: 10.21228/M86X36 This work is supported by NIH grant, U2C- DK119886.

See: https://www.metabolomicsworkbench.org/about/howtocite.php

This study contains a large results data set and is not available in the mwTab file. It is only available for download via FTP as data file(s) here.

Perform statistical analysis  |  Show all samples  |  Show named metabolites  |  Download named metabolite data  
Download mwTab file (text)   |  Download mwTab file(JSON)   |  Download data files (Contains raw data)
Study IDST002132
Study TitleOptimization of Imputation Strategies for High-Resolution Gas Chromatography-Mass Spectrometry (HR GC-MS) Metabolomics Data
Study SummaryGas chromatography-coupled mass spectrometry (GC-MS) has been used in biomedical research to analyze volatile, non-polar, and polar metabolites in a wide array of sample types. Despite advances in technology, missing values are still common in metabolomics datasets and must be properly handled. We evaluated the performance of ten commonly used missing value imputa-tion methods with metabolites analyzed on an HR GC-MS instrument. By introducing missing values into the complete (i.e., data without any missing values) NIST plasma dataset we demon-strate that Random Forest (RF), Glmnet Ridge Regression (GRR), and Bayesian Principal Com-ponent Analysis (BPCA) shared the lowest Root Mean Squared Error (RMSE) in technical repli-cate data. Further examination of these three methods in data from baboon plasma and liver samples demonstrated they all maintained high accuracy. Overall, our analysis suggests that any of the three imputation methods can be applied effectively to untargeted metabolomics datasets with high accuracy. However, it is important to note that imputation will alter the correlation structure of the dataset, and bias downstream regression coefficients and p-values.
Institute
Wake Forest School of Medicine
Last NameAmpong
First NameIsaac
AddressCenter for Precision Medicine, Department of Internal Medicine, Section on Molecular Medicine, Wake Forest University, Winston-Salem, North Carolina, United States
Emailiampong@wakehealth.edu
Phone3367162091
Submit Date2022-04-01
Raw Data AvailableYes
Raw Data File Type(s)mzML
Analysis Type DetailGC-MS
Release Date2022-04-27
Release Version1
Isaac Ampong Isaac Ampong
https://dx.doi.org/10.21228/M86X36
ftp://www.metabolomicsworkbench.org/Studies/ application/zip

Select appropriate tab below to view additional metadata details:


Project:

Project ID:PR001350
Project DOI:doi: 10.21228/M86X36
Project Title:Optimization of Imputation Strategies for High-Resolution Gas Chromatography-Mass Spectrometry (HR GC-MS) Metabo-lomics Data
Project Summary:Gas chromatography-coupled mass spectrometry (GC-MS) has been used in biomedical research to analyze volatile, non-polar, and polar metabolites in a wide array of sample types. Despite advances in technology, missing values are still common in metabolomics datasets and must be properly handled. We evaluated the performance of ten commonly used missing value imputa-tion methods with metabolites analyzed on an HR GC-MS instrument. By introducing missing values into the complete (i.e., data without any missing values) NIST plasma dataset we demon-strate that Random Forest (RF), Glmnet Ridge Regression (GRR), and Bayesian Principal Com-ponent Analysis (BPCA) shared the lowest Root Mean Squared Error (RMSE) in technical repli-cate data. Further examination of these three methods in data from baboon plasma and liver samples demonstrated they all maintained high accuracy. Overall, our analysis suggests that any of the three imputation methods can be applied effectively to untargeted metabolomics datasets with high accuracy. However, it is important to note that imputation will alter the correlation structure of the dataset, and bias downstream regression coefficients and p-values.
Institute:Wake Forest School of Medicine
Department:Department of Internal Medicine
Laboratory:Olivier Lab
Last Name:Ampong
First Name:Isaac
Address:Center for Precision Medicine, Department of Internal Medicine, Section on Molecular Medicine, Wake Forest University, Winston-Salem, North Carolina, United States
Email:iampong@wakehealth.edu
Phone:3367162091
  logo