Summary of Study ST002132

This data is available at the NIH Common Fund's National Metabolomics Data Repository (NMDR) website, the Metabolomics Workbench, https://www.metabolomicsworkbench.org, where it has been assigned Project ID PR001350. The data can be accessed directly via it's Project DOI: 10.21228/M86X36 This work is supported by NIH grant, U2C- DK119886.

See: https://www.metabolomicsworkbench.org/about/howtocite.php

This study contains a large results data set and is not available in the mwTab file. It is only available for download via FTP as data file(s) here.

Perform statistical analysis  |  Show all samples  |  Show named metabolites  |  Download named metabolite data  
Download mwTab file (text)   |  Download mwTab file(JSON)   |  Download data files (Contains raw data)
Study IDST002132
Study TitleOptimization of Imputation Strategies for High-Resolution Gas Chromatography-Mass Spectrometry (HR GC-MS) Metabolomics Data
Study SummaryGas chromatography-coupled mass spectrometry (GC-MS) has been used in biomedical research to analyze volatile, non-polar, and polar metabolites in a wide array of sample types. Despite advances in technology, missing values are still common in metabolomics datasets and must be properly handled. We evaluated the performance of ten commonly used missing value imputa-tion methods with metabolites analyzed on an HR GC-MS instrument. By introducing missing values into the complete (i.e., data without any missing values) NIST plasma dataset we demon-strate that Random Forest (RF), Glmnet Ridge Regression (GRR), and Bayesian Principal Com-ponent Analysis (BPCA) shared the lowest Root Mean Squared Error (RMSE) in technical repli-cate data. Further examination of these three methods in data from baboon plasma and liver samples demonstrated they all maintained high accuracy. Overall, our analysis suggests that any of the three imputation methods can be applied effectively to untargeted metabolomics datasets with high accuracy. However, it is important to note that imputation will alter the correlation structure of the dataset, and bias downstream regression coefficients and p-values.
Institute
Wake Forest School of Medicine
Last NameAmpong
First NameIsaac
AddressCenter for Precision Medicine, Department of Internal Medicine, Section on Molecular Medicine, Wake Forest University, Winston-Salem, North Carolina, United States
Emailiampong@wakehealth.edu
Phone3367162091
Submit Date2022-04-01
Raw Data AvailableYes
Raw Data File Type(s)mzML
Analysis Type DetailGC-MS
Release Date2022-04-27
Release Version1
Isaac Ampong Isaac Ampong
https://dx.doi.org/10.21228/M86X36
ftp://www.metabolomicsworkbench.org/Studies/ application/zip

Select appropriate tab below to view additional metadata details:


Project:

Project ID:PR001350
Project DOI:doi: 10.21228/M86X36
Project Title:Optimization of Imputation Strategies for High-Resolution Gas Chromatography-Mass Spectrometry (HR GC-MS) Metabo-lomics Data
Project Summary:Gas chromatography-coupled mass spectrometry (GC-MS) has been used in biomedical research to analyze volatile, non-polar, and polar metabolites in a wide array of sample types. Despite advances in technology, missing values are still common in metabolomics datasets and must be properly handled. We evaluated the performance of ten commonly used missing value imputa-tion methods with metabolites analyzed on an HR GC-MS instrument. By introducing missing values into the complete (i.e., data without any missing values) NIST plasma dataset we demon-strate that Random Forest (RF), Glmnet Ridge Regression (GRR), and Bayesian Principal Com-ponent Analysis (BPCA) shared the lowest Root Mean Squared Error (RMSE) in technical repli-cate data. Further examination of these three methods in data from baboon plasma and liver samples demonstrated they all maintained high accuracy. Overall, our analysis suggests that any of the three imputation methods can be applied effectively to untargeted metabolomics datasets with high accuracy. However, it is important to note that imputation will alter the correlation structure of the dataset, and bias downstream regression coefficients and p-values.
Institute:Wake Forest School of Medicine
Department:Department of Internal Medicine
Laboratory:Olivier Lab
Last Name:Ampong
First Name:Isaac
Address:Center for Precision Medicine, Department of Internal Medicine, Section on Molecular Medicine, Wake Forest University, Winston-Salem, North Carolina, United States
Email:iampong@wakehealth.edu
Phone:3367162091

Subject:

Subject ID:SU002217
Subject Type:Mammal
Subject Species:Papio hamadryas
Taxonomy ID:9557

Factors:

Subject type: Mammal; Subject species: Papio hamadryas (Factor headings shown in green)

mb_sample_id local_sample_id type
SA20468623baboon liver
SA20468722baboon liver
SA20468821baboon liver
SA20468924baboon liver
SA20469020baboon liver
SA20469126baboon liver
SA20469219baboon liver
SA20469327baboon liver
SA20469428baboon liver
SA20469529baboon liver
SA20469625baboon liver
SA20469711baboon liver
SA2046981baboon liver
SA2046992baboon liver
SA20470030baboon liver
SA2047013baboon liver
SA20470212baboon liver
SA20470313baboon liver
SA20470417baboon liver
SA20470516baboon liver
SA20470615baboon liver
SA20470714baboon liver
SA20470818baboon liver
SA20470949baboon liver
SA20471047baboon liver
SA20471146baboon liver
SA20471245baboon liver
SA20471344baboon liver
SA20471448baboon liver
SA20471550baboon liver
SA20471610baboon liver
SA20471753baboon liver
SA20471852baboon liver
SA20471951baboon liver
SA20472043baboon liver
SA20472142baboon liver
SA20472235baboon liver
SA20472334baboon liver
SA20472433baboon liver
SA20472532baboon liver
SA20472636baboon liver
SA20472737baboon liver
SA20472841baboon liver
SA20472940baboon liver
SA20473039baboon liver
SA20473138baboon liver
SA20473231baboon liver
SA20473314705baboon plasma
SA20473415400baboon plasma
SA20473515149baboon plasma
SA20473615099baboon plasma
SA20473715027baboon plasma
SA20473815432baboon plasma
SA20473915537baboon plasma
SA20474015727baboon plasma
SA20474115706baboon plasma
SA20474215671baboon plasma
SA20474315636baboon plasma
SA20474414722baboon plasma
SA20474514719baboon plasma
SA20474612818baboon plasma
SA20474712656baboon plasma
SA20474811887baboon plasma
SA20474911641baboon plasma
SA20475013029baboon plasma
SA20475113238baboon plasma
SA20475214438baboon plasma
SA20475313737baboon plasma
SA20475413669baboon plasma
SA20475515898baboon plasma
SA20475616172baboon plasma
SA20475730325baboon plasma
SA20475830226baboon plasma
SA20475927948baboon plasma
SA20476026702baboon plasma
SA20476130623baboon plasma
SA20476230628baboon plasma
SA204763DK63baboon plasma
SA204764BB36baboon plasma
SA204765AE06baboon plasma
SA20476626476baboon plasma
SA20476726392baboon plasma
SA20476817000baboon plasma
SA20476916772baboon plasma
SA20477016518baboon plasma
SA20477116215baboon plasma
SA20477217803baboon plasma
SA20477317883baboon plasma
SA20477420120baboon plasma
SA20477518565baboon plasma
SA20477618463baboon plasma
SA204777EF44baboon plasma
SA204536T2Nistplasma
SA204537T3Nistplasma
SA204538T4Nistplasma
SA204539T1Nistplasma
SA204540S9Nistplasma
SA204541S7Nistplasma
SA204542S8Nistplasma
SA204543T5Nistplasma
Showing page 1 of 3     Results:    1  2  3  Next     Showing results 1 to 100 of 242

Collection:

Collection ID:CO002210
Collection Summary:The NIST plasma metabolomics dataset consisted of 150 replicate samples which were bought from commercial vendors. The 12 batched datasets were pooled, aligned, and processed using open source software MS-DIAL (v4.6). The second dataset was generated from metabolic profiling of 45 baboon plasma samples collected from 35 females in the age range of 6-23 years and 10 males in the same age range. All 45 plasma samples were analyzed using an untargeted EI-GC-MS approach as described above. The third dataset consists of another EI-GC-MS analysis of metabolites extracted from 47 liver biopsy samples collected from the same adult healthy baboons as the plasma which included 39 females and 8 males in the age range of 6-23 years.
Sample Type:Liver

Treatment:

Treatment ID:TR002229
Treatment Summary:For the baboon study, normal life course baboons were fed control chow diet

Sample Preparation:

Sampleprep ID:SP002223
Sampleprep Summary:15 μL of plasma or liver samples were subjected to sequential solvent extraction, once each with 1 mL of acetonitrile: isopropanol: water (3:3:2) and 500 μL of acetonitrile: water (1:1) mixtures at 4°C [14]. An internal standard, adonitol (2 μL from 10 mg/ml stock) was added to each aliquot prior to the extraction. The extracts were dried under vacuum at 4°C prior to chemical derivatization (silylation reactions). Blank tubes without samples, were treated similarly as sample tubes and added to account for background noise and other sources of contamination. Samples and blanks were sequentially derivatized with meth-oxyamine hydrochloride (MeOX) and 1% TMCS in N-methyl-N-trimethylsilyl-trifluoroacetamide (MSTFA) or 1% TMCS containing N-(t-butyldimethylsilyl)-N-methyltrifluoroacetamide (MTBSTFA) as described elsewhere [15]. Briefly, the steps involved addition of 20 μL of MeOX (20 mg mL-1) in pyridine incu-bated at 55°C for 60 min followed by trimethylsilylation at 60°C for 60 min after adding 80 μL MTBSTFA.

Combined analysis:

Analysis ID AN003487
Analysis type MS
Chromatography type GC
Chromatography system Thermo Trace 1310
Column Thermo Scientific Trace GOLD TG-5SIL-MS
MS Type EI
MS instrument type QTRAP
MS instrument name Thermo Q Exactive Orbitrap
Ion Mode POSITIVE
Units Normalized Peak abundances

Chromatography:

Chromatography ID:CH002574
Instrument Name:Thermo Trace 1310
Column Name:Thermo Scientific Trace GOLD TG-5SIL-MS
Chromatography Type:GC

MS:

MS ID:MS003248
Analysis ID:AN003487
Instrument Name:Thermo Q Exactive Orbitrap
Instrument Type:QTRAP
MS Type:EI
MS Comments:Data acquisition and instrument control were carried out using Xcalibur 4.3 and Trace-Finder 4.1 softwares MS-DIAL
Ion Mode:POSITIVE
  logo