Summary of Study ST001491

This data is available at the NIH Common Fund's National Metabolomics Data Repository (NMDR) website, the Metabolomics Workbench, https://www.metabolomicsworkbench.org, where it has been assigned Project ID PR001009. The data can be accessed directly via it's Project DOI: 10.21228/M88H6T This work is supported by NIH grant, U2C- DK119886.

See: https://www.metabolomicsworkbench.org/about/howtocite.php

This study contains a large results data set and is not available in the mwTab file. It is only available for download via FTP as data file(s) here.

Show all samples | Perform analysis on untargeted data
Download mwTab file (text) | Download mwTab file(JSON) | Download data files (Contains raw data)

Study ID	ST001491
Study Title	Global Urine Metabolic Profiling to Predict Gestational Age in Term and Preterm Pregnancies
Study Summary	Assessment of gestational age (GA) is key to provide optimal care during pregnancy. However, its accurate determination remains challenging in low- and middle-resource countries, where access to obstetric ultrasound is limited. Hence, there is an urgent need to develop clinical approaches that allow accurate and inexpensive estimation of GA. We investigated the ability of urinary metabolites to predict GA at time of collection in a diverse multi-site cohort (n = 99) using a broad-spectrum liquid chromatography coupled with mass spectrometry (LC-MS) platform. Our approach detected a myriad of steroid hormones and their derivatives including estrogens, progesterones, corticosteroids and androgens that associated with pregnancy progression. We developed a prediction model that predicted GA with high accuracy using the levels of three metabolites (rho = 0.87, .RMSE = 1.58 weeks). These predictions were robust irrespective of whether the pregnancy went to term or ended prematurely. Overall, we demonstrate the feasibility of implementing urine collection for metabolomics analysis in large-scale multi-site studies and we report a predictive model of GA with a potential clinical value.
Institute	Stanford University
Last Name	Contrepois
First Name	Kevin
Address	300 Pasteur Dr
Email	kcontrep@stanford.edu
Phone	6506664538
Submit Date	2020-09-27
Raw Data Available	Yes
Raw Data File Type(s)	raw(Thermo)
Analysis Type Detail	LC-MS
Release Date	2022-05-16
Release Version	1

Select appropriate tab below to view additional metadata details:

Combined analysis:

Analysis ID	AN002470	AN002471	AN002472	AN002473
Analysis type	MS	MS	MS	MS
Chromatography type	HILIC	HILIC	Reversed phase	Reversed phase
Chromatography system	Thermo Dionex Ultimate 3000 RS	Thermo Dionex Ultimate 3000 RS	Thermo Dionex Ultimate 3000 RS	Thermo Dionex Ultimate 3000 RS
Column	SeQuant ZIC-HILIC (100 x 2.1mm,3.5um)	SeQuant ZIC-HILIC (100 x 2.1mm,3.5um)	Hypersil GOLD (150 x 2.1mm,1.9um)	Hypersil GOLD (150 x 2.1mm,1.9um)
MS Type	ESI	ESI	ESI	ESI
MS instrument type	Orbitrap	Orbitrap	Orbitrap	Orbitrap
MS instrument name	Thermo Q Exactive HF hybrid Orbitrap	Thermo Q Exactive HF hybrid Orbitrap	Thermo Q Exactive HF hybrid Orbitrap	Thermo Q Exactive HF hybrid Orbitrap
Ion Mode	POSITIVE	NEGATIVE	POSITIVE	NEGATIVE
Units	MS count	MS Counts	MS Counts	MS Counts

MS:

MS ID:	MS002290
Analysis ID:	AN002470
Instrument Name:	Thermo Q Exactive HF hybrid Orbitrap
Instrument Type:	Orbitrap
MS Type:	ESI
MS Comments:	Data processing. Data from each mode were independently analyzed using Progenesis QI software (v2.3) (Nonlinear Dynamics). Metabolic features from blanks and that did not show sufficient linearity upon dilution in QC samples (r < 0.6) were discarded. Only metabolic features present in > 2/3 of the samples were kept for further analysis. Inter- and intra-batch variations were corrected by applying locally estimated scatterplot smoothing local regression (LOESS) on pooled samples injected repetitively along the batches (span = 0.75). Data were acquired in four batches for HILIC and RPLC modes. Dilution effects were corrected using probabilistic quotient normalization (PQN) (Rosen Vollmar et al., 2019). Missing values were imputed by drawing from a random distribution of low values in the corresponding sample. Data from each mode were then merged, producing a dataset containing 6,630 metabolic features. Metabolite abundances were reported as spectral counts. Metabolic feature annotation. Peak annotation was first performed by matching experimental m/z, retention time and MS/MS spectra to an in-house library of analytical-grade standards. Remaining peaks were identified by matching experimental m/z and fragmentation spectra to publicly available databases including HMDB (http://www.hmdb.ca/), MoNA (http://mona.fiehnlab.ucdavis.edu/) and MassBank (http://www.massbank.jp/) using the R package ‘MetID’ (v0.2.0) (Shen et al., 2019). Briefly, metabolic feature tables from Progenesis QI were matched to fragmentation spectra with a m/z and a retention time window of ± 15 ppm and ± 30 s (HILIC) and ± 20 s (RPLC), respectively. When multiple MS/MS spectra match a single metabolic feature, all matched MS/MS spectra were used for the identification. Next, MS1 and MS2 pairs were searched against public databases and a similarity score was calculated using the forward dot–product algorithm which considers both fragments and intensities (Stein and Scott, 1994). Metabolites were reported if the similarity score was above 0.4. Spectra from metabolic features of interest important in random forest models (see below) were further investigated manually to confirm identification.
Ion Mode:	POSITIVE
Capillary Temperature:	375C
Capillary Voltage:	3.4kV
Collision Energy:	25 & 35 NCE
Collision Gas:	N2
Dry Gas Temp:	310C

MS ID:	MS002291
Analysis ID:	AN002471
Instrument Name:	Thermo Q Exactive HF hybrid Orbitrap
Instrument Type:	Orbitrap
MS Type:	ESI
MS Comments:	Data processing. Data from each mode were independently analyzed using Progenesis QI software (v2.3) (Nonlinear Dynamics). Metabolic features from blanks and that did not show sufficient linearity upon dilution in QC samples (r < 0.6) were discarded. Only metabolic features present in > 2/3 of the samples were kept for further analysis. Inter- and intra-batch variations were corrected by applying locally estimated scatterplot smoothing local regression (LOESS) on pooled samples injected repetitively along the batches (span = 0.75). Data were acquired in four batches for HILIC and RPLC modes. Dilution effects were corrected using probabilistic quotient normalization (PQN) (Rosen Vollmar et al., 2019). Missing values were imputed by drawing from a random distribution of low values in the corresponding sample. Data from each mode were then merged, producing a dataset containing 6,630 metabolic features. Metabolite abundances were reported as spectral counts. Metabolic feature annotation. Peak annotation was first performed by matching experimental m/z, retention time and MS/MS spectra to an in-house library of analytical-grade standards. Remaining peaks were identified by matching experimental m/z and fragmentation spectra to publicly available databases including HMDB (http://www.hmdb.ca/), MoNA (http://mona.fiehnlab.ucdavis.edu/) and MassBank (http://www.massbank.jp/) using the R package ‘MetID’ (v0.2.0) (Shen et al., 2019). Briefly, metabolic feature tables from Progenesis QI were matched to fragmentation spectra with a m/z and a retention time window of ± 15 ppm and ± 30 s (HILIC) and ± 20 s (RPLC), respectively. When multiple MS/MS spectra match a single metabolic feature, all matched MS/MS spectra were used for the identification. Next, MS1 and MS2 pairs were searched against public databases and a similarity score was calculated using the forward dot–product algorithm which considers both fragments and intensities (Stein and Scott, 1994). Metabolites were reported if the similarity score was above 0.4. Spectra from metabolic features of interest important in random forest models (see below) were further investigated manually to confirm identification.
Ion Mode:	NEGATIVE
Capillary Temperature:	375C
Capillary Voltage:	3.4kV
Collision Energy:	25 & 35 NCE
Collision Gas:	N2
Dry Gas Temp:	310C

MS ID:	MS002292
Analysis ID:	AN002472
Instrument Name:	Thermo Q Exactive HF hybrid Orbitrap
Instrument Type:	Orbitrap
MS Type:	ESI
MS Comments:	Data processing. Data from each mode were independently analyzed using Progenesis QI software (v2.3) (Nonlinear Dynamics). Metabolic features from blanks and that did not show sufficient linearity upon dilution in QC samples (r < 0.6) were discarded. Only metabolic features present in > 2/3 of the samples were kept for further analysis. Inter- and intra-batch variations were corrected by applying locally estimated scatterplot smoothing local regression (LOESS) on pooled samples injected repetitively along the batches (span = 0.75). Data were acquired in four batches for HILIC and RPLC modes. Dilution effects were corrected using probabilistic quotient normalization (PQN) (Rosen Vollmar et al., 2019). Missing values were imputed by drawing from a random distribution of low values in the corresponding sample. Data from each mode were then merged, producing a dataset containing 6,630 metabolic features. Metabolite abundances were reported as spectral counts. Metabolic feature annotation. Peak annotation was first performed by matching experimental m/z, retention time and MS/MS spectra to an in-house library of analytical-grade standards. Remaining peaks were identified by matching experimental m/z and fragmentation spectra to publicly available databases including HMDB (http://www.hmdb.ca/), MoNA (http://mona.fiehnlab.ucdavis.edu/) and MassBank (http://www.massbank.jp/) using the R package ‘MetID’ (v0.2.0) (Shen et al., 2019). Briefly, metabolic feature tables from Progenesis QI were matched to fragmentation spectra with a m/z and a retention time window of ± 15 ppm and ± 30 s (HILIC) and ± 20 s (RPLC), respectively. When multiple MS/MS spectra match a single metabolic feature, all matched MS/MS spectra were used for the identification. Next, MS1 and MS2 pairs were searched against public databases and a similarity score was calculated using the forward dot–product algorithm which considers both fragments and intensities (Stein and Scott, 1994). Metabolites were reported if the similarity score was above 0.4. Spectra from metabolic features of interest important in random forest models (see below) were further investigated manually to confirm identification.
Ion Mode:	POSITIVE
Capillary Temperature:	375C
Capillary Voltage:	3.4kV
Collision Energy:	25 & 50 NCE
Collision Gas:	N2
Dry Gas Temp:	310C

MS ID:	MS002293
Analysis ID:	AN002473
Instrument Name:	Thermo Q Exactive HF hybrid Orbitrap
Instrument Type:	Orbitrap
MS Type:	ESI
MS Comments:	Data processing. Data from each mode were independently analyzed using Progenesis QI software (v2.3) (Nonlinear Dynamics). Metabolic features from blanks and that did not show sufficient linearity upon dilution in QC samples (r < 0.6) were discarded. Only metabolic features present in > 2/3 of the samples were kept for further analysis. Inter- and intra-batch variations were corrected by applying locally estimated scatterplot smoothing local regression (LOESS) on pooled samples injected repetitively along the batches (span = 0.75). Data were acquired in four batches for HILIC and RPLC modes. Dilution effects were corrected using probabilistic quotient normalization (PQN) (Rosen Vollmar et al., 2019). Missing values were imputed by drawing from a random distribution of low values in the corresponding sample. Data from each mode were then merged, producing a dataset containing 6,630 metabolic features. Metabolite abundances were reported as spectral counts. Metabolic feature annotation. Peak annotation was first performed by matching experimental m/z, retention time and MS/MS spectra to an in-house library of analytical-grade standards. Remaining peaks were identified by matching experimental m/z and fragmentation spectra to publicly available databases including HMDB (http://www.hmdb.ca/), MoNA (http://mona.fiehnlab.ucdavis.edu/) and MassBank (http://www.massbank.jp/) using the R package ‘MetID’ (v0.2.0) (Shen et al., 2019). Briefly, metabolic feature tables from Progenesis QI were matched to fragmentation spectra with a m/z and a retention time window of ± 15 ppm and ± 30 s (HILIC) and ± 20 s (RPLC), respectively. When multiple MS/MS spectra match a single metabolic feature, all matched MS/MS spectra were used for the identification. Next, MS1 and MS2 pairs were searched against public databases and a similarity score was calculated using the forward dot–product algorithm which considers both fragments and intensities (Stein and Scott, 1994). Metabolites were reported if the similarity score was above 0.4. Spectra from metabolic features of interest important in random forest models (see below) were further investigated manually to confirm identification.
Ion Mode:	NEGATIVE
Capillary Temperature:	375C
Capillary Voltage:	3.4kV
Collision Energy:	25 & 50 NCE
Collision Gas:	N2
Dry Gas Temp:	310C