Summary of Study ST001491
This data is available at the NIH Common Fund's National Metabolomics Data Repository (NMDR) website, the Metabolomics Workbench, https://www.metabolomicsworkbench.org, where it has been assigned Project ID PR001009. The data can be accessed directly via it's Project DOI: 10.21228/M88H6T This work is supported by NIH grant, U2C- DK119886.
See: https://www.metabolomicsworkbench.org/about/howtocite.php
This study contains a large results data set and is not available in the mwTab file. It is only available for download via FTP as data file(s) here.
Study ID | ST001491 |
Study Title | Global Urine Metabolic Profiling to Predict Gestational Age in Term and Preterm Pregnancies |
Study Summary | Assessment of gestational age (GA) is key to provide optimal care during pregnancy. However, its accurate determination remains challenging in low- and middle-resource countries, where access to obstetric ultrasound is limited. Hence, there is an urgent need to develop clinical approaches that allow accurate and inexpensive estimation of GA. We investigated the ability of urinary metabolites to predict GA at time of collection in a diverse multi-site cohort (n = 99) using a broad-spectrum liquid chromatography coupled with mass spectrometry (LC-MS) platform. Our approach detected a myriad of steroid hormones and their derivatives including estrogens, progesterones, corticosteroids and androgens that associated with pregnancy progression. We developed a prediction model that predicted GA with high accuracy using the levels of three metabolites (rho = 0.87, .RMSE = 1.58 weeks). These predictions were robust irrespective of whether the pregnancy went to term or ended prematurely. Overall, we demonstrate the feasibility of implementing urine collection for metabolomics analysis in large-scale multi-site studies and we report a predictive model of GA with a potential clinical value. |
Institute | Stanford University |
Last Name | Contrepois |
First Name | Kevin |
Address | 300 Pasteur Dr |
kcontrep@stanford.edu | |
Phone | 6506664538 |
Submit Date | 2020-09-27 |
Raw Data Available | Yes |
Raw Data File Type(s) | raw(Thermo) |
Analysis Type Detail | LC-MS |
Release Date | 2022-05-16 |
Release Version | 1 |
Select appropriate tab below to view additional metadata details:
Project:
Project ID: | PR001009 |
Project DOI: | doi: 10.21228/M88H6T |
Project Title: | Untargeted urine metabolomics to predict gestational age in term and preterm pregnancies |
Project Summary: | Multi-site collection of urine early in pregnancy (8-19 weeks) and untargeted LC-MS metabolomics to predict gestational age in term and preterm pregnancies |
Institute: | Stanford University |
Department: | Genetics |
Last Name: | Contrepois |
First Name: | Kevin |
Address: | 300 Pasteur Dr, ALWAY bldg M302, STANFORD, California, 94305, USA |
Email: | kcontrep@stanford.edu |
Phone: | 6507239914 |
Subject:
Subject ID: | SU001565 |
Subject Type: | Human |
Subject Species: | Homo sapiens |
Taxonomy ID: | 9606 |
Gender: | Female |
Factors:
Subject type: Human; Subject species: Homo sapiens (Factor headings shown in green)
mb_sample_id | local_sample_id | Site | GA_delivery | GA_sampling |
---|---|---|---|---|
SA125708 | 59 | Bangladesh | 24 | 17 |
SA125709 | 162 | Bangladesh | 24 | 17 |
SA125710 | 89 | Bangladesh | 29 | 13 |
SA125711 | 126 | Bangladesh | 29 | 13 |
SA125712 | 146 | Bangladesh | 29 | 15 |
SA125713 | 81 | Bangladesh | 29 | 15 |
SA125714 | 45 | Bangladesh | 29 | 15 |
SA125715 | 20 | Bangladesh | 29 | 15 |
SA125716 | 22 | Bangladesh | 30 | 13 |
SA125717 | 47 | Bangladesh | 30 | 13 |
SA125718 | 14 | Bangladesh | 31 | 11 |
SA125719 | 76 | Bangladesh | 31 | 11 |
SA125720 | 61 | Bangladesh | 31 | 15 |
SA125721 | 163 | Bangladesh | 31 | 15 |
SA125722 | 106 | Bangladesh | 31 | 16 |
SA125723 | 122 | Bangladesh | 31 | 16 |
SA125724 | 70 | Bangladesh | 31 | 8 |
SA125725 | 172 | Bangladesh | 31 | 8 |
SA125726 | 165 | Bangladesh | 33 | 17 |
SA125727 | 12 | Bangladesh | 33 | 17 |
SA125728 | 153 | Bangladesh | 41 | 11 |
SA125729 | 92 | Bangladesh | 41 | 11 |
SA125730 | 114 | Bangladesh | 41 | 13 |
SA125731 | 50 | Bangladesh | 41 | 13 |
SA125732 | 78 | Bangladesh | 41 | 13 |
SA125733 | 105 | Bangladesh | 41 | 13 |
SA125734 | 95 | Bangladesh | 41 | 15 |
SA125735 | 137 | Bangladesh | 41 | 15 |
SA125736 | 112 | Bangladesh | 41 | 16 |
SA125737 | 159 | Bangladesh | 41 | 16 |
SA125738 | 42 | Bangladesh | 41 | 16 |
SA125739 | 26 | Bangladesh | 41 | 16 |
SA125740 | 19 | Bangladesh | 41 | 16 |
SA125741 | 48 | Bangladesh | 41 | 16 |
SA125742 | 102 | Bangladesh | 41 | 17 |
SA125743 | 29 | Bangladesh | 41 | 17 |
SA125744 | 73 | Bangladesh | 41 | 18 |
SA125745 | 169 | Bangladesh | 41 | 18 |
SA125746 | 74 | Bangladesh | 41 | 8 |
SA125747 | 135 | Bangladesh | 41 | 8 |
SA125688 | 118 | Bangladesh_GAPPS | 29 | 12 |
SA125689 | 113 | Bangladesh_GAPPS | 32 | 11 |
SA125690 | 158 | Bangladesh_GAPPS | 33 | 12 |
SA125691 | 17 | Bangladesh_GAPPS | 33 | 13 |
SA125692 | 4 | Bangladesh_GAPPS | 33 | 13 |
SA125693 | 39 | Bangladesh_GAPPS | 33 | 15 |
SA125694 | 30 | Bangladesh_GAPPS | 34 | 11 |
SA125695 | 96 | Bangladesh_GAPPS | 35 | 11 |
SA125696 | 100 | Bangladesh_GAPPS | 36 | 11 |
SA125697 | 58 | Bangladesh_GAPPS | 36 | 11 |
SA125698 | 94 | Bangladesh_GAPPS | 39 | 13 |
SA125699 | 23 | Bangladesh_GAPPS | 39 | 19 |
SA125700 | 64 | Bangladesh_GAPPS | 40 | 11 |
SA125701 | 88 | Bangladesh_GAPPS | 40 | 11 |
SA125702 | 157 | Bangladesh_GAPPS | 40 | 12 |
SA125703 | 62 | Bangladesh_GAPPS | 40 | 12 |
SA125704 | 154 | Bangladesh_GAPPS | 40 | 12 |
SA125705 | 43 | Bangladesh_GAPPS | 40 | 12 |
SA125706 | 132 | Bangladesh_GAPPS | 40 | 12 |
SA125707 | 143 | Bangladesh_GAPPS | 40 | 12 |
SA125748 | 7 | Pakistan | 28 | 9 |
SA125749 | 140 | Pakistan | 28 | 9 |
SA125750 | 80 | Pakistan | 32 | 17 |
SA125751 | 67 | Pakistan | 32 | 17 |
SA125752 | 69 | Pakistan | 32 | 8 |
SA125753 | 55 | Pakistan | 32 | 8 |
SA125754 | 66 | Pakistan | 32 | 9 |
SA125755 | 155 | Pakistan | 32 | 9 |
SA125756 | 82 | Pakistan | 33 | 10 |
SA125757 | 160 | Pakistan | 33 | 10 |
SA125758 | 18 | Pakistan | 33 | 12 |
SA125759 | 33 | Pakistan | 33 | 12 |
SA125760 | 11 | Pakistan | 33 | 16 |
SA125761 | 147 | Pakistan | 33 | 16 |
SA125762 | 37 | Pakistan | 33 | 17 |
SA125763 | 15 | Pakistan | 33 | 17 |
SA125764 | 170 | Pakistan | 33 | 18 |
SA125765 | 161 | Pakistan | 33 | 18 |
SA125766 | 41 | Pakistan | 33 | 18 |
SA125767 | 109 | Pakistan | 33 | 18 |
SA125768 | 107 | Pakistan | 39 | 10 |
SA125769 | 52 | Pakistan | 39 | 10 |
SA125770 | 97 | Pakistan | 39 | 12 |
SA125771 | 119 | Pakistan | 39 | 12 |
SA125772 | 53 | Pakistan | 39 | 17 |
SA125773 | 31 | Pakistan | 39 | 17 |
SA125774 | 144 | Pakistan | 39 | 18 |
SA125775 | 72 | Pakistan | 39 | 18 |
SA125776 | 104 | Pakistan | 39 | 8 |
SA125777 | 167 | Pakistan | 39 | 8 |
SA125778 | 34 | Pakistan | 39 | 9 |
SA125779 | 120 | Pakistan | 39 | 9 |
SA125780 | 83 | Pakistan | 39 | 9 |
SA125781 | 166 | Pakistan | 39 | 9 |
SA125782 | 164 | Pakistan | 40 | 16 |
SA125783 | 9 | Pakistan | 40 | 16 |
SA125784 | 124 | Pakistan | 40 | 17 |
SA125785 | 134 | Pakistan | 40 | 17 |
SA125786 | 139 | Pakistan | 40 | 18 |
SA125787 | 127 | Pakistan | 40 | 18 |
Collection:
Collection ID: | CO001560 |
Collection Summary: | The study comprises a single urine sample for each participant (n = 99) that was collected at a prenatal visit after ultrasound confirmed a gestation < 20 weeks. Ultrasound imaging was performed by trained sonologists in compliance with standard-of-care. All study sites employed a uniform method of GA assessment, urine collection and handling. Urine samples were aliquoted and frozen at -80°C within 2 hours. Deidentified urine aliquots were shipped on dry ice from each biorepository to Stanford University as a single batch and under continuous temperature monitoring. Urine samples from 20 healthy pregnancies collected between 8 and 19 weeks of gestation at the Lucile Packard Children’s Hospital at Stanford University, served as the validation cohort. |
Sample Type: | Urine |
Storage Conditions: | -80℃ |
Treatment:
Treatment ID: | TR001580 |
Treatment Summary: | There was no treatment. |
Sample Preparation:
Sampleprep ID: | SP001573 |
Sampleprep Summary: | Urine aliquots were prepared and analyzed in a random order as previously described (Contrepois et al., 2015). Briefly, frozen urine samples were thawed on ice and centrifuged at 17,000g for 10 min at 4°C. Supernatants (25 µl) were then diluted 1:4 with 75% acetonitrile and 100% water for HILIC- and RPLC-MS experiments, respectively. Each sample was spiked-in with 15 analytical-grade internal standards (IS). Samples for HILIC-MS experiments were further centrifuged at 21,000g for 10 min at 4°C to precipitate proteins. |
Combined analysis:
Analysis ID | AN002470 | AN002471 | AN002472 | AN002473 |
---|---|---|---|---|
Analysis type | MS | MS | MS | MS |
Chromatography type | HILIC | HILIC | Reversed phase | Reversed phase |
Chromatography system | Thermo Dionex Ultimate 3000 RS | Thermo Dionex Ultimate 3000 RS | Thermo Dionex Ultimate 3000 RS | Thermo Dionex Ultimate 3000 RS |
Column | SeQuant ZIC-HILIC (100 x 2.1mm,3.5um) | SeQuant ZIC-HILIC (100 x 2.1mm,3.5um) | Hypersil GOLD (150 x 2.1mm,1.9um) | Hypersil GOLD (150 x 2.1mm,1.9um) |
MS Type | ESI | ESI | ESI | ESI |
MS instrument type | Orbitrap | Orbitrap | Orbitrap | Orbitrap |
MS instrument name | Thermo Q Exactive HF hybrid Orbitrap | Thermo Q Exactive HF hybrid Orbitrap | Thermo Q Exactive HF hybrid Orbitrap | Thermo Q Exactive HF hybrid Orbitrap |
Ion Mode | POSITIVE | NEGATIVE | POSITIVE | NEGATIVE |
Units | MS count | MS Counts | MS Counts | MS Counts |
Chromatography:
Chromatography ID: | CH001810 |
Chromatography Summary: | HILIC experiments were performed using a ZIC-HILIC column 2.1x100 mm, 3.5μm, 200Å (Merck Millipore) and mobile phase solvents consisting of 10mM ammonium acetate in 50/50 acetonitrile/water (A) and 10 mM ammonium acetate in 95/5 acetonitrile/water (B).(Contrepois et al., 2015) |
Instrument Name: | Thermo Dionex Ultimate 3000 RS |
Column Name: | SeQuant ZIC-HILIC (100 x 2.1mm,3.5um) |
Column Temperature: | 40 |
Flow Rate: | 0.5 ml/min |
Solvent A: | 95% acetonitrile/5% water; 10 mM ammonium acetate |
Solvent B: | 95% acetonitrile/5% water; 10 mM ammonium acetate |
Chromatography Type: | HILIC |
Chromatography ID: | CH001811 |
Chromatography Summary: | RPLC experiments were performed using a Hypersil GOLD column 2.1 x 150 mm, 1.9 µm, 175Å (Thermo Scientific) and mobile phase solvents consisting of 0.06% acetic acid in water (A) and 0.06% acetic acid in methanol (B). (Contrepois et al., 2015) |
Chromatography Comments: | Hypersil GOLD column 2.1 x 150 mm, 1.9 µm, 175Å (Thermo Scientific) |
Instrument Name: | Thermo Dionex Ultimate 3000 RS |
Column Name: | Hypersil GOLD (150 x 2.1mm,1.9um) |
Column Temperature: | 60 |
Flow Rate: | 0.6 ml/min |
Solvent A: | 100% water; 0.06% acetic acid |
Solvent B: | 100% methanol; 0.06% acetic acid |
Chromatography Type: | Reversed phase |
MS:
MS ID: | MS002290 |
Analysis ID: | AN002470 |
Instrument Name: | Thermo Q Exactive HF hybrid Orbitrap |
Instrument Type: | Orbitrap |
MS Type: | ESI |
MS Comments: | Data processing. Data from each mode were independently analyzed using Progenesis QI software (v2.3) (Nonlinear Dynamics). Metabolic features from blanks and that did not show sufficient linearity upon dilution in QC samples (r < 0.6) were discarded. Only metabolic features present in > 2/3 of the samples were kept for further analysis. Inter- and intra-batch variations were corrected by applying locally estimated scatterplot smoothing local regression (LOESS) on pooled samples injected repetitively along the batches (span = 0.75). Data were acquired in four batches for HILIC and RPLC modes. Dilution effects were corrected using probabilistic quotient normalization (PQN) (Rosen Vollmar et al., 2019). Missing values were imputed by drawing from a random distribution of low values in the corresponding sample. Data from each mode were then merged, producing a dataset containing 6,630 metabolic features. Metabolite abundances were reported as spectral counts. Metabolic feature annotation. Peak annotation was first performed by matching experimental m/z, retention time and MS/MS spectra to an in-house library of analytical-grade standards. Remaining peaks were identified by matching experimental m/z and fragmentation spectra to publicly available databases including HMDB (http://www.hmdb.ca/), MoNA (http://mona.fiehnlab.ucdavis.edu/) and MassBank (http://www.massbank.jp/) using the R package ‘MetID’ (v0.2.0) (Shen et al., 2019). Briefly, metabolic feature tables from Progenesis QI were matched to fragmentation spectra with a m/z and a retention time window of ± 15 ppm and ± 30 s (HILIC) and ± 20 s (RPLC), respectively. When multiple MS/MS spectra match a single metabolic feature, all matched MS/MS spectra were used for the identification. Next, MS1 and MS2 pairs were searched against public databases and a similarity score was calculated using the forward dot–product algorithm which considers both fragments and intensities (Stein and Scott, 1994). Metabolites were reported if the similarity score was above 0.4. Spectra from metabolic features of interest important in random forest models (see below) were further investigated manually to confirm identification. |
Ion Mode: | POSITIVE |
Capillary Temperature: | 375C |
Capillary Voltage: | 3.4kV |
Collision Energy: | 25 & 35 NCE |
Collision Gas: | N2 |
Dry Gas Temp: | 310C |
MS ID: | MS002291 |
Analysis ID: | AN002471 |
Instrument Name: | Thermo Q Exactive HF hybrid Orbitrap |
Instrument Type: | Orbitrap |
MS Type: | ESI |
MS Comments: | Data processing. Data from each mode were independently analyzed using Progenesis QI software (v2.3) (Nonlinear Dynamics). Metabolic features from blanks and that did not show sufficient linearity upon dilution in QC samples (r < 0.6) were discarded. Only metabolic features present in > 2/3 of the samples were kept for further analysis. Inter- and intra-batch variations were corrected by applying locally estimated scatterplot smoothing local regression (LOESS) on pooled samples injected repetitively along the batches (span = 0.75). Data were acquired in four batches for HILIC and RPLC modes. Dilution effects were corrected using probabilistic quotient normalization (PQN) (Rosen Vollmar et al., 2019). Missing values were imputed by drawing from a random distribution of low values in the corresponding sample. Data from each mode were then merged, producing a dataset containing 6,630 metabolic features. Metabolite abundances were reported as spectral counts. Metabolic feature annotation. Peak annotation was first performed by matching experimental m/z, retention time and MS/MS spectra to an in-house library of analytical-grade standards. Remaining peaks were identified by matching experimental m/z and fragmentation spectra to publicly available databases including HMDB (http://www.hmdb.ca/), MoNA (http://mona.fiehnlab.ucdavis.edu/) and MassBank (http://www.massbank.jp/) using the R package ‘MetID’ (v0.2.0) (Shen et al., 2019). Briefly, metabolic feature tables from Progenesis QI were matched to fragmentation spectra with a m/z and a retention time window of ± 15 ppm and ± 30 s (HILIC) and ± 20 s (RPLC), respectively. When multiple MS/MS spectra match a single metabolic feature, all matched MS/MS spectra were used for the identification. Next, MS1 and MS2 pairs were searched against public databases and a similarity score was calculated using the forward dot–product algorithm which considers both fragments and intensities (Stein and Scott, 1994). Metabolites were reported if the similarity score was above 0.4. Spectra from metabolic features of interest important in random forest models (see below) were further investigated manually to confirm identification. |
Ion Mode: | NEGATIVE |
Capillary Temperature: | 375C |
Capillary Voltage: | 3.4kV |
Collision Energy: | 25 & 35 NCE |
Collision Gas: | N2 |
Dry Gas Temp: | 310C |
MS ID: | MS002292 |
Analysis ID: | AN002472 |
Instrument Name: | Thermo Q Exactive HF hybrid Orbitrap |
Instrument Type: | Orbitrap |
MS Type: | ESI |
MS Comments: | Data processing. Data from each mode were independently analyzed using Progenesis QI software (v2.3) (Nonlinear Dynamics). Metabolic features from blanks and that did not show sufficient linearity upon dilution in QC samples (r < 0.6) were discarded. Only metabolic features present in > 2/3 of the samples were kept for further analysis. Inter- and intra-batch variations were corrected by applying locally estimated scatterplot smoothing local regression (LOESS) on pooled samples injected repetitively along the batches (span = 0.75). Data were acquired in four batches for HILIC and RPLC modes. Dilution effects were corrected using probabilistic quotient normalization (PQN) (Rosen Vollmar et al., 2019). Missing values were imputed by drawing from a random distribution of low values in the corresponding sample. Data from each mode were then merged, producing a dataset containing 6,630 metabolic features. Metabolite abundances were reported as spectral counts. Metabolic feature annotation. Peak annotation was first performed by matching experimental m/z, retention time and MS/MS spectra to an in-house library of analytical-grade standards. Remaining peaks were identified by matching experimental m/z and fragmentation spectra to publicly available databases including HMDB (http://www.hmdb.ca/), MoNA (http://mona.fiehnlab.ucdavis.edu/) and MassBank (http://www.massbank.jp/) using the R package ‘MetID’ (v0.2.0) (Shen et al., 2019). Briefly, metabolic feature tables from Progenesis QI were matched to fragmentation spectra with a m/z and a retention time window of ± 15 ppm and ± 30 s (HILIC) and ± 20 s (RPLC), respectively. When multiple MS/MS spectra match a single metabolic feature, all matched MS/MS spectra were used for the identification. Next, MS1 and MS2 pairs were searched against public databases and a similarity score was calculated using the forward dot–product algorithm which considers both fragments and intensities (Stein and Scott, 1994). Metabolites were reported if the similarity score was above 0.4. Spectra from metabolic features of interest important in random forest models (see below) were further investigated manually to confirm identification. |
Ion Mode: | POSITIVE |
Capillary Temperature: | 375C |
Capillary Voltage: | 3.4kV |
Collision Energy: | 25 & 50 NCE |
Collision Gas: | N2 |
Dry Gas Temp: | 310C |
MS ID: | MS002293 |
Analysis ID: | AN002473 |
Instrument Name: | Thermo Q Exactive HF hybrid Orbitrap |
Instrument Type: | Orbitrap |
MS Type: | ESI |
MS Comments: | Data processing. Data from each mode were independently analyzed using Progenesis QI software (v2.3) (Nonlinear Dynamics). Metabolic features from blanks and that did not show sufficient linearity upon dilution in QC samples (r < 0.6) were discarded. Only metabolic features present in > 2/3 of the samples were kept for further analysis. Inter- and intra-batch variations were corrected by applying locally estimated scatterplot smoothing local regression (LOESS) on pooled samples injected repetitively along the batches (span = 0.75). Data were acquired in four batches for HILIC and RPLC modes. Dilution effects were corrected using probabilistic quotient normalization (PQN) (Rosen Vollmar et al., 2019). Missing values were imputed by drawing from a random distribution of low values in the corresponding sample. Data from each mode were then merged, producing a dataset containing 6,630 metabolic features. Metabolite abundances were reported as spectral counts. Metabolic feature annotation. Peak annotation was first performed by matching experimental m/z, retention time and MS/MS spectra to an in-house library of analytical-grade standards. Remaining peaks were identified by matching experimental m/z and fragmentation spectra to publicly available databases including HMDB (http://www.hmdb.ca/), MoNA (http://mona.fiehnlab.ucdavis.edu/) and MassBank (http://www.massbank.jp/) using the R package ‘MetID’ (v0.2.0) (Shen et al., 2019). Briefly, metabolic feature tables from Progenesis QI were matched to fragmentation spectra with a m/z and a retention time window of ± 15 ppm and ± 30 s (HILIC) and ± 20 s (RPLC), respectively. When multiple MS/MS spectra match a single metabolic feature, all matched MS/MS spectra were used for the identification. Next, MS1 and MS2 pairs were searched against public databases and a similarity score was calculated using the forward dot–product algorithm which considers both fragments and intensities (Stein and Scott, 1994). Metabolites were reported if the similarity score was above 0.4. Spectra from metabolic features of interest important in random forest models (see below) were further investigated manually to confirm identification. |
Ion Mode: | NEGATIVE |
Capillary Temperature: | 375C |
Capillary Voltage: | 3.4kV |
Collision Energy: | 25 & 50 NCE |
Collision Gas: | N2 |
Dry Gas Temp: | 310C |