Summary of Study ST003587
This data is available at the NIH Common Fund's National Metabolomics Data Repository (NMDR) website, the Metabolomics Workbench, https://www.metabolomicsworkbench.org, where it has been assigned Project ID PR002217. The data can be accessed directly via it's Project DOI: 10.21228/M82821 This work is supported by NIH grant, U2C- DK119886.
See: https://www.metabolomicsworkbench.org/about/howtocite.php
This study contains a large results data set and is not available in the mwTab file. It is only available for download via FTP as data file(s) here.
Study ID | ST003587 |
Study Title | Comparison of Machine Learning Models for Metabolomic-Based Clinical Prediction of Preterm Birth |
Study Summary | Machine learning (ML), with advancements in algorithms and computations, is seeing an increased presence in life science research. This study investigated several ML models' efficacy in predicting preterm birth using untargeted metabolomics from serum collected during the third trimester of gestation. Samples from 48 preterm and 102 term delivery mothers (1:2 ratio) from the All Our Families Cohort (Calgary, AB) were examined. Selected ML applications were used to examine the small-scale clinical dataset for both model performance and metabolite interpretation. Model performance was evaluated based on confusion matrices, receiver operating characteristic curves, and feature importance rankings. Conventional linear models, like Partial Least Squares Discriminant Analysis (PLS-DA) and linear logistic regression, showed moderate predictive potential with AUC-ROC around 0.60. Non-linear models, including Extreme Gradient Boosting (XGBoost) and artificial neural networks, had marginally improved predictive accuracy and strength. Resampling by bootstrapping was also examined. Among all MLs, bootstrap resampling enhanced XGBoost's performance the most, improving AUC-ROC (0.85, 95% CI:0.574-0.995, p<0.001) for the best fitted model. Feature importance analysis by Shapley Additive Explanations analysis consistently identified acylcarnitines and amino acid derivatives as significant metabolites. Findings underscored the complexity of modeling preterm birth prediction, suggesting a trial-and-error approach for model selection. |
Institute | University of Calgary |
Last Name | Han |
First Name | Ying Chieh |
Address | 2500 University Drive NW |
yingchieh.han@ucalgary.ca | |
Phone | 17783848168 |
Submit Date | 2024-11-08 |
Num Groups | 2 |
Total Subjects | 150 |
Num Females | 150 |
Raw Data Available | Yes |
Raw Data File Type(s) | mzML |
Analysis Type Detail | LC-MS |
Release Date | 2024-12-12 |
Release Version | 1 |
Select appropriate tab below to view additional metadata details:
Project:
Project ID: | PR002217 |
Project DOI: | doi: 10.21228/M82821 |
Project Title: | Metabolomic-Based Clinical Assessment of Preterm Birth |
Project Summary: | Machine learning (ML), with advancements in algorithms and computations, is seeing an increased presence in life science research. This study investigated several ML models' efficacy in predicting preterm birth using untargeted metabolomics from serum collected during the third trimester of gestation. Samples from 48 preterm and 102 term delivery mothers (1:2 ratio) from the All Our Families Cohort (Calgary, AB) were examined. Selected ML applications were used to examine the small-scale clinical dataset for both model performance and metabolite interpretation. Model performance was evaluated based on confusion matrices, receiver operating characteristic curves, and feature importance rankings. Conventional linear models, like Partial Least Squares Discriminant Analysis (PLS-DA) and linear logistic regression, showed moderate predictive potential with AUC-ROC around 0.60. Non-linear models, including Extreme Gradient Boosting (XGBoost) and artificial neural networks, had marginally improved predictive accuracy and strength. Resampling by bootstrapping was also examined. Among all MLs, bootstrap resampling enhanced XGBoost's performance the most, improving AUC-ROC (0.85, 95% CI:0.574-0.995, p<0.001) for the best fitted model. Feature importance analysis by Shapley Additive Explanations analysis consistently identified acylcarnitines and amino acid derivatives as significant metabolites. Findings underscored the complexity of modeling preterm birth prediction, suggesting a trial-and-error approach for model selection. |
Institute: | University of Calgary |
Last Name: | Han |
First Name: | Ying Chieh |
Address: | 2500 University Drive NW, Calgary, Alberta, T2N 1N4, Canada |
Email: | yingchieh.han@ucalgary.ca |
Phone: | 17783848168 |
Funding Source: | NSERC |
Subject:
Subject ID: | SU003716 |
Subject Type: | Human |
Subject Species: | Homo sapiens |
Taxonomy ID: | 9606 |
Age Or Age Range: | 19-43 |
Weight Or Weight Range: | 44-116 |
Height Or Height Range: | 147-186 |
Gender: | Female |
Human Ethnicity: | Caucasian |
Species Group: | Mammals |
Factors:
Subject type: Human; Subject species: Homo sapiens (Factor headings shown in green)
mb_sample_id | local_sample_id | Termstatus | Sample source |
---|---|---|---|
SA391403 | 815139 | Preterm Birth | Blood Serum |
SA391404 | 810506 | Preterm Birth | Blood Serum |
SA391405 | 818540 | Preterm Birth | Blood Serum |
SA391406 | 818520 | Preterm Birth | Blood Serum |
SA391407 | 818409 | Preterm Birth | Blood Serum |
SA391408 | 810526 | Preterm Birth | Blood Serum |
SA391409 | 818368 | Preterm Birth | Blood Serum |
SA391410 | 818323 | Preterm Birth | Blood Serum |
SA391411 | 818237 | Preterm Birth | Blood Serum |
SA391412 | 818224 | Preterm Birth | Blood Serum |
SA391413 | 815179 | Preterm Birth | Blood Serum |
SA391414 | 815161 | Preterm Birth | Blood Serum |
SA391415 | 815076 | Preterm Birth | Blood Serum |
SA391416 | 818577 | Preterm Birth | Blood Serum |
SA391417 | 810585 | Preterm Birth | Blood Serum |
SA391418 | 812580 | Preterm Birth | Blood Serum |
SA391419 | 812566 | Preterm Birth | Blood Serum |
SA391420 | 812523 | Preterm Birth | Blood Serum |
SA391421 | 812517 | Preterm Birth | Blood Serum |
SA391422 | 812478 | Preterm Birth | Blood Serum |
SA391423 | 812471 | Preterm Birth | Blood Serum |
SA391424 | 812462 | Preterm Birth | Blood Serum |
SA391425 | 812459 | Preterm Birth | Blood Serum |
SA391426 | 812285 | Preterm Birth | Blood Serum |
SA391427 | 812359 | Preterm Birth | Blood Serum |
SA391428 | 818575 | Preterm Birth | Blood Serum |
SA391429 | 818274 | Preterm Birth | Blood Serum |
SA391430 | 812342 | Preterm Birth | Blood Serum |
SA391431 | 818822 | Preterm Birth | Blood Serum |
SA391432 | 830909 | Preterm Birth | Blood Serum |
SA391433 | 830872 | Preterm Birth | Blood Serum |
SA391434 | 830850 | Preterm Birth | Blood Serum |
SA391435 | 830762 | Preterm Birth | Blood Serum |
SA391436 | 830687 | Preterm Birth | Blood Serum |
SA391437 | 810387 | Preterm Birth | Blood Serum |
SA391438 | 830640 | Preterm Birth | Blood Serum |
SA391439 | 818614 | Preterm Birth | Blood Serum |
SA391440 | 830560 | Preterm Birth | Blood Serum |
SA391441 | 830505 | Preterm Birth | Blood Serum |
SA391442 | 830408 | Preterm Birth | Blood Serum |
SA391443 | 830390 | Preterm Birth | Blood Serum |
SA391444 | 830635 | Preterm Birth | Blood Serum |
SA391445 | 818716 | Preterm Birth | Blood Serum |
SA391446 | 810453 | Preterm Birth | Blood Serum |
SA391447 | 818626 | Preterm Birth | Blood Serum |
SA391448 | 818781 | Preterm Birth | Blood Serum |
SA391449 | 818684 | Preterm Birth | Blood Serum |
SA391450 | 818732 | Preterm Birth | Blood Serum |
SA391451 | 818036 | Term Birth | Blood Serum |
SA391452 | 812455 | Term Birth | Blood Serum |
SA391453 | 812435 | Term Birth | Blood Serum |
SA391454 | 812431 | Term Birth | Blood Serum |
SA391455 | 812412 | Term Birth | Blood Serum |
SA391456 | 812404 | Term Birth | Blood Serum |
SA391457 | 812379 | Term Birth | Blood Serum |
SA391458 | 812397 | Term Birth | Blood Serum |
SA391459 | 812373 | Term Birth | Blood Serum |
SA391460 | 812369 | Term Birth | Blood Serum |
SA391461 | 818556 | Term Birth | Blood Serum |
SA391462 | 812352 | Term Birth | Blood Serum |
SA391463 | 812458 | Term Birth | Blood Serum |
SA391464 | 818706 | Term Birth | Blood Serum |
SA391465 | 818695 | Term Birth | Blood Serum |
SA391466 | 830757 | Term Birth | Blood Serum |
SA391467 | 818010 | Term Birth | Blood Serum |
SA391468 | 812479 | Term Birth | Blood Serum |
SA391469 | 812483 | Term Birth | Blood Serum |
SA391470 | 812498 | Term Birth | Blood Serum |
SA391471 | 818371 | Term Birth | Blood Serum |
SA391472 | 812550 | Term Birth | Blood Serum |
SA391473 | 818758 | Term Birth | Blood Serum |
SA391474 | 812350 | Term Birth | Blood Serum |
SA391475 | 815111 | Term Birth | Blood Serum |
SA391476 | 815124 | Term Birth | Blood Serum |
SA391477 | 818197 | Term Birth | Blood Serum |
SA391478 | 818789 | Term Birth | Blood Serum |
SA391479 | 818593 | Term Birth | Blood Serum |
SA391480 | 510369 | Term Birth | Blood Serum |
SA391481 | 812347 | Term Birth | Blood Serum |
SA391482 | 810499 | Term Birth | Blood Serum |
SA391483 | 810473 | Term Birth | Blood Serum |
SA391484 | 810482 | Term Birth | Blood Serum |
SA391485 | 810489 | Term Birth | Blood Serum |
SA391486 | 810492 | Term Birth | Blood Serum |
SA391487 | 810495 | Term Birth | Blood Serum |
SA391488 | 810497 | Term Birth | Blood Serum |
SA391489 | 810502 | Term Birth | Blood Serum |
SA391490 | 810467 | Term Birth | Blood Serum |
SA391491 | 810503 | Term Birth | Blood Serum |
SA391492 | 810509 | Term Birth | Blood Serum |
SA391493 | 810513 | Term Birth | Blood Serum |
SA391494 | 810514 | Term Birth | Blood Serum |
SA391495 | 810523 | Term Birth | Blood Serum |
SA391496 | 810537 | Term Birth | Blood Serum |
SA391497 | 810472 | Term Birth | Blood Serum |
SA391498 | 810466 | Term Birth | Blood Serum |
SA391499 | 810542 | Term Birth | Blood Serum |
SA391500 | 810388 | Term Birth | Blood Serum |
SA391501 | 515122 | Term Birth | Blood Serum |
SA391502 | 515123 | Term Birth | Blood Serum |
Collection:
Collection ID: | CO003709 |
Collection Summary: | Serum samples were collected during the third trimester between 28 and 32 weeks' of gestation. The collected serum was centrifuged and stored at -80℃ storage until the day of sample assay. |
Sample Type: | Blood (serum) |
Collection Location: | Calgary |
Storage Conditions: | -80℃ |
Treatment:
Treatment ID: | TR003725 |
Treatment Summary: | The experimental group in this study were pregnant woman who later experienced preterm delivery. No additional treatment was implemented. |
Sample Preparation:
Sampleprep ID: | SP003723 |
Sampleprep Summary: | Sample were prepared following the procedure described in the published manuscript titled “Maternal Acylcarnitine Disruption as a Potential Predictor of Preterm Birth in Primigravida” published in Nutrients (2024), 16(5), 595. doi: 10.3390/nu16050595. |
Processing Storage Conditions: | Room temperature |
Extraction Method: | Protein precipitation with methanol |
Extract Storage: | -80℃ |
Sample Resuspension: | in 1:1 methanol:water |
Sample Derivatization: | no |
Sample Spiking: | no |
Combined analysis:
Analysis ID | AN005891 |
---|---|
Analysis type | MS |
Chromatography type | Reversed phase |
Chromatography system | Agilent QTOF 6545i |
Column | Waters ACQUITY UPLC HSS T3 (150 x 2.1 mm, 1.7 um) |
MS Type | ESI |
MS instrument type | QTOF |
MS instrument name | Agilent 6545 QTOF |
Ion Mode | POSITIVE |
Units | Peak Intensity |
Chromatography:
Chromatography ID: | CH004474 |
Chromatography Summary: | Reverse Phase Positive ESI method |
Instrument Name: | Agilent QTOF 6545i |
Column Name: | Waters ACQUITY UPLC HSS T3 (150 x 2.1 mm, 1.7 um) |
Column Temperature: | 40℃ |
Flow Gradient: | Initiated with 5% B for 1.5 min, then a linear gradient of B from 5% to 100% for 14 min, followed by 100% B for 3 min. The gradient returned to the 5% B starting condition at the 17 min mark and equilibrated for 2 min to conclude the run |
Flow Rate: | 0.4 mL/min |
Solvent A: | 100% Water; 0.1% Formic acid |
Solvent B: | 100% Acetonitrile; 0.1% Formic acid |
Chromatography Type: | Reversed phase |
MS:
MS ID: | MS005609 |
Analysis ID: | AN005891 |
Instrument Name: | Agilent 6545 QTOF |
Instrument Type: | QTOF |
MS Type: | ESI |
MS Comments: | MS spectrum was obtained in positive ionization mode betwenn 50 amd 1200 m/z. Peaks were labeled using XCMS web platform. Compound identifies were determined by inputting m/z for identified peaks in Human Metabolome Database for the most probable compound candidate based on tolerance threshold of 30 ppm. |
Ion Mode: | POSITIVE |