Summary of project PR002217

This data is available at the NIH Common Fund's National Metabolomics Data Repository (NMDR) website, the Metabolomics Workbench, https://www.metabolomicsworkbench.org, where it has been assigned Project ID PR002217. The data can be accessed directly via it's Project DOI: 10.21228/M82821 This work is supported by NIH grant, U2C- DK119886.

See: https://www.metabolomicsworkbench.org/about/howtocite.php

Project ID: PR002217
Project DOI:doi: 10.21228/M82821
Project Title:Metabolomic-Based Clinical Assessment of Preterm Birth
Project Summary:Machine learning (ML), with advancements in algorithms and computations, is seeing an increased presence in life science research. This study investigated several ML models' efficacy in predicting preterm birth using untargeted metabolomics from serum collected during the third trimester of gestation. Samples from 48 preterm and 102 term delivery mothers (1:2 ratio) from the All Our Families Cohort (Calgary, AB) were examined. Selected ML applications were used to examine the small-scale clinical dataset for both model performance and metabolite interpretation. Model performance was evaluated based on confusion matrices, receiver operating characteristic curves, and feature importance rankings. Conventional linear models, like Partial Least Squares Discriminant Analysis (PLS-DA) and linear logistic regression, showed moderate predictive potential with AUC-ROC around 0.60. Non-linear models, including Extreme Gradient Boosting (XGBoost) and artificial neural networks, had marginally improved predictive accuracy and strength. Resampling by bootstrapping was also examined. Among all MLs, bootstrap resampling enhanced XGBoost's performance the most, improving AUC-ROC (0.85, 95% CI:0.574-0.995, p<0.001) for the best fitted model. Feature importance analysis by Shapley Additive Explanations analysis consistently identified acylcarnitines and amino acid derivatives as significant metabolites. Findings underscored the complexity of modeling preterm birth prediction, suggesting a trial-and-error approach for model selection.
Institute:University of Calgary
Last Name:Han
First Name:Ying Chieh
Address:2500 University Drive NW, Calgary, Alberta, T2N 1N4, Canada
Email:yingchieh.han@ucalgary.ca
Phone:17783848168
Funding Source:NSERC

Summary of all studies in project PR002217

Study IDStudy TitleSpeciesInstituteAnalysis
(* : Contains Untargted data)
Release
Date
VersionSamplesDownload
(* : Contains raw data)
ST003587 Comparison of Machine Learning Models for Metabolomic-Based Clinical Prediction of Preterm Birth Homo sapiens University of Calgary MS 2024-12-12 1 150 Uploaded data (1.2G)*
  logo