Summary of Study ST003995
This data is available at the NIH Common Fund's National Metabolomics Data Repository (NMDR) website, the Metabolomics Workbench, https://www.metabolomicsworkbench.org, where it has been assigned Project ID PR002500. The data can be accessed directly via it's Project DOI: 10.21228/M8GV7H This work is supported by NIH grant, U2C- DK119886.
See: https://www.metabolomicsworkbench.org/about/howtocite.php
This study contains a large results data set and is not available in the mwTab file. It is only available for download via FTP as data file(s) here.
| Study ID | ST003995 |
| Study Title | A machine learning framework to predict cancer metabolomics from gene expression data |
| Study Summary | Metabolomics provides a direct functional readout of a tumor’s physiology. Yet, it is lagging behind other omics technologies in facilitating disease monitoring and prognostication. This stems partly from the scarcity of large-scale metabolomic studies, but also the analytical complexities of detecting diverse metabolites with varying physicochemical properties and concentrations. To address this, we developed a machine learning framework using both tumor tissue and cell line samples across multiple cancer types that allows prediction of metabolomics from gene expression data. To validate our models we performed metabolomic analyses to detect metabolite levels in MCF7 (PI3K wild-type (WT) and E545K mutant (MUT)) and MCF10A isogenic cell lines (PI3K WT, E545K and H1047R MUT) for which coupled RNA-Seq data was available. Targeted profiling of 50 metabolites using UHPLC-MS showed that changes in a pool of metabolites between WT and MUT cell lines positively correlated with predictions of the machine learning framework. This work offers a scalable and efficient machine learning pipeline to determine metabolic from transcriptomic signatures, opening avenues to reconstruct and study the metabolic landscape of samples across novel and existing datasets lacking direct metabolomics measurements. |
| Institute | The Institute of Cancer Research London |
| Department | Cell and Molecular Biology |
| Laboratory | Signalling and Cancer Metabolism |
| Last Name | Poulogiannis |
| First Name | George |
| Address | 237 Fulham Road SW3 6JB LONDON |
| george.poulogiannis@icr.ac.uk | |
| Phone | +442071535347 |
| Submit Date | 2025-05-28 |
| Raw Data Available | Yes |
| Raw Data File Type(s) | mzML, d |
| Analysis Type Detail | LC-MS |
| Release Date | 2025-09-30 |
| Release Version | 1 |
Select appropriate tab below to view additional metadata details:
Project:
| Project ID: | PR002500 |
| Project DOI: | doi: 10.21228/M8GV7H |
| Project Title: | A machine learning framework to predict cancer metabolomics from gene expression data |
| Project Summary: | Metabolomics provides a direct functional readout of a tumor’s physiology. Yet, it is lagging behind other omics technologies in facilitating disease monitoring and prognostication. This stems partly from the scarcity of large-scale metabolomic studies, but also the analytical complexities of detecting diverse metabolites with varying physicochemical properties and concentrations. To address this, we developed a machine learning framework using both tumor tissue and cell line samples across multiple cancer types that allows prediction of metabolomics from gene expression data. Two different model types were selected and trained for tissues and cell lines with their generalization capacity validated on independent cohorts, accurately predicting as high as 70-80% of tested metabolites. This work offers a scalable and efficient machine learning pipeline to determine metabolic from transcriptomic signatures, opening avenues to reconstruct and study the metabolic landscape of samples across novel and existing datasets lacking direct metabolomics measurements. |
| Institute: | The Institute of Cancer Research London |
| Department: | Cell and Molecular Biology |
| Laboratory: | Signalling and Cancer Metabolism |
| Last Name: | Poulogiannis |
| First Name: | George |
| Address: | 237 Fulham Road, LONDON, London, SW3 6JB, United Kingdom |
| Email: | george.poulogiannis@icr.ac.uk |
| Phone: | +442071535347 |
| Funding Source: | Work in the GP lab was supported by UK Research and Innovation (MR/W012030/1 and MC_PC_MR/X013715/1). |
Subject:
| Subject ID: | SU004132 |
| Subject Type: | Cultured cells |
| Subject Species: | Homo sapiens |
| Taxonomy ID: | 9606 |
Factors:
Subject type: Cultured cells; Subject species: Homo sapiens (Factor headings shown in green)
| mb_sample_id | local_sample_id | Sample type | Sample source |
|---|---|---|---|
| SA461167 | MCF10A E545K5 | MCF10A normal breast epithelial cell line | Cultured cells |
| SA461168 | MCF10A H1047R5 | MCF10A normal breast epithelial cell line | Cultured cells |
| SA461169 | MCF10A WT2 | MCF10A normal breast epithelial cell line | Cultured cells |
| SA461170 | MCF10A H1047R3 | MCF10A normal breast epithelial cell line | Cultured cells |
| SA461171 | MCF10A H1047R2 | MCF10A normal breast epithelial cell line | Cultured cells |
| SA461172 | MCF10A H1047R1 | MCF10A normal breast epithelial cell line | Cultured cells |
| SA461173 | MCF10A WT1 | MCF10A normal breast epithelial cell line | Cultured cells |
| SA461174 | MCF10A E545K4 | MCF10A normal breast epithelial cell line | Cultured cells |
| SA461175 | MCF10A WT4 | MCF10A normal breast epithelial cell line | Cultured cells |
| SA461176 | MCF10A E545K3 | MCF10A normal breast epithelial cell line | Cultured cells |
| SA461177 | MCF10A WT3 | MCF10A normal breast epithelial cell line | Cultured cells |
| SA461178 | MCF10A H1047R4 | MCF10A normal breast epithelial cell line | Cultured cells |
| SA461179 | MCF10A WT5 | MCF10A normal breast epithelial cell line | Cultured cells |
| SA461180 | MCF10A E545K1 | MCF10A normal breast epithelial cell line | Cultured cells |
| SA461181 | MCF10A E545K2 | MCF10A normal breast epithelial cell line | Cultured cells |
| SA461182 | MCF7 WT1 | MCF7 breast cancer cell line | Cultured cells |
| SA461183 | MCF7 WT2 | MCF7 breast cancer cell line | Cultured cells |
| SA461184 | MCF7 WT4 | MCF7 breast cancer cell line | Cultured cells |
| SA461185 | MCF7 WT5 | MCF7 breast cancer cell line | Cultured cells |
| SA461186 | MCF7 MUT1 | MCF7 breast cancer cell line | Cultured cells |
| SA461187 | MCF7 MUT2 | MCF7 breast cancer cell line | Cultured cells |
| SA461188 | MCF7 MUT3 | MCF7 breast cancer cell line | Cultured cells |
| SA461189 | MCF7 MUT4 | MCF7 breast cancer cell line | Cultured cells |
| SA461190 | MCF7 MUT5 | MCF7 breast cancer cell line | Cultured cells |
| SA461191 | MCF7 WT3 | MCF7 breast cancer cell line | Cultured cells |
| Showing results 1 to 25 of 25 |
Collection:
| Collection ID: | CO004125 |
| Collection Summary: | Isogenic PI3K WT and their respective MUT (E545K and H1047R) MCF10A cells were cultured in DMEM/F-12 supplemented with 5% horse serum, 20 ng/ml epidermal growth factor (EGF), 100 ng/ml cholera toxin, 0.5 mg/ml hydrocortisone, 10 µg/mL insulin, 100 IU/mL penicillin and 100 μg/mL streptomycin. (PEST). MCF7 isogenic cells were cultured in DMEM supplemented with 10% FBS, 100 IU/mL penicillin and 100 μg/mL streptomycin. Five million cells were seeded per 100 mm Petri dish and incubated for 24 hours in full media. The cells were washed once with ice-cold PBS, snap-frozen in liquid nitrogen and placed on ice. |
| Collection Protocol Filename: | ICR_GP_Protocol.pdf |
| Sample Type: | Cultured cells |
Treatment:
| Treatment ID: | TR004141 |
| Treatment Summary: | No treatment. |
Sample Preparation:
| Sampleprep ID: | SP004138 |
| Sampleprep Summary: | Metabolites were extracted with 500 μl of extraction buffer (methanol : acetonitrile : water, 40 : 40 : 20, pre-chilled at -20°C). The samples were then centrifuged for 10 min at +4°C, 10.000 RPM and the supernatant was transferred to screw-cap tubes for long-term storage at -80°C. Subsequently, 100 μL of the metabolite solution were mixed with 100 μL of acetonitrile, vortexed briefly, centrifuged for 10 min at +4°C, 10.000 RPM and finally transferred into LC-MS V-shaped vials for analysis. |
| Sampleprep Protocol Filename: | ICR_GP_Protocol.pdf |
Chromatography:
| Chromatography ID: | CH004999 |
| Chromatography Summary: | Chromatography column: InfinityLab Poroshell 120 HILIC-z column (2.7 μm, 2.1 mm x 100 mm, PEEK-lined - Agilent: 675775-924). Solvent A: 10 mM ammonium acetate in water pH 9 supplemented with 2.5 μM InfinityLab Deactivator Additive; Solvent B: 10 mM ammonium acetate in acetonitrile/water 85:15 (V:V) pH 9 supplemented with 2.5 μM InfinityLab Deactivator Additive |
| Methods Filename: | ICR_GP_Protocol.pdf |
| Instrument Name: | Agilent 1290 Infinity II |
| Column Name: | Agilent InfinityLab Poroshell 120 HILIC-z (100 x 2.1mm, 2.7um) |
| Column Temperature: | 50 |
| Flow Gradient: | 0 minutes, 96%B; 2 minutes, 96%B; 5.5 minutes, 88%B; 8.5 minutes, 88%B; 9 minutes, 86%B; 14 minutes, 86%B; 17 minutes, 82%B; 23 minutes, 65%B; 24 minutes, 65%B; 24.5 minutes, 96%B; 26 minutes, 96%B and post-time of 3 minutes |
| Flow Rate: | 0.25 mL/minute |
| Solvent A: | 100% water; 10 mM ammonium acetate; 2.5 μM InfinityLab Deactivator Additive |
| Solvent B: | 85% acetonitrile/15% water; 10 mM ammonium acetate; 2.5 μM InfinityLab Deactivator Additive |
| Chromatography Type: | HILIC |
Analysis:
| Analysis ID: | AN006584 |
| Analysis Type: | MS |
| Analysis Protocol File: | ICR_GP_Protocol.pdf |
| Chromatography ID: | CH004999 |
| Num Factors: | 2 |
| Num Metabolites: | 51 |
| Units: | Normalized Area: Area/ug of protein |