Summary of Study ST001269

This data is available at the NIH Common Fund's National Metabolomics Data Repository (NMDR) website, the Metabolomics Workbench, https://www.metabolomicsworkbench.org, where it has been assigned Project ID PR000854. The data can be accessed directly via it's Project DOI: 10.21228/M8998T This work is supported by NIH grant, U2C- DK119886.

See: https://www.metabolomicsworkbench.org/about/howtocite.php

This study contains a large results data set and is not available in the mwTab file. It is only available for download via FTP as data file(s) here.

Perform statistical analysis  |  Show all samples  |  Show named metabolites  |  Download named metabolite data  
Download mwTab file (text)   |  Download mwTab file(JSON)   |  Download data files (Contains raw data)
Study IDST001269
Study TitleExosomal lipids for classifying early and late stage non-small cell lung cancer
Study TypeBiomarker Discovery
Study SummaryLung cancer is the leading cause of cancer deaths in the United States. Patients with early stage lung cancer have the best prognosis with surgical removal of the tumor, but the disease is often asymptomatic until advanced disease develops, and there are no effective blood-based screening methods for early detection of lung cancer in at-risk populations. We have explored the lipid profiles of blood plasma exosomes using ultra high-resolution Fourier transform mass spectrometry (UHR-FTMS) for early detection of the prevalent non-small cell lung cancers (NSCLC). Exosomes are nanovehicles released by various cells and tumor tissues to elicit important biofunctions such as immune modulation and tumor development. Plasma exosomal lipid profiles were acquired from 39 normal and 91 NSCLC subjects (44 early stage and 47 late stage). We have applied two multivariate statistical methods, Random Forest (RF) and Least Absolute Shrinkage and Selection Operator (LASSO) to classify the data. For the RF method, the Gini importance of the assigned lipids was calculated to select 16 lipids with top importance. Using the LASSO method, 7 features were selected based on a grouped LASSO penalty. The Area Under the Receiver Operating Characteristic curve for early and late stage cancer versus normal subjects using the selected lipid features was 0.85 and 0.88 for RF and 0.79 and 0.77 for LASSO, respectively. These results show the value of RF and LASSO for metabolomics data-based biomarker development, which provide robust an independent classifiers with sparse data sets. Application of LASSO and Random Forests identifies lipid features that successfully distinguish early stage lung cancer patient from healthy individuals.
Institute
University of Kentucky
DepartmentCenter for Environmental and Systems Biochemistry
Last NameThompson
First NamePatrick
Address789 South Limestone, Lexington, Kentucky, 40536, USA
Emailptth222@uky.edu, rick.higashi@uky.edu
Phone8592181027
Submit Date2019-10-17
Total Subjects95
Publicationshttps://doi.org/10.1016/j.aca.2018.02.051
Raw Data AvailableYes
Raw Data File Type(s)raw(Thermo)
Analysis Type DetailMS(Dir. Inf.)
Release Date2019-10-11
Release Version1
Patrick Thompson Patrick Thompson
https://dx.doi.org/10.21228/M8998T
ftp://www.metabolomicsworkbench.org/Studies/ application/zip

Select appropriate tab below to view additional metadata details:


Project:

Project ID:PR000854
Project DOI:doi: 10.21228/M8998T
Project Title:Exosomal lipids for classifying early and late stage non-small cell lung cancer
Project Type:Biomarker Discovery
Project Summary:Lung cancer is the leading cause of cancer deaths in the United States. Patients with early stage lung cancer have the best prognosis with surgical removal of the tumor, but the disease is often asymptomatic until advanced disease develops, and there are no effective blood-based screening methods for early detection of lung cancer in at-risk populations. We have explored the lipid profiles of blood plasma exosomes using ultra high-resolution Fourier transform mass spectrometry (UHR-FTMS) for early detection of the prevalent non-small cell lung cancers (NSCLC). Exosomes are nanovehicles released by various cells and tumor tissues to elicit important biofunctions such as immune modulation and tumor development. Plasma exosomal lipid profiles were acquired from 39 normal and 91 NSCLC subjects (44 early stage and 47 late stage). We have applied two multivariate statistical methods, Random Forest (RF) and Least Absolute Shrinkage and Selection Operator (LASSO) to classify the data. For the RF method, the Gini importance of the assigned lipids was calculated to select 16 lipids with top importance. Using the LASSO method, 7 features were selected based on a grouped LASSO penalty. The Area Under the Receiver Operating Characteristic curve for early and late stage cancer versus normal subjects using the selected lipid features was 0.85 and 0.88 for RF and 0.79 and 0.77 for LASSO, respectively. These results show the value of RF and LASSO for metabolomics data-based biomarker development, which provide robust an independent classifiers with sparse data sets. Application of LASSO and Random Forests identifies lipid features that successfully distinguish early stage lung cancer patient from healthy individuals.
Institute:University of Kentucky
Department:Center for Environmental and Systems Biochemistry
Last Name:Thompson
First Name:Patrick
Address:789 South Limestone, Lexington, Kentucky, 40536, USA
Email:ptth222@uky.edu; rick.higashi@uky.edu
Phone:8592181027
Funding Source:NCI
Publications:https://doi.org/10.1016/j.aca.2018.02.051

Subject:

Subject ID:SU001337
Subject Type:Human
Subject Species:Homo sapiens
Taxonomy ID:9606

Factors:

Subject type: Human; Subject species: Homo sapiens (Factor headings shown in green)

mb_sample_id local_sample_id Cancer Stage
SA09213817Dec15_34exoEarly
SA09213917Dec15_2P58(19)Early
SA09214017Dec15_35exoEarly
SA09214117Dec15_36exoEarly
SA09214217Dec15_P193Early
SA09214317Dec12_P97(48)Early
SA09214417Dec12_8P76Early
SA09214517Dec11_10P66(25)Early
SA09214617Dec08_P145Early
SA09214717Dec11_P128Early
SA09214817Dec11_UK012Early
SA09214917Dec15_P92(46)Early
SA09215017Dec12_P83Early
SA09215117Dec18_17P64(23)Early
SA09215217Dec22_24exoEarly
SA09215317Dec20_P123Early
SA09215417Dec22_5P73(31)Early
SA09215517Dec22_P81Early
SA09215617Dec22_P89Early
SA09215717Dec20_P103Early
SA09215817Dec19_P82Early
SA09215917Dec18_2P147Early
SA09216017Dec08_P134Early
SA09216117Dec18_50exoEarly
SA09216217Dec18_P132(61)Early
SA09216317Dec19_21exoEarly
SA09216417Dec18_10exoEarly
SA09216517Dec18_UK009Early
SA09216617Dec08_P125Early
SA09216717Dec15_51exoLate
SA09216817Dec15_43exoLate
SA09216917Dec18_25exoLate
SA09217017Dec18_33exoLate
SA09217117Dec18_3exoLate
SA09217217Dec15_40exoLate
SA09217317Dec18_28exoLate
SA09217417Dec15_39exoLate
SA09217517Dec12_4exoLate
SA09217617Dec12_17exoLate
SA09217717Dec11_48exoLate
SA09217817Dec12_5exoLate
SA09217917Dec15_13exoLate
SA09218017Dec15_2exoLate
SA09218117Dec15_23exoLate
SA09218217Dec18_44exoLate
SA09218317Dec18_49exoLate
SA09218417Dec22_15exoLate
SA09218517Dec22_14exoLate
SA09218617Dec22_29exoLate
SA09218717Dec22_42exoLate
SA09218817Dec22_8exoLate
SA09218917Dec22_7exoLate
SA09219017Dec22_11exoLate
SA09219117Dec20_47exoLate
SA09219217Dec19_1exoLate
SA09219317Dec19_15P169(28)Late
SA09219417Dec19_32exoLate
SA09219517Dec19_41exoLate
SA09219617Dec20_45exoLate
SA09219717Dec20_30exoLate
SA09219817Dec18_6exoLate
SA09219917Dec11_27exoLate
SA09220017Dec08_26exoLate
SA09220117Dec11_22exoLate
SA09220217Dec08_9exoLate
SA09220317Dec08_19exoLate
SA09220417Dec11_18exoLate
SA09220517Dec08_P31Late
SA09220617Dec20_P109N28Normal
SA09220717Dec20_P95N19Normal
SA09220817Dec19_P37N12Normal
SA09220917Dec19_P192N49Normal
SA09221017Dec20_UK001NNormal
SA09221117Dec18_P28N5Normal
SA09221217Dec22_P32N7Normal
SA09221317Dec22_P91N17Normal
SA09221417Dec08_P90Normal
SA09221517Dec22_P35N10Normal
SA09221617Dec18_P162N42Normal
SA09221717Dec22_P209N50Normal
SA09221817Dec22_P135N37Normal
SA09221917Dec15_P36N11Normal
SA09222017Dec12_P108N27Normal
SA09222117Dec12_P94N18Normal
SA09222217Dec11_P99N21Normal
SA09222317Dec11_P88N15Normal
SA09222417Dec08_P98Normal
SA09222517Dec15_P104N24Normal
SA09222617Dec15_P126N33Normal
SA09222717Dec18_P101N22Normal
SA09222817Dec18_P127N34Normal
SA09222917Dec15_P29N6Normal
SA09223017Dec08_P110N29Normal
SA09223117Dec15_P164N44Normal
SA09223217Dec18_P136N38Normal
Showing results 1 to 95 of 95

Collection:

Collection ID:CO001331
Collection Summary:Ten mL samples of blood were drawn into a purple top vacutainer containing K2-EDTA (Becton-Dickson), inverted twice to ensure dissolution of the EDTA, and kept on ice immediately after blood draw. The whole blood was separated into packed red cells, buffy coat, and plasma within 30 min of collection by centrifuging at 3500 g for 15 min at 4 C in a swing out rotor. All blood processing procedures were performed in a class II biosafety cabinet housed in a BSL category 2 laboratory. Plasma (0.7 mL) was aliquotted into 1.5 mL screw cap vials, flash frozen in liq. N2, and stored at-80  C until exosomal isolation. These collection and processing procedures were designed to minimize variations in plasma and exosome quality.
Sample Type:Blood (whole)
Storage Conditions:-80℃

Treatment:

Treatment ID:TR001352
Treatment Summary:No Treatment

Sample Preparation:

Sampleprep ID:SP001345
Sampleprep Summary:Exosomes were isolated from plasma by differential ultracentrifugation adapted from Refs. [47,48]. 0.7 mL cleared plasma (see above) were placed in 5  41 mm polyallomer ultraclear ultracentrifuge tubes on ice, and centrifuged for 1 h at 70,000 g at 4 C in a SWTi55 swing out rotor (Beckman). The supernatant was recentrifuged at 100,000 g for 1 h at 4 C, and the pellet was drained and resuspended in 0.7 mL cold PBS, and recentrifuged at 100,000 g for 1 h at 4  C. The washed exosomal pellets were resuspended in 100 mL nanopure water, vortexed for 30 s and transferred to a fresh microcentrifuge tube. The ultracentrifuge tube was washed with another 100 mL of nanopure water, vortexed for 30 s and the wash was transferred into same microcentrifuge tube, using the same pipet tip. The combined exosome suspensions were then lyophilized except for a small portion that was used for characterization by particle size distribution analysis (see below). These nanoparticles are operationally defined as exosomes. The lyophilized EXO preparations were extracted for lipidic metabolites using a solvent partitioning method with CH3CN:H2O:CHCl3 (2:1.5:1, v/v) as described previously [49]. The resulting lipid extracts were vacuum-dried in a vacuum centrifuge (Eppendorf), redissolved in 200 mL CHCl3:CH3OH (2:1) with 1 mM butylated hydroxytoluene, which was further diluted 1:20 in isopropanol/CH3OH/CHCl3 (4:2:1) with 20 mM ammonium formate for UHR-FTMS analysis.
Sampleprep Protocol Comments:A small fraction (<1%) of each exosome preparation was characterized by size distribution analysis using a Nanosight 300 (Malvern Instruments), which provided the distribution of the Stokes' radius (mean 60e66 nm) and the number density of the particles. A typical analysis is shown in Fig. S1. The method eliminates very small particles, and provides a strongly peaked, narrow distribution at the expected size for exosomes (40e100nm, observed mode of 60e65 nm for the main peaks in Figs. S1A and B).

Combined analysis:

Analysis ID AN002109
Analysis type MS
Chromatography type None (Direct infusion)
Chromatography system Thermo Orbitrap Fusion
Column none
MS Type ESI
MS instrument type Orbitrap
MS instrument name Thermo Fusion Orbitrap
Ion Mode POSITIVE
Units Ion Intensity

Chromatography:

Chromatography ID:CH001539
Instrument Name:Thermo Orbitrap Fusion
Column Name:none
Chromatography Type:None (Direct infusion)

MS:

MS ID:MS001960
Analysis ID:AN002109
Instrument Name:Thermo Fusion Orbitrap
Instrument Type:Orbitrap
MS Type:ESI
MS Comments:High sample throughput ( 16 min total cycle time per sample, <7 min for MS1 portion) was achieved using the nanoelectrospray TriVersa NanoMate (Advion Biosciences, Ithaca, NY, USA) with 1.5 kV electrospray voltage and 0.4 psi head pressure. UHR-FTMS data were acquired from an Orbitrap Fusion Tribrid (Thermo Scientific, San Jose, CA, USA) set at a resolving power of 450,000 (at 200 m/z) for MS1 full scans using 10 microscans per scan in the m/z range of 150e1,600, achieving sub ppm mass accuracy through <1200 m/z in positive mode. AGC (Automatic Gain Control) target was set to 1e5 and maximal injection time was set to 100 ms. During the MS1 run, the top 500 most intense monoisotopic precursor ions were isolated via quadrupole using 1m/z isolation window and HCD (Higher Energy Collisional Dissociation) set at 25% collision energy was performed in positive mode for datadependent MS2 at a resolving power of 120,000 (at 200 m/z) to obtain fragments for acyl chain assignment and neutral loss of specific head groups. The AGC target was set to 5e4 with maximal injection time of 500 ms. MS2 does not distinguish the sn1 and sn2 acyl positions of glycerolipids, nor the position of unsaturations in acyl chains and acyl branching. Representative full scan MS along with an example MS2 spectrum are shown in Fig. S2. The UHR-FTMS raw data were assigned by our (CESB) in-house software PREMISE (PRecalculated Exact Mass Isotopologue Search Engine) that compares UHR-FTMS m/z data against our metabolite m/z library (calculated with mass accuracy to the 5th decimal point) to discern all known lipid MF and their 13C isotopologues, including hypothetical lipids, while simultaneously taking into account all of the major adducts (here Hþ, Naþ, Kþ and NHþ4 ) [50,51]. An in-house developed natural abundance (NA) correction algorithm [52,53] was applied to simultaneously examine the distribution of naturally occurring 13C isotopologues of the unlabeled lipids to help verify the assigned molecular formulae, and to eliminate non- monoisotopic 13C isotopologues from further analysis. For statistical classification, we used only high accuracy monoisotopic m/z values that mapped to lipid molecular formulae, and multiple adducts of each were tracked throughout to avoid redundancy. Below, such m/z values are referred to as “lipid features”, and neither molecular formulae nor lipid names were directly used. The number of assigned lipid features in each sample varied from 1 to 70. After combining all samples into a master file, the data set had a total of 430 such lipid features. Prior to multivariate statistical analyses, MS1 peaks arising from solvent blanks and known contaminants were removed from the lipid feature lists. As absolute intensities vary from sample to sample, the lipid features must be normalized. The intensities of the lipid features in each sample were thus normalized to the summed intensities of all mass peaks that were non-zero in 20%, 50%, 75%, 97%, 100% of all samples. This is equivalent to estimating the mole fraction of each lipid feature present, and therefore can be used for determining relative changes in composition. We found that normalization using the summed intensities of lipid features that were non-zero in 20% of all samples provided the best statistical outcome according to the ROC analysis.
Ion Mode:POSITIVE
  logo