Summary of Study ST001269

This data is available at the NIH Common Fund's National Metabolomics Data Repository (NMDR) website, the Metabolomics Workbench, https://www.metabolomicsworkbench.org, where it has been assigned Project ID PR000854. The data can be accessed directly via it's Project DOI: 10.21228/M8998T This work is supported by NIH grant, U2C- DK119886.

See: https://www.metabolomicsworkbench.org/about/howtocite.php

This study contains a large results data set and is not available in the mwTab file. It is only available for download via FTP as data file(s) here.

Study ID	ST001269
Study Title	Exosomal lipids for classifying early and late stage non-small cell lung cancer
Study Type	Biomarker Discovery
Study Summary	Lung cancer is the leading cause of cancer deaths in the United States. Patients with early stage lung cancer have the best prognosis with surgical removal of the tumor, but the disease is often asymptomatic until advanced disease develops, and there are no effective blood-based screening methods for early detection of lung cancer in at-risk populations. We have explored the lipid profiles of blood plasma exosomes using ultra high-resolution Fourier transform mass spectrometry (UHR-FTMS) for early detection of the prevalent non-small cell lung cancers (NSCLC). Exosomes are nanovehicles released by various cells and tumor tissues to elicit important biofunctions such as immune modulation and tumor development. Plasma exosomal lipid profiles were acquired from 39 normal and 91 NSCLC subjects (44 early stage and 47 late stage). We have applied two multivariate statistical methods, Random Forest (RF) and Least Absolute Shrinkage and Selection Operator (LASSO) to classify the data. For the RF method, the Gini importance of the assigned lipids was calculated to select 16 lipids with top importance. Using the LASSO method, 7 features were selected based on a grouped LASSO penalty. The Area Under the Receiver Operating Characteristic curve for early and late stage cancer versus normal subjects using the selected lipid features was 0.85 and 0.88 for RF and 0.79 and 0.77 for LASSO, respectively. These results show the value of RF and LASSO for metabolomics data-based biomarker development, which provide robust an independent classifiers with sparse data sets. Application of LASSO and Random Forests identifies lipid features that successfully distinguish early stage lung cancer patient from healthy individuals.
Institute	University of Kentucky
Department	Center for Environmental and Systems Biochemistry
Last Name	Thompson
First Name	Patrick
Address	789 South Limestone, Lexington, Kentucky, 40536, USA
Email	ptth222@uky.edu, rick.higashi@uky.edu
Phone	8592181027
Submit Date	2019-10-17
Total Subjects	95
Publications	https://doi.org/10.1016/j.aca.2018.02.051
Raw Data Available	Yes
Raw Data File Type(s)	raw(Thermo)
Analysis Type Detail	MS(Dir. Inf.)
Release Date	2019-10-11
Release Version	1

Select appropriate tab below to view additional metadata details:

Project:

Project ID:	PR000854
Project DOI:	doi: 10.21228/M8998T
Project Title:	Exosomal lipids for classifying early and late stage non-small cell lung cancer
Project Type:	Biomarker Discovery
Project Summary:	Lung cancer is the leading cause of cancer deaths in the United States. Patients with early stage lung cancer have the best prognosis with surgical removal of the tumor, but the disease is often asymptomatic until advanced disease develops, and there are no effective blood-based screening methods for early detection of lung cancer in at-risk populations. We have explored the lipid profiles of blood plasma exosomes using ultra high-resolution Fourier transform mass spectrometry (UHR-FTMS) for early detection of the prevalent non-small cell lung cancers (NSCLC). Exosomes are nanovehicles released by various cells and tumor tissues to elicit important biofunctions such as immune modulation and tumor development. Plasma exosomal lipid profiles were acquired from 39 normal and 91 NSCLC subjects (44 early stage and 47 late stage). We have applied two multivariate statistical methods, Random Forest (RF) and Least Absolute Shrinkage and Selection Operator (LASSO) to classify the data. For the RF method, the Gini importance of the assigned lipids was calculated to select 16 lipids with top importance. Using the LASSO method, 7 features were selected based on a grouped LASSO penalty. The Area Under the Receiver Operating Characteristic curve for early and late stage cancer versus normal subjects using the selected lipid features was 0.85 and 0.88 for RF and 0.79 and 0.77 for LASSO, respectively. These results show the value of RF and LASSO for metabolomics data-based biomarker development, which provide robust an independent classifiers with sparse data sets. Application of LASSO and Random Forests identifies lipid features that successfully distinguish early stage lung cancer patient from healthy individuals.
Institute:	University of Kentucky
Department:	Center for Environmental and Systems Biochemistry
Last Name:	Thompson
First Name:	Patrick
Address:	789 South Limestone, Lexington, Kentucky, 40536, USA
Email:	ptth222@uky.edu; rick.higashi@uky.edu
Phone:	8592181027
Funding Source:	NCI
Publications:	https://doi.org/10.1016/j.aca.2018.02.051

Subject:

Subject ID:	SU001337
Subject Type:	Human
Subject Species:	Homo sapiens
Taxonomy ID:	9606

Factors:

Subject type: Human; Subject species: Homo sapiens (Factor headings shown in green)

mb_sample_id	local_sample_id	Cancer Stage
SA092138	17Dec15_34exo	Early
SA092139	17Dec15_2P58(19)	Early
SA092140	17Dec15_35exo	Early
SA092141	17Dec15_36exo	Early
SA092142	17Dec15_P193	Early
SA092143	17Dec12_P97(48)	Early
SA092144	17Dec12_8P76	Early
SA092145	17Dec11_10P66(25)	Early
SA092146	17Dec08_P145	Early
SA092147	17Dec11_P128	Early
SA092148	17Dec11_UK012	Early
SA092149	17Dec15_P92(46)	Early
SA092150	17Dec12_P83	Early
SA092151	17Dec18_17P64(23)	Early
SA092152	17Dec22_24exo	Early
SA092153	17Dec20_P123	Early
SA092154	17Dec22_5P73(31)	Early
SA092155	17Dec22_P81	Early
SA092156	17Dec22_P89	Early
SA092157	17Dec20_P103	Early
SA092158	17Dec19_P82	Early
SA092159	17Dec18_2P147	Early
SA092160	17Dec08_P134	Early
SA092161	17Dec18_50exo	Early
SA092162	17Dec18_P132(61)	Early
SA092163	17Dec19_21exo	Early
SA092164	17Dec18_10exo	Early
SA092165	17Dec18_UK009	Early
SA092166	17Dec08_P125	Early
SA092167	17Dec15_51exo	Late
SA092168	17Dec15_43exo	Late
SA092169	17Dec18_25exo	Late
SA092170	17Dec18_33exo	Late
SA092171	17Dec18_3exo	Late
SA092172	17Dec15_40exo	Late
SA092173	17Dec18_28exo	Late
SA092174	17Dec15_39exo	Late
SA092175	17Dec12_4exo	Late
SA092176	17Dec12_17exo	Late
SA092177	17Dec11_48exo	Late
SA092178	17Dec12_5exo	Late
SA092179	17Dec15_13exo	Late
SA092180	17Dec15_2exo	Late
SA092181	17Dec15_23exo	Late
SA092182	17Dec18_44exo	Late
SA092183	17Dec18_49exo	Late
SA092184	17Dec22_15exo	Late
SA092185	17Dec22_14exo	Late
SA092186	17Dec22_29exo	Late
SA092187	17Dec22_42exo	Late
SA092188	17Dec22_8exo	Late
SA092189	17Dec22_7exo	Late
SA092190	17Dec22_11exo	Late
SA092191	17Dec20_47exo	Late
SA092192	17Dec19_1exo	Late
SA092193	17Dec19_15P169(28)	Late
SA092194	17Dec19_32exo	Late
SA092195	17Dec19_41exo	Late
SA092196	17Dec20_45exo	Late
SA092197	17Dec20_30exo	Late
SA092198	17Dec18_6exo	Late
SA092199	17Dec11_27exo	Late
SA092200	17Dec08_26exo	Late
SA092201	17Dec11_22exo	Late
SA092202	17Dec08_9exo	Late
SA092203	17Dec08_19exo	Late
SA092204	17Dec11_18exo	Late
SA092205	17Dec08_P31	Late
SA092206	17Dec20_P109N28	Normal
SA092207	17Dec20_P95N19	Normal
SA092208	17Dec19_P37N12	Normal
SA092209	17Dec19_P192N49	Normal
SA092210	17Dec20_UK001N	Normal
SA092211	17Dec18_P28N5	Normal
SA092212	17Dec22_P32N7	Normal
SA092213	17Dec22_P91N17	Normal
SA092214	17Dec08_P90	Normal
SA092215	17Dec22_P35N10	Normal
SA092216	17Dec18_P162N42	Normal
SA092217	17Dec22_P209N50	Normal
SA092218	17Dec22_P135N37	Normal
SA092219	17Dec15_P36N11	Normal
SA092220	17Dec12_P108N27	Normal
SA092221	17Dec12_P94N18	Normal
SA092222	17Dec11_P99N21	Normal
SA092223	17Dec11_P88N15	Normal
SA092224	17Dec08_P98	Normal
SA092225	17Dec15_P104N24	Normal
SA092226	17Dec15_P126N33	Normal
SA092227	17Dec18_P101N22	Normal
SA092228	17Dec18_P127N34	Normal
SA092229	17Dec15_P29N6	Normal
SA092230	17Dec08_P110N29	Normal
SA092231	17Dec15_P164N44	Normal
SA092232	17Dec18_P136N38	Normal

Showing results 1 to 95 of 95

Showing results 1 to 95 of 95

Collection:

Collection ID:	CO001331
Collection Summary:	Ten mL samples of blood were drawn into a purple top vacutainer containing K2-EDTA (Becton-Dickson), inverted twice to ensure dissolution of the EDTA, and kept on ice immediately after blood draw. The whole blood was separated into packed red cells, buffy coat, and plasma within 30 min of collection by centrifuging at 3500 g for 15 min at 4 C in a swing out rotor. All blood processing procedures were performed in a class II biosafety cabinet housed in a BSL category 2 laboratory. Plasma (0.7 mL) was aliquotted into 1.5 mL screw cap vials, flash frozen in liq. N2, and stored at-80 C until exosomal isolation. These collection and processing procedures were designed to minimize variations in plasma and exosome quality.
Sample Type:	Blood (whole)
Storage Conditions:	-80℃

Treatment:

Treatment ID:	TR001352
Treatment Summary:	No Treatment

Sample Preparation:

Sampleprep ID:	SP001345
Sampleprep Summary:	Exosomes were isolated from plasma by differential ultracentrifugation adapted from Refs. [47,48]. 0.7 mL cleared plasma (see above) were placed in 5 41 mm polyallomer ultraclear ultracentrifuge tubes on ice, and centrifuged for 1 h at 70,000 g at 4 C in a SWTi55 swing out rotor (Beckman). The supernatant was recentrifuged at 100,000 g for 1 h at 4 C, and the pellet was drained and resuspended in 0.7 mL cold PBS, and recentrifuged at 100,000 g for 1 h at 4 C. The washed exosomal pellets were resuspended in 100 mL nanopure water, vortexed for 30 s and transferred to a fresh microcentrifuge tube. The ultracentrifuge tube was washed with another 100 mL of nanopure water, vortexed for 30 s and the wash was transferred into same microcentrifuge tube, using the same pipet tip. The combined exosome suspensions were then lyophilized except for a small portion that was used for characterization by particle size distribution analysis (see below). These nanoparticles are operationally defined as exosomes. The lyophilized EXO preparations were extracted for lipidic metabolites using a solvent partitioning method with CH3CN:H2O:CHCl3 (2:1.5:1, v/v) as described previously [49]. The resulting lipid extracts were vacuum-dried in a vacuum centrifuge (Eppendorf), redissolved in 200 mL CHCl3:CH3OH (2:1) with 1 mM butylated hydroxytoluene, which was further diluted 1:20 in isopropanol/CH3OH/CHCl3 (4:2:1) with 20 mM ammonium formate for UHR-FTMS analysis.
Sampleprep Protocol Comments:	A small fraction (<1%) of each exosome preparation was characterized by size distribution analysis using a Nanosight 300 (Malvern Instruments), which provided the distribution of the Stokes' radius (mean 60e66 nm) and the number density of the particles. A typical analysis is shown in Fig. S1. The method eliminates very small particles, and provides a strongly peaked, narrow distribution at the expected size for exosomes (40e100nm, observed mode of 60e65 nm for the main peaks in Figs. S1A and B).

Combined analysis:

Analysis ID	AN002109
Analysis type	MS
Chromatography type	None (Direct infusion)
Chromatography system	Thermo Orbitrap Fusion
Column	none
MS Type	ESI
MS instrument type	Orbitrap
MS instrument name	Thermo Fusion Orbitrap
Ion Mode	POSITIVE
Units	Ion Intensity

Chromatography:

Chromatography ID:	CH001539
Instrument Name:	Thermo Orbitrap Fusion
Column Name:	none
Chromatography Type:	None (Direct infusion)

MS:

MS ID:	MS001960
Analysis ID:	AN002109
Instrument Name:	Thermo Fusion Orbitrap
Instrument Type:	Orbitrap
MS Type:	ESI
MS Comments:	High sample throughput ( 16 min total cycle time per sample, <7 min for MS1 portion) was achieved using the nanoelectrospray TriVersa NanoMate (Advion Biosciences, Ithaca, NY, USA) with 1.5 kV electrospray voltage and 0.4 psi head pressure. UHR-FTMS data were acquired from an Orbitrap Fusion Tribrid (Thermo Scientific, San Jose, CA, USA) set at a resolving power of 450,000 (at 200 m/z) for MS1 full scans using 10 microscans per scan in the m/z range of 150e1,600, achieving sub ppm mass accuracy through <1200 m/z in positive mode. AGC (Automatic Gain Control) target was set to 1e5 and maximal injection time was set to 100 ms. During the MS1 run, the top 500 most intense monoisotopic precursor ions were isolated via quadrupole using 1m/z isolation window and HCD (Higher Energy Collisional Dissociation) set at 25% collision energy was performed in positive mode for datadependent MS2 at a resolving power of 120,000 (at 200 m/z) to obtain fragments for acyl chain assignment and neutral loss of specific head groups. The AGC target was set to 5e4 with maximal injection time of 500 ms. MS2 does not distinguish the sn1 and sn2 acyl positions of glycerolipids, nor the position of unsaturations in acyl chains and acyl branching. Representative full scan MS along with an example MS2 spectrum are shown in Fig. S2. The UHR-FTMS raw data were assigned by our (CESB) in-house software PREMISE (PRecalculated Exact Mass Isotopologue Search Engine) that compares UHR-FTMS m/z data against our metabolite m/z library (calculated with mass accuracy to the 5th decimal point) to discern all known lipid MF and their 13C isotopologues, including hypothetical lipids, while simultaneously taking into account all of the major adducts (here Hþ, Naþ, Kþ and NHþ4 ) [50,51]. An in-house developed natural abundance (NA) correction algorithm [52,53] was applied to simultaneously examine the distribution of naturally occurring 13C isotopologues of the unlabeled lipids to help verify the assigned molecular formulae, and to eliminate non- monoisotopic 13C isotopologues from further analysis. For statistical classification, we used only high accuracy monoisotopic m/z values that mapped to lipid molecular formulae, and multiple adducts of each were tracked throughout to avoid redundancy. Below, such m/z values are referred to as “lipid features”, and neither molecular formulae nor lipid names were directly used. The number of assigned lipid features in each sample varied from 1 to 70. After combining all samples into a master file, the data set had a total of 430 such lipid features. Prior to multivariate statistical analyses, MS1 peaks arising from solvent blanks and known contaminants were removed from the lipid feature lists. As absolute intensities vary from sample to sample, the lipid features must be normalized. The intensities of the lipid features in each sample were thus normalized to the summed intensities of all mass peaks that were non-zero in 20%, 50%, 75%, 97%, 100% of all samples. This is equivalent to estimating the mole fraction of each lipid feature present, and therefore can be used for determining relative changes in composition. We found that normalization using the summed intensities of lipid features that were non-zero in 20% of all samples provided the best statistical outcome according to the ROC analysis.
Ion Mode:	POSITIVE