Purpose: No validated biomarkers that could identify the subset of patients with lung adenocarcinoma who might benefit from chemotherapy have yet been well established. This study aimed to explore potential biomarker model predictive of efficacy and survival outcomes after first-line pemetrexed plus platinum doublet based on metabolomics profiling.

Experimental Design: In total, 354 consecutive eligible patients were assigned to receive first-line chemotherapy of pemetrexed in combination with either cisplatin or carboplatin. Prospectively collected serum samples before initial treatment were utilized to perform metabolomics profiling analyses under the application of LC/MS-MS. Binary logistic regression analysis was carried out to establish discrimination models.

Results: There were 251 cases randomly sorted into discovery set, the rest of 103 cases into validation set. Seven metabolites including hypotaurine, uridine, dodecanoylcarnitine, choline, dimethylglycine, niacinamide, and l-palmitoylcarnitine were identified associated with chemo response. On the basis of the seven-metabolite panel, a discriminant model according to logistic regression values g(z) was established with the receiver operating characteristic curve (AUC) of 0.912 (Discovery set) and 0.909 (Validation set) in differentiating progressive disease (PD) groups from disease control (DC) groups. The median progression-free survival (PFS) after chemotherapy in patients with g(z) ≤0.155 was significantly longer than that in those with g(z) > 0.155 (10.3 vs.4.5 months, P < 0.001).

Conclusions: This study developed an effective and convenient discriminant model that can accurately predict the efficacy and survival outcomes of pemetrexed plus platinum doublet chemotherapy prior to treatment delivery. Clin Cancer Res; 24(9); 2100–9. ©2018 AACR.

This article is featured in Highlights of This Issue, p. 2027

Translational Relevance

In clinical practice, there are currently no validated biomarkers that can, prior to treatment, reliably indicate the intrinsically sensitive or resistant features of a nonsquamous NSCLC patient to pemetrexed plus platinum doublet chemotherapy. In this study, we investigated the metabolic characteristics of a large cohort of prechemotherapeutic serum samples (354 cases) and found tight associations between small metabolite subsets and the responses of patients to this cytotoxic drug combination. We developed an effective discriminant model employing a seven-metabolite panel (hypotaurine, uridine, dodecanoylcarnitine, choline, dimethylglycine, niacinamide, l-palmitoylcarnitine) that can predict the efficacy of this chemotherapeutic regimen, prior to treatment, with a sensitivity of 90.8% and specificity of 79.5%. We also identified three one-carbon metabolism-involved metabolites including choline, betaine, and DMG that are potentially associated with drug resistance to pemetrexed. This serum-based biomarker study can be easily applied in clinical practice and personalized treatment decisions.

Lung cancer is one of the most common fatal malignancies worldwide, causing over one million deaths annually (1). Pemetrexed plus platinum (cisplatin or carboplatin) doublet chemotherapy has become a standard first-line treatment for nonsquamous non–small cell lung cancer (NSCLC) patients who are not eligible for targeted therapies [e.g., tyrosine kinase inhibitors (TKIs) or immunotherapy (2, 3)]. However, the efficacy of this preferred chemotherapeutic regimen is very limited, with response rates of 30%–40%, and progression-free survival (PFS) of about 4–6 months (3–5). Despite a large number of promising studies demonstrating the expression level of enzymes targeted by pemetrexed (6, 7) and enzymes involved in nucleotide excision repair (NER) of DNA–platinum adducts (8) can be used to predict response to pemetrexed plus cisplatin treatment, the translation of these scientific data from bench to the bedside has not yet occurred. The detection assays, tissue types and antibodies used in previous studies, which lead to inconsistent results, needed to be standardized in all current clinical trials (9). Besides, large clinical trials are needed to further validate the predictive abilities before these biomarkers being reliably used to guide treatment protocols (10). Unlike targeted therapies with specific gene aberrations or activated kinases as predictive biomarkers such as mutations in EGFR or translocations of ALK (anaplastic lymphoma kinase; refs. 11, 12), cytotoxic drug-based chemotherapies do not have specific molecular targets. Besides a cancer cell itself, pharmacokinetic and tumor microenvironmental issues are also known to contribute to the clinical failure of chemotherapy and apparent drug resistance (13). Thus, potential biomarkers that can address these issues relating to drug resistance and can predict the response of a patient to pemetrexed plus platinum doublet therapy are needed.

Metabolomics is an -omics technology that provides information that complements the genomic and proteomic profile of a sample (14). Metabolites are stable in serum and can be quantified, which presents an opportunity to establish a noninvasive approach for monitoring disease status and exploring biomarkers to predict the efficacy of anticancer therapies (15). Mapstone and colleagues (2014) discovered and validated a panel of ten lipid metabolites from plasma that predicts Alzheimer's disease within a 2- to 3-year timeframe with over 90% accuracy (16). Sreekumar and colleagues (2009) revealed the potential role of sarcosine in prostate cancer progression by profiling metabolites from a large cohort of clinical specimens (17).

We hypothesize that the capacity to respond to pemetrexed plus platinum doublet chemotherapy is a congenital feature that represents a specific metabolic phenotype, and can therefore be extrapolated from analysis of the metabolic profiles of prechemotherapeutic serum. To explore this hypothesis, we used LC/MS-MS to perform metabolomic profiling analysis (18) and identified potential metabolic biomarkers that can, prior to treatment, predict the efficacy of pemetrexed-platinum doublet therapy in patients with nonsquamous NSCLC.

Study design and patient population

This study was designed to explore the serum metabolic biomarkers predictive of efficacy of chemotherapy. Patients included in our study were assigned to receive first-line chemotherapy of pemetrexed (500 mg/m2) in combination with either cisplatin (75 mg/m2) or carboplatin (AUC = 5 mg/ml.min) intravenously on day 1 every 21 days for 4–6 cycles or disease progression (3). The serum samples were prospectively collected before first-cycle treatment. Patients were randomly sorted into discovery group (251 cases, 70.9%), and the rest of the cases into validation group (103 cases, 29.1%). This study was approved by the medical ethics committee of Peking University Cancer Hospital. All patients signed written informed consent to participate.

From September 2014 to December 2016, 389 patients consecutively diagnosed with inoperable stage IIIB or IV lung adenocarcinoma, who received pemetrexed plus platinum (cisplatin or carboplatin) doublet as the first-line regimen were recruited into this study. Patients who with symptoms associated with bacterial infection, such as fever, increased leucocytes and neutrophils, inflammation indicated by lung imaging or microculture, were not included in our study to avoid any influence of bacterial infection on the serum metabolome. Finally, 354 serum samples were successfully used for metabolite profiling analysis. Reasons for the patients omitted from analysis included pretreatment serum not available (24 cases, 6.2%), or not completing two cycles of chemotherapy due to overt cytotoxicity (14 cases, 3.6%) or switched to molecular targeted therapy due to EGFR mutation or ALK rearrangement (7 cases, 1.8%). No further histologic subtyping such as IASLC/AST/ERS classification was performed due to their inconsistent prognostic value of chemo-response (19). The tumor responses to chemotherapy were evaluated per Response Evaluation Criteria in Solid Tumors, version 1.1 (RECIST v1.1).

Serum sample collection

Fasting peripheral blood (3.5 mL) was collected with a serum separator tube in the morning, one week before the first cycle of chemotherapy treatment. The blood was centrifuged within 30 minutes of venipuncture at 1, 200 g for 10 minutes at 4°C, and serum was collected and stored at −80°C.

Metabolite extraction

Serum samples were thawed on ice and vortexed thoroughly. To extract hydrophilic metabolites, 100 μL of homogeneous serum sample was mixed with 400 μL methanol (prechilled to −80°C). The mixture was vortexed for 30 seconds and incubated for 6–8 hours at −80°C. After centrifugation at 12,000 × g at 4°C for 10 minutes, the supernatant (300 μL) was transferred to a fresh tube and lyophilized under vacuum. The dried sample were reconstituted in 80-μL 80% methanol, vortexed for 30 seconds, and incubated at 4°C for 15 minutes. The samples were centrifuged at 12,000 × g at 4°C for 20 minutes. Finally, 20 μL of supernatant was used for LC/MS-MS analysis.

For hydrophobic metabolite extraction, 400 μL of extraction buffer (chloroform/methanol, 2:1, v/v) was added to 100 μL of serum sample. The mixture was vortexed for 30 seconds and centrifuged at 10,000 × g for 10 minutes at room temperature. Subsequently, the lower organic-phase (200 μL) was collected into a fresh tube and lyophilized under vacuum. The dried samples were dissolved in 150 μL of dichloromethane/methanol (2:1, v/v), vortexed for 30 seconds, and then centrifuged at 12,000 × g for 15 minutes at room temperature. Finally, 20 μL of supernatant was used for LC/MS-MS analysis.

Metabolite profiling analysis

For untargeted metabolite screening, a Q Exactive orbitrap mass spectrometer (Thermo) was used. In positive mode, 95% and 50% acetonitrile was applied as mobile phase A and B, with 10 mmol/L ammonium formate and 0.1% formic acid in both phases. Atlantis HILIC Silica columns were used for separation with a column temperature of 35°C. Separation was initiated at 1% mobile phase B with a flow rate of 300 μL/minute. The gradient is shown in Supplementary Fig. S8A. In negative mode, mobile phase A and B were 95% and 50% acetonitrile with 10 mmol/L ammonium formate. However, the pH of A and B was adjusted to 9.0 using ammonium hydroxide. BEH Amide columns (2.1 mm × 100 mm, Waters) were utilized for LC separation in negative mode. The column temperature was 35°C. The gradient was started at 5% mobile phase B with a flow rate of 250 μL/minute. The gradient is shown in Supplementary Fig. S8B. The detailed mass spectrometer parameters are listed in Supplementary Fig. S8C.

Lipid analysis was performed on a Q Exactive orbitrap mass spectrometer (Thermo). Mobile phase A was prepared by dissolving 10 mmol/L ammonium acetate in 60% acetonitrile and mobile phase B was 10%/90% acetonitrile/isopropanol (v/v). XSelect CSH C18 columns (2.1 mm × 100 mm, Waters) were used for lipid analysis with column temperature of 45°C. The gradient was generated with flow rate of 250 μL/minute as shown in Supplementary Fig. S8D. The detailed mass spectrometer parameters are as shown in Supplementary Fig. S8E.

All mobile phases were freshly prepared to avoid bacterial contamination. For data quality assessment, pooled quality control (QC) samples were generated by mixing equal aliquots of representative subsets of subjects. Five injections of QC samples were analyzed at the beginning of the sample queue for column conditioning followed by one QC sample inserted for every ten samples. The sequence order of all the samples was randomized to avoid the interference from system instability.

Data processing and statistical analysis

Peak extraction and alignment was performed using SIEVE software (Thermo). Features that existed in at least 80% samples of a group were retained (20). To acquire more reliable peaks, features were further verified by TraceFinder (Thermo) based on the m/z value and retention time. The relative ion intensity was normalized to the sum of the peak area of a sample. Statistical analysis was performed using SPSS 23.0 software. Statistical significance was calculated using Student t test as implemented. Survival analysis was assessed with Kaplan–Meier curves via log-rank tests. All the statistical tests were two-sided and considered statistically significant at P < 0.05, unless otherwise stated. Features from both ESI+ and ESI with statistical significance were imported into the SIMCA-P program (version 12.0, Umetrics) for multivariate analysis. Partial least squares discriminate analysis (PLS-DA) was applied with unit variance (UV) scaling (21). The normalized amount of each metabolite was graphed in a boxplot using GraphPad Prism version 5.0 (GraphPad Software). The ROC curve and binary logistic regression were applied to the serum data. Logistic regression with the best subset selection method was used for variable selection, in which, the highest likelihood score indicated the best model under certain numbers of variables. Five-fold cross validation was used to validate models with different variable numbers, and the model with the minimum MSE (mean squared error) was defined as the best one. The performance of the discriminant model was characterized by estimating the area under the ROC curve (AUC; ref. 22). Youden index (J = sensitivity + specificity − 1), as a commonly used measure of overall diagnostic effectiveness (23), was used in conjunction with ROC analysis.

Patient sets and baseline characteristics

In total, 354 serum samples were collected from patients with histologically confirmed lung adenocarcinoma in this study, including 251 cases as the discovery set for predictive model establishment and 103 cases as the validation set (Fig. 1). All the patients received pemetrexed plus platinum doublet [pemetrexed/cisplatin, (pem-cis), n = 224 and pemetrexed/carboplatin, (pem-carbo), n = 130] as a first-line therapy. Most patients achieved disease control, including those with a confirmed partial response (PR) or stable disease (SD; pem-cis, 183/224, 82% and pem-carbo, 95/130, 73%). The disease control rates, stratified by chemotherapeutic regimen (pem-cis vs. pem-carbo) are presented by patient set in Table 1. The clinical characteristics of age, gender, smoking status, stage of disease, and Eastern Cooperative Oncology Group (ECOG) performance status were comparable between the discovery and validation set.

Figure 1.

Study profile. Among the 389 patients who were eligible for the study, 45 were excluded, in which 24 patients were not available for pretreatment serum and 21 patients did not complete the two cycles of treatment either because the overt side effect (n = 14) or the detection of EGFR or ALK gene aberrant (n = 7) during the chemotherapy and thus altered with tyrosine kinase inhibitor (TKI) treatment. Finally, 354 serum samples were randomly sorted into discovery group (251 cases, 70.9%), and the rest of the cases into validation group (103 cases, 29.1%). Pem-Cis, pemetrexed-cisplatin; pem-Carbo, pemetrexed-carboplatin; PR, partial response; SD, stable disease; PD, progressive disease. ECOG denotes Eastern Cooperative Oncology Group.

Figure 1.

Study profile. Among the 389 patients who were eligible for the study, 45 were excluded, in which 24 patients were not available for pretreatment serum and 21 patients did not complete the two cycles of treatment either because the overt side effect (n = 14) or the detection of EGFR or ALK gene aberrant (n = 7) during the chemotherapy and thus altered with tyrosine kinase inhibitor (TKI) treatment. Finally, 354 serum samples were randomly sorted into discovery group (251 cases, 70.9%), and the rest of the cases into validation group (103 cases, 29.1%). Pem-Cis, pemetrexed-cisplatin; pem-Carbo, pemetrexed-carboplatin; PR, partial response; SD, stable disease; PD, progressive disease. ECOG denotes Eastern Cooperative Oncology Group.

Close modal
Table 1.

Patient sets and baseline characteristics

Characteristics of the patients (N = 354)a
Discovery set (n = 251)Validation set (n = 103)
CharacteristicsbPem-cis (n = 161, 64.1%)Pem-carbo (n = 90, 35.9%)Pem-cis (n = 63, 61.2%)Pem-carbo (n = 40, 38.8%)
RECIST ResultbPR (n = 51)SD (n = 81)PD (n = 29)PR (n = 31)SD (n = 35)PD (n = 24)PR (n = 22)SD (n = 29)PD (n = 12)PR (n = 14)SD (n = 15)PD (n = 11)
Age, y (P = 0.134) 
Median (range) 59 (44–84) 61 (37–83) 
Sex, n (%) (P = 0.114) 
 Male 148 (60.0%) 70 (68.0%) 
 Female 103 (40.0%) 33 (32.0%) 
Smoking status, n (%) (P = 0.164) 
 Former smoker 114 (45.4%) 45 (43.7%) 
 Never smoker 137 (54.6%) 58 (56.3%) 
Disease stage, n (%) (P = 0.300) 
 IIIB 31 (12.4%) 17 (16.5%) 
 IV 220 (87.6%) 86 (83.5%) 
ECOG performance status, n (%)c (P = 0.122) 
 0 90 (35.9%) 46 (44.6%) 
 1 161 (64.1%) 57 (55.4%) 
Characteristics of the patients (N = 354)a
Discovery set (n = 251)Validation set (n = 103)
CharacteristicsbPem-cis (n = 161, 64.1%)Pem-carbo (n = 90, 35.9%)Pem-cis (n = 63, 61.2%)Pem-carbo (n = 40, 38.8%)
RECIST ResultbPR (n = 51)SD (n = 81)PD (n = 29)PR (n = 31)SD (n = 35)PD (n = 24)PR (n = 22)SD (n = 29)PD (n = 12)PR (n = 14)SD (n = 15)PD (n = 11)
Age, y (P = 0.134) 
Median (range) 59 (44–84) 61 (37–83) 
Sex, n (%) (P = 0.114) 
 Male 148 (60.0%) 70 (68.0%) 
 Female 103 (40.0%) 33 (32.0%) 
Smoking status, n (%) (P = 0.164) 
 Former smoker 114 (45.4%) 45 (43.7%) 
 Never smoker 137 (54.6%) 58 (56.3%) 
Disease stage, n (%) (P = 0.300) 
 IIIB 31 (12.4%) 17 (16.5%) 
 IV 220 (87.6%) 86 (83.5%) 
ECOG performance status, n (%)c (P = 0.122) 
 0 90 (35.9%) 46 (44.6%) 
 1 161 (64.1%) 57 (55.4%) 

aMedian of age was calculated by Wilcoxon two-sample test (P = 0.134). Comparisons of proportions were performed with the χ2 test. There were no significant differences between the discovery set and validation set in the characteristics of sex (P = 0.114), smoking status (P = 0.164), disease stage (P = 0.300), ECOG performance status (P = 0.122).

bPem-cis, pemetrexed-cisplatin; Pem-carbo, pemetrexed-carboplatin; PR, partial response; SD, stable disease; PD, progressive disease.

cThe Eastern Cooperative Oncology Group (ECOG) performance-status scores range from 0 to 5, with higher scores indicating increasing disability. A score of 0 indicates no symptoms and 1 mild symptoms.

Metabolomic profiling of pretreatment serum specimens

The 251 pretreatment serum specimens of the discovery set were subjected to nontargeted metabolomics analysis using LC/MS-MS. Pooled quality control (QC) samples, which were tightly clustered based on principal component analysis (PCA) (Supplementary Fig. S1; ref. 24), verified the stability and repeatability of the sample analysis sequence. After peak alignment and removal of missing values, 1,373 electrospray ionization positive-mode (ESI+) features and 1,014 negative-mode (ESI) features were obtained. Further evaluation of these metabolomic profiles revealed a total of 379 out of 2,387 features (229 ESI+ and 150 ESI) that exhibited significant differences between the disease control (DC) group (including the PR and SD groups) and the progressive disease (PD) group (Wilcoxon P < 0.05). Finally, 90 features were selected because they had accumulation trends (either up or down) that were consistent with the clinical outcome of this chemotherapy from PR to SD to PD, where PR is the best response to the therapy and PD is the worst outcome. Most of these selected features (85/90) were from the hydrophilic sample extracts while the 5 from the hydrophobic extracts were not considered further in this study. The DC (PR and SD) and PD groups exhibited different accumulation patterns of these 85 hydrophilic features (62 ESI+ and 23 ESI; Fig. 2A), which were therefore used for the subsequent identification of potential metabolite biomarkers.

Figure 2.

Examination of the metabolic profiles of pretreatment serum specimens. A, Heatmap representation of unsupervised hierarchical clustering of the 85 significantly changed ion features (rows) grouped by sample type (columns). PR, partial response; SD, stable disease; PD, progressive disease. ESI+ and ESI represent features detected in the positive and negative electrospray ionization modes, respectively. Shades of yellow and blue represent increase and decrease of an ion feature respectively relative to the average levels in all the samples (see color scale). B, Scores plot of PLS-DA of samples in the discovery set based on the ion 85 features. Red, progressive disease (PD) samples; yellow, stable disease (SD) samples; green, partial response (PR) samples. C, Values for the variable importance in the project (VIP) of 85 ion features are illustrated. Each column represents one feature in the PLS-DA model (Fig. 2B). Error bars, SEM.

Figure 2.

Examination of the metabolic profiles of pretreatment serum specimens. A, Heatmap representation of unsupervised hierarchical clustering of the 85 significantly changed ion features (rows) grouped by sample type (columns). PR, partial response; SD, stable disease; PD, progressive disease. ESI+ and ESI represent features detected in the positive and negative electrospray ionization modes, respectively. Shades of yellow and blue represent increase and decrease of an ion feature respectively relative to the average levels in all the samples (see color scale). B, Scores plot of PLS-DA of samples in the discovery set based on the ion 85 features. Red, progressive disease (PD) samples; yellow, stable disease (SD) samples; green, partial response (PR) samples. C, Values for the variable importance in the project (VIP) of 85 ion features are illustrated. Each column represents one feature in the PLS-DA model (Fig. 2B). Error bars, SEM.

Close modal

Identification of metabolites predictive of chemotherapeutic response

PLS-DA was used to determine the potential metabolite features associated with chemotherapeutic response. On the basis of the selected 85 features, the PLS-DA scores plot illustrated a clear separation between the DC and PD groups in the discovery set (Fig. 2B). Values for the variable importance in the project (VIP) were used to rank the importance of different features for specimen discrimination in the model (Fig. 2C). Thirty-two features with VIP value more than one were selected for further consideration (25), including discriminant model establishment and identification of chemical structure. Three parameters of each feature, including accurate mass, retention time, and fragmentation mass spectrum were analyzed and queried in the following databases: The Human Metabolome Database (HMDB, http://www.hmdb.ca/), Kyoto Encyclopedia of Genes and Genomes (KEGG, http://www.genome.jp/kegg/), METLIN (http://metlin.scripps.edu/), and Mass Bank (http://www.massbank.jp/). Ultimately, 11 metabolites were assigned structures, including eight with lower levels in the DC group compared with the PD group [hypotaurine, taurine, choline, betaine, dimethylglycine (DMG), uridine, dodecanoylcarnitine (C12:0-carnitine) and l-palmitoylcarnitine (C16:0-carnitine)], and three with elevated levels in the DC group (palmitic amide, imidazole-4-acetaldehyde, and niacinamide; Supplementary Table S1). These metabolites were unambiguously identified using tandem mass spectrometry (Supplementary Fig. S2A–S2H). The coefficients of variation (CV) of these metabolites in the QC samples were all below 20% (an accepted tolerance level for LC/MS-MS; ref. 26). The relative abundances of these metabolites in the DC group versus PD group are plotted in Fig. 3.

Figure 3.

Relative abundance of the 11 identified metabolites. The abundance of each metabolite was normalized to the mean level of all tested samples. The z-scores are shown as box and whisker plots. The bands inside the box represent the median abundance, and the ends of the whiskers represent the 5th and 95th percentiles. Data not included between the whiskers are plotted as outliers as dots. DC group samples are shown in blue, while PD group samples are in red. The first eight metabolites decreased and the last three increased when comparing the DC group with the PD group samples.

Figure 3.

Relative abundance of the 11 identified metabolites. The abundance of each metabolite was normalized to the mean level of all tested samples. The z-scores are shown as box and whisker plots. The bands inside the box represent the median abundance, and the ends of the whiskers represent the 5th and 95th percentiles. Data not included between the whiskers are plotted as outliers as dots. DC group samples are shown in blue, while PD group samples are in red. The first eight metabolites decreased and the last three increased when comparing the DC group with the PD group samples.

Close modal

Given that pemetrexed is an inhibitor that interferes with folate metabolism, we examined whether any of the structurally identified metabolites in our initial study are potentially associated with responses to pemetrexed monotherapy. To explore this idea, pretreatment serum specimens from 18 patients who received first-line pemetrexed monotherapy (including 6 PR, 6 SD, and 6 PD) were collected and tested in targeted analyses of the previously identified 11 metabolites. We found that the accumulation of three metabolites, choline, betaine, and DMG, changed significantly between the DC and PD specimens (Supplementary Fig. S3A and S3B).

Discriminant model establishment based on the discovery set

The optimal variable combination patterns under certain numbers of variables (from 1 to 11) were obtained according to the likelihood score by using logistic regression model with the best subset selection method (Supplementary Table S2). Under the number of variables from 7 to 11, the likelihood scores were comparable (99.269–101.54) and superior to other number of variables. Furthermore, 5-fold cross validation was used to test the mean squared error (MSE) of the above 11 identified variable combination patterns. As a result, the seven-metabolite panel (including hypotaurine, uridine, C12:0-carnitine, choline, dimethylglycine, niacinamide, C16:0-carnitine) yielded the minimal MSE, which was therefore defined as the best variable model (Supplementary Fig. S4). On the basis of this, the discriminant model was established according to the logistic regression values g(z) (Fig. 4B). The ROC analysis of the model using the seven-metabolite panel yielded the AUC of 0.9214 (an AUC of 0.939 for pem-cis and 0.896 for pem-carbo, respectively; Fig. 4A; Supplementary Fig. S5A and S5B).

Figure 4.

Efficacy prediction based on the seven-metabolite panel. A and C, Plots of ROC results for distinguishing PD samples from the DC samples for the discovery set (A) and the validation set (C). The ROC curves were created by plotting the sensitivity (i.e., true positive rate) against 1−specificity (i.e., false positive rate). The blue line in each plot represents the area under the curve (AUC). B and D, The logistic regression values g(z) for each patient of the discovery (B) and validation set (D). Logistic regression values g(z) were calculated with the formula g(z) = 1/(1+e−z), where z = −10.312+6.81E-07X1+1.30E-07X2+1.55E-07X3+1.29E-09X4-3.10E-08X5-1.66E-07X6-4.04E-8X7 (X1, hypotaurine; X2, uridine; X3, C12:0-carnitine; X4, choline; X5, DMG; X6, Niacinamide; and X7, C16:0-carnitine). E and F, Kaplan–Meier survival curves for progression-free survival (PFS; E) and overall survival (OS; F) time of patients classified by the discrimination model as DC cases and PD cases. Survival differences were evaluated by log-rank test.

Figure 4.

Efficacy prediction based on the seven-metabolite panel. A and C, Plots of ROC results for distinguishing PD samples from the DC samples for the discovery set (A) and the validation set (C). The ROC curves were created by plotting the sensitivity (i.e., true positive rate) against 1−specificity (i.e., false positive rate). The blue line in each plot represents the area under the curve (AUC). B and D, The logistic regression values g(z) for each patient of the discovery (B) and validation set (D). Logistic regression values g(z) were calculated with the formula g(z) = 1/(1+e−z), where z = −10.312+6.81E-07X1+1.30E-07X2+1.55E-07X3+1.29E-09X4-3.10E-08X5-1.66E-07X6-4.04E-8X7 (X1, hypotaurine; X2, uridine; X3, C12:0-carnitine; X4, choline; X5, DMG; X6, Niacinamide; and X7, C16:0-carnitine). E and F, Kaplan–Meier survival curves for progression-free survival (PFS; E) and overall survival (OS; F) time of patients classified by the discrimination model as DC cases and PD cases. Survival differences were evaluated by log-rank test.

Close modal

Validation of the discriminant model

To validate the accuracy of the discriminant model, untargeted metabolomic analysis were performed in the aforementioned validation sample set composed of 103 serum specimens. Similar accumulation trends for the seven metabolites were observed in validation set as compared with discovery set (Supplementary Fig. S6). The discriminant model established from the discovery set data distinguished the specimens of the PD group from those of the DC group in the validation set with an AUC of 0.9092 (0.936 for pem-cis–treated patients and 0.881 for pem-carbo–treated patients; Fig. 4C; Supplementary Fig. S7A and S7B). Logistic regression values g(z) were obtained for the patients in the validation set (Fig. 4D).

Totally, for all the 354 patients both in the discovery and validation set, we set up the optimal cut-off value of g(z) as 0.155, which was in corresponding with the maximum Youden index of our discrimination model. Patients with g(z) values greater than 0.155 will be classified as PD cases, values equal to or less than 0.155 as DC cases. Under the cut-off value, the discrimination model was able to distinguish all the PD patients from DC ones with a sensitivity of 90.8% and specificity of 79.5%.

We further analyzed the potential prediction ability of this discrimination model on progression-free survival (PFS). The median PFS of patients who were classified as DC cases was 10.3 months (95% CI, 9.7–10.9), significantly longer than 4.5 months (95% CI, 4.0–5.1) of PD cases (log-rank, P < 0.001, Fig. 4E). The overall survival (OS) of patients predicted as DC cases was similar with the overall survival of patients of PD cases (median OS, 17.8 vs. 18.3 months, log-rank, P = 0.534; Fig. 4F), with a censoring rate of approximately 9% of all the patients.

Prognostic prediction workflow

To illustrate the prognostic prediction workflow of the discriminant model, we selected three pretreatment serum samples whose clinical responses were typical of PR, SD, and PD, respectively. Metabolomics analysis of the serum samples and application of the discriminant model was conducted (Fig. 5A and B). On the basis of the relative abundance of the seven-metabolite panel, the logistic regression values g(z) of these three samples were calculated as 0.045, 0.112, and 0.882, respectively (Fig. 5C). Under the cut-off value of g(z) of 0.155, we predicted that samples A and B were from DC cases (either PR or SD) and that sample C was from a PD case. These predictions from the model were in line with CT imaging-based clinical evaluations of these three patients (Fig. 5D).

Figure 5.

Analysis workflow of prognostic prediction. A, Typical base peak of the serum specimen in positive ion mode. B, Metabolite identification and quantification of the seven metabolites of hypotaurine, uridine, C12:0-carnitine, choline, DMG, niacinamide, C16:0-carnitine. C, Logistic regression values (g(z)) and outcome prediction of the three samples. D, Typical evaluation of pemetrexed-platinum chemotherapy efficacy according to RECIST version 1.1 by comparing CT imaging in three representative patients at baseline (pretreatment) and after two cycles of treatment (posttreatment). PD, progressive disease; PR, partial response; SD, stable disease. The tumor sizes provided under the images were evaluated by physicians.

Figure 5.

Analysis workflow of prognostic prediction. A, Typical base peak of the serum specimen in positive ion mode. B, Metabolite identification and quantification of the seven metabolites of hypotaurine, uridine, C12:0-carnitine, choline, DMG, niacinamide, C16:0-carnitine. C, Logistic regression values (g(z)) and outcome prediction of the three samples. D, Typical evaluation of pemetrexed-platinum chemotherapy efficacy according to RECIST version 1.1 by comparing CT imaging in three representative patients at baseline (pretreatment) and after two cycles of treatment (posttreatment). PD, progressive disease; PR, partial response; SD, stable disease. The tumor sizes provided under the images were evaluated by physicians.

Close modal

This prediction modeling study was carefully designed by following the CHARMS checklists (27), including the development of the model by untargeted analysis with a discovery set and quantification of the model's predictive performance with an independent validation set by using the same analytic method as that in the discovery set. By using metabolomics analysis in a large lung cancer population, we have developed an effective discriminant model based on a seven-metabolite panel that can predict the efficacy of pemetrexed plus platinum doublet chemotherapy prior to treatment delivery. This predictive model could be easily applied by physicians to select patients who might benefit from pemetrexed and platinum doublet therapy, which provides a promising strategy to personalize this widely used chemotherapy.

Metabolomics, due to its close relationship with the phenotype and sensitivity to many factors, has been widely used in biomarker discovery. As the “downstream” omics of genomic, transcriptomic, and proteomics, metabolomics is able to capture a plurality of subtle changes that reflect alterations of biological states, even when no measurable changes of genes and proteins are detected, thus, at least theoretically, is more likely to provide candidates of potential biomarkers (28). Besides, endogenous metabolites are fewer than genes, transcripts, and proteins and have the same basic chemical structure as well as highly conserved pathways, which makes it easier to interpret the metabolomic data (29, 30). Moreover, metabolomic studies can be assessed noninvasively by using biofluids, such as blood and urine, and are also less expensive than other omics approaches, thus are easier to be translated into clinical practice (31, 32). Despite these advantages, metabolomics still needs to overcome some limitations that impact its applicability to systems biology studies. First, the complete coverage of metabolome by detection and identification of all the metabolites is still a challenge, requiring the application of different detecting platforms as well as the development of the metabolome databases. Besides, as the metabolome is so highly dynamic and sensitive to a wide range of internal and external factors (33), the information obtained needs to be validated to warrant the consistency and reproducibility. The main reason underlying the frequent failure to identify biomarkers that predict patient response to particular chemotherapy treatments is that the targets of cytotoxic agents are not single gene or protein aberrations (34, 35). Instead, cytotoxic agents target general biological processes such as cell proliferation and apoptosis, which can be influenced by cancer-related metabolic alterations. These cancer-driven metabolic abnormalities may be representative of a tumor's intrinsic features and may even be indicative of the cancer's pathogenesis, which affords a change of serum metabolites that are biologically relevant to the system's phenotype (36, 37). Lung adenocarcinoma patients with EGFR-sensitizing mutations had an excellent objective response (60%–80%) and survival outcomes (PFS, 9–13 months) after EGFR-TKIs treatment, meaning that these selected patients had about 70% possibility obtaining response and 50% possibility with disease control time more than about 9 months once identified harboring EGFR mutation. In our study, below the settled cut-off value of g(z) ≤ 0.155, the median PFS of chemotherapy was 10.3 months, and intriguingly comparable with that of EGFR-TKIs. These results indicated that our discriminant model had satisfactory predictive value to chemo-response for clinical practice. Importantly, our formulated gauge including boundary values based on PD possibility can provide integral clinical prognostic information for physicians and be used conveniently in clinical practice.

In total, 11 out of 32 ion features with VIP > 1 was assigned with chemical structures based on information in databases. These metabolites are mainly involved in the metabolism of amino acids, fatty acids and pyrimidine, which are closely associated with cancer progression and drug resistance. Uridine, an important component of pyrimidine metabolism, plays a crucial role in synthesizing RNA, glycogen, and biomembranes (38). Taurine is an essential, sulfur-containing organic compound that has many diverse biological functions as a neurotransmitter, a cell membrane stabilizer and a transport facilitator of ions such as sodium, potassium, calcium, and magnesium (39). Hypotaurine is a product of a reaction catalyzed by cysteamine dioxygenase and has been reported to function as an antioxidant and protective agent under physiologic conditions. Hypotaurine can also be oxidized to taurine by hypotaurine dehydrogenase (40). Carnitine is able to be acylated by the attachment of l-palmitoylcarnitine and dodecanoylcarnitine and thus participates in the formation of organic compounds (41). These metabolites, involved in biosynthesis, solute transport, and physiologic protection, were found in our study to be potentially associated with the efficacy of pemetrexed plus platinum doublet therapy.

By verification of these metabolites in patients receiving pemetrexed monotherapy, choline, betaine, and DMG were found to be associated with the efficacy of pemetrexed monotherapy. Choline, betaine, and DMG can generate sarcosine and glycine through a series of demethylation reactions, which have been reported to play key roles in cancer progression (17, 42). Sarcosine and glycine are involved in the donation of carbon units in the one-carbon metabolism (43). It is noteworthy that the anticancer effects of pemetrexed are known to occur through the inhibition of key folate cycle enzymes, such as thymidylate synthase (TS), dihydrofolate reductase (DHFR), and glycinamide ribonucleotide formyltransferase (GARFT; ref. 44), which inhibits folate synthesis and thus influences one-carbon metabolism. Accordingly, we deduce that the relatively high accumulation of these three metabolites in the PD group patients indicates an abundant source of components involved in one-carbon metabolism that may compensate for the disruption of one-carbon metabolism by pemetrexed treatment and thereby lead to poor clinical outcomes.

To the best of our knowledge, this is the first large-cohort study using metabolic biomarkers to predict the clinical response to pemetrexed plus platinum doublet therapy in patients with lung adenocarcinoma. The discriminant model developed in our study offers a feasible and convenient strategy to personalize the widely used pemetrexed plus platinum doublet chemotherapy. The high accuracy of our established model in discriminating the possibility of chemo-response warrants further development of prospective clinical trials.

No potential conflicts of interest were disclosed by the authors

Conception and design: Y. Tian, Z. Wang, Y. Yin, J. He, J. Wang

Development of methodology: Y. Tian, X. Liu, J. Wang

Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): Z. Wang, X. Liu, D. Wang, J. Wang

Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): Y. Tian, X. Liu, G. Feng, C. Zhang, D. Wang, J. Wang

Writing, review, and/or revision of the manuscript: Y. Tian, Z. Wang, J. Wang

Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): Z. Wang, J. Duan, Z. Chen, H. Bai, R. Wan, J. Jiang, J. Liu, J. Han, X. Zhang, L. Cai, J. Wang

Study supervision: J. Gu, S. Gao, J. Wang

We thank all patients that were involved in this study. We also thank the support from Lina Xu, Xueying Wang and Yupei Jiao from the Metabolomics Facility at the Technology Center for Protein Sciences of Tsinghua University.

This work was supported by the National Natural Sciences Foundation Key Program (81630071) (81330062); National Key Research and Development Project Precision Medicine Special Research (2016YFC0902300); National High Technology Research and Development Program 863 (SS2015AA020403); CAMS Innovation Fund for Medical Sciences (CIFMS 2016-I2M-3-008); Aiyou foundation (KY201701); National Natural Sciences Foundation, China (81101778, 81472206, and 81702289); and Beijing Natural Science Foundation, China (7172045).

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1.
Torre
LA
,
Bray
F
,
Siegel
RL
,
Ferlay
J
,
Lortet-Tieulent
J
,
Jemal
A
. 
Global cancer statistics, 2012
.
CA: A Cancer J Clin
2015
;
65
:
87
108
.
2.
Scagliotti
GV
,
Parikh
P
,
von Pawel
J
,
Biesma
B
,
Vansteenkiste
J
,
Manegold
C
, et al
Phase III study comparing cisplatin plus gemcitabine with cisplatin plus pemetrexed in chemotherapy-naive patients with advanced-stage non-small-cell lung cancer
.
J Clin Oncol
2008
;
26
:
3543
51
.
3.
Ettinger
DS
,
Wood
DE
,
Akerley
W
,
Bazhenova
LA
,
Borghaei
H
,
Camidge
DR
, et al
NCCN Guidelines insights: non-small cell lung cancer, version 4.2016
.
J Natl Compr Canc Netw
2016
;
14
:
255
64
.
4.
Genova
C
,
Rijavec
E
,
Truini
A
,
Coco
S
,
Sini
C
,
Barletta
G
, et al
Pemetrexed for the treatment of non-small cell lung cancer
.
Expert Opin Pharmacother
2013
;
14
:
1545
58
.
5.
Zinner
RG
,
Fossella
FV
,
Gladish
GW
,
Glisson
BS
,
Blumenschein
GR
 Jr
,
Papadimitrakopoulou
VA
, et al
Phase II study of pemetrexed in combination with carboplatin in the first-line treatment of advanced nonsmall cell lung cancer
.
Cancer
2005
;
104
:
2449
56
.
6.
Sun
JM
,
Ahn
JS
,
Jung
SH
,
Sun
J
,
Ha
SY
,
Han
J
, et al
Pemetrexed plus cisplatin versus gemcitabine plus cisplatin according to thymidylate synthase expression in nonsquamous non-small-cell lung cancer: a biomarker-stratified randomized phase II Trial
.
J Clin Oncol
2015
;
33
:
2450
6
.
7.
Shimizu
T
,
Nakanishi
Y
,
Nakagawa
Y
,
Tsujino
I
,
Takahashi
N
,
Nemoto
N
, et al
Association between expression of thymidylate synthase, dihydrofolate reductase, and glycinamide ribonucleotide formyltransferase and efficacy of pemetrexed in advanced non-small cell lung cancer
.
Anticancer Res
2012
;
32
:
4589
96
.
8.
Olaussen
KA
,
Dunant
A
,
Fouret
P
,
Brambilla
E
,
Andre
F
,
Haddad
V
, et al
DNA repair by ERCC1 in non-small-cell lung cancer and cisplatin-based adjuvant chemotherapy
.
N Engl J Med
2006
;
355
:
983
91
.
9.
Bowden
NA
. 
Nucleotide excision repair: why is it not used to predict response to platinum-based chemotherapy?
Cancer Lett
2014
;
346
:
163
71
.
10.
Friboulet
L
,
Olaussen
KA
,
Pignon
JP
,
Shepherd
FA
,
Tsao
MS
,
Graziano
S
, et al
ERCC1 isoform expression and DNA repair in non-small-cell lung cancer
.
N Engl J Med
2013
;
368
:
1101
10
.
11.
Lynch
TJ
,
Bell
DW
,
Sordella
R
,
Gurubhagavatula
S
,
Okimoto
RA
,
Brannigan
BW
, et al
Activating mutations in the epidermal growth factor receptor underlying responsiveness of non-small-cell lung cancer to gefitinib
.
N Engl J Med
2004
;
350
:
2129
39
.
12.
Kwak
EL
,
Bang
YJ
,
Camidge
DR
,
Shaw
AT
,
Solomon
B
,
Maki
RG
, et al
Anaplastic lymphoma kinase inhibition in non-small-cell lung cancer
.
N Engl J Med
2010
;
363
:
1693
703
.
13.
Agarwal
R
,
Kaye
SB
. 
Ovarian cancer: strategies for overcoming resistance to chemotherapy
.
Nat Rev Cancer
2003
;
3
:
502
16
.
14.
Beger
RD
. 
A review of applications of metabolomics in cancer
.
Metabolites
2013
;
3
:
552
74
.
15.
Spratlin
JL
,
Serkova
NJ
,
Eckhardt
SG
. 
Clinical applications of metabolomics in oncology: a review
.
Clin Cancer Res
2009
;
15
:
431
40
.
16.
Mapstone
M
,
Cheema
AK
,
Fiandaca
MS
,
Zhong
X
,
Mhyre
TR
,
MacArthur
LH
, et al
Plasma phospholipids identify antecedent memory impairment in older adults
.
Nat Med
2014
;
20
:
415
8
.
17.
Sreekumar
A
,
Poisson
LM
,
Rajendiran
TM
,
Khan
AP
,
Cao
Q
,
Yu
J
, et al
Metabolomic profiles delineate potential role for sarcosine in prostate cancer progression
.
Nature
2009
;
457
:
910
4
.
18.
Yuan
M
,
Breitkopf
SB
,
Yang
X
,
Asara
JM
. 
A positive/negative ion-switching, targeted mass spectrometry-based metabolomics platform for bodily fluids, cells, and fresh and fixed tissue
.
Nat Protoc
2012
;
7
:
872
81
.
19.
Travis
WD
,
Brambilla
E
,
Noguchi
M
,
Nicholson
AG
,
Geisinger
KR
,
Yatabe
Y
, et al
International association for the study of lung cancer/american thoracic society/european respiratory society international multidisciplinary classification of lung adenocarcinoma
.
J Thorac Oncol
2011
;
6
:
244
85
.
20.
Smilde
AK
,
van der Werf
MJ
,
Bijlsma
S
,
van der Werff-van der Vat
BJ
,
Jellema
RH
. 
Fusion of mass spectrometry-based metabolomics data
.
Anal Chem
2005
;
77
:
6729
36
.
21.
Yin
P
,
Zhao
X
,
Li
Q
,
Wang
J
,
Li
J
,
Xu
G
. 
Metabonomics study of intestinal fistulas based on ultraperformance liquid chromatography coupled with Q-TOF mass spectrometry (UPLC/Q-TOF MS)
.
J Proteome Res
2006
;
5
:
2135
43
.
22.
Kamarudin
AN
,
Cox
T
,
Kolamunnage-Dona
R
. 
Time-dependent ROC curve analysis in medical research: current methods and applications
.
BMC Med Res Methodol
2017
;
17
:
53
.
23.
Schisterman
EF
,
Perkins
NJ
,
Liu
A
,
Bondell
H
. 
Optimal cut-point and its corresponding Youden index to discriminate individuals using pooled blood samples
.
Epidemiology
2005
;
16
:
73
81
.
24.
Want
EJ
,
Wilson
ID
,
Gika
H
,
Theodoridis
G
,
Plumb
RS
,
Shockcor
J
, et al
Global metabolic profiling procedures for urine using UPLC-MS
.
Nat Protoc
2010
;
5
:
1005
18
.
25.
Huang
Q
,
Tan
Y
,
Yin
P
,
Ye
G
,
Gao
P
,
Lu
X
, et al
Metabolic characterization of hepatocellular carcinoma using nontargeted tissue metabolomics
.
Cancer Res
2013
;
73
:
4992
5002
.
26.
Dunn
WB
,
Broadhurst
D
,
Begley
P
,
Zelena
E
,
Francis-McIntyre
S
,
Anderson
N
, et al
Procedures for large-scale metabolic profiling of serum and plasma using gas chromatography and liquid chromatography coupled to mass spectrometry
.
Nat Protoc
2011
;
6
:
1060
83
.
27.
Moons
KGM
,
de Groot
JAH
,
Bouwmeester
W
,
Vergouwe
Y
,
Mallett
S
,
Altman
DG
, et al
Critical appraisal and data extraction for systematic reviews of prediction modelling studies: the CHARMS checklist
.
PLoS Medicine
2014
;
11
:
e1001744
.
28.
Monteiro
MS
,
Carvalho
M
,
Bastos
ML
,
Guedes de Pinho
P
. 
Metabolomics analysis for biomarker discovery: advances and challenges
.
Curr Med Chem
2013
;
20
:
257
71
.
29.
Davis
VW
,
Bathe
OF
,
Schiller
DE
,
Slupsky
CM
,
Sawyer
MB
. 
Metabolomics and surgical oncology: Potential role for small molecule biomarkers
.
J Surg Oncol
2011
;
103
:
451
9
.
30.
Goodacre
R
,
Vaidyanathan
S
,
Dunn
WB
,
Harrigan
GG
,
Kell
DB
. 
Metabolomics by numbers: acquiring and understanding global metabolite data
.
Trends Biotechnol
2004
;
22
:
245
52
.
31.
Kind
T
,
Tolstikov
V
,
Fiehn
O
,
Weiss
RH
. 
A comprehensive urinary metabolomic approach for identifying kidney cancer
.
Anal Biochem
2007
;
363
:
185
95
.
32.
Fiehn
O
,
Kristal
B
,
van Ommen
B
,
Sumner
LW
,
Sansone
SA
,
Taylor
C
, et al
Establishing reporting standards for metabolomic and metabonomic studies: a call for participation
.
OMICS
2006
;
10
:
158
63
.
33.
Castle
AL
,
Fiehn
O
,
Kaddurah-Daouk
R
,
Lindon
JC
. 
Metabolomics Standards workshop and the development of international standards for reporting metabolomics experimental results
.
Brief Bioinform
2006
;
7
:
159
65
.
34.
Muhsin
M
,
Gricks
C
,
Kirkpatrick
P
. 
Pemetrexed disodium
.
Nat Rev Drug Discov
2004
;
3
:
825
6
.
35.
Ahmad
A
,
Gadgeel
S
. 
Lung Cancer and Personalized Medicine
.
Springer International Publishing
; 
2016
.
36.
Griffin
JL
,
Shockcor
JP
. 
Metabolic profiles of cancer cells
.
Nat Rev Cancer
2004
;
4
:
551
61
.
37.
Loo
JM
,
Scherl
A
,
Nguyen
A
,
Man
FY
,
Weinberg
E
,
Zeng
Z
, et al
Extracellular metabolic energetics can promote cancer progression
.
Cell
2015
;
160
:
393
406
.
38.
Yamamoto
T
,
Koyama
H
,
Kurajoh
M
,
Shoji
T
,
Tsutsumi
Z
,
Moriwaki
Y
. 
Biochemistry of uridine in plasma
.
Clin Chim Acta
2011
;
412
:
1712
24
.
39.
Ripps
H
,
Shen
W
. 
Review: taurine: a "very essential" amino acid
.
Mol Vis
2012
;
18
:
2673
86
.
40.
Fontana
M
,
Pecci
L
,
Dupre
S
,
Cavallini
D
. 
Antioxidant properties of sulfinates: protective effect of hypotaurine on peroxynitrite-dependent damage
.
Neurochem Res
2004
;
29
:
111
6
.
41.
Reuter
SE
,
Evans
AM
. 
Carnitine and acylcarnitines: pharmacokinetic, pharmacological and clinical aspects
.
Clin Pharmacokinet
2012
;
51
:
553
72
.
42.
Jain
M
,
Nilsson
R
,
Sharma
S
,
Madhusudhan
N
,
Kitami
T
,
Souza
AL
, et al
Metabolite profiling identifies a key role for glycine in rapid cancer cell proliferation
.
Science
2012
;
336
:
1040
4
.
43.
Locasale
JW
. 
Serine, glycine and one-carbon units: cancer metabolism in full circle
.
Nat Rev Cancer
2013
;
13
:
572
83
.
44.
Chattopadhyay
S
,
Moran
RG
,
Goldman
ID
. 
Pemetrexed: biochemical and cellular pharmacology, mechanisms, and clinical applications
.
Mol Cancer Ther
2007
;
6
:
404
17
.