Abstract
A mass spectrometry analysis was performed using serum from patients receiving checkpoint inhibitors to define baseline protein signatures associated with outcome in metastatic melanoma. Pretreatment serum was obtained from a development set of 119 melanoma patients on a trial of nivolumab with or without a multipeptide vaccine and from patients receiving pembrolizumab, nivolumab, ipilimumab, or both nivolumab and ipilimumab. Spectra were obtained using matrix-assisted laser desorption/ionization time of flight mass spectrometry. These data combined with clinical data identified patients with better or worse outcomes. The test was applied to five independent patient cohorts treated with checkpoint inhibitors and its biology investigated using enrichment analyses. A signature consisting of 209 proteins or peptides was associated with progression-free and overall survival in a multivariate analysis. The test performance across validation cohorts was consistent with the development set results. A pooled analysis, stratified by set, demonstrated a significantly better overall survival for “sensitive” relative to “resistant” patients, HR = 0.15 (95% confidence interval: 0.06–0.40, P < 0.001). The test was also associated with survival in a cohort of ipilimumab-treated patients. Test classification was found to be associated with acute phase reactant, complement, and wound healing pathways. We conclude that a pretreatment signature of proteins, defined by mass spectrometry analysis and machine learning, predicted survival in patients receiving PD-1 blocking antibodies. This signature of proteins was associated with acute phase reactants and elements of wound healing and the complement cascade. This signature merits further study to determine if it identifies patients who would benefit from PD-1 blockade. Cancer Immunol Res; 6(1); 79–86. ©2017 AACR.
Introduction
Encouraging clinical results in metastatic melanoma with the antibodies nivolumab and pembrolizumab to anti–PD-1 have led to clinically and statistically significant improvements in progression-free and overall survival (OS) compared with alternative first- and second-line therapies (1–3). The promise of these treatments is the induction of durable responses and clinical benefit in a subgroup of around 30% of patients (4). The selection of patients who would benefit from anti–PD-1 therapy based on pretreatment parameters would further clinical understanding of how PD-1 blockade functions and speed development of alternative treatments for the patients not likely to benefit from the use of PD-1 antibodies. Delivering treatments only to those patients likely to benefit would also result in substantial health care savings and decreased morbidity due to elimination of unnecessary toxicity.
Considerable effort has been directed toward establishing the utility of PD-L1 expression on tumor-and/or immune-infiltrating cells measured by immunohistochemistry (IHC), which is currently the best candidate biomarker for selection of patients for anti–PD-1 therapy. Correlations between PD-L1 expression and outcome with PD-1/PD-L1 antibodies have been observed in many studies (5–8), but melanoma patients with negatively stained tumors may still benefit from anti–PD-1 therapy (9). In nonsquamous non–small cell lung cancer, benefit with PD-1 blockade was greater for those with strong PD-L1 staining (5–7). Three tests of PD-L1 expression by IHC are approved by the FDA to guide treatment decisions in bladder and non–small cell lung cancer as well as melanoma, with different assays using different screening thresholds for PD-L1 positivity (10). There is a lack of standardization and universally accepted cutoffs for IHC staining for PD-L1, rendering comparison of biomarker data across trials difficult. PD-L1 expression is a variable, heterogeneous, and dynamic marker inducible by the amounts of IFNγ generated by infiltrating T cells (11). PD-L1 expression thus has limited utility as a predictive marker for the efficacy of PD-1/PD-L1 blockade. A variety of other tumor-associated markers have been found to be associated with resistance to checkpoint inhibition, including infiltration with CD8+PD-1+ T cells (9, 12), altered β-catenin signaling (13, 14), PTEN deletion (15), and a gene signature of IFNγ signaling (16, 17). These markers have yet to be validated in large cohorts of patients.
A serum-based pretreatment test of circulating proteins would not require tissue, and, if found to be associated with a favorable response to PD-1 blocking antibodies would be clinically useful. To develop such a test, sera from a development set of 119 patients with stage IV melanoma collected prior to treatment with the PD-1 antibody nivolumab were analyzed using matrix-assisted laser desorption/ionization—time of flight (MALDI-TOF) mass spectrometry (18, 19). This is a “soft” ionization technique designed to preserve the integrity of large macromolecules. Machine-learning techniques were used to combine the clinical and mass spectral (MS) data to generate a signature of protein peaks associated with outcome after PD-1 blockade (20). We show the results of test development and blinded validation on four independent pretreatment sample sets from patients treated with checkpoint inhibitors, as well as a pooled analysis of patients treated with anti–PD-1 blockade and an analysis of a small cohort of patients treated with a combination of PD-1 and CTLA4 agents. In addition, we show results of a protein set enrichment analysis (PSEA) to elucidate biological pathways associated with test classification.
Materials and Methods
Sample sets
Six pretreatment serum sample sets were used for test development and validation. Patient characteristics are described in Table 1. For development, we used all available (119) pretreatment serum samples from a clinical study of 133 patients (21, 22) in which the efficacy of nivolumab monotherapy at 1, 3, and 10 mg/kg with or without a peptide vaccine in second- or later line treatment of metastatic melanoma was evaluated. The majority of patients were treated at 3 mg/kg of nivolumab without vaccine, and no differences were observed by dose or with vaccine in this study in any measure of outcome (21). Pretreatment samples and clinical data for validation sets 1 (PD-1 antibodies) and 4 (ipilimumab) were obtained prospectively and consecutively from available patients receiving off-protocol treatment at approved doses collected by the Yale SPORE in Skin Cancer. Samples for validation set 3 (PD-1 antibodies) were collected as part of clinical practice at National Tumor Institute “Fondazione G. Pascale,” Naples. Fifty-two percent were treated with nivolumab or pembrolizumab in second line and 48% in third line. Validation set 2 was collected prospectively at the Massachusetts General Hospital Cancer Center, predominantly from patients treated in first or second lines for advanced disease. Set 5 (ipilimumab + nivolumab) was from patients treated off-protocol at Yale. Available patient characteristics are summarized in Table 1. All samples was collected with appropriate informed consent as part of institutional review board–approved protocols at the respective institutions. Validation samples were run blinded to the clinical data.
Spectral acquisition and processing
Samples were processed using standardized operating procedures described in the supplementary materials. We used the Deep MALDI method of mass spectrometry (18) on a SimulToF mass spectrometer (Virgin Instruments) to generate reproducible mass spectra from small amounts of serum (<2 μL), showing peaks from a higher abundance range than previously possible by exposing the samples to 400,000 MALDI laser “shots” compared with several thousand used in standard applications (23). The spectra were processed to render them comparable between patients and 351 MS peaks were selected for further analysis by assessing their reproducibility and stability. Sample processing and MS analysis methods are described fully in the supplementary materials. Parameters for these procedures were established using only the 119-sample development set, and this fixed procedure was applied to all other sample sets without modification.
Test development
The Diagnostic Cortex platform (20) incorporates machine learning concepts and advances in deep learning (24) and was designed for test development in cases with more attributes than samples, with the goal of minimizing the potential for overfitting and promoting the ability of the developed tests to generalize to unseen datasets. This is achieved by performing many different splits of the development set into training and test sets, obtaining performance estimates only from the test parts, and taking an average over all these results (25). The method is implemented in a semisupervised manner that allows simultaneous refinement of the test and the classes used in its training, revealing the underlying structure in the MS data associated with the chosen measure of outcome of OS. Full details are provided in the supplementary materials (Supplementary Tables S3–S7 and Supplementary Figs. S1 and S2) and in Roder and colleagues (20). The results of a test developed using this platform that was able to stratify patients into those with better and worse outcomes on PD-1 blockade have been presented previously (26).
The parameters and reference data for the final test were generated solely on the development set, and these were then locked. All validations were performed using these fixed analysis procedures.
Protein set enrichment analysis (PSEA)
This analysis applies the gene set enrichment analysis (GSEA) method to protein expression data (27, 28). This method identifies expression differences that are consistent across prespecified groups or sets of features, in our case, proteins. An additional independent reference set of 49 serum samples with matched mass spectrometry data and protein expression data from the panel of 1,129 proteins measured by SomaLogic was used for this analysis. Specific protein sets were created as the intersection of the list of SomaLogic 1,129 panel targets and results of queries for biological functions from GeneOntology, using AmiGO2 tools and UniProt databases (Supplement). The PSEA method associated test classification (sensitive and resistant) with these biological functions via a rank-based correlation of the measured protein expressions with the test classifications of the reference samples.
Statistical analysis
All analyses, except the PSEA, were obtained using SAS9.3 (SAS Institute). The PSEA was carried out using the method described in (27) using Matlab (MathWorks). Survival/progression-free survival (PFS) plots and medians were generated using the Kaplan–Meier method. All P values are two-sided, except for PSEA, which are defined as described by Subramanian and colleagues (27).
Results
A protein signature associated with PFS and OS with PD-1 antibody
The goal was to build a clinically useful test that identified patients receiving anti–PD-1 who would be likely to experience long survival. We constructed a test composed of multiple subtests, each using a clinically different subset of the development sample set, using the classifier development platform described above and in the supplementary materials (Supplementary Tables S3–S6, and Supplementary Figs. S1 and S2). For a sample to be assigned a test classification of sensitive (i.e., long survival), a sample had to classify uniformly into that group for each subtest. This would identify patients with the best outcomes, as the associated sample has to always demonstrate molecular features associated with good outcome whatever subset of samples is used to create the subtest. We generated a binary classifier for each of seven clinically different subsets (see Supplementary Material). Each of these subtests assigns a good or poor prognosis classification to a sample (by comparing this sample with the samples in the development subset of the test). Samples where at least one of the seven subtests indicated poor prognosis were labeled as “resistant.” The seven subtests utilize different numbers of MS peaks (n = 56, 69, 75, 82, 84, 85, and 85), which are partially overlapping and listed in the Supplementary Materials, so that the overall test uses 209 of the 351 analyzed mass spectrometry peak intensities in the classification algorithm.
The test classified 34 (29%) of the patients in the development set as “sensitive” and 85 (71%) as “resistant.” Kaplan–Meier plots for OS and PFS are shown in Fig. 1A and B, respectively. Summary statistics are shown in the development set column of Table 2. Both OS and PFS show significant separation (P = 0.002 and 0.016, respectively) by test classification with substantial effect sizes for each (hazard ratios of 0.37 and 0.55, respectively). The sensitive group had an excellent two-year survival rate with nivolumab treatment of 67%.
The test was performed independently on the whole cohort a second time several months after initial assessment of the signature using the same mass spectrometer and settings to provide an estimate of technical assay reproducibility. Concordance of sample classification was 94% (112/119).
Test classification was independent of gender, age (<65 versus ≥65), PD-L1 expression (PD-L1 testing protocol outlined in supplementary materials), prior ipilimumab therapy, and nivolumab dose/addition of peptide vaccine (p ≥ 0.148), shown in Supplementary Table S1. The classification was found to be significantly associated with serum LDH level (P = 0.006) and baseline tumor size (P < 0.001). However, the proposed test retained its predictive power in a multivariate analysis including these factors (Table 3) for OS and PFS, (P = 0.002 and 0.022, respectively). PD-L1 expression did not appear as a significant predictor of outcome in multivariate analysis, which may be due to the small subset of patients with available measured PD-L1 expression (n = 37).
Test classification was not found to be associated with the occurrence of severe immune adverse events (Supplementary Table S2), although it should be noted that the number of grade 3 to 4 events was low.
The potential biological relevance of the sensitive and resistant groups was investigated using the PSEA approach to assess whether pathways of biologic significance were associated with test classification. These data are summarized in Table 4. Biological processes significantly associated with classification with a false discovery rate <0.25 as assessed by the Benjamini–Hochberg approach (29) included the complement, wound healing, and acute phase reactant pathways. Based on the proteins that are included in the PSEA analysis protein sets (shown in Supplementary Table S8), these processes all appear to be upregulated in the resistant group compared with the sensitive group. Proteins included in these protein sets that were significantly correlated (Mann–Whitney P < 0.05) with the test classification in the reference sample set are listed in Table 5.
Validation
The 209-protein signature was validated in a blinded manner in four independent sample sets from three additional institutions, one for patients treated with ipilimumab and three for patients treated with the PD-1 antibodies nivolumab or pembrolizumab. The results for OS are shown in Fig. 1C–F, and performance estimates of the test are summarized in the respective columns of Table 2. Given the small cohort sizes and limited follow up in some cohorts, for anti–PD-1 antibodies nivolumab and pembrolizumab (validation sets 1–3) the results are consistent with the development data for OS. A pooled analysis of these validation sets, stratified by set, demonstrated a significantly better OS for the patients classified as “sensitive” relative to “resistant,” HR = 0.15 [95% confidence interval (CI): 0.06−0.40, P < 0.001]. PFS data were not available for validation set 1, and although the numerical estimate of hazard ratio between sensitive and resistant groups was the same as the development set for validation set 3 and somewhat larger for validation set 2, the set-stratified pooled analysis for the two sets with available PFS data did not show significantly better PFS for the sensitive group [HR = 0.63 (95% CI: 0.29−1.36), P = 0.239]. The ipilimumab-treated validation set 4 showed a significant difference in OS between sensitive and resistant groups (HR = 0.40, P = 0.004), even with the lower overall efficacy of this therapy compared with PD-1 or combination blockade.
In validation set 5, shown in Fig. 1G, where patients were treated with the combination of ipilimumab and nivolumab, 2-year survival rates were similar and high in both the sensitive and the resistant group (83% and 63%, respectively), and there was no statistically significant separation of the groups by survival. This raises the issue of whether outcome with combination therapy is not associated with the signature, but it requires further analysis with larger numbers of patients to draw any firm conclusions.
Discussion
Definition of tumor biomarkers that predict response to and benefit from PD-1/PD-L1 blocking antibodies (9–17) would be clinically valuable. Following assays of the tumor microenvironment, the emerging consensus is that patients with T-cell-infiltrated tumors within an inflammatory milieu are most likely to respond to PD-1 blockade (12, 17). In contrast, little attention has been paid to developing serum or blood-based markers due to the concern that such markers may not reflect the immune events unfolding within the tumor. A biomarker strategy focused on the tumor suffers from the heterogeneity of gene expression within tumors, variability from one metastatic lesion to another, access to sufficient tumor for analysis, as well as the context-dependent expression and variations of markers like PD-L1 over time (10). A serum marker would have ease of detection using one blood draw, might be independent of inter- and intratumor variation, and could assess aspects of the host that might not be apparent from the tumor microenvironment. Several serum markers have been suggested to be associated with outcome with checkpoint inhibitors, including soluble PD-L1, and angiopoetin-2 (30–32).
We developed a pretreatment serum test utilizing mass spectrometry to identify a group of serum proteins that distinguish metastatic melanoma patients that went on to have prolonged survival after receiving anti–PD-1 antibodies. The test was validated in a blinded manner in three independent sets of patients treated with PD-1 checkpoint inhibitors. The test measures the relative abundance of 209 circulating proteins or protein fragments via mass spectrometry with MALDI-TOF, and generates a test classification, designated either “sensitive” or “resistant,” by using machine learning. The test is reproducible, and its ability to identify patients likely to have prolonged survival when treated with different PD-1 antibodies was consistent across multiple sets of sera from different institutions and validated in a pooled analysis. The test appears to differentiate a group of patients that have a plateau of survival over 50% from those with less than 20% survival at 3 years. Long-term follow-up of a cohort of nivolumab-treated patients indicated that there was a plateau of the survival curve after 3 years, suggesting that some of those alive at that time point may be cured of melanoma (4).
The serum test described herein might identify patients expressing the “sensitive” serum classification that have long OS with PD-1 blockade alone or with the addition of ipilimumab to nivolumab (33). Although the test was developed in a manner that was agnostic to biological and immunological mechanism, it is important to investigate what differentiates the sensitive and resistant groups from a mechanistic point of view. The PSEA analysis indicated that the serum of PD-1 resistant patients is characterized by acute phase, complement, and wound healing molecules. Inflammatory proteins and acute phase reactants, including CRP, are indicators of poor prognosis in many cancers (34), including melanoma (35–37), and with melanoma patients who received nivolumab (38). Here, we found that complement and wound healing pathways are associated with poor outcomes in patients treated with checkpoint inhibition. In support of these observations, a study established that T cells express C3a and C5a receptors that interact with the IL10 pathway upon ligand binding and suppress tumor specific CD8+ T-cell function (39). An independent study showed that binding of the C3a receptor inhibited neutrophil and CD4+ T-helper activity (40), and C1q has also been shown to act as an immune inhibitor without promoting activation of complement (41). These data indicate that complement activation, or the presence of elements of the complement pathway, may inhibit the efficacy of adaptive antitumor immunity, with a putative role in stimulation of myeloid-derived suppressor cells (41, 42). The role of wound healing in anti–PD-1 resistance was also demonstrated as part of a transcriptomic primary resistance signature, IPRES, and the recruitment of T-regulatory cells in patients with active wound healing may explain the deceased efficacy of checkpoint inhibitors in such patients (17, 43). The current serum test may help in the development of treatments overcoming primary anti–PD-1 resistance by adding inhibition of complement activation, suppression of components of wound healing or down modulation of acute phase pathways by blocking IL6 and IL1 signaling. Further experiments to demonstrate a role for these molecules in antitumor immunity are ongoing.
There are limitations to our data. Our validation sets are relatively small, have limited follow up, and some data such as PD-L1 expression status is only available for a subset of the development cohort. Further validation in much larger retrospective sets with long-term follow-up is needed and is in progress. Although we did not observe association of PD-L1 expression status and test classification in this study, it would be of interest to reassess this in a larger cohort of patients to investigate whether a combination of both markers could provide better stratification of patient outcomes. Before use in clinical practice for patient selection, the utility of this test would have to be shown in prospective randomized studies to distinguish whether the signature represents a prognostic rather than a predictive biomarker. It would also be important to observe changes in the levels of complement and wound healing activation at baseline and over time by a comparison of tumor tissue transcriptome and serum analysis. The availability of a validated serum assay to predict outcome with PD-1 blockade could provide guidance in selecting metastatic melanoma patients for immunotherapy, and for further understanding the mechanisms of sensitivity to, and resistance to, checkpoint blockade.
Disclosure of Potential Conflicts of Interest
J.S. Weber has ownership interest (including patents) in Altor, Biond, and Cytomx and is a consultant/advisory board member for Bristol-Myers Squibb, Merck, WindMIL, Novartis, Glaxo Smith Kline, Genentech, Astra Zeneca, EMD Serono, Incyte, Dai Ichi, and Celldex. M. Sznol is a consultant/advisory board member for Biodesix. R.J. Sullivan is a consultant/advisory board member for Biodesix, Merck, Novartis, Amgen, and Takeda. H.M. Kluger reports receiving a commercial research grant from Merck and is a consultant/advisory board member for Biodesix. P.A. Ascierto reports receiving commercial research grants from Bristol-Myers Squibb, Roche-Genentech, and Array; is a consultant/advisory board member for Bristol-Myers Squibb, Roche-Genentech, Genmab, Medimmune, MSD, Array, Novartis, Amgen, Merck Serono, Pierre Fabre, Incyte, and NewLink Genetics. C. Oliveira, H. Roder, J. Roder, J. Grigorieva, S.G. Asmellash, and K. Meyer own stock options and patents with Biodesix. No potential conflicts of interest were disclosed by the other authors.
Authors' Contributions
Conception and design: J.S. Weber, H.M. Kluger, H. Roder
Development of methodology: C. Oliveira, K. Meyer, S.G. Asmellash, J. Roder, H. Roder
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): J.S. Weber, M. Sznol, R.J. Sullivan, G. Boland, H.M. Kluger, R. Halaban, P.A. Ascierto, M. Capone, S.G. Asmellash, H. Roder
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): J.S. Weber, H.M. Kluger, P.A. Ascierto, C. Oliveira, K. Meyer, J. Grigorieva, J. Roder, H. Roder
Writing, review, and/or revision of the manuscript: J.S. Weber, M. Sznol, R.J. Sullivan, H.M. Kluger, R. Halaban, P.A. Ascierto, C. Oliveira, J. Grigorieva, S.G. Asmellash, J. Roder, H. Roder
Study supervision: J.S. Weber
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.