Abstract
Purpose: The aim of this study was to identify and validate novel predictive and/or prognostic serum proteomic biomarkers in patients with epithelial ovarian cancer (EOC) treated as part of the phase III international ICON7 clinical trial.
Experimental Design: ICON7 was a phase III international trial in EOC which showed a modest but statistically significant benefit in progression-free survival (PFS) with the addition of bevacizumab to standard chemotherapy. Serum samples from 10 patients who received bevacizumab (five responders and five nonresponders) were analyzed by mass spectrometry to identify candidate biomarkers. Initial validation and exploration by immunoassay was undertaken in an independent cohort of 92 patients, followed by a second independent cohort of 115 patients (taken from across both arms of the trial).
Results: Three candidate biomarkers were identified: mesothelin, fms-like tyrosine kinase-4 (FLT4), and α1-acid glycoprotein (AGP). Each showed evidence of independent prognostic potential when adjusting for high-risk status in initial (P < 0.02) and combined (P < 0.01) validation cohorts. In cohort I, individual biomarkers were not predictive of bevacizumab benefit; however, when combined with CA-125, a signature was developed that was predictive of bevacizumab response and discriminated benefit attributable to bevacizumab better than clinical characteristics. The signature showed weaker evidence of predictive ability in validation cohort II, but was still strongly predictive considering all samples (P = 0.001), with an improvement in median PFS of 5.5 months in signature-positive patients in the experimental arm compared with standard arm.
Conclusions: This study shows a discriminatory signature comprising mesothelin, FLT4, AGP, and CA-125 as potentially identifying those patients with EOC more likely to benefit from bevacizumab. These results require validation in further patient cohorts. Clin Cancer Res; 19(18); 5227–39. ©2013 AACR.
The benefit seen from the addition of bevacizumab to standard chemotherapy in the ICON7 trial was relatively modest in the overall population studied in terms of progression-free survival (PFS; 1.5 months). When additional clinical factors were used to identify the population of patients at high risk of progression, the benefit specifically to this group increased to 3.6 months. Several biomarkers predictive of bevacizumab response have been reported in a number of different tumors, with promising preliminary results. This translational research study aligned to the ICON7 trial identified a biomarker signature, which seems to assist in identifying patients with ovarian cancer most likely to benefit from bevacizumab treatment.
Introduction
Epithelial ovarian cancer (EOC) causes significant morbidity and mortality worldwide with an estimated 21,880 new cases and 13,850 attributable deaths in 2010 (1). Standard treatment for many years has involved debulking surgery combined with systemic platinum-based chemotherapy (2). However, despite EOC being very chemosensitive, most patients subsequently develop recurrent disease and die. Novel drug targets include VEGF, with progression of EOC commonly being VEGF-driven and increased VEGF expression is associated with more advanced disease, ascites, and a worse overall prognosis (3, 4). Preclinical and early clinical data supported the further investigation of bevacizumab, a monoclonal antibody to VEGF, in the treatment of EOC.
ICON7 was a two-arm phase III international randomized open-label trial of 1,528 women with high-risk early- or advanced-stage EOC, comparing six cycles of standard chemotherapy (carboplatin and paclitaxel) to six cycles of chemotherapy plus the addition of concurrent and maintenance bevacizumab. Initial results showed that bevacizumab improved progression-free survival (PFS), but although statistically significant, over the entire trial population the absolute PFS benefit was only 1.5 months (5). These data, along with the results of a similar trial, GOG218 (6), have led to bevacizumab being licensed in combination with carboplatin and paclitaxel in the first-line treatment of advanced EOC.
Subset analysis within ICON7 showed increased benefit when the population was restricted to those patients at higher risk of disease progression [Federation Internationale des Gynaecologistes et Obstetristes (FIGO) stage IV disease or FIGO stage III disease with >1 cm residual disease after surgical debulking], with an increase in median PFS of 3.6 months and a trend toward increased overall survival. Identifying novel biomarkers to select patients with EOC who will derive most benefit from bevacizumab is important, and is also required for patients with glioblastoma, colorectal, lung, and renal cancer where bevacizumab is also licensed for use, as there are no predictive biomarkers in clinical use.
Patients participating in ICON7 were asked to donate tissue and longitudinal blood samples for use in future translational research projects. The ICON7 sample bank, with rigorous sample collection, processing, and storage and associated high-quality clinical data, provides an excellent opportunity to identify potential EOC prognostic and predictive biomarkers, and to conduct initial validation. The aim of this study was to use serum samples from the ICON7 sample bank to identify, and subsequently validate, candidate biomarkers relating to bevacizumab use in EOC.
Materials and Methods
Patients and samples
ICON7 patient recruitment, trial design, and outcomes have been published elsewhere (5). Following ethical approval, blood samples were obtained from ICON7 patients following a schedule dependent on their level of consent (Fig. 1A). Samples were collected into plain clot activator tubes, allowed to clot for 1 hour before centrifugation at 2,000 × g for 10 minutes at 20°C, and serum aliquotted and stored at −80°C. This was conducted according to standard operating procedures and samples transferred to the central ICON7 sample bank in Leeds. Following approval of this study by the ICON7 Trial Management Group, 762 serum samples from 217 patients were used (all patients with baseline samples from the total of 226 who donated serum; Fig. 1A and B).
Proteomic biomarker discovery by LC/MS-MS
Serum samples from 10 patients within the ICON7 experimental arm were selected for biomarker discovery, grouped as 5 responders (complete and partial response) and 5 nonresponders (stable or progressive disease) defined by Response Evaluation Criteria in Solid Tumors (RECIST) and/or CA-125 after six cycles of treatment. Selection on this basis was used as PFS data were not available at that time. However, the median (range) PFS was 24.7 (12.0–25.8) months in the responder group and 12.8 (8.12–23.8) months in the nonresponder group. All patients had grade 3 serous tumors and the two groups were matched as closely as possible by age, FIGO stage, and surgical outcome (optimal or suboptimal debulking; Supplementary Table S1). Paired serum samples at time point 1 (baseline) and time point 4 (pre-cycle 2) from each of the selected patients were subjected to proteomic analysis by label-free mass spectrometry and candidate biomarkers of response selected (Fig. 1B).
For each sample, 200 μL of serum was filtered through 0.22 μm Spin-X filters (Corning) and 150 μL then depleted of the 14 most-abundant proteins using a Multiple Affinity Removal System (MARS) human 14 column (7), leaving an average of 7% of the total protein in the samples. Samples were concentrated using 15 mL 10 kDa molecular weight cut off (MWCO) filters (Millipore), desalted using 2 mL 7 kDa MWCO Zeba spin-desalting columns (Thermo Scientific), and 150 μg protein was digested with trypsin using filter-aided sample preparation (FASP; refs. 7, 8).
Peptides (triplicate injections each of 2 μg) were separated by online reversed-phase capillary liquid chromatography (LC) and analyzed by electrospray tandem mass spectrometry (MS-MS) using a Thermo Orbitrap Velos (9). Data were searched against an International Protein Index (IPI 3.80) human protein sequence database with MaxQuant 1.1.1.36 software (10) and the Andromeda search engine (11). The initial maximal mass tolerance for mass spectrometry scan was set to 10 ppm, the fragment mass tolerance for MS-MS was set to 0.5 Th. The maximum protein and peptide false discovery rates were set to 0.01. Label-free quantitation was conducted with MaxQuant.
Results were subjected to initial exploratory data analysis using principal component analysis and hierarchical clustering considering the whole profile together to identify gross patterns and identify potential outliers. Each protein was then examined separately to identify differences in protein abundance between responders and nonresponders at time points 1 and 4 (Mann–Whitney tests) and to identify differences between these time points (Wilcoxon signed-rank tests). False discovery rate was estimated by the q-value method.
Biomarker validation by immunoassay
The mass spectrometry results from the discovery analysis were confirmed by immunoassay. Validation and exploration of initial findings was undertaken using a cohort of 92 patients [627 longitudinal samples (Supplementary Table S2): validation cohort I], with further validation in an additional 115 patients (baseline samples only: validation cohort II). Samples in validation cohort I were selected to ensure similar numbers in each of the patient treatment and outcome groups described (limited by available assessable patients in the nonresponder groups). Validation cohort II consisted of all remaining baseline samples from the biobank. Validation cohorts included patients from both arms of the trial (to enable markers differing specifically in response to bevacizumab and not just chemotherapy to be distinguished; Fig. 1B), were independent of each other and the discovery set, and were representative of the trial population (Table 1). Patients in the validation cohorts were separated around the median PFS (18 and 16.8 months in validation cohort I and II, respectively) into early and late progressors (includes nonprogressors). This differed from the RECIST-based criteria used for selection of patients for biomarker discovery as at the later time of validation sample analysis, the ICON7 trial PFS data, which is more clinically relevant than response or nonresponse, had become available.
An ELISA for soluble fms-like tyrosine kinase-4 (FLT4; VEGFR3) was developed (Supplementary Methods) using the human sVEGFR3 DuoSet (R&D Systems) but with a substituted standard due to validation issues. Soluble mesothelin-related peptides (sMRP) were quantified using the U.S. Food and Drug Administration (FDA)–approved MESOMARK assay (Fujirebio Diagnostics) and α1-acid glycoprotein (AGP) and CA-125 concentrations were measured using routine clinical assays (Behring Nephelometric Analyzer II, Siemens and Siemens ADVIA Centaur CA-125 II assay, respectively) at the Leeds General Infirmary (Leeds, United Kingdom).
Statistical analysis
Results were examined for predictive use, either on the basis of baseline concentrations or patterns of longitudinal change of the proposed protein biomarkers. All analyses were conducted according to REMARK (REporting recommendations for tumor MARKer prognostic studies) criteria (12). Associations between the four biomarkers under study were visualized using a correlogram based on simple linear regression and Spearman rank correlation coefficient. Associations with demographic and clinical characteristics at baseline were investigated using Fisher exact test for categorical variables and Spearman rank correlation coefficient for continuous variables. Corrections for multiple testing were not applied because of the explorative nature of the study.
Estimates of intra- and intersubject variation for each of the selected proteins were obtained using linear mixed effects models (13) considering biomarker concentrations from pretreatment time points 1 and 2. Longitudinal line and box plots were used to assess trends in biomarker concentration over time to inform salient nonparametric significance testing of differences between time points. A linear mixed effect model was also used to investigate the rate of change of FLT4 concentration over time in patients in the experimental arm.
PFS was calculated from the date of randomization to the date of disease progression or death, whichever occurred first. Patients who were alive without disease progression were censored as of the date of their last assessment, with a cutoff date of November 30, 2010. Cox proportional hazards models were used to estimate the association between biomarker concentrations and PFS, the Kaplan–Meier method was used to estimate survival functions and the log-rank test was used to compare survival functions. Markers were initially considered as continuous variates and then as binary factors using cutoff points derived by maximizing Harrell concordance (C) index (for visualization of effects) in validation cohort I and subsequently by predictive index-defined cutoff points in the combined data. The predictive potential of markers was assessed using interaction terms for treatment arm and biomarker concentration in Cox proportional hazards models.
An additive marker index with scale 0–4 was calculated from the individual markers. Further details of marker index construction are provided in Supplementary Methods. Briefly, the optimum cutoff point in the index was identified by using Cox proportional hazards models [by examining the likelihood ratio test (LRT) on the interaction term for treatment arm and each level of the index], with indices below cutoff points considered as signature-negative and those above as signature-positive. Internal validation of the optimal model was conducted using R = 1,000 bootstrap resamples to estimate the optimism in model predictive ability. Proportional hazards assumptions were tested for each model using tests based on Schoenfeld residuals (14). All analyses were initially conducted in validation cohort I and replicated in validation cohort II and the cohorts combined, with the exception of longitudinal analysis which was conducted only in validation cohort I. All statistical tests were two-sided and all analyses were undertaken in the R environment for statistical computing (15).
Results
Biomarker discovery phase
Serum profiling of the discovery cohort identified a total of 352 proteins with at least two significant and one unique peptide in the total dataset (www.proteomics.leeds.ac.uk). From these, four proteins were selected on the basis of differences between responders and nonresponders, either at time point 1 or longitudinally: mesothelin, FLT4/VEGFR3, and AGP1 and -2. In the discovery cohort, mesothelin did not differ at baseline between responders and nonresponders, but decreased in both groups, and was not detected in responders at time point 4 (Supplementary Fig. S1). FLT4 was only detected in two responders and at time point 1 only. Both these proteins were identified by only two peptides and therefore differences were potentially due to under-sampling during analysis, but given their biologic relevance they were investigated further. Subsequent ELISA measurements of serum samples from the 10 patients in the discovery set confirmed the initial findings (Supplementary Fig. S1 and Fig. 2A). AGP1 and -2 were higher in 3 responders at time point 1 and decreased by time point 4. Although nominally removed by immunodepletion, their presence could reflect saturation of the column, with the residual amount detected being proportional to the starting concentration. The mass spectrometry results were confirmed using an immunoassay which detects both AGP1 and -2, which are referred to hereafter as AGP.
Biomarker validation cohorts and clinical characteristics/associations
Biomarker validation cohort characteristics.
The three candidate biomarkers together with CA-125 were examined in a further 207 patients (n = 92 and 115 in validation cohort I and II, respectively) from both arms of the trial in relation to disease progression. Baseline clinical and pathologic characteristics and outcomes in the cohorts were similar to those in the overall trial population (Table 1; ref. 5). Median PFS was longer in the experimental arm as in the trial, but was not significantly different between trial arms in either validation cohort (validation cohort I, P = 0.860; Supplementary Fig. S2A; validation cohort II, P = 0.664; log-rank test) due to the reduced power in these smaller cohorts. Treatment effect on PFS was also similar in both cohorts to that seen in the overall trial population (HR, 0.956, validation cohort I; HR, 0.923, validation cohort II, cf. 0.81) with no evidence of being significantly different (validation cohort I, P = 0.340; validation cohort II, P = 0.664; Wald test). The lack of proportional hazards seen in the trial arms (5) was also apparent in the biomarker validation cohort, with comparable violation from proportionality (Supplementary Fig. S2B).
Association of biomarkers with clinical characteristics.
Examination of candidate biomarkers (Supplementary Fig. S3) and clinical characteristics at baseline in validation cohort I (Table 2) showed an association between mesothelin and FLT4 and patients at high risk of progression (P = 0.002 and 0.043, respectively), with patients at higher risk of progression having higher concentrations. These associations were confirmed in validation cohort II (Supplementary Table S3) and the combined validation cohort (Supplementary Table S4) with all candidate markers and CA-125 showing an association with high-risk status (all P < 0.02). Mesothelin and AGP were associated with FIGO stage in validation cohort I (P = 0.008 and 0.003, respectively), confirmed in validation cohort II (P < 0.001 and 0.001; Supplementary Table S3) and overall (P < 0.001 and 0.001; Supplementary Table S4). CA-125, as expected, was associated with FIGO stage (P < 0.001) and histologic subtype (P = 0.052) with highest concentrations in more advanced stages and serous tumors, confirmed in validation cohort II and overall (Supplementary Tables S3 and S4). Baseline CA-125 concentrations were higher in patients in the experimental arm of validation cohort I (P = 0.043) although ranges overlapped. This is likely an artifact of selection, supported by no such significant differences being seen in validation cohort II or overall. The markers showed no strong positive or negative associations (although correlations were significantly different from zero, all P < 0.01 in validation cohort I; Supplementary Fig. S4). Similar results were observed in validation cohort II and the combined validation cohort (data not shown).
Biomarker validation—cohort I
Longitudinal changes in biomarkers during treatment.
The inclusion in the validation group of 61 patients who had two baseline samples (ranging from 0 to 32 days apart) allowed for an estimate of “normal” intrapatient variation, important when considering biomarker changes in individuals during treatment. Intraindividual coefficients of variation (CV%) were 22.6% for mesothelin, 14.6% for FLT4, 12.0% for AGP, and 16.3% for CA-125.
In validation cohort I, mesothelin differed in concentration between time points 1 and 4 in the early progressor subset of the experimental arm (P < 0.001; Fig. 2B), with slight but not statistically significant downward trends in the late progressor groups for each treatment arm. Comparing the difference in change between these time points in early and late progressors separately showed evidence of a significantly different change in the experimental arm (P = 0.003) not observed in the standard arm (P = 0.945). The most striking decreases in mesothelin concentration were observed for patients at a high risk of progression, although they generally had higher baseline concentrations (Fig. 2C).
Patients in the experimental (bevacizumab) arm of the trial had generally decreasing concentrations of FLT4 between time points 1 and 4, irrespective of the timing of disease progression (Fig. 2D). This decrease was observed during treatment with plateauing during follow-up. At time point 10 (when 41 of 45 patients with available samples have experienced disease progression), a rebound increase in concentration was apparent. Considering the longitudinal mean profile and using a simple trend variable for time point in a linear mixed effects model, there was a significant decrease of 2.49 ng/mL per time point (time points, 1–9; P < 10−10). However, early progressors had higher concentrations overall as compared with late progressors even when taking this longitudinal effect into account (increased concentration of 9.99 ng/mL in early progressors; P = 0.01), indicating a potential prognostic effect.
Concentrations of AGP also tended to reduce throughout the period of treatment, particularly between time points 1 to 6 (during systemic therapy; Supplementary Fig. S5A), with patients at higher risk of progression showing the greatest reductions (Supplementary Fig. S5B).
Association between candidate biomarker concentrations at baseline and PFS.
When using the 88 time point 1 samples for which analyte measurements were available, each of the candidate biomarkers and CA-125 showed evidence of prognostic potential upon univariate analysis in Cox proportional hazards models (Table 3, validation cohort I, column 1, all P < 0.05), and independently of risk of progression status (Table 3, column 2) upon multivariable analysis. Similar results were observed when stratifying analysis by treatment arm (Table 3, column 3–4) and adjusting for high risk of progression status with the exception of mesothelin in the standard arm. This evidence of independent prognostic potential for FLT4 and AGP warrants further investigation in a larger cohort and was also evident in Kaplan–Meier survival functions for optimal cutoff points in biomarker concentration (Supplementary Fig. S6). None of these models showed evidence of violating the proportional hazards assumption.
Predictive potential of individual biomarkers.
All three candidate biomarkers and CA-125 showed evidence of prognostic potential (Table 3). However, to determine whether a marker has predictive potential, a contrasting effect in treatment arms must be identified (usually through a significant treatment/biomarker interaction term in a Cox proportional hazards model). Cox proportional hazards models with terms for marker concentration, treatment, and the relevant interaction term gave nonsignificant Wald tests on the interaction term, indicating that individually the markers showed no evidence of predictive potential (Supplementary Fig. S6, column d). However, examination of plots for treatment effects separated by optimal cutoff points (derived by maximizing Harrell C-index) for biomarker concentrations (Supplementary Fig. S6, columns b and c) showed some evidence of nonadditive effects, suggesting further investigation to combine the effect of biomarkers in a predictive score.
A predictive biomarker index to inform treatment decisions.
A biomarker index was constructed as described and when Cox proportional hazards models with index, experimental arm, and interaction terms were estimated for the dichotomized index at each potential cutoff point (no patients had index = 0, implying dichotomization around 1, 2, and 3), the optimum model had a significant interaction term (P = 0.006, LRT; full model shown in Supplementary Table S5) indicating potential predictive ability. When patients were separated at this optimum cutoff point (index < 3; index ≥ 3), Cox's proportional hazards models for each treatment arm yielded significant results with contrasting beneficial treatments [signature-negative (n = 35): HR, 2.38; 95% confidence interval (CI), 0.97–5.81; P = 0.058 and signature-positive (n = 53): HR, 0.51; 95% CI, 0.28–0.94; P = 0.031; Supplementary Table S5]. Kaplan–Meier estimates of survival functions highlighted the difference between treatment arms in each signature group (Fig. 3A). In the signature-negative group, patients responded better to the standard therapy (median PFS standard arm not attained, but >42 months; experimental arm, 22.8 months; P = 0.051; log-rank test), whereas in the signature-positive group, the median PFS for patients on the experimental arm was 4.1 months longer (median PFS standard arm, 13.8 months; experimental arm, 17.9 months; P = 0.028; log-rank test). Constructing the index similarly excluding CA-125 resulted in a significant interaction in the Cox proportional hazards model (P = 0.035), but an index with lesser predictive ability.
The biomarker index showed weak evidence of prognostic potential overall (P = 0.099; log-rank test; Supplementary Fig. S7A) representing the combined and conflicting effects in each arm (Supplementary Fig. S7B). Considering age, FIGO stage, histology, Eastern Cooperative Oncology Group (ECOG) performance status, surgical outcome, and those patients classified clinically to be at higher risk of progression, there was no significant difference in the composition of the groups which could explain the better prognosis in the signature-negative, standard arm (age P > 0.05, Kruskal–Wallis test; all others P > 0.05, Fisher exact test).
The final Cox proportional hazards model for the biomarker index was internally validated using bootstrap resamples to estimate the optimism in the model (16). The final model had an estimated Harrell C-index of 0.62 and the optimism (O) was estimated to be 0.05 from 1,000 resamples. As the estimated C–O was more than 0.5, this implies that the final model has predictive ability. The second baseline sample at time point 2 also allowed the evaluation of the model in a smaller subset of patients (N = 58 with 38 PFS events). The results were similar with coefficients of the same sign and similar magnitude, but with increased Wald test P values (all P < 0.150) most likely caused by the lack of power given the smaller cohort.
Biomarker validation—cohort II
Association between candidate biomarker concentrations at baseline and PFS.
The univariate associations seen between candidate biomarkers and PFS observed in validation cohort I were confirmed in validation cohort II (Table 3, validation cohort II, P ≤ 0.05). Multivariable associations overall and in each trial arm also showed similar effects, but were not statistically significant (P > 0.05).
Validating the biomarker index.
The biomarker index was recalculated using the 115 time point 1 samples in validation cohort II with the same methods and cutoff points as outlined earlier. The subsequent model showed further evidence for a predictive effect for the biomarker index (interaction term from the LRT; P = 0.060; full model shown in Supplementary Table S5). Separating at the optimal cutoff point, there was further evidence from the Cox's proportional hazards models of contrasting benefit in each treatment arm [signature-negative (n = 21): HR, 3.64; 95% CI, 0.80–16.57; P = 0.095 and signature-positive (n = 94): HR, 0.76; 95% CI, 0.47–1.25; P = 0.283). Kaplan–Meier estimates of median survival indicated differences between treatment arms in each signature group (Fig. 3B) with patients in the signature-negative standard arm responding better than patients in the experimental arm of the same signature group (median PFS in the standard arm, 36.3 months; experimental arm, 17.9 months) with the converse result seen in the signature-positive group (median PFS in the standard arm, 10.9 months; experimental arm, 17.9 months).
Biomarker validation—combined cohorts I and II
Association between candidate biomarker concentrations at baseline and PFS.
Using the data from 203 time point 1 samples from both validation cohorts allowed a more powerful assessment of prognostic potential of the proteins. Significant univariate associations were again seen (Table 3, combined data, P ≤ 0.003). Upon multivariate analysis, the candidate markers and CA-125 were independently prognostic overall and in each arm separately (P ≤ 0.045), the exception being FLT4 in the experimental arm (P = 0.183).
Further validation of biomarker index.
The biomarker index was recalculated using time point 1 samples from the 207 patients from validation cohorts I and II combined, using the same methods and cutoff points as described earlier. The resultant model had a significant interaction term from the LRT (P = 0.001; full model shown in Supplementary Table S5) and Kaplan–Meier survival estimates show differences between treatment arm for both signature groups (Fig. 3C), again with the standard arm responding better in the signature-negative group (median PFS in the standard arm, 36.3 months; experimental arm, 20 months; P = 0.005; log-rank test) and the experimental arm responding better in the signature-positive group (median PFS in the standard arm, 12.4 months; experimental arm, 17.9 months; P = 0.040; log-rank test).
Further analysis of the candidate markers in the combined validation cohorts (with increased statistical power) suggested that combined mesothelin and AGP may perform equally as well as the biomarker index (Supplementary Tables S6–S8 and Supplementary Fig. S8).
Discussion
Bevacizumab has shown efficacy in a number of different solid tumors (17–20), but identification of patients who would derive most benefit would allow more selective and appropriate usage, improving efficacy and cost-effectiveness. In this exploratory translational study focused on proteomic identification of relevant serum biomarkers in ICON7 patients, we identified three candidate biomarkers (FLT4, AGP, and mesothelin), which we then sought to validate in larger cohorts. Although baseline values of all candidate markers seemed to have prognostic value, unfortunately none were individually predictive of benefit from the addition of bevacizumab. However, combining these with CA-125 in a biomarker index, a clear predictive value was seen in validation cohort I with signature-positive patients shown to benefit from the addition of bevacizumab.
We conducted additional validation of the biomarker index in a further independent cohort of ICON7 patients. Using an approximation based on the LRT for interaction (21) with the HRs observed in validation cohort I for the signature negative (HR1, 2.38) and signature positive (HR2, 0.51) groups, a sample size of 15 PFS events in each group (signature-positive or -negative crossed with trial arm) would be required at a significance level of 5% and 80% power. However, post hoc power analysis based on the true number of events observed in each group and the reduced interaction effect (HR1, 2.58 and HR2, 0.75) meant that the sample had a power of only 43%. In some part, this explains the disparate significance levels for the interaction effect in validation cohorts I and II as a larger and more matched cohort would have been more appropriate.
The identification of a biomarker index with potential clinical use shows the potential of proteomics. However, limitations need to be acknowledged. First, with practical limitations influencing sample number in the discovery phase, only patients on the experimental arm were included. Potential markers were then examined for predictive use by analyzing samples from patients in both arms in the subsequent validation phase. Ideally comparing both arms at the discovery phase would have allowed earlier highlighting of bevacizumab-specific response biomarkers. Second, although the initial immunoassay results on the discovery samples did reproduce the mass spectrometry findings, not all findings replicated independently within the larger validation cohorts, presumably due to biologic variation and the small size of the discovery set. For example, changes seen between time points 1 and 4 for the various proteins were reproduced in the validation cohort. However, the finding of higher baseline mesothelin being associated with poorer outcome/response in the validation set was not apparent in the mass spectrometry data.
Selection of candidate biomarkers for further validation included consideration of biologic relevance. Mesothelin is overexpressed in EOC, involved in tumor progression (22, 23) and associated with chemoresistance and a poorer overall prognosis (24). It has also been proposed as a potential biomarker for EOC identification (25, 26), and in mesothelioma as a putative biomarker to monitor response to treatment (27). Our association between baseline mesothelin and risk of EOC progression may relate to increased disease burden. Mesothelin is synthesized as a precursor and cleaved to form soluble megakaryocyte-potentiating factor and membrane-bound mesothelin. The two peptides identified in the mass spectrometry study were both in the mesothelin part of the molecule and are present in variants including the soluble forms. The immunoassay used detects forms including variant 1, most frequently found in ascitic fluid due to shedding from EOC cells (28), and variant 3.
FLT4 (VEGFR3) is a receptor for VEGF-C and -D, and plays a key role in angiogenesis, being implicated in breast cancer for example (29). FLT4 signaling leads to increased VEGF-C and -A. The reduction of soluble FLT4 here only in patients treated with bevacizumab is presumably due to the effect of bevacizumab on the VEGF-related angiogenic pathway. However, as it is not independently predictive of response, nonresponders may have alternative angiogenic pathways predominating. Similar changes are seen in serum VEGFR2 in patients with renal cancer treated with sorafenib, an antiangiogenic tyrosine kinase inhibitor (30).
AGP is an acute phase protein which is elevated in several cancers (31, 32), and although not previously linked with EOC, it has been reported to have proangiogenic properties and to support the proangiogenic effect of VEGF-A (33).
The predictive ability of the biomarker index is promising and strength is the derivation from a major randomized controlled trial, allowing a comparison between the standard and experimental arms, high quality clinical data, and stringent sample processing (34). The signature requires further validation including eventual prospective validation within a clinical trial, for example randomizing to bevacizumab based on biomarker index, and possible investigation of relevance in other solid tumors treated with bevacizumab, should be considered. Circulating protein and cellular biomarkers, single-nucleotide polymorphisms (SNP), expression arrays, and clinical correlates such as hypertension have been examined for potential association with bevacizumab effectiveness (reviewed in ref. 35) but no definitive biomarkers have yet been found. Higher baseline VEGF-A is most promising as potentially predictive of bevacizumab benefit in breast cancer (AVADO; ref. 36), gastric cancer (AVAGAST; ref. 37), and pancreatic cancer (AViTA; ref. 38). However, in some studies it has appeared only prognostic rather than predictive, for example in colorectal cancer (AVF2107g; ref. 39) and renal cancer (AVOREN; ref. 38), possibly reflecting study design, specific disease type, or assay. Other promising markers include delta-like ligand-4, VEGF-C, neuropilin (40), and a VEGFR2 SNP (41) in breast cancer, high circulating intercellular adhesion molecule (ICAM) in non–small cell lung cancer (although as there was still benefit in the low ICAM group this seemed to be prognostic rather than predictive; ref. 42) and day 4 circulating endothelial progenitor cells and proportion of baseline CXCR4-positive circulating endothelial cells in colorectal cancer (43). Hypertension has also been proposed (44, 45), although again not borne out in all studies (46).
Predictive biomarkers relating to bevacizumab in EOC are exploratory and limited to smaller phase II studies (reviewed in ref. 47). High tumor VEGF levels have been reported to be associated with early progression in patients treated with bevacizumab and erlotinib (48), whereas in patients treated with gemcitabine/oxaliplatin and bevacizumab (n = 19), increases in plasma placental growth factor and VEGFR2-expressing monocytes correlated with EOC outcomes (49). Examination of biomarkers relating to alternative angiogenic pathways (for example platelet-derived growth factor and fibroblast growth factor) found none to be of predictive value in a study of 106 patients with chemotherapy-resistant EOC treated with single-agent bevacizumab (50).
On the basis of the results from ICON7 and GOG218, bevacizumab is the first targeted therapy to be licensed for use in the first-line treatment of EOC, in combination with carboplatin and paclitaxel. Current clinical practice is variable, with some clinicians using it in all patients with advanced EOC, and others using it only in those clinically identified at high risk of progression. There is clearly still a need to identify biomarkers predictive of response to bevacizumab, to enable avoidance of potential toxicities in those least likely to benefit and also to maximize the cost effectiveness of this drug. Our study has identified a biomarker index which seems predictive of benefit from bevacizumab but further validation of clinical use in a larger population, including assessment of the predictive ability of all possible biomarker combinations, is needed.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Authors' Contributions
Conception and design: F. Collinson, R.A. Craven, D.A. Cairns, G. Hall, G.C. Jayson, P.J. Selby, R.E. Banks
Development of methodology: F. Collinson, D.A. Cairns, T.C. Wind, M.P. Messenger, P.J. Selby, R.E. Banks
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): R.A. Craven, A. Zougman, N. Gahir, M.P. Messenger, S. Jackson, D. Thompson, J. Ledermann, G. Hall
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): F. Collinson, M. Hutchinson, R.A. Craven, D.A. Cairns, T.C. Wind, N. Gahir, M.P. Messenger, D. Thompson, G. Hall, G.C. Jayson
Writing, review, and/or revision of the manuscript: F. Collinson, M. Hutchinson, R.A. Craven, D.A. Cairns, T.C. Wind, M.P. Messenger, D. Thompson, J. Ledermann, G. Hall, G.C. Jayson, P.J. Selby, R.E. Banks
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): D.A. Cairns, A. Zougman, S. Jackson, D. Thompson, C. Adusei
Study supervision: P.J. Selby, R.E. Banks
Acknowledgments
The authors thank all patients and staff at the various sites who participated in this research, the MRC CTU, and Dr. Helene Hoegsbro Thygesen and Prof. Jenny Barrett for statistical discussions.
Grant Support
This research study was funded by Cancer Research UK (C2075/A7966). The ICON7 trial was sponsored by the Medical Research Council and managed internationally by the MRC Clinical Trials Unit in London with the clinical trial and sample collection funded by Hoffmann-La Roche. D.A. Cairns is supported by a UK Medical Research Council Career Development Fellowship (G0802416).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.