Abstract
Purpose: We analyzed the outcomes of single-agent phase II clinical trials in non–small cell lung cancer (NSCLC) to determine trial parameters that predicted clinical activity.
Exoerimental Design: Data on response rate (RR), progression-free survival (PFS), and overall survival (OS) from all English language, single-agent phase II trials in advanced/metastatic NSCLC indexed by PubMed (January 2000 through December 2009) were abstracted.
Results: A total of 143 single-agent phase II trials (7,701 patients) were identified. The median RR was 10%, PFS 2.8 months, and OS 7.6 months. RR and PFS correlated with OS (r = 0.46, P < 0.001, r = 0.52, P < 0.001, respectively) and RR correlated with PFS (r = 0.61, P < 0.001). Treatment arms enriched for patients with molecular targets had a higher median RR (48.8% vs. 9.7%, P = 0.005), longer median PFS (6 vs. 2.8 months, P = 0.005), and OS (11.3 vs. 7.5 months, P = 0.05) as compared with those of unselected patients. In multivariate analysis, only studies enriched for patients with molecular targets or including drugs that eventually gained FDA/EMA approval were associated with a higher RR, and longer PFS/OS.
Conclusions: In phase II trials in NSCLC, RR and PFS correlated with OS. Studies enriched for patients with putative molecular drug targets were associated with higher therapeutic benefit as compared with those of unselected populations. Clin Cancer Res; 18(22); 6356–63. ©2012 AACR.
Phase II studies in non–small cell lung cancer (NSCLC) and other cancers are designed to test the efficacy of cancer therapies to identify compounds with promising activity that should be considered for testing in phase III randomized trials. Despite phase II trials being used frequently in oncology, their ability to delineate promising compounds for phase III trials is not straightforward as only approximately 34% of phase III oncology trials are successful. In this study, we attempted to identify specific variables associated with better outcomes of single-agent phase II studies in NSCLC.
Introduction
There is a well-established pathway of anticancer drug development that progresses from dose-finding phase I studies through phase II efficacy trials, to phase III randomized trials to determine benefit compared with accepted standard treatment regimens (1). Phase II cancer clinical trials are designed to identify the spectrum of antitumor activity of new therapies using the dose and schedule identified in phase I trials (2). Therapies deemed to have promising activity advance to phase III testing in which they are typically compared with standard-of-care. However, only a minority of new anticancer therapies tested in early-phase trials eventually meet the standards for marketing approval by the U.S. Food and Drug Administration (FDA) or European Medicines Agency (EMA; refs. 3, 4). Indeed, phase III oncology trials are successful only approximately 34% of the time (5).
Non–small cell lung cancer (NSCLC) accounts for 85% of all lung cancers (6). NSCLC is often diagnosed at an advanced stage and has a poor prognosis. Here, we report an analysis of the outcomes of single-agent phase II studies of NSCLC published within the past decade to delineate the specific variables associated with increased antitumor activity.
Materials and Methods
Identification of phase II studies and data abstraction
We searched PubMed for all single-agent phase II trials published throughout a 10-year period from January 2000 to December 2009. Search included terms “phase II” and “cancer” or “carcinoma” and was limited to articles in English. From that broad search, we included all phase II studies in advanced/metastatic NSCLC with single-agent therapies.
The following data were extracted from each study: number of treatment arms, type of treatment, number of treating centers, trial design, enrichment for molecular targets (as defined by testing for the molecular target directly or enrichment for patient populations expected to have a high rate of the target), previous systemic therapies, primary endpoint, the year of publication, name of the journal, and its 5-year impact factor [2008 ISI Journal of Citation Reports (Thomson Scientific)]. If the 5-year impact factor was not available, we substituted the 2008 impact factor. We recorded FDA and/or EMA approval status of the study drugs at the time of our analysis.
From reported trial results, we abstracted the number of patients enrolled, number of patients treated, response rate (RR), median progression-free survival (PFS), and overall survival (OS), if available. If the RR was based on number of evaluable patients, then the RR was recalculated using the intent-to-treat principle. All data were abstracted per treatment arm.
Statistical analysis
RR was defined as a partial or complete response. Medians for RR, PFS, and OS were calculated. Statistical dispersion was measured by the interquartile range (IQR, the distance between the 75th and the 25th percentile). Assessment of independent samples was done using nonparametric tests, such as the Wilcoxon rank-sum test and Kruskal–Wallis one-way analysis of variance by ranks. Multivariate linear regression models were fit to assess the association between several variables, RR, PFS, and OS, in which the variables included type of treatment (chemotherapy vs. targeted), number of treating centers (single vs. multicenter), trial design (nonrandomized vs. randomized), enrichment for molecular targets, previous systemic therapies (treatment-naïve vs. pretreated), primary endpoint (RR vs. other), the year of publication (2000–2004 vs. 2005–2009), the journal's 5-year impact factor (<5 vs. ≥5), FDA and/or EMA approval for NSCLC. The correlations between RR and PFS, RR and OS, and PFS and OS were estimated using Spearman rank correlation coefficient. A P value of 0.05 or less was considered statistically significant. All statistical analyses were carried out using SPSS version 17 (SPSS) software.
Results
Clinical trials designs and treatments
A total of 143 single-agent phase II studies in advanced or metastatic NSCLC published between January 2000 and December 2009 were identified in the PubMed database (Supplementary Appendix S1 and Table S1). These 143 studies had 163 treatment arms and enrolled a total of 7,701 patients. The median number of patients per treatment arm was 40 (range, 6–203). The details about abstracted trials are listed in Table 1.
Characteristics . | Studies N (%) . | Arms N (%) . | Patients N (%) . | Median RR % (IQR) . | P . | Median PFS months (IQR) . | P . | Median OS months (IQR) . | P . |
---|---|---|---|---|---|---|---|---|---|
Total studies | 143 (100) | 163 (100) | 7,701 (100) | 10 (15.8) | 2.8 (2.3) | 7.6 (3.6) | |||
Agent | |||||||||
Chemotherapy | 91 (64) | 105 (64) | 4,817 (61) | 11 (14.4) | 0.65 | 3 (2) | 0.60 | 7.7 (3.3) | 0.28 |
Targeted therapy | 52 (36) | 58 (36) | 2,884 (37) | 8.2 (23.4) | 2.7 (3.4) | 7.5 (6.8) | |||
Number of centers | |||||||||
Single-center | 44 (31) | 48 (29) | 1,833 (24) | 12.5 (22.5) | 0.08 | 3.3 (2.4) | 0.21 | 8.2 (4.6) | 0.27 |
Multicenter | 99 (69) | 115 (71) | 5,868 (76) | 9 (13.2) | 2.8 (2) | 7.5 (2.8) | |||
Planned design | |||||||||
Randomized | 31 (22) | 49 (30) | 2,812 (37) | 10.9 (12.3) | 0.94 | 2.8 (2.3) | 0.25 | 7.3 (2.1) | 0.11 |
Nonrandomized | 112 (78) | 114 (70) | 4,889 (63) | 9.9 (20.6) | 2.8 (2.2) | 8 (4.3) | |||
Enrichment for molecular targets | |||||||||
Enriched | 12 (8) | 14 (9) | 666 (9) | 48.8 (71) | 0.005a | 6 (6.8) | 0.005a | 11.3 (11.2) | 0.05a |
Not enriched | 131 (92) | 149 (91) | 7,035 (91) | 9.7 (13.6) | 2.8 (1.9) | 7.5 (3.2) | |||
Prior therapies | |||||||||
Treatment-naïve | 60 (42) | 63 (39) | 2,819 (37) | 12.5 (22.2) | 0.12 | 3.5 (2.2) | 0.004 | 8.2 (3.6) | 0.002a |
Pretreated | 83 (58) | 100 (61) | 4,882 (63) | 9.3 (13.1) | 2.7 (1.7) | 7.1 (2.9) | |||
Endpoint | |||||||||
RR | 115 (80) | 127 (78) | 5,468 (71) | 11 (17.5) | 0.47 | 2.9 (2.3) | 0.17 | 7.9 (3.6) | 0.02a |
PFS or OS | 21 (15) | 25 (15) | 1,638 (21) | 6 (12) | 2.5 (2.1) | 6.7 (3.7) | |||
Other (PK, PS, symptoms, toxicity) | 7 (5) | 11 (7) | 595 (8) | 8 (9.7) | 2.8 (1) | 5.8 (3.5) | |||
Year of publication | |||||||||
2000–2004 | 50 (35) | 55 (34) | 2,513 (33) | 12 (19) | 0.55 | 3 (2.1) | 0.42 | 7.6 (3.2) | 0.41 |
2005–2009 | 93 (65) | 108 (66) | 5,188 (67) | 9.5 (13.3) | 2.8 (2.2) | 7.5 (3.6) | |||
Journal impact factor | |||||||||
≥5 | 49 (34) | 60 (37) | 3,666 (48) | 9 (12.7) | 0.57 | 2.7 (2.4) | 0.15 | 7.6 (4.1) | 0.39 |
<5 | 94 (66) | 103 (63) | 4,035 (52) | 11 (18.8) | 3 (2.1) | 7.5 (3.4) | |||
FDA/EMA approval | |||||||||
Yes | 26 (18) | 27 (17) | 1,530 (20) | 26.5 (41.9) | <0.001a | 3.7 (5.2) | 0.001a | 9.3 (6.7) | 0.002a |
No | 117 (82) | 136 (83) | 6,171 (80) | 9 (13.1) | 2.8 (1.9) | 7.5 (3.1) |
Characteristics . | Studies N (%) . | Arms N (%) . | Patients N (%) . | Median RR % (IQR) . | P . | Median PFS months (IQR) . | P . | Median OS months (IQR) . | P . |
---|---|---|---|---|---|---|---|---|---|
Total studies | 143 (100) | 163 (100) | 7,701 (100) | 10 (15.8) | 2.8 (2.3) | 7.6 (3.6) | |||
Agent | |||||||||
Chemotherapy | 91 (64) | 105 (64) | 4,817 (61) | 11 (14.4) | 0.65 | 3 (2) | 0.60 | 7.7 (3.3) | 0.28 |
Targeted therapy | 52 (36) | 58 (36) | 2,884 (37) | 8.2 (23.4) | 2.7 (3.4) | 7.5 (6.8) | |||
Number of centers | |||||||||
Single-center | 44 (31) | 48 (29) | 1,833 (24) | 12.5 (22.5) | 0.08 | 3.3 (2.4) | 0.21 | 8.2 (4.6) | 0.27 |
Multicenter | 99 (69) | 115 (71) | 5,868 (76) | 9 (13.2) | 2.8 (2) | 7.5 (2.8) | |||
Planned design | |||||||||
Randomized | 31 (22) | 49 (30) | 2,812 (37) | 10.9 (12.3) | 0.94 | 2.8 (2.3) | 0.25 | 7.3 (2.1) | 0.11 |
Nonrandomized | 112 (78) | 114 (70) | 4,889 (63) | 9.9 (20.6) | 2.8 (2.2) | 8 (4.3) | |||
Enrichment for molecular targets | |||||||||
Enriched | 12 (8) | 14 (9) | 666 (9) | 48.8 (71) | 0.005a | 6 (6.8) | 0.005a | 11.3 (11.2) | 0.05a |
Not enriched | 131 (92) | 149 (91) | 7,035 (91) | 9.7 (13.6) | 2.8 (1.9) | 7.5 (3.2) | |||
Prior therapies | |||||||||
Treatment-naïve | 60 (42) | 63 (39) | 2,819 (37) | 12.5 (22.2) | 0.12 | 3.5 (2.2) | 0.004 | 8.2 (3.6) | 0.002a |
Pretreated | 83 (58) | 100 (61) | 4,882 (63) | 9.3 (13.1) | 2.7 (1.7) | 7.1 (2.9) | |||
Endpoint | |||||||||
RR | 115 (80) | 127 (78) | 5,468 (71) | 11 (17.5) | 0.47 | 2.9 (2.3) | 0.17 | 7.9 (3.6) | 0.02a |
PFS or OS | 21 (15) | 25 (15) | 1,638 (21) | 6 (12) | 2.5 (2.1) | 6.7 (3.7) | |||
Other (PK, PS, symptoms, toxicity) | 7 (5) | 11 (7) | 595 (8) | 8 (9.7) | 2.8 (1) | 5.8 (3.5) | |||
Year of publication | |||||||||
2000–2004 | 50 (35) | 55 (34) | 2,513 (33) | 12 (19) | 0.55 | 3 (2.1) | 0.42 | 7.6 (3.2) | 0.41 |
2005–2009 | 93 (65) | 108 (66) | 5,188 (67) | 9.5 (13.3) | 2.8 (2.2) | 7.5 (3.6) | |||
Journal impact factor | |||||||||
≥5 | 49 (34) | 60 (37) | 3,666 (48) | 9 (12.7) | 0.57 | 2.7 (2.4) | 0.15 | 7.6 (4.1) | 0.39 |
<5 | 94 (66) | 103 (63) | 4,035 (52) | 11 (18.8) | 3 (2.1) | 7.5 (3.4) | |||
FDA/EMA approval | |||||||||
Yes | 26 (18) | 27 (17) | 1,530 (20) | 26.5 (41.9) | <0.001a | 3.7 (5.2) | 0.001a | 9.3 (6.7) | 0.002a |
No | 117 (82) | 136 (83) | 6,171 (80) | 9 (13.1) | 2.8 (1.9) | 7.5 (3.1) |
Abbreviations: PK, pharmacokinetics; PS, performance status.
aResults from univariate analysis confirmed in multivariate analysis.
Enrichment for molecular targets is associated with better treatment outcomes
Response rate.
Of the 143 trials with 163 treatment arms, the median RR was 10% (IQR, 15.8). A RR of 0% was reported in 23 (14%) treatment arms (Fig. 1).
Twelve studies, which were enriched for the presence of molecular targets, showed a higher median RR of 48.8% (IQR, 71) as compared with 9.7% (IQR, 13.6) in the remaining 131 studies with unselected patients (P = 0.005; Tables 1 and 2). Figures 2 and 3 show the reported RR in trials with enriched and unselected patient populations with respect to number of patients per treatment arm (Fig. 2A), year of publication (Fig. 2B), PFS (Fig. 3A), and OS (Fig. 3B).
Study (reference in Supplementary Appendix S1) . | Method of enrichment . | Drug, mechanism . | Number of patients . | RR (%) . | PFS (mo) . | OS (mo) . |
---|---|---|---|---|---|---|
EGFR | ||||||
Yang and colleagues (141) | Enriched for EGFR mutations (East Asians, mostly nonsmokers with adenocarcinoma) | Gefitinib, EGFR kinase inhibitor | 106 | 50.9 | 5.5 | 22.4 |
Inoue and colleagues (59) | EGFR mutations | Gefitinib, EGFR kinase inhibitor | 30 | 66 | 6.5 | 17.8 |
Sequist and colleagues (111) | EGFR mutations | Gefitinib, EGFR kinase inhibitor | 34 (3 not treated) | 50 | 9.2 | 17.5 |
Tamura and colleagues (125) | EGFR mutations | Gefitinib, EGFR kinase inhibitor | 28 | 75 | 11.5 | Not reached |
Cappuzzo and colleagues (17) | EGFR FISH, pAKT, nonsmokers | Gefitinib, EGFR kinase inhibitor | 42 | 47.6 | 6.4 | Not reached |
Sunaga and colleagues (121) | EGFR mutations | Gefitinib, EGFR kinase inhibitor | 21 | 76 | 12.9 | Not reported |
Sutani and colleagues (122) | EGFR mutations | Gefitinib, EGFR kinase inhibitor | 27 | 78 | 9.4 | 15.4 |
Asahina and colleagues (5) | EGFR mutations | Gefitinib, EGFR kinase inhibitor | 16 | 75 | 8.9 | Not reached |
West and colleagues (137) | Bronchioloalveolar carcinoma | Gefitinib, EGFR kinase inhibitor | 136 | 10 | 3.5 | 13 |
Chen and colleagues (23) | East Asian | Gefitinib, EGFR kinase inhibitor | 36 | 33.3 | 4.7 | 9.5 |
HER2 | ||||||
Clamon and colleagues (29) | HER2 expression on IHC | Trastuzumab, HER2 monoclonal antibody | 24 | 4 | 2.6 | 5.3 |
Pan-HER | ||||||
Janne and colleagues (61) | Expression of at least one ERBB family protein on IHC | CI-1033, panHER kinase inhibitor | 166 (3 arms) | 2–4 | 1.9 | 6–6.6 |
Study (reference in Supplementary Appendix S1) . | Method of enrichment . | Drug, mechanism . | Number of patients . | RR (%) . | PFS (mo) . | OS (mo) . |
---|---|---|---|---|---|---|
EGFR | ||||||
Yang and colleagues (141) | Enriched for EGFR mutations (East Asians, mostly nonsmokers with adenocarcinoma) | Gefitinib, EGFR kinase inhibitor | 106 | 50.9 | 5.5 | 22.4 |
Inoue and colleagues (59) | EGFR mutations | Gefitinib, EGFR kinase inhibitor | 30 | 66 | 6.5 | 17.8 |
Sequist and colleagues (111) | EGFR mutations | Gefitinib, EGFR kinase inhibitor | 34 (3 not treated) | 50 | 9.2 | 17.5 |
Tamura and colleagues (125) | EGFR mutations | Gefitinib, EGFR kinase inhibitor | 28 | 75 | 11.5 | Not reached |
Cappuzzo and colleagues (17) | EGFR FISH, pAKT, nonsmokers | Gefitinib, EGFR kinase inhibitor | 42 | 47.6 | 6.4 | Not reached |
Sunaga and colleagues (121) | EGFR mutations | Gefitinib, EGFR kinase inhibitor | 21 | 76 | 12.9 | Not reported |
Sutani and colleagues (122) | EGFR mutations | Gefitinib, EGFR kinase inhibitor | 27 | 78 | 9.4 | 15.4 |
Asahina and colleagues (5) | EGFR mutations | Gefitinib, EGFR kinase inhibitor | 16 | 75 | 8.9 | Not reached |
West and colleagues (137) | Bronchioloalveolar carcinoma | Gefitinib, EGFR kinase inhibitor | 136 | 10 | 3.5 | 13 |
Chen and colleagues (23) | East Asian | Gefitinib, EGFR kinase inhibitor | 36 | 33.3 | 4.7 | 9.5 |
HER2 | ||||||
Clamon and colleagues (29) | HER2 expression on IHC | Trastuzumab, HER2 monoclonal antibody | 24 | 4 | 2.6 | 5.3 |
Pan-HER | ||||||
Janne and colleagues (61) | Expression of at least one ERBB family protein on IHC | CI-1033, panHER kinase inhibitor | 166 (3 arms) | 2–4 | 1.9 | 6–6.6 |
Because 10 of 14 arms with enriched populations used gefitinib, we compared those 10 gefitinib arms (476 patients) with 12 gefitinib arms with unselected patients (n = 764). Gefitinib arms with enriched populations showed a higher median RR of 58% (IQR, 41.8) as compared with 18% (IQR, 29.7) in the gefitinib arms with unselected patients (P = 0.001).
Treatment arms (n = 27) with drugs that eventually (at the time of analysis) gained FDA/EMA approval for advanced or metastatic NSCLC showed a higher median RR of 26.5% (IQR, 41.9) as compared with 9% (IQR, 13.1) in 136 treatment arms for other drugs (approved previously in NSCLC, approved previously for other indications, approved at the time of analysis for other indications, or not approved; P < 0.001; Table 1).
There were no differences in median RR in studies with chemotherapy versus targeted therapy; single institution versus multicenter studies; randomized versus nonrandomized studies; studies with treatment-naïve patients versus previously treated with a systemic therapy; studies published from 2000 to 2004 versus from 2005 to 2009; studies published in journals with an impact factor ≥5 versus <5. There were also no differences in the median RR in studies with different endpoints (Table 1).
In multivariate analysis, only studies enriched for patients with molecular targets (P < 0.001) and studies with drugs that eventually (at the time of analysis) gained FDA/EMA approval (P < 0.001) were associated with a higher RR.
Progression-free survival.
Of the 143 trials included in our analysis, median PFS was reported in 127 (78%) of 163 treatment arms. In these 127 treatment arms, the median PFS was 2.8 months (IQR, 2.3).
The studies that were enriched for the presence of putative molecular targets had a longer median PFS of 6 months (IQR, 6.8) as compared with 2.8 months (IQR, 1.9) in studies with unselected patients (P = 0.005).
In addition, gefitinib arms with enriched populations showed a longer median PFS of 7.7 months (IQR, 4.8) as compared with 2.8 months (IQR, 2.7) in the gefitinib arms with unselected patients (P = 0.003; Table 1).
Treatment arms with drugs that eventually (at the time of analysis) gained FDA and/or EMA approval for advanced or metastatic NSCLC also showed a longer median PFS of 3.7 months (IQR, 5.2) as compared with 2.8 months (IQR, 1.9) in other trials (approved previously for NSCLC, approved previously for other indications, approved at the time of analysis for other indications, or not approved; P < 0.001).
Treatment arms enrolling treatment-naïve patients showed a prolonged median PFS of 3.5 months (IQR, 2.2) as compared with 2.7 months (IQR, 1.7) in trials enrolling patients previously treated with a systemic therapy (P = 0.004).
There were no differences in the median PFS among studies with chemotherapy versus targeted therapy; single institution versus multicenter studies; randomized versus nonrandomized studies; studies published from 2000 to 2004 versus from 2005 to 2009; studies published journals with impact factors ≥5 versus <5. There were no differences in the median PFS in studies with different endpoints (Table 1).
In multivariate analysis, the only factors that were selected as independent predictors of longer PFS were, enrichment for patients with putative molecular targets (P = 0.001), studies with drugs that eventually (at the time of analysis) gained FDA/EMA approval (P = 0.05), and journal impact factor of less than 5 (P = 0.01).
Overall survival.
Of the 143 trials, the median OS was reported in 148 (91%) of 163 treatment arms. In these 148 treatment arms, the median OS was 7.6 months (IQR, 3.6).
The studies enriched for the presence of molecular targets showed a longer median OS of 11.3 months (IQR, 11.2) as compared with 7.5 months (IQR, 3.2) in studies with unselected patients (P = 0.05; Table 1).
Gefitinib arms with enriched populations showed a longer median OS of 16.4 months (IQR, 6.8) as compared with 7.6 months (IQR, 4.15) in the gefitinib arms with unselected patients (P = 0.01).
Treatment arms with drugs that eventually (at the time of analysis) gained FDA/EMA approval for advanced or metastatic NSCLC also showed a longer median OS of 9.3 months (IQR, 6.7) as compared with 7.5 months (IQR, 3.1) in other trials (approved previously for NSCLC, approved previously for other indications, approved at the time of analysis for other indications, or not approved; P = 0.002).
Treatment arms enrolling treatment-naïve patients showed a prolonged median OS of 8.2 months (IQR, 3.6) as compared with 7.1 months (IQR, 2.9) in trials enrolling patients previously treated with a systemic therapy (P = 0.002).
Studies with RR as a primary endpoint showed a longer median OS than studies with other primary endpoints, such as PFS/OS, and others [7.9 months (IQR, 3.6) vs. 6.7 months (IQR, 3.7) vs. 5.8 months (IQR, 3.5); P = 0.02].
There were no differences in the median OS among studies with chemotherapy versus targeted therapy; single institution versus multicenter studies; randomized versus nonrandomized studies; studies published from 2000 to 2004 versus from 2005 to 2009; studies published in journals with impact factors ≥5 versus <5 (Table 1).
In multivariate analysis, the only factors that were selected as independent predictors of survival were, enrichment for patients with putative molecular targets (P = 0.005), studies with drugs that eventually (at the time of analysis) gained FDA/EMA approval (P = 0.003), being previously treatment naïve (P = 0.009), and studies with RR as a primary endpoint (P = 0.02).
Response rate correlates with PFS and OS
Discussion
Phase II studies are among the most common studies in cancer clinical research. The quality of phase II data may be one of the most important factors for predicting the success of a phase III trial (7, 8). However, many phase II trials prove to be of limited value in the drug development trajectory.
In our report, the number of phase II studies with single-agent therapy published and indexed in the PubMed database nearly doubled in the 5-year period from 2005 to 2009 as compared with 2000 to 2004 (93 studies with 5,188 patients enrolled vs. 50 studies with 2,513 patients). However, this increase did not translate into improvements in outcomes as measured by RR, PFS, and OS. The actual, rather than apparent, reality might be even more disappointing as trials with poor outcomes tend to not be published (9).
The reported outcomes (RR, PFS, and OS) did not consistently differ with respect to the type of therapy, number of institutions involved, presence of randomization, trial endpoints, number of prior therapies, and impact factor of the publishing journals. However, the outcomes were improved for all parameters (RR, PFS, and OS) when trials were enriched for the presence of putative molecular drug targets. The only other factor that was consistently chosen as independent in multivariate analysis for RR, PFS, and OS, was studies with drugs that eventually gained FDA and/or EMA approval. The latter finding is, of course, not unexpected.
To improve the design and scientific use of phase II clinical trials, some authors suggest replacing single-arm phase II trials with randomized phase II trials (10). Their hypothesis is that randomized phase II trials would eliminate bias associated in comparisons with historical controls and generate robust data before proceeding to phase III trials (11, 12). Other investigators view single-agent phase II trials as a valuable platform, as well as being easily executed, and they can lead to drug approval (13). Moreover, while randomized phase II studies have been useful in gaining benefit and efficacy data for drugs whose effects are characterized by prolonged stable disease, the routine use of randomized phase II trials does not seem to improve the overall success rate in drug development (12, 14).
Our data showed better outcomes in trials enriched for putative molecular targets even though not every trial in the analysis used a validated enrichment strategy (Table 2). Single-agent phase II trials are often criticized for their propensity to produce false-positive results when compared with historical controls (11). Yet, in the last decade, well-conducted early-phase trials in populations enriched by molecular targets to define patients likely to benefit from therapy resulted in robust outcomes (e.g., BRAF inhibitors in BRAF-mutated melanoma, etc.) so that the question arises whether in such situations randomized trials are necessary or justifiable (15–19).
For many years, clinical cancer research has been tailored to low-yield therapies producing benefit in a small proportion of patients within a large clinical trial population. In these studies, the P value is driven to statistically significant levels, mostly by virtue of the large number of study participants (20). In contrast, if the treatment is given in a selective manner only to patients with tumors expressing the target of interest, the number of patients needed to adequately power the study is likely to be much smaller even if a randomized trial is conducted (21).
Identifying relevant molecular targets is of increasing importance (22, 23). Unfortunately, it is sometimes unclear whether the putative molecular target is a real target or an incidental passenger (12). For example, initial cetuximab clinical trials in colorectal cancer were designed only for patients with EGF receptor (EGFR) expression by immunohistochemistry (IHC; refs. 24–26). However, ultimately no significant correlation between EGFR expression and outcomes was identified (27, 28). Our study also identified trials in which selected targets, such as HER1-4 expression detected through IHC, proved to be flawed (Table 2).
It is known that RR does not always correlate with PFS and OS, and there has been considerable debate about the use of RR as a surrogate marker for survival, mainly because RR can be subjective, and there is a concern that it may not reliably predict PFS and OS (29). However, in a review of randomized trials in NSCLC, a strong correlation existed between RR and OS (30). Furthermore, a survey of 31 anticancer drugs approved by the FDA on the basis of RR or PFS in the absence of a randomized trial showed that these agents fared well after long-term clinical use (13). Importantly, in this study, we also showed a positive correlation between reported RR and PFS, RR and OS, and PFS and OS (Fig. 3). On the other hand, this finding should be interpreted with caution as OS is also largely influenced by survival after progression, which frequently depends upon the administration of postprotocol therapies not captured in our analysis (31). In addition, another possible limitation is that PFS and OS were calculated together for treatment-naïve and previously treated patients.
In our study, most of the published trials enriched for putative molecular targets used an EGFR inhibitor, gefitinib. Nevertheless, the subgroup analysis of gefitinib studies showed that treatment arms enriched by patients with molecular targets outperformed treatment arms with unselected patient populations for all outcome parameters (RR, PFS, and OS). In addition, a recent clinical trial with an anaplastic lymphoma kinase (ALK) inhibitor, crizotinib, in patients with EML4-ALK–positive NSCLC showed a RR of 57% in the early-phase clinical trial (16). Because EML4-ALK positivity occurs in only about 2% to 7% of NSCLC, and patients without this aberration are unlikely to respond to crizotinib, these results further support our current observation that molecular enrichment strategies are uniquely associated with an increased RR (16). In addition, an enrichment approach is further supported by data from The Biomarker-integrated Approaches of Targeted Therapy for Lung Cancer Elimination (BATTLE) trial, which was a biomarker-based, adaptively randomized study in pretreated patients with NSCLC (32).
A second limitation of our study is the retrospective nature of the analysis. Third, there are likely to be unpublished studies or trials presented in abstract form that were not included in the analysis. Another limitation of our study is that we only included studies with single-agent therapies and not combinations. Lastly, reported studies used different criteria for response evaluation and in some of them, but not all, responses were confirmed by independent review. The use of different criteria could introduce a bias; however, several studies previously reported an acceptable concordance, ranging from 79% to 96%, between Response Evaluation Criteria in Solid Tumors and World Health Organization or SWOG criteria (33–37). In addition, many clinical trials showed that response assessment done by investigators is highly concordant with independent review (38).
In conclusion, 143 single-agent phase II trials that enrolled 7,701 patients showed similar, relatively poor, outcomes (medians for RR, PFS, and OS were 10%, 2.8 months, and 7.6 months, respectively) regardless of treatment type. In contrast, trials of targeted therapies enriched by the presence of molecular targets had significantly higher RRs, PFS, and OS (48.8%, 6 months, and 11.3 months, respectively). Furthermore, in these trials, RR correlated with PFS and both RR and PFS correlated with OS. These observations suggest that drug development will remain slow unless relevant biomarkers are identified and used in clinical trials in NSCLC, and that either RR or PFS may be a useful surrogate endpoint for OS.
Disclosure of Potential Conflicts of Interest
D.A. Berry is a co-owner and consultant/advisory board member of Berry Consultants, LLC and has ownership interest (including patents) in Berry Consultants, LLC. D.J. Stewart has honoraria from Speakers Bureau of Pfizer and is a consultant/advisory board member of Roche and Align2Action. No potential conflicts of interest were disclosed by the other authors.
Authors' Contributions
Conception and design: F. Janku, D.A. Berry, R. Kurzrock
Development of methodology: F. Janku, D.A. Berry, H.A. Parsons
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): F. Janku, J. Gong, H.A. Parsons
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): F. Janku, D.A. Berry, H.A. Parsons
Writing, review, and/or revision of the manuscript: F. Janku, D.A. Berry, H.A. Parsons, D.J. Stewart, R. Kurzrock
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): F. Janku, H.A. Parsons
Study supervision: F. Janku
Acknowledgments
The authors thank Ms. Joann Aaron for scientific review and editing of this article.
Grant Support
This study was supported in part by grant number RR024148 from the National Center for Research Resources, a component of the NIH Roadmap for Medical Research.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.