Purpose: Gastrointestinal stromal tumor (GIST) is a relatively rare tumor that is treated with targeted therapies in advanced stages. Randomized clinical trials (RCT) often require long follow-up and large sample sizes to evaluate overall survival (OS), the gold-standard measure of treatment efficacy. However, changes in therapy following disease progression may complicate survival assessments. Establishing surrogate endpoints may facilitate the drug approval and availability of new efficacious treatments; however, no published studies have investigated this topic in unresectable and/or metastatic GIST.

Experimental Design: A systematic literature review identified 14 RCTs and five observational studies of sufficient methodologic quality published between January 1995 and December 2013 (29 treatment arms; 2,189 patients). Weighted linear regression was used to evaluate the relation between median OS and median progression-free survival (PFS) for all arms combined and stratified by treatment line, treatment type, and quality score.

Results: Median OS and PFS were positively related with a correlation of 0.91. The association was still moderate (correlation 0.72) after eliminating four influential data points. In stratified analyses, correlation of OS and PFS was greater in later lines of therapy (first line = 0.52; second line = 0.80; third- and later-line = 0.70) and imatinib showed a stronger association (0.91) than other evaluated treatments (−0.26 to 0.69).

Conclusion: This analysis identified a strong relationship between median OS and PFS, especially in later lines of therapy. Findings suggest that PFS could serve as a surrogate marker for OS; however, analyses of patient-level data are needed to establish its validity in GIST. Clin Cancer Res; 21(2); 295–302. ©2014 AACR.

Translational Relevance

This arm-level meta-analysis pertains directly to the evaluation of survival endpoints in oncology clinical trials. We evaluated the relationship between median overall survival and median progression-free survival in gastrointestinal stromal tumor (GIST) trials. Establishing the validity of PFS as a surrogate marker for OS in advanced GIST may allow for shorter follow-up periods in trials, and help to reduce the costs of clinical trials and facilitate the drug approval and availability of new and efficacious targeted therapies. No published studies have investigated this topic in unresectable and/or metastatic GIST. This analysis identified a strong relationship between median OS and PFS, especially in later lines of therapy. Findings suggest that PFS could serve as a surrogate marker for OS; however, analyses of patient-level data are needed to establish its validity in GIST.

Gastrointestinal stromal tumor (GIST) is the most common mesenchymal tumor of the gastrointestinal (GI) system, although GIST is a relatively rare cancer (1, 2). The annual incidence of GIST may range from 6.8 to 19.7 per million across several countries, including the United States (2, 3). This range may be an underestimate due to the challenges of diagnosis and the under-representation of small GIST cases in cancer registries (4). In early stages, GIST is treated by surgical resection, whereas advanced and unresectable forms are treated with targeted therapies: tyrosine kinase inhibitors (TKI), such as imatinib (5) or sunitinib (6), in the case of imatinib resistance. Another targeted therapy, regorafenib, is used in patients with GIST with resistance to both imatinib and sunitinib (7). Several new therapies are also being evaluated for later lines of treatment, including the TKIs nilotinib and imatinib in combination with the mTOR inhibitor everolimus (8–10).

Targeted therapies have shown a profound effect on survival outcomes. On the basis of estimates from 1995 to 2000 and from 2001 to 2004 after the introduction of imatinib, survival in unresectable and metastatic GIST increased from 12 months to 33 months (11). More recent studies suggest that survival today may be as much as 60 months (12). The approval of the current targeted therapies in GIST is based on data measured by surrogate endpoints. Although overall survival (OS) is the gold-standard measure of treatment efficacy in most cancers, the evaluation of OS in clinical trials in metastatic or unresectable GIST requires a lengthy follow-up due to the improvements in survival time with targeted therapies. In addition, estimates of OS in GIST are increasingly impacted by treatment cross-over and the subsequent administration of effective later-line therapies.

There has been a significant interest in establishing and validating surrogate endpoints for OS in advanced cancers. Utilized as a substitute for another clinical endpoint, a surrogate is a measure or sign that is expected to sufficiently predict clinical benefit or decline. Correlation analyses have validated surrogate endpoints for OS in metastatic colorectal, ovarian, and other cancers (13–16). The use of surrogate endpoints in advanced cancers may facilitate earlier analysis of trial data and provide more straightforward estimates of efficacy, eliminating the impact of cross-over in the event of disease progression.

Typically evaluated as a secondary or coprimary endpoint in oncology trials, progression-free survival (PFS) reflects the time from beginning the treatment to initial disease progression or to death by any cause. PFS is an attractive candidate for a surrogate endpoint as it measures only the effect of the study drug and is not impacted by subsequent treatments patients receive, as OS may be in trial settings where cross-over is offered to participating patients (17). In addition, in disease settings such as advanced GIST, with incidence rates of only 11 to 20 cases per million per year (18), PFS is reported to be a commonly accepted primary endpoint (19).

The purpose of this study was to evaluate the relationship between median PFS and median OS in GIST to inform consideration of PFS as a surrogate endpoint for OS. A systematic review and an arm-level analysis evaluated the linear association between median OS and median PFS in clinical trials and observational studies of targeted therapies for GIST. Additional analyses explore the impact of treatment line, treatment type, and study quality on the strength of the relationship.

Literature review

Two systematic literature searches were conducted in PubMed and Embase according to Cochrane guidelines (20); an initial search from January 1, 1995 through July 1, 2010 was conducted in PubMed, and later an updated search through October 28, 2013 was conducted in both databases. Search terms included “GIST,” “advanced or metastatic,” and “unresectable or stage IV.” The searches were limited to clinical trials, prospective observational studies, and retrospective observational studies published in English. A manual search of the annual proceedings of the American Society of Clinical Oncology and the European Society of Medical Oncology was conducted from 2011 through 2013. Two researchers reviewed each abstract and text against the study inclusion and exclusion criteria. Studies were included if they evaluated targeted therapies for GIST, reported both median OS and median PFS values, and enrolled 15 or more patients.

Quality assessment

Two researchers evaluated all included studies using the Grades of Recommendation, Assessment, Development, and Evaluation (GRADE) approach, as outlined and adopted by the Cochrane Collaboration (20. Chapter 12: Interpreting results and drawing conclusions). This approach defines the quality of a body of evidence on the basis of study design and the directness, precision, and consistency of the results. Each study was assigned a score of low (1 to <2), moderate (2 to <3), and high (3 to ≤4) for the randomized controlled trials of high-statistical power and consistent reporting. Studies with an average score of at least 1.75 were considered to be of sufficient quality for inclusion in the analysis; others were excluded to limit the impact of studies with critical problems and unsystematic clinical observations on the study results. Sensitivity analysis was conducted using different groups of quality scores.

Data extraction

For each included study of sufficient quality, data were extracted for study design, year of publication, sample size per treatment arm, treatment, treatment line, and Eastern Cooperative Oncology Group (ECOG; ref. 21) performance status. Data on the median OS and median PFS and outcome definitions were collected; reported point values took precedence over data from Kaplan–Meier curves. Confidence intervals were also collected to elucidate the variation in median outcome. In the event that a study was published in multiple articles or abstracts, the most recent data were used. A second researcher fully validated the extracted data.

Data analysis

Analyses were defined prospectively in a statistical analysis plan. To evaluate the relationship between median OS and median PFS, linear regression analyses were performed. Each treatment arm was weighted in the least-squares regression models by its respective sample size. In addition, the corresponding coefficient of determination (R2) and Pearson correlation coefficient were recorded for each model fit. Difference in Beta Scaled (DFBETAS; ref. 22) was used to identify influential data points within the key regression analyses and additional models were then run to show the effect of excluding all highly influential points. Furthermore, to explore the impact of other study variables, additional analyses were performed stratifying treatment arms by (i) treatment line (first-line, second- and line 2.5, third- and later-line); (ii) treatment type (imatinib, sunitinib, sorafenib, combination, and other TKIs); and (iii) assigned study quality score (1.75, 2–2.75, 3–3.75). All reported P values reflect two-sided tests, with a value of P < 0.05 indicative of statistical significance. Analyses were performed using R version 3.0.2. Studies that were designated as line 2.5 (2+) were those that allowed patients with second-line and later treatments but were largely considered to be second-line studies.

Following the systematic literature review (Fig. 1), a total of 51 studies were evaluated for quality. Nineteen studies were considered to be of sufficient quality for inclusion in analyses (5–10, 23–35). Most studies were phase II or III clinical trials, though three retrospective and one prospective observational study were included (25, 29, 32, 36). These studies included 29 treatment arms and 2,189 patients (Table 1). Across study treatment arms, there were between 15 and 349 patients with unresectable and/or metastatic GIST; the mean and the median for reported median OS were 18.7 and 11.8 months, respectively. The mean of the reported median PFS was 7.9 (median median PFS was 4.1) for the 29 treatment arms. Evaluated targeted therapies included imatinib, sunitinib, regorafenib, and nilotinib. Treatment line and ECOG status of evaluated patients varied across the studies.

Figure 1.

Systematic review yield. aThese publications did not provide any additional data for the study of interest.

Figure 1.

Systematic review yield. aThese publications did not provide any additional data for the study of interest.

Close modal
Table 1.

Therapies and study characteristics

Proportion (%)Median (in mo)
First author and year (name, study design)Treatment (daily dose)NLineECOG 0ECOG 1ECOG 2OSPFSQuality score
Blanke et al. 2008 (S0033, phase III) Imatinib (800 mg) 349 96%a 51.0 20.0 H (3.75) 
 Imatinib (400 mg) 345    55.0 18.0  
McAuliffe et al. 2007 (retrospective observational) Imatinib (400–800 mg) 53 — — — 41.2 27.0 L (2.00) 
Ryu et al. 2009 (phase II) Imatinib (400 mg) 47 27.7% 68.1% 4.3% 65.0b 40.0b L (1.75) 
George et al. 2009 (phase II) Sunitinib (37.5 mg morning or evening) 60 56.7% 41.7% 1.7% 24.6 7.8 L (1.75) 
Heinrich et al. 2008 (phase I–II) Sunitinib (25–75 mg) 97 51.5% 42.3% 6.2% 19.0 7.8 L (2.00) 
Raut et al. 2010 (retrospective observational) Sunitinib (NR) 50 — — — 26.0 15.6 L (1.75) 
Schoffski et al. 2010c (phase I–II) Everolimus + imatinib (2.5 mg + 600 mg) 28 39% 57% 4% 14.9 1.9 M (2.50) 
Benjamin et al. 2011c (phase II) Motesanib (125 mg) 138 — — — 14.7 3.7 M (2.50) 
Li et al. 2012c (prospective observational) Imatinib (600 mg) 52 100% 18.6 3.9 L (2.00) 
Maurel et al. 2010c (phase I–II) Imatinib + doxorubicin (400 mg) 26 2+ 20.8% 61.5% 7.7% 13.0 3.3 L (2.00) 
Kindler et al. 2011c (U. Chicago Phase II Consortium) Sorafenib (800 mg) 38 2+ 47%d 47%d 5%d 11.6 5.2 M (2.50) 
Schoffski et al. 2010c (phase I–II) Everolimus + imatinib (2.5 mg + 600 mg) 47 45% 49% 6% 10.7 3.5 M (2.50) 
Montemurro et al. 2013c (retrospective observational) Sorafenib (400 mg) 56 63.9%e 36.1%e 17.9 6.0 L (2.00) 
Italiano et al. 2012c (retrospective observational) Imatinib (doses NR) 40 80%f 9%f 7.5 2.9 L (2.00) 
 Imatinib + other agent (not “ib”; dose NR) 27    8.7 3.0  
 Nilotinib (dose NR) 67    11.8 4.1  
 Sorafenib (dose NR) 55    10.7 4.9  
George et al. 2013c (phase II) Regorafenib (160 mg) 33 3+ 70.0% 30.0% 0% 27.0 13.0 M (2.50) 
Kang et al. 2013c,g (RIGHT, phase III) Imatinib (400 mg) 81 3+ 68%h 32%h 8.2 1.8 M (3.00) 
Park et al. 2012c (Korean GIST Study, phase II) Sorafenib (800 mg) 31 3+ 29% 61% 10% 9.7 4.9 M (2.50) 
Reichardt et al. 2012c (ENESTg3, phase III) Nilotinib (800 mg) 165 3+ 54.5% 37.6% 7.9% 11.9 3.6 M (3.00) 
 BSC, BSC + imatinib, or BSC + sunitinibi 83 3+ 39.8% 49.4% 9.6% 9.2 3.6  
Kang et al. 2012c (phase II) Dovitinib (500 mg) 30 3+ — — — 6.2 3.6 M (2.50) 
Trent et al. 2011c (phase II) Dasatinib (140 mg) 50 3+ — — — 19.0 2.0 L (2.00) 
Montemurro et al. 2013c (retrospective observational) Sorafenib (400 mg) 68 63.9% 36.1%e 11.0 7.1 L (2.00) 
Italiano et al. 2012c (retrospective observational) Imatinib (dose NR) 37 80%f 9%f 7.4 4.5 L (2.00) 
 Imatinib + other agent (not “ib”; dose NR) 15    7.4 2.5  
 Nilotinib (dose NR) 21    5.4 3.8  
Proportion (%)Median (in mo)
First author and year (name, study design)Treatment (daily dose)NLineECOG 0ECOG 1ECOG 2OSPFSQuality score
Blanke et al. 2008 (S0033, phase III) Imatinib (800 mg) 349 96%a 51.0 20.0 H (3.75) 
 Imatinib (400 mg) 345    55.0 18.0  
McAuliffe et al. 2007 (retrospective observational) Imatinib (400–800 mg) 53 — — — 41.2 27.0 L (2.00) 
Ryu et al. 2009 (phase II) Imatinib (400 mg) 47 27.7% 68.1% 4.3% 65.0b 40.0b L (1.75) 
George et al. 2009 (phase II) Sunitinib (37.5 mg morning or evening) 60 56.7% 41.7% 1.7% 24.6 7.8 L (1.75) 
Heinrich et al. 2008 (phase I–II) Sunitinib (25–75 mg) 97 51.5% 42.3% 6.2% 19.0 7.8 L (2.00) 
Raut et al. 2010 (retrospective observational) Sunitinib (NR) 50 — — — 26.0 15.6 L (1.75) 
Schoffski et al. 2010c (phase I–II) Everolimus + imatinib (2.5 mg + 600 mg) 28 39% 57% 4% 14.9 1.9 M (2.50) 
Benjamin et al. 2011c (phase II) Motesanib (125 mg) 138 — — — 14.7 3.7 M (2.50) 
Li et al. 2012c (prospective observational) Imatinib (600 mg) 52 100% 18.6 3.9 L (2.00) 
Maurel et al. 2010c (phase I–II) Imatinib + doxorubicin (400 mg) 26 2+ 20.8% 61.5% 7.7% 13.0 3.3 L (2.00) 
Kindler et al. 2011c (U. Chicago Phase II Consortium) Sorafenib (800 mg) 38 2+ 47%d 47%d 5%d 11.6 5.2 M (2.50) 
Schoffski et al. 2010c (phase I–II) Everolimus + imatinib (2.5 mg + 600 mg) 47 45% 49% 6% 10.7 3.5 M (2.50) 
Montemurro et al. 2013c (retrospective observational) Sorafenib (400 mg) 56 63.9%e 36.1%e 17.9 6.0 L (2.00) 
Italiano et al. 2012c (retrospective observational) Imatinib (doses NR) 40 80%f 9%f 7.5 2.9 L (2.00) 
 Imatinib + other agent (not “ib”; dose NR) 27    8.7 3.0  
 Nilotinib (dose NR) 67    11.8 4.1  
 Sorafenib (dose NR) 55    10.7 4.9  
George et al. 2013c (phase II) Regorafenib (160 mg) 33 3+ 70.0% 30.0% 0% 27.0 13.0 M (2.50) 
Kang et al. 2013c,g (RIGHT, phase III) Imatinib (400 mg) 81 3+ 68%h 32%h 8.2 1.8 M (3.00) 
Park et al. 2012c (Korean GIST Study, phase II) Sorafenib (800 mg) 31 3+ 29% 61% 10% 9.7 4.9 M (2.50) 
Reichardt et al. 2012c (ENESTg3, phase III) Nilotinib (800 mg) 165 3+ 54.5% 37.6% 7.9% 11.9 3.6 M (3.00) 
 BSC, BSC + imatinib, or BSC + sunitinibi 83 3+ 39.8% 49.4% 9.6% 9.2 3.6  
Kang et al. 2012c (phase II) Dovitinib (500 mg) 30 3+ — — — 6.2 3.6 M (2.50) 
Trent et al. 2011c (phase II) Dasatinib (140 mg) 50 3+ — — — 19.0 2.0 L (2.00) 
Montemurro et al. 2013c (retrospective observational) Sorafenib (400 mg) 68 63.9% 36.1%e 11.0 7.1 L (2.00) 
Italiano et al. 2012c (retrospective observational) Imatinib (dose NR) 37 80%f 9%f 7.4 4.5 L (2.00) 
 Imatinib + other agent (not “ib”; dose NR) 15    7.4 2.5  
 Nilotinib (dose NR) 21    5.4 3.8  

Abbreviations: BSC, best supportive care; H, high; L, low; M, moderate; NR, not reported.

aThis study also enrolled patients with performance status of 3 (4%).

bData were visually estimated from a Kaplan–Meier curve.

cIndicates study identified in the systematic review update.

dPerformance status tool not specified.

eProportions include ECOG of 3. Reported value is for the overall study population.

fProportion includes ECOG ≥2. Reported value is for the overall study population. 11% of patients had unknown performance status.

gWithin the RIGHT randomized trial, the placebo arm was not included in the analysis.

hProportions include ECOG of 3.

iThis treatment arm was not included in the analyses by treatment as it represents a mixture of sunitinib and imatinib along with BSC.

Overall analysis

Median OS was strongly associated with median PFS (R2 = 0.84, r = 0.91; Fig. 2A and Table 2). When analyzing influential data points, any DFBETA larger than 2/sqrt(n) or smaller than −2/sqrt(n) was flagged as influential (where n is the number of points in the model; ref. 22). Ryu and colleagues (35), McAuliffe and colleagues (31), and both low- and high-dose treatment arms from Blanke and colleagues (5) stood out as having a high degree of influence on the fitted model. Figure 2B shows the modest change in slope of the regression with removal of the influential data points. Although the strength of the relationship was also reduced in this sensitivity analysis, the result is still reflective of a strong linear association between median OS and median PFS (R2 = 0.52, r = 0.72; Table 2).

Figure 2.

Weighted linear regression analyses for the relationship between median OS and median PFS in 29 treatment arms evaluating patients (n = 2,189) with unresectable and/or metastatic GIST in observational studies (n = 4) and clinical trials (n = 15) of targeted therapies. Influential data points were identified using DFBETAS. A, regression for all arms. B, regression for all arms with four influential points removed. C, regression stratified by treatment line. D, regression stratified by treatment line with five influential points removed.

Figure 2.

Weighted linear regression analyses for the relationship between median OS and median PFS in 29 treatment arms evaluating patients (n = 2,189) with unresectable and/or metastatic GIST in observational studies (n = 4) and clinical trials (n = 15) of targeted therapies. Influential data points were identified using DFBETAS. A, regression for all arms. B, regression for all arms with four influential points removed. C, regression stratified by treatment line. D, regression stratified by treatment line with five influential points removed.

Close modal
Table 2.

Weighted linear regression analyses for the relationship between median OS and median PFS in 29 treatment arms evaluating patients (n = 2,189) with unresectable and/or metastatic GIST in observational studies (n = 4) and clinical trials (n = 15) of targeted therapies

GroupCorrelation (95% CI)R2Adjusted R2Slope (SE), P
Overall 
 All arms 0.91 (0.82–0.96) 0.84 0.83 2.08 (0.18), P < 0.0001 
 All arms, with removal of four influential points 0.72 (0.46–0.87) 0.52 0.50 1.33 (0.27), P < 0.0001 
Stratified by treatment line 
 First line 0.52 (−0.88–0.99) 0.082 −0.38 0.25 (0.59), P = 0.71 
 First line, with removal of influential point −1.0 (not availablea0.98 0.97 −1.57 (0.20), P = 0.082 
 Second and 2.5-line 0.80 (0.21–0.96) 0.66 0.60 0.98 (0.29), P = 0.015 
 Second and 2.5-line, with removal of influential point 0.66 (−0.19–0.94) 0.51 0.41 1.32 (0.58), P = 0.072 
 Third and later lines 0.70 (0.33–0.88) 0.39 0.35 1.27 (0.41), P = 0.0074 
 Third and later lines, with removal of three influential points 0.62 (0.14–0.87) 0.44 0.40 1.77 (0.57), P = 0.0093 
Stratified by treatment 
 Imatinib 0.91 (0.56–0.98) 0.72 0.67 1.75 (0.45), P = 0.008 
 Sunitinib 0.65 (not availablea0.44 −0.13 0.62 (0.71), P = 0.54 
 Sorafenib 0.29 (−0.80–0.93) 0.03 −0.29 0.58 (1.90), P = 0.78 
 Combination therapy −0.26 (−0.93–0.81) 0.16 −0.13 −1.62 (2.18), P = 0.51 
 Other TKIs 0.69 (−0.13–0.95) 0.39 0.27 1.28 (0.71), P = 0.13 
Stratified by quality score 
 1.75 0.98 (not availablea0.96 0.91 1.32 (0.28), P = 0.13 
 2–2.75 0.85 (0.66–0.94) 0.73 0.72 1.22 (0.17), P < 0.0001 
 3–3.75 0.99 (0.83–1.0) 0.96 0.95 2.66 (0.31), P = 0.0032 
GroupCorrelation (95% CI)R2Adjusted R2Slope (SE), P
Overall 
 All arms 0.91 (0.82–0.96) 0.84 0.83 2.08 (0.18), P < 0.0001 
 All arms, with removal of four influential points 0.72 (0.46–0.87) 0.52 0.50 1.33 (0.27), P < 0.0001 
Stratified by treatment line 
 First line 0.52 (−0.88–0.99) 0.082 −0.38 0.25 (0.59), P = 0.71 
 First line, with removal of influential point −1.0 (not availablea0.98 0.97 −1.57 (0.20), P = 0.082 
 Second and 2.5-line 0.80 (0.21–0.96) 0.66 0.60 0.98 (0.29), P = 0.015 
 Second and 2.5-line, with removal of influential point 0.66 (−0.19–0.94) 0.51 0.41 1.32 (0.58), P = 0.072 
 Third and later lines 0.70 (0.33–0.88) 0.39 0.35 1.27 (0.41), P = 0.0074 
 Third and later lines, with removal of three influential points 0.62 (0.14–0.87) 0.44 0.40 1.77 (0.57), P = 0.0093 
Stratified by treatment 
 Imatinib 0.91 (0.56–0.98) 0.72 0.67 1.75 (0.45), P = 0.008 
 Sunitinib 0.65 (not availablea0.44 −0.13 0.62 (0.71), P = 0.54 
 Sorafenib 0.29 (−0.80–0.93) 0.03 −0.29 0.58 (1.90), P = 0.78 
 Combination therapy −0.26 (−0.93–0.81) 0.16 −0.13 −1.62 (2.18), P = 0.51 
 Other TKIs 0.69 (−0.13–0.95) 0.39 0.27 1.28 (0.71), P = 0.13 
Stratified by quality score 
 1.75 0.98 (not availablea0.96 0.91 1.32 (0.28), P = 0.13 
 2–2.75 0.85 (0.66–0.94) 0.73 0.72 1.22 (0.17), P < 0.0001 
 3–3.75 0.99 (0.83–1.0) 0.96 0.95 2.66 (0.31), P = 0.0032 

Abbreviation: CI, confidence interval.

aThe confidence interval was not generated by the linear regression model due to inclusion of only three data points.

Analysis by treatment line

Four treatment arms evaluated a first-line population (5, 31, 35), whereas eight arms evaluated a second- or 2.5-line [treatment arms designated as line 2.5 (2+) were those that allowed patients receiving second-line and later treatments; refs. 6, 9, 23, 24, 28–30, 33], and 17 arms evaluated a third- and later-line population (7–10, 25–27, 32, 34). In a stratified analysis, the linear association was strongest for second-line therapy (R2 = 0.66, r = 0.80) followed by third- and later-line therapy (R2 = 0.39, r = 0.70; Fig. 2C and Table 2). The association was less strong in first-line therapy (R2 = 0.08, r = 0.52), which may be related to the variable treatment options available to patients with GIST following progression on first-line treatment.

Ryu and colleagues (35) was identified as an influential point in the first-line therapy model and Raut and colleagues (33) was identified for second line. In the third-line regression model, George and colleagues (7), Trent and colleagues (10), and the sorafenib fourth-line arm of Montemurro and colleagues (32) each contributed highly influential data points. The sensitivity analysis with removal of these data points showed a weaker relationship in second line but still reflected moderate correlation (R2 = 0.51, r = 0.66; Fig. 2D and Table 2). This was also true of the third- and later-line analysis (R2 = 0.44, r = 0.63). For first-line therapy, removal of the influential data points by Ryu and colleagues (35) changed the slope from positive to negative (R2 = 0.98, r = −1), suggesting that no significant association between median OS and median PFS is present in first-line therapy.

Analysis by treatment type

Eight treatment arms evaluated single-agent imatinib (5, 25, 27, 29, 31, 35), three evaluated sunitinib (6, 24, 33), five evaluated sorafenib (8, 25, 28, 32), five evaluated imatinib combinations (9, 25, 30), and seven evaluated other TKIs (7, 10, 23, 25, 26, 34), including regorafenib and nilotinib. (These analyses did not include arm from Reichardt and colleagues, as it was not able to be categorized as a single therapy. Patients received BSC, BSC + imatinib, or BSC + sunitinib.) In a stratified analysis, imatinib was associated with the strongest linear association between median OS and median PFS (R2 = 0.72, r = 0.91), followed by other TKIs (R2 = 0.39, r = 0.69) and sunitinib (R2 = 0.44, r = 0.66; Fig. 3A and Table 2). Sorafenib (R2 = 0.03, r = 0.29) and imatinib combination therapy (R2 = 0.16, r = −0.26) demonstrated little or no association.

Figure 3.

Weighted linear regression analyses for the relationship between median OS and median PFS in 29 treatment arms evaluating patients (n = 2,189) with unresectable and/or metastatic GIST in observational studies (n = 4) and clinical trials (n = 15) of targeted therapies. All studies were scored for quality, with higher scores representing a higher level of evidence. A, regressions stratified by therapy (combination included imatinib + doxorubicin and imatinib + everolimus; “other TKI” included regorafenib, nilotinib, motesanib, dovitinib, and dasatinib). B, regressions stratified by quality score. Note that the analysis in A does not include the control arm from Reichardt et al. (34), as it could not be categorized as a single therapy. Patients received best supportive care or its combination with imatinib or sunitinib.

Figure 3.

Weighted linear regression analyses for the relationship between median OS and median PFS in 29 treatment arms evaluating patients (n = 2,189) with unresectable and/or metastatic GIST in observational studies (n = 4) and clinical trials (n = 15) of targeted therapies. All studies were scored for quality, with higher scores representing a higher level of evidence. A, regressions stratified by therapy (combination included imatinib + doxorubicin and imatinib + everolimus; “other TKI” included regorafenib, nilotinib, motesanib, dovitinib, and dasatinib). B, regressions stratified by quality score. Note that the analysis in A does not include the control arm from Reichardt et al. (34), as it could not be categorized as a single therapy. Patients received best supportive care or its combination with imatinib or sunitinib.

Close modal

Analysis by quality score

The impact of quality grading on relation between median OS and median PFS was evaluated by analyzing data in three groups: studies with score 3 to 3.75, score 2 to 2.75, and score 1.75. Three treatment arms were graded with a score of 1.75 (6, 33, 35). The majority of the treatment arms (arms n = 21) received scores of 2 to 2.75 (7–10, 23–26, 28–32). Five treatment arms were given scores of 3 to 3.75 (5, 27, 34).

In a stratified analysis, the higher-quality studies demonstrated the strongest linear relationship between median OS and median PFS (R2 = 0.96, r = 0.99), though the analyses of moderate-quality (R2 = 0.73, r = 0.85) and lower-quality (R2 = 0.96, r = 0.98) studies also showed strong association (Fig. 3B and Table 2).

Targeted therapies have markedly improved survival in patients with unresectable and metastatic GIST. Although clinical trials in GIST have been characterized by long periods of follow-up for survival endpoints, the pivotal S0033 trial of first-line imatinib required a median follow-up of 4.5 years (5). Receipt of additional therapies after treatment progression further complicates the evaluation of OS in clinical trials. PFS offers an advantage over OS because it requires patients to be followed only until their disease progresses and therefore, measures only the effect of the study drug and is not diluted by subsequent treatments patients receive, as OS may be (17). This analysis demonstrates potential for PFS as a valid surrogate for OS based on a strong relationship between median OS and median PFS in 29 study arms of targeted agents used to treat advanced or metastatic GIST.

The strong correlation between median OS and median PFS persisted even with the removal of four highly influential treatment arms (5, 31, 35). Two of these influential arms were from the SS003 trial, which evaluated the largest number of patients with GIST (n = 746) and assumed the greatest weight within the model (5). The other two removed had relatively high estimates of median OS and PFS compared with others; each had a median PFS longer than 2 years. Moreover, all four of these arms represent the only first-line GIST populations evaluated in this analysis and therefore their corresponding estimates of OS are most impacted by the use of any later therapies applied after disease progression.

In a stratified analysis, the associations between median OS and median PFS differed by increasing line of therapy. Second-line (eight arms) and third- and later-line (17 arms) therapies all demonstrated a moderate to strong relationship, whereas first-line therapy (four arms) showed only moderate correlation. The first-line sensitivity analysis with removal of an influential data point reflected the absence of a relationship (35). This finding and the nonsignificant slope indicate that there is insufficient evidence for the association between median OS and PFS within first-line therapy alone. The moderate correlation in the main analysis may be related to the variability in treatment options following progression. The power to detect a survival benefit using derivations from PFS declines substantially with increased survival postprogression (37). Thus, since survival after GIST progression in the first-line setting may be long, reduced correlation would be expected in this model.

Analyses stratified by study quality score reinforced the result from the analysis of all treatment arms, suggesting that the relationship between median OS and median PFS is not impacted by differences in study quality as assessed by researchers. In the analysis stratified by therapy, imatinib was the most-represented treatment (eight arms: 4 first-line, 1 second-line, 1 third-line, 2 later than third-line) and expected to show the strongest association between median OS and median PFS. This assumption was confirmed, and the model in “other TKIs” (seven arms) demonstrated moderate correlation. It should be noted that “other TKIs” consisted of arms from studies of second- or later-line therapies. As these showed strong association in the analysis stratified by treatment line, this may confound the interpretation of the analysis by therapy type.

Before this analysis, it was assumed that there would be few high-quality studies of patients with unresectable and/or metastatic GIST, as it is a relatively rare cancer. This expectation was confirmed by the results of the systematic review. Included studies were characterized by limitations in design, the most notable of which was small sample sizes. To improve the strength of this arm-level analysis, only higher-quality studies with at least 15 patients per arm were included. The introduction of the GRADE quality assessment feature of the review and the strict cutoff for inclusion attempted to improve the level of evidence. A relatively low sample cutoff was chosen as several key GIST trials evaluated small sample sizes and it was necessary to include these data. Nonetheless, this analysis still considered four observational studies, which differs from analyses in more common cancers that only include data from clinical trials (14, 15, 38, 39).

One limitation of this analysis is the potential for a high degree of heterogeneity among the analyzed studies. Studies utilized differing inclusion and exclusion criteria, progression criteria (e.g., RECIST and SWOG), and time intervals between radiologic and clinical assessments. These differences may have contributed to a wider range of survival outcomes, which could bias the findings toward a stronger relationship between endpoints. However, the results of this study were similar in magnitude to estimates demonstrated in other cancers (39, 40). Future analyses should confirm the strength of the association in GIST using patient-level data from clinical trials, which has been widely regarded as necessary for the validation of endpoint surrogacy (41).

The inherent relationship between OS and PFS should also be considered. By definition, PFS is contained within OS, thus median OS is always expected to be longer than median PFS. Overlapping definitions of PFS and OS may account for part of the relationship between the values, especially if the length of survival after progression is short (37). Part of this problem may be solved by performing a study-level analysis with relative estimates of survival efficacy (i.e., HR). However, few randomized trials have evaluated a GIST population, which precludes an analysis of this type.

Relying on data from observational studies and clinical trials in patients with GIST, this analysis demonstrated a strong correlation between median OS and PFS. The association was more apparent in later treatment lines compared with first-line therapy. These findings provide some insight into the use of PFS as a surrogate marker for OS, which may be strengthened by future trial results; however, further patient-level analyses are needed to fully establish its validity in GIST.

This study highlights the importance and the need for further research to establish validity of PFS as a surrogate marker, which may ultimately lead to earlier treatment decisions, help to reduce the considerable costs of clinical trials, and to facilitate approval, reimbursement, and availability of new efficacious treatments for GIST.

J. Chang and A. Mohamed are employees of Bayer HealthCare Pharmaceuticals. I. Özer-Stillman is an employee of and a consultant/advisory board member for Evidera. L. Strand is an employee of Evidera. K. Tranbarger-Freier was an employee of Evidera. No other conflicts of interest were disclosed by the other authors.

Conception and design: I. Özer-Stillman, J. Chang

Development of methodology: I. Özer-Stillman, J. Chang, K.E. Tranbarger-Freier

Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): L. Strand

Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): L. Strand, A.F. Mohamed, K.E. Tranbarger-Freier

Writing, review, and/or revision of the manuscript: I. Özer-Stillman, L. Strand, J. Chang, A.F. Mohamed, K.E. Tranbarger-Freier

Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): I. Özer-Stillman, L. Strand, J. Chang, A.F. Mohamed

Study supervision: I. Özer-Stillman, J. Chang

The authors thank Dr. Kyle Fahrbach of Evidera for statistical advising and review of the manuscript and Mina Jeong, Emily Shore, and Christopher Ngai for review of the manuscript.

This work was supported by Bayer HealthCare Pharmaceuticals.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1.
Judson
I
,
Demetri
G
. 
Advances in the treatment of gastrointestinal stromal tumors.
Ann Oncol
2007
;
18
Suppl 10
:
x20
24
.
2.
Reddy
P
,
Boci
K
,
Charbonneau
C
. 
The epidemiologic, health-related quality of life, and economic burden of gastrointestinal stromal tumors.
J Clin Pharm Ther
2007
;
32
:
557
65
.
3.
Chiang
NJ
,
Chen
LT
,
Tsai
CR
,
Chang
JS
. 
The epidemiology of gastrointestinal stromal tumors in Taiwan, 1998–2008: a nation-wide cancer registry-based study.
BMC Cancer
2014
;
14
:
102
.
4.
Tran
T
,
Davila
JA
,
El-Serag
HB
. 
The epidemiology of malignant gastrointestinal stromal tumors: an analysis of 1,458 cases from 1992 to 2000.
Am J Gastroenterol
2005
;
100
:
162
8
.
5.
Blanke
CD
,
Rankin
C
,
Demetri
GD
,
Ryan
CW
,
Von Mehren
M
,
Benjamin
RS
, et al
Phase III randomized, intergroup trial assessing imatinib mesylate at two dose levels in patients with unresectable or metastatic gastrointestinal stromal tumors expressing the kit receptor tyrosine kinase: S0033.
J Clin Oncol
2008
;
26
:
626
32
.
6.
George
S
,
Blay
JY
,
Casali
PG
,
Le Cesne
A
,
Stephenson
P
,
DePrimo
SE
, et al
Clinical evaluation of continuous daily dosing of sunitinib malate in patients with advanced gastrointestinal stromal tumor after imatinib failure.
Eur J Cancer
2009
;
45
:
1959
68
.
7.
George
S
,
Feng
Y
,
Von Mehren
M
,
Choy
E
,
Corless
CL
,
Hornick
JL
, et al
Prolonged survival and disease control in the academic phase II trial of regorafenib in GIST: Response based on genotype.
J Clin Oncol
31
, 
2013
(
suppl; abstr 10511
).
8.
Park
SH
,
Ryu
MH
,
Ryoo
BY
,
Im
SA
,
Kwon
HC
,
Lee
SS
, et al
Sorafenib in patients with metastatic gastrointestinal stromal tumors who failed two or more prior tyrosine kinase inhibitors: a phase II study of Korean gastrointestinal stromal tumors study group.
Invest New Drugs
2012
;
30
:
2377
83
.
9.
Schoffski
P
,
Reichardt
P
,
Blay
JY
,
Dumez
H
,
Morgan
JA
,
Ray-Coquard
I
, et al
A phase I–II study of everolimus (RAD001) in combination with imatinib in patients with imatinib-resistant gastrointestinal stromal tumors.
Ann Oncol
2010
;
21
:
1990
8
.
10.
Trent
JC
,
Wathen
K
,
Von Mehren
M
,
Samuels
BL
,
Staddon
AP
,
Choy
E
, et al
A phase II study of dasatinib for patients with imatinib-resistant gastrointestinal stromal tumor (GIST).
J Clin Oncol
29
: 
2011
(
suppl; abstr 10006
).
11.
Artinyan
A
,
Kim
J
,
Soriano
P
,
Chow
W
,
Bhatia
S
,
Ellenhorn
JD
. 
Metastatic gastrointestinal stromal tumors in the era of imatinib: improved survival and elimination of socioeconomic survival disparities.
Cancer Epidemiol Biomarkers Prev
2008
;
17
:
2194
201
.
12.
Casali
PG
. 
Successes and limitations of targeted cancer therapy in gastrointestinal stromal tumors.
Prog Tumor Res
2014
;
41
:
51
61
.
13.
Buyse
M
. 
Use of meta-analysis for the validation of surrogate endpoints and biomarkers in cancer trials.
Cancer J
2009
;
15
:
421
5
.
14.
Petrelli
F
,
Barni
S
. 
Correlation of progression-free and post-progression survival with overall survival in advanced colorectal cancer.
Ann Oncol
2013
;
24
:
186
92
.
15.
Sidhu
R
,
Rong
A
,
Dahlberg
S
. 
Evaluation of progression-free survival as a surrogate endpoint for survival in chemotherapy and targeted agent metastatic colorectal cancer trials.
Clin Cancer Res
2013
;
19
:
969
76
.
16.
Buyse
M
,
Burzykowski
T
,
Carroll
K
,
Michiels
S
,
Sargent
DJ
,
Miller
LL
, et al
Progression-free survival is a surrogate for survival in advanced colorectal cancer.
J Clin Oncol
2007
;
25
:
5218
24
.
17.
Mayfield
E
. 
Progression-free survival: patient benefit or lower standard?
NCI Cancer Bulletin
2008
;
5
:
1
11
.
18.
Joensuu
H
. 
Current perspectives on the epidemiology of gastrointestinal stromal tumors.
EJC Supplements
2006
;
4
:
44
9
.
19.
Pazdur
R
. 
Endpoints for assessing drug activity in clinical trials.
Oncologist
2008
;
13
:
19
21
.
20.
Higgins
JPT
,
Green
S
,
eds.
Cochrane Handbook for Systematic Reviews of Interventions.
Version 5.1.0 The Cochrane Collaboration
; 
2011
.
[Accessed September 4, 2014]
.
Available from:
www.cochrane-handbook.org.
21.
Oken
MM
,
Creech
RH
,
Tormey
DC
,
Horton
J
,
Davis
TE
,
McFadden
ET
, et al
Toxicity and response criteria of the Eastern Cooperative Oncology Group.
Am J Clin Oncol
1982
;
5
:
649
55
.
22.
Neter
J
,
Kutner
M
,
Nachtsheim
C
,
Wasserman
W
.
Applied linear statistical models
. 4th ed. New York: McGraw-Hill/Irwin; 
1996
.
23.
Benjamin
RS
,
Schoffski
P
,
Hartmann
JT
,
Van Oosterom
A
,
Bui
BN
,
Duyster
J
, et al
Efficacy and safety of motesanib, an oral inhibitor of VEGF, PDGF, and Kit receptors, in patients with imatinib-resistant gastrointestinal stromal tumors.
Cancer Chemother Pharmacol
2011
;
68
:
69
77
.
24.
Heinrich
MC
,
Owzar
K
,
Corless
CL
,
Hollis
D
,
Borden
EC
,
Fletcher
CDM
, et al
Correlation of kinase genotype and clinical outcome in the North American intergroup phase III trial of imatinib mesylate for treatment of advanced gastrointestinal stromal tumor: CALGB 150105 study by cancer and leukemia group B and southwest oncology group.
J Clin Oncol
2008
;
26
:
5360
7
.
25.
Italiano
A
,
Cioffi
A
,
Coco
P
,
Maki
RG
,
Schoffski
P
,
Rutkowski
P
, et al
Patterns of care, prognosis, and survival in patients with metastatic gastrointestinal stromal tumors (GIST) refractory to first-line imatinib and second-line sunitinib.
Ann Surg Oncol
2012
;
19
:
1551
9
.
26.
Kang
Y
,
Ryu
M
,
Lee
J
,
Park
JH
,
Ryoo
B
. 
A phase II study of dovitinib in patients with metastatic or unresectable gastrointestinal stromal tumors after failure of two or more tyrosine kinase inhibitors.
Poster Discussion at European Society of Medical Oncology (ESMO) 37th Congress, Vienna
.
Ann. Oncol
. 
2012
;23(Suppl 9):Abstract 1481PD, Page ix1480.
27.
Kang
YK
,
Ryu
MH
,
Yoo
C
,
Ryoo
BY
,
Kim
HJ
,
Lee
JJ
, et al
Resumption of imatinib to control metastatic or unresectable gastrointestinal stromal tumours after failure of imatinib and sunitinib (RIGHT): a randomised, placebo-controlled, phase 3 trial.
Lancet Oncol
2013
;
14
:
1175
82
.
28.
Kindler
HL
,
Campbell
NP
,
Wroblewski
K
,
Maki
RG
,
D'Adamo
DR
,
Cho
WA
, et al
Sorafenib (SOR) in patients (pts) with imatinib (IM) and sunitinib (SU)-resistant (RES) gastrointestinal stromal tumors (GIST): final results of a University of Chicago Phase II Consortium trial.
J Clin Oncol
2011
;
suppl; abstr 10009
.
29.
Li
J
,
Gong
JF
,
Li
J
,
Gao
J
,
Sun
NP
,
Shen
L
. 
Efficacy of imatinib dose escalation in Chinese gastrointestinal stromal tumor patients.
World J Gastroenterol
2012
;
18
:
698
703
.
30.
Maurel
J
,
Martins
AS
,
Poveda
A
,
Lopez-Guerrero
JA
,
Cubedo
R
,
Casado
A
, et al
Imatinib plus low-dose doxorubicin in patients with advanced gastrointestinal stromal tumors refractory to high-dose imatinib: a phase I–II study by the Spanish Group for Research on Sarcomas.
Cancer
2010
;
116
:
3692
701
.
31.
McAuliffe
JC
,
Hunt
KK
,
Lazar
AJF
,
Choi
H
,
Qiao
W
,
Thall
P
, et al
A randomized, phase II study of preoperative plus postoperative Imatinib in GIST: Evidence of rapid radiographic response and temporal induction of tumor cell apoptosis.
Ann Surg Oncol
2009
;
16
:
910
9
.
32.
Montemurro
M
,
Gelderblom
H
,
Bitz
U
,
Schutte
J
,
Blay
JY
,
Joensuu
H
, et al
Sorafenib as third- or fourth-line treatment of advanced gastrointestinal stromal tumor and pretreatment including both imatinib and sunitinib, and nilotinib: a retrospective analysis.
Eur J Cancer
2013
;
49
:
1027
31
.
33.
Raut
CP
,
Wang
Q
,
Manola
J
,
Morgan
JA
,
George
S
,
Wagner
AJ
, et al
Cytoreductive surgery in patients with metastatic gastrointestinal stromal tumor treated with sunitinib malate.
Ann Surg Oncol
2010
;
17
:
407
15
.
34.
Reichardt
P
,
Blay
JY
,
Gelderblom
H
,
Schlemmer
M
,
Demetri
GD
,
Bui-Nguyen
B
, et al
Phase III study of nilotinib versus best supportive care with or without a TKI in patients with gastrointestinal stromal tumors resistant to or intolerant of imatinib and sunitinib.
Ann Oncol
2012
;
23
:
1680
7
.
35.
Ryu
MH
,
Kang
WK
,
Bang
YJ
,
Lee
KH
,
Shin
DB
,
Ryoo
BY
, et al
A prospective, multicenter, phase 2 study of imatinib mesylate in Korean patients with metastatic or unresectable gastrointestinal stromal tumor.
Oncology
2009
;
76
:
326
32
.
36.
McAuliffe
JC
,
Lazar
AJF
,
Yang
D
,
Steinert
DM
,
Qiao
W
,
Thall
PF
, et al
Association of intratumoral vascular endothelial growth factor expression and clinical outcome for patients with gastrointestinal stromal tumors treated with imatinib mesylate.
Clin Cancer Res
2007
;
13
:
6727
34
.
37.
Broglio
KR
,
Berry
DA
. 
Detecting an overall survival benefit that is derived from progression-free survival.
J Natl Cancer Inst
2009
;
101
:
1642
9
.
38.
Chirila
C
,
Odom
D
,
Devercelli
G
,
Khan
S
,
Sherif
BN
,
Kaye
JA
, et al
Meta-analysis of the association between progression-free survival and overall survival in metastatic colorectal cancer.
Int J Colorectal Dis
2012
;
27
:
623
34
.
39.
Petrelli
F
,
Barni
S
. 
Surrogate endpoints in metastatic breast cancer treated with targeted therapies: an analysis of the first-line phase III trials.
Med Oncol
2014
;
31
:
776
.
40.
Beauchemin
C
,
Johnston
J
,
Lapierre
ME
,
Aissa
F
,
Lachaine
J
. 
Relationship between progression-free survival and overall survival in chronic lymphocytic leukemia.
Value Health
2012
;
15
:
A414
5
.
41.
Buyse
M
,
Molenberghs
G
,
Burzykowski
T
,
Renard
D
,
Geys
H
. 
The validation of surrogate endpoints in meta-analyses of randomized experiments.
Biostatistics
2000
;
1
:
49
67
.