Purpose:

Previous studies indicate that the benefit of therapy depends on patients' risk for cancer recurrence relative to noncancer mortality (ω ratio). We sought to test the hypothesis that patients with head and neck cancer (HNC) with a higher ω ratio selectively benefit from intensive therapy.

Experimental Design:

We analyzed 2,688 patients with stage III–IVB HNC undergoing primary radiotherapy (RT) with or without systemic therapy on three phase III trials (RTOG 9003, RTOG 0129, and RTOG 0522). We used generalized competing event regression to stratify patients according to ω ratio and compared the effectiveness of intensive therapy as a function of predicted ω ratio (i.e., ω score). Intensive therapy was defined as treatment on an experimental arm with altered fractionation and/or multiagent concurrent systemic therapy. A nomogram was developed to predict patients' ω score on the basis of tumor, demographic, and health factors. Analysis was by intention to treat.

Results:

Decreasing age, improved performance status, higher body mass index, node-positive status, P16-negative status, and oral cavity primary predicted a higher ω ratio. Patients with ω score ≥0.80 were more likely to benefit from intensive treatment [5-year overall survival (OS), 70.0% vs. 56.6%; HR of 0.73, 95% confidence interval (CI): 0.57–0.94; P = 0.016] than those with ω score <0.80 (5-year OS, 46.7% vs. 45.3%; HR of 1.02, 95% CI: 0.92-1.14; P = 0.69; P = 0.019 for interaction). In contrast, the effectiveness of intensive therapy did not depend on risk of progression.

Conclusions:

Patients with HNC with a higher ω score selectively benefit from intensive treatment. A nomogram was developed to help select patients for intensive therapy.

Translational Relevance

The effectiveness of intensive therapy for patients with head and neck cancer (HNC) who are older, or who have comorbidities, or who have favorable risk human papilloma virus–associated disease is unclear. Traditional risk stratification models pool patients at high risk for cancer events with patients at high risk for competing events, even though these groups have different expected benefit from intensive therapy. Studies indicate that the hazard for cancer recurrence relative to competing mortality (ω ratio) is a key determinant of treatment benefit, with newer regression methods developed to quantify effects on this ratio. This is the first study to examine the effectiveness of intensive therapy for HNC as a function of ω ratio. We found that patients with a predicted ω > 0.80 had improved overall survival with intensive therapy, using pooled data from three randomized controlled trials.

Although the effectiveness of intensive therapy [e.g., concurrent chemotherapy or altered fractionation (AFX)] for locoregionally advanced head and neck cancer has been established, there is considerable controversy surrounding which subsets of patients are most likely to benefit from this approach. In particular, the effectiveness of intensive therapy in patients who are older, or who have comorbidities, or who have relatively favorable risk disease [e.g., human papillomavirus (HPV)-associated disease, nonsmokers] is unclear.

Traditionally, risk stratification models used in cancer outcomes research have focused on the effects of treatments and risk factors on endpoints such as overall survival (OS) or progression-free survival (PFS). A problem is that these endpoints do not differentiate effects on primary events, such as disease recurrence or cancer mortality, from competing events, such as death from comorbid illness. As a result, such models are suboptimal, because they pool patients at high risk for cancer events with patients at high risk for competing events, even though these groups have different expected benefit from intensive therapy (1–9). Thus, staging systems and nomograms that predict for OS and PFS are likely to be suboptimal for selecting patients with head and neck cancer (HNC) for intensive therapeutic regimens.

Previous studies indicate that in patients with competing risks, the hazard for cancer relative to competing mortality events (i.e., ω ratio) is a key determinant of treatment benefit (7, 9–11). In particular, older patients with higher ω ratios may be good candidates for more intensive therapy; conversely, younger patients with lower ω ratios may not be. Further work is needed to define factors that critically affect ω ratios and correlate them with treatment effects. Correspondingly, newer methods have been developed to quantify effects on the ω ratio, with considerable improvement in risk stratification compared with standard models (10–13). However, it is not known whether the benefit of more intensive treatment varies according to ω ratio and, in particular, whether this is a more effective method to predict which patients are most likely to benefit from intensive treatment. The goal of this study was to develop a model to identify patients with locally advanced HNC with a higher ω ratio and to test the hypothesis that such patients selectively benefit from treatment intensification.

Population, sampling methods, and treatment

We studied 2,688 patients with locoregionally advanced (stage III–IVB) HNC treated on three clinical trials: RTOG 9003 (NCT00771641), RTOG 0129 (NCT00047008), and RTOG 0522 (NCT00265941). Details of these protocols have been published previously (14–17). Written informed consent was obtained for all patients. The study was conducted in accordance with recognized ethical guidelines and was approved by the institutional review boards at all participating institutions.

Briefly, patients on RTOG 9003 were randomized to one of four arms: hyperfractionated radiotherapy (HFX: 81.6 Gy in 68 fractions twice a day over 7 weeks), delayed concomitant boost radiotherapy (DCB: 72 Gy in 42 fractions over 6 weeks), split course radiotherapy (SC: 67.2 Gy in 42 fractions over 6 weeks), or standard fractionation (SFX: 70 Gy in 35 fractions over 7 weeks). For the purpose of this analysis, HFX and DCB were considered AFX, whereas SC and SFX were not. Patients on RTOG 9003 did not receive chemotherapy. Patients on RTOG 0129 were randomized to either AFX or SFX and received chemotherapy (two cycles of cisplatin 100 mg/m2weeks 1 and 4 of chemoradiotherapy for patients receiving AFX, and three cycles at the same dose weeks 1, 4, and 7 for patients receiving SFX). Patients on RTOG 0522 were randomized to either cetuximab (400 mg/m2loading dose, followed by 250 mg/m2weekly) or no cetuximab, and all patients received AFX (six fractions per week) along with concurrent cisplatin (two cycles of cisplatin 100 mg/m2weeks 1 and 4 of chemoradiotherapy). All human investigations were performed after approval by a local human investigations committee and in accordance with an assurance filed with and approved by the Department of Health & Human Services.

Outcomes

Progression-free survival time was defined as the time from randomization to the first recurrence of disease, or death from any cause, or censoring. Overall survival time was defined as the time from randomization to death from any cause, or censoring. Time to recurrence and time to cancer-specific mortality were defined as the time from randomization to first recurrence (or cancer-related mortality), with competing mortality events treated as censored. Time to competing mortality for recurrence was defined as time from randomization to death from any cause, in the absence of a recurrence event, with recurrence events treated as censored. Correspondingly, time to competing mortality for cancer mortality was defined as time from randomization to death from any cause, in the absence of a cancer mortality event, with cancer mortality events treated as censored.

Statistical analysis

The study followed TRIPOD guidelines (18). The statistical approach involved two main steps: (i) development of a model to separate patients by ω ratio and (ii) validation of the model as a method to predict treatment effects (i.e., variation in treatment effects as a function of ω). Overall survival was used as the primary outcome assessment for model validation, because this endpoint was not used in model development and represents an outcome of clear clinical benefit to patients.

Kaplan–Meier functions were used to plot PFS and OS and cumulative incidence functions were used to plot competing events with respect to time. The basehaz function in R (version 3.4.2) was used to estimate cumulative hazards. Forest plots were used to analyze treatment effects within risk strata, according to intention to treat. Proportional hazards assumptions were tested using the Grambsch–Therneau method (cox.zph function in R).

We trained risk scores for recurrence, competing mortality, and PFS using data from the control arms from the three studies (Supplementary Fig. S1), based on the linear predictor from a multivariable Cox proportional hazards regression (19). For RTOG 9003, the SC and SFX arms were collectively considered the control group. For the multivariable models, we selected the following candidate variables for inclusion, based on their availability in all three trials and potential association with disease recurrence (15–17, 20, 21) and/or competing mortality (2–4, 11, 19, 22, 23): age (per 10 years; continuous), female sex, black/African–American race (vs. other), white/Caucasian race (vs. other), body mass index (BMI; ≤20 kg/m2vs. >20), ECOG performance status (0 vs. 1–2), marital status (married vs. other/unknown), anemia (yes/no), education history (any college vs. other/unknown)—as a proxy for socioeconomic status (SES), primary site (oral cavity vs. oropharynx vs. hypopharynx vs. larynx), T stage (0–2 vs. 3 vs. 4), and N stage (0 vs. 1–2a vs. 2b–2c vs. 3). Anemia was defined for males as a baseline hemoglobin ≤13.5 g/dL and for females as a baseline hemoglobin ≤12.5 g/dL. For patients with known smoking and tumor P16 status, we included pack-years (≤10 vs. >10) and P16 (positive vs. negative) as covariates. P16 was analyzed as a prognostic factor for both oropharyngeal and nonoropharyngeal sites, based on several studies that have found differences in outcomes by P16 or HPV status in both oropharyngeal (20, 21) and nonoropharyngeal HNC (24–26).

All variables were normalized by subtracting the sample mean and dividing by the sample standard deviation. The mean BMI value was imputed for 194 patients with missing data, using single imputation. Risk scores based on the linear predictor were generated taking the inner product of the coefficient vector with the individual patient's data vector, as described previously (11). Risk strata were defined according to quantiles of the risk score distribution. We compared results with a standard model developed to stratify patients with oropharyngeal cancer (20, 21). Note that this model also stratifies patients with nonoropharyngeal cancer (Supplementary Fig. S2).

For modeling effects of covariates, we used generalized competing event (GCE) regression based on a proportional relative hazards model (11, 27, 28). A detailed description of the GCE modeling approach is provided below. In brief, the ratio of the cause-specific hazard for recurrence (λ1) versus the cause-specific hazard for competing mortality (λ2) is represented as ω+, whereas the ratio of the cause-specific hazard for recurrence (λ1) to the hazard for any progression-free event (λ1 + λ2) is represented as ω. We use the terms ω and ω+ratio to refer to observed values, whereas ω and ω+score refers to values of ω and ω+, respectively, predicted by the GCE model.

For GCE regression, the same variables were used as in the Cox proportional hazards models after normalization. Separate regression models were built for both cause-specific events (i.e., disease recurrence) and competing mortality (i.e., death in the absence of disease recurrence, with the cause-specific event treated as censored). Treatment-related deaths were classified as competing mortality events. To test the sensitivity of our conclusions to model specification, reduce overfitting, and facilitate clinical implementation, we generated a parsimonious GCE model using backward stepwise regression to exclude variables from the regression if P > 0.20 and by consolidating N stage (0 vs. 1–3). In this model, only age, performance status, BMI, oral cavity site, N stage, and P16 status had P < 0.20 and thus were retained in the final nomogram, which was trained on the subset of controls with known P16 status (N = 602).

GCE risk scores were generated by taking the inner product of the (normalized) individual patient's data vector with the difference of the coefficient vector for cause-specific events and competing mortality. For 95% confidence intervals (CIs) of estimates, we employed the gcerisk package in R (27). Risk strata were defined according to quantiles of the GCE risk score distribution. Tests of treatment effects and interactions included random effects for study and age (29). All P values are two-sided.

GCE model

For mutually exclusive events of type k, we posit the following proportional relative hazards model:

formula

where

formula

Here, λk0(t) is the baseline hazard for an event of type k, Σj≠k λj0(t) is the baseline cause-specific hazard for the set of events competing with event type k, {\rm{X}}$ is a vector of covariates, and \beta _{k\ GCE}^ + $ is the vector of effects (coefficients) on the covariates. From this model, it can be shown that

formula

where \ {\beta _k}\ {\rm{and\ }}{\beta _{j \ne k}}\ $ represent effects on the baseline hazard for event type k and competing events, respectively, from the Cox proportional hazard model. We use {\rm{\hat{\beta }}}_{k\ GCE}^ + = {{\rm{\hat{\beta }}}_k}$-{{\rm{\hat{\beta }}}_{j \ne k}}$ as the estimator for {\rm{\beta }}_{k\ GCE}^ + $ and {\rm{\hat{\omega }}}_{k0}^ + ( {\rm{t}} ){\rm{\ }}$ = {{\rm{\hat{\Lambda }}}_{k0}}( {\rm{t}} )/{{\rm{\hat{\Lambda }}}_{j \ne k\ 0}}( {\rm{t}} )$⁠, where {{\rm{\hat{\Lambda }}}_{k0}}({\rm t})$ and {{\rm{\hat{\Lambda }}}_{j \ne k\ 0}}( {\rm{t}} )$ represent the Nelson–Aalen estimators18for the cumulative hazard for event type k and the set of competing events at time t, respectively. We estimate the predicted value of {\rm{\hat{\omega }}}_k^ + ( {t|{\bmi{d}}} )\ $for an individual with given data vector d as:

formula

Note then that {\rm{exp}}( {{\rm{\hat{\beta }}}_{k\ GCE}^ + } )$ is the estimate of the ω+ratio, which quantifies how the relative hazards for primary and competing events change in response to changes in covariates.

We define the omega value as the ratio of the hazard for an event of type k to the hazard for all events:

formula

and estimate the predicted omega value as:

formula

Note that while {{\rm{\hat{\omega }}}_k}$ ranges from 0 to 1 inclusive, {\rm{\hat{\omega }}}_k^ + $ ranges from 0 to ∞. For k = 2, a value of {\rm{\hat{\omega }}}_1^ + = 1{\rm{\ }}$ means the hazard for event type 1 equals the hazard for event type 2, and therefore {{\rm{\hat{\omega }}}_1}$ = {{\rm{\hat{\omega }}}_2}$ = 0.5. For the purpose of this study, we defined ω+as the ratio of the hazard for disease recurrence to the hazard for competing mortality in the absence of recurrence, and ω as the ratio of the hazard for disease recurrence to the hazard for recurrence or death from any cause. All values of ω are unscaled unless otherwise specified. Scaled estimates were obtained by factoring out the baseline ω+values.

Sample size estimates

We used the power calculator described by Pintilie (30) to estimate sample sizes for hypothetical randomized trials with a primary endpoint of PFS, assuming balanced randomization, accrual time of 3 years, follow-up time of 2 years, two-sided α = 0.05, and β = 0.20. We considered two events, cancer recurrence (k = 1) and competing mortality (k = 2), and assumed an HR for cancer recurrence (θ1) of 0.5 and an HR for competing mortality (θ2) of 1. Under varying {{\rm{\omega }}_1}$ values, we allowed the HR for any event (θ) to vary according to the equation:

formula

Final GCE risk score

R functions to define the GCE risk score and scaled predicted ω are:

exp.risk.score =

function(AGE,BMI,ECOG12,OC,N0,P16){exp(-0.3693*((0.1*AGE-5.72)/0.9126)+0.2044*(((BMI>20)-0.88538)/0.31883)-0.2262*((ECOG12-0.377076)/0.4851)+0.1684*((OC-0.03488)/0.18364)-0.1274*((N0-0.14452)/0.3519)-0.2147*((P16-0.488372)/0.5))}

scaled.omega.predicted =

function(AGE,BMI,ECOG12,OC,N0,P16) {exp.risk.score(AGE,BMI,ECOG12,OC,N0,P16)/(exp.risk.score(AGE,BMI,ECOG12,OC,N0,P16)+1)}

omega.score =

function(AGE,BMI,ECOG12,OC,N0,P16) {2.6*exp.risk.score(AGE,BMI,ECOG12,OC,N0,P16)/(2.6*exp.risk.score(AGE,BMI,ECOG12,OC,N0,P16)+1)}

For this calculation, AGE is in years, BMI is in kg/cm2, ECOG12 is 1 if ECOG performance status is >0 and 0 otherwise, OC is 1 for oral cavity tumors and 0 otherwise, N0 is 1 if there is no nodal involvement and 0 otherwise, and P16 is 1 for P16-positive tumors and 0 otherwise. The factor 2.6 is the mean baseline ω+estimate from the control sample.

Sample characteristics are provided in Supplementary Table S1. Comparisons of model estimates for the entire control group and the subset with known smoking history and P16 status appear in Table 1. Factors predicting a higher ratio were decreasing age, improved performance status, higher BMI, node-positive status, P16-negative status, and oral cavity primary. It is interesting to compare and contrast effect estimates from Cox versus GCE models. Although patients with poorer OS or PFS are typically identified as candidates for more intensive treatment, the GCE model indicates that patients with advanced age, poorer performance status, hypopharynx site, and advanced T category, for example, have a reduced hazard for cancer events relative to competing mortality, implying that such patients are relatively less likely to benefit from treatment intensification. Moreover, some factors, such as N3 category and marital, education, and smoking status, are attenuated in the GCE model due to offsetting effects on recurrence and competing mortality.

Table 1.

Comparison of Cox versus GCE models in all controls (left columns) and complete cases with known smoking and P16 status (right columns)

All controls (N = 1352)Subset with known smoking and P16 status (N = 527)
Cox PH regressionGCE regressionCox PH regressionGCE regression
Characteristics HRa(95% CI) ω+Ratio (RHR)tb1fn2b(95% CI) HRa(95% CI) ω+Ratio (RHR)b(95% CI) 
Age at diagnosis, per 10 yearsc 1.42 (1.31–1.53) 0.66 (0.57–0.77) 1.37 (1.19–1.58) 0.67 (0.51–0.88) 
Sex 
 Female vs. male 0.84 (0.70–1.01) 1.06 (0.74–1.51) 0.94 (0.68–1.29) 0.94 (0.51–1.75) 
Race 
 Black 1.04 (0.74–1.47) 0.51 (0.25–1.06) 0.65 (0.35–1.22) 0.88 (0.25–3.10) 
 White 0.80 (0.60–1.09) 0.42 (0.22–0.80) 0.59 (0.35–1.00) 0.83 (0.29–2.41) 
 Nonblack/nonwhite Reference Reference Reference Reference 
Body mass indexc 
 ≤20 kg/m2vs. >20 kg/m2 0.55 (0.46–0.67) 1.03 (0.71–1.49) 0.76 (0.54–1.09) 1.66 (0.83–3.32) 
ECOG performance statusc 
 1–2 vs. 0 1.35 (1.16–1.58) 0.49 (0.36–0.67) 1.54 (1.19–2.01) 0.59 (0.34–1.00) 
Anemia 
 Yes vs. no/unknown 1.11 (0.94–1.30) 0.79 (0.58–1.08) 0.92 (0.69–1.21) 0.97 (0.56–1.67) 
Married 
 Yes vs. no/unknown 0.75 (0.66–0.88) 0.99 (0.74–1.32) 0.77 (0.59–1.00) 1.16 (0.70–1.93) 
Education history 
 Any college/vocational/technical vs. none/unknown 0.60 (0.51–0.71) 0.98 (0.71–1.36) 0.57 (0.43–0.77) 0.90 (0.52–1.55) 
Anatomic subsite 
 Oropharynx Reference Reference Reference Reference 
 Larynx 1.16 (0.96–1.39) 1.06 (0.73–1.53) 0.93 (0.66–1.30) 1.15 (0.59–2.24) 
 Hypopharynx 1.65 (1.33–2.04) 0.77 (0.50–1.18) 1.72 (1.16–2.55) 0.87 (0.39–1.91) 
 Oral cavityc 1.47 (1.13–1.91) 2.55 (1.40–4.62) 2.11 (1.22–3.64) 2.46 (0.62–9.77) 
T stage 
 0–2 Reference Reference Reference Reference 
 3 1.06 (0.88–1.28) 0.85 (0.59–1.23) 0.87 (0.63–1.21) 0.84 (0.44–1.61) 
 4 1.53 (1.26–1.86) 0.77 (0.52–1.13) 1.45 (1.04–2.01) 0.84 (0.44–1.62) 
N stage 
 0c Reference Reference Reference Reference 
 1–2a 1.26 (1.01–1.57) 1.36 (0.88–2.10) 1.23 (0.82–1.84) 1.53 (0.69–3.39) 
 2b–2c 1.29 (1.06–1.57) 1.39 (0.94–2.06) 1.54 (1.07–2.21) 1.37 (0.67–2.82) 
 3 2.19 (1.65–2.91) 1.07 (0.60–1.91) 2.93 (1.77–4.84) 1.00 (0.36–2.81) 
Smoking history, pack-years 
 ≤10 vs. >10 — — 0.50 (0.36–0.70) 1.05 (0.56–1.94) 
 P16 statusc     
 Positive vs. negative — — 0.53 (0.39–0.72) 0.66 (0.37–1.18) 
All controls (N = 1352)Subset with known smoking and P16 status (N = 527)
Cox PH regressionGCE regressionCox PH regressionGCE regression
Characteristics HRa(95% CI) ω+Ratio (RHR)tb1fn2b(95% CI) HRa(95% CI) ω+Ratio (RHR)b(95% CI) 
Age at diagnosis, per 10 yearsc 1.42 (1.31–1.53) 0.66 (0.57–0.77) 1.37 (1.19–1.58) 0.67 (0.51–0.88) 
Sex 
 Female vs. male 0.84 (0.70–1.01) 1.06 (0.74–1.51) 0.94 (0.68–1.29) 0.94 (0.51–1.75) 
Race 
 Black 1.04 (0.74–1.47) 0.51 (0.25–1.06) 0.65 (0.35–1.22) 0.88 (0.25–3.10) 
 White 0.80 (0.60–1.09) 0.42 (0.22–0.80) 0.59 (0.35–1.00) 0.83 (0.29–2.41) 
 Nonblack/nonwhite Reference Reference Reference Reference 
Body mass indexc 
 ≤20 kg/m2vs. >20 kg/m2 0.55 (0.46–0.67) 1.03 (0.71–1.49) 0.76 (0.54–1.09) 1.66 (0.83–3.32) 
ECOG performance statusc 
 1–2 vs. 0 1.35 (1.16–1.58) 0.49 (0.36–0.67) 1.54 (1.19–2.01) 0.59 (0.34–1.00) 
Anemia 
 Yes vs. no/unknown 1.11 (0.94–1.30) 0.79 (0.58–1.08) 0.92 (0.69–1.21) 0.97 (0.56–1.67) 
Married 
 Yes vs. no/unknown 0.75 (0.66–0.88) 0.99 (0.74–1.32) 0.77 (0.59–1.00) 1.16 (0.70–1.93) 
Education history 
 Any college/vocational/technical vs. none/unknown 0.60 (0.51–0.71) 0.98 (0.71–1.36) 0.57 (0.43–0.77) 0.90 (0.52–1.55) 
Anatomic subsite 
 Oropharynx Reference Reference Reference Reference 
 Larynx 1.16 (0.96–1.39) 1.06 (0.73–1.53) 0.93 (0.66–1.30) 1.15 (0.59–2.24) 
 Hypopharynx 1.65 (1.33–2.04) 0.77 (0.50–1.18) 1.72 (1.16–2.55) 0.87 (0.39–1.91) 
 Oral cavityc 1.47 (1.13–1.91) 2.55 (1.40–4.62) 2.11 (1.22–3.64) 2.46 (0.62–9.77) 
T stage 
 0–2 Reference Reference Reference Reference 
 3 1.06 (0.88–1.28) 0.85 (0.59–1.23) 0.87 (0.63–1.21) 0.84 (0.44–1.61) 
 4 1.53 (1.26–1.86) 0.77 (0.52–1.13) 1.45 (1.04–2.01) 0.84 (0.44–1.62) 
N stage 
 0c Reference Reference Reference Reference 
 1–2a 1.26 (1.01–1.57) 1.36 (0.88–2.10) 1.23 (0.82–1.84) 1.53 (0.69–3.39) 
 2b–2c 1.29 (1.06–1.57) 1.39 (0.94–2.06) 1.54 (1.07–2.21) 1.37 (0.67–2.82) 
 3 2.19 (1.65–2.91) 1.07 (0.60–1.91) 2.93 (1.77–4.84) 1.00 (0.36–2.81) 
Smoking history, pack-years 
 ≤10 vs. >10 — — 0.50 (0.36–0.70) 1.05 (0.56–1.94) 
 P16 statusc     
 Positive vs. negative — — 0.53 (0.39–0.72) 0.66 (0.37–1.18) 

Abbreviations: ECOG, Eastern Cooperative Oncology Group; GCE, generalized competing event; PH, proportional hazards; RHR, relative hazard ratio.

a>1 Indicates increased HR for progression-free survival.

b>1 Indicates increased HR for cancer recurrence relative to competing mortality.

cRetained in parsimonious GCE model (nomogram).

Compared with standard models, GCE models improved stratification according to ω ratio within each risk group, with increasing ω from low risk to high risk according to both model predictions and observations (Supplementary Table S2). Agreement between predicted ω (i.e., ω score) and observed ω ratios was high, indicating excellent model fit and validity. The observed 3-year ω and ω+ratios for the whole cohort were 0.719 and 2.56, respectively. The observed 3-year ω and ω+ratios for the subset with known p16 and smoking status were 0.738 and 2.87, respectively.

As shown in Fig. 1, OS differed markedly across risk groups defined by standard models (Fig. 1A and C), whereas GCE models show little correspondence between OS and risk level when risk is defined by ω score (Fig. 1B and D). This suggests that, paradoxically, patients with a better predicted survival (and higher ω score) could be more likely to benefit from intensive treatment (by virtue of being much less likely to die from noncancer causes). This is further shown in Fig. 2, which plots the cumulative incidences of cancer recurrence and competing mortality within risk groups. Note that with standard risk stratification models, the probability of both cancer and noncancer mortality is increased in the highest risk strata relative to the GCE model, whereas the converse is true of the lower risk strata, further supporting GCE model validity. This is because while standard models are designed to separate groups according to PFS and OS, GCE models are designed to optimize the ratio of competing events in order to favor a particular event of interest.

Figure 1.

Overall survival by risk strata. A, Cox model in the whole cohort. B, Generalized competing event (GCE) model in the whole cohort. C, Fakhry and colleagues (21) nomogram in patients with known smoking history and P16 status. D, GCE nomogram in patients with known P16 status.

Figure 1.

Overall survival by risk strata. A, Cox model in the whole cohort. B, Generalized competing event (GCE) model in the whole cohort. C, Fakhry and colleagues (21) nomogram in patients with known smoking history and P16 status. D, GCE nomogram in patients with known P16 status.

Close modal
Figure 2.

Competing event incidences by risk score. A, Cox model in the whole cohort. B, Generalized competing event (GCE) model in the whole cohort. C, Fakhry and colleagues (21) nomogram in patients with known smoking history and P16 status. D, GCE model in patients with known P16 status.

Figure 2.

Competing event incidences by risk score. A, Cox model in the whole cohort. B, Generalized competing event (GCE) model in the whole cohort. C, Fakhry and colleagues (21) nomogram in patients with known smoking history and P16 status. D, GCE model in patients with known P16 status.

Close modal

Patients with the highest ω score (≥0.80)—representing the highest quintile—were more likely to benefit from intensive treatment (5-year OS, 70.0% vs. 56.6%; HR of 0.73, 95% CI: 0.57–0.94; Wald P = 0.016) than those with ω score <0.80 (5-year OS, 46.7% vs. 45.3%; HR of 1.02, 95% CI: 0.92–1.14; Wald P = 0.69; P = 0.019 for interaction). For patients with known P16 status, the GCE nomogram similarly identified a statistically significant benefit from treatment intensification in patients with ω score ≥0.80 (HR of 0.67; 95% CI: 0.47–0.95, Wald P = 0.027); in contrast, we did not find statistically significant treatment effects in the high-risk subgroups defined by standard models overall (Fig. 3), or in any of the trials separately. Treatment intensification was also associated with statistically significant improvement in OS in patients with ω score ≥0.80 in the RTOG 9003 trial separately (Supplementary Fig. S3). These results appeared robust over a range of potential cut-off points near the ω score of 0.80 (Supplementary Fig. S4A and S4B). Calibration plots showed excellent discriminatory ability, with better fitting at higher predicted ω values (Supplementary Fig. S4C). A nomogram for calculating an individual's ω score appears in Fig. 4.

Figure 3.

Interaction between experimental therapy and ω score. A, Whole cohort. Left: ω score <0.80; right: ω score ≥0.80. B, Patients with known P16 status. Left: ω score <0.80; right: ω score ≥0.80.

Figure 3.

Interaction between experimental therapy and ω score. A, Whole cohort. Left: ω score <0.80; right: ω score ≥0.80. B, Patients with known P16 status. Left: ω score <0.80; right: ω score ≥0.80.

Close modal
Figure 4.

Nomogram to predict patients' relative hazard for recurrence based on GCE regression model. ECOG, Eastern Cooperative Oncology Group.

Figure 4.

Nomogram to predict patients' relative hazard for recurrence based on GCE regression model. ECOG, Eastern Cooperative Oncology Group.

Close modal

Model estimates and performance were similar when patients with missing BMI data were omitted from the analysis. We found evidence of efficiency gains with the GCE model relative to standard models under varying definitions of “high risk” (Table 2), due to the higher ratio of primary to competing events. However, this analysis does not account for efficiency loss that could result from a lower event rate. Although the incidence of competing mortality was lower in the high-risk group using the GCE model, the lower incidence of cancer recurrence offsets some of the efficiency gains, indicating correlation between primary and competing events. As such, GCE models could be less efficient than models designed to predict recurrence, but this conclusion was sensitive to the lack of P16 status for the majority of the cohort. It is noteworthy, however, that sample size estimates were similar with the various approaches, despite a marked reduction in the overall event rate in the “high-risk” group defined by the GCE model.

Table 2.

Comparison of sample size estimates within variously defined high-risk groups

Cancer recurrence [3-year cumulative incidence (%)]Competing mortality [3-year cumulative incidence (%)]HRaN
Whole cohort 
 Highest tertile 
  Cox model for OS 48.4 25.3 0.672 442 
  Cox model for recurrence 50.7 21.9 0.651 364 
  GCE modela 36.6 6.4 0.574 307 
 Highest quintile 
  Cox model for OS 51.9 26.7 0.670 409 
  Cox model for recurrence 54.6 24.3 0.654 346 
  GCE modela 36.6 5.5 0.565 293 
Subset with known P16 and smoking status 
 Highest tertile 
  Fakhry model for OS 52.3 20.7 0.642 331 
  Cox model for recurrence 53.4 18.5 0.629 298 
  GCE model 39.7 8.9 0.592 315 
 Highest quintile 
  Fakhry model for OS 54.4 24.6 0.655 351 
  Cox model for recurrence 58.9 19.6 0.625 264 
  GCE model 34.6 11.0 0.600 301 
Cancer recurrence [3-year cumulative incidence (%)]Competing mortality [3-year cumulative incidence (%)]HRaN
Whole cohort 
 Highest tertile 
  Cox model for OS 48.4 25.3 0.672 442 
  Cox model for recurrence 50.7 21.9 0.651 364 
  GCE modela 36.6 6.4 0.574 307 
 Highest quintile 
  Cox model for OS 51.9 26.7 0.670 409 
  Cox model for recurrence 54.6 24.3 0.654 346 
  GCE modela 36.6 5.5 0.565 293 
Subset with known P16 and smoking status 
 Highest tertile 
  Fakhry model for OS 52.3 20.7 0.642 331 
  Cox model for recurrence 53.4 18.5 0.629 298 
  GCE model 39.7 8.9 0.592 315 
 Highest quintile 
  Fakhry model for OS 54.4 24.6 0.655 351 
  Cox model for recurrence 58.9 19.6 0.625 264 
  GCE model 34.6 11.0 0.600 301 

Abbreviations: GCE, generalized competing event; OS, overall survival.

aProjected HR for recurrence or death from any cause from Eq. (G), based on observed ω ratios.

In this study we found that intensive treatment differentially benefits patients with a higher relative recurrence risk (ω score ≥0.80). Previous studies involving HNC and other disease sites have found that ω scores could be used to identify patients with a greater likelihood to benefit from intensive therapy (3, 10–13). This study is the first to examine treatment effects within risk groups defined by this factor. We found evidence to support the hypothesis that relative recurrence risk is an important predictor of treatment effectiveness in the HNC population.

Advantages of this study were its large sample size and large number of known predictors of both cancer-related and competing events. Randomization also mitigated the impact of selection bias, which presents a problem studying treatment effects in other data sources. A limitation of this study, however, is the heterogeneity in the treatment and populations across the three trials. “Intensive treatment” was defined relative to the baseline (control) group and thus included AFX (with or without concurrent chemotherapy), or chemoradiotherapy with concurrent targeted therapy, depending on the trial. We were also unable to control directly for some predictors, such as comorbidity and SES, that likely would have helped optimize the model relative to standard approaches. For example, income was prognostic for both cancer recurrence and competing mortality but had to be omitted because it was not collected for all trials. Although previous studies have found increased survival for patients undergoing treatment at high-volume centers (31–33), radiotherapy quality in RTOG trials is considered to be high. The incidence of noncancer mortality in this cohort was also lower than has been observed in prior studies (2), indicating the exclusion of many patients at risk for competing events.

Variation in the definition of “intensive treatment” is a potential limitation of our study; however, the intent was to compare the effectiveness of intensity with respect to the control arm, and our results were unaffected whether we applied fixed or random-effects models. Data from large randomized trials or meta-analyses involving homogeneous treatments (in particular, with an established survival benefit) will be important for further model validation. Note that both RTOG 0129 and RTOG 0522 failed to reject the null hypothesis; in the absence of effective therapy, it is not possible to identify subpopulations that would benefit. Future studies involving more trials that met their primary endpoint would be helpful to determine how treatment effects and toxicity vary with the ω ratio. However, we did observe a survival advantage with AFX in the high-risk group from RTOG 9003, lending support to the hypothesis that ω score is a useful predictive marker.

Interpreting the effects of particular covariates in this study should be undertaken with caution, since CIs were fairly wide (leading to some differences in interpretation across samples). Lack of consistency and incomplete collection of key prognostic variables hampers efforts to compare risk models, requiring us to retrain multivariable models in new samples; however, GCE models have previously been validated in population-based studies (10, 11). It should also be noted that age cutoffs ≤50 (and >70) have been previously associated with a selective benefit (or lack thereof) of treatment intensification in HNC, including the RTOG 0522 trial included in this analysis (16, 34, 35). However, age as a sole criterion for treatment selection is generally not favored (36), because other health factors can influence the appropriate intensity of therapy. In this study, age ≤50 years was not predictive of a treatment benefit in the whole cohort.

GCE regression is a modeling approach with clear differences relative to standard risk stratification methods. It contrasts with other nomograms (21, 37, 38) in that instead of predicting patients' risk for event-free survival, which is preferable for prognostication, the GCE model seeks to predict the ratio of cancer events to competing events, which is considered preferable as a predictive model. Further studies are required to establish its advantages over standard methods, especially in the postoperative setting and the larger population not participating in trials, who we expect would have differing risk for competing events. An important limitation is that the cutoff of 0.80 for the ω score, although robust, was not chosen a priori; our results should thus be considered hypothesis generating and should be validated in future studies. Ascertaining optimal cutoffs to define “high-risk” groups remains an area of investigation, especially with models controlling for comorbidity and other geriatric/frailty assessments. Perhaps most interestingly, our findings suggest that a higher absolute risk for recurrence/progression does not necessarily confer a higher likelihood to benefit from intensive therapy (or greater power to detect treatment effects). This is because patients with a low risk for both recurrence and competing mortality may benefit as much from aggressive treatment approaches as patients with high risk for both events.

In summary, here we propose a method to predict an underreported but meaningful quantity for individual patients (i.e., relative recurrence risk, or ω ratio), with a clinically relevant interpretation (i.e., a value >50% means the individual's hazard for cancer recurrence exceeds the hazard for competing mortality). Our findings indicate that patients with a higher relative recurrence risk, indicated by a ω score ≥0.80, selectively benefit from intensive therapy. This approach is being implemented prospectively in the NRG-HN004 trial, along with a nomogram to inform clinical practice and trial design (comogram.org). Further research, however, is needed to optimize GCE models and to ascertain which patients derive the greatest benefit from intensive therapy.

D.I. Rosenthal is an employee of Merck. S.J. Frank is an employee of Boston Scientific and Varian Medical; reports receiving commercial research grants from Hitachi, Eli Lilly, and Elekta; and holds ownership interest (including patents) in C4 Imaging. J.A. Bonner is an employee of Bristol-Myers Squibb, Eli Lilly, Merck Serono, and Cel-Sci. S.S. Yom is an employee of Galera; reports receiving commercial research grants from Bristol-Myers Squibb, Merck, Genentech, and BioMimetix; and reports receiving other remuneration from Springer and UpToDate. No potential conflicts of interest were disclosed by the other authors.

Conception and design: L.K. Mell, H. Shen, Q.-T. Le

Development of methodology: L.K. Mell, H. Shen, K. Zakeri, S.J. Wong

Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): L.K. Mell, P.F. Nguyen-Tân, D.I. Rosenthal, A.M. Trotti III, J.A. Bonner, C.U. Jones, S.S. Yom, W.L. Thorstad, S.J. Wong, G. Shenouda, J.A. Ridge, Q.E. Zhang

Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): L.K. Mell, H. Shen, D.I. Rosenthal, K. Zakeri, L.K. Vitzthum, S.J. Frank, P.B. Schiff, A.M. Trotti III, J.A. Bonner, S.J. Wong, G. Shenouda, J.A. Ridge, Q.E. Zhang

Writing, review, and/or revision of the manuscript: L.K. Mell, H. Shen, P.F. Nguyen-Tân, D.I. Rosenthal, K. Zakeri, L.K. Vitzthum, S.J. Frank, P.B. Schiff, A.M. Trotti III, J.A. Bonner, C.U. Jones, S.S. Yom, W.L. Thorstad, S.J. Wong, G. Shenouda, J.A. Ridge, Q.E. Zhang, Q.-T. Le

Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): L.K. Mell, L.K. Vitzthum, J.A. Bonner, S.J. Wong

Study supervision: L.K. Mell

This project was supported by grants U10CA180868 (NRG Oncology Operations), U10CA180822 (NRG Oncology SDMC) from the National Cancer Institute (NCI; NRG Oncology/RTOG 9003, NCT00771641, https://clinicaltrials.gov/ct2/show/NCT00771641, NRG Oncology/RTOG 0129, NCT00047008, https://clinicaltrials.gov/ct2/show/NCT00047008, and NRG Oncology/RTOG 0522, NCT00265941, https://clinicaltrials.gov/ct2/show/NCT00265941), and Eli Lilly.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1.
Argiris
A
,
Brockstein
BE
,
Haraf
DJ
,
Stenson
KM
,
Mittal
BB
,
Kies
MS
, et al
Competing causes of death and second primary tumors in patients with locoregionally advanced head and neck cancer treated with chemoradiotherapy
.
Clin Cancer Res
2004
;
10
:
1956
62
.
2.
Mell
LK
,
Dignam
JJ
,
Salama
JK
,
Cohen
EE
,
Polite
BN
,
Dandekar
V
, et al
Predictors of competing mortality in advanced head and neck cancer
.
J Clin Oncol
2010
;
28
:
15
20
.
3.
Rose
BS
,
Jeong
JH
,
Nath
SK
,
Lu
SM
,
Mell
LK
. 
Population-based study of competing mortality in head and neck cancer
.
J Clin Oncol
2011
;
29
:
3503
9
.
4.
Kwon
M
,
Roh
JL
,
Song
J
,
Lee
SW
,
Kim
SB
,
Choi
SH
, et al
Noncancer health events as a leading cause of competing mortality in advanced head and neck cancer
.
Ann Oncol
2014
;
25
:
1208
14
.
5.
Mell
LK
,
Weichselbaum
RR
. 
More on cetuximab in head and neck cancer
.
N Engl J Med
2007
;
357
:
2201
2
.
6.
Dignam
JJ
,
Kocherginsky
MN
. 
Choice and interpretation of statistical tests used when competing risks are present
.
J Clin Oncol
2008
;
26
:
4027
34
.
7.
Mell
LK
,
Jeong
JH
. 
Pitfalls of using composite primary end points in the presence of competing risks
.
J Clin Oncol
2010
;
28
:
4297
9
.
8.
Mell
LK
,
Zakeri
K
,
Rose
BS
. 
On lumping, splitting, and the nosology of clinical trial populations and end points
.
J Clin Oncol
2014
;
32
:
1089
90
.
9.
Mell
LK
,
Carmona
R
,
Gulaya
S
,
Lu
T
,
Wu
J
,
Saenz
CC
, et al
Cause-specific effects of radiotherapy and lymphadenectomy in stage I-II endometrial cancer: a population-based study
.
J Natl Cancer Inst
2013
;
105
:
1656
66
.
10.
Carmona
R
,
Gulaya
S
,
Murphy
JD
,
Rose
BS
,
Wu
J
,
Noticewala
S
, et al
Validated competing event model for the stage I-II endometrial cancer population
.
Int J Radiat Oncol Biol Phys
2014
;
89
:
888
98
.
11.
Carmona
R
,
Zakeri
K
,
Green
G
,
Hwang
L
,
Gulaya
S
,
Xu
B
, et al
Improved method to stratify elderly patients with cancer at risk for competing events
.
J Clin Oncol
2016
;
34
:
1270
7
.
12.
Zakeri
K
,
Rose
BS
,
D'Amico
AV
,
Jeong
JH
,
Mell
LK
. 
Competing events and costs of clinical trials: analysis of a randomized trial in prostate cancer
.
Radiother Oncol
2015
;
115
:
114
9
.
13.
Zakeri
K
,
Rose
BS
,
Gulaya
S
,
D'Amico
AV
,
Mell
LK
. 
Competing event risk stratification may improve the design and efficiency of clinical trials: secondary analysis of SWOG 8794
.
Contemp Clin Trials
2013
;
34
:
74
9
.
14.
Fu
KK
,
Pajak
TF
,
Trotti
A
,
Jones
CU
,
Spencer
SA
,
Phillips
TL
, et al
A Radiation Therapy Oncology Group (RTOG) phase III randomized study to compare hyperfractionation and two variants of accelerated fractionation to standard fractionation radiotherapy for head and neck squamous cell carcinomas: first report of RTOG 9003
.
Int J Radiat Oncol Biol Phys
2000
;
48
:
7
16
.
15.
Nguyen-Tân
PF
,
Zhang
Q
,
Ang
KK
,
Weber
RS
,
Rosenthal
DI
,
Soulieres
D
, et al
Randomized phase III trial to test accelerated versus standard fractionation in combination with concurrent cisplatin for head and neck carcinomas in the Radiation Therapy Oncology Group 0129 trial: long-term report of efficacy and toxicity
.
J Clin Oncol
2014
;
32
:
3858
66
.
16.
Ang
KK
,
Zhang
Q
,
Rosenthal
DI
,
Nguyen-Tân
PF
,
Sherman
EJ
,
Weber
RS
, et al
Randomized phase III trial of concurrent accelerated radiation plus cisplatin with or without cetuximab for stage III to IV head and neck carcinoma: RTOG 0522
.
J Clin Oncol
2014
;
32
:
2940
50
.
17.
Beitler
JJ
,
Zhang
Q
,
Fu
KK
,
Trotti
A
,
Spencer
SA
,
Jones
CU
, et al
Final results of local-regional control and late toxicity of RTOG 9003: a randomized trial of altered fractionation radiation for locally advanced head and neck cancer
.
Int J Radiat Oncol Biol Phys
2014
;
89
:
13
20
.
18.
Collins
GS
,
Reitsma
JB
,
Altman
DG
,
Moons
KG
. 
Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement
.
Br J Cancer
2015
;
112
:
251
9
.
19.
Cox
DR
. 
Regression models and life tables.
J R Stat Soc Series B Stat Methodol
1972
;
B34
:
187
220
.
20.
Ang
KK
,
Harris
J
,
Wheeler
R
,
Weber
RS
,
Rosenthal
DI
,
Nguyen-Tân
PF
, et al
Human papillomavirus and survival of patients with oropharyngeal cancer
.
N Engl J Med
2010
;
363
:
24
35
.
21.
Fakhry
C
,
Zhang
Q
,
Nguyen-Tân
PF
,
Rosenthal
DI
,
Weber
RS
,
Lambert
L
, et al
Development and validation of nomograms predictive of overall and progression-free survival in patients with oropharyngeal cancer
.
J Clin Oncol
2017
;
35
:
4057
65
.
22.
Zakeri
K
,
MacEwan
I
,
Vazirnia
A
,
Cohen
EE
,
Spiotto
MT
,
Haraf
DJ
, et al
Race and competing mortality in advanced head and neck cancer
.
Oral Oncol
2014
;
50
:
40
4
.
23.
Park
A
,
Alabaster
A
,
Shen
H
,
Mell
LK
,
Katzel
JA
. 
Undertreatment of women with locoregionally advanced head and neck cancer
.
Cancer
2019
;
125
:
3033
9
.
24.
Chung
CH
,
Zhang
Q
,
Kong
CS
,
Harris
J
,
Fertig
EJ
,
Harari
PM
, et al
p16 protein expression and human papillomavirus status as prognostic biomarkers of nonoropharyngeal head and neck squamous cell carcinoma
.
J Clin Oncol
2014
;
32
:
3930
8
.
25.
Bryant
AK
,
Sojourner
EJ
,
Vitzthum
LK
,
Zakeri
K
,
Shen
H
,
Nguyen
C
, et al
Prognostic role of p16 in nonoropharyngeal head and neck cancer
.
J Natl Cancer Inst
2018
;
110
:
1393
9
.
26.
Tian
S
,
Switchenko
JM
,
Jhaveri
J
,
Cassidy
RJ
,
Ferris
MJ
,
Press
RH
, et al
Survival outcomes by high-risk human papillomavirus status in nonoropharyngeal head and neck squamous cell carcinomas: a propensity-scored analysis of the National Cancer Data Base
.
Cancer
2019
;
125
:
2782
2793
.
27.
Shen
H
,
Carmona
R
,
Mell
LK
. 
Generalized competing event model: gcerisk R package. R (CRAN)
.
Available from
: https://cran.r-project.org/.
28.
Lunn
M
,
McNeil
D
. 
Applying Cox regression to competing risks
.
Biometrics
1995
;
51
:
524
32
.
29.
Michiels
S
,
Baujat
B
,
Mahé
C
,
Sargent
DJ
,
Pignon
JP
. 
Random effects survival models gave a better understanding of heterogeneity in individual patient data meta-analyses
.
J Clin Epidemiol
2005
;
58
:
238
45
.
30.
Pintilie
M
.
Competing risks: a practical perspective
.
Chichester, England
:
John Wiley & Sons
; 
2006
.
p.
115
26
.
31.
Wuthrick
EJ
,
Zhang
Q
,
Machtay
M
,
Rosenthal
DI
,
Nguyen-Tan
PF
,
Fortin
A
, et al
Institutional clinical trial accrual volume and survival of patients with head and neck cancer
.
J Clin Oncol
2015
;
33
:
156
64
.
32.
Boero
IJ
,
Paravati
AJ
,
Xu
B
,
Cohen
EE
,
Mell
LK
,
Le
QT
, et al
Importance of radiation oncologist experience among patients with head-and-neck cancer treated with intensity-modulated radiation therapy
.
J Clin Oncol
2016
;
34
:
684
90
.
33.
David
JM
,
Ho
AS
,
Luu
M
,
Yoshida
EJ
,
Kim
S
,
Mita
AC
, et al
Treatment at high-volume facilities and academic centers is independently associated with improved survival in patients with locally advanced head and neck cancer
.
Cancer
2017
;
123
:
3933
42
.
34.
Bourhis
J
,
Overgaard
J
,
Audry
H
,
Ang
KK
,
Saunders
M
,
Bernier
J
, et al
Meta-Analysis of Radiotherapy in Carcinomas of Head and neck (MARCH) Collaborative Group. Hyperfractionated or accelerated radiotherapy in head and neck cancer: a meta-analysis
.
Lancet
2006
;
368
:
843
54
.
35.
Pignon
JP
,
le Maître
A
,
Maillard
E
,
Bourhis
J
;
MACH-NC Collaborative Group
. 
Meta-analysis of chemotherapy in head and neck cancer (MACH-NC): an update on 93 randomised trials and 17,346 patients
.
Radiother Oncol
2009
;
92
:
4
14
.
36.
Wildiers
H
,
Mauer
M
,
Pallis
A
,
Hurria
A
,
Mohile
SG
,
Luciani
A
, et al
End points and trial design in geriatric oncology research: a joint European organisation for research and treatment of cancer–Alliance for Clinical Trials in Oncology–International Society Of Geriatric Oncology position article
.
J Clin Oncol
2013
;
31
:
3711
8
.
37.
Wang
SJ
,
Patel
SG
,
Shah
JP
,
Goldstein
DP
,
Irish
JC
,
Carvalho
AL
, et al
An oral cavity carcinoma nomogram to predict benefit of adjuvant radiotherapy
.
JAMA Otolaryngol Head Neck Surg
2013
;
139
:
554
9
.
38.
Balachandran
VP
,
Gonen
M
,
Smith
JJ
,
DeMatteo
RP
. 
Nomograms in oncology: more than meets the eye
.
Lancet Oncol
2015
;
16
:
e173
80
.