Abstract
We conducted a randomized phase II multicenter clinical trial to test the hypothesis that physiologic MRI-based radiotherapy (RT) dose escalation would improve the outcome of patients with poor prognosis head and neck cancer.
MRI was acquired at baseline and at RT fraction 10 to create low blood volume/apparent diffusion coefficient maps for RT boost subvolume definition in gross tumor volume. Patients were randomized to receive 70 Gy (standard RT) or 80 Gy to the boost subvolume (RT boost) with concurrent weekly platinum. The primary endpoint was disease-free survival (DFS) with significance defined at a one-sided 0.1 level, and secondary endpoints included locoregional failure (LRF), overall survival (OS), comparison of adverse events and patient reported outcomes (PRO).
Among 81 randomized patients, neither the primary endpoint of DFS (HR = 0.849, P = 0.31) nor OS (HR = 1.19, P = 0.66) was significantly improved in the RT boost arm. However, the incidence of LRF was significantly improved with the addition of the RT boost (HR = 0.43, P = 0.047). Two-year estimates [90% confidence interval (CI)] of the cumulative incidence of LRF were 40% (27%–53%) in the standard RT arm and 18% (10%–31%) in the RT boost arm. Two-year estimates (90% CI) for DFS were 48% (34%–60%) in the standard RT arm and 57% (43%–69%) in the RT boost arm. There were no significant differences in toxicity or longitudinal differences seen in EORTC QLQ30/HN35 subscales between treatment arms in linear mixed-effects models.
Physiologic MRI-based RT boost decreased LRF without a significant increase in grade 3+ toxicity or longitudinal PRO differences, but did not significantly improve DFS or OS. Additional improvements in systemic therapy are likely necessary to realize improvements in DFS and OS.
Physiologic MRI imaging biomarkers before and during radiation therapy have been shown to improve prediction of radiation treatment response in head and neck cancer over clinical factors alone. MRI-directed adaptive radiation therapy is a promising adaptive strategy to personalize treatment for the individual patient. This study randomized patients with poor prognosis head and neck cancer to standard definitive chemoradiation or a physiologic MRI-directed radiotherapy boost. Both arms received concurrent weekly cisplatin. The results showed a locoregional control benefit to physiologic MRI-directed radiation boost, without significant differences in toxicity or changes in patient-reported outcomes between the treatment arms. Future studies assessing physiologic MRI-directed radiation therapy personalization are warranted to facilitate individual treatment decisions in locally advanced head and neck cancer.
Introduction
Locoregional failure (LRF) is common in poor prognosis locally advanced head and neck cancer and is often unsalvageable, leading to death in up to 50% of patients (1). To date, the addition of targeted therapies, immunotherapy or accelerated fractionation to standard-of-care radiotherapy (70 Gy) with concurrent platinum has failed to improve outcomes (2–5). On the basis of our experience and that of others, the vast majority of LRFs in patients receiving intensity-modulated radiotherapy occur within the targets that received the full prescription dose (6). Studies attempting to decrease LRF rates through intensifying treatment to the entire gross tumor volume (GTV) have been limited to modest dose increases due to toxicity and have not shown improvements in tumor control (7–9).
We have shown that dynamic contrast enhanced MRI (DCE-MRI) may be used to identify areas of poor perfusion within a tumor and that persistent low blood volume (BV) tumor subvolumes after 2 weeks of radiation are predictive of local failure (10). Low apparent diffusion coefficient (ADC) has been associated with highly cellular tumor subvolumes (11) and persistently low ADC during radiotherapy (RT) may be associated with LRF, while an increase in ADC during treatment may result in improved locoregional control (LRC; refs. 12–14). Furthermore, highly conformal RT can be used to redistribute dose within the tumor such that the dose to subvolumes likely to be resistant to standard RT can be increased, while standard doses to the rest of the tumor and surrounding normal tissues are maintained (15–17). This concept of RT adaptation may be used to assess response to therapy where predictive changes seen within the tumor seen after 2 weeks occur early enough in treatment allow time for meaningful intervention to the RT plan.
On the basis of these findings, we hypothesized that a physiologic MRI-directed RT boost would decrease LRF and improve disease-free survival (DFS). To test this hypothesis, we conducted a randomized phase II trial comparing definitive chemoradiation with physiologic MRI-directed RT boost to standard-dose chemoradiation.
Patients and Methods
After Institutional Review Board (IRB) approval, a randomized phase II study was conducted at the University of Michigan and the Ann Arbor Veterans Affairs Hospital (NCT0031250). The study was approved by the IRBs of all participating sites and was conducted in accordance with the Declaration of Helsinki, the Belmont Report, and U.S. Common Rule. Patients were randomly assigned 1:1 between the treatment groups of (i) SD (70 Gy in 35 fractions) or (ii) an experimental arm of MRI-directed RT boost to 80 Gy. Both arms received concurrent weekly platinum. Randomization was stratified by size of GTV primary (< or ≥ 56 cc) and RT boost volume (< or ≥ or 10 cc).
Objectives
The primary objective of the study was to compare DFS in patients treated with 70 Gy with concurrent platinum with an MRI-directed boost to 80 Gy with concurrent platinum. Secondary objectives were to compare LRF, overall survival (OS), pattern of failure, adverse effects, and quality of life (QOL) between the treatment groups.
Patients
Patients age >18 years old with previously untreated squamous cell carcinoma of the head and neck, stage III–IV according to AJCC tumor–node–metastasis classification, 7th edition, without distant metastases and undergoing curative treatment with definitive RT were initially eligible for inclusion. This included p16(−) locally/regionally advanced (cT3-4 or cN2-3) oropharyngeal cancer, cT3-4 bulky (>40 cc) laryngeal/hypopharyngeal cancer, inoperable or resection-declined stage III/IV oral cavity or paranasal sinus cancers, locally/regionally advanced EBV negative (cT3-4 or cN3) nonendemic nasopharyngeal cancer, together termed locally advanced head and neck cancer (LAHNSCC). In 2015, eligibility was expanded to include cT4 or cN3 p16+ oropharyngeal cancer. Karnofsky performance status > 70 and adequate renal, bone marrow, and liver function were required. Patients were excluded for any head and neck surgery other than biopsy, prior head and neck radiation or contraindication to contrast enhanced MRI. In women of childbearing age, potential pregnancy had to be excluded before entering the study. All patients signed a written informed consent before study inclusion. All patients underwent a biopsy to confirm the diagnosis as well as CT scan of the head-neck and PET-CT for assessment of clinical tumor stage. IHC analysis for p16 was mandatory and used as surrogate marker of human papillomavirus (HPV)–related disease.
Treatment
In p16+ oropharyngeal cancer, 70 Gy was prescribed to the high-risk planning tumor volume (PTVhigh) including primary tumor and lymph node metastases and 56 Gy to PTV low elective neck volumes, given as daily fractions of 2.0 Gy and 1.6 Gy, respectively, 5 days per week. In LAHNSCC, an additional PTVmid volume was included for positive nodal levels or other areas at high risk for subclinical disease; PTVmid was treated to 59.5 Gy in 35 fractions. Details regarding RT planning, including target delineation instructions and dose-volume constraints, are described in the submitted trial protocol. PTV margin was 3 mm for all volumes including the boost volumes; however, edited to limit mucosal dose to 2 Gy/fraction.
DCE-MRI and diffusion MRI were acquired at baseline and at week 2 (fraction 9–11) as previously described to create BV and ADC maps (15, 18, 19). The pretreatment and midtreatment BV and ADC maps were overlayed with eachother, and the persisting low BV (defined as <7.64 mL/100 g) and persisting low ADC (defined as <1.2 μm2/ms) in both primary tumor and lymph nodes were summed to create the boost volume. The patients with >1 cc persistent total low BV and low ADC subvolumes in GTVs were randomized. In the experimental arm, a union of the persisting low BV and persisting low ADC subvolumes (boost volume) received 2.5 Gy/fraction for the last 20 fractions and received a total dose of 80 Gy in 35 fractions (86 Gy EQD2 for a/b ratio of 2.5). If the union of persisting subvolumes after 20 Gy was < 1 cc, the patient was entered into a nonrandomized cohort and treated by standard RT.
Cisplatin (40 mg/m2) or carboplatin dosed according to AUC (AUC = 2) for cisplatin ineligibility was administered intravenously weekly during the 7 weeks of RT.
Follow-up
Toxicity assessment was performed 1 month after RT completion. Assessment of response was performed by clinical examination and PET-CT at 3 and 12 months post-RT. Patients were then followed clinically every third month for 2 years, and every sixth month until 5 years. Acute adverse effects were assessed until 3 months post-RT and late adverse effects were then monitored every 3 months up to 24 months after RT by CTCAE v4.0 and the European Organization for Research and Treatment of Cancer Quality of Life Questionnaire Core 30 (QLQ-C30, version 3.0) and Quality of Life Head and Neck module (QLQ-H&N35).
Statistical analysis
The trial was designed as a superiority study that aimed to show a 20% absolute increase (from 50% to 70%) in DFS at 3 years for RT boost compared with standard radiation. A sample size of 80 patients was calculated to achieve 83% power to demonstrate improvement in DFS at a one-sided 0.1 level of significance.
DFS was defined as the time from date of diagnosis to the first of local, regional, or distant progression or death. Patients alive and disease free at last follow-up were censored at last follow-up. OS was defined as the time from date of diagnosis to death from any cause. LRF time was defined as the time from date of diagnosis to local or regional progression. Patients without LRF at last follow-up were censored at date of last follow-up. The intention-to-treat population was used for efficacy analyses and contains all randomized patients meeting eligibility criteria. Cox regression models were used to compare DFS, OS, and LRF between randomization arms. Because of imbalance in p16+ oropharynx cancer between treatment arms, Cox models included binary covariates for the two stratification variables of primary tumor volume (GTVprimary) < or ≥ 56 cc and RT boost volume < or ≥10 cc as well as p16+ oropharynx cancer versus other LAHNSCC. To assess whether the impact of boost on LRF varied with p16 status or with primary tumor volume, interaction terms for boost*p16 and boost*GTVprimary were assessed in separate models. Sensitivity analyses were performed to assess the robustness of the reported results to the choice of other covariates in the Cox model. The cumulative incidence of LRF was summarized by treatment arm over time accounting for the competing risks of distant progression or death without prior LRF. As specified by the protocol, comparisons of DFS and LRF outcomes between the randomized treatment arms were conducted at a one-sided 0.10 significance level, a level which would suggest that a phase III trial is warranted. Median follow-up time was calculated with the reverse Kaplan–Meier method. Proportional hazards assumptions were assessed visually and with Schoenfeld residual tests. Logistic regression models were used to test for differences between treatment arms in rates of toxicities while adjusting for p16 status and primary tumor volume. In exploratory analyses, we also assessed whether the continuous boosted volume (0 for patients randomized to standard arm) was associated with individual toxicity outcomes. Change from baseline in patient reported outcomes (PRO) during the first year following treatment was compared between arms using linear mixed-effects models adjusted for p16 status and primary tumor volume. Random patient level effects were included to account for within patient correlation. SAS v9.4 was used for all analyses.
Data availability statement
All analyzed data presented in article, clinical trial protocol, and raw data available upon request by contacting first author M.L. Mierwa.
Results
Between March 2014 and December 2019, 100 patients were recruited into the study (Fig. 1 consort diagram). Seven patients were excluded before start of treatment: 4 patients were found to have distant metastasis (DM) during screening evaluation, 3 patients were unable to complete required MRI scans. There were 93 patients evaluable for final analysis: 41 patients on standard RT arm, 40 patients on RT boost arm, and 12 patients treated in a nonrandomized cohort because they had a calculated boost volume < 1 cc. The median number of days between date of diagnosis and date of first RT fraction was 32 (range, 14–71).
Among 81 randomized patients, 63% of the patients had AJCC 8 stage 3 p16+ oropharyngeal cancer, and 37% had LAHNSCC. Fifty-four percent (54%) or 44 total patients were Eastern Cooperative Oncology Group (ECOG) 0. Median gross total tumor volume (GTVtotal) was 76 cc (range, 15–250). Median RT boost volume was 6.2 cc and mean was 5.8 cc (range, 1–107). At the time of analysis, median follow-up was 29 months (range, 12–84 months). All patients received all prescribed radiotherapy and 77% patients received all prescribed chemotherapy (76% on standard RT arm and 78% on RT boost arm). Mean number of weeks to complete RT was 7.0 (range, 6.8–7.2) and no patient had > 2 day RT break.
Baseline characteristics of randomized patients between the treatment groups are seen in Table 1. LAHNSCC treatment site according to treatment arm can be found in Supplementary Table S1. Treatment arms were numerically imbalanced with more patients with p16+ oropharyngeal cancer in the RT boost arm (55% vs. 73%, P = 0.08). Total tumor volume (GTVtotal) was higher in the boost RT arm (P = 0.007).
Characteristic . | Standard arm- Arm A (%) . | RT Boost arm- Arm B (%) . | P . |
---|---|---|---|
Median age | 64 | 63 | 0.12 |
Sex | |||
Male | 38 (92.68) | 34 (85.00) | 0.27 |
Female | 3 (7.32) | 6 (15.00) | |
ECOG PS | |||
0 | 26 (63.41) | 18 (45.00) | 0.10 |
1–2 | 15 (36.59) | 22 (55.00) | |
Smoking status | |||
<10 pack-years | 13 (31.71) | 12 (30.00) | 0.87 |
≥10 pack-years | 28 (68.29) | 28 (70.00) | |
p16 status | |||
p16+ OPSCC | 22 (54.66) | 29 (72.5) | 0.08 |
LAHNSCC | 19 (46.34) | 11 (27.5) | |
Chemotherapy | |||
Cisplatin | 14 (34.15) | 21 (52.50) | 0.10 |
Carboplatin | 27 (65.85) | 19 (47.50) | |
RT boost volume | |||
<10 cc | 28 (68.29) | 20 (50.00) | 0.09 |
≥10cc | 13 (31.71) | 20 (50.00) | |
Median (cc) | 4.3 | 10.0 | 0.03 |
Primary tumor volume | |||
<56 cc | 24 (58.53) | 20 (50.00) | 0.34 |
≥56 cc | 17 (41.47) | 20 (50.00) | |
Median | 38.9 | 56.4 | 0.09 |
Total tumor volume | |||
<56 cc | 20 (48.78) | 8 (20.00) | 0.007 |
≥56 cc | 21 (52.12) | 32 (80.00) | |
Median | 61.0 | 89.1 | 0.12 |
Characteristic . | Standard arm- Arm A (%) . | RT Boost arm- Arm B (%) . | P . |
---|---|---|---|
Median age | 64 | 63 | 0.12 |
Sex | |||
Male | 38 (92.68) | 34 (85.00) | 0.27 |
Female | 3 (7.32) | 6 (15.00) | |
ECOG PS | |||
0 | 26 (63.41) | 18 (45.00) | 0.10 |
1–2 | 15 (36.59) | 22 (55.00) | |
Smoking status | |||
<10 pack-years | 13 (31.71) | 12 (30.00) | 0.87 |
≥10 pack-years | 28 (68.29) | 28 (70.00) | |
p16 status | |||
p16+ OPSCC | 22 (54.66) | 29 (72.5) | 0.08 |
LAHNSCC | 19 (46.34) | 11 (27.5) | |
Chemotherapy | |||
Cisplatin | 14 (34.15) | 21 (52.50) | 0.10 |
Carboplatin | 27 (65.85) | 19 (47.50) | |
RT boost volume | |||
<10 cc | 28 (68.29) | 20 (50.00) | 0.09 |
≥10cc | 13 (31.71) | 20 (50.00) | |
Median (cc) | 4.3 | 10.0 | 0.03 |
Primary tumor volume | |||
<56 cc | 24 (58.53) | 20 (50.00) | 0.34 |
≥56 cc | 17 (41.47) | 20 (50.00) | |
Median | 38.9 | 56.4 | 0.09 |
Total tumor volume | |||
<56 cc | 20 (48.78) | 8 (20.00) | 0.007 |
≥56 cc | 21 (52.12) | 32 (80.00) | |
Median | 61.0 | 89.1 | 0.12 |
Treatment outcomes
In 81 randomized patients, there were 27 LRFs, 22 distant metastases, and 30 deaths observed during follow-up (Supplementary Table S2). The primary endpoint DFS was not significantly improved in the boost arm [HR (80% confidence interval, CI) = 0.849 (0.559–1.290), one-sided P = 0.31] as assessed in a multivariable model for DFS adjusting for p16+ oropharyngeal cancer status, RT boost volume, and primary tumor volume. The 2-year estimate of DFS in the standard arm was 48% (90% CI, 34–60) and 57% (90% CI, 43–69) in the RT boost arm (Fig. 2A). Of the three adjustment covariates, only p16+ oropharyngeal cancer status was a significant predictor of DFS (HR = 0.46, P = 0.01). There were also no significant differences in OS between treatment arms (HR = 1.19, two-sided P = 0.66). In the multivariable model for OS, only p16+ oropharyngeal cancer status was a significant predictor of OS (HR = 0.31, P = 0.004). The 2-year estimate (90% CI) of OS was 80% (67%–88%) in the standard arm and 77% (64%–86%) in the RT boost arm (Fig. 2B).
At 2 years, there were 15 LRFs (37%) in the standard RT arm and seven LRFs in the RT boost arm (18%). The cumulative incidence of LRF was significantly improved with the addition of RT boost (HR = 0.43, one-sided P = 0.047) after adjusting for the same prognostic variables. The rate of LRF in patients randomized to RT boost was less than half that in patients randomized to standard RT. Two-year estimates (90% CI) of the cumulative incidence of LRF were 40% (27%–53%) in the standard RT arm and 18% (10%–31%) in the RT boost arm (Fig. 2C). The multivariable model for LRF showed that, in addition to randomization arm, p16+ oropharyngeal cancer status (P = 0.02) and primary tumor volume (P = 0.02) were significant predictors while RT boost volume (P = 0.30) was nonsignificant (Table 2). Given the importance of p16 status as a prognostic biomarker in oropharynx cancer, we performed an exploratory analysis to assess whether the impact of boost on LRF differed between p16+ oropharyngeal cancer and other LAHNSCC. A nonsignificant interaction term (P = 0.50) showed no evidence of differential benefit of RT boost. The HR estimates (80% CIs) for boost were 0.28 (0.10–0.82) for p16+ oropharyngeal cancer and 0.58 (0.25–1.33) for LAHNSCC. Supplementary Fig. S1 shows estimates of the cumulative incidence of LRF for patients with p16+ oropharyngeal cancer and LAHNSCC with the median value of trial stratification variables including RT boost volume and primary tumor volume. There was also no significant interaction between primary tumor volume (GTVprimary) and treatment arm for outcome of LRF (P = 0.72). In sensitivity analyses, models for LRF adding covariates of chemotherapy regimen and smoking status (<10 pack-years vs. ≥10 pack-years), showed similar results for treatment, p16+ oropharyngeal cancer status, and primary tumor volume, whereas chemotherapy and smoking status were not significantly predictive of LRF (Supplementary Table S3). In models for LRF using covariates total tumor, p16+ oropharyngeal cancer status and RT boost volume, total tumor volume was not significantly correlated with LRF (P = 0.82) while trends for other covariates were similar to the primary model. A Cox model for distant metastases showed no significant differences between randomization arm after accounting for p16+ oropharyngeal cancer status, primary tumor volume, and RT boost volume (arm HR = 0.92, P = 0.85). The marginal cumulative incidence of DM by treatment arm is shown in Fig. 2D.
. | HR . | 90% CI . | P . | |
---|---|---|---|---|
p16 Status (ref = “Non-p16+ OPSCC”) | 0.350 | 0.164 | 0.745 | 0.022 |
Boost volume (cc) | 0.989 | 0.970 | 1.007 | 0.302 |
GTV Primary (per 10 cc) | 1.096 | 1.027 | 1.171 | 0.021 |
Arma (ref = standard) | 0.429 | 0.224 | 0.819 | 0.047 |
. | HR . | 90% CI . | P . | |
---|---|---|---|---|
p16 Status (ref = “Non-p16+ OPSCC”) | 0.350 | 0.164 | 0.745 | 0.022 |
Boost volume (cc) | 0.989 | 0.970 | 1.007 | 0.302 |
GTV Primary (per 10 cc) | 1.096 | 1.027 | 1.171 | 0.021 |
Arma (ref = standard) | 0.429 | 0.224 | 0.819 | 0.047 |
aP value is one sided as specified in the protocol and CI is two sided 80% so that the upper bound is equal to the one-sided 90% value to match the one-sided 0.10 level significance level. P value for interaction term between p16+ OPSCC/LAHNSCC and treatment arm: 0.5025.
In the 12 patients who were not randomized due to <1 cc persistent boostable tumor subvolume, 7 had p16+ oropharyngeal cancer and 5 had LAHNSCC (characteristics in Supplementary Table S4). Both LRC and OS were 83% at 2 years in this group, numerically but not significantly greater than the randomized patients (P = 0.28 and 0.33, respectively).
Toxicity
The proportion of patients with any grade 3+ toxicity were similar in the RT boost and standard arms: 64% versus 53% (P = 0.33). In multivariable logistic regression models accounting for p16+ oropharyngeal cancer status, primary tumor volume and treated RT boost volume, no significant difference was seen between arms for any of the acute or late toxicities collected (Table 3). Larger treated RT boost volume was significantly associated with acute feeding tube use (P = 0.05; Supplementary Table S5). There were no feeding tubes at 12 months required in either treatment arm. Larger primary tumor volume was significantly associated with need for intravenous fluid support (P = 0.04) and patients with LAHNSCC were more likely than patients with p16+ oropharyngeal cancer to have chemotherapy interruption (P = 0.03), feeding tube at 12 months (P = 0.05) or aspiration pneumonia (P = 0.003). There were two early deaths in the RT boost arm related to lingual arterial bleeding, one at 3 months and one at 12 months after RT, both after surgical manipulation and/or biopsy. Both of these patients with fatal hemorrhage had large cT4 base of tongue cancers (GTVprimary 51 cc and 62 cc) with extrinsic deep tongue muscle involvement as well as carotid and lingual artery involvement. Both had suspicious imaging findings prior to hemorrhage reviewed at tumor board and biopsies were recommended after treatment. Boost GTV was close to median value for the group in both patients (5.8 cc and 6.6 cc). Both underwent biopsies that ultimately proved negative within 14 days of fatal hemorrhage at home. Other non–cancer-related deaths included the following: in the patients with LAHNSCC, 3 patients related to second primary cancer and 1 patient died of pulmonary embolism related to noncancer hip fracture; 1 patient with p16+ oropharyngeal cancer died of unknown cause greater than 5 years after cancer treatment.
. | Standard RT . | RT boost . | OR . | P . |
---|---|---|---|---|
ACUTE TOXICITIES ≤3MONTHS AFTER RT | ||||
Hospitalization | 21.9% | 25.6% | 0.650 | 0.52 |
Interrupted Chemotherapy | 24.4% | 23.1% | 0.779 | 0.71 |
Feeding Tube | 24.4% | 38.5% | 1.875 | 0.28 |
Pain Requiring Long Acting Narcotics | 43.9% | 52.6% | 1.022 | 0.97 |
RT Delay | 2.5% | 0% | 0.150 | 0.22 |
Dehydration Requiring IVF Support | 53.7% | 46.2% | 0.511 | 0.19 |
Grade 2+ Mucositis | 80% | 73% | 0.582 | 0.36 |
Grade 3+ Mucositis | 9.8% | 10.0% | 0.361 | 0.29 |
Grade 3+ Dermatitis | 2.4% | 7.5% | 1.401 | 0.75 |
LATE TOXICITIES >3MONTHS AFTER RT | ||||
Aspiration Pneumoniaa | 17.5% | 12.5% | 1.183 | 0.83 |
Pain Requiring Narcotics | 29.3% | 30.8% | 0.862 | 0.78 |
Grade 3+ Xerostomia | 0.0% | 0.0% | 1.000 | 1.00 |
Feeding Tube | 11.8% | 6.9% | 1.212 | 0.86 |
Grade 3+ Dysgeusia Lasting 12 Months | 0.0% | 0.0% | 1.000 | 1.00 |
Osteoradionecrosis Managed Medically | 0.0% | 5.0% | 0.590 | 0.45 |
Osteoradionecrosis Requiring Debridement/Surgery | 9.8% | 12.5% | ||
Lymphedema Requiring Referral To OT | 41.0% | 42.5% | 1.157 | 0.77 |
Fatal Oral Hemorrhage | 0.0% | 5.0% | 4.498 | 0.29 |
. | Standard RT . | RT boost . | OR . | P . |
---|---|---|---|---|
ACUTE TOXICITIES ≤3MONTHS AFTER RT | ||||
Hospitalization | 21.9% | 25.6% | 0.650 | 0.52 |
Interrupted Chemotherapy | 24.4% | 23.1% | 0.779 | 0.71 |
Feeding Tube | 24.4% | 38.5% | 1.875 | 0.28 |
Pain Requiring Long Acting Narcotics | 43.9% | 52.6% | 1.022 | 0.97 |
RT Delay | 2.5% | 0% | 0.150 | 0.22 |
Dehydration Requiring IVF Support | 53.7% | 46.2% | 0.511 | 0.19 |
Grade 2+ Mucositis | 80% | 73% | 0.582 | 0.36 |
Grade 3+ Mucositis | 9.8% | 10.0% | 0.361 | 0.29 |
Grade 3+ Dermatitis | 2.4% | 7.5% | 1.401 | 0.75 |
LATE TOXICITIES >3MONTHS AFTER RT | ||||
Aspiration Pneumoniaa | 17.5% | 12.5% | 1.183 | 0.83 |
Pain Requiring Narcotics | 29.3% | 30.8% | 0.862 | 0.78 |
Grade 3+ Xerostomia | 0.0% | 0.0% | 1.000 | 1.00 |
Feeding Tube | 11.8% | 6.9% | 1.212 | 0.86 |
Grade 3+ Dysgeusia Lasting 12 Months | 0.0% | 0.0% | 1.000 | 1.00 |
Osteoradionecrosis Managed Medically | 0.0% | 5.0% | 0.590 | 0.45 |
Osteoradionecrosis Requiring Debridement/Surgery | 9.8% | 12.5% | ||
Lymphedema Requiring Referral To OT | 41.0% | 42.5% | 1.157 | 0.77 |
Fatal Oral Hemorrhage | 0.0% | 5.0% | 4.498 | 0.29 |
aAspiration pneumonia is defined by any radiographic finding consistent with aspiration or clinical event.
PROs
EORTC QLQ30 and HN35 were collected at clinic visits pretreatment and at 1, 3, 6, and 12, 18, and 24 months post-RT. Completed surveys per timepoint was ≥50% up to 12 months post-RT and <20% at 24 months with similar completion rates between treatment arms (20, 21) and thus statistical analysis was performed up to the 12-month timepoint. Linear mixed-effects models adjusted for p16 status and primary tumor volume were used to evaluate longitudinal changes from baseline between the treatment arms. These models showed no QOL subscale in QLQ30 or HN35 differed significantly between treatment arms longitudinally (Supplementary Table S6; Fig. 3).
Discussion
This is the first prospective study of an adaptive physiologic MRI-directed RT boost for patients with poor prognosis head and neck cancer to demonstrate a significant decrease in LRF. However, the primary endpoint of DFS was not improved. The improvement in LRF was obtained without an increase in toxicity or detriment in PROs. Indeed, grade 3+ toxicity overall was comparable with the literature for clinical trials including lower stage tumors (22, 23) and other contemporary trials incorporating RT boosts (7, 24). These findings suggest that biologically aggressive tumor subvolumes, persistently present during chemoradiation, may be targeted in real time to improve the therapeutic ratio of RT and advance the concept of image guided personalized RT dose in head and neck cancer.
Although intensified treatment based on MRI improved LRF, it did not have a significant impact DFS or OS. The smaller size of our study could mean that a clinically significant improvement in DFS and OS may be missed. In addition, patients on our study were at higher risk for developing distant metastases (29%). More patients in the standard arm developed first failures at multiple locations (N = 6 in standard arm and N = 3 in RT boost arm), both locoregionally and distantly but still counting as a single event in DFS analysis. Similarly, trials incorporating systemic therapy, such as induction chemotherapy, concurrent immunotherapy, targeted agents, or accelerated fractionation RT with concurrent chemotherapy, have failed to improve outcomes in head and neck cancer to date (2, 4, 5, 25, 26). Our results suggest that further improvements in systemic therapy are required to fully optimize the benefit of improved LRC produced by intensified RT.
Our approach has been driven by the acknowledgment that pre-RT volumes are frequently large such that attempting to dose escalate the entire tumor would produce excessive toxicity (15, 16, 27). The adaptive midtreatment boost is advantageous to balance treating the tumor subvolumes with persistently aggressive features that may require higher tumoricidal RT doses without adding unnecessary radiation-related toxicity to subvolumes that have responded during RT. The RT boost volumes here were smaller than other potential boost volumes (24, 28), with mean 6 cc and comprising approximately 10% of the total tumor volume. Clinically significant toxicities were seen in boost patients, as the treated RT boost volume correlated with need for feeding tube placement up to 3months post-RT and 2 patients in the RT boost arm suffered fatal hemorrhages in surveillance. The fatal hemorrhages underscore the need for judicious tissue sampling after high-dose RT as aggressive biopsy may perpetuate soft-tissue necrosis and poor healing. While these events occurred in the RT boost group, we note that the large volume of tumor necrosis generated after RT to these large primary tumors and/or microscopic recurrence undetected by biopsy may have contributed to these fatal bleeding events. One consideration in these large cT4 BOT cancers post-RT prior to any biopsy or surgical procedure may be careful review of imaging and consideration of embolization prior to the procedures if radiographic abnormalities extend to the carotid. These events also underscore the need for further research in noninvasive means to detect tumor recurrence including ctDNA: these patients were treated prior to clinical commercial availability of HPV ctDNA. In retrospective correlative analysis, one of these patients was negative for HPV ctDNA at the time of biopsy while the other was positive. As patients with large tumors may have suspicious imaging findings in follow-up scored through radiology as “possibly post-RT change,” knowing that ctDNA is negative in a patient with pretreatment positivity may help avoid biopsy in that patient. Although these events may occur with standard-dose RT in large tumors with carotid and lingual artery involvement, the clinically significant toxicities and deaths also highlight that significant increases in RT boost volume to high RT doses should be investigated incrementally and carefully.
On the other hand, the tumors treated on our trial had large primary tumors with relatively smaller nodal burden, and GTVprimary was significantly correlated with LRF while total tumor volume was not. The significant improvement seen in local control may avoid morbid salvage surgeries of primary tumors, at the expense of nonpermanent feeding tube as no feeding tubes were necessary at 12 months. Future considerations to balance tumor control and toxicity could also include modulation of dose within the RT boost volume, although this further increases treatment complexity.
Of note, grade 3+ oral mucositis rate was low on this trial at 10% while grade 2 mucositis was high at 76% overall. The low reported rate of grade 3+ mucositis may reflect subjectivity of the grading scale and well-reported underestimation of physician graded toxicity compared with patient reported outcomes (PRO; refs. 29, 30).
Although our trial used an adaptive physiologic MRI-guided RT boost, the optimal imaging to determine boost volume requires further consideration. Biologically aggressive tumor subvolumes including areas of hypoxia (31–33), poor perfusion (10, 34), high cellularity (11), and high metabolic activity may be captured through physiologic imaging. Each of these imaging modalities has distinct features with overlap within tumors (15, 34–36), and the best definition of boost volume is still under investigation. Pretreatment hypoperfused areas on MRI and highly 2[18F]fluoro-2-deoxy-D-glucose (FDG)-avid areas within the tumor have been consistently associated with poor prognosis (27, 37–39), with more variable ADC results (40–42). Our previous results and those of others suggest that midtreatment persistently hypoperfused (10) or FDG-avid areas (27, 43–45) are more predictive of poor outcomes compared with pretreatment tumor subvolumes. The feasibility of FDG-PET–directed midtreatment boost produces acceptable toxicity, but there is no proven benefit of this technique to date (24). One-third of LRFs in p16+ oropharyngeal cancer treated on our trial had a component outside the boost volume often in FDG-avid areas, suggesting possible strategies for optimizing RT boost to include multimodality imaging metrics, for instance FDG-PET. Furthermore, p16+ and p16− tumors demonstrate differing imaging characteristics (18, 46, 47) and thus should be considered separately in future trials.
The modest sample size is a limitation of our trial based on a potentially optimistic improvement of 20% in DFS. This likely contributed to imbalances between treatment arms with regard to p16+ oropharyngeal cancer status, total tumor volume, performance status, and chemotherapy regimen.
In summary, the current randomized phase II clinical trial, based on physiologic imaging biomarkers adapted in real time, showed improved LRC without increased toxicity with the addition of an RT boost. However, the primary endpoint of DFS was not improved. The LRC signal seen here moves us toward optimizing therapy in patients with poor prognosis head and neck cancer with the potential to personalize RT. Furthermore, phase IIR or IIR/III randomized studies are needed to validate this concept in larger patient numbers.
Authors' Disclosures
M.L. Mierzwa reports grants from University of Michigan during the conduct of the study and non-financial support from Bristol Myers Squibb outside the submitted work. M. Aryal reports grants from NIH during the conduct of the study. M. Schipper reports personal fees from Innovative Analytics outside the submitted work. P.L. Swiecicki reports a patent for Materials and Methods for Measuring HPV ctDNA (No:63/208,736) pending to Paul L. Swiecicki, Muneesh Tewari, J. Chad Brenner, and Chandan Bhambhani. K.M. Malloy reports other support from Up To Date outside the submitted work. A. Eisbruch reports grants from NIH during the conduct of the study. Y. Cao reports grants from NIH during the conduct of the study. No disclosures were reported by the other authors.
Authors' Contributions
M.L. Mierzwa: Conceptualization, data curation, formal analysis, supervision, funding acquisition, investigation, methodology, writing–original draft, writing–review and editing. M. Aryal: Resources, data curation, software, investigation, writing–review and editing, processed imaging in real time. C. Lee: Conceptualization, resources, data curation, software, investigation, writing–review and editing. M. Schipper: Conceptualization, resources, formal analysis, investigation, methodology, writing–original draft, writing–review and editing. M. VanTil: Formal analysis, methodology, writing–review and editing. K. Morales: Data curation, writing–review and editing. P.L. Swiecicki: Data curation, supervision, investigation, writing–review and editing. K.A. Casper: Conceptualization, resources, data curation, formal analysis, investigation, methodology, writing–original draft, writing–review and editing. K.M. Malloy: Conceptualization, data curation, investigation, writing–review and editing. M.E. Spector: Conceptualization, data curation, investigation, writing–review and editing. A.G. Shuman: Conceptualization, data curation, investigation, writing–review and editing. S.B. Chinn: Conceptualization, data curation, investigation, writing–review and editing. M.E.P. Prince: Conceptualization, resources, investigation, methodology, writing–review and editing. C.L. Stucken: Conceptualization, data curation, investigation, writing–review and editing. A.J. Rosko: Conceptualization, data curation, investigation, writing–review and editing. T.S. Lawrence: Conceptualization, resources, formal analysis, investigation, methodology, writing–original draft. J.C. Brenner: Conceptualization, resources, data curation, formal analysis, investigation, methodology, writing–review and editing. B. Rosen: Conceptualization, resources, software, formal analysis methodology, writing–review and editing. C.A. Schonewolf: Data curation, investigation, writing–review and editing. J. Shah: Data curation, investigation, methodology, writing–review and editing. A. Eisbruch: Conceptualization, resources, funding acquisition, investigation, methodology, writing–review and editing. F.P. Worden: Conceptualization, resources, supervision, investigation, methodology, writing–review and editing. Y. Cao: Conceptualization, resources, data curation, formal analysis, funding acquisition, investigation, methodology, writing–original draft, writing–review and editing.
Acknowledgments
This work was funded in part by NIH grants R01-CA-184153 and U01-CA-183848.
The publication costs of this article were defrayed in part by the payment of publication fees. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.
Note: Supplementary data for this article are available at Clinical Cancer Research Online (http://clincancerres.aacrjournals.org/).