Abstract
PREDIX HER2 is a randomized Phase II trial that compared neoadjuvant docetaxel, trastuzumab, and pertuzumab (THP) with trastuzumab emtansine (T-DM1) for HER2-positive breast cancer. Rates of pathologic complete response (pCR) did not differ between the two groups. Here, we present the survival outcomes from PREDIX HER2 and investigate metabolic response and tumor-infiltrating lymphocytes (TIL) as prognostic factors.
In total, 202 patients with HER2-positive breast cancer were enrolled and 197 patients received six cycles of either THP or T-DM1. Secondary endpoints included event-free survival (EFS), recurrence-free survival (RFS), and overall survival (OS). Assessment with PET/CT was performed at baseline, after two and six treatment cycles. TILs were assessed manually at baseline biopsies, while image-based evaluation of TILs [digital TILs (DTIL)] was performed in digitized full-face sections.
After a median follow-up of 5.21 years, there was no difference between the two treatment groups in terms of EFS [HR = 1.26; 95% confidence interval (CI), 0.54–2.91], RFS (HR = 0.69; 95% CI, 0.24–1.93), or OS (HR = 0.52; 95% CI, 0.09–2.82). Higher SUVmax at cycle 2 (C2) predicted lower pCR (ORadj = 0.65; 95% CI, 0.48–0.87; P = 0.005) and worse EFS (HRadj = 1.27; 95% CI, 1.12–1.41; P < 0.001). Baseline TILs and DTILs provided additional prognostic information to clinical parameters and C2 SUVmax.
Long-term outcomes following neoadjuvant T-DM1 were similar to neoadjuvant THP. SUVmax after two cycles of neoadjuvant therapy for HER2-positive breast cancer may be an independent predictor of both short- and long-term outcomes. Combined assessment with TILs may facilitate early selection of poor responders for alternative treatment strategies.
With this report of the randomized PREDIX HER2 trial, we confirm that the contemporary prognosis of HER2-positive breast cancer is excellent, that achieving pathologic complete response (pCR) remains strongly prognostic regardless of previously administered treatment, and that the long-term efficacy of treatment with an antibody–drug conjugate is consistent with the short-term results. In addition, we show that metabolic response to treatment correlates with long-term survival, and that its combination with baseline tumor-infiltrating lymphocyte enumeration, either manually or based on machine learning, adds further prognostic value. These results support further investigation of de-escalation strategies and suggest that novel antibody–drug conjugates could challenge the place of chemotherapy and dual HER2 blockade combination as the standard of care for HER2-positive breast cancer. Moreover, capturing the host response at baseline and the metabolic response to treatment is promising and can be potentially implemented as a fully automated prognostic tool for early therapy adaptation, pending further validation.
Introduction
Neoadjuvant therapy for early and locally advanced nonmetastatic HER2-positive breast cancer has become the recommended approach (1). Three contributing factors to this paradigm change are its exquisite sensitivity to dual HER2 blockade combined with chemotherapy (2–5), the strong correlation between pathologic complete response (pCR) and long-term survival (6), and the availability of effective post-neoadjuvant salvage therapy in case of residual invasive cancer (7). The excellent survival rates with chemotherapy and trastuzumab alone (8) and the reported 15%–30% pCR rates with chemotherapy-free regimens (2, 9–11), indicate that a sizeable proportion of patients may be currently overtreated. In addition, a considerable proportion of patients do not attain pCR despite being treated with standard chemotherapy and dual HER2 blockade and these patients have worse outcomes, which highlights the need for their early identification and alternative treatment strategies.
To remedy these issues, substantial efforts have been undertaken to explore and validate novel prognostic and predictive biomarkers. Histology-based markers like abundance of tumor-infiltrating lymphocytes (TIL; refs. 12, 13), or genomic markers such as mutational status (14) and gene expression profiling (15, 16), may select a priori good responders to neoadjuvant therapy. A potential disadvantage with this approach is that all information is obtained at baseline, thus ignoring how individuals respond to treatment and missing the opportunity for late de-escalation after a few initial treatment cycles. To this end, studies exploring longitudinal assessments with PET have shown promising results regarding therapy de-escalation and adaptation (11, 17). For example, PHERGAIN showed prospectively that PET-guided treatment adaptation may identify patients benefiting from neoadjuvant chemotherapy-free dual HER2 blockade (17), while in TBCRC026, an early drop in metabolic activity was prognostic for improved patient outcomes at the same disease setting (11). Presumably, combining a baseline and an on-treatment biomarker could refine response prediction and decrease the number of patients that are currently overtreated or undertreated.
We have previously reported the primary efficacy analysis of PREDIX HER2 (neoadjuvant response-guided treatment of HER2-positive breast cancer), a randomized Phase II study that compared neoadjuvant THP (docetaxel, trastuzumab, and pertuzumab) with trastuzumab emtansine (T-DM1). No difference between the two treatment groups was observed in terms of pCR rates (45.5% and 43.9% for the two groups, respectively; P = 0.82) or in event-free survival (EFS; log-rank P = 0.35), while baseline TILs and metabolic response were prognostic for pCR (18). Herein, we report the survival outcomes of PREDIX HER2 after longer follow-up and the protocol-predefined analysis on the prognostic value of PET/CT during therapy and baseline TILs assessment for both short- and long-term outcomes.
Patients and Methods
Study design and participants
PREDIX HER2 is an academic, prospective, randomized, open-label, multicenter Phase II trial which was conducted at nine centers in Sweden by the Swedish Breast Cancer Group, between December 1, 2014 and October 31, 2018. The study was approved by the Regional Ethical Committee in Stockholm (dnr 2014/1465-31/10) and the Swedish Medical Product Agency. All patients provided written informed consent to participate in the clinical trial and correlative analyses before inclusion. The study was conducted according to the Declaration of Helsinki and the principles of good clinical practice and was registered with EudraCT number 2014-000808-10 and at the ClinicalTrials.gov website, identifier NCT02568839. The clinical trial is reported according to the Consolidated Standards of Reporting Trials (CONSORT) guidelines (19) and the correlative analyses according to the Reporting Recommendations for Tumor Marker Prognostic Studies guidelines (20). The study's principal investigator (T. Hatschek), data manager (M. Hellström), and study statistician (H. Johansson) had full access to data. Editorial or medical writing assistance was not used.
Details regarding the trial design and study population have been presented previously (18). In brief, women and men over 18 years old were eligible for the study if they had early (at least 2 cm and/or node-positive) HER2-positive breast cancer, though patients with at most two distant metastases that could be treated with curative intent could be enrolled. HER2 positivity was defined as IHC score 3+, or 2+ plus ERBB2 amplification (4.0 or more ERBB2 copies, or ERBB2:CEP17 ratio over 2.0), in accordance with Swedish guidelines at the time the study was initiated. Adequate cardiac, renal, and hepatic function, and no history of other malignancies during the past 5 years were required for inclusion to the study.
Procedures
Patients were randomly assigned (1:1) using random permuted blocks (block size of 2 or 4) into the standard or the experimental treatment groups described hereunder. Randomization was conducted at the Central Trial Office at Karolinska University Hospital by a web-based procedure (TENALEA, TransEuropean Network for Clinical Trial Services, Amsterdam, the Netherlands). Random assignment was stratified by participating site.
Patients allocated to the standard group received six cycles of THP [docetaxel (first course given at 75 mg/m2, followed by 100 mg/m2 per cycle), trastuzumab (600 mg subcutaneously), and pertuzumab (first course given at 840 mg, followed by 420 mg per cycle)] every 3 weeks. Patients allocated to the experimental group received six cycles of T-DM1 (3.6 mg/kg i.v.) every 3 weeks. Response to treatment was evaluated using breast imaging (mammography, ultrasound, or MRI) after two, four, and six cycles. Patients with progressive or stable disease after two cycles or intolerable side effects due to the assigned treatment could cross-over to the other treatment group. Breast surgery was performed 3–4 weeks after the final cycle. Patients received two (for the standard group) or four (for the experimental group) adjuvant cycles of epirubicin and cyclophosphamide, followed by 11 courses of subcutaneous trastuzumab, radiotherapy, and endocrine therapy in accordance with national guidelines and local practice.
Outcomes
The primary endpoint was locally assessed pathologic objective response to primary medical treatment with pCR defined as absence of invasive carcinoma in the breast and axillary lymph nodes (ypT0/Tis, ypN0). Secondary time-to-event endpoints included EFS, defined as time from randomization to disease progression, disease recurrence (local, regional, or distant), contralateral breast cancer, or death from any cause, whichever occurs first; recurrence-free survival (RFS), defined as time from surgery to disease recurrence (local, regional, or distant), or death from any cause, whichever occurs first; and overall survival (OS), defined as time from randomization to death from any cause.
PET imaging
Fluorine 18–labeled fluorodeoxyglucose (18F-FDG) PET combined with CT was performed on patients enrolled at Karolinska University Hospital. Following intravenous FDG injection and a 60-minute uptake phase, a combined PET/CT scan was obtained from the thorax and regional lymph nodes to limit radiation exposure. Images were obtained at baseline within 2 weeks prior to treatment start and at 16 ± 2 days after cycles 2 (C2) and 6, before obtaining research biopsies. All scans were reviewed by an expert in nuclear medicine (P. Grybäck), who was not blinded to the patients’ electronic health charts. Within the scope of this analysis, the evaluation of the combined PET/CT images was made by calculating maximal Standardized Uptake Values (SUVmax) of the breast tumor at baseline and after C2.
Assessment of TILs
Core biopsies were obtained at baseline and after C2, while tissue was also available from the surgical specimen. Full-face sections were stained with hematoxylin and eosin (H&E). Stromal TILs were assessed by a certified breast pathologist (J. Hartman), who was blinded to other clinicopathologic and genomic characteristics, as the percentage (%) of tumor stroma covered by infiltrating lymphocytes, according to the recommendations of the International TILs Working Group (21).
An image-based, automated evaluation of TILs was performed in digitized H&E-stained full-face sections using the QuPath open-source software, as described previously (22–24). Briefly, a classifier algorithm compatible with the QuPath software has been created which defines tumor cells, lymphocytes, stromal cells, and other cells on the stained sections. The variable easTILs% = TILs cell area/stroma area*100 was calculated as a surrogate of the respective definition from the TILs Working Group for the visual assessment, henceforth termed digital TILs (DTIL).
CelTIL is a prognostic model comprising cellularity and TILs after short exposure to HER2-directed therapy (25). Using the algorithm described above with the QuPath software, we calculated tumor cellularity after two cycles of treatment and thereafter CelTIL as a continuous variable.
Statistical analysis
Within the scope of this exploratory analysis, we aimed to test the hypothesis that the combination of two different protocol-predefined biomarkers, tumor metabolic activity following short-term exposure to neoadjuvant therapy, and TILs enumeration would provide superior prognostic information concerning both short-term (pCR) and long-term patient outcomes (EFS, RFS). In a post hoc power calculation, if we assume an α of 0.05 and power of 0.80 to detect a HR = 1.25 for the continuous SUVmax C2 variable, and an expected event rate of 15% at 5 years, a population of 125 patients with 19 events would be needed. As such, our study had power of 0.74 given the n = 109 patients and 16 observed events. In addition to this biomarker analysis, we present the predefined time-to-event endpoint analyses of PREDIX HER2.
Binary outcomes were tested using Fisher exact test and continuous outcomes using the Wilcoxon rank-sum test. Correlations between continuous variables were estimated using Spearman rank correlation coefficient supplemented with 95% bootstrap confidence intervals. TILs and DTILs were dichotomized using the median as cutoff and agreement between the two variables is illustrated using the Bland-Altman plot where differences and averages of the measurements are graphed. For the time-to-event outcomes (EFS, RFS, OS as described in the Outcomes section), time for event-free patients was calculated to the date of last clinical visit. Associations between pCR and clinical factors were modeled using logistic regression. Survival outcomes (OS, EFS, and RFS) are graphically displayed as Kaplan–Meier plots and differences in survival times are tested using the log-rank test. Time to failure is modeled using proportional hazards regression. Results from the regression models are presented as ORs when the outcome is pCR, and HRs when the outcome is time to failure, together with 95% confidence intervals (CI) and Wald P values. SUVmax was included in all regression models as a continuous variable. SUVmax was also dichotomized for generating a combined TIL-SUVmax variable using the SUVmax and TIL median values as cutoff. Changes in likelihood ratio (LR − Δχ2) measured the relative amount of prognostic information of TILs/DTILs in relation to clinical variables and SUVmax. All P values are two sided, and the level of significance is set to 5%. All analyses were performed using the Stata software version 17 (StataCorp).
Data availability
The clinical study report is available upon request, after approval by the study principal investigator and the ethics board. Deidentified individual participant data from this clinical trial, as well as a data dictionary, can be requested by contacting the corresponding authors. The trial steering committee and the sponsor will review the requests on a case-by-case basis. In case of approval, a specific agreement between the sponsor and the researcher will be required for a data transfer.
Results
Patient characteristics
In total, 202 patients were included into the trial of which 197 received at least one treatment cycle and form the intention-to-treat (ITT) population (Fig. 1). Baseline and post-C2 PET/CT imaging was available for 112 and 109 patients, respectively. The demographic and clinicopathologic characteristics of the patients with baseline PET/CT data in relation to the ITT population are presented in Table 1, while the representativeness of the study population is described in Supplementary Table S1. In addition, 173 patients had available baseline TILs. The distribution of clinicopathologic characteristics depending on TILs using median value (10%) as cutoff is presented in Supplementary Table S2. Tumors with high TILs were more often estrogen receptor (ER) negative (Fisher exact test P = 0.022) and highly proliferative (Wilcoxon P = 0.007).
. | ITT (n = 197) . | Available PET/CT (n = 112) . |
---|---|---|
. | N (%) . | N (%) . |
Age (median, IQR) | 52 (43–61) | 50 (39–58) |
Menopausal status | ||
Premenopausal | 93 (47.2) | 33 (38.8) |
Postmenopausal | 97 (49.2) | 49 (43.8) |
Unknown | 7 (3.6) | 3 (2.7) |
Gradea | ||
I–II | 81 (41.1) | 51 (45.5) |
III | 93 (47.2) | 50 (44.6) |
Unknown | 23 (11.7) | 11 (9.8) |
Tumor size (mm) | ||
≤20 | 34 (17.3) | 21 (18.8) |
21–50 | 123 (62.4) | 70 (62.5) |
>50 | 34 (17.3) | 19 (17.0) |
Unknown | 6 (3.0) | 2 (1.8) |
Nodal status | ||
Negative | 86 (43.7) | 51 (45.5) |
Positive | 111 (56.3) | 61 (54.5) |
Hormone receptorsb | ||
ER and PR negative | 72 (36.5) | 39 (34.8) |
ER or PR positive | 125 (63.5) | 73 (62.5) |
Allocated treatment | ||
Standard | 99 (50.3) | 58 (51.8) |
Experimental | 98 (49.7) | 54 (48.2) |
Ki67 (median, IQR) | 40 (30–57) | 40 (30–55) |
. | ITT (n = 197) . | Available PET/CT (n = 112) . |
---|---|---|
. | N (%) . | N (%) . |
Age (median, IQR) | 52 (43–61) | 50 (39–58) |
Menopausal status | ||
Premenopausal | 93 (47.2) | 33 (38.8) |
Postmenopausal | 97 (49.2) | 49 (43.8) |
Unknown | 7 (3.6) | 3 (2.7) |
Gradea | ||
I–II | 81 (41.1) | 51 (45.5) |
III | 93 (47.2) | 50 (44.6) |
Unknown | 23 (11.7) | 11 (9.8) |
Tumor size (mm) | ||
≤20 | 34 (17.3) | 21 (18.8) |
21–50 | 123 (62.4) | 70 (62.5) |
>50 | 34 (17.3) | 19 (17.0) |
Unknown | 6 (3.0) | 2 (1.8) |
Nodal status | ||
Negative | 86 (43.7) | 51 (45.5) |
Positive | 111 (56.3) | 61 (54.5) |
Hormone receptorsb | ||
ER and PR negative | 72 (36.5) | 39 (34.8) |
ER or PR positive | 125 (63.5) | 73 (62.5) |
Allocated treatment | ||
Standard | 99 (50.3) | 58 (51.8) |
Experimental | 98 (49.7) | 54 (48.2) |
Ki67 (median, IQR) | 40 (30–57) | 40 (30–55) |
Abbreviations: ER: estrogen receptor; IQR: interquartile range; ITT: intention-to-treat population; PR: progesterone receptor.
aNottingham histologic grade.
bCutoff for positivity of hormone receptors of 10%, in accordance with Swedish national guidelines.
Efficacy
At the time of the latest data cutoff (June 2022), the median follow-up was 5.21 years (interquartile range, 4.33–5.44 years). There were 11 first events in the standard treatment group and 12 in the experimental treatment group. In the former, the most common first event was distant metastasis (n = 7 patients), while in the latter, disease progression during treatment and locoregional relapse were most frequent (n = 4 patients each).
There was no difference in risk for an event between the two treatment groups (HR = 1.26; 95% CI, 0.54–2.91; P = 0.591). Five-year event-free rates were 89.6% (95% CI, 81.5–94.3) for the standard versus 88.6% (95% CI, 80.4–93.5) for the experimental group (Fig. 2A). Moreover, risk for recurrence was similar between THP and T-DM1 (HR = 0.69; 95% CI, 0.24–1.93; P = 0.476). Five-year RFS rates were 91.6% (95% CI, 83.9–95.7) and 94.7% (95% CI, 87.9–97.8) for the two groups, respectively (Fig. 2B). Finally, risk for death was also similar between THP and T-DM1 (HR = 0.52; 95% CI, 0.09–2.82; P = 0.445), as were 5-year OS rates [96.7% (95% CI, 90.1–98.9) vs. 97.7% (95% CI, 91.1–99.4), respectively; Fig. 2C].
RFS was also analyzed in the ITT population according to pCR status. Patients that attained pCR had a lower risk for recurrence following surgery (HR = 0.17; 95% CI, 0.04–0.77; P = 0.027) and superior 5-year RFS rates [98.9% (95% CI, 92.1–99.8) vs. 88.9% (95% CI, 80.7–93.4)]. The improvement in RFS rates in patients with pCR compared with those with residual invasive cancer at the time of surgery was noted both in patients that were allocated to THP (log-rank P = 0.053) and to T-DM1 (log-rank P = 0.027).
Baseline metabolic activity and its change after two treatment cycles
Of the 112 patients with baseline PET/CT imaging, 16 experienced an event during follow-up. There was no difference between the two treatment groups in terms of risk for event among patients with baseline PET/CT (HR = 1.10; 95% CI, 0.41–2.94; P = 0.844).
The median baseline SUVmax in the entire cohort was 8.50 (interquartile range, 5.86–13.35) and at C2 2.60 (interquartile range, 2.10–3.50). Median SUVmax at baseline (Wilcoxon P = 0.10) and at C2 (Wilcoxon P = 0.054) did not differ between the two treatment groups, whereas the relative drop in SUVmax from baseline to C2 was greater in patients treated with THP (72.6% drop vs. 58.5% drop, Wilcoxon P = 0.035). Moreover, patients who attained pCR had similar median SUVmax at baseline with those that did not (Wilcoxon P = 0.380). However, SUVmax at C2 was significantly lower for patients with pCR (Wilcoxon P < 0.001), who also experienced greater relative decrease in metabolic activity from baseline (80.2% vs. 58.4% for patients with residual cancer, Wilcoxon P < 0.001).
Prognostic implications of metabolic response to treatment
The associations between SUVmax at baseline and at C2, and pCR status are shown in Table 2. In univariate analysis, baseline SUVmax did not predict pCR (OR = 1.04; 95% CI, 0.97–1.12; P = 0.228). In contrast, in both univariate (OR = 0.68; 95% CI, 0.52–0.90; P = 0.007) and multivariable analysis when adjusting for hormone receptor status and treatment arm, higher SUVmax at C2 predicted lower pCR rate (ORadj = 0.65; 95% CI, 0.48–0.87; P = 0.005). Sensitivity analysis with the addition of other known prognostic factors to the model (tumor size, nodal status) did not change these results (ORadj = 0.64; 95% CI, 0.47–0.88), even when further adjusting for baseline SUVmax (ORadj = 0.58; 95% CI, 0.42–0.81). There was no interaction between treatment arm and SUVmax at C2 (P = 0.167).
. | Univariate . | Multivariablea . | ||
---|---|---|---|---|
Factor . | OR (95% CI) . | Pb . | OR (95% CI) . | Pb . |
SUVmax C2 | 0.68 (0.52–0.90) | 0.007 | 0.68 (0.48–0.87) | 0.005 |
ER or PR positive | 0.42 (0.19–0.95) | 0.037 | 0.33 (0.14–0.82) | 0.017 |
Experimental arm | 1.21 (0.55–2.63) | 0.638 | 1.36 (0.57–3.22) | 0.487 |
SUVmax baseline | 1.04 (0.97–1.12) | 0.228 |
. | Univariate . | Multivariablea . | ||
---|---|---|---|---|
Factor . | OR (95% CI) . | Pb . | OR (95% CI) . | Pb . |
SUVmax C2 | 0.68 (0.52–0.90) | 0.007 | 0.68 (0.48–0.87) | 0.005 |
ER or PR positive | 0.42 (0.19–0.95) | 0.037 | 0.33 (0.14–0.82) | 0.017 |
Experimental arm | 1.21 (0.55–2.63) | 0.638 | 1.36 (0.57–3.22) | 0.487 |
SUVmax baseline | 1.04 (0.97–1.12) | 0.228 |
Abbreviations: CI: confidence interval; ER: estrogen receptor; OR: odds ratio; PR: progesterone receptor; SUVmax: maximum standardized uptake value.
aAll variables presented in the table are included into the multivariable model.
bWald test.
The associations between SUVmax at baseline and at C2, and EFS are shown in Table 3. In both univariate and multivariable analysis when adjusting for hormone receptor status, treatment arm, tumor size, and nodal status, higher SUVmax at C2 was prognostic for worse EFS (HRadj = 1.25; 95% CI, 1.12–1.40; P < 0.001), even when further adjusting for baseline SUVmax (HRadj = 1.27; 95% CI, 1.12–1.43; P < 0.001). In contrast, SUVmax at baseline was not prognostic for EFS (univariate HR = 1.03; 95% CI, 0.94–1.11; P = 0.552). Moreover, SUVmax at C2 provided prognostic information for EFS beyond pCR status in multivariable analysis (HRadj = 1.22; 95% CI, 1.09–1.37; P = 0.007). Similar results were noted for the RFS endpoint (Supplementary Table S3).
. | Univariate . | Multivariablea . | ||
---|---|---|---|---|
Factor . | HR (95% CI) . | Pb . | HR (95% CI) . | Pb . |
SUVmax C2 | 1.25 (1.12–1.40) | <0.001 | 1.27 (1.12–1.41) | <0.001 |
ER or PR positive | 0.51 (0.19–1.36) | 0.179 | 0.55 (0.19–1.61) | 0.278 |
Tumor size | 1.57 (0.67–3.68) | 0.303 | 1.34 (0.53–3.43) | 0.278 |
Experimental arm | 1.05 (0.39–2.80) | 0.923 | 1.21 (0.41–3.53) | 0.730 |
Nodal status | 1.87 (0.65–5.38) | 0.248 | 1.47 (0.48–4.50) | 0.503 |
SUVmax baseline | 1.03 (0.94–1.11) | 0.523 | 0.97 (0.88–1.07) | 0.97 |
. | Univariate . | Multivariablea . | ||
---|---|---|---|---|
Factor . | HR (95% CI) . | Pb . | HR (95% CI) . | Pb . |
SUVmax C2 | 1.25 (1.12–1.40) | <0.001 | 1.27 (1.12–1.41) | <0.001 |
ER or PR positive | 0.51 (0.19–1.36) | 0.179 | 0.55 (0.19–1.61) | 0.278 |
Tumor size | 1.57 (0.67–3.68) | 0.303 | 1.34 (0.53–3.43) | 0.278 |
Experimental arm | 1.05 (0.39–2.80) | 0.923 | 1.21 (0.41–3.53) | 0.730 |
Nodal status | 1.87 (0.65–5.38) | 0.248 | 1.47 (0.48–4.50) | 0.503 |
SUVmax baseline | 1.03 (0.94–1.11) | 0.523 | 0.97 (0.88–1.07) | 0.97 |
Abbreviations: CI: confidence interval; ER: estrogen receptor; HR: hazard ratio; PR: progesterone receptor; SUVmax: maximum standardized uptake value.
aAll variables presented in the table are included into the multivariable model.
bWald test.
Prognostic implications of TILs
Patients with TILs over the median cut-off compared with those with lower TILs (≥10% vs. <10%) had higher rates of pCR (51.4% vs. 28.1%, Pearson χ2P = 0.003). Baseline TILs ≥10% was an independent predictor of pCR when adjusting for hormone receptor status, treatment, tumor size, and nodal status (ORadj = 2.73; 95% CI, 1.33–5.60; P = 0.006). In addition, baseline TILs ≥10% provided additional prognostic information to clinical parameters and C2 SUVmax for the pCR endpoint (LR − Δχ2 = 6.44; P = 0.011; ORadj = 3.47; 95% CI, 1.28–9.43; P = 0.014).
Information on both TILs and DTILs at baseline was available from 169 patients. TILs and DTILs were significantly correlated (Spearman rho = 0.72, P < 0.0001; Lin concordance coefficient = 0.52). The corresponding Bland–Altman plot is shown in Supplementary Fig. S1. Baseline DTILs ≥8.7% (median value) provided additional prognostic information to clinical parameters and C2 SUVmax for the pCR endpoint (LR − Δχ2 = 5.13; P = 0.023; ORadj = 2.87; 95% CI, 1.12–7.35; P = 0.028).
Neither TILs (univariate HR = 1.05; 95% CI, 0.42–2.55; P = 0.936) nor DTILs (univariate HR = 1.29; 95% CI, 0.51–3.26; P = 0.596) were prognostic for EFS.
Combined assessment of metabolic activity and TILs
We then assessed the combination of metabolic response and TILs/DTILs as a prognostic marker (Fig. 3A–D). Using the median SUVmax value at C2 as cutoff, 8.3% of patients with TILs <10% and C2 SUVmax ≥2.60 achieved pCR, compared with 35%–58.3% for the other groups (P = 0.002; Supplementary Fig. S2). The Kaplan–Meier curves for EFS for these groups are presented in Supplementary Fig. S2 (log-rank P = 0.072). Similar results were noted when grouping patients according to SUVmax at C2 and DTILs: 15.3% of patients with SUVmax ≥2.60 and DTILs <8.7% achieved pCR, compared with 34.4%–57.9% of the other groups (P = 0.031; Supplementary Fig. S3). Combining metabolic response and DTILs identified distinct prognostic groups in terms of EFS (log-rank P = 0.042; Supplementary Fig. S3).
Exploratory analysis of CelTIL as a predictor of pCR
Machine learning–based assessment of cellularity correlated weakly but statistically significantly with PET/CT SUVmax at the same timepoint (baseline: Spearman rho = 0.26, P = 0.006; after two treatment cycles: Spearman rho = 0.34, P = 0.042). In total, 55 patients had an assessable on-treatment biopsy. In a post hoc analysis, CelTIL was not found to predict pCR in this patient group (univariate OR = 1.57; 95% CI, 0.39–6.32; P = 0.527).
Discussion
Initial clinical trials of antibody–drug conjugates (ADC) in the neoadjuvant setting for HER2-postive breast cancer focused on de-escalation without compromising short-term efficacy (4, 18, 26). With the development and ongoing evaluation of second-generation ADC (27), the question of long-term efficacy becomes even more pertinent. With a median follow-up of over 5 years, the longest reported of any ADC-based neoadjuvant trial, we provide reassuring evidence regarding the efficacy of preoperative ADC-based treatment. In addition, we interrogated the metabolic response to treatment and the tumor microenvironment as a potential new combination biomarker to facilitate early treatment adaptation.
This updated efficacy analysis of PREDIX HER2 leads to three main conclusions. First, PREDIX HER2 confirms the excellent contemporary outcomes of early and locally advanced HER2-positive breast cancer when treated perioperatively, with 5-year RFS exceeding 90% in both treatment groups. These results stand in stark contrast to the dismal prognosis 20 years ago before the introduction of adjuvant trastuzumab (8). In addition, there is no signal for worse long-term outcomes after treatment with an ADC instead of standard chemotherapy and dual HER2 blockade. If anything, besides a small numerical increase of disease progression during treatment, there were numerically fewer patients with disseminated cancer or death following treatment with T-DM1 than with standard treatment, an observation also noted in the point estimates for EFS and RFS. Finally, PREDIX HER2 offers reassurance regarding the excellent prognosis of patients achieving pCR regardless of administered treatment, therefore dismantling any potential objections that the “quality” of pCR is lesser when attained with an ADC. Taking into consideration the hazards of cross-trial comparisons, these observations are in accordance with previously published results from the KRISTINE trial (28), and support further investigation of treatment optimization strategies and novel ADC in the treatment of HER2-positive breast cancer that could challenge the place of chemotherapy and dual HER2 blockade combinations as the standard of care.
The second objective of the current study was to determine whether the combination of metabolic response and TILs enumeration provides more precise prognostic information compared with either biomarker alone. Previous studies have shown that metabolic response to neoadjuvant treatment predicts pCR (11, 17), while the association with patient survival has been mostly demonstrated on HER2-negative breast cancer and in retrospective studies (29). This protocol-predefined analysis of a prospective randomized trial has several advantages compared with other studies that have prospectively evaluated PET/CT changes during preoperative treatment for HER2-positive breast cancer, although it should be acknowledged that other studies have used PET/CT-defined endpoints as the primary study objective (11, 17). For example, all patients regardless of hormone receptor expression were enrolled in PREDIX HER2, while TBCRC026 only included ER-negative patients (11). In addition, both standard treatment and ADC were used in PREDIX HER2 and no difference in prognostic value was noted according to treatment arm, which strengthens the generalizability of our results. Moreover, we demonstrate that metabolic activity after two treatment cycles was a strong predictor of long-term survival, a finding reported for the first time by a prospective trial with integrated PET/CT evaluations. However, the biggest novelty of our study is that we show for the first time that combining metabolic response and TIL enumeration provides superior prognostic information, whether TIL enumeration is performed manually or using digital image analysis and therefore through a wholly automated prognostic tool. The potential clinical implications are clear, since early identification of poor responders may facilitate treatment adaptation and escalation with alternative treatment strategies.
The idea of combinatory markers for neoadjuvant therapy prediction is not new. For example, CelTIL combines tumor cellularity and TIL enumeration and has been previously shown to be strongly prognostic in HER2-positive breast cancer (25, 30). In our study, CelTIL was not found to predict pCR. Although this analysis concerned a small patient group and the observed wide confidence intervals preclude any robust conclusions, these limitations highlight the inherent difficulties that biomarkers based on on-treatment tissue biopsies have and preclude their routine implementation in the clinic. Even in PREDIX HER2 where on-treatment biopsies were mandatory, only 1 of 4 patients had usable tissue for assessment, reflecting quality degradation due to highly effective treatment and sampling errors. The fact that CelTIL was assessed on the basis of digital image analysis and after two instead of after one treatment cycle as in prior studies may have affected our results, although the main concerns remain unanswered.
Potential limitations of the clinical trial have been previously described in detail (18) and include treatment switch for patients with progressive disease at two treatment cycles; the definition of HER2 positivity according to the guidelines in use when the trial was initiated; the number of postoperative chemotherapy cycles per treatment group, which however does not affect the primary efficacy endpoint; and the moderate sample size, which means that correlative analyses should be considered exploratory and hypothesis generating. Furthermore, the metabolic and/or immune prognostic model needs to be validated in future studies prior to implementation. Whether the combination of other metrics of metabolic activity such as SUV normalized to lean body mass and other tissue- or liquid-based markers offers superior prognostic information is the subject of ongoing investigation within a translational program based on systematic sample collection from patients that were enrolled in PREDIX HER2.
In summary, after long-term follow-up PREDIX HER2 confirms the efficacy of both standard and de-escalated neoadjuvant treatment with T-DM1 for early and locally advanced HER2-positive breast cancer. In addition, we demonstrate that capturing the host immune response at baseline and the metabolic response to treatment is promising and can be potentially implemented as a fully automated prognostic tool pending further validation.
Authors' Disclosures
A. Matikas reports other support from Veracyte and Roche outside the submitted work. A. Bosch reports other support from Pfizer, Roche, and Eli Lilly outside the submitted work, and is co-owner and chair of the board for SACRA Therapeutics. H. Lindman reports grants from Roche, as well as personal fees from Lilly, AstraZeneca, Daiichi Sankyo, Pierre Fabre, Lilly, MSD, Novartis, Seagen, and Gilead outside the submitted work. J. Hartman reports personal fees from Stratipath, MSD, Pfizer, Roche, AstraZeneca, and Eli Lilly outside the submitted work. J. Bergh reports grants from Amgen, AstraZeneca, Bayer, Merck, Pfizer, Roche, and Sanofi-Aventis outside the submitted work, as well as honoraria to Asklepios Medicine HB. In addition, J. Bergh is co-author on a chapter on prognostic and predictive factors in early, nonmetastatic breast cancer in UpToDate; J. Bergh has also been offered stocks in Stratipath and offered to be consultant for that diagnostic company in early development. T. Foukakis reports grants and personal fees from Novartis; grants, personal fees, and other support from Pfizer; other support from Gilead and Roche; grants and other support from AstraZeneca; and personal fees from Affibody, Exact Sciences, and Veracyte outside the submitted work. No disclosures were reported by the other authors.
Authors' Contributions
A. Matikas: Conceptualization, formal analysis, investigation, writing–original draft, writing–review and editing. H. Johansson: Formal analysis, writing–original draft, writing–review and editing. P. Grybäck: Investigation, writing–review and editing. J. Bjöhle: Investigation, writing–review and editing. B. Acs: Software, investigation, methodology, writing–review and editing. C. Boyaci: Investigation, methodology, writing–review and editing. T. Lekberg: Investigation, writing–review and editing. H. Fredholm: Investigation, writing–review and editing. E. Elinder: Investigation, writing–review and editing. S. Margolin: Investigation, writing–review and editing. E. Isaksson-Friman: Investigation, writing–review and editing. A. Bosch: Investigation, writing–review and editing. H. Lindman: Investigation, writing–review and editing. J. Arda: Investigation, writing–review and editing. A. Andersson: Investigation, writing–review and editing. S. Agartz: Investigation, methodology. M. Hellström: Data curation, project administration. I. Zerdes: Data curation, investigation, visualization, methodology, writing–review and editing. J. Hartman: Software, supervision, investigation, methodology, writing–review and editing. J. Bergh: Resources, supervision, funding acquisition, investigation. T. Hatschek: Supervision, funding acquisition, investigation, methodology, writing–review and editing. T. Foukakis: Conceptualization, resources, supervision, funding acquisition, investigation, writing–review and editing.
Acknowledgments
A. Matikas is supported by the Swedish Cancer Society (Cancerfonden) Junior Clinical Investigator award 2021, the Iris, Stig and Gerry Castenbäcks Foundation, The Swedish Breast Cancer Association (Bröstcancerförbundet), and The Research Funds at Radiumhemmet. B. Acs is supported by The Swedish Society for Medical Research (Svenska Sällskapet för Medicinsk Forskning) Postdoctoral grant and the Swedish Breast Cancer Association (Bröstcancerförbundet). I. Zerdes is supported by the Swedish Society of Medicine and by the Swedish Society of Oncology postdoctoral grant. J. Hartman is supported by MedtechLabs, Region Stockholm, the Swedish Breast Cancer Association, the Swedish Cancer Society and Vinnova. The authors would like to thank Athanasios Zouzos for providing the PET/CT images for Fig. 3.
This study was supported by grants from Region Stockholm, Karolinska Institutet including Cancer Research KI, the Swedish Research Council, the Swedish Cancer Society, the Research Funds at Radiumhemmet, and Roche Sweden. The funding sources had no role in the data analysis, interpretation, writing of the article, or decision to submit the article.
The publication costs of this article were defrayed in part by the payment of publication fees. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.
Note: Supplementary data for this article are available at Clinical Cancer Research Online (http://clincancerres.aacrjournals.org/).