Approximately 40% of patients with stage I–III triple-negative breast cancer (TNBC) recur after standard treatment, whereas the remaining 60% experience long-term disease-free survival (DFS). There are currently no clinical tests to assess the risk of recurrence in TNBC patients. We previously determined that TNBC patients with MHC class II (MHCII) pathway expression in their tumors experienced significantly longer DFS. To translate this discovery into a clinical test, we developed an MHCII Immune Activation assay, which measures expression of 36 genes using NanoString technology. Preanalytical testing confirmed that the assay is accurate and reproducible in formalin-fixed paraffin-embedded (FFPE) tumor specimens. The assay measurements were concordant with RNA-seq, MHCII protein expression, and tumor-infiltrating lymphocyte counts. In a training set of 44 primary TNBC tumors, the MHCII Immune Activation Score was significantly associated with longer DFS (HR = 0.17; P = 0.015). In an independent validation cohort of 56 primary FFPE TNBC tumors, the Immune Activation Score was significantly associated with longer DFS (HR = 0.19; P = 0.011) independent of clinical stage. An Immune Activation Score threshold for identifying patients with very low risk of relapse in the training set provided 100% specificity in the validation cohort. The assay format enables adoption as a standardized clinical prognostic test for identifying TNBC patients with a low risk of recurrence. Correlative data support future studies to determine if the assay can identify patients in whom chemotherapy can be safely deescalated and patients likely to respond to immunotherapy.
The MHCII Immune Activation assay identifies TNBC patients with a low risk of recurrence, addressing a critical need for prognostic biomarker tests that enable precision medicine for TNBC patients.
Triple-negative breast cancer (TNBC) is a clinical subtype of invasive breast cancer that is defined by the absence of standard markers used for prognosis and treatment decisions [estrogen receptor (ER), progesterone receptor (PR), and HER2]. TNBC is notable for its aggressive behavior and high rates of local and distant recurrence (1). TNBC patients are treated with local therapy and cytotoxic chemotherapy. Patient outcomes are disparate. Approximately 42% of patients experience rapid relapses with a peak at 3 years from diagnosis, whereas the remaining 58% of patients have long-term disease-free survival (DFS; ref. 2). Physicians cannot currently predict which patients will relapse, even after intensive chemotherapy, and which patients will have long-term DFS and might do equally well with deescalation of their chemotherapy regimen. Currently, most TNBC patients are treated with aggressive chemotherapy, which can result in serious long-term toxicity including permanent peripheral neuropathy, cardiac toxicity, and secondary malignancies (3–8). A current goal of the TNBC biomarker field is to develop clinical tools that can be used to identify patients who do not require aggressive treatment and can be spared the associated toxicities.
We recently reported that expression of the MHC Class II antigen presentation pathway (MHCII) in TNBC tumor cells is significantly associated with long-term DFS (9). Further, high MHCII expression in tumor cells was associated with the presence of tumor-infiltrating lymphocytes (TIL; ref. 9), which are known to be associated with good prognosis in patients with TNBC (10–14). An independent research team performed IHC on 681 TNBC patient tumors and confirmed that high expression of MHCII in tumor cells was associated with large amounts of tumor-infiltrating CD4- and CD8-positive T cells, and longer DFS (15). Mouse studies have shown that MHCII expression on tumor cells triggers T-cell recruitment and inhibits tumor progression (16–23). A standardized method for the morphologic evaluation of TILs in patient tumor samples has been developed, but has not entered routine clinical practice (24, 25). Although promising, broad clinical implementation of this method may be limited by pathologist training, interobserver variability, and time required for assessment (26). Furthermore, this approach does not discern lymphocyte subsets or T-cell activation states (24, 25).
Although histologic assays for several MHCII proteins and TIL counting could be combined to develop diagnostic criteria, the process would be complex. Historically, multiplexed IHC assays (e.g., IHC4) have not performed as well as multiplexed gene expression assays (27, 28). Compared with traditional pathologic scoring systems, a multiplexed gene expression test can measure the expression of many genes in the MHCII pathway, quantify TIL markers simultaneously, and has a larger dynamic range of measurements with finer resolution.
In routine clinical practice, patients' tumors are collected and processed as formalin-fixed, paraffin-embedded (FFPE) tissues, which results in significant degradation of mRNA (29). PCR was the first technology used to demonstrate that small fragmented RNA transcripts could be recovered from FFPE tissue and used to accurately quantify gene expression in breast tumors (30). This enabled the development of the first gene expression prognostic assay for patients with hormone receptor–positive (HR+) breast cancer (Oncotype Dx; ref. 31). There are now several gene expression assays that are indicated for use in patients with HR+ breast cancer (32–37); however, there are no clinically validated assays available for patients with TNBC.
The NanoString nCounter platform is an alternative method for measuring gene expression in clinical FFPE specimens. NanoString nCounter technology is unique in that it measures RNA directly without amplification or cloning, which eliminates the biases that can be introduced by other PCR or sequencing-based methodologies (38, 39). One clinical prognostic test for HR+ breast cancer (Prosigna) utilizes NanoString technology (32, 37, 40). NanoString obtained a CE Mark for its Prosigna assay in 2012, followed by FDA clearance in September 2013. Prosigna is now included in clinical oncology guidelines for the management of HR+ breast cancer (41) and is performed in qualified clinical laboratories around the world. In this study, we leveraged this previous success in clinical assay development on the NanoString nCounter platform to develop an assay for MHCII and TIL gene expression that could be used to assess prognosis in TNBC patients.
Materials and Methods
NanoString probe design
A custom panel of probes for measuring expression of 36 genes on the NanoString nCounter platform was designed. Probe sequences were compared with RNA-seq data from TNBC tumors (9) to confirm that mRNA isoforms in TNBC would be detected by the probe sequences, and redesigned as necessary. The probe sequences were then synthesized by Integrated DNA Technologies, Inc. The probe A oligos were purified using high performance liquid chromatography, and the Probe B oligos were purified using polyacrylamide gel electrophoresis. The full sequence of the probes is provided in Supplementary Table S1.
NanoString nCounter assay
We used NanoString nCounter Elements TagSets and Master Kits to develop the assay. Custom gene-specific oligonucleotide probes (Probe Sequence in Supplementary Table S1) were produced by Integrated DNA Technologies. Hybridization and counting were performed according to the manufacturer's specifications. Briefly, gene-specific probes were hybridized with NanoString Elements TagSets and RNA at 67°C for 24 hours. After hybridization, samples were transferred to the automated nCounter Prep Station for purification and immobilization onto the sample cartridge. After sample preparation was complete, the sample cartridge was transferred to the nCounter Digital Analyzer for imaging and analysis. All samples were analyzed using the maximum resolution setting (555 images per sample).
Approval for use of patient specimens
Approval for the use of archival tissue specimens was granted by Institutional Review Boards (IRB) at the University of Utah and the University of Kentucky. The research was conducted in accordance with recognized ethical guidelines including the U.S. Common Rule. Written-informed consent was obtained for fresh-frozen tissue collections. For previously collected archival FFPE blocks, the IRBs waived the requirement for informed consent.
RNA from frozen tissues
RNA remaining from frozen tissue collected for previous studies was used (9, 42). The RNA-seq data from these samples are publicly available through GEO Accession GSE58135. For the comparison of frozen and FFPE sections from the same tumor, frozen breast cancer specimens were obtained from the University of Kentucky Markey Cancer Center Biospecimen Procurement and Translational Pathology Shared Resource Facility (BPTP SRF). These tissues were collected from breast surgical specimens under IRB protocols # 04-0454 and 11-0750. Fresh-frozen breast tissues were embedded in Tissue-Tek O.C.T. Compound (Sakura Finetek) and sectioned at −20°C on a cryostat. An initial 4 μm tissue section was cut and stained using hematoxylin and eosin (H&E) so that tumor cellularity could be assessed by a pathologist. Only cases with ≥10% tumor cellularity were included. After assessing the H&E slide, a pathologist cut an additional 10 unstained sections at 10 μm each. Unstained sections were collected in lysis buffer and homogenized in a bullet blender (NextAdvance); RNA was then isolated using an E.Z.N.A RNA Isolation Kit (Omega Bio-tek). After frozen sections had been taken for RNA isolation, the remnant block was taken off the cryostat, placed in a tissue cassette, and submitted for routine processing and embedding (creation of an FFPE block) in a pathology laboratory.
FFPE sample identification
This project was performed under an approved University of Utah IRB protocol (#24487). Natural language searches were used to identify surgical pathology cases with a diagnosis of invasive carcinoma of the breast. Only breast tumors from patients with primary stage I–III breast cancer were included in the study. Surgical pathology reports were reviewed by a pathologist to determine ER, PR, and HER2 status. Only TNBC cases with pretreatment tumor material available in the archives were included. Detailed clinicopathologic, stage, and outcome data were obtained through review of the pathology report and medical record. DFS was defined as the length of time that the patient survived after a primary diagnosis of breast cancer without any evidence of local disease recurrence or distant metastases. Events included ipsilateral breast recurrence and distant metastases.
Slide review, macrodissection, and RNA isolation from FFPE tissue
A pathologist reviewed all cases and selected the best FFPE block from each case for analysis, taking care to avoid blocks with low tumor cellularity, or with large areas of necrosis, calcification, or fibrosis. For each block, a fresh H&E-stained slide and adjacent unstained sections (10 μm) were obtained. A board-certified pathologist reviewed each H&E section and confirmed the presence of invasive breast cancer. Tumors were required to be ≥4 mm in size and to have at least 10% tumor cellularity. Using these requirements, only a single case was initially deemed inadequate due to low tumor cellularity (<10%). In this case, an alternate block was selected from the same surgical pathology specimen; the alternate block had 60% tumor cellularity and was therefore included in the study. After assessing tumor cellularity, the pathologist circled tumor on the H&E slide for macrodissection, taking care to exclude large areas of necrosis, hemorrhage, calcification, and ductal carcinoma in situ. The pathologist also measured the tumor surface area to determine the number of unstained slides required for the assay. Prior to macrodissection, unstained slides (10 μm) were deparaffinized using Hemo-De (Scientific Safety Solvents), washed in 100% Ethanol, air-dried for 10 minutes, and then briefly rinsed in 3% glycerol. Tumor macrodissection was performed with a scalpel in order to isolate tumor-rich regions from unstained FFPE sections. Macrodissected tissue was subject to RNA isolation using a Roche column-based kit (HighPure FFPET RNA Isolation Kit, Roche Diagnostics). Briefly, macrodissected tissue from FFPE-unstained slides was digested overnight in Proteinase K, and RNA was bound to a silica column, treated with DNase, and then eluted in 30 μL of buffer according to the manufacturer's instructions. Isolated RNA was quantified using the Qubit 3.0 and the RNA-BR (Broad-Range) assay kit (ThermoFisher Scientific). RNA quality was assessed on the 2200 TapeStation (Agilent Technologies) using the Agilent RNA ScreenTape Assay. RNA integrity number (RIN) values for each specimen were recorded.
Normalization of gene expression values
The gene expression count values for each sample were normalized to correct for differences in background signal intensity across runs, and to correct for differences in RNA template quality and quantity between samples. The first step of normalization is background subtraction. In each Nanostring nCounter run, a “no template” control sample was analyzed. The count values for each probe in this control were subtracted from the count values for each of the patient samples in the run. This is called “Blank lane background subtraction” in the NanoString nSolver analysis software. Next, the geometric mean of the Housekeeping genes was used as a normalization factor for each sample. Notably, the probe for the Housekeeping gene PSMC4 exhibited a very high percent coefficient of variation (101%; Supplementary Fig. S1) and was excluded from the normalization factor calculation. The normalized counts for each sample were then analyzed as described below.
Statistical analyses were performed in R version 3.5.0 and Graphpad Prism version 7.0C. The geometric mean is often used in the literature, as well as the NanoString nSolver software, to calculate a composite score of multiple internal control housekeeping genes for normalization of gene expression assays (43, 44). In this study, the geometric mean was used to calculate composite scores for basal-like gene expression, MHCII gene expression, TIL gene expression, and immune activation gene expression. This ensures that each gene in the score has similar weight, regardless of its baseline expression levels and dynamic range. This was particularly important when incorporating TIL genes into the same score as the MHCII genes expressed in tumor cells, because TIL genes inherently have lower mRNA counts because they are derived from a smaller fraction of cells in the sample. Thus, higher scores represent higher expression of all of the genes in the signature and avoid the risk that a single extremely high or low expressed gene in the signature will have uneven influence on the score.
The Basal-like Subtype score, MHCII Score, TIL Score, and Immune Activation Score were calculated using the geometric mean of normalized counts for each gene as noted in following formulas:
Heatmaps of log-normalized gene counts were created using the R package “pheatmap” version 1.0.10. Survival analysis (Kaplan–Meier plots and Cox regression) was performed using the R package “survival” version 2.42-3 and the R package “survminer” version 0.4.2. Receiver operator characteristic curve analysis was performed using the R package “pROC” version 1.12.1. The linear model of Risk of Recurrence was created using the glm package in R.
Analysis of public microarray data
The Kaplan–Meier Plotter tool (http://kmplot.com; ref. 45) was used to perform correlative analysis of publicly available gene expression datasets. The intrinsic subtype classification provided by the Kaplan–Meier Plotter tool was used to select cases for analysis (46). The following selections were applied to all analyses: only one JetSet best probe (47) for each gene was used in the multigene classifier that calculates the mean expression of the selected probes, relapse-free survival was selected for the analysis, patients were censored at the follow-up threshold (60 months), biased arrays were excluded, and redundant samples were removed. The most significant cutpoint was used to split patients into two groups.
IHC staining was performed on 4-μm-thick sections of FFPE tissue. The following antibodies were used: HLA-DR [Santa Cruz Biotechnology (sc-53319)] and HLA-DR/DP/DQ/DX [Santa Cruz Biotechnology (sc-53302)]. FFPE sections were air-dried and then melted in a 60°C oven for 30 minutes. Slides were loaded onto the Ventana BenchMark ULTRA automated staining instrument (Ventana Medical Systems) and deparaffinized with the EZ Prep solution. Antigen retrieval was performed using Cell Conditioning 1 (CC1, pH 8.5) for 64 minutes at 95°C. The primary antibody (concentration of 1:1,000 for HLA-DR/DP/DQ/DX; 1:2,000 for HLA-DR) was applied for 1 hour at 37°C. Signal amplification was performed with the amplification kit. Positive signal was visualized using the UltraView DAB Detection Kit, which is a Universal HRP Multimer that contains a cocktail of HRP-labeled antibodies (goat anti-mouse IgG, goat anti-mouse IgM, and goat anti-rabbit), utilizing DAB (3-3′ diaminobenzidine) as the chromogen. Tissue sections were counterstained with hematoxylin for 8 minutes. The slides were removed from the immunostainer and placed in a dH2O/DAWN mixture. The sections were gently washed in a mixture of deionized water and DAWN solution to remove any coverslip oil applied by the automated instrument. The slides were gently rinsed in deionized water until all of the wash mixture was removed. The slides were dehydrated in graded ethanol, cleared in xylene, and then coverslipped. For all staining runs, positive and negative controls were included and stained appropriately in all cases. Benign human tonsil was used as a positive control, whereas skeletal muscle was used as a negative control. In addition, positive staining in macrophages and infiltrating lymphocytes served as internal positive controls for all cases. Scoring for HLA-DR and HLA-DR/DP/DQ was performed by a board-certified pathologist who was blinded to clinical variables. Expression of HLA-DR and HLA-DR/DP/DQ was assessed in tumor epithelial cells using a standard semiquantitative system: negative (0), weak (1), moderate (2), and strong (3).
A diagrammatic outline of this study's design and analyses is provided in Supplementary Fig. S2.
Design of the MHCII immune activation assay
The major goal of this study was to develop a multiplexed gene expression assay on the NanoString nCounter platform that could accurately measure the expression of MHCII and TIL genes in FFPE TNBC tumor specimens. We have named this the “MHCII Immune Activation” assay.
The MHCII Immune Activation assay uses custom gene-specific oligo probes designed to 36 genes including MHCII signature genes, TIL genes, Subtype Verification genes, and Housekeeping Control genes (Fig. 1A; Probe Sequences in Supplementary Table S1). The MHCII genes were selected based on significant association with longer DFS in the previous study (9). CIITA is the master transcriptional transactivator of the MHCII pathway and is required to induce expression of the other genes in the pathway (48, 49). Candidate TIL genes were selected based on high Spearman correlation (R > 0.5) with CIITA expression in the TNBC tumors in the previous study (9) and membership in the Gene Otology classification “Positive regulation of T cell activation” (50–52). Nine candidate genes that were identified as TIL markers in recent publications were selected for the assay (53–55). The selected TIL genes include markers of T-cell types, as well as markers of T-cell activation, T-cell memory, and T-cell interactions with tumor cells. The Subtype Verification genes were previously determined to be the best distinguishers of basal-like TNBC from other subtypes using the PAM50 gene set (56). During the analytical/technical development of the PAM50 signature, statistical algorithms to identify the best housekeeping control gene sets for normalization in breast cancer were developed by our group (57). The five best housekeeping control genes for normalizing classifier genes across all types of breast cancer and across different ages of FFPE procurement were selected for this assay (57).
Preanalytical testing of the MHCII immune activation assay
We chose to develop the assay on the NanoString nCounter platform because previous studies reported that the platform provides accurate gene expression measurements even in degraded RNA from FFPE specimens (39). To ensure that the MHCII Immune Activation assay accurately measures gene expression in FFPE specimens, the MHCII Immune Activation assay was performed on three pairs of matched frozen and FFPE breast tumor specimens. Measurements were highly correlated (Spearman R2 = 0.89–0.96; P < 0.0001) between the high-quality RNA from frozen tumor sections (RIN = 9.0–9.7) and the degraded RNA from matched FFPE tumor sections (RIN = 1.0–4.5; Fig. 1B). Thus, the MHCII Immune Activation assay on the NanoString platform can accurately quantify gene expression in FFPE specimens.
To evaluate the reproducibility of the MHCII Immune Activation assay, 11 pairs of replicate FFPE breast tumor RNA samples were analyzed on the NanoString nCounter instrument. The two sets of replicate samples were processed by two different technical teams at our institution. The normalized counts were highly correlated between the pairs of replicates for each of the 11 samples (Fig. 1C; Spearman R2 = 0.98–0.99; P < 0.0001). Genes whose normalized counts were below 10 in both replicates have higher variation between replicates, reflecting natural variation in counting rare molecules. Therefore, the MHCII Immune Activation assay provides highly reproducible results on RNA isolated from FFPE tissue.
To confirm that the MHCII Immune Activation assay accurately measures TIL genes, the assay was performed on FFPE specimens from histologically confirmed TIL-high and TIL-low TNBC tumors. The TIL genes were differentially expressed between TIL-high and TIL-low TNBC tumors, as expected (Fig. 1D).
The MHCII signature is associated with improved DFS in patients with basal-like TNBC, but not in patients with HR+ breast cancer (Supplementary Fig. S3). This observation is consistent with previous studies that have investigated immune and TIL signatures across breast cancer subtypes. In one of the largest studies, the presence of TILs was identified as an adverse prognostic factor in patients with luminal breast cancer, potentially reflecting the unique immunobiology of this HR+ subtype (58). Subtype Verification genes are included in the MHCII Immune Activation assay to exclude tumors that are not basal-like TNBCs from analysis. To confirm that the Subtype Verification genes in the assay are able to discern basal-like TNBC from other subtypes of breast cancer, the MHCII Immune Activation assay was performed on 33 FFPE breast tumor RNA samples that had been previously classified into intrinsic subtypes [basal-like (n = 8), luminal A (n = 8), luminal B (n = 8), and HER2-enriched (n = 9)] using the PAM50 assay (59). The Subtype Verification genes were differentially expressed between these subtypes of breast cancer, as expected (Fig. 1E). To develop an inclusion criterion threshold for basal-like TNBC tumors, a “basal-like score” was calculated for each sample, defined as the geometric mean of the Subtype Verification genes that are highly expressed in basal-like tumors (FOXC1, MKI67, CDC20, CCNE1, and ORC6). A threshold for the basal-like score that perfectly distinguished basal-like tumors from other subtypes was selected (Fig. 1F).
Performance of the MHCII immune activation assay in a training set of TNBC tumors
To evaluate the accuracy of the MHCII Immune Activation assay in TNBC tumor specimens, we analyzed RNA from fresh-frozen tissue samples (n = 44) that had been previously analyzed using RNA-seq (9). From each sample, 50 to 250 ng of RNA was hybridized with the custom gene-specific probes and Elements TagSets and analyzed on the NanoString nCounter Analysis System. The gene expression counts in each sample were background subtracted and normalized to housekeeping genes, as described in Materials and Methods. Five samples were excluded from analysis because they did not meet the basal-like score threshold defined in the preanalytical testing. The remaining 39 samples were analyzed for MHCII and TIL gene expression.
Three gene probes (HLA-DQA1, HLA-DRB5, and HLA-DRB6) were excluded from further analysis due to poor concordance between the RNA-seq and NanoString data (Supplementary Fig. S4). The remaining MHCII gene expression measurements obtained from the MHCII Immune Activation assay and from RNA-seq on the same samples were highly correlated (mean Spearman R2 = 0.88, mean P = 0.008, Fig. 2A). This result confirms the accuracy of this new MHCII Immune Activation assay on the NanoString nCounter instrument.
To determine if the MHCII Immune Activation assay could detect differential expression of MHCII genes between TNBC patients who relapsed and those who did not, an “MHCII Score” for each sample was calculated, defined as the geometric mean of the MHCII gene expression values. MHCII scores were significantly higher (one-sided Mann–Whitney P = 0.0022) in TNBC patients who did not relapse compared with those who did relapse (Fig. 2B). A Kaplan–Meier curve using a threshold for MHCII score that provides the most significant log-rank P value demonstrated that the MHCII Immune Activation assay reproduced the significant prognostic difference between tumors with high and low MHCII expression (log-rank P = 0.0045, Fig. 2C, threshold depicted in Fig. 2B). This result confirms that the MHCII gene expression signature maintains its prognostic significance on the Nanostring nCounter platform.
A heatmap of the MHCII and TIL genes in TNBC patient tumors demonstrated that expression of MHCII and TIL genes is highly correlated within a tumor (Fig. 2D). Similarly, MHCII and TIL scores were correlated across samples (Spearman R2 = 0.71; Supplementary Fig. S5). To determine whether expression of the MHCII and TIL genes could be combined into score that could be used to assess prognosis, an Immune Activation Score for each sample was calculated using the geometric mean of the MHCII and TIL gene expression values. Immune Activation Scores were significantly higher (one-sided Mann–Whitney P = 0.0041) in TNBC patients who did not relapse compared with those who did relapse (Fig. 2E). A Kaplan–Meier curve using a threshold for the Immune Activation Score that provides the same Specificity (90%) as the MHCII score demonstrated that patients with high Immune Activation Scores have a significantly higher probability of DFS than those with low Immune Activation Scores (log-rank P = 0.022, Fig. 2F, threshold = 1,750 depicted in Fig. 2E). This result confirms the prognostic power of the Immune Activation Score generated by the MHCII Immune Activation assay.
Validation of the MHCII immune activation assay in an independent cohort
The second major goal of this study was to validate that the MHCII Immune Activation assay could be used to assess prognosis in an independent institutional cohort of TNBC patients. Chart review was used to select cases that generally represent the diverse presentation and outcomes that are seen in TNBC patients in clinical practice at the University of Utah (n = 56). Selected cases included age 35–70 (median, 55), stage I–III disease (majority stage II), tumor size T1–T4 (majority T2), histologic grade 2–3 (majority grade 3), and patients with positive and negative lymph nodes (Supplementary Table S2). Overall, these demographics and the number of cases are similar to the cohort used in the previous study and the training set (Supplementary Table S2; ref. 9).
A board-certified anatomic pathologist selected clinical FFPE tissue blocks in which there was adequate tumor tissue for macrodissection. All specimens were collected prior to chemotherapy. The MHCII Immune Activation assay was performed on RNA isolated from the TNBC FFPE specimens using a protocol similar to the Prosigna test, as described in detail in Materials and Methods.
Eleven samples were excluded from analysis because they did not meet the basal-like score threshold defined in the preanalytical testing. The observation that not all TNBC tumors will be classified into the basal-like subtype based on gene expression is consistent with prior studies that report the presence of luminal androgen receptor subtype tumors and HER2-enriched subtype tumors among TNBCs (59, 60). The remaining 45 samples were analyzed for MHCII and TIL gene expression.
The expression of MHCII and TIL genes was correlated within each tumor, similar to the training set (Fig. 3A). MHCII and TIL scores were also correlated across samples (Spearman R2 = 0.58, Supplementary Fig. S5). The geometric mean of the MHCII and TIL gene expression values was used to calculate an Immune Activation Score for each sample. Immune Activation Scores were significantly higher (one-sided Mann–Whitney, P = 0.0278) in TNBC patients who did not relapse compared with those who did relapse (Fig. 3B). A Kaplan–Meier curve using the same Immune Activation Score threshold as the training set demonstrated a significant prognostic difference between tumors with high and low Immune Activation Scores (log-rank P = 0.021, Fig. 3C, threshold = 1,750 depicted as a dashed line in Fig. 3B). This result confirms the prognostic significance of the MHCII Immune Activation assay in this independent cohort.
Assessing risk of recurrence using the MHCII immune activation assay
The most likely clinical use of the MHCII Immune Activation assay would be to identify patients that have a very low risk of relapse, and distinguish them from patients who have an average risk of relapse. To determine if the MHCII Immune Activation Assay could be used to identify patients that have a very low risk of relapse, an ROC curve was calculated for the Immune Activation Scores in the training set and validation cohort (Fig. 4A, ROC statistics are provided in Supplementary Fig. S6). This clinical application of the assay requires high specificity to correctly identify patients who have a low risk of recurrence and avoid misclassifying patients that may recur. To evaluate the specificity of the assay, threshold analysis of the ROC curve was used to calculate the Immune Activation Score that results in 95% specificity for identifying patients who do not relapse in the training set (threshold = 2,400). The 95% confidence intervals (CI) for the threshold that provides 95% specificity are depicted in the ROC curve in Fig. 4A. When this Immune Activation Score threshold was applied to the validation cohort, the specificity for identifying patients who did not relapse was 100%, i.e., 0 patients with Immune Activation Scores above the threshold relapsed (Fig. 4B). Kaplan–Meier curves were created using this Immune Activation Score threshold to stratify patients, which demonstrates the difference in probability of DFS in both the training set (Fig. 4C) and the validation cohort (Fig. 4D).
In multigene clinical tests used to assess prognosis in HR+ breast cancer (e.g., Prosigna and Oncotype Dx), the results are continuous variables that are linearly related to a patient's risk of recurrence (27, 61). Currently, the quantitative results of these tests are used to classify patients into groups of low, intermediate, and high risk of recurrence for clinical management. The Immune Activation Score produced by this assay is also a continuous variable. To determine if the Immune Activation Score produced by this assay is linearly related to a patient's risk of recurrence, the cumulative risk of recurrence was calculated for patients across the range of Immune Activation Scores observed in the training set and validation cohort. The risk of recurrence in both the training set and validation cohort is a linear function of the log10 Immune Activation Score (Fig. 4E). This result confirms that a patient's risk of recurrence is monotonically related to the Immune Activation Score. In the future, larger studies could be used to define thresholds to classify TNBC patients into groups with low, intermediate, or high risk of recurrence.
Cox proportional hazards regression models were generated to test the association between DFS, clinical variables, and Immune Activation Score in the training set and validation cohort. In univariate Cox regression, Immune Activation Score and stage at diagnosis were significantly associated with DFS in both the training set and validation cohort (Table 1). The Immune Activation Score hazard ratio (HR) was 0.1430 (95% CI = 0.03683–0.5555) in the training set and 0.2111 (95% CI = 0.06075–0.7335) in the validation cohort, indicating a good prognostic factor. The HR for stage was 2.1227 (95% CI = 1.439–3.131) in the training set and 1.628 (95% CI = 1.204–2.201) in the validation cohort, indicating a poor prognostic factor. The other clinical parameters were not significantly associated with DFS, including age at diagnosis, and whether the patient received chemotherapy (Table 1). In the multivariable Cox proportional hazards regression model for both the training set and the validation cohort, Immune Activation Score and stage at diagnosis both remained significant, and their HRs were similar to those in the univariate analysis (Table 1). This result indicates that the Immune Activation Score is an independent predictor of DFS, even when accounting for the differences in DFS associated with a patient's disease stage at diagnosis.
A Cox proportional hazards model of the effect of stage alone in the validation cohort predicts that a patient diagnosed with stage IIB disease has a 59% probability of 5-year DFS. A Cox proportional hazards model including both stage and Immune Activation score predicts that a stage IIB patient with a high Immune Activation Score of 4,000 has a 79% probability of 5-year DFS, whereas a patient with the same disease stage and a low Immune Activation Score of 400 has a 32% probability of 5-year DFS. This suggests that a clinical decision-making tool that incorporated the Immune Activation Score in addition to the patient's disease stage could provide improved assessment of a patient's risk of recurrence. Further studies in larger cohorts will be needed to train and evaluate a predictive model that incorporates Immune Activation score.
Comparison of MHCII immune activation assay with IHC and histologic TIL counting
The results from the MHCII Immune Activation assay confirm that elevated expression of MHCII and TIL genes is associated with a significantly reduced risk of recurrence in TNBC patients. To determine if these gene expression measurements correlate with traditional histologic assessment of MHCII expression and TIL counting, IHC and H&E staining was performed on FFPE sections from the specimens analyzed in the validation cohort, which was reviewed by a board-certified anatomic pathologist who specializes in breast pathology.
In tumors with the highest Immune Activation Scores, MHCII protein was strongly expressed in a membranous pattern within infiltrating carcinoma cells and in associated TILs (Fig. 5A). Tumors with an intermediate Immune Activation Score showed variable MHCII expression; in these cases, staining was often heterogeneous and of moderate intensity (Fig. 5A). In tumors with the lowest Immune Activation Scores, MHCII protein expression was absent in invasive carcinoma cells and present only in rare tumor-associated inflammatory cells (Fig. 5A).
TIL quantification was performed using a histologic “gold standard” protocol developed by a consensus committee on TILs in breast cancer (24, 25). The TIL Score measured by the MHCII Immune Activation assay was highly correlated with morphologic assessment of stromal TIL percentage (Spearman R2 = 0.69, P < 0.0001, Fig. 5B). These results confirm that the MHCII Immune Activation assay on the Nanostring nCounter provides a standardized and multiplexed procedure for measuring MHCII expression and TILs in FFPE tumor specimen that is highly correlated with histologic assessments.
The purpose of this study was to develop and validate a multiplexed assay for MHCII and TIL gene expression that could be used on FFPE tissue to assess a TNBC patient's risk of recurrence. The results of this study demonstrate that performing the MHCII Immune Activation assay on FFPE tumor specimens using the Nanostring nCounter instrument provides accurate measurements of MHCII and TIL gene expression that are highly correlated with reduced risk of recurrence in TNBC patients with primary stage I–III breast cancer.
The most likely clinical use of the MHCII Immune Activation assay would be to distinguish TNBC patients who have a very low risk of relapse from those who have an average risk of relapse. We demonstrate that an Immune Activation Score threshold can be established to identify patients who have a very low risk of recurrence (Fig. 4) and may not require systemic therapy. Both the training set and validation cohort in this study included patients who did not receive systemic chemotherapy for a variety of reasons including advanced age, comorbidities, and patient preference (Supplementary Table S2). Excitingly, we found that patients with high Immune Activation Scores who did not receive systemic chemotherapy did not relapse (Supplementary Fig. S7A). To investigate this preliminary association further, we analyzed public microarray data from a larger cohort of patients with primary stage I–III basal-like breast cancer who did not receive systemic chemotherapy. We found that patients with higher expression of MHCII and TIL genes had significantly longer relapse-free survival, even without systemic treatment (Supplementary Fig. S7B). Future clinical studies are warranted to evaluate whether this assay could be used routinely to identify TNBC patients who inherently have a good prognosis and can safely be treated with local therapy alone. The MHCII Immune Activation assay enables precision medicine for TNBC patients and could help reduce the burden of chemotherapy-induced side effects in TNBC survivors.
Another potential clinical application of the MHCII Immune Activation assay is predicting response to immunotherapy. Recent studies have shown that expression of MHC Class II molecules in melanoma cells is associated with improved response to anti–PD-1 immunotherapy in melanoma patients (62–64). Data presented at the American Society of Clinical Oncology 2017 annual meeting from the phase II randomized, controlled, multicenter I-SPY 2 trial (NCT01042379) demonstrated that 60% of newly diagnosed TNBC patients achieved pathologic complete response (pCR) when treated with the immune checkpoint inhibitor pembrolizumab in combination with standard neoadjuvant chemotherapy. This was a significant improvement compared with the 20% of patients who achieved pCR with standard neoadjuvant chemotherapy alone (65). Although this result is promising, it also indicates that 40% of TNBC patients in the pembrolizumab arm did not achieve pCR but were exposed to the significant risks associated with immunotherapy, which in this trial included autoimmune-mediated adrenal insufficiency, hepatitis, colitis, and hypothyroidism. Future studies are needed to determine whether the MHCII Immune Activation assay can be used to identify patients that are most likely to benefit from immunotherapy.
The MHCII Immune Activation assay produces similar measurements as histologic assays for MHCII expression and TIL counting (Fig. 5), but provides standardized methodology, a larger dynamic range of measurements, and multiplexed analysis of small specimens. The development of the Prosigna test for HR+ breast cancer has demonstrated that one key strength of assays developed on the NanoString nCounter is the ability to implement them as Laboratory Developed Tests in clinical laboratory sites across the world while maintaining standardized protocols and data analysis. Following demonstration of its clinical utility, the format of the MHCII Immune Activation assay will enable similar broad adoption as a clinical test for prognosis in TNBC patients, for which there are currently no clinical tests available.
Disclosure of Potential Conflicts of Interest
N.L. Henry reports receiving other commercial research support from Pfizer, AbbVie, Innocrin Pharmaceutical, and H3 Biomedicine. P.S. Bernard has ownership interest (including stock, patents, etc.) in Bioclassifier LLC and PhenoTx LLC. No potential conflicts of interest were disclosed by the other authors.
Conception and design: R.L. Stewart, K.L. Updike, P.S. Bernard, K.E. Varley
Development of methodology: R.L. Stewart, P.S. Bernard, K.E. Varley
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): R.L. Stewart, K.L. Updike, K.E. Varley
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): R.L. Stewart, K.M. Boucher, P.S. Bernard, K.E. Varley
Writing, review, and/or revision of the manuscript: R.L. Stewart, R.E. Factor, N.L. Henry, K.M. Boucher
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): R.L. Stewart, K.E. Varley
Study supervision: K.E. Varley
We acknowledge the direct financial support for the research reported in this publication provided by the Huntsman Cancer Foundation and the Women's Cancer Disease Oriented Team at Huntsman Cancer Institute. The project described was also supported by the NIH National Center for Advancing Translational Sciences through grant number KL2TR001996 (R.L. Stewart). Research reported in this publication utilized the University of Utah Huntsman Cancer Institute Biorepository and Molecular Pathology Shared Resource and the High-Throughput Genomics Shared Resource, which are supported by the NCI of the NIH under Award Number P30CA042014. This research was also supported by the Biospecimen Procurement and Translational Pathology and Oncogenomics Shared Resource Facilities of the University of Kentucky Markey Cancer Center (P30CA177558). The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH. Special thanks to Sheryl Tripp at ARUP Laboratories for her histologic expertise, and special thanks to Darah Johnson, S. Emily Bachert, and Donna Wall at the University of Kentucky for their technical assistance.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.