To investigate immune escape during breast tumor progression, we analyzed the composition of leukocytes in normal breast tissues, ductal carcinoma in situ (DCIS), and invasive ductal carcinomas (IDC). We found significant tissue and tumor subtype-specific differences in multiple cell types including T cells and neutrophils. Gene expression profiling of CD45+CD3+ T cells demonstrated a decrease in CD8+ signatures in IDCs. Immunofluorescence analysis showed fewer activated GZMB+CD8+ T cells in IDC than in DCIS, including in matched DCIS and recurrent IDC. T-cell receptor clonotype diversity was significantly higher in DCIS than in IDCs. Immune checkpoint protein TIGIT-expressing T cells were more frequent in DCIS, whereas high PD-L1 expression and amplification of CD274 (encoding PD-L1) was only detected in triple-negative IDCs. Coamplification of a 17q12 chemokine cluster with ERBB2 subdivided HER2+ breast tumors into immunologically and clinically distinct subtypes. Our results show coevolution of cancer cells and the immune microenvironment during tumor progression.
Significance: The design of effective cancer immunotherapies requires the understanding of mechanisms underlying immune escape during tumor progression. Here we demonstrate a switch to a less active tumor immune environment during the in situ to invasive breast carcinoma transition, and identify immune regulators and genomic alterations that shape tumor evolution. Cancer Discov; 7(10); 1098–115. ©2017 AACR.
See related commentary by Speiser and Verdeil, p. 1062.
This article is highlighted in the In This Issue feature, p. 1047
The importance of tissue-infiltrating leukocytes in breast tumor development and therapeutic responses is widely accepted, but the mechanisms underlying their effects and alterations of leukocyte composition during tumor progression are still poorly understood (1). Leukocytes are one of the most dynamic cell populations present within tumors and they also play a role in normal breast tissue remodeling during pregnancy and involution (2, 3). Tumor-associated macrophages (TAM) are known to facilitate angiogenesis, extracellular matrix (ECM) degradation, and tumor invasion, and high frequency of TAMs is associated with poor clinical outcome (3, 4). In contrast, higher frequency of tumor-infiltrating lymphocytes (TIL) and especially more CD8+ and fewer FOXP3+ regulatory T cells within tumors is associated with better outcome (1). The numbers and composition of TILs within tumors seem to be especially relevant in HER2+ and triple-negative breast cancer (TNBC) where tumors with higher TIL fractions have better response to HER2-targeted therapies and chemotherapy, respectively (1).
In DCIS, high leukocyte density has been observed in a subset of tumors with enrichment of leukocytes at sites of focal myoepithelial cell layer disruptions (5), suggesting that they might play a role in invasive progression. In DCIS, cancer cells are still physically separated from the stroma and direct leukocyte–cancer cell contact is rarely detected. With invasive progression, cancer cells and leukocytes are intermingled, and only cancer cells that can survive and proliferate in this environment will contribute to disease progression. Thus, the transition from in situ to invasive carcinoma might be a critical tumor progression step for immune escape in breast cancer, and deciphering its mechanism would aid the design of immunotherapies for both advanced and early-stage disease. Most prior analyses of leukocytes in breast tumors, especially in DCIS, have been limited to inferring leukocyte composition from gene expression profiles of bulk tumors (6–10) and to the testing of a handful of markers in archived tissue samples (11–16). Thus, our understanding of immune-related changes in early stages of breast tumorigenesis is still rather limited.
Here we used a combination of global profiling and single-cell methods for the cellular and molecular characterization of tissue-infiltrating leukocytes, with particular emphasis on T cells, in normal breast tissues, pure DCIS (no histologic evidence of invasion), and in HER2+ and triple-negative (TN) invasive ductal breast carcinomas. We also characterized genetic alterations in cancer cells that might affect the tumor immune microenvironment and disease progression. Our goal was to gain insights into the coevolution of tumor and immune cell compartments during the in situ to invasive carcinoma transition. We focused on HER2+ and triple-negative tumor subtypes, as these DCIS have a higher risk of invasive recurrence and the resulting invasive tumors are also more likely to progress to metastatic disease.
Leukocyte Composition of Human Breast Tissues
We first characterized the composition of tissue-infiltrating leukocytes in normal and neoplastic breast tissues using a polychromatic FACS, which allows for the quantitative assessment of all major leukocyte cell populations (ref. 14; Fig. 1A; Supplementary Fig. S1A). Quantification based on FACS was reproducible and accurate as confirmed by the analysis of the same tumor stained and profiled separately and by comparing it to histologic examination of tissue slides (Supplementary Fig. S1B and S1C). We analyzed normal breast tissues from nulliparous and parous women, including BRCA1 and BRCA2 mutation carriers, as well as DCIS and invasive ductal carcinomas (IDC) of different subtypes (Supplementary Table S1). In normal breast tissues, we analyzed epithelial and stromal fractions separately to detect potential differences between intraepithelial and stromal leukocytes. We found that DCIS and IDC contained significantly (P = 0.0015 and P < 0.0001, respectively) higher numbers of leukocytes, compared with normal breast, whereas in normal tissues, more leukocytes were in the stromal than in the epithelial fraction (Fig. 1B). We also observed significant differences in the relative frequencies of several CD45+ cell types, including increased neutrophils and decreased CD8+/CD4+ T-cell ratios in tumors compared with normal stroma (Fig. 1A and C; Supplementary Table S2). The relative fraction of CD8+ and CD4+ T cells showed significant inverse correlation in normal breast tissues and DCIS, and in DCIS higher fractions of CD8+ T cells were associated with significantly higher frequency of macrophages (Supplementary Fig. S1D). The frequencies of dendritic cells and T cells showed significant inverse correlation in TN IDCs (Supplementary Fig. S1D), whereas the relative proportion of γδ T cells was significantly higher in IDCs compared with normal breast tissues (P = 0.008) and DCIS (P = 0.0476; Supplementary Fig. S1E). This latter result is potentially interesting in light of recent findings in animal models showing that γδ T cells promote breast cancer metastasis via their recruitment of neutrophils (17). Corroborating this, we detected higher fraction of neutrophils in IDCs compared with normal breast tissues (Fig. 1C). However, the clinical relevance of γδ T cells in human breast tumors is inconclusive; some studies suggest that their frequency is associated with good prognosis (18), whereas others show that their higher frequency indicates poor outcome (19). Overall, these changes in leukocyte composition imply a switch to a more immunosuppressive and metastasis-promoting environment during breast tumor progression. We focused our subsequent analyses on T cells due their central role in immune responses and being the most dominant CD45+ cell population in both normal and cancerous breast tissues.
Next, we performed multicolor immunofluorescence for CD45 pan-leukocyte and CD3 T-cell markers to assess the spatial distribution of leukocytes within normal and neoplastic breast tissues. We also stained for smooth muscle actin (SMA) to mark myoepithelial cells surrounding the ducts; intact myoepithelial cell layer and basement membrane are detected in normal breast and in DCIS but lost in invasive breast carcinomas. In normal breast tissues there were relatively few T cells, with the exception of mastitis or other benign inflammation, and minor differences were observed between nulliparous and parous women (Fig. 1D; Supplementary Fig. S1F and S1G). In DCIS, we found significantly higher fraction of T cells in high-grade compared with low-grade DCIS (P < 0.0001), and HER2+ compared with HER2− DCIS (P < 0.0001), and DCIS adjacent to IDC also contained significantly (P = 0.0002) higher frequency of T cells than pure DCIS (Fig. 1E and F). However, the spatial distribution of T cells was highly variable in DCIS; some areas of the tumor had very high leukocyte and T-cell density, whereas others contained almost none (Fig. 1E). In several DCIS, we even detected probable tertiary lymphoid structures (TLS; ref. 20) characterized by tight clusters of B cells surrounded by T cells and the presence of peripheral-node addressin-positive (PNAd+) endothelial venules (Fig. 1G; Supplementary Fig. S1H). In normal breast tissues, T cells surrounded the ducts and intraepithelial T cells were frequently detected despite their numbers being relatively low (Fig. 1D), in contrast to DCIS where T cells were rarely detected within the ducts but rather localized in the stroma and at sites of focal myoepithelial cell layer disruption (Fig. 1E). These findings suggest that in DCIS there is limited interaction between T cells and cancer cells; thus, immune escape is most likely to occur during in situ to invasive carcinoma transition.
Gene Expression Profiles of CD3+ T Cells
To begin dissecting the properties of T cells infiltrating normal and neoplastic breast tissues, we performed RNA sequencing (RNA-seq) on purified CD45+CD3+ T cells from normal breast tissues (n = 12), DCIS (n = 11), and IDC (n = 12), focusing on HER2+ (n = 5), and TN (n = 6) IDC cases (Supplementary Table S1). In normal breast tissues, parity and BRCA status did not appear to influence T-cell gene expression profiles; thus, we considered all normal samples as one group. Principal component analysis (PCA) of T-cell gene expression profiles demonstrated separation of normal samples from DCIS. Although the HER2+ IDCs were more similar to normal cases, TN IDCs clustered with DCIS (Fig. 2A and B). Unsupervised hierarchical clustering of the samples using differentially expressed immune-related genes showed a clear separation of T cells from normal breast, DCIS, and HER2+ and TN IDC, indicating that these groups have largely distinct T-cell expression profiles (Fig. 2B; Supplementary Table S3). The most abundantly expressed genes were related to T-cell receptor (TCR) signaling (e.g., NFATC1, NFKB1) and antigen presentation (e.g., HLA-A, HLA-C), which appeared to be more highly expressed in normal samples. Transcriptional regulators of T cells (e.g., TBX21, GATA3) were somewhat consistently expressed among samples with the exception of EOMES, which had low expression in T cells from TN IDC, and FOXP3, which was more abundant in T cells from IDCs (both HER2+ and TN cases). IL18R1, CXCL13, CXCL1, and CTLA4 were also more abundant in T cells from IDCs than in DCIS, whereas the expression of ILF3 and NFKBIA and several interleukin receptors (e.g., IL2RA, IL10RA) was higher in T cells in DCIS and normal breast than in IDCs. TNFRSF25 and other TNF receptors were more highly expressed in DCIS-associated T cells compared with normal breast and IDCs, whereas LCK kinase, which is involved in T-cell development and TCR signaling, showed the opposite pattern; these results were also confirmed by immunofluorescence analysis (Supplementary Fig. S2A).
Gene set enrichment analysis (GSEA) using the canonical gene set compendium (21) showed that, as expected, immune-related gene sets were enriched in our list of significantly differentially expressed genes (Supplementary Fig. S2B; Supplementary Table S4). This included a downregulation of genes in the IL4 signaling pathway in IDC and normal samples compared with DCIS. Furthermore, the Th1Th2 and CTLA4 signaling pathways were upregulated in IDC compared with DCIS, and the complement pathway was higher in normal compared with DCIS samples. Differences between HER2+ and TN IDC included the upregulation of pathways signaling through IL6, IL7, CTLA4, and antigen presentation by MHC I in TN cases (Supplementary Table S4). Also enriched were gene sets corresponding to cell signaling pathways, transcription and translation, and cell cycle in the HER2+/TN IDC comparison, suggesting potential T-cell expansion. Overall, these results suggest a more inflammatory and more immunosuppressive environment in IDCs compared with DCIS, especially in TN tumors.
To obtain a more detailed view of the specific activities of T cells, GSEA was conducted using ImmuneSigDB (21), a compendium of gene signatures summarizing differences between two given groups of immune cells. First, we summarized enriched gene sets according to the frequency of a particular immune lineage and compared with the overall frequency of that given cell type appearing in the compendium (Fig. 2C). Gene sets relating to thymocytes were not as common as expected when comparing DCIS with IDC and HER2+ to TN IDC samples. In addition, gene sets relating to CD8+ T cells were enriched when comparing HER2+ with TN IDCs, and natural killer T (NKT) cells when comparing both DCIS and IDC with normal cases (proportionality test, Benjamini–Hochberg adjusted P < 0.05), suggesting that differences between these groups are driven by their activation state rather than lineage (Fig. 2C). Enriched gene sets from ImmuneSigDB were collapsed into network diagrams to highlight the directionality within each comparison (Fig. 2D). When comparing T cells from DCIS with normal cases, multiple enriched gene sets suggest that the expression profiles of T cells from DCIS samples are more similar to CD4+ regulatory T cells (Treg) or an undifferentiated CD4+ T-cell state. In contrast, the expression profile of T cells from normal samples was more varied. Comparison with IDC samples showed that DCIS T cells had a CD8+ or CD4+Th17 phenotype, whereas IDC samples were more similar to Tregs, suggesting even stronger immunosuppression in IDC compared with DCIS. Comparison between HER2+ and TN IDCs suggested activation of Th cells, including Th17 and Tregs in TN IDCs, whereas T cells from HER2+ IDCs were enriched for Th1 and Th2 cells (Fig. 2D; Supplementary Fig. S2C). As T cells from HER2+ and TN IDC showed very distinct profiles (Fig. 2A), we compared T cells from DCIS to those of HER2+ IDC alone (Fig. 2D). T cells from HER2+ IDC showed a higher enrichment of Th1, Th2, and Tregs compared to undifferentiated CD4+ T-cell gene sets. Conversely, DCIS T cells more closely resembled CD8+ T cells when compared with Treg or CD4+ T cells. Deconvolution of the composition of T-cell populations within each sample using CIBERSORT (22) supported this trend, with samples having the greatest CD8 or CD4 naïve content being in normal or DCIS. Differences in γδ T cells, Tregs, and CD4+ memory-activated T cells were noted among the different groups, with higher proportions of activated CD4+ memory cells and γδ T cells in HER2+ IDC and lower proportions of Tregs in normal samples (Fig. 2E). Overall, these data suggest that compared with normal tissues, both DCIS and IDC have more Tregs, and T cells in DCIS are enriched for cytotoxic CD8+ and undifferentiated/naïve CD4+ T-cell signatures, whereas IDCs have lower proportion of CD8+ T cells but more activated CD4+ Th cells and Tregs. Although inference of cellular composition based on deconvolution of RNA-seq data has limitations and ideally it would be best to characterize each T-cell subpopulation by single-cell RNA-seq, it provides estimates that agree well with FACS profiles for more frequent cell types (Supplementary Fig. S2D). Nevertheless, further experimental validation of these predictions is needed.
Activation of CD8+ T Cells in DCIS
The enrichment of DCIS T-cell expression profiles for CD8+ T-cell signatures and the predicted higher frequency of CD4+ memory–activated T cells in HER2+ IDC suggest potential activation and clonal expansion of T cells in DCIS and HER2+ IDCs. In line with this observation, the expression of MKI67, a proliferation marker, in our RNA-seq data was highly variable across samples. All HER2+ IDCs and a subset of DCIS displayed high MKI67 levels, suggesting T-cell expansion in these cases (Supplementary Fig. S3A and S3B). The expression of granzyme B (GZMB), a marker of activated CD8+ T cells, was also high in MKI67hi T cells in DCIS (Supplementary Fig. S3B). Moreover, gene signatures specific for naïve and cytotoxic T cells showed a positive correlation in MKI67lo T cells but not in MKI67hi T cells (Supplementary Fig. S3C; Supplementary Table S5), suggesting that MKI67hi T cells reflect expansion of activated CD8+ T cells. Immunofluorescence analysis of markers of activated and effector T cells, including CD8, GZMB, Ki67, and IFNγ, in an independent cohort (Supplementary Table S1) confirmed the predictions of the RNA-seq data, as we detected significantly higher fractions of GZMB+, Ki67+, and IFNγ+ CD8+ T cells in DCIS than in IDC in both HER2+ and TN subtypes (Fig. 3A–D; Supplementary Fig. S3D). Furthermore, the relative fractions of Ki67+CD8+ and GZMB+CD8+ T cells showed significant positive correlation in both DCIS and IDCs (Supplementary Fig. S3E). Importantly, analysis of matched DCIS and IDC samples in a cohort of patients who were diagnosed with pure DCIS and underwent lumpectomy, but years later recurred locally with IDC, demonstrated the same results (Fig. 3E–H), confirming that a decrease in activated CD8+ T cells is a feature of in situ to invasive breast carcinoma progression.
To further explore the potential clonal expansion of T cells in DCIS, we aligned our RNA-seq data to a library of known variable human complementarity determining regions (CDR) to determine the frequency of unique TCR segments in each sample (23). Samples were normalized on the basis of the relative fraction of T cells. The TCR is rearranged during T-cell maturation, and a population of T cells that have the same unique TCR sequence is defined as a TCR clonotype. TCR clonotype repertoire can be quantitatively assessed using the Shannon index of diversity (24). DCIS cases overall appeared to have more clonally expanded T cells than IDCs (Supplementary Fig. S4A), and TCR clonotype diversity measured by the Shannon index was significantly (P = 0.03) higher in DCIS compared with IDCs (Fig. 3I). The correlation between MKI67 read counts and the Shannon index of TCR clonotype diversity was the highest in TN IDCs, implying the expansion of multiple T-cell clones in these tumors (Supplementary Fig. S4B). Similar positive correlation was observed in TN IDCs between the expression of T-cell activation markers (e.g., GZMB, PRF1, and CCL4) and TCR clonotype diversity (Supplementary Fig. S4C), further implying the presence of polyclonal activated T cells. In contrast, the generally lower TCR clonotype diversity in HER2+ cases might reflect expansion of a limited set of T-cell clones. As an alternative method to assess the TCR repertoire in DCIS, we performed immunoSEQ that defines TCR clonotype diversity based on DNA sequence (25). ImmunoSEQ also showed polyclonality in all DCIS, although a few dominant clones were also detected, and the top 10 most abundant clones in each case revealed differences among samples (Supplementary Fig. S4D and S4E). Similarly, clustering of samples based on normalized frequencies of TCR v and j genes highlighted several similarities and differences among samples (Supplementary Fig. S4F and S4G). The two samples for which we had both RNA-seq and immunoSEQ data (DCIS15 and DCIS17) appeared to have the highest fraction of expanded clones based on TCR and were also highly MKI67+ and GZMB+. Interestingly, we found a few TCR clones that were shared (at the amino acid level) among different tumors and represented relatively more abundant clones (Supplementary Fig. S4H). For one of these TCRs, we were able to predict the putative antigen as EBNA3, an Epstein–Barr virus protein based on prior TCR repertoire analyses (26). The presence of shared, relatively prominent clones, in combination with our RNA-seq data, support the possibility of an adaptive immune response in a subset of DCIS and a potential decline of this after progression to IDC.
Immune Checkpoint Proteins
Our observation that IDCs, particularly TN IDCs, contained relatively more TILs but fewer in the activated state than DCIS implied potential exhaustion and immunosuppression in IDCs. To investigate this hypothesis, we analyzed our RNA-seq data for genes characteristic of exhausted state, including genes encoding for immune checkpoint proteins (e.g., CTLA4, PD-1, TIGIT; Supplementary Table S5). We found a positive correlation between exhausted and cytotoxic gene signatures in both MKI67hi and MKI67lo T cells, which was expected as these markers are commonly coexpressed in the same T-cell population and exhaustion is usually due to chronic activation. On the other hand, the expression of inhibitory and activating immune checkpoint proteins showed positive correlation only in MKI67hi T cells (Supplementary Fig. S5A and S5B). Similarly, the expression of T-cell activation and dysfunction-related genes (27) was significantly different in tumor-associated MKI67hi and MKI67lo T cells (Supplementary Fig. S5C).
To explore the expression of immune checkpoint proteins in further detail, we performed immunofluorescence analysis for TIGIT, PD-1, and PD-L1 in DCIS and IDCs, as these proteins have been previously implicated in breast cancer with clinical trials targeting them currently ongoing (Fig. 4A–C). TIGIT was expressed mostly in CD3+ T cells but also in other leukocyte populations (Fig. 4A; Supplementary Fig. S5D). In contrast, PD-L1 expression was more widespread with high expression in various immune cells and in a subset of tumor epithelial cells (Fig. 4B; Supplementary Fig. S5E). In a luminal A IDC with an extensive in situ region, PD-L1 was highly expressed in DCIS-associated myoepithelial cells but not in tumor cells, whereas in the invasive areas, the tumor epithelial cells were highly PD-L1–positive (Supplementary Fig. S5F). This finding implies that in DCIS, myoepithelial cells might play a role in immune suppression and tumor epithelial cells that gain this phenotype have a selective advantage during invasive progression. TIGIT and PD-1 were also frequently coexpressed in CD8+ T cells in both DCIS and IDCs (Fig. 4C), similar to what was described in melanoma (28). Quantification of the signal revealed slightly but significantly more TIGIT+CD3+ T cells in DCIS compared with IDCs in both HER2+ and TN cases (Fig. 4D), whereas the relative fraction of PD-1+CD3+ T cells was not significantly different among tumors (Supplementary Fig. S5G). In a few cases, we analyzed the expression of TIGIT by both FACS and immunofluorescence to confirm our results, and we detected similar TIGIT levels by both techniques (Supplementary Fig. S5H). We also analyzed the spatial distribution of TIGIT+ or PD-1+ CD3+ T cells to see potential differences associated with histology or tumor subtype. For this, we digitized the immunofluorescence images by recording spatial coordinates and marking tumor cells and T cells positive or negative for TIGIT or PD-1 and assessed the relative distribution of the various cell types (Supplementary Fig. S6A). We found that the fraction of T cells invading the tumor cells was significantly higher in IDC compared with DCIS only in TN cases (P = 0.19 in HER2+ and 0.006 in TN based on t test), implying that there is more intermixing between T cells and cancer cells in the TN but not in the HER2+ cases (Supplementary Fig. S6B). Interestingly, we also found that in TN DCIS adjacent to IDC, PD-1+CD3+ T cells can commonly be observed within the ducts in the epithelial compartment, whereas intraepithelial T cells were rarely present in HER2+ DCIS adjacent to IDC (Fig. 4B). These results imply potential differences in the microenvironment such as the extracellular matrix (ECM) or differences in local myeloid complexity of HER2+ and TNBCs that could influence the ability of T cells to invade tumors and enter the ducts. Correlating with this finding, collagen trichrome staining showed some differences between HER2+ and TNBCs, especially in DCIS, with HER2+ DCIS displaying a denser fibrous ECM potentially forming a physical barrier (Supplementary Fig. S6C).
The high expression of PD-L1 in tumor epithelial cells in TN IDCs (Fig. 4B) raised the possibility of specific upregulation of PD-L1 in these tumors due to copy-number gain, as we and others previously described that a 9p24 amplicon including CD274 (encoding for PD-L1), PDCD1LG2 (encoding PD-L2), and JAK2 occurs only in basal-like breast tumors where approximately 30% of cases have a high copy-number gain of this region (29, 30). To investigate this hypothesis in more detail, we examined the frequency of gains and losses of the CD274 locus in The Cancer Genome Atlas (TCGA) cohort composed of invasive breast cancers, and found an enrichment of gains in the basal subtype of approximately 40% (P < 0.001 χ2 test, Fig. 4E; Supplementary Fig. S7A–S7C). 9p24 copy-number gain was not associated with any differences in cytotoxic T-cell gene signatures, probably due to the expression of PD-L1 by many stromal cell types besides tumor epithelial cells (Supplementary Fig. S7D). 9p24 gain was also independent of neoantigen load and estimated immune content, but was associated with overall copy-number aberration (CNA) status only in TN cases (Supplementary Table S6). Similar comparisons in the Oslo cohort (9) that included both DCIS and IDCs reported a higher frequency of CD274 and PDCD1LG2 gain in IDC samples (6/12 patients) compared with DCIS (2/9 patients), which had basal-like expression profiles (Fig. 4E). To confirm the preferential gain of CD274 and overexpression of PD-L1 in TN IDCs compared with DCIS, we developed an immuno-FISH protocol that allows for the combined analysis of PD-L1 protein levels and CD274 copy-number status at the single-cell level in archived samples. Testing of 10 TN IDCs revealed CD274 copy-number gain and PD-L1 overexpression in 3 of 10 cases, whereas none of the 10 TN DCIS showed gain of this locus and had relatively low expression of PD-L1 (Fig. 4F). These results suggest a possible mechanism for the in situ to invasive breast carcinoma transition in TNBC through the selection for tumor cells with higher expression of PD-L1 due to increased copy number for CD274. The clinical relevance of these findings as they relate to the likelihood of response to PD-L1 blockade remains to be determined.
17q12 Chemokine Amplicon
To explore whether other genomic regions encoding for genes involved in immune regulation may also display differential copy-number gain between DCIS and IDCs, we analyzed known breast cancer amplicons for the presence of such genes. We found that the 17q12 chromosomal region that contains a cluster of genes encoding chemokines is located proximal to the ERBB2 amplicon, making it prone to possible coamplification (Fig. 5A). Interestingly, in the TCGA cohort, chemokine cluster (CC) gain was associated with ERBB2 amplicon status independent of immune content and overall CNA but was negatively associated with neoantigen load, suggesting coamplification as an alternate mechanism for immune activation (Supplementary Table S6). An enrichment of CC gains was observed in HER2+ tumors defined either by PAM50 classification or ERBB2 amplification (50% compared with 2%, P < 0.001 χ2 test; Supplementary Fig. S7E). Interestingly, these HER2+ tumors can roughly be divided into two groups: those which are HER2+ER+ luminal-like and display evidence of coamplification, and those which are HER2+ER− as defined by PAM50, which do not exhibit coamplification and in fact suggest loss of this locus (Fig. 5B). To assess whether the copy-number gain of this locus is associated with changes in leukocyte composition, we analyzed the expression of immune activation and exhaustion-related genes in these samples and found higher expression of both activation and inhibition-related genes in tumors that lack CC gain (Fig. 5C). Cytotoxic gene scores were associated with CC gain particularly in HER2+ER− tumors (Supplementary Table S6). These results suggest that HER2+ER− tumors with loss of the CC region are more likely to have T-cell infiltrates than HER2+ER+ tumors with gain of the CC region.
Next, we developed a multicolor FISH assay to evaluate both the 17q12 CC and ERBB2 copy-number gain in HER2+ IDCs and DCIS at the single-cell level (Fig. 5D; Supplementary Table S7). Adjacent slides were used for immunofluorescence to assess T-cell frequency and activation. We found that cancer cells with ERBB2 gain had significantly higher chance of gaining the CC locus, especially in pure DCIS cases (Fig. 5E; Supplementary Table S7). Analysis of the intratumor spatial distribution of cells with or without CC gain did not reveal any specific pattern in DCIS nor in IDC (Supplementary Fig. S7F and S7G). The Shannon index of diversity for CC and ERBB2 copy numbers did not display significant correlation with frequencies of CD3+ T cells and CD45+ leukocytes, neither in pure DCIS nor in DCIS/IDC cases (Supplementary Fig. S7H). However, CC amplification showed significant inverse correlation with the frequency of GZMB+CD8+ T cells within tumors (Fig. 5F). Moreover, we found a relatively strong positive correlation between the Shannon index and patient age at diagnosis in HER2+ IDC cases but not in pure DCIS (Supplementary Fig. S7I). To investigate whether the decreased T-cell activation in tumors with CC gain might be due to higher frequencies of myeloid-derived suppressor cells (MDSC), we stained adjacent slides for CD68, CD33, and HLA-DRB1 that allow for the detection of macrophages (CD68+HLA-DRB1+CD33+), and monocytic (CD68+HLA-DRB1−CD33+) and granulocytic (CD68−HLA-DRB1−CD33+) MDSCs (Supplementary Fig. S7J). However, the numbers of macrophages and MDSCs did not correlate with CC copy number or fraction of GZMB+CD8+ T cells (Supplementary Fig. S7K and S7L), suggesting other mechanisms underlying the “immune cold” status of these tumors. Finally, a generalized linear model predicting CC coamplification (Supplementary Table S7) suggested an association between ERBB2 and ER, consistent with our findings from the TCGA cohort (Fig. 5B). These results imply potential selection for cancer cells that have lost the 17q12 CC amplicon during DCIS to IDC progression only in HER2+ER− tumors, but retention of coamplification in HER2+ER+ IDCs, and some of this selection might influence the immune microenvironment of the tumors.
Evasion of immune surveillance is a necessary step in tumor evolution. Despite its importance, our understanding of mechanisms of immune escape in human tumors is rather limited. In DCIS, the tumor cells are relatively protected from the immune system due to an intact myoepithelial cell layer and basement membrane, and intraductal T cells are rarely detected (Fig. 6). In contrast, in invasive disease, cancer and immune cells are intermingled. Thus, the in situ to invasive carcinoma transition might be a particularly important bottleneck for immune escape and tumor evolution, and the analysis of this could identify novel targets for immunotherapies, which has had limited success so far in patients with breast cancer (1, 31). One reason for this limited success might be that different tumors evade the immune system via different mechanisms, and thus the identification and characterization of immune escape mechanisms is critical for the design of novel and more effective immunotherapies.
Here, we investigated potential mechanisms of immune escape in breast cancer by analyzing the composition and molecular profiles of leukocytes, with special emphasis on T cells, in normal breast tissues, DCIS, and IDC. Using clinical samples from normal and neoplastic breast tissues, we performed FACS analyses and RNA-seq on sorted T cells. As tissue dissociation can influence the number of viable cells recovered and may also affect the detection of some antigens (the relatively high fraction of Lin− cells in our FACS may be due to this), we also tested selected genes by immunofluorescence on intact tissue sections. Using these approaches, we determined that T cells in DCIS are enriched in activated effector CD8+ T cells characterized by the expression of GZMB and MKI67, but the frequencies of these cells decreases in invasive disease. This was particularly evident in DCIS cases that recurred locally as IDC, implying that decreased immune activity may be necessary for invasive progression. Currently there are no clinically useful biomarkers that would predict the risk of invasive progression of DCIS, and considering that approximately 40,000 women are diagnosed annually with DCIS in the United States alone, this is an important clinical problem. Our data suggest that the frequency of activated CD8+ T cells may predict which DCIS is likely to progress to invasive disease. Analysis of the TCR repertoire also displayed significantly higher clonotype diversity in DCIS compared with IDCs, and we identified several more prominent shared clones among cases. Interestingly, one of these TCRs was predicted to react to an epitope in EBNA3, an Epstein–Barr virus protein. This finding does not necessarily imply a casual role for the Epstein–Barr virus in breast cancer, but it is consistent with a hypothesis that some T cells reactive to commensal microflora and common infectious agents may cross-react with some tumor neoantigens (32). However, further studies are required to test this hypothesis.
Several immune checkpoint proteins displayed significant differences between DCIS and IDC, particularly in TN cases. TIGIT+ T cells were more common in DCIS than in IDC, especially in the TN subtype, whereas the expression of PD-L1 was almost undetectable in DCIS tumor epithelial cells but increased to higher levels in IDC with the amplification of the CD274 locus encoding PD-L1 in a subset (∼30%) of the cases. In Hodgkin lymphoma, the presence of this same 9p24.1 amplicon is highly predictive of response to PD-1 blockade (33), raising the possibility that this might also be the case in TNBCs, but this remains to be explored. The expression of CTLA4 was also higher in T cells from IDCs compared with DCIS, further supporting the development of a suppressive immune microenvironment during invasive progression. The changes in the expression of these immune checkpoint proteins during DCIS to IDC transition suggest that the application of immunotherapies can be and may actually be more effective at earlier stages compared with metastatic disease. This hypothesis is supported by findings in animal models showing that the most efficient antitumor immune response was achieved when primary tumor and draining lymph nodes were still present (34, 35). Thus, it would be useful to test whether immunotherapy could be more effective in breast cancer when applied in the neoadjuvant phase in combination with chemotherapy.
Most prior studies analyzing leukocytes and the prognostic value of TILs in breast cancer have focused on invasive tumors, whereas DCIS and premalignant lesions have been relatively neglected. This is in part due to difficulties with obtaining fresh tissue samples from these small preinvasive lesions that are required for comprehensive profiling studies, and also due to the focus on advanced-stage disease that is subject to systemic therapies including immunotherapy. Previous genomic characterization of DCIS cases has identified gene expression signatures implying the presence of activated T cells in a subset of tumors (8), but this was not confirmed by any other method. Prior studies have also shown that the frequencies of Tregs increased during tumor progression, suggesting that this could be used to predict risk of invasive progression (13). However, in our experience the frequency of FOXP3+ T cells is very low in DCIS (<10% of T cells) and their topologic distribution is highly variable within tumors, making their assessment inaccurate in thin tissue sections (see Supplementary Methods). In contrast, activated GZMB+CD8+ and Ki67+CD8+ T cells were very common within DCIS, implying that they could potentially be better biomarkers for predicting risk of invasive progression. Our observation that the frequency of these cells is decreased in local invasive recurrence of DCIS supports this hypothesis, but further testing in large, uniformly treated cohorts with long-term follow-up would be required to determine this. It would also be useful to analyze each T-cell subset in detail in both DCIS and IDC, possibly by single-cell RNA-seq, as inferring the composition of T cells based on expression profiles has limitations.
We have also found evidence for coevolution of cancer cells and leukocytes as exemplified by the negative association between the coamplification of the 17q12 chemokine cluster with ERBB2 in HER2+ breast tumors and the presence of activated T cells within tumors. This coamplification was more common in PAM50 luminal HER2+ IDCs compared with the PAM50 HER2-enriched subset, which may reflect different evolutionary paths for luminal, commonly ER+, and nonluminal ER−HER2+ tumors. Interestingly, the copy-number gain was relatively low in a subset of luminal HER2+ tumors for both ERBB2 and the CC locus, which could imply higher intratumor heterogeneity in these tumors that can contribute to their lower response rate to HER2-targeted therapies. The lower frequency of TILs and particularly fewer activated GZMB+CD8+ T cells in these luminal HER2+ IDCs could further increase the probability of treatment resistance, as the presence of these cells is associated with better response to both chemotherapy and HER2-targeted therapies. Because of the high number of chemokines in this 17q12 cluster and their diverse function, dissecting the mechanism by which coamplification of this locus affects T-cell activity would require more detailed functional studies in the future.
Immunoediting is a key step in tumor evolution that can be divided into elimination, equilibrium, and escape phases (36). Our data suggest that in DCIS, due to a mostly intact basement membrane and myoepithelial cell layer, relatively few cancer cells are exposed to immune cells and these can be mostly eliminated, keeping the tumor in the intraductal “equilibrium” stage. Because of subsequent selection for cancer cells that have lost (or never had) neoepitopes and/or upregulated immunosuppressive mechanisms, tumor cells escape from immune surveillance, and invasive and subsequent metastatic progression occurs. Therefore, the in situ to invasive breast carcinoma transition appears to be the most critical tumor progression step from both clinical and immunology standpoints. A better understanding of cellular and molecular changes that occur during DCIS to IDC transition could improve the design of immunotherapies for the treatment of both early-stage and advanced-stage disease. At the same time, our results also suggest that assessing the frequencies of activated CD8+ T cells in DCIS may identify patients with high risk of invasive progression.
Human Tissues and FACS
Fresh and archival formalin-fixed, paraffin-embedded (FFPE) tissue samples were obtained from Brigham and Women's Hospital/Dana-Farber/Harvard Cancer Center, Sutter Health, Washington University (St. Louis, MO), Yonsei University Medical College Gangnam Severance Hospital (Seoul, Korea), and Seoul National University Bundang Hospital (Seoul, Korea). Samples were deidentified prior to transfer to the laboratory. All Institutional Review Boards approved the protocol and waived the informed consent requirement. Fresh human normal and neoplastic breast tissues were dissociated as described previously (37). Polychromatic FACS was performed essentially as described previously (14). Briefly, single-cell suspensions were blocked in PBS with 0.5% BSA and 2 mmol/L EDTA and stained at 4°C for 30 minutes with antibodies listed in Supplementary Table S8. Live-Dead Aqua (Invitrogen) was used to eliminate dead cells from the analysis, and isotype control antibodies and unstained cells were used as negative controls. Single-antibody staining was used for gating controls. Cells were analyzed and sorted using BD LSR Fortessa cell analyzer and BD FACSAria II SORP UV (Becton Dickinson), respectively. For RNA-seq, EpCAM+ cells were gated out to avoid epithelial contamination, and CD45+CD3+ double-positive cells were sorted into PBS or lysis buffer (1% 2-mercaptoethanol in RLT plus buffer, Qiagen). Associations between different cell populations was computed using Spearman correlation.
Histology and Immunofluorescence Analyses
Histology and multicolor immunofluorescence analyses were performed using 5-μm sections of FFPE tissues essentially as described previously (38). Briefly, slides were deparaffinized in xylene and hydrated in a series of descending ethanol concentrations. After heat-induced antigen retrieval in either citrate (pH = 6) or TRIS-EDTA (pH = 9) buffer, the samples were permeabilized with 0.5% TritonX-100, blocked with 5% goat serum PBS, and sequentially costained with antibodies as indicated in Supplementary Table S8. For TIGIT and CD33 staining, TSA indirect kit was used following the manufacturer's instructions (PerkinElmer). Image analysis was performed on 3 × 3 montage images acquired by Nikon Ti microscope attached to a Yokogawa spinning-disk confocal unit, 40 × Plan Apo objective, and OrcaER camera controlled by the Andor iQ software. Masson trichrome staining was performed as indicated by the manufacturer (American MasterTech). Myeloid suppressor cell classification and staining was performed as recommended by Bronte and colleagues (39).
Multicolor FISH and Immuno-FISH
FISH to analyze ERBB2 and 17q12 chemokine cluster was performed using whole sections of FFPE breast tumor samples as described previously (38). Briefly, 5 μm FFPE tissue sections on silylated glass slides were baked overnight at 70°C, dewaxed in xylene (Leica Microsystems), washed in 100%, 70%, and 50% ethanol (Leica Microsystems), rinsed with H2O, and air-dried for 1 hour. Tissue digestion was performed by treating the samples with 0.08% Pepsin in PBS (DAKO) at 37°C for 15–22 minutes. Afterward, the slides were washed in H2O, dehydrated in increasing concentrations of ethanol (50%, 70%, 85%, 100%), and air-dried. A FISH probe mix, containing ERBB2 BAC probe (RP11-94L15, provided by Drs. Hege Russness and Inga Rye, Oslo, Norway; labeled with SpectrumOrange dUTP by Nick Translation Kit; Abbott Molecular, according to the manufacturer's recommendations), CEP17Aqua probe (Abbott Molecular), and 17q12 chemokine cluster BAC probes (equal concentration of RP11-791G14 and RP11-278F22 cDNA; labeled with SpectrumGreen by Nick Translation Kit), was then applied to the slides, covered with coverslip, and sealed with rubber cement. Hybridization was performed for 7 minutes at 75°C followed by overnight incubation at 37°C in a humid chamber. Next, the slides were washed in 0.4× SSC with 0.3% NP-40 for 2 minutes at room temperature, in 0.4× SSC with 0.3% NP-40 for 2 minutes at 74°C, then in 2× SSC with 0.1% NP-40, in 2× SSC, and in PBS. Nuclear counterstaining was performed by 10-minute incubation at room temperature with 1 μmol/L To-Pro-3 in PBS (Molecular Probes, Life Technologies). After two additional washes in PBS and one in H2O, the slides were dried and mounted with Vectashield Mounting Medium (Vector Laboratories), covered with coverslips, and stored overnight to 3 days at −20°C. Different immunofluorescence images from multiple areas of each sample were acquired with a Nikon Ti microscope attached to a Yokogawa spinning-disk confocal unit, 40× plan apo objective, and OrcaER camera controlled by Andor iQ software.
For the detection of CD274 gene (encoding PD-L1) and PD-L1 protein, slides were deparaffinized and digested with Proteinase K for 20 minutes at 37°C before staining with PD-L1 antibody as described previously. Coordinates of imaged areas were recorded. After imaging, the same procedure as used for multicolor FISH was followed using CEP9 probe (Abbott Molecular) and BAC probe 599H20 (Life Technologies) labeled with SpectrumOrange dUTP by Nick Translation Kit (Abbott Molecular). After hybridization and washes, slides were mounted using Vectashield with DAPI (Vector Laboratories) and imaged using the set of coordinates from the first round of imaging. Images were then overlaid using R to compute overlaps and crop images.
RNA-seq and Data Analysis
RNA was isolated from purified CD45+CD3+ cells by cell sorting using the antibodies described above. After approximately 2,000 cells were sorted into 96-well plates with 2.5 μL PBS [containing 0.5 μL RNaseOut (Life Technologies) and 0.5 μL dithiothreitol (DTT; Life Technologies)], cytoplasmic RNA was isolated as described previously (40). Briefly, 2.5 μL of 2 × selected cytoplasm lysis buffer (SCLB) was added and the cells were lysed by pipetting. Lysates were centrifuged at 8,000 rpm for 5 minutes at 4°C. Supernatants (∼5 μL) containing the total cytoplasmic RNA were transferred to PCR tubes. 5′-phosphorylated oligo-GdT24 (pGdT24) primer was used for reverse transcription by Superscript Reverse Transcriptase III (Life Technologies) followed by second-strand synthesis. cDNAs were purified with Genomic DNA Clean & Concentrator Kit (Zymo Research) followed by DNA blunt ending, 5′-end phosphorylation and ligation with End-It DNA End-Repair Kit (Epicentre) and T4 DNA ligase (Epicentre). Products were directly amplified using REPLI-g UltraFast Mini Kit (Qiagen) and purified using Genomic DNA Clean & Concentrator Kit (Zymo Research Corp). Products were fragmented to 200–500 bp by a Bioruptor Sonicator (Diagenode). Fragment sizes were validated using High Sensitivity DNA Kit (Agilent Technologies) and Agilent 2100 Bioanalyzer. Functional concentrations were determined by qPCR with standards for Illumina sequencing libraries.
RNA-seq libraries were (single-end) sequenced using the Illumina NextSeq 500 Next Gen Sequencer. Datasets were aligned to the human reference GRCh37/hg19 genome using the STAR RNA-seq aligner (version STAR_2.5.1b; ref. 41). Two-pass mapping was performed using the following modified parameters: –outSAMstrandField intronMotif, –outFilterMultimapNmax 20, –alignSJoverhangMin 8, –alignSJDBoverhangMin 1, –outFilterMismatchNmax 999, –outFilterMismatchNoverLmax 0.1, –alignIntronMin 20, –alignIntronMax 1000000, –alignMatesGapMax 1000000, –outFilterType BySJout, –outFilterScoreMinOverLread 0.33, –outFilterMatchNminOverLread 0.33, –limitSjdbInsertNsj 1200000, –chimSegmentMin 15, –chimJunctionOverhangMin 15, –twopassMode Basic. The read counts for individual genes were generated using the htseq-count script of the HTSeq framework (version 0.6.1p1; ref. 42) using modified parameters (–stranded no) and the hg19 refGene annotation file available at the UCSC Genome Browser. Genes were filtered to retain only those with at least 45 counts across all samples. Differential gene expression analysis was performed with both edgeR (43) and DESeq2 (44) using design models which take into account batch effects. For PCA and heat-map visualizations, counts were converted to log2(counts per million) and batch adjusted using limma (45). Heat-map visualization was performed using the intersect of genes defined as differentially expressed using both edgeR and DESeq2, which also appear in the ImmPort database of immune-related genes (46). Estimation of the relative proportions of different T cells was performed using CIBERSORT (22), using the author's gene signatures for members of the T-cell lineage and 100 permutations. Testing for significant differences among groups was performed using ANOVA.
GSEA (47) was performed using the HTSAnalyzeR package (48) using a Benjamini–Hochberg corrected P value cutoff of 0.05 to define significant gene sets. Two gene set collections from MSigDB were assessed: (i) curated canonical pathways (c2) to determine whether immune-related gene sets were differentially expressed and (ii) immunologic signatures (c7) reflecting genetic perturbations or differences in cell states (21). The c7 compendium contains 389 curated studies comparing differences in gene expression between cells of different lineages, or cells before and after treatment with specific chemokines. Thus, the multiple appearance of a gene set (e.g., Treg vs. CD4+ T cell) in independent studies would strongly support a difference in phenotype. Enriched gene sets were summarized by: (i) the frequency of specific immune cell types (e.g., CD4+, CD8+, thymocytes) appearing as differentially expressed, with significance calculated using proportionality testing with correction for multiple hypothesis testing, and (ii) the corresponding directionality of these enriched gene sets in network diagrams. Node sizes in network diagrams were determined by the frequency of a given cell lineage appearing in the list of enriched gene sets. Edge weights were computed by the net number of gene sets supporting a link between cell types. For each network link, arrows are colored to highlight which group the test sample is more similar to (e.g., pink arrow in direction of CD4 T-cell suggests DCIS T cells are more similar to this phenotype)
A list of genes characteristic of T-cell activation, checkpoint inhibition, exhaustion, naïve T cells, activation-specific and dysfunction-specific were manually curated from the literature (refs. 23, 27, 49; Supplementary Table S4). A score for cytotoxicity, exhaustion, naïve, checkpoint activation, and checkpoint inhibition was obtained for each sample by computing the mean logCPM value. To take into account differences in cell cycle, patients were grouped into “high” and “low” categories using a logCPM Ki67 threshold of −0.5.
TCR Alignment and Identification of Clonotypic T-cell Repertoires
The clonotypic repertoires of T cells have been identified by aligning the RNA-seq reads to the human CDR regions using MiXCR (default parameters), a tool specific for immunoglobulin (IG) and T-cell receptor profiling (50). Only clonotypes with at least 2 reads were retained for further analyses. The clonotypes were then normalized using the number of sorted CD45+CD3+ T cells used for the RNA-seq. The diversity of individual samples based on the number and abundance of different clonotypes is represented using the Shannon index (24). The significance of the difference between DCIS, IDC, and normal samples based on clonotype diversity was calculated using the Wilcoxon rank sum test. For a comparison of the Shannon index of individual samples and also their subtypes to the expression of selected cytotoxic genes, the raw read counts of the genes have been normalized on the basis of library size using DESeq2 (44).
Copy-number log2 ratios were obtained from publicly available data from TCGA (51) and from the Oslo cohort (9) using Affymetrix SNP6 arrays with logR ratios adjusted for ploidy and tumor percentage using the allele-specific copy-number analysis for tumors (ASCAT) algorithm (52). Copy-number data was truncated to lie within a [−2, 2] range. To allow for intratumor heterogeneity, a relaxed log2 ratio of ±0.3 was used to define gains and losses. In the TCGA cohort, HER2+ patients were defined as positive either by PAM50 classification or by ERBB2 log2 ratio of at least 0.3. In both cohorts, triple-negative patients were defined using PAM50 Basal classification. Significance was determined using χ2 test.
Generalized linear models.
Spatial, Diversity, and Statistical Analyses
Association between DCIS/IDC cell type, ERBB2, and chemokine cluster amplification.
The association between the probability of a cell carrying the 17q12 CC amplification and ERBB2 amplification [together with the cell type (DCIS or IDC cells) in the DCIS/IDC cohorts] was estimated using logistic hierarchical model (or logistic mixed effects regression; ref. 55) to account for the correlation of cells within each sample and of samples within each patient. In addition, we adjusted for important clinical covariates collected in this study, including patients' ages, ER status, and others. The selection of covariates from those collected was justified using the Akaike Information Criterion (AIC; ref. 56). The models with the lowest AIC in both cohorts (AIC = 4412.2 in pure DCIS and 10909.7 in DCIS/IDC) were selected as the final model to interpret (Supplementary Table S5).
Diversity estimation from single-cell FISH data.
To compute the diversity from the multicolor FISH data, we first categorized each cell into the following four types based on copy number for ERBB2 and the CC: ERBB2+CC+, ERBB2+CC−, ERBB2−CC+, and ERBB2−CC−. Next, the number of cells belonging to each category was determined and the Shannon's entropy (24) was thereby calculated for each sample of each patient. The association between the diversity and patients' age was evaluated for both the pure DCIS and DCIS-IDC cohorts (Supplementary Fig. S7I). Interestingly, in the DCIS-IDC cohort, diversity was significantly associated with patients' age (P = 0.0183 for DCIS cell type and P = 0.0146 for IDC cell type), whereas there was no such association supported by statistical significance in the pure DCIS cohort. Furthermore, we also studied the association between diversity and CD3 and CD45 counts, and found no significant correlation between these two factors in both cohorts (Supplementary Fig. S7H).
We then considered another layer of information contained in the immuno-FISH images describing the spatial distribution of single cells. We sought to investigate the observed patterns relative to the null hypothesis—complete spatial randomness. Such analysis can help inform future evolutionary modeling of tumor development and intratumor heterogeneity. Here our definition of complete spatial randomness closely follows (57): (i) regardless of the type of cells, the cells are randomly localized on the 2-D slice of a tumor by a spatial uniform Poisson process with an intensity parameter estimated from all cells on the image; (ii) for each single cell, we randomly assign one of the four mutation types defined in the previous section based on the relative frequency of each mutation type, again estimated from the image. This approach is also called marked spatial Poisson process (58). To test such complete spatial randomness, we first used Monte Carlo simulation to generate a number of simulated samples under the null hypothesis, and calculated the so-called “cross” K function (a function of the radius of the neighborhood of each cell; ref. 57) using both the simulated samples and the patient sample. With a collection of simulated samples, we then constructed a 95% confidence band of the K function and determined if the K function computed from the patient data lies within the 95% confidence band under complete spatial randomness, to investigate whether the spatial distribution of different cell types in 2-D deviates from the null hypothesis. Supplementary Figure S7F shows that most of the patient samples do not significantly deviate from the complete spatial randomness, demonstrated by the K function of the data (black curve) lying within the 95% confidence band of the K function of the simulated datasets (gray area).
Analysis of invasiveness of tumor cells and T cells.
Because of the relatively small number of cells subdivided by the TIGIT or PD-1 expression levels, we considered only the comparison between tumor cell populations and T-cell populations in the image. In Supplementary Fig. S6A, we used different colors to distinguish T cells (black for TIGIT+ or PD-1+, red for TIGIT− or PD-1−) and tumor cells (green for TIGIT+ or PD-1+, gold for TIGIT− or PD-1−). We use k-means clustering (k = 2) to classify the cells on the image into two clusters based only on their coordinate information (or spatial distribution on the image) with an unsupervised approach. We then computed the percentages of T cells (or equivalently the tumor cells) classified into the two clusters and chose the lower value as the “invasive percentage,” because intuitively, the more invasive the T cells are into the tumor cells, the closer the percentage of T cells belonging to the two clusters should be to 50%, and hence there should be a higher “invasive percentage.” Consistent with our expectation (Supplementary Fig. S6B), in the “TIGIT” images (left), we observed a higher invasive percentage in “HER2+ IDC” than in “HER2+ DCIS” (P = 0.047) and also a higher invasive percentage in “TN IDC” than in “TN DCIS” (P = 0.003). However, the differences in “HER2+ IDC” and “HER2+ DCIS” are relatively weak. Similarly, in the “PD-1” images (right), we observed a higher invasive percentage in “TN IDC” than in “TN DCIS” (P = 0.039), but no significant difference between “HER2+ IDC” and “HER2+ DCIS” (P = 0.52).
All code for transcriptomic, spatial, and statistical analysis can be found at github.com/polyak-lab/LeukocyteDCISIDC
Gene Expression Omnibus: RNA-seq datasets have been deposited to GEO with accession number GSE87517 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE87517).
Disclosure of Potential Conflicts of Interest
D. Dillon is a consultant/advisory board member for Oncology Analytics. L.M. Coussens has received commercial research support from Jansen, Acerta, and Deciphera, and is a consultant/advisory board member for Syndax. J.E. Garber is a consultant/advisory board member for Helix Genetics, Novartis, and GTx. R. Fan is a scientific advisor/cofounder of IsoPlexis and is a consultant/advisory board member for the same. No potential conflicts of interest were disclosed by the other authors.
Conception and design: S.J. Huh, M.B. Ekram, K. Polyak
Development of methodology: C.R. Gil Del Alcazar, S.J. Huh, M.B. Ekram, F. Beca, L.M. Coussens, R. Fan, F. Michor
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): C.R. Gil Del Alcazar, S.J. Huh, F. Beca, X. Zi, M. Kwak, Y. Su, L. Ding, H.G. Russnes, A.L. Richardson, K. Babski, E.M.H. Kim, C. McDonnell, J. Wagner, R. Rowberry, T. Sorlie, J.E. Garber, R. Fan, K. Bobolis, D.C. Allred, J. Jeong, S.Y. Park
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): C.R. Gil Del Alcazar, S.J. Huh, M.B. Ekram, A. Trinh, L.L. Liu, F. Beca, H. Bergholtz, Y. Su, G.J. Freeman, T. Sorlie, R. Fan, D.C. Allred, F. Michor, K. Polyak
Writing, review, and/or revision of the manuscript: C.R. Gil Del Alcazar, S.J. Huh, M.B. Ekram, A. Trinh, L.L. Liu, F. Beca, M. Kwak, H. Bergholtz, Y. Su, L. Ding, C. McDonnell, G.J. Freeman, D. Dillon, L.M. Coussens, J.E. Garber, R. Fan, K. Bobolis, D.C. Allred, F. Michor, K. Polyak
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): S.J. Huh, A. Trinh, X. Zi, G.J. Freeman, K. Bobolis, J. Jeong, K. Polyak
Study supervision: S.J. Huh, R. Fan, F. Michor, K. Polyak
Other (actively selecting patient to enroll in the study, collection and review of clinical data; obtaining patient consent for the study, communicate with multidisciplinary team members and the hospital and research staff to obtain the tissue sample according to the Hospital IRB/HIPPA): E.M.H. Kim
Other (acquisition of appropriate tissue samples): D. Dillon
We thank John F. Daley in the Dana-Farber Cancer Institute Flow Cytometry and Lisa Cameron in the Confocal and Light Microscopy Core facilities for their outstanding services and technical support. We thank Drs. Jane Grogan and Patrick Caplazi (Genentech) for their advice on TIGIT IHC and Dr. Drew Pardoll (Johns Hopkins University) for his critical reading of our manuscript. We thank Dr. Maria Grazia Daidone (Fondazione IRCCS Instituto Nazionale dei Tumori, Milan, Italy) for a subset of the data included in the Oslo cohort.
This work was supported by the National Cancer Institute F32CA156991 (to S.J. Huh), R35CA197623 (to K. Polyak), and U54CA193461 (to K. Polyak, F. Michor, and R. Fan; the Susan G. Komen Foundation (to Y. Su), and the Breast Cancer Research Foundation (to K. Polyak).