Purpose: This study was undertaken to conduct a comprehensive investigation of the role of DNA damage repair (DDR) defects in poor outcome ER+ disease.

Experimental Design: Expression and mutational status of DDR genes in ER+ breast tumors were correlated with proliferative response in neoadjuvant aromatase inhibitor therapy trials (discovery dataset), with outcomes in METABRIC, TCGA, and Loi datasets (validation datasets), and in patient-derived xenografts. A causal relationship between candidate DDR genes and endocrine treatment response, and the underlying mechanism, was then tested in ER+ breast cancer cell lines.

Results: Correlations between loss of expression of three genes: CETN2 (P < 0.001) and ERCC1 (P = 0.01) from the nucleotide excision repair (NER) and NEIL2 (P = 0.04) from the base excision repair (BER) pathways were associated with endocrine treatment resistance in discovery dataset, and subsequently validated in independent patient cohorts. Complementary mutation analysis supported associations between mutations in NER and BER genes and reduced endocrine treatment response. A causal role for CETN2, NEIL2, and ERCC1 loss in intrinsic endocrine resistance was experimentally validated in ER+ breast cancer cell lines, and in ER+ patient-derived xenograft models. Loss of CETN2, NEIL2, or ERCC1 induced endocrine treatment resistance by dysregulating G1–S transition, and therefore, increased sensitivity to CDK4/6 inhibitors. A combined DDR signature score was developed that predicted poor outcome in multiple patient cohorts.

Conclusions: This report identifies DDR defects as a new class of endocrine treatment resistance drivers and indicates new avenues for predicting efficacy of CDK4/6 inhibition in the adjuvant treatment setting. Clin Cancer Res; 24(19); 4887–99. ©2018 AACR.

Translational Relevance

Estrogen receptor positive (ER+) breast cancer is treatable with endocrine drugs that interrupt estrogen receptor function, but fatal drug resistance occurs in at least 30% of cases. The comprehensive analysis of the molecular landscape of DNA repair defects in ER+ breast cancer patient tumors reported here identifies defects in select DNA repair pathways as a novel class of endocrine treatment resistance driver occurring in approximately 40% of patients with endocrine treatment–resistant ER+ breast cancer. Candidate DNA damage repair genes identified are experimentally shown to be linked by common dysregulation of G1–S transition, suggesting that loss of DNA proof reading pathways disrupts ER regulation of the cell cycle, and therefore, response to endocrine treatment. A combined DDR signature score was developed that predicted poor outcome in multiple patient cohorts, which will have immense translational implications in stratifying patients who would not respond to endocrine therapy, but will be good candidates for CDK4/6 inhibitor–based treatment. These findings, therefore, significantly increase the understanding of factors underlying response to standard-of-care in patients with ER+ breast cancer and identify an alternative targeted therapeutic strategy for this subset of patients.

Breast cancer is the most frequent form of cancer affecting women, and estrogen-receptor positive (ER+) tumors account for 60%–70% of all reported cases. For patients with early-stage ER+ disease, endocrine therapy: tamoxifen or an aromatase inhibitor (AI) are preferred first-line therapies. Despite these treatments, at least 1 in 4 patients develop fatal endocrine therapy resistance (1, 2). Although some markers predictive of endocrine treatment (ET) response are available, for example, ERBB2 mutation/amplification (3, 4), gene expression profiles (5), and on-neoadjuvant endocrine treatment Ki67 analysis (6, 7), resistance largely remains an unpredictable and poorly understood event. Efforts to study underlying mechanisms of ET resistance have focused on activation of peptide growth factors (e.g., EGFR, ERBB2) and on activating mutations or translocation in ESR1 (8, 9). However, these mechanisms mostly explain adaptive or acquired resistance to endocrine treatment in the advanced setting, thereby circumscribing their use as predictive markers in primary tumors. The success of cyclin-dependent kinase (CDK) 4/6 inhibition for metastatic breast cancer (10, 11) indicates a major role for the target cell-cycle–dependent kinases in restoring growth control in ET-resistant tumors, but these agents lack accurate predictive markers. Consequently adjuvant treatment will lead to overtreatment in some patients and undertreatment in others.

It is known that loss of the MutL components of the mismatch repair (MMR) complex causes poor initial response to ET (intrinsic ET resistance) but in the MutL-deficient setting tumors remain sensitive to CDK4/6 inhibition (12). Effects of other DNA damage response/repair (DDR) defects on ET resistance are understudied but might have similar relationships. In other cancer types, disruptions in DDR pathways associate with tumor formation, responsiveness to chemotherapy, and loss of replicative checkpoints in many cancer types (13). In addition, ER-induced signaling and proliferation downregulates many DDR pathways in the normal mammary gland (14), potentially signifying a relationship between defects in DDR and hormonal response in ER+ breast cancer cells. Several genomic studies on breast cancer have identified signatures of DNA repair defects generated by classifying types of mutations, but the impact of these studies have been diluted by uncertainty regarding the molecular origin and clinical relevance of these signatures (15). Therefore, there is strong rationale for conducting a comprehensive, molecular analysis of the role of DDR defects in regulating and predicting ET response.

DDR is constituted of eight canonical pathways: mismatch repair (MMR), which can be further broken down into MutL and MutS complementation groups—nucleotide excision repair (NER), base excision repair (BER), nonhomologous end joining (NHEJ), homologous recombination (HR), Fanconi Anemia (FA), trans-lesion synthesis (TLS), and direct repair (DR; ref. 16). The first six of these pathways fall into one of two larger categories: single-strand break repair (SSBR) consisting of MMR, BER, and NER, and double-strand break repair (DSBR) comprised of NHEJ and HR (ref. 17; Fig. 1A). Here, we describe a comprehensive analysis of these canonical DDR pathways in ER+ breast tumors from patients and draw causal associations between BER and NER defects and ET resistance that parallel earlier observations on the role of MMR defects.

Figure 1.

Study outline. A, Network view of different DDR pathways along with the shared genes (gray nodes) and unique genes (indicated by name, adjacent to pathway name). Pathways associated with SSBR are denoted as yellow nodes and DSBR denoted as orange nodes. Lines indicate pathways that share common genes. MMR, mismatch repair; NER, nucleotide excision repair; BER, base excision repair; NHEJ, non-homologous end joining; HR, homologous recombination; FA, Fanconi anemia; TLS, trans-lesion synthesis; and DR, direct repair. B, Schematic representation of screening approach to identify DDR pathways and genes associated with ET response. Supporting data with detailed schema are presented in Supplementary Fig. S1.

Figure 1.

Study outline. A, Network view of different DDR pathways along with the shared genes (gray nodes) and unique genes (indicated by name, adjacent to pathway name). Pathways associated with SSBR are denoted as yellow nodes and DSBR denoted as orange nodes. Lines indicate pathways that share common genes. MMR, mismatch repair; NER, nucleotide excision repair; BER, base excision repair; NHEJ, non-homologous end joining; HR, homologous recombination; FA, Fanconi anemia; TLS, trans-lesion synthesis; and DR, direct repair. B, Schematic representation of screening approach to identify DDR pathways and genes associated with ET response. Supporting data with detailed schema are presented in Supplementary Fig. S1.

Close modal

DDR gene set compilation

Gene set for eight canonical DDR pathways comprising of DR, MMR, NER, BER, HR, NHEJ, and FA along with checkpoint genes was built as a union of MSigDB (18, 19) KEGG (c2:curated) pathway–specific genes and updated table of DDR pathway genes are listed at http://sciencepark.mdanderson.org/labs/wood/dna_repair_genes.html. Genes shared across different DDR pathways or checkpoints were not included in the analysis. In addition, TLS pathway genes were left out from current analysis, because of their ambiguous role in SSBR and DSBR mechanism. Cytoscape (20) was used to visualize the DDR pathway network.

Patient data

Datasets.

Z1031/POL dataset (referred to as NeoAI) was used with permission from the Alliance consortium (21). The data were obtained after written informed consent from the patients and the studies were conducted in accordance with recognized ethical guidelines approved by IRB (21). TCGA (downloaded June 2017) and MSKCC-IMPACT (downloaded February 2018) mutation data were obtained from cBioPortal. TCGA (downloaded June 2017) and METABRIC (downloaded June 2017) copy number data were obtained from cBioPortal. TCGA analyses were restricted to ER+ patient tumors [except for Supplementary Fig. S3 where basal-like and HER2-enriched tumors were analyzed on the basis of published PAM50 categorization (22)]. For MSKCC-IMPACT analyses, a list of ER breast cancer sample IDs was derived from previously published literature (23) and subtracted from the list available on cBioPortal. While there is estimated to be some contamination from patients with HER2+ or ER breast cancer, this subtracted list contains a majority of ER+ breast tumors. TCGA and METABRIC gene expression data and associated survival outcomes were downloaded from Oncomine. Standard cutoffs of mean-1.5× SD were used to identify “Low” subsets of each candidate gene in each dataset when multiple candidate genes were combinatorially analyzed. For individual analyses in METABRIC, “Low” subsets were identified using median cutoffs, and in Loi, using mean-1.5 × SD.

Enrichment analysis.

Mutation calls from NeoAI (21, 24) were used as the discovery set along with DNA-sequenced ER+ tumors from TCGA. For NeoAI dataset, mutations with Normal VAF<10/NA and Tumors VAF >10/NA were classified as somatic. Mutations were scored by SIFT (25, 26) to assess their effect on protein structure and function. In accordance with SIFT standards, missense mutations with scores <0.05 were considered to be damaging. For the enrichment analysis in TCGA, mutations were categorized into three variant types—missense, frameshift, and nonsense. Frameshift and nonsense mutations were cumulatively referred to as FS/NS mutations. Enrichment analysis in TCGA used the Z-score test of two population proportions to compare the proportion of missense to frameshift/nonsense mutations in each DDR pathway to the proportion of missense to frameshift/nonsense mutations in a control set of unrelated genes (MYH7, SYNE1, NEB) in tumors from patients who remained alive. This approach was used to ensure the presence of appropriate sample size for the analysis employed and to control for genome-wide levels of mutations in each tumor.

Survival analysis.

For univariate and multivariate analyses, 887 tumors from Luminal (A/B) patients who received endocrine treatment were analyzed. mRNA (microarray) expression and survival information along with other clinical metadata were extracted from Oncomine (27–29). Only samples with survival metadata were included in the analysis. For this cohort, tumors with expression level of candidate genes lower than median values were labeled as “low” while rest were labelled as “high.” All survival data were analyzed using Kaplan–Meier curves and log-rank tests. Proportional hazards were determined using Cox regression.

Statistical analysis

Missing data were imputed with “NA” from mutation, expression, and survival data analysis. Samples classifying for more than one category were either removed (e.g., samples with mutations in both SSBR and DSBR genes) or treated as separate set (e.g., cumulative analysis of tumors with dysregulation of new versus published candidates) for statistical comparisons. Pearson's correlation was performed on log-transformed normalized data for every DDR gene using automated script in R. To control for errors in multiple test corrections, false discovery rates were calculated using the Benjamini–Hochberg (30) method in R. Two-tailed Wilcoxon rank sum tests were used for two-sample tests of association between classes. Pathway over-representation analysis was performed using thirteen candidate genes as input genes against all DDR genes (n = 104) as the background in WebGestalt (31) using KEGG pathway database.

A DDR expression signature score known as CENMP (CETN2,ERCC1, NEIL2, MLH1, PMS2) score was devised using mean of standardized expression for the five mentioned genes. The CENMP score was calculated in three independent datasets (NeoAI, Metabric, and Loi) using microarray expression levels. Survival of patients with highest 20% quantile of CENMP score was compared against that of patients with lowest 20% quantile of the same.

Cell lines, siRNA transfection, and growth assays

ZR75.1 and MCF7 parental cells were from ATCC, 2015 and 2017, respectively, and tested for mycoplasma contamination upon arrival using the Lonza Mycoalert Plus Kit (CAT# LT07-710) as per the manufacturer's instructions, and annually since. Both lines were maintained in RPMI1640 1× w/l-glutamine and 1% penicillin/streptomycin (Sigma-Aldrich, Catalog no. P4333-100 mL). Cell lines used for experiments were <20 passages. Transient transfections with esiRNA (Sigma-Aldrich) toward human NEIL2 (Catalog no. EHU158461), CETN2 (catalog no. EHU137031), ERCC1 (catalog no. EHU156971), RAD23B (catalog no. EHU145881) and POLM (catalog no. SASI_Hs02_003344553), or scrambled control used at 50 nmol/L each were delivered using Polyplus jetPRIME Transfection Reagent (catalog no. 114-07) as per the manufacturer's instructions. Cells were plated for experiments 48 hours after transfection. Stable selection with puromycin after infection with lentivirus harboring RNAi oligos (ABM) against human NEIL2 (catalog no. i014980a), CETN2 (catalog no. i004479a), and ERCC1 (catalog no. i007023a) or scrambled control (catalog no. i000238c) was conducted using manufacturer's protocol. Growth assays were repeated independently in triplicate as reported previously (32) using Alamar blue to detect cell viability. Final readings were between 4 and 6 days after initial drug treatment and fold change plotted for analysis.

Drug treatment

Fulvestrant (Thermo Fisher Scientific catalog no. 506242), 4-OHT (Sigma-Aldrich catalog no. H7904), palbociclib (Thermo Fisher Scientific catalog no. 508548), and abemaciclib (Thermo Fisher Scientific, catalog no. NC0577560), dissolved in DMSO, and β-estradiol (Sigma-Aldrich catalog no. E2758-1G), dissolved in water, at 10 mmol/L were stored long term at −80°C with working stocks at −20°C. Cells were treated 24 hours after plating with fresh drug added every 48 hours until the end of the experiment. Estradiol addition experiments were conducted in phenol red–free DMEM with 10% charcoal-stripped serum.

qPCRs, Western blot analysis, FACS and IFs

RNA from cell lines was extracted using the Qiagen RNeasy Mini Kit (catalog no. 74106) and converted to cDNA using Bio-Rad Reverse Transcriptase iScript (catalog no. TX1708891BAY), both following manufacturers’ instructions. cDNA was quantified by q-RT-PCR using Bio-Rad SsoAdvanced Universal SYBR Green Supermix (catalog no. 17525272) at the manufacturers’ specifications. For immunoflourescence, 48 hours after esiRNA transfection, cells were plated on Poly-d-lysine–coated coverslips (catalog no. NC0746078), treated for 24 hours, then harvested. Cells on coverslips were then washed with 1× PBS, fixed in 4% PFA, and costained with Ki-67/MKI67 Antibody (Novus Biologicals catalog no. NB110-89717SS). Western blotting was conducted as described previously (32) using antibodies against CETN2 (Abclonal catalog no. A5397), NEIL2 (Abcam catalog no. ab221556), ERCC1 (Abcam catalog no. ab129267), and β-actin (Sigma-Aldrich catalog no. A5316-100 μL). FACS analysis for PCNA (Abcam CAT#ab29) was conducted after treating MCF7 siScr or siCEN cells with 100 nmol/L fulvestrant for 72 hours following a 20-hour treatment with 250 nmol/L nocodazole (Sigma-Aldrich catalog no. M1404). Cells were then probed for PCNA and run on a BD Accuri C6 flow cytometer using a standard protocol.

NER and BER downregulation associates with endocrine treatment resistance

The status of 104 DDR genes belonging to the six major DDR pathways: NER, BER, MMR, NHEJ, FA, and HR was assessed in primary ER+ breast tumors. Genes that were shared between multiple pathways were excluded from the analysis to facilitate better understanding of the discrete contributions of individual pathways to ET resistance (Fig. 1A). In addition, DR and TLS were excluded because they fall outside canonical SSBR-DSBR categorization, and appeared to be rarely dysregulated in ER+ tumors in initial observations. The discovery set, referred to as NeoAI, was composed of data from two neoadjuvant aromatase inhibitor trials [Z1031 (21, 33) and POL (24)]. These clinical trials were designed to assess intrinsic ET response by accruing serial biopsies from patients at diagnosis (baseline, BL), after 2–4 weeks of endocrine treatment (on-treatment) and in the surgical specimen (Fig. 1B). These biopsies were annotated by whole-genome/exome-sequencing, RNA-seq and gene expression microarray, providing both mutational and transcriptomic information. The biopsies, both baseline and on-treatment were also evaluated for Ki67, a marker of proliferation, by both IHC and expression (ref. 34; Supplementary Fig. S1A). An on-treatment Ki67 value >10% is a clinically relevant marker of intrinsic ET resistance associated with elevated risk of relapse in the first 5 years of follow up (35). The discovery strategy for this study was therefore, to correlate DDR gene expression at baseline with on-treatment Ki67 levels (both by IHC and by expression), and combine results with that from a complementary enrichment analysis for deleterious mutations, to identify a set of DDR genes, which when dysregulated may predict intrinsic ET resistance (Fig. 1B).

Transcriptomic analysis (schema outlined in Supplementary Fig. S1B) identified 13 DDR genes whose reduced expression (at baseline) correlated with increased on-treatment Ki67 levels (Fig. 2A). Three genes (ESR1, GATA3, and RUNX1; refs. 36–39) were used as positive controls because of their known associations with ET responsiveness (Fig. 2A). Twelve of the 13 DDR genes identified belonged to SSBR pathways, corresponding to approximately 20% of all unique SSBR genes compared with only 2% of all unique DSBR genes (Fig. 2B) supporting a strong role for SSBR pathway downregulation in ET resistance. FA was the only one of three DSBR pathways examined to demonstrate significant association with ET resistance (Fig. 2C). Pathway enrichment analysis of the candidate gene list relative to all other DDR genes studied revealed significant over-representation of “Platinum drug resistance” (P = 0.04, comprised of MMR and NER genes) and “mismatch repair” (P = 0.04) terms (Fig. 2D). These results framed the proposition that underexpression of genes serving NER, and to a lesser extent, BER genes can reduce response to ET.

Figure 2.

RNA levels of MMR, BER, and NER genes associate inversely with on-endocrine treatment Ki67. A, Table describing 13 candidate genes with significant correlation between RNA levels and on-treatment Ki67 in ER+ tumors from NeoAI. Pearson correlation analysis determined the correlation coefficient. False discovery rate (FDR) is denoted in the table for each correlation. Blue boxes indicate correlations where FDR < 20%. Pathways to which candidate genes belong are noted in the DDR function column. Yellow boxes indicate SSBR pathways and orange boxes, DSBR. As a positive control for the analysis, three genes previously implicated in response to ET: ESR1, GATA3 and RUNX1, are included. B, Bar graph indicating enrichment for SSBR genes in the list of 13 candidate genes. Fisher exact test determined P value. C, Venn diagram depicting proportion of genes from each DDR pathway that was implicated in ET resistance from correlation analysis in A. Blue circle indicates candidate gene population. D, KEGG pathway enrichment analysis of the candidate gene list against all DDR genes used in the analysis revealed significant enrichment of indicated pathways. Number of genes from candidate list contributing to each enriched pathway is listed along the bars. A −log10 (P) of 1.2 denotes a P < 0.05.

Figure 2.

RNA levels of MMR, BER, and NER genes associate inversely with on-endocrine treatment Ki67. A, Table describing 13 candidate genes with significant correlation between RNA levels and on-treatment Ki67 in ER+ tumors from NeoAI. Pearson correlation analysis determined the correlation coefficient. False discovery rate (FDR) is denoted in the table for each correlation. Blue boxes indicate correlations where FDR < 20%. Pathways to which candidate genes belong are noted in the DDR function column. Yellow boxes indicate SSBR pathways and orange boxes, DSBR. As a positive control for the analysis, three genes previously implicated in response to ET: ESR1, GATA3 and RUNX1, are included. B, Bar graph indicating enrichment for SSBR genes in the list of 13 candidate genes. Fisher exact test determined P value. C, Venn diagram depicting proportion of genes from each DDR pathway that was implicated in ET resistance from correlation analysis in A. Blue circle indicates candidate gene population. D, KEGG pathway enrichment analysis of the candidate gene list against all DDR genes used in the analysis revealed significant enrichment of indicated pathways. Number of genes from candidate list contributing to each enriched pathway is listed along the bars. A −log10 (P) of 1.2 denotes a P < 0.05.

Close modal

Low expression of CETN2, ERCC1, and NEIL2 are poor prognostic factors in ER+ breast cancer patients treated with endocrine therapy

To understand the effect of downregulation of the 13 candidate genes on long-term patient outcomes, univariate associations between low expression of these 13 genes and patient survival was tested in an independent dataset (METABRIC). To increase the specificity of these correlations, only the subset of patients with luminal tumors who were treated with ET was included in these analyses. Univariate Cox Proportional hazard analyses based on overall survival identified five genes as significantly associating with poor survival: CETN2, ERCC1, MLH1, NEIL2, and PMS2 (Supplementary Fig. S2). Subsequent multivariate analyses, including tumor size, grade, and node status, supported an independent role for these five candidates in predicting disease-specific survival (DSS, Fig. 3A). An association between reduced expression of MutL genes, MLH1 and PMS2, and poor survival has already been described in this dataset (12). Kaplan–Meier survival analysis demonstrated that low expression of CETN2, NEIL2, and ERCC1 individually also correlated with worse DSS compared with all other patients with ER+ breast cancer (Fig. 3B–D). However, no association between low RNA levels and poor survival for CETN2, NEIL2, and ERCC1 was observed in patients with either HER2-enriched (Supplementary Fig. S3A–S3C) or basal-like (Supplementary Fig. S3D–S3F) breast cancer, suggesting that the association between defects in these genes and survival is subtype-specific. A correlation between low expression of CETN2, NEIL2, and ERCC1 and poor recurrence-free survival was also observed in the Loi dataset (40), serving as independent validation (Supplementary Fig. S4). Furthermore, a composite signature based on low expression of any one of these genes (the CEN signature) associated with a significantly increased risk ratio of 5.1 in Loi (P = 0.02; Fig. 3E).

Figure 3.

CETN2, NEIL2, and ERCC1 loss associates with poor survival of patients with ER+ breast cancer. A, Forest plot summarizing results of multivariate analysis of the 13 candidate genes in METABRIC. Other factors included in the analysis were tumor size, grade, and node positivity. Boxes denote HR based on overall survival outcome, and error bars the 95% confidence interval (CI). HR for genes whose dysregulation associated with poor survival (P ≤ 0.05) by univariate analysis (presented in Supplementary Fig. S2) are shown as red boxes. B–D, Kaplan–Meier curves depicting disease-specific survival of patients with luminal breast cancer treated with ET whose tumors have low (mean-1.5 SD) CETN2 (B), NEIL2 (C), and ERCC1 (D) expression (red) in METABRIC dataset. Kaplan–Meier curves for HER2-enriched and basal-like tumors are presented in Supplementary Fig. S3. E, Kaplan–Meier curves depicting recurrence-free survival of tamoxifen-treated ER+ breast cancer patients whose tumors had low expression of CETN2, ERCC1, and NEIL2 (CEN Low in red) in Loi dataset. Individual Kaplan–Meier curves presented in Supplementary Fig. S4. All HRs were calculated using Cox Regression and log-rank P value determined significance of differences in survival.

Figure 3.

CETN2, NEIL2, and ERCC1 loss associates with poor survival of patients with ER+ breast cancer. A, Forest plot summarizing results of multivariate analysis of the 13 candidate genes in METABRIC. Other factors included in the analysis were tumor size, grade, and node positivity. Boxes denote HR based on overall survival outcome, and error bars the 95% confidence interval (CI). HR for genes whose dysregulation associated with poor survival (P ≤ 0.05) by univariate analysis (presented in Supplementary Fig. S2) are shown as red boxes. B–D, Kaplan–Meier curves depicting disease-specific survival of patients with luminal breast cancer treated with ET whose tumors have low (mean-1.5 SD) CETN2 (B), NEIL2 (C), and ERCC1 (D) expression (red) in METABRIC dataset. Kaplan–Meier curves for HER2-enriched and basal-like tumors are presented in Supplementary Fig. S3. E, Kaplan–Meier curves depicting recurrence-free survival of tamoxifen-treated ER+ breast cancer patients whose tumors had low expression of CETN2, ERCC1, and NEIL2 (CEN Low in red) in Loi dataset. Individual Kaplan–Meier curves presented in Supplementary Fig. S4. All HRs were calculated using Cox Regression and log-rank P value determined significance of differences in survival.

Close modal

Damaging mutations in nucleotide excision repair, base excision repair and non-homologous end joining genes are enriched in ER+ patient tumors

To further understand the involvement of DDR genes in ER+ breast cancer, incidence of damaging versus nondamaging missense mutations (as predicted by SIFT; refs. 25, 26) was analyzed in pretreatment biopsies from NeoAI (Supplementary Table S1). In this analysis, genes from SSBR pathways showed significant enrichment for damaging mutations when compared with all other genes in the genome, whereas DSBR genes did not (Fig. 4A). Enrichment for damaging and tolerant mutations over genome-wide frequency was then assessed for each individual DDR pathway in NeoAI (Supplementary Fig. S5). Damaging mutations were enriched in genes of NER, BER, NHEJ, and HR pathways (FDR < 0.05), but tolerant mutations were not, potentially indicating a selection for deleterious mutations in these pathways during tumor evolution. Damaging mutations were also enriched in genes of FA pathway, but nondamaging mutations showed even higher enrichment, suggesting that any role for FA gene mutation in ER+ breast cancer is likely complex (Supplementary Fig. S5). Enrichment for deleterious (frameshift/nonsense; FS/NS) over missense (MS) mutations in ER+ patient tumors was validated in TCGA for genes of BER and NHEJ, but not HR and FA, pathways, although due to limited follow-up time in this dataset, similar validation could not be obtained for NER (Fig. 4B). To facilitate rigorous statistical analysis, P values were generated by comparing the proportion of somatic FS/NS:MS mutations in each DDR pathway in patients who were alive, to a control set of genes that have not been implicated as cancer drivers (SYNE1, MYH7, NEB). Together, these enrichment analyses promote the postulate that NER, BER, and NHEJ gene mutations may be ER+ breast cancer drivers.

Figure 4.

NER, BER, and NHEJ genes are enriched for damaging mutations in endocrine treatment–resistant tumors. A, Enrichment analysis for prevalence of predicted damaging mutations (based on SIFT scores: lower the SIFT score, the more damaging the mutation is predicted to be) in SSBR and DSBR pathways compared with genome-wide prevalence in tumors from NeoAI. Significant P values were determined by Wilcoxon test analysis. Similar analysis for each individual DDR pathway is presented in Supplementary Fig. S5. B, Pie charts comparing proportion of missense (light yellow – SSBR, light orange – DSBR) and frameshift/nonsense (yellow – SSBR, orange – DSBR) mutations in SSBR and DSBR genes relative to proportion in control gene set (gray). Z-statistic for two population proportions was used to determine significant differences in proportion of missense to frameshift/nonsense mutations in patients who remained alive to maintain adequate sample size for the test. C, Forest plots depicting HRs for overall survival of patients from TCGA (above) and MSKCC-IMPACT (below) with ER+ tumors harboring nonsynonymous mutations in indicated pathways. Log-rank test was used to determine significance and Cox regression proportional hazards generated univariate HRs. Supporting data investigating a role for NHEJ gene mutation in ER+ breast cancer survival is presented in Supplementary Fig. S6, and analyses controlling for replication defects, genome instability, and mutation load are presented in Supplementary Figs. S7 and S8. D, Venn diagram and word cloud (yellow text, SSBR and orange text, DSBR) summarizing candidate pathways that significantly associate with poor survival of patients with ER+ breast cancer (red) based on mutational (green) or transcriptomic (violet) dysregulation. MMR, NER, and BER pathways are identified at the intersection of all analyses. Larger font size indicates greater confidence.

Figure 4.

NER, BER, and NHEJ genes are enriched for damaging mutations in endocrine treatment–resistant tumors. A, Enrichment analysis for prevalence of predicted damaging mutations (based on SIFT scores: lower the SIFT score, the more damaging the mutation is predicted to be) in SSBR and DSBR pathways compared with genome-wide prevalence in tumors from NeoAI. Significant P values were determined by Wilcoxon test analysis. Similar analysis for each individual DDR pathway is presented in Supplementary Fig. S5. B, Pie charts comparing proportion of missense (light yellow – SSBR, light orange – DSBR) and frameshift/nonsense (yellow – SSBR, orange – DSBR) mutations in SSBR and DSBR genes relative to proportion in control gene set (gray). Z-statistic for two population proportions was used to determine significant differences in proportion of missense to frameshift/nonsense mutations in patients who remained alive to maintain adequate sample size for the test. C, Forest plots depicting HRs for overall survival of patients from TCGA (above) and MSKCC-IMPACT (below) with ER+ tumors harboring nonsynonymous mutations in indicated pathways. Log-rank test was used to determine significance and Cox regression proportional hazards generated univariate HRs. Supporting data investigating a role for NHEJ gene mutation in ER+ breast cancer survival is presented in Supplementary Fig. S6, and analyses controlling for replication defects, genome instability, and mutation load are presented in Supplementary Figs. S7 and S8. D, Venn diagram and word cloud (yellow text, SSBR and orange text, DSBR) summarizing candidate pathways that significantly associate with poor survival of patients with ER+ breast cancer (red) based on mutational (green) or transcriptomic (violet) dysregulation. MMR, NER, and BER pathways are identified at the intersection of all analyses. Larger font size indicates greater confidence.

Close modal

To address associations with clinical outcomes more directly, Cox regression analysis was conducted for tumors with mutations in each DDR pathway in two datasets: TCGA and MSKCC-IMPACT (Fig. 4C). TCGA has whole-exome sequencing data from >800 ER+ breast tumors, while MSKCC-IMPACT has targeted sequencing of a selected panel of genes (including a subset of DDR genes) in >300 ER+ primary breast tumors. Among the SSBR pathways, mutations in NER (ERCC2–5) and BER genes each associated with significantly higher HR in MSKCC-IMPACT and TCGA databases, respectively (Fig. 4C), validating observations made in the gene expression (Fig. 2) and enrichment analyses described above (Fig. 4A and B; Supplementary Fig. S5). Association of NER gene mutations in TCGA and BER gene mutations in MSKCC-IMPACT could not be made because of median follow-up being <6 months in either case.

Amongst the DSBR pathways, tumors with mutations in NHEJ genes associated with a significantly higher HR when compared with wild-type tumors in TCGA (Fig. 4C). No NHEJ gene was included in the targeted panel sequenced in MSKCC-IMPACT precluding validation in this dataset. To date, NHEJ has not been associated with ET resistance. Only five genes from this pathway had truncating mutations in either NeoAI or TCGA: PRKDC, XRCC5, DNTT, NHEJ1, and POLM. Patients whose tumors harbored either MS or FS/NS mutations in any of these genes associated with worse overall survival in TCGA, as did patients whose tumors had copy number loss of these loci (Supplementary Fig. S6A). When individual associations with survival were analyzed, only mutation and/or copy number loss of PRKDC associated independently with poor survival (HR = 2.8; P = 0.009, Supplementary Fig. S6B). In addition, tumors with either PRKDC copy number loss or mutations also had significantly lower gene expression of PRKDC (P < 0.001, Supplementary Fig. S6C), suggesting that this is unlikely to be a chance association. Although mutation data on PRKDC was not available in METABRIC, the association of PRKDC copy number loss with poor survival was validated in METABRIC (Supplementary Fig. S6D). In addition, the association of PRKDC mutations with poor prognosis was observed in parallel with this study, in an independent set of ER+ patient tumors (41).

Among the other DSBR pathways, primary tumors with mutations in HR genes did not associate with higher HRs than wild-type tumors in either TCGA or MSKCC-IMPACT where eight HR genes (RAD54L, RAD52, RAD51D, RAD51, NBN, BRCA1, and BLM) were included in the targeted panel (Fig. 4C). Mutations in FA genes had mixed associations, with patients whose tumors had mutations in FA genes associating with worse survival in TCGA, but not in MSKCC-IMPACT (FANCA, FANCC, PALB2, BRIP1; Fig. 4C).

Consideration of all three discovery parameters analyzed, that is, gene expression downregulation, gene mutation, and association with patient outcomes (increased HRs for overall survival in TCGA, METABRIC, or MSKCC-IMPACT) provides strongest support for the involvement of SSBR pathway dysregulation in poor clinical outcomes of patients with ER+ breast cancer (Fig. 4D). More specifically, all three discovery parameters support an understudied role for NER and BER dysregulation in ET resistance (Fig. 4D) that warrants functional investigation. Evidence for involvement of DSBR pathway dysregulation in ER+ breast cancer outcomes is less consistent across the three different screening parameters (Fig. 4D) and requires further investigation in larger patient cohorts, and in experimental model systems.

Two confounding factors affect interpretation of mutational analyses conducted here. First, it is possible that dysregulation of replication factors commonly associated with DSBR disruption affects proliferative response to ET. However, no decisive association between replication gene expression (RPA1-4) and on-treatment proliferation marker Ki67 (IHC and mRNA, Supplementary Fig. S7) was observed in NeoAI, suggesting that replicative disruption is unlikely to be a major confounding factor for this analysis. Second, previous reports have suggested an association between high mutation load or genome instability and poor patient outcomes in breast cancer (42), which may influence interpretation of associations between mutations in DDR genes and clinical outcome. No significant increase in genome instability or mutation load was observed in tumors with mutations in DSBR genes (Supplementary Fig. S8B and S8D). However, as expected, a significant increase in mutation load, but not genome instability, was observed in SSBR-mutated tumors (Supplementary Fig. S8A and S8C). Therefore, it is not possible to rule out high mutation load as a potential confounding factor in the mutational analysis presented above, without functional validation of causality. Dysregulation of genes from the NER (CETN2 and ERCC1) and BER (NEIL2) pathways were, therefore, functionally investigated as candidates from these pathways were most consistently correlated with ET resistance and poor patient outcomes (Fig. 4D).

Experimental validation of CETN2, ERCC1, and NEIL2 as endocrine therapy resistance genes

To test whether dysregulation of candidate genes from NER and BER pathways can directly cause ET resistance, pooled siRNA against each of the three candidate genes identified in the gene expression screen, that is, CETN2, ERCC1, and NEIL2, as well as a scrambled control, was transiently transfected into two ET-sensitive, ER+ breast cancer cell lines, MCF7 (Fig. 5) and ZR-75 (Supplementary Fig. S9). Knockdown of the genes was validated at RNA level in both cell lines (Supplementary Fig. S9A and S9C), and downregulation of each protein was confirmed by Western blots of MCF7 cell lysates (Fig. 5A). Cells transfected with siRNA against scrambled control (siScr), CETN2, NEIL2, or ERCC1 were then exposed to all three classes of ET: estrogen deprivation in media containing charcoal-stripped serum (to mimic AI), and tamoxifen or fulvestrant treatment in media containing full serum. Estrogen-deprived siCETN2, siNEIL2, and siERCC1 MCF7 (Supplementary Fig. S9B) and ZR-75 (Supplementary Fig. S9D) cells showed attenulated growth response to estradiol stimulation when compared with siScr control cells, indicating decreased influence of estrogen signaling on proliferaton. Consistent with this notion, siCETN2, siNEIL2, and siERCC1 MCF7 (Fig. 5B and C) and ZR-75 (Supplementary Fig. S9E and S9F) cells demonstrated a significant lack of growth inhibition in response to either fulvestrant or tamoxifen treatment.

Figure 5.

Inhibition of CETN2, NEIL2, and ERCC1 induces resistance to all classes of endocrine therapy in ER+ breast cancer cells and PDXs. A, Western blot validation of siRNA-mediated knockdown of CETN2, NEIL2, and ERCC1 respectively in MCF7 cells. Results from three independent experiments are depicted. Columns represent the mean and error bars the SD. RNA level validation of knockdown is presented in Supplementary Fig. S9A. B–D, Dose–response curves of MCF7 cells with transient inhibition of CETN2, NEIL2 or ERCC1 treated with fulvestrant (B) or 4-hydroxy-tamoxifen (4-OHT; C). Dose response to estrogen stimulation is presented in Supplementary Fig. S9B. IC50 values were calculated from three independent dose curves for each condition and Student t test used to determine significant differences in IC50 values. nmol/L, nanomolar. Independent validation in a second cell line is presented in Supplementary Fig. S9C–S9F, and orthogonal validation of knockdown results are presented in Supplementary Fig. S10. D, Box plot depicting tumor viability in vivo after anastrozole treatment of 7 ER+ PDX lines from BCaPE, calculated using area under the curve (AUC) measurements. CEN: CETN2, ERCC1, NEIL2; MP: MLH1, PMS2. Wilcoxon rank-sum test determined P value. E, Working model indicating peak expression levels of NEIL2, ERCC1, MLH1, and CETN2 genes across the cell cycle. Data generated from two independent double thymidine block experiments (www.dnarepairgenes.com). Cumulative peak expression level of all genes in NHEJ, HR, and FA pathways also indicated. y-axis indicates relative gene expression level and x-axis is plotted on the basis of number of hours post release of double thymidine block. Implication of CDKs and estrogen stimulation in the cell cycle is based on published reports. Supporting mechanistic data is presented in Supplementary Figs. S11 and S12. F, Bar graphs represent growth inhibition, relative to vehicle treated cells, in response to 100 nmol/L of fulvestrant or 1 μmol/L of palbociclib, CDK4/6 inhibitor in MCF7 cells stably expressing pooled RNAi oligos against CETN2, ERCC1, NEIL2 or scrambled control. Student t test determined P values by comparing growth inhibition in response to palbociclib against that in response to fulvestrant.

Figure 5.

Inhibition of CETN2, NEIL2, and ERCC1 induces resistance to all classes of endocrine therapy in ER+ breast cancer cells and PDXs. A, Western blot validation of siRNA-mediated knockdown of CETN2, NEIL2, and ERCC1 respectively in MCF7 cells. Results from three independent experiments are depicted. Columns represent the mean and error bars the SD. RNA level validation of knockdown is presented in Supplementary Fig. S9A. B–D, Dose–response curves of MCF7 cells with transient inhibition of CETN2, NEIL2 or ERCC1 treated with fulvestrant (B) or 4-hydroxy-tamoxifen (4-OHT; C). Dose response to estrogen stimulation is presented in Supplementary Fig. S9B. IC50 values were calculated from three independent dose curves for each condition and Student t test used to determine significant differences in IC50 values. nmol/L, nanomolar. Independent validation in a second cell line is presented in Supplementary Fig. S9C–S9F, and orthogonal validation of knockdown results are presented in Supplementary Fig. S10. D, Box plot depicting tumor viability in vivo after anastrozole treatment of 7 ER+ PDX lines from BCaPE, calculated using area under the curve (AUC) measurements. CEN: CETN2, ERCC1, NEIL2; MP: MLH1, PMS2. Wilcoxon rank-sum test determined P value. E, Working model indicating peak expression levels of NEIL2, ERCC1, MLH1, and CETN2 genes across the cell cycle. Data generated from two independent double thymidine block experiments (www.dnarepairgenes.com). Cumulative peak expression level of all genes in NHEJ, HR, and FA pathways also indicated. y-axis indicates relative gene expression level and x-axis is plotted on the basis of number of hours post release of double thymidine block. Implication of CDKs and estrogen stimulation in the cell cycle is based on published reports. Supporting mechanistic data is presented in Supplementary Figs. S11 and S12. F, Bar graphs represent growth inhibition, relative to vehicle treated cells, in response to 100 nmol/L of fulvestrant or 1 μmol/L of palbociclib, CDK4/6 inhibitor in MCF7 cells stably expressing pooled RNAi oligos against CETN2, ERCC1, NEIL2 or scrambled control. Student t test determined P values by comparing growth inhibition in response to palbociclib against that in response to fulvestrant.

Close modal

As negative controls, two DDR genes that did not correlate with ET resistance, RAD23B and POLM, were also knocked down using siRNA in MCF7 cells (Supplementary Fig. S10A). Downregulation of these genes did not alter growth response to either fulvestrant (Supplementary Fig. S10B) or tamoxifen (Supplementary Fig. S10C). In addition, independent lentiviral RNAi oligos against CETN2, NEIL2, and ERCC1 were used to stably select MCF7 cells with CETN2, ERCC1, and NEIL2 knockdown (Supplementary Figs. S9A and S10D demonstrate knockdown at RNA and protein level respectively). Fulvestrant-independent (Fig. 10E) and tamoxifen-independent (Supplementary Fig. S10F) growth phenotypes were faithfully replicated in these stable cells, providing orthogonal confirmation that growth effects caused by knockdown of the three candidate genes, CETN2, NEIL2, and ERCC1 are specific and causal.

Next, dysregulation of these candidates at either the mutational or RNA level was assessed across 7 ER+ PDXs (BCaPE; ref. 43). One of the 7 lines had strong downregulation of NEIL2 expression, and two other lines exhibited downregulation of ERCC1. In addition, one line had downregulation of PMS2 (MutL−). One of the lines with low ERCC1 RNA also harbored a missense mutation in MLH1 suggesting a compound phenotype. All four of these lines, (designated CENMP− because of a disruption of any CETN2, ERCC1, NEIL2, or MutL component) exhibited significantly higher tumor viability after treatment with the AI, anastrozole, when compared with the other three tumors designated CENMP+ in this PDX cohort (P = 0.02; Fig. 5D).

To test whether the loss of proliferative inhibition to ET observed in human tumors (Fig. 2) was reproducible in experimental systems, Ki67 levels were also assessed before and after fulvestrant treatment in siCETN2, siNEIL2 and siERCC1 MCF7 cells relative to siScr control. Inhibition of gene expression from any of these three candidates resulted in a profound lack of Ki67 inhibition in response to fulvestrant treatment, unlike control cells (Supplementary Fig. S11A), reproducing observations in clinical trial samples. Earlier studies indicated that MutL-defective ER+ breast cancer cells exhibit altered proliferative response to ET due to dysregulation of G1–S cell-cycle transition (12). To test whether the candidate DDR genes identified in this screen also participate in the regulation of G1–S transition, their gene expression pattern was analyzed across the cell cycle after a double thymidine block (www.dnarepairgenes.com; ref. 44). All three candidates, as well as NHEJ genes, had maximal gene expression specifically in G1 or around the G1–S transition point, similarly to MLH1 (Fig. 5E). On the other hand, FA gene expression was maximal in late S phase and HR gene expression in late G2 (Fig. 5E). These observations are consistent with published data (45, 46) and indicate a common role for the candidate endocrine resistance DDR genes identified here in G1–S transition. As in the case of MutL-defective tumors, CEN/PRKDC- ER+ patient tumors from TCGA also had significantly increased RNA levels of CDK4, the principal G1 cyclin-dependent kinase (Supplementary Fig. S11B) and protein levels of PCNA, a marker of successful S-phase transition, relative to CEN/PRKDC+ tumors (Supplementary Fig. S11C). Increased PCNA positivity was also confirmed in MCF7 cells with stable knockdown of CETN2, NEIL2, ERCC1, and MLH1 after fulvestrant treatment relative to control cells (Supplementary Fig. S11D).

To test ER regulation as an alternative mechanism uniting these candidate genes in their ability to cause ET-resistant growth, correlation of gene expression of each candidate gene was tested against ESR1/PGR expression in patient tumors from NeoAI (Supplementary Fig. S12). Partial correlation was observed between ESR1/PGR levels and CETN2 (R = 0.36/0.2), but not for other two genes. A ChIP-seq dataset (47, 48) identified an ESR1 binding peak close to the CETN2 promoter but not for the other two candidates. Therefore, although ER-mediated regulation of some DDR genes cannot be ruled out as one mechanism underlying relationships with ET resistance, it is unlikely that it constitutes a common underlying mechanism for all the DDR candidate genes studied herein. Together, these data suggest that CEN ER+ breast cancer cells, akin to MutL cells, enable unchecked CDK4 activity, resulting in rapid G1–S transition even in the presence of ET.

To directly test whether inhibition of CDK4/6 can inhibit proliferation in CEN ER+ breast cancer cells, MCF7 cells with stable knockdown of CETN2, NEIL2, or ERCC1 were exposed to the CDK4/6 inhibitors, palbociclib and abemaciclib. Control MCF7 cells demonstrated comparable sensitivity to both fulvestrant and CDK4/6 inhibitors, palbociclib (Fig. 5F) and abemaciclib (Supplementary Fig. S11E), in keeping with published reports (12). However, downregulation of any one of the three candidate genes in MCF7 cells induced resistance to fulvestrant, but persistent sensitivity to both palbociclib (Fig. 5F) and abemaciclib (Supplementary Fig. S11E). These data provide preliminary support for a role for DDR dysregulation in predicting ET resistance and sensitivity to CDK4/6 inhibitors.

Predictive value of candidate DDR gene dysregulation in ER+ breast cancer

To estimate the impact of DDR dysregulation as a novel class of ET resistance driver and a predictive marker for ET failure, the cumulative frequency of dysregulation, i.e., multiple or cooccurring downregulation of 3 of the 4 novel candidate genes discovered in this analysis, CETN2, NEIL2, ERCC1, mutation or copy number loss of the fourth candidate gene, PRKDC, and downregulation of the two previously known candidate genes, MLH1 and PMS2, was assessed in METABRIC (Fig. 6A) and TCGA (Fig. 6B). In both datasets, downregulation of one or a combination of these genes occurred in 40%–60% of tumors from patients with ER+ breast cancer who died within 5 years of diagnosis. A less significant enrichment for dysregulation of these genes was observed in patients with ER+ breast cancer who died more than 5 years after diagnosis, suggesting that downregulation of these genes predisposes ER+ breast cancer to early ET failure consistent with intrinsic resistance.

Figure 6.

Cumulative incidence and predictive potential of CETN2, NEIL2, ERCC1, MLH1, and PMS2 (CENMP) deficiency. A and B, Stacked columns indicating cumulative frequency of dysregulation (mutation or underexpression) of CETN2, ERCC1, NEIL2 (CEN); MLH1, PMS2 (MutL); and PRKDC mutation (mut) or copy number loss (cnl) in ER+ breast tumors from METABRIC (A) and TCGA (B). Fisher exact test determined P values. C, Box plots describing CENMP expression signature score in tumors from patients based on their response to AI treatment. Wilcoxon rank-sum test determined P values. D and E, Kaplan–Meier survival curves evaluating separation based on CENMP score in ET-treated ER+ patients from METABRIC (D) and Loi (E) datasets. Cox regression identified HR and log-rank test determined P values for survival analyses.

Figure 6.

Cumulative incidence and predictive potential of CETN2, NEIL2, ERCC1, MLH1, and PMS2 (CENMP) deficiency. A and B, Stacked columns indicating cumulative frequency of dysregulation (mutation or underexpression) of CETN2, ERCC1, NEIL2 (CEN); MLH1, PMS2 (MutL); and PRKDC mutation (mut) or copy number loss (cnl) in ER+ breast tumors from METABRIC (A) and TCGA (B). Fisher exact test determined P values. C, Box plots describing CENMP expression signature score in tumors from patients based on their response to AI treatment. Wilcoxon rank-sum test determined P values. D and E, Kaplan–Meier survival curves evaluating separation based on CENMP score in ET-treated ER+ patients from METABRIC (D) and Loi (E) datasets. Cox regression identified HR and log-rank test determined P values for survival analyses.

Close modal

To identify a DDR-low signature in patients with ER+ breast cancer, a gene expression score was defined using mean normalized expression of each gene. The score was significantly lower in resistant tumors from NeoAI when compared against sensitive counterparts (P = 0.002; Fig. 6C). While this indicates that the score associates with ET resistance in patient tumors, the sensitivity of the score is approximately 70% and the specificity approximately 68%, indicating potential for further refinement of the signature by inclusion of other known factors, and mutational or copy number data.

Using this signature, the lowest and highest scoring quintiles of ER+ tumors were identified in METABRIC and Loi. The lowest scoring quintile associated with poor disease-specific and recurrence-free survival of patients with ER+ tumors in METABRIC (P < 0.001; Fig. 6D) and Loi (P = 0.09; Fig. 6E) indicating the feasibility of using this score to predict short-term outcomes in patient cohorts. Of note, this analysis also demonstrated better survival of patients in the upper quintile of the DDR signature score, suggesting dual validity of the score in predicting both worse and better response to ET.

This study presents a comprehensive characterization of the molecular landscape of canonical DNA repair pathway defects in ER+ breast cancer as it relates to response to ET. A previous epidemiologic study examining a selected subset of BER proteins using IHC identified XRCC1, APE1, SMUG1, and FEN1 as associating with ER+ breast cancer–specific survival (49). However, this study did not investigate a role for NEIL2, the only BER gene that was identified here. It is also noteworthy that the screening strategy outlined here did not identify XRCC1 or SMUG1 loss as associating with ET resistance, but this may be due to the stringent criteria required for a positive finding in our screening approach, that is, independent validation in three datasets. We are unaware of other studies reporting a role for loss of any DDR candidates identified here in ET response. This is of specific interest with the publication of many large genome-wide studies of breast cancer, which were extremely valuable in our analyses. Although other studies using these large datasets did not identify a role for DDR loss in endocrine resistance, this is likely because in most datasets the diagnosis of endocrine resistance is based on relapse and death from disease, a phenotype that is highly dependent on the quality of follow up. The use of neoadjuvant datasets in our analysis not only specifically addresses the intrinsic ET phenotype based on the highly prognostic nature of on-treatment Ki67 values, but also demonstrates that etiologic diagnoses relating to endocrine resistance can be made very early on in the course of the disease, enabling interventions to address adverse biology early enough to improve overall outcomes.

The identification of DDR defects as regulators of response to ET also provide fundamental insights into the etiology of ER+ breast cancer. Previous studies have identified lower incidence of structural rearrangements in ER+ breast tumors when compared with either ER or HER2+ tumors (50). Simultaneously, whole-exome sequencing identified a subset of ER+ tumors with high somatic mutation load as associating with poor survival, whereas high mutation load in ER tumors trended toward an association with better patient survival (42). The ability of ER+ breast cancer cells to grow in the presence of SSBR defects may reflect the evolutionary context of normal ER+ mammary cells, which are primed for sudden and rapid bursts of proliferation, associated with downregulation of many SSBR pathways (14). In contrast, ER+ mammary cells may find it more difficult to tolerate large genomic rearrangements, commonly associated with DSBR defects, as this is not part of their etiology. Further analysis of the unique role of NHEJ loss in endocrine treatment response is warranted.

In terms of alternative therapeutic strategies for patients with CENMP ER+ breast cancer, this study provides some preliminary but potentially important associations that warrant deeper investigation. MutL-defective, ET-resistant, ER+ breast cancer cells and tumors are sensitive to CDK4/6 inhibitors (12), currently in clinical use in advanced disease settings. Preliminary functional investigations presented herein extend these observations, indicating that a common mechanism underlying endocrine resistance caused by disruption of multiple DDR candidate genes from different pathways can generate a disconnect between ER and CDK4/6 that is targetable with CDK4/6 inhibition.

The CEN score, which takes into account MMR, BER, and NER pathway genes is a new starting point for distinguishing patients into those who are not likely to respond to ET and will require alternative treatments potentially including CDK4/6 inhibition. However, sophisticated algorithms and inclusion of additional DDR genes and other known factors that regulate ET response will be necessary to improve the sensitivity and specificity of this signature, particularly for the prediction of ET response. The ultimate validation of our hypotheses awaits results from the many adjuvant CDK4/6 inhibitor trials that are ongoing.

In summary, the results of this study most clearly identify single-strand DNA damage repair defects as a novel class of ET resistance drivers that may contribute to perhaps half of ER+ breast cancer patient deaths within the first 5 years after diagnosis. Detailed mechanistic studies focused on dysregulation of identified DDR components are ongoing to facilitate a better understanding of the fundamental connections between the ER, CDK4/6, and DNA repair pathways to further refine the therapeutic approach that should be offered to these patients.

M. J. Ellis is an employee of and holds ownership interest (including patents) in Bioclassifier LLC, and is a consultant/advisory board member for NanoString, Pfizer, Novartis, and AstraZeneca. No potential conflicts of interest were disclosed by the other authors.

The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH.

Conception and design: M. Anurag, M.J. Ellis, S. Haricharan

Development of methodology: M. Anurag, M.N. Bainbridge, M.J. Ellis, S. Haricharan

Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): N. Punturi, M.J. Ellis, S. Haricharan

Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): M. Anurag, M.N. Bainbridge, M.J. Ellis, S. Haricharan

Writing, review, and/or revision of the manuscript: M. Anurag, M.N. Bainbridge, M.J. Ellis, S. Haricharan

Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): M. Anurag, N. Punturi, J. Hoog, M.J. Ellis

Study supervision: M.N. Bainbridge, M.J. Ellis, S. Haricharan

The authors would like to acknowledge Mr. Jonathan T. Lei and Drs. Shyam Kavuri and Eric Chang for scientific input.

Research reported in this publication was primarily supported by Susan G. Komen Promise grant (PG12220321; to M.J. Ellis), Cancer Prevention and Research Institute of Texas (CPRIT) Recruitment of Established Investigators award (RR140033; to M.J. Ellis), and Laura Ziskin award from Stand Up2 Cancer (to M.J. Ellis). Clinical trial data accrual and analysis was supported by the National Cancer Institute of the NIH under Award Numbers U10CA180821 and U10CA180882 (to the Alliance for Clinical Trials in Oncology), U10CA077440 (legacy), U10CA180833, and U10CA180858. NeoPalAna trial was supported by Pfizer Pharmaceuticals and Susan G. Komen Promise Grant (to M.J. Ellis).

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1.
Davies
C
,
Godwin
J
,
Gray
R
,
Clarke
M
,
Cutter
D
,
Darby
S
, et al
Relevance of breast cancer hormone receptors and other factors to the efficacy of adjuvant tamoxifen: patient-level meta-analysis of randomised trials
.
Lancet
2011
;
378
:
771
84
.
2.
Ma
CX
,
Reinert
T
,
Chmielewska
I
,
Ellis
MJ
. 
Mechanisms of aromatase inhibitor resistance
.
Nat Rev Cancer
2015
;
15
:
261
75
.
3.
Bose
R
,
Kavuri
SM
,
Searleman
AC
,
Shen
W
,
Shen
D
,
Koboldt
DC
, et al
Activating HER2 mutations in HER2 gene amplification negative breast cancer
.
Cancer Discov
2013
;
3
:
224
37
.
4.
Slamon
DJ
,
Clark
GM
,
Wong
SG
,
Levin
WJ
,
Ullrich
A
,
McGuire
WL
. 
Human breast cancer: correlation of relapse and survival with amplification of the HER-2/neu oncogene
.
Science
1987
;
235
:
177
82
.
5.
Yersal
O
,
Barutca
S
. 
Biological subtypes of breast cancer: Prognostic and therapeutic implications
.
World J Clin Oncol
2014
;
5
:
412
24
.
6.
Goncalves
R
,
Ma
C
,
Luo
J
,
Suman
V
,
Ellis
MJ
. 
Use of neoadjuvant data to design adjuvant endocrine therapy trials for breast cancer
.
Nat Rev Clin Oncol
2012
;
9
:
223
9
.
7.
Cheang
MC
,
Chia
SK
,
Voduc
D
,
Gao
D
,
Leung
S
,
Snider
J
, et al
Ki67 index, HER2 status, and prognosis of patients with luminal B breast cancer
.
J Natl Cancer Inst
2009
;
101
:
736
50
.
8.
Barone
I
,
Cui
Y
,
Herynk
MH
,
Corona-Rodriguez
A
,
Giordano
C
,
Selever
J
, et al
Expression of the K303R estrogen receptor-alpha breast cancer mutation induces resistance to an aromatase inhibitor via addiction to the PI3K/Akt kinase pathway
.
Cancer Res
2009
;
69
:
4724
32
.
9.
Osborne
CK
,
Schiff
R
. 
Mechanisms of endocrine resistance in breast cancer
.
Annu Rev Med
2011
;
62
:
233
47
.
10.
de Groot
AF
,
Kuijpers
CJ
,
Kroep
JR
. 
CDK4/6 inhibition in early and metastatic breast cancer: a review
.
Cancer Treat Rev
2017
;
60
:
130
8
.
11.
Kwapisz
D
. 
Cyclin-dependent kinase 4/6 inhibitors in breast cancer: palbociclib, ribociclib, and abemaciclib
.
Breast Cancer Res Treat
2017
;
166
:
41
54
.
12.
Haricharan
S
,
Punturi
N
,
Singh
P
,
Holloway
KR
,
Anurag
M
,
Schmelz
J
, et al
Loss of mutl disrupts CHK2-dependent cell-cycle control through CDK4/6 to promote intrinsic endocrine therapy resistance in primary breast cancer
.
Cancer Discov
2017
;
7
:
1168
83
.
13.
Broustas
CG
,
Lieberman
HB
. 
DNA damage response genes and the development of cancer metastasis
.
Radiat Res
2014
;
181
:
111
30
.
14.
Caldon
CE
. 
Estrogen signaling and the DNA damage response in hormone dependent breast cancers
.
Front Oncol
2014
;
4
:
106
.
15.
Nik-Zainal
S
,
Alexandrov
LB
,
Wedge
DC
,
Van Loo
P
,
Greenman
CD
,
Raine
K
, et al
Mutational processes molding the genomes of 21 breast cancers
.
Cell
2012
;
149
:
979
93
.
16.
Dietlein
F
,
Thelen
L
,
Reinhardt
HC
. 
Cancer-specific defects in DNA repair pathways as targets for personalized therapeutic approaches
.
Trends Genet
2014
;
30
:
326
39
.
17.
Nickoloff
JA
,
Jones
D
,
Lee
SH
,
Williamson
EA
,
Hromas
R
. 
Drugging the cancers addicted to DNA repair
.
J Natl Cancer Inst
2017
;
109
.
18.
Liberzon
A
,
Birger
C
,
Thorvaldsdóttir
H
,
Ghandi
M
,
Mesirov
JP
,
Tamayo
P
, et al
The Molecular Signatures Database (MSigDB) hallmark gene set collection
.
Cell Systems
2015
;
1
:
417
25
.
19.
Liberzon
A
,
Subramanian
A
,
Pinchback
R
,
Thorvaldsdóttir
H
,
Tamayo
P
,
Mesirov
JP
. 
Molecular signatures database (MSigDB) 3.0
.
Bioinformatics
2011
;
27
:
1739
40
.
20.
Cline
MS
. 
(Integration of biological networks and gene expression data using Cytoscape Cytoscape: a software environment for integrated models of biomolecular interaction networks
.
(1750–2799 (Electronic))
. https://www.nature.com/articles/nprot.2007.324.
21.
Ellis
MJ
,
Suman
VJ
,
Hoog
J
,
Lin
L
,
Snider
J
,
Prat
A
, et al
Randomized phase II neoadjuvant comparison between letrozole, anastrozole, and exemestane for postmenopausal women with estrogen receptor-rich stage 2 to 3 breast cancer: clinical and biomarker outcomes and predictive value of the baseline PAM50-based intrinsic subtype–ACOSOG Z1031
.
J Clin Oncol
2011
;
29
:
2342
9
.
22.
Parker
JS
,
Mullins
M
,
Cheang
MC
,
Leung
S
,
Voduc
D
,
Vickery
T
, et al
Supervised risk predictor of breast cancer based on intrinsic subtypes
.
J Clin Oncol
2009
;
27
:
1160
7
.
23.
Zehir
A
,
Benayed
R
,
Shah
RH
,
Syed
A
,
Middha
S
,
Kim
HR
, et al
Mutational landscape of metastatic cancer revealed from prospective clinical sequencing of 10,000 patients
.
Nat Med
2017
;
23
:
703
13
.
24.
Olson
JA
 Jr
,
Budd
GT
,
Carey
LA
,
Harris
LA
,
Esserman
LJ
,
Fleming
GF
, et al
Improved surgical outcomes for breast cancer patients receiving neoadjuvant aromatase inhibitor therapy: results from a multicenter phase II trial
.
J Am Coll Surg
2009
;
208
:
906
14
.
25.
Liu
X
,
Jian
X
,
Boerwinkle
E
. 
dbNSFP: a lightweight database of human nonsynonymous SNPs and their functional predictions
.
Hum Mutat
2011
;
32
:
894
9
.
26.
Ng
PC
,
Henikoff
S
. 
SIFT: Predicting amino acid changes that affect protein function
.
Nucleic Acids Res
2003
;
31
:
3812
4
.
27.
Hovelson
DH
,
McDaniel
AS
,
Cani
AK
,
Johnson
B
,
Rhodes
K
,
Williams
PD
, et al
Development and validation of a scalable next-generation sequencing system for assessing relevant somatic variants in solid tumors
.
Neoplasia
2015
;
17
:
385
99
.
28.
Rhodes
DR
,
Kalyana-Sundaram
S
,
Mahavisno
V
,
Varambally
R
,
Yu
J
,
Briggs
BB
, et al
Oncomine 3.0: genes, pathways, and networks in a collection of 18,000 cancer gene expression profiles
.
Neoplasia
2007
;
9
:
166
80
.
29.
Rhodes
DR
,
Yu
J
,
Shanker
K
,
Deshpande
N
,
Varambally
R
,
Ghosh
D
, et al
ONCOMINE: a cancer microarray database and integrated data-mining platform
.
Neoplasia
2004
;
6
:
1
6
.
30.
Benjamini
Y
,
Yekutieli
D
. 
The control of the false discovery rate in multiple testing under dependency
.
Ann Stat
2001
;
29
:
1165
88
.
31.
Wang
J
,
Vasaikar
S
,
Shi
Z
,
Greer
M
,
Zhang
B
. 
WebGestalt 2017: a more comprehensive, powerful, flexible and interactive gene set enrichment analysis toolkit
.
Nucleic Acids Res
2017
;
45
:
W130
7
.
32.
Haricharan
S
,
Brown
P
. 
TLR4 has a TP53-dependent dual role in regulating breast cancer cell growth
.
Proc Natl Acad Sci U S A
2015
;
112
:
E3216
25
.
33.
Ellis
MJ
,
Ding
L
,
Shen
D
,
Luo
J
,
Suman
VJ
,
Wallis
JW
, et al
Whole-genome analysis informs breast cancer response to aromatase inhibition
.
Nature
2012
;
486
:
353
60
.
34.
Ellis
MJ
,
Suman
VJ
,
Hoog
J
,
Goncalves
R
,
Sanati
S
,
Creighton
CJ
, et al
Ki67 proliferation index as a tool for chemotherapy decisions during and after neoadjuvant aromatase inhibitor treatment of breast cancer: results from the American College of Surgeons Oncology Group Z1031 Trial (Alliance)
.
J Clin Oncol
2017
;
35
:
1061
9
.
35.
Urruticoechea
A
,
Smith
IE
,
Dowsett
M
. 
Proliferation marker Ki-67 in early breast cancer
.
J Clin Oncol
2005
;
23
:
7212
20
.
36.
Liu
J
,
Prager-van der Smissen
WJ
,
Look
MP
,
Sieuwerts
AM
,
Smid
M
,
Meijer-van Gelder
ME
, et al
GATA3 mRNA expression, but not mutation, associates with longer progression-free survival in ER-positive breast cancer patients treated with first-line tamoxifen for recurrent disease
.
Cancer Lett
2016
;
376
:
104
9
.
37.
Chimge
NO
,
Little
GH
,
Baniwal
SK
,
Adisetiyo
H
,
Xie
Y
,
Zhang
T
, et al
RUNX1 prevents oestrogen-mediated AXIN1 suppression and beta-catenin activation in ER-positive breast cancer
.
Nat Commun
2016
;
7
:
10751
.
38.
Thewes
V
,
Simon
R
,
Schroeter
P
,
Schlotter
M
,
Anzeneder
T
,
Büttner
R
, et al
Reprogramming of the ERRalpha and ERalpha target gene landscape triggers tamoxifen resistance in breast cancer
.
Cancer Res
2015
;
75
:
720
31
.
39.
Iwamoto
T
,
Booser
D
,
Valero
V
,
Murray
JL
,
Koenig
K
,
Esteva
FJ
, et al
Estrogen receptor (ER) mRNA and ER-related gene expression in breast cancers that are 1% to 10% ER-positive by immunohistochemistry
.
J Clin Oncol
2012
;
30
:
729
34
.
40.
Loi
S
,
Haibe-Kains
B
,
Desmedt
C
,
Wirapati
P
,
Lallemand
F
,
Tutt
AM
, et al
Predicting prognosis using molecular profiling in estrogen receptor-positive breast cancer treated with tamoxifen
.
BMC Genomics
2008
;
9
:
239
.
41.
Griffith
OL
,
Spies
NC
,
Anurag
M
,
Griffith
M
,
Luo
J
,
Tu
D
, et al
The prognostic effects of somatic mutations in ER-positive breast cancer
.
bioRxiv
2017
.
42.
Haricharan
S
,
Bainbridge
MN
,
Scheet
P
,
Brown
PH
. 
Somatic mutation load of estrogen receptor-positive breast tumors predicts overall survival: an analysis of genome sequence data
.
Breast Cancer Res Treat
2014
;
146
:
211
20
.
43.
Bruna
A
,
Rueda
OM
,
Greenwood
W
,
Batra
AS
,
Callari
M
,
Batra
RN
, et al
A Biobank of breast cancer explants with preserved intra-tumor heterogeneity to screen anticancer compounds
.
Cell
2016
;
167
:
260
74
.
44.
Mjelle
R
,
Hegre
SA
,
Aas
PA
,
Slupphaug
G
,
Drabløs
F
,
Saetrom
P
, et al
Cell cycle regulation of human DNA repair and chromatin remodeling genes
.
DNA Repair
2015
;
30
:
53
67
.
45.
Nalepa
G
,
Clapp
DW
. 
Fanconi anaemia and cancer: an intricate relationship
.
Nat Rev Cancer
2018
;
18
:
168
85
.
46.
Murray
JM
,
Carr
AM
. 
Integrating DNA damage repair with the cell cycle
.
Curr Opin Cell Biol
2018
;
52
:
120
5
.
47.
Hurtado
A
,
Holmes
KA
,
Ross-Innes
CS
,
Schmidt
D
,
Carroll
JS
. 
FOXA1 is a key determinant of estrogen receptor function and endocrine response
.
Nat Genet
2011
;
43
:
27
33
.
48.
Joseph
R
,
Orlov
YL
,
Huss
M
,
Sun
W
,
Kong
SL
,
Ukil
L
, et al
Integrative model of genomic factors for determining binding site selection by estrogen receptor-alpha
.
Mol Syst Biol
2010
;
6
:
456
.
49.
Abdel-Fatah
TM
,
Perry
C
,
Arora
A
,
Thompson
N
,
Doherty
R
,
Moseley
PM
, et al
Is there a role for base excision repair in estrogen/estrogen receptor-driven breast cancers?
Antioxidants Redox Signal
2014
;
21
:
2262
8
.
50.
Kwei
KA
,
Kung
Y
,
Salari
K
,
Holcomb
IN
,
Pollack
JR
. 
Genomic instability in breast cancer: pathogenesis and clinical implications
.
Mol Oncol
2010
;
4
:
255
66
.