Purpose: This study was undertaken to conduct a comprehensive investigation of the role of DNA damage repair (DDR) defects in poor outcome ER+ disease.
Experimental Design: Expression and mutational status of DDR genes in ER+ breast tumors were correlated with proliferative response in neoadjuvant aromatase inhibitor therapy trials (discovery dataset), with outcomes in METABRIC, TCGA, and Loi datasets (validation datasets), and in patient-derived xenografts. A causal relationship between candidate DDR genes and endocrine treatment response, and the underlying mechanism, was then tested in ER+ breast cancer cell lines.
Results: Correlations between loss of expression of three genes: CETN2 (P < 0.001) and ERCC1 (P = 0.01) from the nucleotide excision repair (NER) and NEIL2 (P = 0.04) from the base excision repair (BER) pathways were associated with endocrine treatment resistance in discovery dataset, and subsequently validated in independent patient cohorts. Complementary mutation analysis supported associations between mutations in NER and BER genes and reduced endocrine treatment response. A causal role for CETN2, NEIL2, and ERCC1 loss in intrinsic endocrine resistance was experimentally validated in ER+ breast cancer cell lines, and in ER+ patient-derived xenograft models. Loss of CETN2, NEIL2, or ERCC1 induced endocrine treatment resistance by dysregulating G1–S transition, and therefore, increased sensitivity to CDK4/6 inhibitors. A combined DDR signature score was developed that predicted poor outcome in multiple patient cohorts.
Conclusions: This report identifies DDR defects as a new class of endocrine treatment resistance drivers and indicates new avenues for predicting efficacy of CDK4/6 inhibition in the adjuvant treatment setting. Clin Cancer Res; 24(19); 4887–99. ©2018 AACR.
Estrogen receptor positive (ER+) breast cancer is treatable with endocrine drugs that interrupt estrogen receptor function, but fatal drug resistance occurs in at least 30% of cases. The comprehensive analysis of the molecular landscape of DNA repair defects in ER+ breast cancer patient tumors reported here identifies defects in select DNA repair pathways as a novel class of endocrine treatment resistance driver occurring in approximately 40% of patients with endocrine treatment–resistant ER+ breast cancer. Candidate DNA damage repair genes identified are experimentally shown to be linked by common dysregulation of G1–S transition, suggesting that loss of DNA proof reading pathways disrupts ER regulation of the cell cycle, and therefore, response to endocrine treatment. A combined DDR signature score was developed that predicted poor outcome in multiple patient cohorts, which will have immense translational implications in stratifying patients who would not respond to endocrine therapy, but will be good candidates for CDK4/6 inhibitor–based treatment. These findings, therefore, significantly increase the understanding of factors underlying response to standard-of-care in patients with ER+ breast cancer and identify an alternative targeted therapeutic strategy for this subset of patients.
Breast cancer is the most frequent form of cancer affecting women, and estrogen-receptor positive (ER+) tumors account for 60%–70% of all reported cases. For patients with early-stage ER+ disease, endocrine therapy: tamoxifen or an aromatase inhibitor (AI) are preferred first-line therapies. Despite these treatments, at least 1 in 4 patients develop fatal endocrine therapy resistance (1, 2). Although some markers predictive of endocrine treatment (ET) response are available, for example, ERBB2 mutation/amplification (3, 4), gene expression profiles (5), and on-neoadjuvant endocrine treatment Ki67 analysis (6, 7), resistance largely remains an unpredictable and poorly understood event. Efforts to study underlying mechanisms of ET resistance have focused on activation of peptide growth factors (e.g., EGFR, ERBB2) and on activating mutations or translocation in ESR1 (8, 9). However, these mechanisms mostly explain adaptive or acquired resistance to endocrine treatment in the advanced setting, thereby circumscribing their use as predictive markers in primary tumors. The success of cyclin-dependent kinase (CDK) 4/6 inhibition for metastatic breast cancer (10, 11) indicates a major role for the target cell-cycle–dependent kinases in restoring growth control in ET-resistant tumors, but these agents lack accurate predictive markers. Consequently adjuvant treatment will lead to overtreatment in some patients and undertreatment in others.
It is known that loss of the MutL components of the mismatch repair (MMR) complex causes poor initial response to ET (intrinsic ET resistance) but in the MutL-deficient setting tumors remain sensitive to CDK4/6 inhibition (12). Effects of other DNA damage response/repair (DDR) defects on ET resistance are understudied but might have similar relationships. In other cancer types, disruptions in DDR pathways associate with tumor formation, responsiveness to chemotherapy, and loss of replicative checkpoints in many cancer types (13). In addition, ER-induced signaling and proliferation downregulates many DDR pathways in the normal mammary gland (14), potentially signifying a relationship between defects in DDR and hormonal response in ER+ breast cancer cells. Several genomic studies on breast cancer have identified signatures of DNA repair defects generated by classifying types of mutations, but the impact of these studies have been diluted by uncertainty regarding the molecular origin and clinical relevance of these signatures (15). Therefore, there is strong rationale for conducting a comprehensive, molecular analysis of the role of DDR defects in regulating and predicting ET response.
DDR is constituted of eight canonical pathways: mismatch repair (MMR), which can be further broken down into MutL and MutS complementation groups—nucleotide excision repair (NER), base excision repair (BER), nonhomologous end joining (NHEJ), homologous recombination (HR), Fanconi Anemia (FA), trans-lesion synthesis (TLS), and direct repair (DR; ref. 16). The first six of these pathways fall into one of two larger categories: single-strand break repair (SSBR) consisting of MMR, BER, and NER, and double-strand break repair (DSBR) comprised of NHEJ and HR (ref. 17; Fig. 1A). Here, we describe a comprehensive analysis of these canonical DDR pathways in ER+ breast tumors from patients and draw causal associations between BER and NER defects and ET resistance that parallel earlier observations on the role of MMR defects.
Materials and Methods
DDR gene set compilation
Gene set for eight canonical DDR pathways comprising of DR, MMR, NER, BER, HR, NHEJ, and FA along with checkpoint genes was built as a union of MSigDB (18, 19) KEGG (c2:curated) pathway–specific genes and updated table of DDR pathway genes are listed at http://sciencepark.mdanderson.org/labs/wood/dna_repair_genes.html. Genes shared across different DDR pathways or checkpoints were not included in the analysis. In addition, TLS pathway genes were left out from current analysis, because of their ambiguous role in SSBR and DSBR mechanism. Cytoscape (20) was used to visualize the DDR pathway network.
Z1031/POL dataset (referred to as NeoAI) was used with permission from the Alliance consortium (21). The data were obtained after written informed consent from the patients and the studies were conducted in accordance with recognized ethical guidelines approved by IRB (21). TCGA (downloaded June 2017) and MSKCC-IMPACT (downloaded February 2018) mutation data were obtained from cBioPortal. TCGA (downloaded June 2017) and METABRIC (downloaded June 2017) copy number data were obtained from cBioPortal. TCGA analyses were restricted to ER+ patient tumors [except for Supplementary Fig. S3 where basal-like and HER2-enriched tumors were analyzed on the basis of published PAM50 categorization (22)]. For MSKCC-IMPACT analyses, a list of ER− breast cancer sample IDs was derived from previously published literature (23) and subtracted from the list available on cBioPortal. While there is estimated to be some contamination from patients with HER2+ or ER− breast cancer, this subtracted list contains a majority of ER+ breast tumors. TCGA and METABRIC gene expression data and associated survival outcomes were downloaded from Oncomine. Standard cutoffs of mean-1.5× SD were used to identify “Low” subsets of each candidate gene in each dataset when multiple candidate genes were combinatorially analyzed. For individual analyses in METABRIC, “Low” subsets were identified using median cutoffs, and in Loi, using mean-1.5 × SD.
Mutation calls from NeoAI (21, 24) were used as the discovery set along with DNA-sequenced ER+ tumors from TCGA. For NeoAI dataset, mutations with Normal VAF<10/NA and Tumors VAF >10/NA were classified as somatic. Mutations were scored by SIFT (25, 26) to assess their effect on protein structure and function. In accordance with SIFT standards, missense mutations with scores <0.05 were considered to be damaging. For the enrichment analysis in TCGA, mutations were categorized into three variant types—missense, frameshift, and nonsense. Frameshift and nonsense mutations were cumulatively referred to as FS/NS mutations. Enrichment analysis in TCGA used the Z-score test of two population proportions to compare the proportion of missense to frameshift/nonsense mutations in each DDR pathway to the proportion of missense to frameshift/nonsense mutations in a control set of unrelated genes (MYH7, SYNE1, NEB) in tumors from patients who remained alive. This approach was used to ensure the presence of appropriate sample size for the analysis employed and to control for genome-wide levels of mutations in each tumor.
For univariate and multivariate analyses, 887 tumors from Luminal (A/B) patients who received endocrine treatment were analyzed. mRNA (microarray) expression and survival information along with other clinical metadata were extracted from Oncomine (27–29). Only samples with survival metadata were included in the analysis. For this cohort, tumors with expression level of candidate genes lower than median values were labeled as “low” while rest were labelled as “high.” All survival data were analyzed using Kaplan–Meier curves and log-rank tests. Proportional hazards were determined using Cox regression.
Missing data were imputed with “NA” from mutation, expression, and survival data analysis. Samples classifying for more than one category were either removed (e.g., samples with mutations in both SSBR and DSBR genes) or treated as separate set (e.g., cumulative analysis of tumors with dysregulation of new versus published candidates) for statistical comparisons. Pearson's correlation was performed on log-transformed normalized data for every DDR gene using automated script in R. To control for errors in multiple test corrections, false discovery rates were calculated using the Benjamini–Hochberg (30) method in R. Two-tailed Wilcoxon rank sum tests were used for two-sample tests of association between classes. Pathway over-representation analysis was performed using thirteen candidate genes as input genes against all DDR genes (n = 104) as the background in WebGestalt (31) using KEGG pathway database.
A DDR expression signature score known as CENMP (CETN2,ERCC1, NEIL2, MLH1, PMS2) score was devised using mean of standardized expression for the five mentioned genes. The CENMP score was calculated in three independent datasets (NeoAI, Metabric, and Loi) using microarray expression levels. Survival of patients with highest 20% quantile of CENMP score was compared against that of patients with lowest 20% quantile of the same.
Cell lines, siRNA transfection, and growth assays
ZR75.1 and MCF7 parental cells were from ATCC, 2015 and 2017, respectively, and tested for mycoplasma contamination upon arrival using the Lonza Mycoalert Plus Kit (CAT# LT07-710) as per the manufacturer's instructions, and annually since. Both lines were maintained in RPMI1640 1× w/l-glutamine and 1% penicillin/streptomycin (Sigma-Aldrich, Catalog no. P4333-100 mL). Cell lines used for experiments were <20 passages. Transient transfections with esiRNA (Sigma-Aldrich) toward human NEIL2 (Catalog no. EHU158461), CETN2 (catalog no. EHU137031), ERCC1 (catalog no. EHU156971), RAD23B (catalog no. EHU145881) and POLM (catalog no. SASI_Hs02_003344553), or scrambled control used at 50 nmol/L each were delivered using Polyplus jetPRIME Transfection Reagent (catalog no. 114-07) as per the manufacturer's instructions. Cells were plated for experiments 48 hours after transfection. Stable selection with puromycin after infection with lentivirus harboring RNAi oligos (ABM) against human NEIL2 (catalog no. i014980a), CETN2 (catalog no. i004479a), and ERCC1 (catalog no. i007023a) or scrambled control (catalog no. i000238c) was conducted using manufacturer's protocol. Growth assays were repeated independently in triplicate as reported previously (32) using Alamar blue to detect cell viability. Final readings were between 4 and 6 days after initial drug treatment and fold change plotted for analysis.
Fulvestrant (Thermo Fisher Scientific catalog no. 506242), 4-OHT (Sigma-Aldrich catalog no. H7904), palbociclib (Thermo Fisher Scientific catalog no. 508548), and abemaciclib (Thermo Fisher Scientific, catalog no. NC0577560), dissolved in DMSO, and β-estradiol (Sigma-Aldrich catalog no. E2758-1G), dissolved in water, at 10 mmol/L were stored long term at −80°C with working stocks at −20°C. Cells were treated 24 hours after plating with fresh drug added every 48 hours until the end of the experiment. Estradiol addition experiments were conducted in phenol red–free DMEM with 10% charcoal-stripped serum.
qPCRs, Western blot analysis, FACS and IFs
RNA from cell lines was extracted using the Qiagen RNeasy Mini Kit (catalog no. 74106) and converted to cDNA using Bio-Rad Reverse Transcriptase iScript (catalog no. TX1708891BAY), both following manufacturers’ instructions. cDNA was quantified by q-RT-PCR using Bio-Rad SsoAdvanced Universal SYBR Green Supermix (catalog no. 17525272) at the manufacturers’ specifications. For immunoflourescence, 48 hours after esiRNA transfection, cells were plated on Poly-d-lysine–coated coverslips (catalog no. NC0746078), treated for 24 hours, then harvested. Cells on coverslips were then washed with 1× PBS, fixed in 4% PFA, and costained with Ki-67/MKI67 Antibody (Novus Biologicals catalog no. NB110-89717SS). Western blotting was conducted as described previously (32) using antibodies against CETN2 (Abclonal catalog no. A5397), NEIL2 (Abcam catalog no. ab221556), ERCC1 (Abcam catalog no. ab129267), and β-actin (Sigma-Aldrich catalog no. A5316-100 μL). FACS analysis for PCNA (Abcam CAT#ab29) was conducted after treating MCF7 siScr or siCEN cells with 100 nmol/L fulvestrant for 72 hours following a 20-hour treatment with 250 nmol/L nocodazole (Sigma-Aldrich catalog no. M1404). Cells were then probed for PCNA and run on a BD Accuri C6 flow cytometer using a standard protocol.
NER and BER downregulation associates with endocrine treatment resistance
The status of 104 DDR genes belonging to the six major DDR pathways: NER, BER, MMR, NHEJ, FA, and HR was assessed in primary ER+ breast tumors. Genes that were shared between multiple pathways were excluded from the analysis to facilitate better understanding of the discrete contributions of individual pathways to ET resistance (Fig. 1A). In addition, DR and TLS were excluded because they fall outside canonical SSBR-DSBR categorization, and appeared to be rarely dysregulated in ER+ tumors in initial observations. The discovery set, referred to as NeoAI, was composed of data from two neoadjuvant aromatase inhibitor trials [Z1031 (21, 33) and POL (24)]. These clinical trials were designed to assess intrinsic ET response by accruing serial biopsies from patients at diagnosis (baseline, BL), after 2–4 weeks of endocrine treatment (on-treatment) and in the surgical specimen (Fig. 1B). These biopsies were annotated by whole-genome/exome-sequencing, RNA-seq and gene expression microarray, providing both mutational and transcriptomic information. The biopsies, both baseline and on-treatment were also evaluated for Ki67, a marker of proliferation, by both IHC and expression (ref. 34; Supplementary Fig. S1A). An on-treatment Ki67 value >10% is a clinically relevant marker of intrinsic ET resistance associated with elevated risk of relapse in the first 5 years of follow up (35). The discovery strategy for this study was therefore, to correlate DDR gene expression at baseline with on-treatment Ki67 levels (both by IHC and by expression), and combine results with that from a complementary enrichment analysis for deleterious mutations, to identify a set of DDR genes, which when dysregulated may predict intrinsic ET resistance (Fig. 1B).
Transcriptomic analysis (schema outlined in Supplementary Fig. S1B) identified 13 DDR genes whose reduced expression (at baseline) correlated with increased on-treatment Ki67 levels (Fig. 2A). Three genes (ESR1, GATA3, and RUNX1; refs. 36–39) were used as positive controls because of their known associations with ET responsiveness (Fig. 2A). Twelve of the 13 DDR genes identified belonged to SSBR pathways, corresponding to approximately 20% of all unique SSBR genes compared with only 2% of all unique DSBR genes (Fig. 2B) supporting a strong role for SSBR pathway downregulation in ET resistance. FA was the only one of three DSBR pathways examined to demonstrate significant association with ET resistance (Fig. 2C). Pathway enrichment analysis of the candidate gene list relative to all other DDR genes studied revealed significant over-representation of “Platinum drug resistance” (P = 0.04, comprised of MMR and NER genes) and “mismatch repair” (P = 0.04) terms (Fig. 2D). These results framed the proposition that underexpression of genes serving NER, and to a lesser extent, BER genes can reduce response to ET.
Low expression of CETN2, ERCC1, and NEIL2 are poor prognostic factors in ER+ breast cancer patients treated with endocrine therapy
To understand the effect of downregulation of the 13 candidate genes on long-term patient outcomes, univariate associations between low expression of these 13 genes and patient survival was tested in an independent dataset (METABRIC). To increase the specificity of these correlations, only the subset of patients with luminal tumors who were treated with ET was included in these analyses. Univariate Cox Proportional hazard analyses based on overall survival identified five genes as significantly associating with poor survival: CETN2, ERCC1, MLH1, NEIL2, and PMS2 (Supplementary Fig. S2). Subsequent multivariate analyses, including tumor size, grade, and node status, supported an independent role for these five candidates in predicting disease-specific survival (DSS, Fig. 3A). An association between reduced expression of MutL genes, MLH1 and PMS2, and poor survival has already been described in this dataset (12). Kaplan–Meier survival analysis demonstrated that low expression of CETN2, NEIL2, and ERCC1 individually also correlated with worse DSS compared with all other patients with ER+ breast cancer (Fig. 3B–D). However, no association between low RNA levels and poor survival for CETN2, NEIL2, and ERCC1 was observed in patients with either HER2-enriched (Supplementary Fig. S3A–S3C) or basal-like (Supplementary Fig. S3D–S3F) breast cancer, suggesting that the association between defects in these genes and survival is subtype-specific. A correlation between low expression of CETN2, NEIL2, and ERCC1 and poor recurrence-free survival was also observed in the Loi dataset (40), serving as independent validation (Supplementary Fig. S4). Furthermore, a composite signature based on low expression of any one of these genes (the CEN signature) associated with a significantly increased risk ratio of 5.1 in Loi (P = 0.02; Fig. 3E).
Damaging mutations in nucleotide excision repair, base excision repair and non-homologous end joining genes are enriched in ER+ patient tumors
To further understand the involvement of DDR genes in ER+ breast cancer, incidence of damaging versus nondamaging missense mutations (as predicted by SIFT; refs. 25, 26) was analyzed in pretreatment biopsies from NeoAI (Supplementary Table S1). In this analysis, genes from SSBR pathways showed significant enrichment for damaging mutations when compared with all other genes in the genome, whereas DSBR genes did not (Fig. 4A). Enrichment for damaging and tolerant mutations over genome-wide frequency was then assessed for each individual DDR pathway in NeoAI (Supplementary Fig. S5). Damaging mutations were enriched in genes of NER, BER, NHEJ, and HR pathways (FDR < 0.05), but tolerant mutations were not, potentially indicating a selection for deleterious mutations in these pathways during tumor evolution. Damaging mutations were also enriched in genes of FA pathway, but nondamaging mutations showed even higher enrichment, suggesting that any role for FA gene mutation in ER+ breast cancer is likely complex (Supplementary Fig. S5). Enrichment for deleterious (frameshift/nonsense; FS/NS) over missense (MS) mutations in ER+ patient tumors was validated in TCGA for genes of BER and NHEJ, but not HR and FA, pathways, although due to limited follow-up time in this dataset, similar validation could not be obtained for NER (Fig. 4B). To facilitate rigorous statistical analysis, P values were generated by comparing the proportion of somatic FS/NS:MS mutations in each DDR pathway in patients who were alive, to a control set of genes that have not been implicated as cancer drivers (SYNE1, MYH7, NEB). Together, these enrichment analyses promote the postulate that NER, BER, and NHEJ gene mutations may be ER+ breast cancer drivers.
To address associations with clinical outcomes more directly, Cox regression analysis was conducted for tumors with mutations in each DDR pathway in two datasets: TCGA and MSKCC-IMPACT (Fig. 4C). TCGA has whole-exome sequencing data from >800 ER+ breast tumors, while MSKCC-IMPACT has targeted sequencing of a selected panel of genes (including a subset of DDR genes) in >300 ER+ primary breast tumors. Among the SSBR pathways, mutations in NER (ERCC2–5) and BER genes each associated with significantly higher HR in MSKCC-IMPACT and TCGA databases, respectively (Fig. 4C), validating observations made in the gene expression (Fig. 2) and enrichment analyses described above (Fig. 4A and B; Supplementary Fig. S5). Association of NER gene mutations in TCGA and BER gene mutations in MSKCC-IMPACT could not be made because of median follow-up being <6 months in either case.
Amongst the DSBR pathways, tumors with mutations in NHEJ genes associated with a significantly higher HR when compared with wild-type tumors in TCGA (Fig. 4C). No NHEJ gene was included in the targeted panel sequenced in MSKCC-IMPACT precluding validation in this dataset. To date, NHEJ has not been associated with ET resistance. Only five genes from this pathway had truncating mutations in either NeoAI or TCGA: PRKDC, XRCC5, DNTT, NHEJ1, and POLM. Patients whose tumors harbored either MS or FS/NS mutations in any of these genes associated with worse overall survival in TCGA, as did patients whose tumors had copy number loss of these loci (Supplementary Fig. S6A). When individual associations with survival were analyzed, only mutation and/or copy number loss of PRKDC associated independently with poor survival (HR = 2.8; P = 0.009, Supplementary Fig. S6B). In addition, tumors with either PRKDC copy number loss or mutations also had significantly lower gene expression of PRKDC (P < 0.001, Supplementary Fig. S6C), suggesting that this is unlikely to be a chance association. Although mutation data on PRKDC was not available in METABRIC, the association of PRKDC copy number loss with poor survival was validated in METABRIC (Supplementary Fig. S6D). In addition, the association of PRKDC mutations with poor prognosis was observed in parallel with this study, in an independent set of ER+ patient tumors (41).
Among the other DSBR pathways, primary tumors with mutations in HR genes did not associate with higher HRs than wild-type tumors in either TCGA or MSKCC-IMPACT where eight HR genes (RAD54L, RAD52, RAD51D, RAD51, NBN, BRCA1, and BLM) were included in the targeted panel (Fig. 4C). Mutations in FA genes had mixed associations, with patients whose tumors had mutations in FA genes associating with worse survival in TCGA, but not in MSKCC-IMPACT (FANCA, FANCC, PALB2, BRIP1; Fig. 4C).
Consideration of all three discovery parameters analyzed, that is, gene expression downregulation, gene mutation, and association with patient outcomes (increased HRs for overall survival in TCGA, METABRIC, or MSKCC-IMPACT) provides strongest support for the involvement of SSBR pathway dysregulation in poor clinical outcomes of patients with ER+ breast cancer (Fig. 4D). More specifically, all three discovery parameters support an understudied role for NER and BER dysregulation in ET resistance (Fig. 4D) that warrants functional investigation. Evidence for involvement of DSBR pathway dysregulation in ER+ breast cancer outcomes is less consistent across the three different screening parameters (Fig. 4D) and requires further investigation in larger patient cohorts, and in experimental model systems.
Two confounding factors affect interpretation of mutational analyses conducted here. First, it is possible that dysregulation of replication factors commonly associated with DSBR disruption affects proliferative response to ET. However, no decisive association between replication gene expression (RPA1-4) and on-treatment proliferation marker Ki67 (IHC and mRNA, Supplementary Fig. S7) was observed in NeoAI, suggesting that replicative disruption is unlikely to be a major confounding factor for this analysis. Second, previous reports have suggested an association between high mutation load or genome instability and poor patient outcomes in breast cancer (42), which may influence interpretation of associations between mutations in DDR genes and clinical outcome. No significant increase in genome instability or mutation load was observed in tumors with mutations in DSBR genes (Supplementary Fig. S8B and S8D). However, as expected, a significant increase in mutation load, but not genome instability, was observed in SSBR-mutated tumors (Supplementary Fig. S8A and S8C). Therefore, it is not possible to rule out high mutation load as a potential confounding factor in the mutational analysis presented above, without functional validation of causality. Dysregulation of genes from the NER (CETN2 and ERCC1) and BER (NEIL2) pathways were, therefore, functionally investigated as candidates from these pathways were most consistently correlated with ET resistance and poor patient outcomes (Fig. 4D).
Experimental validation of CETN2, ERCC1, and NEIL2 as endocrine therapy resistance genes
To test whether dysregulation of candidate genes from NER and BER pathways can directly cause ET resistance, pooled siRNA against each of the three candidate genes identified in the gene expression screen, that is, CETN2, ERCC1, and NEIL2, as well as a scrambled control, was transiently transfected into two ET-sensitive, ER+ breast cancer cell lines, MCF7 (Fig. 5) and ZR-75 (Supplementary Fig. S9). Knockdown of the genes was validated at RNA level in both cell lines (Supplementary Fig. S9A and S9C), and downregulation of each protein was confirmed by Western blots of MCF7 cell lysates (Fig. 5A). Cells transfected with siRNA against scrambled control (siScr), CETN2, NEIL2, or ERCC1 were then exposed to all three classes of ET: estrogen deprivation in media containing charcoal-stripped serum (to mimic AI), and tamoxifen or fulvestrant treatment in media containing full serum. Estrogen-deprived siCETN2, siNEIL2, and siERCC1 MCF7 (Supplementary Fig. S9B) and ZR-75 (Supplementary Fig. S9D) cells showed attenulated growth response to estradiol stimulation when compared with siScr control cells, indicating decreased influence of estrogen signaling on proliferaton. Consistent with this notion, siCETN2, siNEIL2, and siERCC1 MCF7 (Fig. 5B and C) and ZR-75 (Supplementary Fig. S9E and S9F) cells demonstrated a significant lack of growth inhibition in response to either fulvestrant or tamoxifen treatment.
As negative controls, two DDR genes that did not correlate with ET resistance, RAD23B and POLM, were also knocked down using siRNA in MCF7 cells (Supplementary Fig. S10A). Downregulation of these genes did not alter growth response to either fulvestrant (Supplementary Fig. S10B) or tamoxifen (Supplementary Fig. S10C). In addition, independent lentiviral RNAi oligos against CETN2, NEIL2, and ERCC1 were used to stably select MCF7 cells with CETN2, ERCC1, and NEIL2 knockdown (Supplementary Figs. S9A and S10D demonstrate knockdown at RNA and protein level respectively). Fulvestrant-independent (Fig. 10E) and tamoxifen-independent (Supplementary Fig. S10F) growth phenotypes were faithfully replicated in these stable cells, providing orthogonal confirmation that growth effects caused by knockdown of the three candidate genes, CETN2, NEIL2, and ERCC1 are specific and causal.
Next, dysregulation of these candidates at either the mutational or RNA level was assessed across 7 ER+ PDXs (BCaPE; ref. 43). One of the 7 lines had strong downregulation of NEIL2 expression, and two other lines exhibited downregulation of ERCC1. In addition, one line had downregulation of PMS2 (MutL−). One of the lines with low ERCC1 RNA also harbored a missense mutation in MLH1 suggesting a compound phenotype. All four of these lines, (designated CENMP− because of a disruption of any CETN2, ERCC1, NEIL2, or MutL component) exhibited significantly higher tumor viability after treatment with the AI, anastrozole, when compared with the other three tumors designated CENMP+ in this PDX cohort (P = 0.02; Fig. 5D).
To test whether the loss of proliferative inhibition to ET observed in human tumors (Fig. 2) was reproducible in experimental systems, Ki67 levels were also assessed before and after fulvestrant treatment in siCETN2, siNEIL2 and siERCC1 MCF7 cells relative to siScr control. Inhibition of gene expression from any of these three candidates resulted in a profound lack of Ki67 inhibition in response to fulvestrant treatment, unlike control cells (Supplementary Fig. S11A), reproducing observations in clinical trial samples. Earlier studies indicated that MutL-defective ER+ breast cancer cells exhibit altered proliferative response to ET due to dysregulation of G1–S cell-cycle transition (12). To test whether the candidate DDR genes identified in this screen also participate in the regulation of G1–S transition, their gene expression pattern was analyzed across the cell cycle after a double thymidine block (www.dnarepairgenes.com; ref. 44). All three candidates, as well as NHEJ genes, had maximal gene expression specifically in G1 or around the G1–S transition point, similarly to MLH1 (Fig. 5E). On the other hand, FA gene expression was maximal in late S phase and HR gene expression in late G2 (Fig. 5E). These observations are consistent with published data (45, 46) and indicate a common role for the candidate endocrine resistance DDR genes identified here in G1–S transition. As in the case of MutL-defective tumors, CEN/PRKDC- ER+ patient tumors from TCGA also had significantly increased RNA levels of CDK4, the principal G1 cyclin-dependent kinase (Supplementary Fig. S11B) and protein levels of PCNA, a marker of successful S-phase transition, relative to CEN/PRKDC+ tumors (Supplementary Fig. S11C). Increased PCNA positivity was also confirmed in MCF7 cells with stable knockdown of CETN2, NEIL2, ERCC1, and MLH1 after fulvestrant treatment relative to control cells (Supplementary Fig. S11D).
To test ER regulation as an alternative mechanism uniting these candidate genes in their ability to cause ET-resistant growth, correlation of gene expression of each candidate gene was tested against ESR1/PGR expression in patient tumors from NeoAI (Supplementary Fig. S12). Partial correlation was observed between ESR1/PGR levels and CETN2 (R = 0.36/0.2), but not for other two genes. A ChIP-seq dataset (47, 48) identified an ESR1 binding peak close to the CETN2 promoter but not for the other two candidates. Therefore, although ER-mediated regulation of some DDR genes cannot be ruled out as one mechanism underlying relationships with ET resistance, it is unlikely that it constitutes a common underlying mechanism for all the DDR candidate genes studied herein. Together, these data suggest that CEN− ER+ breast cancer cells, akin to MutL− cells, enable unchecked CDK4 activity, resulting in rapid G1–S transition even in the presence of ET.
To directly test whether inhibition of CDK4/6 can inhibit proliferation in CEN− ER+ breast cancer cells, MCF7 cells with stable knockdown of CETN2, NEIL2, or ERCC1 were exposed to the CDK4/6 inhibitors, palbociclib and abemaciclib. Control MCF7 cells demonstrated comparable sensitivity to both fulvestrant and CDK4/6 inhibitors, palbociclib (Fig. 5F) and abemaciclib (Supplementary Fig. S11E), in keeping with published reports (12). However, downregulation of any one of the three candidate genes in MCF7 cells induced resistance to fulvestrant, but persistent sensitivity to both palbociclib (Fig. 5F) and abemaciclib (Supplementary Fig. S11E). These data provide preliminary support for a role for DDR dysregulation in predicting ET resistance and sensitivity to CDK4/6 inhibitors.
Predictive value of candidate DDR gene dysregulation in ER+ breast cancer
To estimate the impact of DDR dysregulation as a novel class of ET resistance driver and a predictive marker for ET failure, the cumulative frequency of dysregulation, i.e., multiple or cooccurring downregulation of 3 of the 4 novel candidate genes discovered in this analysis, CETN2, NEIL2, ERCC1, mutation or copy number loss of the fourth candidate gene, PRKDC, and downregulation of the two previously known candidate genes, MLH1 and PMS2, was assessed in METABRIC (Fig. 6A) and TCGA (Fig. 6B). In both datasets, downregulation of one or a combination of these genes occurred in 40%–60% of tumors from patients with ER+ breast cancer who died within 5 years of diagnosis. A less significant enrichment for dysregulation of these genes was observed in patients with ER+ breast cancer who died more than 5 years after diagnosis, suggesting that downregulation of these genes predisposes ER+ breast cancer to early ET failure consistent with intrinsic resistance.
To identify a DDR-low signature in patients with ER+ breast cancer, a gene expression score was defined using mean normalized expression of each gene. The score was significantly lower in resistant tumors from NeoAI when compared against sensitive counterparts (P = 0.002; Fig. 6C). While this indicates that the score associates with ET resistance in patient tumors, the sensitivity of the score is approximately 70% and the specificity approximately 68%, indicating potential for further refinement of the signature by inclusion of other known factors, and mutational or copy number data.
Using this signature, the lowest and highest scoring quintiles of ER+ tumors were identified in METABRIC and Loi. The lowest scoring quintile associated with poor disease-specific and recurrence-free survival of patients with ER+ tumors in METABRIC (P < 0.001; Fig. 6D) and Loi (P = 0.09; Fig. 6E) indicating the feasibility of using this score to predict short-term outcomes in patient cohorts. Of note, this analysis also demonstrated better survival of patients in the upper quintile of the DDR signature score, suggesting dual validity of the score in predicting both worse and better response to ET.
This study presents a comprehensive characterization of the molecular landscape of canonical DNA repair pathway defects in ER+ breast cancer as it relates to response to ET. A previous epidemiologic study examining a selected subset of BER proteins using IHC identified XRCC1, APE1, SMUG1, and FEN1 as associating with ER+ breast cancer–specific survival (49). However, this study did not investigate a role for NEIL2, the only BER gene that was identified here. It is also noteworthy that the screening strategy outlined here did not identify XRCC1 or SMUG1 loss as associating with ET resistance, but this may be due to the stringent criteria required for a positive finding in our screening approach, that is, independent validation in three datasets. We are unaware of other studies reporting a role for loss of any DDR candidates identified here in ET response. This is of specific interest with the publication of many large genome-wide studies of breast cancer, which were extremely valuable in our analyses. Although other studies using these large datasets did not identify a role for DDR loss in endocrine resistance, this is likely because in most datasets the diagnosis of endocrine resistance is based on relapse and death from disease, a phenotype that is highly dependent on the quality of follow up. The use of neoadjuvant datasets in our analysis not only specifically addresses the intrinsic ET phenotype based on the highly prognostic nature of on-treatment Ki67 values, but also demonstrates that etiologic diagnoses relating to endocrine resistance can be made very early on in the course of the disease, enabling interventions to address adverse biology early enough to improve overall outcomes.
The identification of DDR defects as regulators of response to ET also provide fundamental insights into the etiology of ER+ breast cancer. Previous studies have identified lower incidence of structural rearrangements in ER+ breast tumors when compared with either ER− or HER2+ tumors (50). Simultaneously, whole-exome sequencing identified a subset of ER+ tumors with high somatic mutation load as associating with poor survival, whereas high mutation load in ER− tumors trended toward an association with better patient survival (42). The ability of ER+ breast cancer cells to grow in the presence of SSBR defects may reflect the evolutionary context of normal ER+ mammary cells, which are primed for sudden and rapid bursts of proliferation, associated with downregulation of many SSBR pathways (14). In contrast, ER+ mammary cells may find it more difficult to tolerate large genomic rearrangements, commonly associated with DSBR defects, as this is not part of their etiology. Further analysis of the unique role of NHEJ loss in endocrine treatment response is warranted.
In terms of alternative therapeutic strategies for patients with CENMP− ER+ breast cancer, this study provides some preliminary but potentially important associations that warrant deeper investigation. MutL-defective, ET-resistant, ER+ breast cancer cells and tumors are sensitive to CDK4/6 inhibitors (12), currently in clinical use in advanced disease settings. Preliminary functional investigations presented herein extend these observations, indicating that a common mechanism underlying endocrine resistance caused by disruption of multiple DDR candidate genes from different pathways can generate a disconnect between ER and CDK4/6 that is targetable with CDK4/6 inhibition.
The CEN score, which takes into account MMR, BER, and NER pathway genes is a new starting point for distinguishing patients into those who are not likely to respond to ET and will require alternative treatments potentially including CDK4/6 inhibition. However, sophisticated algorithms and inclusion of additional DDR genes and other known factors that regulate ET response will be necessary to improve the sensitivity and specificity of this signature, particularly for the prediction of ET response. The ultimate validation of our hypotheses awaits results from the many adjuvant CDK4/6 inhibitor trials that are ongoing.
In summary, the results of this study most clearly identify single-strand DNA damage repair defects as a novel class of ET resistance drivers that may contribute to perhaps half of ER+ breast cancer patient deaths within the first 5 years after diagnosis. Detailed mechanistic studies focused on dysregulation of identified DDR components are ongoing to facilitate a better understanding of the fundamental connections between the ER, CDK4/6, and DNA repair pathways to further refine the therapeutic approach that should be offered to these patients.
Disclosure of Potential Conflicts of Interest
M. J. Ellis is an employee of and holds ownership interest (including patents) in Bioclassifier LLC, and is a consultant/advisory board member for NanoString, Pfizer, Novartis, and AstraZeneca. No potential conflicts of interest were disclosed by the other authors.
The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH.
Conception and design: M. Anurag, M.J. Ellis, S. Haricharan
Development of methodology: M. Anurag, M.N. Bainbridge, M.J. Ellis, S. Haricharan
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): N. Punturi, M.J. Ellis, S. Haricharan
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): M. Anurag, M.N. Bainbridge, M.J. Ellis, S. Haricharan
Writing, review, and/or revision of the manuscript: M. Anurag, M.N. Bainbridge, M.J. Ellis, S. Haricharan
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): M. Anurag, N. Punturi, J. Hoog, M.J. Ellis
Study supervision: M.N. Bainbridge, M.J. Ellis, S. Haricharan
The authors would like to acknowledge Mr. Jonathan T. Lei and Drs. Shyam Kavuri and Eric Chang for scientific input.
Research reported in this publication was primarily supported by Susan G. Komen Promise grant (PG12220321; to M.J. Ellis), Cancer Prevention and Research Institute of Texas (CPRIT) Recruitment of Established Investigators award (RR140033; to M.J. Ellis), and Laura Ziskin award from Stand Up2 Cancer (to M.J. Ellis). Clinical trial data accrual and analysis was supported by the National Cancer Institute of the NIH under Award Numbers U10CA180821 and U10CA180882 (to the Alliance for Clinical Trials in Oncology), U10CA077440 (legacy), U10CA180833, and U10CA180858. NeoPalAna trial was supported by Pfizer Pharmaceuticals and Susan G. Komen Promise Grant (to M.J. Ellis).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.