Breast cancers are divided into subtypes with different prognoses and treatment responses based on global differences in gene expression. Luminal breast cancer gene expression and proliferation are driven by estrogen receptor alpha, and targeting this transcription factor is the most effective therapy for this subtype. By contrast, it remains unclear which transcription factors drive the gene expression signature that defines basal-like triple-negative breast cancer, and there are no targeted therapies approved to treat this aggressive subtype. In this study, we utilized integrated genomic analysis of DNA methylation, chromatin accessibility, transcription factor binding, and gene expression in large collections of breast cancer cell lines and patient tumors to identify transcription factors responsible for the basal-like gene expression program. Glucocorticoid receptor (GR) and STAT3 bind to the same genomic regulatory regions, which were specifically open and unmethylated in basal-like breast cancer. These transcription factors cooperated to regulate expression of hundreds of genes in the basal-like gene expression signature, which were associated with poor prognosis. Combination treatment with small-molecule inhibitors of both transcription factors resulted in synergistic decreases in cell growth in cell lines and patient-derived organoid models. This study demonstrates that GR and STAT3 cooperate to regulate the basal-like breast cancer gene expression program and provides the basis for improved therapy for basal-like triple-negative breast cancer through rational combination of STAT3 and GR inhibitors.
This study demonstrates that GR and STAT3 cooperate to activate the canonical gene expression signature of basal-like triple-negative breast cancer and that combination treatment with STAT3 and GR inhibitors could provide synergistic therapeutic efficacy.
Breast cancers can be divided into subtypes with different prognoses and treatment responses based on global gene expression differences (1). These expression differences reflect the transcription factors (TF), epigenetic states, and gene regulatory networks in the normal cells from which the tumors arise, as well as changes that accumulate during tumorigenesis.
Normal luminal breast cells proliferate in response to estrogen through activation of the TF estrogen receptor (ER)α and ER's downstream target genes. Breast cancers that express ER are most often classified into the luminal subtype (2). These luminal ER+ tumors are treated with hormone therapies that prevent ER from activating its proproliferative transcriptional program. Even though these tumors usually have wild-type ER, this is an extremely effective targeted therapy because it inhibits the master regulator of gene expression and proliferation in this cell type.
In contrast, 15% to 20% of patients with breast cancer are diagnosed with triple-negative breast cancer (TNBC), which is defined by a lack of expression of ER and progesterone receptor (PR), and normal expression of human epidermal growth factor receptor 2. There are no FDA-approved targeted therapies to treat TNBC. Even after aggressive treatment with surgery, radiotherapy, and cytotoxic chemotherapy, patients with TNBC have a high rate of recurrence within 3 years (3) and a high rate of death from disease (38%; ref. 4). There is a crucial need for alternative therapeutic strategies for TNBC. TNBC is a molecularly heterogeneous disease, and numerous previous studies have found subsets of TNBC tumors with distinct molecular characteristics. These subtypes include tumors with features that are associated with the epithelial-to-mesenchymal transition, which is often referred to as claudin-low or mesenchymal subtype (5–7), and an apocrine/luminal signature driven by androgen receptor (AR) instead of ER, now commonly known as the luminal androgen receptor subtype (LAR), which responds to antiandrogen therapy (8). The most broadly encompassing signature of TNBC that distinguishes it from ER-driven luminal breast cancer is the basal-like subtype (9). Approximately 70% to 80% of TNBC tumors are classified as basal-like breast cancer using PAM50 subtyping (9). Thousands of genes are differentially expressed between basal-like and luminal breast cancer, and the genes upregulated in basal-like breast cancer reflect the rapid growth and poor prognosis that are the hallmarks of TNBC (1, 5). It is unclear which TFs are responsible for regulating the proproliferative transcriptional program in the basal-like subtype. The goal of this study is to identify TFs that regulate clinically relevant gene regulatory programs in basal-like breast cancer, and determine if inhibiting these TFs could provide novel therapeutic strategies to improve clinical outcomes for patients with basal-like breast cancer.
The roles of many TFs have been studied in basal-like breast cancer (10, 11). Many of these TFs regulate important pathways in basal-like breast cancer, and these TFs often regulate each other's expression. Several of these TFs can be inhibited with small-molecule drugs and represent promising therapeutic targets (12). What remains unclear is which TFs are responsible for driving the gene expression signature that defines the basal-like subtype and which are downstream of the master regulator.
This study utilized integrated analysis of DNA methylation, chromatin accessibility, TF binding, gene expression, and cell growth in large collections of breast cancer cell lines and patient tumors to identify TFs that drive the basal-like gene expression program. We sought to identify regulatory regions that are specifically unmethylated and have accessible chromatin in basal-like compared with luminal breast cancers. We identified TFs whose motifs and binding sites are most highly enriched at these basal-specific regulatory regions. We then investigated whether these TFs were more highly expressed in basal-like compared with luminal breast cancer, whether the TFs regulated genes in the basal-like gene expression program, and whether the genes regulated by these TFs were associated with patient outcomes. This genome-wide approach led to the discovery that cooperation between two TFs [STAT3 and glucocorticoid receptor (GR)], rather than a single “master regulator,” drives expression of hundreds of genes in the basal-like gene expression program. This TF cooperation provides basal-like cells with their resilient growth capacity. Furthermore, inhibiting both TFs simultaneously led to synergistic decreases in cell growth in basal-like cell lines and patient-derived organoid models, which suggests a new treatment strategy for basal-like breast cancer.
Materials and Methods
The following basal-like breast cancer cell lines were used for this study: HCC1569, HCC1954, BT-20, HCC1143, HCC1187, HCC1599, HCC1937, HCC70, MDA-MB-468, 2-LMP, BT-549, HCC38, MDA-MB-157, MDA-MB-231, MDA-MB-436, SUM102, SUM149, and SUM159.
The following luminal breast cancer cell lines were used for this study: DY36T2, ZR-75–30, MDA-MB-453, SK-BR-3, BT-474, MDA-MB-361, MCF-7, MDA-MB-134, T-47D, and ZR-75–1.
Cells were obtained and cultured as described previously (13). Cell lines passaged for more than 3 months were authenticated every 6 months using short tandem repeat testing service offered through the University of Utah DNA Sequencing Core Facility.
Reduced representation bisulfite sequencing
Reduced representation bisulfite sequencing (RRBS) and primary analysis of CG methylation were performed on the panel of cell lines using the method described previously (14). CG positions with at least 10x coverage in at least 80% of the cell lines were used for subsequent analysis. To identify CGs that were differentially methylated between basal-like and luminal cell lines, linear regression was performed in the R software package version 3.5.0 using the lm function. P values were adjusted for multiple hypothesis testing using the p.adjust function in R and the Benjamini–Hochberg method. CGs with adjusted P values less than 0.05 were visualized in R using the pheatmap package version 1.0.10. RRBS data are publicly available through NCBI Gene Expression Omnibus (GEO) accession number GSE152202.
Enrichment of ENCODE TF binding sites
A file named wgEncodeRegTfbsClusteredV2.bed.gz containing TF-binding sites from chromatin immunoprecipitation sequencing (ChIP-seq) experiments performed on 149 TFs by the ENCODE Project (15) in a variety of cell lines was downloaded from the UCSC Genome Browser (16). Bedtools intersect (17) was used to determine which CG positions in the RRBS dataset overlapped binding site positions for each TF in the file. The number of overlaps with basal-like and luminal-specific CGs was normalized to the number of overlaps to all CGs in RRBS to calculate a fold-enrichment.
Analysis of The Cancer Genome Atlas RNA sequencing and ATAC-seq
RNA sequencing (RNA-seq) counts file for The Cancer Genome Atlas (TCGA) Breast Cancer samples was downloaded from the UCSC Xena browser (18, 19). These files contained log counts, so they were transformed to counts in R version 3.5.0. DESeq2 version 1.20.0 (20) was used to identify genes that were differentially expressed between basal-like and luminal tumors using the PAM50 Subtype classification provided by TCGA and available for download through the UCSC Xena browser (18, 19). Basal-specific genes were identified using an adjusted P value < 0.05 and coefficients less than 0. Luminal-specific genes were identified using a P value < 0.05 and coefficients greater than 0.
Assay for transposase accessible chromatin with high-throughput sequencing (ATAC-seq) data from TCGA breast tumors were downloaded as counts files and bigwig files from the Supplementary Materials provided in the recent publication by Corces and colleagues (21). DESeq2 version 1.20.0 (20) was used to identify peaks that were differentially accessible between basal-like and luminal tumors using the PAM50 Subtype classification provided by TCGA and available for download through the UCSC Xena browser (18, 19). Pheatmap package version 1.0.10 was used to visualize basal-specific peaks that were identified using an adjusted P value < 0.05 and coefficients less than 0. The ATAC-seq peaks specifically open in basal-like tumors were analyzed for enrichment of motifs for JUN (MA0491.1), STAT3 (MA0144.1), and GR(MA0113.2) downloaded from JASPAR2018 (22). AME (23) was used to perform the analysis of sequences in ATAC-seq peaks compared with a primary sequence shuffled conserving 2-mer frequencies, and P values were calculated using the Fisher exact test. MEME-ChIP (24) was used to perform motif discovery on the 14,094 most significant peaks (DESeq2 adjusted P < 5 × 10−11) that were specifically accessible in basal-like tumors using the human HOCOMOCOv11 database with default options.
To analyze ATAC-seq signal from TCGA breast tumors over the basal-specific GR and STAT3 shared binding sites, the UCSC hgLiftOver was used to convert the GR and STAT3 shared sites bed file to the hg38 genome build to be compatible with TCGA ATAC-seq alignments. Deeptools version 3.1.0 (25) computeMatrix was used to generate a matrix of counts over the GR and STAT3 shared sites for each sample. The mean accessibility was computed across the regions for each sample using R version 3.5.0. The Mann–Whitney test was used to determine if the average normalized ATAC-seq read depth across regions was significantly different between basal-like and luminal tumors. The mean and SD of the average normalized depth across basal-like and luminal samples were graphed using GraphPad Prism 7.0c.
Violin plots for averaged bigwig read counts from basal-like and luminal tumors at GR and STAT3 shared sites were created with vioplot_0.3.2, using default parameters and areaEqual = T, h = 0.04, wex = 0.9.
ChIP-seq for GR and STAT3 was performed on 4 basal-like cell lines (SUM159, MDA-MB-231, HCC1937, and HCC70) and 4 luminal cell lines (MDA-MB-361, BT-474, MDA-MB-453, and MCF-7).
For ChIP experiments, protein–DNA complexes were covalently cross-linked by incubating cells in 1% formaldehyde for 10 minutes at room temperature. Cells were incubated with 0.125 mol/L glycine for 5 minutes to quench cross-linking reaction. Cells were washed and scraped with PBS (pH 7.4; Lonza). Cells were lysed with Farnham Lysis Buffer (5 mmol/L PIPES at pH 8.0, 85 mmol/L KCl, and 0.5% NP-40) containing protease inhibitor (Roche). Cell lysate was centrifuged at 2,000 rpm for 5 minutes at 4°C. The crude nuclear extract contained in the supernatant was stored at -80°C. ChIP-seq was performed using antibodies for GR (sc-1003, Santa Cruz Biotechnology) and STAT3 (sc-482, Santa Cruz Biotechnology). A thorough version of the ChIP-seq protocol used in this study is available on the ENCODE Project website: https://www.encodeproject.org/documents/df9dd0ec-c1cf-4391-a745-a933ab1af7a7/@@download/attachment/Myers_Lab_ChIP-seq_Protocol_v042211.pdf
STAT3 and GR ChIP-seq datasets have been deposited in the NCBI GEO accession numbers GSE85579 and GSE152203.
Fastq files from ChIP-seq were aligned to the hg19 build of the human genome using Bowtie with the following parameters: -m 1 -t –best -q -S -l 32 -e 80 -n 2. ChIP-seq peaks were identified by comparing GR ChIP-seq in cells induced with dexamethasone with GR ChIP-seq in cells treated with ethanol, and STAT3 ChIP-seq to input control libraries. Peaks were called using Model-Based Analysis of ChIP-seq-2 (MACS2; ref. 26) with a P value cutoff of 1e-10, and the mfold parameter constrained between 15 and 100. Bedtools merge (17) was used to merge bed files of MACS2 narrow peak calls from each of the ChIP-seq experiments. Bedtools coverageBed (17) was used to extract read counts under each merged peak in each ChIP-seq experiment in each cell line. DESeq2 version 1.20.0 (20) was used to identify peaks with significantly different read depth between basal-like and luminal cell lines (adjusted P value < 0.05), and significantly different read depth between GR and STAT3 ChIP-seq experiments (adjusted P value < 0.05). A multivariate model was used to identify shared GR and STAT3 peaks whose read depth was significantly different between subtypes in both GR and STAT3 ChIP-seq experiments (adjusted P value < 0.05), but not significantly different between GR and STAT3 in the same subtype (adjusted P value > 0.05). Pheatmap package version 1.0.10 and Deeptools version 3.1.0 (25) computeMatrix and plotHeatmap functions were used to create the heatmaps of ChIP-seq data.
RNA-seq of the panel of cell lines was described previously (27), and the data are publicly available through NCBI GEO accession number GSE58135.
RNA-seq was also performed on MDA-MB-231 and HCC70 cell lines treated alone or in combination with dexamethasone, STAT3 siRNA, and SH4–54 as follows:
STAT3 siRNA knockdown utilized the ON-TARGETplus Human STAT3 siRNA kit from GE Healthcare (L-003544–00–0005). This SMARTpool siRNA contains four pooled siRNAs, each targeting a separate region of the STAT3 RNA sequence. ON-TARGETplus Non-targeting siRNA #1 (D-001810–01–05) is a nontargeting control. The siRNA SMARTpool and Nontargeting siRNA target sequences are below:
ON-TARGETplus SMARTpool siRNA J-003544–07, STAT3: GAGAUUGACCAGCAGUAUA
ON-TARGETplus SMARTpool siRNA J-003544–08, STAT3: CAACAUGUCAUUUGCUGAA
ON-TARGETplus SMARTpool siRNA J-003544–09, STAT3: CCAACAAUCCCAAGAAUGU
ON-TARGETplus SMARTpool siRNA J-003544–10, STAT3: CAACAGAUUGCCUGCAUUG
ON-TARGETplus Non-targeting siRNA #1: UGGUUUACAUGUCGACUAA
The siRNA transfection experiments were performed in 6-well plates in triplicate. Lipofectamine RNAiMAX Transfection Reagent (Thermo Fisher Scientific) was used per the manufacturer's instructions. To each well containing cells, 250 μL siRNA-transfection reagent mix was added to each well, for a final concentration of 25 pmol siRNA in 7.5 μL Lipofectamine RNAiMAX reagent per well.
The dexamethasone treatments were performed in 6-well plates in triplicate. Cell were treated with 100 nmol/L dexamethasone (D4902, Sigma Aldrich) or an equal volume of 100% molecular biology grade ethanol vehicle control (E7023, Sigma Aldrich) for 4 hours. Cells were lysed with Buffer RL (Norgen Biotek) containing 10% beta-mercaptoethanol. Total RNA was extracted using the Animal Tissue RNA Purification Kit (Norgen Biotek).
RNA-seq libraries for the STAT3 siRNA and Dexamethasone treatment were prepared from 250 ng total RNA via polyA-selection (Dynabead mRNA Purification Kit, Invitrogen) followed by transposase-mediated nonstranded library construction (28). Libraries were pooled and sequenced on an Illumina HiSeq 2000 or HiSeq 2500 sequencer using paired-end 50 bp reads with a 6 bp index read. TopHat v1.4.1 was used to align RNA-seq paired reads to GENCODE version 9. Cufflinks v1.3.0 and BEDTools were used to calculate raw counts for each GENCODE transcript.
DESeq2 version 1.20.0 (20) was used to identify genes with significantly different expression between treatment and control conditions in each cell line separately. To visualize differential expression of genes across cell lines with different base-line expression, gene expression was divided by the mean in the control sample, and normalized values were displayed in a heatmap using Pheatmap package version 1.0.10 in R.
For the GR and STAT3 interaction studies, MDA-MB-231 cells were plated in 6-well plates at an approximately 300,000 cells in 4 mL of media per well and grown for 24 hours. Three replicates wells were treated for 24 hours with 1 μmol/L of Dexamethasone (DEX), 8 μmol/L SH4–54, no treatment, or combination of DEX and SH4–54. The cells were lysed using 350 μL of Qiagen RLT buffer and 1% BME. RNA was extracted using Norgen Animal Tissue RNA Purification Kit according to the manufacturer's instructions. RNA-seq libraries were prepared using Illumina TruSeq Stranded RNA Kit with Ribo-Zero Gold and sequenced on the Illumina HiSeq 2500 as 50 cycle single-end reads. Reads were mapped using HISAT2 (29) version 2.1.0 with parameters –dta -p 28 -t -q -U and sorted with samtools (30). A counts file was generated from the bam files using featureCounts version 1.5.1 (31).
To identify genes that were additively and synergistically regulated by both GR and STAT3, the linear model (lm) function in R was used. DESeq2-normalized counts were fit to a multivariate model that included a term for GR activity, a term for STAT3 activity, and an interaction term (GR:STAT3). Coefficients and P values for each gene were used to identify those with additive and synergistic positive and negative regulation by GR and STAT3. Synergy was defined as a significant P value (<0.05) for the interaction term, and nonsignificant P values (>0.05) for the individual GR and STAT3 terms. Additivity was defined as a nonsignificant P value for the interaction term and significant P values for both of the individual GR and STAT3 terms with coefficients in the same direction. Coefficients from the linear model were used to determine whether the interactions between GR and STAT3 led to increased or decreased gene expression.
RNA-seq data from STAT3 siRNA treatments are publicly available through NCBI GEO accession number: GSE85579.
RNA-seq data from dexamethasone and SH4-54 treatments are publicly available through NCBI GEO accession numbers: GSE152201 and GSE137535.
The Kaplan–Meier plotter tool (http://kmplot.com; ref. 32) was used to analyze the association between gene expression and relapse-free survival. A Kaplan–Meier survival plot was generated using public gene expression microarray data from patient tumors for whom relapse-free survival data were available. The intrinsic subtype classification provided by the Kaplan–Meier plotter tool was used to select basal-like and luminal cases for analysis. The analysis was performed for GR, STAT3, and 438 basal-like genes that are upregulated by GR and STAT3. The following selections were applied to all analyses: only one JetSet best probe for each gene was used, multigene signatures used the mean expression of the selected probes, relapse-free survival was selected for the analysis, patients were censored at the follow-up threshold (60 or 120 months), biased arrays were excluded, and redundant samples were removed. The most significant cutpoint was used to split patients into two groups (“autoselect best cutoff” option).
Live imaging of cell growth
Assays were performed using the IncuCyte ZOOM live cell imaging system (Essen BioSciences). MDA-MB-231 and MDA-MB-436 cells were plated at 20,000 cells per well in 96-well plates. For titration experiments, cells were treated at a range of doses of SH4–54 between 0 μmol/L and 11 μmol/L of SH4–54, or at a range of doses of dexamethasone between 0 μmol/L and 2 μmol/L. For combination treatments, cells were treated with 8 μmol/L SH4–54 and 1 μmol/L dexamethasone, alone and in combination. Plates were imaged every 2 hours for 72 hours, and cell confluence was calculated by the IncuCyte ZOOM software. Relative percent confluence was calculated by subtracting the percent confluence at each timepoint by the starting percent confluence at the first timepoint.
Patient-derived organoid growth assay
Two patient-derived organoid cultures previously derived from primary basal-like TNBC tumors (HCI-001 and HCI-002) were utilized for this experiment (33). The organoids were cultured in media composed of Advanced DMEM/F12, (Thermo Fisher 12634028), 5% FBS Thermo Fisher 26140079), 1x Hepes (Thermo Fisher 15630080), 1× GlutaMAX Supplement (Thermo Fisher 35050061), 1 μg/mL hydrocortisone (Sigma Aldrich H0888), 10 ng/mL hEGF (Sigma Aldrich E9644), 50 μg/mL gentamicin (Genesee 25–533), and 10 μmol/L Y-27632 (Selleckchem S1049). Organoids were seeded as 50–100 organoids in 5% Matrigel (Corning 354230) per well of a 384-well plate, as described previously (33). MDA-MB-231 cell lines were included as a positive control in this experiment and plated at 500 cells per well in a 384-well plate. Cells were treated with all possible combinations of titrated doses of SH4–54 (0, 1, 2.1, 3.2, 4.3, 5.4, 6.5, 7.6, 8.8, 9.9, and 11 μmol/L) and mifepristone (0, 0.1, 1.42, 2.73, 4.05, 5.37, 6.68, and 8 μmol/L) in combination. Cell viability was measured relative to untreated cells after 72 hours using the CellTiter Glo 3D (Promega) that measures ATP. Loewe additivity analysis was used to compute synergy scores (34).
Identification of regulatory regions and candidate TFs specific to basal-like breast cancer
We hypothesized that genomic regulatory regions that are specifically unmethylated in basal-like breast cancer compared with luminal breast cancer will be enriched for TF-binding sites involved in controlling basal-specific gene expression. To identify regulatory regions specific to basal-like breast cancer, RRBS was performed on 28 breast cancer cell lines in order to measure DNA methylation across the genome. The cell lines include 18 cell lines that were previously classified as basal-like, and 10 cell lines that were previously classified as luminal by PAM50 subtyping (19, 35, 36). These 28 basal-like and luminal cell lines represent the heterogeneity found within TNBC, including cell lines previously classified as the mesenchymal and LAR subtypes (6), and cell lines with BRCA1 and BRCA2 mutations (Fig. 1A; refs. 6, 37). Of the 479,746 CG positions in the genome with at least 10x coverage in each cell line, 3,748 CGs were significantly differentially methylated (Linear Regression Benjamini–Hochberg adjusted P < 0.05) between basal-like and luminal cell lines. To focus our analysis on differential methylation in potential regulatory regions, rather than gene bodies, we identified 3,093 of these CG positions that were in promoter or intergenic regions of the genome (Fig. 1A). The DNA methylation differences were consistent across the luminal and basal-like cell lines, despite the molecular heterogeneity within these groups.
There were 1,300 CGs in intergenic or promoter regions of the genome that were significantly unmethylated in luminal breast cancer. These CG positions were intersected with the locations of TF-binding sites from ChIP-seq experiments performed on 149 TFs by the ENCODE Project in a variety of cell lines (15). To determine which TFs were enriched at regions that are specifically unmethylated in luminal breast cancer cells, the fraction of binding sites for each TF that intersect these CGs was divided by the fraction of binding sites for each TF in the ENCODE dataset (Supplementary Table S1). The most highly enriched sequence-specific TF-binding sites were ER (6.9-fold), FOXA1 (8.1-fold), and GATA3 (10.3-fold). This confirms that this approach is sound because ER, FOXA1, and GATA3 are known to be the master regulators of the luminal gene expression program (Fig. 1A; ref. 38).
There were 1,793 CGs that were in intergenic or promoter regions of the genome and were specifically unmethylated in basal-like breast cancer. These CG positions were intersected with the TF-binding sites locations from ChIP-seq experiments performed on 149 TFs by the ENCODE Project (15). The most highly enriched sequence-specific TFs in these regions were JUN and FOS TFs that heterodimerize to form the AP-1 complex (6.2-fold), STAT3 (4.8-fold), and GR (4.2-fold; Fig. 1A). This result led us to hypothesize that JUN, STAT3, and GR are involved in regulating basal-like breast cancer gene expression.
To further investigate this hypothesis, we sought to use primary human tumors rather than cell lines, to directly investigate chromatin accessibility with ATAC-seq rather than inferring it from a lack of DNA methylation, and use TF motif enrichment rather than the ENCODE Project ChIP-seq experiments, which did not include any basal-like breast cancer cell lines. We analyzed ATAC-seq data (21) from 59 primary breast tumors including 15 basal-like and 44 luminal that were previously classified by PAM50 subtyping (39). There were 110,156 regions of the genome that exhibit open chromatin specifically in basal-like tumors compared with luminal tumors (DESeq2 Adjusted P < 0.05; Fig. 1B). The chromatin accessibility differences were consistent across the basal-like and luminal tumors, despite the heterogeneity within these groups that included tumors previously classified as mesenchymal and LAR subtypes (40), varying clinical ER, PR, and HER2 status (39), varying AR expression, and germline BRCA1 and BRCA2 mutations (Fig. 1B; ref. 39). Motif enrichment analysis found significant enrichment for the canonical motifs for JUN/AP-1 (E-value = 1.59 × 10−1116), STAT3 (E-value = 7.13 × 10−373), and GR/PR/AR (E-value = 4.12 × 10−6) in the 110,156 chromatin regions that were specifically open in basal-like tumors and closed in luminal tumors (Fig. 1B). In addition, motif discovery was performed on the top 14,094 most significant (DESeq2 adjusted P < 5 × 10−11) chromatin regions specifically accessible in basal-like tumors, and motifs recognized by these TFs were significantly enriched (Supplementary Table S2). Although the enrichment for the GR motif is significant, it is lower than that of STAT3 and JUN/AP1, which is consistent with previous studies that suggest GR can be tethered to some enhancers through protein:protein interactions, rather than direct DNA binding (41).
Expression of GR and STAT3 is higher in basal-like breast cancer and is associated with a worse prognosis
If JUN, GR, and STAT3 regulate the basal-like breast cancer gene expression program, then one would expect these TFs to have higher expression in basal-like breast cancer compared with luminal breast cancer. RNA-seq was performed on the same 28 breast cancer cell lines used for DNA methylation analysis. Both GR (gene name: NR3C1) and STAT3 have significantly higher expression in basal-like cell lines compared with luminal cell lines, whereas JUN does not (Fig. 2A). An independent gene expression dataset containing 19 basal-like breast cancer cell lines and 21 luminal breast cancer cell lines (42) confirmed that both GR and STAT3 have higher expression in basal-like breast cancer (Fig. 2B). GR and STAT3 are equally highly expressed across basal-like tumors that were previously classified into TNBC subtypes (Supplementary Fig. S1A; ref. 40), and across tumors with and without BRCA1 and BRCA2 mutations (Supplementary Fig. S1B; ref. 39). These results confirm that GR and STAT3 have higher expression in basal-like breast cancer than luminal breast cancer.
Some luminal breast cancers do express low levels of GR and STAT3, so we investigated the association between prognosis and GR or STAT3 expression. Analysis of a dataset containing breast tumor gene expression and long-term survival data (32) revealed that higher expression of both GR and STAT3 was associated with shorter relapse-free survival in basal-like breast cancer (Fig. 2C). In contrast, luminal breast cancer patients with higher GR and STAT3 expression were more likely to have longer relapse-free survival (Fig. 2D). These results confirm that GR and STAT3 expression has a unique role in basal-like breast cancer and suggest that these TFs are associated with aggressive features of the disease.
GR and STAT3 bind to the same basal-specific regulatory regions across the genome
To directly test the hypothesis that GR and STAT3 bind to the genome and regulate the basal-like gene expression program, ChIP-seq was performed for GR and STAT3 in 8 breast cancer cell lines (4 basal-like, 4 luminal). GR is a steroid hormone nuclear receptor that is chaperoned in the cytoplasm until it binds glucocorticoid ligands, which allow it to homodimerize and translocate to the nucleus and regulate gene expression. To assess the binding of GR, cell lines were treated with 100 nmol/L dexamethasone for 1 hour prior to performing ChIP-seq. STAT3 is a TF whose dimerization and translocation to the nucleus are controlled through phosphorylation by Janus kinases (JAK). Normally, JAK phosphorylates STAT3 when cytokine receptors bind ligands such as interferons, epidermal growth factor, and IL5 and 6. STAT3 is constitutively phosphorylated and active in the majority of basal-like cell lines and patient tumors (43, 44). Thus, STAT3 ChIP-seq was performed on cells in standard media.
The genome-wide GR and STAT3 binding sites from ChIP-seq were used to perform unsupervised hierarchical clustering of the cell lines based on Spearman correlations. This analysis revealed that the basal-like and luminal subtypes have distinct genome-wide GR and STAT3 binding profiles (Fig. 3A). The basal-like cell lines form two distinct subclusters within the basal-like group, which corresponds to previous classifications as Basal A or Basal-like 1 and Basal-like 2 (HCC70 and HCC1937) and Basal B or mesenchymal stem-like (SUM159 and MDA-MB-231) subtypes (6, 45). Despite these differences, the four basal-like cell lines are still more similar to each other than they are to luminal cell lines when considering genome-wide GR and STAT3 binding. DESeq2 was used to identify regions of the genome that exhibited subtype-specific binding of GR. There were 2,002 regions of the genome that were bound by GR specifically in basal-like breast cancer (DESeq2 adjusted P < 0.05; Fig. 3B). The same analysis was repeated for STAT3 and 3,667 regions of the genome that were bound by STAT3 specifically in basal-like breast cancer (DESeq2 adjusted P < 0.05; Fig. 3C). This result demonstrates that both GR and STAT3 bind thousands of regions of the genome in basal-like breast cancer that they do not bind in luminal breast cancer, which supports the hypothesis that these TFs regulate the basal-like gene expression program.
The correlation heatmap revealed that genome-wide GR binding sites and STAT3 binding sites were highly correlated within each of the four basal-like cell lines. To investigate this further, DESeq2 was used to perform a multivariate analysis that combined both GR and STAT3 ChIP-seq datasets to identify sites where these TFs bound alone or together in a basal-like–specific manner. This analysis revealed 1,773 sites were bound by GR alone, 188 sites were bound by STAT3 alone, and 12,712 ChIP-seq peaks with both GR and STAT3 binding specifically in basal-like cell lines (Fig. 3D). This result indicates that GR and STAT3 bind the same basal-specific regulatory regions more often than they bind alone.
The majority of shared GR and STAT3 basal-like–specific binding sites are 50–500 kb from the nearest gene transcription start site, indicating that they are likely binding distal regulatory elements such as enhancers (Fig. 4A). Motif analysis was performed on the shared GR and STAT3 basal-like–specific binding sites. The canonical STAT3 motif was found in 5,044 of these peaks, resulting in an E-value of 4.97 × 10−425 compared with shuffled control sequences. The canonical GR motif was found in 1,309 of the peaks, resulting in an E-value of 8.00 × 10−37 compared with shuffled control sequences. This result indicates that GR and STAT3 shared binding sites are significantly enriched for motifs recognized by both TFs.
To test the hypothesis that shared GR and STAT3 basal-specific binding sites are near genes that are specifically expressed in basal-like breast cancer, RNA-seq data from 57 primary breast tumors (15 basal-like and 42 luminal) were analyzed (39). There were 7,121 genes that had significantly higher expression in basal-like breast tumors, and 3,464 genes that had significantly higher expression in luminal breast tumors (DESEQ adjusted P value 0.05). The distance between the transcription start site of each gene and the nearest shared GR and STAT3 binding site was calculated. Genes that had significantly higher expression in basal-like breast tumors were closer to shared GR and STAT3 basal-specific binding sites than genes that were higher in luminal tumors or genes that were not differentially expressed (Fig. 4B).
To further investigate whether the shared GR and STAT3 basal-specific binding sites discovered in cell lines are open and accessible in patient tumors, analysis of ATAC-seq data from 59 primary breast tumors (15 basal-like and 44 luminal) was performed (21). The shared GR and STAT3 basal-specific binding sites are significantly more open in basal-like patient tumors compared with luminal patient tumors (Mann–Whitney P = 7.375 × 10−231; Fig. 4C and D).
Together, these results indicate that GR and STAT3 bind together at thousands of regulatory regions that are specifically open in basal-like tumors and closed in luminal tumors, and these binding sites are near genes that have significantly higher expression in basal-like tumors. These data further support the hypothesis that GR and STAT3 regulate the basal-like gene expression program.
GR and STAT3 cooperatively regulate gene expression
To determine which genes are regulated by GR and STAT3, RNA-seq was performed on two basal-like cell lines (HCC70 and MDA-MB-231) treated with 100 nmol/L dexamethasone or an equal volume of 100% ethanol vehicle control for 4 hours, and STAT3 siRNA SMARTpool or Nontargeting siRNA for 96 hours. Dexamethasone induction of GR resulted in significant changes in the expression of 546 genes in both cell lines (DESeq2 adjusted P < 0.05; Fig. 5A). STAT3 siRNA knockdown resulted in significant changes in the expression of 1,967 genes in both cell lines (DESeq2 adjusted P < 0.05; Fig. 5B).
To test the hypothesis that genes that change expression are near the shared GR and STAT3 basal-like–specific binding sites, the distance between the transcription start site of each gene and the nearest shared GR and STAT3 binding site was calculated. Previous studies have reported these TFs are more likely to activate gene expression directly, and that associated repression is more likely a downstream indirect effect (46). Therefore, the genes that are activated and repressed by each TF were analyzed separately. This analysis revealed that a larger fraction of genes that are activated by GR or STAT3 are close to a shared site when compared with repressed genes, or genes that did not change expression when these TFs were modulated (Fig. 5C and D). Notably, 80% of genes that were activated by dexamethasone induction of GR and 60% of genes that are activated by STAT3 are within 100,000 bp of a shared GR and STAT3 binding site.
To further elucidate how GR and STAT3 cooperate to regulate gene expression, the MDA-MB-231 cell line was treated with dexamethasone (1 μmol/L for 24 hours) and the STAT3 inhibitor SH4–54 (8 μmol/L for 24 hours; ref. 47) alone and in combination. A multivariate linear model with a GR:STAT3 interaction term was used to identify genes that are differentially expressed when both TFs are fully active in the nucleus. There were 970 genes that had significantly higher expression when both GR and STAT3 were active in the nucleus (BH adjusted P value < 0.05; Fig. 6A). Strikingly, the majority of these genes (769) fit a synergistic model of activation, where the presence of both GR and STAT3 in the nucleus resulted in higher expression than the sum of the expression from GR and STAT3 alone. In addition, 201 genes fit an additive model, where GR and STAT3 each contribute additively to the expression. There were also 628 genes that had significantly lower expression when both GR and STAT3 were active in the nucleus (BH adjusted P value < 0.05; Fig. 6A). Again, the majority (515 genes) fit a synergistic model, where the presence of both GR and STAT3 in the nucleus resulted in greater repression than the sum of the repression from GR and STAT3 alone.
To determine if the genes that are cooperatively regulated by both GR and STAT3 are near shared GR and STAT3 basal-like–specific binding sites, the distance between the transcription start site of each gene and the nearest shared GR and STAT3 binding site was calculated. Consistent with previous observations, a larger fraction of genes that are cooperatively activated by GR and STAT3 are close to a shared site, when compared with repressed genes, or genes that did not change expression when these TFs were modulated (Fig. 6B).
Together, these results indicate that GR and STAT3 cooperate to synergistically regulate the expression of hundreds of genes.
GR and STAT3 regulate the basal-like gene expression program
To test the hypothesis that GR and STAT3 are key regulators of the basal-like gene expression program, RNA-seq data from 28 breast cancer cell lines (18 basal-like and 10 luminal) were analyzed to identify genes that exhibit subtype-specific expression. Of the 2,485 genes that had significantly higher expression in basal-like breast cancer, 438 were significantly upregulated in HCC70 and MDA-MB-231 when comparing dexamethasone induction with baseline media, and standard media with STAT3 siRNA or SH4–54 inhibition of STAT3 in the experiments described above. These 438 basal-like genes that are upregulated by GR and STAT3 are listed in Supplementary Table S3. These genes have a significant overlap with genes that were previously characterized as more highly expressed in basal-like breast cancer compared with luminal (Hypergeometric FDR adjusted q value = 2.6 × 10−95; ref. 48). This result confirms that the GR and STAT3 regulated genes identified in this analysis have been associated with basal-like breast cancer in previous studies.
Further characterization of the 438 basal-like genes upregulated by GR and STAT3 revealed enrichment for cellular processes and pathways associated with the aggressive nature of basal-like breast cancer. These genes were significantly enriched for the Gene Ontology Biological Process “Positive Regulation of Cell Proliferation” (Hypergeometric FDR adjusted q value 2.04 × 10−17), “hallmarks that define the epithelial-mesenchymal transition” (Hypergeometric FDR adjusted q value 7.22 × 10−15; ref. 49), “mammary stem cells” (Hypergeometric FDR adjusted q value = 1.04 × 10−22; ref. 50), and “adult tissue stem cells” (Hypergeometric FDR adjusted q value = 2.13 × 10−20; ref. 51).
Notably, of the 438 basal-like genes upregulated by GR and STAT3, 34 are TFs themselves, including those that are known to be upregulated by major signaling pathways involved in basal-like breast cancer growth including EGFR signaling (ELK3, ETS2, CEBPD, NFIL3, MBNL2, HIVEP2), and TGFB signaling (ELK3, ETS2, SNAI1, MAFB). In addition, the genes regulated by GR and/or STAT3 include 5 TFs (ETV6, NFIL3, HIF1A, DR1, and TFCP2L1) that were previously identified as “embryonic stem cell” TFs that are preferentially and coordinately overexpressed in the high-grade, ER-negative breast cancer tumors (52). These results suggest GR and STAT3 are upstream of the key TFs that control growth and stemness features in basal-like breast cancer.
These 438 basal-like genes are upregulated by GR and STAT3 in both HCC70 and MDA-MB-231 cell lines. To determine if they are associated with prognosis in patients with basal-like breast cancer, Kaplan–Meier survival plots were generated using public gene expression microarray data from 360 basal-like patient tumors for whom 5-year relapse-free survival data were available. Patients who had higher mean expression of these genes in their tumors had significantly shorter relapse-free survival than patients with lower expression of these genes (HR = 1.76, logrank P = 0.0068; Fig. 6C). This result indicates that even among patients with basal-like breast cancer, higher expression of genes that are upregulated by GR and STAT3 is associated with a worse prognosis.
Inhibiting both GR and STAT3 reduces cell growth synergistically
The observation that GR and STAT3 cooperate to regulate a large number of genes in the basal-like gene expression signature that are associated with cell proliferation and poor prognosis led us to investigate whether drugs that modulate GR and STAT3 would affect cell growth. To first evaluate the independent roles of GR and STAT3 in cell growth, the MDA-MB-231 cell line was treated with a range of doses of dexamethasone or the STAT3 inhibitor SH4–54 separately and was monitored for 72 hours on the Incucyte Zoom live cell imaging instrument. Reducing STAT3 activity with increasing doses of the STAT3 inhibitor SH4–54 resulted in decreased growth in a dose-dependent manner (Fig. 7A). Reducing GR activity by decreasing amounts of dexamethasone resulted in decreased cell growth in a dose-dependent manner (Fig. 7B). These results are consistent with previous publications that report that inhibiting GR or STAT3 separately decreases basal-like breast cancer proliferation (53–55).
To evaluate whether modulating both TFs simultaneously leads to differences in growth, the MDA-MB-231 and MDA-MB-468 cell lines were treated with different combinations of dexamethasone and SH4–54. The combination of inhibiting GR by withholding dexamethasone, and inhibiting STAT3 with 8 μmol/L SH4–54 resulted in the slowest cell growth (Fig. 7C and D). Conversely, the activation of GR with 1 μmol/L dexamethasone combined with the endogenous activity of STAT3 resulted in the fastest cell growth (Fig. 7C and D). Notably, when either GR or STAT3 was active alone, this led to intermediate growth rates. This result supports the hypothesis that GR and STAT3 cooperate to increase cell growth, and dual inhibition of both GR and STAT3 results in slower cell growth than inhibiting either TF alone.
To expand this analysis beyond cell lines, growth of patient-derived organoid cultures was analyzed in the context of GR and STAT3 inhibition. Two patient-derived organoid cultures previously derived from primary basal-like TNBC tumors (HCI-001 and HCI-002) were utilized for this experiment (33, 56). The standard media for the patient-derived organoid cultures include 1 μg/mL hydrocortisone, which is a glucocorticoid that induces GR activity. Therefore, escalating doses of mifepristone were used to inhibit GR activity. Mifepristone also inhibits PR and weakly inhibits AR activity (57), but these basal-like TNBC organoids do not express PR or AR, and thus this is not a confounding factor in these experiments (Supplementary Fig. S2). The STAT3 inhibitor SH4–54 was used in escalating doses to inhibit STAT3 activity. The cell line MDA-MB-231 was included in this experiment as a positive control and was grown in the same media as the organoids. The growth rate of the organoids is much slower than the cell line, which resulted in a narrower dynamic range for observing changes in growth. Despite this technical limitation, inhibition of GR and STAT3 for 72 hours resulted in significant dose-dependent decreases in cell growth, measured by ATP levels (Fig. 7E). An analysis of the relative decrease in cell growth across drug doses indicates that inhibiting both GR and STAT3 provides synergistic, rather than additive, decreases in cell growth in the MDA-MB-231 cell line, as well as both patient-derived organoid cultures (Fig. 7E). This result confirms that inhibiting the activity of both GR and STAT3 reduces cell growth in basal-like breast cancer patient–derived organoids. The synergy observed between GR and STAT3 inhibitors indicates there could be a window for therapeutic dose reduction, i.e., lower doses of each drug can be used in combination to achieve the same reduction in cell growth as higher doses of each drug alone (Fig. 7E). Additional experiments and analyses are needed to determine the synergistic potency and synergistic efficacy of the GR and STAT3 inhibitors (58).
The goal of this study was to determine which TFs drive the basal-like gene expression program. Genome-wide analysis of DNA methylation, chromatin accessibility, TF binding, and gene expression indicate that GR and STAT3 binding is enriched at genomic regulatory regions that are specifically open in basal-like breast cancer. These TFs bind to the same regulatory regions and cooperate to regulate hundreds of genes in the basal-like expression signature, including many other TFs. The genes that are cooperatively upregulated by GR and STAT3 are transducers of major signaling pathways implicated in basal-like breast cancer growth (EGFR and TGFB), are associated with aggressive features of basal-like breast cancer (proliferation, stemness, and epithelial-to-mesenchymal transition), and are associated with shorter relapse-free survival in patients. Furthermore, inhibiting both GR and STAT3 leads to synergistic reductions in cell growth in multiple cell lines and patient-derived organoid cultures. Together, these data suggest that inhibiting both GR and STAT3 simultaneously can reduce expression of the basal-like gene expression signature and create a synthetic lethality that could be exploited therapeutically.
Although GR and STAT3 have been separately implicated in regulating gene expression in basal-like breast cancer previously (54, 55, 59), their cooperative role has not been previously described. This study makes it apparent that understanding this cooperation is crucial to investigating their roles in driving basal-like gene expression and phenotypes, as well as arriving at more effective therapeutic strategies. Inhibiting GR alone is not effective; basal-like cell lines proliferate and express some components of the basal-like gene signature without glucocorticoids in their media. Inhibiting STAT3 alone is not completely effective; previous studies have shown that SH4–54 killed basal-like cell lines in culture, but was insufficient to completely stop tumor growth in vivo where glucocorticoids are naturally present (47). These results indicate that inhibiting either of these TFs alone is not sufficient to abrogate the proproliferative gene expression program of basal-like breast cancer in vivo, but this study suggests that inhibiting both GR and STAT3 could be a novel and effective therapeutic strategy.
There is a large body of previous literature supporting our conclusion that STAT3 is a critical factor for the growth, invasion, and survival of breast cancer cells (55, 60–63), including our own previous study that showed that STAT3 binding regulates genes that promote invasion (64). Due to its known oncogenic role, there are a number of clinical trials testing a variety of JAK/STAT3 pathway inhibitors in breast as well as other solid tumors that are summarized in Supplementary Fig. S3. These trials, along with the continual development of additional drugs that directly target STAT3 (65, 66), offer promising opportunities for future studies to determine if combination treatment with GR inhibitors improves efficacy, or if combination treatment can provide similar efficacy with lower, better tolerated, doses of each drug.
The study of GR's role in basal-like breast cancer has important clinical implications. GR's role in cancer is disease and subtype specific (67, 68). GR expression is associated with good prognosis in luminal breast cancer (Fig. 2D; ref. 69), and glucocorticoids are commonly used to treat hematologic malignancies because they induce apoptosis in lymphoid cells (68). In addition, potent synthetic glucocorticoids are often prescribed to patients with breast cancer undergoing chemotherapy to prevent side-effects such as nausea, loss of appetite, and rare severe immune reactions. Our preliminary data support a handful of other studies that report that induction of GR with glucocorticoids could promote tumor growth and therapy resistance in TNBC (54, 59, 70). In our study specifically, it is clear that glucocorticoid induction of GR leads to increased cell growth in basal-like breast cancer cell lines and patient-derived organoids. It also leads to increased expression of genes that are cooperatively regulated by STAT3, a known promoter of cancer growth and invasion (60). Recently, clinical trials have been initiated to test whether inhibiting GR directly with mifepristone or by inhibiting its chaperone, HSP90, will enhance the efficacy of chemotherapy in ER-negative breast cancer (71, 72). Although the administration of glucocorticoids is thought to be necessary to improve the tolerability of chemotherapy, the first trial of GR inhibition with mifepristone combined with nab-paclitaxel demonstrated that the primary toxicity, neutropenia, could be managed with filgrastim growth factor administration (71). This study found that the standard dose of nab-paclitaxel can be safely, and tolerably, combined with an effective dose of mifepristone, and a phase II trial of this regimen has been initiated (71). A list of current trials of GR and Hsp90 inhibitors is provided in Supplementary Fig. S4. Our results support this line of investigation and suggest that combination therapy that inhibits both GR and STAT3 could be even more effective in treating basal-like TNBC.
Disclosure of Potential Conflicts of Interest
P. Yue reports a patent for “Small molecule STAT3 inhibitors” pending. J. Turkson reports grants from National Institutes of Health (grant #R01 CA208851) during the conduct of the study; other compensation from Novella, LLC (co-founder) outside the submitted work; in addition, J. Turkson has a patent for (Non-provisional patent application on small molecule STAT3 inhibitors) pending and licensed to Novella, LLC. B.E. Welm reports grants from NIH during the conduct of the study; personal fees from University of Utah (compensation from the University of Utah from the licensing of PDX models) outside the submitted work. K.E. Varley reports grants from American Cancer Society during the conduct of the study; personal fees from Kailos Genetics Inc. (Consulting/Founder Shares) outside the submitted work. In addition, K.E. Varley has a patent for “Method for multiplexed nucleic acid patch polymerase chain reaction” issued, licensed, and with royalties paid from Kailos Genetics, Inc. (not directly related to this work, but broadly relevant to cancer research), a patent for “Capture Methodologies for Circulating Cell Free DNA” issued to Kailos Genetics, Inc. (not directly related to this work, but broadly relevant to cancer research), a patent for “Biomarkers for triple-negative breast cancer” pending (not directly related to this work, but broadly relevant to cancer research), a patent for “Novel Read-Through Fusion Polynucleotides and Polypeptides and Uses Thereof” pending (not directly related to this work, but broadly relevant to cancer research), a patent for “Multigene assay to assess risk of recurrence of cancer” pending (not directly related to this work, but broadly relevant to cancer research), and a patent for “Targeted sequencing to detect and quantify low levels of methylated DNA” pending (not directly related to this work, but broadly relevant to cancer research). No potential conflicts of interest were disclosed by the other authors.
M.E. Conway: Conceptualization, data curation, software, formal analysis, investigation, visualization, methodology, writing-original draft, writing-review and editing. J.M. McDaniel: Formal analysis, investigation, writing-original draft, writing-review and editing. J.M. Graham: Investigation. K.P. Guillen: Resources, data curation, formal analysis, validation, investigation, visualization, methodology, writing-review and editing. P.G. Oliver: Resources, investigation, writing-review and editing. S.L. Parker: Investigation, writing-review and editing. P. Yue: Resources, writing-review and editing. J. Turkson: Conceptualization, resources, supervision, writing-review and editing. D.J. Buchsbaum: Conceptualization, resources, supervision, funding acquisition, project administration, writing-review and editing. B.E. Welm: Resources, supervision, project administration, writing-review and editing. R.M. Myers: Conceptualization, resources, supervision, funding acquisition, project administration, writing-review and editing. K.E. Varley: Conceptualization, resources, data curation, software, formal analysis, supervision, funding acquisition, validation, investigation, visualization, methodology, writing-original draft, project administration, writing-review and editing.
This work was supported by American Cancer Society Research Scholar Grant Award #132596‐RSG‐18‐197‐01‐DMC awarded to K.E. Varley. We acknowledge support of funds in conjunction with grant P30 CA042014 awarded to Huntsman Cancer Institute and to the Nuclear Control Program at Huntsman Cancer Institute. Research reported in this publication utilized the High-Throughput Genomics and Bioinformatic Analysis Shared Resource at Huntsman Cancer Institute at the University of Utah and was supported by the NCI of the NIH under Award Number P30 CA042014. This research was also supported by funding from the HudsonAlpha Institute, a Cancer Research Fund from the State of Alabama to HudsonAlpha Institute, Susan G. Komen, and the NIH-National Cancer Institute Comprehensive Cancer Center Core Support Grant (5P30CA013148). J.M. McDaniel was supported by a fellowship from the National Science Foundation Graduate Research Fellowship Program (NSF-GRFP).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.