Alternative promoters (AP) occur in >30% protein-coding genes and contribute to proteome diversity. However, large-scale analyses of AP regulation are lacking, and little is known about their potential physiopathologic significance. To better understand the transcriptomic effect of estrogens, which play a major role in breast cancer, we analyzed gene and AP regulation by estradiol in MCF7 cells using pan-genomic exon arrays. We thereby identified novel estrogen-regulated genes (ERG) and determined the regulation of AP-encoded transcripts in 150 regulated genes. In <30% cases, APs were regulated in a similar manner by estradiol, whereas in >70% cases, they were regulated differentially. The patterns of AP regulation correlated with the patterns of estrogen receptor α (ERα) and CCCTC-binding factor (CTCF) binding sites at regulated gene loci. Interestingly, among genes with differentially regulated (DR) APs, we identified cases where estradiol regulated APs in an opposite manner, sometimes without affecting global gene expression levels. This promoter switch was mediated by the DDX5/DDX17 family of ERα coregulators. Finally, genes with DR promoters were preferentially involved in specific processes (e.g., cell structure and motility, and cell cycle). We show, in particular, that isoforms encoded by the NET1 gene APs, which are inversely regulated by estradiol, play distinct roles in cell adhesion and cell cycle regulation and that their expression is differentially associated with prognosis in ER+ breast cancer. Altogether, this study identifies the patterns of AP regulation in ERGs and shows the contribution of AP-encoded isoforms to the estradiol-regulated transcriptome as well as their physiopathologic significance in breast cancer. Cancer Res; 70(9); 3760–70. ©2010 AACR.
Estrogen receptors (ER), especially ERα, belong to the nuclear receptor superfamily of transcription factors and play a major role in breast cancer. Indeed, ∼50% of breast tumors are ER positive (ER+) and can be growth inhibited by pharmacologic blockade of estrogen. In addition, the ER status is one of the most widely used markers of good prognosis in breast cancer (1). Therefore, intensive efforts are currently made to elucidate the mechanisms of estrogen actions on breast cancer cells. Many estrogen-regulated genes (ERG) have been identified mainly through microarray analyses in the MCF7 breast cancer cell line (2–8). The identification of ERGs has helped understand how estrogens affect cell phenotype (e.g., induction of cell proliferation) and helped identify prognostic markers for breast cancer (5, 9, 10).
It is now established that >30% of human protein-coding genes have alternative promoters (AP), and multiple biological functions of APs have been shown (11–14). Firstly, APs may allow for gene regulation by multiple signals and in multiple tissues. Secondly, APs may affect the efficiency of translation by affecting the sequence of 5′-untranslated regions (UTR). Thirdly, APs may affect open reading frames in mRNAs and give rise to protein isoforms with different NH2-terminal sequences and activities. Therefore, APs contribute a significant part of the proteome diversity and may have a wide biological effect. However, little is known about their regulation; in particular, large-scale analyses of AP regulation are lacking. Indeed, classic 3′-biased expression microarrays and promoter arrays are not designed to interrogate APs.
Recent studies indicated that most ERα binding sites in the genome of MCF7 cells are not located within promoter regions but up to >50 kb away (15, 16), opening up wide possibilities for the regulation of APs by estrogens. However, few studies have examined the effect of estrogens on APs. Studies of the Xenopus vitellogenin A1 gene and of the human GREB1 gene showed that estrogen can stimulate several promoters within a target gene (17, 18). More recently, a ChIP-chip study using a custom array of APs showed selective regulation of RNA polymerase II (Pol II) levels at APs in response to estrogen (19). However, Pol II levels at promoters do not always reflect productive initiation (20, 21), and therefore, such an analysis does not permit to determine which isoforms are generated in response to estradiol. In the present study, we determined the contribution of AP-encoded transcripts to the estrogen-regulated transcriptome in MCF7 cells using the Affymetrix Human Exon (HuEx) arrays, which have probes for most exons in the genome.
Materials and Methods
MCF7 cells were maintained in DMEM with 10% fetal bovine serum. Except for transfections, cells were grown for 3 days in phenol red–free medium supplemented with 2% charcoal-stripped serum before treatment with 17β-estradiol (10 nmol/L; Sigma) or 0.1% ethanol as a control. Transient transfections were performed using Lipofectamine RNAiMax (Invitrogen) and the indicated small interfering RNAs (siRNA; siGL2, CGUACGCGGAAUACUUCGAdTdT; siL, GACUUCACUUUGAAAAGAAdTdT; siS, CCAUACGAGUCCUAGAUGUdTdT; siDDX5/17, GGCUAGAUGUGGAAGAUGU). Adherent cells were harvested using trypsin. Adherent and floating cells were counted with a Coulter counter. Cell cycle analyses were performed by propidium iodide staining and fluorescence-activated cell sorting analysis.
RNA and protein analyses
Total RNA was extracted using Trizol (Invitrogen). RNA treated with DNase I (Ambion) was reverse transcribed using SuperScript II and random primers (Invitrogen). PCR was performed using GO-Tag (Promega). Quantitative PCR was performed using Master SYBR Green I on a LightCycler (Roche) using 18S RNA for normalization. PCR primers are described in Supplementary Table S1. Protein extracts were prepared in 50 mmol/L Tris (pH 8.0), 0.4 mol/L NaCl, 5 mmol/L EDTA, 1% NP40, 0.2% SDS, and 1 mmol/L DTT with protease inhibitors. Following SDS-PAGE, blots were hybridized to anti-NET1 (Abcam) and anti-actin (Sigma) antibodies.
RNA was purified on Qiagen columns, and RNA integrity was verified using a Bioanalyzer. Total RNA (1 μg) was processed with the GeneChip WT Sense Target Labeling kit and hybridized to GeneChip Human Exon 1.0 ST arrays (Affymetrix), following the manufacturer's instructions. Stained arrays were scanned using the GeneChip Scanner 3000-7G, and quality assessment was performed using Expression Console (Affymetrix). For data analyses, probe intensities were normalized using the quantile normalization method, and background correction was made using antigenomic probes (22).
Array data analysis at the gene level
Using the ArrayAssist software (Stratagene), the average intensity of exonic probes was calculated for each gene in each sample, and the estradiol and control groups were compared using a Student's paired t test (fold > 1.5, P < 0.05). A compilation of 3,047 genes regulated by ER ligands in various cell lines was retrieved from previous studies (2–8). Gene functions were analyzed using the Panther software (23).
Array data analysis at the exon level
Exon-level analyses of array data were performed using the EASANA system (GenoSplice Technology). Several criteria were used to select probes for further analyses: exonic location according to cDNAs (13), GC content below 18, and no cross-hybridizing potential. The strategy used to annotate AP expression and regulation is described in Supplementary Fig. S2. Briefly, to annotate AP expression, AP-associated exons were compared for probe intensities using a Student's t test. To annotate AP regulation, the intensities of AP-associated probes were compared between estrogen-treated and control cells using a Student's paired t test. To identify estrogen-regulated APs within nonregulated genes, the gene-normalized intensities of each AP-associated exon were compared between estradiol-treated and control cells using a Student's paired t test (splicing index method). Probe data were visualized in their gene context using the EASANA visualization module (GenoSplice Technology).
Chromatin immunoprecipitation (ChIP) was performed as described previously (24) using antibodies against Pol II (CTD4H8), acetylated histone H3 (Upstate), and control mouse immunoglobulins. Immunoprecipitated DNA was purified using Qiagen columns and analyzed by quantitative PCR.
Tumor samples and survival studies
Biopsies from 74 ER+ primary unilateral breast carcinomas were collected. The patients (mean age, 59.3 y; range, 35–91) received no radiotherapy or chemotherapy before surgery. Sixty-nine percent of tumors were lymph node positive. The median follow-up was 8.5 years (range, 1.4–16.2), and 26 patients relapsed at distant sites. Total RNA was analyzed by quantitative reverse transcription-PCR (RT-PCR) using TBP levels for normalization. Associations between transcript levels and metastasis-free survival (MFS) were determined by the Kaplan-Meier method using the log-rank test for statistical analysis.
Identification of novel ERGs using exon arrays
To identify the APs regulated by estradiol, MCF7 cells were treated with estradiol or vehicle for 6 or 24 hours, and total RNA was analyzed with HuEx arrays. HuEx arrays contain probes targeting >1 million of exons from well-annotated or computationally predicted genes. In this study, we focused on genes with known cDNAs. For each gene, the average intensity of exonic probes was calculated in each sample from four independent experiments. Using cutoffs of 1.5 for fold change and 0.05 for P value, 983 genes were regulated by estradiol at 6 and/or 24 hours (Supplementary Fig. S1; gene lists are given in Supplementary Tables S2–S4). Of 983 genes, 637 (65%) had been identified in a large compilation of previous studies (2–8), which showed the reliability of the method we used to identify ERGs (Fig. 1A, Known). However, 346 genes (35%) had not been identified as ERGs in the previous studies (Fig. 1A, New). Using a cutoff of 2 for fold change, there were still 61 genes (25%) that had not been previously found to be regulated by estradiol. We validated (five of five tested) by quantitative RT-PCR the regulation of several novel ERGs (Supplementary Fig. S1). Similar functions than previously reported, including cell cycle, were enriched in the 983 ERGs identified (Supplementary Fig. S1). Altogether, these data indicated that HuEx arrays allowed to reliably identify ERGs.
Determination of AP regulation by estradiol
Among the 983 ERGs, 301 (31%) contain annotated APs in the FAST DB database (Fig. 1B). To identify the APs regulated by estradiol in ERGs, we then analyzed the array data at the exon level. For this, we used the EASANA web interface that allows to display probes corresponding to alternative exons, particularly exons associated with APs, as defined by FAST DB (see below; refs. 13, 25). Using the strategy described in Supplementary Fig. S2, we defined two broad categories of AP regulation by estradiol.
The first category corresponds to APs with similar expression regulation in response to estradiol and that we defined as coregulated (CR) APs. To illustrate this category, Fig. 2A represents the analysis of the GREB1 gene, which is controlled by three different promoters (18), corresponding to exons E1, E2, and E3 in FAST DB (Fig. 2A, item 1). EASANA analysis indicated that both promoters E2 and E3 were upregulated by estradiol (Fig. 2A, item 2), in agreement with RT-PCR analysis (Fig. 2A, item 3) and with a previous study (18). Likewise, APs from the RBBP8, TPD52L1, SCL7A5, STC2, FKBP5, and SLC22A5 genes were similarly activated by estradiol (Supplementary Fig. S3). We also identified APs that were simultaneously repressed by estradiol as illustrated with the TGFB3, BMF, LMO7, NPHP3, and MYO1B genes (Supplementary Fig. S3). A total of 43 genes containing CR promoters were identified (Supplementary Table S5), and 100% (11 of 11) were validated by RT-PCR (Supplementary Fig. S3).
The second category corresponds to APs that were differentially expressed or regulated (DE/DR) in response to estradiol. For example, both EASANA and RT-PCR analyses of the GREB1 gene indicated that promoter E1 was lowly expressed compared with the other two promoters (Fig. 2A; data not shown), as previously reported (18). Similar cases of DE APs were found in 92 genes (Supplementary Table S5), and 85% (11 of 13) were validated by RT-PCR (Supplementary Fig. S4). More interestingly, we identified genes where coexpressed APs were DR by estradiol (i.e., they were regulated in a selective manner). Figure 2B illustrates the case of the UNG gene, which contains two promoters (26). Both EASANA and RT-PCR analyses indicated that the E1 promoter was upregulated by estradiol, whereas the E2 promoter was not regulated (Fig. 2B). Globally, 29 cases of genes with DR APs were identified (Supplementary Table S5), and 100% (5 of 5) were validated by RT-PCR, namely, the UNG, EFHD1, GRB7, DSCR1, and MCM7 genes (Supplementary Fig. S5; see below).
Globally, of 301 ERGs with APs, we could determine the expression regulation of APs for 150 (50%) genes (Supplementary Table S5). Of these, 43 genes (29%) contained APs that were similarly regulated by estradiol (CR genes), and 118 genes (79%) contained APs that were DE/DR in response to estradiol (DE/DR genes); among DE/DR genes, there were 92 DE and 29 DR genes (Fig. 1C). It should be noted that 19 genes containing more than two promoters were classified in two categories. The global validation rate by RT-PCR was 93% (27 of 29). The detailed annotation of AP expression and regulation in ERGs is given in Supplementary Table S5.
Patterns of AP regulation in ERGs correlate with patterns of ERα and CTCF binding sites
To determine whether the distinct patterns of AP regulation in ERGs may be explained by distinct patterns of ERα binding sites at gene loci, we focused on the 12 CR and 9 DR genes validated by RT-PCR throughout this study (Supplementary Table S6). We used ERα ChIP-chip data previously obtained using genome-wide tiling arrays, which showed ERα binding site enrichment within 50 kb of ERGs (15, 27). We also analyzed CTCF binding sites, which are insulators that were recently mapped genome-wide and shown to isolate genes from the action of distant regulatory sites, including ERα binding sites (28–30). As expected from previous studies (15, 16, 27), only 6 (4 CR and 2 DR) of 21 genes analyzed had an ERα binding site within 5 kb of an AP (Supplementary Table S6). However, 10 genes (6 CR and 4 DR) had an ERα binding site and an AP located in the same CTCF block (region between two consecutive CTCF binding sites; Supplementary Table S6). These data suggest that a large proportion of both CR and DR genes are direct ER target genes. In the majority of these genes, either CR or DR, both APs were located within 50 kb of an ERα binding site (Supplementary Fig. S6A). However, whereas in CR genes all APs were located in the same CTCF block as an ERα binding site (Fig. 2C, CR), in DR genes only one (regulated) promoter was located in the same CTCF block as an ERα binding site (Fig. 2C, DR; Supplementary Fig. S6B; Supplementary Table S6). Thus, in genes with ERα binding sites, the distinct patterns of AP regulation (CR versus DR) may be explained by distinct patterns of ERα location relative to APs and CTCF binding sites. As all ERGs do not contain ERα binding sites, we then considered the location of CTCF binding sites in the 21 genes analyzed relative to APs only. In 85% CR cases, APs were located in the same CTCF block (“clustered APs”), whereas in 67% DR cases APs were separated by a CTCF binding site (“insulated APs”; Fig. 2D). The binding of CTCF between APs was validated by ChIP in MCF7 cells (Supplementary Fig. S6C). Thus, distinct patterns of AP regulation by estradiol were associated with different patterns of ERα and CTCF binding sites at gene loci.
Some genes are essentially regulated in a qualitative manner by estradiol
Further analyzing DR genes, we observed that estradiol can regulate the expression of two APs in an opposite manner. Indeed, EASANA analysis of the MCM7 gene predicted that the E1 promoter was upregulated by estradiol, like the global gene level, whereas the E2 promoter was downregulated (Fig. 3A). Quantitative RT-PCR analysis confirmed that the E1 promoter was strongly upregulated by estradiol (∼6-fold), whereas the E2 promoter was downregulated (∼2-fold; Fig. 3A).
Our finding that AP-encoded transcripts may be selectively or inversely regulated by estradiol within a gene (Figs. 2B and 3A; Supplementary Fig. S5) raised the possibility that estradiol might regulate APs within genes that, as a whole, are not significantly regulated. Such DR genes would have been missed in our first analysis of gene regulation by estradiol, where all the probes of a given gene were averaged. We therefore generated an algorithm to search for estradiolregulated APs within genes that are not globally regulated. A stringent bioinformatic analysis, followed by RT-PCR validation, identified this pattern of regulation in the LMBR1, TPD52, NUMA1, and NET1 genes (Fig. 3B; Supplementary Fig. S5). In the NET1 gene, two promoters are located upstream of exons E1 (producing the long form) and E4 (producing the short form; Fig. 3B). Quantitative RT-PCR analysis confirmed the prediction that the E1 promoter was upregulated by estradiol (∼3.5-fold), whereas the E4 promoter was slightly downregulated (Fig. 3B). Remarkably, the whole NET1 gene expression, as measured by summarizing all the exonic probes, was not significantly affected by estradiol (−1.2-fold). This prediction was confirmed by quantitative RT-PCR analysis with primers located in the 3′ part of the gene measuring both isoforms (Fig. 3B, Total). The fact that NET1 isoforms were regulated in an opposite manner by estradiol, whereas overall NET1 mRNA level was not affected, suggested that hormone induced a switch in the expression of AP-encoded isoforms. This switch was confirmed by competitive amplification of both isoforms by RT-PCR (Fig. 3B, bottom).
The opposite effects of estradiol on AP expression, sometimes without changes in global gene expression levels, set a novel paradigm of qualitative gene regulation by hormone. Further supporting the significance of this finding, the opposite effects of estradiol on NET1 isoform expression were observed at different time points and in a second breast cancer cell line (Supplementary Fig. S7). Moreover, the differential effect of estradiol on NET1 APs was specific, as progesterone that stimulates NET1 expression in T47D cells (31) strongly increased the expression of both isoforms (Supplementary Fig. S7). This result also indicated that the E4 promoter can be activated at the same time as the E1 promoter. As an ERα binding site was mapped 8 kb downstream of the NET1 gene (15), it might be a direct transcriptional target gene of estrogen. To determine whether the inverse regulation of NET1 isoforms by estradiol was indeed a transcriptional effect, we analyzed the effect of estradiol on Pol II levels on NET1 promoters using ChIP assay. Pol II levels increased over the E1 promoter while decreasing over the E4 promoter in response to estradiol (Fig. 4A). These data indicate that estradiol may have opposite effects on the transcriptional activity of promoters within the same target gene. AP-selective effects of estradiol on Pol II or pre-mRNA levels were also observed for other genes (Supplementary Fig. S8).
Our finding that both upregulation and downregulation of promoters may occur within the same gene in response to estradiol suggested that these processes might be mediated in part by common factors. Although little is known about the mechanisms of gene downregulation by estradiol, several coactivators recruited by ERα to target genes, including the DDX5/DDX17 family of proteins (also called p68/p72), were recently shown to repress transcription in specific promoter contexts (32–35). To test whether these coregulators might play a role in the inverse regulation of APs by estrogen, MCF7 cells were transfected with a siRNA efficiently targeting both DDX5 and DDX17 (siDDX5/17; Supplementary Fig. S9). DDX5/DDX17 depletion nearly abolished the estradiol-mediated effect on the relative expression levels of AP-encoded isoforms of the NET1 gene (Fig. 4B), repressing the estradiol-stimulated isoform while inducing the estradiol-repressed isoform (Fig. 4C). Similar observations were made for the MCM7 gene (Supplementary Fig. S9). Consistently, on DDX5/DDX17 depletion, Pol II levels and histone H3 acetylation were decreased at the NET1 P1 promoter while increasing at the P4 promoter (Fig. 4D). Altogether, these data show the ability of estradiol to switch the expression of promoters within target genes, and identify the DDX5/DDX17 coregulators as potential mediators of this promoter-switching effect.
Global effect of APs on transcriptome regulation by estradiol
Taking into account that APs occur in 31% ERGs and that their AP expression/regulation status could be determined in 50% cases, our data (Fig. 1C) imply that CR and DE/DR promoters occur in 9% and 24% ERGs, respectively. APs often encode distinct protein isoforms (11, 12), and we found this to be the case in 76% ERGs with APs (Supplementary Fig. S10). Thus, ∼23% ERGs are predicted to encode protein isoforms through APs, thereby contributing to proteome regulation by estrogen. In particular, the coregulation of APs with distinct open reading frames potentially increases the number of regulated protein isoforms by estradiol. Meanwhile, the differential regulation of APs may serve to regulate protein isoforms in a selective manner. Consistently, AP-specific open reading frames were more frequent in DR genes than in CR genes (88% versus 69%; P = 0.03; Supplementary Fig. S10). To further assess the potential contribution of APs to biological responses to estrogen, we analyzed the functions of DE/DR and CR genes. In comparison with the whole genome, DE/DR genes were enriched in several functions, including cell cycle, DNA metabolism, cell structure and motility, and related molecular functions (Supplementary Fig. S10). These functions were also enriched in ERGs as a whole but not in CR genes (Supplementary Figs. S1 and S10). Related functions were also enriched in DE/DR genes when directly compared with CR genes (Supplementary Fig. S10). Altogether, these data suggest that the ability of estradiol to regulate specific AP-encoded isoforms may be preferentially used to regulate gene subsets involved in specific functions.
Differential functions of AP-encoded isoforms
We then assessed the potential role of DR isoforms in breast cancer cells, focusing on NET1, a RhoA-specific guanyl nucleotide exchange factor. NET1 APs encode long (594 amino acids) and short (542 amino acids) protein isoforms that have distinct properties when transfected into cells (36). However, little is known about the endogenous expression of these protein isoforms. We therefore analyzed their expression in MCF7 cells. Using an antibody directed against the COOH terminus of NET1, both the long and short forms could be detected; the identity of the isoforms was confirmed using isoform-selective siRNAs (Fig. 5A). As expected from analyses at the RNA level (Fig. 3B), only the long protein isoform was upregulated by estradiol (Fig. 5A). These data indicated that whereas both protein isoforms encoded by the NET1 gene promoters were expressed in MCF7 cells, estradiol treatment selectively increased the NET1 long protein isoform levels, owing to the selective upregulation of the E1 promoter.
Previous studies of NET1 isoforms showed that transfection of the short form (also called NET1A), but not of the long form, induces the formation of stress fibers, which are actin fibers mediating cell adhesion to substrate (36). On the other hand, the potential functions of the long NET1 isoform are currently unknown. In addition, a role for NET1 in cell growth was recently proposed (37), which might be relevant to the well-known mitogenic effect of estrogen on ER+ breast cancer cells. To investigate the functions of NET1 isoforms in MCF7 cells, we developed siRNAs selectively targeting either the long (siL) or the short (siS) NET1 isoform (Fig. 5A). Several days after transfection, adherent and floating cells were counted separately to determine the potential role of NET1 isoforms in cell adhesion; in addition, cell cycle analysis was performed on adherent cells to determine the potential effect of NET1 isoforms on cell proliferation. When compared with transfection with a control siRNA targeting luciferase (siGL2), transfection with siL and siS led to ∼30% and 75% decreases in the number of adherent cells, respectively (Supplementary Fig. S11). As expected, the proportion of floating cells was markedly increased by siS and only slightly affected by siL (Fig. 5B). These effects did not seem to be due to increases in cell death (Supplementary Fig. S11) and agree with the established role of the NET1 short isoform in stress fiber formation and cell adhesion (36). Conversely, cell cycle analysis indicated that siL decreased by 33% the proportion of cells in S phase relative to cells in G1, whereas siS had no significant effect (Fig. 5C). These data suggest a selective role for the NET1 long isoform in promoting the G1-S transition of the cell cycle in MCF7 cells. The differential effect of NET1 isoforms on cell growth and adhesion was confirmed using an independent set of siRNAs and was not due to an interference with ERα (Supplementary Fig. S12). Finally, the depletion of the NET1 long isoform decreased estrogen-stimulated MCF7 cell growth to levels observed in the absence of estrogen (Fig. 5D). Altogether, these data indicate that NET1 isoforms play differential roles in the regulation of cell adhesion and proliferation. Remarkably, estradiol selectively induces the isoform of NET1 that supports cell proliferation.
Physiopathologic significance of promoter-encoded isoforms
Because estrogen and ER play important roles in breast cancer cell proliferation, we next examined the expression of AP-encoded isoforms that are inversely regulated by estradiol in a collection of 74 ER+ breast tumors. Higher levels of the estrogen-stimulated long isoform of NET1 were associated with shorter MFS, whereas higher levels of the short isoform were not (Fig. 6A). In fact, when normalized to total NET1 mRNA levels, higher levels of the short isoform were associated with longer MFS (Supplementary Fig. S13). As expected from these data, the combination of high levels of the long form and low levels of the short form was associated with reduced MFS (Supplementary Fig. S13). In contrast, total NET1 mRNA levels were not associated with MFS (Fig. 6A). These data show the value of measuring individual isoforms to identify prognostic markers. Likewise, in the case of MCM7, higher levels of the estrogen-stimulated P1 isoform, but not of the estradiol-repressed P2 isoform, were associated with shorter MFS (Fig. 6B). In fact, higher levels of the P2 isoform tended to be associated with a delay in the occurrence of death or metastasis, and the combination of high P1 and low P2 expression levels was associated with shorter MFS (Fig. 6B). Thus, for both the NET1 and MCM7 genes, higher expression levels of the estrogen-stimulated isoform were selectively associated with shorter MFS in ER+ breast cancer. These data suggested the physiopathologic significance of estrogen-regulated APs. In the case of NET1, estradiol selectively induces the isoform of NET1 that supports cell proliferation (Fig. 3 and 5), in line with the adverse prognostic value of this isoform in ER+ breast cancer. Also in line with these data, the combined overexpression of DDX5 and DDX17 tended to be associated with shorter MFS in ER+ breast cancer (Supplementary Fig. S13).
Although >30% of human genes have APs (11, 14), little is known about AP regulation by transcriptional stimuli and about their physiopathologic significance. In this study, using pan-genomic exon arrays, we determined the patterns of AP regulation by estrogen in ERGs and showed their significance in breast cancer.
The identification of many novel ERGs in this study was likely due in part to the higher number and whole-exon distribution of probes in exon arrays when compared with classic 3′-biased expression microarrays (22). In addition, the ability to look at the regulation of individual promoters allowed us to identify ERGs whose overall expression levels were not affected. This also allowed us to determine the expression and regulation of AP-encoded transcripts in ERGs. The resulting large data sets (Supplementary Tables S2–S5) significantly increase the knowledge of the estrogen-regulated transcriptome in MCF7 breast cancer cells. In particular, the annotation of AP expression regulation in ERGs will help understand the biological consequences of gene regulations by estradiol because APs most often are associated with alternative open reading frames, and in many cases, the resulting protein isoforms have differential activities (Supplementary Table S5; see below). In addition, AP-encoded transcripts have distinct 5′-UTRs that may affect translation efficiency (11, 12).
The identification of the regulated AP-encoded transcripts within ERGs will also help understand the mechanisms of gene regulation by estrogen. Our data show that APs in ERGs can be regulated either similarly (29% of cases) or differentially (79% of cases) by estradiol (some genes with more than two promoters have both patterns). Both regulation patterns seem to apply to direct ERGs according to the presence of ERα binding sites in the gene vicinity. Coregulation of APs by estrogen was previously shown in the case of the GREB1 gene (18), which we confirmed and extended to various other genes. Differential AP regulation by estrogen is also supported by AP-selective effects on Pol II and pre-mRNA levels (Fig. 4A; Supplementary Fig. S8; ref. 19). Importantly, our data indicate that differential AP regulation may occur not only through selective effects but also through opposite effects on AP expression. Indeed, this study identifies a novel type of ERGs that are regulated by estradiol in a qualitative manner, sometimes in the absence of changes in global gene expression level (e.g., NET1 and NUMA1 genes; Fig. 3; Supplementary Fig. S5). These data also indicate that ERα association in the vicinity of a gene may affect its expression not only quantitatively but also qualitatively. This might explain, in part, why many ERα binding sites in the genome are not associated with changes in the expression level of neighboring genes (16).
We identified several factors that seem to determine, in part, the pattern of AP regulation by estradiol. Firstly, in the case of genes with ERα binding sites located in the close vicinity of promoters, the differential regulation of APs correlated with their differential association with ERα binding sites (Fig. 2C). Secondly, in the case of genes with no “local” ERα binding sites, our data suggest that promoter selectivity may be due to the occurrence of CTCF binding sites between APs (Fig. 2C and D). In support of this hypothesis, CTCF binding sites were recently involved as insulators of distant ERα binding sites and in the tissue-specific regulation of APs (29, 30, 38). Thirdly, we identified the DDX5/DDX17 family of proteins as mediators of promoter switches induced by estradiol (Fig. 4). These proteins are ERα interactants and coactivators that were also shown to repress transcription in specific promoter contexts (32, 34, 35, 39). Their dual role in promoter activation and repression might rely on their ability to interact with both coactivators (e.g., the histone acetylase CBP) and corepressors (e.g., the histone deacetylase HDAC1; refs. 32–35), thereby leading to either increased or decreased acetylation of histones (Fig. 4D). Our data also support the recent finding that ER coactivators may be involved in promoter downregulation by estrogen (40, 41). This is an important observation because a large proportion of ERGs and APs are downregulated by estrogen (this study and refs. 5, 15).
Interestingly, ERGs with APs DE/DR in response to estradiol were enriched in specific functions relevant to the effects of estrogen on breast cancer cells (e.g., cell cycle; Supplementary Fig. S10). Moreover, our data suggest that the differential regulation of APs by estradiol has physiopathologic significance. Although very few instances of promoter-encoded isoforms associated with prognosis have been described overall (42, 43), this study identifies promoter-encoded isoforms of NET1 and MCM7 as differentially associated with poor prognosis in ER+ breast cancer. Such markers may be useful in the understanding and management of the subset of ER+ breast cancer that has poor prognosis (9, 10). Both NET1 and MCM7 are potentially involved in oncogenesis. A partial NET1 cDNA was cloned in a screen for oncogenes, and both isoforms of NET1 interact with the tumor suppressor DLG1 (44). Although neither isoform of NET1 transformed NIH 3T3 cells (36), NET1 isoforms have differential activities that might underlie their differential association with prognosis. Indeed, our data suggest a selective involvement of the NET1 long isoform in cell proliferation, whereas the short isoform promotes stress fiber formation and cell adhesion (this study and ref. 36). These differential activities may be due to differential protein localization; indeed, whereas the long isoform is located in nuclei, the short isoform can also be detected in the cytoplasm (36). MCM7 plays a major role in DNA replication (45). The MCM7 promoters have the potential to give rise to protein isoforms with different NH2-terminal sequences, but there are no data available about these isoforms. The NET1 and MCM7 genes might be prototypes of genes with AP-encoded isoforms that are differentially involved in ER+ breast cancer. Finally, other genes whose APs are DR by estradiol are thought to play roles in cancer (Supplementary Fig. S5). Altogether, this study shows the contribution of AP-encoded isoforms to the estrogen-regulated transcriptome, as well as their physiopathologic significance in breast cancer. More generally, we propose that AP profiling will help improve the analysis of gene regulation and the identification of prognostic markers in cancer.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Grant Support: Institut National de la Sante et de la Recherche Medicale AVENIR, Ligue Nationale Contre le Cancer (Comité de Paris), Agence Nationale de la Recherche, Institut National du Cancer, Groupement Entreprises Françaises Lutte Contre Cancer, and European Union FP6 (EURASNET). M. Dutertre was supported by Ligue Nationale Contre le Cancer and Institut National de la Sante et de la Recherche Medicale. L. Gratadou was supported by Agence Nationale de la Recherche. S. Germann was supported by Association pour la Recherche sur le Cancer.