In many tumors, cells transition reversibly between slow-proliferating tumor-initiating cells (TIC) and their differentiated, faster-growing progeny. Yet, how transcriptional regulation of cell-cycle and self-renewal genes is orchestrated during these conversions remains unclear. In this study, we show that as breast TIC form, a decrease in cell-cycle gene expression and increase in self-renewal gene expression are coregulated by SOX2 and EZH2, which colocalize at CpG islands. This pattern was negatively controlled by a novel long noncoding RNA (lncRNA) that we named Stem Cell Inhibitory RNA Transcript (SCIRT), which was markedly upregulated in tumorspheres but colocalized with and counteracted EZH2 and SOX2 during cell-cycle and self-renewal regulation to restrain tumorigenesis. SCIRT specifically interacted with EZH2 to increase EZH2 affinity to FOXM1 without binding the latter. In this manner, SCIRT induced transcription at cell-cycle gene promoters by recruiting FOXM1 through EZH2 to antagonize EZH2-mediated effects at target genes. Conversely, on stemness genes, FOXM1 was absent and SCIRT antagonized EZH2 and SOX2 activity, balancing toward repression. These data suggest that the interaction of an lncRNA with EZH2 can alter the affinity of EZH2 for its protein-binding partners to regulate cancer cell state transitions.
These findings show that a novel lncRNA SCIRT counteracts breast tumorigenesis by opposing transcriptional networks associated with cell cycle and self-renewal.
See related commentary by Pardini and Dragomir, p. 535
The clonal model of tumor growth has been revised by the discovery of heterogeneous, tissue-like organization of many cancer types (1, 2). In this view, cancers form hierarchies of tumor-initiating cells (TIC), which give rise to differentiated cells with limited proliferative potential (3). TICs can self-renew, divide indefinitely, and produce differentiated cells within the tumor mass, generating cell states that create intratumor heterogeneity (3). TICs are highly metastatic and being slow-proliferating are resistant to chemotherapy, two properties linked to treatment failure and relapse (4). These risks have sparked interest in the potential for TIC-targeting therapies. However, strategies for eliminating TIC populations could be complicated by the dynamic equilibrium that exists between TIC and non-TIC states (3), suggesting a need to target both compartments simultaneously.
The self-renewal capacity that drives the formation and expansion of TICs has parallels to that in embryonic stem cells, with the involvement of pluripotency-associated transcription factors (TF) such as SOX2 and chromatin modifiers such as EZH2 (5, 6). Interestingly, SOX2 and EZH2 also promote cancer cell plasticity in prostate cancer by inducing expression of neuroendocrine markers, which promote metastasis and antiandrogen resistance (7). In addition, EZH2 can display oncogenic Polycomb-independent functions to regulate transcription through binding specific TFs (8, 9).
Certain surface markers can prospectively isolate TIC populations from tumors or cell lines (10). However, the specificity of these markers for TIC populations, especially in triple-negative breast cancer (TNBC), is imperfect (11). As a result, retrospective approaches to enrich for TICs have been developed, which use three-dimensional (3D) culturing conditions and low plating densities to form clonal cultures of TIC-enriched tumorspheres (spheres; refs. 12, 13). By culturing spheres, we have studied the regulatory transcriptional networks that drive TIC formation in breast tumors.
Although TFs controlling self-renewal of TICs have been partially characterized (14), factors that regulate plasticity between TIC and non-TIC compartments remain unknown. We hypothesized that factors driving TIC formation could be counteracted by negative feedback loops keeping cells in a poised state to facilitate easy cell state transitions, with consequences for cancer treatment.
Here, we demonstrate that SOX2 and EZH2 directly repress cell-cycle gene transcription and activate self-renewal by recognizing CpG islands (CGI) in breast cancer cells. We further show that a previously undescribed long noncoding RNA (lncRNA) counteracts these processes, affecting these regulatory networks by negative feedback. Further understanding the regulatory dynamics by which lncRNAs control cell plasticity may aid the development of novel therapies that may not only target the existing TICs but also prevent the dynamic conversion of TICs to non-TICs and vice versa.
Materials and Methods
Mammalian cell culture
Breast cancer cell lines MDA-MB-231, MCF7, SKBR3, T47D, MDA-MB-468, BT549, MDA-MB-453, and BT474 were obtained from the ATCC. MDA-MB-231 and MCF7 cells were grown in DMEM (Sigma), SKBR3 in McCoy's 5a Medium Modified (Sigma), T47D and BT549 in RPMI-1640 Medium (Sigma), MDA-MB-468 and MDA-MB-453 in Leibovitz's L-15 Medium (Sigma), and BT474 in DMEM/F12 Medium (Gibco), supplemented with 10% FCS, 2 mmol/L l-glutamine, 100 U/mL penicillin, and 100 mg/mL streptomycin. Between thawing and the use in the described experiments, the cells were passaged no more than 5 times. Routinely, the state of cells was checked for cellular morphology and compared with images from ATCC website. All cell lines were monthly tested for mycoplasma (MycoAlert, Lonza) and were always found negative.
Sphere culture (tumorspheres)
MDA-MB-231, MCF7, SKBR3, T47D, MDA-MB-468, BT549, MDA-MB-453, and BT474 cells were plated in single-cell suspension in ultralow attachment plates (Corning, # CLS3471). Cells were grown in serum-free DMEM/F12 medium (Gibco) supplemented with B27 (1:50, Gibco), 20 ng/mL basic fibroblast grown factor (bFGF, Biolegend), and 20 ng/mL EGF (Sigma). Tumorspheres were collected after 16 hours or after 5 days of sphere formation. For sphere formation assay, breast cancer cells were plated in ultralow attachment plates at a density of 2 × 103 for MDA-MD-231 and 1 × 103 for MCF7 cells/well, and formed spheres with a size larger than 75 μm were counted under the microscope. Percentage of sphere formation efficiency was calculated as a ratio between the number of formed spheres divided by the number of cells seeded, multiplied by 100.
Silencer Select siRNAs were purchased from Ambion. Cells were transfected using Lipofectamine RNAiMAX (Invitrogen) following the manufacturer's recommended protocol. Unless otherwise specified 25 nmol/L of siRNAs were transfected for 48 or 96 hours. siRNA sequences or catalog number can be found in Supplementary Table S8 (sheet #1).
RNA isolation and RT-qPCR assays
Total RNA from cultured cells was extracted using TRI Reagent (Sigma) following the manufacturer's instructions including DNase I treatment. For gene expression, cDNA was synthesized from 1 μg of purified DNase-treated RNA using RevertAid M-MuLV reverse transcriptase and random hexamer primers (Thermo Scientific), according to the manufacturer's protocols. RT-qPCR assays were performed on a StepOne Real-Time PCR System using Fast SYBR Green Master Mix (both from Applied Biosystems).
We calculated the transcript copy number by using a previously published protocol (15, 16). Briefly, RNA relative-copy numbers were determined by RT-qPCR using standard curves and normalized to β-actin levels. The primer sequences used are reported in Supplementary Table S8 (sheet #2).
Primary tumor preparation
Fresh primary samples were collected from Charing Cross hospital within 1 hour of operating and retrieved in DMEM medium on ice. Samples were cut into pieces < 1 mm using a scalpel, washed once in DMEM medium, and then proteolytically digested for 1.5 to 2 hours in 5 mL of medium containing proteolytic enzymes (100 U/mL hyaluronidase and 3,000 U/mL collagenase). Once cells had appropriately detached from the extracellular matrix (assessed by checking cells under a hemocytometer), they were centrifuged at 200 x g for 10 minutes at room temperature, with supernatant carefully removed and cells resuspended in full medium. Tumor cells were then sorted using Magnet Activated Cell Sorting (MACS) to deplete extraneous cell types, using beads specific for blood cells (Lineage Depletion Kit, Human, MACS) and fibroblasts (fibroblast depletion kit, human, MACS).
Protein extraction and Western blotting
Cells were harvested in RIPA buffer (Sigma-Aldrich) supplemented with a protease inhibitor cocktail (Roche Applied Science). Cell lysates were centrifuged at 14,000 rpm, and the supernatant was collected. Protein lysates were quantified using a BCA Protein Assay Reagent kit (Pierce, Thermo Scientific). Fifty micrograms of lysates were resolved by SDS-PAGE and transferred to nitrocellulose membranes (Amersham). After blocking in 1X PBS containing 5% (w/v) milk and 0.1% Tween20 (v/v), membranes were incubated with the specific antibodies overnight at 4°C. Membranes were then washed 3 times with 1X PBS containing 0.1% Tween20 (v/v), incubated with horseradish peroxidase–conjugated goat anti-rabbit or goat anti-mouse antibody (Sigma-Aldrich), and washed again to remove unbound antibodies. Bound antibody complexes were detected with SuperSignal chemiluminescent substrate (GE Healthcare). Antibodies used for Western blotting can be found in Supplementary Table S8 (sheet #6).
Results for continuous variables are presented as mean ± SD unless stated otherwise, and significance was determined using the Mann–Whitney U test using GraphPad Prisma 8 (GraphPad Software) or R (https://www.r-project.org/). Expression values and statistical analysis for differential gene expression studies were performed with DESeq2 from the Bioconductor or by using GEO2R (https://www.ncbi.nlm.nih.gov/geo/geo2r/). Significance for overlapping genes was computed by using a Hypergeometric test. Significant enrichment for Stem Cell Inhibitory RNA Transcript (SCIRT) binding on promoters was determined using CEAS (https://anaconda.org/bioconda/cistrome-ceas). Significant differential expressions of SCIRT, FOXM1, EZH2, and SOX2 in tumor, normal, and metastatic samples from The Cancer Genome Atlas (TCGA) and Genotype-Tissue Expression (GTEx) datasets as well as correlation analyses were obtained by applying two-sided Welch t test and Pearson correlation, respectively, and computed using R (https://www.r-project.org/). Log-rank P values for survival analyzes were computed with Kaplan–Meier plot or cBioPortal.
Differences were considered significant when P values or P-adjusted values were < 0.05.
The dataset supporting the conclusions of this article is available in the Gene Expression Omnibus SuperSeries GSE136195.
Gene expression change during sphere formation reflects the induction of self-renewal and reduction of cell cycle
To evaluate how gene expression changes during sphere formation in aggressive breast cancer, we cultured TNBC MDA-MB-231 cells in adherent (adh) and sphere conditions for 16 hours and 5 days and performed RNA sequencing (RNA-seq; Fig. 1A; Supplementary Table S1). We selected MDA-MB-231 (and later MCF7 for further validation) because these cell lines have been previously reported as good models to study breast cancer stem cells in a study that compared cell lines with primary tissues (17). We chose an early time point (16 hours) to specifically detect early gene expression changes during sphere formation without confounding factors, such as alterations of the geometry within the spheres or limited access to nutrients within the cultures. A 5-day time point was chosen to identify genes common between nascent and mature spheres. After bioinformatic analysis, we identified 2,559 and 2,636 genes that were upregulated in spheres grown for 16 hours and 5 days respectively with 1,482 upregulated in common. Furthermore, we uncovered that 2,344 and 2,856 genes were downregulated in spheres grown for 16 hours and 5 days, respectively, whereas 1,547 were downregulated in common (Fig. 1A; Supplementary Table S1, P-adjusted < 0.01). Pathways enriched for genes up- and downregulated in spheres grown for 16 hours and 5 days are shown in Supplementary Table S2. We validated top up- and downregulated genes by RT-qPCR (Supplementary Fig. S1A). As expected, transcripts coding for proteins known to be involved in stem cell renewal, including KLF4, NOTCH1, and TGFB1, were significantly upregulated after 16 hours or 5 days of sphere growth (Supplementary Fig. S1B; Supplementary Table S1, P-adjusted < 0.01), supporting the biological relevance of our model system and analysis. By using the Enrichr tool (18), we found that genes upregulated in sphere culture at both time points were significantly enriched in PI3K-Akt and TGFβ signaling pathways (Supplementary Fig. S1C, P-adjusted < 0.01), compared with cells grown in adh conditions. Each pathway has been shown to induce breast cancer stem cell formation (19, 20). Importantly, analysis of the chromatin immunoprecipitation (ChIP) Enrichment Analysis (ChEA) database (21), using Enrichr, indicated that ZNF217, KDM2B, and SOX2 were the top enriched TFs or chromatin-modifying enzymes that bind within the promoters of observed upregulated genes in spheres (Supplementary Fig. S1C, P-adjusted < 0.01). Each one of these factors has been identified as a strong regulator of breast cancer stem cell self-renewal and tumorigenesis (5, 22, 23). In summary, these results confirm that genes induced during sphere growth are enriched for tumorigenic pathways and regulated by tumorigenic TFs.
Interestingly, transcripts significantly downregulated at 16 hours and 5 days of sphere growth were strongly enriched for cell-cycle–promoting genes and frequently contained binding sites of two master transcriptional regulators of cell cycle, FOXM1 and E2F4, within their promoters (Supplementary Fig. S1D; P-adjusted < 0.01). Overall, these effects indicate concomitant reductions of proliferative and increases of self-renewal gene expression in breast cancer spheres during the adh to sphere transition.
SCIRT lncRNA is upregulated in spheres (16 hours and 5 days) but counteracts stemness
Next, we hypothesized that lncRNAs could regulate transcriptional dynamics observed during sphere formation and breast cancer tumorigenesis. To explore undescribed lncRNAs that could be involved in this process, we searched for long noncoding transcripts that were up- or downregulated in spheres after both 16 hours and 5 days, and displayed a degree of cross-species conservation, using a previously established pipeline (24). Seven lncRNA candidates presented such characteristics (Fig. 1B, P-adjusted < 0.01). We focused on RP5-1120P11.1 (also annotated as LOC101929705 or AL109615.3) because it was strikingly upregulated in spheres (Fig. 1B; Supplementary Table S1, P-adjusted < 0.01), but surprisingly, its depletion strongly induced sphere formation in both MDA-MB-231 and MCF7 breast cancer cells (Fig. 1C and D), suggesting that it may act through negative feedback. We named this lncRNA SCIRT.
Our RNA-seq data indicated several SCIRT isoforms including RP5-1120P11.1 (Supplementary Fig. S2A). To precisely map the transcriptional start sites (TSS) of SCIRT and to understand which one of these two isoforms is most expressed in breast cancer, we analyzed public Cap Analysis of Gene Expression sequencing (CAGE-seq) data from the ENCODE project (https://www.encodeproject.org/), performed in MCF7 breast cancer cell lines (Supplementary Fig. S2B). RP5-1120P11.1 and isoform n3 present a CAGE-seq signal overlapping with their common TSSs in MCF7 cells, reinforcing the hypothesis that the transcription of SCIRT starts from this region. Within the SCIRT locus, another undescribed transcript is annotated on the opposite strand of SCIRT (C6orf223); however, C6orf223 does not appear to be expressed in our system (Supplementary Fig. S2A). Next, we observed that SCIRT was also upregulated in spheres derived from primary breast cancer specimens (Fig. 2A), and several other breast cancer cell lines (Supplementary Fig. S2C), except BT549, maybe due to genetic or epigenetic defects accumulated in these cell lines. In addition, we observed that SCIRT expression levels were heterogeneous in different breast cancer cell lines (Supplementary Fig. S2C) but were significantly higher in more aggressive basal-like primary tumors compared with luminal breast cancer subtypes in cells derived from primary tumors (Fig. 2A).
Breast cancer cells in which SCIRT expression was depleted by two independent siRNAs (siRNA knockdown efficiency < 90%, Supplementary Fig. S2D) showed increases in sphere formation over two passages (Fig. 2B). In contrast, overexpression of SCIRT in MDA-MB-231 cells (Supplementary Fig. S2E) decreases sphere formation efficiency (Supplementary Fig. S2F). These results indicate that SCIRT RNA restrains breast cancer self-renewal capacity. In addition, when MDA-MB-231 spheres with stable downregulation of SCIRT (shSCIRT) were injected s.c. into the flanks of immunocompromised mice, tumor formation was enhanced, suggesting that SCIRT can also reduce breast cancer tumorigenesis in vivo (Fig. 2C). SCIRT silencing also significantly increased directional cell migration and cell speed (Supplementary Fig. S3A and S3B) in aggregate, indicating that SCIRT opposes breast cancer progression.
SCIRT is a chromatin-associated lncRNA that downregulates self-renewal genes and induces cell-cycle genes
To understand the mechanism by which SCIRT opposes stemness, we first looked at its subcellular localization (Fig. 2D and E; Supplementary Fig. S3C and S3D). RNA FISH showed that SCIRT was exclusively located into the nucleus of MCF7 breast cancer cells (Fig. 2D) and that its signal was strongly reduced in cells treated with SCIRT siRNA (siSCIRT). Subcellular fractionation followed by RT-qPCR indicated that SCIRT mainly localizes in the chromatin-associated compartment of spheres derived from TNBC MDA-MB-231 (Fig. 2E) and MDA-MB-468 (Supplementary Fig. S3C) cells. In addition, analysis of RNA-seq from MCF7 fractions (25) also showed chromatin localization of SCIRT in these cells (Supplementary Fig. S3B).
Next, to identify genes regulated by SCIRT, we performed RNA-seq from MDA-MB-231 spheres where we depleted SCIRT using two different siRNAs. SCIRT silencing (siSCIRT#1 and #2 overlap) increased the expression of 653 and decreased the expression of 768 genes (P-adjusted < 0.05, Wald Test; Supplementary Fig. S4A; Supplementary Table S3). Interestingly, SCIRT regulation of gene expression recapitulated pathway enrichment observed for genes modulated in spheres versus adh cells but in the opposite direction. Accordingly, genes induced by SCIRT (downregulated upon SCIRT depletion) were strongly enriched in cell-cycle–related signaling and mitosis, whereas genes repressed by SCIRT (upregulated upon SCIRT silencing) were strongly enriched in stem cell expansion as well as neuronal functions (Fig. 3A; Supplementary Fig. S4A–S4C). This indicates that SCIRT silencing increases sphere formation (Fig. 1C and D; Fig. 2B) because this lncRNA acts by repressing self-renewal and by activating cell-cycle gene expression, operating in negative feedback to fine-tune separate gene expression programs.
SCIRT globally binds to promoter or enhancer regions to increase expression of cell-cycle genes and decrease expression of self-renewal genes
Because SCIRT is an abundant chromatin-associated lncRNA (Fig. 2E; Supplementary Fig. S3C and S3D) with expression levels comparable with TFs that are active in breast cancer, such as SOX2 (Supplementary Table S3), it may regulate gene expression by interacting with several specific chromatin loci. To evaluate this possibility, we performed Capture Hybridization Analysis of RNA targets (CHART) with DNA sequencing (26) to identify chromatin regions bound by SCIRT. Pulldown with SCIRT probes showed strong enrichment (20–25-fold) of the SCIRT transcript compared with DNA oligonucleotides complementary to control LacZ sequence (from Escherichia coli, frequently used as control sequence for this kind of experiment; Fig. 3B; refs. 26–28), indicating that we enriched for specific endogenous regions of SCIRT. Moreover, SCIRT probes did not enrich for MALAT1 or GAPDH transcripts used as negative control RNAs (Fig. 3B). Following CHART, we used massively parallel sequencing (CHART-seq) to identify all the chromatin regions that are associated with SCIRT. Similar to the previous experiment (Fig. 3B), regions bound by SCIRT were depleted of LacZ peaks (Fig. 3C; Supplementary Fig. S5A). Moreover, SCIRT chromatin–binding regions show strong PhastCons signal (given by evolutionary conserved elements in a multiple alignment) that peaks at their center (Supplementary Fig. S5B), implying that SCIRT interacts with conserved cis-regulative regions around the genome. Conversely, LacZ peaks do not occur at conserved regions (Supplementary Fig. S5B). By calculating the fold change of SCIRT signal versus input control, followed by metagene profiling, we observed that SCIRT peaks occur in the proximity of TSSs and are depleted at transcriptional termination sites (Supplementary Fig. S5C). Interestingly, the highest levels of SCIRT binding, considering the ratio of observed over expected values, were at promoters and CGIs (Supplementary Fig. S5D; P < 2.2e-16, one-sided binomial test). Using the HOMER peak caller (29), we detected 15,999 significant peaks for SCIRT (FDR < 0.001), which were common between two independent experiments (Supplementary Table S4). We observed that 12,724 of these peaks were associated with a total of 7,130 genes, being present within 200 kilobases (Kb) from their TSSs (Fig. 3D). We found a significant overlap between the genes up- or downregulated following SCIRT depletion and the genes associated with the SCIRT peaks (Fig. 3D, P < 2.2e-16, Hypergeometric test, 1.9-fold higher than expected by chance for both upregulated and downregulated genes), suggesting the association of SCIRT with chromatin as having a functional role in the regulation of those genes. Twenty-three percent of SCIRT peaks were located within promoters of genes regulated by SCIRT, as shown by colocalization with promoter-specific H3K4me3 marks (Supplementary Fig. S5E). In addition, 43% of the peaks were located within enhancers, as shown by colocalization with enhancer-specific H3K4me1 histone modifications (Supplementary Fig. S5E), and 32% in undefined positions. In both promoters and enhancers, SCIRT peaked exactly at the center of the valley formed by the histone modifications peaks (Supplementary Fig. S5F), indicating that it may interact with proteins within these locations. This indicates that SCIRT can increase or repress gene transcription through association with promoters or enhancers. We also observed that transcripts upregulated by siSCIRT were significantly enriched for genes involved in TGFβ and PIK3-Akt pathways and had pluripotency factors (KDM2B, SOX2, or SOX9) enriched within their promoters (from ChEA; Fig. 3D, bottom-right). On the other hand, transcripts downregulated by siSCIRT were significantly enriched for cell-cycle and DNA replication pathways and had promoters that were enriched for TFs that induce cell cycle (FOXM1, E2F4, or E2F7; Fig. 3D, bottom-left).
SCIRT interacts with EZH2 to antagonize its polycomb-independent activity
By analyzing public ChIP sequencing (ChIP-seq) databases (ChEA), we showed that KDM2B represents the most enriched factor interacting significantly with the promoters of genes upregulated by siSCIRT and associated with SCIRT peaks (Fig. 3D, bottom-right). KDM2B maintains pluripotency of stem cells by recruiting PRC1 to CGIs, which colocalize with PRC2 (30). Several studies have suggested that nuclear lncRNAs can interact with PRC2 to promote transcriptional gene silencing (31, 32). To evaluate whether SCIRT acts through a similar mechanism, we performed RNA immunoprecipitation (RIP)–RT-qPCR for SCIRT, using an antibody recognizing the PRC2 catalytic component EZH2 as it has been shown that RNA preferentially binds PRC2 proximally to its methyltransferase center (33). In line with our hypothesis, SCIRT is associated with EZH2 in spheres formed from both MCF7 and MDA-MB-231 cells (Supplementary Fig. S6A). We then used streptavidin-binding S1 aptamers fused to four SCIRT partially overlapping fragments (F1–F4) to pull down interacting proteins in vitro followed by immunoblotting for EZH2 protein and to evaluate the region of SCIRT where EZH2 binds. We observed EZH2 interacts preferentially with the 5′ half of SCIRT (Supplementary Fig. S6B; F1 and F2) as predicted by catRAPID (http://service.tartaglialab.com/page/catrapid_group; Supplementary Fig. S6C and S6D). Based on this algorithm, EZH2 interacts with SCIRT at position 226–277 or 626–677 (Supplementary Fig. S6D). Interestingly, G4hunter (http://bioinformatics.ibp.cz:8888/#/analyze/quadruplex) predicted a G-quadruplex structure at position 214–264 of SCIRT. Because it has been shown that EZH2 specifically binds G-quadruplex structures present on RNA transcripts (34), this suggests that EZH2 specifically binds SCIRT by recognizing a G4-quadruplex structure in its 5′ region.
Next, to evaluate the functionality of the SCIRT–EZH2 interaction, we silenced EZH2 (siEZH2) and performed RNA-seq in spheres to assess whether EZH2 regulates the same transcripts modulated by SCIRT (Supplementary Tables S3 and S5). Intriguingly, a significant fraction of genes upregulated by siEZH2 were also downregulated by siSCIRT (Fig. 4A, P < 2.2e-16, hypergeometric test; fold enrichment = 7.6), suggesting that SCIRT counteracts EZH2-mediated gene activation in breast TICs. In contrast, genes downregulated by siEZH2 significantly overlap with genes upregulated by siSCIRT (Fig. 4B, P = 1.01e-13, hypergeometric test; fold enrichment = 4.1). These data suggest that SCIRT may interact directly with EZH2 to antagonize both its repressive and activating functions. Next, we wondered whether SCIRT would change the profile of H3K27me3, a histone mark associated with gene repression induced by the EZH2/PRC2 complex, on genes coregulated by SCIRT and EZH2. To assess this, we performed ChIP-seq of H3K27me3 in cells growing in 3D conditions after siSCIRT, siEZH2, or control siRNA (siNC) treatment and assessed its levels across all genes as well as SCIRT-controlled genes, including promoters (Fig. 4C–E). Although as expected, H3K27me3 signal was elevated throughout the bodies of genes that are not expressed, but absent at active genes (Fig. 4C), H3K27me3 global profile did not change upon either siEZH2 or siSCIRT treatment (Fig. 4C). More importantly, H3K27me3 levels were low and did not change upon siSCIRT or siEZH2 treatment, either for SCIRT-regulated genes or their promoters (Fig. 4D and E). Lack of change in H3K27me3 global levels following transient EZH2 inhibition is in line with previous observations (35), and it is likely due to compensation from EZH1 (36). Based on these results, we propose that gene expression change observed following siEZH2 and siSCIRT is likely to be Polycomb-independent.
SCIRT interacts with chromatin loci and acts through FOXM1, EZH2, and SOX2
We observed that genes directly activated by SCIRT were enriched for cell-cycle and mitotic genes mostly controlled by FOXM1 (Fig. 3D, bottom-left), and those downregulated by SCIRT were enriched for self-renewal, TGFβ, and PIK3-Akt signaling, transcriptionally controlled by SOX2 or other pluripotency controlling factors (Fig. 3D, bottom-right). As it has been shown in adh breast cancer cells (37), we observed that FOXM1 also activates the expression of mitotic genes in cells growing in spheres (Supplementary Fig. S6E). This led us to hypothesize that SCIRT could increase transcription of selected cell-cycle genes by recruiting FOXM1 to their promoters or enhancers, counteracting the Polycomb-independent repression of these genes exerted by EZH2. Accordingly, depletion of EZH2 (siEZH2) in spheres led to an increase in the expression of cell-cycle genes that are both targets of SCIRT and FOXM1 (Supplementary Fig. S6F). At the same time, SCIRT could decrease transcription of self-renewal genes by forming a complex with EZH2 and SOX2 and counteracting their activating effect on those promoters (Supplementary Fig. S6G and S6H). In support of these hypotheses, FOXM1, EZH2, and SOX2 were all necessary for the effect of SCIRT on sphere formation for both MDA-MB-231 (Fig. 4F) and MCF7 cells (Supplementary Fig. S6I), as silencing of any of these factors could prevent increases in sphere growth following downregulation of SCIRT (Fig. 4F; Supplementary Fig. S6I). Effects seen on sphere formation appear additive for EZH2 and SOX2, suggesting they may act independently (Fig. 4F). In addition, we observed that the copy number of SCIRT transcripts per sample was similar to the copy number of EZH2 or SOX2 mRNAs (Supplementary Fig. S6J: i, ii, and iii), indicating that they reach stoichiometric ratios and that SCIRT is an abundant lncRNA able to interact with several genomic regions together with these factors. To assess this effect at the gene level, we performed ChIP-seq for EZH2 and SOX2 in spheres, reanalyzing publicly available ChIP-seq data for FOXM1 (GSE40762) as well as H3K4me1 and H3K4me3 (GSE124379) from MDA-MB-231 cells and integrating these data with our SCIRT-CHART-seq and our ChIP-seq from H3K27me3. K-means clustering of these peaks indicated that SCIRT, FOXM1, EZH2, and SOX2 mostly colocalize close to TSSs of a fraction of genes regulated by SCIRT (Fig. 5A). However, SOX2 and EZH2 colocalized with SCIRT at promoters of genes upregulated upon siSCIRT treatment, as anticipated (Fig. 5B and C; Supplementary Fig. S6A), but surprisingly also at promoters of cell-cycle genes downregulated upon SCIRT knockdown together with FOXM1 (Fig. 5B and D; Supplementary Fig. S6B). Intriguingly, this indicates that in addition to activating genes involved in self-renewal (Supplementary Fig. S6G; ref. 5), SOX2 can also regulate the transcription of cell-cycle genes that are targets of FOXM1 and SCIRT during sphere formation. In line with this, depleting SOX2 using siRNAs (siSOX2) increased the expression of a set of genes involved in cell cycle (Fig. 5E), which are also increased by siEZH2 (Supplementary Fig. S6F) but reduced by both siSCIRT and siFOXM1 (Fig. 6F; Supplementary Fig. S6E; Supplementary Table S3). As expected, genes increased by siSCIRT (Supplementary Table S3) were downregulated by siSOX2 and siEZH2 (Supplementary Fig. S6G and S6H). In aggregate, these data suggest that SCIRT, EZH2, and SOX2 colocalize at promoters of their target genes but that SCIRT antagonizes EZH2 and SOX2 in the transcriptional regulation of those genes.
In embryonic stem cells, the PRC2 complex and consequently H3K27me3 colocalize at promoters of developmental regulators with the pluripotency TFs SOX2, NANOG, and OCT4 (38). Importantly, in mammals, PRC2 is enriched at genomic regions with high GC content, which are enriched with CGIs (39). By applying K-means clustering in CHART-seq and ChIP-seq peaks at promoters of SCIRT-regulated genes, we showed that EZH2 and SOX2 colocalize next to SCIRT peaks and exactly on CGIs (Supplementary Fig. S7A). However, these two proteins not only colocalized within promoters of genes regulated by SCIRT or CGIs but much more broadly at multiple locations with high GC content (Supplementary Fig. S7B; Supplementary Fig. S8A and S8B). Accordingly, SOX2 and EZH2 global binding profiles showed a striking positive correlation of R2 = 0.91 (Spearman's rank correlation, Supplementary Fig. S7B), and both TFs showed a strong positive correlation with GC% (Supplementary Fig. S8B; EZH2∼GC% = 0.86; SOX2∼GC% = 0.77).
SCIRT increases cell-cycle gene expression by binding EZH2 to induce EZH2–FOXM1 interaction within promoters
By specifically looking at promoters of genes regulated by SCIRT and by sorting regions for up- and downregulated genes, we found that FOXM1 mostly colocalized with SCIRT on promoters of genes downregulated by siSCIRT that are enriched for cell cycle and have a CHR motif (Fig. 5B). Importantly, promoters bound to FOXM1 and SCIRT were strongly depleted of H3K27me3 signal (Fig. 5B; Supplementary Fig. S8), indicating that the SCIRT/FOXM1/EZH2 acts on gene transcription independently of PRC2 activity. Further supporting this, the promoter of CCNA2, a direct target of both FOXM1 and SCIRT (Supplementary Fig. S8B; Supplementary Table S3), was enriched for EZH2 but not by SUZ12 binding (Supplementary Fig. S9A). These data also indicate that in spheres, EZH2 activity is Polycomb-independent, regulating a subset of genes that are directly controlled by SCIRT/FOXM1/EZH2.
Interestingly, RIP–RT-qPCR for SCIRT after using an antibody against FOXM1 indicates that FOXM1 also interacts with SCIRT (Supplementary Fig. S9B), but suppression of FOXM1 did not reduce EZH2 binding with SCIRT (Fig. 6A). When we performed RIP after UV crosslinking (X_RIP), which only detects direct RNA–protein interactions, we found that only EZH2, not FOXM1, was able to form an RNA–protein complex with SCIRT (Supplementary Fig. S9C and S9D), indicating that FOXM1 binds to SCIRT indirectly. Next, by using coimmunoprecipitation, we found that FOXM1 physically interacted with EZH2 in spheres formed by MCF7 or MDA-MB-231 (Fig. 6B), but depletion of SCIRT completely disrupted this interaction (Fig. 6C, left and right plots).
Mechanistically, although siSCIRT treatment for 48 hours did not affect FOXM1 protein levels (Fig. 6D), it reduced FOXM1 (Fig. 6E) as well as Pol II and H3K27ac levels at promoters of cell-cycle genes (Supplementary Fig. S10A and S10B). The effect of siSCIRT on FOXM1 recruitment at cell-cycle gene promoters was stronger in MDA-MB-231 than MCF7 cells (Fig. 6E). This is likely due to the higher expression levels of SCIRT in MDA-MB-231 compared with MCF7. This suggested that cell-cycle genes are repressed by EZH2 (Supplementary Fig. S6F), but that SCIRT activates their transcription by recruiting FOXM1, Pol II, and histone acetyltransferases to promoters of cell-cycle genes to counteract the slow proliferation of spheres and induce differentiation. In aggregate, we revealed that SCIRT interacts with EZH2 to promote EZH2–FOXM1 protein–protein interactions at promoters of cell-cycle genes to constrain the transcriptional repression exerted by EZH2 on these genes. This fine-tunes their expression during TIC formation.
Because these genes are repressed during TIC formation whereas SCIRT is upregulated, SCIRT acts in a negative-feedback loop to attenuate this transcriptional response.
Single-cell RNA-seq analysis shows SCIRT is mostly present in primary breast cancer cells that express pluripotent TFs
Our results point to the role of SCIRT in counteracting stemness of TICs. It is thought that TICs are aggressive self-renewing tumor cells that are controlled by pluripotent TFs (40). To evaluate whether SCIRT is coexpressed with pluripotency TFs in breast cancer cells, we reanalyzed a public single-cell RNA-seq (scRNA-seq) study (GSE75688) performed in 515 cells isolated from 11 primary human breast cancer cells and representing all the 4 breast cancer subtypes (luminal A, luminal B, HER2+, and TNBC; Supplementary Fig. S11; ref. 41). We identified two different HER2+ subtypes that cluster with two different patients (clusters 1 and 6, Supplementary Fig. S11A), and cluster 6 had SOX2 as discriminative marker (Supplementary Fig. S11B and S11C; Supplementary Table S6, P-adjusted < 0.01). Interestingly, the HER2+ breast cancer subtype represented by cluster 6 (Supplementary Fig. S11A), which contained the highest proportion of cells expressing SCIRT and SCIRT, was not expressed in stromal cells (Supplementary Fig. S11B and S11C). In this tumor, cells expressing SCIRT were fewer (20%) but had a similar average expression levels than SOX2 (Supplementary Fig. S11C), indicating that both can reach stoichiometric ratio in primary tumor cells as well as in cell lines. These data indicated that this HER2+ subtype that has cells with high levels of SCIRT is less differentiated than the second (cluster 1) and probably more aggressive. In line with this hypothesis, this HER2+ subtype had also a high expression of additional pluripotent TFs, such as KLF4, EZH2, as well as ZNF217, which are strongly involved in breast cancer stemness (Supplementary Fig. S11B and S11C; ref. 22) in addition to SOX2. Finally, specific gene markers that characterize this tumor (Supplementary Table S6, P-adjusted < 0.01) were mostly regulated by ZNF217, SOX2, and FOXM1 TFs (Supplementary Fig. S11D), confirming that SCIRT is more expressed in less differentiated/more aggressive tumors. POU5F1 that encodes for OCT4 was only detected in a very small number of cells (Supplementary Fig. S11B and S11C), despite its documented role in breast cancer stemness and tumorigenesis (42), probably because its expression levels are mostly below the limit of detection of this scRNA-seq experiment.
SCIRT is upregulated in breast cancer specimens, but its high expression is associated with good prognosis
Next, we evaluated the clinical significance of this SCIRT-controlled transcriptional regulatory network. We first measured SCIRT, FOXM1, EZH2, and SOX2 expression from RNA-seq data obtained from 1,391 specimens from the TCGA and GTEx cohorts and related expression levels of these genes to available TCGA clinical data. Similar to FOXM1 and EZH2, SCIRT was significantly more expressed in breast cancers compared with normal tissues (Welch t test, P < 2.2e-16; Supplementary Fig. S12A–S12C) and showed the highest expression levels in the basal and Her2+ subtypes, which represent more aggressive tumor types (Supplementary Fig. S12D–S12F). We observed similar trends in cells derived from primary tumors (Fig. 2A; Supplementary Fig. S11B and S11C). However, differences detected for both FOXM1 and EZH2 were generally more dramatic and less variable than those observed for SCIRT (Supplementary Fig. S12A–S12F). Surprisingly, SOX2 levels showed a different pattern of expression in breast cancers compared with SCIRT, FOXM1, and EZH2 (Supplementary Fig. S12G), suggesting that the role of SOX2 in SCIRT/EZH2/FOXM1 regulatory network is context-specific and can occur only in rarer TICs. In general, SCIRT levels correlated positively with both EZH2 and FOXM1 (Supplementary Fig. S13A and S13B) but did not correlate with SOX2 in either cancer or normal samples (Supplementary Fig. S13C). However, FOXM1 correlated better with EZH2 than SOX2 (Supplementary Fig. S13D and S13E).
Strikingly, although high expression levels of FOXM1 and EZH2 were associated with worse disease-free survival (DFS), high expression levels of SCIRT were associated with better DFS (Supplementary Fig. S14A). These results are in line with our findings: that SCIRT is coexpressed and coregulates gene expression with oncogenic EZH2 and FOXM1 but, in the opposite direction, restraining tumorigenesis if upregulated in tumorigenic cells.
Next, anticipating that genes that are repressed by SCIRT (upregulated by siSCIRT) are oncogenic, we selected the top 100 genes with greatest upregulation upon depletion of SCIRT by 2 independent siRNAs (Supplementary Table S7) and evaluated their expression levels, clinical relevance, and mutational landscape in both the TCGA (43) and the METABRIC breast cancer datasets (44) interrogating a total of 3,334 breast cancer samples (Supplementary Fig. S14B). These genes were upregulated in almost all breast cancer specimens (Supplementary Fig. S14C), probably due to genome amplification at the genomic loci containing these genes (Supplementary Fig. S14D). Finally, breast cancer samples presenting amplification in these loci (Supplementary Fig. S14E) and samples with high expression of these genes (Supplementary Fig. S14F) presented worse overall survival.
In summary, we describe a novel lncRNA named SCIRT to be upregulated in TICs in breast cancer but to counteract tumorigenesis despite its increased expression. This suggests that re-expression of SCIRT or inhibition of genes that are strongly repressed by SCIRT (Supplementary Fig. S14B) in patients could represent a useful therapeutic approach to tackle the dynamic process of TIC self-renewal and differentiation in breast cancer. We also find that genes repressed by SCIRT are frequently amplified in breast cancer. Taken together, this gene set can be used as a novel signature to stratify the disease with diagnostic and prognostic implications.
Using an lncRNA that we describe for the first time, SCIRT, we expand our understanding of the transcriptional regulation of stemness and proliferative expression programs active in breast TICs. EZH2 and SOX2 are TFs able to increase self-renewal capacity of both embryonic stem cells and cancer stem cells (5, 6, 45). By using a breast cancer tumorsphere formation assay, which enriches for slow-proliferating cancer stem cells or TICs, we observed that these two factors widely colocalize within CG-rich chromatin regions, including CGIs located within promoters, and when they interact with specific promoters, they repress cell-cycle gene expression and activate self-renewal and neuronal gene programs in breast cancer cells that grow in 3D conditions, indicating that SOX2 and EZH2 are also crucial factors that instruct TICs to proliferate slowly.
We further observed that this transcriptional program regulated by SOX2 and EZH2 in breast tumor does not occur unopposed, but it is counteracted by SCIRT, strongly upregulated during tumorsphere formation, but unexpectedly, functioning by counteracting EZH2 and SOX2's effects on the transcription of many self-renewal and cell-cycle genes. By acting in this negative-feedback loop, SCIRT increases cell-cycle and represses self-renewal transcriptional programs of these cells (Supplementary Fig. S15). Our data are in line with the hypothesis that SCIRT acts as a regulator that only reduces transcription of genes involved in tumorigenesis without fully repressing their activity. In this manner, SCIRT tends to be more expressed in tumors than normal cells, but when it is expressed, it is associated with a more favorable prognosis.
We show for the first time that during sphere growth EZH2 binds to cell-cycle gene promoters to act as a corepressor with its binding partner FOXM1, restraining its effects on transcription. SCIRT directly interacts with EZH2 to increase EZH2–FOXM1 interaction and recruit more FOXM1 to promoters to fine-tune this cell-cycle transcriptional program (Supplementary Fig. S15). Previously, it was reported that FOXM1 and EZH2 interact during hypoxia in breast cancer to activate MMP2 and MMP7 transcription (46), suggesting that SCIRT may mediate the interaction between FOXM1 and EZH2 and their transcriptional effects during hypoxia as well. However, to the best of our knowledge, the importance of the EZH2–FOXM1 antagonistic interaction for the regulation of cell-cycle gene transcription has not been described elsewhere. Our results suggest that this effect mediated by SCIRT is specifically related to an EZH2–FOXM1 interaction that seems to be independent of the PRC2 complex. When selected lncRNAs interact with EZH2, it appears they may change EZH2's affinity for its protein-binding partners. It would be interesting to evaluate whether the binding of selected lncRNAs to EZH2 also changes PRC2 complex composition and activity rather than its recruitment to chromatin as widely described, or whether this action may be mediated by SCIRT alone in other cellular contexts.
Unexpectedly, we also found that SOX2, a TF that promotes self-renewal gene expression in TICs (5), also colocalized with EZH2 on promoters of cell-cycle genes to repress their transcription. Yet transcription from these promoters occurs during sphere growth, due to the dominant effects of SCIRT and FOXM1. Supporting this novel role for SOX2 in cell-cycle regulation, alongside its established function in TIC self-renewal, SOX2 has been reported to repress pro-proliferative cell-cycle genes in cortical progenitors to maintain their slow proliferative state (47).
Although we demonstrate that the interaction between FOXM1 and EZH2 is modulated by SCIRT, FOXM1 is mostly absent from the promoters of self-renewal genes upregulated upon SCIRT depletion. We suggest that FOXM1 is not present within these genomic positions despite the presence of SCIRT and EZH2, because these genomic sites do not have a CHR motif (Fig. 5B) that is the FOXM1 consensus site essential for its DNA binding (48). Additionally or alternatively, a higher level of GC content or (consequently) high levels of EZH2/SOX2 interactions within these promoters may prevent FOXM1's DNA binding. In support of this latter view, we observed FOXM1-binding site peaks in promoters of genes downregulated by SCIRT that are adjacent to EZH2- and SOX2-binding sites (Supplementary Fig. S7A; Supplementary Fig. S8A), indicating that FOXM1 cannot interact with DNA in regions occupied by EZH2 and SOX2, but resides next to them. As reported by others in embryonic stem cells, we did not find a physical interaction between EZH2 and SOX2 in breast cancer, suggesting that their colocalization depends on DNA rather than protein–protein interaction (49).
In aggregate, based on these data, we propose that SCIRT acts as a tumor suppressor in breast cancer, despite appearing to be more expressed in TICs compared with their more differentiated counterparts and in tumors compared with healthy control tissues. Indeed, despite being strongly coexpressed with FOXM1 and EZH2, high expression of FOXM1 and EZH2 in clinical samples predicts poor DFS, but DFS is better in patients with high levels of SCIRT supporting the idea that SCIRT is induced in tumorigenic cells but counteracts their aggressive properties (Fig. 2; Supplementary Fig. S2E and S2F).
Future cancer treatments may include TIC and differentiated cell–targeting components. Inducing SCIRT or SCIRT-like factors that promote differentiation toward chemotherapy-vulnerable cell states could enhance the success of such future approaches.
P. Cathcart reports grants from Medical Research Council (MRC) during the conduct of the study. S. Ottaviani reports grants from Action Against Cancer during the conduct of the study. J. Stebbing conflicts can be found at: https://www.nature.com/onc/editors, none of which are relevant here. L. Castellano reports grants from Action Against Cancer during the conduct of the study. No disclosures were reported by the other authors.
S. Zagorac: Investigation, visualization, methodology. A. de Giorgio: Investigation, methodology. A. Dabrowska: Validation, investigation. M. Kalisz: Investigation, methodology. N. Casas-Vila: Investigation, methodology. P. Cathcart: Investigation, methodology. A. Yiu: Investigation, methodology. S. Ottaviani: Investigation. N. Degani: Investigation, methodology. Y. Lombardo: Investigation. A. Tweedie: Investigation, methodology. T. Nissan: Investigation, methodology. K.W. Vance: Supervision, methodology. I. Ulitsky: Supervision, methodology. J. Stebbing: Resources, funding acquisition. L. Castellano: Conceptualization, data curation, formal analysis, supervision, validation, methodology, writing-original draft, project administration, writing-review and editing.
The authors thank Action Against Cancer (grant numbers: P75997_WSCC and PF9671), the Searle Memorial Trust, Charles and Diane Herlinger, Alessandro Dusi, Julian and Cat O' Dell, Sofiya Machulskaya, and Jackie McCarthy for funding this study. This work used the computing resources of the UK MEDical BIOinformatics partnership—aggregation, integration, visualization, and analysis of large, complex data (UK MED-BIO)—which is supported by the Medical Research Council. This study was supported by the Sussex University and the Imperial BRC and ECMC.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.