Glioblastoma is a lethal brain tumor that exhibits heterogeneity and resistance to therapy. Our understanding of tumor homeostasis is limited by a lack of genetic tools to selectively identify tumor states and fate transitions. Here, we use glioblastoma subtype signatures to construct synthetic genetic tracing cassettes and investigate tumor heterogeneity at cellular and molecular levels, in vitro and in vivo. Through synthetic locus control regions, we demonstrate that proneural glioblastoma is a hardwired identity, whereas mesenchymal glioblastoma is an adaptive and metastable cell state driven by proinflammatory and differentiation cues and DNA damage, but not hypoxia. Importantly, we discovered that innate immune cells divert glioblastoma cells to a proneural-to-mesenchymal transition that confers therapeutic resistance. Our synthetic genetic tracing methodology is simple, scalable, and widely applicable to study homeostasis in development and diseases. In glioblastoma, the method causally links distinct (micro)environmental, genetic, and pharmacologic perturbations and mesenchymal commitment.
Glioblastoma is heterogeneous and incurable. Here, we designed synthetic reporters to reflect the transcriptional output of tumor cell states and signaling pathways' activity. This method is generally applicable to study homeostasis in normal tissues and diseases. In glioblastoma, synthetic genetic tracing causally connects cellular and molecular heterogeneity to therapeutic responses.
This article is highlighted in the In This Issue feature, p. 521
The cellular and molecular heterogeneity of cancer is thought to contribute to resistance to targeted and immune therapies. Glioblastoma (GBM) is the most common, heterogeneous, and resistant primary adult brain tumor (1). Compared with most cancers, GBM is highly genomically and epigenomically characterized (2–5). Lineage tracing has provided important insights into GBM biology in the mouse, including the cellular origins of individual subtypes (6), and how aberrant homeostatic regulation can affect responses to treatments in vivo (7).
Transcriptome analyses of human GBM biopsies have repeatedly yielded a general classification into three subtypes, classic (CL), mesenchymal (MES), and proneural (PN), across cohorts (2–5). However, a single GBM tumor may exhibit the coexistence of a predominant subtype along with tumor cells of other subtypes (8, 9). In addition, recurrent tumors exhibit plasticity of genotype and phenotype, in which a dominant mutation and expression profile changes after treatment (2, 10). This makes it difficult to discern whether GBM subtypes represent hardwired entities or transient states imposed by differences in signaling or tumor regions. Yet, several studies established correlations between subtype-specific gene-expression signatures, differential response to therapy, and overall patient survival; the latter is particularly poor in highly mesenchymal tumors that exhibit infiltration of innate immune cells at recurrence (5). A better understanding of GBM subtype identities and fate changes might be crucial to develop effective therapies.
Technological advances in single-cell biology confirmed and extended our understanding of the complexity of brain tumor homeostasis (11, 12). Yet, as single-cell RNA sequencing (scRNA-seq) has increasingly expanded the catalogs of cell populations within tumors, the biological interpretation of novel cell types and disease states has remained difficult, due to a lack of experimental approaches for validation (13). Advanced experimental approaches will be required. Genetic tracing is extremely informative in developmental settings and diseases involving alterations in tissue homeostasis including metabolic, immunologic, neurologic, or psychiatric disorders, as well as inflammation and cancer (14–17). However, despite the expansion of sophisticated perturbation tools, such as CRISPR/Cas9 and optogenetics (14), genetic tracing has yet to be fully exploited in the dissection of complex disease traits. The pressing need for a novel strategy to characterize brain tumor heterogeneity prompted us to design synthetic reporters based on gene-expression signatures. These reporters are designed to integrate multiple pathways into a single genetic cassette, thereby mimicking endogenous regulatory elements. As an example, the β-globin locus control region shows cell type– and developmental stage–specific expression and engages transcription factors independently of its genomic position (15, 16).
Our method permits us to genetically label individual cell populations that share a similar state or undergo similar fate transitions within a heterogeneous tissue in vitro and in vivo, and to discover mechanisms regulating proneural-to-mesenchymal transition. Whereas a hierarchical and directional organization of the subtypes is continuously revised (2, 5, 12, 17), through synthetic genetic tracing in vitro and in vivo, we observed that interconversion between proneural and mesenchymal states is bidirectional. Our results are clinically relevant because they expose a causal connection between radiotherapy or innate immune cell infiltration and mesenchymal transdifferentiation, which was previously hypothesized on the basis of correlative analyses. Notably, we link innate immunity and mesenchymal commitment to the acquisition of selective resistance to therapies, which holds translational implications.
GBM Subtype Genetic Tracing by Synthetic Locus Control Regions
To trace complex GBM expression subtype identities, we developed a method that generates synthetic reporters. Our method relies on robust evidence that transcription factors govern cell identity or states through binding to cis-regulatory elements, which in turn control downstream target genes. From The Cancer Genome Atlas (TCGA) data sets (3), we annotated the genes specific to mesenchymal, classic, and proneural subtypes. Cell-intrinsic signature genes [i.e., differentially regulated genes (DRG)] were selected as genomic loci containing subtype-specific cis-regulatory elements (Fig. 1A and B; Supplementary Table S1). Next, we used gene expression and standard annotation tools to identify their regulators (i.e., genes encoding ubiquitous and tissue-specific transcription factors). Finally, we identified DNA elements with the potential of driving GBM subtype–specific expression according to the following criteria: a high transcription factor binding sites number (i) and diversity (ii), as well as a known distance from the nearest endogenous transcriptional start site (iii; Supplementary Table S1). This approach produced a list of potential cis-regulatory elements, including loci with endogenous binding by the predicted transcription factors at endogenous level (Supplementary Fig. S1A). Next, we generated distinct synthetic locus control regions (sLCR) to genetically trace mesenchymal, classic, and proneural GBM subtypes. This was accomplished by stitching 5 to 6 of the identified cis-regulatory elements, representing 40% to 60% of the regulatory potential (Supplementary Table S1 and Methods). Hereafter, we refer to these GBM subtype genetic tracing tools as MGT, CLGT, and PNGT.
The GBM subtype–specific reporters we generated can inform on the transcriptional identity of subsets of patients' GBM single cells. Consistently, single-sample gene set enrichment analysis (ssGSEA) showed a high correspondence between the potential expression pattern of each individual reporter, the known cell states of freshly purified single GBM cells (12), and their corresponding TCGA subtype (Fig. 1C). Each sLCR encodes the subtype-specific expression of a fluorescent reporter (mVenus or mCherry) and a second cassette expressing the nuclear H2B–CFP fusion via a ubiquitous viral promoter enabling reporter-independent selection (Fig. 1D). Reporter expression was validated in live transiently transfected cells, in stably transduced and cryosectioned tumorspheres, and in fixed tumor cells (Supplementary Fig. S1B–S1D). In the latter, dual RNA FISH and immunofluorescence demonstrated colocalization between nascent MGT#1 RNA and MED1, a master regulator organized in coactivator puncta and regulating endogenous cell-identity LCRs (also known as superenhancers; ref. 18).
To test the relative expression of synthetic reporters, which are representative of two opposite GBM subtypes, we next transduced proneural and mesenchymal sLCRs into a near-isogenic pair of human glioma-initiating cells (hGIC). These cells were engineered by genetically manipulating spontaneously immortalized human neural progenitor cells (19) and have a proneural-like expression signature, possibly inherited from the cell of origin (Supplementary Fig. S1E). In addition to preexisting and shared aberrations, the near-isogenic pair background consists of p53 and NF1 depletion or IDH1R132H and p53R273H mutant overexpression. Both lines display a DNA methylation profile concordant with IDH1 status in patients with high-grade glioma and are hereafter referred to as IDH-wild-type (WT)-hGICs and IDH-mutant (mut)-hGICs, respectively (data not shown). FACS quantification showed that the proneural reporter PNGT#2 is more expressed than the mesenchymal one in both near-isogenic lines, but MGT#1 expression was even lower in the IDH-mut genotype (Fig. 1D and E), possibly due to epigenetic suppression by IDH1R132H-dependent production of the oncometabolite 2-hydroxyglutarate (20).
Next, we tested the specificity of our reporters by extending the analysis of the proneural and mesenchymal reporters to patient-derived GBM stem-like cells (GSC), lung and breast cancer cell lines of epithelial origin, and cell lines of nonepithelial cancer origin such as leukemia cells. Each line was purified using FACS to select reporter-expressing cells. We detected each reporter using qRT-PCR and normalized their expression through endogenous GAPDH and the number of reporter integrations into genomic DNA (see Methods).
Consistent with the specificity of the design, both reporters were highly expressed in patient-derived GSCs and exhibited low expression in leukemia cells (Fig. 1D). In addition to line-to-line heterogeneity, an interesting common pattern was the high expression of the mesenchymal reporter MGT#1 in well-established mesenchymal cells independently of the tissue of origin (i.e., GBM166 and MDA-231; Supplementary Fig. S1E) and in epithelial cells committed to mesenchymal fate through TGFB signaling (i.e., A549+ TGFβ1; Fig. 1D).
Neurosphere culture conditions permit propagation of stem-like and short-lived progenitors and spontaneous differentiation to occur (21). We exploited this limited degree of heterogeneity to test whether cells with high reporter expression are more homogeneous than the bulk. Under these conditions, both near-isogenic hGICs showed high expression of PNGT#1/#2 and CLGT#1/#2 and low expression of MGT#1/#2 (Supplementary Table S1). FACS-purified IDH-mut-hGICs and IDH-WT-hGICs modified by selected reporters could be distinguished by RNA-seq and principal component analysis. Moreover, IDH-WT-hGICs with high reporter expression were less variable than bulk hGICs (Supplementary Fig. S1F). GSEA demonstrated that high expression of MGT#1 in IDH-WT-hGICs enriched for mesenchymal GBM gene sets compared with PNGT#2 or bulk unsorted cells (Fig. 1F). Consistently, bulk and PNGT#2 cells were both enriched for proneural and cell-cycle gene sets. Interestingly, mesenchymal GBM gene sets were more enriched in IDH-WT than in IDH-mut cells (Fig. 1F). Together, the data indicate that reporter expression by FACS reflects cell states at the endogenous level of gene expression. Finally, the signature genes retrieved in high MGT#1-expressing cells did not change when compared with either high proneural or classic reporter-expressing cells. Still, OLIG1 and OLIG2 marked proneural reporter-expressing cells, whereas CCNE2 was enriched in classic reporter-expressing cells (Supplementary Fig. S1G), suggesting that each reporter enriches for cells in a specific state.
In summary, this method can systematically leverage bulk or single-cell gene-expression data representing biologically or clinically relevant phenotypes into synthetic reporters, which preserve critical features of endogenous cis-regulatory elements and permit a genetic tracing of cell states.
Synthetic Genetic Tracing In Vivo Reveals GBM Heterogeneity and Hierarchies
The proneural GBM is considered the ancestor of all subtypes (22) and to reflect an oligodendrocytic origin (23). Recently, however, a mesenchymal-to-proneural hierarchy emerged by in silico transcriptomic lineage tracing in single cells (17).
To test whether sLCRs allow for a genetic tracing of tumor cell–fate changes in vivo, we intracranially transplanted IDH-WT-hGICs-MGT#1 into immunodeficient mice. Histologically, all tumors appeared as grade IV GBM (n = 10), with a large proportion of the mouse brain infiltrated by malignant cells, indicating extensive proliferation and invasion (Fig. 2A). IHC staining revealed MGT#1 expression was well confined within tumor areas, particularly at the invasive front (Fig. 2A and B). We detected tubulin immunostaining in MGT#1-negative cells as well as H2B–CFP puncta formation, facilitated through mitotic chromatin condensation in dividing cells. This ruled out that the cells expressing the reporter were simply located in areas of higher epitope accessibility or were escapers to lentiviral silencing (Fig. 2B–D). Thus, MGT#1 expression reflects functional intratumoral heterogeneity.
Next, we exploited dual-reporter combinations to gain insights in the dynamics of cell states in vivo. For this experiment, we generated IDH-WT or IDH-mut lines that carried one mVenus-driven mesenchymal reporter (MGT#1 or MGT#2) and one mCherry-driven nonmesenchymal reporter (PNGT#1, PNGT#2, CLGT#1, or CLGT#2). Upon xenograft formation (n = 18), we applied t-distributed stochastic neighbor embedding (t-SNE) to categorize parallel in vivo/in vitro flow cytometry data; this consisted of hierarchical components including cell shape, granularity, viability dyes, mesenchymal and nonmesenchymal fate reporters (Supplementary Fig. S2A and Methods). Strikingly, we observed a proportional increase of IDH-WT mesenchymal reporter–expressing cells in vivo (P < 0.0001), and alongside a concomitantly milder but significant decrease was observed for classic or proneural reporter–expressing cells (P < 0.01; Fig. 2E–G) when compared with their in vitro culture counterparts. In contrast, proportions of MGT#1-high (MGT#1hi) IDH-mut-hGICs did not increase in vivo (Supplementary Fig. S2B and Methods), raising the interesting possibility that the IDH1R132H restricts tumor cell plasticity.
To determine the transcriptional identity of the mesenchymal state in vivo, we isolated several hundred highly mesenchymal and nonmesenchymal reporter–expressing cells (MGT#1, and PNGT#2 or CLGT#1, respectively) from brain xenografts. We purified and profiled them using FACS and RNA-seq. IDH-WT-hGICs with high MGT#1 expression in vivo were significantly enriched for the TCGA–mesenchymal gene set compared with nonmesenchymal tumor cell fractions (i.e., PNGT#2 and CLGT#1); the latter were instead enriched for TCGA proneural and oligodendrocyte-like gene sets (Neftel-OPC; Fig. 2H). Of note, features of highest fitness in vivo such as viability and high transcriptome quality (Supplementary Fig. S2C and S2D) were more often associated with MGT#1 than with nonmesenchymal reporter cells. Compared with homogeneously proneural cells in vitro (i.e., PNGT#2hi), MGT#1 expression in vivo significantly enriched for a human mesenchymal GBM gene set (e.g., Neftel-MES1). Moreover, the acquisition of an astrocyte-like (AC-like) state appeared to be a dominant feature of MGT#1hi cells in vivo (Fig. 2H). This is consistent with the signature of bulk TCGA–mesenchymal GBM to be a mix of mesenchymal and astrocytic phenotypes at the single-cell level (Fig. 1C).
To identify the hallmarks of proneural-to-mesenchymal transition in vivo, we performed differential expression analysis between the two most abundant homogeneous populations in vitro and in vivo (proneural PNGT#2hi in vitro vs. mesenchymal MGT#1hi in vivo). This revealed that in vivo, the MGT#1hi GBM state relies on the activation of early-immediate response transcription factors, including EGR1, 2, and 3, JUN, JUNB, FOSL2, FOSB, and FOS (Fig. 2I). A similar outcome was obtained by comparing MGT#1hi in vivo with the MGT#1hi in vitro (Supplementary Fig. S2E). This indicated that those genes represent a signature of tumorigenesis in vivo.
Overall, 954 genes were specifically upregulated in the MGT#1 in vivo fraction, and 234 marked PNGT#2 cells in vitro (Padj < 0.05; log2FC ± 1.5; Fig. 2I). mVenus featured one of the most relevant upregulated genes, whereas mCherry was inversely regulated, although not significantly (log2FC = 5.34; Padj = 0 and log2FC = −0.48; Padj = 0.42, respectively; Fig. 2I). Of note, the MGT#1hi state in vivo showed an upregulation of GFAP and CHI3L1, which mark terminally differentiated astrocytic and mesenchymal cells. But their expression in the absence of other markers of terminally differentiated astrocytes indicates that most cells in the MGT#1 state are undifferentiated (Supplementary Fig. S2E).
Ingenuity pathway analysis (IPA) established that the genes marking MGT#1 cells in vivo locate downstream of several proinflammatory regulators, including TNFα (z-score = 2.655, P = 4.49–24), NFκB (z-score = 1.915, P = 3.05–13), and interferons (Fig. 2J). Among other well-known in vivo–specific pathways, we found known targets of the synthetic retinoic acid tretinoin (z-score = 2.99, P = 3.31–22), the TGFβ and VEGF pathways (z-score = 2.51, P = 3.28–26 and z-score = 1.531, P = 3.2–15, respectively; data not shown).
In summary, synthetic genetic tracing in a physiologically relevant tumor microenvironment revealed that hGICs with a low mesenchymal identity in vitro generate xenografts with phenotypically distinct populations of cells that include mesenchymal GBM cells.
The Mesenchymal GBM Identity Is Adaptive and Reversible
Our hGICs express proneural reporters in a genotype-independent manner, and high levels of mesenchymal activity were seen only in vivo (Fig. 2A–G). We reasoned that the proneural and mesenchymal identities are dependent on intrinsic and external signaling, respectively.
To investigate whether the mesenchymal state relies on upstream signaling cues that are absent in standard in vitro culture conditions, we modulated external signaling in a phenotypic screen. In neurobasal-like serum-free medium (see Methods), which recapitulates GBM heterogeneity in mouse xenografts (21, 24), hGICs were stimulated by the addition of selected cytokines and growth factors. We carried out a FACS screen after 48 hours (Fig. 3A) and also tested several signaling cues through individual FACS analyses or live-cell imaging (Supplementary Fig. S3A and S3B). Compared with the naïve cells, IDH-WT-hGICs-MGT#1 swiftly upregulated the reporter in response to TNFα signaling, human/bovine serum, and leukemia inhibitory factor (LIF), pointing to these factors as direct mesenchymal triggers, and to a lower extent Activin A (Fig. 3B; Supplementary Fig. S3A and S3B). These results were reproducible across two independent mesenchymal reporters (i.e., MGT#1–2; Fig. 3B). The response to the external signaling was generally dampened in IDH-mut cells.
In contrast, in a parallel screen for the same factors using proneural reporters, no factor consistently elicited a response in both reporters, with IFNγ and serum triggering PNGT#1 expression only (Fig. 3B). This supports that the proneural identity is largely uncoupled from the microenvironment but rather is either encoded in the cell of origin or embedded in the RTK signaling.
TNFα's role as a prominent promesenchymal signaling cue is consistent with the transcription factor NFκB binding at endogenous cis-regulatory elements included in the MGT#1 reporter upon TNFα stimulation (Supplementary Fig. S1A). Importantly, this is in line with our finding that NFκB activation is the key connecting pathway for mesenchymal genes in brain xenografts (Fig. 2J) and previous analyses in patient-derived cell lines (10).
Interestingly, side-by-side stimulation in vitro showed that MGT#1 blandly responded to TNFα in IDH-mut cells (Supplementary Fig. S3C), despite both IDH-WT and IDH-mut bearing comparable levels of MGT#1 expression, and were propagated under the same signaling conditions. This is reminiscent of the impaired MGT#1 activation in these cells in vivo (Supplementary Fig. S2B). Quantitative PCR confirmed that response to TNFα involved the amplification of promesenchymal genes in both cell types but to a lower extent in IDH-mut-hGICs (Supplementary Fig. S3D). The specificity of the mesenchymal reporter expression in response to TNFα was further confirmed by using a control reporter carrying the ubiquitin C gene promoter (Supplementary Fig. S3E). Notably, TNFα led to the phosphorylation of NFκB-p65, STAT3, and p38-MAPK in both cell types (Supplementary Fig. S3F). Thus, we conclude that our reporters permit the identification of differences in the transcriptional responses between IDH-WT-hGICs and IDH-mut-hGICs.
Genetic tracing of GBM subtype identities or states offers a means of testing whether the mesenchymal GBM is a stable entity (e.g., subtype) or a reversible cell state. To this end, we committed IDH-WT-hGICs to a mesenchymal fate using different external signaling cues and then performed a washout experiment. Within the timeframe of five days, the activation of two independent mesenchymal reporters was reset by the washout of the relevant signaling (Fig. 3C–F). This aligns with the observation in dual-reporter cells that MGT#1 expression was inducible but PNGT#2 levels were unchanged by the signaling cues tested (Supplementary Fig. S3A–S3C). To confirm that the transcriptional states obtained in vitro are representative of the human mesenchymal GBM in patients' biopsies, we carried out RNA-seq of FACS-sorted MGT#1hi cells after 48 hours of stimulation with TNFα, human serum, FBS, LIF, or Activin A. We compared these profiles with naïve control cells as well as with the FACS-sorted in vivo MGT#1hi cells. This revealed that in vitro stimulation with TNFα, human serum, and—to a lower extent—LIF promoted the acquisition of a cell state resembling human mesenchymal GBM (Fig. 3G). Hence, the same reporter can detect different signaling cues that lead to a mesenchymal transition. In addition, under in vitro conditions, TNFα, human serum, and LIF appeared to individually activate some of the genes that specifically mark MGT#1-expressing cells in vivo—even though, individually, these factors elicited a milder transcriptional activation of the MGT#1 reporter (Fig. 3H; Supplementary Fig. S3G). Together with our pathway analysis of in vivo MGT#1-expressing IDH-WT-hGICs (Fig. 2J), the data suggest that the mesenchymal transition that occurs in vivo is driven by specific external signals, most probably in a combinatorial way.
Overall, our genetic tracing of GBM states revealed that the proneural GBM is supported by intrinsic signaling, whereas the mesenchymal program is largely adaptive and builds on a reversible, preexisting identity. It is, therefore, non–necessarily hierarchical.
Mesenchymal GBM Genetic Tracing Reveals a Swift Cell-State Change in Response to Ionizing Radiation, but Not Hypoxia
Mesenchymal transdifferentiation in GBM is dominant at recurrence after standard of care (2), and scRNA-seq has shown a correlation between hypoxia and mesenchymal GBM (11, 12). However, a causal link had yet to be demonstrated.
IR is a major component in the standard of care for GBM (1). To experimentally test whether IR can induce mesenchymal transdifferentiation in a cell-intrinsic manner, we exposed IDH-WT-hGICs and IDH-mut-hGICs to medical X-rays. A dose-dependent MGT#1 activation in IDH-WT cells occurred in response to increasing radiation (Fig. 4A). TNFα amplified mesenchymal transdifferentiation in both genotypes (Supplementary Fig. S4A). The IR doses tested here included a single application of 10 Gy radiation that is sublethal in multiple human GSCs (25, 26). In all of the conditions, hGICs remained viable, also in combination with other treatments (e.g., TNFα or temozolomide; Supplementary Fig. S4A and data not shown). Phosphorylation of the DNA damage marker γH2AX one hour after irradiation confirmed that double-strand breaks had occurred (Fig. 4A). Forty-eight hours after 20 Gy radiation, in IDH-WT-hGICs-MGT#1, a proneural-to-mesenchymal transition was supported by RNA-seq followed by GSEA (Fig. 4B and C). Differential expression analysis revealed 135 genes significantly upregulated and 31 downregulated (log2FC ± 1.5; Padj < 0.05), featuring the activation of the ATM signaling that connects the mesenchymal commitment to the DNA damage induced by IR (Supplementary Fig. S4B and S4C).
To address whether hypoxia also causes mesenchymal transition, we exploited genetic tracing by sLCRs under ambient oxygen tension, physiologic glioma hypoxia, and severe experimental hypoxia. Low oxygen levels (3%), which are considered physiologic in gliomas (27), or hypoxia (1%) activated neither MGT#1 nor MGT#2 in IDH-WT-hGICs or IDH-mut-hGICs-MGT#1 (Fig. 4D). Of note, MGT#2 contains a HIF1A motif (P < 0.001), which makes it potentially responsive to HIF1A activation (Supplementary Table S1). However, mesenchymal reporter expression remained unchanged even in response to severe experimental hypoxia (0.5%; Supplementary Fig. S4D). Consistent with the outcome informed by the reporter, RNA-seq in IDH-WT-hGICs-MGT#1 followed by differential expression analysis indicated that genes that respond to low oxygen tension experienced significant regulation (Fig. 4E and F), but this was not accompanied by mesenchymal commitment.
Overall, genetic tracing indicates that both IR and a decrease in tissue oxygenation trigger transcriptional responses, but only radiation leads to a swift acquisition of a mesenchymal state in a tumor cell–intrinsic manner.
Synthetic Genetic Tracing and CRISPR/Cas9 Screens Connect Genetic and Pharmacologic Perturbations to Mesenchymal Commitment
To exploit synthetic genetic tracing in uncovering the genetic determinants of mesenchymal GBM, we used a genome-wide pooled CRISPR/Cas9 screen to uncover genes modulating MGT#1 expression in IDH-WT-hGICs in their naïve state, or upon induction of the mesenchymal state by external signaling or genotoxic stress (i.e., serum + TNFα or temozolomide + IR, respectively; Fig. 5A). Both external signaling and the exposure to therapy increased the number of MGT#1hi cells, as expected. Notably, applying the genome-wide CRISPR/Cas9 Brunello library alone was also able to increase the proportion of IDH-WT-hGICs-MGT#1 (Supplementary Fig. S5A), suggesting that this type of screen may identify neuroepithelial gatekeepers. Replicates clustering according to the expected sources of technical and biological variability, coupled with the depletion of guide RNAs (gRNA) associated with essential genes but not of the control gRNAs overall supported the good quality of the screen (Supplementary Fig. S5B–S5D).
To identify genes whose activity modulates homeostatic, therapy-induced, and signaling-induced mesenchymal transition, we compared the pool of gRNAs statistically depleted in MGT#1hi fractions against all control fractions. This analysis uncovered 341 significant unique genes (256 promoting and 85 opposing to MGT#1 expression, log2FC ± 1; Padj < 0.05; Fig. 5B; Supplementary Table S2). Among the top hits for which a loss of function potentially facilitates the mesenchymal transition, we found a chromatin modulator, the SAGA-complex acetyltransferase CCDC101/SGF29, which had not previously been linked to GBM or epithelial–mesenchymal transition (EMT). Recently, we had identified another chromatin regulator, the PRC2-complex scaffold EED, as negative regulator of EMT (28). Given this, we chose to delete CCDC101/SGF29 in IDH-WT-hGICs-MGT#1 cells and subject these to signaling- or therapy-driven EMT. In each case, the loss of SGF29 increased MGT#1 expression (Fig. 5C), thereby validating the results of the screening.
Consistent with our upstream regulator analysis of in vivo MGT#1-specific gene expression, several single-guide RNAs (sgRNA) required for MGT#1 activation lie downstream of proinflammatory pathways (Fig. 5D). These data connect the expression of proinflammatory genes to their function in mesenchymal commitment. Interestingly, the loss of two sgRNAs targeting RELA/p65 occurred in naïve IDH-WT-hGICs but not under the other conditions (Supplementary Fig. S5E). This suggests that the role of a single one of the NFκB transcription factors in MGT#1 regulation may be limited to homeostasis. Indeed, deleting RELA alone resulted in the short-lived downregulation of MGT#1 expression at bulk protein level, and TNFα-driven MGT#1 activation was only partially affected by RELA loss (Fig. 5E; Supplementary S5F–S5G). Consistently, a simultaneous deletion of RELA and NFκB1/p50 drastically impaired TNFα-driven MGT#1 activation (Supplementary Fig. S5G). These findings demonstrate that functional reporters can help to elucidate how a complex signaling pathway, such as the NFκB pathway, regulates the proneural–mesenchymal transition. Importantly, pharmacologic interference with the pathway by the IκB kinase (IKK) inhibitor-16 impaired MGT#1 activation in both WT and RELA KO cells, at concentrations that did not affect viability (Fig. 5E and data not shown). Thus, hits from the phenotypic screen can predict pharmacologic switches such as the notion that inhibiting IKK-mediated NFκB activation could prevent mesenchymal transition.
Having identified a clinically relevant compound that antagonizes the proneural–mesenchymal transition, we next focused on mesenchymal state amplifiers. All-trans-retinoic acid (ATRA; aka tretinoin) is a RAR/RXR agonist whose targets are expressed in MGT#1 cells in vivo (Fig. 2J), and can potentially induce its targets identified by sgRNAs in the screen (Fig. 5B and F). In response to a short pretreatment with ATRA, IDH-WT-hGICs appeared protected from the detrimental effects of radiotherapy and temozolomide, as compared with their responses to IKK-16 (Fig. 5G). To correlate this phenotype with the modulation of the GBM state, we first subjected dual-reporter-expressing cells (i.e., IDH-WT-hGICs with both MGT#1 and PNGT#2) to a short pretreatment with IKK-16, ATRA, or DMSO, followed by proneural–mesenchymal induction stimulated by TNFα. Subsequently, we measured the gene expression of mVenus (MGT#1 reporter), TNF (an EMT driver gene), and CHI3L1 (a marker of terminal mesenchymal differentiation). On the basis of those markers, after eight hours from induction, we observed that ATRA and IKK-16 had opposing effects on GBM state changes (Supplementary Fig. S5H). This showed that MGT#1 could report on cellular responses to targeted compounds.
Overall, these data demonstrate that synthetic genetic tracing can be an effective method to identify genetic modulators of cell states and prioritize pharmacologic treatments based on their ability to modulate cell identity.
Non–Cell-Autonomous Phenotypic Consequences of the Cross-Talk between Tumor and Innate Immune Cells
Having established that our genetic tracing strategy is well versed to address clinically relevant questions, we next dissected the cross-talk between tumor and innate immune cells. IDH-WT GBM infiltration by GBM-associated microglia/monocytes (GAM) correlates with NF1 deficiency and mesenchymal GBM (5), but whether there is a causal relationship between GAMs and mesenchymal transdifferentiation had yet to be demonstrated.
Rather than only being recruited by mesenchymal GBM cells, GAMs might contribute to their specification. To investigate this, we cocultured NF1-depleted IDH-WT cells with an early-passage immortalized human microglia cell line (hMG; cl.C20; ref. 29). The in vitro coculture between hGIC tumorspheres and hMG cells was set up using transwell inserts, which enable cell–cell communication but maintain a physical separation between the two populations. Under these conditions, hMG drove the upregulation of MGT#1 expression to an extent comparable to TNFα (Fig. 6A–D). Microglia also activated MGT#1 in IDH-mut-hGICs, but at lower levels (Fig. 6B).
Next, to test whether non-brain innate immune cells can also instruct mesenchymal differentiation in human GICs, we differentiated primary human CD34+ cells into immature cells broadly referred to as myeloid-derived suppressor-like (MDSC-like) and the human monocytic cell line THP-1 into M1 or M2 macrophage-like cells. MGT#1 was strongly upregulated when cocultured with macrophage-like cells, with M1-polarized cells driving the strongest phenotypic commitment, whereas MDSC-like cells triggered only a mild MGT#1 activation (Fig. 6A–C). This aligns with our observation that cytokines associated with innate immunity such as TNFα trigger a stronger direct MGT#1 activation to a larger extent than signaling molecules derived from adaptive immunity or stroma (IFNγ/IL2 and IL6, respectively; Fig. 3A and B; Supplementary Fig. S3A and S3B).
To determine how microglia induce a mesenchymal GBM state in a nonautonomous manner, we used FACS to purify and then transcriptionally profile MGT#1hi-expressing IDH-WT-hGICs upon coculture. TNFα treatment served as control, because innate immune cells are capable of secreting TNFα in mouse and human glioma (30, 31). The coculture transcriptionally remodeled both cell types (Supplementary Fig. S6A). Pathway analysis revealed that both microglia and TNFα lead to NFκB-related gene activation in hGICs, yet largely through a private set of genes (Fig. 6D and E). Our system provided no evidence to suggest that TNFα triggered the hMG-driven phenotype (data not shown). In fact, a pathway analysis on MGT#1hi-expressing IDH-WT-hGICs activated by microglia uncovered a remodeling of the metabolic transcriptome, including genes in the cholesterol biosynthesis and SREBP1/2 pathway (Fig. 6E and F). The presence of extracellular lipid droplets in GBM has been well established (32), and GBM cells rely on extracellular cholesterol uptake for growth (33), so we next treated our cells with oxidized low-density lipoprotein (oxLDL), which is known to activate the SREBP1/2 pathway through PPARγ. OxLDL drove MGT#1 activation in both IDH-WT and IDH-mut models (Supplementary Fig. S6B). Moreover, in a hypothesis-driven approach, we tested whether the nitric oxide (NO) synthesis pathway is capable of driving mesenchymal transdifferentiation. Upregulation of the activity of the NO pathway is a common feature of innate immune cell activation, and endothelial cell NO activity has previously been linked to glioma growth and invasiveness (34). Strikingly, NO donor NOC-18 also triggered MGT#1 expression to levels comparable to hMG or TNFα stimulation (Supplementary Fig. S6C). However, neither anti-LDLR treatment nor the NOS inhibitor L-NG-nitroarginine methyl ester (L-NAME) could rescue MGT#1 induction by microglia. This suggests that several concurrent mechanisms regulate the phenotypic changes driven by innate immune cells in glioma. Next, we performed gene-expression profiling of the oxLDL- and NOC-18–driven MGT#1-expressing IDH-WT-hGICs and analyzed these along with all the MGT#1hi expression profiles. This allowed us to search for common traits and to look into how different upstream signaling cues are integrated into the transcriptional response detected by our reporter. We found that each type of upstream signaling led to a specific transcriptional output, yet the microglia-driven mesenchymal transition shares features with oxLDL, NOC-18, TNFα, and—to a lesser extent—serum and LIF (Fig. 6G and H; Supplementary Fig. S6D).
Overall, our data establish a causal relationship between innate immune cell infiltration and mesenchymal transdifferentiation in GBM.
Fate Mapping by sLCRs Reveals Phenotypic and Molecular Features Conserved in Patients with GBM
To test the relevance of the predictions based on genetic tracing experiments that our models generated, we looked for the molecular signatures we had found in patients' gene-expression profiles. Patients with GBM were clustered using ssGSEA and several gene sets experimentally determined in our cells (Fig. 7A). Importantly, the microglia-driven phenotype and its related oxLDL and NOC-18 signatures clearly connected mesenchymal patients.
The patient groups identified by these signatures had statistical features, such as a trend toward poor survival, which were comparable to human mesenchymal GBM signatures that had been directly obtained from patients' biopsies (Supplementary Fig. S7). These analyses were replicated in patient-derived single GBM cells (12) and in a large set of patient-derived GSCs, including a well-characterized one (ref. 35; Fig. 7B; Supplementary Fig. S7C). The similarities between gene-expression data generated in our models and freshly derived patients' GBM cells or GSC-like cells argue that hypotheses generated with our genetic tracing strategy are possibly falsifiable in human GBM.
Therapeutic Implications of Phenotypic Changes Induced by Innate Immune Cells in hGICs
EMT has been linked to resistance to chemotherapy but also offers therapeutic opportunities (36, 37). Thus, we exploited sLCRs' ability to identify a mesenchymal state to determine whether the microglia-driven state we had discovered might have therapeutic implications.
In gene-expression signature seen in patients, both TNFα- and the microglia-driven signatures scored high in a similar cohort of patients and single GBM cells by ssGSEA (Fig. 7A and B; Supplementary Fig. S7). Yet, they had unique molecular features. Specifically, microglia appeared to impair expression of DNA damage and cell-cycle genes in GBM cells (Fig. 7C and D).
GBM cells must activate DNA-damage responses and undergo proliferation in order for the standard-of-care (1) to work. This means that the microglia-driven program identified here may have significant therapeutic implications, such as that GBM cells exposed to microglia respond differently to treatments. To validate our prediction, we FACS-sorted MGT#1hi- and MGT#1lo-expressing hGICs after microglia-driven conversion and exposed these cells to a selected battery of standard and targeted chemotherapeutics. Strikingly, in contrast to their sLCR-low and sLCR-naïve counterpart, both MGT#1hi-expressing IDH-WT-hGICs and IDH-mut-hGICs proved to be more resistant to therapies based on DNA-damage responses (olaparib, ATR inhibitor VE-821, topotecan, mitomycin C; Fig. 7E and F; Supplementary Fig. S7D and S7E). Moreover, compared with MGT#1lo-expressing hGICs, MGT#1hi cells were found more resistant to LXR623, an LXR agonist that regulates cholesterol efflux (Fig. 7E and F)—a mechanism proposed as a synthetic vulnerability for GBM cells (33). Our data suggest that the therapeutic benefit of lowering cell-intrinsic cholesterol levels in GBM cells may be unattained when innate immune cells induce GBM cells to activate cholesterol biosynthesis. Moreover, hGICs purified using either MGT#1hi or MGT#2hi, which both identify phenotypically distinct cells through different regulatory elements, exhibited similar dose–response patterns (Supplementary Fig. S7D), thereby indicating that the program identified by individual reporters in these cells is causal to the resistance. Importantly, IDH-WT/mut-hGICs had a dose response to targeted agents such as BAY11-7085 (IκB inhibitor) and WP1066 (STAT3 inhibitor) independent of stratification by MGT#1 expression (Fig. 7E and F; Supplementary Fig. S7E). This indicates that the microglia activate a selective response to chemotherapeutics in hGICs.
In summary, synthetic genetic tracing established a causal link between the innate immune cells and acquisition of two functionally relevant and therapeutically distinct GBM states.
GBM remains a lethal disease despite deep (epi)genomic characterization. Here, we approached the problem by combining epigenomic information and classic genetic tracing. We used synthetic genetic tracing of human GBM subtype expression programs to trace the fates of cells with homogeneous phenotypes. This way, we discovered causal relationships between perturbations, cell-fate commitment, and response to therapies (Fig. 7G). Our data provide direct evidence that links the tumor heterogeneity to patients' responses to treatments.
Proneural and mesenchymal GBM subtypes have consistently been identified across expression platforms, readouts, and patient cohorts. Yet, the origins of these subtypes, their location or spatiotemporal evolution, and—most importantly—their therapeutic significance have remained obscure. GBM subtypes rely on transcriptional programs (10, 38–40), providing the basis for our genetic tracing using sLCRs. The proneural and the mesenchymal reporters showed a higher relative expression level in patient-derived GBM cells than in unrelated counterparts. Further efforts and models will be required to determine whether the reporters are well suited to perform “absolute subtyping.”
The in vitro derivation and propagation of GBM stem-like cell lines results in generally high mesenchymal signaling in vitro (41). Yet, evidence presented here and in very recent work by others (42) suggests that these in vitro conditions favor proliferation, but lack critical features of microenvironmental complexity. Consistently, our data show that synthetic genetic tracing is best suited to trace changes in cell states. Using our transformed neural stem cell model and synthetic genetic tracing, the proneural GBM emerged as a default entity. In the absence of a tumor microenvironment, the proneural state appeared hardwired even in cells with a genotype often associated with mesenchymal GBM (e.g., NF1 depletion). However, the mesenchymal identity was swiftly amplified by acute inflammatory and proastroglial differentiation stimuli, experimentally supporting previous correlative evidence (3, 10). Interestingly, the presence of the IDH1R132H mutation correlated with a restriction in cells' ability to enter a mesenchymal state, in vitro and in vivo, in line with IDH-mut dampening transcriptional responses and validating its use as a genetic biomarker for disease classification (43). By capturing the activation status of signaling pathways and developmental stages (e.g., inflammation and differentiation), and any preexisting context-dependent difference (e.g., IDH-WT vs. mutant background), genetic tracing supports a mixed model in which the proneural GBM is an entity (2–4) and the mesenchymal GBM is a state (12). Yet the proneural GBM can be further subgrouped into specific “homeostates,” in which oncogenic, cell-of-origin signatures and the microenvironment make distinct contributions.
Our data on the reversibility of the mesenchymal state suggest an alternative view of proneural and mesenchymal GBM as sharp cell identities with a fixed hierarchical relationship, and support rather the potential for a fluid interconversion between states. This view is consistent with both genetic evidence from patients and mouse models (2–4, 12, 39, 40) and the apparently discordant finding of a mesenchymal-to-proneural transition predicted by in silico lineage tracing (17). Of note, the mesenchymal conversion observed in our in vivo genetic tracing is consistent with either a proneural-to-mesenchymal transition contingent on the microenvironment, with mesenchymal hGICs being more fit during tumor initiation and progression, thereby leading to their positive selection, or a combination of both. Along this line, even though the highly mesenchymal cells constituted only a fraction of the bulk tumor, they more often appeared viable and produced high-quality cDNA libraries. It is tempting to speculate that mesenchymal cells are better adjusted to survive in vivo and experimental stressors. A scenario in which mesenchymal fate activation enhances cell fitness would be consistent with its identification as a dominant entity at recurrence (2); it would also explain the presence of the mesenchymal surface receptor CD44 on a sizable fraction of primary GBM–propagating cells (44). To systematically discriminate between all the plausible models, future genetic and transcriptional lineage tracing using genetic barcodes is warranted, including secondary transplantations.
Substantial correlative evidence supports a link between inflammatory signaling, EMT, the infiltration of innate immune cells, and radioresistance (25, 26). Here, fate mapping by sLCRs established a causal link for the clinical observation that DNA-damage therapy is associated with mesenchymal transdifferentiation (2). Conversely, it offered no support for a causal hypothesis of hypoxia-driven mesenchymal commitment. Hypoxia regulates tumor homeostasis, GBM cells surrounding areas of pseudopalisading necrosis overexpress hypoxia-inducible factor-1, and scRNA-seq revealed a correlation between the hypoxia gene-expression signature and the mesenchymal GBM (11, 12, 45). Although individual genes of the mesenchymal GBM signature may be directly regulated by changes in levels of tissue oxygenation, our data support a model in which hypoxic gene expression is additive to rather than causal for the mesenchymal GBM program. This conclusion does not, however, exclude that the hypoxic microenvironment might trigger a recruitment of innate immune cells in vivo, in turn leading to a non–cell-autonomous mesenchymal commitment in vivo (Fig. 7G).
In line with a model of tumor EMT as the hijacking of a conserved developmental process, the mesenchymal reporters presented here are likely to uncover cell-fate mechanisms in a broader set of epithelioid cancers and may serve as tools to discover optimal treatment strategies, such as sequential dosing of targeted drugs to send cancer cells into a state followed by hitting such state with targeted treatments (46).
Finally, our strategy provides evidence of a causal link to support the clinical association between the mesenchymal GBM subtype and a specific immune landscape (5). Although TNFα is believed to drive therapeutic resistance (10) and GAMs to be the source for TNFα in mouse models (30) and human tumors (31), we uncover TNFα-independent routes to the GAM-driven mesenchymal GBM state. The signatures experimentally identified in this study resemble those observed in single mesenchymal GBM cells from patients, suggesting a relevance for our findings in vivo. In addition to identifying a pathophysiologically relevant non–cell-autonomous interaction, mesenchymal genetic tracing pointed toward the mechanistic activation of the NO and SREBP1/2 pathways as relevant for microglia-driven phenotypes.
The transcriptome remodeling induced in GAMs by cross-talk between glioma cells and innate immunity is consistent with regulation of inflammatory and metabolism genes (47). A unique finding of this study is to show that the phenotypic changes induced by innate immune cells in tumor cells drive mesenchymal commitment and lead to selective therapeutic resistance. The indications on LXR agonists, aimed at exploiting low cholesterol biosynthesis in GBM (33), are particularly timely given the identification of innate immune cells as the culprit in patients' lack of response to checkpoint inhibitor immunotherapy (48) and the current clinical trial combining immunotherapy and an LXR agonist (NCT02922764). The data support the combination of multicellular systems and sLCRs as preclinical tools to test therapeutic strategies that are more likely to succeed in clinical trials.
In summary, the synthetic genetic tracing developed here uncovered causal and mechanistic evidence for a pathologic role of innate immune cells in GBM through mesenchymal transition—a relationship that has been hinted at for a long time. The method is effective, scalable, and can be extended to a wide range of ex vivo or in vivo systems. The further development of the algorithm will help to broadly dissect cell-intrinsic and non–cell-autonomous mechanisms that control normal and tumor homeostasis in diverse contexts of clinical relevance.
GBM sLCR Generation
To focus on cell-intrinsic gene signatures, in a pilot approach, we filtered out genes with low expression in GSCs from our previous experiments and confirmed their potential intrinsic expression in a validated cohort of GSCs from others. A database of 1,818 motifs [position weight matrices (PWM)] representing known transcription factor binding preferences was generated from the literature (49–53). PWMs were preselected on the basis of subtype-specific TFs. Regions corresponding to DRGs were retrieved from the hg19 (Refseq table downloaded from UCSC genome browser on October 5, 2012) and decomposed in windows of 150 bp and 50 bp steps (hereafter referred to as CRE). The scanned area surrounding each signature gene was manually delimited by two distal CCCTC-binding factor (CTCF) sites, positioned ~10 kb away from the transcription start site or transcription end site. High-affinity TF-binding sites in defined genomic regions were identified using Find Individual Motif Occurrences (PMID: 21330290) with –output-pthresh 1e-4 –no-qvalue. For each window, whenever multiple matches for the same PWM were identified, the Padj < 0.01 of the best match (multiple backgrounds) was considered as a proxy for the affinity of that TF over that region. TFBS pairwise correlation heat maps in Fig. 1A used the top 500 regions in terms of the score defined as −log10 (P value). Genomic coordinates versus TFBS correlation heat maps, including the representative one in Fig. 1B, were generated with the top 100 scoring regions.
The sLCRs were synthetized initially at IDT, later at GenScript, and currently at VectorBuilder. MGT#1-mVenus was cloned in the PacI-BsrGI fragment of the Mammalian Expression, Lentiviral FUGW (gift from David Baltimore; Addgene #14883). Additional modifications, such as insertion of H2B-CFP (gift of Elaine Fuchs, Addgene #25998), swapping of mVenus to mCherry or MGT#1 with all other sLCRs used either restriction enzyme digestion or Gibson cloning.
The sequence of the Igk-mVenus-TM was from ref. 54. The mCherry was modified with a nuclear localization signal or sequence. The sLCR vectors are third-generation lentiviral system and have been used together with pCMV-G (Addgene #8454), pRSV-REV (Addgene #12253), and pMDLG/pRRE (Addgene #12251).
Oligonucleotides and Primers
All oligos used in this study are available upon request.
All lines used in this study were thawed from frozen batches and propagated for a limited number of passages (5–10×), and all lines regularly tested with the Mycoplasma Detection Kit (Jena Bioscience; 11828383, PP-401L) to exclude contamination. For all glioma-initiating cells and GSCs, cell line authentication occurred through global expression profiling.
Human Glioma Cell Lines
The IDH-WT-hGICs and IDH-mut-hGICs were generated by our lab and will be described elsewhere. Briefly, IDH-mut-hGICs were generated by transforming human neural progenitor cells (NPC; kindly provided by R. Glass, LMU Munich, Munich, Germany), by means of: pLenti6.2/V5-IDH1-R132H (kindly provided by Hai Yan, Duke University, Durham, NC), p53R173H and p53R273H (point mutations introduced into TP53 ccsbBroad304_07088 from the CCSB–Broad Lentiviral Expression Library), and pRS-Puro-sh-PTEN (#1; kind gift from D. Peeper). IDH-WT-hGICs were generated by transforming human NPCs with the constructs pRSPURO-sh-PTEN (#1), pLKO.1-sh-TP53 (TRCN0000003754), and pRS-shNF1. For these lines, thorough genetic, transcriptional, and epigenetic characterization has been performed, as well as in vivo tumor formation and phenotypic mimicking ability.
Patient-derived GSC lines GBM2 (ref. 55; TCP#2), GBM14, and NCH421K (56) were kindly provided by Rainer Glass, LMU Munich (Munich, Germany). Lines GBM166 and GBM179 were kindly provided by Peter Dirks (University of Toronto, Toronto, Canada; ref. 21), and lines BLN-5 and BLN-7 (57) were kindly provided by Phillip Euskirchen, Charité Berlin, Germany.
In vitro, all glioma lines were propagated as described (58) with one modification. In addition to EGF (20 ng/mL; R&D Systems, 236-EG), bFGF (20 ng/mL; R&D Systems, 233-FB), heparin (1 μg/mL; Sigma, H3149), and 1% penicillin and streptomycin, PDGF-AA (20 ng/mL; R&D Systems, 221-AA) is also supplemented to RHB-A (Takara, Y40001). This medium composition will be referred to as RHB-A complete. hGICs were cultured at 37°C in a 5% CO2–95% air incubator, 3% O2, and humidified incubator.
Cancer Cell Lines
The MCF7 and MDA-231 cell lines (kindly provided by the Rene Bernards lab, NKI Amsterdam, Amsterdam, the Netherlands) were cultured in RPMI medium (Life Technologies, 21875091). Both cell lines were supplemented with 10% FBS, and 1% penicillin and streptomycin at 37°C in a 5% CO2–95% air incubator. A549 and H1944 cell lines (kindly provided by the Rene Bernards lab) were cultured in RPMI medium. Both cell lines were supplemented with 10% FBS, and 1% penicillin and streptomycin at 37°C in a 5% CO2–95% air incubator.
Human Microglia Cell Line
Immortalized primary human microglia C20 cells (kindly provided by David Alvarez-Carbonell, Case Western Reserve University, Cleveland, Ohio; ref. 29) were cultured in RHB-A medium supplemented with 1% FBS, 2.5 mmol/L glutamine (Thermo Fisher; 35050038), 1 μmol/L dexamethasone (Sigma; D1756), and 1% penicillin and streptomycin at 37°C in a 5% CO2–95% air incubator.
Human Hematopoietic Progenitor CD34 Differentiation
Donor-derived CD34+ cells (kind gift from K. Rajewsky, MDC Berlin, Berlin, Germany) were propagated in SFEM II (STEMCELL Technologies, 09605), SCF, FLT3-L, TPO, IL6 (all 100 ng/mL; easyexperiments.com), UM171 (Selleck, 35 nmol/L), SR1 (Selleck, 0.75 μmol/L), and 19-deoxy-9-methylene-16,16-dimethyl PGE2 (Cayman Chemicals, 10 μmol/L). CD34+ differentiation toward immature MDSC-like cells was induced by switching culture medium to RHB-A-medium supplemented with human SCF (50 ng/mL) and human GM-CSF (100 ng/mL) for 7 to 12 days (cl.#1 and cl.#2), prior to coculture.
Human Monocyte Cell Line Differentiation
Human monocytic THP-1 cells (ATCC TIB-202) were acquired from S. Minucci (IEO Milan, Milan, Italy) and maintained in RPMI 1640 (Thermo Fisher Scientific) culture medium supplemented with 10% FBS (Gibco, 10270106), 1 mmol/L pyruvate (Life Technologies), 2 mmol/L GlutaMAX (Thermo Fisher Scientific, 35050-038). THP-1 monocytes are differentiated into macrophages by 48 hours of incubation with 150 nmol/L phorbol 12-myristate 13-acetate (PMA; Cayman Chemicals; Cay10008014) followed by 24-hour incubation in RPMI medium. Macrophages were polarized into M1 macrophages by incubation with 20 ng/mL of IFNγ (R&D Systems, 285-IF) and 10 pg/mL of lipopolysaccharide (Sigma, L2630). Macrophage M2 polarization was obtained by incubation with 20 ng/mL of IL4 (Sigma, A3134) and 20 ng/mL of IL13 (PeproTech, 200-13).
Transfection and Transduction
Transfection and transduction were previously described in detail (59). Briefly, 12 μg of DNA mix (lentivector, pCMV-G, pRSV-REV, pMDLG/pRRE) was incubated with the FuGENE (Promega, E2311)–DMEM/F12 (Life Technologies, 31331) mix for 15 minutes at room temperature, added to the antibiotic-free medium covering the 293T cells, and the first tap of viral supernatant was collected at 40 hours after transfection. Titer was assessed using the Lenti-X p24 Rapid Titer Kit (Takara, 631280) according to the manufacturer's instructions. We applied viral particles to target cells in the appropriate complete medium supplemented with 2.5 μg/mL protamine sulfate. After 12 to 14 hours of incubation with the viral supernatant, the medium was refreshed with the appropriate complete medium.
Transduced cell lines were harvested into single-cell suspensions and resuspended into cold medium and filtered into FACS tubes. Sorting was conducted using BD FACSAria III or Fusion. The appropriate laser-filter combinations were chosen depending on the fluorophores being sorted for. Typically, to remove dead cells, events were first gated on the basis of shape and granularity (FSC-A vs. SSC-A) and doublets were excluded (FSC-A vs. FSC-H). Positive gates were established on PGK-driven and constitutively expressed H2B-CFP as sorting reporter, to sort for populations with low to medium intensity of sLCR-dependent fluorophore expression.
All analyses were performed using FlowJo_v10. For analysis of in vivo data in Fig. 2E–G, freshly dissociated mouse brain tumor samples were pregated for viable cells before dimensionality reduction of all acquired parameters (FSC-H/W/A, SSC-H/W/A, positive and negative viability dies and mVenus/mCherry sLCR expression) was performed using the inbuilt auto t-distributed SNE (opt-SNE) algorithm (Perplexity 30, Iterations 1000). From resulting t-SNE maps, the glioma cell clusters separating apart from mouse cells were identified and gated by overlaying sLCR expression heat maps. Further t-SNE dimensionality reduction of gated glioma cells was performed to assess clustering of sLCR reporter distribution for individual in vivo–derived tumor cells and compared with simultaneously analyzed in vitro–cultured cells used for transplantation. Quantification of sLCR-high cells was established by defining a four-quadrant gating in mVenus versus mCherry plots on in vitro cells to mark sLCR-high populations and applying this gating to t-SNE–gated in vivo glioma cells (see Supplementary Fig. S2A).
RNA FISH and Dual FISH-IF
Cells were permeabilized in 70% ethanol (RNA FISH only) or with 0.5% triton X-100 (for dual IF-RNA FISH), washed in RNase-free PBS (Life Technologies, AM9932), fixed with 10% deionized formamide (EMD Millipore, S4117) in 20% Stellaris RNA FISH Wash Buffer A (Biosearch Technologies, Inc., SMF-WA1-60) and RNase-free PBS, for 5 minutes at room temperature. IgK-MGT#1-mVenus and H2B-CFP were probed using SMF-1084-5 CAL Fluor Red 635 and SMF-1063-5 Quasar 570 custom Stellaris FISH Probes (oligo sequence available upon request) in 10% deionized formamide 90% Stellaris RNA FISH Hybridization Buffer (Biosearch Technologies, SMF-HB1-10) at 31.5 μmol/L in 100 μL transferred to the coverglass, hybridized at 37°C in the dark. After overnight incubation, slides were washed three times with RNase-free PBS for 5 minutes. If primary/secondary staining occurred, it was performed as described in the Immunofluorescence (IF) section.
Tumor cells were propagated as described above until the screening. We seeded 15,000 cells/50 μL/well in 384-well plates (Corning) in Gibco FluoroBrite DMEM supplemented with the appropriate growth factors. Cells were dispensed as 50 μL suspension into each well using the SPARK 20M Injector system (50 μL injection volume; 100 μL/second injection speed). For nonadherent cells (e.g., hGICs), cells were further centrifuged at 1,500 rpm for 1 hour 30 minutes at 37°C. Bottom reading fluorescence was scanned using a SPARK 20M TECAN plate reader at 37°C in a 5% CO2 (additionally 3% O2 for hGICs) in a humidified cassette, with the following settings for mVenus: Monochromator, Ex 505 nm ± 20 nm, Em 535 nm ± 7.5 nm. In independent replica, cell viability was measured with 0.02% AlamarBlue solution in FluoroBrite medium with the following settings: Fluorescence Top reading, Monochromator, Ex 565 nm ± 10 nm, Em 592 nm ± 10 nm.
DMSO-soluble compounds such as GSK126 were robotically aliquoted using a D300e compound printer (TECAN), whereas cytokines were robotically aliquoted to each well using an Andrew pipetting robot (AndrewAlliance). Data were imported in PRISM7 (GraphPad). Fluorescence intensity from control dead cells was subtracted as background from all values. Individual values were normalized to the mean of controls and represented as fold change.
Irradiation of hGICs
Irradiation was delivered using the XenX irradiator platform (XStrahl Life Sciences), equipped with a 225 kV X-ray tube for targeted irradiation. hGICs cultured in either 6-well plates or 96-well plates were placed in the focal plane of the beamline and exposed to irradiation for a specific time, depending on the target dosage, as calculated with an internal calculation software.
Induction of Hypoxia
To investigate effects of hypoxia on IDH-WT-hGICs, cells were moved from the standard 3% O2 culture to ambient O2 levels for 24 hours. Cells were then seeded at 250,000 cells/well into 6-well plates and moved to 1% O2, 3% O2, and ambient O2, respectively. In case of severe hypoxia (Supplementary Fig. S4D), plates were cultured in pressurized incubators (Avatar, Xcellbio) at 0.5% O2 and 5 Psi (∼344 mbar) over the usual atmospheric pressure in Berlin of 14.7Psi (1,010–1,030 mbar). After three days of culture, cells were harvested into single-cell suspensions using Accutase for assessing sLCR expression using FACS, and RNA was extracted as described for RNA-seq.
Gene Knockout Using CRISPR/Cas9
The SGF29/CCDC101 deletion was performed using the Gene Knockout Kit v2 (Synthego). The sgRNAs were dissolved in nuclease-free 1xTE buffer to a stock concentration of 30 μmol/L. RNP complexes were formed by mixing the Cas9 nuclease-gRNAs in a ratio of 6:1. Each RNP complex was nucleofected into 250,000 IDH-WT-hGICs-MGT#1 using the CA-138 pulse program of the 4D-Nucleofector Core Unit (Lonza). Approximately seven days after electroporation, the gDNA was extracted with AMPure XP beads (Beckman Coulter, A63881), eluted in 50 μL of elution buffer with downstream PCR amplification of the target site of interest using 800 to 1,200 bp products centered around the gRNA target loci (primers available upon request). The efficiency of the knockouts was assessed using TIDE (NKI, https://tide.nki.nl/) or T7EI assays. The alteration of the MGT#1 fluorescence in bulk SGF29/CCDC101-KO cells was directly assayed by FACS using BD LSRFortessa and FlowJo.
Genome-Wide CRISPR Knockout In Vitro Screen
For the genome-wide pooled CRISPR knockout screen, we utilized the Brunello library consisting of 77,441 sgRNAs targeting 19,114 genes (average of 4 sgRNAs per gene) and 1,000 nontargeting controls. To achieve a library representation over 100×, we transduced a total of 16 × 106 IDH-WT-hGICs-MGT#1 cells at a multiplicity of infection of ∼0.5 and amplified the cells for 10 days prior to introducing the treatment. At day 10, the cells were either treated with TNFα (10 ng/mL) and FBS (0.5%); temozolomide (50 μmol/L) and irradiation (20 Gy), or left untreated. Before the gDNA extraction, we performed a FACS sorting of each condition, collecting the IDH-WT-hGICs-MGT#1hi, IDH-WT-hGICs-MGT#1lo, and the unsorted populations. Although the time window of the experiment is compatible with gene essentiality, we have specifically ruled out that our hits are essential in validation experiments. The genomic DNA was extracted by lysing the cell pellets for 10 minutes at 56°C in AL buffer (Qiagen, 19075), supplemented with Proteinase K (Invitrogen, AM2548) and RNAse A (Thermo Scientific, 10753721), subsequently purified with AMPure XP beads (Beckman Coulter, A63881), and eluted in EB buffer (Qiagen, 19086). Next-generation sequencing (NGS) libraries were constructed in a two-step PCR setup, where the PCR1 is used to amplify the sgRNA scaffold and insert a stagger sequence to increase library complexity across the flow cell, while the PCR2 introduced Illumina compatible adaptors with unique P7 barcodes, allowing sample multiplexity. For the PCR1, 5 μg of each gDNA sample was divided over five parallel reactions, which were subsequently pooled together and purified using AMPure XP beads. The optimal cycle numbers for PCR2 were determined for 1 μL of each PCR1 individually by conducting a qPCR amplification using KAPA HiFi HotStart Ready Mix (Roche, 7958927001) and 1× EvaGreen (Biotium, 31000). Ten microliters of the purified PCR1 of each sample was used as input for the final PCR2. Both PCR1 and PCR2 were performed using KAPA HiFi HotStart Ready Mix. Primers are available upon request. Quality control of the final libraries was performed using the Qubit dsDNA HS kit (Invitrogen, Q32854) for quantification and TapeStation High Sensitivity D1000 ScreenTapes (Agilent, 5067-5584) for determination of PCR fragment size. The barcoded libraries were pooled together in equal molarities and sequenced on an Illumina NextSeq500 using the 75 cycles V2 chemistry (1 × 75 nt single-read mode). Reads were aligned to the Brunello library using bwa and a custom script to generate the gRNA read counts. The resulting sgRNA read counts were prefiltered (>10 counts) and aggregated to the gene level with caRpools R package, and DESeq2 was applied for the identification of potential MGT#1 expression regulating hits. A hit was considered to be significant at Padj < 0.05 and log2FC ± 1. The identified significant differentially regulated target genes were used as input for the IPA (Qiagen Bioinformatics), predicting the upstream regulators and toxicity of MGT#1 activation (Fig. 5D and F). For the density plot in Supplementary Fig. S5D, the count data were log2-normalized and averaged between the replicates. Log2-normalized counts of the Brunello library were subtracted from the unsorted IDH-WT-hGIC conditions to calculate the log2FC. sgRNAs with <10 read counts in the plasmid library were removed prior to log2FC calculation.
cDNA was generated using the SuperScript VILO MasterMix (Invitrogen, 11755050) starting with 0.5–2.5 μg RNA as input in 20 μL reaction, incubated at 25°C for 10 minutes, at 42°C for 60 minutes, and at 85°C for 5 minutes. qRT-PCR was performed with 10 ng cDNA/well, in a 384w ViiA 7 System using 1× Power SYBR Green PCR Master Mix (Applied Biosystems, 4368702), in 10 μL/well. Primers are available upon request.
Automated Longitudinal Live-Cell Imaging
IncuCyte automated longitudinal imaging was performed in 96-well black-walled plates (Greiner). 3 × 105 cells per plate were seeded to reach optimal confluence at the end of the experiment. GSK126 was aliquoted using a D300e, whereas TGFβ1+2 were manually aliquoted to each well. Both were refreshed every second day. The last time point was independently verified using a plate reader (BMC Clariostar).
Cell pellets were lysed in RIPA buffer (20 mmol/L Tris-HCl pH7.5, 150 mmol/L NaCl, 1 mmol/L EDTA, 1 mmol/L EGTA, 1% NP-40) supplemented with a 1× Protease inhibitor cocktail (Roche), 10 mmol/L NaPPi, 10 mmol/L NaF, and 1 mmol/L sodium orthovanadate. The lysates were sonicated if necessary, and electrophoresis was performed using NuPAGE Bis-Tris precast gels (Life Technologies) in NuPAGE MOPS SDS running buffer (50 mmol/L MOPS, 50 mmol/L Tris Base, 0.1% SDS, 1 mmol/L EDTA). Protein was transferred onto nitrocellulose membranes in transfer buffer (25 mmol/L Tris-HCl pH 7.5, 192 mmol/L glycine, 20% methanol) at 120 mA for 1 hour. Protein transfer was assessed through staining with Ponceau Red for 5 minutes, following two washes with TBS-T. Blocking of membranes was done for 1 hour at room temperature with 5% BSA in PBS. Dilutions of primary antibodies were prepared in PBS + 5% BSA, and membranes were incubated overnight at 4°C. Following three washes for 5 minutes with TBS-T, dilutions of appropriate HRP-coupled secondary antibodies were prepared in PBS + 5% BSA, and membranes were incubated for 45 minutes at room temperature. After washing three times for 5 minutes with TBS-T, enhanced chemiluminescence (ECL) detection reagent (Sigma, RPN2209) was applied and membranes were exposed to ECL Hyperfilms (Sigma, GE28-9068-37) to detect chemoluminescent signals.
Copy Number Normalized sLCR Expression
hGICs, patient-derived GSCs, lung adenocarcinoma, breast adenocarcinoma, and leukemia cell lines were transduced with the sLCRs MGT#1 and PNGT#2 as described. To perform lentiviral copy-number normalization, gDNA was extracted using AMPure XP beads according to the manufacturer's protocol. Relative amounts of sLCR integration sites into the genome of target cells were assessed by qPCR, using mVenus (MGT#1)- and mCherry (PNGT#2)-specific primers and N2 primers targeting a genomic region in chromosome 13 for input normalization between samples. One nanogram of gDNA was amplified using respective primers and Power SYBR Green PCR Master Mix in a total reaction volume of 10 μL in quadruplicates. Relative DNA amounts of MGT#1 or PNGT#2 were normalized over N2 levels to calculate copy-number abundance for each sLCR in each sample.
Expression levels of sLCRs in corresponding samples were assessed in quadruplicates by qPCR using One-Step TB Green PrimeScript RT-PCR Kit II (Takara, RR086A) with an input of 2 ng total RNA using mVenus (MGT#1) and mCherry (PNGT#2)-specific primers and GAPDH primers for normalization. Relative sLCR expression of MGT#1 or PNGT#2 was calculated by normalizing over GAPDH expression for each sLCR in each sample. Both qPCR normalizations were carried out as two independent technical replicate experiments, and data from both runs were combined for final normalization. All primers are available upon request.
Final normalization as presented in Fig. 1D was done by first correcting GAPDH-normalized sLCR expression divided over N2-normalized copy-number abundance and then setting IDH-WT-hGICs as reference, by calculating the fold-change increase of copy-number normalized sLCR expression.
Intracranial Orthotopic Glioma Xenograft
All mouse studies were conducted in accordance with a protocol approved by the Institutional Animal Care and Use Committee and in agreement with regulations by the European Union. Orthotopic glioma xenograft studies were conducted in NOD.Cg-Prkdcscid Il2rgtm1Wjl/SzJ (NSG) mice purchased from The Jackson Laboratory and maintained in specific pathogen-free conditions. We used male and female mice between 7 and 12 weeks of age. Orthotopic glioma xenograft studies were conducted as described previously with modifications (58, 59).
NOD.Cg-Prkdcscid Il2rgtm1Wjl/SzJ (NSG) male and female mice between 7 and 12 weeks of age were used for injections. Briefly, the head of the mouse was immobilized within the stereotactic frame and the skull was exposed through a small skin incision. A small burr hole was made using a drill with the following stereotactic coordinates: 1.0 mm anterior and 2.0 mm lateral of the bregma. During the entire procedure, mice were kept under 1.5% to 2% isoflurane mixed with ambient air and oxygen anesthesia on a warming pad. hGIC tumorspheres were treated with accutase and resuspended as single cells at a concentration of 50,000 cells/μL. Generally, 4 to 5 μL of hGICs was stereotactically injected into the corpus callosum at a 3 mm depth to the cortical surface. Injection flow was set at 0.4 μL/minute, and the needle was withdrawn after one-minute rest, to avoid intracranial pressure and reflux, respectively. The burr hole was sealed with bone wax and the scalp with surgical clips. Mice were returned to their cages, monitored until full recovery, and checked daily for signs of neurologic symptoms. Mice were also monitored by imaging using in vivo imaging system (IVIS) as needed until neurologic symptoms occurred and mice were to be sacrificed. Brains were collected directly after euthanasia, examined under fluorescence microscope to identify the tumor border, and xenografted tumors were processed the same day for FACS analysis and sorting or cells were frozen in medium plus 10% DMSO until needed. The remaining brain tissues were then subjected to fixation as described below. For the experiment in Fig. 2E–L, performed at EPO GmbH, 8-week-old female NOG mice were used under similar experimental protocols, and welfare standards were approved by the Berlin authorities (LaGeSo).
Tissue Dissociation and Brain Tumor Cell Sorting
Brain tumor dissection was previously described in detail (58, 59). Briefly, the tissue was dissected with a scalpel and digested in Accutase/DNaseI (947 μL Accutase, 50 μL DNase I Buffer, 3 μL DNase I) at 37°C using C-tubes in an OctoMACS dissociator (Miltenyi Biotec) using program 37C_BTDK_1. Suspensions were filtered first through a 100-μm cell strainer, following a 70-μm cell strainer before RBC lysis (NH4Cl, 155 mmol/L; KHCO3, 10 mmol/L; EDTA, pH 7.4, 0.1 mmol/L). After washing in cold PBS, viability and cell count were assessed automatically with 0.4% Trypan Blue staining using a TECAN SPARK20M.
When surface markers were assessed, typically, 200,000 cells/antibody were used in 15 mL Falcons. Staining volume was 50 μL in RHB-A medium with primary antibody (e.g., CD133-APC; Miltenyi Biotec), on ice, in the dark, for 30 minutes. Unbound antibody was removed with two washes of PBS. Depending on whether cells were analyzed or sorted, data acquisition was performed on the BD LSRFortessa or cells were sorted using the BD Aria II/III or an Astrios Moflo. The appropriate laser-filter combinations were chosen depending on the fluorophores being analyzed. Typically, to remove dead cells, events were first gated on the basis of shape and granularity (FSC-SSC), and we used as viability dyes either Calcein UltraBlue or DRAQ5 and ZombieRed (depending on the fluorophores being analyzed). Analysis was performed with FlowJo_V10.
Preparation of Cryosections
Tumorspheres were allowed to settle by gravity, fixed in fresh prepared formaldehyde in PBS (1.0%), which was blocked with 140 mmol/L glycine, rinsed with 30% sucrose, followed by addition of freezing medium (O.C.T./cryomold). Frozen blocks were obtained by dry ice freezing and stored at -80°C until used. The blocks were cut with Leica CM 1950.
Tissues or tumorspheres were fixed in 4% paraformaldehyde (PFA) for 20 minutes. Following fixation, dehydration was performed with increasing EtOH from 70% to 100%, Xylene, and overnight paraffin incubation. Paraffin-embedded samples were cut using a HM 355S microtome (Thermo Scientific). Hematoxylin/eosin (H&E) staining was performed with standard protocols, and slide images were acquired with an automated microscope (Keyence).
At room temperature, cells were grown on coverslips or spheroids spun down on glass followed by 4% PFA (Sigma-Aldrich, 16005) in PBS for 10-minute fixation, washed in PBS 5 minutes (3×), permeabilized with 0.5% triton X-100 in PBS for 5 minutes, blocked 15 minutes with 4% BSA (Roth, 3854.4), stained with primary and secondary antibodies, and 20 μg/mL Hoechst 33258 (Cayman Chemicals, 16756–50), and finally mounted onto glass slides using nail polish and Vectashield (Linaris, H1000). On paraffin-embedded tissues, we performed deparaffinization and citrate buffer antigen retrieval with standard protocols. Permeabilization was performed with Triton X-100 0.25% in PBS and—when appropriate—endogenous peroxidases were blocked with 3% H2O2 in water. Typically, we performed blocking with 5% normal goat serum. Primary antibodies were anti-GFP (Abcam, ab6556, 1:1,000), anti-MED1 (Abcam, ab64965 1:500), and anti-Tubulin (BD T5168, 1:2,000), and secondary antibodies (1:200) were A31573, A11055, A31571 Alexa Fluor 647, A21206 Alexa Fluor 488, and A31570 Alexa Fluor 555.
Microscopes used in this study were Zeiss LSM800, Leica SP5-7-8, and Nikon Spinning Disk. Confocal images in Supplementary Fig. S1B were acquired with a Leica SP5, where mVenus fluorescence was acquired using Ex = 488 nm, Em = 535 nm. Images in Supplementary Fig. S1C were acquired with a Zeiss LSM800, using Ex = 558 nm, Em = 575 nm for mVenus-QUASAR570 and Ex = 653 nm, Em = 668 nm for BRD4- or MED1-AF647, respectively. For Supplementary Fig. S1D, the H2B-CFP-QUASAR670 used Ex = 631 nm, Em = 670 nm. Images were processed using ImageJ or Photoshop. Confocal live-cell imaging in Supplementary Fig. S3A was done using a Zeiss LSM700 with appropriate excitation laser/emission filter combinations to detect endogenous mVenus, mCherry, and CFP expression.
Cocultures of hGICs and immortalized primary human microglia C20 or human monocytes were set up using hydrophilic PTFE 6-well cell culture inserts with a pore size of 0.4 μm (Merck). Human microglia, CD34+ monocytes, or THP-1–derived M1/M2 macrophages were seeded at 1.5 × 105 cells/well for 24 hours on 6-well plates in respective medium. The medium was aspirated, and cells were washed once with PBS before 1 mL of RHB-A complete medium was added. Transwell inserts were placed into plates, and 5 × 105 single hGICs in a total volume of 1 mL of RHB-A complete medium were plated on insert surface. hGICs and C20 human microglia were harvested after 48 hours of coculture for further analysis.
Drug Dose–Response Screening
These experiments were performed as previously described (46). Transduced hGICs from transwell coculture experiments were harvested into single-cell suspension and sorted into mVenus-high and mVenus-low populations using a BD FACSAria III. Cells were counted and 7,000 cells/50 μL/well were seeded onto 384-well black-walled plates in RHB-A complete medium using the SPARK20M Injector system (50 μL injection volume; 100 μL/second injection speed). Drugs were typically dissolved as a 10 mmol/L stock in DMSO and dispensed using the D300e compound printer (TECAN) for targeted dose response with plate randomization and DMSO normalization. After 72 hours of incubation, cell viability was measured after 2 to 6 hours of incubation with 10 μL of Cell-Titer-Blue (Promega, 2813) assay reagent with the following settings: Fluorescence top reading, Monochromator, Ex 565 nm ± 10 nm, Em 592 nm ± 10 nm. Data were imported in PRISM7 (GraphPad). Fluorescence intensities from empty wells were subtracted as background from all values. Concentrations were log10-transformed into log[M] scale, and individual values were normalized to the mean of untreated positive and SDS-treated negative control conditions. Nonlinear regression modeling [log (inhibitor) vs. normalized response − variable slope) was used to derive dose–response curve and IC50 values.
Cell Viability Assay
For Fig. 5G, IDH-WT-hGICs-MGT#1 were plated in 5 replicates and pretreated for 16 hours with 5 μmol/L ATRA or 0.2 μmol/L IKK-16, with subsequent exposure to ionizing irradiation (20 Gy) and temozolomide (58 μmol/L). 72 hours after treatment, the viability was measured by the addition of Calcein UltraBlue at 1.5 μmol/L and 4 images were taken per well with the Operetta CLS (PerkinElmer) high-content imaging system. Integrated Harmony software was used to analyze the images, obtaining the mean intensity of the Calcein UltraBlue from the spheres per well. Data were analyzed with Prism8.
Standardly, the RNA was extracted using the TRIzol Reagent (Invitrogen, 15596026), with subsequent isopropanol precipitation and AMPure XP beads purification. The concentration of the RNA was assessed by the Qubit RNA HS Assay Kit (Invitrogen) and/or qPCR quantification with the One-Step TB Green PrimeScript RT-PCR Kit II (Takara Bio). The integrity of the RNA was determined by means of the High-Sensitivity RNA ScreenTape System (Agilent, 5067-5581).
To generate the GSC expression profiles for GBM sLCR construction, the TruSeq Stranded Total RNA Library Prep (Illumina, 20020596) and the Ribo-Zero Gold rRNA Removal Kits (Illumina, MRZG12324) were utilized following the manufacturer's protocol. The final libraries were profiled for quantity using the Qubit dsDNA HS kit (Invitrogen) and/or the KAPA Library Quantification Kit (Roche, 7960204001). The appropriate library size distribution was assessed by the TapeStation High-Sensitivity D1000 ScreenTapes kit (Agilent). Pooled libraries were sequenced on the Illumina HiSeq2500 or HiSeq4000 platforms in either single-read 51 bp or paired-end 100 bp mode. Illumina adaptors were trimmed using raw reads with Cutadapt (https://cutadapt.readthedocs.io/en/stable/), and raw reads were aligned to the human genome (hg19 or hg38) with TopHat (Tophat2 v2.1.0; parameters: –library-type fr-firststrand -g 1 -p 8 -G ENSEMBL_Annotation_v82.gtf). HTSeq-count v0.6.1p1 was used to assess the number of uniquely assigned reads for each gene (parameters: -m intersection-nonempty -a 10 -i gene_id -s reverse -f bam). Reads were normalized and log2-transformed to obtain log2 counts per millions.
The TruSeq Stranded Total RNA Library Prep Gold Kit (Illumina, 20020598) was utilized for the generation of the in vitro RNA-seq data (Figs. 3, 4, 6, and 7) according to the manufacturer's protocol with 0.1–1 μg of total RNA input. The SMARTer Stranded Total RNA-Seq Kit v2–Pico Input Mammalian (Takara Bio, 634413) was used for the in vivo RNA profiling (Figs. 1 and 2). The library construction was performed following the manufacturer's protocol with 0.2–10 ng of total RNA input. Quantity and quality controls were performed as described above. Pooled libraries were sequenced on the Illumina NextSeq500 or NovaSeq 6000 platforms in a 2 × 75 bp or 2 × 100 bp mode. The demultiplexing was performed using the bcl2fastq conversion software (v2.20.0). The reads were aligned to a custom genome (GRCh38 containing sLCRs and reporter sequences) using STARv2.6.0c. HTSeq was utilized to assess the number of uniquely assigned reads for each gene (–m intersection-nonempty -a 10 -i gene_id -s reverse -f bam).
RNA-seq analysis of in vivo and in vitro high/low data set was conducted using R v3.6 (Figs. 1 and 2). After the data processing step (above), the quality of each sample was individually assessed using dupRadar v1.18 R package (default parameters) and subsequently reevaluated by the correlation between the number of genes and average counts for each file (Supplementary Fig. S2C and S2D). Then, differential expression analyses between specific sLCR activation, high/low, and in vivo/in vitro were conducted using DESeq2 v1.24 on raw prefiltered counts (>100 and >50). Of note, principal component analysis was used to identify potential outliers in the in vivo samples data set (Supplementary Fig. S2D), and only MGT#1hi homogeneous samples were used in different comparisons (Fig. 2H; Supplementary Fig. 2E). Differential regulated genes were considered if log2FC > ±1, Padj < 0.05, and base mean > 5 (Fig. 2I; Supplementary Fig. S2E). The in vivo MGT#1hi gene set contains only the upregulated genes of the comparison between in vivo MGT#1hi and MGT#1lo samples. Graphical representations in this analysis were generated using ggplot2 v3.3.2.
Differential expression analyses of MGT#1 activation cues RNA-seq profiles were conducted using DESeq2 v1.24. Each condition [TNFα, Human Serum (huSer), IR, Activin A, NOC-18, oxidized LDL (OxLDL), and C20-human microglia coculture] was individually compared against control samples after filtering for low-count genes (filterByExpr function from the edgeR v3.26 R package). To account for technical differences between sequencing runs, batch correction using the sva v3.32 R package was applied when necessary. Upregulated genes (log2FC > 1, Padj < 0.05, and base mean > 5) were considered as gene sets for the different representations (Figs. 3H, 6G and H, and 7A; Supplementary Figs. S3G, S6D, S7A, and S7B). Common genes between MGT#1 activation cues were identified using the UpSetR v1.4 R package (Figs. 3H and 6H). Of note, the control gene set was obtained by the comparison between control samples and the rest of MGT#1hi samples. UMAP dimensional reduction (Fig. 6G) was generated using the umap function from the uwot R package (n_neighbors = 10, metric = “manhattan,” search_k = 100). The input to run the algorithm was the batch corrected matrix (removeBatchEffect limma v3.46 R package function) filtered by the combination of all upregulated genes in each comparison. Graphical representations of this analysis were generated using ggplot2 v3.3.2, with the exception of the upset plots (above) and the heat map in Supplementary Fig. S3G, which was generated using pheatmap v1.0.12 R package (hierarchical clustering is based on Manhattan distance and the ward.D2 clustering method).
The RNA-seq analysis in Fig. 4 was conducted for the indicated comparisons applying the DESeq2 pipeline on raw prefiltered counts (>50). The heat maps depicting the significantly differentially expressed genes (Padj < 0.05, log2FC ± 1.5) in Fig. 4B and E were generated using the pheatmap package. Color coding represents relative rlog-normalized gene-expression values across samples. The enrichGO and the dotplot function from the clusterProfiler R package v.3.16.1 were used to conduct and visualize the Gene Ontology enrichment analysis are shown in Fig. 4F.
In Fig. 6D–F, transcriptome profiles were analyzed using SeqMonk, and reads were normalized by the standard pipeline, applying DNA contamination correction. Raw counts were used to perform DESeq2 differential gene-expression analysis. The same pipeline with log-transformation was used for visualization. Significance was determined using standard SeqMonk settings, P < 0.05 after Benjamini and Hochberg correction, followed by independent intensity filtering. Quantitation was done as above. NFκB-related genes in hMG versus hGIC and TNFα versus hGIC comparisons were determined using IPA (Qiagen Bioinformatics). MES-GBM signatures were obtained by the respective publications and plots were generated using Venny. GSEA significance was determined for MES-GBM log2FC > 0.5-fold with Padj = 0, for PN log2FC < −0.4, Padj = 0, and for SREBP log2FC > 1-fold with Padj = 0.
The interaction map in Fig. 6F was generated using the function Ingenuity upstream regulator from IPA for the comparison of MGT#1hi TNFα with MGT#1hi C20MG coculture. In comparing TNFα and hMG samples, independent filter of Padj < 0.05, log2FC > 1, and log2Avg > 5 was applied to the DESeq2 results, selecting only upregulated genes to conform the TNFα and hMG signature gene set.
GSEA was used to identify the enrichment of any of the GBM subtypes using GBM subtype/state public gene sets (3, 5, 12) in the different comparisons of FACS sorted upon MGT#1 activation RNA-seq profiles (above). The enrichment was generated using the piano v2.0.2 R package function runGSA (parameters: geneSetStat=“page,” signifMethod=“geneSampling,” and nPerm=1000). Graphical representation was generated by computing the positive gene set enriched as −1*log10 (Padj (dist.dir.up) for the corresponding comparison (Figs. 1F, 2H, and 3G). GSEA for GBM signatures (12) in Fig. 4C and Supplementary Fig. S4C were conducted and visualized using the fast-preranked GSEA (fgsea) R package v.1.14.0, using all genes included in the comparison. The genes were preranked based on the log-fold change derived from the differential expression analysis for the indicated comparisons, with the number of permutations = 100,000 for assessing the enrichment significance.
Graphical representations of this analysis were generated using ggplot2 v3.3.2, with the exception of the heat map in Fig. 1F, which was generated using pheatmap v1.0.12 R package (hierarchical clustering is based on Manhattan distance and the ward.D2 clustering method).
ssGSEA using gsva (ssgsea.norm=TRUE) from GSVA v1.32.0 R package was applied to obtain a signature score-matrix of GBM bulk (TCGA) and single-cell patients' expression profiles (12) per each gene set (Figs. 1C, 7A–C; Supplementary Figs. S1E, S7A–S7C). Box-plot comparison (t test) in Supplementary Fig. S7A–S7D was carried out between each GBM transcriptional subtype. Heat map in Fig. 7A was directly generated using this matrix (pheatmap v1.0.12 R package; hierarchical clustering is based on Manhattan distance and the ward.D2 method). The division between patients with high- and low-expression GBM in the survival plots (Supplementary Fig. S7B) was established using the 0.9 percentile as a threshold given for each signature. Statistical tests log rank and Mantel–Cox were applied to contrast differences between survival distributions.
For GSC microarray analyses (Supplementary Fig. S7C), the brain tumor stem cell line data set was assembled with previously generated transcriptomic data: 44 RNA-seq (samples Illumina HiSeq 2500) from GSE119834, GSE67089, and GSE8049. At the exception of the GSE119834, for which preprocessed data were used, raw data were downloaded from the Gene Expression Omnibus repository (https://www.ncbi.nlm.nih.gov/geo/) and, subsequently, the affy R package v1.64.0 was used for robust multiarray average normalization followed by quantile normalization. For genes with several probe sets, the median of all probes had been chosen and only common genes. Heat map representation was generated using pheatmap v1.0.12 R package (hierarchical clustering is based on Euclidean distance and the complete clustering method).
Deposited Data and Code
All data and codes used in this study have been deposited to the relevant repositories (Supplementary Table S3). Additional protocols, data, and codes are available upon reasonable request.
G. Gargiulo reports a patent for EP18192715 issued to Max-Delbrück-Center for Molecular Medicine (MDC), Robert-Rössle-Str. 10, 13092 Berlin, Germany. No disclosures were reported by the other authors.
M.J. Schmitt: Validation, investigation, visualization, and methodology. C. Company: Validation, investigation, visualization, and methodology. Y. Dramaretska: Validation, investigation, visualization, and methodology. I. Barozzi: Resources, software, validation, investigation, and methodology. A. Göhrig: Validation and investigation. S. Kertalli: Validation. M. Großmann: Validation. H. Naumann: Validation. M.P. Sanchez-Bailon: Validation. D. Hulsman: Validation. R. Glass: Resources. M. Squatrito: Software, validation, investigation, visualization, methodology, writing–review and editing. M. Serresi: Formal analysis, validation, investigation, visualization, methodology, writing–review and editing. G. Gargiulo: Conceptualization, resources, data curation, formal analysis, supervision, funding acquisition, validation, investigation, visualization, methodology, writing-original draft, project administration, writing–review and editing.
We thank E. Guccione, A. Carugo, and A.P. Haramis for critical reading of the manuscript, L. Li for help with figures, R. Hodge for proofreading, and past and present members of the Gargiulo lab and advisory board for support. We are grateful to M.v. Lohuizen for support and critical advice, N. Zampieri, S. Dietrich, and J. Martins for support with imaging, genomics (R. Kerkhoven, NKI, S. Sauer and T. Borodina, MDC), FACS (F.v. Diepen, NKI, and H.-P. Rahn, MDC), Imaging (A. Sporbert, MDC) facilities and infrastructures, in vivo experiments (EPO GmbH), and A. Hufton and A. Sparmann for discussions. The GBM2, GBM14, NCH421K, and GBM166, GBM179 and BLN5, and BLN7 GSCs were a kind gift from R. Glass, P. Dirks, and P. Euskirchen, respectively. The Human Brunello CRISPR knockout pooled library was a gift from David Root and John Doench (Addgene #73178); hMG-cl.20, CD34+ cells, and THP-1 cells were kind gifts from D. Alvarez-Carbonell, K. Rajewsky, and S. Minucci, respectively. Data analyses include data generated by the TCGA Research Network: https://www.cancer.gov/tcga. C. Company, M.J. Schmitt, and Y. Dramaretska are graduate students with Humboldt University and S. Kertalli with BSIO-Charitè Medical University. M. Squatrito is supported by a grant from the Seve Ballesteros Foundation. The G. Gargiulo lab acknowledges funding from MDC, Helmholtz (VH-NG-1153), ERC (714922), and KWF (NKI-2013 and NKI-2014-7208).