Relapse of acute myeloid leukemia (AML) after allogeneic bone marrow transplantation has been linked to immune evasion due to reduced expression of major histocompatibility complex class II (MHCII) genes through unknown mechanisms. In this work, we developed CORENODE, a computational algorithm for genome-wide transcription network decomposition that identified a transcription factor (TF) tetrad consisting of IRF8, MYB, MEF2C, and MEIS1, regulating MHCII expression in AML cells. We show that reduced MHCII expression at relapse is transcriptionally driven by combinatorial changes in the expression of these TFs, where MYB and IRF8 play major opposing roles, acting independently of the IFNγ/CIITA pathway. Beyond the MHCII genes, MYB and IRF8 antagonistically regulate a broad genetic program responsible for cytokine signaling and T-cell stimulation that displays reduced expression at relapse. A small number of cells with altered TF abundance and silenced MHCII expression are present at the time of initial leukemia diagnosis, likely contributing to eventual relapse.
Our findings point to an adaptive transcriptional mechanism of AML evolution after allogeneic transplantation whereby combinatorial fluctuations of TF expression under immune pressure result in the selection of cells with a silenced T-cell stimulation program.
Treatment of acute myeloid leukemia (AML) with curative intent often utilizes allogeneic hematopoietic stem cell transplantation (alloSCT), which relies on the graft-versus-leukemia effect (GvL), whereby donor-derived T cells eliminate residual leukemia cells (1). Although the immunologic mechanisms of GvL are complex, relapse of AML after allogeneic transplantation has been linked to the loss of major histocompatibility complex class II (MHCII) expression (2–4). Notably, treatment of AML blasts with IFNγ restored MHCII expression, and no new DNA coding mutations were detected in comparative analyses of pre- and post-alloSCT relapsed samples (3, 4). Thus, the reduced expression of MHCII at relapse appears to have a regulatory basis. Decreased expression of the MHCII transcriptional coactivator CIITA was observed in some cases (4), but loss of MHCII expression was also seen despite unchanged or elevated expression of CIITA (3), suggesting alternative mechanisms of immune escape.
Network Decomposition by Combinatorial Regression
To identify the transcriptional regulators of MHCII in AML, we sought to reverse engineer a transcription regulatory network from public genome-scale data sets (Fig. 1A). In principle, the expression of a gene in a given context may be computationally inferred from the identities of its transcriptional regulators, their cellular abundance, and the gene-regulatory function that represents the quantitative relationship between a transcription factor (TF) and its target gene (5, 6). We began by defining a list of core regulatory (CR) TFs for AML lineage specification (7). Genes encoding CR TFs are typically marked by extended, closely spaced enhancers with markedly high histone acetylation and cofactor recruitment, termed superenhancers (SE; ref. 8). Among a chromatin immunoprecipitation sequencing (ChIP-seq) data set for the enhancer histone mark H3K27ac in a panel of 49 primary human AML samples (9), we identified 1,298 SEs present in at least 10 (∼20%) samples (Supplementary Fig. S1A–S1C). These common SEs were proximal to 2,748 genes, of which 220 encoded TFs. Because CR TFs are typically essential for lineage survival (10, 11), we focused this list further by asking which of the 220 SE-associated TFs were dependencies selective for AML versus other malignancies. We interrogated data from the Broad Cancer Dependency Map project, a collection of genome-scale CRISPR knockout screens of 18,119 genes in 769 cell lines, including 19 AML lines (12, 13). Applying a skewed-LRT test (14) to compare guide RNA drop out between AML and non-AML cell lines, we identified 40 TFs selectively essential in AML (Supplementary Fig. S1D). We then intersected the 40 essential TFs with the 220 SE-driven TFs, resulting in a list of 19 putative CR TFs (Fig. 1A).
We sought next to define the gene-regulatory functions linking CR TFs to all expressed genes. We developed CORENODE (COmbinatorial REgression for NetwOrk DEcomposition), a coexpression-based network decomposition algorithm where TF–target gene connections, or edges, are inferred from naturally occurring variation in gene expression within a reasonably large set of samples. We hypothesized that the expression of each gene could be estimated from the expression of a small subset, possibly 3 to 5 TFs, of the entire CR TF set. Given that the intersection of dependency and SE profiling data nominated only 19 TFs, it was computationally feasible to design a comprehensive combinatorial regression approach to fit expression of each target gene against all possible combinations of the CR TFs (Fig. 1A; Supplementary Fig. S1A). Using the protein-coding transcriptomes of 510 primary AML samples from the Beat AML study (15), we regressed the mRNA expression values of all genes in the expressed genome against mRNA expression values of the 19 CR TFs taken 3 at a time, using linear, quadratic, and cross terms. Use of 3-mers provided the best balance between fit quality and overfitting risk genome-wide. Thus, for every target gene, we generated 969 gene-regulatory functions corresponding to all possible 3-mer combinations of the 19 TFs. We chose the top 5% of best-fitting 3-mers and calculated an edge score (ES) for each TF–target gene pair as the number of times the TF appeared in the top 5% of best fits. Finally, we estimated the direction and amplitude of positive or negative regulation by aggregating response slopes for each TF across all samples and 3-mers, which was termed directional derivative (DD). Details of the computational algorithm are found in the Supplementary Note and online data browser (https://corenode.shinyapps.io/corenode/). Genome-wide ES and DD outputs are found in Supplementary Data S1 and S2, respectively.
A Transcription Factor Tetrad Regulates MHCII Genes in AML
CORENODE prioritized four TFs (IRF8, MYB, MEF2C, and MEIS1) as candidate regulators of the MHCII and several additional genes involved in antigen presentation that have been associated with immune escape in AML (CD74, IFI30, HLA-DMA, HLA-DMB, and CD86; ref. 3; Fig. 1B). MYB and MEIS1 were predicted to negatively regulate the expression of these genes, whereas IRF8 and MEF2C were predicted to be positive regulators (Fig. 1B–D). In parallel work, we used additional data sets and criteria to define an extended AML core regulatory circuit of 31 members (16). For additional validation, we repeated CORENODE regressions with this extended set of TFs, as well as a 37-member set that included all selectively essential AML TFs above a minimal expression cutoff and regardless of the SE status. In all cases, the same 4 TFs were predicted to be the top regulators of MHCII expression (Supplementary Fig. S2A–S2D). Combining these TFs in a single 4-mer resulted in a modest improvement of the regression fits (compared with 3-mers) without overfitting (Fig. 1C and E; Supplementary Fig. S3A; Supplementary Data S3).
We examined how variation in the expression of the TF tetrad correlated with changes in MHCII expression in two independent data sets, Beat AML and TCGA, including 510 and 151 patients, respectively (15, 17). In both data sets, the expression of MHCII and other genes associated with immune escape in patients with high (above the median) IRF8/MEF2C and low (below the median) MYB/MEIS1 was up to 17-fold higher than in patients with the opposite pattern of TF expression (Fig. 1F; Supplementary Fig. S3B). To validate the impact of these TFs on MHCII regulation, we inactivated each TF with CRISPR/Cas9 editing followed by RNA-seq in an AML cell line (Supplementary Fig. S4A and S4B; Supplementary Data S4). Consistent with predictions of CORENODE, loss of IRF8 and MEF2C led to reduced expression of MHCII and other genes associated with immune escape, whereas loss of MYB and MEIS1 was associated with increased expression (Fig. 2A). We confirmed these findings at the protein level by measuring the cell-surface expression of MHCII molecules after TF knockout in a panel of AML cell lines (Fig. 2B). Overall, MYB and IRF8 had a major impact on MHCII expression, whereas the effects of MEIS1 and MEF2C were more subtle and variable between cell lines, suggesting context-specific roles of MEIS1 and MEF2C as fine-tuners of MHCII expression. One cell line with a particularly high baseline MHCII expression (MONOMAC1) demonstrated decreased, rather than increased, MHCII expression after MYB knockout (Fig. 2B; Supplementary Fig. S4B), but we found this effect to be epistatic with CIITA and reversed after a CIITA knockout (see below). Treatment of AML cells with the MYB inhibitor mebendazole (18) resulted in a modest induction of MHCII expression (Supplementary Fig. S4C), consistent with the effect of MYB knockout.
The MHCII locus contains several SEs whose H3K27 acetylation correlates with the expression of MHCII genes in primary AML cells (Supplementary Fig. S5A). Using ChIP-seq, we confirmed binding of the TF tetrad at the MHCII SEs (Fig. 2C; Supplementary Fig. S5B). Spike-in controlled ChIP-seq analysis detected a significant increase in H3K27 acetylation at these SEs after MYB knockout, despite the expected global decrease of H3K27 acetylation elsewhere across the genome (Fig. 2C; Supplementary Fig. S5B and S5C). In contrast, IRF8 knockout was accompanied by the loss of H3K27 acetylation. Furthermore, the SEs associated with IRF8 and MEF2C in primary AML blast cells positively correlated with MHCII SEs, whereas SEs controlling MYB and MEIS1 negatively correlated (Fig. 2D; Supplementary Fig. S5D). These data provided further evidence of antagonistic functions of these TFs in MHCII regulation.
We next examined how changes in the MHCII gene expression after alloSCT correlated with expression of the TF tetrad. We accessed the published RNA-seq data from paired AML samples at initial presentation and relapse after alloSCT, where decreased MHCII expression was observed in 6 of the 7 examined patients (3). Despite a substantial variability of baseline expression, we observed significant downregulation of IRF8 and, to a lesser extent, MEF2C in all patients with reduced MHCII expression at relapse (Fig. 2E). In contrast, MYB and MEIS1 were upregulated in 5 of 6 patients at relapse, which was of borderline statistical significance. Each of these changes would be expected to lead to reduced expression of MHCII genes.
A Combinatorial Transcriptional Mechanism of Immune Escape
To this point, our computational predictions and knockout experiments demonstrated that perturbation of any one of the 4 candidate TFs was sufficient to effect a change in MHCII expression, and examination of AML relapses revealed changes in the abundance of these TFs in the direction consistent with reduced MHCII expression. However, the degree of observed TF change was often modest and, in the case of MYB and MEIS1, not evident in all post-alloSCT samples (Fig. 2E). This observation led us to hypothesize that immune escape may be enacted by combinatorial changes in the tetrad. To assess whether the observed changes in TF expression adequately accounted for the magnitude of MHCII downregulation at the time of relapse, we used the CORENODE gene-regulatory functions to predict the change in MHCII expression from the pre- and post-relapse expression values of the TF tetrad. For additional statistical robustness, we analyzed all paired samples from the two available studies (3, 4), including patients regardless of therapy (chemotherapy vs. alloSCT) or the direction of MHCII change. In both data sets, the predicted changes in MHCII expression closely correlated with the actual changes observed at the time of relapse (Fig. 2F and G). Given that the gene-regulatory functions were derived from an unrelated data set, these findings provided validation for the computational model and confirmed the combinatorial action of CR TFs in MHCII regulation.
Immune Escape Is Regulated Independently of the IFNγ/CIITA Pathway
MHCII expression in myeloid cells is regulated by IFNγ, which acts via the Jak/STAT/IRF pathway to induce the expression of CIITA, a specific transcriptional coactivator of the MHCII genes (19–21). Accordingly, expression of MHCII genes correlates with CIITA expression (Fig. 3A; Supplementary Fig. S6A). Therefore, we asked if the TF tetrad regulates MHCII expression via the IFNγ/CIITA pathway. Indeed, CORENODE identified MEF2C, MEIS1, and IRF8 as candidate regulators of CIITA (Fig. 1B). Knockouts of IRF8 and MEF2C led to decreased CIITA expression, whereas a MYB knockout was associated with increased expression (Fig. 2A). Thus, at least three of the four TFs appear to participate in MHCII expression by modulating CIITA. However, post-alloSCT silencing of the MHCII genes was not consistently associated with reduced CIITA expression (Fig. 2E), pointing to an independent mechanism of MHCII regulation. In addition, ChIP-seq confirmed binding of the entire tetrad at the MHCII locus (Fig. 2C; Supplementary Fig. S5B), consistent with a possibility of direct action. These conflicting observations prompted us to investigate further CIITA-dependent and -independent actions of the tetrad. First, we repeated CORENODE regressions of MHCII genes using 4-mers and 5-mers where CIITA was used as an additional regression term. The inclusion of CIITA in the model resulted in a marked improvement of goodness-of-fit for all genes in the MHCII locus without overfitting (Fig. 3A and B). Furthermore, t and P values of individual regression terms, representing the probability of their independent contribution to the overall fit (see Supplementary Note), indicated that the TFs were independent variables, with the strongest contributions from MYB and CIITA (Fig. 3C; Supplementary Data S5 and S6). This analysis suggested that members of the TF tetrad—and particularly MYB—regulate MHCII in a CIITA-independent manner. To confirm this inference, we asked if the negatively acting TF MYB regulates MHCII expression in the absence of CIITA coactivation. We generated a panel of CIITA knockout cell lines (Supplementary Fig. S6B). As predicted, baseline expression of MHCII proteins was reduced in CIITA-deficient cells (Fig. 3D). Nonetheless, loss of MYB led to increased MHCII expression (Fig. 3D and E), consistent with a CIITA-independent mechanism of MHCII regulation. Notably, MYB knockout in wild-type MONOMAC1, a cell line with a very high baseline expression of CIITA and MHCII, decreased MHCII expression, whereas knockout of MYB in CIITA-deficient MONOMAC1 cells increased MHCII expression (Figs. 2B and 3D; Supplementary Fig. S4B). This result suggests context-specific epistasis between MYB and CIITA and confirms the role of MYB as a CIITA-independent repressor of MHCII in AML.
Finally, we asked if the induction of MHCII expression by IFNγ is mediated by the tetrad TFs. We treated AML cells with IFNγ and measured the expression of the tetrad TFs, CIITA, and HLA-DRA by RT-qPCR. Although IFNγ dramatically induced the expression of HLA-DRA and CIITA, its effects on the tetrad TFs were modest (Fig. 3F). Furthermore, IFNγ induced expression of MHCII in AML cells with an IRF8 knockout (Fig. 3G). Collectively, our data indicate that the tetrad TFs regulate MHCII expression independently of the IFNγ/CIITA pathway.
IRF8 and MYB Antagonistically Regulate a Graft-Versus-Leukemia Effect Response Module
We sought to place the MHCII genes in the context of genome-wide transcriptional regulation by the 19 CR TFs. We began by evaluating the global accuracy of CORENODE in predicting transcriptional regulation. To this end, we generated a list of high-confidence targets of each CR TF by applying a minimal ES cutoff of 15, and cross-plotted DDs for each target gene versus the actual response observed on RNA-seq after knockout of the corresponding TF in MV411 cells. A significant correlation was observed between predicted and actual changes in gene expression following TF loss (Fig. 4A; see Supplementary Note for a detailed discussion). Having confirmed CORENODE's precision in predicting genomic regulation, we built a map of high-confidence edges in the AML transcription regulatory network, revealing distinct functional modules of CR TFs and coregulated genes (Fig. 4B). The model highlighted several TFs (MYB, FLI1, STAT5B, CEBPA, IRF8, and SPI1) as genome-wide regulators, whereas other TFs (GFI1, MEIS1, RUNX1, RUNX2, MEF2C, and MEF2D) were projected to regulate more narrow transcriptional programs. In addition, the model predicted partially antagonistic functions for some TFs on a pairwise comparison (Supplementary Fig. S7A and S7B). We focused attention on the antagonistic actions of MYB and IRF8. CORENODE highlighted a module of 58 genes predicted to be upregulated by IRF8 and downregulated by MYB (Fig. 4C; Supplementary Data S7). The predicted outcome conformed closely to the actual responses following TF knockouts (Fig. 4D). The module was highly enriched for genes participating in multiple aspects of alloreactivity (Fig. 4E; Supplementary Fig. S8A), including response to IL2 and IFNγ (MAP2K1, OAS1, IFI30, MX1, and TNFRSF1B), T-cell stimulation (CD86, BTN3A1, TNFSF8, and SH2B3), adhesion (ITGB7 and TGFBI), inflammasome formation (NLRP1), and transcription regulation (NOD2, SPI1, MALT1, CBX7, and MAFB). Importantly, the module was downregulated in 5 of 6 patients with reduced MHCII expression after alloSCT, with the exception of the sole patient (142074) who displayed decreased MYB expression at relapse (Fig. 4F). In contrast, patients with relapse after chemotherapy displayed inconsistent changes in the expression of the module (Supplementary Fig. S8B). Thus, antagonistic actions of MYB and IRF8 facilitate immune evasion by silencing a broad transcriptional program that regulates the interaction between AML blasts and donor T cells (Fig. 4G).
Immune Escape Is Uncoupled from the Myeloid Differentiation State
Members of the TF tetrad have previously been implicated in the regulation of normal and malignant myelopoiesis: MYB and MEIS1 restrain myeloid differentiation, whereas IRF8 and MEF2C promote it (22–25). Additionally, AML cells form a functional hierarchy of differentiation states arising from a leukemia stem cell (LSC; ref. 26), and activation of immune pathways is a hallmark of monocytic development (27). Thus, we hypothesized that reduced expression of IRF8 and MEF2C and increased expression of MYB and MEIS1 at relapse reflect the evolution of leukemia into a less differentiated state. To test this hypothesis, we computed a myeloid differentiation index from the mRNA expression of 19 cell type–specific markers, which correctly reproduced both the normal myeloid trajectory and the functional AML hierarchy (Supplementary Fig. S9A and S9B). However, MHCII expression correlated poorly with myeloid differentiation in the Beat AML and TCGA data sets, and changes in the MHCII expression at relapse could not be predicted from the changes in the myeloid differentiation index (Fig. 4H; Supplementary Fig. S9C and S9D). We conclude that reduced MHCII expression at relapse is regulated independently from the myeloid differentiation state.
Transcriptional Plasticity Underlies Relapse
We examined the expression of MHCII genes in AML cells at the single-cell level. We analyzed single-cell transcriptomes obtained from patient 452198 at presentation and post-alloSCT relapse (3). Plotting all cells together using t-distributed stochastic neighbor embedding (t-SNE) revealed distinct clusters of AML cells corresponding to initial presentation and relapse (Fig. 5A). Although most AML cells displayed high MHCII expression at presentation, approximately 1 in 40 cells had low or undetectable MHCII similar to those seen at relapse (Fig. 5B and C). Some of these MHCII-low cells coclustered with the relapsed cell population, indicating similar global transcription patterns (Fig. 5C). The MHCII-low cells displayed significantly higher MYB and lower IRF8 compared with the cells expressing MHCII highly at presentation (Fig. 5D). The average MYB expression in the MHCII-low cell population at presentation was still lower, and IRF8 was still higher than the average of these TFs at relapse. However, up to 1 in 6 cells in the MHCII-low population (approximately 1 in 250 of all AML cells at presentation) expressed undetectable IRF8 and markedly elevated MYB that matched or exceeded the average MYB expression seen at relapse (Fig. 5C). Although expression of MEIS1 was increased and expression of MEF2C was decreased in most cells at relapse, we found no statistically significant change in their expression in the MHCII-low cells at presentation, indicating that stable changes in these TFs were acquired later in the course of leukemia evolution (Fig. 5D). We conclude that a minor population of cells with reduced MHCII expression driven by altered MYB and IRF8 expression was present in this patient at the time of initial presentation and likely contributed to the subsequent relapse. For additional validation, we examined scRNA-seq data from six diagnostic AML samples from another report (28). MHCII-low cells were universally present in all patients at the time of initial diagnosis (Fig. 6A). IRF8 and MEF2C were significantly lower in MHCII-low cells than in MHCII–high cells (Fig. 6B). MYB and MEIS1 were higher in MHCII-low cells in 4 of 6 patients (Fig. 6B). Importantly, in the 2 samples where MHCII-low cells displayed paradoxically lower expression of MYB and MEIS1, this was compensated by a much greater fractional decrease in IRF8 and MEF2C. Overall, these observations are consistent with the results of bulk RNA-seq of post-alloSCT samples (Fig. 2E) and lend further support to a combinatorial transcriptional mechanism of immune escape. Importantly, AML cells with transcriptionally driven low MHCII expression appear to be common at the time of initial diagnosis in all examined patients, suggesting a unifying potential mechanism of immune escape and eventual relapse.
By developing a network decomposition approach that captures combinatorial TF interactions, we identified a tetrad of core TFs that regulate immune escape of AML cells after alloSCT, where IRF8 and MYB appear to act as major regulators of MHCII expression, acting antagonistically. Remarkably, the transcriptional activator MYB acts as a strong negative regulator of MHCII genes. Consistent with our observations, Myb-deficient mouse B cells exhibit elevated MHCII expression (29). Our data reinforce the status of MYB as an important therapeutic target in AML (30) and further indicate that even partial MYB inhibition may represent a viable strategy to prevent or reverse immune escape after alloSCT. Indeed, the tool compound mebendazole, which appears to inhibit MYB by targeting it for proteosomal degradation (18), increases MHCII expression, albeit modestly. However, a complete MYB knockout results in an extremely potent induction of MHCII expression, and it is likely that a potent direct MYB inhibitor would produce a similarly dramatic effect. Although IRF8 is an established effector of IFNγ and has been reported to control MHCII expression in myeloid cells (31–33), we demonstrate that both IRF8 and MYB regulate MHCII expression largely independently of the IFNγ/CIITA axis. Notably, various IRF paralogs (IRF1, IRF2, IRF4, and IRF8) have been implicated as transcriptional effectors of the IFNγ/Jak/STAT cascade, inducing MHCII expression both directly and via CIITA (19, 24, 31, 33, 34). We suspect that the IRF paralogs exhibit context-specific and partially redundant roles in MHCII regulation. Although binding of the tetrad TFs at the MHCII locus suggests that they may directly regulate the expression of MHCII genes, TF occupancy is a poor predictor of direct regulation (35, 36), and additional studies are needed to elucidate the mechanism by which the tetrad TFs regulate MHCII expression. Furthermore, the MHCII locus is a point of convergence of complex regulatory mechanisms (20), and it is likely that there are other important regulators of MHCII expression contributing to immune escape. Indeed, CORENODE is limited to a preselected set of regulators and is unable to capture posttranscriptional regulation. While this manuscript was in revision, a study by Gambacorta and colleagues reported epigenetic silencing by the polycomb repressive complex 2 (PRC2) as a mechanism of MHCII loss in post-alloSCT relapse (37). It would be interesting to examine whether altered expression of the tetrad TFs identified in our study contributes to the recruitment of PRC2 to the MHCII locus.
Beyond reducing MHCII expression, antagonistic actions of MYB and IRF8 facilitate immune escape by enacting a broader program of transcriptional changes resulting in reduced expression of multiple genes involved in antigen presentation, cytokine response, and T-cell costimulation. Notably, to the extent that these core TFs are essential for AML survival, post-SCT relapse appears to rely on relatively modest changes in their expression that cooperatively mediate immune evasion without affecting cell viability. A small population of such cells with reduced MHCII expression due to altered TF expression is present at the time of initial diagnosis and likely contributes to relapse under in vivo selection. Whether transcriptional heterogeneity is hard-wired (for example, by mutations in noncoding regulatory DNA elements) or driven by stochastic TF fluctuations remains to be elucidated. Nonetheless, we speculate that the immune pressure imposed by allogeneic transplantation selects AML cells with a pattern of CR TF expression that provides an optimal balance between immune escape and overall fitness. Such “transcriptional evolution” by the selection of cells with favorable transcriptional states mediated by altered combinatorial patterns of key TFs may drive the progression of other tumors and eventual relapse.
External Data Sets
RNA-seq BAM files for 510 patients from the Beat AML project (15) were provided by Oregon Health & Science University and processed through the CCLE RNA processing pipeline (STAR/RSEM, described at https://github.com/broadinstitute/ccle_processing). Reads were normalized to transcripts per million (TPM) and filtered for protein-coding genes. The expression values were transformed to log2(TPM + 1). The genes were ranked by average expression across the samples and 9,000 top expressed genes were chosen for further analysis (Supplementary Fig. S1).
H3K27ac ChIP-seq data from primary AML samples (9) were downloaded from Sequence Read Archive (SRA) under accession number SRP103200 and processed using the AQUAS pipeline (https://github.com/kundajelab/chipseq_pipeline) with minor modifications and according to the ENCODE3 guidelines. Reads were aligned to the hg19 genome build using BWA-ALN and peaks were called using MACS2. For SE calling, each sample was processed using ROSE2 (https://github.com/linlabbcm/rose2), excluding 2,500 bp around TSSs (−t 2,500) and the hg19 Encode blacklisted regions. SE regions were then merged and ROSE2 was run again for each sample on the merged regions, producing the SE signal matrix, which was then normalized by a median signal for each sample.
TCGA mRNA (FPKM) data for 151 patients with AML was downloaded from the NCI Genomic Data Commons site using the GDCquery routine from the R package TCGAbiolinks (39); the procedure was last verified on September 18, 2019. TCGA mRNA gene information was converted from ENSG codes to hgnc symbols using the Bioconductor program biomaRt (40, 41). mRNA FPKM expression values were converted to TPM.
RNA-seq and associated metadata for normal hematopoietic progenitors and AML samples (42) for myeloid index analysis were downloaded from GEO GSE74246, converted to FPKM using transcript lengths acquired through the Bioconductor package “biomaRt” and then converted to TPM.
Paired processed RNA-seq data from the two available studies of patients with AML at the time of initial presentation and relapse (3, 4) were downloaded from https://www.nejm.org/doi/full/10.1056/NEJMoa1808777 and https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-7456/samples/.
Genetic dependency data are available for download at the Broad DepMap portal database (https://depmap.org/portal/download/). Data release 20q1 was used for this study. For the purpose of this study, we considered a gene to represent a selective AML dependency if it had a probability of dependency of ≥0.5 in 3 or more AML cell lines and not be considered a pan-lethal dependency based on the LRT score. This approach resulted in a total of 225 selective AML dependencies.
Gene set enrichment analysis was performed using the Enrichr database (43).
AML cell lines were purchased from the American Type Culture Collection (https://www.atcc.org) within the past 10 years. The MV411 cell line was authenticated within the last year by confirming the presence of the t(4;11) translocation. Other cell lines were not independently authenticated. Cells were cultured in the RPMI-1640 media (Gibco 11875085) containing 10% FBS (Thermo Fisher A4766) and regularly tested to be free of Mycoplasma spp.
CRISPR-Cas9 Gene Knockouts
Synthetically modified sgRNA constructs were purchased from Synthego (refer to Supplementary Table S1 for sgRNA sequences). Ribonucleoprotein (RNP) assembly was performed by mixing 2 to 3 sgRNAs (a total of 120 pmol/L) with 8.5 μg recombinant Cas9 (Invitrogen A36499). The resulting RNP mix was electroporated into 0.3 × 106 cells using a Lonza 4D Nucleofector, program DJ-100, in 20 μL Nucleocuvette strips (Lonza V4XC-2032). Unless otherwise noted, cells were incubated in media for 72 hours postelectroporation before subsequent analyses. Knockout efficiency was confirmed by Western blotting (Supplementary Fig. S4A). The following antibodies were used: actin (Santa Cruz sc-47778), IRF8 (Cell Signaling 5628s), MEF2C (Cell Signaling 5030s), MYB (Abcam ab45150). A guide RNA targeting the AAVS1 “safe harbor” locus was used as a negative control (44).
Genome-Wide Occupancy Analysis
ChIP-seq was performed as previously described (45) and in accordance with the Encode guidelines (46). Briefly, 10 × 106 (for histone ChIPs) or 100 × 106 (for TF ChIPs) exponentially growing MV411 cells were fixed with 1% formaldehyde for 10 minutes at room temperature and then quenched with 125 mmol/L glycine for 5 minutes. Nuclei were isolated using the Nuclei EZ Kit (Sigma-Aldrich NUC101) and resuspended in 10 mmol/L Tris-HCl, pH 8.0, 1 mmol/L EDTA, 0.1% SDS with 1× HALT protease inhibitor (Thermo Fisher; 87786). Chromatin was fragmented by sonication on an E220 Covaris-focused sonication machine using the parameters of 140 mV, 5% duty factor, 200 cycles/burst for 14 minutes, and cleared by centrifugation. The following antibodies were used for ChIP: H3K27ac (Abcam ab4729), IRF8 (SantaCruz sc365042X), MYB (Abcam ab45150), MEIS1 (Abcam ab19867), and MEF2C (GeneTex GTX105433). ChIP-seq libraries were prepared using Swift S2 Acel reagents (Swift 21096) on a Beckman Coulter Biomek i7 liquid handling platform from approximately 1 ng of DNA according to the manufacturer's protocol and using 14 cycles of PCR amplification. Sequencing libraries were quantified by Qubit fluorometer and Agilent TapeStation 2200. Library pooling and indexing were evaluated by shallow sequencing on Illumina MiSeq. Subsequently, libraries were sequenced on NovaSeq targeting 40 million 100 bp read pairs by the Molecular Biology Core facilities at the Dana-Farber Cancer Institute. Sequencing reads were processed using the AQUAS pipeline (https://github.com/kundajelab/chipseq_pipeline) with minor modifications and according to the ENCODE3 guidelines. Reads were aligned to the hg38 genome build using BWA-ALN and peaks were called using MACS2. Quality control of the ChIP-seq data were performed using the nf-core pipeline (https://github.com/nf-core/chipseq), targeting FRiP scores of >0.3.
For quantitative ChIP-seq analysis of H3K27 acetylation, we used Drosophila chromatin/antibody spike-in control as previously described (47). Briefly, 4 μg of anti-H3K27ac antibody, 2 μg of spike-in antibody, and 20 ng of spike-in chromatin (Active Motif 61686 and 53083, respectively) were added to chromatin prepared from 2.5 × 106 MV411 cells 72 hours after RNP-mediated TF knockout. The rest of the ChIP-seq experiment was performed in the standard fashion. After ChIP-seq reads were mapped to the Drosophila genome and the hg38 human genome in parallel, human tag counts were normalized to Drosophila tag counts.
Global Transcriptome Analysis
For RNA-seq experiments, the total cellular RNA was extracted using the QuickRNA kit (Zymo Research R1054). Purified total RNA was mixed with the ERCC synthetic spike-in control (Invitrogen 4456740). RNA-seq libraries were prepared on a Beckman Coulter Biomek i7 liquid handling platform using Roche Kapa mRNA HyperPrep strand-specific sample preparation kits (Roche; 08098123702) from 200 ng of purified total RNA according to the manufacturer's protocol. Library quantification and Illumina sequencing were performed as described in the ChIP-seq section above.
The total cellular RNA was extracted using the RNeasy Plus Mini Kit (Qiagen; 74134). Target gene expression was quantified by a TaqMan ΔΔCt experiment using TaqPath 1-step master mix (Thermo Fisher A28525) and HPRT1 as an internal control. The following TaqMan assays were used: CIITA, Hs00172106_m1; MYB, Hs00920556_m1; MEIS1, Hs00180020_m1; MEF2C, Hs00231149_m1; IRF8, Hs00175238_m1; HPRT1, Hs99999909_m1; HLA-DRA, Hs00219575_m1).
Measurement of MHCII Expression by FACS
For MHCII analysis, 1 × 106 cells were first incubated for 10 minutes with 0.5 μg of Human Fc block (BD Biosciences; 564219). Cells were then split and stained for 20 minutes with either a pan-specific FITC-conjugated Mouse Anti-Human HLA-DR, DP, DQ antibody (BD Biosciences 555558, clone Tu39), or FITC-conjugated Mouse IgG2a, κ Isotype Control (BD Biosciences 555573). Data were analyzed in FlowJo (FlowJo LLC). Surface MHCII expression was calculated by subtracting the isotype control fluorescence from the pan–HLA-DR, DP, and DQ fluorescence.
Whole-cell lysates were prepared in RIPA buffer (Boston Bio-Products BP-115-500) with a protease inhibitor cocktail (Thermo Fisher; 23225). Lysates were boiled in Laemmli buffer (Bio-Rad 1610737), separated by SDS-PAGE, and transferred and blocked using standard methodology. HRP-conjugated anti-mouse and anti-rabbit IgG secondary antibodies were used for imaging (Bio-Rad; 1706515 and 1706515) with an enhanced chemiluminescence substrate (PerkinElmer NEL104001EA) according to the manufacturers’ instructions. The following primary antibodies were used: anti-MYB (Abcam ab45150), anti-MEF2C (Cell Signaling; 5030S), anti-MEIS1 (Abcam; ab19867), anti-IRF8 (Cell Signaling; 5628S).
Details of the statistical analyses pertaining to CORENODE are found in the Supplementary Note. All experiments were done in at least three replicates. Two-tailed Student t test was used to compare the mean fluorescence of antibody-stained cells following TF knockout. Changes in gene expression in patient samples between initial diagnosis and relapse were ascertained using a Wilcoxon signed-rank test. Statistical analysis of ChIP-seq and RNA-seq data were performed using the DESeq2 package (48) with genome-wide adjustment for multiple hypothesis testing.
Myeloid Differentiation Index
To identify markers of myeloid development, genome-wide mRNA expression values in “HSC” and “monocyte” samples from ref. (42) were processed to yield the mean and variance of expression by the gene. For each gene, the two variances were pooled [pooled variance = mean (HSC variance, monocyte variance)]. A separation index was then defined for each gene as the difference between the HSC and monocyte mean expressions divided by the square root of the pooled variance. Markers were chosen as the 19 genes with the highest separation indices. Lymphoid markers were identified with the same procedure, comparing HSC samples to the T-cell and B-cell samples. To compute the myeloid index, each sample's expression of the marker genes as determined above was converted to a z-score using the mean of all eight HSC and monocyte expressions and the pooled variance for that gene, and further normalized to a ± 1 scale by dividing by the maximum absolute value of all z-scores. For genes where the mean monocyte score was higher than the mean HSC score, normalized scores were multiplied by (−1). The normalized scores for the set of marker genes were summed to define the myeloid index for each sample. The same procedure was used for the lymphoid index, with appropriate cell-type substitution. Plotting normal and leukemia cell types according to the myeloid and lymphoid indices yielded the expected development vectors as shown in Supplementary Fig. S6. The same markers were then used to define a myeloid index for the samples from the other data sets (3, 4, 15, 17). The HLA index was defined as the mean expression (in log2[TPM + 1]) of the nine HLA-D genes.
Data and Materials Availability
The CORENODE code (written in R) is found at https://app.box.com/s/a3x7f4zwcgv3f4jgshphe4opsen13mwf. Genome-wide matrix of CR TF ESs and DDs can be found in Supplementary Data S1 and S2, respectively. Genome-wide CORENODE output, including n-mer fits and scores, is found at https://corenode.shinyapps.io/corenode/. Statistics on 5-mer CORENODE fits for the MHC type II genes is displayed in Supplementary Data S3. Gene-dependency scores from the CRISPR–Cas9 screen are available at https://depmap.org/portal/download/. ChIP-seq and RNA-seq data have been deposited to the SRA database under accession no. PRJNA751732. DESeq2 outputs from the RNA-seq experiments are found in Supplementary Data S4. CORENODE statistics for MHC type II 5-mer regressions are found in Supplementary Data S5. Plots of CORENODE n-mer regression fits for MHC type II genes are found in Supplementary Data S6. Expression of the GvL module in primary and post-allo-SCT relapse samples is found in Supplementary Data S7.
K. Eagle reports personal fees from Flare Therapeutics and Third Rock Ventures outside the submitted work; in addition, K. Eagle has a patent for Method of Inducing Therapeutic Selenium Deficiency pending. C.Y. Lin reports other support from Kronos Bio outside the submitted work. N.V. Dharia reports grants from Julia's Legacy of Hope St. Baldrick's Foundation Fellowship during the conduct of the study; other support from Genentech, Inc., a member of the Roche Group outside the submitted work. K. Stegmaier reports grants from NCI during the conduct of the study; personal fees from AstraZeneca, grants from KronosBio, Novartis, and other support from Auron Therapeutics outside the submitted work. No disclosures were reported by the other authors.
K. Eagle: Conceptualization, data curation, software, formal analysis, investigation, visualization, methodology, writing–original draft, writing–review and editing. T. Harada: Validation, investigation, visualization, methodology, writing–review and editing. J. Kalfon: Data curation, software, formal analysis, investigation, visualization, methodology. M.W. Perez: Data curation, software, formal analysis, investigation, visualization. Y. Heshmati: Formal analysis, validation, investigation, methodology, writing–review and editing. J. Ewers: Investigation. J. Vrabič Koren: Data curation, formal analysis, methodology. J.M. Dempster: Data curation. G. Kugener: Data curation. V.R. Paralkar: Conceptualization, investigation, writing–review and editing. C.Y. Lin: Conceptualization, supervision, investigation, writing–review and editing. N.V. Dharia: Conceptualization, resources, formal analysis, supervision, funding acquisition, investigation, project administration, writing–review and editing. K. Stegmaier: Resources, supervision, funding acquisition, investigation, project administration, writing–review and editing. S.H. Orkin: Conceptualization, resources, supervision, funding acquisition, investigation, writing–original draft, project administration, writing–review and editing. M. Pimkin: Conceptualization, resources, data curation, formal analysis, supervision, funding acquisition, validation, investigation, visualization, methodology, writing–original draft, project administration, writing–review and editing.
M. Pimkin was supported by a Damon Runyon–Sohn Pediatric Cancer Fellowship and a Young Investigator Award from the Alex's Lemonade Stand Foundation. This work was supported, in whole or in part, by research grants to M. Pimkin from the Curing Kids Cancer Foundation, When Everyone Survives Foundation, Pedals for Pediatrics, Children's Cancer Research Fund, Children's Leukemia Research Association, Hyundai Hope on Wheels, Kate Amato Foundation, and Boston Children's Hospital Office of Faculty Development. K. Stegmaier was supported by NIH 5R35 CA210030 and NIH P50 CA206963. N.V. Dharia was supported by the Julia's Legacy of Hope St. Baldrick's Foundation Fellowship. We thank Drs. L.S. Kean, G.C. Yuan and S. Avagyan for critical reading of this manuscript, and Drs. T. Ley and C. Miller for access to the scRNA-seq data.
Note: Supplementary data for this article are available at Blood Cancer Discovery Online (https://bloodcancerdiscov.aacrjournals.org/).