STAG2 encodes a cohesin component and is frequently mutated in myeloid neoplasms, showing highly significant comutation patterns with other drivers, including RUNX1. However, the molecular basis of cohesin-mutated leukemogenesis remains poorly understood. Here we show a critical role of an interplay between STAG2 and RUNX1 in the regulation of enhancer–promoter looping and transcription in hematopoiesis. Combined loss of STAG2 and RUNX1, which colocalize at enhancer-rich, CTCF-deficient sites, synergistically attenuates enhancer–promoter loops, particularly at sites enriched for RNA polymerase II and Mediator, and deregulates gene expression, leading to myeloid-skewed expansion of hematopoietic stem/progenitor cells (HSPC) and myelodysplastic syndromes (MDS) in mice. Attenuated enhancer–promoter loops in STAG2/RUNX1–deficient cells are associated with downregulation of genes with high basal transcriptional pausing, which are important for regulation of HSPCs. Downregulation of high-pausing genes is also confirmed in STAG2–cohesin-mutated primary leukemia samples. Our results highlight a unique STAG2–RUNX1 interplay in gene regulation and provide insights into cohesin-mutated leukemogenesis.

Significance:

We demonstrate a critical role of an interplay between STAG2 and a master transcription factor of hematopoiesis, RUNX1, in MDS development, and further reveal their contribution to regulation of high-order chromatin structures, particularly enhancer–promoter looping, and the link between transcriptional pausing and selective gene dysregulation caused by cohesin deficiency.

This article is highlighted in the In This Issue feature, p. 747

Myelodysplastic syndromes (MDS) and related disorders are heterogeneous groups of myeloid neoplasms showing varying degrees of cytopenia due to ineffective hematopoiesis and a high propensity to progression to acute myeloid leukemia (AML; ref. 1). During the past two decades, a complete registry of recurrent mutational targets, or driver genes, has been identified using advanced genomics (2–4). However, the functional basis of mutations has not been fully elucidated in many of those drivers. Among these are STAG2 and other members of the cohesin complex, including SMC1, SMC3, and RAD21, which are a new class of driver genes mutated in approximately 10% of MDS and other myeloid neoplasms, with STAG2 being most frequently affected (5–7). Most mutations in STAG2 are nonsense or frameshift, leading to protein truncation and loss-of-function (5). Involved in different cellular processes, such as sister chromatid cohesion during cell division and DNA repair (8), cohesin is also implicated in transcriptional control (9–12), possibly through regulating high-order chromatin structures (13). However, it is largely unknown how mutated cohesin contributes to myeloid leukemogenesis. In this study, through the analysis of interactions of gene mutations in a large cohort of MDS followed by the analysis of relevant mouse models, we show a strong functional interplay between STAG2 and RUNX1 in the regulation of chromatin structures and gene expression in the hematopoietic compartment, providing novel insight into the leukemogenic mechanism of a unique subset of myeloid neoplasms characterized by STAG2 and mutually highly correlated mutations.

Genetic Interaction of Mutations in Human MDS/AML

In MDS/AML, STAG2 mutations are rarely seen as a solitary mutation, but almost always accompanied by other mutations, frequently involving SRSF2, RUNX1, ASXL1, CEBPA, BCOR, EZH2, IDH2, and NRAS (2–4). To see this in more detail, we investigated significant mutational correlations in MDS and related myeloid neoplasms, using in-house or publicly available mutation data sets from 3,047 cases with MDS (n = 2,498) and related myeloid neoplasms (n = 549; refs. 2, 3, 14–18). After exhaustively evaluating correlations across all pair-wise combinations among 42 major drivers commonly mutated in MDS/AML, we detected a number of significant positive and negative correlations (Fig. 1A; Supplementary Tables S1–S3). Remarkably, the top-ranked six correlations were exhausted by all possible pair-wise combinations among four genes, STAG2, RUNX1, SRSF2, and ASXL1 (“SRSA” genes), which are involved in gene regulation, and were comutated at significantly higher frequencies than expected only by chance (Fig. 1A and B). One or more of these genes accounted for 31.8% (n = 970) of cases of MDS/AML, of which 346 (35.7%) had mutations in ≥2 of these genes and 75 (7.7%) and 33 (3.4%) carried mutations in three and all four genes, respectively (Fig. 1B; Supplementary Fig. S1A and S1B). Patients with ≥2 SRSA mutations had a significantly poor overall survival, compared with those with just one, which still negatively affected survival (Fig. 1C). Numbers of other driver mutations did not differ according to the number of SRSA mutations, suggesting that SRSA combination is not just a consequence of increased total mutations (Supplementary Fig. S2A). No significant difference was observed in the frequency of missense versus nonsense or frameshift mutations in the RUNX1 gene between STAG2 wild-type (WT) and mutated cases (Supplementary Fig. S2B). Analysis of variant allele frequency (VAF) suggests that SRSF2 mutations are acquired earlier than the other three mutations, followed by RUNX1 mutations and then STAG2 and ASXL1 mutations (Fig. 1D; Supplementary Fig. S2C). We also observed high frequency of converging evolution by way of “parallel” STAG2 mutations; multiple, as many as four, independent STAG2-mutated subclones were detected in 17 (22%) of 76 evaluable cases with STAG2 mutations, of which 16 carried RUNX1, SRSF2, and/or ASXL1 mutations in the major tumor population, indicating that STAG2 mutations should confer a strong selective advantage in these mutational contexts (Fig. 1E). Combined, these findings suggest strong functional interactions among SRSA mutations in positive selection that underlie the development/progression of MDS.

Expanded HSPC Pools and Differentiation Block in Stag2 Knockout Mice

To understand the leukemogenic mechanism of SRSA-mutated MDS, particularly focusing on STAG2 mutation, which showed a unique converging evolution pattern (Fig. 1E) and has been less studied in terms of functional consequence compared with other SRSA genes, we first generated a mouse model having a conditional Stag2 knockout allele with an Mx1-Cre transgene (SKO; Mx1-Cre+Stag2fl/−; Supplementary Fig. S3A–S3C; see Methods). After polyinosinic-polycytidylic acid (pIpC) injection, SKO mice exhibited a slightly decreased white blood cell (WBC) count with a large reduction in B-lymphocytes (B220+), compared with littermate WT mice (Mx1-CreStag2fl/−; Fig. 2A). Although no significant changes were observed in hemoglobin level and platelet count between SKO and WT mice, SKO mice showed significantly increased red cell distribution width (RDW), suggestive of dyserythropoiesis (19), and pathologic examination revealed mild trilineage bone marrow (BM) dysplasia and a slightly enlarged spleen with evidence of extramedullary hematopoiesis in SKO mice (Fig. 2A; Supplementary Fig. S3D and S3E). Collectively, these findings support the presence of a mild MDS-like phenotype in SKO mice, although overall survival did not differ between SKO and WT mice (Supplementary Fig. S3F). In BM, SKO mice had a higher frequency of Lin/Sca1+/c-Kit+ (LSK) cells, compared with WT littermates, where all major subfractions of LSK cells were increased (Fig. 2B; Supplementary Fig. S4A), indicating expanded hematopoietic stem/progenitor cell (HSPC) pools. The increase was most prominent in myeloid-biased progenitors, including multipotent progenitor 2 (MPP2; Lin/c-Kit+/Sca-1+/CD135/CD150+/CD48+) and MPP3 (Lin/c-Kit+/Sca-1+/CD135/CD150/CD48+), followed by long-term hematopoietic stem cells (LT-HSC; Lin/c-Kit+/Sca-1+/CD135/CD150+/CD48) and short-term HSCs (ST-HSC; Lin/c-Kit+/Sca-1+/CD135/CD150/CD48), suggesting the presence of myeloid skewing (20). The myeloid skewing was also evident in more differentiated progenitors as evident from increased common myeloid progenitors (CMP; Lin/c-Kit+/Sca-1/CD34+/FcγRmed) and granulocyte–macrophage progenitors (GMP; Lin/c-Kit+/Sca-1/CD34+/FcγRhi), and decreased common lymphoid progenitors (CLP; Lin/c-Kitmed/Sca-1med/IL7Rα+/Flt3+; Fig. 2C; Supplementary Fig. S4B). Of note, SKO mice showed decreased frequencies of megakaryocyte/erythrocyte lineage-restricted progenitors (MEP; Lin/c-Kit+/Sca-1/CD34/FcγRlo) and erythroid progenitors (Ter119+/CD71+), suggesting a blocked differentiation into erythromegakaryocyte lineages (Fig. 2C; Supplementary Fig. S4B–S4D). Mature cells were also skewed to myeloid lineages with increased granulocytes/monocytes (CD11b+) and decreased B-lymphocytes in the BM and spleen (Fig. 2D; Supplementary Fig. S4E and S4F). Extramedullary hematopoiesis was evident from increased frequencies of erythroid progenitors in the spleen (Supplementary Fig. S4C and S4D). In agreement with expanded HSPC pools, SKO-derived BM cells showed an enhanced clonogenicity in replating assays (Fig. 2E). HSCs (CD150+/CD48 LSK cells) from SKO mice showed a decreased frequency of apoptotic cells (Annexin+/7-AAD) and an enhanced cell cycling (S/G2/M; Ki-67+/Hoechst+) with decreased quiescent (G0; Ki-67/Hoechst) cells, compared with those from WT mice (Fig. 2F and G; Supplementary Fig. S4G and S4H). In competitive repopulation assay, SKO-derived cells showed enhanced chimerism within the LSK fraction, although the chimerism of SKO-derived cells was not significantly changed in total BM and even reduced in peripheral blood, particularly within lymphocytes (Fig. 2H). These results suggest an enhanced self-renewal and repopulation capacity of SKO-derived progenitors with a block in lymphoid differentiation.

As expected from the myeloid skewing in SKO mice, transcriptome analysis demonstrated upregulation and downregulation of genes implicated in myeloid and lymphoid programs in SKO, respectively, including a number of transcription factors (TF; Fig. 2I and J; Supplementary Fig. S4I). Among these, we noted the elevated expression of Runx1 in SKO-derived LSK and CMP cells (Fig. 2K; ref. 21). The following assay for transposase-accessible chromatin using sequencing (ATAC-seq) analysis (22) showed enrichment of the binding motifs of RUNX1 and GATA2 in enhanced ATAC peaks and enrichment of the binding motifs of interferon regulatory factors (IRF) in reduced ATAC peaks in SKO-derived LSK and CMP cells (Fig. 2L and M; Supplementary Fig. S5A–S5C). Furthermore, we confirmed an increased RUNX1 binding to its consensus motifs in SKO cells by chromatin immunoprecipitation sequencing (ChIP-seq; Supplementary Fig. S5D). These findings suggest upregulated expression of RUNX1-regulated genes. However, interestingly, genes downregulated in SKO-derived LSK cells showed a highly significant enrichment in genes downregulated in Runx1-knockout mice (RKO; Mx1-Cre+Runx1fl/fl) and vice versa (Fig. 2N), suggesting a functional interplay between STAG2 and RUNX1, a typical SRSA combination.

Stag2/Runx1 Codeficiency Induces MDS in Mice

To see the interplay between both proteins, we investigated the effects of Stag2/Runx1 double knockout (DKO; Mx1-Cre+Stag2fl/−Runx1fl/fl) on hematologic phenotype in the transplantation setting, in which DKO- as well as single KO–derived BM cells were transplanted into lethally irradiated mice, followed by pIpC injection. Compared with single KO–transplanted mice, in which cytopenia was confined to lymphocytes (SKO and RKO) and platelets (RKO), DKO-transplanted mice exhibited more profound cytopenia (Fig. 3A). WBC count was markedly reduced and the reduction was seen not only in lymphocytes but also in granulocytes and monocytes. Although not apparent in single KO–transplanted mice, anemia was evident with markedly increased MCV and RDW (Fig. 3A; Supplementary Fig. S6A). In contrast to severely reduced peripheral blood counts, the frequency of immature BM progenitors (LSK cells) was almost doubled compared with that in single KO–transplanted mice, where the increase was mostly explained by MPP2 and MPP3, although LT- and ST-HSCs were significantly reduced compared with those in SKO-transplanted mice (Fig. 3B and C; Supplementary Fig. S6B). Although more mature progenitors exhibited variable profiles depending on genotype, they were largely maintained in DKO-transplanted mice, except for severely reduced frequency of colony-forming unit erythroid cells (CFUe) and Ter119+CD71+ erythroid progenitors (Fig. 3DF; Supplementary Fig. S6C–S6F). Lineage-committed mature cells were synergistically skewed to myeloid lineages by Stag2/Runx1 deletion (Fig. 3G; Supplementary Fig. S6E). Prolonged clonogenicity was also observed in replating assays (Supplementary Fig. S6G). In competitive repopulation assays, the peripheral blood chimerism was severely impaired, even though the chimerism in the progenitor (LSK) fraction was not significantly affected, which was compatible with severe peripheral blood cytopenias (Supplementary Fig. S6H). Conspicuously, all mice transplanted with DKO-derived BMcells developed overt MDS mostly within half a year after pIpC injection, with severe cytopenia and marked trilineage dysplasia, whereas none of the animals transplanted with SKO or RKO cells developed MDS, although a number of deaths were also observed in RKO-transplanted mice with no exacerbation of blood cell counts (Fig. 3H and I; Supplementary Fig. S6I–S6L). In transcriptome analysis and ATAC-seq of LSK cells, DKO-derived LSK cells showed more extensive alterations in gene expression and chromatin accessibility, compared with single KO–derived cells; DKO cells showed a much higher number of differentially expressed genes (DEG) and differentially enhanced or attenuated chromatin accessibility sites, compared with single KO (Fig. 3J and K; Supplementary Fig. S6M). These findings in the phenotype of DKO mice further support the functional interplay between STAG2 and RUNX1.

Unique Binding of STAG2–Cohesin and RUNX1 to Enhancers

To understand the molecular basis of the interplay between STAG2 and RUNX1, we performed an extensive ChIP-sequencing analysis, in which genomic localization of STAG2 and other cohesin components (STAG1 and SMC1), RUNX1, CTCF, and major histone marks was investigated in WT-derived HSPCs (c-Kit+ BM cells). We identified a total of 27,997 cohesin binding sites that showed signal peaks in STAG1 and/or STAG2. On the basis of relative ChIP signals for CTCF and H3K27ac, indicative for insulator and enhancer sites, respectively, these cohesin binding sites were separated into two discrete groups showing high relative CTCF signals [cohesin binding cluster-I (CC-I) sites] and high relative H3K27ac signals [cohesin binding cluster-II (CC-II) sites; Fig. 4A; Supplementary Fig. S7A and S7B]. Explaining most of the cohesin binding sites (n = 24,364, 87%), CC-I sites generally had comparable STAG1 and STAG2 signal intensities with discrete signal peaks and were characterized by the paucity of active histone marks (Fig. 4A). In contrast, accounting for only 13% (n = 3,633) of all cohesin binding sites detected, CC-II sites had stronger STAG2 than STAG1 signals showing broader chromatin binding peaks (Fig. 4A and B). As expected from their association with high H3K27ac signals, most CC-II sites also had abundant signals of other active histone marks, including H3K4me1 and H3K4me3, with scarcity of H3K27me3 signals, suggesting their enrichment in active promoters and enhancers (23). Conspicuously, CC-II sites were highly enriched for RUNX1 binding (Fig. 4A). In accordance with this, a subset (approximately 10%) of STAG2 signals were colocalized with RUNX1 signals in immunofluorescence of mouse HSPCs and human K562 leukemia cell lines using super-resolution microscopy (Fig. 4C). Moreover, RUNX1 and STAG2 were shown to be coimmunoprecipitated with SMC1 and SMC3 in mouse (32Dcl3) and human (K562) myeloid leukemia cell lines (Supplementary Fig. S7C and S7D). In addition, according to the published ChIP-seq data from a murine HSPC cell line (HPC-7; ref. 24), CC-II sites were highly enriched not only for RUNX1 binding but also for binding of other TFs implicated in hematopoiesis as well as ASXL1 (25), RNA polymerase II (Pol II) and MED12 (ref. 26; Supplementary Fig. S7E and S7F), which was also supported by an enrichment of consensus motifs of many hematopoietic TFs in CC-II sites compared with CC-I sites (Supplementary Fig. S7G). The distinct role of STAG2 at CC-II sites as compared with at CC-I sites was further supported by ChIP-seq analysis in SKO-derived cells, in which only slight changes in STAG1 binding were observed at CC-II sites in contrast to remarkably increased binding of STAG1 at CC-I sites. This suggests that STAG1–cohesin does not replace STAG2–cohesin on CC-II sites, even in the face of STAG2 deficiency (Fig. 4D). CTCF signals slightly decreased at CC-I sites but not at CC-II sites, consistent with specific binding of CTCF at CC-I sites (Fig. 4D). These findings in ChIP-seq suggest a distinct role of STAG2 and RUNX1 bindings at CC-II sites, in which active enhancers are enriched (Fig. 4E).

Stag2/Runx1 Codeficiency Disrupts Enhancer–Promoter Loops

We next investigated the effects of STAG2 and/or RUNX1 depletion on chromatin structures, using deep in situ Hi-C analysis of c-Kit+ HSPCs from mice with different genotypes (27), which yielded 1.79 billion valid interactions per genotype on average. Overall, Stag2/Runx1 deletions did not substantially affect boundaries of the large genomic structures that are known as compartment A/B, which largely correspond to transcriptionally active/inactive genomic regions, respectively (Supplementary Fig. S8A–S8C). However, insulations at boundaries of all topologically associating domains (TAD) were slightly enhanced in SKO and DKO cells, compared with those in WT cells (Supplementary Fig. S8D). Cohesin binding sites CC-I and CC-II were highly enriched at boundaries and inside regions of TADs, especially those located in compartment A (A-TAD), respectively (Fig. 5A; Supplementary Fig. S8E) and as expected most of the DEGs in SKO-, RKO-, or DKO-derived LSK cells, as compared with WT-derived cells, were mapped within A-TADs (Fig. 5B). Thus, we calculated average difference in Hi-C contacts between mutant and WT cells, focusing on A-TADs (n = 3,295) after their size was normalized (Fig. 5C). SKO-derived cells showed slightly reduced short-range contacts in the vicinity of the bottom line, whereas the contacts were slightly enhanced at the TAD corner. In contrast, chromatin contacts in the corresponding A-TADs exhibited a more profound and widespread reduction in DKO cells, even though RKO alone minimally influences intra-TAD chromatin contacts. When the analysis was stratified for TAD hierarchy, the contacts tended to be more attenuated in nested TADs than the parental TADs (Fig. 5D).

We also evaluated the effects of Stag2/Runx1 KO on the number of chromatin loops. Consistent with relative spatial distribution of CC-I and CC-II sites (Supplementary Fig. S8E), CC-II–anchored loops were shorter than CC-I–anchored loops (Fig. 5E). The number of loops originating from CC-II sites was reduced in SKO-derived cells and further decreased in DKO-derived cells, whereas the effects on loops originating from CC-I sites remained minimum across the genotypes (Fig. 5F; Supplementary Fig. S8F–S8H). When these loops were characterized by the presence or absence of CTCF (CC-I), promoter, and enhancer at loop anchors, we found that the loops between enhancer–promoter (E-P), promoter–promoter (P-P), and enhancer–enhancer (E-E) were more selectively disrupted in SKO and DKO mice than other types of loops associated with CTCF (Fig. 5G and H). Moreover, these findings in the mouse model were confirmed in human leukemia cells using a series of isogenic AML cell lines derived from HL-60, which were targeted for STAG2 and RUNX1 using a CRISPR/Cas9 system, in which synergistic disruption of CC-II loops was recapitulated in STAG2/RUNX1 DKO cells (Supplementary Fig. S9A–S9F).

To further understand the combinatorial effects on chromatin looping, we evaluated how the CC-II–anchored loops identified in WT cells are altered in SKO and DKO cells (Fig. 5I). As shown in Fig. 5J, the CC-II sites anchoring loops identified in WT cells (group 1) were divided into four groups (groups 2–5) depending on whether the loop at each site was lost or preserved in SKO and DKO cells, which were further investigated for the binding patterns of cohesin components, Pol II, Mediator, and ten hematopoietic TFs at each group. Interestingly, we found relative enrichment of STAG2 at groups 2 and 4, strong enrichment of Pol II and Mediator at group 4, and relative enrichment of certain TFs at group 4 (Fig. 5K). In contrast, STAG1 enrichment was observed at group 5, that is, CC-II sites with DKO-resistant loops. In addition, group 4 is bound by multiple TFs more frequently than other groups (Fig. 5L). These results suggest that, in the absence of STAG2, RUNX1 deficiency additionally disrupts E-P loops anchored at STAG2 binding sites enriched for Pol II, Mediator, and TFs, leading to profound disruption of short-range chromatin interactions in DKO (Fig. 5M).

Transcriptional Pausing Underlies Transcriptional Vulnerability to Stag2/Runx1 Codeficiency

To see the effects of SKO and/or RKO on gene expression, we next investigated gene expression in LSK cells from these mice in greater details. On the basis of unsupervised clustering, we identified five discrete groups of genes (groups I–V) differentially expressed across four genotypes (Fig. 6A and B; Supplementary Fig. S10A). These groups were associated with distinct gene ontology terms and tissue- or hematopoietic lineage–specific gene-expression profiles (Fig. 6C; Supplementary Fig. S10B and S10C), and group II and IV genes were synergistically upregulated and downregulated in DKO mice, respectively. These indicate that despite being globally seen across the entire genome, loop attenuation does not necessarily result in a uniform change in overall gene expression. This is in agreement with the report that alteration in gene expression remained moderate and was seen only in a subset of genes, even when all loop structures were disrupted by complete loss of cohesin (28), suggesting that in general the effect of chromatin looping on gene expression is modest and may be influenced by other contexts.

In this regard, it has been reported that superenhancer (SE)-associated genes are more prone to downregulated expression in cohesin-deficient cells (28). In our mouse models, SE-associated genes also showed a trend of being downregulated in SKO and DKO cells, compared with RKO and WT cells, on average (Fig. 6D and E; Supplementary Fig. S10D). However, the effect of knockout was highly variable depending on SE site, compared with the case with typical enhancer (TE), particularly in DKO. This was exemplified for four SE-associated genes, Hoxa9 (group IV), Gata2 (group II), and Fos/Fosb (group II; Fig. 6D). Hoxa9 and other Hoxa cluster genes, which have long been implicated in the regulation of normal hematopoiesis and leukemogenesis (29), were downregulated in SKO/DKO cells, where attenuated loops were observed (Fig. 6F; Supplementary Fig. S10E). Thus, we further investigated the roles of Hoxa9, which was also downregulated in shRNA mouse models targeting the cohesin components Stag2 and Smc1 (ref. 10; Supplementary Fig. S11A). In serial replating assay, Hoxa9 overexpression did not prolong the replating for SKO/DKO cells, although the number of colonies increased substantially in Hoxa9-expressing cells (Supplementary Fig. S11B). On the other hand, in in vitro single-cell differentiation assay, DKO-derived HSPCs showed a reduced frequency of erythroid-containing colonies compared with WT-derived HSPCs, which, however, was substantially rescued by Hoxa9 overexpression (Supplementary Fig. S11C–S11E). This suggests that the enhanced self-renewal in DKO is not attributed to Hoxa9 downregulation but mediated by other mechanisms, while downregulated Hoxa9 might contribute to differentiation block in these mice.

In contrast to Hoxa9 downregulation, even being associated with attenuated E-P loops, Gata2 and Fos/Fosb, as well as other AP1 components, showed upregulated expression (Supplementary Fig. S12A). Of note, in ATAC-seq analysis of DKO-derived LSK cells, consensus binding motifs of AP1, GATA2, RUNX1, IRFs, and HOXA9 were enriched in enhanced ATAC peaks frequently associated with group II and V genes (AP1 and GATA2) and attenuated ATAC peaks frequently associated with group IV genes (RUNX1, IRFs, and HOXA9), respectively (Fig. 6G and H; Supplementary Fig. S12B and S12C). Moreover, AP1 and GATA2 motifs were highly enriched in promoter regions in group II genes (Supplementary Fig. S12D). Thus, modulation of promoter activities by AP1 and GATA2 may correspond to gene upregulation in group II genes by overriding detrimental effects of global loop suppression.

To further investigate the effect of E-P loop attenuation on gene expression, we next evaluated the link with transcriptional pausing, which has been known to correlate with enhancer activity and be implicated in the regulation of differentiation potential of stem cells (30–32). Of particular interest in this regard, group IV genes showed a high basal pausing level, as assessed by pausing index calculated by total Pol II ChIP-seq (Fig. 6I; Supplementary Fig. S13A). We separately confirmed that high-pausing genes were consistently and significantly downregulated in SKO and more profoundly decreased in DKO (Supplementary Fig. S13B). Degrees of pausing did not influence the expression specificity across diverse hematopoietic lineages (Supplementary Fig. S13C). These findings suggest a higher vulnerability of genes with high basal pausing to attenuated chromatin interactions caused by Stag2/Runx1 deficiency. In addition, in coordination with reduced numbers of E-P loops in SKO and DKO cells across all DEG groups (Fig. 6J), we observed a substantial decrease in ChIP-seq intensities of Ser5-phosphorylated (Ser5-P) Pol II at the promoter proximal regions in SKO and DKO in all DEG groups, suggesting the global impact of E-P loop attenuation on promoter-proximal Pol II dynamics (Fig. 6K).

Explaining partial similarities of transcriptomes in SKO and RKO (Fig. 2N), group IV genes were downregulated in SKO/RKO in a synergistic manner (Fig. 6A and B; Supplementary Fig. S10A) and were most strongly enriched for genes specifically expressed in HSCs across diverse hematopoietic lineages (Fig. 6C), suggesting a role of downregulated group IV genes in the hematologic phenotypes of Stag2/Runx1 deficiency. We also performed RNA-sequencing (RNA-seq) analysis in HL-60 cell lines with STAG2/RUNX1 deficiency. Although there were substantial differences in altered pathways in mouse LSK cells and HL-60 cells, we observed considerable overlaps of downregulated pathways in two experimental systems (Supplementary Fig. S14A and S14B).

Downregulation of High-Pausing Genes in STAG2/Cohesin-Mutated Human MDS/AML

Finally, we evaluated to what extent the above findings in mice could be extended in human MDS/AML samples carrying STAG2 and/or other SRSA mutations, as well as other cohesin mutations, using transcriptome data from three published cohorts of MDS/AML (6, 33, 34). An integrated pathway analysis of altered gene expression revealed a number of pathways changing in the same directions among the SKO mouse model, cohesin-mutated MDS/AML cohorts, and the HL-60 cell line model, including interferon (IFN) response, regulation of leukocyte, adaptive immune responses, inflammatory response, ribosomal translation, and regulation of DNA/expression (Fig. 7A). Recapitulating the case with Stag2/Runx1-deficient mouse models, expression of HOXA9 and other HOXA cluster genes showed a uniform trend of downregulation with an increasing number of SRSA mutations in human samples (Fig. 7B). SE-associated genes identified in human normal HSPCs (35) were marginally downregulated in cohesin/STAG2-mutated MDS samples, as compared with TE-associated genes; however, this trend was less clear in AML samples (Supplementary Fig. S15A and S15B). We also evaluated the effect of basal level of transcriptional pausing on gene expression in SRSA-mutated samples, where the basal pausing level was calculated using total Pol II ChIP-seq of human normal HSPCs (36). High-pausing genes identified in mouse and human HSPCs were associated with several molecular pathways including IFN response and DNA repair response (Fig. 7C), consistent with downregulation of IFN and inflammatory responses in pathway network analysis (Fig. 7A). Conspicuously, high-pausing genes are consistently downregulated in cohesin- or STAG2-mutated samples in all three cohorts studied (Fig. 7D; Supplementary Fig. S15C and S15D). Moreover, high-pausing genes were preferentially downregulated in samples with ≥2 SRSA mutations, and STAG2-mutated cases with other SRSA mutations (Fig. 7E and F; Supplementary Fig. S15E). Taken together, these results support the relevance of transcriptional pausing and downregulation of high basal pausing genes in leukemogenesis not only of Stag2/Runx1 DKO mice but also of human MDS/AML carrying multiple SRSA mutations.

We have revealed a unique interplay between STAG2 and RUNX1 in the maintenance of short-range chromatin interactions and transcriptional regulation, disruption of which leads to the development of MDS in mice. The interplay seems to depend on their localization to CTCF-deficient, enhancer-proficient sites, in which the role of STAG2 cannot be replaced by STAG1. The distinct role of STAG2 in enhancer-rich regions is also suggested from the analysis of cultured cell lines in a recent report (37). Given the minimum changes with RKO alone, it is rather unexpected that STAG2/RUNX1 DKO causes more profound chromatin alterations across the entire genome, compared with SKO alone. In the absence of STAG2, RUNX1 loss additionally disrupts E-P loops anchored at cohesin binding sites highly enriched for Pol II and Mediator. This suggests that such E-P loops vulnerable to combined loss of STAG2 and RUNX1 may be associated with formation of transcriptional condensates through phase separation capacities of Pol II, Mediator, and TFs, as has emerged from several recent studies (38–40).

The effects of cohesin loss on whole transcriptomes are modest despite of the essential roles in regulation of looping (28). Of interest, genes showing high basal transcriptional pausing are more prone to downregulated expression, compared with genes with low transcriptional pausing. Such genes may require the assistance of intact E-P loop formation and relevant enhancer-bound TFs, such as RUNX1, for their transcription/expression, as group IV genes are more profoundly downregulated in DKO cells. The link between gene expression and transcriptional pausing is in contrast to a previous study (28) that highlighted the effect of cohesin-mediated looping on the expression of SE-associated genes as typically seen for Hoxa genes in DKO mice. Given that SE-associated genes show a lower degree of pausing than TE-associated genes (Supplementary Fig. S13A; ref. 32), transcriptional pausing should be considered as a novel gatekeeper to determine selective expression changes, distinct from association with SE. High-pausing genes in HSPCs are associated with several molecular pathways including IFN response and DNA-repair response. Thus, downregulation of high-pausing genes is consistent with downregulation of IFN and inflammatory responses at whole-transcriptome levels and enrichment of IRFs in reduced ATAC peaks in SKO and DKO. These findings suggest that transcriptional pausing provides the molecular basis of a previously reported downregulation of IFN and inflammatory responses, which control self-renewal and differentiation of HSPCs, in cohesin-mutated AML and hematopoietic cells (41). Taken together, perturbation of both several SE-associated genes including Hoxa9 and highly paused genes may contribute to multiple differentiation abnormalities and cell transformation in MDS/AML.

Distinct roles of STAG1 and STAG2 in HSPC self-renewal and differentiation in mouse models has been reported in a recent study (42). Although STAG2-deficient mice display similar phenotypes in two studies, our study illustrates several novel findings: (i) association of mutations in SRSA genes; (ii) mechanistic insights of combined effects of STAG2 and RUNX1 loss on chromatin looping; and (iii) the link between transcriptional pausing and selective gene dysregulation. Explaining the top significant associations between driver mutations, SRSA genes showed a conspicuous comutation pattern among MDS and related disorders, and also significantly comutated in a subset of primary AML, reported as AML with chromatin spliceosome mutation (43) or secondary-type primary AML (15), suggesting that these mutations define a unique subset of myeloid neoplasms, where the deregulated interplay of these genes in the maintenance of enhancer activity might explain the common pathogenesis. The significant downregulation of high-pausing genes in samples with cohesin/STAG2 and other SRSA mutations in three independent cohorts of MDS/AML strongly supports this idea. A previous study reporting the interaction between ASXL1 and cohesin (25) and our finding of highly enriched ASXL1 binding at CC-II sites in mice (Supplementary Fig. S7F) are also suggestive of this, as well as a recent report demonstrating the enhanced transcriptional pausing induced by mutated SRSF2 (44). We did not observe a significant increase in abnormal splicing in Stag2/Runx1 DKO cells, suggesting that myelodysplasia in DKO mice is not likely due to altered splicing (Supplementary Fig. S15F). It is warranted in future studies to better clarify the molecular basis of this unique subset of myeloid neoplasms.

Correlations between Driver Mutations in Human MDS/AML

We analyzed in-house or publicly available data on large-scale genetic profiling of MDS/AML (2, 3, 14–18) for the investigation of correlations among major driver mutations commonly mutated in myeloid neoplasms. A total of 3,047 cases with MDS/AML were analyzed for correlations across 42 frequently mutated genes by Fisher exact test as described previously (45). Patient survival was analyzed in 831 patients, for which data on survival was available (2). We analyzed number of driver mutations, VAFs of mutations, type of RUNX1 mutations, and status of STAG2 mutations in 76 cases which harbored more than one STAG2 mutation, in an MDS cohort (2). The tumor cell fraction was calculated as previously described (46) with minor modifications. The tumor content was regarded as the maximum adjusted VAF among driver mutations in each case. Patient cohort is summarized in Supplementary Table S1. Clinical characteristics of patients with MDS (2) with STAG2, RUNX1, SRSF2, and/or ASXL1 mutations are summarized in Supplementary Table S2. Correlations between mutations in patients with MDS/AML are described in Supplementary Table S3.

Mice

Animal care was in accordance with institutional guidelines and approved by the Animal Research Committee, Graduate School of Medicine, Kyoto University (Kyoto, Japan). To generate the Stag2 conditional knockout mouse model, the Stag2-targeting vector was constructed based on the Cre-LoxP system, in which two LoxP sites were inserted flanking exon 7, which encodes a STAG domain, and Frt-flanked neomycin selection cassette in a downstream intron. The linearized targeting vector was electroporated into murine C57BL/6 embryonic stem cells (RENKA strain) and cells were selected in G418. Homologous recombination was confirmed by direct sequencing of both outer regions of the neomycin cassette, and also by Southern blotting using neomycin-specific probes. Chimeric mice were produced by microinjection of targeted embryonic stem cells into blastocysts and were bred to C57BL/6 mice to establish germline transmission (Trans Genic Inc.). The generated mice were initially crossed to a germline Flp deleter (The Jackson Laboratory) to eliminate the neomycin cassette, and subsequently to the IFN-inducible Mx1-Cre transgenic mice (47). Mx1-Cre expression was induced by intraperitoneally injecting 500 μg of pIpC on six alternate days. Genotyping of Stag2 floxed alleles was performed as described previously (48) using the following primers: 5′-GACCACTAAGCTCATAATCGC-3′ and 5′-ATTTCTGGCTACTACGCTTGC-3′. Runx1 conditional knockout mice were described previously (49). CD57BL/6 CD45.1+ mice and CD57BL/6 F1-CD45.1+/CD45.2+ mice were purchased from Sankyo Lab Service.

Retroviral Transduction

For Hoxa9 overexpression, we used FLAG-tagged Hoxa9 (50) in the pMSCV-neo retroviral vector in serial replating assay and pGCDNsam-IRES-EGFP retroviral vector in single-cell differentiation assay. Retroviruses were produced by transient transfection of Plat-E packaging cells with retroviral constructs. Purified c-Kit+ HSPCs were cultured in Iscove's Modified Dulbecco's Medium containing 20% FBS and 50 ng/mL mouse SCF, mouse TPO, and human FLT-3L for 24 hours, and then were retrotransduced using Retronectin (Takara).

Serial Replating Assay

Freshly isolated 20,000 BM or 2,000 c-Kit+ cells were seeded into cytokine-supplemented methylcellulose medium (Methocult, M3434; STEMCELL Technologies) according to the manufacturer's protocol. For Hoxa9 overexpression assay, c-Kit+ cells transduced by pMSCV-neo vector were selected by adding G418 to the medium at the final concentration of 1 mg/mL. Colonies propagated in culture were scored at day 14 in duplicate. For replating, cells were resuspended in PBS and counted, and 10,000 cells were replated in duplicate once a week.

Flow Cytometry

Single-cell suspensions were stained with mAbs as described previously (48). Stained cells were analyzed with FACS Aria III or LSRFortessa X-10 flow cytometers (BD Biosciences). Cell sorting was performed with FACS Aria III. Cell cycle and apoptosis analysis were performed as described (48). Data were analyzed by FlowJo software (Tree Star). The antibodies used in FACS experiments are described in Supplementary Table S4.

Isolation of Mouse Hematopoietic Progenitors

Purification of HSPCs was performed as described previously (48). Briefly, freshly isolated BM cells were stained with APC-conjugated anti–c-Kit antibody and were enriched using anti-APC magnetic beads and MACS LS columns (Miltenyi Biotec) according to the manufacturer's instructions.

BM Transplantation

In noncompetitive BM transplantation (BMT) assay, unfractionated BM cells (2 × 106 cells) from CD45.2 donor mice were transplanted into 8- to 12-week-old female CD45.1 recipient mice, which were lethally irradiated at 9.5 Gy dose. In competitive BMT assay, unfractionated BM cells (1 × 106 cells) from CD45.2 donor mice were transplanted into 8- to 12-week-old female CD45.1 recipient mice which were lethally irradiated at 9.5 Gy dose, together with the equal number of BM cells from CD45.1/CD45.2 competitor mice. In the BMT experiments, pIpC was injected at 4 weeks after BMT. The donor chimerism in peripheral blood was evaluated at 4, 8, 12, 16, and 20 weeks after transplantation by flow cytometry. Mice were sacrificed at 20 weeks after transplantation, and the chimerism in each BM fraction was assessed by flow cytometry.

Generation of CRISPR Knockout Cell Lines

Human AML HL-60 cell lines were obtained from ATCC and not tested for Mycoplasma contamination. Single guide RNAs (sgRNA) targeting STAG2/RUNX1 were designed, and knockout efficiency was confirmed by Guide-it Mutation Detection Kit (Takara). To express SpCas9, the HL-60 cell line was transduced by lentiCas9-Blast vector (Addgene #52962) and selected by blasticidin at the concentration of 5 μg/mL. To generate WT, SKO, RKO, and DKO cells, cells were transduced by pLKO5-sgRNA-EFS-GFP vector (Addgene #57822) containing sgRNA targeting STAG2 or nontargeting control sgRNA and pLKO5-sgRNA-EFS-tRFP vector (Addgene #57823) containing sgRNA targeting RUNX1 or nontargeting control sgRNA, and GFP+/RFP+ cells were sorted at single cell per well into a 96-well plate. Generated colonies were genotyped by amplicon sequencing using iSeq 100 (Illumina). All clones for SKO, RKO, and DKO were subjected to Western blotting to confirm the knockout of STAG2 and/or RUNX1. PCR primers and sgRNA sequences are provided in Supplementary Table S5, and genotypes of generated clones are described in Supplementary Table S6.

RNA-seq

RNA was extracted using RNeasy Mini Kit (Qiagen) or NucleoSpin RNA XS (Macherey-Nagel). Libraries for RNA-seq were prepared using the NEBNext Ultra RNA Library Prep Kit for Illumina (New England BioLabs) and were subjected to sequencing using HiSeq 2500 or NovaSeq 6000 instrument (Illumina) with a standard 100 to 150-bp paired-end protocol as previously described (51). RNA-seq experiments were performed in two or more biological replicates. The sequencing reads were aligned to the reference genome (hg19 or mm9) using STAR (v2.5.3; ref. 52). Reads on each refSeq gene were counted with featureCounts (v1.5.3; ref. 53) from Subread package, and edgeR package in R (54) was used to identify the differentially expressed genes with FDR threshold of 0.05 and to generate the multidimensional scaling plot. The analysis was performed in genes expressed at >1 count per million in two or more samples, and generalized linear models were used to compare gene expression data. MSigDB overlap analysis was performed using MSigDB database and hallmark gene sets (https://www.gsea-msigdb.org/gsea/msigdb/index.jsp). RNA expression analysis in the hematopoietic system was carried out using Haemopedia RNA-seq datasets (55), and averages of log2 (TPM + 1) values of each gene set were calculated for each cell type. Gene set enrichment analysis (GSEA; v2.2.4; ref. 56) was used to determine the sets of genes that are significantly different between groups. Enriched pathways were visualized using Enrichment Map (57) with the q-value < 0.05 and overlap similarity coefficient parameters > 0.5. RNA-seq datasets used are described in Supplementary Table S7. DEGs shown in Fig. 2I are described in Supplementary Tables S8 and S9. Differentially expressed genes shown in Fig. 3J are described in Supplementary Tables S10–S15. More detailed methods are described in the Supplementary Methods.

ATAC-seq

ATAC-seq experiments were performed using LSK or CMP cells obtained from WT-, SKO-, RKO-, or DKO-transplanted mice using Fast-ATAC protocol as described previously (58) with minor modifications. Briefly, 10,000 freshly isolated cells were pelleted and 50 μL of transposase mixture (25 μL of 2x TD buffer, 2.5 μL of TDE1, 0.5 μL of 1% digitonin, and 22 μL of nuclease-free water; FC-121-1030, Illumina; G9441) was added to the cells. After transposition reactions at 37°C for 30 minutes, transposed DNA was purified using QIAGEN MinElute Reaction Cleanup Kit. Transposed fragments were PCR-amplified, and the resulting library was sequenced on HiSeq 2500. ATAC-seq experiments were performed in biological duplicates. Reads were aligned to the mouse mm9 reference genome using bowtie2 (v2.3.3; ref. 59) with -X 2000 –no-mixed –very-sensitive parameters following adapter trimming using cutadapt (v1.14; ref. 60). Duplicates were removed by Picard (v2.6.0; https://broadinstitute.github.io/picard/), and reads on mitochondria genome or blacklisted regions (ENCODE) were removed by bedtools (v2.27.1; ref. 61). Peaks were called with MACS (v2.1.1; ref. 62) with –nomodel –broad parameters with a q-value threshold of 1 × 10−5 for individual replicate as well as merged data of all replicates. Read counts on peaks for merged data were counted with the multicov function in bedtools, and edgeR was used to identify the peaks with differential accessibility with FDR threshold of 0.05. TF motifs were discovered with HOMER (63) in differentially accessible sites (up or down, compared with WT) using stable peaks or random genome as backgrounds. Peak annotation was performed with HOMER. Differentially accessible ATAC regions in SKO-derived LSK and CMP cells (vs. WT) shown in Fig. 2 are described in Supplementary Tables S16–S19.

ChIP-seq

ChIP-seq experiments were performed using c-Kit+ HSPCs or HL-60 cell lines. Cells were fixed in PBS with 1% formaldehyde (Thermo Fisher Scientific) for 10 minutes at room temperature with gentle mixing. The reaction was stopped by adding glycine solution (10x; Cell Signaling Technology) and incubating for 5 minutes at room temperature, and the cells were washed in cold PBS twice. The cells were then processed with SimpleChIP Plus Sonication Chromatin IP Kit (Cell Signaling Technology) and Covaris E220 (Covaris) according to the manufacturer's protocol. After purification of ChIPed DNA, ChIP-seq libraries were constructed using ThruPLEX DNA-seq kit (Takara) according to the manufacturer's protocol, and then subjected to sequencing using HiSeq 2500 or NoveSeq 6000 (Illumina). ChIP-seq experiments were performed in two or more biological replicates with input controls. The sequencing reads were aligned to the reference genome (hg19 or mm9) using bowtie (v1.2.2; ref. 64) following trimming of adapters and read tails to a total length of 50 base pairs using cutadapt. Duplicates and reads on blacklisted regions (ENCODE) were removed by Picard and bedtools, respectively. Peaks were called using MACS (v2.1.1) for each replicate individually with a P value threshold of 1 × 10−3 unless otherwise specified, and overlapped peaks among replicates were regarded as consensus peak sets. Motif analysis and peak annotation were performed with HOMER. SEs were identified with H3K27ac ChIP-seq data in WT HSPCs using ROSE (65) with default parameters. Identified SEs are described in Supplementary Table S20. SEs in human HSCs were described previously (35), and we used the dataset of BI_CD34_Primary_RO01536 for assignment of SE-associated genes. Calculation of ChIP signal intensities around peaks and generation of read density profile plots and heat maps were performed using deeptools (66). For clustering of cohesin-binding sites, we calculated logarithm of H3K27ac and CTCF ChIP signal intensities summed up around ± 200 bp from centers of cohesin-binding sites (STAG1 and/or STAG2 peaks) by deeptools, performed clustering using flowPeaks (67), and regarded the H3K27ac high clusters as cohesin cluster II and the others as cluster I. Binding profiles of cohesin components, Pol II, Mediator, and 10 hematopoietic TFs were similarly calculated around ± 200 bp from centers of cohesin binding sites (Fig. 5K). We also used ChIP-seq datasets (10 TFs: ASXL1, FLI1, GATA2, GFI1B, LMO2, LYL1, MEIS1, PU1, RUNX1, SCL; POL II, MED12 in mouse, Pol II in human; refs. 24–26, 36) in previous studies. More detailed methods are described in the Supplementary Methods.

Pol II Pausing Analysis

For total Pol II ChIP-seq, two replicates were merged and ChIP-seq peaks were called using MACS (v1.4.2) with a P value threshold of 1 × 10−3. For analysis of pausing indices in mice, expressed genes, whose transcription start sites (TSS) overlapped with Pol II ChIP-seq peaks in WT HSPCs, were subjected to downstream analyses. Pausing indices were calculated as the input-subtracted read density in the promoter-proximal region (-50 bp to +300 bp around TSS) divided by that of the gene body (from +300 bp to 2 kb downstream of the end of gene). Genes with positive signals in both the promoter-proximal region and gene body were further considered. ChIP-seq intensities of Ser5-P Pol II in the promoter proximal region were similarly calculated. Pol II ChIP-seq in human CD133-positive cells was described previously (36), and ChIP-seq peaks were called using MACS (v1.4.2) with a P value threshold of 1 × 10−5. For pausing analysis in human, input subtraction was not performed, and genes whose TSSs overlapped with Pol II ChIP-seq peaks were considered. Statistical significance was assessed with one-sided Wilcoxon rank-sum test.

Hi-C

Hi-C experiments were performed using MboI restriction enzyme as described previously (27). Briefly, two million cells were cross-linked with 1% formaldehyde for 10 minutes at room temperature. Cells were permeabilized and chromatin was digested with MboI restriction enzyme, and the ends of restriction fragments were labeled with biotinylated nucleotides and ligated. After cross-link reversal, DNA was purified and sheared with Covaris M220 (Covaris). Then point ligation junctions were pulled down with streptavidin beads. Then libraries were constructed with Nextera Mate Pair Sample Preparation Kit (Illumina) according to the manufacturer's protocol, and subject to sequencing using NovaSeq 6000 (Illumina) with a standard 100- or 150-bp paired-end protocol. Hi-C experiments were performed in biological duplicates. The sequencing reads were processed using Juicer (27) and hg19 or mm10 reference genome. After filtering of reads, the average valid interactions per genotype resulted in 1.79 billion for mouse HSPCs and 1.66 billion for HL-60 cells. For comparative analysis, the valid interactions after filtering were randomly resampled and arranged in the number of the lowest sample. Contact matrices used for further analysis were created for each replicate as well as merged one by genotype and Knight-Ruiz–normalized with Juicer. Loops were called at 5 kb and 10 kb resolutions using HICCUPS (27) and then merged to construct loop sets. Loops were classified into CC-I loops (whose anchors overlapped with at least one CC-I sites but not with CC-II) and CC-II loops (whose anchors overlapped with at least one CC-II but not CC-I sites). TADs were called at 5 kb resolution using Arrowhead (27). TAD boundaries were defined as +/– 5 kb from the 5′- or 3′- ends of TADs, and insides were regions inside of both boundaries. More detailed methods are described in the Supplementary Methods.

Statistical Analysis

In human MDS/AML, correlations across 42 frequently mutated genes were assessed by Fisher exact test. For mouse phenotype analysis, we calculated P values with two-tailed unpaired Student t test for two-group comparison, or ordinary one-way ANOVA with Bonferroni analysis for three or more group comparisons using GraphPad Prism (v6). For survival analysis, survival was estimated using the Kaplan–Meier method, and groups were compared using the log-rank test, with survival package in R software. In RNA-seq, ATAC-seq, and ChIP-seq analyses, statistical analyses were performed using edgeR, HOMER, DAVID, Tissue Specific Expression Analysis (TSEA), GSEA, or MACS. For metaplot analysis, bin-wise P values were obtained using one-sided Wilcoxon rank-sum test. For splicing analysis, differential PSI was assessed using two-sided moderated t test and Benjamini–Hochberg correction. For pausing analysis and RNA expression analysis according to pausing indices, statistical significance was assessed with one-sided Wilcoxon rank-sum test.

Data Availability

External ChIP-seq datasets used in this study are summarized in Supplementary Table S21. The sequencing data generated in this study are available in the Gene Expression Omnibus (GEO) repository, under accession number GSE131583. All other data will be made available upon request to the corresponding author.

Additional Methods

Detailed methods for qRT-PCR, Western blotting, coimmunoprecipitation, histology and cytology, immunostaining, microscopy and data analysis, single-cell differentiation assay, and splicing analysis are described in the Supplementary Methods.

No potential conflicts of interest were disclosed.

Conception and design: Y. Ochi, A. Kon, K. Kataoka, H.I. Suzuki, S. Ogawa

Development of methodology: Y. Ochi, A. Kon, N. Nakazawa, A. Yoda, Y. Kogure

Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): Y. Ochi, A. Kon, N. Nakazawa, H. Koseki, M. Nakayama, D. Morishita, A. Yoda, R. Okuda, K. Yoshida, S. Kotani, H. Makishima, L. Malcovati, K. Takeuchi, E. Sugihara, T.-A. Sato, M. Cazzola

Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): Y. Ochi, A. Kon, T. Sakata, M.M. Nakagawa, N. Nakazawa, M. Kakuta, T. Tsuruyama, R. Saiki, T. Yoshizato, Y. Shiozawa, Y. Nannya, L. Malcovati, M. Sanada, S. Miyano, H.I. Suzuki, S. Ogawa

Writing, review, and/or revision of the manuscript: Y. Ochi, A. Kon, T. Sakata, T. Yoshizato, H. Makishima, L. Malcovati, A. Takaori-Kondo, M. Cazzola, H.I. Suzuki, S. Ogawa

Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): A. Kon, T. Sakata, H. Koseki, N. Kakiuchi, T. Nishimura, A. Yokoyama, M. Kengaku, K. Shirahige, S. Ogawa

Study supervision: A. Kon, A. Takaori-Kondo, H.I. Suzuki, S. Ogawa

We thank Satoko Yabuta, Ai Takatsu, Akiko Ozaki, Atsuko Ryu, Maki Nakamura, Takeshi Shirahari (Department of Pathology and Tumor Biology, Kyoto University, Japan), and Satoko Baba (Pathology Project for Molecular Targets, Cancer Institute, Japanese Foundation for Cancer Research) for technical assistance; Dr. Kazuko Miyazaki and Dr. Masaki Miyazaki (Department of Immunology, Institute for Frontier Life and Medical Sciences, Kyoto University) for technical advice on ATAC-seq and ChIP-seq experiments; Dr. Shuhei Asada and Dr. Toshio Kitamura (Division of Cellular Therapy, The Institute of Medical Science, University of Tokyo, Tokyo, Japan) for providing vectors; the Center for Anatomical, Pathological and Forensic Medical Research, Kyoto University Graduate School of Medicine, for preparing microscope slide; and iLAC, Co., Ltd. for sequencing support. Super-resolution imaging was performed at the iCeMS Analysis Center, Institute for Integrated Cell-Material Sciences (iCeMS), Kyoto University Institute for Advanced Study (KUIAS), and we thank Dr. Takahiro Fujiwara for data analysis and Fumiyoshi Ishidate for technical assistance (iCeMS Analysis Center). The super-computing resource was provided by Human Genome Center, the Institute of Medical Science, the University of Tokyo. The results shown here are in part based upon data generated by The Cancer Genome Atlas Research Network: https://www.cancer.gov/tcga. This work was supported by Grant-in-Aid for Scientific Research (MEXT/JSPS KAKENHI JP26221308 and JP26253060, to S. Ogawa; JP17J05245, to Y. Ochi; JP18K16084, to A. Kon; JP19H03560, to A. Yoda), MEXT as “Priority Issue on Post-K computer” (Integrated Computational Life Science to Support Personalized and Preventive Medicine; hp180198, hp190179, to S. Ogawa and S. Miyano), Grants-in-Aid from the Japan Agency for Medical Research and Development (AMED; JP15cm0106056, JP19cm0106501, JP16ck0106073, JP19ck0106250, to S. Ogawa; JP19cm0106138, to A. Kon), Scientific Research on Innovative Areas (15H05909, 15H05912, to S. Ogawa and S. Miyano; JP15H05979, to K. Yoshida and A. Kon; 15H05976, to K. Shirahige; 19H04806, to A. Yoda), “Stem Cell Aging and Disease” (14430052, to M. Sanada), JST CREST (JPMJCR18S5, to K. Shirahige), grants from Takeda Science Foundation (to S. Ogawa, H. Makishima, T. Yoshizato, and A. Kon), Naito Foundation (to A. Yoda), TERUMO LIFE SCIENCE FOUNDATION (to A. Yoda), Yasuda Medical Foundation (to A. Yoda), and JSPS Core-to-Core Program (to S. Ogawa). Studies conducted at Massachusetts Institute of Technology were supported by the United States Public Health Service Grant R01-GM034277 and R01-CA133404 (to P.A. Sharp) and P01-CA042063 (to T. Jacks) from the NIH and by the Koch Institute Support (core) grant P30-CA14051 from the National Cancer Institute, and supported in part by an agreement between the Whitehead Institute for Biomedical Research and Novo Nordisk. Studies conducted at the University of Pavia and Fondazione IRCCS Policlinico San Matteo, Pavia, Italy, were supported by Associazione Italiana per la Ricerca sul Cancro (AIRC), Milan, Italy (Investigator Grant #20125, to L. Malcovati; AIRC 5 × 1000 project #21267, to M. Cazzola).

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1.
Cazzola
M
,
Della Porta
MG
,
Malcovati
L
. 
The genetic basis of myelodysplasia and its clinical relevance
.
Blood
2013
;
122
:
4021
34
.
2.
Haferlach
T
,
Nagata
Y
,
Grossmann
V
,
Okuno
Y
,
Bacher
U
,
Nagae
G
, et al
Landscape of genetic lesions in 944 patients with myelodysplastic syndromes
.
Leukemia
2014
;
28
:
241
7
.
3.
Papaemmanuil
E
,
Gerstung
M
,
Malcovati
L
,
Tauro
S
,
Gundem
G
,
Van Loo
P
, et al
Clinical and biological implications of driver mutations in myelodysplastic syndromes
.
Blood
2013
;
122
:
3616
27
.
4.
Ogawa
S
. 
Genetics of MDS
.
Blood
2019
;
133
:
1049
59
.
5.
Kon
A
,
Shih
LY
,
Minamino
M
,
Sanada
M
,
Shiraishi
Y
,
Nagata
Y
, et al
Recurrent mutations in multiple components of the cohesin complex in myeloid neoplasms
.
Nat Genet
2013
;
45
:
1232
7
.
6.
Ley
TJ
,
Miller
C
,
Ding
L
,
Raphael
BJ
,
Mungall
AJ
,
Robertson
A
, et al
Genomic and epigenomic landscapes of adult de novo acute myeloid leukemia
.
N Engl J Med
2013
;
368
:
2059
74
.
7.
Yoshida
K
,
Toki
T
,
Okuno
Y
,
Kanezaki
R
,
Shiraishi
Y
,
Sato-Otsubo
A
, et al
The landscape of somatic mutations in down syndrome-related myeloid disorders
.
Nat Genet
2013
;
45
:
1293
9
.
8.
Remeseiro
S
,
Losada
A
. 
Cohesin, a chromatin engagement ring
.
Curr Opin Cell Biol
2013
;
25
:
63
71
.
9.
Viny
AD
,
Ott
CJ
,
Spitzer
B
,
Rivas
M
,
Meydan
C
,
Papalexi
E
, et al
Dose-dependent role of the cohesin complex in normal and malignant hematopoiesis
.
J Exp Med
2015
;
212
:
1819
32
.
10.
Mullenders
J
,
Aranda-Orgilles
B
,
Lhoumaud
P
,
Keller
M
,
Pae
J
,
Wang
K
, et al
Cohesin loss alters adult hematopoietic stem cell homeostasis, leading to myeloproliferative neoplasms
.
J Exp Med
2015
;
212
:
1833
50
.
11.
Mazumdar
C
,
Shen
Y
,
Xavy
S
,
Zhao
F
,
Reinisch
A
,
Li
R
, et al
Leukemia-associated cohesin mutants dominantly enforce stem cell programs and impair human hematopoietic progenitor differentiation
.
Cell Stem Cell
2015
;
17
:
675
88
.
12.
Galeev
R
,
Baudet
A
,
Kumar
P
,
Rundberg Nilsson
A
,
Nilsson
B
,
Soneji
S
, et al
Genome-wide RNAi screen identifies cohesin genes as modifiers of renewal and differentiation in human HSCs
.
Cell Rep
2016
;
14
:
2988
3000
.
13.
Rowley
MJ
,
Corces
VG
. 
Organizational principles of 3D genome architecture
.
Nat Rev Genet
2018
;
19
:
789
800
.
14.
Walter
MJ
,
Shen
D
,
Ding
L
,
Shao
J
,
Koboldt
DC
,
Chen
K
, et al
Clonal architecture of secondary acute myeloid leukemia
.
N Engl J Med
2012
;
366
:
1090
8
.
15.
Lindsley
RC
,
Mar
BG
,
Mazzola
E
,
Grauman
PV
,
Shareef
S
,
Allen
SL
, et al
Acute myeloid leukemia ontogeny is defined by distinct somatic mutations
.
Blood
2015
;
125
:
1367
76
.
16.
Makishima
H
,
Yoshizato
T
,
Yoshida
K
,
Sekeres
MA
,
Radivoyevitch
T
,
Suzuki
H
, et al
Dynamics of clonal evolution in myelodysplastic syndromes
.
Nat Genet
2017
;
49
:
204
12
.
17.
Yoshizato
T
,
Nannya
Y
,
Atsuta
Y
,
Shiozawa
Y
,
Iijima-Yamashita
Y
,
Yoshida
K
, et al
Genetic abnormalities in myelodysplasia and secondary acute myeloid leukemia: impact on outcome of stem cell transplantation
.
Blood
2017
;
129
:
2347
58
.
18.
Meggendorfer
M
,
de Albuquerque
A
,
Nadarajah
N
,
Alpermann
T
,
Kern
W
,
Steuer
K
, et al
Karyotype evolution and acquisition of FLT3 or RAS pathway alterations drive progression of myelodysplastic syndrome to acute myeloid leukemia
.
Haematologica
2015
;
100
:
e487
90
.
19.
Hu
L
,
Li
M
,
Ding
Y
,
Pu
L
,
Liu
J
,
Xie
J
, et al
Prognostic value of RDW in cancers: a systematic review and meta-analysis
.
Oncotarget
2017
;
8
:
16027
35
.
20.
Pietras
EM
,
Reynaud
D
,
Kang
YA
,
Carlin
D
,
Calero-Nieto
FJ
,
Leavitt
AD
, et al
Functionally distinct subsets of lineage-biased multipotent progenitors control blood production in normal and regenerative conditions
.
Cell Stem Cell
2015
;
17
:
35
46
.
21.
de Bruijn
M
,
Dzierzak
E
. 
Runx transcription factors in the development and function of the definitive hematopoietic system
.
Blood
2017
;
129
:
2061
9
.
22.
Buenrostro
JD
,
Giresi
PG
,
Zaba
LC
,
Chang
HY
,
Greenleaf
WJ
. 
Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position
.
Nat Methods
2013
;
10
:
1213
8
.
23.
Shlyueva
D
,
Stampfel
G
,
Stark
A
. 
Transcriptional enhancers: from properties to genome-wide predictions
.
Nat Rev Genet
2014
;
15
:
272
86
.
24.
Wilson
NK
,
Foster
SD
,
Wang
X
,
Knezevic
K
,
Schutte
J
,
Kaimakis
P
, et al
Combinatorial transcriptional control in blood stem/progenitor cells: genome-wide analysis of ten major transcriptional regulators
.
Cell Stem Cell
2010
;
7
:
532
44
.
25.
Li
Z
,
Zhang
P
,
Yan
A
,
Guo
Z
,
Ban
Y
,
Li
J
, et al
ASXL1 interacts with the cohesin complex to maintain chromatid separation and gene expression for normal hematopoiesis
.
Sci Adv
2017
;
3
:
e1601602
.
26.
Aranda-Orgilles
B
,
Saldana-Meyer
R
,
Wang
E
,
Trompouki
E
,
Fassl
A
,
Lau
S
, et al
MED12 regulates HSC-specific enhancers independently of mediator kinase activity to control hematopoiesis
.
Cell Stem Cell
2016
;
19
:
784
99
.
27.
Rao
SS
,
Huntley
MH
,
Durand
NC
,
Stamenova
EK
,
Bochkov
ID
,
Robinson
JT
, et al
A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping
.
Cell
2014
;
159
:
1665
80
.
28.
Rao
SSP
,
Huang
SC
,
Glenn St Hilaire
B
,
Engreitz
JM
,
Perez
EM
,
Kieffer-Kwon
KR
, et al
Cohesin loss eliminates all loop domains
.
Cell
2017
;
171
:
305
20
.
29.
Collins
CT
,
Hess
JL
. 
Deregulation of the HOXA9/MEIS1 axis in acute leukemia
.
Curr Opin Hematol
2016
;
23
:
354
61
.
30.
Ghavi-Helm
Y
,
Klein
FA
,
Pakozdi
T
,
Ciglar
L
,
Noordermeer
D
,
Huber
W
, et al
Enhancer loops appear stable during development and are associated with paused polymerase
.
Nature
2014
;
512
:
96
100
.
31.
Williams
LH
,
Fromm
G
,
Gokey
NG
,
Henriques
T
,
Muse
GW
,
Burkholder
A
, et al
Pausing of RNA polymerase II regulates mammalian developmental potential through control of signaling networks
.
Mol Cell
2015
;
58
:
311
22
.
32.
Henriques
T
,
Scruggs
BS
,
Inouye
MO
,
Muse
GW
,
Williams
LH
,
Burkholder
AB
, et al
Widespread transcriptional pausing and elongation control at enhancers
.
Genes Dev
2018
;
32
:
26
41
.
33.
Tyner
JW
,
Tognon
CE
,
Bottomly
D
,
Wilmot
B
,
Kurtz
SE
,
Savage
SL
, et al
Functional genomic landscape of acute myeloid leukaemia
.
Nature
2018
;
562
:
526
31
.
34.
Shiozawa
Y
,
Malcovati
L
,
Galli
A
,
Pellagatti
A
,
Karimi
M
,
Sato-Otsubo
A
, et al
Gene expression and risk of leukemic transformation in myelodysplasia
.
Blood
2017
;
130
:
2642
53
.
35.
Hnisz
D
,
Abraham
BJ
,
Lee
TI
,
Lau
A
,
Saint-Andre
V
,
Sigova
AA
, et al
Super-enhancers in the control of cell identity and disease
.
Cell
2013
;
155
:
934
47
.
36.
Cui
K
,
Zang
C
,
Roh
TY
,
Schones
DE
,
Childs
RW
,
Peng
W
, et al
Chromatin signatures in multipotent human hematopoietic stem cells indicate the fate of bivalent genes during differentiation
.
Cell Stem Cell
2009
;
4
:
80
93
.
37.
Kojic
A
,
Cuadrado
A
,
De Koninck
M
,
Gimenez-Llorente
D
,
Rodriguez-Corsino
M
,
Gomez-Lopez
G
, et al
Distinct roles of cohesin-SA1 and cohesin-SA2 in 3D chromosome organization
.
Nat Struct Mol Biol
2018
;
25
:
496
504
.
38.
Cho
WK
,
Spille
JH
,
Hecht
M
,
Lee
C
,
Li
C
,
Grube
V
, et al
Mediator and RNA polymerase II clusters associate in transcription-dependent condensates
.
Science
2018
;
361
:
412
5
.
39.
Sabari
BR
,
Dall'Agnese
A
,
Boija
A
,
Klein
IA
,
Coffey
EL
,
Shrinivas
K
, et al
Coactivator condensation at super-enhancers links phase separation and gene control
.
Science
2018
;
361
:
pii:eaar3958
.
40.
Boija
A
,
Klein
IA
,
Sabari
BR
,
Dall'Agnese
A
,
Coffey
EL
,
Zamudio
AV
, et al
Transcription factors activate genes through the phase-separation capacity of their activation domains
.
Cell
2018
;
175
:
1842
55
.
41.
Cuartero
S
,
Weiss
FD
,
Dharmalingam
G
,
Guo
Y
,
Ing-Simmons
E
,
Masella
S
, et al
Control of inducible gene expression links cohesin to hematopoietic progenitor self-renewal and differentiation
.
Nat Immunol
2018
;
19
:
932
41
.
42.
Viny
AD
,
Bowman
RL
,
Liu
Y
,
Lavallee
VP
,
Eisman
SE
,
Xiao
W
, et al
Cohesin members Stag1 and Stag2 display distinct roles in chromatin accessibility and topological control of HSC self-renewal and differentiation
.
Cell Stem Cell
2019
;
25
:
682
96
.
43.
Papaemmanuil
E
,
Gerstung
M
,
Bullinger
L
,
Gaidzik
VI
,
Paschka
P
,
Roberts
ND
, et al
Genomic classification and prognosis in acute myeloid leukemia
.
N Engl J Med
2016
;
374
:
2209
21
.
44.
Chen
L
,
Chen
JY
,
Huang
YJ
,
Gu
Y
,
Qiu
J
,
Qian
H
, et al
The augmented R-loop is a unifying mechanism for myelodysplastic syndromes induced by high-risk splicing factor mutations
.
Mol Cell
2018
;
69
:
412
25
.
45.
Suzuki
H
,
Aoki
K
,
Chiba
K
,
Sato
Y
,
Shiozawa
Y
,
Shiraishi
Y
, et al
Mutational landscape and clonal architecture in grade II and III gliomas
.
Nat Genet
2015
;
47
:
458
68
.
46.
Ochi
Y
,
Hiramoto
N
,
Yoshizato
T
,
Ono
Y
,
Takeda
J
,
Shiozawa
Y
, et al
Clonally related diffuse large B-cell lymphoma and interdigitating dendritic cell sarcoma sharing MYC translocation
.
Haematologica
2018
;
103
:
e553
e6
.
47.
Kuhn
R
,
Schwenk
F
,
Aguet
M
,
Rajewsky
K
. 
Inducible gene targeting in mice
.
Science
1995
;
269
:
1427
9
.
48.
Kon
A
,
Yamazaki
S
,
Nannya
Y
,
Kataoka
K
,
Ota
Y
,
Nakagawa
MM
, et al
Physiological Srsf2 P95H expression causes impaired hematopoietic stem cell functions and aberrant RNA splicing in mice
.
Blood
2018
;
131
:
621
35
.
49.
Ichikawa
M
,
Asai
T
,
Saito
T
,
Seo
S
,
Yamazaki
I
,
Yamagata
T
, et al
AML-1 is required for megakaryocytic maturation and lymphocytic differentiation, but not for maintenance of hematopoietic stem cells in adult hematopoiesis
.
Nat Med
2004
;
10
:
299
304
.
50.
Yokoyama
A
,
Ficara
F
,
Murphy
MJ
,
Meisel
C
,
Hatanaka
C
,
Kitabayashi
I
, et al
MLL becomes functional through intra-molecular interaction not by proteolytic processing
.
PLoS One
2013
;
8
:
e73649
.
51.
Yoshida
K
,
Sanada
M
,
Shiraishi
Y
,
Nowak
D
,
Nagata
Y
,
Yamamoto
R
, et al
Frequent pathway mutations of splicing machinery in myelodysplasia
.
Nature
2011
;
478
:
64
9
.
52.
Dobin
A
,
Davis
CA
,
Schlesinger
F
,
Drenkow
J
,
Zaleski
C
,
Jha
S
, et al
STAR: ultrafast universal RNA-seq aligner
.
Bioinformatics
2013
;
29
:
15
21
.
53.
Liao
Y
,
Smyth
GK
,
Shi
W
. 
featureCounts: an efficient general purpose program for assigning sequence reads to genomic features
.
Bioinformatics
2014
;
30
:
923
30
.
54.
Robinson
MD
,
McCarthy
DJ
,
Smyth
GK
. 
edgeR: a Bioconductor package for differential expression analysis of digital gene expression data
.
Bioinformatics
2010
;
26
:
139
40
.
55.
Choi
J
,
Baldwin
TM
,
Wong
M
,
Bolden
JE
,
Fairfax
KA
,
Lucas
EC
, et al
Haemopedia RNA-seq: a database of gene expression during haematopoiesis in mice and humans
.
Nucleic Acids Res
2019
;
47
:
D780
D5
.
56.
Subramanian
A
,
Tamayo
P
,
Mootha
VK
,
Mukherjee
S
,
Ebert
BL
,
Gillette
MA
, et al
Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles
.
Proc Natl Acad Sci U S A
2005
;
102
:
15545
50
.
57.
Merico
D
,
Isserlin
R
,
Stueker
O
,
Emili
A
,
Bader
GD
. 
Enrichment map: a network-based method for gene-set enrichment visualization and interpretation
.
PLoS One
2010
;
5
:
e13984
.
58.
Corces
MR
,
Buenrostro
JD
,
Wu
B
,
Greenside
PG
,
Chan
SM
,
Koenig
JL
, et al
Lineage-specific and single-cell chromatin accessibility charts human hematopoiesis and leukemia evolution
.
Nat Genet
2016
;
48
:
1193
203
.
59.
Langmead
B
,
Salzberg
SL
. 
Fast gapped-read alignment with Bowtie 2
.
Nat Methods
2012
;
9
:
357
9
.
60.
Martin
M
. 
Cutadapt removes adapter sequences from high-throughput sequencing reads
.
EMBnet Journal
2011
;
17
:
10
2
.
61.
Quinlan
AR
,
Hall
IM
. 
BEDTools: a flexible suite of utilities for comparing genomic features
.
Bioinformatics
2010
;
26
:
841
2
.
62.
Zhang
Y
,
Liu
T
,
Meyer
CA
,
Eeckhoute
J
,
Johnson
DS
,
Bernstein
BE
, et al
Model-based analysis of ChIP-Seq (MACS)
.
Genome Biol
2008
;
9
:
R137
.
63.
Heinz
S
,
Benner
C
,
Spann
N
,
Bertolino
E
,
Lin
YC
,
Laslo
P
, et al
Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities
.
Mol Cell
2010
;
38
:
576
89
.
64.
Langmead
B
. 
Aligning short sequencing reads with Bowtie
.
Curr Protoc Bioinformatics
2010
;
Chapter 11:Unit 11.7
.
65.
Loven
J
,
Hoke
HA
,
Lin
CY
,
Lau
A
,
Orlando
DA
,
Vakoc
CR
, et al
Selective inhibition of tumor oncogenes by disruption of super-enhancers
.
Cell
2013
;
153
:
320
34
.
66.
Ramirez
F
,
Dundar
F
,
Diehl
S
,
Gruning
BA
,
Manke
T
. 
deepTools: a flexible platform for exploring deep-sequencing data
.
Nucleic Acids Res
2014
;
42
:
W187
91
.
67.
Ge
Y
,
Sealfon
SC
. 
flowPeaks: a fast unsupervised clustering for flow cytometry data via K-means and density peak finding
.
Bioinformatics
2012
;
28
:
2052
8
.