The genetic subtypes of Burkitt lymphoma have been defined, but the role of epigenetics remains to be comprehensively characterized. We searched genomic DNA from 218 patients across four continents for recurrent DNA methylation patterns and their associations with clinical and molecular features. We identified DNA methylation patterns that were not fully explained by the Epstein–Barr virus status or mutation status, leading to two epitypes described here as HypoBL and HyperBL. Each is characterized by distinct genomic and clinical features including global methylation, mutation burden, aberrant somatic hypermutation, and survival outcomes. Methylation, gene expression, and mutational differences between the epitypes support a model in which each arises from a distinct cell of origin. These results, pending validation in external cohorts, point to a refined risk assessment for patients with Burkitt lymphoma who may experience inferior outcomes.

Significance:

Burkitt lymphoma can be divided into two epigenetic subtypes (epitypes), each carrying distinct biological, transcriptomic, genomic, and clinical features. Epitype is more strongly associated with clinical and mutational features than the Epstein–Barr virus status or genetic subtype, highlighting an important additional layer of Burkitt lymphoma pathogenesis.

Burkitt lymphoma is the most prevalent type of B-cell non–Hodgkin lymphoma (NHL) affecting children, accounting for approximately 50% of pediatric NHL cases, but it is rarer in adults (13). The genetic hallmark is a chromosomal translocation that places MYC under the regulation of a potent enhancer, causing constitutive expression (4, 5). Burkitt lymphoma has historically been classified based on the clinical variant status (endemic or sporadic) and patient age (pediatric or adult; refs. 6, 7). Recent evidence has underscored the association of Epstein–Barr virus (EBV) tumor positivity with distinct genomic features (810). These insights have challenged the relevance of the clinical variant status, pointing to the EBV status as being more biologically meaningful. Using genome-wide mutational information, four genetic subgroups of Burkitt lymphoma have been proposed, which span pediatric and adult Burkitt lymphoma. These are marked by distinct combinations of driver mutations, namely, in DDX3X, GNA13, GNAI2 (DGG-BL), ID3, CCND3 (IC-BL), and TP53 (Q53-BL), with some of them bearing distinct patterns of noncoding mutations caused by aberrant somatic hypermutation (aSHM; ref. 9).

The genome-wide methylation features of Burkitt lymphoma are yet to be extensively studied. Changes to CpG methylation in tumors typically involve widespread/global hypomethylation and local hypermethylation of specific genes. Specifically, changes to promoters and other regulatory regions can influence the expression of tumor suppressor genes (11, 12). This tends to preserve some of the methylation pattern of the founding cell, and this epigenetic memory can inform on the cell of origin (1317). Owing to different natural histories, differentiation stages, and selective pressures experienced by these tumors, DNA methylation patterns may also be associated with different driver mutations (1618). Considering the relatively low number of Burkitt lymphoma driver genes to other mature B-cell neoplasms, we sought to explore the contribution of epigenetic changes to Burkitt lymphoma and how these might inform about other clinical or molecular features.

Identifying and Distinguishing the Epigenetic Landscape of Burkitt Lymphoma

We applied array-based genome-wide methylation profiling to 218 Burkitt lymphoma biopsies collected at diagnosis (156 pediatric and 62 adult samples) along with six normal centroblast samples. We also performed whole-genome sequencing (WGS) on nine of these samples using the PromethION platform and used these data to resolve CpG methylation sites and entropy. Our analysis leveraged existing genomic and clinical information from the same cohort, which has been described previously (8, 9). As has been observed in other cancers, the Burkitt lymphomas exhibited widespread DNA hypomethylation with localized hypermethylation relative to centroblasts (Fig. 1A and B; refs. 11, 12). Specifically, this comparison identified 357,124 differentially methylated probes (DMP) with 191,724 of these showing hypomethylation and 165,400 showing hypermethylation in Burkitt lymphoma. When this comparison was performed separately according to EBV status, EBV-positive Burkitt lymphomas had more regions affected by methylation changes than centroblasts and a significantly higher number of DMPs exhibited hypermethylation (Fig. 1C and D; P < 1 × 10−16, Fisher exact test). The number of DMPs in each chromatin state should be approached with caution as the extent of probe coverage in each of these regions is constrained by the array design. It is clear, however, that most Burkitt lymphoma–associated hypomethylation affects regions that are typically heterochromatin, regardless of the EBV status. By contrast, the regions affected by hypermethylation were more diverse in both EBV-positive and EBV-negative Burkitt lymphomas (Fig. 1D).

Figure 1.

Burkitt lymphoma (BL) genomes contain global demethylation and localized hypermethylation. A, Global averaged methylation of normal centroblast (n = 6) and EBV-positive BL (n = 130) samples. Methylation values from individual EPIC array probes were binned into 1-Mbp windows, and the average of each bin is shown on a heat scale. Data are arranged in concentric rings starting from the outermost ring: normal centroblasts, the difference in methylation levels between EBV-positive BL and centroblasts, and EBV-positive BL. B, Global averaged methylation of normal centroblast (n = 6) and EBV-negative BL (n = 88) samples. Methylation values from individual EPIC array probes were binned into 1-Mbp windows, and the average of each bin is shown on a heat scale. Data are arranged in concentric rings starting from the outermost ring: normal centroblasts, the difference between EBV-negative BL and normal centroblasts, and EBV-negative BL. C, A comparison of the number of significant DMPs (|abs(logFC)|> 0.25 and Q < 0.05) from Illumina EPIC array that were hypermethylated and hypomethylated among each EBV-positive BL (n = 130) and EBV-negative BL (n = 88) samples when compared with normal centroblast (n = 6) samples. D, The chromatin state associated with the DMPs in (C) are shown as a total number of probes with DMP in BL vs. normal centroblasts. E, The averaged methylation of samples across DMRs shared between EBV-positive (n = 130) and EBV-negative (n = 88) samples when compared with normal centroblast (n = 6) samples. CNV, copy-number variation; Heterochrom/lo, heterochromatin/low signal; Txn, transcriptional.

Figure 1.

Burkitt lymphoma (BL) genomes contain global demethylation and localized hypermethylation. A, Global averaged methylation of normal centroblast (n = 6) and EBV-positive BL (n = 130) samples. Methylation values from individual EPIC array probes were binned into 1-Mbp windows, and the average of each bin is shown on a heat scale. Data are arranged in concentric rings starting from the outermost ring: normal centroblasts, the difference in methylation levels between EBV-positive BL and centroblasts, and EBV-positive BL. B, Global averaged methylation of normal centroblast (n = 6) and EBV-negative BL (n = 88) samples. Methylation values from individual EPIC array probes were binned into 1-Mbp windows, and the average of each bin is shown on a heat scale. Data are arranged in concentric rings starting from the outermost ring: normal centroblasts, the difference between EBV-negative BL and normal centroblasts, and EBV-negative BL. C, A comparison of the number of significant DMPs (|abs(logFC)|> 0.25 and Q < 0.05) from Illumina EPIC array that were hypermethylated and hypomethylated among each EBV-positive BL (n = 130) and EBV-negative BL (n = 88) samples when compared with normal centroblast (n = 6) samples. D, The chromatin state associated with the DMPs in (C) are shown as a total number of probes with DMP in BL vs. normal centroblasts. E, The averaged methylation of samples across DMRs shared between EBV-positive (n = 130) and EBV-negative (n = 88) samples when compared with normal centroblast (n = 6) samples. CNV, copy-number variation; Heterochrom/lo, heterochromatin/low signal; Txn, transcriptional.

Close modal

Through region-level analyses, we identified 10,715 and 7,320 differentially methylated regions (DMR) in EBV-positive and EBV-negative Burkitt lymphoma samples, respectively (Supplementary Tables S1 and S2). Of these, 49,855 DMPs and 5,334 DMRs were shared between EBV-positive and EBV-negative comparisons, suggesting that changes to methylation within these regions are a shared feature across Burkitt lymphomas (Fig. 1E). These pan–Burkitt lymphoma DMRs span 3,340 genes, including genes differentially expressed between centrocytes and centroblasts (CXCR5, CCND1, and SLAMF1), genes hypermethylated in other B-cell lymphomas (TP73, IL12BR2, GSTP1, MT1G, CDH1, RARB, RBP1, SLIT2, DLC1, p16, DAPK1, KLF4, and DBC1), and genes with differential expression between Burkitt lymphomas and diffuse large B-cell lymphomas (DLBCL; SOX11, STAT3, CD44, CTSH, DLEU1, BATF, and BCL2; Supplementary Table S3; refs. 12, 1928).

We next investigated whether genes commonly mutated in Burkitt lymphomas (“Burkitt lymphoma genes”; refs. 8, 9, 29) are also associated with DNA methylation patterns consistent with transcriptional activity. We compared the methylation state of the promoter with the expression of the corresponding gene across all samples. Among these genes, there was a negative correlation between methylation and expression for CREBBP, DTX1, HIST1H1D, ID3, KLHL6, SMARCA4, TCL1A, and TET2 (Supplementary Fig. S1A). This is consistent with the notion that the expression of these genes is repressed by DNA methylation in a subset of Burkitt lymphomas. As the function of a gene could be similarly suppressed genetically, we incorporated the mutation status of each gene into our analysis using a linear model. This confirmed an association between ID3 promoter methylation and mutation status. Samples carrying ID3 mutations exhibited lower promoter methylation (mean β = 0.158), whereas Burkitt lymphomas with intact ID3 had a significantly higher degree of methylation (mean β = 0.199; P < 0.001). Although the promoter exhibited a predominantly hypomethylated state across all Burkitt lymphoma samples (mean β = 0.182), we noted that subtle differences in ID3 promoter methylation were associated with significant changes in gene expression (P < 0.001; Supplementary Fig. S1B). A similar trend was observed for SMARCA4, although for this gene the difference in the expression level did not reach significance (P = 0.09; Supplementary Fig. S1B). Taken together, these data support the hypothesis that the loss of ID3 function can be acquired through promoter hypermethylation as well as genetic loss.

Because noncoding mutations also have the potential to influence expression through the alteration of regulatory sequences, we investigated the methylation of noncoding regions enriched for mutations in Burkitt lymphomas, specifically those affected by aSHM (9). Given the disparate rates of aSHM observed between EBV-positive and EBV-negative Burkitt lymphomas, we explored whether methylation patterns associated with EBV status were correlated with frequencies of aSHM (8, 9). To compare with DLBCLs, which has higher rates of aSHM relative to Burkitt lymphomas (9), this analysis included array data from 56 DLBCLs from a previous study (18). We noted a general trend of low average β values across most of the analyzed aSHM regions, which is seen across Burkitt lymphomas, DLBCLs, and 15 normal germinal center (GC) B cells, whereas a minority of these regions exhibited variable levels of methylation among the Burkitt lymphomas and DLBCLs (Supplementary Fig. S1C). The highly methylated regions were consistently observed across the Burkitt lymphomas and GC B cells, implying this epigenetic pattern reflects the state of B cells at this differentiation stage. The hypermethylation of SEPT9 was predominantly observed in EBV-positive Burkitt lymphomas, consistent with the lower rates of aSHM across this region in EBV-positive Burkitt lymphoma samples than in EBV-negative Burkitt lymphomas and DLBCLs (Supplementary Fig. S1D; ref. 9). Additionally, ST6GAL1 had the highest degree of methylation in EBV-positive Burkitt lymphomas, wherein similar rates of aSHM were observed between EBV-positive and EBV-negative Burkitt lymphomas (Supplementary Fig. S1D). Although most of these hypermethylated regions were also methylated in GC cells, ST6GAL1 methylation was uniquely observed in Burkitt lymphomas. Overall, there was no consistent association between the methylation state and degree of mutations in these regions.

EBV-Associated DNA Methylation Patterns in Burkitt Lymphomas

To explore the effect of EBV on the Burkitt lymphoma methylome, we stratified Burkitt lymphomas on EBV status and searched for methylation differences. This identified 30,845 significant DMPs {|absolute log fold change [abs(logFC)]|> 0.25 and Q < 0.05}, with most (30,726) hypermethylated in EBV-positive Burkitt lymphomas. DMPs were primarily located in open seas (15,985), CpG shores (8,742), CpG islands (3,021), and CpG shelves (2,965). Using chromatin and transcriptional state annotations, we found a significant (P < 0.001) enrichment of hypermethylated DMPs in annotated enhancer regions (Fig. 2A). Using a region-level analysis, we identified a total of 6,000 DMRs, again with most (5,991) hypermethylated in EBV-positive Burkitt lymphomas (Fig. 2B; Supplementary Table S4).

Figure 2.

Methylation patterns in Burkitt lymphoma (BL) are associated with EBV status. A, The chromatin state associated with significant DMPs from the EPIC array that were hypermethylated and hypomethylated from comparing EBV-positive (n = 130) against EBV-negative (n = 88) BLs shown as a number of all significant DMPs. The number of EPIC array probes contained within each chromatin region that was differentially methylated between BL and centroblasts (CB) (from Fig. 1C) is displayed as the background reference. B, The averaged methylation of samples across DMRs identified from the EPIC array comparing EBV-positive (n = 130) and EBV-negative (n = 88) BLs. Normal centroblast samples (n = 6) are shown as the reference. C, The heatmap on the left represents averaged methylation of samples across DMRs identified from the EPIC array comparing EBV-positive (n = 130) and EBV-negative (n = 88) BLs that were significantly (R2 > 0.4 and Q < 0.01) correlated with gene expression as quantified by RNA-seq analysis. The heatmap on the right indicates the expression of genes associated with each DMR. Rows in both heatmaps are in the same order with each row depicting a DMR and the expression of its associated gene. CNV, copy-number variation; Heterochrom/lo, heterochromatin/low signal; neg, negative; pos, positive; Txn, transcriptional.

Figure 2.

Methylation patterns in Burkitt lymphoma (BL) are associated with EBV status. A, The chromatin state associated with significant DMPs from the EPIC array that were hypermethylated and hypomethylated from comparing EBV-positive (n = 130) against EBV-negative (n = 88) BLs shown as a number of all significant DMPs. The number of EPIC array probes contained within each chromatin region that was differentially methylated between BL and centroblasts (CB) (from Fig. 1C) is displayed as the background reference. B, The averaged methylation of samples across DMRs identified from the EPIC array comparing EBV-positive (n = 130) and EBV-negative (n = 88) BLs. Normal centroblast samples (n = 6) are shown as the reference. C, The heatmap on the left represents averaged methylation of samples across DMRs identified from the EPIC array comparing EBV-positive (n = 130) and EBV-negative (n = 88) BLs that were significantly (R2 > 0.4 and Q < 0.01) correlated with gene expression as quantified by RNA-seq analysis. The heatmap on the right indicates the expression of genes associated with each DMR. Rows in both heatmaps are in the same order with each row depicting a DMR and the expression of its associated gene. CNV, copy-number variation; Heterochrom/lo, heterochromatin/low signal; neg, negative; pos, positive; Txn, transcriptional.

Close modal

There are gene expression differences between EBV-positive and EBV-negative Burkitt lymphomas, which may be explained, in part, by DNA methylation changes orchestrated by the virus. We searched for genes near DMRs with correlation between expression and methylation levels. Among the 4,782 genes associated with DMRs, 476 had significant correlations (|abs(R2)|> 0.4 and Q < 0.01; Fig. 2C; Supplementary Table S5). The majority (430) of these genes exhibited anticorrelation between methylation and expression level. Among these DMR/gene pairs, we identified several genes with relevance in B-cell lymphomas, including TERT, SOX11, DNMT3A, HVCN1, and RCOR2. The association between hypermethylation and reduced expression of TERT in EBV-positive Burkitt lymphoma was compelling given its association with the reactivation of the EBV lytic cycle (30, 31).

Pathway enrichment analysis of 441 negatively correlated genes revealed an enrichment for RNA polymerase II transcription regulatory region sequence-specific DNA binding (Q = 0.03; Supplementary Table S6). Overall, it seems that the hypermethylation landscape of EBV-positive Burkitt lymphoma methylomes could broadly affect the expression of genes responsible for regulating transcription, thereby broadly reprogramming the transcriptome of Burkitt lymphomas.

Identification and Characterization of Burkitt Lymphoma Epitypes

To resolve DNA methylation patterns that cannot be explained by EBV status, we used unsupervised clustering and applied nonnegative matrix factorization. The optimal result resolved two patient clusters with distinct methylation patterns among these probes (Supplementary Fig. S2A and S2B). The genomes in one cluster were predominantly hypermethylated, hereafter referred to as HyperBL, with the remaining cases having lower methylation of most probes (HypoBL). Using the cluster assignments as training data, we implemented a statistical model to determine the probability that each Burkitt lymphoma belonged to either epigenetic subtype (epitype). The optimized classifier used 60 probes and had a high accuracy (88%) in the training set of 65% of samples (Supplementary Table S7) and similar accuracy (86%) when applied to the held-out validation set. Of the training samples, 27 (13 HyperBL and 14 HypoBL) were unclassified by this approach because of an intermediate level of methylation at these probes (Fig. 3A; Supplementary Table S8). These epitype labels were used for all subsequent analyses. We noted an even representation of adult and pediatric cases between HyperBL and HypoBL. Although HypoBL was approximately equally split between EBV-positive (N = 56/88) and EBV-negative (N = 32/88) tumors, HyperBL had an enrichment of EBV-positive (N = 78/102) when compared with EBV-negative (N = 24/102, Fig. 3B and C) tumors. Considering the recently described Burkitt lymphoma genetic subgroups (9), DGG-BL was predominantly HyperBL (N = 45/70), whereas IC-BL was mostly HypoBL (N = 38/65, Fig. 3D). The methylation differences associated with these epitypes cannot be explained by differences in tumor sample purity, as there was no significant (P > 0.05) difference in tumor purities estimated by four separate approaches: β values (32), RNA sequencing (RNA-seq; ref. 33), and WGS data using either copy-number variations (34) or mean variant allele frequency of coding mutations (Supplementary Table S8).

Figure 3.

Identification of distinct epitypes in Burkitt lymphoma. A, Heatmap depicting the optimal 60 probes from the EPIC array which achieved the greatest accuracy for classifying samples into one of the two epitypes (LPS class). Samples are ordered on the basis of the probability score associated with belonging to the HyperBL epitype. Alluvial plots showing the distribution of epitype membership by EBV status (B), age group (C), and genetic subgroup (D). All 218 samples with the available EPIC array profiling are included in the analysis.

Figure 3.

Identification of distinct epitypes in Burkitt lymphoma. A, Heatmap depicting the optimal 60 probes from the EPIC array which achieved the greatest accuracy for classifying samples into one of the two epitypes (LPS class). Samples are ordered on the basis of the probability score associated with belonging to the HyperBL epitype. Alluvial plots showing the distribution of epitype membership by EBV status (B), age group (C), and genetic subgroup (D). All 218 samples with the available EPIC array profiling are included in the analysis.

Close modal

We next compared the methylation differences between each epitype and normal centroblasts, revealing marked changes in the methylome of HyperBL (Fig. 4A and B). The magnitude of changes in the methylome of HyperBL was greater than the magnitude of changes when EBV-positive Burkitt lymphoma was compared with normal centroblasts. Specifically, although HyperBLs displayed a similar number of hypermethylated DMPs when compared with EBV-positive Burkitt lymphomas (57,596 vs. 61,336), they harbored significantly (P < 0.001) more hypomethylated DMPs (83,933 vs. 47,609). The methylation features of HypoBL more closely resembled that of centroblasts, with fewer DMPs (8,227 hypermethylated and 32,581 hypomethylated). Directly comparing HyperBL and HypoBL revealed 19,832 significant DMPs (Q < 0.05 and |logFC | > 0.25). Most of these were methylated in HyperBL and were predominantly found in openSeas (8,733), followed by CpG islands (4,233), CpG shores (2,957), and CpG shelves (1,068). The analysis of chromatin and transcriptional state annotations revealed a significant enrichment (P < 0.001) of hypermethylated DMPs in promoters (Fig. 4C). Performing the region-level analyses resolved 5,550 DMRs, mostly showing hypermethylation in HyperBL (Fig. 4D; Supplementary Table S9). Interestingly, unlike the result when stratifying on EBV status, the DMRs associated with HyperBL corresponded to regions in which methylation has been acquired compared with normal centroblasts and HypoBL. Overall, the hypermethylation patterns in HyperBL were more distinct from those found in normal centroblasts, whereas EBV-positive Burkitt lymphoma exhibits methylation patterns closer to normal centroblasts (Figs. 2B and 4D).

Figure 4.

Burkitt lymphoma (BL) epitypes have distinct methylation patterns. A, Global averaged methylation of normal centroblast (n = 6) and HyperBL (n = 102) samples. Methylation values from individual EPIC array probes were binned into 1-Mbp windows and the average of each bin is shown on a heat scale. Data are arranged in concentric rings starting from the outermost ring: normal centroblast, the difference in methylation levels between HyperBL and centroblast, and HyperBL samples. B, Global averaged methylation of normal centroblast (n = 6) and HypoBL (n = 88) samples. Methylation values from individual EPIC array probes are binned into 1-Mbp windows and the average of each bin is shown on a heat scale. Data are arranged in concentric rings starting from the outermost ring: normal centroblast, the difference in methylation levels between HypoBL and centroblast, and HypoBL samples. C, The chromatin state associated with significant DMPs from the EPIC array that were hypermethylated and hypomethylated from comparing HyperBL (n = 102) against HypoBL (n = 88) samples is shown as a number of all significant DMPs (|abs(logFC)|> 0.25 and Q < 0.05). The number of EPIC array probes contained within each chromatin region that were differentially methylated between BL and centroblasts (CB) (from Fig. 1C) is displayed as the background reference. D, The averaged methylation of samples across DMRs identified from the EPIC array comparing HyperBL (n = 102) and HypoBL (n = 88). Normal centroblast samples (n = 6) are shown as the reference, and unclassified samples (n = 28) are shown for comparison. E, The heatmap on the left represents averaged methylation of samples across DMRs identified from the EPIC array comparing HyperBL (n = 102) and HypoBL (n = 88) that were significantly correlated with gene expression. The heatmap on the right indicates the expression of the genes associated with each DMR. Rows in both heatmaps are in the same order with each row depicting a DMR and the expression of its associated gene. Unclassified samples (n = 28) are included for comparison. F, Dot plot depicting the top 15 enriched families of TFs with binding sites identified within the DMRs based on epitype. The x-axis depicts the different families of TFs, and the y-axis depicts the −log10(P value) of each TF. CNV, copy-number variation; Heterochrom/lo, heterochromatin/low signal; Txn, transcriptional.

Figure 4.

Burkitt lymphoma (BL) epitypes have distinct methylation patterns. A, Global averaged methylation of normal centroblast (n = 6) and HyperBL (n = 102) samples. Methylation values from individual EPIC array probes were binned into 1-Mbp windows and the average of each bin is shown on a heat scale. Data are arranged in concentric rings starting from the outermost ring: normal centroblast, the difference in methylation levels between HyperBL and centroblast, and HyperBL samples. B, Global averaged methylation of normal centroblast (n = 6) and HypoBL (n = 88) samples. Methylation values from individual EPIC array probes are binned into 1-Mbp windows and the average of each bin is shown on a heat scale. Data are arranged in concentric rings starting from the outermost ring: normal centroblast, the difference in methylation levels between HypoBL and centroblast, and HypoBL samples. C, The chromatin state associated with significant DMPs from the EPIC array that were hypermethylated and hypomethylated from comparing HyperBL (n = 102) against HypoBL (n = 88) samples is shown as a number of all significant DMPs (|abs(logFC)|> 0.25 and Q < 0.05). The number of EPIC array probes contained within each chromatin region that were differentially methylated between BL and centroblasts (CB) (from Fig. 1C) is displayed as the background reference. D, The averaged methylation of samples across DMRs identified from the EPIC array comparing HyperBL (n = 102) and HypoBL (n = 88). Normal centroblast samples (n = 6) are shown as the reference, and unclassified samples (n = 28) are shown for comparison. E, The heatmap on the left represents averaged methylation of samples across DMRs identified from the EPIC array comparing HyperBL (n = 102) and HypoBL (n = 88) that were significantly correlated with gene expression. The heatmap on the right indicates the expression of the genes associated with each DMR. Rows in both heatmaps are in the same order with each row depicting a DMR and the expression of its associated gene. Unclassified samples (n = 28) are included for comparison. F, Dot plot depicting the top 15 enriched families of TFs with binding sites identified within the DMRs based on epitype. The x-axis depicts the different families of TFs, and the y-axis depicts the −log10(P value) of each TF. CNV, copy-number variation; Heterochrom/lo, heterochromatin/low signal; Txn, transcriptional.

Close modal

As before, we searched for genes with evidence for epigenetic silencing, focusing on DMRs distinguishing the epitypes. These regions were associated with 2,825 genes, of which 210 exhibited significant correlations with expression (R2 > 0.4 and Q < 0.01; Fig. 4E; Supplementary Table S10). These include several genes that are mutated in DLBCLs and to a lesser extent in Burkitt lymphomas. The hypermethylation and reduced expression of IRF4 were intriguing, given the role of this transcription factor (TF) with terminal differentiation. Within the DMRs, we also observed hypermethylation of several IRF4 target genes, all of which were negatively correlated with expression in HyperBL. These genes are involved in the NF-κB pathway, important in plasma cell differentiation, and associated with activated B cell–like DLBCL (ABC-DLBCL). Hypermethylation of the TET2 promoter was associated with significantly lower expression (R2 = −0.29, P < 0.001). When compared by epitype, HyperBLs exhibited significantly lower TET2 expression than HypoBLs (P < 0.001; Supplementary Fig. S2C). Moreover, by comparing TET2 expression across various B-cell lymphoma subtypes, we found HyperBL had the lowest TET2 expression (mean variance stabilizing transformation = 9.98; Supplementary Fig. S2D). Given its role in CpG demethylation, it is possible that lower TET2 expression in HyperBL permits malignant cells to progressively acquire increased levels of methylation.

The methylation patterns in B-cell neoplasms retain some features of the epigenome of their cell of origin (18, 3537). We analyzed the Burkitt lymphomas along with 1,595 methylation profiles from a variety of B-cell neoplasms. By principal component analysis, the Burkitt lymphomas were generally separated from other malignancies with the exception of DLBCLs. There was also a general separation of HyperBL and HypoBL. Relative to the trajectory of B-cell differentiation, the methylation profiles of HypoBLs were closer to those of normal memory B or plasma cells (Supplementary Fig. S2E; refs. 15, 16). HypoBL also exhibited a closer association with m-CLL and C2.MCL, both of which have proposed origins in memory B cells. The proximity of HypoBL to normal memory B and plasma cell populations, m-CLL and C2.MCL, further supports a model wherein the two epitypes arise from distinct cells of origin.

To further resolve the underlying factors contributing to HyperBL, we explored whether the affected regions were associated with specific regulatory features. As TFs can promote hypomethylation of regulatory regions, we performed TF motif discovery within the DMRs (24, 25) and identified motifs associated with TF families responsible for B-cell development, including the NF-kB family, IRF4, and PRDM1 (Fig. 4F; Supplementary Table S11). In normal B-cell development, increased expression of these TFs promotes terminal differentiation and is linked to the hypomethylation of their associated targets in memory B cells and plasma cells (15). These observations imply that these TFs are either active or primed for activation within HypoBL. Additionally, the activation of TFs associated with memory B cells and plasma cells may indicate that the two epitypes arise from B cells at distinct differentiation stages. When this analysis was performed using only DMRs associated with CpG islands, enhancers, and promoters and not with gene body or intergenic regions, we obtained comparable results (Supplementary Fig. S3A). For comparison, we conducted similar analyses based on EBV status, which resulted in fewer enrichments for motifs, leaving only the zinc-finger family and AP-2 family TFs as significant (Supplementary Fig. S3B; Supplementary Table S12). The absence of B cell–specific TF-binding motifs among the EBV-related DMRs suggests that the hypermethylation in EBV-positive Burkitt lymphoma is not a consequence of inactivation of their cognate TF but more plausibly attributable to pressures imposed by the viral infection.

Using epiCMIT (16), a computational approach that infers historical mitotic activity based on specific methylation changes, we estimated the relative number of cell divisions in each tumor. HyperBL samples exhibited significantly higher epiCMIT values than HypoBL samples (P = 0.001; Supplementary Fig. S3C). When comparing epiCMIT values across B-cell malignancies, we noted a trend toward higher epiCMIT values in the entities thought to arise from B cells closer to terminal differentiation. This could be explained by the cumulative effect of mitotic cell divisions during normal differentiation and after malignant transformation (Supplementary Fig. S3D; ref. 18).

To explore methylation changes comprehensively, we sequenced nine of these samples using long-read sequencing. Nanopolish reported the CpG methylation status of ∼28 million CpG sites, representing nearly 99% of all CpGs in the genome. There was strong correlation between these β values and those from the EPIC array (Supplementary Fig. S4A). We then performed differential methylation analysis to compare the two epitypes. This revealed a total of 984,369 CpGs that exhibited differential methylation between HypoBL and HyperBL, along with 34,008 DMRs (Supplementary Fig. S4B). Most of these regions displayed hypermethylation in HyperBLs, and they were prominently enriched in promoter regions. These DMRs were strongly correlated with those identified through the EPIC array, providing additional support for our findings (Spearman correlation, P value < 0.001).

Long-read sequencing allows the resolution of CpG methylation status of each DNA molecule and yields a digital measurement. This affords the opportunity to quantify the heterogeneity of methylation across DNA molecules (entropy; ref. 38) that is not possible with analog measurements such as β values. We compared the entropy at CpG sites within genes and regulatory elements in both epitypes. The methylation entropy of promoter regions was substantially higher in HyperBL than in HypoBL, indicating a more uniform DNA methylation state at these sites in HypoBL (Supplementary Fig. S4C and S4D). This may imply that hypomethylation of DMRs in promoter regions of HypoBL results from a strong selective pressure, involving active enzymatic DNA demethylation. Using an overrepresentation analysis of promoter regions with low entropy in HypoBL, we found an enrichment of genes involved in hematopoietic differentiation (GO:0048534) and hematopoietic stem cell differentiation (GO:0060218; Supplementary Fig. S4E). The low entropy within these promoters suggests that their epigenetic derepression or sustained maintenance of derepression in HypoBL. These genes are activated in long-lived memory B cells, which are considered to possess a stemness reminiscent of CD8+ memory T cells (3941). These results further indicate a distinct cell of origin for HypoBL.

Genetic Features of Epitypes

Using previously published genome-wide mutation profiles (9), we next compared the epitypes for differences in mutation patterns (Fig. 5A) and mutation burden. The latter was significantly higher in HyperBL (P < 0.001; Fig. 5B). This disparity was consistent when restricted to coding mutations (P < 0.001) or known Burkitt lymphoma driver mutations (P = 0.012). Comparing the mutation frequency in each Burkitt lymphoma gene between epitypes and by modules of genes with related function (Supplementary Fig. S5A) revealed several distinctions. At the single-gene level, only ID3 exhibited a higher frequency of mutations in HypoBL. For modules, genes associated with epigenetic regulation and antigen presentation by MHC exhibited a higher frequency of mutations in HyperBL (**, Q < 0.01; *, Q < 0.05, Fisher exact test; Supplementary Fig. S5A). Interestingly, ID3 promoter methylation was significantly higher in patients lacking ID3 mutations, implying epigenetic silencing may be an alternative avenue to reducing ID3 function in Burkitt lymphoma (P = 0.005, Fisher exact test).

Figure 5.

Mutational profiles associated with the Burkitt lymphoma (BL) epitypes. A, Oncoplot of coding mutations identified by WGS in the genes determined to be associated with BL (29) and their occurrence across epitypes. The percentages on the left indicate the frequency of coding mutations of each specific gene across all samples. Each column of the oncoplot represents an individual sample. The mutations are colored according to their type. Gray tiles on the oncoplot represent the absence of mutations of the specific gene. Samples (n = 190) within each epitype are ordered on the basis of EBV status to highlight key differences. B, Box and whisker plots showing the mutation burden across HyperBL (n = 102), HypoBL (n = 88), and unclassified (n = 28) samples. In order from left to right the plots show the total number of coding mutations, driver mutations, and genome-wide mutation load as identified from the WGS. Samples were subjected to the Wilcoxon rank-sum test. C, Estimated number of mutations per COSMIC signatures SBS1, SBS5, and SBS9 in BL tumors (n = 218) stratified by epitype. Each point represents individual sample, and the y-axis shows the total number of mutations associated with that signature. Samples were subjected to the Wilcoxon rank-sum test. *, P < 0.05; **, P < 0.01; ***, P < 0.001 (P values at or above 0.05 are not shown).

Figure 5.

Mutational profiles associated with the Burkitt lymphoma (BL) epitypes. A, Oncoplot of coding mutations identified by WGS in the genes determined to be associated with BL (29) and their occurrence across epitypes. The percentages on the left indicate the frequency of coding mutations of each specific gene across all samples. Each column of the oncoplot represents an individual sample. The mutations are colored according to their type. Gray tiles on the oncoplot represent the absence of mutations of the specific gene. Samples (n = 190) within each epitype are ordered on the basis of EBV status to highlight key differences. B, Box and whisker plots showing the mutation burden across HyperBL (n = 102), HypoBL (n = 88), and unclassified (n = 28) samples. In order from left to right the plots show the total number of coding mutations, driver mutations, and genome-wide mutation load as identified from the WGS. Samples were subjected to the Wilcoxon rank-sum test. C, Estimated number of mutations per COSMIC signatures SBS1, SBS5, and SBS9 in BL tumors (n = 218) stratified by epitype. Each point represents individual sample, and the y-axis shows the total number of mutations associated with that signature. Samples were subjected to the Wilcoxon rank-sum test. *, P < 0.05; **, P < 0.01; ***, P < 0.001 (P values at or above 0.05 are not shown).

Close modal

Considering the overlap between HyperBL with EBV positivity, we searched for differences within each epitype by stratifying patients into four categories using both epitype and EBV status. Interestingly, although TP53 mutations were common in EBV-positive HyperBL (HyperBL+; 24%), they were absent among the EBV-positive HypoBL (HypoBL+; Fig. 5A). Among EBV-negative cases, HyperBL− had mutations in at least one epigenetic modifier, whereas no mutations in these were found in HypoBL−. For example, SMARCA4 mutations were common in EBV-negative Burkitt lymphoma overall but were exclusive to HyperBL− (Fig. 5A). This interplay implies distinct selective paths to lymphomagenesis that may be influenced by both EBV status and epitype.

The genome-wide landscape of mutations in HyperBL was strikingly different, with higher global mutation burden along with more coding mutations and mutations in Burkitt lymphoma genes (Fig. 5B). This contrasts with prior observations based on EBV status in Burkitt lymphomas, in which EBV-positive Burkitt lymphomas possess a higher frequency of global and coding mutations but notably fewer driver mutations. Using mutational signatures, we compared the single-base substitution exposures between the epitypes. Each of the signatures SBS1 (clock-like), SBS5 (clock-like/age-associated), and SBS9 (associated with AICDA and polymerase η activity) exhibited significant differences between the epitypes (Fig. 5C). The mutation levels attributed to all three signatures were consistently higher in HyperBL (P < 0.001). Considering the previous link between the SBS9 mutation signature and EBV-positive Burkitt lymphoma, we considered whether EBV status or epitype was the primary factor influencing mutation rates. Using a linear model, we found that mutation rates linked to SBS9, SBS1, and SBS5 were more significantly associated with epitype than with EBV status. Similarly, epitype more accurately explained the variations in mutation burden, encompassing global, coding, and driver mutations. This underscores the unique molecular attributes linked to each Burkitt lymphoma epitype, suggesting that this classification is more relevant than EBV status when considering the molecular features of the Burkitt lymphoma genome.

Mutations occurring within regions known for aSHM are attributed to the abnormal activity of activation-induced cytidine deaminase, a pattern that has been observed predominantly in EBV-positive Burkitt lymphomas. By comparing the level of aSHM-associated mutations between the two epitypes, a higher rate was found in the HyperBL genomes (Supplementary Fig. S5B). We assessed whether aSHMs at each of these sites and the total burden of associated aSHMs were more strongly associated with epitype or EBV status. Again, epitype membership consistently exhibited a stronger association than EBV status (Supplementary Fig. S5B; Supplementary Table S13). Although this does not exclude that aSHMs are associated with EBV status, they are more strongly correlated with epitype membership regardless of viral infection.

Additional Molecular Distinctions between the Epitypes

Although DNA methylation influences expression, it was unclear whether the epitypes would harbor consistent gene expression changes shaped by their unique DNA methylation landscape. Comparing the epitypes directly while controlling for EBV status revealed only 218 genes with significant differential expression [adjusted P value (Padj) < 0.01 and |log2FC| > 1], including TET2, IRF4, CD44, and CD24 (Fig. 6A; Supplemental Table S14). The differential expression of CD44, CD24, and IRF4 is of interest due to their dynamic expression in the GC reaction (CD44 and CD24) and role in plasmablastic differentiation (23, 42, 43). Higher IRF4 expression in HypoBL is reminiscent with the difference that we observed between IC-BL and DGG-BL genetic subtypes. Although the variation in IRF4 expression was more pronounced when categorized by genetic subtype as opposed to epitype, the parallels in the distribution of IRF4 expression based on both criteria implies that these classification methods are complementary yet capture distinct biological features (Supplementary Fig. S5C and S5D).

Figure 6.

Patterns of gene expression associated with epitype membership. A, The heatmap displays the 213 differentially expressed genes between epitypes, with rows representing differentially expressed genes and columns representing samples. Rows and columns are clustered using Manhattan distances. The epitype membership and EBV status of each sample is shown at the top. Samples are split on the basis of epitype with HyperBL (n = 102) cases on the left, HypoBL (n = 88) cases in the middle, and unclassified (n = 28) cases on the right. B, Heatmap representing the expression of 32 gene sets from SignatureDB with significant (Q ≤ 0.0025) differences between epitypes. Samples from 218 patients were clustered and ordered on their expression of genes within each gene set. Rows represent the gene sets and columns represent samples. Row and columns were clustered using Manhattan distances.

Figure 6.

Patterns of gene expression associated with epitype membership. A, The heatmap displays the 213 differentially expressed genes between epitypes, with rows representing differentially expressed genes and columns representing samples. Rows and columns are clustered using Manhattan distances. The epitype membership and EBV status of each sample is shown at the top. Samples are split on the basis of epitype with HyperBL (n = 102) cases on the left, HypoBL (n = 88) cases in the middle, and unclassified (n = 28) cases on the right. B, Heatmap representing the expression of 32 gene sets from SignatureDB with significant (Q ≤ 0.0025) differences between epitypes. Samples from 218 patients were clustered and ordered on their expression of genes within each gene set. Rows represent the gene sets and columns represent samples. Row and columns were clustered using Manhattan distances.

Close modal

We analyzed the RNA-seq data to infer B-cell receptor expression and search for evidence of class-switch recombination (CSR). We classified each sample based on immunoglobulin heavy chain constant gene expression (IGHM, IGHD, IGHA, IGHG, or IGHE; Supplementary Fig. S5E) and observed significantly higher (P = 0.03) representation of IGHG among the HyperBLs. To complement this, we attributed the mechanism underlying each of the MYC rearrangements. Although 32% of HyperBL cases had an MYC translocation that could be attributed to CSR, the majority (56%) of MYC translocations in HypoBL could be attributed to CSR (Supplementary Fig. S5F). Considering the higher rate of IGHM expression in HypoBL and the significantly higher (P = 0.003) proportion of MYC rearrangements resulting from CSR, we conclude that HypoBL may be derived from memory B cell that has attempted and failed to undergo CSR (4447).

Given the limited number of genes with consistent expression differences, we used gene set enrichment analyses to search for more subtle effects on genes with related function. We identified 32 differentially expressed gene sets (23), including two associated with the IRF4 network (Fig. 6B; Supplementary Fig. S6; Supplementary Table S15). HypoBL displayed elevated expression of genes involved in IRF4 induction in ABC-DLBCL, other ABC-DLBCL–related pathways, as well as a memory B-cell precursor pathway (Fig. 6B). HyperBL showed elevated expression of genes in those related to IRF4 repression in GC B cell–like DLBCL and other pathways associated with it (Fig. 6B). The elevated pathways in each epitype is consistent with earlier results that suggest that HypoBL originates from a memory B cell, whereas HyperBL likely originates from centroblasts.

Relationship between Epitypes and Patient Outcomes

We conducted Kaplan–Meier survival analyses within the adult cases, for whom the follow-up data were most complete. Among these cases, patients with HyperBL had significantly shorter progression-free and overall survival. This difference remains significant (P = 0.035 and P = 0.014, respectively) when the unclassified patients are also included in the analysis (Fig. 7A and B). Moreover, the difference cannot be explained by other molecular biomarkers such as TP53 mutation status (9) because frequencies of TP53 mutations were relatively balanced in both epitypes within adult patients and epitype status remained significant in a multivariate analysis with TP53 mutation status (Supplementary Fig. S7A–S7D). Although this finding requires confirmation in additional cohorts, the potential association between epitype and outcomes highlights DNA methylation status as a potential new prognostic biomarker for Burkitt lymphoma.

Figure 7.

Survival outcomes of adult patients with Burkitt lymphoma (BL) stratified by epitype. Kaplan–Meier survival analyses were conducted within the adult cases for whom the follow-up data were most complete. Patients with HyperBL were compared with those with HypoBL to analyze progression-free survival (A) and overall survival (B), including the unclassified cases. All times are shown in years. A log-rank P value is shown, and pairwise comparisons were conducted separately across the indicated groups. The risk tables show the number of patients at the specified timepoint.

Figure 7.

Survival outcomes of adult patients with Burkitt lymphoma (BL) stratified by epitype. Kaplan–Meier survival analyses were conducted within the adult cases for whom the follow-up data were most complete. Patients with HyperBL were compared with those with HypoBL to analyze progression-free survival (A) and overall survival (B), including the unclassified cases. All times are shown in years. A log-rank P value is shown, and pairwise comparisons were conducted separately across the indicated groups. The risk tables show the number of patients at the specified timepoint.

Close modal

Previous studies provided limited exploration of the Burkitt lymphoma methylome and its contribution to pathogenesis, primarily due to limited sample sizes and the narrow scope of comparisons focusing solely on EBV status or comparisons with follicular lymphoma (20, 48). In this study, we aimed to address these limitations by conducting a comprehensive examination of the Burkitt lymphoma methylome. Our findings confirm the influence of EBV status on DNA methylation and provide further evidence of the connection between EBV tumor positivity and hypermethylation within the DNA methylome, particularly in enhancer regions. Moreover, our investigation of the most variable CpGs in EBV-positive and EBV-negative Burkitt lymphomas allowed us to identify two distinct epitypes that extend beyond the classification based on EBV status alone.

EBV infection is detected in a variable proportion of Burkitt lymphoma tumors, differing by clinical variant and age. EBV-positive Burkitt lymphoma cases exhibit distinct characteristics, including a lower number of driver mutations specifically related to apoptotic genes, higher rates of aSHM, and increased AICDA activity (8, 9). Consistent with a previous study (8), we find that EBV-positive Burkitt lymphomas display significant hypermethylation patterns in their DNA methylomes compared with EBV-negative Burkitt lymphomas and normal centroblast samples. This hypermethylation predominantly occurs within OpenSea annotations, targets a number of genes previously identified as hypermethylated across B-cell lymphomas, and shows enrichment in enhancer regions (4952). Our results are consistent with previous studies that show EBV can promote DNA hypermethylation aimed at precisely controlling the expression of specific genes, such as PRDM1, which can trigger EBV lytic reactivation, thereby increasing its capacity to evade detection by the immune system (30, 49, 50, 52).

The occurrence of focal hypermethylation within Burkitt lymphoma has been described in the context of EBV (49, 50, 53, 54). Although HyperBL is dominated by EBV-positive samples, HyperBL occurrence in EBV-negative cases suggests EBV infection may not be a prerequisite for this phenomenon. TET2 hypermethylation in HyperBL was recurrent and associated with reduced expression. Furthermore, we note the highest incidence of mutations affecting regulators of epigenetic state among the genomes of HyperBLs. Indeed, among the 21 EBV-negative HyperBL cases, three had loss-of-function mutations affecting TET2, whereas there were none observed among the 63 EBV-positive HyperBLs (P = 0.01396, Fisher exact test). Given the suggested role of TET2 mutations in genome hypermethylation in DLBCLs (55, 56), it is plausible that the results of TET2 mutation or hypermutation along with mutations in other epigenetic-related genes contribute to the hypermethylation pattern in the DNA methylome applicable to HyperBL, particularly in the absence of EBV.

HyperBL also has elevated mutation frequencies in genes associated with epigenetic regulation and antigen presentation (Supplementary Fig. S5A). By contrast, HypoBL exhibited a higher prevalence in genes associated with, notably, ID3 mutations. An inverse correlation emerged between ID3 mutation status and promoter methylation patterns. Specifically, we observed a significant prevalence of ID3 promoter hypermethylation in patients devoid of ID3 mutations. This pattern suggests that the inactivation of ID3 can occur either through genetic mutations or through epigenetic modifications, such as promoter hypermethylation, underscoring the multifaceted mechanisms that contribute to the functional loss of this gene. Although we noted a pronounced correlation between HyperBL and EBV-positive Burkitt lymphoma, distinct variations exist that are sufficient to identify epitype as biologically distinct from EBV status. For example, approximately a quarter of EBV-positive HyperBL cases exhibited TP53 mutations, but none were observed in EBV-positive HypoBL. In the context of EBV-negative Burkitt lymphoma, all HyperBL cases exhibited mutations in at least one gene related to epigenetic regulation; conversely, these mutations were not observed in EBV-negative HypoBL. These observed disparities underscore a multifaceted interaction and unique selective pressures that characterize the process of lymphomagenesis in each epitype.

Noncoding mutations consistent with aSHM are a pattern previously linked to EBV-positive status in Burkitt lymphomas (8, 9). However, our findings reveal a notable enrichment of aSHM in HyperBL, a pattern that is correlated with but not wholly explained by EBV-positive Burkitt lymphoma status per se. Our comparative analysis established a stronger linkage of aSHM rates with epitype after controlling for EBV status. Based on this comprehensive analysis, we infer that both HyperBL and HypoBL encompass EBV-positive and EBV-negative tumors, and that each epitype represents a unique biological entity independent of EBV status. Our data underscore that complexity of mechanisms leading to variations in aSHM, mutation burden, and mutation signatures are influenced by EBV status and, more specifically, the epitype. Thus, epitype may serve as a more relevant biomarker of mutational processes within Burkitt lymphoma genomes than EBV status, offering a new way to study Burkitt lymphoma pathogenesis.

The use of gene expression for classification has been instrumental in understanding the diverse nature and prognosis of various NHLs, including Burkitt lymphoma. In DLBCL, it has offered insights into distinct biological underpinnings and facilitated its categorization into different cell-of-origin (COO) subgroups (5759). However, when extrapolated to other lymphomas, this approach has shown limitations particularly in diseases like chronic lymphocytic leukemia and mantle cell lymphoma, in which distinguishing COO through gene expression profiling has been challenging (16, 17). In our study, we describe epitypes within Burkitt lymphoma that potentially illuminate COO hypothesis, each associated with distinct survival outcomes. HyperBL and HypoBL are distinguished by notable molecular characteristics, with biological and transcriptomic distinctions that echo the COO subgroups in DLBCL. Specifically, HyperBL was associated with pronounced promoter hypermethylation and inferior overall survival, appearing epigenetically closer to the GC B cell. Conversely, HypoBL appears epigenetically closer to a memory B cell, maintaining methylation patterns typical of normal B cells at this developmental phase and being associated with better survival outcomes. Furthermore, we found reduced entropy in the promoters of genes tied to hematopoietic stem cell differentiation in HypoBL.

In our observations, HypoBL exhibited a pronounced elevation in IRF4 expression compared with HyperBL, a trend that aligns with the distinctions noted between IC-BL and DGG-BL when sorted by genetic subtypes. Although the disparity in IRF4 expression is more accentuated under genetic subtype classification, the similarities in IRF4 expression distribution under both genetic and epitype categorizations suggest that these two classification approaches, though capturing different nuances, are complementary. In HypoBL, there is an amplified expression of genes associated with IRF4 induction in ABC-DLBCL and a pathway indicative of memory B-cell precursors. These findings, coupled with distinct methylation patterns, suggest the possibility of unique cellular origins for HyperBL and HypoBL, each carrying distinct biological and clinical connotations.

We have identified a subgroup within Burkitt lymphoma characterized by elevated levels of DNA methylation and have also shown elevated levels of methylation present within EBV-positive Burkitt lymphoma. The shared occurrence of hypermethylation in both EBV-positive Burkitt lymphoma and the HyperBL subgroup highlights the potential value of investigating demethylating agents as part of Burkitt lymphoma treatment approaches. Further detailed investigations are essential to confirm these hypotheses and understand their broader implications in the context of Burkitt lymphoma pathogenesis and treatment.

Case Accrual

This study was performed in accordance with the ethical standards of the Declaration of Helsinki. All contributing sites provided institutional review board approvals for the use of tissues submitted for molecular characterization. Written informed consent was obtained from the parents or guardians of the children and written informed assent was obtained from children of 7 years of age and older prior to enrollment.

Methylation Data Collection and Integration

Samples were analyzed by Illumina Infinium MethylationEPIC BeadChip arrays by following the manufacturer’s protocol. Beginning with a compendium of 866,836 probes, a systematic filtration process was executed using the minfi R package (60). Initial exclusion criteria encompassed the removal of 977 CpG probes with inadequate signal detection, followed by the omission of 30,435 CpGs corresponding to SNPs and 19,298 CpGs located on sex chromosomes. The retained subset of 812,126 CpGs exhibited a detection P value ≤ 0.05 in more than 10% of samples. Subsequent exclusions targeted samples with suboptimal intensity signals or problematic probe conversions, culminating in the elimination of 10 samples. After filtration, a total of 218 Burkitt lymphoma samples and six normal centroblast samples (8) were preserved for further analysis, encompassing DNA methylation values for 812,126 CpG probes. These values were normalized using the Subset-quantile Within Array Normalization (SWAN) algorithm. Comprehensive annotation of all CpGs was facilitated by using the IlluminaHumanMethylationEPICanno.ilm10b4.hg19 (version 0.6) R package. To enable comparative analyses across B-cell neoplasms and normal B cell types, we analyzed methylation data from 1,595 B-cell samples from Duran-Ferrer and colleagues (18) and integrated these with our data. Subsequent analyses were conducted using the 452,679 CpG sites common to both methylation array platforms.

Differential DNA Methylation Analyses

DMRcate (61) was used to identify DMPs and DMRs using preprocessed β values as input. DMPs required abs(logFC) > 0.25 and Padj < 0.05. DMRs required a mean methylation difference >0.1, an FDR < 0.05, and at least four CpG sites. The analysis of differential methylation between epitypes included EBV status as a variable in the design model to facilitate the identification of DMPs and DMRs not directly associated with EBV status.

To quantify enrichment among DMPs, the entire complement of CpGs from the EPIC array was used to establish a background reference. To ascertain enrichments of CpG probes within regulatory regions (ChromHMM), we assessed the proportion of differentially methylated CpGs located in these regions relative to the proportion of background EPIC array probes within the same regulatory context. Enrichment significance was determined through the calculation of odds ratio scores using the Fisher exact test (P < 0.01).

Nonnegative Matrix Factorization Clustering

Nonnegative matrix factorization (NMF) clustering relies on the NMF package (0.23.0). Five thousand probes with the highest variance within each of the EBV-positive and EBV-negative samples were used as input (10,000 probes total). To ascertain the optimal factorization rank for NMF, 100 bootstrapped iterations ranging from ranks 2 to 5 were executed, with the optimal rank identified through the assessment of the cophenetic coefficient, dispersion, and silhouette metrics. Upon determining a factorization rank of 2 as the best fit, various algorithms, specifically, those of Lee, Brunet, offset, and nonsmooth NMF, were evaluated by iteratively running them on the feature matrix until convergence was achieved. The Brunet algorithm emerged as the most suitable for subsequent analysis based on these evaluations. Throughout the estimation of factorization rank, algorithm selection, and NMF computations, a consistent random seed value of 12,345 was used to ensure reproducibility. Sample clustering was determined using the Euclidean distance metric, with individual sample subgroup membership ascertained by extracting matrix coefficients and assigning samples to the subgroup corresponding to their highest coefficient.

Training Burkitt Lymphoma Epitype Classifier

Using the linear predictor score (LPS) R package, a linear predictor model was engineered to ascertain the optimal probe count necessary for categorizing samples into HyperBL and HypoBL epigenetic subtypes, similar to that described in Wright and colleagues (57). Initially, a training set composed of 77 HyperBL and 66 HypoBL specimens was compiled, in conjunction with a validation set containing 40 HyperBL and 35 HypoBL samples. A series of models, each incorporating a variable number of probes, underwent assessment within the training set, using a leave-one-out cross-validation strategy to identify the optimal number of probes for the LPS model. The iteration incorporating 60 probes, selected from an initial pool of 10,000 probes used for NMF clustering, demonstrated the lowest mean error rate. When this model was applied to the validation cohort, the LPS score distribution within each epigenetic subtype mirrored that of the training cohort, effectively negating the presence of model overfitting. Using probability determinations based on Bayes’ theorem, a 90% certainty threshold was established for definitive epitype classification. Accordingly, samples were designated to an epitype only if the likelihood of their belonging met or exceeded the 90% threshold; otherwise, they were considered unclassified.

Genome-Wide Methylation Profiling with Long Reads

The occurrence of 5mC in the long-read sequencing data was determined using Nanopolish (v. 0.8.5; ref. 62) using the combination of raw signals from the sequencer found in the fast5 files and aligned PromethION BAM files as input. Methylation frequencies (β values) at reference CpGs were then summarized using the Nanopolish “calculate methylation frequency” script and by enabling the split flag (-s) to obtain single CpG methylation. β values used in subsequent analyses were restricted to those within autosomes (chr:1-22), excluding the sex (X and Y) chromosomes. The analysis of long-read methylation profiling was done using the Lymphoid Cancer Research (LCR) modules, and the code is available at https://github.com/LCR-BCCRC/lcr-modules.

The DSS (63) R package was used to execute differential methylation analyses on PromethION sequencing data, adhering to the procedure outlined in the official tutorial. Differentially methylated CpG sites were characterized by a mean methylation differential exceeding 0.1 between epitypes, coupled with a FDR < 0.05. DMRs were identified under the criteria of a mean methylation difference surpassing 0.1 between epitypes, an FDR < 0.05, and a composition of at least four CpGs.

Estimating Methylation Entropy from Long-Read Data

Shannon entropy scores were computed from methylation values from Nanopolish using CpelNano (64). Comparative analysis of these scores across varied transcriptional states between the two epitypes was conducted using the Wilcoxon rank-sum test (base R, RRID: SCR_001905). Promoter regions exhibiting significant (Q < 0.05) differences in entropy scores, specifically those promoters hypomethylated with low entropy scores in HypoBL, were further analyzed for enrichment using the rGREAT (65), package within R by following the recommended guidelines outlined in the tutorial. The resulting Gene Ontology biological processes enrichment data were compiled via the getEnrichmentTables function to obtain the final over-enrichment results table.

DNA Extraction and PromethION Sequencing

High molecular weight DNA was extracted using the MagAttract HMW DNA Kit (cat. #67563, QIAGEN). Genomic libraries were prepared, conforming to Oxford Nanopore Technologies’ protocols, using the SQK-LSK109 Ligation Library Kit. NEBNext Ultra II kit (cat. #E7646A, New England Biolabs) was used for end-repair and A-tailing. NEBNext quick ligase (E6056S) was used to ligate the Oxford Nanopore adapters. A final size selection of 0.4:1 ratio (magnetic beads to library) was done to select against smaller molecules. PromethION sequencing proceeded using the R9.4.1 pore flow cell on the PromethION Alpha-Beta instrument and beta release software version 19.06.9 (MinKNOW v3.4.6, GUI v3.4.12). A DNase I nuclease flush (cat. #AM2222, Invitrogen) was performed after 18 hours as per Oxford Nanopore protocol, version NFL_9076_v109_revD_08Oct2048. Base calling from the resulting fast5 files was performed using guppy v3.2.1, and the reads were mapped to GRCh38 using minimap2.

Annotation of Probes with Chromatin and Transcriptional State

To refine our annotations, probes were mapped to chromatin and transcriptional states, as defined by the lymphoblastoid B-cell line GM12878 obtained from the ChromHMM track of the UCSC Genome Browser. The chromatin states were partitioned as follows: states 1 to 3 were classified as promoter regions, indicative of active, weak, and poised promoters; states 4 to 7 were designated as enhancer regions, encompassing strong and weak enhancers; state 8 delineated insulator regions; state 9 was attributed to transcriptional transition; state 10 to transcriptional elongation; state 11 to weak transcription; state 12 to polycomb-repressed regions; and state 13 to heterochromatin.

TF-Binding Motif Enrichment Analyses

JASPAR enrichment analysis tool was used to perform TF-binding motif enrichment analysis. Input for the analysis consisted of DMRs delineated by EBV status and epigenetic subtype. These regions were juxtaposed against a carefully constructed background comprising EPIC array probes. Prior to analysis with JASPAR, both DMRs and background data were converted from hg19 to hg38 genomic coordinates. Adhering to established JASPAR protocols, enrichment calculations were conducted within a specified universe of genomic regions. The DMRs constituted the user-defined input set, whereas the background constructed from the EPIC array probes served as the user background/universe set.

Gene Expression and Pathway Analyses

Gene expression was quantified at the transcript level using Salmon as described previously in Thomas and colleagues (9). Differential gene expression analysis was performed using edgeR between the epitypes, wherein RNA-seq data were available. The experimental design model used to determine differentially expressed genes between epitypes allowed for the controlling of variables including cell sorting status, patient sex, and EBV status while identifying differentially expressed genes based on the epitype. Cutoffs of Padj < 0.01 and |log2FC| > 1.5 were used to determine significantly differentially expressed genes. Differentially expressed genes were further visualized using the R package ComplexHeatmap (version 2.2.0). Due to the disproportionate composition of EBV-positive samples between epitypes and to ensure that the most significantly differentially expressed genes were best explained by epitypes, we created a linear model for the said genes. The model included both EBV status and epitype as variables.

Gene set enrichment analysis was performed using GSVA (version 1.34.0) comparing the two epitypes. Normalized expression data and gene sets obtained from SignatureDB were used as input for gene set variation analyses. Gene sets were filtered for a minimum of five genes and a maximum of 500. Significantly differentially enriched gene sets were determined based on a cutoff of Padj < 0.01 and visualized using the R package ComplexHeatmap.

Mitotic Clock Estimates

By using the EPIC array dataset, methylation-based mitotic clock estimates were ascertained via the epiCMIT (https://github.com/Duran-FerrerM/Pan-B-cell-methylome), in adherence to the workflow defined in the mitotic clock–calculator documentation. Concurrently, mitotic clock estimates were obtained from a cohort of 1,595 B-cell samples, as detailed in Duran-Ferrer and colleagues (18). These estimates facilitated subsequent comparative analyses across various B-cell malignancies and against baseline measurements from normal B-cell populations comparing the proliferative histories of various entities.

Data Availability

All primary WGS, long-read PromethION, and transcriptome sequencing data, as well as clinical data used in this publication can be found on the NCI’s Genome Data Commons that can be accessed directly at https://portal.gdc.cancer.gov/projects/CGCI-BLGSP. The raw DNA methylation profiling data analyzed with EPIC arrays is deposited and publicly available through Gene Expression Omnibus under accession number GSE292690 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE292690). All custom bioinformatics workflows, scripts, postprocessing, and visualization functions are openly available on GitHub through repository LCR modules (https://github.com/LCR-BCCRC/lcr-modules), LCR scripts (https://github.com/LCR-BCCRC/lcr-scripts), and GAMBLR.open package (https://github.com/morinlab/GAMBLR.open).

N. Thomas reports personal fees from Vevo Therapeutics outside the submitted work. C. Casper reports grants from NIH during the conduct of the study. J.M. Gastier-Foster reports grants from NIH during the conduct of the study, as well as other support from Beckman Coulter Biosciences outside the submitted work. A.S. Gerrie reports grants and personal fees from Eli Lilly Canada and BeiGene, personal fees from AstraZeneca, and grants from Janssen outside the submitted work. T.C. Greiner reports other support from Leidos Biomedical Research, Inc., a subcontractor for NIH, during the conduct of the study. C.G. Mullighan reports personal fees from Illumina during the conduct of the study, as well as grants from Pfizer, personal fees from Amgen, and other support from Cyrus outside the submitted work. D.W. Scott reports personal fees from AbbVie, AstraZeneca, Genmab, Kite/Gilead, and Veracyte and grants and personal fees from Roche/Genentech outside the submitted work, as well as a patent for Using Gene Expression to Assign Cell-of-origin Class to Aggressive B-cell Lymphomas issued and licensed to NanoString Technologies and a patent for Using Gene Expression to Identify the Dark Zone Signature in Aggressive B-cell Lymphomas issued. M. Esteller reports grants from Incyte and personal fees from Quimatryx outside the submitted work. No disclosures were reported by the other authors.

N. Thomas: Resources, data curation, software, formal analysis, validation, investigation, visualization, methodology, writing–original draft, writing–review and editing. C.A. García-Prieto: Resources, software, formal analysis, validation, investigation, visualization, methodology, writing–review and editing. K. Dreval: Resources, data curation, software, formal analysis, visualization, methodology, writing–review and editing. L.K. Hilton: Resources, data curation, software, formal analysis, investigation, visualization, methodology, writing–review and editing. J.S. Abramson: Resources, data curation, writing–review and editing. N.L. Bartlett: Resources, formal analysis, investigation, methodology, writing–review and editing. J. Bethony: Resources, data curation, writing–review and editing. J. Bowen: Resources, data curation, writing–review and editing. A.C. Bryan: Resources, data curation, writing–review and editing. C. Casper: Resources, data curation, writing–review and editing. M.A. Dyer: Resources, data curation, funding acquisition, project administration. J.M. Gastier-Foster: Resources, data curation, writing–review and editing. A.S. Gerrie: Resources, data curation, formal analysis, methodology, writing–review and editing. T.C. Greiner: Resources, formal analysis, investigation, methodology, writing–review and editing. N.B. Griner: Resources, data curation, writing–review and editing. T.G. Gross: Resources, data curation, writing–review and editing. N. Harris: Resources, formal analysis, investigation, methodology, writing–review and editing. J.D. Irvin: Resources, data curation, supervision, funding acquisition, project administration. E.S. Jaffe: Resources, formal analysis, investigation, methodology, writing–review and editing. F.E. Leal: Resources, data curation, writing–review and editing. S.M. Mbulaiteye: Resources, data curation, formal analysis, funding acquisition, methodology, writing–review and editing. C.G. Mullighan: Resources, data curation, writing–review and editing. A.J. Mungall: Resources, data curation, project administration, writing–review and editing. K.L. Mungall: Resources, data curation, project administration, writing–review and editing. C. Namirembe: Resources, data curation, writing–review and editing. A. Noy: Resources, data curation, formal analysis, methodology, writing–review and editing. M.D. Ogwang: Resources, data curation, writing–review and editing. J. Orem: Resources, data curation, writing–review and editing. G. Ott: Resources, data curation, writing–review and editing. H. Petrello: Resources, data curation, project administration, writing–review and editing. S.J. Reynolds: Resources, data curation, funding acquisition, writing–review and editing. S.H. Swerdlow: Resources, data curation, methodology, writing–review and editing. A. Traverse-Glehen: Resources, data curation, writing–review and editing. W.H. Wilson: Resources, data curation, writing–review and editing. M.A. Marra: Conceptualization, resources, data curation, supervision, project administration, writing–review and editing. L.M. Staudt: Conceptualization, resources, data curation, supervision, funding acquisition, writing–review and editing. D.W. Scott: Conceptualization, resources, data curation, supervision, methodology, project administration, writing–review and editing. M. Esteller: Conceptualization, resources, data curation, software, formal analysis, supervision, funding acquisition, methodology, writing–original draft, project administration, writing–review and editing. R.D. Morin: Conceptualization, resources, data curation, software, formal analysis, supervision, funding acquisition, validation, investigation, visualization, methodology, writing–original draft, project administration, writing–review and editing.

The authors thank the Foundation for Burkitt Lymphoma Research Working Group for interesting discussions. They also acknowledge the Information Management Services (Silver Spring, MD), Westat, Inc. (Rockville, MD), and African Field Epidemiology Network (Kampala, Uganda) for coordinating The Epidemiology of Burkitt Lymphoma in East African Children and Minors (EMBLEM) study fieldwork in Uganda. The authors are grateful for contributions from various groups at Canada’s Michael Smith Genome Sciences Centre, including those from the biospecimen, library construction, sequencing, bioinformatics, technology development, quality assurance, laboratory information management system, purchasing, and project management teams. They also thank the Biorepository of St. Jude Children’s Research Hospital (NCI grants P30 CA021765 and R35 CA197695 to C.G. Mullighan). This work has been funded in part by the Foundation for Burkitt Lymphoma Research (http://www.foundationforburkittlymphoma.org) and in whole or in part with Federal funds from the NCI, NIH, under contract number 75N91019D00024, task order number 75N91020F00003, contract number HHSN261200800001E, contract number HHSN261201100063C, and contract number HHSN261201100007I (Division of Cancer Epidemiology and Genetics), as well as in part (S.J. Reynolds) by the Division of Intramural Research, NIAID, NIH. The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services nor does the mention of trade names, commercial products, or organizations imply endorsement by the US government. This work was supported by a Terry Fox New Investigator Award (number 1043) and by an operating grant from the Canadian Institutes for Health Research, as well as a New Investigator Award from the Canadian Institutes for Health Research (to R.D. Morin). R.D. Morin is a Michael Smith Foundation for Health Research Scholar, and D.W. Scott is a Michael Smith Foundation for Health Research Health Professional-Investigator. M.A. Marra is the recipient of the Canada Research Chair in Genome Science. M. Esteller is supported by the CERCA Programme/Generalitat de Catalunya (2021 SGR01494), Spanish Ministry of Science and Innovation MCIN/AEI/10.13039/501100011033/ERDF “A way to make Europe” (PID2021-125282OB-I00), and the Cellex Foundation (CEL007). This article is dedicated to the memory of Daniela S. Gerhard.

Note: Supplementary data for this article are available at Blood Cancer Discovery Online (https://bloodcancerdiscov.aacrjournals.org/).

1.
Hämmerl
L
,
Colombet
M
,
Rochford
R
,
Ogwang
DM
,
Parkin
DM
.
The burden of Burkitt lymphoma in Africa
.
Infect Agents Cancer
2019
;
14
:
17
.
2.
Stefan
DC
,
Lutchman
R
.
Burkitt lymphoma: epidemiological features and survival in a South African centre
.
Infect Agents Cancer
2014
;
9
:
19
.
3.
Bouska
A
,
Bi
C
,
Lone
W
,
Zhang
W
,
Kedwaii
A
,
Heavican
T
, et al
.
Adult high-grade B-cell lymphoma with Burkitt lymphoma signature: genomic features and potential therapeutic targets
.
Blood
2017
;
130
:
1819
31
.
4.
Schmitz
R
,
Ceribelli
M
,
Pittaluga
S
,
Wright
G
,
Staudt
LM
.
Oncogenic mechanisms in Burkitt lymphoma
.
Cold Spring Harb Perspect Med
2014
;
4
:
a014282
.
5.
Love
C
,
Sun
Z
,
Jima
D
,
Li
G
,
Zhang
J
,
Miles
R
, et al
.
The genetic landscape of mutations in Burkitt lymphoma
.
Nat Genet
2012
;
44
:
1321
5
.
6.
Molyneux
EM
,
Rochford
R
,
Griffin
B
,
Newton
R
,
Jackson
G
,
Menon
G
, et al
.
Burkitt’s lymphoma
.
Lancet
2012
;
379
:
1234
44
.
7.
Magrath
I
.
Epidemiology: clues to the pathogenesis of Burkitt lymphoma
.
Br J Haematol
2012
;
156
:
744
56
.
8.
Grande
BM
,
Gerhard
DS
,
Jiang
A
,
Griner
NB
,
Abramson
JS
,
Alexander
TB
, et al
.
Genome-wide discovery of somatic coding and noncoding mutations in pediatric endemic and sporadic Burkitt lymphoma
.
Blood
2019
;
133
:
1313
24
.
9.
Thomas
N
,
Dreval
K
,
Gerhard
DS
,
Hilton
LK
,
Abramson
JS
,
Ambinder
RF
, et al
.
Genetic subgroups inform on pathobiology in adult and pediatric Burkitt lymphoma
.
Blood
2023
;
141
:
904
16
.
10.
Alaggio
R
,
Amador
C
,
Anagnostopoulos
I
,
Attygalle
AD
,
Araujo
IBO
,
Berti
E
, et al
.
The 5th edition of the world health organization classification of haematolymphoid tumours: lymphoid neoplasms
.
Leukemia
2022
;
36
:
1720
48
.
11.
Bergman
Y
,
Cedar
H
.
DNA methylation dynamics in health and disease
.
Nat Struct Mol Biol
2013
;
20
:
274
81
.
12.
Esteller
M
.
Epigenetic gene silencing in cancer: the DNA hypermethylome
.
Hum Mol Genet
2007
;
16
:
R50
9
.
13.
Fernandez
AF
,
Assenov
Y
,
Martin-Subero
JI
,
Balint
B
,
Siebert
R
,
Taniguchi
H
, et al
.
A DNA methylation fingerprint of 1628 human samples
.
Genome Res
2012
;
22
:
407
19
.
14.
Xia
D
,
Leon
AJ
,
Yan
J
,
Silva
A
,
Bakhtiari
M
,
Tremblay-LeMay
R
, et al
.
DNA methylation-based classification of small B-cell lymphomas: a proof-of-principle study
.
J Mol Diagn
2021
;
23
:
1774
86
.
15.
Kulis
M
,
Merkel
A
,
Heath
S
,
Queirós
AC
,
Schuyler
RP
,
Castellano
G
, et al
.
Whole-genome fingerprint of the DNA methylome during human B-cell differentiation
.
Nat Genet
2015
;
47
:
746
56
.
16.
Kulis
M
,
Heath
S
,
Bibikova
M
,
Queirós
AC
,
Navarro
A
,
Clot
G
, et al
.
Epigenomic analysis detects widespread gene-body DNA hypomethylation in chronic lymphocytic leukemia
.
Nat Genet
2012
;
44
:
1236
42
.
17.
Queirós
AC
,
Beekman
R
,
Vilarrasa-Blasi
R
,
Duran-Ferrer
M
,
Clot
G
,
Merkel
A
, et al
.
Decoding the DNA methylome of mantle cell lymphoma in the light of the entire B cell lineage
.
Cancer Cell
2016
;
30
:
806
21
.
18.
Duran-Ferrer
M
,
Clot
G
,
Nadeu
F
,
Beekman
R
,
Baumann
T
,
Nordlund
J
, et al
.
The proliferative history shapes the DNA methylome of B-cell tumors and predicts clinical outcome
.
Nat Cancer
2020
;
1
:
1066
81
.
19.
Hummel
M
,
Bentink
S
,
Berger
H
,
Klapper
W
,
Wessendorf
S
,
Barth
TFE
, et al
.
A biologic definition of Burkitt’s lymphoma from transcriptional and genomic profiling
.
N Engl J Med
2006
;
354
:
2419
30
.
20.
Kretzmer
H
,
Bernhart
SH
,
Wang
W
,
Haake
A
,
Weniger
MA
,
Bergmann
AK
, et al
.
DNA-methylome analysis in Burkitt and follicular lymphomas identifies differentially methylated regions linked to somatic mutation and transcriptional control
.
Nat Genet
2015
;
47
:
1316
25
.
21.
Shawky
SA
,
El-Borai
MH
,
Khaled
HM
,
Guda
I
,
Mohanad
M
,
Abdellateif
MS
, et al
.
The prognostic impact of hypermethylation for a panel of tumor suppressor genes and cell of origin subtype on diffuse large B-cell lymphoma
.
Mol Biol Rep
2019
;
46
:
4063
76
.
22.
Hayslip
J
,
Montero
A
.
Tumor suppressor gene methylation in follicular lymphoma: a comprehensive review
.
Mol Cancer
2006
;
5
:
44
.
23.
Victora
GD
,
Dominguez-Sola
D
,
Holmes
AB
,
Deroubaix
S
,
Dalla-Favera
R
,
Nussenzweig
MC
.
Identification of human germinal center light and dark zone cells and their relationship to human B-cell lymphomas
.
Blood
2012
;
120
:
2240
8
.
24.
Esteller
M
,
Guo
M
,
Moreno
V
,
Peinado
MA
,
Capella
G
,
Galm
O
, et al
.
Hypermethylation-associated inactivation of the cellular retinol-binding-protein 1 gene in human cancer
.
Cancer Res
2002
;
62
:
5902
5
.
25.
Esteller
M
.
Profiling aberrant DNA methylation in hematologic neoplasms: a view from the tip of the iceberg
.
Clin Immunol
2003
;
109
:
80
8
.
26.
Amara
K
,
Trimeche
M
,
Ziadi
S
,
Laatiri
A
,
Hachana
M
,
Korbi
S
.
Prognostic significance of aberrant promoter hypermethylation of CpG islands in patients with diffuse large B-cell lymphomas
.
Ann Oncol
2008
;
19
:
1774
86
.
27.
Mohamed
G
,
Talima
S
,
Li
L
,
Wei
W
,
Rudzki
Z
,
Allam
RM
, et al
.
Low expression and promoter hypermethylation of the tumour suppressor SLIT2, are associated with adverse patient outcomes in diffuse large B cell lymphoma
.
Pathol Oncol Res
2019
;
25
:
1223
31
.
28.
Grønbaek
K
,
Ralfkiaer
U
,
Dahl
C
,
Hother
C
,
Burns
JS
,
Kassem
M
, et al
.
Frequent hypermethylation of DBC1 in malignant lymphoproliferative neoplasms
.
Mod Pathol
2008
;
21
:
632
8
.
29.
Coyle
KM
,
Dreval
K
,
Hodson
DJ
,
Morin
RD
.
Audit of B-cell cancer genes
.
Blood Adv
2025
;
9
:
2019
31
.
30.
Dolcetti
R
,
Giunco
S
,
Dal Col
J
,
Celeghin
A
,
Mastorci
K
,
De Rossi
A
.
Epstein-Barr virus and telomerase: from cell immortalization to therapy
.
Infect Agent Cancer
2014
;
9
:
8
.
31.
Bellon
M
,
Nicot
C
.
Regulation of telomerase and telomeres: human tumor viruses take control
.
J Natl Cancer Inst
2008
;
100
:
98
108
.
32.
Qin
Y
,
Feng
H
,
Chen
M
,
Wu
H
,
Zheng
X
.
InfiniumPurify: an R package for estimating and accounting for tumor purity in cancer methylation research
.
Genes Dis
2018
;
5
:
43
5
.
33.
Steen
CB
,
Liu
CL
,
Alizadeh
AA
,
Newman
AM
.
Profiling cell type abundance and expression in bulk tissues with CIBERSORTx
.
Methods Mol Biol
2020
;
2117
:
135
57
.
34.
Nik-Zainal
S
,
Van Loo
P
,
Wedge
DC
,
Alexandrov
LB
,
Greenman
CD
,
Lau
KW
, et al
.
The life history of 21 breast cancers
.
Cell
2012
;
149
:
994
1007
.
35.
Gravemeyer
J
,
Spassova
I
,
Verhaegen
ME
,
Dlugosz
AA
,
Hoffmann
D
,
Lange
A
, et al
.
DNA-methylation patterns imply a common cellular origin of virus- and UV-associated Merkel cell carcinoma
.
Oncogene
2022
;
41
:
37
45
.
36.
Oakes
CC
,
Martin-Subero
JI
.
Insight into origins, mechanisms, and utility of DNA methylation in B-cell malignancies
.
Blood
2018
;
132
:
999
1006
.
37.
Pan
H
,
Jiang
Y
,
Boi
M
,
Tabbò
F
,
Redmond
D
,
Nie
K
, et al
.
Epigenomic evolution in diffuse large B-cell lymphomas
.
Nat Commun
2015
;
6
:
6921
.
38.
Xie
H
,
Wang
M
,
de Andrade
A
,
Bonaldo
MF
,
Galat
V
,
Arndt
K
, et al
.
Genome-wide quantitative assessment of variation in DNA methylation patterns
.
Nucleic Acids Res
2011
;
39
:
4099
108
.
39.
Kurosaki
T
,
Kometani
K
,
Ise
W
.
Memory B cells
.
Nat Rev Immunol
2015
;
15
:
149
59
.
40.
Luckey
CJ
,
Bhattacharya
D
,
Goldrath
AW
,
Weissman
IL
,
Benoist
C
,
Mathis
D
.
Memory T and memory B cells share a transcriptional program of self-renewal with long-term hematopoietic stem cells
.
Proc Natl Acad Sci U S A
2006
;
103
:
3304
9
.
41.
Good-Jacobson
KL
.
Strength in diversity: phenotypic, functional, and molecular heterogeneity within the memory B cell repertoire
.
Immunol Rev
2018
;
284
:
67
78
.
42.
Ochiai
K
,
Maienschein-Cline
M
,
Simonetti
G
,
Chen
J
,
Rosenthal
R
,
Brink
R
, et al
.
Transcriptional regulation of germinal center B and plasma cell fates by dynamical control of IRF4
.
Immunity
2013
;
38
:
918
29
.
43.
Laidlaw
BJ
,
Cyster
JG
.
Transcriptional regulation of memory B cell differentiation
.
Nat Rev Immunol
2021
;
21
:
209
20
.
44.
Zhang
Y
,
Garcia-Ibanez
L
,
Ulbricht
C
,
Lok
LSC
,
Pike
JA
,
Mueller-Winkler
J
, et al
.
Recycling of memory B cells between germinal center and lymph node subcapsular sinus supports affinity maturation to antigenic drift
.
Nat Commun
2022
;
13
:
2460
.
45.
Mesin
L
,
Schiepers
A
,
Ersching
J
,
Barbulescu
A
,
Cavazzoni
CB
,
Angelini
A
, et al
.
Restricted clonality and limited germinal center reentry characterize memory B cell reactivation by boosting
.
Cell
2020
;
180
:
92
106.e11
.
46.
Reusch
L
,
Angeletti
D
.
Memory B-cell diversity: from early generation to tissue residency and reactivation
.
Eur J Immunol
2023
;
53
:
2250085
.
47.
Palm
A-KE
,
Henry
C
.
Remembrance of things past: long-term B cell memory after infection and vaccination
.
Front Immunol
2019
;
10
:
1787
.
48.
Manara
F
,
Jay
A
,
Odongo
GA
,
Mure
F
,
Maroui
MA
,
Diederichs
A
, et al
.
Epigenetic alteration of the cancer-related gene TGFBI in B cells infected with epstein–barr virus and exposed to aflatoxin B1: potential role in Burkitt lymphoma development
.
Cancers (Basel)
2022
;
14
:
1284
.
49.
Hernandez-Vargas
H
,
Gruffat
H
,
Cros
MP
,
Diederichs
A
,
Sirand
C
,
Vargas-Ayala
RC
, et al
.
Viral driven epigenetic events alter the expression of cancer-related genes in Epstein-Barr-virus naturally infected Burkitt lymphoma cell lines
.
Sci Rep
2017
;
7
:
5852
.
50.
Zhang
T
,
Ma
J
,
Nie
K
,
Yan
J
,
Liu
Y
,
Bacchi
CE
, et al
.
Hypermethylation of the tumor suppressor gene PRDM1/Blimp-1 supports a pathogenetic role in EBV-positive Burkitt lymphoma
.
Blood Cancer J
2014
;
4
:
e261
.
51.
Stanland
LJ
,
Luftig
MA
.
The role of EBV-induced hypermethylation in gastric cancer tumorigenesis
.
Viruses
2020
;
12
:
1222
.
52.
Sinclair
AJ
.
Could changing the DNA methylation landscape promote the destruction of epstein-barr virus-associated cancers?
Front Cell Infect Microbiol
2021
;
11
:
695093
.
53.
Ghosh Roy
S
,
Robertson
ES
,
Saha
A
.
Epigenetic impact on EBV associated B-cell lymphomagenesis
.
Biomolecules
2016
;
6
:
46
.
54.
Saha
A
,
Jha
HC
,
Upadhyay
SK
,
Robertson
ES
.
Epigenetic silencing of tumor suppressor genes during in vitro Epstein–Barr virus infection
.
Proc Natl Acad Sci U S A
2015
;
112
:
E5199
207
.
55.
Asmar
F
,
Punj
V
,
Christensen
J
,
Pedersen
MT
,
Pedersen
A
,
Nielsen
AB
, et al
.
Genome-wide profiling identifies a DNA methylation signature that associates with TET2 mutations in diffuse large B-cell lymphoma
.
Haematologica
2013
;
98
:
1912
20
.
56.
Rosikiewicz
W
,
Chen
X
,
Dominguez
PM
,
Ghamlouch
H
,
Aoufouchi
S
,
Bernard
OA
, et al
.
TET2 deficiency reprograms the germinal center B cell epigenome and silences genes linked to lymphomagenesis
.
Sci Adv
2020
;
6
:
eaay5872
.
57.
Wright
G
,
Tan
B
,
Rosenwald
A
,
Hurt
EH
,
Wiestner
A
,
Staudt
LM
.
A gene expression-based method to diagnose clinically distinct subgroups of diffuse large B cell lymphoma
.
Proc Natl Acad Sci U S A
2003
;
100
:
9991
6
.
58.
Scott
DW
,
Wright
GW
,
Williams
PM
,
Lih
C-J
,
Walsh
W
,
Jaffe
ES
, et al
.
Determining cell-of-origin subtypes of diffuse large B-cell lymphoma using gene expression in formalin-fixed paraffin-embedded tissue
.
Blood
2014
;
123
:
1214
7
.
59.
Yan
W-H
,
Jiang
X-N
,
Wang
W-G
,
Sun
Y-F
,
Wo
Y-X
,
Luo
Z-Z
, et al
.
Cell-of-Origin subtyping of diffuse large B-cell lymphoma by using a qPCR-based gene expression assay on formalin-fixed paraffin-embedded tissues
.
Front Oncol
2020
;
10
:
803
.
60.
Aryee
MJ
,
Jaffe
AE
,
Corrada-Bravo
H
,
Ladd-Acosta
C
,
Feinberg
AP
,
Hansen
KD
, et al
.
Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays
.
Bioinformatics
2014
;
30
:
1363
9
.
61.
Peters
TJ
,
Buckley
MJ
,
Statham
AL
,
Pidsley
R
,
Samaras
K
,
V Lord
R
, et al
.
De novo identification of differentially methylated regions in the human genome
.
Epigenetics Chromatin
2015
;
8
:
6
.
62.
Simpson
JT
,
Workman
RE
,
Zuzarte
PC
,
David
M
,
Dursi
LJ
,
Timp
W
.
Detecting DNA cytosine methylation using nanopore sequencing
.
Nat Methods
2017
;
14
:
407
10
.
63.
Park
Y
,
Wu
H
.
Differential methylation analysis for BS-seq data under general experimental design
.
Bioinformatics
2016
;
32
:
1446
53
.
64.
Abante
J
,
Kambhampati
S
,
Feinberg
AP
,
Goutsias
J
.
Estimating DNA methylation potential energy landscapes from nanopore sequencing data
.
Sci Rep
2021
;
11
:
21619
.
65.
Gu
Z
,
Hübschmann
D
.
rGREAT: an R/bioconductor package for functional enrichment on genomic regions
.
Bioinformatics
2023
;
39
:
btac745
.
This open access article is distributed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) license.

Supplementary data