Abstract
While cancers evolve during disease progression and in response to therapy, temporal dynamics remain difficult to study in humans due to the lack of consistent barcodes marking individual clones in vivo. We employ mitochondrial single-cell assay for transposase-accessible chromatin with sequencing to profile 163,279 cells from 9 patients with chronic lymphocytic leukemia (CLL) collected across disease course and utilize mitochondrial DNA (mtDNA) mutations as natural genetic markers of cancer clones. We observe stable propagation of mtDNA mutations over years in the absence of strong selective pressure, indicating clonal persistence, but dramatic changes following tight bottlenecks, including disease transformation and relapse posttherapy, paralleled by acquisition of copy-number variants and changes in chromatin accessibility and gene expression. Furthermore, we link CLL subclones to distinct chromatin states, providing insight into nongenetic sources of relapse. mtDNA mutations thus mirror disease history and provide naturally occurring genetic barcodes to enable patient-specific study of cancer subclonal dynamics.
Single-cell multi-omic profiling of CLL reveals the utility of somatic mtDNA mutations as in vivo barcodes, which mark subclones that can evolve over time along with changes in accessible chromatin and gene expression profiles to capture dynamics of disease evolution.
See related commentary by Hilton and Scott, p. 2965.
This article is highlighted in the In This Issue feature, p. 2945
Introduction
Cancer is a clonal disease originating from a founder cell population and is typically characterized by occurrence of divergent subclones with different fitness over time through the acquisition of somatic mutations, epigenetic reprogramming, and changes in transcriptional state (1–3). When exposed to evolutionary bottlenecks in the form of systemic therapy or disease transformation, fitter clones with distinct genomic properties may emerge or be selected for, which form the basis for acquired therapeutic resistance (4, 5). Long-term studies of the clonal evolution of cancer cells with their phenotypic characterization on a molecular and genomic level are therefore pivotal for a better understanding of evolving biology and differential therapeutic susceptibilities within heterogeneous cancer cell populations.
Chronic lymphocytic leukemia (CLL) is an opportune setting for assessing clonal evolution due to ease of access to high purity tumor cells from blood or lymphoid tissue. This, combined with the long-term nature of the disease, often characterized by an initial period of watchful waiting and subsequent disease progression requiring intermittent therapies with eventual relapse, allows for sequential sampling. Indeed, the genetic heterogeneity and clonal dynamics of CLL have been well described based on somatic nuclear mutations during disease evolution and in response to treatment (4, 6). Although identification of mutations conferring resistance has the potential to provide insights into the mechanisms of relapse, one hurdle is the identification of nongenetic contributions to disease recurrence, of which there is growing evidence (7). Related to this shortcoming is the lack of bona fide markers or in vivo barcodes that allow individual subclones to be consistently tracked over time within a patient at the single-cell level. Although somatic nuclear mutations can be utilized, these are limited to known mutations, often requiring additional targeted amplification (8) and may be subject to selective pressure.
Mitochondrial DNA (mtDNA) mutations are located on a small 16.6 kB circular genome that is replicated independently from nuclear DNA. Because of the high copy number of mtDNA per cell, variant allele frequencies of mtDNA mutations (heteroplasmy) may have a wide dynamic range from <1% to 100% (homoplasmy). While mtDNA mutations at low heteroplasmy are likely functionally silent, there is mounting evidence that truncating mtDNA mutations can play a role in cancer (9, 10). Recent studies have demonstrated the ability to detect mtDNA mutations in single cells based on the Assay for Transposase-Accessible Chromatin using sequencing (mtscATAC-seq; refs. 11, 12). This technology leverages naturally occurring mtDNA mutations to mark subclonal populations, providing a complementary and unbiased approach to the temporal study of clonal evolution. Of note, mtscATAC-seq enables the pairing of clonal information based on mtDNA mutations and inferred chromosomal copy-number variants (CNV) with an accessible chromatin readout, thus allowing the characterization of nongenetic mechanisms of clonal evolution (13).
Here, we evaluate the chromatin and CNV changes as well as mtDNA mutations in serially collected peripheral blood (PB) samples from 9 patients with CLL across a span of up to 10 years and across different clinical scenarios. These include watchful waiting (W/W); chemoimmunotherapy with fludarabine, cyclophosphamide, and rituximab (FCR); targeted inhibition of Bruton tyrosine kinase (BTK) through ibrutinib; and immunotherapy in the form of reduced-intensity conditioning allogeneic transplantation (RIC allo-HSCT) that reflect the evolving therapeutic strategies for CLL (14). We also track CLL subclones during their progression to Richter syndrome (RS), the transformation to an aggressive lymphoma, which is associated with dismal prognosis (15). We observe mtDNA mutations follow natural disease evolution and their profiles are dramatically shaped by tight therapeutic bottlenecks, such as chemotherapy or RIC allo-HSCT, but remain largely unaffected by ibrutinib treatment. These changes in mtDNA mutations after chemoimmunotherapy are paralleled by acquisition of additional CNVs and chromatin changes that are associated with profound shifts in transcriptional state, thus demonstrating the contribution of genetic and nongenetic features to disease evolution and therapeutic resistance.
Results
mtscATAC-seq on Serially Collected Samples Representative of the Clinical Spectrum of CLL
We performed mtscATAC-seq on 26 serial samples [24 PB, 1 bone marrow (BM), 1 lymph node (LN)] collected from 9 patients with CLL at time of active disease [median white blood cell (WBC) count 43.8 × 109/L, range 6.7–607 × 109/L; median %CD19+ CD5+ 99.5, range 42.4–100, where available; Fig. 1A; Supplementary Table S1]. Twenty-four of 26 samples were also coanalyzed using single-cell RNA sequencing (scRNA-seq). The cohort included representative scenarios of CLL disease course such as W/W (CLL1), chemotherapy with FCR (CLL1–3), FCR followed by RIC allo-HSCT (CLL4–6), treatment with ibrutinib (CLL6–8), and transformation to RS (CLL9). In total, we obtained single-cell ATAC-seq (scATAC-seq) profiles from 163,279 high-quality cells [transcriptional start site (TSS) enrichment >4; unique nuclear fragments >1,000; median cells per sample 5,425 (range 501–29,833); Supplementary Fig. S1A–S1C; Supplementary Table S1]. While scATAC-seq profiles of non-CLL immune cells from CLL1–9 cells were highly similar, each CLL formed a distinct cluster distinguished by 41,384 marker peaks (Fig. 1B), illustrating patient-specific chromatin states. For mtDNA analysis, 97,690 cells were used (mean >10–20× coverage per base). To extend our analysis of mtDNA mutations in CLL, we also included two previously reported mtscATAC-seq CLL profiles (CLLA and CLLB) for a total of 109,118 cells (11). CLL cells could be distinguished from physiologic B cells based on chromatin accessibility profiles. The estimated median CLL content of the CLL/B-cell compartment was 99% (range, 66%–100%) across samples (Supplementary Fig. S1D–S1I).
Longitudinal accessible chromatin and mtDNA mutation dynamics across a cohort of 9 patients with CLL. A, Circulating WBC counts of 9 CLL patients across disease course. Black dots, time points at which mtscATAC-seq samples were obtained. Day 0, day of diagnosis. Treatments are indicated. B, Distinct accessible chromatin profiles in 163,279 CLL and non-CLL cells across 9 study subjects. UMAP, Uniform Manifold Approximation and Projection. C, mtDNA mutations detected in CLL cells from 26 samples collected at the time of detectable circulating disease. Available samples from CLL9 at time of transformation to RS (LN, PB, and BM; n = 3) are pooled. Colors of bars indicate individual patients. mtscATAC-seq profiles from a previously published dataset originating from two additional CLL patients (CLLA and CLLB) are included (11). D, Recurrent mtDNA mutations (black) in the CLL cell compartment from 11 patients with CLL. Gray dots, nonrecurrent mtDNA mutations (presence in <4 patients or in <1% of cells). Magenta boxed mutations mark known frequent mtDNA variants. E and F, Number of mtDNA mutations detected per cell and percentage of cells with detectable mtDNA mutations in T cells, CLL cells, and monocytes across the samples. Statistical testing was performed using Wilcoxon signed-rank test. G, mtDNA mutations enriched in T cells, CLL cells, and monocytes individually or as combinations. The majority of mutations (n = 457) was equally distributed among all three compartments. Inset shows distribution of percentage monocytes (green), T cells (blue), or B cells (orange) marked by mtDNA mutations enriched in either cell compartment. Gray, distribution of percentage cells not marked by enriched mtDNA mutations.
Longitudinal accessible chromatin and mtDNA mutation dynamics across a cohort of 9 patients with CLL. A, Circulating WBC counts of 9 CLL patients across disease course. Black dots, time points at which mtscATAC-seq samples were obtained. Day 0, day of diagnosis. Treatments are indicated. B, Distinct accessible chromatin profiles in 163,279 CLL and non-CLL cells across 9 study subjects. UMAP, Uniform Manifold Approximation and Projection. C, mtDNA mutations detected in CLL cells from 26 samples collected at the time of detectable circulating disease. Available samples from CLL9 at time of transformation to RS (LN, PB, and BM; n = 3) are pooled. Colors of bars indicate individual patients. mtscATAC-seq profiles from a previously published dataset originating from two additional CLL patients (CLLA and CLLB) are included (11). D, Recurrent mtDNA mutations (black) in the CLL cell compartment from 11 patients with CLL. Gray dots, nonrecurrent mtDNA mutations (presence in <4 patients or in <1% of cells). Magenta boxed mutations mark known frequent mtDNA variants. E and F, Number of mtDNA mutations detected per cell and percentage of cells with detectable mtDNA mutations in T cells, CLL cells, and monocytes across the samples. Statistical testing was performed using Wilcoxon signed-rank test. G, mtDNA mutations enriched in T cells, CLL cells, and monocytes individually or as combinations. The majority of mutations (n = 457) was equally distributed among all three compartments. Inset shows distribution of percentage monocytes (green), T cells (blue), or B cells (orange) marked by mtDNA mutations enriched in either cell compartment. Gray, distribution of percentage cells not marked by enriched mtDNA mutations.
Within CLL cells, we detected a total of 516 mtDNA mutations across the 28 samples (median 38 mutations per sample; range, 6–171; Supplementary Table S2), with each patient displaying a largely distinct mutational profile (Fig. 1C). We confirmed a strong enrichment of T>C and C>T base substitutions among these mutations, verifying our experimental and computational workflow (Supplementary Fig. S2A; ref. 11). We found no association between the number of mutations detected and the number of CLL cells sequenced or age at diagnosis (Supplementary Fig. S2B and S2C). In addition, samples with fewer mtDNA mutations did not show a more skewed distribution as measured by the normalized Shannon index (Supplementary Fig. S2D). Ten mtDNA mutations were found in more than three subjects, all identifiable in <1% of cells at <1% mean (pseudobulk) heteroplasmy (Fig. 1D; Supplementary Fig. S2E). These recurring mutations at low frequency were not unexpected, and were in keeping with the relatively small mitochondrial genome and its 10-fold higher mutation rate compared with genomic DNA (16). In fact, three of these mutations (64C>T, 16390G>A, 709G>A) have been reported as frequently occurring variants, and none are known to be pathogenic or have a functional role (17). However, 3244G>A has been associated with mitochondrial encephalopathy, lactic acidosis, and stroke-like episodes (MELAS) syndrome (18). One mutation within MT-ND1 (3412G>A) was detectable in >10% of cells in 2 patients but was not found in a previously reported cohort of 20 patients with CLL (19). Overall, we did not observe substantial evidence of specific and recurrent mtDNA mutations that could be directly implicated in CLL pathogenesis within our cohort. These results support a model where the observed somatic mtDNA mutations occur randomly to mark clonal lineages within human cells, as previously suggested (12).
When comparing mtDNA mutations between the T-cell, CLL, and monocyte compartments, we observed CLL cells to have the highest number of mtDNA mutations [median mutations per cell 0.59 vs. 1.36 vs. 1.0; P < 0.05 (pairwise comparisons; Wilcoxon signed-rank test)] and the highest percentage of cells with at least one detectable mutation [median percentage of cells with mutations 46.8% vs. 78.2% vs. 62.6%; P < 0.01 (pairwise comparisons; Wilcoxon signed-rank test); Fig. 1E and F]. This difference is likely explained by a higher mtDNA coverage in CLL cells (median 38×, range 16–103) than in T cells (median 31×, range 15–60) or monocytes (median 34×, range 10–51). While 457 of 518 mutations (88.2%) were equally distributed among these three compartments, only 11 (2.1%), 25 (4.8%), and 9 (1.7%) were enriched in either T cells, CLL cells, or monocytes, respectively [Padj <0.05 (pairwise comparisons; Fisher exact test); Fig. 1G]. Mutations that were equally distributed across these tissue compartments were typically detectable in <1% cells. Enrichment of mtDNA mutations in CLL cells was most pronounced with some mutations detectable in up to 99% of cells, while such mutations were least enriched in monocytes (Fig. 1G, inset), consistent with a lack of antigen-driven clonal expansion (Supplementary Fig. S2F–S2H).
Long-term Tracking of mtDNA Mutations and Chromatin Remodeling
To assess the stability of mtDNA mutations within CLL cells over a prolonged disease course and to associate their kinetics with chromatin states, we focused first on CLL1, from whom we obtained mtscATAC-seq profiles of circulating CLL samples over a period of 10 years. Two of five serially collected samples were procured across a 6-year treatment-free W/W period (W/W-1 and W/W-2), during which elevated WBC counts remained relatively stable before entering a phase of exponential growth requiring the initiation of FCR chemotherapy (pre-FCR). CLL1 experienced disease relapse after a 3-year remission, for which we profiled two further samples (remission and relapse; Fig. 2A).
mtDNA mutations as long-term markers of disease evolution. A, Clinical information and mean heteroplasmy of mtDNA mutations at five time points (CLL1). Sixteen of 43 identified mtDNA mutations (mean heteroplasmy >0.01%) are displayed. For a complete list, see Supplementary Table S2. UMAP, Uniform Manifold Approximation and Projection. B, UMAP plots of single CLL cells, clustered on the basis of chromatin accessibility profiles, demonstrating distinct clusters based on time points (top). The heteroplasmy of 15261G>A, 7053G>A, 9144C>A, and 4966G>A are shown in the individual panels and segregate in distinct cell clusters. C, Trajectory of subclonal structure inferred from mean heteroplasmy of mtDNA mutations and copy-number changes, derived from scATAC-seq data (see Supplementary Fig. S3). D and E, Chromatin accessibility of differentially expressed peaks and gene activity scores in CLL cells during watch and wait (W/W-1/2), before FCR chemotherapy (pre-FCR), or at relapse marked by 7053G>A, 9144C>A, or other mutations. F, Integrated scRNA-seq expression counts of RGS1, TNIK, and ICOS projected onto the CLL1 scATAC-seq UMAP.
mtDNA mutations as long-term markers of disease evolution. A, Clinical information and mean heteroplasmy of mtDNA mutations at five time points (CLL1). Sixteen of 43 identified mtDNA mutations (mean heteroplasmy >0.01%) are displayed. For a complete list, see Supplementary Table S2. UMAP, Uniform Manifold Approximation and Projection. B, UMAP plots of single CLL cells, clustered on the basis of chromatin accessibility profiles, demonstrating distinct clusters based on time points (top). The heteroplasmy of 15261G>A, 7053G>A, 9144C>A, and 4966G>A are shown in the individual panels and segregate in distinct cell clusters. C, Trajectory of subclonal structure inferred from mean heteroplasmy of mtDNA mutations and copy-number changes, derived from scATAC-seq data (see Supplementary Fig. S3). D and E, Chromatin accessibility of differentially expressed peaks and gene activity scores in CLL cells during watch and wait (W/W-1/2), before FCR chemotherapy (pre-FCR), or at relapse marked by 7053G>A, 9144C>A, or other mutations. F, Integrated scRNA-seq expression counts of RGS1, TNIK, and ICOS projected onto the CLL1 scATAC-seq UMAP.
In total, we identified 43 mtDNA mutations across all cell types with mean pseudobulk heteroplasmy ranging from 0.0001% to 98.7%. These mutations each followed distinct patterns over time, suggesting their ability to mark clonal substructure. The mutation 15261G>A, not known to be pathogenic, was detectable at high mean heteroplasmy in all samples (>95% in CLL cells and >30% in T cells and monocytes), indicating it to broadly mark hematopoietic cells. In contrast, all other detected mutations followed dynamic patterns: 7053G>A had a stably high mean heteroplasmy among CLL cells during W/W and at the time of accelerated growth (>8%) but disappeared after chemotherapy. Other mutations appeared only during exponential growth before therapy at low frequency [i.e., 15215G>A (1.3% mean heteroplasmy), 16065G>A (2.3%), 9144C>A (4.3%), 1119T>C (6.3%); Fig. 2B and C]. Among the mtDNA mutations that appeared before FCR, CLL cells shared 7053G>A and 1119T>C, but presence of 7053G>A rarely co-occurred with 16065G>A and 9144C>A (Supplementary Fig. S3A–S3E). Of these emerging mutations, only cells carrying the 9144C>A mutation survived FCR therapy and reemerged at relapse (32.5% mean heteroplasmy in CLL cells) along with partial loss of chromosome 6q, which was also detected through clinical karyotyping (Table 1), and an additional novel mutation (4966G>A). This latter mutation was present virtually exclusively within cells that also had the relapse-dominating 9144C>A mutation (Supplementary Fig. S3F–S3H), suggesting this to be a marker of a diverging subclone that emerged within the relapse population.
Molecular and clinical karyotypic characterization of CLL1–9
CLL . | Age . | Sex . | IGHV . | Cytogenetics before FCR . | Cytogenetics at relapse post-FCR . | Karyotype before FCR . | Karyotype at relapse post-FCR . | ||
---|---|---|---|---|---|---|---|---|---|
1 | 37 | M | Unmutated | tri(12) | tri(12), del(6) | n/a | 47,XY,t(2;14)(p15;q32),+12,t(14;19)(q32;q13.3)[11]/ | ||
47,sl,del(6)(q13q27)[2]/ | |||||||||
46,XY,del(20)(q11.2q13.3)[5]/ | |||||||||
46,XY,t(7;8)(p14;q11.2)[1]/46,XY[1] | |||||||||
2 | 37 | M | Unmutated | del(13q), del(p53) | del(13q), del(p53) | 46,XYnuc ish 13q14.3 (D13S319 × 1),17p13.1(p53 × 1) | 45,XY,der(17;18)(q10;q10)[cp6]/46,XY[cp14]nuc ish(ATMx2,P53 × 1)[48/200],(D12Z3 × 2,D13S319 × 1,LSI13q34 × 2)[55/200]/(D12Z3 × 2,D13S319 × 0,LSI13q34 × 2)[138/200] | ||
3 | 58 | M | Unmutated | Normal | Normal | 46,XY | 46,XY | ||
CLL | Age | Sex | IGHV | Cytogenetics before FCR | Cytogenetics at relapse post–allo-HSCT | Karyotype before FCR | Karyotype at relapse post–allo-HSCT | ||
4 | 54 | F | Unmutated | del(11q) | del(17p) | nuc ish(CCND1,IGH) x2[100], (ATMx1,P53 × 2)[93/100] | 45,X,-X[7]/45,X,-X,inv(3)(p21p25)[7]/ | ||
45,X,-X,inv(3),t(1;6)(p34;p23)[cp4]/ | |||||||||
45,X,-X,t(1;5)(p36;p13)[cp2] | |||||||||
nuc ish(CCND1,IGH)x2[200], | |||||||||
(ATMx2,TP53 × 1)[155/200], | |||||||||
(D12Z3 × 2,D13S319 × 2,LSI13q34 × 2)[200], | |||||||||
(DXZ1 × 2)[2/200]/(DXZ1 × 1)[131/200]/ | |||||||||
(DXZ1,DYZ1) x1[67/200] | |||||||||
5 | 59 | F | Unmutated | der(6)t(6;17), del(13q), | del(13q) | 45,XX,der(6)t(6;17)(q2?1;q21),-17[6]/46,XX[cp14] | nuc ish(DXZ1 × 2)[12/200]/(DXZ1,DYZ1) x 1[92/200] | ||
del(17p) | nuc ish (D13S319 × 1)[10/100], | nuc ish(DXZ1 × 1)[96/200] nuc ish(DYZ1 × 0), | |||||||
(P53 × 1)[15/100], | (D13S319 × 1),(LSI13q34 × 2)[13/100] | ||||||||
(ATM,D12Z3,LSI13q34,) x 2[100] | nuc ish(DYZ1 × 0),(P53 × 2)[100] | ||||||||
6 | 40 | M | Mutated | del(14q) | del(14q),der(13;17) | nuc ish(P53 × 1)[7/100],(CCND1,ATM,D12Z3,D13S319,LSI13q34,IGH) x2[100] | 45,XY,der(13;17)(q10;q10),del(14)(q22q32)[cp5] | ||
nuc ish(DXZ1,DYZ1) x1[45/200]// | |||||||||
(DXZ1 × 2)[155/200] | |||||||||
CLL | Age | Sex | IGHV | Cytogenetics before Ibrutinib | |||||
7 | 52 | F | Unmutated | del(13q) | |||||
8 | 65 | M | n/a | del(13q) | |||||
CLL | Age | Sex | IGHV | Cytogenetics CLL | Cytogenetics RS | Karyotype CLL | Karyotype RS | ||
9 | 56 | M | Mutated | del(17p) | del(17p), | nuc ish(TP53 × 1)[107/200], | 44,XY,t(3;4)(q28;q25), | ||
t(8;22) | (ATM,D12Z3,DLEU1, DLEU2,TFDP1)x2[200] | der(8)t(8;17)(p11.2;q11,2)t(8;22)(q24;q11), -9, add(14)(q24),-17,add(20)(q13.3), | |||||||
nuc ish(CCND1, IGH)x2[200] | der(22)t(8;22)[cp7]/78–88<4n>, | ||||||||
idemx2,ider(3)(q10)add(3)(q27)[cp9]/46,XY[4] | |||||||||
CLL | Age | Sex | IGHV | Cytogenetics | Karyotype | ||||
A | 84 | M | n/a | tri(12), del(13q14.3) | nuc ish(MYB x 2)[200] nuc ish(ATM x 2)[200] nuc ish(D12Z3 × 3)[82/200] | ||||
nuc ish(D13S319,D13S25 × 1,LAMP 1 × 2)[24/200] nuc ish(CCND1, IGH) × 2[200] | |||||||||
nuc ish(TP53,D17Z1)[200] | |||||||||
B | 45 | F | Mutated | del(13q) | n/a |
CLL . | Age . | Sex . | IGHV . | Cytogenetics before FCR . | Cytogenetics at relapse post-FCR . | Karyotype before FCR . | Karyotype at relapse post-FCR . | ||
---|---|---|---|---|---|---|---|---|---|
1 | 37 | M | Unmutated | tri(12) | tri(12), del(6) | n/a | 47,XY,t(2;14)(p15;q32),+12,t(14;19)(q32;q13.3)[11]/ | ||
47,sl,del(6)(q13q27)[2]/ | |||||||||
46,XY,del(20)(q11.2q13.3)[5]/ | |||||||||
46,XY,t(7;8)(p14;q11.2)[1]/46,XY[1] | |||||||||
2 | 37 | M | Unmutated | del(13q), del(p53) | del(13q), del(p53) | 46,XYnuc ish 13q14.3 (D13S319 × 1),17p13.1(p53 × 1) | 45,XY,der(17;18)(q10;q10)[cp6]/46,XY[cp14]nuc ish(ATMx2,P53 × 1)[48/200],(D12Z3 × 2,D13S319 × 1,LSI13q34 × 2)[55/200]/(D12Z3 × 2,D13S319 × 0,LSI13q34 × 2)[138/200] | ||
3 | 58 | M | Unmutated | Normal | Normal | 46,XY | 46,XY | ||
CLL | Age | Sex | IGHV | Cytogenetics before FCR | Cytogenetics at relapse post–allo-HSCT | Karyotype before FCR | Karyotype at relapse post–allo-HSCT | ||
4 | 54 | F | Unmutated | del(11q) | del(17p) | nuc ish(CCND1,IGH) x2[100], (ATMx1,P53 × 2)[93/100] | 45,X,-X[7]/45,X,-X,inv(3)(p21p25)[7]/ | ||
45,X,-X,inv(3),t(1;6)(p34;p23)[cp4]/ | |||||||||
45,X,-X,t(1;5)(p36;p13)[cp2] | |||||||||
nuc ish(CCND1,IGH)x2[200], | |||||||||
(ATMx2,TP53 × 1)[155/200], | |||||||||
(D12Z3 × 2,D13S319 × 2,LSI13q34 × 2)[200], | |||||||||
(DXZ1 × 2)[2/200]/(DXZ1 × 1)[131/200]/ | |||||||||
(DXZ1,DYZ1) x1[67/200] | |||||||||
5 | 59 | F | Unmutated | der(6)t(6;17), del(13q), | del(13q) | 45,XX,der(6)t(6;17)(q2?1;q21),-17[6]/46,XX[cp14] | nuc ish(DXZ1 × 2)[12/200]/(DXZ1,DYZ1) x 1[92/200] | ||
del(17p) | nuc ish (D13S319 × 1)[10/100], | nuc ish(DXZ1 × 1)[96/200] nuc ish(DYZ1 × 0), | |||||||
(P53 × 1)[15/100], | (D13S319 × 1),(LSI13q34 × 2)[13/100] | ||||||||
(ATM,D12Z3,LSI13q34,) x 2[100] | nuc ish(DYZ1 × 0),(P53 × 2)[100] | ||||||||
6 | 40 | M | Mutated | del(14q) | del(14q),der(13;17) | nuc ish(P53 × 1)[7/100],(CCND1,ATM,D12Z3,D13S319,LSI13q34,IGH) x2[100] | 45,XY,der(13;17)(q10;q10),del(14)(q22q32)[cp5] | ||
nuc ish(DXZ1,DYZ1) x1[45/200]// | |||||||||
(DXZ1 × 2)[155/200] | |||||||||
CLL | Age | Sex | IGHV | Cytogenetics before Ibrutinib | |||||
7 | 52 | F | Unmutated | del(13q) | |||||
8 | 65 | M | n/a | del(13q) | |||||
CLL | Age | Sex | IGHV | Cytogenetics CLL | Cytogenetics RS | Karyotype CLL | Karyotype RS | ||
9 | 56 | M | Mutated | del(17p) | del(17p), | nuc ish(TP53 × 1)[107/200], | 44,XY,t(3;4)(q28;q25), | ||
t(8;22) | (ATM,D12Z3,DLEU1, DLEU2,TFDP1)x2[200] | der(8)t(8;17)(p11.2;q11,2)t(8;22)(q24;q11), -9, add(14)(q24),-17,add(20)(q13.3), | |||||||
nuc ish(CCND1, IGH)x2[200] | der(22)t(8;22)[cp7]/78–88<4n>, | ||||||||
idemx2,ider(3)(q10)add(3)(q27)[cp9]/46,XY[4] | |||||||||
CLL | Age | Sex | IGHV | Cytogenetics | Karyotype | ||||
A | 84 | M | n/a | tri(12), del(13q14.3) | nuc ish(MYB x 2)[200] nuc ish(ATM x 2)[200] nuc ish(D12Z3 × 3)[82/200] | ||||
nuc ish(D13S319,D13S25 × 1,LAMP 1 × 2)[24/200] nuc ish(CCND1, IGH) × 2[200] | |||||||||
nuc ish(TP53,D17Z1)[200] | |||||||||
B | 45 | F | Mutated | del(13q) | n/a |
Abbreviations: F, female; M, male; n/a, not available.
We wondered whether the two dynamic mutations 7053G>A (contraction at relapse) and 9144C>A (expansion at relapse) could mark CLL populations with different chromatin states. Indeed, cells bearing 7053G>A or 9144C>A did not colocalize on the Uniform Manifold Approximation and Projection (UMAP) plot derived from scATAC-seq profiles and marked distinct populations (Fig. 2B). Differential analysis of chromatin accessibility showed higher accessibility of transcription factor motifs associated with B-cell development (SPI1, REL, RELA, SPIC) in cells marked by 7053G>A than those marked by 9144C>A (Supplementary Fig. S3I). Notably, CLL cells marked by 9144C>A before FCR had chromatin peaks that were already more similar with cells at relapse than those marked by 7053G>A (Fig. 2D). These differences in chromatin peaks led to higher gene activity scores of genes previously described to play a role in B-cell malignancies such as RGS1 (20), TNIK (21), and ICOS (22) in CLL cells marked by 9144C>A (Fig. 2E). Indeed, we observed higher expression of these genes at relapse through the integration of matched scRNA-seq profiles of CLL1 cells into the scATAC-seq space (Fig. 2F). Finally, we wondered whether clonal dynamics inferred with mtDNA mutations would be consistent with analyses based on somatic nuclear mutations from matched whole-exome sequencing (WES) of the same tumors. Reanalysis of WES data (23) similarly showed that a number of therapy-sensitive clones were eradicated by FCR, while therapy-resistant clones expanded at relapse (Supplementary Fig. S3J). WES also confirmed the acquisition of del(6q) at relapse (Supplementary Fig. S3K and S3L).
Our analysis revealed that the dynamics of mtDNA mutations mirrored disease evolution in CLL1 over a period of 10 years and marked putative subclones with distinct chromatin profiles and differential therapeutic sensitivity. Furthermore, single-cell chromatin and mtDNA profiles demonstrated continued evolution and selection of a therapy-resistant phenotype.
Dynamics of mtDNA Mutations Following Chemotherapy and Stem Cell Transplantation
Having observed evidence of selective pressure exercised by natural disease history or therapeutic intervention in CLL1, we sought to determine whether consistent patterns of change in mtDNA mutations and chromatin profiles during clinical inflection points could also be detected in other patients with CLL. To this end, we evaluated the mtDNA mutation profiles of 2 additional patients undergoing FCR chemotherapy (CLL2–3) and 3 patients treated with FCR followed by RIC allo-HSCT (CLL4–6, see Table 1). We first focused on CLL4, who relapsed 6 years after allo-HSCT, and asked whether this longer disease course was associated with even more extreme changes in mtDNA mutations than in CLL1. Indeed, while we observed that the 16247G>A mutation (mean heteroplasmy of 30.5% pre-FCR) was almost undetectable at relapse (mean heteroplasmy <0.2%), two other mutations (6426G>A, 16290C>T) that were initially almost undetectable reached high levels of mean heteroplasmy at relapse (50.7%, 43.3%). At the same time, 3538G>A continued to be detectable in practically all CLL cells with a mean heteroplasmy of almost 100% but was absent from T cells or monocytes, indicating this to be a CLL cell lineage–specific mutation (Fig. 3A). We observed a uniformly detectable chromosome 4p deletion in all CLL cells at both time points, which, along with the occurrence of the 3538G>A mutation, seems to have been an early genetic change in the disease history of this patient. Consistent with the profound shift in the detection of 6426G>A and 16290C>T mutations at relapse, and the corresponding loss of 16247G>A, a chromosome 11q deletion was only detectable before FCR and a 17p deletion was present in most CLL cells at relapse but undetectable before FCR ( Fig. 3A; see arrows), consistent with the known cytogenetics (Fig. 3A and B; Table 1).
Selection pressure during intensive chemoimmunotherapy revealed by mtDNA and copy-number alterations. A, mtDNA mutations, inferred copy-number changes from scATAC-seq data, and differential chromatin peaks in CLL cells (CLL4) before chemoimmunotherapy with FCR and RIC allo-HSCT as well as at relapse. For the heat maps, only CLL cells with at least one detectable mtDNA mutation are shown. UMAP plots show cell populations and localization of cells marked by the 3538G>A, 16247A>G, 6426G>A, and 16290C>T mutations. For analogous analyses on CLL2 and 3, see Supplementary Fig. S4. Mono, monocyte. B, Schematic representation of the trajectory of subclonal structure inferred from mean heteroplasmy of mtDNA mutations and copy-number changes (CLL4). C, mtDNA mutations, inferred copy-number changes, and differential chromatin peaks of CLL cells (CLL5) before FCR chemotherapy and RIC allo-HSCT (pre-FCR) and at relapse (relapse post-RIC). For the heat maps, only CLL cells with at least one detectable mtDNA mutation are shown. D, Schematic representation of the trajectory of subclonal structure inferred from mean heteroplasmy of mtDNA mutations and copy-number changes (CLL5). E and F, Identification of CLL subclones 1 (2332C>T; blue) and 2 (5979G>A; yellow) in CLL5 with distinct chromatin states. G, Changes in chromatin peaks at relapse after FCR (CLL1–3) or FCR and RIC allo-HSCT (CLL4–6). H, Transcription factor motifs depleted from marker peaks shown in G at relapse. Colors indicate statistical significance of enrichment before treatment as Padj (FDR) ranging from 10−136 (blue) to 0 (white).
Selection pressure during intensive chemoimmunotherapy revealed by mtDNA and copy-number alterations. A, mtDNA mutations, inferred copy-number changes from scATAC-seq data, and differential chromatin peaks in CLL cells (CLL4) before chemoimmunotherapy with FCR and RIC allo-HSCT as well as at relapse. For the heat maps, only CLL cells with at least one detectable mtDNA mutation are shown. UMAP plots show cell populations and localization of cells marked by the 3538G>A, 16247A>G, 6426G>A, and 16290C>T mutations. For analogous analyses on CLL2 and 3, see Supplementary Fig. S4. Mono, monocyte. B, Schematic representation of the trajectory of subclonal structure inferred from mean heteroplasmy of mtDNA mutations and copy-number changes (CLL4). C, mtDNA mutations, inferred copy-number changes, and differential chromatin peaks of CLL cells (CLL5) before FCR chemotherapy and RIC allo-HSCT (pre-FCR) and at relapse (relapse post-RIC). For the heat maps, only CLL cells with at least one detectable mtDNA mutation are shown. D, Schematic representation of the trajectory of subclonal structure inferred from mean heteroplasmy of mtDNA mutations and copy-number changes (CLL5). E and F, Identification of CLL subclones 1 (2332C>T; blue) and 2 (5979G>A; yellow) in CLL5 with distinct chromatin states. G, Changes in chromatin peaks at relapse after FCR (CLL1–3) or FCR and RIC allo-HSCT (CLL4–6). H, Transcription factor motifs depleted from marker peaks shown in G at relapse. Colors indicate statistical significance of enrichment before treatment as Padj (FDR) ranging from 10−136 (blue) to 0 (white).
The analysis of the mtscATAC-seq profiles of CLL4 illustrated how samples separated by years of disease history may contain distinct mtDNA mutational profiles, suggesting a model of clonal succession. We repeatedly observed similar changes in mtDNA mutations across 4 more patients with mtscATAC-seq profiles available before treatment and at relapse after FCR chemotherapy and RIC allo-HSCT, although to different extents and with distinct dynamics (Supplementary Fig. S4A–S4C). For example, in CLL5, we observed how two mutations with high mean heteroplasmy 2332C>T (15.8%) and 5979G>A (6.6%) disappeared at relapse, while an additional copy of chromosome 7 was uniformly present across all CLL5 cells at relapse, illustrating the utility of combining the detection of distinct types of somatic mutation events to trace clonal structure over time (Fig. 3C and D). In CLL5, we were able to identify two subpopulations within CLL cells before treatment marked by two different mtDNA mutations, 2332C>T (clone 1) and 5979G>A (clone 2). These two subpopulations had detectable differences in accessibility of transcription factor motifs implicated in B-cell malignancies such as YY1, BCL11A, POU2F2, and SPI1 (24–27), and clone 1 harbored evidence of a 17p deletion that was absent in clone 2 (Fig. 3C–F). Chromatin accessibility of CD38 and ITGA4 (encoding for CD49d), two unfavorable prognostic markers in CLL, differed between these clones (Supplementary Fig. S4D–S4F).
Finally, we assessed the ability to identify common changes at the chromatin level at relapse following fludarabine-based treatment in CLL1–6. The CLL cells from all 6 patients demonstrated chromatin remodeling. With the exception of CLL5, a number of transcription factor motifs (SPIB, SPI1, BCL11B, BCL11A, and IRF1) were depleted from differentially accessible chromatin peaks at relapse, consistent with a less differentiated B-cell phenotype than prior to treatment (ref. 26; Fig. 3G and H; Supplementary Fig. S5A). Similarly, although corresponding scRNA-seq profiles were essentially unique for each CLL, they showed concordant expression changes of B-cell–associated genes such as downregulation of MEF2C or CD24 and upregulation of CXCR4 and RGS1/2 at relapse (Supplementary Fig. S5B–S5E), reinforcing the idea of defined selective pressure through therapeutic interventions. Altogether, these observations demonstrate how somatic events (CNVs and mtDNA mutations) in CLL cells demarcate subclones with distinct genomic properties within the pool of tumor cells and their evolution under profound selective pressure.
Dynamics of mtDNA Mutations under Ibrutinib Treatment
The targeted BTK inhibitor ibrutinib is known to deepen clinical responses over months to years (28), and we therefore compared changes of clonal cell populations as measured by mtDNA mutations in this setting with those seen in chemotherapy or allo-HSCT. We first focused on CLL6 with the longest follow-up on ibrutinib and from whom we obtained mtscATAC-seq profiles of PB at four time points: (i) before FCR and RIC allo-HSCT, (ii) at relapse after RIC allo-HSCT shortly before ibrutinib treatment, (iii) at the peak of ibrutinib-induced lymphocytosis 20 days after initiation, and (iv) at relapse after 656 days with mixed nodal and blood manifestation (Fig. 4A).
Stability of mtDNA mutations during ibrutinib treatment. A and B, mtDNA mutations, inferred copy-number changes from scATAC-seq data, and differential chromatin peaks in CLL cells (CLL6) (i) before FCR (pre-FCR), (ii) at relapse after RIC allo-HSCT (relapse post-RIC), (iii) after 20 days of ibrutinib treatment, as well as (iv) at relapse on ibrutinib. Arrows indicate copy-number changes unique to cells marked by 7205C>T at relapse on ibrutinib. UMAP plots show cell populations and localization of cells marked by 3526G>A, 3830T>C, 930G>A, and 7205C>T. In the heat maps, only CLL cells with at least one detectable mtDNA mutation are shown. Mono, monocyte. C, Percentage of CLL cells marked by mtDNA mutations during ibrutinib treatment in CLL cells in CLL6 (partial remission, PR with relapse), CLL7 (complete remission, CR), and CLL8 (stable disease, SD). D and E, Frequency of individual mtDNA mutations (D) and total number of mtDNA mutations (E) detected per sample in reanalysis of public bulk ATAC-seq profiles of CD19+ CD5+ CLL cells from 7 patients during ibrutinib treatment (29). F, Changes in abundance of mtDNA mutations detected in CLL cells relapsing after FCR/RIC allo-HSCT (CLL1–6) compared with CLL cells during ibrutinib treatment (CLL6–8). Statistical testing performed using Kolmogorov–Smirnov test.
Stability of mtDNA mutations during ibrutinib treatment. A and B, mtDNA mutations, inferred copy-number changes from scATAC-seq data, and differential chromatin peaks in CLL cells (CLL6) (i) before FCR (pre-FCR), (ii) at relapse after RIC allo-HSCT (relapse post-RIC), (iii) after 20 days of ibrutinib treatment, as well as (iv) at relapse on ibrutinib. Arrows indicate copy-number changes unique to cells marked by 7205C>T at relapse on ibrutinib. UMAP plots show cell populations and localization of cells marked by 3526G>A, 3830T>C, 930G>A, and 7205C>T. In the heat maps, only CLL cells with at least one detectable mtDNA mutation are shown. Mono, monocyte. C, Percentage of CLL cells marked by mtDNA mutations during ibrutinib treatment in CLL cells in CLL6 (partial remission, PR with relapse), CLL7 (complete remission, CR), and CLL8 (stable disease, SD). D and E, Frequency of individual mtDNA mutations (D) and total number of mtDNA mutations (E) detected per sample in reanalysis of public bulk ATAC-seq profiles of CD19+ CD5+ CLL cells from 7 patients during ibrutinib treatment (29). F, Changes in abundance of mtDNA mutations detected in CLL cells relapsing after FCR/RIC allo-HSCT (CLL1–6) compared with CLL cells during ibrutinib treatment (CLL6–8). Statistical testing performed using Kolmogorov–Smirnov test.
At initiation of ibrutinib (after relapse from FCR and RIC allo-HSCT), CLL6 was dominated by cells with a 17p deletion and two mtDNA mutations (3526G>A, 3830T>C), which had expanded from a minor clone (13.2% of CLL cells) before FCR. During ibrutinib treatment, mtDNA mutations remained stable: only negligible change was observed after 20 days of ibrutinib and both predominant mutations marked large fractions of cells [36.2% and 37.3% (3526G>A), 25.1% and 25.5% (3830T>C) in cells with ≥1 detectable mtDNA mutation]. At relapse, CLL cells marked by these two mutations diminished to 16.2% (3526G>A) and 12.4% (3830T>C), and distinct groups marked by two independent mutations became apparent in 6.8% (930G>A) and 5.8% (7205C>T) of cells. Although differences in chromatin accessibility were undetectable in CLL cells harboring either the decreased (3526G>A, 3830T>C) or increased mutations (930G>A, 7205C>T), we observed cells marked by 7205C>T to demonstrate additional copy-number changes in chromosomes 1, 3, and 13, indicating they represent a newly emerging relapse- associated subclone (Fig. 4A and B, arrows).
We also observed relatively stable mtDNA genotypes over the course of ibrutinib treatment in CLL7 and 8, whose best response was complete remission and stable disease, respectively (Fig. 4C). Ibrutinib did not appear to affect the T cells and monocytes in CLL6 and 7, as we did not observe any major shifts of mtDNA mutations in these populations (Supplementary Fig. S6A–S6C). Despite the different clinical responses in CLL6–8, a reduction in NFKB1 motif accessibility under ibrutinib treatment was observed in all 3 subjects (Supplementary Fig. S7A and S7B), consistent with recently reported data (29). Similarly, scRNA-seq profiles showed a consistent pattern of ibrutinib-induced changes in 20 differentially expressed genes, some of which have critical roles in CLL such as MS4A1 (30) and MIR155HG (ref. 31; Supplementary Fig. S7C and S7D). These ibrutinib-induced gene expression changes largely reversed in CLL6 at relapse, suggesting acquired treatment resistance to ibrutinib (Supplementary Fig. S7E and S7F).
To determine whether the observed clonal stability as captured via mtDNA mutations in CLL6–8 was generalizable, we took advantage of public bulk ATAC-seq profiles obtained from FACS-sorted CD19+CD5+ cells from 7 independent CLL patients, serially sampled before and during ibrutinib treatment (29). We similarly observed stability in the frequency of individual mtDNA mutations during 6 months of ibrutinib treatment (Fig. 4D). Some mtDNA mutations with low abundance showed small changes at approximately day 10 of ibrutinib treatment, which coincided with a transiently increased number of detectable mtDNA mutations (Fig. 4E) and mirrored the kinetics of ibrutinib-induced lymphocytosis. The stability of mtDNA mutations in this dataset was thus highly concordant with the dynamics observed in CLL6–8 and contrasted with the dramatic changes observed in CLL1–6 at relapse after chemoimmunotherapy (Fig. 4F). Taken together, ibrutinib and chemotherapy appear to induce distinct selective pressures, with ibrutinib inducing more uniform chromatin remodeling across CLL subclones.
Clonal Dynamics during CLL Transformation to RS
Although CLL is typically characterized as an indolent B-cell malignancy, a subset of patients can experience transformation into an aggressive lymphoma, a process known as RS. Because RS cells expand from a subclone of antecedent CLL in the majority of cases, we reasoned that they are enriched for distinct mtDNA mutations. CLL9 progressed to RS, with evidence of circulating disease, 20 months after the initial CLL diagnosis and in the absence of any intervening therapy. PB samples were procured early in this patient's CLL course and at the time of RS, as well as BM and LN specimens at the time of RS (Fig. 5A and B). Comparison of mtDNA mutations in CLL and RS demonstrated clonal outgrowth: within the CLL PB cells (CLL PB1 and PB2), 24% of cells contained the 3412G>A or 9553G>A mutations at a high heteroplasmy (>5%) and 4.7% of cells had the 2343G>A mutation also at high heteroplasmy, marking two distinct subclones (Fig. 5C; Supplementary Fig. S8A–S8D). Upon transformation to RS, cells with 2343G>A became undetectable, while those with 3412G>A or 9553G>A were observed in a higher proportion of Richter cells at comparable percentages in PB, BM, and LN (57.4%–64.2%; Fig. 5C). Consistent CNV changes were detected across these compartments, such as deletion of chromosomes 8p, 9q and 14q in addition to the 17p deletion already present at time of CLL (purple arrows, Fig. 5C), which were confirmed by WES. WES also revealed enrichment of subclones at RS compared with CLL and was consistent with our observations made using mtDNA mutations (Fig. 5C; Supplementary Fig. S8E–S8G). Within the LN, we also identified a subclone with additional copy-number changes of chromosomes 4 and 13 that were marked by high heteroplasmy of 3412G>A (92.7%) and 9553G>A (71.1%; green arrows, Fig. 5C).
Accessible chromatin and mtDNA mutational dynamics during transformation to RS. A, Circulating WBC count (CLL9). Analyzed samples are PB during CLL phase (CLL; black) and PB, BM, and LN at time of Richter transformation (Richter, magenta). B, UMAP plot of CLL, Richter, and immune cells based on chromatin profiles (n = 30,395), with identification of CD4+ T cells; CD8+ T cells; physiologic B cells; CLL cells from PB; and Richter cells from LN, PB, and BM. C, Heteroplasmy of mtDNA mutations and inferred copy-number changes from scATAC-seq data of cells in clusters identified in B. Only cells with at least one detectable mtDNA mutation are shown. D, Heteroplasmy of 3412G>A and 9553G>A in Richter cells. Coloring of dots indicates sequencing coverage (20–350×). Inset table shows number of cells with both, either one, or none of the mutations detectable. E, Distribution of heteroplasmy of 3412G>A (blue) or 9553G>A (red) in Richter cells. F, Differential chromatin accessibility of transcription factor motifs between CLL (CLL PB1, CLL PB2) and Richter cells (Richter LN1, Richter LN2, Richter PB, Richter BM). Precision limit for Padj value 2.6e-297. G, Chromatin accessibility of POU2F2 motif across clusters. H, Differential gene expression of scRNA-seq profiles of CLL cells from PB (CLL PB; black) and Richter cells from PB (Richter PB, magenta) or BM (Richter BM, magenta).
Accessible chromatin and mtDNA mutational dynamics during transformation to RS. A, Circulating WBC count (CLL9). Analyzed samples are PB during CLL phase (CLL; black) and PB, BM, and LN at time of Richter transformation (Richter, magenta). B, UMAP plot of CLL, Richter, and immune cells based on chromatin profiles (n = 30,395), with identification of CD4+ T cells; CD8+ T cells; physiologic B cells; CLL cells from PB; and Richter cells from LN, PB, and BM. C, Heteroplasmy of mtDNA mutations and inferred copy-number changes from scATAC-seq data of cells in clusters identified in B. Only cells with at least one detectable mtDNA mutation are shown. D, Heteroplasmy of 3412G>A and 9553G>A in Richter cells. Coloring of dots indicates sequencing coverage (20–350×). Inset table shows number of cells with both, either one, or none of the mutations detectable. E, Distribution of heteroplasmy of 3412G>A (blue) or 9553G>A (red) in Richter cells. F, Differential chromatin accessibility of transcription factor motifs between CLL (CLL PB1, CLL PB2) and Richter cells (Richter LN1, Richter LN2, Richter PB, Richter BM). Precision limit for Padj value 2.6e-297. G, Chromatin accessibility of POU2F2 motif across clusters. H, Differential gene expression of scRNA-seq profiles of CLL cells from PB (CLL PB; black) and Richter cells from PB (Richter PB, magenta) or BM (Richter BM, magenta).
We noted that the heteroplasmy of both 3412G>A and 9553G>A mutations in RS cells was highly correlated (R = 0.89; Fig. 5D), suggesting co-occurrence within the same clone. Notably, both mutations showed a wide range of heteroplasmy, which we cannot attribute to variation in coverage (Supplementary Fig. S8H). The distribution of heteroplasmy of both mutations was such that the maximum value was approximately 10% and fewer than 1% of cells had a heteroplasmy <1% (Fig. 5E). This analysis confirms that mtDNA mutations in monoclonal populations can have a wide range of heteroplasmy (12); mtDNA mutations thus behave more dynamically than somatic nuclear mutations due to processes such as asymmetrical segregation of mitochondria, relaxed replication, or mitophagy, which should be considered when interpreting longitudinal mtDNA kinetics.
Given the clear changes in mtDNA mutations and CNV associated with RS, we asked whether similarly profound chromatin and transcriptional changes could be detected. Analysis of chromatin accessibility revealed RS cells to have strong reduction of accessibility in motifs related to the AP-1 transcription factor complex and higher accessibility of POU2F transcription factor motifs (Fig. 5F and G), which are known to play an important role in B-cell lymphomagenesis (25). Likewise, scRNA-seq profiles of RS cells were quite distinct from those of antecedent CLL cells, with downregulation of genes such as JUNB, JUN, FOS, and FOSB and upregulation of B-cell–related genes such as LTB, CD24, and CD70 (Fig. 5H; Supplementary Fig. S8I). Of note, the RS cells in BM and PB were transcriptionally indistinguishable (Supplementary Fig. S8J) and matched our finding that the same mtDNA mutation profiles could be found between these two compartments. Our analyses thus revealed the dramatic changes in accessible chromatin and transcriptional profiles along with the acquisition of numerous CNVs in the transformation from CLL to RS.
Characteristic Chromatin Profiles and mtDNA Mutations in T-cell Populations
Because T cells undergo antigen-driven clonal expansion, we also evaluated whether mtDNA mutations from the 33,573 available scATAC-seq profiles of T cells could mark distinct T-cell subpopulations coexisting with CLL. The scATAC-seq profiles of T cells from PB and BM were highly similar across all 9 subjects but clearly differed from T cells isolated from LN, which were collected from RS (Fig. 6A and B). On the basis of gene activity scores, we identified CD4+ and CD8+ T-cell populations (Supplementary Fig. S9A). For CLL4, CLL6, and CLLA, we could identify six mutations that marked subsets of clonally expanded CD8+ T cells. In CLL4, 1918G>A marked donor-derived CD8+ T cells and was undetectable in CLL cells or host CD8+ T cells (Supplementary Fig. S9B). Similarly, in CLL6, 2647G>A and 10408T>C were only detectable in donor-derived CD8+ T cells (Fig. 6C; Supplementary Fig. S9C) and marked two distinct CD8+ T-cell clones with a shared chromatin state that expanded and contracted at different time points (Fig. 6D). Finally, in CLLA, three mtDNA mutations marked two CD8+ T-cell clones with distinct chromatin states (Fig. 6E–G; Supplementary Fig. S9D and S9E). 2464G>A and 4573T>C marked CD8+ T cells with higher chromatin accessibility of GNLY and GZMH, indicative of an effector memory phenotype, while 7762G>A marked cells with higher chromatin accessibility of IL7R, TCF7, NFATC1, and CD28, consistent with a naïve T-cell phenotype (32). Taken together, these six mtDNA mutations likely mark individual expanded T-cell clones with distinct specificities, thus demonstrating the potential of mtscATAC-seq to track phenotypically diverse cell populations.
T-cell phenotypes and clonal dynamics revealed through mtDNA mutations and accessible chromatin profiles. A and B, UMAP plots showing clustering of 33,573 scATAC-seq profiles of T cells from CLL1–9 isolated from PB, BM, or LN. C and D, Identification of two CD8+ T-cell clones marked by 2647G>A (clone 1; yellow) and 10408T>C (clone 2; brown) in CLL6 that expand at relapse after RIC allo-HSCT (clone 1) or after 21 months of ibrutinib treatment (clone 2). E–G, Identification of two CD8+ T-cell clones marked by 4573T>C and 2464G>A (clone 1; yellow) or 7762G>A (clone 2; magenta) in CLLA. Clones 1 and 2 show different chromatin accessibility of T cell–related genes associated with an effector memory (GZMH, GNLY) and naïve (TCF7, NFATC1, IL7R, and CD28) T-cell phenotype. H, Differential analysis of gene activity scores between T cells isolated from PB/BM (CLL1–9) and LN (CLL9 at time of RS). I, Browser tracks showing chromatin accessibility of the TOX gene across T-cell clusters (Supplementary Fig. S9A). Tfh, T follicular helper cell; Treg, regulatory T cell.
T-cell phenotypes and clonal dynamics revealed through mtDNA mutations and accessible chromatin profiles. A and B, UMAP plots showing clustering of 33,573 scATAC-seq profiles of T cells from CLL1–9 isolated from PB, BM, or LN. C and D, Identification of two CD8+ T-cell clones marked by 2647G>A (clone 1; yellow) and 10408T>C (clone 2; brown) in CLL6 that expand at relapse after RIC allo-HSCT (clone 1) or after 21 months of ibrutinib treatment (clone 2). E–G, Identification of two CD8+ T-cell clones marked by 4573T>C and 2464G>A (clone 1; yellow) or 7762G>A (clone 2; magenta) in CLLA. Clones 1 and 2 show different chromatin accessibility of T cell–related genes associated with an effector memory (GZMH, GNLY) and naïve (TCF7, NFATC1, IL7R, and CD28) T-cell phenotype. H, Differential analysis of gene activity scores between T cells isolated from PB/BM (CLL1–9) and LN (CLL9 at time of RS). I, Browser tracks showing chromatin accessibility of the TOX gene across T-cell clusters (Supplementary Fig. S9A). Tfh, T follicular helper cell; Treg, regulatory T cell.
For CLL9, we had the opportunity to compare T cells from the CLL phase with the RS phase of disease. LN-associated T cells (CLL9 at the time of RS) showed higher gene activity of TOX and immune checkpoint molecules such as PDCD1, TIGIT, CTLA4, and ICOS than T cells from PB (CLL1–9), consistent with the immunophenotype of follicular T helper cells (ref. 33; Fig. 6H and I). In particular, LN-associated CD8+ T cells (CLL9 at the time of RS) had higher gene scores of EOMES and GZMK, while CD8+ T cells from PB (CLL1–9) expressed TBX21 and GZMB, which may point to different effector memory populations (ref. 34; Supplementary Fig. S9F). LN-associated CD8+ T cells also highly expressed HAVCR2 and PDCD1 (Supplementary Fig. S9G). These observations are especially intriguing given that PD-1 inhibition has shown clinical activity in RS.
Discussion
Longitudinal tracking of distinct clones within heterogenous cell populations and linking clonal lineages to distinct functional states at the single-cell level have been the subject of numerous efforts aimed at understanding disease evolution and mechanisms leading to therapeutic resistance in a wide range of cancers. While in vitro and in vivo model systems have the potential to exploit technologies that introduce synthetic genetic markers into cells (35) and can provide high-resolution clonal tracking capabilities, in vivo human studies primarily rely on the presence and detection of naturally occurring barcodes. The most common approach to monitoring cancer subclones has been through the tracking of somatic nuclear mutations, although for many diseases, including CLL, these are often sparse events and frequently subject to selective pressures confounding the inferences that can be made from such mutations. For noncancer populations, such as B cells and T cells, B-cell receptor (BCR; ref. 36) and T-cell receptor (TCR) sequences (37, 38), respectively, have been commonly exploited for clone tracking. While such sequences cannot be extracted from scATAC-seq data, we confirm herein that mtDNA mutations lend themselves as natural markers of clonal lineage, and as a single entity, enable the tracking of multilineage populations from both cancer and noncancer fractions of tissues. We further demonstrate their promise in linking subclonal structures to distinct functional states combined with coevolving CNV changes.
Our study of mtscATAC-seq applied to 9 patients with CLL to longitudinally assess their clonal and accessible chromatin dynamics in the context of distinct treatment modalities revealed several notable new biologic insights. First, the extent to which cells carrying distinct mtDNA mutation profiles evolved with therapy appeared to reflect the degree of therapeutic bottleneck. We observed mutational stability over years in the absence of treatment but also saw dramatic changes in mtDNA mutations with disease acceleration, transformation, or relapse after strong therapeutic pressure such as chemoimmunotherapy and allo-HSCT. In contrast, ibrutinib therapy was associated with less dramatic changes of mtDNA genotypes. In specific instances, we also noted that mtDNA mutations behaved more dynamically than somatic nuclear mutations. For example, in CLL9, two mutations showed a wide range of heteroplasmy within RS cells. In CLL1 and CLL4, a high percentage of cells carried mtDNA mutations at baseline that were absent at relapse, suggesting vulnerability to prior treatment. Our data highlight the potential utility of mtDNA mutations for direct ex vivo tracking of clonal populations of human cells, but we also caution that the unique nature of mitochondrial genetics warrants further high-resolution exploration of mtDNA mutational dynamics to understand these processes in greater detail. The large mtDNA copy number, the diverse levels of heteroplasmy present among these mutations, the relaxed replication of mtDNA, mitophagy, and largely stochastic allocation to daughter cells during cell division are all processes that require further consideration when making inferences about clonal dynamics (39).
Second, we observed that mtDNA mutations can identify CLL subpopulations with distinct functional states, demonstrating how mtscATAC-seq can capture information on cell state, clonal lineage, and copy-number alterations at single-cell resolution. In the case of CLL1, we identified a specific mtDNA subclone at the time of disease acceleration that not only expanded following FCR chemotherapy, but had a distinct chromatin state more similar to relapse even prior to therapy. In keeping with parallel genomic and nongenomic evolution, we identified changes in CNVs that co-occurred with changes in mtDNA mutations and chromatin accessibility such as in CLL5, CLL6, and CLL9, thus strengthening the link between disease evolution and composition of mtDNA mutations within the CLL compartment. Given that mtDNA mutations are enriched in CLL cells, key open questions remain, such as at what stage during lymphoid development these occur, the impact of IGHV mutational status on mtDNA mutations, whether they are already detectable at the stage of lymphoid progenitor or even hematopoietic stem cell cells, or whether they are result of later events in the disease pathogenesis (40) and the relationship with prognosis.
We noted that chromatin profiles were dynamic in relation to disease history: In CLL9, Richter cells, compared with CLL cells, showed increased chromatin accessibility for POU transcription factor motifs, which was most pronounced in LN-associated cells and potentially reflects that POU2F2 (encoding Oct-2) is known to play a role in diffuse large B-cell lymphoma (DLBCL) and has even been proposed as a diagnostic marker (25, 41). Although underpowered to generalize, our study identified common changes at the level of chromatin accessibility and gene expression across 6 CLLs who had undergone fludarabine-based treatment. Transcription factor motifs involved in B cell development (SPIB, SPI1) were depleted at relapse, which speaks to the selection of a less differentiated CLL phenotype, in line with a recent study reporting enrichment of stem cell gene sets in CLL late relapse specimens (42). We also detected higher expression of CXCR4 at relapse, which is consistent with studies that have defined high CXCR4 expression as an adverse prognostic factor in DLBCL and CLL (43) and as a mechanism of acquired resistance to fludarabine/mafosfamide in an in vitro model (35). Altogether, these observations support the idea that despite marked heterogeneity across patients, defined therapeutic interventions lead to common principles of response and acquired resistance, thereby confirming other studies documenting convergent epigenomic evolution with acquired drug resistance (44). This justifies further research into epigenetic mechanisms of acquired resistance given that genomic somatic mutations only identify mechanisms of resistance in a proportion of patients (45) and that it is still unclear how many known recurrent somatic mutations drive complex changes in transcriptional state and resistance (46).
Third, although the majority of cells were CLL cells, analysis of chromatin profiles of single T cells was feasible and showed stark phenotypic differences between LN and PB. Furthermore, we could identify mtDNA mutations and associated chromatin profiles to define subclones of non-CLL immune cells. These analyses thus demonstrate their ability to mark physiologic cell subpopulations with diverging chromatin states.
Clonal tracking with mtscATAC-seq is an alternative to other DNA- and RNA-based single-cell approaches that utilize somatic nuclear mutations as natural barcodes. Compared with somatic nuclear mutations, mtDNA mutations have the advantage that a priori knowledge is not required for their detection. However, the exact extent to which mtDNA mutations contribute to tumorigenesis is not yet clearly established (9, 10, 16). In addition, they may be shared between cell compartments at different frequencies, which needs to be taken into account when inferring clonal lineage. The high copy number and small size of the mitochondrial genome allows mtscATAC-seq to read out mtDNA mutations in single cells at high coverage and avoids the necessity of additional targeted amplification steps while simultaneously providing information on cell states. Although scRNA-seq can provide higher resolution of cell states, genotyping using transcriptomic data has a far lower detection rate than mtscATAC-seq (8). Emerging multi-omic technologies for the combined capture of accessible chromatin, transcriptome, and protein expression along with sufficient mtDNA profiling are being developed to overcome current limitations and will surely enable the future improved detection of subtle but important differences in functional states between subclones identified by mtDNA mutations or other clonal markers (47, 48). Combined capture of mtDNA mutations and different types of nuclear mutations would further enhance the elucidation of both genomic and nongenomic evolution of malignant cells. Higher throughput and increased coverage of single cells in the future will increase the level of detection of mtDNA mutations, thereby improving the tracking of clones using mtDNA mutations in settings such as premalignant states or minimal residual disease. Our observations using mtscATAC-seq support the use of mtDNA mutations as natural barcodes to track changes in the subclonal structure of disease populations over time and the contribution of nongenetic changes in disease progression, transformation, and therapeutic resistance.
Methods
Sample Acquisition
PB, BM, and LN samples were collected from patients treated at Dana-Farber Cancer Institute (Boston, MA) or in collaboration with the CLL Research Consortium (CRC) under Institutional Review Board–approved protocols from patients who provided written informed consent. PB and BM mononuclear cells were isolated by Ficoll-Hypaque density gradient centrifugation. LN mononuclear cells were isolated through (Miltenyi GentleMACS, RRID:SCR_020280) dissociation. All samples were stored in vapor-phase liquid nitrogen after cryopreservation with 10% dimethyl sulfoxide until analysis.
mtscATAC-seq Cell Preparation
The mtscATAC-seq cell preparation was performed as described previously (11). Briefly, cells were fixated with 1% formaldehyde for 10 minutes at room temperature and quenched with 0.125 mol/L glycine final. After cell lysis in 100 μL with a modified lysis buffer (10 mmol/L Tris-HCl (pH 7.4), 10 mmol/L NaCl, 3 mmol/L MgCl2, 0.1% Nonidet P40 Substitute, 1% BSA, nuclease-free H2O) for 3 minutes, cells were washed with a modified wash buffer (10 mmol/L Tris-HCl (pH 7.4), 10 mmol/L NaCl, 3 mmol/L MgCl2, 1% BSA, nuclease-free H2O) and resuspended in nuclei buffer (10x Genomics) at a concentration of 5,000 cells/μL.
scATAC-seq Sequencing
After loading (targeted recovery of 7,000 cells) onto a Chromium Chip E (10x Genomics, catalog no. 1000082), library preparation was performed using the Chromium Single Cell ATAC Library & Gel Bead Kit (catalog no. 1000110) according to the manufacturer's instructions. Following quality control using a BioAnalyzer 2100 (RRID:SCR_019715) High Sensitivity DNA Kit (Agilent), pooled libraries were sequenced on an Illumina NovaSeq 6000 Sequencing System (RRID:SCR_016387) S1 or S2 platform with 50 bp paired-end reads, 8 bp for index 1, and 16 bp for index 2.
scRNA-seq Sequencing
A total of 17,000 cells per sample were loaded onto a Chromium Chip A (10x Genomics, catalog no. 120236). Single-cell gene expression was obtained using the Chromium Single Cell 3′ or 5′ Library & Gel Bead Kit (catalog no. 120237 or 1000006). Library preparations were performed according to the manufacturer's instructions. Following quality control with a Bioanalyzer High Sensitivity DNA Kit (Agilent), pooled libraries were sequenced on an Illumina NovaSeq 6000 Sequencing System (RRID:SCR_016387) with 150 bp paired-end reads, and 8 bp for index 1.
scRNA-seq Data Analysis
Raw sequencing reads were demultiplexed and aligned using Cell Ranger (RRID:SCR_017344) version 3.1.0 with the GRCh38–3.0.0 reference genome. Downstream analyses of scRNA-seq data were performed in (RStudio, RRID:SCR_000432) using the Seurat, (RRID:SCR_007322) package version 3.2.0. (49). Low-quality cells were excluded from downstream analyses based on percentage of mitochondrial reads <20, features per cell >200 and <2,500, and number of reads per cell <10,000. Differential gene expression analysis was performed on pseudobulk transcriptome profiles using Seurat::FindMarkers() and visualized using ggplot2 (RRID:SCR_014601).
mtscATAC-seq Data Analysis
Demultiplexing and alignment of raw sequencing reads was performed using Cell Ranger ATAC (RRID:SCR_021160) version 1.2.0 with a custom GRCh38 reference genome. mtDNA mutations were called with mgatk (RRID:SCR_021159) 0.5.6 using default parameters (11) and call_mutations_mgatk() downloaded from https://github.com/caleblareau/mtscATACpaper_reproducibility or the Signac (RRID:SCR_021158) functionality for mtscATAC-seq data. mtDNA mutations were selected for further analysis based on variant mean ratio >0.01 and strand concordance >0.6. Downstream analyses of mtscATAC-seq data were performed using the ArchR (RRID:SCR_020982) package (50). The ArchR manual was followed for exclusion of low-quality cells (TSS enrichment <4 and fragments per cell <1,000) and doublets as well as analyses of chromatin marker peaks, calculation of imputed gene activity scores, and calculation of transcription factor deviation scores.
Mitochondrial mutations and copy-number alterations were analyzed as described previously (https://github.com/caleblareau/mtscATACpaper_reproducibility) in cells with coverage of mtDNA >10–20×. Specifically, CNVs were calculated using patient-specific mtscATAC-seq data from non-CLL immune cells as healthy controls (11).
Statistical Analysis
Statistical significance of differences between median numbers of mtDNA mutations per cell of CLL, T cells, and monocytes was calculated using a Wilcoxon signed rank test by pairwise comparison of CLL with T cells and monocytes, and between T cells and monocytes. mtDNA mutations with significant enrichment in CLL, T cells, and monocytes were detected with a Fisher exact test using the number of cells with a heteroplasmy >0.05% for each mutation between CLL and T cells as well as CLL cells and monocytes. Adjustment for multiple hypothesis testing was performed using a Benjamini–Hochberg correction (FDR).
Data and Materials Availability
scATAC and RNA-seq data for CLL1–9 can be downloaded from Gene Expression Omnibus (GEO; GSE163579 and GSE165087). scATAC-seq data for CLL A and B were previously published (11) and can be accessed from GEO (GSE142745) and at https://osf.io/bupge. WES data from CLL1 were accessed from database of Genotypes and Phenotypes (dbGaP; phs001431.v1.p). WES data from CLL9 are available via dbGaP (Phs002458.v1).
Authors' Disclosures
L. Penter reports grants from DFG/German Research Foundation (PE 3127/1–1) during the conduct of the study. S.H. Gohil reports grants from Kay Kendall Leukaemia Fund during the conduct of the study. C. Lareau reports a patent for 62/683,502 pending. L.S. Ludwig reports a patent for 62/683,502 pending to Broad Institute and Boston Children's Hospital. E.M. Parry reports grants from ASCO (Conquer Cancer Young Investigator Award), Dana-Farber Cancer Institute (FLAMES FLAIR Grant Award), and Doris Duke Charitable Foundation during the conduct of the study. S. Li reports grants from NCI during the conduct of the study. I. Leshchiner reports personal fees from PACT Pharma, Inc. outside the submitted work. G. Getz reports research funds from IBM and Pharmacyclics; is an inventor on patent applications related to MuTect, ABSOLUTE, MutSig, MSMuTect, MSMutSig, MSIdetect, POLYSOLVER, and TensorQTL; and is a founder of, consultant for, and holds privately held equity in Scorpion Therapeutics. J.R. Brown reports personal fees from AbbVie, Acerta/AstraZeneca, BeiGene, Bristol Myers Squibb/Juno/Celgene, Catapult, Dynamo, Eli Lilly, Genentech/Roche, Janssen, Kite, Loxo, MEI Pharma, Nextcea, Novartis, Octapharma, Pfizer, Pharmacyclics, Rigel, Sunesis, Verastem, and Teva; grants and personal fees from Gilead and TG Therapeutics; personal fees and other support from Morphosys AG; grants from Loxo/Lilly and Verastem/SecuraBio; and other support from Invectys outside the submitted work. M.S. Davids reports grants and personal fees from AbbVie, Ascentage Pharma, AstraZeneca, Genentech, Pharmacyclics, TG Therapeutics, and Verastem; personal fees from Adaptive Biotechnologies, BeiGene, Bristol Myers Squibb, Celgene, Eli Lilly, Gilead Sciences, Janssen, Merck, Research to Practice, Syros Pharmaceuticals, Takeda, and Zentalis; and grants from MEI Pharma, Surface Oncology, and Novartis outside the submitted work. K.J. Livak reports grants from NIH/NCI during the conduct of the study. V.G. Sankaran reports a patent for PCT/US2019/036583 issued to Broad Institute and Boston Children's Hospital. C.J. Wu reports grants from NIH (P01CA206978 and P01CA229092) during the conduct of the study, as well as other support from BioNTech and Pharmacyclics outside the submitted work. No disclosures were reported by the other authors.
Authors' Contributions
L. Penter: Conceptualization, resources, data curation, software, formal analysis, investigation, visualization, writing–original draft. S.H. Gohil: Conceptualization, investigation, methodology, writing–original draft. C. Lareau: Conceptualization, resources, software, visualization, methodology, writing–original draft. L.S. Ludwig: Conceptualization, resources, investigation, methodology, writing–original draft. E.M. Parry: Resources. T. Huang: Performed single–cell experiments. S. Li: Software. W. Zhang: Performed experiments. D. Livitz: Data curation, software, formal analysis. I. Leshchiner: Data curation, software, formal analysis. L. Parida: Resources. G. Getz: Resources. L.Z. Rassenti: Resources. T.J. Kipps: Resources. J.R. Brown: Resources. M.S. Davids: Resources. D.S. Neuberg: Formal analysis. K.J. Livak: Resources. V.G. Sankaran: Supervision. C.J. Wu: Conceptualization, resources, supervision, funding acquisition, investigation, visualization, writing–original draft, project administration.
Acknowledgments
This work was supported by grants from the NCI and from the National Institute of Diabetes and Digestive and Kidney Diseases of the NIH (R01DK103794, to V.G. Sankaran; UG1 CA233338, 1U24CA224331-01, P01CA206978, and P01CA229092, to C.J. Wu). L. Penter is supported by a research fellowship from the German Research Foundation (DFG, PE 3127/1-1). S.H. Gohil is supported by a Kay Kendall Leukaemia Fund Fellowship. S. Li is supported by the NCI Research Specialist Award (R50CA251956). C. Lareau is supported by a Stanford Science Fellowship. L.S. Ludwig is supported by an Emmy Noether fellowship by the German Research Foundation (DFG, LU 2336/2-1). V.G. Sankaran is a New York Stem Cell Foundation-Robertson Investigator. E.M. Parry is supported by a Doris Duke Charitable Foundation Physician-Scientist Fellowship, a Dana-Farber FLAIR Fellowship in Lymphoma/Leukemia Research, and an American Society of Clinical Oncology Young Investigator Award. J.R. Brown is supported by NIH R01 CA 213442 (principal investigator, J. Brown). This work was partially supported by the Broad/IBM Cancer Resistance Research Project. We thank Doreen Hearsey and the members of the DFCI Ted and Eileen Pasquarello Tissue Bank in Hematologic Malignancies for provision of samples and the patients who generously consented for the research use of these samples, as well as Stacey Fernandes for expert technical assistance. We are grateful to Hayley Lyons, Sam Pollock, Oriol Olive, and Brian Danysh for project management support. We thank the members of the Wu Lab for constructive discussions of the data.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.