Abstract
Pediatric glioblastoma (pGBM) is a lethal cancer with no effective therapies. To understand the mechanisms of tumor evolution in this cancer, we performed whole-genome sequencing with linked reads on longitudinally resected pGBM samples. Our analyses showed that all diagnostic and recurrent samples were collections of genetically diverse subclones. Clonal composition rapidly evolved at recurrence, with less than 8% of nonsynonymous single-nucleotide variants being shared in diagnostic-recurrent pairs. To track the origins of the mutational events observed in pGBM, we generated whole-genome datasets for two patients and their parents. These trios showed that genetic variants could be (i) somatic, (ii) inherited from a healthy parent, or (iii) de novo in the germlines of pGBM patients. Analysis of variant allele frequencies supported a model of tumor growth involving slow-cycling cancer stem cells that give rise to fast-proliferating progenitor-like cells and to nondividing cells. Interestingly, radiation and antimitotic chemotherapeutics did not increase overall tumor burden upon recurrence. These findings support an important role for slow-cycling stem cell populations in contributing to recurrences, because slow-cycling cell populations are expected to be less prone to genotoxic stress induced by these treatments and therefore would accumulate few mutations. Our results highlight the need for new targeted treatments that account for the complex functional hierarchies and genomic heterogeneity of pGBM.
This work challenges several assumptions regarding the genetic organization of pediatric GBM and highlights mutagenic programs that start during early prenatal development.
Introduction
Pediatric glioblastoma (pGBM) is a lethal brain tumor. Unlike adult GBM, no effective therapeutic intervention or standard of care currently exists for this cancer. Despite their histologic similarities, pGBM is clinically, biologically, and molecularly distinct from adult GBM (1). Unfortunately, the biological principles uncovered by studying the adult disease have not translated to the pediatric malignancy (2). A major breakthrough in understanding the molecular principles underlying pGBM stemmed from the discovery that a third of patients harbor somatic mutations in H3F3A, a gene encoding the histone variant H3.3 (3, 4). Mutations in this gene cause one of two amino acid substitutions in H3.3: (i) Mutation of lysine 27 to methionine (K27M); (ii) mutation of glycine 34 to arginine (G34R) or valine (G34V). Both types of mutations interfere with the wild-type function of H3.3 (5–8), but the H3.3K27M variant has been studied more extensively because it is more frequent and trimethylation of K27 has an established and important role in repressing gene transcription and compacting chromatin. Several experimental models have been used to show that H3F3A mutations can induce cellular hyperproliferation on their own, but they need to cooperate with other drivers, including TP53 mutations, to produce overt malignancies (6, 9, 10). For the majority of pGBM patients who harbor no H3.3 mutations, the early genetic events leading to overt malignancy are not fully understood.
It is currently believed that pGBM, similarly to other pediatric cancers, is characterized by relatively bland genomes (11). Furthermore, H3F3A mutations were shown to be clonal in pGBM (12) and to result in global loss of trimethylation of lysine 27 of histone 3 (H3K27me3) in virtually all cells in a tumor (6). These findings have contributed to the view that pGBM might have a relatively low level of intratumoral complexity at the genomic and functional levels. However, a recent report identified differences in mutational signatures between matched primary and recurrent pGBM samples using exome sequencing (12).
Evidence of subclonal architecture in diffuse intrinsic pontine glioma and pGBM was recently provided (13). The authors reanalyzed previously published exome datasets for 36 pGBM samples and identified evidence for the coexistence of subclones in each tumor. To the best of our knowledge, no study to date has systematically assessed subclonal composition and evolution in matched pGBM primary and recurrent samples from the same patient using whole-genome information. Such studies are important because they may allow inference of biological principles responsible for tumor evolution and recurrence, and they might identify intergenic variants with important roles in the etiology of this cancer. Furthermore, treatment regimens for pGBM differ between centers and individual oncologists, raising the possibility that different therapeutic approaches might skew recurrences toward the acquisition of specific molecular features. The relative rarity of pGBM (adjusted incidence rate of 0.15/0.16 in 100,000 population; ref. 14) and the lack of standardized practices for surgical resection contribute to the scarce availability of diagnostic-recurrent sample pairs for study.
New whole-genome technologies that allow sensitive measurements of cellular heterogeneity in a single tumor, and documentation of tumor evolution over time, have yet to be systematically employed for this malignancy. We hypothesized that whole-genome sequencing (WGS) of primary and recurrent tumors, coupled with targeted bioinformatic approaches to specifically test for subclonal architecture of each tumor, might bring new insight into the intratumoral organization of pGBM and its evolution. Understanding the principles exploited by the tumor to escape treatment may have important implications for rationally selecting or designing personalized treatment options for these rare and aggressive cancers.
Materials and Methods
Human samples
All samples were collected and used for research with written-informed consent and with approval by the Health Research Ethics Board of Alberta and the Research Ethics Board of the Hospital for Sick Children. Fresh tumor tissue and blood samples from each patient were collected by the Clark Smith Brain Tumour biobank at the University of Calgary and the Hospital for Sick Children and preserved at either −80°C or in vapor phase liquid nitrogen. For selected cases, a portion of the fresh tumor material was also allocated for patient-derived xenograft preparation (see below). Anagraphical information of sample donors is in Supplementary Tables S2 and S3.
Patient-derived xenografts
All mice were used with approval from the Animal Care Committee of the University of Calgary. Fresh tumor tissue was gently dissociated by trituration, filtered using a 0.7 μm filter, washed, and resuspended in sterile DPBS (Gibco). Tumor cells were injected into the right striatum of the brains of CB17 SCID mice (Charles River Laboratory). Animals were monitored for tumor growth, and tumors were isolated when signs of morbidity were observed. Initial tumor establishment for patient specimens SM4021 (recurrence of patient 3) and SM4058 (third recurrence of patient 5) took 59 and 41 days, respectively. Tumors were then maintained by serial in vivo passaging in the brain of SCID mice every 1–2 months or cut into 1 mm x 1 mm tissue pieces and cryostored in 1 mL CryoStor CS10 solution (Stem Cell) for 10 minutes on ice prior to being stored at −80°C. In experiments where cryostored tissue was used, tissue was thawed, dissociated into a single-cell suspension, filtered using a 0.7 μm filter, and implanted into the brain of SCID mice. All xenografts were identity matched to the original patient material by short tandem repeat analysis using the AmpFLSTR Identifiler Plus Kit (Applied Biosystems).
Genomic DNA extraction and library preparation
High-molecular-weight genomic DNA (gDNA) was extracted from frozen tumor tissue and frozen blood samples with the Qiagen MagAttract HMW DNA Kit (catalog# 67563). gDNA fragment size and distribution were quantified with the Agilent Technologies 2200 TapeStation Genomic DNA Assay. Samples with gDNA fragment size > 50 kb were used for library construction. The 10xGenomics Chromium Genome Library Kit & Gel Bead Kit v2, 16 reactions (catalog# PN-120258) was used for library preparation. Size and distribution of all sequencing libraries were quantified with the Agilent Technologies 2200 TapeStation D1000 Assay.
Whole-genome sequencing and linked-read data analysis
Whole-genome sequencing was performed at The Centre for Applied Genomics at the Hospital for Sick Children (Toronto, Ontario, Canada) with a HiSeq X Series (Illumina) instrument. Libraries were validated on a Bioanalyzer DNA High Sensitivity chip to check for size and absence of primer dimers, and quantified by qPCR using Kapa Library Quantification Illumina/ABI Prism Kit protocol (KAPA Biosystems). Validated libraries were paired-end sequenced on an Illumina HiSeq X platform following Illumina's recommended protocol to generate paired-end reads of 150 bases in length. Each library was loaded on a single lane. All libraries were sequenced at 30x coverage. Postsequencing, fastq files were analyzed with the package LongRanger-2.1.6 (10xGenomics) on servers hosted by the Centre for Health Genomics and Informatics (University of Calgary, Calgary, AB). LongRanger performed alignment of reads to the reference genome (GRCh38), assembled linked-reads, generated haplotype blocks, and called single-nucleotide polymorphisms and structural variants (SV). LongRanger output files were visualized with Loupe browser 2.1.2 (10xGenomics). Single-nucleotide variants (SNV) were called using both Mutect2 (15) and Strelka2 (https://github.com/Illumina/strelka). Only SNVs called by both methods were included for downstream analysis. Annotation of SNVs was performed using ANNOVAR (16).
Copy-number assays for the ATRX locus
Custom TaqMan Copy Number Assay Probes (catalog# 4400294) from Applied Biosystems by Thermo Fisher Scientific were used to detect changes in copy number at the ATRX locus in all patients. Quantification was performed according to the manufacturer's protocols. Briefly, for each patient, a control DNA sample, patient germline DNA, primary tumor DNA, and recurrent tumor DNA were diluted to 5 ng/μL. DNA was quantified in triplicates with the Qubit dsDNA HS Assay Kit (catalog# Q32851). The TaqMan Genotyping Master Mix and TaqMan Copy Number Assay were added to each well of a 96-well plate. Note that 20 ng of DNA was added in triplicate wells of a 96-well plate for each assay condition. The samples were then run on a Bio-Rad CFX Connect Real-Time PCR Detection System with the following parameters: hold at 95°C for 10 minutes, 40 cycles of 95°C for 15 seconds, and 60°C for 60 seconds. Fluorescence signal for the ATRX putative-deleted region (ATRXdel probe) was first normalized to the fluorescence signal from the control diploid region probe. Fluorescence was additionally normalized to the diploid control DNA sample. TaqMan Probes were labeled with FAM at 5′ end and MGBNFQ at the 3′ end. For the probe in the ATRX-deleted region, probe sequence: CACACCCAAATATTGGTAAAAAT, forward primer: TCCAGGACTTAGCAGGATGTGA, reverse primer: ACCCCATCAAGTAGATGGTAAGAAACT. For the probe in the control diploid region, probe sequence: ACACGGGTGTTGAAGACGC, forward primer: AGGCGCTGTGGAAACAATTATAGTA, reverse primer: TGTCCCCGTCAACGATCAC.
Data and software availability
The linked-read WGS datasets we generated have been deposited to the European Genome-Phenome Archive (EGA) with accession number EGAS00001003432. scRNA-seq datasets have been deposited to the Gene Expression Omnibus under accession number GSE117599. Bulk RNA-seq data from the Toronto cohort have been deposited to the EGA with accession number EGAS00001003070.
Results
pGBM samples and whole-genome sequencing with linked-reads
We characterized longitudinally collected pGBM samples that include primary tumors and their recurrences, in addition to peripheral blood (used as germline control), and were biobanked at our institution (Calgary cohort). For 4 of 5 patients, we had at least two longitudinal samples, and for 1 patient, we had three recurrent samples (Supplementary Table S1). For the purposes of this article, diagnostic samples are referred to as “sample 1,” and each consecutive resection from the same patient is referred to as “sample n.” Treatment history for each patient is known (Supplementary Table S2). We also obtained a set of matched germline and primary pGBM samples biobanked in a second institution (Toronto cohort; Supplementary Table S3) and used these for validation purposes. We extracted high-molecular-weight DNA (average length distribution > 40 kbp) from each sample and generated libraries for WGS with linked-read technology (10x Genomics). With this system, short next-generation sequencing reads originating from the same DNA molecule (i.e., an individual chromosome fragment) are tagged with an identical barcode, enabling the assembly of Mb-size haplotypes from individual chromosomes (Supplementary Table S4).
Diagnostic and recurrent pGBM samples share few SNVs
We used linked-read WGS data for pGBM samples and matched germlines to call somatic variants in diagnostic samples and their recurrences. Focusing specifically on nonsynonymous SNVs (nsSNV), our data are in agreement with previous reports of low mutation burden (3, 17, 18) in pGBM (range, 26–124 SNVs per tumor; average, 65.6 SNVs per tumor in primary samples, 60 SNVs per tumor in the recurrences). Overall, we found that recurrent tumors shared few somatic mutations with their matched primary malignancies, with only between 1 and 5 nsSNVs shared between longitudinal samples from the same patient. This translated to an average of 7.8% nsSNVs that were shared between a primary tumor and its recurrence. A similar amount (11.8%) of somatic SNVs and indels were shared between diagnostic and recurrent samples in the pediatric brain tumor medulloblastoma (19), in contrast with the much higher proportion shared between primary and recurrent adult glioma samples (54%; ref. 20). Surprisingly, we observed that the total number of somatic SNVs tended to be stable in recurrences after treatment (Fig. 1A–E; Supplementary File S1), whereas mutational load is known to increase 5-fold at recurrence in pediatric medulloblastoma (19) as well as in adult gliomas (20). Patient 1 was one exception (Fig. 1A), but this individual harbored a germline deletion of the last exon of TP53 (Supplementary Fig. S1A and S1B), which might have contributed to accelerated tumor evolution by increasing the number of mutational events. Sample 2 had 126 somatic nsSNVs, and only lesions in ARMC9 and DSE were shared with the diagnostic sample. Interestingly, DSE encodes a dermatan sulfarase epimerase and is expressed in the human brain at higher levels before birth than after birth (Supplementary Fig. S1C and S1D). Patient 2 had a somatic mutation in H3F3A that was retained in the second sample (Fig. 1B), together with a mutation in MUC22 and two nsSNVs in TP53. This patient was treated with temozolomide and local radiation (Supplementary Table S2), which is the standard of care for adult GBM patients. However, the overall number of somatic nsSNVs was lower in the second sample (69 SNVs) than in the diagnostic sample (109 SNVs), whereas this treatment is known to increase mutational burden in adults. This difference in response to chemotherapy and radiation between pGBM and adult GBM was also noted in an independent cohort of patients in a recent publication (12). Samples 1 and 2 from patient 3 only shared a mutation in PCNX (Fig. 1C). Patient 5 had 26 nsSNVs in the diagnostic sample and in sample 2, 38 in sample 3, and 39 in sample 4 (Fig. 1D). SNVs in 5 genes were shared between sample 1, sample 2, and sample 3 (ELAVL2, VWF, KCNJ12, HEXIM2, and ZNF816). Only SNVs in VWF (encoding the von Willebrand factor) and KCNJ12 (encoding an ion channel) were conserved between all samples for this patient.
No SNV was shared among all samples. Only 4 genes were mutated in at least two primary tumors (Supplementary Table S5), including VWF. Six genes were mutated in at least two recurrences from different patients (Supplementary Table S6). Our data therefore show that pGBM is characterized by large intertumoral genetic heterogeneity and by large evolutionary changes following surgical resection and treatment.
No obvious pattern of gene mutations was observed in our dataset. Gene ontology analysis did not identify any significantly enriched class of mutated genes. However, we found that nsSNVs in diagnostic samples were significantly enriched for genes with an OMIM entry and linked to hereditary syndromes (19.6% of the 321 mutated genes, permutation test P = 0.00082; Fig. 1E; Supplementary Fig. S1E and S1F). Note that 41.3% of these genes are associated with syndromes of the central nervous system (CNS). Similarly, nsSNVs in subsequent resections had a significant enrichment for genes with OMIM entries (20.2% of the 420 mutated genes, permutation test P < 10−5; Supplementary Fig. S1E and S1F), and 35.2% of these genes were associated with hereditary CNS syndromes (Fig. 1F). These observations suggest that disruption of multiple unrelated networks of genes may be sufficient for pGBM initiation and progression. The identification of nsSNVs in genes linked to hereditary syndromes is intriguing, given the early-life onset of this malignancy.
Germline sequencing of mother–father–patient trios
Intrigued by the identification of somatic nsSNVs in genes associated with hereditary disorders, we asked whether pGBM behaves as a developmental syndrome. We reasoned that if we assumed cancer was a developmental disorder of progressive mutagenesis, individual patients might undergo systemic accumulation of mutations over time. We decided to investigate this possibility by assembling trios composed of mother, father, and patient, implementing an approach similar to what is done with genetic disorders. We were able to obtain the germlines (peripheral blood) of the parents of patient 1 and patient 3, and used them for WGS with linked reads. Analysis of these trios enabled us to categorize nsSNVs as either unique to the pGBM patient germline (i.e., not found in either parent's germline), or inherited from one of the parental genomes. Our data show that a TP53 germline variant identified in patient 1 (Supplementary Fig. S1A and S1B) was inherited from one of the parents. In addition, nsSNVs in 9 genes were identified uniquely in the germline of this patient (Fig. 1G). Among these genes, RASAL2 encodes a RasGAP and is a tumor suppressor (21); PARP4 encodes a member of the poly(ADP)-ribose family and was recently shown to be mutated in the germlines of patients with thyroid and breast cancer (22); HUWE1 encodes an E3 ubiquitin ligase that was shown to be a tumor suppressor in mouse models of skin cancer (23) and colorectal cancer (24). Similarly, we called 4 de novo nsSNVs in the germline of patient 3 (Fig. 1H). Among these genes, KIAA1551 was recently identified as a candidate tumor suppressor and is frequently deleted in different cancer types (25). Overall, we noticed a significant increase in the number of short (30–50,000 bp) deletions in the germlines of pGBM patients (n = 8) compared with the control (parental) germlines (n = 4) we sequenced (Mann–Whitney test P = 0.0040; Supplementary Fig. S1G). This collection of sequenced trios is a unique dataset in the context of pGBM and offers a new perspective on the emergence of germline variants in pediatric patients with sporadic cancer.
Subclonal architecture of pGBM at diagnosis and recurrence
We next asked whether the small proportion of SNVs shared between each diagnostic sample and its recurrence corresponded to an unexpected intratumoral genomic complexity and consequent clonal selection. We assessed the presence of genetically distinct subclonal cell lineages in each tumor before and after treatment by using somatic mutations and copy-number alterations to build a genetic phylogeny for each patient using the EXPANDS algorithm (26). These phylogenies show that each pGBM and its recurrence(s) are characterized by extensive intratumoral genomic heterogeneity and complex subclonal architecture (Fig. 2A–E). These results support the presence of multiple genetically distinct subclones that change significantly in frequency throughout the evolution of a tumor. Importantly, each recurrent tumor also had a few subclones clustering close to the germline, likely indicating the stable presence of early (ancestral) clones that are not actively accumulating somatic alterations. This observation is best illustrated in patient 5 where three recurrences were available (Fig. 2E). Throughout this patient's tumor progressive divergence from the germline occurs at each successive recurrence in a number of genetic lineages, yet a number of ancestral subclones remain detectable at all timepoints (e.g., 2agh at the first recurrence; 3f at the second recurrence; 4ef at the third recurrence). Although these subclones likely had ancestral roles in the early part of the evolutionary path of this patient's tumor, each subsequent recurrence is marked by distinct lineages that have diverged significantly from this baseline.
Having already inferred the phylogenetic lineages present in each sample, let us select SNVs corresponding to distinct clusters of subclones, such that we could then compare the range of mutation signatures from subclones within each sample. In each patient, we see evidence for multiple mutation signatures, indicating the influence of more than one mutagenic process. For instance, in the diagnostic sample of patient 2, the subclones clustering near the germline (1bcfhij) were characterized by mutational signatures 3, 4, 6, and 24, whereas the more evolutionarily divergent subclones accumulated mutations dominated by signatures 4, 6, and 1 (Fig. 2C). Similar patterns of divergence between mutational signatures specific to clusters of subclones were observed for all other patients (Fig. 2B–E; Supplementary Fig. S2). All together, these data indicate that different mutational processes may operate at different times during the evolution of individual tumors. Alternatively, different mutational processes are acting concurrently at the onset, and the changing selective pressure present at each treatment stage enables different genetic lineages to become dominant within the tumor.
Identification of structural variants in pGBM
pGBM is known to harbor somatic SVs, some of which recur at low frequency in the patient population (27, 28). Because WGS with linked reads data are optimal for identification and visualization of SVs and subclonal events, we explored the behavior of SVs during tumor evolution. SVs were called with the LongRanger package. We defined large SVs as variants affecting > 30 kbp of the genome. We processed our linked-read WGS data for germlines and tumors independently of each other. This approach allowed us to retain SVs that might be important for tumor etiology, but that would otherwise have been filtered out because they are also present in the germlines. With this approach, we identified 20 to 30 SVs in the germlines of each pGBM patient compared with the reference genome (Fig. 3A–E; Supplementary File S2). The overall number of large SVs did not change dramatically between the primary and recurrent tumors. Patient 1 was the only one displaying a large increase in the number of SVs in the primary tumor compared with the germline (Fig. 3A). This was probably because of chromosome 6 chromothripsis events, which represented 62% of total SVs for this patient. No trace of chromosome 6 chromothripsis was observed in the second resection, and the total number of SVs was similar to the germline. This was unexpected because most patients were treated with radiation, which is known to induce SVs. However, this absence is in accord with the observed divergent clonal structure of pGBM (Fig. 2) and suggests that the chromothriptic event may have been present in a subclone. Adult gliomas experience a large number of mutational events (SVs and SNVs) during standard-of-care treatment (radiation plus temozolomide; ref. 20). Patient 2 underwent radiation and temozolomide treatment, yet the overall mutational burden did not drastically increase (Fig. 3B). We detected SVs affecting previously known cancer genes as well. Sequencing data provide evidence of a duplication of NTRK2, the overexpression of which was previously shown to contribute to tumor metastasis (29). Gene fusions involving NTRK2 have been previously reported in pGBM (28). Patient 3 had a somatic ETV6-NTRK3 gene fusion in both primary and recurrent tumors. This gene fusion has been reported in pGBM before (28), and it was also shown to have oncogenic function in other malignancies, including leukemia (30) and breast cancer (31). We also identified a translocation predicted to truncate the proto-oncogene MET in patient 4. MET amplifications are relatively common in adult GBM (32), and lesions in this gene are known to occur in pGBM (18, 28).
We have identified several SVs that recurred in more than one patient in our cohorts. Among these, we found germline deletions affecting the coding sequences of BTNL3, BTNL8, APOBEC3A, and APOBEC3B (Supplementary Fig. S3). Interestingly, BTNL8 encodes a cell surface protein involved in T-cell activation (33), whereas the APOBEC3 genes encode cytidine deaminases with roles in antiviral immunity (34). These deletions abrogated gene expression (Supplementary Fig. S3).
Germline and somatic SVs during tumor evolution
In some instances, tumors had lower numbers of SVs than their matched germlines (see diagnostic sample in Fig. 3C and samples 2–4 in Fig. 3E). We found that every patient carried some germline SVs that were not detected in their diagnostic samples and their recurrences. There were 8 such germline-specific SVs for patient 1 (Fig. 3A), 1 for patient 2 (Fig. 3B), 7 for patient 3 (Fig. 3C), 9 for patient 4 (Fig. 3D), and 3 for patient 5 (Fig. 3E). Each SV call was made with LongRanger and was manually reviewed, checked for coverage in the region of the call, and confirmed. Some SVs with allelic frequencies between 0.4 and 0.7 in the germline also underwent fluctuations in the tumors, whereas others were stable (Supplementary Fig. S4A–S4D). SVs with allelic frequencies > 0.7 in the germline, which we considered clonal, tended to be detected at stable levels in primary and recurrent tumors (Supplementary Fig. S4E–S4H). The subclonal nature of some SVs in the germline, and their fluctuations in primary and recurrent tumors, were unexpected findings, and we set out to further substantiate these observations.
In order to explore the observed SVs with an independent computational method, we recalled SVs with the GROC-SVs package (35). We subset SV predictions to those for which we had sufficient physical coverage in all samples (see Supplementary Methods), thus retaining SVs supported by at least 200 barcodes (i.e., DNA molecules) spanning each breakpoint (Supplementary File S3). We stratified SVs in each patient based on their pattern of presence or absence across samples. For instance, somatic SVs were absent from the germline and appeared in one or more tumor samples (Fig. 4A–I). Germline events were detected in all samples, and in patients 1 and 3 could be further stratified into those inherited from a parent or occurring de novo. In all pGBM patients, we identified changes in SV allelic frequencies between tumor resections. Allelic frequency changes could represent loss or acquisition of somatic SVs in the tumors (Fig. 4H) and are often observed in our cohort to occur in genes with previous evidence for structural variation in pediatric cancers (Fig. 4B, C, G, and I; ref. 36). For patient 1, we identified several de novo germline SVs (which were not detected in the parents) with allelic frequencies in the tumor samples (Fig. 4C). These data point to complex patterns of propagation of SVs from the germline to tumor samples, and highlight for the first time the presence of putative de novo germline SVs in pGBM patients.
Subclonal deletions at the ATRX locus in germlines and pGBM samples
The data organization principles underlying linked-read WGS data facilitated identification and visualization of subclonal SVs. Among these SVs, we identified large deletions in ATRX that removed almost the entire coding region of the gene (Fig. 5A). Somatic SNVs in ATRX are known to occur in pGBM (3, 37), and loss of ATRX protein is routinely tested in GBM specimens in the clinic with immunostaining techniques. Our data indicate that ATRX harbors both SNVs and SVs in pGBM. Of note, we detected putative subclonal ATRX deletions in the germline of pGBM patients. In order to validate the presence of putative ATRX deletions with an independent approach, we designed a TaqMan probe for copy-number assays that targets the putatively deleted region ("ATRXdel" probe; Fig. 5B; see Materials and Methods). We then performed copy-number TaqMan assays using genomes from both the germline and the tumors of our pGBM patients. This TaqMan assay confirmed the ATRX deletion in both germlines and tumors of all pGBM patients with the exception of patient 5, where the deletion was only detected in sample 2 (Fig. 5C; Supplementary Fig. S5). In order to validate these findings in an independent cohort, we performed WGS with linked reads on three more pGBM samples and their matched germlines collected in Toronto (Supplementary Table S3). Our data show evidence of subclonal deletions at the ATRX locus in all samples (Supplementary Fig. S6A). These findings confirm our WGS and copy-number data and show that subclonal deletions in ATRX exist in both the germline and tumors of most pGBM patients.
We expected that subclonal ATRX deletions may result in heterogeneous expression of the gene in populations of tumor cells. Because our longitudinal collection was assembled over a long period of time, storage of pGBM samples was not compatible with single-cell RNA-seq (scRNA-seq) methods. We therefore decided to assess transcriptional heterogeneity using patient-derived xenografts (PDX), which were available for sample 2 of patient 3 and for sample 4 of patient 5. We successfully generated scRNA-seq data for 1,327 cells from patient 3′s sample 2, and for 2,204 cells patient 5′s sample 4. As predicted from the WGS data and TaqMan assays, scRNA-seq data showed heterogeneous expression of ATRX in both PDXs (Fig. 5D and E). In contrast, other genes, for instance H3F3A and H3F3B (both of which encode the histone variant H3.3), were more homogeneously expressed (Supplementary Fig. S6B and S6C). Widespread decreased ATRX expression in pGBM compared with nonneoplastic brain tissue (n = 4) was confirmed by RNA-seq of bulk pGBM tissue (n = 3) and their matched primary cultures from an independent cohort collected in Toronto (Fig. 5F). Furthermore, analysis of an independent genomic dataset confirmed that subclonal ATRX mutations are detected in the tumors of pGBM patients (Supplementary Fig. S6D). Our data therefore show that in addition to somatic SNVs in ATRX, pGBM patients may also carry subclonal ATRX deletions in both their germlines and tumors.
Evidence for the existence of slow-cycling cancer stem cells in pGBM
We questioned whether the observed variability of the mutational profile might correlate with proliferative heterogeneity in the respective contributions made by subclones to tumor growth and recurrence. Recently, cell tracing studies based on the genetic barcoding of freshly isolated adult GBM cells transplanted into PDX mouse models have suggested that tumor cells may adhere to a common fate behavior, reminiscent of a normal neurogenic program (38). Specifically, a signature negative binomial dependence of the distribution of variant allele frequencies (VAF) provides evidence of a conserved proliferative hierarchy in which slow-cycling tumor stem–like cells give rise to a rapidly cycling, self-renewing, progenitor-like population (Fig. 6A; Extended Data Fig. 4 of ref. 38). Applied to the patient samples in the current study, we find that the VAF distribution obtained from tumor samples is also largely consistent with a negative binomial dependence (Fig. 6B–E; Supplementary Fig. S7A–S7I; Supplementary Table S7), suggesting pGBM tumor cells may be defined by a similar proliferative hierarchy (see Supplementary Methods). This conclusion is reinforced by focusing on the ensemble of de novo point mutations that are acquired, or only rise above the detection threshold, during recurrence (Supplementary Fig. S7J–S7N and Supplementary Methods). By contrast, mutations that are shared between the primary and recurrent tumor samples show VAFs that are peaked at larger values, consistent with the predominance or fixation of subclones within the tumor population (Supplementary Fig. S7O–S7R). Our data from our pGBM patient cohort are therefore consistent with a hierarchical model for pGBM, characterized by slow-cycling cancer stem cells that give rise to fast-proliferating progenitor-like cells, which eventually produce nondividing "differentiated" cells.
Discussion
Cancer is often compared with a caricature of developmental processes. Molecular profiling of pGBM suggests that this cancer has features that make it appear like a caricature of hereditary disorders. Notwithstanding large differences in terms of mutated genes among tumors, in every patient, we identified somatic SNVs in several genes linked to hereditary conditions, including MeCP2, which is linked to Rett syndrome (39), and autosomal-dominant genes linked to severe intellectual (ZMYND11, OMIM number 608668) and developmental syndromes (DSE, linked to Ehlers–Danlos syndrome; REEP1, OMIM number 609139). Interestingly, the pediatric brain tumor diffuse intrinsic pontine glioma has recurrent somatic mutations in ACVR1 (28, 40–42), and germline mutations in this gene lead to the hereditary syndrome fibrodysplasia ossificans progressiva (43). The detection of germline variants in pGBM patients and the identification of somatic SNVs in genes associated with hereditary syndromes give pGBM a characteristically different molecular flavor from adult GBM. Because of the early onset of pGBM, studies of larger cohorts will be required to determine if germline and somatic events occurring during embryonic and/or fetal development cooperatively contribute to pGBM initiation and progression. These studies will be particularly important given that currently no genetic or environmental factor has been associated with sporadic pGBM. It would be important to determine whether the de novo germline variants identified in this study arise in the germ cells of the parents during the normal mutagenic processes of genome replication (44), or if they arise during early embryonal or fetal development. Our data showing that some variants have low (<0.2) allelic frequency in the germline of pGBM patients suggest that these individuals might display somatic mosaicism resulting from mutational processes during postzygotic stages of development. Interestingly, clonal hematopoiesis has recently been shown to be associated with a 10-fold increase in risk of developing hematologic cancer (45, 46), and was observed in both adult and pediatric patients with non-hematologic cancers (47). In pediatric cancer patients, clonal hematopoiesis was identified in 5% to 10% of individuals, with highest frequency in children aged 0 to 9 years. There is therefore a possibility that some subclonal SVs with < 0.2 allelic frequencies in the germline that became barely detectable or undetectable in tumors might reflect clonal hematopoiesis. In this scenario, the frequency of germline SVs in the tumor may depend on the extent of tumor infiltration of hematopoietic cells carrying that specific SV. However, we identified this phenomenon in all pGBM samples, which is more than expected based on available published data (47). The low allelic frequency of some germline SVs is therefore more likely the result of developmental processes that selected them over time.
By sequencing the germlines of trios, our data provide an important perspective on the developmental origins of pGBM. Our data show that some potentially deleterious germline variants (like a TP53 exonic deletion in patient 1) can be inherited from a healthy parent. At the same time, we observed de novo nsSNVs in the germlines of both patient 1 and patient 3, as well as de novo germline SVs in patient 1. The expected rate of emergence of de novo nsSNVs is very low, based on recent literature evidence on trios that included probands affected by disorders with high heritability (48, 49). While acknowledging that our sample set is small and that there may be differences in numbers of SNVs detected and called based on sequencing platform and computational pipelines used in these studies, our data offer proof of principle that mutational processes are at play in the parents' germ cells or in the early developing zygote leading to de novo germline variants in at least some pGBM patients. Because of the rare nature of pGBM, and difficulties in recontacting families of pGBM patients, the trios we generated represent extremely valuable resources. In the future, it will be important to assemble germline WGS data on trios of pGBM probands in a systematic fashion, in order to assess and investigate genomic predisposing events.
Using WGS data with linked reads, we have identified a set of germline SVs that tend to occur at high frequency in our cohort. One of these examples is subclonal deletions at the ATRX locus in both germlines of 4 of 5 patients and tumor tissues of all patients. Recent work showed that deletion of Atrx in mouse neural progenitor cells or ATRX in adult glioma stem cells alters patterns of chromatin openness genome-wide and leads to enhanced cell migration (50). Because of the established involvement of ATRX in gliomagenesis, we speculate that ATRX deletions might be a predisposing genetic lesion for pGBM.
Importantly, our analyses show that pGBM is characterized by intratumoral genetic heterogeneity and subclonal architecture, which rapidly evolves at recurrence. Our work, together with previous reports on pediatric medulloblastoma (19) and diffuse intrinsic pontine glioma (13), indicates that pediatric brain cancers have more complex genetic architectures than previously thought. The genetic divergence between diagnostic and recurrent samples should be kept into consideration when designing personalized treatment approaches. In addition to genetic heterogeneity, our data provide evidence for the existence of a functional hierarchy in pGBM. VAF distributions in pGBM samples are consistent with a hierarchy comprising slow-cycling cancer stem cells, fast-proliferating progenitor cells, and non–self-renewing and nondividing cells destined to undergo cell death. Single-cell barcoding experiments with xenografts derived from adult GBM patients supported a similar hierarchy (38). Although adult GBM and pGBM are molecularly different, hierarchical organization may be a common feature of both. If slow-cycling, self-renewing cancer stem cells reside at the apex of a pGBM hierarchy, these cells might be less sensitive to radiation and antimitotics used to treat this cancer, explaining why recurrences did not have a significantly higher mutational burden than the diagnostic samples. Whereas temozolomide and radiation treatment significantly increased the mutational burden in recurrences in adult GBM, it did not have this effect in pGBM. We propose that this might be because the cancer stem cell population might be more quiescent in pGBM than in adult GBM. Alternatively, it is possible that other biological differences between pGBM and adult GBM play a role.
Overall, our data provide strong evidence for the existence of subclonal architecture in pGBM and relatively rapid evolution at recurrence, irrespective of treatment. Reconstruction of mother–father–patient trios revealed the potential for inherited and de novo germline variants and somatic mutations to contribute to the etiology of this pediatric cancer.
Disclosure of Potential Conflicts of Interest
T.J. Pugh reports receiving commercial research grant from Boehringer Ingelheim and is a consultant/advisory board member for Merck, Dynacare, and Chrysalis Biomedical Advisors. No potential conflicts of interest were disclosed by the other authors.
Authors' Contributions
Conception and design: M. Hoffman, M. Gallo
Development of methodology: M. Hoffman, A.H. Gillmor, A.J. Mungall, R. Moore, A.S. Morrissy, M. Gallo
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): M. Hoffman, M. Zarrei, J. King, K. Ellestad, N.H. Dang, F.M.G. Cavalli, F.J. Coutinho, Y. Ma, A.J. Mungall, R. Moore, M.D. Taylor, P.B. Dirks, S. Scherer, J.A. Chan
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): M. Hoffman, A.H. Gillmor, D.J. Kunz, M.J. Johnston, A. Nikolic, K. Narta, M. Zarrei, F.M.G. Cavalli, Y. Ma, M.A. Marra, T.J. Pugh, S. Scherer, D.L. Senger, B.D. Simons, A.S. Morrissy, M. Gallo
Writing, review, and/or revision of the manuscript: M. Hoffman, A.H. Gillmor, D.J. Kunz, A. Nikolic, M. Zarrei, A.J. Mungall, M.D. Taylor, T.J. Pugh, S. Scherer, B.D. Simons, J.A. Chan, A.S. Morrissy, M. Gallo
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): M. Hoffman, A.H. Gillmor, A. Nikolic, M.M. Kushida, B. Luu, Y. Ma, D.L. Senger, M. Gallo
Study supervision: D.L. Senger, B.D. Simons, M. Gallo
Acknowledgments
We thank Bhooma Thiruv and Jeff MacDonald for support of our analyses. Funds for this work were provided by the Canadian Institutes of Health Research (CIHR) early career award (Institute of Cancer Research) to M. Gallo (ICT-156651); a CIHR project scheme grant to M. Gallo (PJT-156278); a Cancer Research Society Scholarship for the Next Generation of Scientists grant to M. Gallo; a CIHR graduate scholarship to M. Hoffman; Royal Society EP Abraham Research Professorship and a Wellcome Trust Senior Investigator Award (098357) to B.D. Simons; Wellcome trust grants 203828/Z/16/A and 203828/Z/16/Z to D.J. Kunz; a Cancer Research Society Scholarship for the Next Generation of Scientists grant to A.S. Morrissy. B.D. Simons and D.J. Kunz acknowledge core funding to the Gurdon Institute from the Wellcome Trust (092096) and CRUK (C6946/A14492). Research supported by SU2C Canada Cancer Stem Cell Dream Team Research Funding (SU2C-AACR-DT-19-15) provided by the Government of Canada through Genome Canada and the Canadian Institute of Health Research, with supplemental support from the Ontario Institute for Cancer Research, through funding provided by the Government of Ontario. SU2C Canada is a Canadian Registered Charity (Reg.# 80550 6730 RR0001). Research Funding is administered by the AACR International - Canada, the Scientific Partner of SU2C Canada.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.