Abstract
Adult diffuse gliomas commonly recur regardless of therapy. As recurrence typically arises from the peritumoral edema adjacent to the resected bulk tumor, the profiling of somatic mutations from infiltrative malignant cells within this critical, unresected region could provide important insights into residual disease. A key obstacle has been the inability to distinguish between next-generation sequencing (NGS) noise and the true but weak signal from tumor cells hidden among the noncancerous brain tissue of the peritumoral edema. Here, we developed and validated True2 sequencing to reduce NGS-associated errors to <1 false positive/100 kb panel positions while detecting 97.6% of somatic mutations with an allele frequency ≥0.1%. True2 was then used to study the tumor and peritumoral edema of 22 adult diffuse gliomas including glioblastoma, astrocytoma, oligodendroglioma, and NF1-related low-grade neuroglioma. The tumor and peritumoral edema displayed a similar mutation burden, indicating that surgery debulks these cancers physically but not molecularly. Moreover, variants in the peritumoral edema included unique cancer driver mutations absent in the bulk tumor. Finally, analysis of multiple samples from each patient revealed multiple subclones with unique mutations in the same gene in 17 of 22 patients, supporting the occurrence of convergent evolution in response to patient-specific selective pressures in the tumor microenvironment that may form the molecular foundation of recurrent disease. Collectively, True2 enables the detection of ultralow frequency mutations during molecular analyses of adult diffuse gliomas, which is necessary to understand cancer evolution, recurrence, and individual response to therapy.
True2 is a next-generation sequencing workflow that facilitates unbiased discovery of somatic mutations across the full range of variant allele frequencies, which could help identify residual disease vulnerabilities for targeted adjuvant therapies.
Introduction
Adult diffuse gliomas are a group of malignant, infiltrative tumors intrinsic to the central nervous system that arise from glial cells such as astrocytes and oligodendrocytes (1). The World Health Organization (WHO) Classification of Tumors of the Central Nervous System partitions adult diffuse gliomas into three types: (i) WHO grade 4 glioblastoma, isocitrate dehydrogenase (IDH)-wildtype, (ii) WHO grade 2–4 astrocytoma, IDH-mutant, and (iii) WHO grade 2–3 oligodendroglioma, IDH-mutant and 1p/19q-codeleted (1). Hereafter, these tumors are referred to as glioblastoma, astrocytoma, and oligodendroglioma, respectively. Among these tumor types, median survival is shortest for glioblastoma, the most common of all diffuse gliomas (8–15 months; refs. 2–4), and longest for grade 2 oligodendroglioma (15–18 years; refs. 2, 5, 6). Although the extent of resection is a clinically well-recognized prognostic factor within each diagnostic category, adult diffuse gliomas inevitably recur with few exceptions even after gross total resection (7–10). Because recurrence typically arises from the peritumoral edema that extends beyond radiographic evidence of the bulk tumor (11, 12), profiling somatic mutations in the malignant cells that infiltrate the surrounding brain parenchyma may prove pivotal for the successful implementation of postsurgical adjuvant therapies.
To date, few studies have profiled somatic mutations arising from tumor cells beyond the surgical margins of adult diffuse gliomas. Using microarray-based technologies, early studies detected amplifications, deletions, and aneuploidies in the peritumoral edema of patients with glioblastoma (13, 14). Subsequently, whole-exome sequencing found putative truncal mutations from primary and recurrent glioblastoma to be divergent in peritumoral regions (15, 16). Although these studies established molecular evidence of residual disease in the peritumoral edema of high-grade gliomas, two key gaps in our knowledge persist. First, mutations unique to migratory tumor cells in peritumoral edema and absent in the bulk tumor remain unmapped in glioblastoma. Second, the extent and profile of somatic mutations in peritumoral edema associated with diffuse gliomas beyond glioblastoma remain unknown. The ability to close these knowledge gaps has been hampered by technical limitations associated with next-generation sequencing (NGS) and is further complicated by the reduced tumor content in the peritumoral edema. Specifically, true somatic mutations with a low allele frequency discovered during an untargeted search regardless of tissue location may be indistinguishable from NGS-associated noise.
NGS-associated noise is the presence of nonreference alleles in DNA sequencing reads that are caused by the accrual of errors during an NGS workflow (e.g., PCR errors, sequencing artifacts, alignment errors, etc.). The median error rate for Illumina sequencing is platform dependent and ranges from 0.087% to 0.613% (17). In practical terms, an error rate of 0.1% applied to a small 100-kb hotspot panel at a read depth of 100× yields an astounding 10,000 errors. While a majority of the error will occur as a single nonreference allele count at positions distributed across the genome, errors can also accumulate at a specific genomic coordinate, which increases the allele frequency of the nonreference allele. For germline analyses that classify variants in the heterozygous and homozygous state, these cumulative errors that can cause a false positive to have an allele frequency generally <5% are inconsequential because the expected allele fraction of a true variant is 50% and 100%, respectively. However, as investigations have increasingly sought to detect low-frequency genomic variants (i.e., <1%) for a variety of emerging applications, such as the detection of circulating tumor DNA in blood (18) or somatic mosaicism in healthy tissues (19), methods to minimize sequencing errors have become critically important.
A commonly employed approach to reduce NGS errors has been duplex barcoding for labeling each double-stranded DNA molecule with a unique molecular identifier. Duplex barcoding tracks amplicons from both the sense and antisense strands of DNA and subsequently uses bioinformatics to combine all sequencing reads linked to the original DNA molecule into a single consensus sequence (20). Practical applications of duplex barcoding; however, have encountered two key limitations. First, early PCR errors persist and the empirical gains in error correction have been substantially less than originally theorized (Fig. 1A; refs. 21–23). Second, read depth remains low because sequencing both the sense and antisense strands originating from the same molecule among the large milieu of DNA in the amplicon pool has been inefficient. Strategies to overcome these obstacles have included the development of novel duplex adapters (21, 22), probability modeling of position error (21, 23, 24), reductions in sample complexity to increase the likelihood of sequencing both strands (23), and conducting a two-fold iterative panel capture-enrichment (i.e., using the output from the first capture as the input for a second capture; ref. 25), among other proposed methods (26, 27). While each approach reduces error, discerning the appropriate methodology for data analysis becomes challenging, particularly once the multitude of variables associated with NGS such as total paired reads, sample complexity, read depth, etc., are considered.
Residual false positives with high allele frequencies persist after traditional duplex sequencing. A, The perception versus reality of duplex sequencing is diagrammed. In step 0, the sense (yellow) and antisense (blue) strands of double-stranded DNA (dsDNA) are shown with a true mutation. In step 1, duplex molecular barcodes are used to label dsDNA. The perception of duplex barcoding (left) is that amplicons of the sense and antisense strands are individually tracked back to the strand of origin from step 2 onwards. Single-stranded DNA (ssDNA) is first grouped on the basis of strand origin for consensus sequence determination (i.e., singleton; left), followed by double-stranded DNA sequence determination of sense and antisense strands (i.e., duplex, left– step 4). Sequencing artifacts and PCR errors are presumed to be removed. In reality, however, sense and antisense strands are made from both original strands during PCR amplification and strand origin is unknown (step 2; right). During step 3 (right), representations of sense (and antisense) amplicons are grouped regardless of strand origin. The inability to distinguish true strand origin for each amplicon allows a subset of PCR errors, particularly early PCR errors, to persist in the final dsDNA consensus sequence (step 4; right). B and C, The effects of singleton and duplex consensus sequence determination on read depth and false positives are shown, respectively. The dashed and dotted gray lines in B identify the theoretical maximum read depth for a 100-ng DNA library input and the maximum observed read depth for duplex sequencing (∼4,750×), respectively. D, The distribution of false positives (FP) based on allele frequency for duplex sequencing. Lines in B–D represent the mean value of eight healthy controls.
Residual false positives with high allele frequencies persist after traditional duplex sequencing. A, The perception versus reality of duplex sequencing is diagrammed. In step 0, the sense (yellow) and antisense (blue) strands of double-stranded DNA (dsDNA) are shown with a true mutation. In step 1, duplex molecular barcodes are used to label dsDNA. The perception of duplex barcoding (left) is that amplicons of the sense and antisense strands are individually tracked back to the strand of origin from step 2 onwards. Single-stranded DNA (ssDNA) is first grouped on the basis of strand origin for consensus sequence determination (i.e., singleton; left), followed by double-stranded DNA sequence determination of sense and antisense strands (i.e., duplex, left– step 4). Sequencing artifacts and PCR errors are presumed to be removed. In reality, however, sense and antisense strands are made from both original strands during PCR amplification and strand origin is unknown (step 2; right). During step 3 (right), representations of sense (and antisense) amplicons are grouped regardless of strand origin. The inability to distinguish true strand origin for each amplicon allows a subset of PCR errors, particularly early PCR errors, to persist in the final dsDNA consensus sequence (step 4; right). B and C, The effects of singleton and duplex consensus sequence determination on read depth and false positives are shown, respectively. The dashed and dotted gray lines in B identify the theoretical maximum read depth for a 100-ng DNA library input and the maximum observed read depth for duplex sequencing (∼4,750×), respectively. D, The distribution of false positives (FP) based on allele frequency for duplex sequencing. Lines in B–D represent the mean value of eight healthy controls.
Here, we systematically consider both experimental and in silico methods that increase sensitivity and reduce the error rate to construct True2, a duplex-based sequencing approach that enables the detection of variants with an allele frequency >0.05%. We establish a roadmap for True2 sequencing to achieve a specific false positive rate at ultra-deep read depths and then benchmark sensitivity using a reference standard. Subsequently, True2 is used to conduct an unbiased search for somatic mutations in the bulk tumor and the adjacent, unresected peritumoral edema from high-grade and low-grade adult diffuse gliomas. Our approach leads to the discovery and profiling of a diverse and unexpected mutational landscape in the tumor and peritumoral edema of adult diffuse gliomas that would otherwise remain hidden without the inclusion of somatic mutations present below commonly used NGS detection limits.
Materials and Methods
Participants and sample collection
Adults (age ≥18 years) presenting to the University of Utah with radiographic evidence of a new high-grade to low-grade diffuse glioma were recruited for enrollment. Healthy controls for donation of white blood cell (WBC) DNA were recruited from the University of Utah. All study procedures were approved by the University of Utah Institutional Review Board prior to study initiation. All study participants signed written informed consent prior to enrollment.
Brain tumor patients were categorized into Group A (contrast-enhancing) and Group B (nonenhancing) based on their diagnostic MRI. In preparation for surgical treatment, standard tumor preoperative imaging was obtained and included acquisitions for the following contrasts: T1-weighted, T2-weighted, FLAIR, postcontrast T1-weighted, diffusion-weighted, and either postcontrast T1-weighted (group A) or T2-weighted (Group B) stereotactic sequences. Surgery was performed within 24 hours of the imaging studies. Stereotactic MRI images were loaded on the intraoperative navigation system and used to select the desired regions of tumor biopsy sites. All surgeries were performed under the same attending neurosurgeon with >30 years of experience. Biopsies for Group A patients included areas of preoperative T1-weighted contrast enhancement (suspected tumor, i.e., “tumor” or “bulk tumor”) and areas without contrast enhancement but within adjacent T2/FLAIR signal abnormality (i.e., “peritumoral edema”). Biopsies for Group B patients included areas with preoperative T2-weighted and FLAIR signal abnormalities (suspected tumor, i.e., “tumor” or “bulk tumor”) and areas outside of the T2/FLAIR signal abnormalities (i.e., “peritumoral edema”). Patients underwent maximal safe surgical resection of either enhancing abnormality (Group A) or T2/FLAIR abnormality (Group B). Samples corresponding to peritumoral edema were acquired a minimum of 1 cm from the bulk tumor. The tumor was removed using the intraoperative navigation system, which allowed for documentation and correlation of the location of the tissue removed with the preoperative imaging studies. The majority of the tumor was sent for standard pathological examination and tumor grade determination according to the WHO classification system (1). Only tumors 2 cm in diameter or larger were included in this study to ensure that the small amount of tumor used as part of this study would not interfere with the ability of the neuropathologist to make an accurate diagnosis. Patients then underwent an intraoperative MRI using the same MRI sequences as had been obtained preoperatively. Resection was extended to any remaining abnormality on the intraoperative imaging. When feasible, additional tumor biopsies were acquired from areas with residual T1-weighted contrast enhancement (“tumor” or “bulk tumor” in Group A) or residual T2/FLAIR signal abnormalities (“tumor” or “bulk tumor” in Group B). Areas outside of the suspected tumor were also biopsied for comparison (i.e., “peritumoral edema”). Biopsy sites were verified by the navigation system and recorded. All study biopsy tissues were snap-frozen in liquid nitrogen. Prior to the extraction of DNA, a slice from the tissue was used to make an hematoxylin and eosin slide to estimate the tumor content by the study neuropathologist. Estimation of tumor content was performed blinded to the underlying clinical diagnosis, tissue of origin (bulk tumor vs. peritumoral edema), and NGS results.
Whole blood was collected in either EDTA or Streck BCT tubes via venipuncture. For patients with brain tumor, blood was collected prior to surgery. Within 24 hours of sample collection, whole blood was centrifuged at 1,900 × g for 20 minutes. After the removal of plasma, the buffy coat containing WBC DNA was isolated and stored at −80°C.
MGG8 cell line
MGG8 stem-like cells were acquired from Dr. Hiroaki Wakimoto at Massachusetts General Hospital (Boston, MA; ref. 28). The MGG8 stem-like cell line was isolated from a newly diagnosed glioblastoma patient. MGG8 stem-like cells were grown in DMEM/F-12 media (Thermo Fisher Scientific) supplemented with GlutaMAX (3 mmol/L; Thermo Fisher Scientific), retinoic acid-free B27 supplement (Thermo Fisher Scientific), N2 supplement (Thermo Fisher Scientific), heparin (5 mg/mL; Sigma), EGF (20 ng/mL; PeproTech), and FGF2 (20 ng/mL; PeproTech) to establish neurosphere cultures. Cell media was supplemented with growth factors every 3 to 4 days with fresh media exchange every 7 days. Cells were split or passaged as needed with Trypsin or Versene (Thermo Fisher Scientific) to maintain cultures as small neurospheres. MGG8-organoid cultures were initiated by soft-pelleting 0.2×106 cells in a 1.5 mL Eppendorf tube. The seed-organoid was then transferred into a 12- or 6-well plate for growth. MGG8-organoid cultures received fresh EGF and FGF supplements every 2–3 days, with fresh media exchanges at 5–7 days. Large organoids were subcultured (via gentle pipette-transfer) into fresh wells every 14–17 days. MGG8 neurospheres and organoids were harvested via centrifugation and then snap-frozen in liquid nitrogen for subsequent DNA isolation. MGG8 cells (and organoids) were started in culture at P26. MGG8 cells (P39) and organoids were harvested after 6 weeks of tumor growth.
DNA extraction, library preps, and sequencing
WBC DNA and MGG8 DNA were extracted using the QIAamp DNA Blood Mini Kit (Qiagen). DNA from frozen patient samples was extracted using the DNeasy Blood & Tissue Kit on the QIAcube (Qiagen). HD701 DNA, a quantitative multiplex reference standard composed of genomic DNA from three different human colon cancer cell lines (HCT116/RKO/SW48) was purchased from Horizon Discovery Ltd.
Libraries for all DNA samples were prepared using the NEBNext Ultra II FS DNA kit (NEB) following the manufacturer's instructions with minor alterations. The enzymatic shearing during library preparation was performed for 10 minutes at 37°C. Dual-index, dual molecularly barcoded adapters (xGen CS Adapter, IDT) were used during adapter ligation. The dual-index was used for sample identification during demultiplexing and to minimize errors caused by template switching. The molecular barcodes are unique molecular identifiers (3-mers) added to each end of the insert for tracking amplicons. During library preparation, the 3 μL NEB USER Enzyme was not used after ligation since IDT adapters were being used and the volume of AMPure XP Beads during postligation cleanup was reduced from 57 μL to 55 μL to maintain the 0.8× ratio. DNA input (100 ng) was used for all library preps. In addition, the eight WBC DNA samples used for error profiling and the HD701 DNA used 50 ng DNA inputs for additional library generation. Both the 100 ng and 50 ng DNA inputs for HD701 DNA were performed independently in duplicate. PCR amplification used six and seven PCR cycles for the 100 ng and 50 ng inputs, respectively. After library generation, hybridization capture was performed with the xGen NGS Hybridization Capture Kit (IDT) using a custom-designed 114 kb panel targeting hotspot exons from 115 genes and the TERT promoter region (IDT; Supplementary Table S1) following the manufacturer's instructions. Library DNA (500 ng) was used as input and each panel capture enrichment was single-plexed. Each library from the eight WBC DNA samples used for error profiling and HD701 DNA underwent a single (1×) and iterative (2×) panel capture enrichment. Iterative panel capture enrichment (2×) was accomplished by performing a second panel capture enrichment using the output from a single panel capture-enrichment as the input. The second panel capture enrichment is a complete iteration of the first panel capture enrichment effectively doubling preparation time and consumable costs, most notably blocking oligos and panel probes. PCR cycles prior to sequencing are detailed in Supplementary Table S2. All samples underwent paired-end (150×150) NGS using the Illumina NovaSeq 6000 (v1.5).
Bioinformatics
Consensus sequence generation
The NovaSeq 6000 BCL files were demultiplexed using bcl2fastq2 version 2.20 with dual index barcodes to create sample-specific FASTQ files that exclude reads associated with index hopping (29). FASTQ files from WBC DNA used for error analysis and HD701 DNA used for sensitivity were trimmed to 10, 20, 30, 40, 50, 60, 80, 100, 120, 140, 160, and 180 million total paired reads (TPR). The maximum value of 180 million TPR represented the highest minimum TPR present for all WBC DNA and HD701 DNA samples. FASTQ files from WBC DNA used for error modeling specific to the panel were trimmed to 100 million total paired reads. FASTQ files from tumor DNA were not trimmed and the maximum TPR available for each sample was used. Mean TPR was 141 ± 16 million (median: 142 million; range: 103 million to 181 million; Supplementary Table S3). Preconsensus, singleton collapsed, and duplex collapsed aligned reads (GRCh37) from FASTQ files were achieved using BWA, Picard, Samtools, and FGBIO. Although the adapters used in the experiment labeled double-stranded DNA by adding a 3-mer UMI to both ends of the original molecule, to assess the effect of singleton adapters the two 3-mers were combined into a 6-mer for consensus determination of single-stranded DNA. A comprehensive description of each step with an example command line entry is provided in Supplementary Data S2. Applications available in USeq were used to measure read depth and base counts on BAM files (Supplementary Data S2). Somalier, a tool for estimating relatedness in cancer and germline studies (30), was used on duplex BAM files originating from the adult diffuse glioma cohort to detect sample swaps.
Proper-pair-plus
The initial step used the samtools view flag “-f 0×2” to identify reads where each segment was properly aligned according to the aligner with a minimum alignment quality (MQ) of ≥20. Unmapped reads (MQ = 255) were removed. The resulting SAM file was then searched to select reads with a uniquely matched pair. Finally, only reads with one of the following cigar structures were included: “XM”, “XMYIZM,” or “XMYDZM,” where X, Y, and Z were integers, M is a match, I is an insertion, and D is a deletion. If an insertion or deletion was present, X and Z were ≥5.
Alignment and base quality
The initial generation of bam files, mpileup files, and base count files used ≥MQ20 and a base quality (BQ) ≥20. To examine the effects of alignment and base quality on error, additional mpileup and base count files were generated using all combinations of alignment (MQ20, MQ30, MQ40, MQ50, and MQ60) and base quality (BQ20, BQ30, BQ40, and BQ50) scores. Of note, the FGBIO algorithm combines base quality scores during duplex consensus sequence determination resulting in higher base quality values for a base call compared to Q-scores associated with each read from an Illumina sequencer.
End trimming
The first step in converting FASTQ files to BAM files was to create an unmapped BAM file from the FASTQ files (Supplementary Data S2). Read structure (“3M2S146T”) identified the location of the UMI (“3M”), the number of bases between the UMI and the insert (“2S”), and the insert length (“146T”). For end trimming, the read structure was incrementally altered to achieve zero to five base-pair trimming from both ends of the insert. For example, to trim 1 bp from the end the read structure was “3M3S145T.” Subsequently, the trimmed insert went through the entire sequence alignment and consensus calling pipeline outlined in Supplementary Data S2 to produce BAM files for read depth and base count analysis.
Family size
The “FilterConsensusReads” FGBIO application was used to select family size (Supplementary Data S2). All initial duplex data required agreement between strands (flag: —require-single-strand-agreement true) and representation from the sense and antisense strands of at least one molecule each (i.e., single-stranded DNA FS ≥1; flag: -M 2 1 1). Increments in family size required that both sense and antisense strands match the family size criteria (e.g., FS ≥2 used the flag: -M 2 2 2).
Removal of potential contamination
Single-nucleotide polymorphisms (SNP) with a minor allele frequency of at least 1% mapped to a single location in the reference genome (i.e., common SNPs) were downloaded from dbSNP build 151 (ftp.ncbi.nlm.nih.gov/snp). The specific criteria used to define common SNPs is available at http://www.genome.ucsc.edu/cgi-bin/hgTrackUi?hgsid=899875475_jlFIn3eyAUiLV8aG7Y6G8TZR42ea&g=snp151Common&hgTracksConfigPage=configure. During in silico error reduction and detection of variants in glioma tumor samples, variants corresponding to common SNPs (Supplementary Data S2) were removed from the analysis.
Common error modeling
Libraries were prepared from the WBC DNA of 26 participants using 100 ng DNA input, duplex barcoding, and 1X panel capture enrichment with our 114 kb custom-designed panel. The 26 participants included both healthy controls and glioma patients. FASTQ files were trimmed to 100 million TPR for all samples. In silico error correction included sense and antisense agreement and the cigar structure described under proper-pair-plus. To increase the identification of regions and positions susceptible to error the maximum error rate for a single consensus base was increased to 1 (default value is 0.1). Base count files were generated from collapsed consensus sequences using a minimum base quality of BQ20. A nonreference allele with ≥2 counts was considered an error. Two types of common errors were identified: (i) an error present in ≥2 samples; and (ii) a 50-bp region with an error at ≥10% of the positions across samples. For the former, the singular genomic position was recorded in a blacklist. For the latter, all positions within the 50-bp window were included in the blacklist. From the 26 samples, one sample was randomly selected as the test sample while the remaining 25 samples were randomly ordered and used to generate a pool-of-normal incrementally increased in size from 1 to 25 samples. For each pool-of-normal size, common error was established and the residual error in the test sample, where an error was defined as a panel position with a nonreference allele count ≥1 and absent from the blacklist, was tabulated and categorized on the basis of the variant allele frequency. When the pool-of-normal sample size was one, a nonreference allele with ≥2 counts was considered a common error. This process was performed for 1,000 iterations to study the effects of sample order and the minimum number of samples necessary to achieve the identification of common errors.
The final blacklist was constructed from all 26 samples. Although the genomic coordinates associated with the two common TERT promoter mutations (228C>T, 250C>T) had a low error rate, the positions were within a 50 bp window of high error that was included on the blacklist. Because TERT promoter mutations are common to glioblastoma and oligodendroglioma, these two positions were removed from the final blacklist.
Somatic mutation calling
Nonreference alleles identified as variants after the True2 workflow and absent in the patient's WBC DNA as a single nucleotide polymorphism (variant allele frequency >20%) were passed to Variant Effect Predictor (31) for annotation. Somatic mutations in each tissue sample were identified independently of all other tissue samples within and between patients. For each sample, SAM files were searched to determine if mutations occurring in the same gene at a different position were present/absent in the same read.
NGS comparisons to histology
Copy-number variation
Sequencing data from the same 26 samples used to model common error (WBC DNA) were used to train the CNV model. The sequencing data in the singleton configuration was used to maximize read depth to provide a narrower normal distribution for each gene during modeling. The median read depth rounded to the nearest thousand (14,000×) from the 26 samples was used for normalization – the read depth for each sample was multiplied by the normalization factor so that the average read depth was similar across all samples. The mean, normalized, read depth for each gene was then used to measure the mean and standard deviation of read depth from all 26 samples. The presence of a CNV was defined for each gene as the ±3.891 × mean (99.99th percentile). A comprehensive description of each training and testing step with an example Python script is provided in Supplementary Data S4. The specificity of the CNV model was tested on the WBC DNA from eight healthy controls described above using the singleton data from 100 million total paired reads. Detection of CNVs was tested on DNA from the MGG8 cell line grown in culture as cells and organoids. Subsequently, CNVs were identified in tumor tissue samples using singleton data and 100 million total paired reads.
Truth set generation for sensitivity analysis
HD701 DNA is a reference standard composed of DNA from three human colon cancer cell lines (HCT116/RKO/SW48). To generate a range of target VAFs, HD701 DNA was serially diluted with WBC DNA from a healthy control (Control 4: 25 yrs, female) using a 1:2 ratio at each iteration. A portion of the output from each dilution was used as input for the next dilution. Four dilutions were produced with theoretical VAF reductions of 1/3rd (dilution 1), 1/9th (dilution 2), 1/27th (dilution 3), and 1/81st (dilution 4). The undiluted HD701 DNA sample and a subset of the diluted HD701 DNA samples (dilutions 2, 3, and 4) underwent library preparation (100 ng and 50 ng DNA inputs), single (1×) and iterative (2×) panel capture enrichment, and NGS identical to the procedures described above.
Potential variants were identified in the undiluted HD701 DNA data derived from 100 ng input, duplex adapters, a single panel capture enrichment, 180 million TPR, family size ≥1, and prior to exclusion of reads without proper-pairing and target cigar structure. The VAFs we observed with duplex sequencing strongly correlated with the manufacturer's reported VAFs measured with droplet digital PCR indicating that any loss of DNA during the NGS workflow was random (Supplementary Fig. S1A). Single-nucleotide variants (SNV) with an allele frequency >1% in at least one replicate and >0.9% in both replicates were selected for use. In addition, the EGFR p.Thr790Met was included because the variant was confirmed by the manufacturer to be present in HD701 DNA at an expected allele frequency of 1% even though our study replicates measured the VAF at 0.97% and 0.79%. Because we sought to measure the sensitivity of variants with an allele frequency <0.1% in the diluted HD701 DNA, a potential variant with an average nonreference allele count >2 in the corresponding sequencing data from the eight control samples was eliminated to minimize noise as a confounder of sensitivity. The remaining 72 variants in 47 genes were established as true variants (Supplementary Table S4). Not all variants were present in the manufacturer's whole-exome sequencing VCF files for the cell lines that comprise HD701, which may be attributable to differences in read depth.
To establish dilution factors, true SNVs with an average allele frequency >0.5% in the diluted DNA from both replicates were used to measure the ratio between the VAF in the diluted sample and the previously diluted sample (or undiluted sample for the first dilution; Supplementary Fig. S1B). Variants with a VAF <0.5% were not used during the calculation of dilution factors because relatively minor differences in low VAFs may substantially skew the ratio. Dilution factors were then determined using the median value of the dilution factors (Supplementary Fig. S1C) and applied to the mean value of the two replicates for the undiluted 72 variants to establish 288 SNVs with an allele frequency ranging from 0.01% to 69.6% divided into six categories (Supplementary Table S4). Sensitivity results are reported as the mean of the two replicates.
Statistical analysis
All statistical tests were performed with GraphPad Prism 9 (version 9.4.1). The independent Student t test was used for between-group comparisons. Pearson correlation was used to evaluate the association between groups. Kaplan–Meier analysis was used to assess overall survival. Results were considered statistically significant for P < 0.05.
Data availability
The BED file used during analysis has been included in the Supplemental Data Files as ‘Supplement_1-gbmPanel.bed’ (Supplementary Data S1). A complete set of instructions to generate singleton and duplex consensus sequence determination has been included in the Supplementary Data Files as “Supplement_2-ConsensusCalling” (Supplementary Data S2). A complete list of the common SNPs removed during variant detection has been included in the Supplementary Data Files as “Supplement_3-humanCommonSnpsGRCh37_gbmPanel.vcf” (Supplementary Data S3). Commented Python scripts for copy number variation analysis have been included in the Supplementary Data Files as “Supplement_4-CNV_methods” (Supplementary Data S4). The blacklist file used to exclude specific genomic coordinates based on error has been included in the Supplementary Data Files as “Supplement_5-blacklist.txt” (Supplementary Data S5). Data used to compile results and figures are provided as an Excel spreadsheet in the Supplementary Data Files as “Supplement_6-FigureData.xlsx” (Supplementary Data S6). All associated demultiplexed FASTQ files are available for download in the NCBI Sequence Read Archive (SRA) repository (accession PRJNA1039491). All other raw data are available upon request from the corresponding author.
Results
Construction and validation of True2 sequencing
Read depth and false positives were the two principal endpoints tracked during the testing of experimental and in silico error reduction methods. As a point of reference, we first measured read depth and error for our custom-designed capture-enrichment panel (114 kb or 1.14 × 105 bp, 115 genes; Supplementary Table S1; Supplementary Data S1) prior to consensus sequence determination in WBC DNA obtained from eight healthy controls. Read depth increased linearly with total paired reads (TPR) to a maximum of approximately 100,000× at 180 million TPR (Supplementary Fig. S2A), which is a similar TPR used during whole genome sequencing (3.2 × 109 bp) at a 15–20× read depth. An error was defined as a nonreference allele count excluding insertions/deletions. Counts associated with sample-specific, germline, single-nucleotide polymorphisms (SNP) were removed from the error tally by only including counts associated with a nonreference allele having a cumulative allele frequency <20%. The error rate was relatively uniform at approximately 0.03% for all TPR (Supplementary Fig. S2B), which was below the median error rate of 0.109% recently reported for the NovaSeq (17), suggesting the standard experimental methods used herein did not introduce error beyond other commonly used protocols. We tabulated >250,000 false positives per 100-kb panel positions, where a false positive was defined as a nonreference allele with a count ≥1 and an allele frequency <20% (Supplementary Fig. S2C). Insertions/deletions were not included. As expected, the implementation of molecular barcodes reduced both read depth (Fig. 1B) and false positives (Fig. 1C). Application of barcodes in the singleton and duplex configurations reduced read depth below the theoretical maximum consistent with the consensus collapse of single-stranded and double-stranded DNA amplicons, respectively. The associated error reduction with barcode use, which was greater with duplex sequencing, occurred predominately with false positives at a variant allele frequency (VAF) <0.1% (Supplementary Fig. S3A). The persistence of false positives with an allele frequency >0.1% (Fig. 1D) indicated that further improvements in error suppression were necessary to enable the untargeted detection of somatic mutations regardless of VAF.
Given the overall favorable initial error profile, we used duplex technology as the foundation for True2 sequencing. We then systematically tested experimental and in silico methods to reduce the total false positive rate inclusive of all VAFs to <1 false positive per 100-kb panel positions (Fig. 2A). Selection of aligned reads corresponding to proper-pair-plus criteria and increments in minimum base quality and alignment quality reduced the quantity of total false positives in all categories with <2% loss of read depth separately and combined (Fig. 2B; Supplementary Fig. S3B–S3I). Applying a threshold to the minimum number of counts for a nonreference allele to qualify as a true variant reduced false positives with a VAF <1% but had limited effects on the other VAF categories (Supplementary Fig. S4A and S4B). Accounting for potential contamination (e.g., template switching) by removing variants identified as common in the dbSNP database had a minimal overall effect (Supplementary Fig. S5). Thus, we hypothesized that false positives associated with a VAF >1% may be caused by recurrent PCR errors, residual alignment artifacts, or both at specific genomic coordinates or regions of the genome. By mapping common error using a set of WBC DNA, we found that increments in the number of controls to define the pool-of-normal continuously reduced false positives in all VAF categories (Fig. 2C). Notably, the false positive rate for variants with an allele frequency >1% fell below 1 per 100 kb using ≥3 samples to define the pool-of-normal supporting our conjecture that specific locations on the panel were highly susceptible to error. A final blacklist using all 26 WBC DNA samples as a pool-of-normal was compiled for subsequent use, which resulted in a 3.09% loss of panel positions (Supplementary Fig. S6; Supplementary Data S4). Subsequent reapplication of a thresholding strategy to proper-pair-plus data selected for base quality ≥50, alignment quality ≥60, and removal of potential contamination and common error allowed the workflow to achieve the target error rate of <1 false positive per 100 kb at a threshold value ≥5 (Fig. 2D; Supplementary Fig. S7). Collectively, these criteria constitute the True2 duplex sequencing workflow (Fig. 2A). Read depths and false positive rates for True2 versus singleton and the underlying duplex technology are shown in Fig. 2E and F, respectively. Importantly, True2 was constructed from commercially available reagents and open-source materials (Supplementary Data S2) allowing for rapid adaptation by any entity.
True2 integrates key experimental and in silico modules to minimize false positives and achieve high sensitivity for variant detection across all allele frequencies. A, The experimental and in silico modules that were included and excluded in the final True2 workflow based on systematic testing. B, False positive counts for allele frequency categories are shown for duplex adapters after the application of proper-pair-plus criteria and selection for BQ ≥ 50 and MQ ≥ 60. C, Identification of common errors specific to the custom-designed panel using a pool-of-normal was necessary to minimize false positives with a high allele frequency, particularly once the sample size of the pool-of-normal exceeded two (data from 180 million total paired reads). D, Using proper-pair-plus duplex data, requiring BQ ≥ 50 and MQ ≥ 60, and removing false positives present in the dbSNP database or associated with positions harboring common error enabled the total false positive rate to be reduced to <1 per 100 kb by applying a threshold value of ≥5 counts for a nonreference allele to qualify as a false positive. E, True2 had a minimal reduction on duplex sequencing read depth. The dashed and dotted lines in read depth identify the theoretical maximum read depth (28,000×) and the maximum observed read depth for duplex sequencing (∼4,750×), respectively. F, The cumulative false positive rate for each type of sequencing output is displayed. All data in B, D, E, and F represent the mean value from eight independent samples. Data in C are the mean value from 1,000 random permutations of the pool-of-normal at each sample size. G, To benchmark sensitivity, a truth set of 288 SNVs distributed across a range of allele frequencies was generated using a dilution series derived from a reference standard. H, True2 sensitivity for each allele frequency category of truth set variants is displayed. Data lines in H represent the mean value of two replicates. The gray dashed line identifies 90% sensitivity and the black dotted line corresponds to 33% sensitivity.
True2 integrates key experimental and in silico modules to minimize false positives and achieve high sensitivity for variant detection across all allele frequencies. A, The experimental and in silico modules that were included and excluded in the final True2 workflow based on systematic testing. B, False positive counts for allele frequency categories are shown for duplex adapters after the application of proper-pair-plus criteria and selection for BQ ≥ 50 and MQ ≥ 60. C, Identification of common errors specific to the custom-designed panel using a pool-of-normal was necessary to minimize false positives with a high allele frequency, particularly once the sample size of the pool-of-normal exceeded two (data from 180 million total paired reads). D, Using proper-pair-plus duplex data, requiring BQ ≥ 50 and MQ ≥ 60, and removing false positives present in the dbSNP database or associated with positions harboring common error enabled the total false positive rate to be reduced to <1 per 100 kb by applying a threshold value of ≥5 counts for a nonreference allele to qualify as a false positive. E, True2 had a minimal reduction on duplex sequencing read depth. The dashed and dotted lines in read depth identify the theoretical maximum read depth (28,000×) and the maximum observed read depth for duplex sequencing (∼4,750×), respectively. F, The cumulative false positive rate for each type of sequencing output is displayed. All data in B, D, E, and F represent the mean value from eight independent samples. Data in C are the mean value from 1,000 random permutations of the pool-of-normal at each sample size. G, To benchmark sensitivity, a truth set of 288 SNVs distributed across a range of allele frequencies was generated using a dilution series derived from a reference standard. H, True2 sensitivity for each allele frequency category of truth set variants is displayed. Data lines in H represent the mean value of two replicates. The gray dashed line identifies 90% sensitivity and the black dotted line corresponds to 33% sensitivity.
Additional experimental and in silico methods were evaluated during the development of True2 but were excluded from the final pipeline (Fig. 2A) because of marginal overall effects on error reduction, adverse effects on read depth, or both. The absence of noise reduction with insert trimming (Supplementary Fig. S8A) was likely due to less damage caused by enzymatic shearing during library preparation compared to mechanical shearing used in other studies (32). The selection of larger single-stranded DNA family sizes reduced false positives with a VAF <1%, but a substantial reduction in read depth was also observed with each family size increment (Supplementary Fig. S8A). Using a two-fold (2×) iterative panel capture enrichment method improved on-target performance and read depth on average by approximately 7% and 20%, respectively, compared with conventional single (1×) panel capture-enrichment. However, the gains in read depth at 2× were matched by increasing the TPR at 1X (Supplementary Fig. S8B). Compared with the original 100-ng input, read depth was reduced with a 50 ng input (i.e., less complex sample), but the reduction was <50%, suggesting that using a less complex sample as a library input increased the efficiency of paired strand identification (Supplementary Fig. S8B). This conjecture is supported by the larger mean family sizes observed in less complex samples (Supplementary Fig. S8B). The potential gain in error correction associated with larger family sizes may have been mitigated by the two-fold increase in PCR cycles during iterative panel capture enrichment because total false positive counts were increased compared with the single capture enrichment approach (Supplementary Table S2; Supplementary Fig. S8B). The principal gain in error reduction for VAFs <1% through the selection of larger family sizes (i.e., ≥2) was similar regardless of sample complexity and capture enrichment method (Supplementary Fig. S8C). Overall, we did not find compelling evidence in our direct comparisons to integrate these additional experimental and in silico methods into True2 from an error reduction perspective. Nevertheless, some or all of these strategies may be beneficial in future studies depending on the study design.
Benchmarking True2 sensitivity with a reference standard
We next sought to measure the sensitivity of variant detection across different allele frequency categories using the optimized True2 workflow. HD701 DNA is a reference standard composed of DNA from three human colon cancer cell lines (HCT116/RKO/SW48). We diluted HD701 DNA with WBC DNA to establish 288 SNVs with an allele frequency ranging from 0.01% to 69.6% divided into six categories (Fig. 2G; Supplementary Table S4). Using a 100-ng DNA input, True2 sensitivity improved with each increase in TPR (Fig. 2H). At 180 million TPR, the overall sensitivity for SNVs with a VAF ≥0.1% was 97.6%. Notably, 89.1% and 36.8% of SNVs with an allele frequency of 0.1–0.3% and <0.1%, respectively, were detected at 180 million TPR (Fig. 2H). For these two VAF categories, the read depth, allele frequency, and counts associated with each detected and missed SNV are shown in Supplementary Fig. S9A–S9C. Although 46.7% of variants with a VAF between 0.05% and 0.1% were detected, none were detected with a VAF ≤0.05%. Greater TPR to increase both variant and wildtype counts may further improve sensitivity. As with read depth, reducing the DNA library input from 100 ng to 50 ng did not cause a 50% loss in sensitivity likely because the reduced sample complexity enabled greater detection of paired strands (Supplementary Fig. S10A and S10B). Using iterative panel capture-enrichment for 100 ng and 50 ng DNA inputs yielded a similar sensitivity for all categories of allele frequencies providing further evidence that iterative panel capture enrichment may have limited application within the True2 paradigm (Supplementary Fig. S10C and S10D).
Mutation burden in the tumor and peritumoral edema from adult diffuse gliomas
Next, we used True2 to profile somatic mutations in the tumor and peritumoral edema of adult diffuse gliomas. Twenty-two adult patients with primary diffuse gliomas were enrolled (mean age: 46.9 ± 18.3 yrs; range: 22–86 yrs; 40.9% female; Supplementary Table S5) – eight patients with glioblastoma, one patient with grade 4 astrocytoma, five patients with grade 2–3 astrocytoma, seven patients with grade 2–3 oligodendroglioma, and one patient with neurofibromatosis type 1 (NF1)-associated low-grade neuroglioma. Thirty-eight frozen tissue samples from bulk tumor and 31 frozen tissue samples from peritumoral edema were acquired. Representative stereotactic magnetic resonance images to demonstrate the locations of bulk tumor and peritumoral edema are shown in Fig. 3A. Tumor was histologically identified in 98.6% (68/69) of all tissue samples. As expected, tumor cell content by histology was significantly higher in bulk tumor compared with tissue from stereotactically guided sampling of peritumoral edema (73.0 ± 27.6% vs. 45.5 ± 33.4%, respectively; P < 0.001), but the ranges for each strongly overlapped (tumor: 1–95% with a median of 80% vs. edema: 0–95% with a median of 50%; Fig. 3B). The lower tumor content (<5%) observed in bulk tumor arose from patients with grade 2 astrocytoma and NF1-associated low-grade neuroglioma. Of the 20 patients with at least one peritumoral edema sample available for analysis, 15 (75%) were found to have a tumor content ≥50% by histology in one or both samples.
True2 uncovers a similar mutation burden between tumor and peritumoral edema in high-grade and low-grade adult diffuse gliomas. A, Postcontrast T1-weighted images from a patient with glioblastoma (left) show an enhancing, midline lesion (green arrowhead). T2-weighted images for an oligodendroglioma patient with a hyperintense mass (blue arrowhead) in the operculum are also displayed (right). In both panels, the yellow crosshairs intersect at the location of tissue samples obtained from the bulk tumor (top row) and peritumoral edema (bottom row). B, The tumor purity by histology for tumor (T) and peritumoral edema (PE) is shown for all histopathologies. C, True2 measurement of VAFs for principal driver mutations associated with each tumor type are shown. In B and C, solid lines identify the mean value; values of zero are not shown because of the logarithmic scale. D, The association between histology and VAF is shown, where a regression line and zero values are included. E, Mutation burden as defined by the average number of mutations per tumor and peritumoral edema sample is shown for each tumor type. F, Overall survival for patients undergoing initial surgery at our center from 2016 to 2022 is shown for glioblastoma, astrocytoma, and oligodendroglioma. The dashed gray lines identify 20.6 months as the median overall survival for patients with glioblastoma. GBM, glioblastoma; astro, astrocytoma; oligo, oligodendroglioma; NF1, NF1-related low-grade neuroglioma. ns, not significant; *, P < 0.05; **, P < 0.01; ***, P < 0.001.
True2 uncovers a similar mutation burden between tumor and peritumoral edema in high-grade and low-grade adult diffuse gliomas. A, Postcontrast T1-weighted images from a patient with glioblastoma (left) show an enhancing, midline lesion (green arrowhead). T2-weighted images for an oligodendroglioma patient with a hyperintense mass (blue arrowhead) in the operculum are also displayed (right). In both panels, the yellow crosshairs intersect at the location of tissue samples obtained from the bulk tumor (top row) and peritumoral edema (bottom row). B, The tumor purity by histology for tumor (T) and peritumoral edema (PE) is shown for all histopathologies. C, True2 measurement of VAFs for principal driver mutations associated with each tumor type are shown. In B and C, solid lines identify the mean value; values of zero are not shown because of the logarithmic scale. D, The association between histology and VAF is shown, where a regression line and zero values are included. E, Mutation burden as defined by the average number of mutations per tumor and peritumoral edema sample is shown for each tumor type. F, Overall survival for patients undergoing initial surgery at our center from 2016 to 2022 is shown for glioblastoma, astrocytoma, and oligodendroglioma. The dashed gray lines identify 20.6 months as the median overall survival for patients with glioblastoma. GBM, glioblastoma; astro, astrocytoma; oligo, oligodendroglioma; NF1, NF1-related low-grade neuroglioma. ns, not significant; *, P < 0.05; **, P < 0.01; ***, P < 0.001.
Subsequently, the specificity and sensitivity of True2 were leveraged for the unbiased detection of somatic mutations in samples from the tumor and peritumoral edema. Specifically, variant information in one tumor or peritumoral edema sample was not used to guide variant detection in any other sample. The mean True2 read depth from all tumor and peritumoral edema samples was 3,863× (Supplementary Fig. S11A). Total paired reads ranged between 103 million to 181 million (mean: 141.6 million; Supplementary Table S3; Supplementary Fig. S11A and S11B), which is in the range of True2 validation yielding <1 total false positives per 100 kb panel positions. Prior to variant calling, all WBC, tumor, and peritumoral edema sequencing files (N = 91) were assessed for relatedness to confirm the correct assignment of samples to each patient (Supplementary Fig. S12). Because True2 did not detect variants with an allele frequency ≤0.05% during validation, only variants with an allele frequency >0.05% were recorded. True2 detected a common diagnostic mutation specific to each tumor type [i.e., TERT promoter (pTERT) for glioblastoma, IDH1 or IDH2 for astrocytoma and oligodendroglioma] in 95.7% (66/69) of samples. For comparisons to histology, a ploidy correction that accounted for location-specific variation in read depth was applied to VAFs (Supplementary Fig. S13). Although the mean allele frequency of variants was higher in tumor compared with edema (28.1 ± 15.2% vs. 17.5 ± 15.2%, respectively; P = 0.006), there was a strong overlap in the ranges (tumor: 0.73–59.6% vs. edema: 0–48.3%; Fig. 3C). Tumor purity measured histologically was significantly correlated with VAF by NGS for the overall cohort (Fig. 3D) and for each diagnosis separately (Supplementary Fig. S14A–S14D). However, VAFs within a histologic burden of tissue were variable. For example, glioblastomas with dense tumor by histology (>75%) exhibited pTERT mutations with a wide range of allele frequencies (4.8% to 51.7%). The contrary was also present in each diagnostic category. Astrocytomas with a low density of tumor by histology (<10%) were found to have a similarly wide range of IDH1/2 allele frequencies (0% to 37.8%). The discordance in a subset of samples between histology and NGS most likely reflects the use of a relatively small portion of tissue for histology that may not have been fully representative of tumor burden in the frozen tissue mass used for sequencing.
True2 detected molecular evidence of tumor in 100% (69/69) of tissue samples when analyzed beyond the common diagnostic mutations in pTERT, IDH1, and IDH2. Hereafter, all VAFs are reported without ploidy correction to show the innate distribution of VAFs discovered. As before, the mean allele frequency of variants was higher in tumor compared with edema (15.5 ± 20.9% vs. 11.2 ± 17.4%, respectively; P = 0.028) and there was a strong overlap in the ranges (tumor: 0.07% to 90.3% vs. edema: 0.10% to 75.6%; Supplementary Fig. S15). Furthermore, the average number of somatic mutations between tumor and edema for each histopathology was remarkably similar and demonstrated a comparable mutation burden between the bulk tumor and peritumoral edema (Fig. 3E). Because of the extensive tumor presence being detected beyond the surgical margin by histology and True2, we considered the possibility that the extent of residual tumor in peritumoral edema in our patients may be associated with worse overall survival, particularly in glioblastoma. Thus, we reviewed survival from adult diffuse glioma patients who underwent initial resection as part of a research study at our center by the study's neurosurgeon between January 2016 and December 2022 (Fig. 3F). The median overall survival for 69 glioblastoma patients was 20.6 months, which was similar or exceeded previous reports for glioblastoma patients undergoing surgery (33, 34). Overall survival for the 38 astrocytoma patients and 23 oligodendroglioma patients was similarly consistent with the literature (Fig. 3F). Survival for the 22 patients included in this study was distributed throughout this larger cohort of patients. As such, the high malignant cell content detected by True2 in peritumoral edema was consistent with the diffusely infiltrative nature of the tumors being studied (11, 12). The sample size for each diagnosis was not sufficient to determine if the severity of infiltrative tumor in the peritumoral edema was an independent predictor of survival. Additional studies are warranted to determine the impact, if any, of peritumoral edema tumor cell content on survival and disease-free survival.
The hidden mutational landscape of adult diffuse gliomas
Recognizing that VAF ≥5% is a common cutoff for somatic mutation detection in tumor tissue, we next sought to understand the potential impact that the inclusion of variants <5% may have on contributing to the mutational landscape. In 69 tumor samples, True2 identified a total of 182 distinct mutations (Supplementary Data S5) in 53 of the 115 genes (46.1%) on our custom-designed panel. Of the 365 mutations detected across all patient samples, 61.1% had a VAF <5%, suggesting an extensive presence of mutations that may commonly go undetected using standard NGS thresholds (Fig. 4A). Importantly, the discovery of somatic mutations with a VAF <5% was not constrained to samples with less tumor burden (i.e., peritumoral edema), but occurred in tumor and peritumoral edema similarly (Fig. 4B). We also found that 42.5% of mutations had a VAF <1%, and nearly one-third (32.6%) had an allele frequency <0.5%. Thus, methods that lower the detection threshold to VAF >1% or even VAF >0.5% will not detect a wide array of mutations that could significantly impact prognosis and treatment.
The unbiased detection of ultra-low frequency variants with True2 reveals a hidden landscape of somatic mutations in high-grade and low-grade adult diffuse gliomas. A, The number of somatic mutations for each histopathology with a VAF <5% was similar or greater to the number of variants with VAF >5%. B, The distribution of somatic mutations above and below 5% VAF was similar between tumor (T) and peritumoral edema (PE). C, The conventional visualization of somatic mutations (VAF >5%) for an oligodendroglioma is compared with variant detection with True2, which reveals a TP53 mutation and multiple CIC mutations, among others at VAFs <5% in the tumor and peritumoral edema. A dashed line between symbols identifies a specific variant present in more than one sample (i.e., shared). A mutation present in only a single sample was considered unique. The light dashed and dark dotted horizontal lines correspond to a VAF of 5% and 1%, respectively. D, The full landscape of somatic mutations for each patient is shown using the format from C. Vertical solid lines separate patients and a patient identifier is at the top of each column. GBM, glioblastoma; astro, astrocytoma; oligo, oligodendroglioma; NF1, NF1-related low-grade neuroglioma; TERT, TERT promoter 228C>T or 250C>T; IDH, IDH1 or IDH2. ns, not significant; *, P < 0.05; **, P < 0.01.
The unbiased detection of ultra-low frequency variants with True2 reveals a hidden landscape of somatic mutations in high-grade and low-grade adult diffuse gliomas. A, The number of somatic mutations for each histopathology with a VAF <5% was similar or greater to the number of variants with VAF >5%. B, The distribution of somatic mutations above and below 5% VAF was similar between tumor (T) and peritumoral edema (PE). C, The conventional visualization of somatic mutations (VAF >5%) for an oligodendroglioma is compared with variant detection with True2, which reveals a TP53 mutation and multiple CIC mutations, among others at VAFs <5% in the tumor and peritumoral edema. A dashed line between symbols identifies a specific variant present in more than one sample (i.e., shared). A mutation present in only a single sample was considered unique. The light dashed and dark dotted horizontal lines correspond to a VAF of 5% and 1%, respectively. D, The full landscape of somatic mutations for each patient is shown using the format from C. Vertical solid lines separate patients and a patient identifier is at the top of each column. GBM, glioblastoma; astro, astrocytoma; oligo, oligodendroglioma; NF1, NF1-related low-grade neuroglioma; TERT, TERT promoter 228C>T or 250C>T; IDH, IDH1 or IDH2. ns, not significant; *, P < 0.05; **, P < 0.01.
To gain insight into the detection of somatic mutations within the bulk tumor and peritumoral edema, we plotted variants present in ≥2 samples or present in only one sample according to tissue type and allele frequency (Fig. 4C and D). As a result, we identified a hidden mutational landscape for adult diffuse gliomas and discerned several observations that merit emphasis. First, in 17 of 22 patients (77.3%), we identified multiple mutations specific to a single gene without an increased mutational burden across all other genes with mutations in the corresponding tumor and peritumoral edema sample (Fig. 5). For example, in one patient with glioblastoma (G5), ten unique NF1 mutations with VAFs from 0.27% to 19.8% were detected, while no other genes harbored more than one mutation. In other patients with glioblastoma, multiple mutations were identified in MTOR (G6) and EGFR (G1, G6, and G8). Multiple mutations were present in TP53 in one patient with glioblastoma (G4), four patients with astrocytoma (A1, A3, A4, and A5), and one patient with oligodendroglioma (O5) with VAFs ranging from 0.10% to 52.9%. Five of seven patients with oligodendroglioma (O1, O2, O4, O5, and O6) were found to have ≥3 different CIC mutations with VAFs ranging from 0.13% to 74.7%. The allele frequency >70% for CIC is likely from a high tumor purity in association with loss of heterozygosity due to 19q deletion, which in combination with loss of 1p is a defining molecular characteristic of oligodendroglioma (1). In the NF1-associated low-grade diffuse glioma, seven distinct NF1 mutations were present from bulk tumor and peritumoral edema. All seven mutations were present at a VAF <3% and three mutations were present at a VAF <0.3%. We also found multiple mutations in PIK3CA (O1; six unique mutations), PTEN (G6 and G7), and MTOR (G6) among others (Fig. 5). None of the gene-specific mutations from the same sample cooccurred on a sequencing read supporting the presence of subclones rather than a multigenic hypermutable state such as that seen in cancers with microsatellite instability (35) and also described in a subset of adult gliomas after treatment with temozolomide (36). Although we observed more transitions than transversions in the subclones (Supplementary Fig. S16), a specific mutational signature was not apparent (37). However, future studies composed of larger sample sizes are necessary to discern the presence/absence of subclone-specific mutational signatures in adult diffuse gliomas. Notably, subclones were recently detected in pediatric H3 K27-altered diffuse midline gliomas that phylogenic analysis demonstrated to be consistent with convergent evolution (38). Thus, a possible etiology for the subclones occurring on different genes described herein is a survival response to patient- and tumor-specific microenvironment selective pressures.
A heat map of the genes containing subclones in at least one patient shows the diversity of genes involved and the clustering of affected genes based on the underlying histopathology. Columns represent either a tumor (T) or peritumoral edema (P) sample. Vertical solid lines separate patients and a patient identifier is at the top. Gray shadowing separates the different histopathologies. Horizontal lines separate genes. For each mutation, the type of variant is identified (i.e., nonsense, missense, etc.). Variants are displayed on the basis of allele frequency. NF1, NF1-related low-grade neuroglioma; sa/sd, splice acceptor or splice donor.
A heat map of the genes containing subclones in at least one patient shows the diversity of genes involved and the clustering of affected genes based on the underlying histopathology. Columns represent either a tumor (T) or peritumoral edema (P) sample. Vertical solid lines separate patients and a patient identifier is at the top. Gray shadowing separates the different histopathologies. Horizontal lines separate genes. For each mutation, the type of variant is identified (i.e., nonsense, missense, etc.). Variants are displayed on the basis of allele frequency. NF1, NF1-related low-grade neuroglioma; sa/sd, splice acceptor or splice donor.
Second, the inclusion of multiple tissues from the same patient identified distinct somatic mutations within and between bulk tumor and peritumoral edema. In patients with two bulk tumor samples, driver mutations were commonly present in one sample but not the other. In glioblastoma, this was observed with PTEN (G1, G2, G6, and G7), EGFR (G3, G6, and G8), MTOR (G6), and PIK3CA (G8). In patient G5, NF1 mutations were present in only three of four tissue samples. NF1 mutations were absent in one of the bulk tumor samples despite high tumor cell content (pTERT VAF = 39.4%), suggesting the association of the tumor with NF1 would have been missed without studying multiple tumor samples. In two grade 2–3 oligodendroglioma patients, TP53 mutations were detected in only one of two bulk tumor samples (O2 and O5). In peritumoral edema, mutations associated with glioblastoma but unique to a single sample occurred in TP53, EGFR, PTEN, NF1, ATRX, SCN9A, and MTOR at an allele frequency ranging from 0.10% to 20.26%. The edema of astrocytoma similarly harbored unique mutations in TP53, PTEN, and NF1, but mutations in EGFR, ATRX, SCN9A, and MTOR were absent. While unique mutations in TP53, EGFR, and MTOR were present in the peritumoral edema of oligodendrogliomas, unique CIC and PIK3CA mutations were also present at an allele frequency between 0.10% and 1.69%. These additional mutations present in the peritumoral edema represent only a portion of the total unique mutations detected, many of which had a VAF <5% (Supplementary Fig. S17). Collectively, these results provide compelling evidence for studying multiple tissue samples from different regions of the same tumor when feasible. Moreover, our data support the detection of mutations with VAFs below commonly used cutoffs (i.e., >5%) to more strongly characterize the true mutational landscape through the identification of common and unique subclones in multiple tissues associated with the same tumor.
Finally, PIK3CA mutations were constrained to higher grade tumors, as detected in two glioblastomas (G6, G8), one grade 4 astrocytoma (A1), and one grade 3 oligodendroglioma (O1). PIK3CA mutations were absent in all five grade 2–3 astrocytomas (A2–A6), all five grade 2 oligodendrogliomas (O3–O7), and the single NF1-associated low-grade neuroglioma. The exclusive presence of PIK3CA mutations in higher grade diffuse gliomas suggested a possible association with disease severity, but further data are needed to adequately assess this hypothesis.
CNVs in adult diffuse gliomas
Finally, we sought to detect evidence of CNVs in the bulk tumor and peritumoral edema of diffuse gliomas. Characteristic findings associated with each histopathology were observed (Fig. 6). Trisomy 7, monosomy 10, PDGFRA and EGFR amplifications, and CDKN2A codeletion among others were evident in glioblastoma. Notably, the PDGFRA/KIT/KDR amplification in patient G4 was detected in peritumoral edema, but absent in the bulk tumor. In patients G5 and G6, an EGFR amplification was detected in only one of the two bulk tumor samples from each patient even though the tumor purity by histology was similar (Supplementary Fig. S18). Moreover, an EGFR amplification was present in only the peritumoral edema of patient G3. Interestingly, tumor heterogeneity of EGFR amplification has been previously described with evidence that overexpression may be related to tumor invasion (39). Oligodendrogliomas readily demonstrated the classic 1p/19q codeletion. In one patient (O1), trisomy 7 was also present in both the bulk tumor and peritumoral edema. CNVs in astrocytoma were less abundant, but in the few that were identified a specific pattern was not present. While some CNVs were present throughout all of a patient's tissues, other CNVs were divergent. For example, a large loss in Chromosome 4, reduction in CDKN2A, and amplification in ABBC9 and KRAS were present in all samples from patient A2. In contrast, an FGFR2 amplification was present in peritumoral edema but absent in the tumor sample from patient A1. The opposite occurred in patient A4 for an FGFR2 amplification. Overall, CNV patterns observed in bulk tumor tended to be echoed in the peritumoral edema, but we also found evidence for heterogeneity within and between tissue regions. Nevertheless, the presence of detectable CNVs in the peritumoral edema of diffuse gliomas further attests to the high tumor burden in tissues that remain after resection.
CNV detection is consistent with the underlying histopathology and demonstrates the high malignant cell burden in the peritumoral edema of high-grade and low-grade adult diffuse gliomas. Heat maps of CNVs are shown for controls (top), tumor (middle), and peritumoral edema (bottom) for identification of codeletions (i.e., biallelic loss) and amplifications. WBC DNA from eight healthy controls was used as a negative control. DNA collected from MGG8 as a cultured cell line (P26 and P39) and organoid (Org 1 and 2) was used as a positive control. Chromosome numbers/letters are shown across the top of each heat map along with p- (orange) and q-arms (purple). A subset of genes associated with CNVs in gliomas (e.g., EGFR, PDGFRA, CDKN2A) or of potential interest (CIC) is highlighted. Vertical lines identify common locations for CNVs associated with high-grade and low-grade diffuse gliomas (e.g., 1p/19q codeletion in oligodendroglioma; trisomy 7 and monosomy 10 in glioblastoma). Gray shadowing separates the different histopathologies. Tumor purity (green bar at right) was derived from histologic assessment and is provided as a measure of malignant cell content.
CNV detection is consistent with the underlying histopathology and demonstrates the high malignant cell burden in the peritumoral edema of high-grade and low-grade adult diffuse gliomas. Heat maps of CNVs are shown for controls (top), tumor (middle), and peritumoral edema (bottom) for identification of codeletions (i.e., biallelic loss) and amplifications. WBC DNA from eight healthy controls was used as a negative control. DNA collected from MGG8 as a cultured cell line (P26 and P39) and organoid (Org 1 and 2) was used as a positive control. Chromosome numbers/letters are shown across the top of each heat map along with p- (orange) and q-arms (purple). A subset of genes associated with CNVs in gliomas (e.g., EGFR, PDGFRA, CDKN2A) or of potential interest (CIC) is highlighted. Vertical lines identify common locations for CNVs associated with high-grade and low-grade diffuse gliomas (e.g., 1p/19q codeletion in oligodendroglioma; trisomy 7 and monosomy 10 in glioblastoma). Gray shadowing separates the different histopathologies. Tumor purity (green bar at right) was derived from histologic assessment and is provided as a measure of malignant cell content.
Discussion
We have developed and validated True2 as a readily accessible NGS workflow for the unbiased discovery of somatic mutations across the full range of variant allele frequencies. Because we systematically evaluated common methods to improve specificity and sensitivity, the True2 methodologies outlined herein may help guide parameter determination and optimization of future study designs seeking to accomplish an unbiased search for somatic mutations using duplex-based sequencing. Our initial application of True2 to the study of high-grade and low-grade adult diffuse gliomas uncovered an extensive population of tumor cell subclones in both the bulk tumor and peritumoral edema. Within these distinct sites, True2 revealed a new mutational landscape consisting of gene-specific subclones, common and unique mutations between different tumor regions, and a mutation burden in the peritumoral edema of unresected tissue similar to the bulk tumor. Combined, our data highlight the need in adult diffuse gliomas for broader tissue sampling and molecular profiling inclusive of ultralow frequency mutations to identify the molecular origins of recurrence, provide greater context to the interpretation of tumor phylogeny, and improve our understanding of individual response to therapy.
Although the spatial origin of recurrent disease has been widely recognized to arise from unresected peritumoral edema, the origin of mutations that drive recurrence remains ambiguous. Using whole-exome sequencing (WES; ∼157×), Spiteri and colleagues studied somatic mutations in the tumor and infiltrative margin from ten glioblastomas and one grade 3 astrocytoma (16). Phylogenetic reconstruction found that subclones in the infiltrative margin diverged early in tumor development and seeded the cancer cell lineages in 11 primary lesions and two recurrent tumors. Because of read depth limitations associated with WES, statistical modeling and custom-designed panels were leveraged to adjudicate the presence/absence of truncal mutations (VAF >5%) in the infiltrative margin. Consequently, the molecular profile of the infiltrative margin was largely constrained to mutations within the bulk tumor, which limits the potential to identify whether recurrent lesions emerged de novo as a consequence of therapy or if therapy allowed for previously present cancerous subclones to emerge from peritumoral regions. Comparisons of bulk tumor mutational profiles between primary and recurrent lesions to similarly examine molecular evolution have yielded conflicting results. The Glioma Longitudinal Analysis Consortium (GLASS) studied 222 patients with glioblastoma, astrocytoma, and oligodendroglioma with primary and recurrent lesions using WES and whole-genome sequencing at a minimum read depth of 15× to detect somatic mutations with a VAF >5% (40). The clonal structure at initial disease was observed to mostly persist into recurrence. Similarly, Lee and colleagues also used WES and an allele frequency >5% for variant calling to report that locally recurrent glioblastoma arose from the same clones as the primary lesion (41). In contrast, Wang and colleagues found that dominant clones present before treatment in 93 patients with glioblastoma were replaced by new clones in recurrent lesions without many of the same mutations (42). The criteria for somatic mutation detection were similar (VAF >5%) within these studies, but read depth on WES was greater at 100× for 76% of coding bases. Finally, Draaisma and colleagues used a 287 gene panel at approximately 700× and an allele frequency >10% for mutation detection in 176 patients with glioblastoma with primary and recurrent tumor pairs and demonstrated that 81.3% of mutations present in the primary tumor were retained at recurrence (43). Collectively, these studies highlight the technical challenges associated with sequencing limitations (e.g., error rate), tumor heterogeneity, and sample impurity. The modest read depths and absence of consensus sequencing to support error reduction in each of these prior studies constrained somatic mutation detection to relatively high allele frequencies. In contrast, True2 commonly observed bulk tumor mutations with a VAF >5% to be present throughout a patient's samples at varying allele frequencies, suggesting stereotypical mutations for a particular histopathology were ubiquitous regardless of clonality. Moreover, we found an extensive cohort of subclones with a VAF <0.5% in peritumoral edema that may be inconsequential or represent the tumor cells resistant to therapy that emerge as the dominant clone at recurrence. Collectively, our data provide compelling evidence that future studies seeking to discern tumor phylogeny, determine molecular origins of recurrent disease, and monitor response to therapy may benefit from the incorporation of not only multiple tissue samples but inclusion of low-frequency somatic mutations to more accurately map the mutational landscape.
We were surprised by the multitude of gene-specific subclones across all histopathologies. There are several notable features of these mutations. First, PIK3CA mutations have been previously associated with grade 3 oligodendroglioma (44). Although our study included only two grade 3 oligodendrogliomas, one tumor was characterized by six different mutations in PIK3CA. In addition, PIK3CA mutations were only detected in grade 3 to 4 diffuse gliomas in this study, suggesting that PIK3CA may be a prognostic marker not only in oligodendroglioma, but more generally in diffuse gliomas. PIK3CA subclones have recently been reported in pediatric diffuse pontine gliomas (38) – a lethal tumor with a principal driver mutation that affects histone H3 proteins. In glioblastoma, the presence of an EGFR amplification, chromosome 7 gain/10 loss, and/or TERT promoter mutation has been associated with reduced overall survival compared with patients with glioblastoma without any of these molecular features (45). Our observation that the detection of subclones in specific genes, such as PIK3CA, occurred in higher grade gliomas suggests that the integration of additional molecular features could more accurately grade disease in diffuse gliomas. However, a larger cohort with survival data is needed to evaluate this conjecture. Second, subclones of a single gene were found not only in tumor but were also common in peritumoral edema, suggesting gene-specific subclones may be prevalent outside the tumor margin. Notably, our study may have underestimated the extent of subclones present because only a few genes (e.g, ALK, EGFR, MTOR, PTEN, and TP53) had full exonic coverage while the overwhelming majority of genes on the panel (e.g., CIC, NF1, PIK3CA, etc.) targeted hotspot exons. Future studies may benefit from full coverage of genes associated with increased subclone occurrence to optimally characterize subclone extent. Third, the VAF associated with single gene subclones ranged widely. Most commonly, an affected gene was associated with a somatic mutation at a high allele frequency (>5%) with an entourage of distinct mutations in the same gene at lower allele frequencies. The etiology for a specific gene to be subclonal within a tumor is unclear. While mutations in mismatch repair genes and microsatellite instability is a plausible hypothesis, additional genes within the same tumor should have similarly manifested multiple mutations as seen in hypermutable glioblastomas with an associated mismatch repair gene variant (46). Thus, the presence of mutations within a specific gene may not represent a hypermutable state but rather reflect selective pressures in the tumor microenvironment favoring survival for alterations in a particular gene or pathway (i.e., convergent evolution; ref. 38). For example, Draaisma and colleagues has previously reported the replacement of driver mutations in EGFR, PTEN, NF1, RB1, and TP53 with other mutations in the same gene at recurrence of glioblastoma (43). Identification of subclonal genes/pathways in diffuse gliomas may help understand the selective pressures for a particular tumor and subsequently guide therapy to improve outcomes.
While the error correction afforded by duplex adapter sequencing is strongly favorable compared to preconsensus sequencing and singleton adapters, the read depth required to achieve very low frequency variant detection has proven challenging to achieve. Early studies reported that duplex sequencing rendered read depths inadequate for rare variant detection, particularly when library inputs were low (21). Thus, duplex barcoding was leveraged for error correction when observed, but otherwise single-stranded data was relied upon (21). To increase duplex sequencing yield, Chabon and colleagues subsequently combined a larger fraction of library into a capture with greater TPR to demonstrate the feasibility of achieving duplex read depths of ∼1,700× (22). Our True2 approach builds upon these prior studies by using a high TPR and small panel size to achieve ultra-deep duplex sequencing at common DNA input amounts. Importantly, True2 does not rely on statistical modeling for variant detection in the NGS workflow, which is a key distinction compared with prior investigations. The use of statistical analyses to adjudicate the presence/absence of variants is indicative of the persistent residual error in duplex sequencing. A potential confounder, however, is the stochastic component of NGS-associated noise that may evade error modeling and lead to false positives (47). Instead of statistical modeling, we identified locations and regions with noise common across multiple samples to effectively mitigate errors associated with a high allele frequency. While such locations are likely specific to each panel, library preparation method, and sequencer, our data indicate that relatively few control samples were necessary to generate an impactful exclusion list. Furthermore, we note that maximizing pool-of-normal size may universally improve the overall error profile. To minimize low-frequency errors, we found that the simplistic approach of thresholding effectively eliminated background noise. In so doing, however, somatic mutation detection below 0.1% was likely reduced, resulting in false negatives. For example, in astrocytoma patient A6 the IDH1 R132H mutation present in the bulk tumor was absent in peritumoral edema. Inspection of the raw data file, however, found that four counts for the variant were present but since the total was below the threshold requirement the variant was not recorded. Thus, additional noise reduction strategies that allow for a lower threshold without compromising read depth may further improve the sensitivity of somatic mutations below 0.1%, reduce the necessary TPR, or both. Regardless, we anticipate the implementation of True2 in future investigations of cancers and studies of mosaicism in healthy tissues will enrich discoveries through the unbiased detection of variants across all allele frequencies.
Authors' Disclosures
H. Underhill reports grants from NCI, NIH, and grants from the National Institute of Neurological Disorders and Stroke (NINDS) during the conduct of the study. M. Karsy reports Thieme Medical (royalties), Hoth (consulting), Altus Medical (consulting). S. Hellwig reports personal fees from ARUP Laboratories during the conduct of the study and personal fees from ARUP Laboratories outside the submitted work. No disclosures were reported by the other authors.
Authors' Contributions
H.R. Underhill: Conceptualization, resources, data curation, software, formal analysis, supervision, funding acquisition, validation, investigation, visualization, methodology, writing–original draft, project administration, writing–review and editing. M. Karsy: Conceptualization, resources, data curation, formal analysis, investigation, methodology, writing–review and editing. C.J. Davidson: Conceptualization, data curation, formal analysis, investigation, methodology, writing–review and editing. S. Hellwig: Data curation, formal analysis, methodology, writing–review and editing. S. Stevenson: Data curation, formal analysis, methodology, writing–review and editing. E.A. Goold: Data curation, formal analysis, writing–review and editing. S. Vincenti: Investigation, writing–review and editing. D.L. Sellers: Conceptualization, funding acquisition, investigation, methodology, writing–original draft, project administration, writing–review and editing. C. Dean: Data curation, investigation, writing–review and editing. B.E. Harrison: Data curation, investigation, writing–review and editing. M.P. Bronner: Resources, project administration, writing–review and editing. H. Colman: Resources, writing–review and editing. R.L. Jensen: Conceptualization, resources, supervision, funding acquisition, investigation, methodology, writing–original draft, project administration, writing–review and editing.
Acknowledgments
This study received grant support from the NCI (R37CA246183 to H. Underhill) and the National Institute for Neurological Disorders and Stroke (R01NS118247 to D. Sellers; UG3NS132134 to H. Underhill) of the NIH. The computational resources used were partially funded by the NIH Shared Instrumentation Grant 1S10OD021644–01A1. This study was conducted with support from the Biorepository and Molecular Pathology Shared Resource and the Cancer Biostatistics Shared Resource supported by the Cancer Center Support Grant awarded to the Huntsman Cancer Institute by the NCI of the NIH (P30CA042014).
Note: Supplementary data for this article are available at Cancer Research Online (http://cancerres.aacrjournals.org/).
References
Supplementary data
Data used to construct figures.