Abstract
Microsatellite instability (MSI) and high tumor mutation burden (TMB-High) are promising pan-tumor biomarkers used to select patients for treatment with immune checkpoint blockade; however, real-time sequencing of unresectable or metastatic solid tumors is often challenging. We report a noninvasive approach for detection of MSI and TMB-High in the circulation of patients.
We developed an approach that utilized a hybrid-capture–based 98-kb pan-cancer gene panel, including targeted microsatellite regions. A multifactorial error correction method and a novel peak-finding algorithm were established to identify rare MSI frameshift alleles in cell-free DNA (cfDNA).
Through analysis of cfDNA derived from a combination of healthy donors and patients with metastatic cancer, the error correction and peak-finding approaches produced a specificity of >99% (n = 163) and sensitivities of 78% (n = 23) and 67% (n = 15), respectively, for MSI and TMB-High. For patients treated with PD-1 blockade, we demonstrated that MSI and TMB-High in pretreatment plasma predicted progression-free survival (hazard ratios: 0.21 and 0.23, P = 0.001 and 0.003, respectively). In addition, we analyzed cfDNA from longitudinally collected plasma samples obtained during therapy to identify patients who achieved durable response to PD-1 blockade.
These analyses demonstrate the feasibility of noninvasive pan-cancer screening and monitoring of patients who exhibit MSI or TMB-High and have a high likelihood of responding to immune checkpoint blockade.
See related commentary by Wang and Ajani, p. 6887
Microsatellite instability and mismatch repair deficiency represent the first pan-cancer biomarker indication approved for treatment of patients with the immune checkpoint inhibitor, pembrolizumab. However, tumor biopsy or resection tissue is not easily obtained for genetic testing, and, therefore, more accessible alternatives must be explored. Detection of circulating tumor DNA derived from plasma provides a viable alternative due to its noninvasive nature and ability to capture tumor heterogeneity and affords the possibility of monitoring patient response to therapy. Here, we describe the development of a liquid biopsy method to identify tumors in patients with MSI and high tumor mutation burden and demonstrate the efficacy of the approach for determination of response to immune checkpoint blockade.
Introduction
Microsatellite instability (MSI) and mismatch repair (MMR) deficiency have recently been demonstrated to predict response to immune checkpoint blockade (1, 2). The checkpoint inhibitor pembrolizumab is indicated for the treatment of patients with any unresectable or metastatic solid tumors identified as having either of these biomarkers (1, 2). The accumulation of somatic mutations in cancers has the potential to result in the expression of neoantigens, which may elicit T-cell–dependent immune responses against tumors (3–5). MMR is a mechanism by which postreplicative mismatches in daughter DNA strands are repaired and replaced with the correct DNA sequence. MMR deficiency results in both MSI and high tumor mutation burden (TMB-High), which increases the likelihood that acquired somatic mutations may be transcribed and translated into proteins that are recognized as immunogenic neoantigens. Historically, testing for MSI has been restricted to screening for hereditary non-polyposis colorectal cancer (HNPCC), which is often characterized by early age onset colorectal cancer and endometrial cancer, as well as other extracolonic tumors (6, 7). HNPCC, commonly referred to as Lynch syndrome, is caused by mutations in the DNA MMR genes (MLH1, MSH2, MSH6, and PMS2; refs. 8–15), as well as the more recently described, EPCAM (16). In addition to familial conditions, MSI can occur sporadically in cancer, and both hereditary and sporadic MSI patients respond to immune checkpoint blockade (1, 2). A recent study, conducted across 39 tumor types and 11,139 patients to determine the landscape of MSI prevalence, concluded that 3.8% of these cancers across 27 tumor types displayed MSI, including 31.4% of uterine/endometrial carcinoma, 19.7% of colon adenocarcinoma, and 19.1% of stomach adenocarcinoma (17, 18).
MSI can be detected by measuring the length of altered microsatellite sequences typically due to deletions of repetitive units that changes the lengths of these sequences in tumor DNA as compared with matched-normal DNA. Current methods for MSI testing, using tissue biopsies and resection specimens, include polymerase chain reaction (PCR)-based amplification, followed by capillary electrophoresis (19), and more recently, next-generation-sequencing (NGS)–based approaches (17, 20–24), which are used to quantify microsatellite allele lengths. Both methodologies have sensitivity limitations for tissue applications due to polymerase-induced errors (stutter bands) and inaccurate estimation of homopolymer lengths in the PCR-based and NGS-based approaches, respectively.
Because MSI has become a valuable marker that predicts a robust response to checkpoint blockade, we were interested in utilizing circulating tumor DNA (ctDNA) to asses MSI status in patients with metastatic disease. Such an approach would be desirable because it is often not possible to readily obtain biopsy or resection tissue for genetic testing due to insufficient samples (biopsy size and tumor cellularity), exhaustion of specimens after pathologic analyses, logistical considerations for obtaining tumor and normal samples after initial diagnosis, or safety concerns related to additional tissue biopsy interventions (25). Plasma-based approaches offer the unique opportunity to obtain a rapid and real-time view of the primary tumor and metastatic lesions along with associated response to therapy. ctDNA can be used to monitor and assess residual disease in response to clinical intervention, such as surgery or chemotherapy (26–34), which can directly impact patient care. A novel method was recently described for determination of MSI in liquid biopsies using pre-PCR elimination of wild-type DNA homopolymers (35), but it simply reports MSI status and was not designed to interrogate multiple genetic alterations as will be required in the future for tumor-profiling applications. To determine the clinical impact of identifying tumors that harbor MSI and TMB-High, we developed and applied a 98-kb 58-gene targeted panel to noninvasively assess patients with cancer with advanced disease treated with PD-1 blockade.
Materials and Methods
Patients and sample collection
Formalin-fixed paraffin-embedded (FFPE) tumor and matched normal buffy coat specimens (n = 61) from individuals with cancer were obtained after surgical resection through commercial biorepositories from BioIVT, Indivumed, and iSpecimen. Plasma samples from healthy individuals (n = 163) were procured through BioIVT during routine screening with negative results and no prior history of cancer. Human cells from previously characterized MSI cell lines were obtained from ATCC (n = 5; LS180, LS411N, SNU-C2B, RKO, and SNU-C2A). Baseline and serial plasma samples from patients with cancer with progressive metastatic carcinoma were obtained whereas patients were enrolled in a phase II clinical trial to evaluate immune checkpoint blockade with pembrolizumab (1, 2). Radiographic and serum protein biomarker data for CEA and CA19-9 were collected as a part of routine clinical care. All samples were obtained under institutional review board–approved protocols with informed consent for research. Orthogonal testing of FFPE tissue was performed for MSI status using the Promega MSI analysis system as recommended by the manufacturer.
Sample preparation and NGS
FFPE tumor and normal analyses.
Sample processing from tissue or buffy coat, library preparation, hybrid capture, and sequencing were performed as previously described at Personal Genome Diagnostics (36, 37). Briefly, DNA was extracted from FFPE tissue and matched normal buffy coat cells using the Qiagen FFPE Tissue Kit and DNA Blood Mini Kit, respectively. Genomic DNA was sheared using a Covaris sonicator and subsequently used to generate a genomic library using the New England Biolabs end-repair, A-tailing, and adapter ligation modules. Finally, genomic libraries were amplified and captured using the Agilent SureSelect XT in-solution hybrid capture system with a custom panel targeting the predefined regions of interest across 125 genes (Supplementary Table S1). Captured libraries were sequenced on the Illumina HiSeq 2000/2500 with 100-bp paired end reads.
Plasma analyses.
Sample processing from plasma, library preparation, hybrid capture, and sequencing were performed as previously described at Personal Genome Diagnostics (36). Cell-free DNA (cfDNA) was extracted from plasma using the QIAamp Circulating Nucleic Acid Kit. Libraries were prepared with 5 to 250 ng of cfDNA using the NEBNext DNA Library Prep Kit. Targeted hybrid capture was performed using Agilent SureSelect XT in-solution hybrid capture system with a custom panel targeting the predefined regions of interest across 58 genes (Supplementary Table S4) according to the manufacturer protocol. Captured libraries were sequenced on the Illumina HiSeq 2000/2500 with 100-bp paired end reads. For limit of detection analyses, the 1% mutant allele fraction (MAF) level was chosen on the basis of the expected level of distinct sequencing coverage across the mononucleotide tracts, assuming a binomial distribution, requiring a minimum of three distinct observations.
MSI analyses by NGS
Sequence data were aligned to the human reference genome assembly (hg19) using BWA-MEM (38). Reads mapping to microsatellites were excised using Samtools (38) and analyzed for insertion and deletion events (indels). In most cases, alignment and variant calling did not generate accurate indel calls in repeated regions due to low-quality bases surrounding the microsatellites. Therefore, a secondary local realignment and indel quantitation was performed. Reads were considered for an expanded indel analysis if (i) the mononucleotide repeat was contained to more than eight bases inside of the start and end of the read, (ii) the indel length was ≤ 12 bases from the reference length, (iii) there were no single-base changes found within the repeat region, (iv) the read had a mapping score of 60, and (v) ≤ 20 bases of the read were soft clipped for alignment. After read-specific mononucleotide length analysis, error correction was performed for the mononucleotide indel to allow for an accurate quantitation among duplicated fragments using molecular barcoding. Indels were error corrected by using the ordered and combined read 1 and read 2 alignment positions with the molecular bar code. An indel with a given bar code was considered for downstream analysis if it had at least two observations and >50% of indels with the given bar code had consistent mononucleotide lengths. The error-corrected mononucleotide length distribution based on indel size was subjected to a peak-finding algorithm where local maxima were required to be greater than the error- corrected distinct fragment counts of the adjacent lengths ± 2 bp. Identified peaks were further filtered to include only those that had > 3 error-corrected distinct fragments at 1% or more of the absolute coverage. The shortest identified mononucleotide allele length was compared with the hg19 reference length. If the allele length was ≥ 3 bp shorter than the reference length, the given mononucleotide loci were classified as exhibiting instability. The sum of the error-corrected distinct coverage of the shortest allele length for all tracts classified as MSI was divided by the sum of the total error-corrected distinct coverage for all tracts classified as MSI to generate an MSI MAF. This approach was applied across all mononucleotide loci. In the targeted 58-gene plasma panel, BAT25 (chr4:55598211-55598236 hg19), BAT26 (chr2:47641559-47641586 hg19), MONO27 (chr2:39536689-39536716 hg19), NR21 (chr14:23652346-23652367 hg19), and NR24 (chr2:95849361-95849384 hg19), mononucleotide loci were used for the determination of MSI status. In the targeted 125-gene targeted tissue panel, an additional 65 microsatellite regions were used for MSI classification (Supplementary Fig. S5).
TMB analyses by NGS
Next-generation sequencing data were processed and variants were identified using the VariantDx custom software as previously described (36). A final set of candidate somatic mutations were selected for TMB analyses based on the following: (i) variants enriched due to sequencing or alignment error were removed (≤5 observations or <0.30% MAF), (ii) nonsynonymous and synonymous variants were included, but variants arising in noncoding regions were removed, (iii) hot spot variants annotated in COSMIC (version 72) were not included to reduce bias toward driver alterations (requiring a given genomic alteration to be mutated in at least 50 tumors with the exact nucleotide change), (iv) common germline single nucleotide polymorphism (SNP) found in dbSNP (version 138) were removed as well as variants deemed private germline variants based on the variant allele frequency, and (v) variants associated with clonal hematopoietic expansion were filtered as previously described and not included in the candidate variant set (36, 40).
In silico The Cancer Genome Atlas analyses
To evaluate the accuracy of the 98-kb targeted panel for prediction of TMB, a comparison to whole-exome sequencing data derived from The Cancer Genome Atlas (TCGA; ref. 41) was performed by considering synonymous and nonsynonymous alterations, excluding known hot spot mutations that may not be representative of TMB in the tumor.
MSI and TMB cutoff selection from plasma
For MSI analyses, samples were classified as MSI-H if 20% or more of loci were determined to be MSI, based on previous reports of these targeted loci when evaluating DNA derived from tumor tissue (19). For TMB analyses, the TMB-High cutoff was determined on the basis of in silico analyses of the 58 gene plasma panel compared with the TCGA whole-exome analyses (r = 0.91, P < 0.0001; Pearson correlation; Fig. 2A). In this training cohort, we determined that a cutoff of five mutations (50.8 mutations/Mbp sequenced) in the targeted plasma panel could be used to identify tumors with exceptionally high TMB related to MMR deficiency (>36 mutations/Mbp of the whole exome) with more than 95% accuracy.
Statistical analyses
Due to small sample size, Firth's Penalized Likelihood was used to evaluate significant differences between Kaplan–Meier curves for progression-free survival and overall survival with the classifiers baseline MSI status and baseline TMB status. Pearson correlations were used to evaluate the significance of the association between TMB in the 58-gene targeted panel compared with whole-exome analyses, progression-free and overall survival compared with residual protein biomarker levels, and progression-free and overall survival compared with residual MSI and TMB allele levels. A Student t test was used to evaluate significant differences between the mean TMB level in TMB-High and TMB-Low patients.
Results
Development of an assay to identify MSI in cfDNA
To identify MSI in tumor-derived cfDNA from the plasma as well as in tissue specimens, we developed a highly sensitive error correction approach incorporating the commonly used mononucleotide tracts BAT25, BAT26, MONO27, NR21, and NR24 (see Patients and Methods). To address the technical challenges associated with low-level allele length polymorphisms obtained from NGS, we combined an error correction approach for accurate determination of insertions and deletions (indels) present in the cfDNA fragments, together with a digital peak-finding (DPF) method for quantification of MSI-High (MSI-H) and microsatellite stable (MSS) alleles. Redundant sequencing of each cfDNA fragment was performed, and reads were aligned to the five microsatellite loci contained in the human reference genome (hg19). cfDNA sequences were then analyzed for indels through a secondary local alignment at these five microsatellite loci to more accurately determine the indel length. To perform the error correction, duplicated reads associated with each cfDNA molecule were consolidated, recognizing only indels present throughout bar-coded DNA fragment replicates obtained through redundant sequencing. Finally, the DPF approach was applied across the error-corrected distribution of indels to identify high-confidence alleles that exhibit MSI (Fig. 1A).
Plasma-based detection of microsatellite instability. A, Across the BAT25, BAT26, MONO27, NR21, and NR24 mononucleotide loci in 23 clinical MSI-H patients, 6 clinical MSS patients, and 163 healthy donor plasma specimens (169 total MSS cases), the error-corrected mononucleotide count distribution was assessed with a DPF algorithm to identify mononucleotide alleles and determine MSI status. Prior to combined bar coding and DPF (raw) and with bar coding alone, the majority of clinical MSS and healthy donor samples exhibit alleles below the cutoff for MSI and MSS classification (red line) making MSI-H and MSS cases indistinguishable. With the DPF algorithm alone and with combined bar coding and DPF, the majority of samples were correctly classified (15/23 MSI; 169/169 MSS and 18/23 MSI; 168/169 MSS, respectively). Kaplan–Meier curves for progression-free survival (B) and overall survival (C) among patients with progressive metastatic carcinoma were determined using MSI status from pretreatment plasma specimens. In MSI patients (n = 18*), median progression-free survival and median overall survival were 16.2 and 16.3 months, respectively. In MSS patients (n = 11*), median progression-free survival and median overall survival were 2.8 and 6.9 months, respectively. *Five patients with a tissue enrollment status of MSI-H were classified as MSS using pretreatment baseline cfDNA obtained from plasma.
Plasma-based detection of microsatellite instability. A, Across the BAT25, BAT26, MONO27, NR21, and NR24 mononucleotide loci in 23 clinical MSI-H patients, 6 clinical MSS patients, and 163 healthy donor plasma specimens (169 total MSS cases), the error-corrected mononucleotide count distribution was assessed with a DPF algorithm to identify mononucleotide alleles and determine MSI status. Prior to combined bar coding and DPF (raw) and with bar coding alone, the majority of clinical MSS and healthy donor samples exhibit alleles below the cutoff for MSI and MSS classification (red line) making MSI-H and MSS cases indistinguishable. With the DPF algorithm alone and with combined bar coding and DPF, the majority of samples were correctly classified (15/23 MSI; 169/169 MSS and 18/23 MSI; 168/169 MSS, respectively). Kaplan–Meier curves for progression-free survival (B) and overall survival (C) among patients with progressive metastatic carcinoma were determined using MSI status from pretreatment plasma specimens. In MSI patients (n = 18*), median progression-free survival and median overall survival were 16.2 and 16.3 months, respectively. In MSS patients (n = 11*), median progression-free survival and median overall survival were 2.8 and 6.9 months, respectively. *Five patients with a tissue enrollment status of MSI-H were classified as MSS using pretreatment baseline cfDNA obtained from plasma.
To demonstrate the feasibility of this approach, we first evaluated the performance of the method for detection of MSI in FFPE tumor tissue specimens obtained from 31 MSI-H and 30 MSS tumors previously characterized with the PCR-based Promega MSI analysis system. In addition to these five mononucleotide markers, we sequenced 125 selected cancer genes that harbor clinically actionable genetic alterations consisting of sequence mutations (single-base substitutions and indels), copy number alterations, and gene rearrangements in cancer (Supplementary Table S1). Analysis of these five mononucleotide loci, together with 65 additional microsatellite regions contained within the 125-gene panel, resulted in 100% sensitivity (31/31) and 100% specificity (30/30) for determination of MSI status using the patient-matched tumor and normal samples (Supplementary Tables S2 and S3). Across this cohort, MSI was observed as a tract length shortening compared with the matched normal tissue. Based on these data, only mononucleotide tract length shortening was considered for the subsequent analyses.
Next, we evaluated the signal-to-noise ratio in homopolymer regions from NGS data obtained using cfDNA extracted from the plasma of healthy individuals and patients with cancer. Together with the five mononucleotide loci, we developed a 98-kb, 58-gene panel for sequence mutation (single-base substitutions and indels) analyses of clinically actionable genetic alterations in cancer (ref. 36; Supplementary Table S4). To demonstrate the specificity of this approach for direct detection of MSI, we first obtained plasma from healthy donors (n = 163), all of whom would be expected to be tumor-free and MSS. These analyses resulted in 2,600-fold distinct coverage across the 98-kb targeted panel and resulted in a per-patient specificity of 99.4% (162/163) for determination of MSI status (Fig. 1A; Supplementary Tables S5 and S6). The single false-positive result was obtained from a sample with 974-fold distinct coverage, lower than any patient with late-stage cancer evaluated, indicating that the specificity of 99.4% is likely a lower bound for the intended use population. This is consistent with reduced cfDNA yields and lower coverage in healthy donor populations compared with patients with cancer with metastatic disease (36).
Because ctDNA may be present at MAFs less than 5% even in patients with advanced cancer, we characterized the ability of DPF for sensitive and reproducible detection of MSI at low MAFs. Five previously characterized MSI cell line samples obtained from ATCC (LS180, LS411N, SNU-C2B, RKO, and SNU-C2A) were sheared to a fragment profile simulating cfDNA and diluted with normal DNA to yield a total of 25 ng evaluated at 1% MAF. In addition, three of these cell lines (LS180, LS411N, and SNU-C2B) were evaluated at 1% MAF in triplicate within and across library preparation and sequencing runs (Supplementary Table S5). On the basis of the MAF observed in the parental cell line, the cases detected as MSI were confirmed to contain MSI alleles at MAFs of 1.2% to 4.6%, with a median MSI allele MAF of 1.8%. Through our analysis, MSI was detected in 90% (18/20) of samples and demonstrated 93.3% (14/15) reproducibility within and across runs (Supplementary Table S6). For one case that was not detected as MSI, one MSI allele was identified, and for the other case, no MSI alleles were detected.
Assessment of MSI in cfDNA in patients treated with PD-1 blockade
To evaluate the analytical and clinical performance of this approach for determination of MSI in cfDNA from patients with late-stage cancers, we obtained baseline and serial plasma from patients with metastatic cancers, including 19 colorectal, 3 ampullary, 3 small intestine, 2 endometrial, 1 gastric, and 1 thyroid, with or without MMR deficiency, while enrolled in a clinical trial to evaluate response to immune checkpoint blockade with the PD-1 blocking antibody, pembrolizumab (refs. 1, 2; Supplementary Table S7). In total, 23 MSI-H cases and 6 MSS cases, determined through archival tissue-based analyses, were assessed at a baseline plasma time point, and 16 of the patients were evaluated across at least one additional plasma time point, including after approximately 2 weeks, 10 weeks, 20 weeks, and more than 100 weeks.
Patients with MSI tumors as determined by archival tissue analyses had improved progression-free survival (hazard ratio, 0.26; P = 0.014, likelihood ratio test) and overall survival (hazard ratio, 0.27; P = 0.02, likelihood ratio test; Supplementary Fig. S1A and S1B; Supplementary Table S8). In cfDNA, we detected MSI in 78.3% (18/23) of the MSI-H patients and correctly identified 100% (6/6) of the MSS patients (Supplementary Table S6). Of the five cases that were MSI in the tumor tissue and MSS in the cfDNA, three were colorectal tumors (two patients exhibited progressive disease and the third was not evaluable) and two were small intestinal tumors (one patient exhibited a partial response and one exhibited progressive disease). Of these cases, two had no detectable ctDNA, two had low levels of ctDNA (average MAF of 0.4%, 1.1%), and one had an average sequence mutation MAF of 24.7%.
We evaluated pretreatment MSI status in ctDNA to predict response and clinical outcome to treatment with PD-1 blockade. We assessed progression-free and overall survival to predict clinical outcome. Similar to tissue-based analyses, direct detection of MSI in baseline cfDNA could be used to predict progression-free survival to immune checkpoint blockade (hazard ratio, 0.21; P = 0.001, likelihood ratio test; Fig. 1B) but was not statistically significant for overall survival (hazard ratio, 0.41; P = 0.063, likelihood ratio test; Fig. 1C). When considering only cases for which adequate ctDNA was detected (median sequence mutation MAF ≥0.5%), there were 25/29 cases evaluable, with 17/19 MSI-H cases detected and 6/6 MSS cases detected. For this subset of patients, direct detection of MSI in baseline cfDNA predicted progression-free and overall survival to immune checkpoint blockade (hazard ratio, 0.15; P = 0.001, likelihood ratio test and hazard ratio, 0.26; P = 0.01, likelihood ratio test, respectively; Supplementary Fig. S2A–S2C).
Estimating TMB in ctDNA
In addition to MSI status, we also evaluated the ability of our cfDNA panel to predict TMB across a range of tumor types, using whole-exome sequencing data from 8,493 samples from TCGA (41). We considered synonymous and nonsynonymous alterations identified by TCGA and excluded known driver hot spot mutations as these have been selected during tumorigenesis and may not be representative of TMB in the tumor. These analyses demonstrated a positive correlation between predicted TMB from our targeted 58-gene plasma panel compared with the TCGA whole-exome analyses (r = 0.91, P < 0.0001; Pearson correlation; Fig. 2A). We determined that a cutoff of five mutations in the targeted plasma panel corresponding to approximately 51 mutations/Mbp sequenced could be used to identify tumors with exceptionally high TMB related to MMR deficiency (>36 mutations/Mbp of the whole exome) at more than 95% accuracy.
Plasma-based detection of high tumor mutation burden. A, Using whole-exome sequencing data derived from The Cancer Genome Atlas (TCGA), a significant positive correlation between the tumor mutation burden (TMB) evaluated in the 98-kb targeted regions compared with the whole-exome analyses was observed (r = 0.91, P < 0.0001; Pearson correlation). B, Comparison of the accuracy for determination of the TMB derived from the targeted panel in plasma at baseline compared with whole-exome analyses of matched archival tissue samples in 20 patients yielded a positive trend (r = 0.38, P = 0.095; Pearson correlation). C, The overall TMB status at baseline was assigned as TMB-High or TMB-Low using a cutoff of 50.8 mutations/Mbp sequenced. In total, 13 patients were categorized as TMB-High and 16 patients as TMB-Low, with a median load of 152.4 mutations/Mbp sequenced and 20.3 mutations/Mbp sequenced, respectively. In addition, 163 healthy donor cases were evaluated, all of which were determined to be TMB-Low, with a median load of 0 mutations/Mbp sequenced across the panel. Kaplan–Meier curves for progression-free survival (D) and overall survival (E) among this same cohort of patients were determined using TMB status from pretreatment plasma specimens with a cutoff of 50.8 mutations/Mbp sequenced. In TMB-High patients (n = 13), median progression-free survival and median overall survival were not reached. In TMB-Low patients (n = 16), median progression-free survival and median overall survival were 2.8 and 7.6 months, respectively. ROI
Plasma-based detection of high tumor mutation burden. A, Using whole-exome sequencing data derived from The Cancer Genome Atlas (TCGA), a significant positive correlation between the tumor mutation burden (TMB) evaluated in the 98-kb targeted regions compared with the whole-exome analyses was observed (r = 0.91, P < 0.0001; Pearson correlation). B, Comparison of the accuracy for determination of the TMB derived from the targeted panel in plasma at baseline compared with whole-exome analyses of matched archival tissue samples in 20 patients yielded a positive trend (r = 0.38, P = 0.095; Pearson correlation). C, The overall TMB status at baseline was assigned as TMB-High or TMB-Low using a cutoff of 50.8 mutations/Mbp sequenced. In total, 13 patients were categorized as TMB-High and 16 patients as TMB-Low, with a median load of 152.4 mutations/Mbp sequenced and 20.3 mutations/Mbp sequenced, respectively. In addition, 163 healthy donor cases were evaluated, all of which were determined to be TMB-Low, with a median load of 0 mutations/Mbp sequenced across the panel. Kaplan–Meier curves for progression-free survival (D) and overall survival (E) among this same cohort of patients were determined using TMB status from pretreatment plasma specimens with a cutoff of 50.8 mutations/Mbp sequenced. In TMB-High patients (n = 13), median progression-free survival and median overall survival were not reached. In TMB-Low patients (n = 16), median progression-free survival and median overall survival were 2.8 and 7.6 months, respectively. ROI
Patients with TMB-High tumors (≥10 mutations/Mbp of the whole exome) as determined by analyses of archival tissue from 20 tumor samples (12 colorectal, three ampullary, two small intestine, one endometrial, one gastric, and one thyroid) had improved progression-free survival (hazard ratio, 0.24; P = 0.021, likelihood ratio test) and overall survival (hazard ratio, 0.28; P = 0.043, likelihood ratio test; Supplementary Fig. S1C and S1D). We also evaluated the accuracy of TMB derived from the targeted panel in 20 baseline plasma samples from these cases compared with whole-exome analyses of tumor and matched normal tissue in the same patients (1, 2), and a similar trend was observed (r = 0.38, P = 0.095; Pearson correlation; Fig. 2B). This correlation was lower than that observed through in silico TCGA analyses, potentially due to the biological variability associated with low ctDNA levels and tumor heterogeneity. These patients were classified as either TMB-High or TMB-Low using a cutoff of 51 mutations/Mbp sequenced (selected through in silico TCGA analyses), which captured 10 of the 15 tumors categorized as TMB-High by archival tissue and provided a statistically significant difference in the TMB classification (P < 0.0001, t test; Fig. 2C). This algorithm was also applied to the same 163 healthy donor plasma samples and 100% (163/163) were determined to be TMB-Low (Fig. 2C). When considering TMB classification as a predictor of clinical outcome for these patients enrolled in a clinical trial to evaluate response to immune checkpoint blockade with the PD-1 blocking antibody, pembrolizumab, baseline plasma TMB-High status was associated with favorable progression-free survival (hazard ratio, 0.23; P = 0.003, likelihood ratio test) and overall survival (hazard ratio, 0.26; P = 0.008, likelihood ratio test; Fig. 2D and E). When we considered only cases for which adequate ctDNA was detected (median detected sequence mutation allele fraction ≥0.5%), there were 17/20 cases evaluable with archival tissue data available, with 9/12 TMB-High cases detected and 5/5 TMB-Low cases detected. When compared with progression-free and overall survival, direct detection of TMB in baseline cfDNA (n = 25) could be used to predict response to immune checkpoint blockade (hazard ratio, 0.27; P = 0.013, likelihood ratio test and hazard ratio, 0.23; P = 0.006, likelihood ratio test, respectively) with this ctDNA requirement (Supplementary Fig. S2D–S2F). Interestingly, all five MSI-H patients exhibiting a complete response, determined through archival tissue analyses, were classified as TMB-High through plasma-based analyses, and six of seven MSI-H patients with progressive disease, determined through archival tissue analyses, were classified as TMB-Low through plasma-based analyses (Supplementary Tables S7 and S8).
Assessment of molecular remission and biomarker dynamics in patients treated with PD-1 blockade
In addition to baseline plasma analyses, we hypothesized that molecular remission, as measured by changes in ctDNA levels during treatment, would also be predictive of long-term durable response to immune checkpoint blockade. We first evaluated the utility of monitoring serum tumor protein biomarkers (CA125, CEA, CA19-9, or PSA) for determination of response at 3.5 to 7 weeks posttreatment initiation (Supplementary Table S7). We evaluated 45 patients with metastatic cancers with MMR deficiency and elevated baseline serum tumor protein biomarker levels while enrolled in a clinical trial to evaluate response to immune checkpoint blockade with the PD-1 blocking antibody, pembrolizumab. We landmarked the first time point between 3.5 and 7 weeks and found that multiple consecutive time points with a more than 75% reduction in the baseline protein biomarker level resulted in improved overall and progression-free survival (hazard ratio, 0.27; P = 0.027 and hazard ratio, 0.38; P = 0.052, likelihood ratio test, respectively; Fig. 3A and B; Supplementary Fig. S3A and S3B). For 12 patients enrolled in this clinical study, when evaluating the on-treatment serial plasma samples for residual ctDNA levels, there was a significant inverse correlation between the overall and progression-free survival when compared with the residual MSI allele levels at last dose (r = −0.91, P = 0.0006 and r = −0.98, P < 0.0001, respectively; Pearson correlation; Fig. 3C; Supplementary Fig. S3C); however, only a limited subset of the time points were available for these analyses (see Patients and Methods). We were able to correctly identify 4 of the 6 MSI patients who would achieve a long-term durable clinical response requiring multiple consecutive on-treatment time points with 0% residual alleles displaying MSI, all four of which displayed a complete response (hazard ratio, 0.09; P = 0.032, likelihood ratio test for overall survival; Fig. 3D; Supplementary Fig. S3D). A similar trend was observed when considering patients with more than 90% decrease in overall TMB across two time points when compared with baseline (HR, 0.07; P = 0.013, likelihood ratio test for overall survival; Fig. 3E and F; Supplementary Fig. S3E and S3F).
Serial plasma-based overall survival analysis for patients treated with immune checkpoint blockade. A, Evaluation of overall survival with the protein biomarker level at last dose (CA125, CEA, CA19-9, or PSA). A significant inverse correlation was observed between the overall survival in months when compared with the residual protein biomarker (r = −0.67, P < 0.001; Pearson correlation). B, Kaplan–Meier curves for overall survival among patients with tissue enrollment status of MSI and detectable protein biomarker levels (n = 45). For patients with two consecutive time points with more than 75% reduction in protein biomarker levels, landmarked 3.5 to 7 weeks posttreatment initiation (n = 12), median overall survival was not reached. For patients with 75% or less reduction in protein biomarker levels (n = 33), median overall survival was 35.1 months. C, Evaluation of overall survival compared with residual MSI allele levels at last dose. A significant inverse correlation was observed between the overall survival when compared with the residual MSI allele levels (r = −0.91, P < 0.001; Pearson correlation). D, Kaplan–Meier curves for overall survival among patients with tissue enrollment status of MSI and detectable MSI status at baseline (n = 9). For patients with two consecutive time points displaying no residual MSI alleles (n = 4), median overall survival was not reached. For patients with multiple time points containing residual MSI alleles (n = 5), median overall survival was 7.64 months. E, Evaluation of overall survival compared with residual TMB levels at last dose. A significant inverse correlation was observed between the overall survival in months when compared with the residual TMB levels (r = −0.95, P < 0.001; Pearson correlation). F, Kaplan–Meier curves for overall survival among patients with tissue enrollment status of MSI and detectable TMB levels at baseline (n = 11). For patients with more than 90% reduction in TMB levels (n = 4), median overall survival was not reached. For patients with 90% or less reduction in TMB levels (n = 7), median overall survival was 7.64 months. “/” indicates a censored data point; “*” indicates cases in which baseline protein biomarker, MSI, or TMB was not detected and were not included in the subsequent analyses; in cases in which residual protein biomarker, MSI, or TMB levels increased when compared with baseline, values of greater than 100% are indicated.
Serial plasma-based overall survival analysis for patients treated with immune checkpoint blockade. A, Evaluation of overall survival with the protein biomarker level at last dose (CA125, CEA, CA19-9, or PSA). A significant inverse correlation was observed between the overall survival in months when compared with the residual protein biomarker (r = −0.67, P < 0.001; Pearson correlation). B, Kaplan–Meier curves for overall survival among patients with tissue enrollment status of MSI and detectable protein biomarker levels (n = 45). For patients with two consecutive time points with more than 75% reduction in protein biomarker levels, landmarked 3.5 to 7 weeks posttreatment initiation (n = 12), median overall survival was not reached. For patients with 75% or less reduction in protein biomarker levels (n = 33), median overall survival was 35.1 months. C, Evaluation of overall survival compared with residual MSI allele levels at last dose. A significant inverse correlation was observed between the overall survival when compared with the residual MSI allele levels (r = −0.91, P < 0.001; Pearson correlation). D, Kaplan–Meier curves for overall survival among patients with tissue enrollment status of MSI and detectable MSI status at baseline (n = 9). For patients with two consecutive time points displaying no residual MSI alleles (n = 4), median overall survival was not reached. For patients with multiple time points containing residual MSI alleles (n = 5), median overall survival was 7.64 months. E, Evaluation of overall survival compared with residual TMB levels at last dose. A significant inverse correlation was observed between the overall survival in months when compared with the residual TMB levels (r = −0.95, P < 0.001; Pearson correlation). F, Kaplan–Meier curves for overall survival among patients with tissue enrollment status of MSI and detectable TMB levels at baseline (n = 11). For patients with more than 90% reduction in TMB levels (n = 4), median overall survival was not reached. For patients with 90% or less reduction in TMB levels (n = 7), median overall survival was 7.64 months. “/” indicates a censored data point; “*” indicates cases in which baseline protein biomarker, MSI, or TMB was not detected and were not included in the subsequent analyses; in cases in which residual protein biomarker, MSI, or TMB levels increased when compared with baseline, values of greater than 100% are indicated.
In addition, for three patients (CS97, CS98, and CS00) with a complete response, one patient with a partial response (CS06), and two patients (CS05 and CS94) without a response to immune checkpoint blockade, circulating protein biomarkers (CEA, ng/mL or CA19-9, units/mL), and residual alleles exhibiting MSI and TMB were evaluated over time during treatment (Fig. 4). In each of the patients exhibiting a complete response, there was a concurrent decrease in the circulating protein biomarker levels, the residual MSI alleles, and TMB levels, which correlated with reduced overall tumor volume as assessed by radiographic imaging. Patient CS97 demonstrated a partial radiographic response at 10.6 months; however, the patient achieved a 100% reduction in residual MSI and TMB levels at 2.8 months. CS97 then went on to a complete radiographic response at 20.2 months (Supplementary Table S7). Patient CS98 appeared to develop new liver lesions at 20 weeks suggestive of progressive disease (Supplementary Fig. S4). However, following an initial spike, protein biomarkers and residual MSI and TMB levels demonstrated a biochemical tumor response at 1.3 and 4.8 months. A liver biopsy demonstrated only inflammatory changes in the location where new lesions were noted, suggesting checkpoint therapy-induced inflammation. Radiographic imaging finally demonstrated resolution of any hepatic lesions and a 100% reduction in tumor volume at 16.8 months. A similar pattern was observed for patient CS00 where significant reduction in protein biomarker and residual MSI and TMB levels occurred at 1.5 and 0.6 months, respectively; however, radiographic imaging did not demonstrate a 100% reduction in tumor volume until 17 months. These data suggest that the residual MSI allele burden and TMB levels are indicative of overall tumor response to immune checkpoint blockade.
Monitoring of patients during immune checkpoint blockade. For three patients with a complete response to immune checkpoint blockade (CS97 (A), CS98 (B), and CS00 (C)), one patient with partial response (CS06 (D)), and two patients with progressive disease (CS05 (E) and CS94 (F)), residual alleles exhibiting MSI, TMB levels, circulating protein biomarkers (CEA, ng/mL and CA19-9, units/mL), and radiographic imaging were evaluated over time during treatment. In each case, exhibiting a complete response, residual MSI and TMB alleles were reduced to 0% mutant allele fraction (MAF) between 0.61 and 4.81 months after first dose. For each patient, the grey horizontal bar represents time on treatment.
Monitoring of patients during immune checkpoint blockade. For three patients with a complete response to immune checkpoint blockade (CS97 (A), CS98 (B), and CS00 (C)), one patient with partial response (CS06 (D)), and two patients with progressive disease (CS05 (E) and CS94 (F)), residual alleles exhibiting MSI, TMB levels, circulating protein biomarkers (CEA, ng/mL and CA19-9, units/mL), and radiographic imaging were evaluated over time during treatment. In each case, exhibiting a complete response, residual MSI and TMB alleles were reduced to 0% mutant allele fraction (MAF) between 0.61 and 4.81 months after first dose. For each patient, the grey horizontal bar represents time on treatment.
Discussion
The checkpoint inhibitor pembrolizumab is now indicated for the treatment of adult and pediatric patients with unresectable or metastatic solid tumors identified as having MSI or MMR deficiency (1, 2). However, it is often not possible to readily obtain biopsy or resection tissue for genetic testing due to insufficient material, exhaustion of the limited material available after prior therapeutic stratification, logistical considerations for tumor and normal sample acquisition after initial diagnosis, or safety concerns related to additional tissue biopsy interventions (25). We have described the development of an analytical method for simultaneous detection of MSI and TMB-High directly from cfDNA and demonstrated proof of concept for the clinical utility afforded through these analyses for the prediction of response to immune checkpoint blockade.
We present the first comprehensive tumor profiling approach for evaluation of MSI status from plasma utilizing NGS. Specifically, MSISensor (20), MANTIS (22), and mSINGS (24) involve extraction of sequencing reads associated with microsatellite loci to create a distribution and compare with a matched normal or a panel of normal samples. MIRMMR (21) utilizes methylation and sequence mutational data from genes in the MMR pathway, from which a regression model is trained, and MSIseq (23) utilizes a single nucleotide variant and indel classifier to determine MSI status. The methods described herein for determination of MSI status are the first to employ error correction of the sequencing reads associated with microsatellite loci through molecular bar coding, together with local maxima detection of low-level microsatellite alleles associated with cfDNA.
Recently, Kim and colleagues have described the use of a 73-gene NGS panel to correlate ctDNA mutational load scores with the mutational load calculated from tumor exome sequencing from 23 patients with metastatic gastric cancer treated with pembrolizumab as salvage treatment (42). A second, larger scale study led by Gandara and colleagues was performed to evaluate the clinical utility of plasma TMB with a 1.1-Mb panel using samples collected prospectively from the POPLAR and OAK randomized clinical trials for second-line or higher patients with non-small cell lung cancer (NSCLC) (43). Interestingly, in our study all five MSI-H patients exhibiting a complete response were classified as TMB-High through plasma-based analyses, and six of seven MSI-H patients with progressive disease, determined through archival tissue analyses, were classified as TMB-Low through plasma-based analyses. These data suggest that a baseline TMB measurement in plasma may be more accurate than archival tissue, as was the case in this study, because it provides a real-time analysis and corrects for the sampling error that is inherent to tissue sequencing.
In addition to the baseline evaluation of cfDNA for response prediction, ctDNA monitoring represents an approach to obtain a real-time analysis of tumor response to immune checkpoint blockade. Assessment of the efficacy of response to immune checkpoint inhibition has proven challenging utilizing imaging-based methodologies, particularly in the context of pseudoprogression, whereby an initial increase in tumor volume is observed, potentially due to immune cell infiltration, followed by tumor shrinkage (44, 45). Therefore, cfDNA-based approaches for comprehensive genome-profiling may be useful for the rapid determination of patients who ultimately may benefit from immune checkpoint blockade. This hypothesis has been previously demonstrated for patients with melanoma treated with CTLA-4 blockade (46), as well as immune checkpoint blockade in NSCLC (34, 47–49). Our data further support this hypothesis and, given the concordance with circulating protein biomarker data, suggest that the residual MSI allele burden and TMB prognostic signature could be applied to other tumor types where standardized protein biomarkers do not exist and may be an earlier predictor of response than radiographic imaging.
While every effort has been made to minimize potential confounding variables in the analyses described, the current study is limited to a small population of patients with cancer, and prospective clinical trials will need to be conducted across a broader range of tumor types to confirm these findings in a pan-cancer setting. Furthermore, the sensitivity for accurate detection of MSI and TMB-High is highly dependent upon ctDNA levels and, as such, there will be a proportion of patients with low levels of ctDNA for which these analyses will not be informative. TMB status and MSI status are highly correlated and, therefore, in the context of this study population, cannot differentiate the predictive value of each for determination of response to immune checkpoint blockade. Finally, for serial monitoring applications, while protein biomarker data were collected at similar time intervals for patients exhibiting a clinical response or lack of clinical response, sample availability across standardized time points was limited for evaluation of plasma MSI allele levels and TMB load. Nevertheless, these methods described herein provide feasibility for a viable diagnostic approach for screening and monitoring of patients who exhibit MSI or TMB-High and may respond to immune checkpoint blockade.
Disclosure of Potential Conflicts of Interest
A. Georgiadis is an employee/paid consultant for Personal Genome Diagnostics. L.A. Keefer is an employee/paid consultant for Personal Genome Diagnostics. M.M. Zielonka is an employee/paid consultant for Personal Genome Diagnostics. J.R. White is an employee/paid consultant for Personal Genome Diagnostics. V.E. Velculescu is an employee/paid consultant for Takeda Pharmaceuticals and Ignyta and holds ownership interest (including patents) in Personal Genome Diagnostics. D.T. Le reports receiving commercial research grants from Merck and Bristol-Myers Squibb; holds ownership interest (including patents) in Hopkins (MSI Diagnostics); and is an unpaid consultant/advisory board member for Merck and Bristol-Myers Squibb. L.A. Diaz is an employee/paid consultant for Personal Genome Diagnostics and Neophore; holds ownership interest (including patents) in Personal Genome Diagnostics, Thrive Detect, Neophore, Johns Hopkins Patents, Amgen (via immediate family members), and Chromacode; and has immediate family members who are unpaid consultant/advisory board member for Merck. M. Sausen is an employee/paid consultant for Personal Genome Diagnostics and Bristol-Myers Squibb. No potential conflicts of interest were disclosed by the other authors.
Authors' Contributions
Conception and design: A. Georgiadis, R. A. Anders, V.E. Velculescu, D.T. Le, L.A. Diaz, M. Sausen
Development of methodology: A. Georgiadis, J.R. White, D.R. Riley, R. A. Anders, L.A. Diaz, M. Sausen
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): B.R. Bartlett, D. Murphy, S. Lu, E.L. Verner, F. Ruan, D.R. Riley, R. A. Anders, D.T. Le, L.A. Diaz, M. Sausen
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): A. Georgiadis, L.A. Keefer, M.M. Zielonka, J.R. White, E.L. Verner, F. Ruan, D.R. Riley, E. Gedvilaite, S. Jones, V.E. Velculescu, D.T. Le, L.A. Diaz, M. Sausen
Writing, review, and/or revision of the manuscript: A. Georgiadis, B.R. Bartlett, J.R. White, S. Lu, E.L. Verner, S. Jones, V.E. Velculescu, D.T. Le, L.A. Diaz, M. Sausen
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): J. N. Durham, B.R. Bartlett, D. Murphy, S. Lu, F. Ruan, D.R. Riley, S. Angiuoli, L.A. Diaz, M. Sausen
Study supervision: D.T. Le, L.A. Diaz, M. Sausen
Acknowledgments
We would like to acknowledge D. Tsui for her data analysis contributions and critical review of the manuscript. This research was supported in part by NIH grants 1R43CA217544, CA121113, CA180950, NCI Contract HHSN261201600005C, Stand Up to Cancer Colorectal Cancer Dream Team Translational Research Grant (grant number SU2C-AACR-DT22-17), Stand Up to Cancer–Dutch Cancer Society International Translational Cancer Research Dream Team Grant (grant number SU2C-AACR-DT1415), and the Commonwealth Foundation. Stand Up to Cancer is a program of the Entertainment Industry Foundation administered by the American Association for Cancer Research. B.R. Bartlett is supported by NIH grants 5P30GM114737 and P20GM103466.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.