Abstract
Genomic analyses of small-cell lung cancer (SCLC) are limited by the availability of tumor specimens. This study aimed to investigate the suitability of single-cell sequencing of circulating tumor cells (CTC) as a method of inferring the evolution and progression of SCLCs.
Between July 1, 2011, and July 28, 2014, 48 consecutively diagnosed patients with SCLC were recruited for this study. CTCs were captured from each patient with CellSearch system. Somatic mutations and copy number alterations (CNA) were monitored by single-cell sequencing of CTCs during chemotherapy.
Single-cell sequencing of CTCs can provide a mutational atlas for SCLC. A 10-CNA score based on single CTCs was established as a classifier for outcomes of initial chemotherapy in patients with SCLC. The survival analyses demonstrated that patients with low CNA scores (<0) had significantly prolonged progression-free survival (PFS) and overall survival (OS) after first-line chemotherapy in comparison with those with high scores (≥0; PFS: 212 days vs. 110.5 days, P = 0.0042; and OS: 223.5 days vs. 424 days, P = 0.0006). The positive predictive value and negative predictive value of the CNA score for clinical subtype (refractory vs. sensitive) were 80.0% and 93.7%, respectively. By tracing allele-specific CNAs in CTCs isolated at different time points during chemotherapy, we showed that CNA heterogeneity might result from allelic losses of initially consistent CNAs.
Single CTC-based sequencing can be utilized to depict the genomic profiles and evolutionary history of SCLC, thus offering the potential for clinical stratification of patients with SCLC.
Small-cell lung cancer (SCLC) is one of the most aggressive cancers and often shows early metastases. Genomic analyses of the evolution and progression of SCLC are limited by the availability of tumor specimens. In our study, single-cell genomic amplification was applied to detect somatic mutations and/or copy number alterations (CNA) in single circulating tumor cells (CTC) from patients with SCLC. Based on single CTC sequencing, we extracted a subset of CNA regions and established a CNA score, which was used to predict the outcomes of initial chemotherapy for patients with SCLC. By tracing allele-specific CNAs in CTCs during chemotherapy, we found that CNA heterogeneity might be a result of allelic losses of initially consistent CNAs. Our results show that monitoring the genomic changes of single CTCs provides a noninvasive method to depict genomic features related to tumorigenesis, evolution during treatment and drug resistance, thus offering the potential for clinical stratification of patients with SCLC.
Introduction
Small-cell lung cancer (SCLC) is an exceptionally aggressive subtype of lung cancer that is most common in heavy smokers and represents 13% of all lung cancers (1). SCLC is mainly diagnosed by bronchoscopic biopsy based on histopathologic features and selected neuroendocrine markers. Currently, there is no clinical incentive for larger biopsies because of the lack of compelling treatment options except for chemotherapy. Moreover, the proximity of the lesions to large blood vessels causes complications for trans-thoracic biopsies. A few recent genomic studies identified ubiquitous genomic alterations involving TP53 and RB1; however, few oncogenic druggable genomic alterations have been identified (2–4). The paucity of tumor tissue available from patients undergoing treatment prevents a detailed investigation of the genomic evolution of SCLC.
The poor cohesiveness of malignant cells in SCLC suggests that molecular characterization of SCLC may be escalated to a higher degree through analysis of circulating tumor cells (CTC) compared with other cancer types. CTCs derived from patients with SCLC were able to maintain tumorigenic properties in immune-compromised mice, and the resultant CTC-derived explants provided a means of interrogating responses to different therapeutic interventions (5). Our previous study illustrated that copy number alteration (CNA) analyses of single CTCs showed the potential to distinguish SCLC from lung adenocarcinoma (6). A recent study of a cohort of 31 patients with SCLC generated a CNA-based classifier to distinguish chemosensitive from chemorefractory patients (7). However, genomic analyses of CTCs, especially single CTCs, for clinically relevant alterations remain largely unexplored.
Exploration of the mutational evolutionary process driving the transition from a primary tumor to a metastatic tumor could facilitate our understanding of the mechanisms underlying cancer metastases and the events promoting tumor progression. Paired primary and metastatic studies reveal late dissemination of primary tumor cells to seed metastases (8, 9). Analysis of the mutational profiles of CTCs, in addition to those of primary and metastatic tumors, could provide a full spectrum of the evolutionary process of SCLC.
Here we performed single-cell genomic analyses of individual CTCs isolated from 48 patients with SCLC using a single-cell sequencing approach with uniform genome coverage and a relatively low allele dropout rate (10). Tumor samples from some of these patients were obtained, and both single-nucleotide variants (SNV) and CNAs in individual CTCs and tumor samples were analyzed.
Materials and Methods
Patients and samples
Between July 1, 2011 and July 28, 2014, a total of 48 consecutively diagnosed patients with SCLC were recruited for this study. CTCs were collected from each patient, and 10 patients provided sufficiently paired tumor tissue samples for further genomic analyses.
The treatment regimens used in this study utilized etoposide plus platinum (cisplatin, nedaplatin, or carboplatin), which are the standard chemotherapy regimens for patients with SCLC. The doses and schedules were as follows: etoposide was administered at a dose of 100 mg/m2 on days 1, 2, and 3, whereas cisplatin was administered at a dose of 37.5 mg/m2 on days 1 and 2, or carboplatin AUC 4–5/nedaplatin at a dose of 75 mg/m2 was administered on day 1. Every regimen was repeated every 21 days. Radiographic imaging methods, including CT and MRI, were used for tumor response assessment, which was performed by both the investigator and an independent radiologist. Baseline tumor assessments were performed within 1 to 28 days prior to the initiation of chemotherapy, with subsequent assessments performed every 2 cycles until the development of objective disease progression. The objective response rate (ORR) was defined as the percentage of patients with confirmed complete response (CR) or partial response (PR) by RECIST version 1.1 (11). Progression-free survival (PFS) was defined as the time from the start of chemotherapy until disease progression (assessed by an investigator using RECIST version 1.1) or death from any cause. Patients who had not progressed at the time of statistical analysis or were lost to follow-up before progression or death were censored at the time of their last evaluation. Overall survival (OS) was defined as the time from the start of chemotherapy until death from any cause. Detailed information on patient diagnosis and treatment is available in Supplementary Table S1.
This study was conducted in accordance with the Declaration of Helsinki, and approved by the institutional ethics committee at Peking University Cancer Hospital & Institute (No. 2015KT13) and the Committee on the Use of Human Subjects in Research at Harvard University (No. F22221-101). All participants provided written informed consent form.
Isolation of CTCs and sequencing strategies
CTCs from 7.5 mL of blood from each patient were captured with the CellSearch@ Epithelial Cell Kit (Veridex LLC) using magnetic beads conjugated to anti–epithelial cell adhesion molecule (EpCAM) antibodies. The CTC isolation and whole-genome amplification of single CTCs and leukocytes were performed as described previously (6, 12). After whole-genome amplification of the genome of a single CTC, qPCR was performed for 8 randomly selected loci to check the genomic integrity of the whole-genome amplification product. DNA samples with 7 of 8 loci amplified by qPCR with a reasonable Ct number were used for subsequent analyses. Ninety percent of CTCs have passed the genomic integrity filter. Among the CTCs that failed to meet the criteria for inclusion, most did not show a reasonable Ct number at all 8 loci, which indicated failure to transfer CTCs to the PCR tube during the micro-pipetting step. Genomic DNA from the blood (gDNA) was extracted from blood samples using the Blood & Cell Culture DNA Mini Kit (Qiagen). DNA was extracted from the frozen tumor samples and formalin-fixed, paraffin-embedded (FFPE) tumor samples using the QIAamp DNA Micro Kit (Qiagen) and the QIAamp DNA FFPE Tissue Kit (Qiagen), respectively.
For each patient with tumor samples, whole-genome sequencing (WGS) and whole-exome sequencing (WES; refs. 6, 12, 13) were performed using more than three CTCs from each treatment stages, as well as gDNA and tumor samples, by an Illumina HiSeq X Ten system (read lengths of 2 × 150 bp) or an Illumina HiSeq 2500 system (read lengths of 2 × 100 bp). For patients for which only CTCs collected before treatment were available, 1 CTC from each patient was subjected to WGS.
Bioinformatics analysis
First, the sequencing reads were aligned to reference genome hg19 using the Burrows-Wheeler Aligner (BWA; ref. 14). Next, the aligned reads were sorted and merged with Samtools 0.1.18 (15). INDEL realignment was performed with the Genome Analysis Toolkit (GATK 2.1-8; ref. 16), and mate pair fixing and duplicate removal were conducted with Picard-tools. Base quality was recalibrated by GATK using dbSNP135.
Based on WES data, GATK UnifiedGenotyper was used to detect all mutations in all samples (CTCs, tumor samples, single leukocyte, and gDNA) from each patient. Next, germline mutations were removed based on gDNA, whereas false-positive mutations were removed based on single leukocytes (6). The functional effects of variants were annotated with SNPEFF 3.0 (17). Variations that were present in dbSNP 135 and the National Heart, Lung, and Blood Institute (NHLBI) Exome Sequencing Project (ESP), but not in COSMIC v61 (18), were removed.
Based on WGS data, the CNAs of each sample were identified as described previously (6). Briefly, the likely diploid regions were determined from the normalized (by total reads) coverage at a bin size of 500 kb using the hidden Markov model (HMM). The identified diploid regions were then used to provide a normalization factor for copy number determination. Significance analyses of gain and loss regions in CTCs from all patients with SCLC were performed following the GISTIC algorithm as described previously (6). To summarize, 1 CTC was collected from each patient prior to chemotherapy, and copy numbers in the 500K binned regions were determined. Similar copy numbers in adjacent bins were merged using a circular binary segmentation (CBS) algorithm implemented in the DNAcopy package (www.bioconductor.org/packages/DNAcopy). Next, P values for gains and losses in each region were calculated using the GISTIC algorithm. After FDR P value adjustment, a q value was assigned to each region. A significance level of 10−5 was determined according to the q values for gains and losses in normal leukocytes; no gain or loss regions were observed in the normal leukocytes based on this significance threshold.
The raw sequence data are deposited at the National Center for Biotechnology Information (www.ncbi.nlm.nih.gov) under accession number PRJNA448888.
Survival analyses
To explore CNAs related to disease progression in patients with SCLC, leave-one-out cross validation was used to find regions correlated with remission duration time (time interval from initial chemotherapy discontinuation to relapse, which was recorded as “0 days” if 4 cycles of initial chemotherapy were not finished). Sufficient clinical data were collected for 41 patients for the leave-one-out cross validation. For each step, 40 patients were selected and divided into 2 groups: one group with progressive disease during or within 3 months after the initial chemotherapy and another group with disease control for 3 months or longer. The CNAs in each cytoband were binned. For CNAs in each cytoband locus, Fisher exact test was used to determine the significance of copy number gains or losses for distinguishing the 2 groups. The loci with gains or losses that were most significantly correlated with disease progression were candidate chromosome regions for the CNA score and subjected to a visual inspection to ensure the uniformity of their sequencing coverage. The remaining sample was used as the test set. We selected the regions that frequently appeared in the score model and could validate the remaining sample.
Single-cell allelic-specific CNA analysis
To obtain allele-specific copy numbers for each CTC from patient 3, we first identified germline heterozygous SNPs based on WES. Population SNPs in tumor samples, CTCs, and gDNA extracted from white blood cells were detected using the Genome Analysis Toolkit (GATK 2.1-8; ref. 16). Among them, SNPs with mutant allele frequencies between 70% and 30% in the gNDA were considered as germline heterozygous SNPs. The major allele and minor allele were inferred from the 2 metastatic tumor samples; for a genotype in each germline heterozygous SNP, we designated it as the major allele if its allele frequency was higher than 80% in any of the metastatic tumors, whereas the other genotype was designated as the minor allele.
To estimate the allele-specific CNA, we first determined the ratio of minor alleles in each bin. If less than 50% of germline SNPs in the bin were assigned to a major or a minor allele, we did not consider differences in allele frequencies and set the allele ratio to 0.5. Otherwise, we counted the sequence reads supporting designation as a major allele (Rmajor) or minor allele (Rminor) and calculated the allele ratio r as follows: r = Rminor/(Rminor + Rmajor). For a total copy number N in a bin, the allele-specific CNA for the minor allele was N × r, whereas that of the major allele was N × (1 − r).
Results
The majority of mutations in primary and metastatic tumors can be detected in CTCs
To assess the suitability of CTCs for evaluating the overall tumor mutational landscape, we performed WES to identify SNVs and small insertions/deletions (indels) in paired samples of CTCs and tumor tissues from 10 patients with SCLC. The DNA in individual CTCs was amplified by whole-genome amplification prior to exome sequencing (Fig. 1A; ref. 10). In each patient, SNVs/indels shared by more than 2 CTCs were analyzed.
An average of 126.5 (SD = 41.6) nonsynonymous SNVs/indels were identified in the coding exons of each tumor. Mutations in the TP53 and RB1 genes, the 2 most significantly mutated genes in SCLC, were observed in 90% and 50% of the patients, respectively (Supplementary Fig. S1). Most of the SNVs/indels (68%–99%) present in the tumors were detected by CTC sequencing (Fig. 1B; Supplementary Tables S2–S11). This observation indicated that single-cell sequencing of CTCs, regardless of the concern of heterogeneity, can provide a mutational atlas for SCLC and thus facilitate liquid biopsy-based applications such as tumor mutational burden assessment for PD-1 blockade treatment (19).
CNA profiling of SCLC
Chromosomal-level genomic alterations in cancer are not random events, as evidenced by the consistent CNAs observed in single nuclei from invasive ductal carcinoma of breast cancer and CTCs (6, 20). CTCs are ideal specimens for CNA profiling because they possess reproducible CNAs resembling those of metastatic tumors (6, 12). To illustrate the CTC-based CNA profiling in SCLCs, we performed WGS for isolated CTCs in 48 patients with SCLC (Table 1). The high concordance of the CNA profiles across individual CTCs from each patient was identified and confirmed in 10 patients with a total of 91 sequenced CTCs, yielding satisfactory correlation coefficients (median = 0.84, 0.60–0.92, all P < 0.0001; Supplementary Fig. S2).
. | N (percentage, %) . |
---|---|
Age (years, median, range) | 60 (32–78) |
Gender | |
Male | 36 (75%) |
Female | 12 (25%) |
Smoking status | |
Never | 19 (39.58%) |
Ever/current | 29 (60.42%) |
Disease stage | |
Limited | 8 (17%) |
Extensive | 40 (83%) |
Chemoresponsea | |
Sensitive | 25 (60.98%) |
Refractory | 16 (39.02%) |
Response to initial chemotherapyb | |
CR | 0 (0%) |
PR | 33 (71.74%) |
SD | 5 (10.87%) |
PD | 8 (17.39%) |
PFS (days, median, 95% CI)a | 144 (136.54–203.07) |
OS (days, median, 95% CI)a | 287 (261.48–367.16) |
CTC counts (median, range) | 167 (4–40,000) |
. | N (percentage, %) . |
---|---|
Age (years, median, range) | 60 (32–78) |
Gender | |
Male | 36 (75%) |
Female | 12 (25%) |
Smoking status | |
Never | 19 (39.58%) |
Ever/current | 29 (60.42%) |
Disease stage | |
Limited | 8 (17%) |
Extensive | 40 (83%) |
Chemoresponsea | |
Sensitive | 25 (60.98%) |
Refractory | 16 (39.02%) |
Response to initial chemotherapyb | |
CR | 0 (0%) |
PR | 33 (71.74%) |
SD | 5 (10.87%) |
PD | 8 (17.39%) |
PFS (days, median, 95% CI)a | 144 (136.54–203.07) |
OS (days, median, 95% CI)a | 287 (261.48–367.16) |
CTC counts (median, range) | 167 (4–40,000) |
Abbreviations: PD, progressive disease; SD, stable disease.
aFor 41 patients with complete clinical records.
bFor 46 patients with evaluations of chemoresponse.
CNA profiling of a single representative CTC from each patient was utilized for the subsequent analysis. The segmented copy numbers in each CTC were visualized in a heatmap and the significance of chromosome alterations was determined by GISTIC analysis (Fig. 2; Supplementary Fig. S3). Copy number losses in 2 frequently inactivated genes in SCLC, TP53 and RB1, were identified in 64.6% and 81.3% of the patients, respectively (Supplementary Fig. S1). Generally, the single-CTC CNA profiling recapitulated both the recurrent focal events (such as the losses of TP53, FHIT, and RB1 and gains of MYCL1, PIKC3A, and SOX4) and arm-level alterations (such as losses in chromosome 3p, 13q, 17p, and gains in chromosome 3q and 5p) in the analyses of primary tumors (2–4). Unobscured by stromal cell admixture, our single-cell–based CNA profiling revealed more significant arm-level events. One of these events, the loss of chr4, was observed in 22 of 48 patients. Another loss region, chr5q, affected 18 of 48 patients.
A single-CTC-based 10-CNA classifier correlates with clinical outcomes
CNAs remained largely consistent among CTCs from each patient, which indicated that they might be utilized as predictive or prognostic markers for clinical outcomes. We analyzed the correlations between CTC-based CNA profiles and the efficacy and survival outcomes of first-line chemotherapy (etoposide plus platinum). A group of 10 CNA regions was identified using leave-one-out cross validation and found to be strongly correlated with remission duration time (time interval from chemotherapy discontinuation to relapse) in 41 patients completing first-line chemotherapy and with adequate clinical data (Fig. 3A). Based on these regions, we established a CNA score from CTCs obtained before treatment to predict outcomes. A score of +1 or −1 was assigned to each of these regions based on whether CNAs in these regions were positively or negatively correlated with the remission duration. Therefore, “+1” for alteration in the CNA region was associated with a shorter interval before remission, and “−1” was associated with a longer interval. A total CNA score was assigned to each patient (Fig. 3A).
The CNA score was subsequently used as a classifier to determine whether it could reflect the clinical outcomes of first-line chemotherapy. Generally, a higher CNA score (≥0) stratified patients with worse clinical outcomes to the first-line chemotherapy in comparison with patient with a low CNA score (<0). A total of 41 patients with detailed clinical records were used for further survival analyses (high score, ≥0, n = 20; low score, <0, n = 21). The survival analyses demonstrated that patients with low CNA scores (<0) had a significantly prolonged PFS after first-line chemotherapy in comparison with those with high scores (≥0; 212 days vs. 110.5 days, P = 0.0042; Fig. 3B), which indicated that the CNA scores of patients with SCLC could be used as a predictive marker for PFS. Using the same 10-CNA classifier, a significant difference in OS was also observed between patients with a high score (≥0) and those with a low score (<0; 223.5 days vs. 424 days, P = 0.0006; Fig. 3C). A multivariate analysis using Cox proportional hazards model, including gender (male vs. female), age (≥65 years vs. <65 years), disease stage (limited vs. extensive), and CNA score (high vs. low), demonstrated that a high CNA score was the only factor that was independently predictive of poor PFS [HR = 3.53; 95% confidence interval (CI), 1.665–7.484; P < 0.001] and OS (HR = 4.201; 95% CI, 1.829–9.649; P = 0.00072).
Clinically, patients with SCLC were classified as chemorefractory with progressive disease during or within 3 months after the initial treatment, or chemosensitive with disease control for 3 months or longer. Using the established CNA score as a criterion for predicting the clinical subtypes of patients, 20 of 25 chemorefractory patients were correctly identified as having a high score (≥0), and 15 of 16 chemosensitive patients with CNA scores below 0 were also correctly identified (P < 0.0001, 2-tailed Fisher exact test). The positive predictive value (PPV) of the CNA score for the clinical subtype of patients with SCLC was 80.0%, whereas the negative predictive value (NPV) of the CNA score was 93.7%.
The proportion of nonsmokers in our cohort is significantly higher than that in the western population. In our study, 19 of 48 patients with SCLC have never smoked tobacco, whereas only 3 of 159 patients with SCLC in a study of a western population had never smoked (4). The prevalence of nonsmokers in patients with SCLC in the Chinese population was reported to range from 22.8% to 53.8% in studies with sample sizes ranging from 78 to 303 patients with SCLC (21–23). Taken together with our results, these findings indicate that the proportion of nonsmokers in the Chinese population is likely to be much higher than that in western populations. This discrepancy mostly reflects an etiologic difference between Chinese and western populations. To make sure that our 10-CNA classifier is not confounded by the higher proportion of nonsmokers in the Chinese population, we examined both mutational signatures and CNAs in groups of nonsmokers and smokers. The prevailing C-to-A transversions associated with heavy smokers were found at a similar frequency in both groups (24.9% of all mutations on average in nonsmokers and 24.6% of all mutations on average in smokers). A GISTIC analysis of the nonsmokers and smokers showed CNAs characteristic of patients with SCLC (e.g., losses in chromosome 3p, 13q, 17p, and gains in chromosome 3q and 5p). However, one chromosome region, chromosome 18q, was frequently amplified in the nonsmokers but not in the smokers or patients with SCLC in western populations (Supplementary Fig. S4; ref. 2, 4). This chromosome region was not selected for establishing CNA score based on the leave-one-out cross validation. The survival analysis showed no significant difference between the smokers and nonsmokers (P = 0.8 for PFS and P = 0.6 for OS; Supplementary Fig. S4). Our observations indicate that further studies of these 2 groups are merited.
Allelic losses drive CNA heterogeneity during disease progression
To trace the evolution of CNAs during disease progression, we analyzed CNAs in CTCs from the same patient (patient 3). Two needle-biopsy specimens of liver metastasis were collected before first-line chemotherapy and at the time of disease progression (PD) after first-line chemotherapy respectively, and CTCs were collected at 4 time points during the treatment (before first-line chemotherapy, during first-line chemotherapy, before the start of second-line chemotherapy, and during third-line chemotherapy). CTCs from different stages showed consistent CNAs in chromosomal regions, such as a gain in chromosome 1p and losses in chromosomes 3p, 7q, and 13q (Fig. 4A). The correlation coefficients between the CNAs of any 2 CTCs ranged from 0.71 to 0.97 (median = 0.87), which demonstrated that CNAs were largely reproducible. However, we did observe the heterogeneous CNAs in certain chromosomal regions (chromosome 3q, 4q, 6p, and 10p).
To explore the causes of heterogeneous CNAs, we analyzed allele-specific copy numbers in each CTC. We identified major and minor alleles based on exome sequencing of tumor specimens (Fig. 4B). In chromosome 3q, the CNAs exhibited 2 states among different CTCs: copy number gain and copy number neutral. This finding could be explained as a heterogeneous gain in this region. However, the allele-specific copy number analyses showed 3 different states: gain in 13 CTCs, copy number neutral with both alleles in 7 CTCs, and copy number neutral loss of heterozygosity (CNNLOH) in 1 CTC (Fig. 4B; Supplementary Fig. S5). The CNNLOH could arise from the loss of the minor allele in chromosome 3q followed by the gain of the major allele or, conversely, arise from the subsequent loss of the minor allele in the CTC following the initial gain of the major allele. The latter is intuitively plausible, because a gain in the same major allele was also observed in 13 CTCs. Because both major and minor alleles can be lost, we believe that the copy number neutral state with both alleles could be due to the loss of a major allele. This hypothesis is further supported by the observation of a gain in chromosome 6p25.3-6p21.2 [6p(1)]. Starting from the major population of CTCs that exhibited a gain in the major allele, but retained 1 copy of the minor allele, 2 additional populations of cells evolved; 1 population lost the minor allele, leading to copy number gain loss of heterozygosity (CNGLOH), whereas the other population lost 1 copy of the major allele, leading to reduced copy number gain. Using 3 CTCs as examples, we illustrated the evolution of heterogeneous chromosomal alterations in chromosome 3q and chromosome 6p(1) (Fig. 4B).
Heterogeneous losses in 3 other chromosome regions, chromosome 4q, chromosome 6p21.2-6p11.2 [6p(2)], and chromosome 10p, were also observed among CTCs from patient 3. We traced the evolution of all 5 chromosome regions in individual CTCs throughout different courses of chemotherapy. Originating from a presumed founding clone (clone 1) with gains in chr3q and chr6p(1), continuous losses of one or both alleles could lead to the evolution of genomic heterogeneity among diverse clones that evolved during treatment (Fig. 4C).
CTCs mainly disseminate from primary tumors and continue to progress at metastatic site
To infer the evolution of mutational clonality among primary tumors, bulk CTCs, and metastatic tumors, the exome sequencing libraries of 2 patients with more than 10,000 CTCs were prepared directly from extracted CTC DNA without whole-genome amplification. In patient 1, the CTCs were isolated before initial treatment at the same time that primary and liver metastatic tumor samples were obtained. Approximately 82% of mutations were shared among primary tumor, CTCs, and metastatic tumor, including mutations in known driver genes (RB1, TP53, NSD1, and ERG; Fig. 5A; Supplementary Table S2). Only 3 mutations were specific to primary tumors, and 3 mutations were specific to metastatic tumors. In patient 2, primary and metastatic tumor samples were collected at the same time; however, bulk CTCs were collected 1 year later. Primary tumor and CTCs shared 79% of mutations, whereas 61% of the mutations were shared by CTCs and metastatic tumor. Although mutations in known driver genes (RB1, TP53, MLL3, and JAK2) were shared among all specimens, 42 mutations were specific to metastatic tumors (Fig. 5A; Supplementary Table S3).
Next, we analyzed the mutant allele frequency and clonality of the selected patients (Fig. 5A; Supplementary Fig. S6A). Mutations in patient 1 formed a major population at an allele frequency close to 50%, a minor population at an allele frequency close to 100%, and a few specimen-specific populations at low allele frequencies. To infer tumor clonality, PyClone, a Bayesian clustering method, was used, which normalized variant allele frequencies (VAF) with segmented copy numbers (24). Among the mutations, 74% were prevalent in most of the tumor cells and formed a main clone (clone 1) that was maintained as it evolved from the primary tumor to the circulatory system and the metastatic tumor (Supplementary Fig. S6A). The other 3 inferred clones (clones 2–4) formed rare clones specific to primary, CTC, or metastasis specimens.
In patient 2, except for the populations of mutations around the allele frequencies of 50% and 100%, a major population of mutations was observed at low allele frequencies. Clonality analysis revealed that, similar to patient 1, a major clone of mutations was prevalent in most of the tumor cells among all 3 types of specimens (Fig. 5A; Supplementary Fig. S6A). Most of the mutations in the second clone were specific to metastatic tumor and were not detected in the primary tumor specimens collected at the same time point or bulk CTCs captured after 1 year.
In patient 3, the majority (83%) of mutations were present in most of the tumor cells in these 2 liver metastatic specimens, whereas the others formed specimen-specific clones (Fig. 5B; Supplementary Fig. S6B; Supplementary Table S4). Most of the posttreatment tumor-specific mutations (clone 4 in Supplementary Fig. S6B) were not detected in the CTCs.
We did not observe a continuously evolving clone whose prevalence was shaped by selection pressure. This observation is consistent with a previous report showing much lower subclonal diversity in SCLC in comparison with lung adenocarcinoma (4). Although CTCs mainly disseminated from the primary tumor and shared a majority of the mutations with the primary tumor, tumors in the metastatic site were observed to progress continuously to form minor clones unobserved in CTCs or primary tumors.
DNA repair/replication mutations enriched in CTCs and metastasis
To elucidate mutational processes associated with metastatic potential or disease progression, we analyzed mutation enrichment in metastatic tumors, CTCs, and relapsed tumors (Fig. 5C). In patient 2, mutations in BRCA1 genes were detected in metastatic tumors and late biopsies of CTCs. In patient 3, 2 mutations in genes involving DNA replication were observed. One of these genes, PRKDC, encodes the catalytic subunit of DNA-dependent protein kinase (DNA-PK), which plays an important role in nonhomologous end joining to repair double-strand breaks in DNA (25). In another gene TOP2A, a nonsense mutation (R673*) generated a stop codon at arginine 673, truncating this gene at the TOPRIM (topoisomerase/PRIMase) domain, which led to the absence of binding sites (M762, S800, R487, M766, and D463) for TOP poison etoposide (26). Among the other 8 patients whose tumors were also sequenced, a BRCA1 mutation was detected in CTCs from patient 4, whereas BRCA2 mutations were detected in CTCs from patient 7 and in both primary tumors and CTCs from patient 8. This analysis revealed mutations in DNA repair/replication genes TOP2A, BRCA1, and BRCA2 were found in the CTCs or relapse/metastatic tumors of 5 of the 10 selected patients.
Discussion
Our study explored the clinical utility of genomic analyses of CTCs. We demonstrated that the genomic alterations of a single CTC can be representative of the somatic mutational profiles and CNAs of the primary tumors in patients with SCLC. A CNA score established based on a single CTC was a promising prior classifier for clinical outcomes after chemotherapy in patients with SCLC. Tracing the evolution of CNAs in single CTCs indicated that CNA heterogeneity might result from allelic losses, revealing that cancer cells may be incapable of maintaining their metastable CNAs, which are initially consistent among cells.
CTCs represent viable cells with metastatic potential in the circulatory system. Enumeration of CTCs based on immunostaining has been deployed in the clinical setting. The clinical utility of genomic analysis of CTCs has not been fully demonstrated, partially as a result of the low yield of DNA from these cells, which presents challenges for library preparation and bioinformatics analysis in next-generation sequencing. A PCR-based approach to EGFR mutation (T790M) showed 57% concordance between CTCs and biopsy in non–small cell lung cancer and urged the combined use of CTCs and cfDNA to provide a more complete assessment of mutations in each patient (27). With whole-genome amplification, we were able to analyze both the mutational and CNA profiles of CTCs from patients with SCLC. Exome sequencing of multiple CTCs can recapitulate the somatic mutational profiles of the primary tumors in patients with SCLC. Paired analyses of primary tumors, CTCs, and metastatic tumors showed stable mutational profiles in patients with SCLC, rationalizing the use of CTCs as a liquid biopsy approach in cancer biology exploration and suggesting that such an approach could have potential clinical applications.
CTC-based CNA analysis is appealing because it is not affected by normal cell contamination. CNAs occur in recurrent chromosomal regions among different patients and have significant potential as predictive and prognostic markers. Currently, biomarkers for predicting responses to initial chemotherapy in patients with SCLC remain largely unknown. This study established a single-CTC–based CNA score to predict the response to first-line chemotherapy, which provided insight into biomarker development and a convenient approach for clinical disease differentiation. Our preliminary results warranted the launch of a large cohort study to translate this knowledge into practice and thus benefit the patients with optimal outcomes.
The evolutionary history of genomic alterations sheds light upon the initiation, maintenance, and progression of tumors. Single-cell analyses showed high consistency between large-scale CNAs among cancers, implicating them as early events during tumorigenesis (6, 20). Single-cell breakpoint analyses tracing the formation of CNAs in individual cells, as well as allelic imbalance analyses pinpointing the contributions of different parental alleles to CNAs at multiple intratumoral regions, revealed CNA convergence (12, 28). On the other hand, pronounced heterogeneity in CNAs was observed in patients with lung cancer (29). During longitudinal monitoring of CTCs in 1 patient, we identified a few chromosomal regions with heterogeneous large-scale CNAs that occurred and evolved during treatment. Although those heterogeneous CNAs may have originated from newly emerging subclones, our single-cell allele-specific copy number analyses suggested that the incapability of CTCs to maintain either or both alleles during tumor progression led to heterogeneity in initially consistent CNA regions. We hypothesize a 2-step CNA evolution model: (i) early tumorigenesis events during tumor initiation produce consistent CNAs. This step of evolution might be cancer-type–specific because similar CNA patterns were observed in CTCs from different patients with certain cancer types; and (ii) CNA heterogeneity could arise because of chromosome exacerbation of initial CNAs, which are known to be metastable at the single cell level, during tumor maintenance and progression. To address genomic heterogeneity, it is desirable to obtain the genomic profiles of as many subclones as possible (30). Sequencing of multiple intratumor regions or tumors from multiple organ sites could reveal genomic heterogeneity at the scale of a mixed population of cancer cells. These approaches could omit genomic alterations from tumor cells in concealed sites and cannot provide information regarding dynamic changes over time. Liquid biopsy methods utilizing CTCs could be used to decipher heterogeneity at the single-cell level and provide information about the overall heterogeneity for a patient, and such methods have the potential to examine such heterogeneity in “real time.”
The increased frequency of BRCA1/2 mutations in metastatic tumors or CTCs collected at late time points after the initiation of treatment was surprising. Mutations in DNA repair/replication genes were not frequent events in primary SCLC tumors. As shown in a large cohort study of 110 SCLC tumors, TOP2A, BRCA1, and BRCA2 were reported in only 5 of 105 treatment-naïve tumors (4). In comparison, in 5 of 10 patients in our study, we observed mutations in these genes in CTCs or relapse/metastatic tumors. Significant enrichment (P = 0.0003) of DNA repair/replication-associated mutations was observed in our combined sequencing of CTCs and metastatic tumors. Tumors with defects in DNA repair genes have been shown to have increased sensitivity to platinum-based chemotherapy drugs, such as cisplatin (31). Unlike secondary mutations that restore the wild-type reading frame in BRCA1/2, which could lead to platinum-based chemoresistance (32, 33), our observation of newly emerged BRCA1/2 mutations in metastatic tumors or CTCs at late time points suggests a rationale for follow-up treatments with additional cycles of platinum-based chemotherapy or DNA repair protein PARP inhibitors.
Disclosure of Potential Conflicts of Interest
X.S. Xie holds ownership interest (including patents) in Yikon Genomics. No potential conflicts of interest were disclosed by the other authors.
Authors' Contributions
Conception and design: Z. Wang, X. Ni, J. Duan, Y. Gao, F. Bai, J. Wang
Development of methodology: Z. Su, Z. Wang, X. Ni, Y. Gao, J. Wang
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): Z. Wang, X. Ni, M. Zhuo, J. Zhao, H. Bai, H. Chen, S. Wang, X. Chen, T. An, Y. Wang, Y. Tian, J. Yu, D. Wang, J. Wang
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): Z. Su, Z. Wang, X. Ni, J. Duan, R. Li, Q. Ma, S. Wang, J. Yu, D. Wang, J. Wang
Writing, review, and/or revision of the manuscript: Z. Su, Z. Wang, X. Ni, J. Duan, Y. Gao, S. Wang, T. An, D. Wang, F. Bai, J. Wang
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): Z. Su, Z. Wang, M. Zhuo
Study supervision: J. Duan, X.S. Xie, F. Bai, J. Wang
Acknowledgments
We thank all patients for their participation in this study. We thank Y. Lv (Beijing Cancer Hospital) for useful discussion, D. Cao (Beijing Cancer Hospital) for help with the histologic diagnosis, and Y. Zhang and W. Ma (BIOPIC) for assistance with sequencing. This research was supported by the National Key Research and Development Program (2016YFC0900102); the National High Technology Research and Development Program of China (863 Program, 2015AA020403); the Beijing Natural Science Foundation (7172045); the Beijing Municipal Science & Technology Commission (Z141100000214013); the National Natural Sciences Foundation Key Program (81630071, 81330062); the Chinese Academy of Medical Sciences (CAMS) Innovation Fund for Medical Sciences (CIFMS 2016-I2M-3-008, 2017-I2M-1-005); Aiyou Foundation (KY201701); the Ministry of Education Innovation Team Development Project (IRT-17R10); CAMS Key lab of translational research on lung cancer (2018PT31035); the China National Natural Sciences Foundation (81871889); the Beijing Novel Program Grants for cross-cooperation (Z181100006218130); and the Non-Profit Central Research Institute Fund of CAMS (2018RC320009).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.