Natural History of Germline BRCA1 Mutated and BRCA Wild-type Triple-negative Breast Cancer

Abstract We report a deep next-generation sequencing analysis of 13 sequentially obtained tumor samples, eight sequentially obtained circulating tumor DNA (ctDNA) samples and three germline DNA samples over the life history of 3 patients with triple-negative breast cancer (TNBC), 2 of whom had germline pathogenic BRCA1 mutation, to unravel tumor evolution. Tumor tissue from all timepoints and germline DNA was subjected to whole-exome sequencing (WES), custom amplicon deep sequencing (30,000X) of a WES-derived somatic mutation panel, and SNP arrays for copy-number variation (CNV), while whole transcriptome sequencing (RNA-seq) was performed only on somatic tumor. There was enrichment of homologous recombination deficiency signature in all tumors and widespread CNV, which remained largely stable over time. Somatic tumor mutation numbers varied between patients and within each patient (range: 70–216, one outlier). There was minimal mutational overlap between patients with TP53 being the sole commonly mutated gene, but there was substantial overlap in sequential samples in each patient. Each patient's tumor contained a founding (“stem”) clone at diagnosis, which persisted over time, from which all other clones (“subclone”) were derived (“branching evolution”), which contained mutations in well-characterized cancer-related genes like PDGFRB, ARID2, TP53 (Patient_02), TP53, BRAF, BRIP1, CSF3R (Patient_04), and TP53, APC, EZH2 (Patient_07). Including stem and subclones, tumors from all patients were polyclonal at diagnosis and during disease progression. ctDNA recapitulated most tissue-derived stem clonal and subclonal mutations while detecting some additional subclonal mutations. RNA-seq revealed a stable basal-like pattern, with most highly expressed variants belonging to stem clone. Significance: In germline BRCA1 mutated and BRCA wild-type patients, TNBC shows a branching evolutionary pattern of mutations with a single founding clone, are polyclonal throughout their disease course, and have widespread copy-number aberrations. This evolutionary pattern may be associated with treatment resistance or sensitivity and could be therapeutically exploited.


Introduction
Triple-negative breast cancer (TNBC) is an aggressive disease characterized by the lack of expression of estrogen receptor (ER), progesterone receptor (PR), and HER/neu receptor (HER2).Endocrine and HER2 targeted therapies are ineffective in these patients.Thus, surgery, radiotherapy, and chemotherapy remain the standard treatment for this disease.Many patients with TNBC experience locoregional and/or distant relapse after primary treatment, with short post-relapse survival and acquisition of resistance to multiple drugs.
including TNBC (4)(5)(6)(7).These and other studies have classified TNBC into subtypes with distinct genomic, transcriptomic and copy-number profiles, and clinical outcomes (8)(9)(10)(11)(12)(13).These studies have shown that basal-like TNBC are characterized by high levels of genomic instability leading to widespread, genome-wide copy-number aberrations, and marked interpatient heterogeneity in mutational profiles.These studies using predominantly treatment-naïve TNBC samples have provided important insights about its landscape and molecular drivers, but are inadequate for explaining the subsequent disease course.
Notably, the drivers of cross-sectional (across patients) and sequential ("in a patient over time") heterogeneity may be different, with the latter arising under the pressure of drugs used sequentially (14).Some landmark studies have attempted to resolve the clonality of TNBC at a snapshot in time or multiregional snapshots in time (14)(15)(16)(17), while others have characterized its evolution under therapeutic pressure (18), or in patient-derived xenograft models (19)(20)(21)(22).Some studies have evaluated the evolutionary pattern of TNBC by comparing primary and metastatic tumors obtained from different organs at autopsy (23,24) and identified considerable similarity in genomic aberrations between them within individual patients.
The sequential evolution of tumor through its life history from initial diagnosis in a non-metastatic stage to the development of metastatic disease and subsequently repeated disease progressions in a single patient is less well reported.This could provide vital insights into adaptation mechanisms and escape from different treatments.It could also help elucidate whether the founding clone(s) ("trunk") persists through the life history of a tumor, with evolution mainly comprising "branches" from the "trunk," or there is clonal extinction and the emergence of new clones.
We performed a prospective sequential sampling of somatic tissue and circulating tumor DNA (ctDNA) from 3 patients with TNBC over their disease course to evaluate clonal evolution under the selection pressure of chemotherapy drugs.

Study Design, Patients, and Samples
The study was approved by the Institutional Ethics Committee of Tata Memorial Centre, Mumbai, India as IEC study protocol 151 and registered in the Clinical Trials Registry-India (CTRI/2016/11/007430).This study was conducted in accordance with the Declaration of Helsinki.Patients were included in the study after obtaining written informed consent, including consent for publication.This was an ambispective (prospective and retrospective) study with respect to sample and data collection.Patients were eligible if they had histopathologically proven breast cancer, which was negative for ER, PR, and HER2 by IHC or FISH, if required.The included patients were those who had their first relapse (local or distant) after prior curative treatment with surgery, (neo) adjuvant chemotherapy, with or without radiotherapy.Tumor specimens [fresh-frozen or formalin-fixed and paraffin-embedded (FFPE) tissue] from initial diagnosis and surgery were required to be available in the hospital tumor tissue repository.
After recruitment in the study, patients underwent blood sampling and tumor tissue sampling from the most accessible site of relapse.Briefly, multiple cores of fresh tumor biopsy were stored in three to four tubes of an RNA preservative (RNAlater) at 2°C-8°C for 12-16 hours, followed by −80°C until further anal-ysis.The stored samples were subjected to whole-exome sequencing (WES), whole transcriptome sequencing (RNA-seq), high-depth targeted resequencing, and SNP array for copy-number variation (CNV), as described below and in the Supplementary Materials and Methods.Routine histopathologic evaluation, including IHC analysis for ER, PR, HER2, and tumor content, was performed on a few cores of the fresh biopsy.
Blood (12-15 mL in ethylenediaminetetraacetic acid (EDTA) tube, except in 1 patient at one timepoint in whom only 3.6 mL was available) was separated into buffy coat and plasma using cold centrifugation at 820 × g for 10 minutes at 4°C followed by 20,000 × g for 10 minutes at 4°C, which were then stored at −80°C until further analysis.Germline DNA from buffy coat was subjected to WES, high-depth targeted resequencing, SNP array for CNV, and custom amplicon next-generation sequencing (NGS) assay for hereditary predisposition genes, as described below and in Supplementary Materials and Methods.Plasma was used for ctDNA analysis as described below and in the Supplementary Materials and Methods.
The patients were treated with chemotherapy per standard practice and followed up with clinical and radiological evaluation.At the time of documented progression, fresh tumor biopsy and blood samples were again obtained and stored as described above.These steps were repeated at the time of each documented disease progression after subsequent lines of treatment until the last disease progression before death.FFPE tissues obtained at the time of initial diagnosis and surgery (after neoadjuvant chemotherapy) were used to extract DNA and subjected to WES and high-depth targeted sequencing as described below and in the Supplementary Materials and Methods.

DNA-based Assays
DNA from fresh-frozen and buffy coat samples was extracted using the Qiagen DNA mini kit (catalog no./ID: 51306) and from FFPE samples using Qiagen DNA FFPE kit (catalog no./ID: 56404), as per manufacturer's protocols.Agilent V4+UTR (71M) exome capture was used to perform WES at an average depth of 200X (tumor samples) and 100X (buffy coat DNA).Somatic mutations from WES analysis of each patient's tumor tissue (all samples) were pooled to design a custom targeted amplicon panel assay, which was performed on DNA from all tumor samples, corresponding WBC, and ctDNA from plasma.The OncoScan assay (Affymetrix, Thermo Fisher Scientific, catalog no./ID: 902293), a high-density microarray platform, was used for SNP profiling on DNA extracted from all fresh-frozen and buffy coat samples, as per the manufacturer's protocol.Insufficient DNA was available from FFPE samples for this assay and WES data were used to infer copy numbers in these samples using VarScan2 (RRID: SCR_006849) and Sequenza (RRID:SCR_016662) tools (details in Supplementary Materials and Methods).A total of 6 mL of plasma from each blood sample was used to extract ctDNA using the Circulating Nucleic Acid kit (catalog no./ID: 55114) from Qiagen, according to the manufacturer's instructions.Extracted ctDNA from each plasma sample was eluted into 55 μL of AVE buffer and stored at −20°C.The concentration of extracted ctDNA was determined using Qubit dsDNA HS (High Sensitivity) Assay Kit (Invitrogen), and its size distribution was assessed using a fragment analyzer.In addition, a custom amplicon targeted NGS assay for germline variants in genes with an established role in hereditary cancers was performed on buffy coat DNA, as reported earlier (25).WES capture and library preparation, targeted custom amplicon deep sequencing assay for pooled somatic variants, panel germline variant NGS assay design, and targeted sequencing and Sanger validation of germline variants are described in Supplementary Materials and Methods.

RNA-based Assays
RNA-seq could only be performed on prospectively collected tumor samples because insufficient tumor was available for this assay in FFPE blocks.RNA was extracted from fresh-frozen samples using the Qiagen RNAeasy mini kit (catalog no./ID: 74004) per the manufacturer's instructions.RNA concentration was measured using Qubit Fluorometer, and the 28S/18S ratio was determined using Agilent 2100 bioanalyzer.RNA integrity number was checked using the Agilent Bioanalyzer and was >7 for all fresh-frozen samples.NGS Libraries for RNAseq were prepared using the manufacturer's instructions (Illumina True-Seq mRNA Sample Preparation Kit) and are described in detail in Supplementary Materials and Methods.

NGS and its Analysis
High-throughput sequencing was independently performed for each captured library to ensure that each sample met the desired coverage.Sequences were generated as 150 bp paired-end reads for WES (200X depth), 100 bp paired-end reads for targeted custom amplicon deep sequencing assay for pooled somatic variants (30,000X), and 150 bp paired-end reads for whole transcriptome libraries (∼60 million reads per sample).Raw image files were processed for base calling using the Illumina base-calling Software 1.7 (RRID:SCR_014332) with default parameters.Bioinformatics analysis for CNV analysis using SNP profiling, copy-number inference from WES, somatic variant calling from WES and RNA-seq data, germline variant calling from white blood cell (WBC) DNA, and deep sequencing validation at 30,000X from targeted custom amplicon assay, are described in detail in Supplementary Materials and Methods.We categorized the variants chosen to be validated by ultra-high depth sequencing data (n = 1006) into four tiers as described in Supplementary Materials and Methods.

Clonality and Phylogenetic Analysis
We used the PyClone (RRID:SCR_016873; ref. 26) software for clonality analysis.A clone was defined as a cluster of mutations with the same allelic frequency at each timepoint and whose allelic frequencies tracked together over time.Any clone that persisted throughout the disease course was termed the "stem clone" or "founder clone."Additional mutations gained by the founder clone resulted in "subclones."Clonal maps of each sample were constructed, and clonal evolution was inferred from sequential samples in each patient.Briefly, allele frequency obtained from deep sequencing of sequential samples in each patient was integrated with allele-specific CNV information and tumor content (inferred from sequencing data), and inputted into the PyClone software.We ran the Py-Clone algorithm for 10,000 iterations using the beta-binomial model, with other options kept at default settings.PyClone estimates each mutation's clonal prevalence (CP) and clusters mutations into various groups ("clones").Clusters with more than one mutation were considered further.CALDER (27) was used to infer an evolutionary phylogenetic tree for each patient, and TimeScape (28) was used for visualization.

Data Availability Statement
All raw data are deposited in ArrayExpress with following IDs: E-MTAB-11379 (Exome sequencing), E-MTAB-11375 (Transciptome sequencing), E-MTAB-11375 (SNP array data), and E-MTAB-11376 (ultra-deep sequencing).Other data generated in this study are available within the article and its Supplementary Data files.

Clinical Characteristics and Sample Accrual
Three patients with TNBC, ages 29 years, 34 years, and 28 years, were included in this study between December 23, 2015, andMarch 17, 2016.The important clinical characteristics of the patients are depicted in Table 1.All 3 patients received anthracycline-based neoadjuvant chemotherapy followed by surgery, and all patients also received paclitaxel, 1 in neoadjuvant setting and 2 after surgery.
All patients had residual tumors after neoadjuvant chemotherapy and relapsed at 1 month, 11 months, and 16 months, respectively, from the date of surgery.
All patients received carboplatin during their disease course, 1 during neoadjuvant treatment and 2 for metastatic disease, and all had two episodes of disease progression after the first relapse.The 3 patients died at 9 months, 6 months, and 7 months after experiencing the first relapse.The disease events, collection timepoints, nature of the sample, intermediate chemotherapy drugs, and the genomic assays applied on each sample in the 3 patients are listed in Table 1.
The sites of prospective tumors biopsy, their histopathologic details including nodal involvement and tumor grade are detailed in Supplementary Table S1.In total, we were able to accrue 13 sequential tissue samples (five FFPE, eight fresh-frozen), eight sequential ctDNA samples at each relapse/progression, and three germline WBC samples at first relapse, from 3 patients.DNA from eight fresh-frozen tissues and three germline samples were subjected to WES, SNP array, and deep sequencing assays.The eight ctDNA samples were subjected to deep sequencing assays.RNA from eight fresh-frozen tissues was also subjected to RNA-seq assays.The three germline DNA samples were additionally subjected to a 35-gene NGS germline assay and Sanger sequencing for BRCA mutations.

Tumor Subtyping
IHC on tumor samples obtained at each timepoint confirmed all samples in the 3 patients to be ER, PR, and HER2 negative.Analysis of genome-wide RNAseq data by AIMS (Absolute Assignment of breast cancer intrinsic Molecular Subtype; ref. 16) on fresh tumor samples from sites of relapse or progression confirmed that all of them belonged to the basal-like category.

Germline Mutations in Patients
WES on the germline DNA from Patient_02 (BRCA c.5035delC, heterozygous deletion) and Patient_07 (BRCA c.4676-1 G>C, heterozygous splice site) suggested the presence of pathogenic mutations in BRCA gene.The third patient had wild-type BRCA and BRCA status in her germline.Details of mutations and significance are provided in Supplementary Data.These findings were confirmed using targeted 35-gene germline panel testing with NGS and with Sanger sequencing (Supplementary Fig. S1).

Somatic Mutations and Mutational Signatures from WES
The median coverage in WES was 264.20X for fresh-frozen samples, 150.26X for buffy coat samples, and 107.62X for FFPE samples.The median number of mutations per megabase (Mb) in somatic tumor samples (n = 13) was 1.59 (range: 0.97-5.65),which is concordant with the known mutation rate in TNBC (Fig. 1A).

Variant Validation and Identification of Low Allelic Prevalence Variants Using Targeted Deep Sequencing
Deep sequencing allowed us to identify low allele frequency subclonal mutations across various samples, which could have been missed in WES.Deep sequencing could not be performed on three FFPE samples (02_Bio, 04_Bio, 04_Sur) due to inadequate tissue.Of the 1,083 mutations shortlisted from WES, 77 could not be sequenced at high depth, while we could successfully design primers for 1,006 variants (Supplementary Table S3).There was concordance in 834 of the 1,006 mutations (82.9%) between exome sequencing and deep sequencing.There was a very high concordance in allele frequency of mutations between WES and targeted deep sequencing for fresh-frozen samples (99%) and moderately high concordance for FFPE samples (78%).Variant allele frequencies from exome and deep sequencing data were strongly correlated with R > 0.9 for all samples (Supplementary Fig. S2), which suggests that WES can effectively capture variant allele frequencies for almost all single-point mutations.There was discordance in 172 of 1,006 unique single point mutations in that some of them were not detected by WES but were detected by deep sequencing in that sample (Supplementary Fig. S3).This suggests that these mutations may not have been adequately covered in WES or that these were subclonal mutations detectable only by high-depth sequencing.There were 25 Tier 1 variants, 41 Tier 2 variants, 286 Tier 3 variants, and 619 Tier 4 variants across 13 samples from the 3 patients.

Copy-number Analysis
Because DNA from baseline FFPE samples was exhausted, we could not subject them to SNP array analysis and only the eight fresh-frozen samples collected prospectively at the time of disease progression were subjected to this analysis.Reduced Segment (RS) analysis identified changes in 98.3% of the genome across eight samples (mean values: focal amplification 49.18%, LOH 28.14%, copy-neutral LOH 19.86%, and homozygous deletion 1.11%) with mean 1.68% (range: 1.03%-2.06%) of the genome being heterozygous diploid, consistent with previous reports (15,29).Patient specific details are shown in the Supplementary Data.Allele specific copy-number analysis identified a mean aberrant cell fraction of 0.79 in our samples suggesting high tumor content for all samples assayed, with a mean ploidy of 2.6 (Table 2; Supplementary Fig. S4).) which is consistent with previous reports (30).
All 3 patients showed copy-number losses at sites of known tumor suppressor genes and gains at the sites of known oncogenes consistent with previous reports (Supplementary Fig. S5).The complete list of gene-specific copy number in 3 patients is listed in Supplementary Table S4.

TNBC Shows Branching Pattern of Clonal Evolution with a Persistent Stem Clone
Phylogenetic clonal evolution in the 3 patients is shown in Fig. 2. The tumor comprised a founder clone and two subclones in all 3 patients at the time of diagnosis.Tumors from all 3 patients in our study exhibited a branching pattern of evolution, with one stem clone persisting through the lifespan of cancer, giving rise to various subclones by acquiring additional mutations, which in turn propagated further subclones of their own (Fig. 2).The stem clone (clone A, Fig. 2) in all 3 patients contained at least one Tier 1 mutation, that is, a known cancer driver mutation.However, only a few subclones gained additional Tier 1 mutations which were absent in the stem clone.

Patient_02
In Patient_02 (germline BRCA mutated), we identified seven clones (one stem and six subclones) comprised of 311 mutations in five samples collected sequentially (Fig. 2A).The treatment-naïve primary tumor biopsy obtained at diagnosis (02_Bio) showed three clones: stem clone A (cellular prevalence = 0.01), and subclones B (CP = 0.02) and D (CP = 0.21).Stem clone A which persisted through the disease course contained two Tier 1 mutations (PDGFRB p.V761I, COLA p.S1425I), four Tier 2 mutations (ARID p.S564X), and 11 Tier 3 mutations (NECTIN p.T226M, MYH p.N1922S).Subclones C, E, F, and G were gained at subsequent timepoints, of which C and F were lost but E and G persisted up to the point of last biopsy.One of the initial subclones, B, was also lost in the last biopsy.The complete list of mutations with their functional annotation is provided in the Supplementary Table S5 and detailed description of clonal architecture and dynamics over time are explained in Supplementary Data.

Patient_04
In Patient_04, we identified 518 total mutations in four sequentially collected samples, which were classified into 19 clones by PyClone, of which nine clones were comprised of a single mutation each (eight Tier 4 mutations and one Tier 2 mutation).We excluded these nine clones from further phylogenetic analysis and visualization using CALDER.The complete list of mutations with their functional annotation is provided in Supplementary Table S6.Three of the 10 clones (named clone AA, BB, and CC) could not be visualized by CALDER because very high read counts led to narrow confidence intervals and the inability to find values common to all clusters.Widening the confidence interval did not yield an optimum tree presenting all 10 clones.Therefore, these three clones comprising of 66 mutations (clone AA 48 mutations, clone BB 4 mutations, clone CC 14 mutations) are not represented in the clonality analysis.Of note, 62 of 66 mutations in these three clones were Tier 4 mutations.We present the PyClone analysis and CALDER visualization for seven clones from four samples collected sequential in this patient (Fig. 2B).The treatment-naïve primary tumor biopsy (04_Bio) showed three clones-stem clone A with cellular prevalence 0.01 (18 mutations), subclone C with cellular prevalence 0.21 (27 mutations), and subclone E with cellular prevalence 0.49 (four mutations).
Subclones B, D, F, and G were gained at subsequent timepoints, of which B and G were lost but D and F persisted up to the point of last biopsy.One of the initial subclones, E persisted but C was lost in the last biopsy.Stem Clone A, which contained one Tier 1 mutation (TP p.I119S), two Tier 2 mutations (BRAF p.Q386 L and KMTC p.G4125C), and two Tier 3 mutations (GOLGB p.P2841 L KDME p.R240W), underwent clonal expansion with cellular prevalence 0.06 and persisted up to the last biopsy.Detailed description of clonal architecture and dynamics over time are explained in Supplementary Data.

Patient_07
In Patient_07, we identified seven clones comprised of 142 mutations in four samples collected during the disease course (Fig. 2C).The complete list of mutations with their functional annotation is provided in Supplementary Table S7.
Treatment-naïve primary tumor biopsy (07_Bio) showed three clones-stem clone A with a cellular prevalence of 0.15, subclone B with a cellular prevalence of 0. Deep sequencing of custom panel of somatic mutations was performed on eight plasma-derived ctDNA samples collected at the time of disease progression in the 3 patients, at an average coverage of 30,000X.In all 3 patients, all mutations identified in the stem clones were also present in ctDNA from respective plasma samples with a prevalence of more than 2% variant allele frequency (Fig. 3), as also reported by others (31).Furthermore, some subclones which could not be detected in high-depth tissue sequencing at some timepoints, were detected with low variant allele frequency in corresponding ctDNA as shown in Fig. 3 and described in detail in Supplementary Data.This suggests that ctDNA analysis can potentially better evaluate subclonal architecture compared with single-site tissue biopsy.

Many Stem Clone Mutations Continue to be Expressed During Tumor Evolution
Variant allele frequencies were inferred from RNA-seq of eight sequential samples in 3 patients.Our results suggest that many somatic mutations, including stem and branch mutations, were expressed at the RNA level.In five of eight samples, stem clone mutations were the sole highest expressed variants and in the remaining three samples both stem and some non-stem mutations were present among the highest expressed variants.These results are shown in Fig. 4 and described in detail in Supplementary Data.

Discussion
We report the clonal architecture and evolution pattern in 3 patients with metastatic TNBC, 2 of whom had germline pathogenic BRCA mutation (Supplementary Table S8), using a sequential, multiple timepoint, multiomics analysis.Notably, the sample set in all 3 patients comprised the initial diagnostic biopsy in treatment-naïve non-metastatic stage, through to the ultimately fatal metastatic tumor, with intervening biopsies at each progression.Our analysis suggests that the primary non-metastatic tumor was polyclonal in all patients, comprising a stem clone and daughter clones.The tumor remained polyclonal throughout its life history with persistence of the stem clone.However, daughter clones were extinguished and gained at various timepoints, likely under the selective pressure of various chemotherapy agents (Supplementary Table S9).Interestingly, this pattern was similar in 2 patients with germline BRCA mutation and the single patient without any germline pathogenic abnormality.There was no reversion of the BRCA mutation in somatic tumor tissue of both patients throughout their clinical course.Our analysis suggests that clonal biology and evolution are likely to be similar in those with and without germline predisposition (32).Our analysis also suggests that the subsequent relapses in 2 patients with germline BRCA mutation were derived from the primary index tumor and were not second cancers.Transcriptome data from all samples in all patients confirmed their classification into the basal-like category, irrespective of the physical and temporal distance from the primary tumor, attesting to  an enduring intrinsic subtype pattern through several relapses in patients with TNBC.
We detected the branching evolution pattern of mutations through the disease course in all 3 patients, with a stem clone comprising stem mutations, which acquired further mutations to branch into daughter clones.Because the stem clone was present throughout the clinical course in all 3 patients, the mutations comprising these clones could be related to treatment resistance.Notably, many mutations comprising the stem clone continued to be expressed in multiple timepoint transcriptomic analysis, further suggesting their possible association with treatment resistance.The acquisition and extinction of subclones suggest the possibility of their correlation with treatment resistance and sensitivity, respectively.Some daughter clones in initial biopsy and subsequent surgery were extinguished during metastatic evolution, suggesting that these were associated with sensitivity to (neo) adjuvant chemotherapy and radiotherapy.
In line with previous reports (4,14,23,33), there was minimal overlap in the mutational landscape across the 3 patients, with only TP (different hotspot mutations in the 3 patients) being the commonly mutated gene.These findings affirm the marked intertumor heterogeneity in TNBC, a potential challenge for the personalized medicine paradigm in these patients.
We found enrichment of signature 3 [homologous recombination deficiency (HRD) signature] in the 3 patients, persisting in all sequential samples, including in 1 patient without germline pathogenic BRCA mutation.This may have implications for treating patients with HRD-positive TNBC with DNAdamaging agents such as PARP inhibitors.Signature 3 has been previously reported in breast and pancreatic cancers and shows a strong association with germline and somatic BRCA and BRCA mutations, leading to defective homologous recombination repair of DNA double-strand breaks (34)(35)(36).This is consistent with the presence of germline BRCA mutations in 2 of our patients, while the presence of signature 3 in the third patient's tumor could be due to epigenetic silencing of homologous recombination repair genes.We also found signature 13 in all 3 patients in tumor samples obtained at 11 of 13 timepoints.This signature is associated with the AID/APOBEC family of cytidine deaminases (35), has been reported to be widespread in human cancers (37), is usually associated with HER2-like tumors (38), but has also been reported in TNBC (39).We found germline polymorphisms in APOBEC family genes in 2 of the 3 patients, one of whom also had hepatitis C, which has also been associated with the presence of APOBEC signature in tumors (37,40,41).Others have also reported the co-occurrence of signatures 3 and 13 in breast cancer (42).The presence of APOBEC signature may be associated with a benefit of immunotherapy ( 43), and we can speculate about the possible benefit of the combination of immunotherapy and PARP inhibitors in patients with such tumors.Our data suggest that mutational signatures that predominate at the time of diagnosis remain stable and continue to be the predominant signatures throughout the life course of a tumor through several exposures to different chemotherapy agents.Our data corroborate the results of Nik-Zainal and colleagues ( 16) who subjected 20 primary breast cancer samples to a 30-40X sequencing depth and applied a novel common ancestor statistical approach to identify evolutionary periods underlying subclonal divergence.Their data suggest that the accumulation of thousands of mutations is required for subclones to emerge, suggesting that cancer-specific signatures of point mutations and genomic instability emerge at late-stage disease.A limitation of their study was a lack of sequential samples from the same patients over time, which we could accomplish in this study.
The median tumor mutation burden (TMB) in our sample set was 1.59 mutations/Mb, which is consistent with previous reports in TNBC (4).Notably, TMB did not markedly change between non-metastatic and metastatic tumor samples in our patients.
We performed SNP arrays to infer copy numbers for eight fresh samples from recurrent/metastatic sites in 3 patients, in whom the mean ploidy was 2.63, consistent with previous reports (44).We identified a high level of copy-number changes in our patients attesting to the high genomic instability in metastatic TNBC, which has been reported earlier (39,(44)(45)(46).Gao and colleagues (15) subjected 1,000 single cells from 12 patients with TNBC at diagnosis and found that TNBCs exhibited punctuated copy-number evolution, with a clonal blast occurring at presentation with copy numbers remaining stable throughout the disease.Each patient had one to three major clonal subpopulations that shared a common evolutionary lineage.More recently, the same group has reported (47) that most cancer cells undergo a period of transient instability wherein a large number of subclones are produced, followed by steady ongoing copynumber evolution that persists during clonal expansion of the tumor.This led the group to propose a revised model of copy-number evolution in TNBC under which TP53 mutations occur early, resulting in genomic instability with acquisition of subclones which continue to evolve as the tumor expands.In line with this finding, we observed early acquisition of TP53 mutations in all 3 patients, a high number of copy-number gains and losses in many cancerrelated and other genes in these samples, followed by transient instability, with discordant copy-number profiles in 40%, 22.54%, and 40.21% reduced segments, respectively, in sequential samples from the 3 patients.Our findings of a substantial concordance of copy-number profiles in the evolving tumor suggest that CNAs could constitute stable targets for treatment throughout the life course of these tumors, should one or more of these aberrations be proven to be the drivers.
Methodologically, we found a strong correlation between WES and ultra-high depth targeted sequencing for variant allele frequency of each mutation (R > 0.9 for all samples; Supplementary Fig. S2), which has also been reported by others (48).This suggests that WES at about 200X depth may be adequate to infer the clonal architecture of tumors.Ultra-high depth sequencing captured some additional subclonal mutations present at very low allelic prevalence but given the high correlation of VAF between the two techniques, it is unlikely that clonal architecture would differ if derived from WES compared with deep sequencing data.
In line with previous findings (31), our deep sequencing analysis (∼30,000X) of ctDNA can recapitulate the clonal architecture of tumors at all timepoints.
We detected the stem clone mutations in ctDNA from corresponding plasma samples with high confidence and more than 2% VAF (Fig. 3).In addition, we could detect some mutations in ctDNA which were missed in tumor biopsy, likely because of intratumor heterogeneity of the biopsied lesion.Another possibility is that these clones were seeded from a metastatic site that we did not biopsy, and a remote possibility is that these clones were seeded from a tumor that was as yet clinically undetected.
Other, single timepoint analyses (49), have suggested a simple clonal organization comprising a stem and daughter clone(s).This can be resolved into more detailed clonal architecture only when samples from other timepoints are available, which show differing variant allele frequencies of various subclones.Kim and colleagues (18) in their single-cell analysis of TNBC before and after neoadjuvant chemotherapy described clonal persistence or extinction in response to treatment, indicating that resistance occurred as a result of adaptive selection of genomic aberrations present at the time of diagnosis.When the detailed clonal architecture of the evolving tumor is available at various timepoints along with treatments that preceded each sample, as is the case in our dataset, it may be possible to find associations between the acquisition and extinction of clones and treatment sensitivity and resistance.In contrast, the multiple-location, single-timepoint sampling can only demonstrate spatial heterogeneity.
Our analysis showed an interesting finding in 1 patient (P2) in whom subclone B was present in initial treatment-naïve biopsy, slightly expanded in the post-neoadjuvant surgical specimen, present in the first-relapse lung metastasis sample, present in ctDNA at first relapse and two subsequent disease progressions, but absent from local chest wall samples at progressions 2 and 3.This suggests that subclones can already be seeded in distant metastatic sites at the initial diagnosis, may have tropism for particular organs, and that ctDNA may provide a complementary and perhaps more complete picture of the entire tumor burden in patients with multiple sites of metastases.Previous studies in patients with TNBC similarly suggest that different clones pre-existing in the primary tumor subsequently seed multiple metastatic lesions (22,50).Although we biopsied only one tumor site at progression, the fact that these changes were captured in the ctDNA at all progression timepoints, suggests that our findings are in line with previous reports.
The ctDNA sampling, multiomics analytic strategy, and inclusion of patients with germline pathogenic BRCA mutations are other important aspects of our study.Although there is no strong a priori reason for TNBC tumor evolution to differ in patients with and without gBRCA pathogenic variants, empirical proof of clonal patterns is provided by our study.
The patients in this study had a rapid clinical deterioration after relapse.However, after relapse, the median overall survival of patients with TNBC is known to be short, in the range of 12-15 months.They eventually, and relatively rapidly, became refractory to treatments after experiencing relapse, which is consistent with the biological behavior of relapsed TNBC.Given the short clinical course of all 3 patients, it is possible that inherent tumor evolutionary mechanisms and chemotherapy-related genotoxicity could have contributed to the branching pattern seen in our study.The design of our study does not allow evaluation of the contributing mechanisms.
The main limitation of our study is its small sample size of 3 patients.Therefore, our findings, including branching evolution pattern of mutations, will have to be considered as preliminary and need to be replicated in larger cohorts.However, it needs to be appreciated that repeated sequential tumor sampling from individual patients through several episodes of disease progression is a challenging sample set to assemble.Second, we did not obtain samples from tumor after death, so additional terminal-stage spatial heterogeneity was possibly not captured, especially from metastatic sites such as the brain, which are difficult to access.Therefore, our findings need to be replicated in larger cohorts, including autopsy-based samples.Third, each patient's initial diagnostic biopsy and surgical specimen tissue were retrospectively obtained as paraffin-embedded blocks.Although the pathologist selected the most suitable block for inclusion, it is possible that the quality of extracted DNA extracted from some of this retrospectively collected material might have been lower.This could have led to systematically different tumor mutational burden between fresh tissue from metastatic sites and archival FFPE tissue.
In summary, our analysis of a sequentially sampled TNBC patient cohort suggests the presence of branching evolutionary pattern of mutations, widespread copy-number aberrations, stability of mutational signatures and intrinsic subtype over disease course, high interpatient tumor heterogeneity, and the ability of ultra-high depth sequenced ctDNA to recapitulate the clonal architecture of the somatic tumor.The evolutionary pattern was similar in patients with and without germline pathogenic BRCA mutations.

FIGURE 1
FIGURE 1 Mutation rate and mutational signatures over disease course.A, X-axis represents samples, Y-axis represents mutations per MB.The mutation rate in pre-therapy samples is lower than that in post-therapy samples.B, X-axis represents 96 possible classes of mutations.Y-axis represents the fraction of these classes.The sample name is shown in each box.

FIGURE 2
FIGURE 2 Graphical representation of clonal evolution observed in sequentially collected samples over time in 3 patients with TNBC.Timepoints are depicted on the X-axis and Cellular Prevalence (Clonal Prevalence) is shown on the Y-axis.Different colors in each patient represent various clones and their changing dynamics over time.Clonal phylogeny for each patient is shown on the left side of main figure.A, Patient_02.B, Patient_04.C, Patient_07.

FIGURE 3
FIGURE 3 Heat map representation of mutations identified in fresh-frozen recurrent tumor samples and corresponding plasma samples.Mutations were clustered on the basis of clonal structure identified from Pyclone and renamed as in CALDER tool.Red rectangles denote high confidence mutations with variant allele frequency (VAF) > 2%, while blue rectangles indicate mutations with statistical significance but VAF < 2%.A, Patient_02.B, Patient_04.Last three clusters (AA, BB, and CC) were identified in Pyclone analysis but not represented in CALDER.C, Patient_07.

FIGURE 4
FIGURE 4 Sample-specific allele frequencies from deep sequencing were compared with allele-specific gene expression frequency from RNA-seq experiment.Colors indicate individual clones identified from PyClone, and each mutation is represented by "×" symbol.A, Patient_02.B, Patient_04.C, Patient_07.

TABLE 1
Detailed description of patient samples and applied genomic assays

TABLE 1
Detailed description of patient samples and applied genomic assays (Cont'd ) Six cycles of gemcitabine plus carboplatin plus oral bicalutamide (March 12, 2016 to July 08, 2016)

TABLE 2
Ploidy and aberrant cell fraction in samples

ctDNA Captures Clonal and Subclonal Mutations with Higher Sensitivity Compared with Tissue Biopsy
3 mutations (SCNA p.V1759M).Subclone B gained four Tier 4 mutations but did not gain any Tier 1, 2, or 3 mutations, while subclone E gained one Tier 2 mutation (ZEB p.K419I) and three Tier 3 mutations.Subclones E, C, D, F, and G were gained at subsequent timepoints, of which E and D were lost but C, F, and G were persisted up to the point of last biopsy.One of the initial subclones, E, persisted but C was lost in the last biopsy.Detailed description of clonal architecture and dynamics over time are explained in Supplementary Data.
13, and subclone E with a cellular prevalence of 0.49.This patient showed a unique pattern of clonal evolution wherein stem clone A resulted in a daughter subclone B from the very beginning, and subclone B persisted throughout the tumor's life course, giving rise to all other subclones.Stem clone A, which persisted throughout, contained one Tier 1 mutation (TP p.R81X), seven Tier 2 mutations (APC p.S1393T, ESR p.F591L, EZH p.D36N, FAT p.K251N), and four Tier