DNA extracted from cancer patients' whole blood may contain somatic mutations from circulating tumor DNA (ctDNA) fragments. In this study, we introduce cmDetect, a computational method for the systematic identification of ctDNA mutations using whole-exome sequencing of a cohort of tumor and corresponding peripheral whole-blood samples. Through the analysis of simulated data, we demonstrated an increase in sensitivity in calling somatic mutations by combining cmDetect to two widely used mutation callers. In a cohort of 93 breast cancer metastatic patients, cmDetect identified ctDNA mutations in 54% of the patients and recovered somatic mutations in cancer genes EGFR, PIK3CA, and TP53. We further showed that cmDetect detected ctDNA in 89% of patients with confirmed mutated cell–free tumor DNA by plasma analyses (n = 9) within 46 pan-cancer patients. Our results prompt immediate consideration of the use of this method as an additional step in somatic mutation calling using whole-exome sequencing data with blood samples as controls. Cancer Res; 76(20); 5954–61. ©2016 AACR.

Mutations acquired in a patient's tumor can be identified with deep sequencing with either targeted sequencing, where only the informative mutations for the clinical practice are screened, or whole-exome sequencing, to get a broader picture of the genetics of the tumor. As opposed to targeted sequencing, whole-exome sequencing is applied not only on the tumor DNA but also on the germline DNA obtained from a normal cell population of the patient to differentiate tumor acquired, or somatic, mutations from patient's germline variations or polymorphisms. Current bioinformatics methods for somatic mutation identification from whole-exome sequencing data are designed to accurately identify DNA variations, nucleotide substitutions, and small insertions or deletions, in the tumor DNA that are not detected in the germline DNA (1). However, for solid tumors, the germline DNA is often extracted from peripheral whole blood that may contain cell-free tumor DNA (cfDNA). cfDNA is composed of cell-free fragments of DNA coming from the patient's tumor and circulating in the bloodstream (2) and has been associated to tumor burden (3). Therefore, DNA extracted from cancer patient's blood may contain various levels of tumor DNA also called circulating tumor DNA (ctDNA). If the level of ctDNA in the blood is high enough, there is a possibility for some mutated DNA fragments to be detected by whole-exome sequencing of the DNA extracted from whole blood sample. This becomes a problem when these ctDNA mutations are detected in whole blood samples used as normal controls to distinguish somatic from germline events, therefore preventing the accurate determination of somaticity. The existence of cfDNA in the peripheral blood has already been reported for various cancers, with a variable contribution ranging from <0.1% to >10% of the DNA molecules (4–8). ctDNA is usually considered as the mutated fraction of cfDNA and was recently shown to reach up to 88% of cfDNA in metastatic patients by sequencing analyses (9). However, the impact of this contamination was rarely taken under consideration in somatic variants calling and whole blood samples are still widely used as normal samples in sequencing studies for solid tumors. As an illustration, in a total of 9,202 solid tumor samples with whole-exome sequencing data available in The Cancer Genome Atlas project (TCGA) through the Cancer Genomics Hub (https://cghub.ucsc.edu/), 7,257 cases only had blood sample as normal control (79%). In particular, in a set of 829 breast carcinomas, 715 (86%) had only blood sample as control, 32 (3%) had both blood and tissue as normal controls, and 82 (9%) had only tumor-adjacent tissue as normal control (Supplementary Table S1). To understand the level of contamination of ctDNA in whole-exome sequencing of cancer patient's whole blood DNA and the extent to which it affects somatic mutation calls, we developed a method for the systematic identification of ctDNA mutations from a set of tumor/blood samples. The method we propose should be executed as an additional step to classic somatic mutation analysis for recovering bona fide somatic mutations for which the allele read count or frequency in the blood is higher than expected.

### Simulated data and performance assessment

We first retrieved the whole exome sequencing data of 99 individuals of CEU population [Utah Residents (CEPH) with Northern and Western European Ancestry] in 1000 Genomes project phase III dataset (Supplementary Table S2). The bam files were then remapped to the reference hg19 (using bwa). To create a set of samples with comparable coverage, we included 66 samples with a mean coverage between 80 and 200.

#### Simulated tumor samples.

We randomly assigned a number of variants ranging from 50 to 500 (SNPs and indels) to each sample with a mutant allele frequency following a Gaussian distribution (10). Because of the consideration of tumor heterogeneity and tumor sample purity, the mean of the Gaussian distribution was set between 0.2 and 0.45. We then chose all the variants from the COSMIC database with CNT ≥ 2 and added the variants to the bam files using Bamsurgeon (11).

### Plasma mutations

Details of plasma DNA extraction and sequencing can be found in the publication by Jovelet and colleagues (9). Fastq files were treated with the Torrent Suite BaseCaller version 4.0 or 4.2. We retrieved hotspot variants using GATK HaplotypeCaller in each plasma sample and satisfying the following filters: strand bias FS < 30; variant supporting reads >4; total base depth >50; and variant allele frequency ≥0.1. We filtered out variants present in polymorphism databases (ESP, the Exome Sequencing Project or 1000G, from European samples in 1000 genome project; ref. 18) with a minor allele frequency >0.001.

### Survival analyses

Overall survival was estimated by the Kaplan–Meier methods. Correlation between the number of ctDNA mutations and survival was assessed using Cox-proportional hazard models. Univariate analysis was performed using Log-rank test for categorized variables. Multivariate analysis was assessed using Cox proportional Hazard Modeling. All factors with P < 0.10 in univariate analysis were evaluated on multivariate analysis. All P values reported are two-sided. For all statistical tests, differences were considered significant at the 5% level. Stata 13.0 was used for all statistical analyses.

### ctDNA mutation detection workflow

We introduce a method, cmDetect (ctDNA mutation detection), for the systematic identification of ctDNA mutations by leveraging information from the tumor and blood samples (Fig. 1). cmDetect consists of three steps and is described in details in the method section. Briefly, the proposed workflow first retrieves heterozygous variants in gene coding regions in each tumor independently using the Genome Analysis ToolKit (GATK; ref. 19) and selects those variants with supporting read(s) in the corresponding blood sample. The patient's germline variants and common polymorphisms are then filtered out to obtain a set of mutations that are identified with high confidence in the tumor sample and lower incidence in the blood sample. At this step, as the read frequency of the selected variants in the blood may be very small (<0.02), it is important to be able to distinguish ctDNA mutations from sequencing biases. Therefore, the frequency of each selected variant is tested against the observed frequencies of the same variant in the mixture of blood samples to estimate its probability of being a false positive (due to sequencing bias or bad alignment).

Figure 1

cmDetect workflow. The method consists of three main steps including: (i) tumor variant identification based on GATK and SnpEff; (ii) selection of non-germline variants with some evidence support in the corresponding blood sample; and (iii) filtration of candidate variants based on sequencing error from the pool of blood samples.

Figure 1

cmDetect workflow. The method consists of three main steps including: (i) tumor variant identification based on GATK and SnpEff; (ii) selection of non-germline variants with some evidence support in the corresponding blood sample; and (iii) filtration of candidate variants based on sequencing error from the pool of blood samples.

Close modal

### Benchmarking

To estimate the statistical power of the cmDetect workflow, a set of 66 tumor/blood pairs of samples was simulated on the basis of real whole-exome sequencing data from the 1000 Genomes project (Supplementary Table S2; ref. 19). Briefly, for each sample, we derived a tumor sample by introducing cancer-causing mutations (COSMIC; ref. 20) at different read frequencies and a normal blood sample contaminated with tumor DNA at various levels (see Materials and Methods; Supplementary Table S2). This way, we introduced a total of 12,469 somatic mutations including 2,676 ctDNA mutations (Supplementary Table S3). We identified ctDNA mutations with cmDetect and retrieved somatic mutations with Mutect (21) and Varscan2 (22) for a range of filters on maximum detection level of the mutation in the normal sample. Application of cmDetect identified 1,219 ctDNA mutations, all true positives (Supplementary Table S4). In addition, 1,458 (54%) ctDNA mutations were not identified by cmDetect for the following reasons: (i) the mutations did not have enough support evidence in the normal sample to be called a ctDNA mutation; (ii) the read frequency in the tumor and corresponding normal was comparable; and (iii) the read frequency in the normal could not be distinguished from a polymorphism (see the polymorphism filtering section in Methods and Supplementary Fig. S1). However, a large proportion (>50%) of the ctDNA mutations missed due to a very low coverage in the normal were correctly identified by the somatic mutation callers [749 (51%) by Mutect and 830 (57%) by Varscan2]. By evaluating the performance of the somatic mutation callers with and without cmDetect at the various filters, we show that without decreasing the sensitivity, adding ctDNA mutations to the results of traditional somatic variant callers decreased the number of false negatives for all the different filters tested (Supplementary Table S5; Fig. 2A). Importantly, the observed recall was not dependent on the initial filters applied in Mutect or Varscan2 (Fig. 2B), indicating that cmDetect can be efficiently combined with somatic mutation callers. This also demonstrated that the gain in sensitivity cannot be achieved by relaxing the initial mutation caller filters, for example by increasing the maximum allowed allele frequency of the variant in the blood, which will have dramatic effects on specificity.

Figure 2

A combining cmDetect with somatic mutation caller(s) improves the performance for somatic mutation identification. F-score, recall, and precision (y-axis) between Mutect (red), Mutect+cmDetect (green), Varscan2 (blue), and Varscan2+cmDetect (violet) at different filters (x-axis) are shown. Error bars correspond to 95% CIs. B sensitivity of somatic mutation calling by adding cmDetect to Mutect(Varscan2) for different filters. Bar plots showing the numbers of false positives (blue), true positives (green), and false negatives (yellow) called by Mutect only, Mutect+cmDetect, Varscan2 only, and Varscan2+cmDetect with different filters for somatic mutations with at least one supporting read in the blood. The vertical axis shows the number of mutations, true positives are shown above 0, and false negatives are shown below 0. The horizontal axis corresponds to the different filters applied for somatic mutation calling, from the most stringent filter (left) to the least stringent filter (right).

Figure 2

A combining cmDetect with somatic mutation caller(s) improves the performance for somatic mutation identification. F-score, recall, and precision (y-axis) between Mutect (red), Mutect+cmDetect (green), Varscan2 (blue), and Varscan2+cmDetect (violet) at different filters (x-axis) are shown. Error bars correspond to 95% CIs. B sensitivity of somatic mutation calling by adding cmDetect to Mutect(Varscan2) for different filters. Bar plots showing the numbers of false positives (blue), true positives (green), and false negatives (yellow) called by Mutect only, Mutect+cmDetect, Varscan2 only, and Varscan2+cmDetect with different filters for somatic mutations with at least one supporting read in the blood. The vertical axis shows the number of mutations, true positives are shown above 0, and false negatives are shown below 0. The horizontal axis corresponds to the different filters applied for somatic mutation calling, from the most stringent filter (left) to the least stringent filter (right).

Close modal

### ctDNA is detectable in whole-exome sequencing of metastatic breast cancer patient's blood

We applied our method to a panel of 93 whole-exome sequenced pairs of breast cancer metastases/blood samples from the SAFIR01 (NCT01414933; ref. 13) and MOSCATO (NCT01566019) clinical trials (see Supplementary Tables S6 and S7, for clinical information and sequence quality metrics). We first applied Mutect and indelGenotyper and identified 7,334 somatic mutations (Supplementary Table S8) for the 93 pairs of metastasis/blood samples (see Materials and Methods). The application of cmDetect identified 263 ctDNA mutations (Supplementary Table S9) distributed in 50 patients (54%). Among these 263 mutations, 141 were correctly identified by the somatic mutation analysis whereas 122 were false negatives, defined as true somatic mutations incorrectly filtered out by the mutation callers due to the high level of detection in the control blood sample, representing 1.67% of the total number of somatic mutations. Importantly, among the false negatives we found 12 mutations reported in clinical databases (COSMIC) including two PIK3CA missense mutations (V344M, H1047R) and two stop-gain TP53 mutations (R306*, E349*), with a read frequency as high as 0.11 with seven supporting reads in the corresponding blood sample (E349*; Fig. 3). Other mutations of interest among the false negatives included one EGFR missense mutation (G322S) and one small insertion in TP53 (N288fs). In the following, we consider a total of 7,457 somatic mutations among the 93 pairs of metastases/blood samples.

Figure 3

Observed read frequencies of ctDNA mutations identified by cmDetect in breast cancer metastatic samples. The color of the gene corresponds to the status of the call of the mutation according to Mutect applied with default filters as described in Materials and Methods. Shown is the allele frequency in the blood (x-axis) and in the corresponding tumor (y-axis). False negative (FN; orange) corresponds to the ctDNA mutations identified by cmDetect but not by Mutect. True positive (TP; green) corresponds to the ctDNA mutations identified by cmDetect and Mutect. The mutations that are documented in COSMIC are shown in darker color.

Figure 3

Observed read frequencies of ctDNA mutations identified by cmDetect in breast cancer metastatic samples. The color of the gene corresponds to the status of the call of the mutation according to Mutect applied with default filters as described in Materials and Methods. Shown is the allele frequency in the blood (x-axis) and in the corresponding tumor (y-axis). False negative (FN; orange) corresponds to the ctDNA mutations identified by cmDetect but not by Mutect. True positive (TP; green) corresponds to the ctDNA mutations identified by cmDetect and Mutect. The mutations that are documented in COSMIC are shown in darker color.

Close modal

The 50 patients had in average 5.26 and a maximum of 38 ctDNA mutations, most of them (82%) having less than 10 ctDNA mutations identified. Patients with detectable ctDNA in the blood, identified as patients with at least one ctDNA mutation, had more somatic mutations in their tumor than patients with no detectable ctDNA (t-test P = 0.0003; Supplementary Fig. S2). However, there was not a direct correlation between number of ctDNA mutations and total number of mutations (Pearson ρ = 0.12; P = 0.24), indicating that the mutational load of the tumor does not reflect the amount of ctDNA detectable in the whole blood sample. Importantly, ctDNA mutations represented up to 53% of the total number of somatic mutations whereas false negative rates ranged from 0% to 35% of the total number of mutations per patient (Fig. 4). We also confirmed in the 86 SAFIR01 cases that patients with detectable ctDNA mutations were associated to poor outcome, marginally when only considering the presence/absence of ctDNA mutations (P = 0.068; Fig. 5), but very significantly when considering ctDNA mutations quantitatively (P < 0.001). Indeed, the number of mutations were highly significant in both univariate analysis (HR = 1.14; P < 0.001) and in multivariate analysis (HR = 1.15; 95% CI, 1.07–1.23; P < 0.001). Finally, we found that patients who received a prior chemotherapy (Supplementary Table S6) presented more ctDNA mutations (mean = 3.07) than the patients who did not (mean = 0.56; t test P = 0.002; Supplementary Fig. S3).

Figure 4

Somatic mutation profile of breast cancer metastatic patients with detectable ctDNA in whole blood. Left, total number of somatic mutations per sample; right, percentage of somatic mutation types. Mutect only, mutations identified by Mutect but not cmDetect; Mutect+cmDetect, ctDNA mutations identified by Mutect and cmDetect; cmDetect only, ctDNA mutations identified by cmDetect but not Mutect.

Figure 4

Somatic mutation profile of breast cancer metastatic patients with detectable ctDNA in whole blood. Left, total number of somatic mutations per sample; right, percentage of somatic mutation types. Mutect only, mutations identified by Mutect but not cmDetect; Mutect+cmDetect, ctDNA mutations identified by Mutect and cmDetect; cmDetect only, ctDNA mutations identified by cmDetect but not Mutect.

Close modal
Figure 5

Patients survival in the SAFIR01 trial according to detectable ctDNA status. The black line (0) represent patients with no detectable ctDNA, whereas the gray line (>0) contains the patients with at least one ctDNA mutation identified.

Figure 5

Patients survival in the SAFIR01 trial according to detectable ctDNA status. The black line (0) represent patients with no detectable ctDNA, whereas the gray line (>0) contains the patients with at least one ctDNA mutation identified.

Close modal

To further evaluate the extent of ctDNA contamination in early-stage disease, we applied cmDetect to a set of 60 primary tumor–blood pairs of whole-exome sequencing data from the TCGA breast cancer collection (see Materials and Methods; ref. 23). We identified 41 ctDNA mutations in 18 patients (30%) harboring from 1 to 14 ctDNA mutations with an average of two mutations per patient (Supplementary Table S10). Among these 41 mutations, 10 were present in the somatic mutation results file from TCGA whereas 31 (0.7% of the total number of somatic mutations) were false negatives including nine that were reported as clinical variants (COSMIC; Supplementary Table S10).

### Validation with plasma samples

We applied cmDetect to a set of 60 pairs of metastasis/blood samples from the MOSCATO clinical trial sequenced on a NextSeq500, including cancers of different tissues of origin (Supplementary Table S7). For 46 MOSCATO patients, 43 in this cohort and three in the breast cancer metastasis cohort, targeted sequencing data of the plasma was also available (Supplementary Table S11) and reported 12 COSMIC mutations (Supplementary Table S12), validating the presence of ctDNA for a total of nine patients. In parallel, cmDetect identified ctDNA mutations from the whole-exome sequencing data for eight of these nine (89%) patients, confirming the sensitivity of our approach (Supplementary Table S13). We found that two of the 12 COSMIC mutations identified in the plasma were also identified by cmDetect from the whole-blood sample: one TP53 stop-gained mutation (E349*) in patient BC93 with an allele frequency of 0.76 in the plasma, 0.39 in the tumor, and 0.12 in total blood, and one TP53 stop-gained mutation (R213*) in patient PCAN39 with an allele frequency of 0.2 in the plasma, 0.35 in the tumor, and 0.013 in total blood. It is interesting to note that BC93 presented with the highest mutated DNA fraction (0.76) in the plasma and also had the highest number of ctDNA mutations (38) detected by cmDetect.

M. Campone has received speakers bureau honoraria from Novartis, AstraZeneca, Pfizer, and is a consultant/advisory board member for Pfizer and AstraZeneca. T. Bachelot has received speakers bureau honoraria from Roche and Novartis and is a consultant/advisory board member for Roche, Novartis, and AstraZeneca. J.-C. Soria is a consultant/advisory board member for AstraZeneca. No potential conflicts of interest were disclosed by the other authors.

Conception and design: Y. Fu, J. Garrabey, J.-C. Soria, L. Lacroix, F. André, C. Lefebvre

Development of methodology: Y. Fu, Y. Luo, L. Lacroix, C. Lefebvre

Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): C. Massard, M. Campone, C. Levy, V. Diéras, T. Bachelot, J. Garrabey, J.-C. Soria, L. Lacroix

Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): Y. Fu, T. Filleron, M. Pedrero, N. Motté, Y. Boursin, Y. Luo, V. Diéras, L. Lacroix, F. André, C. Lefebvre

Writing, review, and/or revision of the manuscript: Y. Fu, T. Filleron, N. Motté, Y. Boursin, C. Massard, M. Campone, V. Diéras, T. Bachelot, J. Garrabey, J.-C. Soria, F. André, C. Lefebvre

Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): Y. Fu, C. Jovelet, N. Motté, Y. Boursin

Study supervision: L. Lacroix, C. Lefebvre

We thank Marta Jimenez from UNICANCER; Marc Deloger and Guillaume Meurice of the bioinformatics core facility of Gustave Roussy and Mélanie Letexier and Emmanuel Martin from Integragen for their assistance. We thank the NIH TCGA project for granting us access to the sequencing data (under project #9082).

This work was supported by Breast Cancer Research Foundation (F. André), Fondation Lombard-Odier “Philanthropia” (F. André), Odyssea (F. André), Operation Parrains Chercheurs (F. André), Dassault Foundation (F. André), French NCI: INCa-DGOS-INSERM 6043 (F. André), SIRIC Socrate (J.C. Soria), and Fondation ARC (to UNICANCER).

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1.
Wang
Q
,
Jia
P
,
Li
F
,
Chen
H
,
Ji
H
,
Hucks
D
, et al
Detecting somatic point mutations in cancer genome sequencing data: a comparison of mutation callers
.
Genome Med
2013
;
5
:
91
.
2.
M
,
Dawson
SJ
.
Circulating tumor cells and circulating tumor DNA for precision medicine: dream or reality?
Ann Oncol
2014
;
25
:
2304
13
.
3.
Bettegowda
C
,
Sausen
M
,
Leary
RJ
,
Kinde
I
,
Wang
Y
,
Agrawal
N
, et al
Detection of circulating tumor DNA in early- and late-stage human malignancies
.
Sci Transl Med
2014
;
6
:
224ra24
.
4.
Diehl
F
,
Schmidt
K
,
Choti
MA
,
Romans
K
,
Goodman
S
,
Li
M
, et al
Circulating mutant DNA to assess tumor dynamics
.
Nat Med
2008
;
14
:
985
90
.
5.
Spindler
KL
,
Pallisgaard
N
,
Andersen
RF
,
Brandslund
I
,
Jakobsen
A
.
Circulating free DNA as biomarker and source for mutation detection in metastatic colorectal cancer
.
PLoS One
2015
;
10
:
e0108247
.
6.
Haber
DA
,
Velculescu
VE
.
Blood-based analyses of cancer: circulating tumor cells and circulating tumor DNA
.
Cancer Discov
2014
;
4
:
650
61
.
7.
Leary
RJ
,
Kinde
I
,
Diehl
F
,
Schmidt
K
,
Clouser
C
,
Duncan
C
, et al
Development of personalized tumor biomarkers using massively parallel sequencing
.
Sci Transl Med
2010
;
2
:
20ra14
.
8.
McBride
DJ
,
Orpana
AK
,
Sotiriou
C
,
Joensuu
H
,
Stephens
PJ
,
Mudie
LJ
, et al
Use of cancer-specific genomic rearrangements to quantify disease burden in plasma from patients with solid tumors
.
Genes Chromosomes Cancer
2010
;
49
:
1062
9
.
9.
Jovelet
C
,
Ileana
E
,
Le Deley
MC
,
Motté
N
,
Rosellini
S
,
Romero
A
, et al
Circulating cell-free tumor
DNA analysis of 50 genes by next-generation sequencing in the prospective MOSCATO trial
.
Clin Cancer Res
2016
;
22
:
2960
8
.
10.
Alexandrov
LB
,
Nik-Zainal
S
,
Wedge
DC
,
Aparicio
SA
,
Behjati
S
,
Biankin
AV
, et al
Signatures of mutational processes in human cancer
.
Nature
2013
;
500
:
415
21
.
11.
Ewing
,
Houlahan
KE
,
Hu
Y
,
Ellrott
K
,
Caloian
C
,
Yamaguchi
TN
, et al
Combining tumor genome simulation with crowdsourcing to benchmark somatic single-nucleotide-variant detection
.
Nat Methods
2015
;
12
:
623
30
.
12.
Clopper
S
,
Pearson
ES
.
The use of confidence or fiducial limits illustrated in the case of the binomial
.
Biometrika
1934
;
26
:
404
13
.
13.
Andre
F
,
Bachelot
T
,
Commo
F
,
Campone
M
,
Arnedos
M
,
Dieras
V
, et al
Comparative genomic hybridisation array and DNA sequencing to direct treatment of metastatic breast cancer: a multicentre, prospective trial (SAFIR01/UNICANCER)
.
Lancet Oncol
2014
;
15
:
267
74
.
14.
Li
H
,
Durbin
R
.
Fast and accurate long-read alignment with Burrows-Wheeler transform
.
Bioinformatics
2010
;
26
:
589
95
.
15.
Li
H
,
Handsaker
B
,
Wysoker
A
,
Fennell
T
,
Ruan
J
,
Homer
N
, et al
The Sequence Alignment/Map format and SAMtools
.
Bioinformatics
2009
;
25
:
2078
9
.
16.
Cingolani
P
,
Platts
A
,
Wang
le L
,
Coon
M
,
Nguyen
T
,
Wang
L
, et al
A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3
.
Fly (Austin)
2012
;
6
:
80
92
.
17.
Bansal
V
.
A statistical method for the detection of variants from next-generation resequencing of DNA pools
.
Bioinformatics
2010
;
26
:
i318
24
.
18.
Abecasis
GR
,
Auton
A
,
Brooks
LD
,
DePristo
MA
,
Durbin
RM
,
Handsaker
RE
, et al
An integrated map of genetic variation from 1,092 human genomes
.
Nature
2012
;
491
:
56
65
.
19.
McKenna
A
,
Hanna
M
,
Banks
E
,
Sivachenko
A
,
Cibulskis
K
,
Kernytsky
A
, et al
The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data
.
Genome Res
2010
;
20
:
1297
303
.
20.
Forbes
SA
,
Beare
D
,
Gunasekaran
P
,
Leung
K
,
Bindal
N
,
Boutselakis
H
, et al
COSMIC: exploring the world's knowledge of somatic mutations in human cancer
.
Nucleic Acids Res
2015
;
43
:
D805
11
.
21.
Cibulskis
K
,
Lawrence
MS
,
Carter
SL
,
Sivachenko
A
,
Jaffe
D
,
Sougnez
C
, et al
Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples
.
Nat Biotechnol
2013
;
31
:
213
9
.
22.
Koboldt
DC
,
Zhang
Q
,
Larson
DE
,
Shen
D
,
McLellan
MD
,
Lin
L
, et al
VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing
.
Genome Res
2012
;
22
:
568
76
.
23.
Cancer Genome Atlas Network
.
Comprehensive molecular portraits of human breast tumours
.
Nature
2012
;
490
:
61
70
.
24.
Dawson
SJ
,
Tsui
DW
,
Murtaza
M
,
Biggs
H
,
Rueda
OM
,
Chin
SF
, et al
Analysis of circulating tumor DNA to monitor metastatic breast cancer
.
N Engl J Med
2013
;
368
:
1199
209
.
25.
Bidard
FC
,
Peeters
DJ
,
Fehm
T
,
Nolé
F
,
R
,
Mavroudis
D
, et al
Clinical validity of circulating tumour cells in patients with metastatic breast cancer: a pooled analysis of individual patient data
.
Lancet Oncol
2014
;
15
:
406
14
.