Next-generation sequencing (NGS) is mainly used to obtain sequence variants (SNVs). However, obtaining copy number results from NGS has gained momentum in both research and clinical applications. Targeted panel sequencing has been a popular method to achieve high depth of coverage for certain regions of interest at an affordable cost compared to whole genome sequencing. Shallow whole genome sequencing, where average read-depth can be as low as 0.1x, provides a cost savings-approach for identification of large copy number variant (CNV) events; it has been utilized in various application areas, including oncology. Here we introduce the BAM (MultiScale Reference) algorithm, currently in Nexus Copy Number, to function with shallow and targeted sequencing data, as well as WGS and WES, using a novel dynamic binning approach. This approach uses a Hidden Markov Model to segment the genome into target areas using the reads in targeted regions and the backbone areas using the off-target reads and additional areas. It uses coarse binning in the backbone areas that provides copy number base line as well as large copy number events and uses fine binning in target areas to provide high resolution copy number detection in targeted regions. Shallow WGS data and targeted panel NGS data, as well as WES with normal depth of coverage, were used for the testing. The results were compared with those from microarray and/or other algorithms in Nexus Copy Number, BAM ngCGH (matched) and BAM (pooled reference). GC correction schemes based on a range of window size and presence or absence of GC probe content were applied to the data and assessed for overall quality. Differences in overall read-depth resulted in variable sample quality across the cohorts, however most sample quality was adequate for copy number estimation and a quality threshold was assessed. Among the samples tested, the best quality after GC correction comes from the 50kb region size with or without the probes. Next, the copy number profiles of the samples from WES and microarray were compared for accuracy. Using microarray results as a reference for assessing calls greater than 5 MB, no false positive and one false negative call were observed; the single false negative call was attributable to low-level mosaicism in the tumor sample. Results indicate that relative copy number can be estimated and is comparable to the results achieved with microarray for the same targeted regions. This analysis series was then repeated using a secondary cohort of unrelated samples subjected to microarray and targeted panel NGS to validate results. The BAM (MultiScale Reference) method has been tested in a variety of cancer samples. This is an ideal tool for copy number estimation with NGS results in cancer samples because it provides a way for non-matched-pair analysis with genome, exome and targeted NGS.

Citation Format: Andrea J. OHara, Zhiwei Che, Soheil Shams. Copy number estimation from targeted and shallow sequencing in cancer samples [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2017; 2017 Apr 1-5; Washington, DC. Philadelphia (PA): AACR; Cancer Res 2017;77(13 Suppl):Abstract nr 3582. doi:10.1158/1538-7445.AM2017-3582