Whole-genome sequencing (WGS) has started a new era in human genetics in which data can be used to more fully understand the role of genetic variation in common complex diseases, including the role of less frequent and rare variants and structural variation. To explore the impact of these variants on colorectal cancer risk we conducted the first large scale WGS study for colorectal cancer (CRC) including 1,961 CRC cases and 981 controls. These WGS data as well as those from the Haplotype Reference Consortium were imputed in 13,104 CRC cases and 15,521 controls with genome-wide association study (GWAS) data that are part of the Colorectal Cancer Family Registry (CCFR) and the Genetics and Epidemiology of Colorectal Cancer Consortium (GECCO). Focusing on rare and less frequent variants, insertions and deletions we observed potentially novel variants: a less frequent variant (MAF = 0.026) on chromosome 5 located in NREP/STARD4-AS1 (p = value 4E-08); and a novel rare multi-allelic variant (MAF = 0.003) on chromosome 9 near KLF9 and TRPM3 (p-value 2E-09; the other allele of this multi-allelic variant had a MAF of 0.0003 and p-value of 0.55). Furthermore, we observed an independent locus close to the known region 8q24 that was located upstream of GSDMC (MAF = 0.16, p-value 5E-08). Within the known region 8q23/EIF3H we identified several low frequency variants with similar MAF (0.0181 to 0.0204) including a 6bp deletion with p-values between 4E-08 and 1E-09 that were independent of the common variant signal in this region. In addition, we identified statistically significant (p<5E-08) deletions, insertions, and an essential splice site within known GWAS loci that present interesting candidates for functional studies. We will follow up these findings in independent samples from the Colorectal Cancer Transdisciplinary Study (CORECT) and CCFR, as well as additional samples currently genotyped in GECCO. In conclusion, next generation sequencing combined with imputation in large GWAS data sets has the potential to identify novel low frequency and rare genetic variants, aid fine-mapping of known CRC susceptibility loci and point to interesting functional candidates.

Citation Format: Jeroen Huyghe, Sai Chen, Hyun M. Kang, Tabitha A. Harrison, Sonja I. Berndt, Stephane Bézieau, Hermann Brenner, Graham Casey, Andrew T. Chan, Jenny Chang-Claude, Gallinger J. Steven, Stephen B. Gruber, Andrea Gsur, Michael Hoffmeister, Thomas J. Hudson, Loic Le Marchand, Polly A. Newcomb, John D. Potter, Conghui Qu, Martha L. Slattery, Joshua D. Smith, Emily White, Li Hsu, Goncalo R. Abecasis, Deborah A. Nickerson, Ulrike Peters. Large scale whole genome sequencing with imputation into GWAS improves our understanding of the genetic architecture of colorectal cancer. [abstract]. In: Proceedings of the 107th Annual Meeting of the American Association for Cancer Research; 2016 Apr 16-20; New Orleans, LA. Philadelphia (PA): AACR; Cancer Res 2016;76(14 Suppl):Abstract nr 5230.