A comprehensive understanding of the mutations that drive tumorigenesis and disease progression is essential to understanding tumor biology and designing precision therapies. The landscape of driver mutations in the protein-coding genome has been well-characterized by large exome-sequencing studies. Surprisingly, many tumors do not have mutations in any known protein-coding driver. Non-coding driver mutations are hypothesized to explain many of these cases, but aside from a few regions, like the TERT promoter, our understanding of drivers in the complex regulatory genome remains limited. To fill this gap, we analyzed 150,000 cis-regulatory modules with clustered transcription factor binding sites in 1,844 whole cancer genomes from the ICGC-TCGA PCAWG project. Using a new statistical method, ActiveDriverWGS, we identified dozens of frequently mutated regulatory elements (FMREs) enriched in non-coding SNVs and indels (FDR<0.05) with many structural rearrangements and focal copy number alterations in additional samples. The FMREs were enriched in super-enhancers, long-range chromatin interactions and H3K27ac marks derived from primary tumors, suggesting a gene regulatory role of these mutations through three-dimensional genome organization. The interaction network of chromatin loops and FMREs revealed putative target genes located dozens to hundreds of kilobases away from the mutated regulatory elements. We found known and putative oncogenes and tumor suppressors whose expression significantly correlated with mutations in FMREs, suggesting novel oncogenic mechanisms. Most of the FMREs were also confirmed by additional driver discovery methods, lending confidence to our statistical approach. We also validated ActiveDriverWGS on protein-coding sequence and accurately recovered known driver genes. The non-coding regulatory genome is characterized by diverse mutational processes, regional hypermutations and technically challenging areas with suboptimal sequencing coverage. Thus our findings, most of which are reported for the first time, should be carefully vetted and experimentally validated in future studies. Our integrative analysis of somatic mutations, cis-regulatory regions and long-range chromatin interaction networks is a novel framework for cancer discovery and reveals the currently largest set of potential non-coding drivers in a pan-cancer cohort.
Citation Format: Juri Reimand. Candidate non-coding driver mutations in super-enhancers and long-range chromatin interaction networks across 1,800 whole cancer genomes [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2018; 2018 Apr 14-18; Chicago, IL. Philadelphia (PA): AACR; Cancer Res 2018;78(13 Suppl):Abstract nr 2354.