In this issue of Cancer Discovery, Li and colleagues provide a blueprint for the identification and functional validation of cancer-associated mutations in noncoding regions of the genome. Integration of whole-genome sequencing and high-throughput epigenome editing screens is starting to reveal the extent to which noncoding genetic lesions contribute to cancer.
See related article by Li et al., p. 724.
Cancer initiation and progression are driven by the acquisition of genetic mutations. Thanks to multi-institutional sequencing collaborations such as The Cancer Genome Atlas and the Catalogue of Somatic Mutations in Cancer, mutations in protein-coding regions are well documented. In many cases, the effect of recurrent exon mutations on protein translation and function is known. However, nearly 99% of the genome does not cover protein-coding exons, but instead contains gene-regulatory regions, structural elements, and noncoding RNAs. The role of the noncoding genome in cancer biology remains poorly understood.
Noncoding regions that regulate the transcription of nearby genes are termed cis-regulatory elements (CRE). Several studies have linked CRE mutations to cancer progression, including TERT promoter mutations that increase its expression in melanoma (1, 2). Promoters are mutated at similar frequencies as protein-coding regions in breast cancer, and these mutations often affect transcription factor binding sites (3). Beyond promoters, the most well-characterized CREs are enhancers that are located in introns and intergenic regions. Enhancers promote gene transcription from a distance of a few kilobases to several megabases. The characterization of CREs has been greatly facilitated by epigenetic technologies such as Assay for Transposase-Accessible Chromatin using sequencing (ATAC-seq) for chromatin accessibility and chromatin immunoprecipitation sequencing (ChIP-seq) to map histone modifications. Chromatin accessibility assays can be used as a proxy for active regions. Enhancers are enriched for histone 3 lysine 4 mono- and dimethylation (H3K4me1 and H3K4me2). The presence of H3K27 acetylation (H3K27ac) is often equated with enhancer activation (4). Depending on analysis thresholds, 10,000 to 100,000 active enhancers can be mapped in most cell types, and their activity is highly cell-type dependent. Our molecular understanding of enhancer contribution to cancer biology has consistently improved over recent years (5). For example, mutations that create de novo binding sites for the transcription factor MYB lead to superenhancer initiation and upregulation of the TAL1 oncogene in T-cell acute lymphoblastic leukemia (6). Another mechanism that can drive oncogene transcription is enhancer hijacking, such as rearrangements that result in MYC upregulation by BCL6 enhancers in B-cell lymphomas (7). These studies emphasize the importance of comprehensive identification and characterization of genetic lesions in the noncoding genome.
This issue of Cancer Discovery features an integrated approach to discover noncoding mutations followed by high-throughput epigenome editing screens and mechanistic validation (Fig. 1; ref. 8). Li and colleagues used published datasets of H3K27ac mapping to define blood-associated CREs and designed an enrichment panel for 22,262 CREs as well as 86 leukemia-associated protein-coding genes. Sequencing of enriched targets was performed on 120 samples from healthy donors, acute myeloid leukemia (AML), B-cell lymphoma, and acute lymphoblastic leukemia. The analysis pipeline incorporated five mutation callers for single-nucleotide variants and insertions/deletions (indels), emphasizing confidence over sensitivity. Using these methods, the authors discovered recurrent noncoding mutations in 1,836 CREs.
After the identification of recurrently mutated CREs, it is a major challenge to assess the functional impact of the noncoding mutations. The CRISPR/Cas9 system allows high-throughput genome-editing screens; however, the effect of small indels on enhancer function would be difficult to predict. To solve this, the authors developed an epigenome editing system that combines recruitment of deactivated Cas9 (dCas9) with recruitment of MCP RNA-binding proteins, each fused to epigenetic regulator domains (9). With the ability to target activating and repressive domains to CREs, single guide RNA dropout and enrichment screens were performed in AML cells.
The epigenetic screen revealed CREs that were associated with growth of the AML cell line MKPL-1. One such CRE was located 135 kb upstream of KRAS, and its identity as an oncogenic enhancer acting through KRAS was supported by chromatin accessibility, H3K27ac mapping, and chromosome conformation capture. Conversely, a tumor-suppressive CRE was located 11 kb from the PER2 tumor suppressor gene. The oncogenic role of the KRAS enhancer and the tumor-suppressive role of the PER2 enhancer were supported by AML cell growth assays in vitro and in xenotransplanted mice. Both enhancers had additional mutations in orthogonal cancer whole-genome sequencing datasets, supporting their recurrent contribution to cancer development.
The CRE mutations were located in close proximity to transcription factor binding sites (motifs) for the nuclear receptor (NR) family transcription factors PPARG and RXRA. Indeed, follow-up studies showed that PPARG and RXRA bind to the KRAS and PER2 enhancers and affect enhancer activity in reporter assays, and that knock-in of the identified noncoding mutations affects KRAS and PER2 expression. Combined with motif analyses, these findings establish that NR family transcription factor binding site mutations can affect the growth of leukemia cells.
This work highlights two aspects of present-day science. First, publicly available datasets and computational tools are indispensable. Data from different sources allowed the authors to identify and evaluate CREs based on chromatin accessibility, histone modifications, and distant contact sites by chromosome conformation capture. Five computational tools were integrated for variant calling. The conclusions were further supported by intersection with orthogonal sequencing datasets, patient survival analyses, and transcription factor motif enrichment tools. Second, the work makes one appreciate the importance of technology development in the field of epigenomics. Perhaps most notable are epigenome editing methods that now enable high-throughput screens with activation and repression of CREs. Functional validation involved chromatin immunoprecipitation, enhancer reporters, electrophoretic mobility shift assays, and variant knock-in by CRISPR/Cas9. This range of tools shows how existing datasets and innovative technologies can drive biological understanding.
Despite the extensive nature of the studies by Li and colleagues, we are just beginning to understand the role of the noncoding genome in cancer cell growth, differentiation, and environmental interactions. With the recent release of whole-genome sequences in thousands of tumors across tissues (10), additional experimental and computational innovations will help to clarify the complex mechanisms by which genomic alterations contribute to cancer.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.