Promoter elements play important roles in isoform and cell type–specific expression. We surveyed the epigenomic promoter landscape of gastric adenocarcinoma, analyzing 110 chromatin profiles (H3K4me3, H3K4me1, H3K27ac) of primary gastric cancers, gastric cancer lines, and nonmalignant gastric tissues. We identified nearly 2,000 promoter alterations (somatic promoters), many deregulated in various epithelial malignancies and mapping frequently to alternative promoters within the same gene, generating potential pro-oncogenic isoforms (RASA3). Somatic promoter–associated N-terminal peptides displaying relative depletion in tumors exhibited high-affinity MHC binding predictions and elicited potent T-cell responses in vitro, suggesting a mechanism for reducing tumor antigenicity. In multiple patient cohorts, gastric cancers with high somatic promoter usage also displayed reduced T-cell cytolytic marker expression. Somatic promoters are enriched in PRC2 occupancy, display sensitivity to EZH2 therapeutic inhibition, and are associated with novel cancer-associated transcripts. By generating tumor-specific isoforms and decreasing tumor antigenicity, epigenomic promoter alterations may thus drive intrinsic tumorigenesis and also allow nascent cancers to evade host immunity.

Significance: We apply epigenomic profiling to demarcate the promoter landscape of gastric cancer. Many tumor-specific promoters activate different promoters in the same gene, some generating pro-oncogenic isoforms. Tumor-specific promoters also reduce tumor antigenicity by causing relative depletion of immunogenic peptides, contributing to cancer immunoediting and allowing tumors to evade host immune attack. Cancer Discov; 7(6); 630–51. ©2017 AACR.

This article is highlighted in the In This Issue feature, p. 539

Gastric cancer is the third leading cause of global cancer mortality with high prevalence in many East Asian countries (1). Patients with gastric cancer often present with late-stage disease (2, 3), and clinical management remains challenging, as exemplified by several recent negative phase II and phase III clinical trials (4–7). At the molecular level, studies have identified characteristic gene mutations (8, 9), copy-number alterations, gene fusions (10), and transcriptional patterns in gastric cancer (11, 12). However, few of these have been clinically translated into targeted therapies, with the exception of HER2-positive gastric cancer and trastuzumab (13). There is thus a strong need for additional and more comprehensive explorations of gastric cancer, as these may highlight new biomarkers for disease detection, predicting patient prognosis or responses to therapy, as well as new therapeutic modalities.

Promoter elements are cis-regulatory elements that function to link gene transcription initiation to upstream regulatory stimuli, integrating inputs from diverse signaling pathways (14). Promoters represent an important reservoir of biological, functional, and regulatory diversity, as current estimates suggest that 30% to 50% of genes in the human genome are associated with multiple promoters (15), which can be selectively activated as a function of developmental lineage and cellular state (16). Differential usage of alternative promoters causes the generation of distinct 5′ untranslated regions (5′ UTR) and first exons in transcripts, which in turn can influence mRNA expression levels (17), translational efficiencies (18, 19), and the generation of different protein isoforms through gain and loss of 5′ coding domains (15, 20). In cancer, alternative promoters in genes such as ALK (21), TP53 (22), LEF1 (23), and CYP19A1 (24) have been reported, producing cancer-specific isoform variants with oncogenic properties. To date, promoter alterations in cancer have been largely studied on a gene-by-gene basis, and very little is known about the global extent of promoter-level diversity in gastric cancer and other solid malignancies.

Promoters in the genome can be experimentally identified by various methods. Broadly divided into RNA-based or epigenomic approaches, the former involves technologies such as RNA sequencing (RNA-seq), CAP analysis gene expression (CAGE), and global run-on sequencing (25–27). For the latter, active promoters have been shown to exhibit characteristic chromatin modifications, specifically H3K4me3 positivity, H3K27ac positivity, and H3K4me1 depletion (28–31). Compared with transcriptome sequencing (25), using histone modifications to identify promoters carries certain advantages. First, epigenome-guided promoter identification allows genomic localization of the promoter element itself, rather than the ensuing transcript product. Second, particularly for clinical samples, epigenome-guided promoter identification is less prone to transcript degradation artifacts caused by 5′ RNA exonucleases (32). Epigenome-marked promoters may also highlight transcript classes not easily detectible by other means, such as promoters originating via recapping events, short-lived RNAs (27), or unstable RNAs with greater sensitivity to exosome-mediated decay, such as promoter upstream transcripts (33) and/or enhancer RNAs (34–36).

In this work, we analyzed a gastric cancer cohort of primary samples and cell lines to survey the landscape of altered promoter elements in gastric cancer. Our study applied microscale histone modification profiling [Nano–chromatin immunoprecipitation sequencing (Nano-ChIP-seq)] to profile primary cancers (37, 38), which allows the measurement of epigenomic modifications in vivo, compared with laboratory-cultured cell lines that may harbor epigenomic artifacts due to in vitro culture (39, 40). By comparing the epigenomic promoter profiles of primary gastric cancers and matched normal tissues, we observed pervasive alterations in promoter usage in gastric cancer, and an important role for somatic promoters in enhancing gastric cancer transcript, protein isoform, and immunogenic diversity. We also found that many somatic promoters observed in gastric cancer are deregulated in other cancer types, supporting a generalized role for somatic promoters in solid malignancies. To our knowledge, our study represents one of the largest and most comprehensive surveys of somatic promoters in any single tumor type.

Identifying Epigenomic Promoter Alterations in Gastric Cancer

Using Nano-ChIP-seq (37), we profiled three histone modification marks (H3K4me3, H3K27ac, and H3K4me1) across 17 gastric cancers, matched normal gastric mucosae (34 samples), and 13 gastric cancer cell lines, generating 110 epigenomic profiles (Supplementary Tables S1 and S2 provide clinical and sequencing metrics; Fig. 1A). Quality control of the Nano-ChIP-seq data was performed using two independent methods: ChIP enrichment at known promoters and employing the ChIP-seq quality control and validation tool ChIP-seq analytics and confidence estimation (CHANCE; ref. 41). Comparisons of Nano-ChIP-seq read densities at 1,000 promoters associated with highly expressed protein-coding genes confirmed successful enrichment in all H3K27ac and H3K4me3 libraries. CHANCE analysis also revealed that the large majority (81%) of samples exhibited successful enrichment (Supplementary Table S2). We have previously also shown that Nano-ChIP signals exhibit a good concordance with orthogonal ChIP-qPCR results (38).

Figure 1.

Somatic promoters in primary gastric adenocarcinoma. A, Example of an unaltered gastric cancer (GC) promoter. The UCSC genome track of the RHOA TSS (shaded box) highlights similar H3K4me3 signals in gastric cancer and matched normal samples. Similar signals are seen in gastric cancer lines. The bottom two tracks display similar levels of RNA expression in the same gastric cancer and matched normal samples (RNA-seq). B, Example of a gained somatic promoter. The UCSC genome track of the CEACAM6 TSS (shaded box) highlights gain of H3K4me3 signals in gastric cancer samples and gastric cancer lines, compared with matched normal samples. In contrast, no changes are observed at the TSS of CEACAM5, an adjacent gene. Concordant tumor-specific gain of RNA expression is shown in the bottom two tracks displaying RNA-seq profiles of the same gastric cancer and matched normal samples. C, Example of a lost somatic promoter. The UCSC genome track of the ATP4A TSS (shaded box) highlights loss of H3K4me3 signals in gastric cancer samples and gastric cancer lines compared with matched normal samples. Concordant tumor-specific loss of RNA expression is shown in the bottom two tracks displaying RNA-seq profiles of the same gastric cancer and matched normal samples. D, Heat map of H3K4me3 read densities (row scaled) of somatic promoters (rows) in primary gastric cancer and matched normal samples. E, Correlation between H3K4me3 promoter signals and H3K27ac activity signals in primary gastric samples (r = 0.91, P < 0.001). Each data point corresponds to a single H3K4me3hi/H3K4me1lo region. Gold points, all promoters; blue points, somatic promoters. Analysis was performed using data from 16 N/T pairs (Supplementary Table S2). F, Top five gene sets associated with canonical gained and lost somatic promoters. Gene sets associated with genes upregulated and downregulated in gastric cancer are rediscovered. Also note that gene sets related to H3K27me3 and SUZ12, a PRC2 component, are enriched. ESC, embryonic stem cell; HCP, high-CpG promoter.

Figure 1.

Somatic promoters in primary gastric adenocarcinoma. A, Example of an unaltered gastric cancer (GC) promoter. The UCSC genome track of the RHOA TSS (shaded box) highlights similar H3K4me3 signals in gastric cancer and matched normal samples. Similar signals are seen in gastric cancer lines. The bottom two tracks display similar levels of RNA expression in the same gastric cancer and matched normal samples (RNA-seq). B, Example of a gained somatic promoter. The UCSC genome track of the CEACAM6 TSS (shaded box) highlights gain of H3K4me3 signals in gastric cancer samples and gastric cancer lines, compared with matched normal samples. In contrast, no changes are observed at the TSS of CEACAM5, an adjacent gene. Concordant tumor-specific gain of RNA expression is shown in the bottom two tracks displaying RNA-seq profiles of the same gastric cancer and matched normal samples. C, Example of a lost somatic promoter. The UCSC genome track of the ATP4A TSS (shaded box) highlights loss of H3K4me3 signals in gastric cancer samples and gastric cancer lines compared with matched normal samples. Concordant tumor-specific loss of RNA expression is shown in the bottom two tracks displaying RNA-seq profiles of the same gastric cancer and matched normal samples. D, Heat map of H3K4me3 read densities (row scaled) of somatic promoters (rows) in primary gastric cancer and matched normal samples. E, Correlation between H3K4me3 promoter signals and H3K27ac activity signals in primary gastric samples (r = 0.91, P < 0.001). Each data point corresponds to a single H3K4me3hi/H3K4me1lo region. Gold points, all promoters; blue points, somatic promoters. Analysis was performed using data from 16 N/T pairs (Supplementary Table S2). F, Top five gene sets associated with canonical gained and lost somatic promoters. Gene sets associated with genes upregulated and downregulated in gastric cancer are rediscovered. Also note that gene sets related to H3K27me3 and SUZ12, a PRC2 component, are enriched. ESC, embryonic stem cell; HCP, high-CpG promoter.

Close modal

To enable accurate promoter identification, we integrated data from multiple histone modifications, selecting H3K4me3 regions simultaneously codepleted for H3K4me1 (“H3K4me3hi/H3K4me1lo regions”; Supplementary Fig. S1; Methods; ref. 42). Comparisons against data from external sources, including GENCODE reference transcripts, ENCODE chromatin state models, and CAGE databases, validated the vast majority of H3K4me3hi/H3K4me1lo regions as true promoter elements (Supplementary Text; Supplementary Fig. S1). Because primary gastric tissues comprise several different tissue types, including epithelial cells, immune cells, and stroma, we further confirmed that our promoter profiles were reflective of bona fide gastric epithelia by comparison against Epigenome Roadmap data for gastric and nongastric tissues. Gastric tumor and matched normal promoter profiles exhibited the highest correlations to Roadmap gastric mucosae and were distinct from other gastrointestinal tissues (small intestine, colon mucosa, colon sigmoid), stomach-associated muscle, skin, and blood (CD14; Supplementary Fig. S2). Primary tissue promoter profiles also showed a significant overlap with promoter profiles of gastric cancer cell lines (87%), which are purely epithelial in origin, compared with gastrointestinal fibroblast lines (58%–69%), and colon carcinoma lines (59%–74%; Supplementary Fig. S2).

In total, we mapped approximately 23,000 promoter elements in the Nano-ChIP-seq cohort. Visual exploration of these promoter elements identified three main promoter categories: unaltered promoters, promoters gained in tumors (gained somatic or tumor-specific promoters), and promoters present in normal gastric tissues but lost or decreased in gastric cancer (lost somatic or normal-specific promoters; Fig. 1A–C). Representative examples of unaltered promoters included RHOA (Fig. 1A), whereas CEACAM6, an intracellular adhesion gene, exhibited somatic promoter gain at the CEACAM6 transcription start site (TSS) in tumor samples and cell lines (Fig. 1B). Conversely, ATP4A, a parietal cell–associated H+/K+ ATPase with decreased expression in gastric cancer (43), exhibited somatic promoter loss (Fig. 1C). Both CEACAM6 and ATP4A promoter alterations were correlated with increased and decreased CEACAM6 and ATP4A gene expression in the same samples, respectively (Fig. 1B and C).

Previous studies have established distinct molecular subtypes of gastric cancer (11, 12, 44). Because of limited sample sizes, however, we elected in the current study to identify promoter alterations (“somatic promoters”; ref. 45–47) present in multiple gastric cancer tissues relative to control tissues irrespective of subtype. Focusing on recurrent alterations also has the benefit of reducing potential artifacts due to “private” epigenomic variation or individual sample-specific technical errors. Using two complementary read count–based algorithms commonly used for analysis of ChIP-seq data (48–50), we identified approximately 2,000 highly recurrent somatic promoters, of which 75% were gained in gastric cancers [fold change (FC) ≥ 1.5, q < 0.1]. Two-dimensional heat map clustering and principal component analysis (PCA) plots based on somatic promoters confirmed a separation of gastric cancer samples from normal samples (Fig. 1D; Supplementary Fig. S3). Somatic promoter H3K4me3 levels were also highly correlated with H3K27ac signals, commonly regarded as a marker of active regulatory activity (42, 51). This correlation was observed across all somatic promoters (r = 0.84, P < 0.001, Fig. 1E), and also when gained somatic promoters and lost somatic promoters were analyzed separately (r = 0.78, P < 0.001 for gained somatic; r = 0.82, P < 0.001 for lost somatic, Supplementary Fig. S3). Pathway analysis revealed that both gained somatic promoters and lost somatic promoters were significantly associated with expression gene sets previously reported to be upregulated and downregulated in gastric cancer, respectively (Fig. 1F). These included upregulated oncogenes (MET, ABL2), cell adhesion genes (CEACAM6), and claudin family members (CLDN7, CLDN3). Fifteen percent to 18% of somatic promoters mapped to noncoding RNAs, including HOTAIR and PVT1, previously associated with gastric cancer (Supplementary Table S3; refs. 52, 53). Additional analyses at increasing thresholds of stringency (FC from 1.5 to 2 and FDR from 0.1 to 0.001) yielded similar results, supporting the robustness of this analysis (Supplementary Fig. S3). These results demonstrate that normal gastric epithelia and gastric cancers can be distinguished on the basis of epigenomic promoter profiles.

Somatic Promoters in Gastric Cancer Exhibit Deregulation in Diverse Cancer Types

To explore relationships between epigenomic promoter alterations and gene expression, we analyzed RNA-seq data from the same discovery cohort (∼106 million reads/sample), quantifying RNA-seq transcript reads mapping to the epigenome-guided promoter regions or directly downstream. Examining somatic promoter regions (Fig. 2A provides an illustrative example of a gained somatic promoter), we observed significantly increased expression at gained somatic promoters in gastric cancers and significantly decreased expression at lost somatic promoters, compared with either all promoters (P < 0.001; Fig. 2B), or unaltered promoters (P < 0.001; Supplementary Fig. S4; refs. 27, 51, 54). Among other types of epigenetic modifications, previous studies have also reported a reciprocal relationship between active regulatory regions and DNA methylation (54, 55). Using Infinium 450K DNA methylation arrays, we identified 7,505 CpG sites overlapping the somatic promoter regions (5,213 sites for gained somatic promoters; 2,292 sites for lost somatic promoters). Promoters gained in gastric cancer were significantly hypomethylated compared with all promoters (P < 0.001, Wilcoxon test), whereas promoters lost in gastric cancer were hypermethylated (P < 0.001, Wilcoxon test; Fig. 2B, bottom). As DNA methylation typically occurs in CpG-rich regions (56), we then repeated the analysis focusing only on CpG island–bearing promoters (Methods). Similar to the original results, CpG island–bearing promoters gained in gastric cancer were significantly hypomethylated compared with all CpG island–bearing promoters (P < 0.001, Wilcoxon test), whereas CpG island–bearing promoters lost in gastric cancer were hypermethylated (P < 0.001, Wilcoxon test; Supplementary Fig. S5).

Figure 2.

Association of somatic promoters with gene expression in gastric cancer (GC) and other tumor types. A, Example of a gastric cancer somatic promoter (red, H3K4me3 signal in gastric cancer; blue, H3K4me3 signal in gastric normal). Example is for illustrative purposes only. B, Changes in RNA-seq expression (top) and DNA methylation (bottom) in discovery samples between somatic promoters and all promoters. Top, box plot depicting changes in RNA-seq expression between 9 paired primary gastric cancer and gastric normal samples at genomic regions exhibiting somatic promoters (gained and lost; ***, P < 0.001, Wilcoxon test). Bottom, box plot depicting changes in DNA methylation (β-values) at regions exhibiting somatic promoters between 20 paired gastric cancer and gastric normal samples, compared with all promoters (***, P < 0.001, Wilcoxon test). C, Independent validation cohorts. Box plot depicting changes in RNA-seq expression at genomic regions exhibiting somatic promoters across 354 (321 gastric cancer, 33 normal) TCGA stomach adenocarcinoma (STAD) samples, compared with all promoters (***, P < 0.001, Wilcoxon test). D, Somatic promoters in other cancer types. Box plot depicting changes in RNA-seq expression at genomic regions exhibiting gastric cancer somatic promoters compared against all promoters, across 326 TCGA colon adenocarcinoma (COAD) samples (286 COAD, 40 normal; ***, P < 0.001, Wilcoxon test), 170 TCGA kidney ccRCC samples (98 ccRCC and 72 normal; ***, P < 0.001, Wilcoxon test), and 115 TCGA LUAD samples (58 LUAD, 57 normal; ***, P < 0.001 somatic gain vs. all promoters and somatic gain vs. somatic loss, Wilcoxon test). T/N, tumor/normal.

Figure 2.

Association of somatic promoters with gene expression in gastric cancer (GC) and other tumor types. A, Example of a gastric cancer somatic promoter (red, H3K4me3 signal in gastric cancer; blue, H3K4me3 signal in gastric normal). Example is for illustrative purposes only. B, Changes in RNA-seq expression (top) and DNA methylation (bottom) in discovery samples between somatic promoters and all promoters. Top, box plot depicting changes in RNA-seq expression between 9 paired primary gastric cancer and gastric normal samples at genomic regions exhibiting somatic promoters (gained and lost; ***, P < 0.001, Wilcoxon test). Bottom, box plot depicting changes in DNA methylation (β-values) at regions exhibiting somatic promoters between 20 paired gastric cancer and gastric normal samples, compared with all promoters (***, P < 0.001, Wilcoxon test). C, Independent validation cohorts. Box plot depicting changes in RNA-seq expression at genomic regions exhibiting somatic promoters across 354 (321 gastric cancer, 33 normal) TCGA stomach adenocarcinoma (STAD) samples, compared with all promoters (***, P < 0.001, Wilcoxon test). D, Somatic promoters in other cancer types. Box plot depicting changes in RNA-seq expression at genomic regions exhibiting gastric cancer somatic promoters compared against all promoters, across 326 TCGA colon adenocarcinoma (COAD) samples (286 COAD, 40 normal; ***, P < 0.001, Wilcoxon test), 170 TCGA kidney ccRCC samples (98 ccRCC and 72 normal; ***, P < 0.001, Wilcoxon test), and 115 TCGA LUAD samples (58 LUAD, 57 normal; ***, P < 0.001 somatic gain vs. all promoters and somatic gain vs. somatic loss, Wilcoxon test). T/N, tumor/normal.

Close modal

To validate the somatic promoters in a larger independent gastric cancer cohort and also to examine their behavior in other cancer types, we proceeded to query RNA-seq data of 354 gastric cancer samples from the The Cancer Genome Atlas (TCGA) consortium (n = 321 gastric cancer samples, n = 33 matched normals). To perform this analysis, RNA-seq reads from TCGA samples were mapped against the epigenome-guided somatic promoter regions defined by the discovery samples and normalized to calculate fold-change differences in expression between gastric cancer samples versus normal samples (see Methods). Similar to the discovery series, we observed that TCGA gastric cancer samples also exhibited significantly increased expression at gained somatic promoters, whereas lost somatic promoters exhibited decreased expression, relative to either all promoters (P < 0.001; Fig. 2C) or unaltered promoters (P < 0.001; Supplementary Fig. S4). We further tested the tissue specificity of the gastric cancer somatic promoters by querying RNA-seq data from other tumor types, including colon cancer, kidney renal clear cell carcinoma (ccRCC), and lung adenocarcinoma (LUAD; Fig. 2D). Almost two thirds (n = 1,231, 63%, FC ≥ 1.5) of gastric cancer somatic promoters were also differentially regulated in TCGA colon cancer samples, and similarly, a significant proportion of gastric cancer somatic promoters were also associated with differential RNA-seq expression in TCGA ccRCC (n = 939, 48%, FC ≥ 1.5) and LUAD samples (n = 1,059, 54%, FC ≥ 1.5; Fig. 2D). This result suggests that many gastric cancer somatic promoters are also likely associated with deregulated promoter activity in other solid epithelial malignancies.

Role of Alternative Promoters

By comparing the somatic promoters against the reference GENCODE database (V19; ref. 57), we discovered extensive use of alternative promoters (18%) in gastric cancers, defined as situations where a common unaltered promoter is present in both normal tissues and tumors (canonical promoter) but a secondary tumor-specific promoter is engaged in the latter (alternative promoter). The remaining 82% of somatic promoters corresponded to single major isoforms or unannotated transcripts (see below). Fifty-seven percent of the alternative promoters occurred downstream of the canonical promoter. Using multiple RNA-seq analysis methods, we confirmed that transcript isoforms driven by alternative promoters are overexpressed in gastric cancers to a significantly greater degree than canonical promoters in the same gene (Methods; Supplementary Fig. S6). For example, HNF4A, a transcription factor overexpressed in gastric cancer (58–61), is driven by two promoters (P1 and P2). At the HNF4A canonical promoter (“P2”), we observed equal promoter signals in gastric cancer tissues and normal tissues; however, we also further observed gain of an additional promoter in gastric cancers at a TSS 45 kb downstream (“P1”). Similarly, HNF4A P1 promoter gains were also observed in gastric cancer cell lines (Fig. 3A), with RNA-seq analysis supporting expression of the HNF4A P1 isoform in gastric cancers. Alternative promoter usage was also observed at the EPCAM gene, frequently used to identify circulating tumor cells, causing expression of EPCAM transcript ENST00000263735.4 (Fig. 3B). Notably, both the HNF4A and EPCAM alternative isoforms exhibited significantly greater cancer overexpression compared with their canonical isoforms (Supplementary Fig. S6). Other genes associated with tumor-specific alternative promoters, many reported for the first time, include NKX6-3 (FC = 1.83, q < 0.05) and GRIN2D (FC = 1.9, q < 0.001). A complete list of gastric cancer tumor–specific promoters is provided in Supplementary Table S4.

Figure 3.

Alternative promoters in gastric cancer (GC). A, UCSC browser track of the HNF4A gene. Gastric cancer and matched gastric normal samples have equal H3K4me3 signals at the canonical HNF4A promoter. However, an alternative promoter, seen by H3K4me3 gain, can be observed at a downstream TSS in gastric cancers compared with matched normals. At the RNA level, both in-house and TCGA STAD samples also show gain of gene expression at the alternate promoter TSS compared with normal samples. B, UCSC browser track of the EPCAM gene. Another example of alternative promoter usage at a downstream TSS. Gain of H3K4me3 is observed at a TSS downstream of the canonical promoter, while the canonical promoter exhibits equal H3K4me3 signals in gastric cancer and gastric normal. Gain of RNA-seq expression can also be observed in gastric cancer at the alternative promoter–driven transcript in both in-house and TCGA STAD samples. C, UCSC browser track of the RASA3 gene, demonstrating H3K4me3 and RNA-seq signals highlighting gain of promoter activity at an unannotated TSS (dark gray box) corresponding to a novel N-terminal truncated RASA3 transcript. Expression of this variant transcript was validated through 5′ Rapid Amplification of cDNA Ends (RACE) in gastric cancer lines (bottom). D, Functional domains of the translated RASA3 canonical and alternate isoform. The alternate transcript is predicted to encode a RASA3 protein missing the RASGAP domain. E, Effect of overexpression of RASA3 canonical (CanT) and alternate (SomT) isoforms on the migration capability of SNU1967 (top) and GES1 (bottom) cells. Representative images of RASA3-Ctl (empty vector), RASA3-CanT, and RASA3-SomT in migration assays (n = 3). Bar plots show the percentage area of migrated cells versus the area of Transwell membrane. Data are shown as mean ± SD; n = 3 (**, P < 0.01; Student one sided t test).

Figure 3.

Alternative promoters in gastric cancer (GC). A, UCSC browser track of the HNF4A gene. Gastric cancer and matched gastric normal samples have equal H3K4me3 signals at the canonical HNF4A promoter. However, an alternative promoter, seen by H3K4me3 gain, can be observed at a downstream TSS in gastric cancers compared with matched normals. At the RNA level, both in-house and TCGA STAD samples also show gain of gene expression at the alternate promoter TSS compared with normal samples. B, UCSC browser track of the EPCAM gene. Another example of alternative promoter usage at a downstream TSS. Gain of H3K4me3 is observed at a TSS downstream of the canonical promoter, while the canonical promoter exhibits equal H3K4me3 signals in gastric cancer and gastric normal. Gain of RNA-seq expression can also be observed in gastric cancer at the alternative promoter–driven transcript in both in-house and TCGA STAD samples. C, UCSC browser track of the RASA3 gene, demonstrating H3K4me3 and RNA-seq signals highlighting gain of promoter activity at an unannotated TSS (dark gray box) corresponding to a novel N-terminal truncated RASA3 transcript. Expression of this variant transcript was validated through 5′ Rapid Amplification of cDNA Ends (RACE) in gastric cancer lines (bottom). D, Functional domains of the translated RASA3 canonical and alternate isoform. The alternate transcript is predicted to encode a RASA3 protein missing the RASGAP domain. E, Effect of overexpression of RASA3 canonical (CanT) and alternate (SomT) isoforms on the migration capability of SNU1967 (top) and GES1 (bottom) cells. Representative images of RASA3-Ctl (empty vector), RASA3-CanT, and RASA3-SomT in migration assays (n = 3). Bar plots show the percentage area of migrated cells versus the area of Transwell membrane. Data are shown as mean ± SD; n = 3 (**, P < 0.01; Student one sided t test).

Close modal

To explore the influence of alternative promoters on protein diversity, we identified 714 tumor-specific promoter alterations predicted to change N-terminal protein composition and also supported by both H3K4me3 and RNA-seq data. The vast majority of these alterations (>95%) were in-frame to that of the canonical protein. Of these, 47% (n = 338) were predicted to cause gains of new N-terminal peptides in tumors (see Methods). To confirm protein-level expression of these N-terminal peptides in gastrointestinal cancer, we queried publicly available peptide spectral data of 90 TCGA colorectal cancer and 60 normal colon samples (62, 63). Colorectal cancer data were used for this analysis, as large-scale proteomic data of primary gastric cancers are not currently available, and because many gastric cancer somatic promoters are also observed in colorectal cancer (Fig. 2D). Across all proteins detected, proteins with gained promoters were overexpressed in cancer to a significantly greater extent than proteins without gained promoters (P < 0.001; 63% vs. 54%; Fisher test). Then, examining specific N-terminal peptides predicted to be gained in tumors, we confirmed protein expression of 33% (112/338) in the colorectal cancer data (Supplementary Table S5), of which 51.8% were overexpressed in colorectal cancer samples relative to normal colon samples (FDR, 10%). In a separate experiment, we further investigated whether these N-terminal peptides also exhibit tumor overexpression in proteomic data from 3 gastric cancer cell lines and 1 normal gastric epithelial line (GES1; see Methods). Similar to the colorectal cancer data, 48% of the N-terminal peptides were overexpressed in the gastric cancer lines relative to normal GES1 gastric cells, and again, a significantly greater proportion of proteins with gained promoters were overexpressed in cancer compared with proteins without gained promoters (P < 0.001; Fisher test). Taken collectively, these analyses suggest that alternative promoters may contribute significantly toward proteomic diversity in gastrointestinal cancer.

To examine possible functions of somatic promoters in cancer development, we focused on RASA3, which encodes a RAS GTPase-activating protein required for Gαi-induced inhibition of MAPK (64). In both gastric cancers (50%) and gastric cancer lines, we observed gain of promoter activity at an intronic region 127 kb downstream apart from the canonical RASA3 TSS (Fig. 3C, top; Supplementary Fig. S7). RNA-seq and 5′ Rapid Amplification of cDNA Ends (RACE) analysis confirmed expression of this shorter RASA3 isoform (Fig. 3C, bottom), and expression of this shorter RASA3 isoform was also observed in TCGA RNA-seq data (Fig. 3C). Using isoform-specific quantitative PCR, we confirmed that although both the canonical full-length RASA3 and shorter RASA3 isoform are overexpressed in gastric cancer tissues relative to matched normal tissues (P < 0.01 Student one sided t test), the shorter RASA3 isoform is overexpressed to a significantly greater extent (FC = 2.64, P = 0.01, Student one sided t test; Supplementary Fig. S7). Compared with the canonical full-length RASA3 protein (CanT), the shorter 31-kDa RASA3 somatic isoform (SomT) is predicted to lack the N-terminal RASGAP domain (Fig. 3D). Consistent with these predictions, transfection of RASA3 CanT into GES1 normal gastric epithelial cells induced lower levels of active GTP-bound RAS compared with either empty vector or RASA3 SomT transfected cells, indicating that RASA3 CanT has higher RASGAP activity (Supplementary Fig. S7).

To address functions of RASA3 SomT, we transfected the RASA3 CanT and SomT isoforms into SNU1967 gastric cancer cells. Compared with untransfected cells, transfection of RASA3 SomT into SNU1967 cells significantly stimulated migration (P < 0.01) and invasion (P < 0.01), whereas RASA3 CanT significantly suppressed invasion (P < 0.001; Fig. 3E, Supplementary Fig. S7). Similarly, transfection of RASA3 SomT into GES1 cells significantly stimulated migration (P < 0.01, Fig. 3E) and invasion (P < 0.01, Supplementary Fig. S7), whereas RASA3 CanT did not. When tested on KRAS-mutated AGS gastric cancer cells that are innately highly migratory, expression of RASA3 CanT potently suppressed migration, whereas RASA3 SomT exhibited significantly less attenuation (P = 0.03, Supplementary Fig. S7). These results suggest that tumor-specific use of RASA3 SomT is likely to increase gastric cancer cell migration and invasion. Notably, RASA3 CanT and SomT transfections did not alter SNU1967, GES1, or AGS cellular proliferation rates (Supplementary Fig. S7). To confirm that these observations are not due to nonphysiologic in vitro expression levels, we then examined NCC24 gastric cancer cells, which normally express high endogenous levels of RASA3 SomT and minimal RASA3 CanT (Supplementary Fig. S7). Silencing of endogenous RASA3 SomT using two independent siRNA constructs significantly inhibited NCC24 migration and invasion (P < 0.01–0.001; Supplementary Fig. S7), consistent with RASA3 SomT playing a role in promoting cancer migration and invasion.

In an earlier study (38), we reported a transcript isoform of the MET receptor tyrosine kinase driven by an internal alternative promoter, which has been independently confirmed in other cancer types (65). However, functional implications of this MET variant (Var) remain unclear. RNA-seq and 5′ RACE analysis confirmed transcript expression of this shorter isoform, predicted to harbor a truncated SEMA domain (Supplementary Fig. S8). To assess functional differences between wild-type (WT) and Var MET, we performed transient transfections of MET-WT and MET-Var into HEK293 cells. In both untreated and HGF-treated conditions, MET-Var transfected cells exhibited significantly higher levels of pGAB1 (Y627), a key mediator of MET signaling [e.g., 2.48- to 3.95-fold comparing MET-Var vs. MET-WT, P = 0.003 (untreated), P < 0.05 (T15 and T30); ref. 66]. In addition, in HGF-untreated samples, cells transfected with MET-Var also exhibited higher pERK1/2 levels (2.74-fold) and higher pSTAT3 (Y705; refs. 67–70) levels (1.80-fold) compared with MET-WT [P = 0.023 and P = 0.026 for pERK and pSTAT3 (Y705), respectively]. These results suggest that expression of the MET-Var isoform may promote MET downstream signaling kinetics in a manner important for gastric cancer tumorigenesis.

Somatic Promoters Correlate with Tumor Immunity

Cancer immunoediting is a process where developing tumors sculpt their immunogenic and antigenic profile to evade host immune surveillance (71, 72). Mechanisms of cancer immunoediting are diverse, including upregulation of immune checkpoint inhibitors, such as PD-L1 (72). To explore potential contributions of somatic promoters to tumor immunity, we identified somatic promoter–associated N-terminal peptides with high predicted affinity binding to various MHC class I HLA alleles (Supplementary Tables S6 and S7), which are required for antigen presentation to CD8+ cytotoxic T cells (IC50 ≤ 50 nmol/L, Fig. 4A; ref. 73). Analysis of recurrent somatic promoter–associated peptides using the NetMHCpan-2.8 (74) algorithm against patient-specific MHC class I alleles revealed a significant enrichment in high-affinity MHC I binding compared with multiple control peptide populations, including canonical gastric cancer peptides (average 36% vs. 24%; P < 0.01), randomly selected peptides (P < 0.001), and C-terminal peptides (P < 0.01; Fig. 4B shows HLA-A, B, and C combined; Supplementary Fig. S9A depicts data for HLA-A only). The majority of these high-affinity somatic promoter–associated peptides corresponded to situations where the somatic transcript lacking the N-terminal peptide is overexpressed in tumors relative to normal tissues (78% lost; 76/97 high-affinity peptides, Fig. 4C), and the proportion of high-affinity MHC-binding peptides among lost peptides was significantly greater than among gained peptides (37% vs. 21%, P < 0.05, Fisher test). Notably, because transcripts driven by the N-terminal lacking somatic TSSs are also overexpressed in tumors to a significantly greater degree than transcripts driven by the canonical TSS (P < 0.05, Wilcoxon one-sided test; Supplementary Fig. S6), such a scenario would be expected to result in relative depletion of N-terminal immunogenic peptides in tumors. Interestingly, an analogous N-terminal analysis using RNA-seq data alone (in the absence of epigenomic data) revealed that epigenome-guided N-terminal peptides exhibited significantly higher predicted immunogenicity scores compared with RNA-seq–only identified peptides (36.10% vs. 27% for MHC presentation,P = 0.02, Fisher test), suggesting that epigenome-guided promoter identification can provide complementary value to RNA-seq–only guided analyses (Supplementary Fig. S9).

Figure 4.

Somatic promoters correlate with immunoediting signatures. A, Schematic outlining alternative promoter usage [H3K4me3 box, overlapping gastric cancer (GC) in red and normal gastric tissue in blue] leading to alternative transcript usage (transcript box) and N-terminally truncated protein isoforms (protein box). B, Bar plot showing the average percentage of peptides with predicted high-affinity binding to MHC class I (HLA-A, B, and C, IC50 ≤ 50 nmol/L). N-terminal peptides associated with recurrent somatic promoters (alternative promoters) show significantly enriched predicted MHC I binding compared with canonical gastric cancer peptides (P < 0.01, Fisher test), random peptides from the human proteome (P < 0.001), and C-terminal peptides (P < 0.01) derived from the same genes exhibiting the N-terminal alterations. Canonical peptides refer to peptides derived from protein-coding genes overexpressed in gastric cancer through nonalternative promoters. C, Percentage (%) of high-affinity peptides predicted to bind different patient-specific HLA alleles categorized by somatic gain or loss. Most alleles have a greater number of N-terminal lost peptides predicted to have high binding affinity. The percentage of patients bearing specific HLA alleles is denoted inside the brackets. D, Quantification of somatic promoter expression using NanoString profiling. Top, distinct NanoString probes were designed to measure the expression of alternate and canonical promoter–driven transcripts. Two probes were designed for each gene, a canonical probe at the 5′ transcript marked by unaltered H3K4me3, and an alternate probe at the 5′ transcript of the somatic promoter. Bottom, heat map of alternative promoter expression from 95 gastric cancer and matched normal samples. Gastric cancer samples have been ordered left to right by their levels of somatic promoter usage. E, Association between somatic promoters and T-cell immune correlates. NS, not significant. Samples with high somatic promoter usage are in red, whereas those with low usage are in blue. Top left, expression of T-cell markers CD8A (P = 0.1443) and the T-cell cytolytic markers GZMA (P = 0.0001) and PRF1 (P = 0. 00806) in gastric cancer samples with either high or low somatic promoter usage (SG cohort). Samples with high alternative promoter usage show lower expression of immune markers. All P values are from Wilcoxon one-sided test. Top right, Kaplan–Meier analysis comparing overall survival curves between validation samples with high somatic promoter usage (top 25%) and low somatic promoter usage (bottom 25%; HR = 2.56, P = 0.02). Bottom left, expression of T-cell markers CD8A (P = 0.02), GZMA (P = 0.01), and PRF1 (P = 0.03) in TCGA STAD with either high or low somatic promoter usage. T-cell markers were evaluated by RNA-seq [transcripts per million (TPM)]. Bottom right, expression of T-cell markers CD8A (P = 0.035), GZMA (P = 0.001), and PRF1 (P = 0.025) in Asian Cancer Research Group (ACRG) gastric cancer samples with either high or low somatic promoter usage. All P values are from Wilcoxon one-sided test. F, EPIMAX heat map of total cytokine responses (fold change relative to actin) for 15 peptide pools against 9 donors. G, Individual cytokine responses against 15 peptides for two individual donors (donor 2 and donor 3) showing complex cytokine responses (FC ≥ 2). *, P < 0.05; **, P < 0.01; ***, P < 0.001.

Figure 4.

Somatic promoters correlate with immunoediting signatures. A, Schematic outlining alternative promoter usage [H3K4me3 box, overlapping gastric cancer (GC) in red and normal gastric tissue in blue] leading to alternative transcript usage (transcript box) and N-terminally truncated protein isoforms (protein box). B, Bar plot showing the average percentage of peptides with predicted high-affinity binding to MHC class I (HLA-A, B, and C, IC50 ≤ 50 nmol/L). N-terminal peptides associated with recurrent somatic promoters (alternative promoters) show significantly enriched predicted MHC I binding compared with canonical gastric cancer peptides (P < 0.01, Fisher test), random peptides from the human proteome (P < 0.001), and C-terminal peptides (P < 0.01) derived from the same genes exhibiting the N-terminal alterations. Canonical peptides refer to peptides derived from protein-coding genes overexpressed in gastric cancer through nonalternative promoters. C, Percentage (%) of high-affinity peptides predicted to bind different patient-specific HLA alleles categorized by somatic gain or loss. Most alleles have a greater number of N-terminal lost peptides predicted to have high binding affinity. The percentage of patients bearing specific HLA alleles is denoted inside the brackets. D, Quantification of somatic promoter expression using NanoString profiling. Top, distinct NanoString probes were designed to measure the expression of alternate and canonical promoter–driven transcripts. Two probes were designed for each gene, a canonical probe at the 5′ transcript marked by unaltered H3K4me3, and an alternate probe at the 5′ transcript of the somatic promoter. Bottom, heat map of alternative promoter expression from 95 gastric cancer and matched normal samples. Gastric cancer samples have been ordered left to right by their levels of somatic promoter usage. E, Association between somatic promoters and T-cell immune correlates. NS, not significant. Samples with high somatic promoter usage are in red, whereas those with low usage are in blue. Top left, expression of T-cell markers CD8A (P = 0.1443) and the T-cell cytolytic markers GZMA (P = 0.0001) and PRF1 (P = 0. 00806) in gastric cancer samples with either high or low somatic promoter usage (SG cohort). Samples with high alternative promoter usage show lower expression of immune markers. All P values are from Wilcoxon one-sided test. Top right, Kaplan–Meier analysis comparing overall survival curves between validation samples with high somatic promoter usage (top 25%) and low somatic promoter usage (bottom 25%; HR = 2.56, P = 0.02). Bottom left, expression of T-cell markers CD8A (P = 0.02), GZMA (P = 0.01), and PRF1 (P = 0.03) in TCGA STAD with either high or low somatic promoter usage. T-cell markers were evaluated by RNA-seq [transcripts per million (TPM)]. Bottom right, expression of T-cell markers CD8A (P = 0.035), GZMA (P = 0.001), and PRF1 (P = 0.025) in Asian Cancer Research Group (ACRG) gastric cancer samples with either high or low somatic promoter usage. All P values are from Wilcoxon one-sided test. F, EPIMAX heat map of total cytokine responses (fold change relative to actin) for 15 peptide pools against 9 donors. G, Individual cytokine responses against 15 peptides for two individual donors (donor 2 and donor 3) showing complex cytokine responses (FC ≥ 2). *, P < 0.05; **, P < 0.01; ***, P < 0.001.

Close modal

To explore whether somatic promoters might contribute to reducing tumor antigen burden and immunoreactivity in vivo, we proceeded to examine correlations between promoter alterations and intratumor T-cell activity in various primary gastric cancer cohorts. First, to detect promoter alterations in a cohort of 95 gastric cancer–normal pairs [Singapore (SG) cohort], we generated a customized NanoString panel targeting the top 95 recurrent gastric cancer somatic promoters, measuring transcripts associated with either the canonical promoter or the alternative promoter. There was a significant correlation between the NanoString data and RNA-seq (Supplementary Fig. S10, r = 0.65, P < 0.001), with approximately 35% of transcripts driven by alternate promoters upregulated in more than half of the gastric cancers (Fig. 4D). Second, to examine markers of T-cell activity in these same gastric cancer samples, we analyzed previously published microarray data (75) to measure CD8A (a measure of CD8+ tumor-infiltrating lymphocytes), and granzyme A (GZMA) and perforin (PRF1; refs. 76–78), which are both T-cell effectors and validated markers of T-cell cytolytic activity. We confirmed that these three genes (CD8A, GZMA, and PRF1) were not themselves associated with somatic promoters. Comparing the top and bottom quartiles, gastric cancers with high somatic promoter usage exhibited significantly lower GZMA and PRF1 levels (P < 0.001 and P = 0.01, Wilcoxon test) indicating lower T-cell cytolytic activity (Fig. 4E, top left), and also a trend toward lower CD8A levels (P = 0.14, Wilcoxon one-sided test). These findings support a lower level of tumor antigenicity in gastric cancers with high somatic promoter usage, as recurrent N-terminal peptides lost through somatic promoters are predicted to be collectively more immunogenic than peptides gained through somatic promoters (Fig. 4C). Using two different algorithms (ASCAT, ref. 79, and ESTIMATE, ref. 80), we further confirmed that the decreased GZMA and PRF1 levels are independent of tumor purity differences between gastric cancers (Supplementary Fig. S10). Similar results were obtained upon splitting the gastric cancer samples based on median promoter usage score (GZMA, P < 0.001; and PRF1, P = 0.03). Patients with gastric cancers exhibiting high somatic promoter usage (top 25%) also showed poor survival compared with patients with gastric cancers with low somatic promoter usage (bottom 25%; Fig. 4E, top right, HR = 2.56, P = 0.02). Again, dividing patients by their median somatic promoter usage score also showed similar survival differences (Supplementary Fig. S10, HR = 1.81, P = 0.04).

To validate these findings, we then analyzed two other prominent gastric cancer cohorts: one from TCGA and another from the Asian Cancer Research Group (ACRG). In the TCGA cohort, availability of RNA-seq data allowed us to infer somatic promoter usage directly from next-generation sequencing data (Fig. 2C). Similar to the Singapore cohort, TCGA gastric cancers with high somatic promoter usage (top 25%) exhibited decreased CD8A (P = 0.002, Wilcoxon one-sided test), GZMA (P = 0.001, Wilcoxon one-sided test), and PRF1 levels (P = 0.005, Wilcoxon one-sided test, Fig. 4E, bottom left) compared with gastric cancers with low somatic promoter usage (bottom 25%) in a manner independent of tumor purity (Supplementary Fig. S10). Notably, as previous studies have suggested that somatic mutation burden may also correlate with intratumor T-cell cytolytic response (81), we further repeated the analysis after adjusting for the total number of missense mutations in each sample using a regression-based approach. Even after correcting for somatic mutation burden, we still observed decreased CD8A (P = 0.02, Wilcoxon one-sided test), GZMA (P = 0.01, Wilcoxon one-sided test), and PRF1 expression (P = 0.03, Wilcoxon one-sided test) in samples with high somatic promoter usage (top 25% against bottom 25%; Supplementary Fig. S10).

We leveraged a third independent cohort of gastric cancersamples from ACRG (11). Using NanoString to target 89 canonical and alternative promoters along with various immune markers, we profiled 264 primary gastric cancer samples from the ACRG cohort (11). Forty percent of alternative promoter transcripts showed tumor-specific expression in more than half of the samples (Supplementary Fig. S10). Once again, samples with high somatic promoter usage (top 25%) showed significantly lower expression of T-cell cytolytic activity markers, including CD8A (P = 0.035, Wilcoxon one-sided test), CD4A (P = 0.005, Wilcoxon one-sided test), GZMA (P = 0.001, Wilcoxon one-sided test), and PRF1 (P = 0.025, Wilcoxon one-sided test; Fig. 4E, bottom right; Supplementary Fig. S10). Similar results were obtained upon splitting the gastric cancer samples based on median promoter usage score (Supplementary Table S8). Also, after adjusting for mutational burden (for cases where information is available), samples with high somatic promoter usage still showed decreased CD8A (P = 0.167, Wilcoxon one-sided test), GZMA (P = 0.009, Wilcoxon one-sided test), and PRF1 (P = 0.03, Wilcoxon one-sided test) expression (Supplementary Fig. S10). Taken collectively, these results, observed across multiple gastric cancer cohorts and assessed using diverse technologies (microarray, RNA-seq, NanoString), all support a significant association between somatic promoter usage and reduced tumor immunity levels. Importantly, the decreased levels of T-cell cytolytic activity associated with somatic promoter usage are likely independent of tumor purity and mutational load.

Somatic Promoter–Associated Peptides Are Immunogenic In Vitro

To functionally test the ability of N-terminal peptides depleted in gastric cancer to elicit immune responses, we conducted in vitro assays using the high-throughput Epitope Maximum (EPIMAX) platform (82, 83), which allows multiepitope testing for both T-cell proliferation and cytokine production. First, we identified N-terminal peptides predicted to exhibit high HLA-binding affinities across a pool of healthy peripheral blood mononuclear cell (PBMC) donors. Second, selecting 15 alternative promoter–associated peptides for testing, we generated peptide pools for each peptide (Supplementary Tables S9 and S10; Methods), which were then used to stimulate PBMCs from 9 healthy donors. T-cell proliferation and cytokine production levels were measured and benchmarked against control peptides (Supplementary Table S11). Across all 135 exposures (15 peptides across 9 donors), we observed strong cytokine responses for 79 peptide pools (58%; FC ≥ 2 relative to actin peptides; Fig. 4F and G) inducing complex Th1, Th2, and Th17 polarizations in a donor-dependent fashion (Supplementary Fig. S11).

To test the immunogenic capacity of specific N-terminal peptides in a more cellular setting, we then assessed responses of T cells previously primed to recognize either altered or WT peptides, when cocultured with HLA-matched isogenic gastric cancer cells expressing either altered or WT peptides, respectively (Supplementary Fig. S11). Similar approaches have been used by previous investigators to investigate protein and peptide immunogenicity (84–86). By MHC-I affinity screening, a VMCDIFFSL nonamer in the WT RASA3 N-terminus was predicted to exhibit high MHC-I affinity binding for both the HLA-A02:01 (IC50 = 6.93 nm) and HLA-A02:06 (IC50 = 9.74 nm) alleles. Using HLA-A*02:06 T cells that are cross-reactive to HLA-A*02:01–positive AGS cells (87, 88), we tested the release of IFNγ from primed T cells after exposure to AGS lysates expressing either RASA3 CanT or SomT isoforms. ELISA assays demonstrated that T cells primed to recognize RASA3 CanT released significantly more IFNγ when cocultured with RASA3 CanT–expressing AGS cells than when cocultured with RASA3 SomT–expressing AGS cells. In contrast, T cells primed with RASA3 SomT did not exhibit appreciable IFNγ release when cocultured with RASA3 SomT–expressing AGS cells (Supplementary Fig. S11). Thus, under similar in vitro conditions, RASA3 CanT is capable of eliciting a stronger immune response than RASA3 SomT, consistent with the RASA3 CanT N-terminus being more immunogenic. Taken collectively, these in vitro results demonstrate that peptides predicted to exhibit relative depletion in gastric cancers through somatic promoters can produce immunogenic responses, with the magnitude of immune responses depending on both peptide sequence and host immune background.

Somatic Promoters Are Associated with EZH2 Occupancy

To identify potential oncogenic mechanisms driving the somatic promoters, we intersected the genomic locations of the somatic promoters with transcription factor binding sites of 237 transcription factors from 83 different tissues (89). Regions exhibiting somatic promoters were significantly enriched in regions associated with EZH2 (P < 0.01) and SUZ12 (P < 0.01) binding (Fig. 5A; Supplementary Table S12), confirming earlier findings on a smaller cohort (38). Both EZH2 and SUZ12 are components of the Polycomb repressor complex 2 (PRC2) epigenetic regulator complex, which is upregulated in many cancer types, including gastric cancer (90–95). To validate these findings, we then performed EZH2 ChIP-seq on HFE-145 normal gastric epithelial cells (Methods). Concordant with the previous findings, we observed significant enrichment of EZH2 binding sites at somatic promoters compared with all promoters (enrichment score 27 vs. 13 for all promoters, P < 0.01), and this EZH2 enrichment remained significant when the gained somatic (enrichment score 28,P < 0.01) and lost somatic promoters (enrichment score 24,P < 0.01) were analyzed separately (Supplementary Fig. S12).

Figure 5.

Somatic promoters are associated with EZH2 occupancy. A, Binding enrichment of ReMap-defined transcription factor–binding sites at genomic regions exhibiting somatic promoters. Transcription factors were sorted according to their binding frequency at all H3K4me3-defined promoter regions. EZH2 and SUZ12 binding sites significantly overlap regions exhibiting somatic promoters (gained and lost; P < 0.01, empirical distribution test). B, Proportion of RNA transcripts associated with somatic promoters changing upon GSK126 treatment in IM95 cells, compared with RNA transcripts associated with unaltered promoters. The top somatic promoter figure is for illustrative purposes only. Unaltered promoters were defined as all gene promoters except the somatic promoters. The proportion of genes changing upon treatment, as a proportion of all genes, is also shown. Somatic promoters are more likely to change expression after GSK126 treatment relative to unaltered promoters (OR = 1.46, P < 0.001) or all GSK126-regulated genes (OR = 9.21, P < 0.001, Fisher test). ***, P < 0.001. C, UCSC browser track of the SLC9A9 TSS, a gene with loss of promoter activity [overlapping gastric cancer (GC; red) and normal gastric tissue (blue) H3K4me3]. Gain of expression is seen after inhibition of EZH2 using GSK126 in IM95 cells at both day 6 (D6) and day 9 (D9) treatment. D, UCSC browser track of the PSCA TSS, with loss of promoter activity [GC (red) and normal gastric tissue (blue) H3K4me3]. Gain of expression is seen after inhibition of EZH2 using GSK126 in IM95 cells at both day 6 (D6) and day 9 (D9) treatment.

Figure 5.

Somatic promoters are associated with EZH2 occupancy. A, Binding enrichment of ReMap-defined transcription factor–binding sites at genomic regions exhibiting somatic promoters. Transcription factors were sorted according to their binding frequency at all H3K4me3-defined promoter regions. EZH2 and SUZ12 binding sites significantly overlap regions exhibiting somatic promoters (gained and lost; P < 0.01, empirical distribution test). B, Proportion of RNA transcripts associated with somatic promoters changing upon GSK126 treatment in IM95 cells, compared with RNA transcripts associated with unaltered promoters. The top somatic promoter figure is for illustrative purposes only. Unaltered promoters were defined as all gene promoters except the somatic promoters. The proportion of genes changing upon treatment, as a proportion of all genes, is also shown. Somatic promoters are more likely to change expression after GSK126 treatment relative to unaltered promoters (OR = 1.46, P < 0.001) or all GSK126-regulated genes (OR = 9.21, P < 0.001, Fisher test). ***, P < 0.001. C, UCSC browser track of the SLC9A9 TSS, a gene with loss of promoter activity [overlapping gastric cancer (GC; red) and normal gastric tissue (blue) H3K4me3]. Gain of expression is seen after inhibition of EZH2 using GSK126 in IM95 cells at both day 6 (D6) and day 9 (D9) treatment. D, UCSC browser track of the PSCA TSS, with loss of promoter activity [GC (red) and normal gastric tissue (blue) H3K4me3]. Gain of expression is seen after inhibition of EZH2 using GSK126 in IM95 cells at both day 6 (D6) and day 9 (D9) treatment.

Close modal

To experimentally test whether inhibiting EZH2/PRC2 activity might modulate somatic promoter usage in gastric cancer, we treated IM95 gastric cancer cells with GSK126, a highly selective small-molecule inhibitor of EZH2 methyltransferase activity (90, 96). This line was selected because it has previously been shown to be sensitive to EZH2 depletion (Supplementary Fig. S12; ref. 97). RNA-seq analysis of GSK126-treated IM95 cells at two treatment time points (days 6 and 9) confirmed that genes upregulated upon EZH2 inhibition are enriched in previously identified PRC2 target gene sets (Supplementary Fig. S12). GSK126 treatment caused deregulation of 2,134 promoters in total. Of 1,959 promoters exhibiting somatic alterations in primary gastric cancers (Fig. 1D), GSK126 treatment caused deregulation of 251 somatic promoters in IM95 cells (12.8%). This proportion was significantly greater than the proportion of unaltered promoters exhibiting deregulation after GSK126 challenge (8.8%, OR = 1.46, P < 0.001, Fisher test, Fig. 5B), suggesting heightened sensitivity of somatic promoters to EZH2 inhibition. The proportion of somatic promoters deregulated after EZH2 inhibition was also greater than the total proportion of genes (as defined by GENCODE) regulated by GSK126 (1.5%, OR = 9.21, P < 0.001, Fig. 5B). Of those promoters exhibiting both GSK126 deregulation and also mapping to somatic promoters lost in primary gastric cancer, 89.6% were reactivated following GSK126 administration (78/87, FC ≥ 2, q < 0.1; Methods), consistent with EZH2 functioning to repress these promoters. For example, Fig. 5C and D highlights two lost somatic promoters (SLC9A9 and PSCA), exhibiting expression gain after GSK126 treatment (Fig. 5). These results thus suggest a role for EZH2 in regulating somatic promoters in gastric cancer.

Somatic Promoters Reveal Novel Cancer-Associated Transcripts

Finally, when analyzing the altered somatic promoters with respect to proximity to known genes, we found that somatic promoters could be classified into annotated and unannotated categories. Annotated promoters were defined as promoters mapping close (<500 bp) to a known GENCODE TSS, whereas unannotated promoters were those mapping to genomic regions devoid of known GENCODE TSSs. The majority of promoters present in nonmalignant tissues, and also promoters unchanged between tumors and normal tissues, mapped closely to previously annotated TSSs (72%–92%). In contrast, only 41% of somatic promoters mapped to annotated promoter locations, whereas the remaining 59% mapped to “unannotated” locations, distant from GENCODE TSSs and in many cases 2 to 10 kb away (Fig. 6A).

Figure 6.

Somatic promoters reveal novel cancer-associated transcripts. A, Distribution of distances for different promoter categories to the nearest annotated TSSs. Left, the first bar plot shows distance distributions for promoters present in gastric normal tissues, the second for promoters present in gastric cancer (GC) samples, and the third for promoters exhibiting somatic alterations (i.e., different in tumor vs. normal). Right, the bar plots present distance distributions associated with either lost or gained somatic promoters. A substantial proportion of gained somatic promoters occupy locations distant from previously annotated TSSs (red, green, purple, blue, orange). B, Median functional scores of unannotated promoters as predicted by GenoSkyline across 7 different tissues. Unannotated promoters exhibited high functional scores for gastrointestinal, fetal, and embryonic stem cell (ESC) tissues. C, Box plot depicting average RNA-seq reads for CAGE-validated promoters, comparing either all promoters or somatic promoters and also supported by CAGE data (***, P < 0.001, Wilcoxon one-sided test). Somatic promoters are observed to have lower levels of RNA-seq expression. D, Cartoon depicting proposed effects of dynamic range on Nano-ChIP-seq and RNA-seq sensitivity in detecting lowly expressed transcripts. NGS, next-generation sequencing. Because of a more restricted dynamic range, epigenomic profiling may detect active promoters missed by RNA-seq, due to the random sampling of abundantly expressed genes by RNA-seq. E, Down-sampling and up-sampling analysis. The y-axis depicts the number of transcripts detected that overlap either all promoters (blue line) or somatic promoters (red line) at varying RNA-seq depths. Original primary sample RNA-seq data were sequenced at approximately 106 M reads, which were down-sampled to 20, 40, and 60 M reads. Deep RNA-seq data were additionally generated at approximately 139 M read depth. F, Cancer-associated transcripts detected at deep but not regular RNA-seq depth. The UCSC genome browser track for ABCA13 shows an example of a novel transcript detected by Nano-ChIP-seq at a read depth of 20 M but detected by RNA-seq only at a read depth of approximately 139 M (deep sequencing GC). This transcript is not detected by regular-depth RNA-seq (GC).

Figure 6.

Somatic promoters reveal novel cancer-associated transcripts. A, Distribution of distances for different promoter categories to the nearest annotated TSSs. Left, the first bar plot shows distance distributions for promoters present in gastric normal tissues, the second for promoters present in gastric cancer (GC) samples, and the third for promoters exhibiting somatic alterations (i.e., different in tumor vs. normal). Right, the bar plots present distance distributions associated with either lost or gained somatic promoters. A substantial proportion of gained somatic promoters occupy locations distant from previously annotated TSSs (red, green, purple, blue, orange). B, Median functional scores of unannotated promoters as predicted by GenoSkyline across 7 different tissues. Unannotated promoters exhibited high functional scores for gastrointestinal, fetal, and embryonic stem cell (ESC) tissues. C, Box plot depicting average RNA-seq reads for CAGE-validated promoters, comparing either all promoters or somatic promoters and also supported by CAGE data (***, P < 0.001, Wilcoxon one-sided test). Somatic promoters are observed to have lower levels of RNA-seq expression. D, Cartoon depicting proposed effects of dynamic range on Nano-ChIP-seq and RNA-seq sensitivity in detecting lowly expressed transcripts. NGS, next-generation sequencing. Because of a more restricted dynamic range, epigenomic profiling may detect active promoters missed by RNA-seq, due to the random sampling of abundantly expressed genes by RNA-seq. E, Down-sampling and up-sampling analysis. The y-axis depicts the number of transcripts detected that overlap either all promoters (blue line) or somatic promoters (red line) at varying RNA-seq depths. Original primary sample RNA-seq data were sequenced at approximately 106 M reads, which were down-sampled to 20, 40, and 60 M reads. Deep RNA-seq data were additionally generated at approximately 139 M read depth. F, Cancer-associated transcripts detected at deep but not regular RNA-seq depth. The UCSC genome browser track for ABCA13 shows an example of a novel transcript detected by Nano-ChIP-seq at a read depth of 20 M but detected by RNA-seq only at a read depth of approximately 139 M (deep sequencing GC). This transcript is not detected by regular-depth RNA-seq (GC).

Close modal

To test the functional relevance of these unannotated promoters, we used GenoCanyon, a nucleotide-level quantification of genomic functional potential that integrates multiple levels of conservation and epigenomic information (98). We observed that 81% of the unannotated promoter regions exhibited a maximum genome-wide functional score of greater than 0.9 (range 0–1), indicating high functional potential. To ascertain tissue-type specificities, we then applied tissue-specific annotations using GenoSkyline (99), an extension of the GenoCanyon framework integrating Roadmap Epigenomics data (54). We observed that gastrointestinal tissues had the third highest median score after embryonic stem cell (ESC) and fetal tissues, consistent with our tumors being gastric in lineage and also dedifferentiated (Fig. 6B). In a separate analysis, recent studies have also suggested that endogenous repeat elements in the human genome may contribute significantly to regulatory element variation (100), and hypomethylation of repeat elements can induce cancer-associated transcription (101). We found that unannotated promoters were also significantly enriched for the repeat elements ERV1 (P < 0.0001 unannotated vs. all) and L1 (P < 0.0001 unannotated vs. all; Supplementary Fig. S13).

Compared with annotated promoters, unannotated promoters exhibited weaker H3K27ac signals, suggesting that the former might have lower activity and decreased gene expression levels (Supplementary Fig. S13). Supporting this, somatic promoters, even those supported by CAGE tags (indicating true promoters), exhibited significantly lower RNA-seq expression levels compared with all CAGE tag–supported promoters (Fig. 6C). We thus hypothesized that unannotated promoters might be associated with low transcript levels, thereby rendering them more challenging to detect by conventional depth transcriptome sequencing given the very wide dynamic range of cellular transcriptomes (10–10,000 transcripts per cell for different genes; Fig. 6D; ref. 102). To test this possibility, we employed both down-sampling and up-sampling analysis. Not surprisingly, decreasing levels of RNA-seq depth caused a concomitant decrease in detected somatic promoter transcripts. For example, down-sampling to approximately 40 M reads caused approximately 250 transcripts (FPKM > 0; Fig. 6E) to be rendered undetectable at somatic promoters. More convincingly, in the reciprocal experiment, we experimentally generated deep RNA-seq data for 5 matched gastric cancer/normal pairs (average read depth 140 M compared with standard 100 M) and confirmed the additional detection of 435 new somatic promoter–associated transcripts (FPKM > 0; Fig. 6E). We estimate that usage of deep RNA-seq data allowed us to discover additional transcripts for 22% of the unannotated promoters, not previously detectable at regular depth RNA-seq (Fig. 6F). These results demonstrate that despite being associated with bona fide cancer-associated transcripts, many somatic promoters defined by epigenomic profiling may have been missed by conventional-depth RNA-seq.

Identifying somatically altered cis-regulatory elements and understanding how these elements direct cancer-associated gene expression (103) represents a critical scientific goal (104, 105). Here, we defined close to 2,000 promoters exhibiting altered activity in gastric cancer, indicating that somatic promoters in gastric cancer are pervasive. Promoters are traditionally defined as proximal cis-regulatory elements that recruit general transcription factors to initiate transcription (106, 107). However, selection and activation of TSSs by RNA polymerase at core promoters is dependent on multiple factors. Core promoters are differentially distributed between genes of different functions (15, 106), and chromatin distributions and epigenetic landscapes of core promoter regions can also differ in a tissue-specific manner (15, 108–110). Presence of multiple transcription initiation sites within the same gene can generate distinct transcript isoforms with different 5′ UTRs that can act to regulate gene expression (111–113), and usage of alternative 5′ UTRs can also affect both translation and protein stability of cancer-associated genes such as BRCA1, TGFβ, and ERG (18, 114–117). Such findings demonstrate that specific promoter element activity is complex and cell-context dependent, with impact on downstream transcriptional, translational, and functional processes.

A significant proportion (∼18%) of somatic promoters corresponded to alternative promoters. In cancer, alternative promoter utilization is of major relevance, as increasing numbers of genes (e.g., LEF1, TP53, TGFB3) are now being shown to exhibit distinct alternative promoter–associated isoforms that differentially affect malignant growth (21, 118). In the current study, we identified alternative promoters in genes both known and novel to gastric cancer biology with significant clinical and translational implications. For example, we discovered an alternative promoter at the EPCAM gene locus specifically activated in gastric tumors. In gastric cancer, EPCAM encodes a transmembrane glycoprotein that has been proposed as a marker for circulating tumor cells (119), and EPCAM expression levels have been correlated with prognosis of patients with gastric cancer (120). However, little is known about the specific cellular mechanisms driving high EPCAM expression in gastric cancer. Our finding that EPCAM is regulated in gastric cancer not through its canonical promoter, but instead through a cancer-specific alternative promoter may lend credence to recent reports suggesting that in addition to acting as an experimentally convenient surface marker, EPCAM may actually play a more direct pro-oncogenic role in stimulating cellular proliferation (121).

Another novel example of an alternative promoter–associated gene, identified for the first time in our study, is RASA3. Although a functional role for RASA3 in cancer remains to be definitively established, studies from other biological fields have shown that RASA3 can inhibit RAP1 (122), which in turn has been implicated in invasion and metastasis in various cancers (123, 124). RASA3 depletion can enhance signaling by integrins (125) and MAPK (64), and the possibility that RASA3 can act as a tumor suppressor has also been recently suggested through independent cross-species cancer studies (126). Our results suggest that RASA3 may play a more complex role in cancer, as the expression of WT RASA3 inhibited cell migration and invasion in gastric cancer cell lines, whereas N-terminal Var RASA3 enhanced migration and invasion. A third example of an alternative promoter–driven gene is MET, which has been extensively investigated as a target for cancer therapy (127–129). Although we and others have previously reported (38, 65) the expression of an N-terminal truncated MET-Var in cancer, functional implications of this truncated MET-Var have remained unclear. In this study, experimental assessment of MET WT and Var signaling revealed that truncated MET variants may have different downstream signaling effects compared with full-length MET isoforms. Under the experimental conditions used, we observed significant differences in phosphorylation patterns of ERK, STAT3, and GAB1, in a manner consistent with MET-Var being more pro-oncogenic compared with MET-WT, as both ERK, STAT3, and GAB1 have been shown to facilitate MET-induced signaling (130–132). The MET signaling pathway is known to be particularly complex, with multiple feedback loops (133), and understanding how expression of the N-terminal short MET isoform might promote downstream survival signaling will be an important subject of future research, particularly in light of recent clinical trials targeting MET in lung cancer using antibodies that have been unsuccessful (5).

Our study also revealed an unexpected relationship between somatic promoters and tumor immunity. Specifically, we discovered that alternative promoter isoforms overexpressed in gastric cancer were significantly depleted of N-terminal peptides predicted to be potentially immunogenic, based on computational predictions of high-affinity MHC class I binding and other immunologic assays. We believe that finding is relevant to cancer immunity, as it builds on previous findings from the literature establishing the existence of self-reactive T cells, the potential immunogenicity of overexpressed tumor antigens, and the process of tumor immunoediting. First, although the majority of self-reactive T cells are clonally deleted during early development, numerous groups have also demonstrated the frequent persistence of self-reactive T cells in the periphery (134). For example, analysis of transgenic mice has shown that 25% to 40% of autoreactive T cells are likely to escape clonal deletion even in the presence of the deleting ligand (135), and in humans, Yu and colleagues have demonstrated that clonal deletion prunes the T-cell repertoire but does not fully eliminate self-reactive T-cell clones (136). Importantly, although such self-reactive T cells are typically low-avidity and are not capable of recognizing self-antigens under normal physiologic conditions (137–140), they still retain the ability to become activated and to produce effector and memory cells under conditions of appropriate stimulation, such as infection and the mounting of antitumor responses (141, 142).

Second, in cancer, several studies have shown that self-reactive T cells can exhibit immunologic activity toward overexpressed tumor antigens, even if these antigens are also expressed at lower levels in normal tissues. One well-known example is the melanocyte differentiation antigen Melan-A/MART1, which is both expressed by normal melanocytes and overexpressed in malignant melanoma cells (143–147). T-cell recognition of Melan-A/MART1 has been detected in 50% of patients with melanoma (148), and even healthy individuals have been shown to exhibit a disproportionately high frequency of Melan-A/MART1–specific T cells in the peripheral blood (148). Besides Melan-A/MART1, other examples of tumor-associated self-antigens (149–151) inducing immunologic recognition in both healthy individuals and patients with cancer (152) include tyrosinase-related proteins (TRP1 and TRP2; refs. 153–159) and glycoprotein (gp) 100 (147, 160–163) in melanoma, and P1A in mastocytoma cells (164). Such examples clearly demonstrate that in certain cases, normally expressed proteins can still become immunogenic when overexpressed in cancer. Third, tumor immunoediting, the acquired capacity of developing tumors to escape immune control, is a recognized hallmark of cancer (165–172). Tumor immune escape can occur via different mechanisms, such as through upregulation of immune checkpoint inhibitors (e.g., PD-L1) and altered transcription of antigen-presenting genes (173–176) or tumor-specific antigens. For example, decreased expression of melanoma antigens (e.g., gp100, MART1, and P1A) has been associated with melanoma progression to later disease stages (177). Besides overt downregulation of the entire gene, it is thus highly plausible that transcriptional changes affecting splice forms and promoter variants may also contribute to tumor immunoediting. For example, very recent work (178) in B-cell acute lymphoblastic leukemia has described the production of N-terminally truncated CD19 variants in response to CD19 CART (chimeric antigen receptor–armed T cells) therapy, clearly showing that promoter transcript variants can indeed arise as a consequence of immunologic pressure. Taken collectively, we believe that these previously established findings all point to a plausible role for alternative promoters in reducing the immunogenic potential of tumors. In this regard, our observation that somatic promoter regions exhibit a significant overlap with binding targets of the PRC2 epigenetic regulator complex, and are particularly sensitive to EZH2 inhibition, suggests that pharmacologic approaches for reawakening somatic promoter–associated epitopes might represent an attractive strategy for increasing antitumor T-cell immunoreactivity and antitumor activity (86, 179).

In conclusion, our study indicates an important role for somatic promoters in gastric cancer. We also note that a significant portion (52%) of the somatic promoters localized to unannotated TSSs, consistent with recent studies indicating the existence of hundreds of transcript loci remaining to be annotated (180). Interestingly, a large portion of the human transcriptome has been shown to originate from repetitive elements that can exhibit promoter activity and/or express noncoding RNAs (181, 182). Unannotated promoters activated in our gastric cancer study were found to be enriched in ERV1 and L1 repeat elements that have been shown to be associated with stage-specific transcription in early human embryonic cells (183), suggesting a yet-unknown functional role for these promoters. Analysis of these unannotated promoters is likely to provide fertile ground for new and hitherto unanticipated insights into mechanisms of gastric cancer development and progression.

Primary Tissue Samples and Cell Lines

Primary patient samples were obtained from the SingHealth tissue repository with approvals from the SingHealth Centralised Institutional Review Board and signed patient informed consent. “Normal” (nonmalignant) samples used in this study refer to samples harvested from the stomach, from sites distant from the tumor and exhibiting no visible evidence of tumor or intestinal metaplasia/dysplasia upon surgical assessment. Tumor samples were confirmed by cryosectioning to contain >60% tumor cells. FU97, IM95, MKN7, OCUM1, and RERF-GC-1B cell lines were obtained from the Japan Health Science Research Resource Bank. AGS, KATOIII and SNU16, Hs 1.Int and Hs 738.St/Int gastrointestinal fibroblast lines were obtained from the ATCC. NCC-59, NCC-24, and SNU-1967 and SNU-1750 were obtained from the Korean Cell Line Bank. YCC3, YCC7, YCC21, and YCC22 were gifts from Yonsei Cancer Centre (Seoul, South Korea). HFE145 cells were a gift from Dr. Hassan Ashktorab (Howard University, Washington, DC). GES1 cells were a gift from Dr. Alfred Cheng, Chinese University of Hong Kong. Cell line identities were confirmed by short tandem repeat DNA profiling using ANSI/ATCC ASN-0002-2011 guidelines in mid-late 2015. All cell lines were negative for Mycoplasma contamination as assessed by the MycoAlert Mycoplasma Detection Kit (Lonza) and the MycoSensor qPCR Assay Kit (Agilent Technologies). PBMCs from healthy donors were collected under protocol CIRB ref no. 2010/720/E.

ChIP-seq

Nano-ChIP-seq was performed as described previously (38) with slight modifications (see Supplementary Text). Eight Nano-ChIP-seq libraries were multiplexed (New England Biolabs) and sequenced on 2 lanes of a HiSeq2500 sequencer (Illumina) to an average depth of 20 to 30 million reads per library. We assessed ChIP library qualities (H3K27ac, H3K4me3, and H3K4me1) using two different methods, ChIP enrichment assessment and CHANCE (see Supplementary Text; ref. 41). For EZH2 ChIP-seq, EZH2 antibodies (catalog #5246, Cell Signaling Technology) were used for ChIP. Thirty nanograms of ChIPed DNA was used for each sequencing library preparation (New England Biolabs).

Promoter Analysis

Promoter (H3K4me3hi/H3K4me1lo) regions were identified by calculating the H3K4me3:H3K4me1 ratio for all H3K4me3 regions merged across normal and gastric cancer samples. We estimated the required sample size to achieve 80% power and 10% type I error (http://powerandsamplesize.com/) based on the average signals of top 100 differential promoters between tumor and normal samples. This result yielded a recommended sample size of 11 (average), which is met in our study (16 N/T). Regions with H3K4me3:H3K4me1 ratios <1 in both normal and gastric cancer samples were excluded from further analysis. For all analyses performed in this study, promoter regions were defined as genomic locations exhibiting H3K4me3hi/H3K4me1lo signals, and for all subsequent analyses, it was only within this predefined H3K4me3hi/H3K4me1lo subset that H3K4me3 signals were compared. H3K27ac data were used for correlative analysis. H3K4me3 data (FASTQs) for colon carcinoma lines were downloaded from public databases: HCT116 and Caco2 from ENCODE and V503 and V400 from GSE36204. To compare promoter signals between gastric cancer and normal samples, we used the DESeq2 (46) and edgeR (47) bioconductor packages using a read count matrix of ChIP-seq signals, adjusting for replicate information. Regions with FCs greater than 1.5 (FDR = 0.1) were selected as significantly different. The criteria of FC = 1.5 and q < 0.1 was based on previous literature comparing ChIP-seq profiles using DESeq2 and edgeR also using similar thresholds (49, 50). Significantly altered promoters identified by DESeq2 overlapped almost completely with altered promoters found by edgeR. A regularized log transformation of the DESeq2 read counts was used to plot PCAs and heat maps.

Transcriptome Analysis

RNA-seq data were obtained from the European Genome-Phenome Archive under accession no. EGAS00001001128. Data were processed by first aligning to GENCODE v19 transcript annotations using TopHat v2.0.12 (184). Cufflinks 2.2.0 was used to generate FPKM abundance measures. For identification of novel transcripts, Cufflinks was used without employing a reference transcript annotation. Transcripts were then merged across all gastric cancer and normal samples and compared against GENCODE annotations to identify novel transcripts using Cuffmerge 2.2.0. Deep-depth strand-specific RNA-seq was also performed on 10 additional primary samples (paired-end 101 bp). TCGA datasets were downloaded from TCGA Data Portal (https://gdc.cancer.gov/) in the form of FASTQ files, which were then aligned to GENCODE v19 transcript annotations using TopHat v2.0.12. To analyze promoter-associated RNA expression, RNA-seq reads from TCGA samples (tumor and normal) were mapped against the genomic locations of promoter regions originally defined by epigenomic profiling in the discovery samples, including all promoters, gained somatic promoters, and lost somatic promoters (see Fig. 1). RNA-seq reads mapping to these epigenome-defined promoter regions were then quantified and normalized by promoter length (kilobases) and by total library size, and FCs in expression were computed between tumor and normal TCGA sample groups. Length of promoter loci was defined as the number of base pairs (bp) between the start and stop genomic coordinate of the H3K4me3 region as identified by the peak caller program CCAT v3.0 (185). Isoform level quantification for alternative promoter–driven transcripts was performed using Cufflinks (FPKM; ref. 186), Kallisto (TPM; ref. 187), and MISO (isoform-centric analysis; ref. 188). Assigned counts for each isoform were normalized by DESeq2.

Other Analyses

Other analyses, including DNA methylation analysis, survival analysis, gene set enrichment analysis, analysis of repetitive elements, functional element analysis using GenoCanyon and GenoSkyline, and analysis of transcription factor binding sites, are presented in the Supplementary Text.

Mass Spectrometry and Data Analysis

Peptide-level spectral data for 90 colon and rectal cancer samples (63) were downloaded from the CPTAC portal (https://cptac-data-portal.georgetown.edu/cptac/s/S016) generated by the Clinical Proteomic Tumor Analysis Consortium (NCI/NIH). Spectral counts were extracted using IDPicker's idQuery tool (189). Differentially expressed peptides were identified by fitting a linear model (limma R; ref. 190) on quantile-normalized and log2-transformed spectral counts. For gastric cancer cell line mass spectrometry, AGS, GES1, SNU1750, and MKN1 proteomic profiles were generated using nanoflow liquid chromatography on an EASY-nLC 1200 system coupled to a Q Exactive HF mass spectrometer (Thermo Fisher Scientific; Supplementary Text). The Q Exactive HF was operated with a TOP20 MS-MS spectra acquisition method per MS full scan. MS scans were conducted with 60,000 resolution and MS-MS scans with 15,000 resolution. For data analysis, raw files were processed with MaxQuant (191) version 1.5.2.8 against the UniProt annotated human protein database (192). Carbamidomethylation was set as a fixed modification, whereas methionine oxidation and protein N-acetylation were considered as variable modifications. Search results were processed with MaxQuant filtered with an FDR of 0.01. The match between run option and Label-Free Quantification (LFQ; ref. 193) was activated. LFQ intensities were filtered for potential contaminants and reverse proteins, and log2 transformed. They were then imputed using the open-source software Perseus (0.5 width, 1.8 downshift; ref. 194) and fitted using linear models (limma R; ref. 190).

Molecular Biology and Biochemistry

Procedures for 5′ rapid amplification of cDNA ends (5′ RACE), gene cloning, Western blotting (MET variants), RASA3 mRNA measurement, and RAS-GTP assays are presented in the Supplementary Text.

Transfection with RASA3 siRNAs

Two RASA3 siRNAs were used to silence the RASA3 SomT transcript in NCC24 cells [hs.Ri.RASA3.13.1 TriFECTa Kit DsiRNA Duplex (Integrated DNA Technologies), and Silencer Select Pre-Designed siRNA s355 (Life Technologies)]. NCC24 cells were transfected either with the above two siRNAs or a nontargeting control (ON-TARGETplus non-targeting pool, Dharmacon) at a final concentration of 100 nmol/L for 48 hours, subsequently followed by qPCR and Western validation and migration/invasion assays.

Cell Proliferation, Migration, and Invasion Assays

For cell proliferation, 3 × 103 GES1, SNU1967, and AGS cells were plated into 96-well plates in media with 10% FBS and left overnight to attach. The next day (day 0), cells were transiently transfected with WT and Var RASA3 constructs using Lipofectamine 3000 (Thermo Fisher Scientific). The amount of the constructs was 40 ng per well for AGS and 100 ng per well for GES1 and SNU1967 cells. Cell proliferation was measured by the WST-8 assay (Cell Counting Kit-8, Dojindo) from 24 to 120 hours posttransfection. WST-8 solution (10 μL) was added per well, and the absorbance reading was measured at 450 nm after 2 hours of incubation in a humidified incubator. To determine cell-migratory capacities, RASA3 WT and Var-transfected GES1, SNU1967, and AGS cells and siRNA-treated NCC24 cells were tested using Corning Costar 6.5-mm Transwell with 8.0-μm Pore Polycarbonate Membrane Inserts (3422, Corning). AGS cells (2.5 × 104), GES1 cells (2 × 104), SNU1967 cells (3 × 104), and NCC24 cells (5 × 104) were suspended in 0.1 mL serum-free RPMI medium and added to the top of the Transwell insert. RPMI (0.6 mL) containing 10% FBS was added into the bottom well as a chemoattractant. After incubation for 24 hours at 37°C in a 5% CO2 incubator, cells were fixed with 3.7% formaldehyde and permeabilized with 100% methanol. Nonmigrated cells were scraped off with cotton swabs from the upper surface of the membrane. Migrated cells were stained with 0.5% crystal violet. The number of migrated cells was represented as the total area of migrated cells versus the area of Transwell membrane calculated using ImageJ software. For cell invasion assays, the above Transwell inserts were coated with 0.1 mL (300 μg/mL) Corning Matrigel matrix (354234, Corning) for 2 to 4 hours at 37°C before use.

Altered Peptide and Antigen Prediction

Altered peptides were defined as variant N-terminal protein sequences arising from somatic alterations in alternative promoter usage. The following filters were applied to select the pool of altered peptides: (i) FC of at least 1.5 for alternate versus canonical RNA-seq expression; (ii) only one canonical and one alternate isoform per gene loci; and (iii) annotated transcripts confirmed as protein coding by GENCODE. Canonical promoters were defined as regions exhibiting unaltered H3K4me3 peaks. Random peptides from the human proteome were generated from amino acid sequences of GENCODE coding transcripts. N-terminal peptide gains were identified as cases in which the alternative transcript was associated with a different 5′ region predicted to result in a different translated protein sequence compared with the canonical transcript. For each N-terminal altered protein, we evaluated binding of 9-mer peptides using the NetMHCpan 2.8 using a strict threshold of IC50 ≤ 50 nmol/L to identify strong MHC binders (74, 195). Antigen predictions were performed against patient-specific HLA types of gastric cancer samples predicted using OptiType (196). OptiType was run using default parameters, except BWA mem was used as an aligner for prefiltering reads aligning to the OptiType-provided reference sequences.

Association of Cytolytic Markers with Alternative Promoter Usage

Local immune cytolytic activity was evaluated using the expression of GZMA and PRF1 as previously used by Rooney and colleagues (81). Tumor content was estimated using two algorithms, ASCAT (aberrant cell fraction; ref. 79) and ESTIMATE (tumor purity; ref. 80). Expression data for the SG series were downloaded (GSE15460) and normalized using the robust multiarray average algorithm in the “affy” R package (197) and log2 transformed. Affymetrix SNP Array 6.0 data for the SG series were downloaded from GSE31168 and GSE85466. Mutation frequencies for TCGA stomach adenocarcinoma (STAD) samples were downloaded from the TCGA STAD publication data (https://tcga-data.nci.nih.gov/docs/publications/stad_2014/; ref. 198) using level 2 curated MAF files (QCv5_blacklist_Pass.aggregated.capture.tcga.uuid.curated.somatic.maf) filtered for “Missense” variant classification. Expression data for TCGA STAD samples (TPM) were computed using the Kallisto algorithm (187). Raw SNP Array 6.0.CEL files for TCGA gastric cancers (STAD) were downloaded from the GDC data portal (https://gdc-portal.nci.nih.gov/). Access to this dataset was obtained using Database of Genotypes and Phenotypes (dbGaP) credentials and an ID issued by eRA commons. Precomputed ESTIMATE scores for TCGA STAD were downloaded from http://bioinformatics.mdanderson.org/estimate/ and converted to tumor purity using the formula cos (0.6049872018 + 0.0001467884 × ESTIMATE score; ref. 80). Preprocessed expression data for the ACRG series were downloaded from GSE62254, and precomputed ASCAT scores were obtained from collaborators (J. Lee). Expression of cytolytic markers was adjusted for missense mutation and tumor purity frequencies using a spline regression model.

EPIMAX Assays

Peptides for 15 representative alternative promoters were synthesized by GenScript (Supplementary Table S10). Control peptide pools for human actin were purchased from JPT [PM-ACTS, PepMix Human (Actin) JPT]. PBMCs were obtained from 9 healthy volunteers, of which 8 PBMC samples were HLA typed (Supplementary Table S9). PBMCs were labeled with 1 μmol/L CFSE (Life Technologies, Thermo Fisher Scientific) and cultured at a density of 2 × 105 cells per well in complete culture medium [cRPMI comprising RPMI-1640 medium (Gibco, Thermo Fisher Scientific), 15 mmol/L HEPES (Gibco), 1% nonessential amino acid (Gibco), 1 mmol/L sodium pyruvate (Gibco), 1% penicillin/streptomycin (Gibco), 2 mmol/L l-glutamine (Gibco), 50 μmol/L β2-mercaptoethanol (Sigma, Merck), and 10% heat-inactivated FCS (HyClone)] for 5 days. Individual peptide pools of each alternative promoter were added at the start of the culture at a concentration of 1 μg/mL for each peptide. At the end of day 5, cells were stained with LIVE/DEAD Fixable Near-IR Dead Cell Stain Kit (Life Technologies) and labeled with CD4-BUV737 (BD Biosciences), CD8-PacificBlue (BD Biosciences), CD3-PE (BioLegend), CD19-PE/TexasRed (Beckman Coulter), and CD56-APC (BD Biosciences). In addition, magnetic bead–based cytokine multiplex analysis (human cytokine panel 1, Millipore, Merck) was performed on cell culture supernatants to measure secreted cytokine levels.

IFNf Assays

To test the immunogenicity of the RASA3 WT and Var protein sequences, CD14+ monocytes were isolated from an HLA-A*02:06 donor by positive selection using magnetic beads (Miltenyi Biotec). Dendritic cells (DC) were generated by GM-CSF (1,000 IU/mL) and IL4 (400 IU/mL) and further matured by TNF (10 ng/mL), IL1b (10 ng/mL), IL6 (10 ng/mL; Miltenyi Biotec), and PGE2 (1 μg/mL; Stemcell Technologies) for 24 hours. The DCs were then primed with AGS cell lysates expressing WT RASA3 or Var RASA3 for 24 hours, before being cocultured with T cells from the same donor at the ratio of 1:5. After 5 days of coculture with DCs, T cells were isolated by positive selection using CD3 magnetic beads (Miltenyi Biotec) and cocultured with AGS cells expressing either WT or Var RASA3 at the ratio of 20:1 for 2 days. Supernatants were harvested and IFNγ release was measured by ELISA (R&D Systems).

NanoString Analysis

NanoString nCounter Reporter CodeSets were designed for 95 genes (83 upregulated in gastric cancer and 11 downregulated) and 5 housekeeping genes (AGPAT1, CLTC, B2M, POL2RL, and TBP covering a broad expression range) on the SG series samples. For each gene, we designed three probes, targeting (i) the 5′ end of the alternate promoter location; (ii) the 5′ end of the canonical promoter (defined by promoter regions of equal enrichment in both GC and normal samples or the longest protein-coding transcript); and (iii) a common downstream probe. A separate NanoString assay was designed for 88 genes on the ACRG cohort, using similar criteria. Vendor-provided nCounter software (nSolver) was used for data analysis. Raw counts were normalized using the geometric mean of the internal positive control probes included in each CodeSet.

EZH2 Inhibition

IM95 cells were treated with GSK126 (Selleck Chemicals), a selective EZH2 inhibitor (96, 199), at a concentration of 5 μmol/L. Cell proliferation was monitored in 96-well plates posttreatment with GSK126 using the CellTiter-Glo Luminescent Cell Viability Assay (Promega) for three independent experiments. For RNA-seq analysis, total RNA was extracted using the Qiagen RNeasy Mini Kit according to the manufacturer's instructions. Cells were treated with GSK126 (Selleck Chemicals; dissolved in DMSO) at a concentration of 5 μmol/L. Control cells were treated with the same concentration of DMSO (0.1%). RNA-seq differential analysis for promoter loci was carried out using edgeR (47) on read counts mapping to H3K4me3 regions estimated using featureCounts (200). RNA-seq gene level differential analysis was performed using cuffdiff2.2.1.

Accession Codes

Genomic data for this study have been deposited in the National Center for Biotechnology GEO database, under accession numbers GSE51776 and GSE75898 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?token=kfoxqeamzfetpal&acc=GSE75898).

P. Tan has ownership interest (including patents) in ASTAR. No potential conflicts of interest were disclosed by the other authors.

Conception and design: A. Qamra, M. Xing, N. Padmanabhan, P.K.H. Chow, B.T. Teh, P. Tan

Development of methodology: M. Xing, J.J.T. Kwok, J.S. Lin, X. Yao, B.T. Teh, P. Tan

Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): M. Xing, N. Padmanabhan, S. Zhang, C. Xu, Y.S. Leong, A.P.L. Lim, Q. Tang, X. Yao, X. Ong, M. Lee, S.T. Tay, E.G. Santoso, C.C.Y. Ng, A. Jusakul, D. Smoot, S.Y. Rha, K.G. Yeoh, W.P. Yong, P.K.H. Chow, W.H. Chan, H.S. Ong, K.C. Soo, K.-M. Kim, W.K. Wong, B.T. Teh, D. Kappei, J. Lee, P. Tan

Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): A. Qamra, M. Xing, J.J.T. Kwok, Q. Tang, W.F. Ooi, J.S. Lin, T. Nandi, X. Yao, A. Ng, S.Y. Rha, S.G. Rozen, D. Kappei, J. Connolly, P. Tan

Writing, review, and/or revision of the manuscript: A. Qamra, M. Xing, J.J.T. Kwok, A. Ng, S.Y. Rha, K.G. Yeoh, W.P. Yong, P.K.H. Chow, S.G. Rozen, B.T. Teh, J. Connolly, P. Tan

Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): A.P.L. Lim, A.T.L. Keng, H. Ashktorab, S.G. Rozen, B.T. Teh, P. Tan

Study supervision: P.K.H. Chow, P. Tan

Other (provided the cell lines): H. Ashktorab

We thank the Sequencing and Scientific Computing teams at the Genome Institute of Singapore for providing sequencing services and data management capabilities, and the Duke-NUS Genome Biology Facility for sequencing services. We also thank Dr. Shyam Prabhakar for helpful discussions. We thank Dr. Wanjin Hong for the gift of HEK293 cells and pCI-Puro-HA vector and Dr. Alfred Cheng for the gift of GES1 cells.

This work was supported by a core institutional grant from the Genome Institute of Singapore under the Agency for Science, Technology and Research; core funding from Duke-NUS Medical School; and National Medical Research Council grants TCR/009-NUHS/2013, BnB/0005b/2013 (BnB11dec069), and NMRC/STaR/0026/2015. Other sources of support include the Cancer Science Institute of Singapore, NUS, under the National Research Foundation Singapore and the Singapore Ministry of Education under its Research Centres of Excellence initiative.

1.
Ferlay
J
,
Soerjomataram
I
,
Dikshit
R
,
Eser
S
,
Mathers
C
,
Rebelo
M
, et al
Cancer incidence and mortality worldwide: sources, methods and major patterns in GLOBOCAN 2012
.
Int J Cancer
2015
;
136
:
E359
86
.
2.
Layke
JC
,
Lopez
PP
. 
Gastric cancer: diagnosis and treatment options
.
Am Fam Physician
2004
;
69
:
1133
40
.
3.
Schmidt
N
,
Peitz
U
,
Lippert
H
,
Malfertheiner
P
. 
Missing gastric cancer in dyspepsia
.
Aliment Pharmacol Ther
2005
;
21
:
813
20
.
4.
Hecht
JR
,
Bang
YJ
,
Qin
SK
,
Chung
HC
,
Xu
JM
,
Park
JO
, et al
Lapatinib in combination with capecitabine plus oxaliplatin in human epidermal growth factor receptor 2-positive advanced or metastatic gastric, esophageal, or gastroesophageal adenocarcinoma: TRIO-013/LOGiC–A randomized phase III trial
.
J Clin Oncol
2016
;
34
:
443
51
.
5.
Roche
. 
Roche provides update on phase III study of onartuzumab in people with specific type of lung cancer
.
Available from
: http://www.roche.com/media/store/releases/med-cor-2014-03-03.htm.
6.
Ohtsu
A
,
Shah
MA
,
Van Cutsem
E
,
Rha
SY
,
Sawaki
A
,
Park
SR
, et al
Bevacizumab in combination with chemotherapy as first-line therapy in advanced gastric cancer: a randomized, double-blind, placebo-controlled phase III study
.
J Clin Oncol
2011
;
29
:
3968
76
.
7.
Dikken
JL
,
van Sandick
JW
,
Maurits Swellengrebel
HA
,
Lind
PA
,
Putter
H
,
Jansen
EP
, et al
Neo-adjuvant chemotherapy followed by surgery and chemotherapy or by surgery and chemoradiotherapy for patients with resectable gastric cancer (CRITICS)
.
BMC Cancer
2011
;
11
:
329
.
8.
Wang
K
,
Kan
J
,
Yuen
ST
,
Shi
ST
,
Chu
KM
,
Law
S
, et al
Exome sequencing identifies frequent mutation of ARID1A in molecular subtypes of gastric cancer
.
Nat Genet
2011
;
43
:
1219
23
.
9.
Zang
ZJ
,
Cutcutache
I
,
Poon
SL
,
Zhang
SL
,
McPherson
JR
,
Tao
J
, et al
Exome sequencing of gastric adenocarcinoma identifies recurrent somatic mutations in cell adhesion and chromatin remodeling genes
.
Nat Genet
2012
;
44
:
570
4
.
10.
Yao
F
,
Kausalya
JP
,
Sia
YY
,
Teo
AS
,
Lee
WH
,
Ong
AG
, et al
Recurrent fusion genes in gastric cancer: CLDN18-ARHGAP26 induces loss of epithelial integrity
.
Cell Rep
2015
;
12
:
272
85
.
11.
Cristescu
R
,
Lee
J
,
Nebozhyn
M
,
Kim
KM
,
Ting
JC
,
Wong
SS
, et al
Molecular analysis of gastric cancer identifies subtypes associated with distinct clinical outcomes
.
Nat Med
2015
;
21
:
449
56
.
12.
Tan
IB
,
Ivanova
T
,
Lim
KH
,
Ong
CW
,
Deng
N
,
Lee
J
, et al
Intrinsic subtypes of gastric cancer, based on gene expression pattern, predict survival and respond differently to chemotherapy
.
Gastroenterology
2011
;
141
:
476
85
.
13.
Bang
YJ
,
Van Cutsem
E
,
Feyereislova
A
,
Chung
HC
,
Shen
L
,
Sawaki
A
, et al
Trastuzumab in combination with chemotherapy versus chemotherapy alone for treatment of HER2-positive advanced gastric or gastro-oesophageal junction cancer (ToGA): a phase 3, open-label, randomised controlled trial
.
Lancet
2010
;
376
:
687
97
.
14.
Lenhard
B
,
Sandelin
A
,
Carninci
P
. 
Metazoan promoters: emerging characteristics and insights into transcriptional regulation
.
Nat Rev Genet
2012
;
13
:
233
45
.
15.
Davuluri
RV
,
Suzuki
Y
,
Sugano
S
,
Plass
C
,
Huang
TH
. 
The functional consequences of alternative promoter use in mammalian genomes
.
Trends Genet
2008
;
24
:
167
77
.
16.
D'Alessio
JA
,
Wright
KJ
,
Tjian
R
. 
Shifting players and paradigms in cell-specific transcription
.
Mol Cell
2009
;
36
:
924
31
.
17.
Bieberstein
NI
,
Carrillo Oesterreich
F
,
Straube
K
,
Neugebauer
KM
. 
First exon length controls active chromatin signatures and transcription
.
Cell Rep
2012
;
2
:
62
8
.
18.
Zammarchi
F
,
Boutsalis
G
,
Cartegni
L
. 
5′ UTR control of native ERG and of Tmprss2:ERG variants activity in prostate cancer
.
PLoS One
2013
;
8
:
e49721
.
19.
Ong
CK
,
Leong
C
,
Tan
PH
,
Van
T
,
Huynh
H
. 
The role of 5′ untranslated region in translational suppression of OKL38 mRNA in hepatocellular carcinoma
.
Oncogene
2007
;
26
:
1155
65
.
20.
Valen
E
,
Pascarella
G
,
Chalk
A
,
Maeda
N
,
Kojima
M
,
Kawazu
C
, et al
Genome-wide detection and analysis of hippocampus core promoters using DeepCAGE
.
Genome Res
2009
;
19
:
255
65
.
21.
Wiesner
T
,
Lee
W
,
Obenauf
AC
,
Ran
L
,
Murali
R
,
Zhang
QF
, et al
Alternative transcription initiation leads to expression of a novel ALK isoform in cancer
.
Nature
2015
;
526
:
453
7
.
22.
Muller
M
,
Schleithoff
ES
,
Stremmel
W
,
Melino
G
,
Krammer
PH
,
Schilling
T
. 
One, two, three–p53, p63, p73 and chemosensitivity
.
Drug Resist Updat
2006
;
9
:
288
306
.
23.
Arce
L
,
Yokoyama
NN
,
Waterman
ML
. 
Diversity of LEF/TCF action in development and disease
.
Oncogene
2006
;
25
:
7492
504
.
24.
Agarwal
VR
,
Bulun
SE
,
Leitch
M
,
Rohrich
R
,
Simpson
ER
. 
Use of alternative promoters to express the aromatase cytochrome P450 (CYP19) gene in breast adipose tissues of cancer-free and breast cancer patients
.
J Clin Endocrinol Metab
1996
;
81
:
3843
9
.
25.
Carninci
P
,
Kasukawa
T
,
Katayama
S
,
Gough
J
,
Frith
MC
,
Maeda
N
, et al
The transcriptional landscape of the mammalian genome
.
Science
2005
;
309
:
1559
63
.
26.
Forrest
AR
,
Kawaji
H
,
Rehli
M
,
Baillie
JK
,
de Hoon
MJ
,
Haberle
V
, et al
A promoter-level mammalian expression atlas
.
Nature
2014
;
507
:
462
70
.
27.
Core
LJ
,
Martins
AL
,
Danko
CG
,
Waters
CT
,
Siepel
A
,
Lis
JT
. 
Analysis of nascent RNA identifies a unified architecture of initiation regions at mammalian promoters and enhancers
.
Nat Genet
2014
;
46
:
1311
20
.
28.
Wang
Z
,
Zang
C
,
Rosenfeld
JA
,
Schones
DE
,
Barski
A
,
Cuddapah
S
, et al
Combinatorial patterns of histone acetylations and methylations in the human genome
.
Nat Genet
2008
;
40
:
897
903
.
29.
Barski
A
,
Cuddapah
S
,
Cui
K
,
Roh
TY
,
Schones
DE
,
Wang
Z
, et al
High-resolution profiling of histone methylations in the human genome
.
Cell
2007
;
129
:
823
37
.
30.
Creyghton
MP
,
Cheng
AW
,
Welstead
GG
,
Kooistra
T
,
Carey
BW
,
Steine
EJ
, et al
Histone H3K27ac separates active from poised enhancers and predicts developmental state
.
Proc Natl Acad Sci U S A
2010
;
107
:
21931
6
.
31.
Rada-Iglesias
A
,
Bajpai
R
,
Swigut
T
,
Brugmann
SA
,
Flynn
RA
,
Wysocka
J
. 
A unique chromatin signature uncovers early developmental enhancers in humans
.
Nature
2011
;
470
:
279
83
.
32.
Gallego Romero
I
,
Pai
AA
,
Tung
J
,
Gilad
Y
. 
RNA-seq: impact of RNA degradation on transcript quantification
.
BMC Biol
2014
;
12
:
42
.
33.
Preker
P
,
Almvig
K
,
Christensen
MS
,
Valen
E
,
Mapendano
CK
,
Sandelin
A
, et al
PROMoter uPstream Transcripts share characteristics with mRNAs and are produced upstream of all three major types of mammalian promoters
.
Nucleic Acids Res
2011
;
39
:
7179
93
.
34.
Kim
TK
,
Hemberg
M
,
Gray
JM
,
Costa
AM
,
Bear
DM
,
Wu
J
, et al
Widespread transcription at neuronal activity-regulated enhancers
.
Nature
2010
;
465
:
182
7
.
35.
Andersson
R
,
Refsing Andersen
P
,
Valen
E
,
Core
LJ
,
Bornholdt
J
,
Boyd
M
, et al
Nuclear stability and transcriptional directionality separate functionally distinct RNA species
.
Nat Commun
2014
;
5
:
5336
.
36.
Andersson
R
,
Gebhard
C
,
Miguel-Escalada
I
,
Hoof
I
,
Bornholdt
J
,
Boyd
M
, et al
An atlas of active enhancers across human cell types and tissues
.
Nature
2014
;
507
:
455
61
.
37.
Ng
JH
,
Kumar
V
,
Muratani
M
,
Kraus
P
,
Yeo
JC
,
Yaw
LP
, et al
In vivo epigenomic profiling of germ cells reveals germ cell molecular signatures
.
Dev Cell
2013
;
24
:
324
33
.
38.
Muratani
M
,
Deng
N
,
Ooi
WF
,
Lin
SJ
,
Xing
M
,
Xu
C
, et al
Nanoscale chromatin profiling of gastric adenocarcinoma reveals cancer-associated cryptic promoters and somatically acquired regulatory elements
.
Nat Commun
2014
;
5
:
4361
.
39.
Tanasijevic
B
,
Dai
B
,
Ezashi
T
,
Livingston
K
,
Roberts
RM
,
Rasmussen
TP
. 
Progressive accumulation of epigenetic heterogeneity during human ES cell culture
.
Epigenetics
2009
;
4
:
330
8
.
40.
Smiraglia
DJ
,
Rush
LJ
,
Fruhwald
MC
,
Dai
Z
,
Held
WA
,
Costello
JF
, et al
Excessive CpG island hypermethylation in cancer cell lines versus primary human malignancies
.
Hum Mol Genet
2001
;
10
:
1413
9
.
41.
Diaz
A
,
Nellore
A
,
Song
JS
. 
CHANCE: comprehensive software for quality control and validation of ChIP-seq data
.
Genome Biol
2012
;
13
:
R98
.
42.
Andersson
R
,
Sandelin
A
,
Danko
CG
. 
A unified architecture of transcriptional regulatory elements
.
Trends Genet
2015
;
31
:
426
33
.
43.
Raja
UM
,
Gopal
G
,
Rajkumar
T
. 
Intragenic DNA methylation concomitant with repression of ATP4B and ATP4A gene expression in gastric cancer is a potential serum biomarker
.
Asian Pac J Cancer Prev
2012
;
13
:
5563
8
.
44.
The Cancer Genome Atlas Research Network
. 
Comprehensive molecular characterization of gastric adenocarcinoma
.
Nature
2014
;
513
:
202
9
.
45.
Rapaport
F
,
Khanin
R
,
Liang
Y
,
Pirun
M
,
Krek
A
,
Zumbo
P
, et al
Comprehensive evaluation of differential gene expression analysis methods for RNA-seq data
.
Genome Biol
2013
;
14
:
R95
.
46.
Love
MI
,
Huber
W
,
Anders
S
. 
Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2
.
Genome Biol
2014
;
15
:
550
.
47.
Robinson
MD
,
McCarthy
DJ
,
Smyth
GK
. 
edgeR: a Bioconductor package for differential expression analysis of digital gene expression data
.
Bioinformatics
2010
;
26
:
139
40
.
48.
Steinhauser
S
,
Kurzawa
N
,
Eils
R
,
Herrmann
C
. 
A comprehensive comparison of tools for differential ChIP-seq analysis
.
Brief Bioinform
2016
;
17
:
953
66
.
49.
Ross-Innes
CS
,
Stark
R
,
Teschendorff
AE
,
Holmes
KA
,
Ali
HR
,
Dunning
MJ
, et al
Differential oestrogen receptor binding is associated with clinical outcome in breast cancer
.
Nature
2012
;
481
:
389
93
.
50.
Decker
KF
,
Zheng
D
,
He
Y
,
Bowman
T
,
Edwards
JR
,
Jia
L
. 
Persistent androgen receptor-mediated transcription in castration-resistant prostate cancer under androgen-deprived conditions
.
Nucleic Acids Res
2012
;
40
:
10765
79
.
51.
Okitsu
CY
,
Hsieh
JC
,
Hsieh
CL
. 
Transcriptional activity affects the H3K4me3 level and distribution in the coding region
.
Mol Cell Biol
2010
;
30
:
2933
46
.
52.
Zhang
ZZ
,
Shen
ZY
,
Shen
YY
,
Zhao
EH
,
Wang
M
,
Wang
CJ
, et al
HOTAIR long noncoding RNA promotes gastric cancer metastasis through suppression of Poly r(C)-Binding Protein (PCBP) 1
.
Mol Cancer Ther
2015
;
14
:
1162
70
.
53.
Ding
J
,
Li
D
,
Gong
M
,
Wang
J
,
Huang
X
,
Wu
T
, et al
Expression and clinical significance of the long non-coding RNA PVT1 in human gastric cancer
.
Onco Targets Ther
2014
;
7
:
1625
30
.
54.
Kundaje
A
,
Meuleman
W
,
Ernst
J
,
Bilenky
M
,
Yen
A
,
Heravi-Moussavi
A
, et al
Integrative analysis of 111 reference human epigenomes
.
Nature
2015
;
518
:
317
30
.
55.
Smith
ZD
,
Meissner
A
. 
DNA methylation: roles in mammalian development
.
Nat Rev Genet
2013
;
14
:
204
20
.
56.
Jones
PA
. 
Functions of DNA methylation: islands, start sites, gene bodies and beyond
.
Nat Rev Genet
2012
;
13
:
484
92
.
57.
Harrow
J
,
Frankish
A
,
Gonzalez
JM
,
Tapanari
E
,
Diekhans
M
,
Kokocinski
F
, et al
GENCODE: the reference human genome annotation for The ENCODE Project
.
Genome Res
2012
;
22
:
1760
74
.
58.
Chia
NY
,
Deng
N
,
Das
K
,
Huang
D
,
Hu
L
,
Zhu
Y
, et al
Regulatory crosstalk between lineage-survival oncogenes KLF5, GATA4 and GATA6 cooperatively promotes gastric cancer development
.
Gut
2015
;
64
:
707
19
.
59.
Chang
HR
,
Nam
S
,
Kook
MC
,
Kim
KT
,
Liu
X
,
Yao
H
, et al
HNF4alpha is a therapeutic target that links AMPK to WNT signalling in early-stage gastric cancer
.
Gut
2016
;
65
:
19
32
.
60.
Tanaka
T
,
Jiang
S
,
Hotta
H
,
Takano
K
,
Iwanari
H
,
Sumi
K
, et al
Dysregulated expression of P1 and P2 promoter-driven hepatocyte nuclear factor-4alpha in the pathogenesis of human cancer
.
J Pathol
2006
;
208
:
662
72
.
61.
Takano
K
,
Hasegawa
G
,
Jiang
S
,
Kurosaki
I
,
Hatakeyama
K
,
Iwanari
H
, et al
Immunohistochemical staining for P1 and P2 promoter-driven hepatocyte nuclear factor-4alpha may complement mucin phenotype of differentiated-type early gastric carcinoma
.
Pathol Int
2009
;
59
:
462
70
.
62.
Edwards
NJ
,
Oberti
M
,
Thangudu
RR
,
Cai
S
,
McGarvey
PB
,
Jacob
S
, et al
The CPTAC data portal: a resource for cancer proteomics research
.
J Proteome Res
2015
;
14
:
2707
13
.
63.
Zhang
B
,
Wang
J
,
Wang
X
,
Zhu
J
,
Liu
Q
,
Shi
Z
, et al
Proteogenomic characterization of human colon and rectal cancer
.
Nature
2014
;
513
:
382
7
.
64.
Nafisi
H
,
Banihashemi
B
,
Daigle
M
,
Albert
PR
. 
GAP1(IP4BP)/RASA3 mediates Galphai-induced inhibition of mitogen-activated protein kinase
.
J Biol Chem
2008
;
283
:
35908
17
.
65.
The Cancer Genome Atlas Research Network
,
Linehan
WM
,
Spellman
PT
,
Ricketts
CJ
,
Creighton
CJ
,
Fei
SS
, et al
Comprehensive molecular characterization of papillary renal-cell carcinoma
.
N Engl J Med
2016
;
374
:
135
45
.
66.
Nishida
K
,
Hirano
T
. 
The role of Gab family scaffolding adapter proteins in the signal transduction of cytokine and growth factor receptors
.
Cancer Sci
2003
;
94
:
1029
33
.
67.
Darnell
JE
 Jr
,
Kerr
IM
,
Stark
GR
. 
Jak-STAT pathways and transcriptional activation in response to IFNs and other extracellular signaling proteins
.
Science
1994
;
264
:
1415
21
.
68.
Ihle
JN
. 
Cytokine receptor signalling
.
Nature
1995
;
377
:
591
4
.
69.
Wen
Z
,
Zhong
Z
,
Darnell
JE
 Jr
. 
Maximal activation of transcription by Stat1 and Stat3 requires both tyrosine and serine phosphorylation
.
Cell
1995
;
82
:
241
50
.
70.
Decker
T
,
Kovarik
P
. 
Serine phosphorylation of STATs
.
Oncogene
2000
;
19
:
2628
37
.
71.
Schumacher
TN
,
Schreiber
RD
. 
Neoantigens in cancer immunotherapy
.
Science
2015
;
348
:
69
74
.
72.
Mittal
D
,
Gubin
MM
,
Schreiber
RD
,
Smyth
MJ
. 
New insights into cancer immunoediting and its three component phases–elimination, equilibrium and escape
.
Curr Opin Immunol
2014
;
27
:
16
25
.
73.
Sette
A
,
Vitiello
A
,
Reherman
B
,
Fowler
P
,
Nayersina
R
,
Kast
WM
, et al
The relationship between class I binding affinity and immunogenicity of potential cytotoxic T cell epitopes
.
J Immunol
1994
;
153
:
5586
92
.
74.
Hoof
I
,
Peters
B
,
Sidney
J
,
Pedersen
LE
,
Sette
A
,
Lund
O
, et al
NetMHCpan, a method for MHC class I binding prediction beyond humans
.
Immunogenetics
2009
;
61
:
1
13
.
75.
Ooi
CH
,
Ivanova
T
,
Wu
J
,
Lee
M
,
Tan
IB
,
Tao
J
, et al
Oncogenic pathway combinations predict clinical prognosis in gastric cancer
.
PLoS Genet
2009
;
5
:
e1000676
.
76.
Johnson
BJ
,
Costelloe
EO
,
Fitzpatrick
DR
,
Haanen
JB
,
Schumacher
TN
,
Brown
LE
, et al
Single-cell perforin and granzyme expression reveals the anatomical localization of effector CD8+ T cells in influenza virus-infected mice
.
Proc Natl Acad Sci U S A
2003
;
100
:
2657
62
.
77.
Ji
RR
,
Chasalow
SD
,
Wang
L
,
Hamid
O
,
Schmidt
H
,
Cogswell
J
, et al
An immune-active tumor microenvironment favors clinical response to ipilimumab
.
Cancer Immunol Immunother
2012
;
61
:
1019
31
.
78.
Herbst
RS
,
Soria
JC
,
Kowanetz
M
,
Fine
GD
,
Hamid
O
,
Gordon
MS
, et al
Predictive correlates of response to the anti-PD-L1 antibody MPDL3280A in cancer patients
.
Nature
2014
;
515
:
563
7
.
79.
Van Loo
P
,
Nordgard
SH
,
Lingjaerde
OC
,
Russnes
HG
,
Rye
IH
,
Sun
W
, et al
Allele-specific copy number analysis of tumors
.
Proc Natl Acad Sci U S A
2010
;
107
:
16910
5
.
80.
Yoshihara
K
,
Shahmoradgoli
M
,
Martinez
E
,
Vegesna
R
,
Kim
H
,
Torres-Garcia
W
, et al
Inferring tumour purity and stromal and immune cell admixture from expression data
.
Nat Commun
2013
;
4
:
2612
.
81.
Rooney
MS
,
Shukla
SA
,
Wu
CJ
,
Getz
G
,
Hacohen
N
. 
Molecular and genetic properties of tumors associated with local immune cytolytic activity
.
Cell
2015
;
160
:
48
61
.
82.
Skibinski
DA
,
Hanson
BJ
,
Lin
Y
,
von Messling
V
,
Jegerlehner
A
,
Tee
JB
, et al
Enhanced neutralizing antibody titers and Th1 polarization from a novel Escherichia coli derived pandemic influenza vaccine
.
PLoS One
2013
;
8
:
e76571
.
83.
Palucka
AK
,
Ueno
H
,
Fay
J
,
Banchereau
J
. 
Dendritic cells: a critical player in cancer therapy?
J Immunother
2008
;
31
:
793
805
.
84.
Robbins
PF
,
Lu
YC
,
El-Gamil
M
,
Li
YF
,
Gross
C
,
Gartner
J
, et al
Mining exomic sequencing data to identify mutated antigens recognized by adoptively transferred tumor-reactive T cells
.
Nat Med
2013
;
19
:
747
52
.
85.
Tran
E
,
Turcotte
S
,
Gros
A
,
Robbins
PF
,
Lu
YC
,
Dudley
ME
, et al
Cancer immunotherapy based on mutation-specific CD4+ T cells in a patient with epithelial cancer
.
Science
2014
;
344
:
641
5
.
86.
Matsushita
H
,
Vesely
MD
,
Koboldt
DC
,
Rickert
CG
,
Uppaluri
R
,
Magrini
VJ
, et al
Cancer exome analysis reveals a T-cell-dependent mechanism of cancer immunoediting
.
Nature
2012
;
482
:
400
4
.
87.
Sidney
J
,
Southwood
S
,
Mann
DL
,
Fernandez-Vina
MA
,
Newman
MJ
,
Sette
A
. 
Majority of peptides binding HLA-A*0201 with high affinity crossreact with other A2-supertype molecules
.
Hum Immunol
2001
;
62
:
1200
16
.
88.
Torikai
H
,
Akatsuka
Y
,
Miyauchi
H
,
Terakura
S
,
Onizuka
M
,
Tsujimura
K
, et al
The HLA-A*0201-restricted minor histocompatibility antigen HA-1H peptide can also be presented by another HLA-A2 subtype, A*0206
.
Bone Marrow Transplant
2007
;
40
:
165
74
.
89.
Griffon
A
,
Barbier
Q
,
Dalino
J
,
van Helden
J
,
Spicuglia
S
,
Ballester
B
. 
Integrative analysis of public ChIP-seq experiments reveals a complex multi-cell regulatory landscape
.
Nucleic Acids Res
2015
;
43
:
e27
.
90.
Simon
JA
,
Lange
CA
. 
Roles of the EZH2 histone methyltransferase in cancer epigenetics
.
Mutat Res
2008
;
647
:
21
9
.
91.
Matsukawa
Y
,
Semba
S
,
Kato
H
,
Ito
A
,
Yanagihara
K
,
Yokozaki
H
. 
Expression of the enhancer of zeste homolog 2 is correlated with poor prognosis in human gastric cancer
.
Cancer Sci
2006
;
97
:
484
91
.
92.
Fujii
S
,
Ochiai
A
. 
Enhancer of zeste homolog 2 downregulates E-cadherin by mediating histone H3 methylation in gastric cancer cells
.
Cancer Sci
2008
;
99
:
738
46
.
93.
Varambally
S
,
Dhanasekaran
SM
,
Zhou
M
,
Barrette
TR
,
Kumar-Sinha
C
,
Sanda
MG
, et al
The polycomb group protein EZH2 is involved in progression of prostate cancer
.
Nature
2002
;
419
:
624
9
.
94.
Kleer
CG
,
Cao
Q
,
Varambally
S
,
Shen
R
,
Ota
I
,
Tomlins
SA
, et al
EZH2 is a marker of aggressive breast cancer and promotes neoplastic transformation of breast epithelial cells
.
Proc Natl Acad Sci U S A
2003
;
100
:
11606
11
.
95.
Hock
H
. 
A complex Polycomb issue: the two faces of EZH2 in cancer
.
Genes Dev
2012
;
26
:
751
5
.
96.
McCabe
MT
,
Ott
HM
,
Ganji
G
,
Korenchuk
S
,
Thompson
C
,
Van Aller
GS
, et al
EZH2 inhibition as a therapeutic strategy for lymphoma with EZH2-activating mutations
.
Nature
2012
;
492
:
108
12
.
97.
Cheng
LL
,
Itahana
Y
,
Lei
ZD
,
Chia
NY
,
Wu
Y
,
Yu
Y
, et al
TP53 genomic status regulates sensitivity of gastric cancer cells to the histone methylation inhibitor 3-deazaneplanocin A (DZNep)
.
Clin Cancer Res
2012
;
18
:
4201
12
.
98.
Lu
Q
,
Hu
Y
,
Sun
J
,
Cheng
Y
,
Cheung
KH
,
Zhao
H
. 
A statistical framework to predict functional non-coding regions in the human genome through integrated analysis of annotation data
.
Sci Rep
2015
;
5
:
10576
.
99.
Lu
Q
,
Powles
RL
,
Wang
Q
,
He
BJ
,
Zhao
H
. 
Integrative tissue-specific functional annotations in the human genome provide novel insights on many complex traits and improve signal prioritization in genome wide association studies
.
PLoS Genet
2016
;
12
:
e1005947
.
100.
Villar
D
,
Berthelot
C
,
Aldridge
S
,
Rayner
TF
,
Lukk
M
,
Pignatelli
M
, et al
Enhancer evolution across 20 mammalian species
.
Cell
2015
;
160
:
554
66
.
101.
Wolff
EM
,
Byun
HM
,
Han
HF
,
Sharma
S
,
Nichols
PW
,
Siegmund
KD
, et al
Hypomethylation of a LINE-1 promoter activates an alternate transcript of the MET oncogene in bladders with cancer
.
PLoS Genet
2010
;
6
:
e1000917
.
102.
Marinov
GK
,
Williams
BA
,
McCue
K
,
Schroth
GP
,
Gertz
J
,
Myers
RM
, et al
From single-cell to cell-pool transcriptomes: stochasticity in gene expression and RNA splicing
.
Genome Res
2014
;
24
:
496
510
.
103.
Kouzarides
T
. 
Chromatin modifications and their function
.
Cell
2007
;
128
:
693
705
.
104.
Rivera
CM
,
Ren
B
. 
Mapping human epigenomes
.
Cell
2013
;
155
:
39
55
.
105.
Jones
PA
,
Baylin
SB
. 
The fundamental role of epigenetic events in cancer
.
Nat Rev Genet
2002
;
3
:
415
28
.
106.
Kadonaga
JT
. 
Perspectives on the RNA polymerase II core promoter
.
Wiley Interdiscip Rev Dev Biol
2012
;
1
:
40
51
.
107.
Sandelin
A
,
Carninci
P
,
Lenhard
B
,
Ponjavic
J
,
Hayashizaki
Y
,
Hume
DA
. 
Mammalian RNA polymerase II core promoters: insights from genome-wide studies
.
Nat Rev Genet
2007
;
8
:
424
36
.
108.
Frith
MC
,
Valen
E
,
Krogh
A
,
Hayashizaki
Y
,
Carninci
P
,
Sandelin
A
. 
A code for transcription initiation in mammalian genomes
.
Genome Res
2008
;
18
:
1
12
.
109.
Carninci
P
,
Sandelin
A
,
Lenhard
B
,
Katayama
S
,
Shimokawa
K
,
Ponjavic
J
, et al
Genome-wide analysis of mammalian promoter architecture and evolution
.
Nat Genet
2006
;
38
:
626
35
.
110.
Rach
EA
,
Winter
DR
,
Benjamin
AM
,
Corcoran
DL
,
Ni
T
,
Zhu
J
, et al
Transcription initiation patterns indicate divergent strategies for gene regulation at the chromatin level
.
PLoS Genet
2011
;
7
:
e1001274
.
111.
Trinklein
ND
,
Aldred
SJ
,
Saldanha
AJ
,
Myers
RM
. 
Identification and functional analysis of human transcriptional promoters
.
Genome Res
2003
;
13
:
308
12
.
112.
Zhang
T
,
Haws
P
,
Wu
Q
. 
Multiple variable first exons: a mechanism for cell- and tissue-specific gene regulation
.
Genome Res
2004
;
14
:
79
89
.
113.
Araujo
PR
,
Yoon
K
,
Ko
D
,
Smith
AD
,
Qiao
M
,
Suresh
U
, et al
Before it gets started: regulating translation at the 5′ UTR
.
Comp Funct Genomics
2012
;
2012
:
475731
.
114.
Pal
S
,
Gupta
R
,
Kim
H
,
Wickramasinghe
P
,
Baubet
V
,
Showe
LC
, et al
Alternative transcription exceeds alternative splicing in generating the transcriptome diversity of cerebellar development
.
Genome Res
2011
;
21
:
1260
72
.
115.
Sobczak
K
,
Krzyzosiak
WJ
. 
Structural determinants of BRCA1 translational regulation
.
J Biol Chem
2002
;
277
:
17349
58
.
116.
Arrick
BA
,
Lee
AL
,
Grendell
RL
,
Derynck
R
. 
Inhibition of translation of transforming growth factor-beta 3 mRNA by its 5′ untranslated region
.
Mol Cell Biol
1991
;
11
:
4306
13
.
117.
Han
B
,
Dong
Z
,
Liu
Y
,
Chen
Q
,
Hashimoto
K
,
Zhang
JT
. 
Regulation of constitutive expression of mouse PTEN by the 5′-untranslated region
.
Oncogene
2003
;
22
:
5325
37
.
118.
Vizoso
M
,
Ferreira
HJ
,
Lopez-Serra
P
,
Carmona
FJ
,
Martinez-Cardus
A
,
Girotti
MR
, et al
Epigenetic activation of a cryptic TBC1D16 transcript enhances melanoma progression by targeting EGFR
.
Nat Med
2015
;
21
:
741
50
.
119.
Toyoshima
K
,
Hayashi
A
,
Kashiwagi
M
,
Hayashi
N
,
Iwatsuki
M
,
Ishimoto
T
, et al
Analysis of circulating tumor cells derived from advanced gastric cancer
.
Int J Cancer
2015
;
137
:
991
8
.
120.
Warneke
VS
,
Behrens
HM
,
Haag
J
,
Kruger
S
,
Simon
E
,
Mathiak
M
, et al
Members of the EpCAM signalling pathway are expressed in gastric cancer tissue and are correlated with patient prognosis
.
Br J Cancer
2013
;
109
:
2217
27
.
121.
Chaves-Perez
A
,
Mack
B
,
Maetzel
D
,
Kremling
H
,
Eggert
C
,
Harreus
U
, et al
EpCAM regulates cell cycle progression via control of cyclin D1 expression
.
Oncogene
2013
;
32
:
641
50
.
122.
Stefanini
L
,
Paul
DS
,
Robledo
RF
,
Chan
ER
,
Getz
TM
,
Campbell
RA
, et al
RASA3 is a critical inhibitor of RAP1-dependent platelet activation
.
J Clin Invest
2015
;
125
:
1419
32
.
123.
Che
YL
,
Luo
SJ
,
Li
G
,
Cheng
M
,
Gao
YM
,
Li
XM
, et al
The C3G/Rap1 pathway promotes secretion of MMP-2 and MMP-9 and is involved in serous ovarian cancer metastasis
.
Cancer Lett
2015
;
359
:
241
9
.
124.
House
CD
,
Wang
BD
,
Ceniccola
K
,
Williams
R
,
Simaan
M
,
Olender
J
, et al
Voltage-gated Na+ channel activity increases colon cancer transcriptional activity and invasion via persistent MAPK signaling
.
Sci Rep
2015
;
5
:
11541
.
125.
Molina-Ortiz
P
,
Polizzi
S
,
Ramery
E
,
Gayral
S
,
Delierneux
C
,
Oury
C
, et al
Rasa3 controls megakaryocyte Rap1 activation, integrin signaling and differentiation into proplatelet
.
PLoS Genet
2014
;
10
:
e1004420
.
126.
Tang
J
,
Li
Y
,
Lyon
K
,
Camps
J
,
Dalton
S
,
Ried
T
, et al
Cancer driver-passenger distinction via sporadic human and dog cancer comparison: a proof-of-principle study with colorectal cancer
.
Oncogene
2014
;
33
:
814
22
.
127.
Gherardi
E
,
Birchmeier
W
,
Birchmeier
C
,
Vande Woude
G
. 
Targeting MET in cancer: rationale and progress
.
Nat Rev Cancer
2012
;
12
:
89
103
.
128.
Peters
S
,
Adjei
AA
. 
MET: a promising anticancer therapeutic target
.
Nat Rev Clin Oncol
2012
;
9
:
314
26
.
129.
Kawakami
H
,
Okamoto
I
,
Arao
T
,
Okamoto
W
,
Matsumoto
K
,
Taniguchi
H
, et al
MET amplification as a potential therapeutic target in gastric cancer
.
Oncotarget
2013
;
4
:
9
17
.
130.
Tanimura
S
,
Chatani
Y
,
Hoshino
R
,
Sato
M
,
Watanabe
S
,
Kataoka
T
, et al
Activation of the 41/43 kDa mitogen-activated protein kinase signaling pathway is required for hepatocyte growth factor-induced cell scattering
.
Oncogene
1998
;
17
:
57
65
.
131.
Zhang
YW
,
Wang
LM
,
Jove
R
,
Vande Woude
GF
. 
Requirement of Stat3 signaling for HGF/SF-Met mediated tumorigenesis
.
Oncogene
2002
;
21
:
217
26
.
132.
Weidner
KM
,
Di Cesare
S
,
Sachs
M
,
Brinkmann
V
,
Behrens
J
,
Birchmeier
W
. 
Interaction between Gab1 and the c-Met receptor tyrosine kinase is responsible for epithelial morphogenesis
.
Nature
1996
;
384
:
173
6
.
133.
Trusolino
L
,
Bertotti
A
,
Comoglio
PM
. 
MET signalling: principles and functions in development, organ regeneration and cancer
.
Nat Rev Mol Cell Biol
2010
;
11
:
834
48
.
134.
Ota
K
,
Matsui
M
,
Milford
EL
,
Mackin
GA
,
Weiner
HL
,
Hafler
DA
. 
T-cell recognition of an immunodominant myelin basic protein epitope in multiple sclerosis
.
Nature
1990
;
346
:
183
7
.
135.
Bouneaud
C
,
Kourilsky
P
,
Bousso
P
. 
Impact of negative selection on the T cell repertoire reactive to a self-peptide: a large fraction of T cell clones escapes clonal deletion
.
Immunity
2000
;
13
:
829
40
.
136.
Yu
W
,
Jiang
N
,
Ebert
PJ
,
Kidd
BA
,
Muller
S
,
Lund
PJ
, et al
Clonal deletion prunes but does not eliminate self-specific alphabeta CD8(+) T lymphocytes
.
Immunity
2015
;
42
:
929
41
.
137.
Schild
H
,
Rotzschke
O
,
Kalbacher
H
,
Rammensee
HG
. 
Limit of T cell tolerance to self proteins by peptide presentation
.
Science
1990
;
247
:
1587
9
.
138.
Morgan
DJ
,
Kreuwel
HT
,
Fleck
S
,
Levitsky
HI
,
Pardoll
DM
,
Sherman
LA
. 
Activation of low avidity CTL specific for a self epitope results in tumor rejection but not autoimmunity
.
J Immunol
1998
;
160
:
643
51
.
139.
Sandberg
JK
,
Franksson
L
,
Sundback
J
,
Michaelsson
J
,
Petersson
M
,
Achour
A
, et al
T cell tolerance based on avidity thresholds rather than complete deletion allows maintenance of maximal repertoire diversity
.
J Immunol
2000
;
165
:
25
33
.
140.
Poplonski
L
,
Vukusic
B
,
Pawling
J
,
Clapoff
S
,
Roder
J
,
Hozumi
N
, et al
Tolerance is overcome in beef insulin-transgenic mice by activation of low-affinity autoreactive T cells
.
Eur J Immunol
1996
;
26
:
601
9
.
141.
McMahan
RH
,
Slansky
JE
. 
Mobilizing the low-avidity T cell repertoire to kill tumors
.
Semin Cancer Biol
2007
;
17
:
317
29
.
142.
Enouz
S
,
Carrie
L
,
Merkler
D
,
Bevan
MJ
,
Zehn
D
. 
Autoreactive T cells bypass negative selection and respond to self-antigen stimulation during infection
.
J Exp Med
2012
;
209
:
1769
79
.
143.
Coulie
PG
,
Brichard
V
,
Van Pel
A
,
Wolfel
T
,
Schneider
J
,
Traversari
C
, et al
A new gene coding for a differentiation antigen recognized by autologous cytolytic T lymphocytes on HLA-A2 melanomas
.
J Exp Med
1994
;
180
:
35
42
.
144.
Sensi
M
,
Traversari
C
,
Radrizzani
M
,
Salvi
S
,
Maccalli
C
,
Mortarini
R
, et al
Cytotoxic T-lymphocyte clones from different patients display limited T-cell-receptor variable-region gene usage in HLA-A2-restricted recognition of the melanoma antigen Melan-A/MART-1
.
Proc Natl Acad Sci U S A
1995
;
92
:
5674
8
.
145.
Kawakami
Y
,
Eliyahu
S
,
Delgado
CH
,
Robbins
PF
,
Rivoltini
L
,
Topalian
SL
, et al
Cloning of the gene coding for a shared human melanoma antigen recognized by autologous T cells infiltrating into tumor
.
Proc Natl Acad Sci U S A
1994
;
91
:
3515
9
.
146.
Kawakami
Y
,
Eliyahu
S
,
Sakaguchi
K
,
Robbins
PF
,
Rivoltini
L
,
Yannelli
JR
, et al
Identification of the immunodominant peptides of the MART-1 human melanoma antigen recognized by the majority of HLA-A2-restricted tumor infiltrating lymphocytes
.
J Exp Med
1994
;
180
:
347
52
.
147.
Cox
AL
,
Skipper
J
,
Chen
Y
,
Henderson
RA
,
Darrow
TL
,
Shabanowitz
J
, et al
Identification of a peptide recognized by five melanoma-specific human cytotoxic T cell lines
.
Science
1994
;
264
:
716
9
.
148.
Zippelius
A
,
Pittet
MJ
,
Batard
P
,
Rufer
N
,
de Smedt
M
,
Guillaume
P
, et al
Thymic selection generates a large T cell pool recognizing a self-peptide in humans
.
J Exp Med
2002
;
195
:
485
94
.
149.
Houghton
AN
,
Taormina
MC
,
Ikeda
H
,
Watanabe
T
,
Oettgen
HF
,
Old
LJ
. 
Serological survey of normal humans for natural antibody to cell surface antigens of melanoma
.
Proc Natl Acad Sci U S A
1980
;
77
:
4260
4
.
150.
Livingston
PO
,
Natoli
EJ
,
Calves
MJ
,
Stockert
E
,
Oettgen
HF
,
Old
LJ
. 
Vaccines containing purified GM2 ganglioside elicit GM2 antibodies in melanoma patients
.
Proc Natl Acad Sci U S A
1987
;
84
:
2911
5
.
151.
Houghton
AN
,
Eisinger
M
,
Albino
AP
,
Cairncross
JG
,
Old
LJ
. 
Surface antigens of melanocytes and melanomas. Markers of melanocyte differentiation and melanoma subsets
.
J Exp Med
1982
;
156
:
1755
66
.
152.
Lewis
JJ
,
Janetzki
S
,
Schaed
S
,
Panageas
KS
,
Wang
S
,
Williams
L
, et al
Evaluation of CD8(+) T-cell frequencies by the Elispot assay in healthy individuals and in patients with metastatic melanoma immunized with tyrosinase peptide
.
Int J Cancer
2000
;
87
:
391
8
.
153.
Brichard
V
,
Van Pel
A
,
Wolfel
T
,
Wolfel
C
,
De Plaen
E
,
Lethe
B
, et al
The tyrosinase gene codes for an antigen recognized by autologous cytolytic T lymphocytes on HLA-A2 melanomas
.
J Exp Med
1993
;
178
:
489
95
.
154.
Brichard
VG
,
Herman
J
,
Van Pel
A
,
Wildmann
C
,
Gaugler
B
,
Wolfel
T
, et al
A tyrosinase nonapeptide presented by HLA-B44 is recognized on a human melanoma by autologous cytolytic T lymphocytes
.
Eur J Immunol
1996
;
26
:
224
30
.
155.
Robbins
PF
,
el-Gamil
M
,
Kawakami
Y
,
Stevens
E
,
Yannelli
JR
,
Rosenberg
SA
. 
Recognition of tyrosinase by tumor-infiltrating lymphocytes from a patient responding to immunotherapy
.
Cancer Res
1994
;
54
:
3124
6
.
156.
Wolfel
T
,
Van Pel
A
,
Brichard
V
,
Schneider
J
,
Seliger
B
,
Meyer zum Buschenfelde
KH
, et al
Two tyrosinase nonapeptides recognized on HLA-A2 melanomas by autologous cytolytic T lymphocytes
.
Eur J Immunol
1994
;
24
:
759
64
.
157.
Topalian
SL
,
Rivoltini
L
,
Mancini
M
,
Markus
NR
,
Robbins
PF
,
Kawakami
Y
, et al
Human CD4+ T cells specifically recognize a shared melanoma-associated antigen encoded by the tyrosinase gene
.
Proc Natl Acad Sci U S A
1994
;
91
:
9461
5
.
158.
Wang
RF
,
Appella
E
,
Kawakami
Y
,
Kang
X
,
Rosenberg
SA
. 
Identification of TRP-2 as a human tumor antigen recognized by cytotoxic T lymphocytes
.
J Exp Med
1996
;
184
:
2207
16
.
159.
Mattes
MJ
,
Thomson
TM
,
Old
LJ
,
Lloyd
KO
. 
A pigmentation-associated, differentiation antigen of human melanoma defined by a precipitating antibody in human serum
.
Int J Cancer
1983
;
32
:
717
21
.
160.
Trager
U
,
Sierro
S
,
Djordjevic
G
,
Bouzo
B
,
Khandwala
S
,
Meloni
A
, et al
The immune response to melanoma is limited by thymic selection of self-antigens
.
PLoS One
2012
;
7
:
e35005
.
161.
Castelli
C
,
Rivoltini
L
,
Andreola
G
,
Carrabba
M
,
Renkvist
N
,
Parmiani
G
. 
T-cell recognition of melanoma-associated antigens
.
J Cell Physiol
2000
;
182
:
323
31
.
162.
Vijayasaradhi
S
,
Bouchard
B
,
Houghton
AN
. 
The melanoma antigen gp75 is the human homologue of the mouse b (brown) locus gene product
.
J Exp Med
1990
;
171
:
1375
80
.
163.
Kawakami
Y
,
Robbins
PF
,
Wang
X
,
Tupesis
JP
,
Parkhurst
MR
,
Kang
X
, et al
Identification of new melanoma epitopes on melanosomal proteins recognized by tumor infiltrating T lymphocytes restricted by HLA-A1, -A2, and -A3 alleles
.
J Immunol
1998
;
161
:
6985
92
.
164.
Brandle
D
,
Bilsborough
J
,
Rulicke
T
,
Uyttenhove
C
,
Boon
T
,
Van den Eynde
BJ
. 
The shared tumor-specific antigen encoded by mouse gene P1A is a target not only for cytolytic T lymphocytes but also for tumor rejection
.
Eur J Immunol
1998
;
28
:
4010
9
.
165.
Dunn
GP
,
Koebel
CM
,
Schreiber
RD
. 
Interferons, immunity and cancer immunoediting
.
Nat Rev Immunol
2006
;
6
:
836
48
.
166.
Dunn
GP
,
Old
LJ
,
Schreiber
RD
. 
The three Es of cancer immunoediting
.
Annu Rev Immunol
2004
;
22
:
329
60
.
167.
Hanahan
D
,
Weinberg
RA
. 
Hallmarks of cancer: the next generation
.
Cell
2011
;
144
:
646
74
.
168.
Dunn
GP
,
Bruce
AT
,
Ikeda
H
,
Old
LJ
,
Schreiber
RD
. 
Cancer immunoediting: from immunosurveillance to tumor escape
.
Nat Immunol
2002
;
3
:
991
8
.
169.
Shankaran
V
,
Ikeda
H
,
Bruce
AT
,
White
JM
,
Swanson
PE
,
Old
LJ
, et al
IFNgamma and lymphocytes prevent primary tumour development and shape tumour immunogenicity
.
Nature
2001
;
410
:
1107
11
.
170.
Schreiber
RD
,
Old
LJ
,
Smyth
MJ
. 
Cancer immunoediting: integrating immunity's roles in cancer suppression and promotion
.
Science
2011
;
331
:
1565
70
.
171.
Koebel
CM
,
Vermi
W
,
Swann
JB
,
Zerafa
N
,
Rodig
SJ
,
Old
LJ
, et al
Adaptive immunity maintains occult cancer in an equilibrium state
.
Nature
2007
;
450
:
903
7
.
172.
Vesely
MD
,
Kershaw
MH
,
Schreiber
RD
,
Smyth
MJ
. 
Natural innate and adaptive immunity to cancer
.
Annu Rev Immunol
2011
;
29
:
235
71
.
173.
Hicklin
DJ
,
Marincola
FM
,
Ferrone
S
. 
HLA class I antigen downregulation in human cancers: T-cell immunotherapy revives an old story
.
Mol Med Today
1999
;
5
:
178
86
.
174.
Nie
Y
,
Yang
G
,
Song
Y
,
Zhao
X
,
So
C
,
Liao
J
, et al
DNA hypermethylation is a mechanism for loss of expression of the HLA class I genes in human esophageal squamous cell carcinomas
.
Carcinogenesis
2001
;
22
:
1615
23
.
175.
Fonsatti
E
,
Sigalotti
L
,
Coral
S
,
Colizzi
F
,
Altomonte
M
,
Maio
M
. 
Methylation-regulated expression of HLA class I antigens in melanoma
.
Int J Cancer
2003
;
105
:
430
1
.
176.
Soong
TW
,
Hui
KM
. 
Locus-specific transcriptional control of HLA genes
.
J Immunol
1992
;
149
:
2008
20
.
177.
de Vries
TJ
,
Fourkour
A
,
Wobbes
T
,
Verkroost
G
,
Ruiter
DJ
,
van Muijen
GN
. 
Heterogeneous expression of immunotherapy candidate proteins gp100, MART-1, and tyrosinase in human melanoma cell lines and in human melanocytic lesions
.
Cancer Res
1997
;
57
:
3223
9
.
178.
Sotillo
E
,
Barrett
DM
,
Black
KL
,
Bagashev
A
,
Oldridge
D
,
Wu
G
, et al
Convergence of Acquired Mutations and Alternative Splicing of CD19 Enables Resistance to CART-19 Immunotherapy
.
Cancer Discov
2015
;
5
:
1282
95
.
179.
Wolfel
T
,
Hauer
M
,
Schneider
J
,
Serrano
M
,
Wolfel
C
,
Klehmann-Hieb
E
, et al
A p16INK4a-insensitive CDK4 mutant targeted by cytolytic T lymphocytes in a human melanoma
.
Science
1995
;
269
:
1281
4
.
180.
Djebali
S
,
Davis
CA
,
Merkel
A
,
Dobin
A
,
Lassmann
T
,
Mortazavi
A
, et al
Landscape of transcription in human cells
.
Nature
2012
;
489
:
101
8
.
181.
Faulkner
GJ
,
Kimura
Y
,
Daub
CO
,
Wani
S
,
Plessy
C
,
Irvine
KM
, et al
The regulated retrotransposon transcriptome of mammalian cells
.
Nat Genet
2009
;
41
:
563
71
.
182.
Speek
M
. 
Antisense promoter of human L1 retrotransposon drives transcription of adjacent cellular genes
.
Mol Cell Biol
2001
;
21
:
1973
85
.
183.
Goke
J
,
Lu
X
,
Chan
YS
,
Ng
HH
,
Ly
LH
,
Sachs
F
, et al
Dynamic transcription of distinct classes of endogenous retroviral elements marks specific populations of early human embryonic cells
.
Cell Stem Cell
2015
;
16
:
135
41
.
184.
Kim
D
,
Pertea
G
,
Trapnell
C
,
Pimentel
H
,
Kelley
R
,
Salzberg
SL
. 
TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions
.
Genome Biol
2013
;
14
:
R36
.
185.
Xu
H
,
Handoko
L
,
Wei
X
,
Ye
C
,
Sheng
J
,
Wei
CL
, et al
A signal-noise model for significance analysis of ChIP-seq with negative control
.
Bioinformatics
2010
;
26
:
1199
204
.
186.
Trapnell
C
,
Williams
BA
,
Pertea
G
,
Mortazavi
A
,
Kwan
G
,
van Baren
MJ
, et al
Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation
.
Nat Biotechnol
2010
;
28
:
511
5
.
187.
Bray
NL
,
Pimentel
H
,
Melsted
P
,
Pachter
L
. 
Near-optimal probabilistic RNA-seq quantification
.
Nat Biotechnol
2016
;
34
:
525
7
.
188.
Katz
Y
,
Wang
ET
,
Airoldi
EM
,
Burge
CB
. 
Analysis and design of RNA sequencing experiments for identifying isoform regulation
.
Nat Methods
2010
;
7
:
1009
15
.
189.
Ma
ZQ
,
Dasari
S
,
Chambers
MC
,
Litton
MD
,
Sobecki
SM
,
Zimmerman
LJ
, et al
IDPicker 2.0: Improved protein assembly with high discrimination peptide identification filtering
.
J Proteome Res
2009
;
8
:
3872
81
.
190.
Ritchie
ME
,
Phipson
B
,
Wu
D
,
Hu
Y
,
Law
CW
,
Shi
W
, et al
limma powers differential expression analyses for RNA-sequencing and microarray studies
.
Nucleic Acids Res
2015
;
43
:
e47
.
191.
Cox
J
,
Mann
M
. 
MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification
.
Nat Biotechnol
2008
;
26
:
1367
72
.
192.
UniProt Consortium
. 
UniProt: a hub for protein information
.
Nucleic Acids Res
2015
;
43
:
D204
12
.
193.
Cox
J
,
Hein
MY
,
Luber
CA
,
Paron
I
,
Nagaraj
N
,
Mann
M
. 
Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ
.
Mol Cell Proteomics
2014
;
13
:
2513
26
.
194.
Tyanova
S
,
Temu
T
,
Sinitcyn
P
,
Carlson
A
,
Hein
MY
,
Geiger
T
, et al
The Perseus computational platform for comprehensive analysis of (prote)omics data
.
Nat Methods
2016
;
13
:
731
40
.
195.
Nielsen
M
,
Lundegaard
C
,
Blicher
T
,
Lamberth
K
,
Harndahl
M
,
Justesen
S
, et al
NetMHCpan, a method for quantitative predictions of peptide binding to any HLA-A and -B locus protein of known sequence
.
PLoS One
2007
;
2
:
e796
.
196.
Szolek
A
,
Schubert
B
,
Mohr
C
,
Sturm
M
,
Feldhahn
M
,
Kohlbacher
O
. 
OptiType: precision HLA typing from next-generation sequencing data
.
Bioinformatics
2014
;
30
:
3310
6
.
197.
Gautier
L
,
Cope
L
,
Bolstad
BM
,
Irizarry
RA
. 
affy–analysis of Affymetrix GeneChip data at the probe level
.
Bioinformatics
2004
;
20
:
307
15
.
198.
The Cancer Genome Atlas Research Network
. 
Comprehensive molecular characterization of gastric adenocarcinoma
.
Nature
2014
;
513
:
202
9
.
199.
Diaz
E
,
Machutta
CA
,
Chen
S
,
Jiang
Y
,
Nixon
C
,
Hofmann
G
, et al
Development and validation of reagents and assays for EZH2 peptide and nucleosome high-throughput screens
.
J Biomol Screen
2012
;
17
:
1279
92
.
200.
Liao
Y
,
Smyth
GK
,
Shi
W
. 
featureCounts: an efficient general purpose program for assigning sequence reads to genomic features
.
Bioinformatics
2014
;
30
:
923
30
.

Supplementary data