Background: Chromosome 13q22.1 has previously been identified to be a susceptibility locus for pancreatic cancer in Chinese and European ancestry populations. This pleiotropy study aimed to identify novel variants in this region associated with susceptibility to different types of human cancer.

Method: To fine-map the 13q22.1 region, imputation analyses were conducted on the basis of the GWAS data of 2,031 esophageal squamous cell cancer (ESCC) cases and 2,044 controls and 5,930 SNPs (625 directly genotyped and 5,305 well imputed). Promising associations were then examined in ESCC (4,146 cases and 4,135 controls), gastric cardia cancer (1,894 cases and 1,912 controls), noncardia gastric cancer (1,007 cases and 2,243 controls), and colorectal cancer (1,111 cases and 1,138 controls). Fine mapping and biochemical analyses were further performed to elucidate the potential function of novel variants.

Results: Two novel variants, rs1924966 and rs115797771, were associated with ESCC risk (P = 1.37 × 10−10 and P = 2.32 × 10−10, respectively) and were also associated with risk of gastric cardia cancer (P = 0.0003 and P = 0.0018, respectively) but not gastric cancer and colorectal cancer. Fine-mapping revealed another SNP, rs58090485, in strong linkage disequilibrium with rs115797771 (r2 = 0.94). Functional analysis showed that this SNP disturbs a transcriptional repressor binding to the promoter region of KLF5, which might result in high constitutional expression of KLF5.

Conclusions: These results demonstrate that variants mapped on 13q22.1 are associated with the risk of different types of cancer.

Impact: 13q22.1 might serve as a biomarker for the identification of individuals at risk for ESCC and gastric cardia cancer. Cancer Epidemiol Biomarkers Prev; 24(11); 1774–80. ©2015 AACR.

Genome-wide association study (GWAS) has been shown to be powerful and successful tool in the identification of genetic variants associated with susceptibility to human diseases or phenotypes. However, the majority of newly identified risk alleles confer a relatively small risk, ranging from 1.1 to 1.5, and the estimates of disease risk based on these established associations are a substantial improvement compared with previous risk prediction models (1–5). Taking esophageal squamous cell carcinoma (ESCC) as an example, four GWAS have identified 26 novel variants mapped on 14 different chromosomal regions, also showing effects not only by the genetic variants themselves but also by gene-drinking interactions on the risk of ESCC in Chinese populations (6–9). The area under the curve (AUC) for a risk model using GWAS-identified SNPs and four nongenetic factors (sex, age, smoking status, and drinking status) was 70.9%, with the improvement being only 7.0% compared with a model using the four nongenetic factors (10). It is believed that “missing heritability” exists and more genetic variants need to be identified through other new strategies.

Different types of human cancer might share genetic susceptibility factors. For example, on chromosome 8q24, a “gene desert” region, more than 20 loci have been identified to be associated with the risk of multiple cancers (11–16). Functional studies have clarified that these variants might affect an enhancer's effect on long-range regulation of MYC expression (17). Therefore, it seems to be a complementary strategy to identify novel variants for a specific cancer through fine-mapping of susceptibility loci that have been identified to be associated with other cancers.

Chromosome 13q22.1, in which KLF5 and KLF12 are located, has been shown to be associated with susceptibility to pancreatic cancer in both Chinese and European ancestry populations, and has also been reported to be associated with prostate cancer risk in a Japanese population (18–20). KLF5 is an important gene that is aberrantly expressed in multiple cancers in the digestive tract such as ESCC, gastric cancer, and colon cancer (21, 22). KLF12 has also been shown to be amplified in esophageal adenocarcinoma and gastric cancer (23, 24) and might play an oncogenic role in poorly differentiated gastric cancer progression. Five variants in this region have been shown to be associated with susceptibility to pancreatic cancer or prostate cancer, respectively. rs4885093 and rs9573163 have been reported to be associated with the risk of pancreatic cancer in the Chinese population and other two susceptibility variants for pancreatic cancer, rs9543325 and rs9564966, were identified in European ancestry (18, 19). In the Japanese population, rs9600079 has been found to be a prostate cancer susceptibility variant (20). To identify novel susceptibility variants on the 13q22.1, we fine-mapped the 13q22.1 region and examined the associations between five tag SNPs and the risk of four different cancers of the digestive tract, including ESCC, gastric cardia cancer, noncardia gastric cancer, and colorectal cancer. A series of biochemical assays were further conducted to elucidate the potential function of the novel identified variants.

Study subjects and genotyping analysis

A two-stage study was conducted to examine the associations between the variants at 13q22.1 and risk of the four different cancers in the digestive tract, including ESCC, gastric cardia cancer, noncardia gastric cancer, and colorectal cancer. In the imputation stage, we conducted imputation analyses based on the GWAS data of 2,031 ESCC cases with a response rate of 95% and 2,044 controls (8) to identify potential susceptibility variants. All these samples were genotyped using Affymetrix GeneChip Human Mapping 6.0 set in our previous GWAS studies. Five tag SNPs (rs1924966, rs1924956, rs115797771, rs9543527, and rs970040) associated with ESCC risk (P < 0.001) were then verified in three independent cohorts including 2,248 cases with a response rate of 94% and 2,238 controls in Replication I recruited in Beijing, 935 cases with a response rate of 91%, and 959 controls in Replication II recruited in Hebei province and 963 cases with a response rate of 92% and 938 controls in Replication III collected in Hubei province. To test whether these SNPs were also associated with risk of other cancers in the digestive tract, 1,894 cases with gastric cardia cancer with a response rate of 93% and 1,912 controls, 1,007 cases with noncardia gastric cancer with a response rate of 93% and 2,243 controls, and 1,111 cases with colorectal cancer with a response rate of 95% and 1,138 controls recruited in Beijing were investigated. These cases were recruited from the Han Chinese population through collaboration with multiple hospitals in Beijing, Hebei, and Wuhan province. All the controls were cancer-free individuals selected from a community nutritional survey in the same region during the same period as cases were collected. All the SNPs in the replication phase were genotyped using TaqMan assays platform (ABI 7900HT system, Applied Biosystems). Several genotyping quality controls were implemented for replication stage, including (i) the case and control samples were mixed in the plates, and persons who performed the genotyping assay were not aware of case or control status, (ii) positive and negative (no DNA) samples were included on every 384-well assay plate, and (iii) we further employed the direct sequencing of PCR products to replicate sets of 50 randomly selected, TaqMan-genotyped samples for rs1924966, rs115797771, and rs58090485; the accordance between the two methods was 100%.

Demographic characteristics including age and sex were obtained from the medical records of these individuals. Diagnosis of pancreatic ductal adenocarcinoma, ESCC, gastric cardia cancer, noncardia gastric cancer or colorectal cancer was confirmed histopathologically or cytologically by at least two local pathologists according to the World Health Organization classification. All the cancer cases had no previous diagnosis of another type of cancer. All the controls were selected on the basis of physical examination, health-screening program, or community cancer screening program. Informed consent was obtained from each subject at recruitment and this study was approved by the Institutional Review Board of the Chinese Academy of Medical Sciences Cancer Institute.

Imputation and linkage disequilibrium pattern analysis

To increase the spectrum of variants tested for association in the 13q22.1 region with risk of ESCC, we performed imputation using MaCH-Admix software to impute ungenotyped SNPs in a region of 2 Mb centered on rs4885093 based on the linkage disequilibrium (LD) and haplotypes information from 1000 Genomes Project November 2010 ASN (Asian) samples as the references (25). LD structures and haplotype block plots were generated using snp.plotter package in R.

Quantitative real-time PCR

Total RNA was isolated from surgically removed normal esophageal tissues adjacent to tumors in 67 patients with ESCC and then converted to cDNA using oligo (dT)15 primer and Superscipt II (Invitrogen). KLF5 RNA was measured by real-time quantitative reverse transcription-PCR in triplicate using ABI 7900HT Real-Time PCR system based on the SYBR-Green method. The measurement of individual KLF5 RNA expression was determined relative to that of GAPDH expression using a modification of the method described by Lehmann and Kreipe (26). The primer sequences used for detecting different RNAs are available upon request.

Luciferase assay

Three SNPs, rs58090485 A/−, rs3812852 A/G, and rs141391427 C/−, were located in the 5′-flanking region of KLF5. A 2,053 bp DNA fragment containing rs58090485A allele, rs3812852A allele, and rs141391427C allele were generated by PCR and subcloned into the pGL4.10[luc2] vector (Promega). The resultant plasmid was designated as p-[AAC]. Because of the perfect LD among rs58090485, rs3812852, and rs141391427, the p-[AAC] construct was then site specifically mutated to create four different constructs, p-[−AC], p-[AGC], p-[AA−], and p-[−G−], and each construct contains the other allele of the corresponding SNP, respectively. All the constructs were restriction mapped and sequenced to confirm their authenticity.

Three human ESCC cell lines, KYSE-30, KYSE-150, and KYSE-510, were gifts from Y. Shimada (First Department of Surgery, Faculty of Medicine, Kyoto University, Japan) and used for luciferase assays. All cell lines used in this study were regularly authenticated by morphologic observation and tested for absence of mycoplasma contamination (MycoAlert, Lonza Rockland). All cell lines were maintained in RPMI1640 medium with 10% FBS at 37°C in 5% CO2. We seeded 5 × 10−5 cells per well in 48-well plates and transfected them with empty pGL4.10[luc2] vector (a promoterless control), p-[AAC], p-[−AC], p-[AGC], p-[AA−], or p-[−G−] construct, respectively. pRL-SV40 plasmid (Promega) was cotransfected as a normalizing control. All transfections were carried out in triplicate and in three different ESCC cell lines. After 48 hours, cells were collected and analyzed for luciferase activity with the Dual-Luciferase Reporter Assay System (Promega).

Electrophoretic mobility shift assay

Synthetic double-stranded and 3′ biotin–labeled oligonucleotides corresponding to the rs58090485[A] and rs58090485[−] sequences and nuclear extracts obtained from KYSE-150 cells were incubated for 20 minutes using the Light Shift Chemiluminescent EMSA kit (Pierce). Reaction mixtures were separated by 8% PAGE, and products were detected by stabilized streptavidin–horseradish peroxidase conjugate (Pierce). For competition assays, unlabeled oligonucleotides at 10- or 100-fold molar excess were added to the reaction mixture before addition of the biotin-labeled probe.

Statistical analysis

Association analyses between 5,930 (625 directly genotyped and 5,305 well-imputed) SNPs and risk of ESCC in the imputation stage were performed using unconditional logistic regression with age, sex, smoking status, drinking status, and first three principal components from EIGENSTRAT, which was calculated in the previous ESCC GWAS (8) as covariates. Association analyses with risk of different cancers in the digestive tract system were adjusted for age, sex, smoking status, and drinking status but without the principal components, as there was no way to assess population stratification of the replication samples given the small number of selected SNPs typed. For colorectal cancer, we only adjusted for age and sex because the data on smoking status and drinking status were not available. The ORs were calculated for the minor allele of each SNP. The conditional association analyses were performed to identify the independent signals. For each locus, the associations between SNPs and risk of ESCC were conditioning on the most significant SNP. Among these five loci, we then conducted stepwise conditional analyses to examine the dependences among them.

In the imputation stage, 109 SNPs with P < 0.001 were selected for further replications. The Student t test was used to examine the differences in luciferase reporter gene expression, and the Mann–Whitney U test was used to assess differences in KLF5 transcript abundance with different genotypes. All statistical tests were carried out in a two-sided manner and a P value less than 0.05 was considered to be statistically significant and was reported without corrections for multiple testing.

Characteristics of study subjects

This study consisted of four ESCC case–control sets including a total of 6,177 cases with ESCC and 6,179 controls recruited from three geographical regions. The select characteristics of these cases and controls such as sex, age, smoking status, and drinking status are shown in Supplementary Table S1. This study also consisted of a gastric cardia cancer case–control set (1,894 cases and 1,912 controls), a noncardia gastric cancer case–control set (1,007 cases and 2,243 controls), and a colorectal cancer case–control set (1,111 cases and 1,138 controls). The distributions of selected characteristics of these study subjects are shown in Supplementary Table S2.

Novel variants associated with risk of ESCC

In our previous GWAS on pancreatic cancer, the SNP rs4885093 was the most significant marker on 13q22.1 (18); however, this SNP was not associated with the risk of ESCC in the current study [OR, 1.05; 95% confidence interval (CI), 0.96–1.15; P = 0.2620]. To identify whether there are other variants potentially associated with the risk of ESCC in this chromosomal region, we imputed a 2 Mb region centered on rs4885093 based on the previous GWAS data of 2,031 cases with ESCC and 2,044 controls. After imputation, we were able to test 5,930 (625 directly genotyped and 5,305 well-imputed) SNPs (Supplementary Fig. S1). Association analyses showed that 109 SNPs were potentially associated with risk of ESCC (all P < 0.001; Supplementary Table S3). These SNPs were scattered at five different loci; the most significant one at each locus were rs1924966, rs1924956, rs115797771, rs9543527, and rs970040. Five SNPs were the tag SNPs on 13q22.1 and the other SNPs in each locus were in strong LD with them, respectively; the r2 values are shown in Supplementary Table S3. We then performed conditional association analyses to assess the dependence of variants in this region. When conditioning on the most significant tag SNP one by one, the association P values for the other SNPs in the same locus increased by at least five orders of magnitude, but the association results of SNPs in other four loci remained significant. All these results suggest that these five loci are likely to be the independent signals and each locus was marked by the top significant tag SNP, respectively (Supplementary Fig. S2).

We then performed replication of these five potentially associated SNPs, i.e., rs1924966, rs1924956, rs115797771, rs9543527, and rs970040, in three additional ESCC case–control sets. We found that only two SNPs, rs1924966 and rs115797771, were significantly associated with risk in the fast-track Replication I group (P = 0.0009 and P = 0.0005, respectively) while the other three were not (Supplementary Table S4). We then further examined these two significant SNPs in the Replication II and Replication III groups and the results showed a consistent significant association of them with risk of ESCC. The P values for rs1924966 and rs115797771 in the combined sample were 1.37 × 10−10 and 2.32 × 10−10, and the minor alleles for both SNPs showed a protective effect with odds ratios (OR) of 0.84 (95% CI, 0.80–0.89) and 0.69 (95% CI, 0.62–0.78), respectively (Table 1). All Hardy–Weinberg equilibrium P values were >0.05. Stratified analyses showed that the risks associated with rs1924966, rs115797771, and rs58090485 were not significantly different among subgroups according to smoking status or drinking status (Supplementary Table S5).

Table 1.

Significant association of three SNPs with esophageal cancer risk imputation, replication and combined samples

MAF
SNPPositionGeneLocationPhaseCasesControlsOR (95% CI)P
rs1924966 73007053 KLF5 Upstream Imputation 0.35 0.40 0.82 (0.75–0.90) 1.69 × 10−5 
A > C    Replication I 0.36 0.40 0.86 (0.79–0.94) 0.0009 
    Replication II 0.34 0.39 0.83 (0.73–0.96) 0.0091 
    Replication III 0.36 0.39 0.86 (0.75–0.98) 0.0255 
    All replication 0.36 0.40 0.85 (0.80–0.91) 1.30 × 10−6 
    Combined 0.35 0.40 0.84 (0.80–0.89) 1.37 × 10−10 
rs115797771 73638643 KLF5 Intron Imputation 0.04 0.06 0.66 (0.54–0.82) 9.17 × 10−5 
A > C    Replication I 0.05 0.07 0.72 (0.60–0.86) 0.0005 
    Replication II 0.04 0.07 0.64 (0.48–0.87) 0.0040 
    Replication III 0.05 0.06 0.70 (0.52–0.93) 0.0141 
    All replication 0.05 0.07 0.69 (0.60–0.80) 2.17 × 10−7 
    Combined 0.05 0.06 0.69 (0.62–0.78) 2.32 × 10−10 
rs58090485 73631518 KLF5 Upstream Imputation 0.04 0.06 0.67 (0.55–0.83) 0.0001 
A > –    Replication I 0.05 0.07 0.71 (0.59–0.85) 0.0002 
    Replication II 0.04 0.07 0.64 (0.48–0.86) 0.0030 
    Replication III 0.05 0.06 0.68 (0.51–0.91) 0.0103 
    All replication 0.05 0.07 0.69 (0.60–0.79) 7.27 × 10−8 
    Combined 0.05 0.07 0.69 (0.62–0.77) 1.23 × 10−10 
MAF
SNPPositionGeneLocationPhaseCasesControlsOR (95% CI)P
rs1924966 73007053 KLF5 Upstream Imputation 0.35 0.40 0.82 (0.75–0.90) 1.69 × 10−5 
A > C    Replication I 0.36 0.40 0.86 (0.79–0.94) 0.0009 
    Replication II 0.34 0.39 0.83 (0.73–0.96) 0.0091 
    Replication III 0.36 0.39 0.86 (0.75–0.98) 0.0255 
    All replication 0.36 0.40 0.85 (0.80–0.91) 1.30 × 10−6 
    Combined 0.35 0.40 0.84 (0.80–0.89) 1.37 × 10−10 
rs115797771 73638643 KLF5 Intron Imputation 0.04 0.06 0.66 (0.54–0.82) 9.17 × 10−5 
A > C    Replication I 0.05 0.07 0.72 (0.60–0.86) 0.0005 
    Replication II 0.04 0.07 0.64 (0.48–0.87) 0.0040 
    Replication III 0.05 0.06 0.70 (0.52–0.93) 0.0141 
    All replication 0.05 0.07 0.69 (0.60–0.80) 2.17 × 10−7 
    Combined 0.05 0.06 0.69 (0.62–0.78) 2.32 × 10−10 
rs58090485 73631518 KLF5 Upstream Imputation 0.04 0.06 0.67 (0.55–0.83) 0.0001 
A > –    Replication I 0.05 0.07 0.71 (0.59–0.85) 0.0002 
    Replication II 0.04 0.07 0.64 (0.48–0.86) 0.0030 
    Replication III 0.05 0.06 0.68 (0.51–0.91) 0.0103 
    All replication 0.05 0.07 0.69 (0.60–0.79) 7.27 × 10−8 
    Combined 0.05 0.07 0.69 (0.62–0.77) 1.23 × 10−10 

NOTE: P values are two sided and were calculated by an additive model in logistic regression analysis adjusted for sex, age, smoking status, and drinking status.

Abbreviation: MAF, minor allele frequency.

Association with risk of other three types of cancer in the digestive tract

We next examined whether these two SNPs were also associated with the risk of gastric cardia, noncardia gastric, and colorectal cancer. The minor alleles of rs1924966 A>C (OR, 0.84; 95% CI, 0.77–0.93; P = 0.0003) and rs115797771 A>C (OR, 0.73; 95% CI, 0.60–0.89; P = 0.0018) showed significant protective effects for gastric cardia cancer, which were similar to those observed for ESCC. However, we did not find any association between these two SNPs and the risk of noncardia gastric and colorectal cancer (all P > 0.05; Supplementary Table S6; Supplementary Fig. S3).

Functional analysis of significant variants

rs1924966 and rs115797771 are located 622 kb away from the transcriptional start site and intron region of KLF5, respectively. To study the regulatory effects of these two SNPs on KLF5 gene expression, we first examined the association between the genotypes of these two SNPs and the KLF5 RNA level in esophageal tissue samples. Although there was no significant association between the rs1924966 genotype and KLF5 RNA levels (Supplementary Fig. S4), we did find a significant association between KLF5 RNA levels and the rs115797771 genotype. Subjects with the rs115797771 AA genotype had significantly lower KLF5 RNA levels (mean ± SE) than those with the rs115797771 AC genotype [0.2003 ± 0.0156 (n = 50) vs. 0.2722 ± 0.0239 (n = 17), P = 0.0198; Supplementary Fig. S4).

Although rs115797771 is the most significant SNP in this high LD region, it might not necessarily be the functional variant because it is an intronic SNP. We therefore performed functional annotations for 35 SNPs identified in the imputation stage and also in high LD with rs115797771 using HaploReg v2 (Supplementary Table S7). Three SNPs, rs58090485, rs3812852, and rs141391427, in perfect LD (r2 = 1.00) with each other and in high LD (r2 = 0.94) with rs115797771 (Fig. 1), all showed significant associations with risk of ESCC (OR, 0.69; 95% CI, 0.62–0.77; P = 1.23 × 10−10) in the combined samples and also gastric cardia cancer (OR, 0.73; 95% CI, 0.60–0.88; P = 0.0012; Table 1 and Supplementary Table S6). Because all these three SNPs are located in the promoter region of KLF5 and previous studies have shown that many promoter histone marks (H3K4me3, H3K9ac, and H3K27ac) are located in this DNaseI-hypersensitivity region in multiple cell types, we therefore first examined whether these three SNPs would have an impact on KLF5 promoter activity using a set of luciferase reporter gene assays. We found that the p-[AAC] construct, which contains the KLF5 promoter with all three major alleles, drove significantly lower reporter gene expression compared with the p-[−G−] construct containing the KLF5 promoter with the minor alleles of the three SNPs (P < 0.0001). Furthermore, we compared three different constructs, p-[−AC], p-[AGC], and p-[AA−], each of which contains only one minor allele of rs58090485, rs3812852, or rs141391427 to identify the real functional SNP, and found that the p-[−AC] construct containing the rs3812852[−] allele drove significantly higher reporter gene expression than the p-[AAC] construct containing the rs58090485 [A] allele (P < 0.0001). However, the p-[AGC] and p-[AA−] constructs containing rs3812852 [G] and rs141391427 [−], respectively, did not drive significantly different reporter gene expression compared with the p-[AAC] construct (Fig. 2A). These results were consistent in three ESCC cell lines, and suggest that the rs58090485 SNP may have an impact on the regulation of KLF5 expression and needs to be further investigated.

Figure 1.

Regional plot of the association results between variants in the KLF5 region and risk of ESCC. The −log10P values of 110 SNPs were assessed using an additive model in unconditional logistic regression analysis with adjustment for age, sex, smoking status, and drinking status in 2,031 cases with ESCC and 2,044 controls in the discovery stage. The 14 SNPs with P < 0.001 are above the black horizontal line and the smallest P value of rs115797771 is 9.17 × 10−5. The LD plot was based on the genotypes after the imputation in the discovery stage.

Figure 1.

Regional plot of the association results between variants in the KLF5 region and risk of ESCC. The −log10P values of 110 SNPs were assessed using an additive model in unconditional logistic regression analysis with adjustment for age, sex, smoking status, and drinking status in 2,031 cases with ESCC and 2,044 controls in the discovery stage. The 14 SNPs with P < 0.001 are above the black horizontal line and the smallest P value of rs115797771 is 9.17 × 10−5. The LD plot was based on the genotypes after the imputation in the discovery stage.

Close modal
Figure 2.

Functional characterization of rs58090485 A/−, rs3812852 A/G, and rs141391427 C/− haplotypes in the promoter of KLF5. A, reporter gene assays with five constructs containing all three major alleles (p-[AAC]), all three minor alleles (p-[−G−]), and only one minor allele (p-[−AC], p-[AGC], and p-[AA−]) of the SNPs, rs58090485 A>−, rs3812852 A>G, rs141391427 C>−, respectively, in the KLF5 promoter in KYSE-30, KYSE-150, and KYSE-510 cells. All constructs were cotransfected with pRL-SV40 to standardize transfection efficiency. Luciferase levels of pGL4.10[luc2] and pRL-SV40 were determined in triplicate. Data shown are the mean ± SD from three independent transfection experiments, each performed in triplicate. The rs58090485A-containing KLF5 promoter drove significantly lower reporter gene expression than the rs58090485 [−]-containing KLF5 promoter (P < 0.0001). B, EMSA with biotin-labeled oligonucleotides containing the rs58090485 [A] or rs58090485 [−] allele and nuclear extracts from KYSE-150 cells. Lanes 1 and 6 show the mobilities of the labeled oligonucleotides without nuclear extracts; lanes 2 and 7 show the mobilities of the labeled oligonucleotides with nuclear extracts in the absence of competitor; lanes 3 and 8, 4 and 9, and 5 and 10 show the mobilities of the labeled oligonucleotides with nuclear extracts in the presence of unlabeled rs58090485 [A] or rs58090485 [−] competitors. The arrow indicates an additional DNA–protein complex for the rs58090485 [A] allele.

Figure 2.

Functional characterization of rs58090485 A/−, rs3812852 A/G, and rs141391427 C/− haplotypes in the promoter of KLF5. A, reporter gene assays with five constructs containing all three major alleles (p-[AAC]), all three minor alleles (p-[−G−]), and only one minor allele (p-[−AC], p-[AGC], and p-[AA−]) of the SNPs, rs58090485 A>−, rs3812852 A>G, rs141391427 C>−, respectively, in the KLF5 promoter in KYSE-30, KYSE-150, and KYSE-510 cells. All constructs were cotransfected with pRL-SV40 to standardize transfection efficiency. Luciferase levels of pGL4.10[luc2] and pRL-SV40 were determined in triplicate. Data shown are the mean ± SD from three independent transfection experiments, each performed in triplicate. The rs58090485A-containing KLF5 promoter drove significantly lower reporter gene expression than the rs58090485 [−]-containing KLF5 promoter (P < 0.0001). B, EMSA with biotin-labeled oligonucleotides containing the rs58090485 [A] or rs58090485 [−] allele and nuclear extracts from KYSE-150 cells. Lanes 1 and 6 show the mobilities of the labeled oligonucleotides without nuclear extracts; lanes 2 and 7 show the mobilities of the labeled oligonucleotides with nuclear extracts in the absence of competitor; lanes 3 and 8, 4 and 9, and 5 and 10 show the mobilities of the labeled oligonucleotides with nuclear extracts in the presence of unlabeled rs58090485 [A] or rs58090485 [−] competitors. The arrow indicates an additional DNA–protein complex for the rs58090485 [A] allele.

Close modal

Regulatory sequences with discrete alleles might influence gene expression upon binding of transcriptional activators or inhibitors that instruct their regulatory control. Therefore, we then examined whether the rs58090485A deletion changes the binding pattern of nuclear proteins using EMSA. As a result, we found that the binding pattern for the rs58090485 [A] allele differed from that for the rs58090485 [−] allele; one DNA–protein complex disappeared when the rs58090485 [−] probe was incubated with nuclear proteins from the KYSE-150 cell line (Fig. 2B, lane 7) compared with the rs58090485 [A] probe under the same experimental conditions (Fig. 2B, lane 2). Competition assays showed that the addition of 100-fold excess unlabeled rs58090485 [A] probe (Fig. 2B, lane 4) but not rs58090485 [−] probe (Fig. 2B, lane 5) to the reaction mixture markedly eliminated the DNA–protein complex formed by the interaction between the rs58090485[A] allele and nuclear proteins, indicating that binding is sequence-specific.

In this study, we performed fine-mapping of a pancreatic cancer susceptibility region at 13q22.1 and conducted the association analyses between variants located in this region and the risk of four different cancers in the digestive tract. Two novel SNPs, rs1924966 and rs115797771, were found to be associated with the risk of ESCC and gastric cardia cancer, but not noncardia gastric cancer or colorectal cancer. Functional analysis suggests that rs58090485 located in the KLF5 promoter region, which is in high LD (r2 = 0.94) with rs115797771, had an impact on the regulation of KLF5 expression.

Previous GWAS identified numerous cancer susceptibility loci, but only a small proportion of the heritability can be explained by these discovered risk loci (27). Patterns of pleiotropic association have shown that key loci or shared pathways might affect multiple cancers, such as the 8q24 region (11–16) and the telomerase reverse transcriptase (TERT) gene region on chromosome 5 (14, 19, 28–30). Our previous GWAS identified only one locus tagged by rs4885093 on 13q22.1 that was associated with the risk of pancreatic cancer. In this study, we did not find any association between rs4885093 and the risk of ESCC; however, we found two novel SNPs, rs1924966 and rs115797771 that were significantly associated with the risk of ESCC and gastric cardia cancer but not the risk of noncardia gastric cancer and colorectal cancer. Epidemiologic studies suggest shared geographic distributions and environmental risk factors for both ESCC and gastric cardia cancer but not for noncardia gastric cancer and colorectal cancer in China (31, 32). On the other hand, a GWAS on ESCC and gastric cancer has also shown a shared genetic risk factor, PLCE1 variation, for both ESCC and gastric cardia cancer but not for noncardia gastric cancer (6). These results suggest that some different types of cancer have the same environmental and genetic susceptibility factors. The analytical strategy used in the current study may not only help to identify new risk genetic loci but also elucidate common etiologies between different cancers.

The most proximate gene on 13q22.1 is KLF5, which has complex roles in digestive tract carcinogenesis. In colon cancer, KLF5 acts as an important mediator of KRAS during intestinal carcinogenesis process (33–36). A similar oncogenic function of KLF5 has also been observed in the severity of premalignant lesions in human gastric carcinogenesis induced by H. pylori (37–40). Considering ESCC, however, KLF5 might play a tumor suppressor role. Downregulation of KLF5 might reduce its limitation on NOTCH1 activity and is sufficient on its own to transform primary human keratinocytes to form invasive tumors in the context of P53 mutation or loss of function (41). KLF5 might also activate the JNK pathway and lead to apoptosis and reduced cell survival (42). Our findings in the present study are consistent with the postulation that KLF5 acts as a tumor suppressor in ESCC carcinogenesis. Compared with individuals with the rs115797771 AA genotype, individuals with at least one protective C allele (AC or CC genotype) of rs115797771 showed higher expression of KLF5, suggesting that upregulation of KLF5 could protect individuals from ESCC.

The current study has several strengths. First, 13q22.1 was the only region identified in pancreatic cancer in both Chinese and European ancestry populations. The current study is the first association study to perform fine-mapping of this region and to identify novel risk variants for different types of cancer in the digestive tract. Previous GWAS used very stringent P values as the statistical significance level (P < 5 × 10−8 in most studies) to identify susceptibility loci associated with the risk of disease. Although this strategy might decrease false-positive findings, it might miss some true susceptibility loci (43). By using our strategy, three loci on 13q22.1 were found to be associated with the risk of different cancers, and two novel loci were associated with ESCC and gastric cardia cancer in addition to pancreatic cancer. Another major strength of our study is the two-phase design and case–control sets for SNP imputation and replication that were recruited from three different geographical regions, which would largely reduce false positive findings. We also characterized the function of rs58090485 SNP, making the association of this SNP with the risk of ESCC biologically plausible.

However, despite the aforementioned strengths, we also acknowledge several limitations of this study. First, because the GWAS data were available only for pancreatic cancer and ESCC in this study, we only explored the susceptibility SNPs in the 13q22.1 region with the risk of ESCC through an imputation approach in the imputation stage. Although we analyzed the two novel variants in gastric cardia cancer, noncardia gastric cancer, and colorectal cancer, it would be interesting to analyze other types of cancer in the digestion tract or other systems. Second, in the imputation stage, the associations between 5,930 SNPs and the risk of ESCC were analyzed and the P values of 109 SNPs were less than 0.001. After conditional and LD analyses, we selected five tag SNPs for further replications and functional analyses, but multiple corrections were not performed in this stage and we still cannot rule out the possibility of false positive results. Further investigations, including resequencing the 13q22.1 region in Asian or Caucasian populations, are still warranted. Third, in vitro functional analysis and expression data can only indirectly support this association. It would be better to include the expression levels of KLF5 in an rs115797771 CC cohort; however, we did not find any individual with this genotype in our cohort. In addition, while functional analyses showed that rs58090485 A/− disrupts a transcriptional inhibitor binding site, the identification of a nuclear protein bound to this site remains to be clarified. In addition, due to the lack of expression of KLF5 expression in controls, the relationships among rs115797771, KLF5 expression, and ESCC risk are still unknown and need to be further explored.

In conclusion, through fine-mapping of a potential susceptibility region for cancer combined with functional analysis, we have extended our GWAS results to the discovery of two novel susceptibility loci for ESCC. Among them, the rs58090485 A/− change may cause upregulation of the KLF5 gene, which in turn might inhibit carcinogenesis in ESCC. These results support our hypothesis that some types of human cancer might share the same genetic susceptibility factors in terms of gene or chromosomal region, which would extend our understanding of the genetic etiology of disease.

No potential conflicts of interest were disclosed.

Conception and design: J. Chang, X. Zhang, C. Wu, D. Lin

Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): L. Wei, W. Tan, X. Zhang

Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): J. Chang, L. Wei, X. Miao, D. Yu, C. Wu

Writing, review, and/or revision of the manuscript: J. Chang, X. Zhang, C. Wu, D. Lin

Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): X. Miao, W. Tan, C. Wu

This work was supported by the National High-Tech Research and Development Program of China (2014AA020601; to C. Wu) and National Basic Research Program of China (2013CB910301; to D. Lin).

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1.
Kraft
P
,
Hunter
DJ
. 
Genetic risk prediction - are we there yet?
N Engl J Med
2009
;
360
:
1701
3
.
2.
Pharoah
PD
,
Antoniou
AC
,
Easton
DF
,
Ponder
BA
. 
Polygenes, risk prediction, and targeted prevention of breast cancer
.
N Engl J Med
2008
;
358
:
2796
803
.
3.
Wacholder
S
,
Hartge
P
,
Prentice
R
,
Garcia-Closas
M
,
Feigelson
HS
,
Diver
WR
, et al
Performance of common genetic variants in breast-cancer risk models
.
N Engl J Med
2010
;
362
:
986
93
.
4.
Lindstrom
S
,
Schumacher
FR
,
Cox
D
,
Travis
RC
,
Albanes
D
,
Allen
NE
, et al
Common genetic variants in prostate cancer risk prediction–results from the NCI Breast and Prostate Cancer Cohort Consortium (BPC3)
.
Cancer Epidemiol Biomarkers Prev
2012
;
21
:
437
44
.
5.
Pierce
BL
,
Tong
L
,
Kraft
P
,
Ahsan
H
. 
Unidentified genetic variants influence pancreatic cancer risk: an analysis of polygenic susceptibility in the PanScan study
.
Genet Epidemiol
2012
;
36
:
517
24
.
6.
Abnet
CC
,
Freedman
ND
,
Hu
N
,
Wang
Z
,
Yu
K
,
Shu
XO
, et al
A shared susceptibility locus in PLCE1 at 10q23 for gastric adenocarcinoma and esophageal squamous cell carcinoma
.
Nat Genet
2010
;
42
:
764
7
.
7.
Wang
LD
,
Zhou
FY
,
Li
XM
,
Sun
LD
,
Song
X
,
Jin
Y
, et al
Genome-wide association study of esophageal squamous cell carcinoma in Chinese subjects identifies susceptibility loci at PLCE1 and C20orf54
.
Nat Genet
2010
;
42
:
759
63
.
8.
Wu
C
,
Hu
Z
,
He
Z
,
Jia
W
,
Wang
F
,
Zhou
Y
, et al
Genome-wide association study identifies three new susceptibility loci for esophageal squamous-cell carcinoma in Chinese populations
.
Nat Genet
2011
;
43
:
679
84
.
9.
Wu
C
,
Kraft
P
,
Zhai
K
,
Chang
J
,
Wang
Z
,
Li
Y
, et al
Genome-wide association analyses of esophageal squamous cell carcinoma in Chinese identify multiple susceptibility loci and gene-environment interactions
.
Nat Genet
2012
;
44
:
1090
7
.
10.
Chang
J
,
Huang
Y
,
Wei
L
,
Ma
B
,
Miao
X
,
Li
Y
, et al
Risk prediction of esophageal squamous-cell carcinoma with common genetic variants and lifestyle factors in Chinese population
.
Carcinogenesis
2013
;
34
:
1782
6
.
11.
Amundadottir
LT
,
Sulem
P
,
Gudmundsson
J
,
Helgason
A
,
Baker
A
,
Agnarsson
BA
, et al
A common variant associated with prostate cancer in European and African populations
.
Nat Genet
2006
;
38
:
652
8
.
12.
Tomlinson
I
,
Webb
E
,
Carvajal-Carmona
L
,
Broderick
P
,
Kemp
Z
,
Spain
S
, et al
A genome-wide association scan of tag SNPs identifies a susceptibility variant for colorectal cancer at 8q24.21
.
Nat Genet
2007
;
39
:
984
8
.
13.
Kiemeney
LA
,
Thorlacius
S
,
Sulem
P
,
Geller
F
,
Aben
KK
,
Stacey
SN
, et al
Sequence variant on 8q24 confers susceptibility to urinary bladder cancer
.
Nat Genet
2008
;
40
:
1307
12
.
14.
Shete
S
,
Hosking
FJ
,
Robertson
LB
,
Dobbins
SE
,
Sanson
M
,
Malmer
B
, et al
Genome-wide association study identifies five susceptibility loci for glioma
.
Nat Genet
2009
;
41
:
899
904
.
15.
Goode
EL
,
Chenevix-Trench
G
,
Song
H
,
Ramus
SJ
,
Notaridou
M
,
Lawrenson
K
, et al
A genome-wide association study identifies susceptibility loci for ovarian cancer at 2q31 and 8q24
.
Nat Genet
2010
;
42
:
874
9
.
16.
Easton
DF
,
Pooley
KA
,
Dunning
AM
,
Pharoah
PD
,
Thompson
D
,
Ballinger
DG
, et al
Genome-wide association study identifies novel breast cancer susceptibility loci
.
Nature
2007
;
447
:
1087
93
.
17.
Pomerantz
MM
,
Ahmadiyeh
N
,
Jia
L
,
Herman
P
,
Verzi
MP
,
Doddapaneni
H
, et al
The 8q24 cancer risk variant rs6983267 shows long-range interaction with MYC in colorectal cancer
.
Nat Genet
2009
;
41
:
882
4
.
18.
Wu
C
,
Miao
X
,
Huang
L
,
Che
X
,
Jiang
G
,
Yu
D
, et al
Genome-wide association study identifies five loci associated with susceptibility to pancreatic cancer in Chinese populations
.
Nat Genet
2012
;
44
:
62
6
.
19.
Petersen
GM
,
Amundadottir
L
,
Fuchs
CS
,
Kraft
P
,
Stolzenberg-Solomon
RZ
,
Jacobs
KB
, et al
A genome-wide association study identifies pancreatic cancer susceptibility loci on chromosomes 13q22.1, 1q32.1 and 5p15.33
.
Nat Genet
2010
;
42
:
224
8
.
20.
Takata
R
,
Akamatsu
S
,
Kubo
M
,
Takahashi
A
,
Hosono
N
,
Kawaguchi
T
, et al
Genome-wide association study identifies five new susceptibility loci for prostate cancer in the Japanese population
.
Nat Genet
2010
;
42
:
751
4
.
21.
Dong
JT
,
Chen
C
. 
Essential role of KLF5 transcription factor in cell proliferation and differentiation and its implications for human diseases
.
Cell Mol Life Sci
2009
;
66
:
2691
706
.
22.
Diakiw
SM
,
D'Andrea
RJ
,
Brown
AL
. 
The double life of KLF5: opposing roles in regulation of gene-expression, cellular function, and transformation
.
IUBMB Life
2013
;
65
:
999
1011
.
23.
Nakamura
Y
,
Migita
T
,
Hosoda
F
,
Okada
N
,
Gotoh
M
,
Arai
Y
, et al
Kruppel-like factor 12 plays a significant role in poorly differentiated gastric cancer progression
.
Int J Cancer
2009
;
125
:
1859
67
.
24.
Frankel
A
,
Armour
N
,
Nancarrow
D
,
Krause
L
,
Hayward
N
,
Lampe
G
, et al
Genome-wide analysis of esophageal adenocarcinoma yields specific copy number aberrations that correlate with prognosis
.
Genes Chromosomes Cancer
2014
;
53
:
324
38
.
25.
Liu
EY
,
Li
M
,
Wang
W
,
Li
Y
. 
MaCH-admix: genotype imputation for admixed populations
.
Genet Epidemiol
2013
;
37
:
25
37
.
26.
Lehmann
U
,
Kreipe
H
. 
Real-time PCR analysis of DNA and RNA extracted from formalin-fixed and paraffin-embedded biopsies
.
Methods
2001
;
25
:
409
18
.
27.
Yang
J
,
Manolio
TA
,
Pasquale
LR
,
Boerwinkle
E
,
Caporaso
N
,
Cunningham
JM
, et al
Genome partitioning of genetic variation for complex traits using common SNPs
.
Nat Genet
2011
;
43
:
519
25
.
28.
McKay
JD
,
Hung
RJ
,
Gaborieau
V
,
Boffetta
P
,
Chabrier
A
,
Byrnes
G
, et al
Lung cancer susceptibility locus at 5p15.33
.
Nat Genet
2008
;
40
:
1404
6
.
29.
Rafnar
T
,
Sulem
P
,
Stacey
SN
,
Geller
F
,
Gudmundsson
J
,
Sigurdsson
A
, et al
Sequence variants at the TERT-CLPTM1L locus associate with many cancer types
.
Nat Genet
2009
;
41
:
221
7
.
30.
Turnbull
C
,
Rapley
EA
,
Seal
S
,
Pernet
D
,
Renwick
A
,
Hughes
D
, et al
Variants near DMRT1, TERT and ATF7IP are associated with testicular germ cell cancer
.
Nat Genet
2010
;
42
:
604
7
.
31.
Yang
L
. 
Incidence and mortality of gastric cancer in China
.
World J Gastroenterol
2006
;
12
:
17
20
.
32.
Gao
Y
,
Hu
N
,
Han
XY
,
Ding
T
,
Giffen
C
,
Goldstein
AM
, et al
Risk factors for esophageal and gastric cancers in Shanxi Province, China: a case-control study
.
Cancer Epidemiol
2011
;
35
:
e91
9
.
33.
Nandan
MO
,
McConnell
BB
,
Ghaleb
AM
,
Bialkowska
AB
,
Sheng
H
,
Shao
J
, et al
Kruppel-like factor 5 mediates cellular transformation during oncogenic KRAS-induced intestinal tumorigenesis
.
Gastroenterology
2008
;
134
:
120
30
.
34.
Bialkowska
AB
,
Du
Y
,
Fu
H
,
Yang
VW
. 
Identification of novel small-molecule compounds that inhibit the proproliferative Kruppel-like factor 5 in colorectal cancer cells by high-throughput screening
.
Mol Cancer Ther
2009
;
8
:
563
70
.
35.
Chanchevalap
S
,
Nandan
MO
,
Merlin
D
,
Yang
VW
. 
All-trans retinoic acid inhibits proliferation of intestinal epithelial cells by inhibiting expression of the gene encoding Kruppel-like factor 5
.
FEBS Lett
2004
;
578
:
99
105
.
36.
McConnell
BB
,
Klapproth
JM
,
Sasaki
M
,
Nandan
MO
,
Yang
VW
. 
Kruppel-like factor 5 mediates transmissible murine colonic hyperplasia caused by Citrobacter rodentium infection
.
Gastroenterology
2008
;
134
:
1007
16
.
37.
Soon
MS
,
Hsu
LS
,
Chen
CJ
,
Chu
PY
,
Liou
JH
,
Lin
SH
, et al
Expression of Kruppel-like factor 5 in gastric cancer and its clinical correlation in Taiwan
.
Virchows Arch
2011
;
459
:
161
6
.
38.
Fujii
Y
,
Yoshihashi
K
,
Suzuki
H
,
Tsutsumi
S
,
Mutoh
H
,
Maeda
S
, et al
CDX1 confers intestinal phenotype on gastric epithelial cells via induction of stemness-associated reprogramming factors SALL4 and KLF5
.
Proc Natl Acad Sci U S A
2012
;
109
:
20584
9
.
39.
Noto
JM
,
Khizanishvili
T
,
Chaturvedi
R
,
Piazuelo
MB
,
Romero-Gallo
J
,
Delgado
AG
, et al
Helicobacter pylori promotes the expression of Kruppel-like factor 5, a mediator of carcinogenesis, in vitro and in vivo
.
PLoS One
2013
;
8
:
e54344
.
40.
Kwak
MK
,
Lee
HJ
,
Hur
K
,
Park do
J
,
Lee
HS
,
Kim
WH
, et al
Expression of Kruppel-like factor 5 in human gastric carcinomas
.
J Cancer Res Clin Oncol
2008
;
134
:
163
7
.
41.
Yang
Y
,
Nakagawa
H
,
Tetreault
MP
,
Billig
J
,
Victor
N
,
Goyal
A
, et al
Loss of transcription factor KLF5 in the context of p53 ablation drives invasive progression of human squamous cell cancer
.
Cancer Res
2011
;
71
:
6475
84
.
42.
Tarapore
RS
,
Yang
Y
,
Katz
JP
. 
Restoring KLF5 in esophageal squamous cell cancer cells activates the JNK pathway leading to apoptosis and reduced cell survival
.
Neoplasia
2013
;
15
:
472
80
.
43.
Panagiotou
OA
,
Ioannidis
JP
. 
What should the genome-wide significance threshold be? Empirical replication of borderline genetic associations
.
Int J Epidemiol
2012
;
41
:
273
86
.