Abstract
Background: Important risk factors for esophageal adenocarcinoma and its precursor, Barrett's esophagus, include gastroesophageal reflux disease, obesity, and cigarette smoking. Recently, genome-wide association studies have identified seven germline single-nucleotide polymorphisms (SNP) that are associated with risk of Barrett's esophagus and esophageal adenocarcinoma. Whether these genetic susceptibility loci modify previously identified exposure–disease associations is unclear.
Methods: We analyzed exposure and genotype data from the BEACON Consortium discovery phase GWAS, which included 1,516 esophageal adenocarcinoma case patients, 2,416 Barrett's esophagus case patients, and 2,187 control participants. We examined the seven newly identified susceptibility SNPs for interactions with body mass index, smoking status, and report of weekly heartburn or reflux. Logistic regression models were used to estimate ORs for these risk factors stratified by SNP genotype, separately for Barrett's esophagus and esophageal adenocarcinoma.
Results: The odds ratio for Barrett's esophagus associated with at least weekly heartburn or reflux varied significantly with the presence of at least one minor allele of rs2687201 (nominal P = 0.0005, FDR = 0.042). ORs (95% CIs) for weekly heartburn or reflux among participants with 0, 1, or 2 minor alleles of rs2687201 were 6.17 (4.91–7.56), 3.56 (2.85–4.44), and 3.97 (2.47–6.37), respectively. No statistically significant interactions were observed for smoking status and body mass index.
Conclusion: Reflux symptoms are more strongly associated with Barrett's esophagus risk among persons homozygous for the major allele of rs2687201, which lies approximately 75 kb downstream of the transcription factor gene FOXP1.
Impact: The novel gene–exposure interaction discovered in this study provides new insights into the etiology of esophageal adenocarcinoma. Cancer Epidemiol Biomarkers Prev; 24(11); 1739–47. ©2015 AACR.
Introduction
The incidence of esophageal adenocarcinoma, an infrequent but often lethal disease, has been rising sharply in the past four decades, especially among white populations in developed countries (1–6). The majority of esophageal adenocarcinoma cases arise from Barrett's esophagus, a precursor lesion defined by a specialized columnar metaplasia of the distal esophagus (7). Barrett's esophagus and esophageal adenocarcinoma share many risk factors, including gastroesophageal reflux disease (GERD), European ancestry, male gender, obesity and tobacco smoking, although the magnitudes of these associations may differ for Barrett's esophagus and esophageal adenocarcinoma (8–14). Population-based studies in the United States and Australia indicate that GERD symptoms, high body mass, and tobacco smoking account for approximately 75% of the population risk for esophageal adenocarcinoma (10, 12).
Less is known about inherited genetic susceptibility to Barrett's esophagus and esophageal adenocarcinoma. Using the candidate gene approach, a number of genetic association studies have investigated the role of genetic predisposition in several biologic pathways (15–21). Recently, large genome-wide association studies (GWAS) have identified several single-nucleotide polymorphisms (SNP) for Barrett's esophagus (22, 23), including variants in or near FOXF1, TBX5, GDF7, and the MHC genes. Additional susceptibility loci for Barrett's esophagus/esophageal adenocarcinoma were also identified in or near CRTC1, BARX1, and FOXP1 (24). These loci were close to or passed the stringent genome-wide significance threshold. Notably, the GWAS based on the Barrett's and Esophageal Adenocarcinoma Consortium (BEACON) reported that risk of Barrett's esophagus and esophageal adenocarcinoma is influenced by many germline variants of small effect and, perhaps not surprisingly, genetic heritability is largely shared between the two diseases (25).
Understanding the interplay between genetic susceptibility and epidemiologic risk factors will improve risk prediction and provide insights into disease pathogenesis, thereby increasing understanding of Barrett's esophagus and esophageal adenocarcinoma etiology. Some gene–exposure interactions have also been identified using the candidate gene approach, for example, smoking and variants of GSTM1, GSTT1, and VEGF (26–31). Few of the previously reported variants have been replicated and more systematic studies are warranted. The statistical power to detect a genome-wide gene–environment interaction is, however, typically low because of high variability in estimates of interactions and the stringent P value correction required for genome-wide testing.
An alternative approach is to focus the search for gene–exposure interactions on genetic susceptibility loci with significant genome-wide associations. This approach is mostly driven by statistical considerations to increase the chances of discovery, and has been used successfully in past studies of breast cancer (32). If a genetic variant interacts with an exposure, it is quite plausible that the marginal genetic association, that is, the genetic association observed across all exposure levels (ignoring gene–exposure interaction) will also be evident. Limiting the search for gene–exposure interactions to SNPs that already demonstrate significant marginal genetic associations decreases the risk of chance associations and reduces the multiple-testing burden (32–34). Furthermore, marginal genetic associations are independent of gene–environment interactions as long as they are assessed in nested models, so that a much reduced multiple-test penalty is required to test for gene-environment interactions among SNPs that rank high in marginal association tests (34). This approach therefore presents a benefit in statistical power when compared to an agnostic genome-wide interaction search.
In this report, using BEACON consortium data, we examine gene–exposure interactions, focusing on the seven SNPs previously shown to be associated with Barrett's esophagus or esophageal adenocarcinoma at or near genome-wide significance and confirmed in replication studies. The risk factors under investigation include body-mass index, cigarette smoking, and acid reflux or heartburn symptoms.
Materials and Methods
Study population and SNP genotyping
Information from participants with Barrett's esophagus and esophageal adenocarcinoma, and from associated controls, was collected by BEACON investigators in 14 studies conducted in Australia, Western Europe, and North America. All esophageal adenocarcinoma and Barrett's esophagus diagnoses were histologically confirmed. Detailed population characteristics and genotyping protocols have been previously described (24). Demographic and exposure data were harmonized into standard variables. Genotyping of DNA from buffy coat or whole blood was performed using the Illumina HumanOmni1-Quad platform. Quality assurance and quality control of genotyping calls followed standard procedures (35). All participants gave written informed consent, and the project was approved by the ethics review boards at each participating center as well as for the study overall at the Fred Hutchinson Cancer Research Center.
All 1,516 esophageal adenocarcinoma case patients and 2,416 Barrett's esophagus case patients in the discovery phase of the BEACON GWAS were included in this investigation, together with 2,187 control participants. Three control participants were excluded from Barrett's esophagus analyses due to familial relation to cases. Although all study sites collected data on age, sex, BMI, and smoking status, some did not ascertain history of heartburn or reflux. The missing status of major risk factors under investigation in this report has been tabulated in Supplementary Table S1. The vast majority of missing data occurred because not all variables were included in all study questionnaires.
Statistical analysis
Seven SNPs confirmed as associated with risk of esophageal adenocarcinoma or Barrett's esophagus (22–24) were included in our analysis: rs3072 in 2p24.1 (GDF7), rs2687201 in 3p14 (FOXP1), rs9257809 in 6p21 (MHC), rs11789015 in 9q22 (BARX1), rs2701108 in 12q24.21 (TBX5), rs9936833 in 16q24 (FOXF1), and rs10419226 in 19p13 (CRTC1). The reported ORs for these SNPs can be found in Supplementary Table S2. Three established risk factors were investigated for potential interaction with the SNPs: cigarette smoking, BMI [weight (kg)/height2 (m2)], and gastroesophageal reflux symptoms. These variables were coded as ever smoking (yes or no), BMI (<25, ≥25–<30, ≥30 kg/m2) and at least weekly heartburn or weekly reflux (yes or no). The definition of gastroesophageal reflux symptoms has been previously described (36). For each of these variables, logistic regression models were fitted to assess separately the risks of Barrett's esophagus and esophageal adenocarcinoma. Each model included age, sex, the first four principal components (PC1–PC4) derived using genome-wide SNP data to account for population stratification by ancestry (24), the environmental exposure under investigation, a binary genotype indicator (0/1) for the presence of at least one minor allele of the given SNP, or a continuous genotype variable with discrete values 0, 1, or 2, and the product term of the environment exposure and genotype variable. The statistical significance of the interaction was assessed by a 1 degree of freedom (df) statistic for equal ORs between participants without the minor allele and participants with one or two copies of the minor allele, or the OR associated with 1 additional minor allele. Although correlated, these two types of interaction tests explore parsimoniously the unknown true functional form of the interactions under investigation. The conservative Bonferroni correction for multiple testing was conducted for the 84 tests of gene–exposure interactions between 7 SNPs and 3 risk variables, two outcomes (Barrett's esophagus and esophageal adenocarcinoma), as well as interaction by the presence of the minor allele and by the trend interaction test. The false discovery rate (FDR) was computed for each of the nominal P values for interaction (37). A sensitivity analysis was conducted to assess the impact of missing data in risk factors. Inverse probability weighted logistic regression models were fitted, in which the weight is the inverse of the probability of being observed given age, gender, site, genotype, and the top four principal components.
After examining interactions for each SNP–exposure pair, one at a time, additional analyses were conducted for SNP–exposure pairs that satisfied the criterion of family-wise error rate <0.05 after Bonferroni correction. To control for potential confounding from correlated risk factors, a logistic regression model was fitted to assess the three interaction terms between a SNP and the three risk variables, namely BMI, cigarette smoking, and at least weekly heartburn or acid reflux, controlling for age, sex, the main effects of the SNP and risk factors, and the top four principal components.
Association analyses were conducted for imputed SNPs based on the 1000 Genomes project around any genotyped SNPs that showed significant interactions with one of the three risk factors. The details of the imputation procedure were presented previously (24).
Results
Characteristics of participants in this study have been described elsewhere (24). Table 1 lists the associations of the three studied risk factors for esophageal adenocarcinoma and Barrett's esophagus, adjusted for age and sex. The ORs for Barrett's esophagus were generally larger than those for esophageal adenocarcinoma, except for cigarette smoking.
. | Barrett's esophagus . | . | Esophageal adenocarcinoma . | . | ||
---|---|---|---|---|---|---|
Risk factors . | Case (n) . | Control (n) . | OR (95% confidence interval)a . | Case (n) . | Control (n) . | OR (95% confidence interval)a . |
Categorized body mass index | ||||||
<25 | 426 | 785 | Referent | 246 | 787 | Referent |
25–29.9 | 883 | 944 | 1.77 (1.52–2.06) | 457 | 944 | 1.44 (1.20–1.73) |
≥30 | 752 | 437 | 3.18 (2.68–3.76) | 296 | 438 | 2.20 (1.78–2.72) |
Ever smoked cigarettes | ||||||
No | 801 | 888 | Referent | 348 | 889 | Referent |
Yes | 1,570 | 1,284 | 1.41 (1.25–1.60) | 1,066 | 1,286 | 1.95 (1.68–2.27) |
Weekly heartburn or reflux | ||||||
No | 958 | 1,447 | Referent | 566 | 1,449 | Referent |
Yes | 997 | 349 | 4.62 (3.97–5.37) | 438 | 350 | 3.46 (2.89–4.13) |
. | Barrett's esophagus . | . | Esophageal adenocarcinoma . | . | ||
---|---|---|---|---|---|---|
Risk factors . | Case (n) . | Control (n) . | OR (95% confidence interval)a . | Case (n) . | Control (n) . | OR (95% confidence interval)a . |
Categorized body mass index | ||||||
<25 | 426 | 785 | Referent | 246 | 787 | Referent |
25–29.9 | 883 | 944 | 1.77 (1.52–2.06) | 457 | 944 | 1.44 (1.20–1.73) |
≥30 | 752 | 437 | 3.18 (2.68–3.76) | 296 | 438 | 2.20 (1.78–2.72) |
Ever smoked cigarettes | ||||||
No | 801 | 888 | Referent | 348 | 889 | Referent |
Yes | 1,570 | 1,284 | 1.41 (1.25–1.60) | 1,066 | 1,286 | 1.95 (1.68–2.27) |
Weekly heartburn or reflux | ||||||
No | 958 | 1,447 | Referent | 566 | 1,449 | Referent |
Yes | 997 | 349 | 4.62 (3.97–5.37) | 438 | 350 | 3.46 (2.89–4.13) |
aLogistic regression model for association of the risk factor with case–control status adjusted for sex and age.
Figure 1 shows the q-q plot of P values for each SNP and exposure pair, and for esophageal adenocarcinoma and Barrett's esophagus separately. The P values for the two types of interaction tests were split due to potentially high correlation between the two tests. As a comparison, a randomly selected set of seven SNPs from the BEAGESS study was assessed for interactions with the three exposures. The observed P values were generally smaller than the expected P values from the random set, possibly because of random noise and correlation of these statistics. Two SNPs had interaction P values <0.05 (Table 2, equality test and trend test) and deviated significantly from the diagonal line in the q-q plot. The first SNP, rs2687201 (G/T, major and minor plus strand allele), lies near the FOXP1 gene, and the second, rs10419226 (C/A, major and minor plus strand allele), is located near the CRTC1 gene. Both SNPs were identified in the BEAGESS study (24).
. | Number of minor alleles . | P valueb (FDR)a . | P valuec (FDR) . | ||
---|---|---|---|---|---|
Risk factors . | 0 . | 1 . | 2 . | for equality test . | for trend test . |
rs2687201 | |||||
Barrett's esophagus | |||||
Categorized body mass indexd | 1.65 (1.46–1.86) | 1.95 (1.72–2.22) | 1.74 (1.33–2.26) | 0.083 (0.635) | 0.224 (0.783) |
Ever smoked cigarettes | 1.59 (1.33–1.91) | 1.35 (1.12–1.62) | 0.96 (0.65–1.42) | 0.065 (0.909) | 0.019 (0.328) |
At least weekly heartburn or reflux | 6.17 (4.91–7.76) | 3.56 (2.85–4.44) | 3.97 (2.47–6.37) | 0.0005 (0.042) | 0.003 (0.122) |
Esophageal adenocarcinoma | |||||
Categorized body mass index | 1.35 (1.16–1.58) | 1.55 (1.32–1.82) | 1.79 (1.31–2.46) | 0.125 (0.582) | 0.081 (0.678) |
Ever smoked cigarettes | 2.19 (1.75–2.74) | 1.85 (1.47–2.33) | 1.37 (0.88–2.13) | 0.141 (0.627) | 0.067 (0.624) |
At least weekly heartburn or reflux | 4.05 (3.10–5.30) | 3.15 (2.42–4.09) | 2.48 (1.45–4.27) | 0.099 (0.525) | 0.066 (0.691) |
rs10419226 | |||||
Barrett's esophagus | |||||
Categorized body mass index | 1.75 (1.49–2.05) | 1.92 (1.70–2.17) | 1.55 (1.30–1.85) | 0.811 (0.987) | 0.373 (0.825) |
Ever smoked cigarettes | 1.43 (1.14–1.80) | 1.45 (1.22–1.74) | 1.36 (1.05–1.77) | 0.971 (0.995) | 0.785 (0.999) |
At least weekly heartburn or reflux | 4.04 (3.07–5.30) | 4.99 (4.03–6.21) | 4.74 (3.41–6.48) | 0.227 (0.733) | 0.407 (0.795) |
Esophageal adenocarcinoma | |||||
Categorized body mass index | 1.54 (1.253–1.88) | 1.54 (1.33–1.79) | 1.33 (1.07–1.66) | 0.728 (0.956) | 0.378 (0.813) |
Ever smoked cigarettes | 1.76 (1.33–2.34) | 1.99 (1.61–2.46) | 2.12 (1.54–2.93) | 0.415 (0.792) | 0.381 (0.799) |
At least weekly heartburn or reflux | 2.35 (1.67–3.31) | 3.81 (2.97–4.89) | 4.30 (2.94–6.28) | 0.010 (0.280) | 0.016 (0.329) |
. | Number of minor alleles . | P valueb (FDR)a . | P valuec (FDR) . | ||
---|---|---|---|---|---|
Risk factors . | 0 . | 1 . | 2 . | for equality test . | for trend test . |
rs2687201 | |||||
Barrett's esophagus | |||||
Categorized body mass indexd | 1.65 (1.46–1.86) | 1.95 (1.72–2.22) | 1.74 (1.33–2.26) | 0.083 (0.635) | 0.224 (0.783) |
Ever smoked cigarettes | 1.59 (1.33–1.91) | 1.35 (1.12–1.62) | 0.96 (0.65–1.42) | 0.065 (0.909) | 0.019 (0.328) |
At least weekly heartburn or reflux | 6.17 (4.91–7.76) | 3.56 (2.85–4.44) | 3.97 (2.47–6.37) | 0.0005 (0.042) | 0.003 (0.122) |
Esophageal adenocarcinoma | |||||
Categorized body mass index | 1.35 (1.16–1.58) | 1.55 (1.32–1.82) | 1.79 (1.31–2.46) | 0.125 (0.582) | 0.081 (0.678) |
Ever smoked cigarettes | 2.19 (1.75–2.74) | 1.85 (1.47–2.33) | 1.37 (0.88–2.13) | 0.141 (0.627) | 0.067 (0.624) |
At least weekly heartburn or reflux | 4.05 (3.10–5.30) | 3.15 (2.42–4.09) | 2.48 (1.45–4.27) | 0.099 (0.525) | 0.066 (0.691) |
rs10419226 | |||||
Barrett's esophagus | |||||
Categorized body mass index | 1.75 (1.49–2.05) | 1.92 (1.70–2.17) | 1.55 (1.30–1.85) | 0.811 (0.987) | 0.373 (0.825) |
Ever smoked cigarettes | 1.43 (1.14–1.80) | 1.45 (1.22–1.74) | 1.36 (1.05–1.77) | 0.971 (0.995) | 0.785 (0.999) |
At least weekly heartburn or reflux | 4.04 (3.07–5.30) | 4.99 (4.03–6.21) | 4.74 (3.41–6.48) | 0.227 (0.733) | 0.407 (0.795) |
Esophageal adenocarcinoma | |||||
Categorized body mass index | 1.54 (1.253–1.88) | 1.54 (1.33–1.79) | 1.33 (1.07–1.66) | 0.728 (0.956) | 0.378 (0.813) |
Ever smoked cigarettes | 1.76 (1.33–2.34) | 1.99 (1.61–2.46) | 2.12 (1.54–2.93) | 0.415 (0.792) | 0.381 (0.799) |
At least weekly heartburn or reflux | 2.35 (1.67–3.31) | 3.81 (2.97–4.89) | 4.30 (2.94–6.28) | 0.010 (0.280) | 0.016 (0.329) |
aFalse discovery rate.
bOne degree of freedom test of equality of the two ORs among participants with no minor alleles versus those with at least one minor allele; logistic regression model for case–control status including terms for sex, age, PC1-PC4, SNP genotype, the exposure variable together with its interaction with an indicator of at least one minor allele.
cP value associated with the test for no interaction between exposure and SNP; logistic regression model for case–control status including terms for sex, age, PC1-PC4, SNP genotype coded as a continuous variable (0, 1, 2), the exposure variable along with its interaction with the SNP.
dBody mass index in three categories: <25, ≥25–<30, ≥30 kg/m2.
The top half of Table 2 shows the odds ratios of the three risk factors for groups with 0, 1, or 2 minor alleles of rs2687201 separately for Barrett's esophagus and esophageal adenocarcinoma. The interaction between rs2687201 (G/T, major and minor plus strand allele) and GERD in relation to the risk of Barrett's esophagus had the smallest P value (0.0005), which is statistically significant under Bonferroni correction for 84 comparisons (P < 5.95 × 10−4). The presence of at least one minor allele of rs2687201 appeared to decrease the magnitude of disease risk associated with GERD. The interaction of this SNP with ever smoking reached borderline statistical significance, whereas the association of BMI with risk of Barrett's esophagus did not appear to vary substantially with rs2687201 genotype. The interactions of this SNP with exposures on the risk of esophageal adenocarcinoma were not statistically significant, although the patterns of the ORs among the three genotype groups were similar to those for Barrett's esophagus; having one or two minor alleles generally decreased the magnitude of risk associated with reflux/heartburn and smoking. The interaction between BMI and this variant in relation to esophageal adenocarcinoma risk was of borderline nominal significance (P = 0.08), and this association was also in the same direction as that for Barrett's esophagus risk. In sensitivity analyses, where the missing data in risk factors were accounted for by the inverse probability weighting method, the interaction between rs2687201 and GERD remained statistically significant (Supplementary Table S3). In a separate analysis, where Barrett's esophagus and esophageal adenocarcinoma case participants were combined into a single case group, the significance level of the interaction between rs2687201 and GERD decreased (Supplementary Table S4A). In results not shown, the interaction ORs in the three continents did not differ significantly (P = 0.54).
The bottom half of Table 2 shows the results for rs10419226 (C/A, major and minor plus strand allele). The smallest P values occurred for the interaction of this SNP with GERD in association with esophageal adenocarcinoma risk (nominal P = 0.01). The FDR for this interaction was 0.28.
In Table 3, the three risk factors were investigated simultaneously in one logistic regression model for their interactions with the rs2687201 variant, the only SNP with an interaction P value satisfying the Bonferroni significance threshold. This analysis helped to differentiate the independent contributions of the risk factors and their interactions since the three risk factors were correlated across the study populations. The P value for the interaction with GERD remained nominally statistically significant (P = 0.001), whereas the statistical significance of the interaction between the SNP and ever-smoking decreased when compared with the results in Table 2. These findings suggest that the newly discovered interaction between the SNP and GERD appears unlikely to be significantly confounded by the other two risk factors.
. | Number of minor alleles . | Pa . | Pb . | ||
---|---|---|---|---|---|
Risk factor . | 0 . | 1 . | 2 . | for equality test . | for trend test . |
Barrett's esophagus | |||||
Categorized body mass index | 1.78 (1.53–2.08) | 2.03 (1.75–2.37) | 1.77 (1.29–2.43) | 0.299 | 0.554 |
Ever smoked cigarettes | 1.69 (1.34–2.12) | 1.34 (1.07–1.70) | 1.11 (0.68–1.79) | 0.102 | 0.075 |
At least weekly heartburn or reflux | 7.21 (5.66–9.19) | 4.17 (3.28–5.31) | 4.50 (2.73–7.43) | 0.001 | 0.004 |
Esophageal adenocarcinoma | |||||
Categorized body mass index | 1.43 (1.17–1.75) | 1.35 (1.11–1.65) | 1.83 (1.26–2.66) | 0.906 | 0.451 |
Ever smoked cigarettes | 2.05 (1.49–2.83) | 1.58 (1.16–2.17) | 1.10 (0.61–1.99) | 0.127 | 0.070 |
At least weekly heartburn or reflux | 4.01 (2.95–5.45) | 3.22 (2.38–4.36) | 2.54 (1.39–4.66) | 0.185 | 0.124 |
. | Number of minor alleles . | Pa . | Pb . | ||
---|---|---|---|---|---|
Risk factor . | 0 . | 1 . | 2 . | for equality test . | for trend test . |
Barrett's esophagus | |||||
Categorized body mass index | 1.78 (1.53–2.08) | 2.03 (1.75–2.37) | 1.77 (1.29–2.43) | 0.299 | 0.554 |
Ever smoked cigarettes | 1.69 (1.34–2.12) | 1.34 (1.07–1.70) | 1.11 (0.68–1.79) | 0.102 | 0.075 |
At least weekly heartburn or reflux | 7.21 (5.66–9.19) | 4.17 (3.28–5.31) | 4.50 (2.73–7.43) | 0.001 | 0.004 |
Esophageal adenocarcinoma | |||||
Categorized body mass index | 1.43 (1.17–1.75) | 1.35 (1.11–1.65) | 1.83 (1.26–2.66) | 0.906 | 0.451 |
Ever smoked cigarettes | 2.05 (1.49–2.83) | 1.58 (1.16–2.17) | 1.10 (0.61–1.99) | 0.127 | 0.070 |
At least weekly heartburn or reflux | 4.01 (2.95–5.45) | 3.22 (2.38–4.36) | 2.54 (1.39–4.66) | 0.185 | 0.124 |
aOne degree of freedom test of equality of the two ORs among participants with no minor alleles versus those with at least one minor allele; logistic regression model for case–control status including terms for sex, age, PC1-PC4, SNP genotype, the three exposure variables together with their interactions with an indicator for the presence of at least one minor allele.
bP value associated with the test for no interaction between exposure and SNP; logistic regression model for case–control status including terms for sex, age, PC1-PC4, SNP genotype coded as a continuous variable (0, 1, 2), the three exposure variables along with their interaction with the SNP.
Table 4 shows ORs stratified by rs2687201 genotype and GERD status, adjusted for age, sex, and the top four principal components. Participants without minor alleles and without GERD comprised the reference group. The genetic relative risk (GRR) in the absence of GERD was 1.50, a statistically significant elevation of Barrett's esophagus risk (nominal P value = 0.00004). GERD increased the risk of Barrett's esophagus substantially, regardless of the genotype group. The odds ratio for GERD-positive participants with at least one copy of the minor allele was less than that observed for GERD-positive participants homozygous for the major allele.
. | Frequency in study populationa . | Barrett's esophagus cases (n) . | Control (n) . | ORb (95% confidence interval) . |
---|---|---|---|---|
GERD-negative, no minor alleles | 30% | 391 | 733 | Referent |
GERD-negative, at least one minor allele | 34% | 567 | 714 | 1.50 (1.27–1.77) |
GERD-positive, no minor alleles | 16% | 460 | 150 | 6.17 (4.93–7.73) |
GERD-positive, at least one minor allele | 20% | 537 | 199 | 5.44 (4.42–6.70) |
. | Frequency in study populationa . | Barrett's esophagus cases (n) . | Control (n) . | ORb (95% confidence interval) . |
---|---|---|---|---|
GERD-negative, no minor alleles | 30% | 391 | 733 | Referent |
GERD-negative, at least one minor allele | 34% | 567 | 714 | 1.50 (1.27–1.77) |
GERD-positive, no minor alleles | 16% | 460 | 150 | 6.17 (4.93–7.73) |
GERD-positive, at least one minor allele | 20% | 537 | 199 | 5.44 (4.42–6.70) |
NOTE: The stratum with no minor alleles and less than weekly heartburn or reflux is the baseline group.
aBarrett's esophagus cases and controls combined.
bLogistic regression model for association of SNP and GERD with the Barrett's esophagus case–control status adjusted for sex, age, and four eigenvectors.
Figure 2 shows the regional interaction associations around rs2687201, including both genotyped and imputed SNPs. Of the 2,017 adjacent SNPs assessed, a cluster of adjacent imputed SNPs with correlations (r2) > 0.6 showed higher levels of significance in their interactions with GERD. The smallest nominal P value obtained was for rs7638679 (P = 0.00006, Supplementary Table S5), which was much more statistically significant than that observed for rs2687201. The rs7638679 variant is situated approximately 50 kb away from the FOXP1 3′UTR.
Discussion
We systematically investigated interactions between known risk factors for Barrett's esophagus and esophageal adenocarcinoma and previously identified genome-wide significant associations in germline susceptibility. We found that rs2687201 (G/T, major and minor plus strand allele), a SNP on chromosome 3 near the FOXP1 gene, modifies the association between gastroesophageal reflux disease and risk of Barrett's esophagus. None of the SNPs showed evidence of statistically significant interactions with BMI and smoking. As reported in our GWAS analysis of genetic effects alone (24), the rs2687201 polymorphism was associated with increased risk of Barrett's esophagus and esophageal adenocarcinoma when the two diseases were combined into a single case group (OR = 1.14).
The rs2687201 variant (chromosome 3p13) is located 75 kb distal to the FOXP1 3′UTR, in a approximately 1 Mb intergenic region containing several pseudogenes (RNPC3P1, UQCRHP4, COX6CP6, HMGB1P36) and a predicted approximately 130 bp “novel miRNA” locus (AC096971.1; Supplementary Fig. S1). On the basis of data from the 1000 Genomes Project (38), rs2687201 is in strong linkage disequilibrium (LD; r2 > 0.8) with approximately 60 other SNPs, located within 30–60 kb (20 SNPs with r2 > 0.94). rs2687201 is situated within a 1,600-bp region characterized as heterochromatin in esophageal tissue, according to chromatin state segmentation data derived from the NIH Roadmap Epigenome Project (39). DNA regulatory motifs for multiple transcriptional regulators (e.g., CPHX, IK-2, ZNF143, HOXA9/10, MEF2, CTCF, RAD21, YY1, and NF-κB) are predicted to be altered by rs2687201 or other nearby variants in high LD (r2 > 0.90; ref. 40). Data derived from cell lines analyzed in the ENCODE project indicate recruitment of several transcription factors (CEBPB, CJUN, P300, RAD21, STAT3) to recognition sequences within 4–8 kb (41).
Further assessment of the 13 imputed SNPs with more significant interaction P values than rs2687201 revealed that several of these variants (e.g., rs2597312, rs7611254, and rs1522554) are situated in putative enhancer sequences according to data from the Roadmap Epigenome Project (Supplementary Table S6.1); most of the 13 SNPs also appear to alter predicted DNA regulatory motifs. Expression quantitative trait locus (eQTL) analyses using the Genotype-Tissue Expression (GTEx) Project resource (42) provided some evidence that eight of these 13 SNPs may be associated (P < 0.05) with altered FOXP1 expression levels in esophageal mucosa (Supplementary Table S6.2). Cautious interpretation is required, however, given the sizable number of comparisons and relatively weak P values. The precise functional effects of rs2687201 and/or linked variants on expression levels of FOXP1 or other neighboring loci remain to be determined. While biologic characterization will require experimental follow-up studies, a potential regulatory role for several of these SNPs appears plausible.
The FOXP1 gene encodes a member of the Forkhead box (FOX) family of transcription factors, which share an evolutionarily conserved “winged-helix” DNA binding domain and function as versatile regulators of a wide range of biologic processes, including development and cancer (43–45). Knockout studies in mice demonstrated that FOXP1 and FOXP2 cooperatively regulate lung and esophageal development (46), while human FOXP1 was first identified as a candidate tumor suppressor gene on chromosome 3p, a region known to exhibit loss of heterozygosity in many tumors, and in premalignant epithelial lesions of the oral cavity, breast, and cervix (47). In subsequent studies, expression of FOXP1 was associated with improved survival in breast cancer (48), but reduced survival in diffuse large B-cell lymphoma (49). FOXP1 appears to play opposing roles in different tissues and functions as either a tumor suppressor or oncogene depending upon context (50). It is interesting to note that genetic variation (rs9936833) in proximity to another FOX family member, FOXF1, has also been associated with altered risk of Barrett's esophagus (22). While significant gene–environment interactions were not observed between this variant and reflux, BMI, or smoking, our present findings for FOXP1 provide further evidence implicating FOX transcription factors in the modulation of disease risk for Barrett's esophagus/esophageal adenocarcinoma, two conditions arising within an organ and tissue regulated by these same proteins during embryogenesis.
Several previous studies have pursued candidate-based gene–environment analyses of esophageal adenocarcinoma, and described potential interactions between GERD, smoking, or BMI and variants in a number of different genes related to detoxification, angiogenesis, DNA repair, apoptosis, and extracellular matrix degradation (27–31). These studies, however, were all limited by small sample sizes and lack of independent confirmation, and interactions in relation to risk of Barrett's esophagus were not examined.
The present study has several strengths. First, our use of genetic and epidemiologic data from a large consortium-based GWAS of esophageal adenocarcinoma/Barrett's esophagus (24) provided us with greater statistical power to detect gene–environment interactions than has been available in any previous study. Our decision to focus on the top marginal genetic signals identified in recent genome-wide analyses also eliminated the need to correct for a massive number of comparisons (34). Second, all genotyping from this GWAS was conducted on a single platform, and subjected to stringent quality control procedures. Third, inclusion of both esophageal adenocarcinoma and Barrett's esophagus cases in the current analysis enabled parallel assessment of gene–environment interactions for both an epithelial cancer and its metaplastic precursor lesion.
Our study also has certain limitations. We did not have a replication study for the newly discovered interaction due to lack of relevant exposure data in other studies. A full exploration of variants around the signal SNP and their functional consequences will further elucidate the mechanism of the interaction. The extent of missing data for environmental variables examined (34% missing BMI data among esophageal adenocarcinoma cases; 34% and 18% missing reflux data among esophageal adenocarcinoma or Barrett's esophagus, respectively) considerably reduced statistical power for the indicated analyses, and may have resulted in some falsely negative gene–environment interactions. While systematic differences cannot be ruled out between included subjects with complete covariate data and those excluded because of missing data, the distributions of age, sex, and race were essentially comparable between these two groups, and most missing values for reflux occurred because certain study sites did not include this variable in their questionnaires. Our ability to detect gene–environment associations may also have been limited by the manner in which we modeled (or assessed) environmental exposures (e.g., at least weekly heartburn). Inclusion of more precise covariates, such as duration or timing of heartburn relative to diagnosis, might have helped capture additional interactions, but such information was often unavailable. While this analysis focused on seven SNPs previously shown to be associated at the genome-wide level with altered risk of esophageal adenocarcinoma or Barrett's esophagus, a number of variants not satisfying this stringent threshold in marginal analysis may nonetheless exhibit significant interactions with reflux or BMI. Genome-wide gene–environment studies would be required to further explore this possibility. Finally, some potential for measurement error of the exposures examined, especially reflux, should be acknowledged, given our consortium-based pooled study population. These exposures were ascertained across two decades in multiple studies on different continents. Nevertheless, an extensive effort was devoted to ensuring accurate data harmonization, as documented in several recent pooled analyses (36, 51, 52).
Our study describes a novel interaction between an intergenic germline polymorphism and weekly reflux symptoms in relation to risk of Barrett's esophagus, the known precursor of esophageal adenocarcinoma. Further studies will be necessary to validate these findings in external populations and investigate the potential biologic basis for differential disease risk associated with reflux in the presence of this variant.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Authors' Contributions
Conception and design: J.Y. Dai, H.A. Risch, L. Bernstein, W. Ye, J. Lagergren, N.C. Bird, B.J. Reid, T.L. Vaughan
Development of methodology: J.Y. Dai, L. Bernstein
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): J.Y. Dai, W.-H. Chow, L. Bernstein, W. Ye, J. Lagergren, N.C. Bird, D.A. Corley, N.J. Shaheen, A.H. Wu, B.J. Reid, L.J. Hardie, D.C. Whiteman, T.L. Vaughan
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): J.Y. Dai, J. de Dieu Tapsoba, M.F. Buas, D.M. Levine, H.A. Risch, W. Ye, D.A. Corley
Writing, review, and/or revision of the manuscript: J.Y. Dai, J. de Dieu Tapsoba, M.F. Buas, L.E. Onstad, D.M. Levine, H.A. Risch, W.-H. Chow, L. Bernstein, W. Ye, J. Lagergren, N.C. Bird, D.A. Corley, N.J. Shaheen, A.H. Wu, B.J. Reid, L.J. Hardie, D.C. Whiteman, T.L. Vaughan
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): N.C. Bird, D.A. Corley
Study supervision: J.Y. Dai, L.E. Onstad, L. Bernstein, N.C. Bird, D.C. Whiteman
Acknowledgments
The authors thank Stuart MacGregor and Puya Gharahkhani for conducting and providing imputed genotypes based on the 1000 Genomes Project.
Grant Support
This work was directly supported by the NIH (R01HL114901 to J.Y. Dai and J.D. Tapsoba; P01CA53996 to J.Y. Dai; R01CA136725 to T.L.Vaughan and D.C. Whiteman; T32CA009168 to T.L. Vaughan and M. F. Buas; and K05CA124911 to T.L. Vaughan). D.C. Whiteman was supported by a Research Fellowship from the National Health and Medical Research Council of Australia.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.