Abstract
Background: A potential susceptibility locus for colorectal cancer on chromosome 9p24 (rs719725) was initially identified through a genome-wide association study, though replication attempts have been inconclusive.
Methods: We genotyped this locus and explored interactions with known risk factors as potential sources of heterogeneity, which may explain the previously inconsistent replication. We included Caucasians with colorectal adenoma or colorectal cancer and controls from 4 studies (total 3,891 cases, 4,490 controls): the Women's Health Initiative (WHI); the Diet, Activity and Lifestyle Study (DALS); a Minnesota population-based case–control study (MinnCCS); and the Prostate, Lung, Colorectal and Ovarian Cancer Screening Trial (PLCO). We used logistic regression to evaluate the association and test for gene–environment interactions.
Results: SNP rs719725 was statistically significantly associated with risk of colorectal cancer in WHI (OR per A allele 1.19; 95% CI, 1.01–1.40; Ptrend = 0.04), marginally associated with adenoma risk in PLCO (OR per A allele 1.11; 95% CI, 0.99–1.25; Ptrend = 0.07), and not associated in DALS and MinnCCS. Evaluating for gene–environment interactions yielded no consistent results across the studies. A meta-analysis of 17 studies (including these 4) gave an OR per A allele of 1.07 (95% CI, 1.03–1.12; Ptrend = 0.001).
Conclusions: Our results suggest the Aallele for SNP rs719725 at locus 9p24 is positively associated with a small increase in risk for colorectal tumors. Environmental risk factors for colorectal cancer do not appear to explain heterogeneity across studies.
Impact: If this finding is supported by further replication and functional studies, it may highlight new pathways underlying colorectal neoplasia. Cancer Epidemiol Biomarkers Prev; 19(12); 3131–9. ©2010 AACR.
Introduction
Colorectal cancer (CRC) is a major cause of morbidity and mortality in the United States. In 2010, there will be an estimated 142,520 new cases and 51,370 deaths in the United States expected from CRC (1). Colorectal cancer is the third most commonly diagnosed cancer and the third leading cause of cancer-related deaths in the United States when men and women are considered separately (2), and the second leading cause for both sexes combined (3). A substantial proportion of CRC is due to genetic risk factors, with estimates of heritable effects reaching as high as 35% [95% CI, 10%–48%] in a study of twins from Finland, Denmark, and Sweden (4). However, less than 5% of CRC is due to known highly penetrant variants inherited in an autosomal dominant manner (5). Thus, a large proportion of the remaining uncharacterized inherited susceptibility is expected to be due to numerous low-penetrance variants (6). The use of high-throughput technology, allowing the simultaneous ascertainment of hundreds of thousands of single nucleotide polymporphisms (SNP), particularly in the context of genome-wide association studies (GWAS), has been especially useful in exploring such potential variants.
Recent GWAS have succeeded in identifying 10 genetic regions that contain SNPs associated with CRC risk (7). One of these regions, 8q24, has been replicated in multiple studies and has also been implicated in various other cancers, such as prostate, breast, and bladder cancer (8). Along with 8q24, a second locus 9p24 was identified in the Assessment of Risk for Colorectal Tumors in Canada (ARCTIC) genome-wide scan (8). Within this region, rs719725 was identified as being the most strongly associated with CRC (OR 1.14, P = 1.32 × 10−5), although with lower statistical significance and less cross-study consistency than for 8q24 (8). However, in replication studies by these researchers, only 3 of 7 study populations showed a statistically significant association consistent with the GWAS finding, resulting in a lower strength of association (OR 1.08, P = 0.023; ref. 8).
A subsequent study replicated the rs719725 association with CRC using a case-unaffected sibling control design from the Colon Cancer Family Registry (9). This study found a statistically significant association among population-based families (P = 0.011), but not among clinic-based families (P = 0.97), although the difference between these 2 was, itself, not statistically significant (P = 0.26; ref. 9). A later study was unable to replicate rs719725 using a combined analysis of 3 study populations from Sheffield and Leeds, UK, and Utah, USA (OR 1.04, 95% CI, 0.91–1.19; ref. 10). Overall results from these subsequent replications were inconsistent, demonstrating an association in some populations whereas not in others (8–10).
Although GWAS have identified 10 SNPs associated with colorectal tumor risk, the biological functionality of these markers are generally unknown. At the 9p24 locus, SNP rs719725 does not lie within a gene itself, but there are several genes nearby. These include the following: protein kinase NYD-SP25 isoform 3 (TPD52L3, 37-kb telomeric), which is a member of the tumor protein D52 family; interleukin 33 (IL33, 124-kb telomeric); ubiquitin-like PHD and RING finger domain-containing protein (UHRF2, 47-kb centromeric); and glycine dehydrogenase (GLDC, 167-kb centromeric; ref. 9). Potentially, rs719725 could also be in linkage disequilibrium with a variant that modifies a long range enhancer of any of these genes, as seen for the variants in 8q24 and MYC (11). rs719725 is upstream of UHRF2 and downstream of TPD52L3, IL33, and GLDC; it lies in a haplotype block that is ∼114 kb in size, which is part of a larger block that is ∼407 kb in size (Haploview 4.1, release 21, CEU population; ref. 12). The small block contains TPD52L3; the large block includes TPD52L3, IL33, and UHRF2. As with other tag SNPs, associations between rs719725 and colorectal tumors may be due to linkage disequilibrium between the genetic marker and the true susceptibility allele, or alleles, at a neighboring locus. Beyond risk identification, a recent study found rs719725 to be statistically significantly associated with time to tumor recurrence in adjuvant CRC patients (13), suggesting this SNP may also be potentially important for prognosis and survival.
To further investigate whether the genetic variant on 9p24 is associated with colorectal tumors, we examined the impact of rs719725 on both colorectal adenoma and cancer, allowing the capture of a broader spectrum of colorectal tumor development. Because the large majority of colorectal malignancies develop from an adenomatous polyp (14), any genetic influence on these precursor lesions is likely to have an effect on cancer as well. We examined this association in 4 well-characterized study populations. These studies allowed for a detailed investigation of potential interactions with environmental risk factors known to be associated with CRC risk including obesity, physical activity, nonsteroidal anti-inflammatory drug (NSAID) use, smoking, and diet. Such factors may contribute to heterogeneity between studies and may help to explain the inconsistent findings previously reported.
Materials and Methods
Study samples
To investigate the full spectrum of CRC, we included both colorectal adenoma and CRC from 4 study populations (total 3,891 cases, 4,490 matched controls): the Women's Health Initiative (WHI: 656 CRC cases, 664 matched controls); the Diet, Activity and Lifestyle Study (DALS: 1,461 colon cancer cases, 1,813 matched controls); a Minnesota population-based case–control study (MinnCCS: 517 colorectal adenoma cases, 628 matched controls); and the Prostate, Lung, Colorectal and Ovarian Cancer Screening Trial (PLCO: 1,257 colorectal adenoma cases, 1,385 matched controls).
The WHI Observational Study (OS) is a prospective cohort study, which enrolled 93,676 postmenopausal women aged 50 to 79 years living near 40 clinical centers across the United States from 1993 to 1998, with continuous follow-up (15). We included invasive incident CRC cases diagnosed up through September 12, 2005. Controls were randomly selected and individually matched on age at screening, enrollment date, ethnicity, hysterectomy status, and prevalent CRC at baseline (16), using risk-set sampling (17).
DALS is a population-based case–control study of colon cancer (18). Cases and controls were recruited from the Kaiser Permanente Medical Care Program of Northern California, an 8-county area in Utah, and the metropolitan twin cities of Minnesota. Cases had been diagnosed with first primary colon cancer between October 1, 1991, and September 30, 1994. Cancers of the rectosigmoid junction or rectum were not included in this study. Controls were matched to cases by 5-year age groups and sex, and came from membership lists (Kaiser), driver license lists (Utah up to age 65, Minnesota), Health Care Financing Administration lists (Utah over age 65), or state identification lists (Minnesota).
The MinnCCS is a case–control study within the Minnesota Cancer Prevention Research Unit (19). Cases and controls, aged 30 to 74 years, were recruited from patients that were scheduled for colonoscopy from April 1991 to April 1994 within a large, multiclinic gastroenterology practice in metropolitan Minneapolis. Cases were subjects found to have adenoma at the time of colonoscopy, and controls were subjects found to be without adenomatous or hyperplastic polyps at colonoscopy (20).
The PLCO is a randomized trial of ∼155,000 persons aged 55 to 74, enrolled during 1993 to 2001 from 10 U.S. centers, to evaluate the effectiveness of screening on cancer mortality (21). We conducted a nested case–control study among participants randomized to the screening arm, who underwent a 60-cm flexible sigmoidoscopy examination at study entry. Cases included participants with advanced colorectal adenoma of the distal colon or rectum (adenoma ≥ 1 cm in size, diagnosed as: villous/tubulovillous; high-grade dysplasia; or carcinoma in situ). Controls were participants who had a successful baseline screening exam that was negative for polyps in the distal colon and rectum. Controls were frequency matched to cases on ethnicity, sex, and, for a subset, age (21).
Further details regarding the methods used for selection of subjects and data collection in these studies have been previously reported. Given the small proportion of non-Caucasians in these studies (2.9%–16.2%), all 4 data sets were restricted to Caucasians only, which the numbers above reflect. All reported studies obtained informed consent and received Institutional Review Board (IRB) approval.
Genotyping
For the WHI and DALS studies, genotyping was performed by MALDI-TOF mass spectrometry on the Sequenom MassARRAY 7K platform using the iPLEX Gold (low-plex) reaction (16). For the PLCO and MinnCCS studies, genotyping was performed using the TaqMan assay system: PLCO used the Fludigm Biomark system (Fluidigm) and MinnCCS used an ABI 7900HT instrument (ABI; ref. 20). These methods have been previously reported and described in further detail elsewhere (16, 20). Quality-control measures in these studies included call-rate cutoffs, blinded duplicates, and other standard methods. In all 4 studies, the minor allele frequency (MAF; C allele) was ∼37% and controls were in Hardy–Weinberg equilibrium (HWE; P > 0.18).
Statistical analysis
We calculated odds ratios (OR) and 95% CIs to estimate the association between SNP rs719725 and colorectal neoplasia. We conducted logistic regression analyses for each study separately, including study-specific covariates, such as age, sex, and study center, as appropriate. To be consistent with most previous studies, we used the minor allele C as the reference allele. We calculated risk on the basis of the log-additive model, by including a single variable coded as 0, 1, or 2, reflecting the number of copies of the major A allele. We also evaluated risk for an unrestricted model, without any assumption of the underlying genetic model, which provides OR estimates comparing genotypes AC versus CC and AA versus CC. We combined the individual study-specific adjusted log-ORs using a linear mixed-effects model and then estimated the overall ORs and 95% CIs (22).
To investigate whether the association between rs719725 and cancer differed by anatomic location, we also performed stratified analyses for colon versus rectum. Cases with tumors located at both the colon and the rectum and those with unknown location were excluded from these stratified analyses. Polytomous regression was used to evaluate statistical differences among models. Because WHI only included females, we also evaluated these associations stratified by sex.
Risk factors potentially related to CRC were evaluated for interaction with rs719725 using likelihood ratio tests. The pertinent risk factors considered were: family history of CRC (having at least 1 first degree relative, yes/no); cigarette smoking (never/former/current and pack-years); use of NSAIDs (yes/no); alcohol consumption (g/d); physical activity (hours of moderate/vigorous exercise per week); body mass index (BMI; kg/m2); folate intake (total folate in mcg/d); calcium intake (mg/d); red meat intake (g/d); and use of postmenopausal hormones (never/former/current and duration). Dietary risk factors were adjusted for daily caloric intake by including this term in the regression models used for the likelihood ratio tests. Risk factors with a statistically significant test in at least 2 studies were considered to lead to potential interaction effects, and OR estimates were calculated stratified on that factor. To assess the overall effect of the interaction we also performed random effects meta-analyses of the betas and standard errors of the interaction terms for each risk factor.
We performed a random effects meta-analysis combining our 4 studies with 13 results presented in 3 previously published studies on rs719725 and CRC (8-10). For this combined analysis, we included the log-additive OR estimates and 95% CIs for each study population; for a few published results we needed to transform from A allele referent to C allele referent (10) or from an unrestricted model to a log-additive model (9). Forest plots were used to display the results from individual studies, as well as the summary results. The statistical significance of between-study heterogeneity was evaluated using Cochran's Q statistics (23). If the P value was less than 0.10, the heterogeneity was considered statistically significant. We also quantified heterogeneity using the I2 metric. I2 takes values between 0% and 100%, with higher values indicating higher levels of heterogeneity (24). Potential for publication bias was assessed using Egger's test and visual inspection of funnel plots (25).
Analyses were performed using SAS version 9.1 (SAS Institute), STATA version 11 (StataCorp), and HaploView version 4.2 (12). Our statistical significance cutoff was a P value of 0.05 and for marginal statistical significance was 0.10.
Results
Demographic information for each study is shown in Tables 1a and b. Participants in the 4 studies had a mean age of 57.9 to 67.1 years. By definition, the WHI study population was 100% female, whereas the proportion female was 45.5% in DALS, 50.7% in MinnCCS, and 35.8% in PLCO. As cases and controls were matched on age and sex, there were no substantial differences between cases and controls on these variables. Due to restriction, all participants were self-reported Caucasians. As expected, cases were more likely to have a positive family history of CRC, except in MinnCCS. The difference in family history in MinnCCS is plausibly explained by the fact that participants were selected among those that elected to get screened for CRC: for cases, indications for screening are likely to be symptom- or disease-related, whereas among controls, screening was more likely sought by the worried well with a positive family history. Except for WHI, cases tend to more likely be current or former smokers. The proportion of smokers was generally similar across these populations. Rectal cancers made up 18.2% of the WHI cases and 0% of the DALS cases; rectal adenoma made up 16.3% of the MinnCCS cases and 22.3% of the PLCO cases (Table 2).
In the WHI study, rs719725 was statistically significantly associated with risk of CRC, with an OR of 1.19 per Aallele (95% CI, 1.01–1.40; Ptrend = 0.04; Table 3). Similarly, the PLCO data provided a marginally statistically significant finding in the same direction for risk of advanced adenoma (OR 1.11 per Aallele; 95% CI, 0.99–1.25; Ptrend = 0.07). In both MinnCCS and DALS, rs719725 was not associated with risk of colorectal tumors. The OR estimates from the unrestricted models were generally consistent in their direction and trend with the results from the additive models.
To further evaluate the association between rs719725 and colorectal neoplasia risk, we performed logistic regression analyses stratified by tumor location (Table 3). These stratified analyses did not show any statistically significant differences between colon and rectal tumors in any study (P for difference > 0.07). Stratifying these analyses by sex did not produce any significantly different results (Table 4).
Only 3 interactions were found to be statistically significant and all were in MinnCCS (Table 5). These interactions were between rs719725 and each of folate intake (P= 0.02); Hormone Replacement Therapy (HRT)use (P= 0.001); and duration of HRT use (P= 0.01).
A meta-analysis of the 4 studies presented here and 13 previous studies (total n = 17; Fig. 1) resulted in an overall OR of 1.07 (95% CI, 1.03–1.12) per A allele (P = 0.001), showing a statistically significant association between rs719725 and colorectal neoplasia across the study populations. There was little evidence for between study heterogeneity (I2 = 24.4%, heterogeneity P = 0.17). Information (where available) on the number of cases and controls, MAF, HWE, and sex of the other studies included in the meta-analysis are shown in Table 6.
Because the interpretation of the results in MinnCCS may be impacted by the large fraction of controls with a positive family history, given that controls were selected among those that chose to be screened for CRC, we also obtained results excluding this study. Taking the MinnCCS study out of the meta-analysis, however, did not produce substantially different results (OR 1.08 per A allele; 95% CI, 1.03–1.13; P = 0.001). Due to potential differences between adenoma and cancer, we also obtained results among studies of cancer only. Taking the adenoma studies PLCO and MinnCCS out of the meta-analysis, however, also did not produce substantially different results (OR 1.08 per A allele; 95% CI, 1.03–1.13; P = 0.002).
Discussion
Among 4 well-characterized epidemiologic studies, we found a statistically significant and marginally significant association in 2 nested case–control studies of CRC and adenoma whereas 2 other case–control studies of colorectal adenoma and cancer did not show evidence for association. We were not able to explain the differences in our results by tumor site or by interactions with environmental risk factors for CRC. A meta-analysis of 17 studies (including these 4) showed a statistically significant association for this SNP with colorectal tumor. It is important to note that, although the primary distinction among the 4 studies presented here is the finding of an association in the cohort, but not the case–control studies, that is not what characterizes the rest of the data. Indeed, the first study to report the association was a case–control study.
Our results are similar to other publications that showed inconsistent findings for rs719725 and risk of CRC. Summarizing findings in a meta-analysis of all available results provided evidence for a positive association between the rs719725 A allele and colorectal tumor (OR 1.07, P = 0.001). However, results do not reach genome-wide significance levels of 10−7 to 10−8 (26, 27). It is probably important that only 5 of the 17 reported risk estimates showed no positive association and only 2 of these showed an inverse association (Fig. 1). We observed no statistical evidence for heterogeneity, though we may not have had the necessary power to detect it (24, 28). A funnel plot of the estimates of the association used in the meta-analysis did not demonstrate any evidence of publication bias (Egger's test; P = 0.68), though our power to detect such bias may be limited (Fig. 2).
The weak OR estimates found in this study are not surprising, given the generally weak risk estimates commonly found in GWAS. Although it seems likely that this particular SNP carries only a slight increase in risk of colorectal tumor, its identification, if true, would nevertheless be important for identifying potentially new pathways relevant to preventive and treatment strategies. Furthermore, in concert with many other susceptibility loci, rs719725 could potentially be used to measure an overall risk profile for colorectal tumor. A recent review estimated that ∼172 SNPs account for all of the genetic variance for CRC (7). In this way, rs719725 could potentially resemble a single small piece of the large CRC genetic risk profile puzzle; the problem of what is sufficient and what is necessary is likely to remain unresolved for a long time.
Among the 4 study populations, we did not find any consistent evidence for any gene–environment interaction. We observed only 3 statistically significantly interactions in 1 study (MinnCCS), for rs719725 and folate intake and HRT use and duration, but these were not replicated in the other 3 study populations. These 3 findings do not remain statistically significant after adjustment for multiple comparisons. Further, the lack of a clear pattern across all 4 studies, along with observing these interactions in the smallest of the 4 studies, suggests that these findings are likely to be spurious. Our hypothesis that these environmental factors could explain some of the heterogeneity between studies is not supported by our data, though we may have been underpowered to explore such effects or perhaps did not evaluate the correct environmental factor. We included essentially all of the currently known CRC environmental factors. The biological function of this SNP is unknown, and therefore we did not have an a priori hypothesis to examine interactions with specific environmental factors. We chose the approach of evaluating all established environmental risk factors as potential effect modifiers, with the goal of determining if these factors are potentially responsible for the inconsistent results reported in previous papers.
A limitation of this study is the inability to investigate gene–gene interactions, which may be important for this locus. It is possible that such interactions may play a role in explaining the different results across studies, and future research will be necessary to determine how risk is impacted by considering multiple genetic variants simultaneously. We examined only the previously reported SNP in this region and it is possible that other variants are more relevant. Although we restricted study participants to only include Caucasians, population stratification is still a potential reason for the inconsistent findings reported across studies. The definition of an adenoma case was also more broadly interpreted in MinnCCS than in PLCO, and this difference in definition could have contributed to the inconsistent association found in these 2 studies.
Strengths of this study include the use of multiple studies with large sample sizes and the availability of detailed information on environmental risk factors for CRC, allowing us to evaluate if environment factors modify the association between this SNP and colorectal tumor.
Although we can neither prove nor disprove the existence of a risk locus on chromosome 9p24 that is tagged by SNP rs719725, our analysis, as well as the meta-analysis including previously published studies, provide additional support for an association between variation at 9p24 and colorectal neoplasia. Further replication in larger studies seems to be warranted. If the association continues to be replicated, this could be followed by studies to identify the disease-related polymorphism being tagged and its possible function. It also may be pertinent to evaluate this association among different races and ethnicities and to further explore potential interactions with other genetic or environmental factors.
Disclosure of Potential Conflicts of Interest
No conflicts of interest to declare.
The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Cancer Institute or the National Institutes of Health.
Acknowledgments
Acknowledgments
The authors thank Dr. Roberd Bostick and Ms. Lisa Fosdick for their contributions to the study design and data collection in the MinnCCS. The authors thank Drs. Christine Berg and Philip Prorok, Division of Cancer Prevention, National Cancer Institute, the Screening Center investigators and staff of the Prostate, Lung, Colorectal, and Ovarian (PLCO) Cancer Screening Trial, Mr. Tom Riley and staff, Information Management Services, Inc., Ms. Barbara O'Brien and staff, Westat, Inc., Mr. Tim Sheehy and staff, DNA Extraction and Staging Laboratory, SAIC-Frederick, Inc., and Ms. Jackie King and staff, BioReliance, Inc. Most importantly, we acknowledge the study participants for their contributions to making this study possible.
Grant Support
This study was funded by National Cancer Institute (NCI), National Institutes of Health (NIH), U.S. Department of Health and Human Services (DHHS) awards NIH R01 CA120582 and NIH K22 CA118421 (Dr. Peters), NIH R01 CA48998 (Dr. Slattery), NIH R01 CA059045 (Dr. Potter), NIH R25 CA094880 (Mr. Kocarnik), NIH K22 CA118421 (Dr. Hutter), NIH R01 AG14358 and PO1 CA53996 (Dr. Hsu). The WHI program is funded by the National Heart, Lung, and Blood Institute, NIH, DHHS through contracts N01WH22110, 24152, 32100-2, 32105-6, 32108-9, 32111-13, 32115, 32118-32119, 32122, 42107-26, 42129-32, and 44221 (Dr. Caan, Dr. Beresford, Dr. Rajkovic, Dr. Sarto, Dr. Wallace, Dr. Prentice). The project described was supported by Award Numbers R01 CA059045 and R01 CA120582 from the National Cancer Institute. This research was also supported by the Intramural Research Program of the Division of Cancer Epidemiology and Genetics and by contracts from the Division of Cancer Prevention, National Cancer Institute, NIH, DHHS.