Advances in genomics offer new strategies for assessing the association of common genetic variations at multiple loci and risk of many diseases, including colorectal cancer. Low-penetrance alleles of genes in many biological pathways, such as DNA repair, metabolism, inflammation, cell cycle, apoptosis, and Wnt signaling, may influence the risk of nonfamilial colorectal cancer. To identify susceptibility genes for colorectal cancer, we designed a large-scale case-control association study nested within the Nurses' Health Study (190 cases and 190 controls) and the Health Professionals' Follow-up Study (168 cases and 168 controls). We used a custom GoldenGate (Illumina) oligonucleotide pool assay including 1,536 single nucleotide polymorphisms (SNP) selected in candidate genes from cancer-related pathways, which have been sequenced and genotyped in the SNP500Cancer project; 1,412 of the 1,536 (92%) of the SNPs were genotyped successfully within 388 genes. SNPs in high linkage disequilibrium (r2 ≥ 0.90) with another assayed SNP were excluded from further analyses. As expected by chance (and not significant compared with a corrected Bonferroni P = 0.00004), in the additive model, 11 of 1,253 (0.9%) SNPs had a Ptrend < 0.01 and 38 of 1,253 (3.0%) SNPs had a Ptrend ≥ 0.01 and Ptrend < 0.05. Of note, the MGMT Lys178Arg (rs2308237) SNP, in linkage disequilibrium with the previously reported MGMT Ile143Val SNP, had an inverse association with colorectal cancer risk (MGMT Lys178Arg: odds ratio, 0.52; 95% confidence interval, 0.35-0.78; unadjusted Ptrend = 0.0003 for the additive model; gene-based test global P = 0.00003). The SNP500Cancer database and the Illumina GoldenGate Assay allowed us to test a larger number of SNPs than previously possible. We identified several SNPs worthy of investigation in larger studies. (Cancer Epidemiol Biomarkers Prev 2008;17(2):311–9)

Colorectal cancer, a complex disease arising from both genetic and environmental factors, is the third most common cancer and the second most common cause of death due to cancer in the United States (1).

So far, susceptibility to colorectal cancer has been characterized by the identification of rare inherited mutations in a small number of established genes and by diet and lifestyle factors, including intake of red meat and alcohol as well as smoking, physical activity, and obesity. High-penetrance mutations in the APC/WNT pathway (in the APC, AXIN, and CTNNB1 genes) and mismatch repair pathway (in the MLH1, MSH1, MSH2, MSH3, PMS1, and PMS2 genes) are found in a proportion of cases of familial colorectal cancer, but these alterations account for only a small fraction of the risk of colorectal cancer in the general population. However, genetic variation in common, low-penetrance genes in multiple biological pathways, such as DNA repair, metabolism, inflammation, cell cycle, apoptosis, and WNT signaling, may also contribute to the etiology of inherited and sporadic cases of colorectal cancer. The spectrum of allelic differences in low-penetrance genes could account for the interindividual variation in response to diet and lifestyle factors.

The advent of highly annotated single nucleotide polymorphism (SNP) data from the International HapMap Project, coupled with the development of high-throughput genotyping platforms, has made SNPs attractive markers for large-scale association studies in candidate genes (2-5). The candidate gene approach may offer insight valuable for detecting genetic associations with colorectal cancer. However, most published studies have been limited to a single or a small number of SNPs in a small number of genes. Therefore, we conducted a prospective nested case-control study in the Nurses' Health Study (NHS) and the Health Professionals' Follow-up Study (HPFS) to evaluate common sequence variants at 1,536 loci in 388 genes to assess associations with susceptibility to colorectal cancer.

Study Population

The NHS is an ongoing prospective study of 121,700 U.S. female registered nurses. Details of the design and follow-up of this cohort have been described previously (9, 10). Briefly, at enrollment in 1976, the participants, ages 30 to 55 years, completed a questionnaire providing information on risk factors for cancer and cardiovascular disease. Exposure and disease information are updated biennially. From 1989 to 1990, blood samples were collected from 32,826 of the NHS participants. After blood collection through June 2000, 197 incident cases of colorectal cancer were confirmed through medical records or death reports, of which 190 cases were successfully genotyped. Controls were randomly selected from women who were alive and free of cancer at the time of case ascertainment. One control was matched to each case on year of birth and month of blood draw.

The HPFS began in 1986 when 51,529 U.S. male dentists, optometrists, osteopaths, podiatrists, pharmacists, and veterinarians, ages 40 to 75 years, responded to a mailed questionnaire (9). These men provided baseline information on age, marital status, height, weight, ancestry, medications, smoking history, medical history, physical activity, and diet. Exposure and medical history information are updated every 2 years. Blood samples were collected from 18,225 of the HPFS participants between 1993 and 1995. Among these men, 168 incident cases of colorectal cancer were identified between the date of blood draw and January 2002. Men who were alive and free of diagnosed cancer at the time of case ascertainment were selected as controls and were matched to cases on year of birth and month of blood draw.

Sample Collection

Venous blood samples were separated into plasma, buffy coat, and RBCs and stored in liquid nitrogen. Genomic DNA was extracted from 50 μL buffy coat diluted with 150 μL PBS using the QIAmp (Qiagen) 96-spin blood protocol according to the manufacturer's instructions. Concentrations of genomic DNA were measured in 96-well format with PicoGreen technology (Molecular Probes).

Gene and SNP Selection

The SNP500Cancer database (http://snp500cancer.nci.nih.gov), a component of the National Cancer Institute's Cancer Genome Anatomy Project, provides sequence and genotype assay information for candidate SNPs in genes hypothesized to be related to cancer. National Cancer Institute's SNP500Cancer reports sequence analysis and allele prevalence information in anonymized control DNA samples (n = 102 Coriell samples representing four self-described ethnic groups: African/African American, Caucasian, Hispanic, and Pacific Rim; refs. 2, 4).

We designed an Illumina oligonucleotide pool assay (OPA) selecting candidate genes that were resequenced in the SNP500Cancer project (4). As of November 2004, there were a total of 5,800 SNPs with a minor allele frequency (MAF) of >3% in the combined four self-described populations (African/African American, Caucasian, Hispanic, and Pacific Rim) annotated in the SNP500 database (2). Overall, the SNP selection approach for this study was to examine 10 kb upstream and 10 kb downstream in accordance with design score validations based on Illumina in-house measurements and the 60-bp limitation (a SNP cannot be closer than 60 bp to another SNP on this OPA). After excluding SNPs with high r2 (defined as r2 ≥ 0.8), we did a preliminary screen of the remaining 3,072 SNPs in the SNP500Cancer database (2-8, 11). From the 3,072 SNP panel, the final SNP selection of the 1,536 polymorphisms chosen for the OPA included SNPs with a MAF of >3% in the unrelated (that is, parents) HapMap CEPH Utah (CEU, with European ancestry) samples favored nonsynonymous SNPs and SNPs with evidence for functional significance as well as those SNPs evaluated previously in relation to cancer risk (2, 4).

Genotyping Methods

Whole-genome amplified DNA plates were sent to Illumina for high-throughput genotyping by the highly multiplexed GoldenGate assay (5). The assay was conducted on DNA that was whole genome amplified using the multiple displacement amplification method relying on the Phi29 DNA polymerase protocol (12, 13) and the high concordance rates and minimal excess of significant departures from Hardy-Weinberg equilibrium (HWE) support that the high fidelity of the whole-genome amplified procedure. The GoldenGate assay queries genomic DNA with probes (two allele-specific oligonucleotides and one locus-specific oligonucleotide) and creates DNA fragments that can be amplified by standard PCR methods with three universal primers. Fluorescently labeled strands are hybridized to the appropriate probe on the microbead array, which allows identification of a particular SNP (5). Of the 1,536 SNPs, 1,421 loci were successfully genotyped with high completion rates (>80%) and MAFs >0.03 in controls, including 9 duplicate SNPs (to verify concordance); thus, there were 1,412 unique SNPs in the OPA.

SNP Characteristics

The SNPs included in the OPA had a MAF of >3% in 60 unrelated parents of CEU trios from HapMap (3). Of the 1,536 assays chosen for this study, 103 were dropped from the analysis because of low MAF (<3% in our study population) or assay problems (low genotyping success rate determined by Illumina using the GenCall score threshold at 50%). Thus, we obtained data on 1,421 SNPs (including 9 duplicate SNPs) in 388 genes. The characteristics of the 1,421 SNPs we genotyped were 51% intronic, 37% exonic, and 12% in the promoter region. Overall, in the 388 genes studied, the mean number of SNPs per gene was 3.5 and the maximum number was 37 (in the GSK3B gene). Of the 1,412 unique SNPs included in this study, 310 were coding SNPs, 145 of which were nonsynonymous SNPs.

In ∼55 (14%) genes, there was an attempt to select of an adequate number of SNPs to tag common haplotypes. This category also included genes, AXIN2, CDKN2A, CTNNB1, GSK3B, KRAS, MSH2, and PMS1, which were submitted because of their high probability of association with colorectal cancer. These genes were then tagged with Tagger based on the CEU population with a MAF of >5% in dbSNP and HapMap data available in January 2005.

Statistical Analyses

Controls were matched to cases 1:1 according to age, month of blood draw, and fasting status at blood draw. Colorectal cancer risk was considered in relation to the 1,412 SNPs. Each polymorphic locus was tested for HWE in the control population. χ2 statistics were used to test for differences in the distributions of the genotypes or other potential risk factors between cases and controls. Conditional logistic regression was used to compute odds ratios (OR) and 95% confidence intervals (95% CI) for the main effect of genotype and to control for potentially confounding variables. Polymorphisms were categorized into three genotypes, homozygous wild-type, heterozygous, and homozygous variant. Homozygosity for the more common allele was treated as the reference category. Tests for linear trend of log ORs (additive model) were calculated using an ordered categorical variable by assigning scores to the genotypes: 0 (no variant allele), 1 (carrying one variant allele), and 2 (carrying two variant alleles). For polymorphisms with less than five individuals with the homozygous variant genotype, homozygotes for the variant genotype were combined with heterozygotes. There were no significant differences between men and women in the main effect of genotype for SNPs with additive Ptrend < 0.01. Therefore, to maximize power only, the combined analyses are presented. Analyses were controlled for known risk factors for colorectal cancer, including family history of colorectal cancer, smoking history, body mass index, multivitamin use, aspirin use, postmenopausal hormone use (among women in the NHS), physical activity, and intake of red meat, folate, and alcohol.

For all covariates, we used updated data from the most recent questionnaire completed before diagnosis of cancer. Individuals with missing genotype data or covariate values were excluded from specific analyses. All P values are based on two-sided tests. All statistical analyses were done with SAS (version 9.1; SAS Institute; ref. 14).

SNP spectral decomposition was used to calculate the Meff value to correct for multiple testing (http://genepi.qimr.edu.au/general/daleN/SNPSpD/; ref. 15). This correction strategy accounts for the linkage disequilibrium (LD) between polymorphic sites. Failure to account for the nonindependence of SNPs would make the Bonferroni correction overconservative. The ratio of observed eigenvalue variance, Var(λobs) to its maximum (M) gives the proportional reduction in the number of variables in a set. The effective number of variables (Meff) is calculated as Meff = 1 + (M - 1) [1 - Var(λobs) / M] (15).

In addition to the SNP-based tests, we also evaluated two gene-based tests for association using the P value obtained from the likelihood ratio test comparing models with and without (a) terms for heterozygous and homozygous variant genotypes for each SNP in a gene (df = 2 × number of SNPs per gene) and (b) terms for each SNP (genotypes assigned as 0, 1, 2) in a gene (df = number of SNPs per gene).

In this study, 358 cases of colorectal cancer and 358 matched controls from the prospective NHS and HPFS cohorts were genotyped for 1,412 unique SNPs in cancer pathways. After excluding SNPs in high LD (r2 > 0.90), we examined the association of 1,299 SNPs and colorectal cancer risk and present a summary of the OPA performance and the associations for the SNPs with a Ptrend < 0.01 in the additive model.

OPA Characteristics

The OPA contained 1,421 successfully genotyped SNPs derived from 388 genes. The loci success rate was 92.5% (1,421 of 1,536 SNPs successfully genotyped). The genotyping success rate for the DNA samples was 98.4% [874 of 888 total samples, including quality-control (QC) samples]. Nine duplicate SNPs were included in the analysis for additional QC measures and were 99.9% concordant (data not shown). After duplicate SNPs were filtered, 1,412 unique SNPs remained for further analysis (see Supplementary Table S1 for a comprehensive list of the 1,299 SNPs analyzed after excluding SNPs in LD with an r2 > 0.90).

QC Analysis

Fourteen percent blinded QC samples were included. Analysis of the 35 QC replicates, involving 130 samples, resulted a 99.95% overall concordance rate (184,640 of 184,730 QC genotype pairs were concordant). Of the 35 QC replicate sample sets, 20 (57.14%) had a 100% concordance rate. The other 15 QC sets had concordance rates ranging from 99.62% to 99.96%. The 90 discordant QC genotype samples were exclusive to 30 loci (2.11% of total loci), of which 6 (rs3765459, rs1052576, rs3774268, rs3736228, rs7260, and E3359_310) had concordance rates from 89.23% to 96.92%.

Study Population

Select characteristics of the study population are shown in Table 1. Briefly, the mean age at diagnosis was 71.0 years (range, 50-86) in the HPFS and 65.5 years (range, 46-78) in the NHS. Cases were more likely than controls to have a family history of colorectal cancer and to consume ≥1 serving per day of red meat, less likely to be regular users of aspirin, and tended to be less physically active and have lower intakes of dietary folate.

Hardy-Weinberg Analysis

The genotype distributions for the 1,421 polymorphisms were examined and tested for agreement with HWE. Among the controls, 72 loci (5.49%) had HWE χ2P values ≤0.05 and 6 loci (0.42%) had HWE χ2 test P values ≤0.001.

Correction for Multiple Comparisons

SNPs in high LD were excluded from the analysis. First, we evaluated which SNPs had a pairwise r2 values <0.9 on an individual chromosome basis. If there were multiple SNPs in one gene at the r2 = 0.9 level, we included nonsynonymous SNPs and SNPs with MAF of >1% (total number of SNPs: excluded 113 SNPs). The Meff value (15) calculated by SNP spectral decomposition (see Supplementary Table S2) for the remaining 1,299 SNPs is 1,253 and the resultant Bonferroni-corrected genome-wide significance threshold is 3.99 × 10-5. Thus, the interpretation of the Meff correction is that the single-locus tests for 1,299 SNPs distributed on the 22 human autosomes are effectively equivalent to 1,253 statistically independent single-locus tests.

SNPs with Ptrend Values <0.01 for the Additive Model

Although not significant after this Bonferroni correction, among the 1,299 SNPs genotyped (Fig. 1A), 11 SNPs had a Ptrend < 0.01 (Fig. 1B) in the additive model. These SNPs were in glutathione peroxidase 3 (GPX3; rs8177426 and rs8177477) hemochromatosis (HFE; rs707889), lymphotoxin α (LTA; rs3093546), postmeiotic segregation increased 2 (PMS2; rs6463524), cyclin-dependent kinase inhibitor 2A (CDKN2A; rs3731239), O6-methylguanine-DNA methyltransferase (MGMT; rs2308327), apoptotic peptidase activating factor 1 (APAF1; rs2288729), RAD51 homologue (RAD51; rs2412546), cytochrome P450, family19, subfamily A, polypeptide 1 (CYP19A1; rs28566535), and spermidine/spermine N1-acetyltransferase (SAT2/SSAT; rs13894) and were analyzed further in multivariate conditional models (Table 2) using a SNP-based approach and a gene-based approach. The distributions of all these SNPs, except the GPX3 SNPs, were in agreement with HWE in cases and controls from both the NHS and the HPFS cohorts. The nonsynonymous MGMT Lys178Arg SNP had a Ptrend = 0.0003 in the additive model (Fig. 1B). We found a significant inverse association among individuals with one copy (multivariate OR, 0.54; 95% CI, 0.37-0.79) and one or more copies of the MGMT Lys178Arg variant allele (multivariate OR, 0.51; 95% CI, 0.35-0.75). The relationship remained after adjusting for other known risk factors for colorectal cancer (one or more copies of the variant allele: OR, 0.52; 95% CI, 0.35-0.78). The MGMT gene-based tests for association (Bonferroni corrected P = 0.0001), evaluating four SNPs, had a notable global P value for genotype associations (P = 0.0003) and a global P value based on trend test for individual SNPs (P = 0.00003).

Compared with the HFE IVS6 + 462 CC genotype, the CT variant (multivariate OR, 0.60; 95% CI, 0.41-0.87) and the TT variant (OR, 0.22; 95% CI, 0.08-0.60) were associated with a reduced risk of colorectal cancer. Compared with the LTA 50 CC genotype, the CT + TT variants (OR, 0.40; 95% CI, 0.21-0.79) were inversely associated with colorectal cancer. The APAF1 2093 AG + GG variants were associated with a suggestive reduction in risk of colorectal cancer (OR, 0.71; 95% CI, 0.51-1.00).

The PMS2 S260S, RAD51 A4480G, and SAT2 Arg126Cys SNPs were associated with an increased risk of colorectal cancer. The risk of colorectal cancer was increased in individuals carrying the PMS2 260 CC genotype (OR, 2.34; 95% CI, 0.95-5.79) relative to the risk in those carrying the GG genotype. The OR was slightly attenuated in the variant carrier model (OR, 1.60; 95% CI, 1.11-2.30). Differences were observed in the genotype distribution between cases and controls for rs2412546 in intron 5 of RAD51 (Ptrend = 0.009). The minor allele of the RAD51 SNP was associated with an increased risk of colorectal cancer (OR, 1.62; 95% CI, 1.11-2.34). The nonsynonymous SNP at Arg126Cys (rs13894) in exon 6 of SAT2 (Ptrend = 0.006) was associated with an increased risk of colorectal cancer (variant carrier genotypes: OR, 2.54; 95% CI, 1.50-4.29).

SNPs with Ptrend > 0.01 and Ptrend < 0.05

Table 3 presents the main effects for 43 SNPs with a Ptrend < 0.05 but Ptrend > 0.01. Of interest is the functional nonsynonymous polymorphism catechol-O-methyltransferase (COMT) Val158Met involved in the dopaminergic pathway and associated with several psychiatric conditions (16), which in this study was associated with a reduced risk of colorectal cancer (OR, 0.75 and 0.58 for the heterozygote and minor allele variant, respectively).

Nonsynonymous SNPs

A total of 145 nonsynonymous SNPs were successfully genotyped (see Supplementary Table S3), including nonsynonymous SNPs in genes, such as APC, AXIN, MLH1, and MHS3, associated previously with colorectal cancer risk. All had P values >0.05, except for the previously noted MGMT R178K, SAT2 R126C, and COMT Val158Met polymorphisms (Bonferroni corrected P value for the 145 nonsynonymous SNPs = 0.0003).

New technologies for large-scale SNP analysis enable much larger sets of SNPs to be tested for association simultaneously and may help in deciphering the complex nature of colorectal and other cancers. In this set of a priori candidate genes in cancer pathways, almost all the SNP associations were null, although our sample size and thus the statistical power to detect a modest association is limited. A limitation of genetic association studies is the lack of functional information for many SNPs. Nonsynonymous SNPs are common genetic variants that alter encoded amino acids in proteins; thus, nonsynonymous SNPs may modify the structure or function of expressed proteins. Our analysis included 145 nonsynonymous SNPs that had a higher probability of being associated with colorectal cancer risk; 3 of the nonsynonymous SNPs had nominally significant P values <0.05. An additional benefit of large-scale analysis, which yields an abundance of genotypic data, is that many SNPs that are not significantly associated with disease risk can be reported in a single study, thus reducing the probability of publication bias.

Quality Control

As this is one of the first case-control analyses using the Illumina GoldenGate assay to be reported, we did a detailed evaluation of the data quality. The high concordance of the QC data confirms that large-scale association studies using the GoldenGate technology provide highly reliable results at a cost of about five cents per genotype.

SNPs with Ptrend Values <0.01

There were 11 SNPs in 10 genes in various biological pathways associated with risk of colorectal cancer at the Ptrend 0.01 = value with the additive model. SNPs in HFE, LTA, PMS2, CDKN2A, MGMT, APAF1, RAD51, and SAT2 genes were analyzed further in multivariate conditional models. Main-effects analysis identified the association of two nonsynonymous SNPs, MGMT Lys178Arg (“benign” by polymorphism phenotyping, which combines a conservation score with additional properties to predict the functional importance of an amino acid alteration) and SAT2 Arg126Cys (“possibly damaging” by polymorphism phenotyping), with colorectal cancer risk (17). In a recent evaluation of 1,041 nonsynonymous SNPs in 2,575 colorectal cancer cases and 2,707 controls colorectal cancer, these nonsynonymous SNPs were not significantly associated with colorectal cancer risk (18).

MGMT eliminates mutagenic DNA adducts from the O6 position of the guanine nucleotide in the DNA direct reversal repair pathway (19, 20). The MGMT Lys178Arg variant allele was strongly associated with a reduced risk of colorectal cancer. In addition, gene-based tests for MGMT also suggested associations with colorectal cancer. The MGMT Lys178Arg SNP is in strong LD with MGMT Ile143Val. The current findings overlay our previously reported associations for the MGMT Ile143Val among NHS women (10); however, the MGMT Lys178Arg polymorphism among men in the HPFS have not been evaluated before. The Ile143Val SNP, in exon 5, lies in close proximity to the 145Cys alkyl-receptor residue (21) and to the conserved estrogen-receptor interacting helix (10) and may alter the function of this DNA repair gene.

SAT2/SSAT catalyzes the transfer of the acetyl group from acetyl-CoA to the N1 position of spermidine or spermine and has a critical role in governing intracellular polyamine concentrations, compounds involved in cell proliferation. SAT2 is overexpressed in colorectal cancer cells (22) in the presence of a K-Ras mutation. The SAT2 Arg126Cys SNP, associated with increased risk in our study, may modulate susceptibility to colorectal cancer via alterations in polyamine metabolism especially in association with aspirin/NSAID usage (23).

In this study, the RAD51 4480 AG + GG, the CYP19A1 14872 AC + CC and the PMS2 260 GC + CC variants were associated with a significantly increased risk of colorectal cancer. RAD51 is involved in homologous recombination of double-strand break repair in the same pathway (24). PMS2 forms a MutLα heterodimer with MLH1 in the mismatch repair complex. Mutations in mismatch repair have been associated with hereditary nonpolyposis colorectal cancer as well as with a subset of colon tumors. Aberrations in mismatch repair related to hereditary nonpolyposis colorectal cancer are typically characterized by mutations in the MLH1 and MSH2 genes and occasional mutations in MSH6 (25). Our data suggest that the PMS2 S260S CC genotype is associated with an increased risk of colorectal cancer, although it does not alter the protein sequence; LD with another gene variant may be responsible for this association. The HFE 462 TT, LTA C50T, GPX3 G1961A (intron 1) GA + AA, and GPX3 C14T (intron 4) genotype were associated with a reduced risk of colorectal cancer. HFE SNPs involved in iron metabolism have been associated with colorectal adenoma and cancer risk (26-28). The CDKN2A A185G variants and the APAF A2093G variants involved in the cell cycle and apoptosis, respectively, were also associated with colorectal cancer risk.

SNPs with Ptrend Values <0.05

The nonsynonymous COMT Val158Met polymorphism regulates COMT activity with the Met/Met variant associated with a 3- to 4-fold difference in function (intermediate phenotype observed in heterozygotes; ref. 29). This COMT Met/Met genotype may be inversely associated with colorectal cancer risk via the estrogen metabolism pathway (30, 31). Previous studies have found inverse associations with the Met/Met genotype for postmenopausal breast cancer but an increased risk for premenopausal breast cancer (32).

This study has several limitations. Large-scale association studies entail the performance of numerous statistical tests. Evaluating each test by an uncorrected threshold would yield a surplus of loci deemed significant due to chance. Therefore, we corrected for multiple testing using the Bonferroni and SNP spectral decomposition (for tests that are not independent because of high LD between the markers) approaches to reduce spurious associations. However, given our modest sample size, we had limited power to detect significant differences at these extreme P values.

With 358 cases of colorectal cancer, we were underpowered to evaluate gene-environment and gene-gene interactions. Although the SNPs were selected among candidate genes, we did not use a haplotype-tagging approach. Therefore, the association with colorectal cancer risk is not definitive for many genes, for which only one to two SNPs were included in this study. Nevertheless, there were 145 nonsynonymous SNPs in this analysis, of which two loci were strongly associated with colorectal cancer risk. In addition, the low MAF of several SNPs may make accurate detection of a modest association with colorectal cancer risk difficult. Further, we recognize that susceptibility to colorectal cancer is influenced by genetic epistasis and determined by synergistic interactions between environmental carcinogens and allelic variants of multiple genes in numerous pathways. Last, the functional relevance of many of the polymorphisms examined in this study is unknown.

In summary, these data extend the current knowledge of genetic variation associated with colorectal cancer risk. SNP-based and gene-based approaches suggest that genetic variants in MGMT may be associated with colorectal cancer risk. Our study lends further support to the previously reported association of the MGMT Ile143Val, in linkage with the MGMT Lys178Arg SNP genotyped in this OPA, located near the 145Cys residue, with risk of colorectal cancer. In addition, we identified novel polymorphisms, including nonsynonymous SNPs SAT Arg126Cys and COMT Val158Met, associated with colorectal cancer risk. The PMS2 S260S SNP is in a gene in the mismatch repair pathway, a major pathway with an established relation to colorectal cancer. In addition to replication of these findings in other populations, further investigation to establish the functional relationship of these SNPs with colorectal cancer is warranted. With advances in affordable genotyping technology and annotation of common human genetic variation, large-scale analyses such as this study have the potential to substantially clarify the inherited component of colorectal cancer risk.

Grant support: NIH research grants CA70817, CA87969, and CA55075, Entertainment Industry Foundation National Colorectal Cancer Research Alliance, and NIH training grant T-32 CA 09001-30 (A. Hazra).

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

Note: Supplementary data for this article are available at Cancer Epidemiology, Biomarkers & Prevention Online (http://cebp.aacrjournals.org/).

We thank Pati Soule, Hardeep Ranu, and Craig Labadie for laboratory assistance and the dedicated participants of the NHS and the HPFS for ongoing commitment.

1
American Cancer Society. Cancer facts and figures. 2006.
2
Packer BR, Yeager M, Staats B, et al. SNP500Cancer: a public resource for sequence validation and assay development for genetic variation in candidate genes.
Nucleic Acids Res
2004
;
32
:
D528
–32.
3
A haplotype map of the human genome.
Nature
2005
;
437
:
1299
–320.
4
Packer BR, Yeager M, Burdett L, et al. SNP500Cancer: a public resource for sequence validation, assay development, and frequency analysis for genetic variation in candidate genes.
Nucleic Acids Res
2006
;
34
:
D617
–21.
5
Steemers FJ, Gunderson KL. Illumina, Inc.
Pharmacogenomics
2005
;
6
:
777
–82.
6
Foster CB, Aswath K, Chanock SJ, McKay HF, Peters U. Polymorphism analysis of six selenoprotein genes: support for a selective sweep at the glutathione peroxidase 1 locus (3p21) in Asian populations.
BMC Genet
2006
;
7
:
56
.
7
Hughes AL, Packer B, Welch R, Chanock SJ, Yeager M. High level of functional polymorphism indicates a unique role of natural selection at human immune system loci.
Immunogenetics
2005
;
57
:
821
–7.
8
Hunter DJ, Riboli E, Haiman CA, et al. A candidate gene approach to searching for low-penetrance breast and prostate cancer genes.
Nat Rev Cancer
2005
;
5
:
977
–85.
9
Tranah GJ, Giovannucci E, Ma J, Fuchs C, Hunter DJ. APC Asp1822Val and Gly2502Ser polymorphisms and risk of colorectal cancer and adenoma.
Cancer Epidemiol Biomarkers Prev
2005
;
14
:
863
–70.
10
Tranah GJ, Bugni J, Giovannucci E, et al. O6-methylguanine-DNA methyltransferase Leu84Phe and Ile143Val polymorphisms and risk of colorectal cancer in the Nurses' Health Study and Physicians' Health Study (United States).
Cancer Causes Control
2006
;
17
:
721
–31.
11
Garcia-Closas M, Malats N, Real FX, et al. Large-scale evaluation of candidate genes identifies associations between VEGF polymorphisms and bladder cancer risk.
PLoS Genet
2007
;
3
:
e29
.
12
Tranah GJ, Lescault PJ, Hunter DJ, De Vivo I. Multiple displacement amplification prior to single nucleotide polymorphism genotyping in epidemiologic studies.
Biotechnol Lett
2003
;
25
:
1031
–6.
13
Paynter RA, Skibola DR, Skibola CF, Buffler PA, Wiemels JL, Smith MT. Accuracy of multiplexed Illumina platform-based single-nucleotide polymorphism genotyping compared between genomic and whole genome amplified DNA collected from multiple sources.
Cancer Epidemiol Biomarkers Prev
2006
;
15
:
2533
–6.
14
SAS. Genetics user's guide. Cary: SAS Institute, Inc.; 2002.
15
Nyholt DR. A simple correction for multiple testing for single-nucleotide polymorphisms in linkage disequilibrium with each other.
Am J Hum Genet
2004
;
74
:
765
–9.
16
Rothe C, Koszycki D, Bradwejn J, et al. Association of the Val158Met catechol O-methyltransferase genetic polymorphism with panic disorder.
Neuropsychopharmacology
2006
;
31
:
2237
–42.
17
Sunyaev S, Ramensky V, Bork P. Towards a structural basis of human non-synonymous single nucleotide polymorphisms.
Trends Genet
2000
;
16
:
198
–200.
18
Webb EL, Rudd MF, Sellick GS, et al. Search for low penetrance alleles for colorectal cancer through a scan of 1467 non-synonymous SNPs in 2575 cases and 2707 controls with validation by kin-cohort analysis of 14 704 first-degree relatives.
Hum Mol Genet
2006
;
15
:
3263
–71.
19
Lindahl T, Demple B, Robins P. Suicide inactivation of the E. coli O6-methylguanine-DNA methyltransferase.
EMBO J
1982
;
1
:
1359
–63.
20
Pegg AE. Regulation of ornithine decarboxylase.
J Biol Chem
2006
;
281
:
14529
–32.
21
Chueh LL, Nakamura T, Nakatsu Y, Sakumi K, Hayakawa H, Sekiguchi M. Specific amino acid sequences required for O6-methylguanine-DNA methyltransferase activity: analyses of three residues at or near the methyl acceptor site.
Carcinogenesis
1992
;
13
:
837
–43.
22
Linsalata M, Giannini R, Notarnicola M, Cavallini A. Peroxisome proliferator-activated receptor γ and spermidine/spermine N1-acetyltransferase gene expressions are significantly correlated in human colorectal cancer.
BMC Cancer
2006
;
6
:
191
.
23
Babbar N, Gerner EW, Casero RA, Jr. Induction of spermidine/spermine N1-acetyltransferase (SSAT) by aspirin in Caco-2 colon cancer cells.
Biochem J
2006
;
394
:
317
–24.
24
Thacker J. The RAD51 gene family, genetic instability and cancer.
Cancer Lett
2005
;
219
:
125
–35.
25
Modrich P. Mechanisms and biological effects of mismatch repair.
Annu Rev Genet
1991
;
25
:
229
–53.
26
Stevens RG, Morris JE, Cordis GA, Anderson LE, Rosenberg DW, Sasser LB. Oxidative damage in colon and mammary tissue of the HFE-knockout mouse.
Free Radic Biol Med
2003
;
34
:
1212
–6.
27
Shaheen NJ, Silverman LM, Keku T, et al. Association between hemochromatosis (HFE) gene mutation carrier status and the risk of colon cancer.
J Natl Cancer Inst
2003
;
95
:
154
–9.
28
Chan AT, Ma J, Tranah GJ, et al. Hemochromatosis gene mutations, body iron stores, dietary iron, and risk of colorectal adenoma in women.
J Natl Cancer Inst
2005
;
97
:
917
–26.
29
Mannisto PT, Kaakkola S. Catechol-O-methyltransferase (COMT): biochemistry, molecular biology, pharmacology, and clinical efficacy of the new selective COMT inhibitors.
Pharmacol Rev
1999
;
51
:
593
–628.
30
Lotta T, Vidgren J, Tilgmann C, et al. Kinetics of human soluble and membrane-bound catechol O-methyltransferase: a revised mechanism and description of the thermolabile variant of the enzyme.
Biochemistry
1995
;
34
:
4202
–10.
31
Dawling S, Roodi N, Mernaugh RL, Wang X, Parl FF. Catechol-O-methyltransferase (COMT)-mediated metabolism of catechol estrogens: comparison of wild-type and variant COMT isoforms.
Cancer Res
2001
;
61
:
6716
–22.
32
Thompson PA, Shields PG, Freudenheim JL, et al. Genetic polymorphisms in catechol-O-methyltransferase, menopausal status, and breast cancer risk.
Cancer Res
1998
;
58
:
2107
–10.

Supplementary data