Prior studies of lung cancer and CYP1A1/2 in African-American and Latino populations have shown inconsistent results and have not yet investigated the haplotype block structure of CYP1A1/2 or addressed potential population stratification. To investigate haplotypes in the CYP1A1/2 region and lung cancer in African-Americans and Latinos, we conducted a case-control study (1998–2003). African-Americans (n = 535) and Latinos (n = 412) were frequency matched on age, sex, and self-reported race/ethnicity. We used a custom genotyping panel containing 50 single nucleotide polymorphisms in the CYP1A1/2 region and 184 ancestry informative markers selected to have large allele frequency differences between Africans, Europeans, and Amerindians. Latinos exhibited significant haplotype main effects in two blocks even after adjusting for admixture [odds ratio (OR), 2.02; 95% confidence interval (95% CI), 1.28–3.19 and OR, 0.55; 95% CI, 0.36–0.83], but no main effects were found among African-Americans. Adjustment for admixture revealed substantial confounding by population stratification among Latinos but not African-Americans. Among Latinos and African-Americans, interactions between smoking level and haplotypes were not statistically significant. Evidence of population stratification among Latinos underscores the importance of adjusting for admixture in lung cancer association studies, particularly in Latino populations. These results suggest that a variant occurring within the CYP1A2 region may be conferring an increased risk of lung cancer in Latinos. [Cancer Res 2009;69(6):2340–8]

Lung cancer is the third most frequently diagnosed malignancy in the United States and its mortality surpasses all other cancers. Racial/ethnic differences in lung cancer incidence and mortality persist with African-Americans and Latinos experiencing the highest and lowest incidence and mortality, respectively, among all racial/ethnic groups in the United States.7

7

Ries LAG, Melbert D, Krapcho M, et al. SEER cancer statistics review, 1975-2004 [database on the Internet]. Bethesda (MD): National Cancer Institute [cited 2007 April 17]. Available from http://seer.cancer.gov/csr/1975_2004/.

Reasons for the disparate incidence rates remain unclear perhaps due to the paucity of studies in Latinos and investigations in African-Americans have yet to definitively ascertain genetic loci associated with their increased cancer susceptibility. By examining recently admixed populations with contrasting incidence and accounting for population substructure, genetic susceptibility loci for the disease may be identified.

CYP1A1 and CYP1A2 are located on chromosome 15q in opposite orientation and separated by 23.3 kb (1). CYP1A1 is a major phase I enzyme, present in lung tissue, which activates procarcinogens present in tobacco smoke (24). CYP1A2 is preferentially expressed in hepatic tissue, although it is also present in lung tissue (5, 6). CYP1A2 biotransforms the tobacco-specific nitrosamines 4-(methylnitrosamino)-I-(3-pyridyl)-1-butanone and N′-nitrosonornicotine (4) and is activated among smokers (7, 8). Mechanistically, lung cancer susceptibility is mediated by allelic variants of CYP1A1/2 resulting in phenotypes characterized by intermediate or poor metabolism of procarcinogens. These carcinogens can become ultimate carcinogens or proximate intermediates able to form DNA adducts causing mutations in tumor suppressor genes and eventually initiating carcinogenesis (3, 4).

Considerable racial/ethnic differences in allele and haplotype frequencies (9, 10) occur in the CYP1A1/2 region, commensurate with racial/ethnic differences in CYP1A1 induction (11) and >60-fold interindividual differences noted in CYP1A2 activity (3). Polymorphisms in CYP1A1 have been extensively investigated in lung cancer etiology, especially the CYP1A1 T6235C (rs4646903, also referred to as M1 or Msp1) and A4889G variants (rs1048943, also known as M2 or Ile462Val). Investigations of these two loci suggest ethnic variation in genetic susceptibility of lung cancer, although ambiguity exists about whether these loci are involved in lung carcinogenesis in African-Americans or Latinos (1216). Less attention has been focused on CYP1A2 in lung cancer susceptibility despite its role in metabolizing important tobacco carcinogens and other compounds. Overall, the lack of consistent associations in studies of CYP1A1/2 conducted in different racial/ethnic populations may be due to differences in allele frequencies or linkage disequilibrium patterns.

Debate has ensued as to whether population stratification is of concern and if studies in recently admixed populations provide credible results (17, 18). The presence of both lung cancer and allele frequency variation across populations suggests that population stratification should be considered. We investigated CYP1A1/2 haplotypes and lung cancer in African-Americans and Latinos, accounting for influences of potential confounding by population stratification.

Study population. Newly diagnosed lung cancer patients residing in the San Francisco Bay Area were identified using rapid case ascertainment methods conducted by the Northern California Cancer Center and Summit Medical Center from September 1998 to March 2003. Incident patients were eligible for participation if they (a) self-identified as African-American or Latino; (b) were 21 y or older; (c) resided within the counties of Alameda, Contra Costa, Santa Clara, San Francisco, or San Mateo; and (d) had a diagnosis of primary lung cancer. Cases meeting eligibility criteria were invited to participate in an in-person interview and to donate a biologic (blood or buccal smear) sample. A total of 368 cases (255 African-Americans and 113 Latinos) are included in this analysis.

Potential controls were recruited from the following: (a) random digit dialing; (b) Health Care Financing Administration (HCFA) records for persons ages 65 or older; and (c) community-based sources, such as churches, health fairs, and senior centers. For each case, approximately twice as many eligible controls were recruited having the same age (±10 y), sex, and self-identified race/ethnicity. Eligible controls were invited to participate in an in-person interview and to donate a biologic sample. A total of 579 controls (280 African-Americans and 299 Latinos) are included in the analysis.

The San Francisco Bay Area Lung Cancer Study was approved by the University of California Committee for the Protection of Human Subjects. Written informed consent was obtained from all participating subjects.

Interview data collection and specimen processing. Epidemiologic data were collected during in-person interviews using a structured questionnaire to ascertain exposure histories from before diagnosis for cases and before interview date for controls. At the time of interview, blood and buccal specimens were collected. Specimens were transported to a University of California, San Francisco laboratory within 48 h of collection and processed for storage until ready for future genotyping. When samples from all participants were collected, biospecimens were thawed and DNA was isolated by automated phenol chloroform extraction using the Autogen 3000 (Autogen, Inc.). DNA concentration was measured by fluorescence (PicoGreen, Invitrogen Corp.) and normalized to 30 to 100 ng/μL, for a total concentration of 150 to 500 ng used for genotyping. Whole genome amplification was performed on samples yielding insufficient DNA (blood, n = 2; buccal, n = 4) in accordance with the Omniplex protocol (Sigma-Aldrich Corp.) and the amplified product was cleaned with Millipore Montage PCR96 filter plate (Millipore Corp.; ref. 19).

Genotyping. Genetic markers included 50 single nucleotide polymorphisms (SNP) in the CYP1A1/2 region on chromosome 15. CYP1A1/2 SNPs were identified from the following sources: (a) published literature, (b) International HapMap Project (20), and (c) SNP500Cancer database (21). Literature SNPs were selected if they had a minor allele frequency (MAF) >5% in any HapMap population (Build 34) and were previously identified as either associated with cancer or a tagSNP of a reported haplotype in the CYP1A1/2 region, including several SNPs previously characterized (10). The HapMap database was used to identify tagSNPs (r2 ≥ 0.8) in the CYP1A1/2 region having a MAF >5% in the Yoruba and CEPH populations. SNPs 10,000 bp upstream and downstream of the CYP1A1/2 gene boundaries were included to ensure gene coverage when generating tagSNPs from HapMap data. The SNP500Cancer database was queried for SNPs in CYP1A1 and CYP1A2. SNPs were selected for genotyping if the MAF was >5% in any SNP500Cancer or Human Diversity Panel population. Based on HapMap CEPH data (Build 36 and r2 ≥ 0.8), the total number of SNPs either genotyped or tagged by SNPs in our panel was 85 across the genotyped CYP1A1/2 region for a marker density of one SNP per 1.4 kb. Based on HapMap Yoruban data, the total number of SNPs either genotyped or tagged by SNPs in our panel was 65 for a marker density of one SNP per 1.9 kb.

DNA collected from African-American and Latino participants was genotyped according to the manufacturer's protocol at the University of California Davis Genome Center using the Illumina Bead Station 500G Golden Gate genotyping platform. A total of 996 African-American or Latino participants were genotyped. Participants were selected for genotyping if they were a lung cancer case (Latino or African-American) or a Latino control. A random sample of African-American controls was selected to complete the study. Participants were removed from statistical analyses if they self-reported more than one ethnicity (n = 44) or DNA sample quality was poor (n = 5), resulting in a final sample size of 947 self-described African-American and Latino participants.

In addition to CYP1A1/2 SNPs, a panel of 184 autosomal biallelic ancestry informative markers (AIM) distinguishing the continental ancestor populations comprising Latinos and African-Americans was genotyped to address potential population stratification. The AIM panel (previously designed by author M.F.S.) was developed by selecting SNPs across the genome with high informativeness for ancestry between Amerindian, European, and sub-Saharan African continental populations (22, 23). Criteria for the AIMs included validation in multiple subgroups for each continent and a lack of linkage disequilibrium (r2 < 0.6) between AIMs in continental populations. Genotyping of the AIMs was conducted using not only DNA collected from the Latino and African-American participants but also DNA from European Americans (San Francisco Bay Area, n = 47), West Africans (Bini from Edo State and Kanuri from Nigeria, n = 46), and Amerindians (Mayans from Bola De Oro and Cienega Grande, Guatemala, n = 46) to improve genetic ancestry estimation of the Latinos and African-Americans. Mean fixation indices (FST), estimated using FSTAT following Weir and Cockerham (24), were 0.52 for West Africans versus Europeans, 0.52 for West Africans versus Amerindians, and 0.48 for Europeans versus Amerindians.

Genotypes obtained from genomic and whole genome amplified (WGA) DNA were called using separate clustering analyses. Genotype call rates (GenCall ≥ 0.25) averaged 99.41% and 99.16% (SD ± 0.010) for genomic and WGA samples, respectively. Genotype reproducibility was verified with duplicates of unamplified DNA and WGA/genomic DNA pairs. Unamplified duplicates (n = 31) had a mean reproducibility of 99.99%. WGA/genomic DNA pairs amplified from blood (n = 18 pairs) and buccal specimens (n = 28 pairs) exhibited a mean genotype reproducibility of 99.39% and 98.49%, respectively.

Statistical analysis. Analyses were conducted separately for African-Americans and Latinos. Exact tests for Hardy-Weinberg equilibrium (HWE) and measures of linkage disequilibrium were conducted using Statistical Analysis System v9.1/Genetics software (SAS). Using African-American and Latino control genotyping data, the haplotype block structure was estimated using the confidence interval algorithm (25) in Haploview 3.2 (26). Haplotypes and their frequencies were estimated from unphased individual genotype data using the HAPPY macro (27). Genetic ancestry of African-American and Latino participants was determined using 184 AIMs and a maximum likelihood approach based on estimation methods described by Chakraborty and Weiss (28), Chakraborty (29), and Hanis and colleagues (30). To improve ancestry assignment, ancestral population AIM frequencies were input along with the genotypes of the admixed participants.

Logistic regression models were used to estimate odds ratios (OR) assessing the association between CYP1A1/2 haplotypes and lung cancer, adjusting for the variables age, sex, and smoking pack-years. Subject-specific haplotype probabilities were incorporated as covariates into regression models, which estimated ORs associated with having a specific haplotype under an additive model. Likelihood ratio tests were used to assess the influence of the explanatory variables. Individual admixture estimates obtained from the AIMs were added as continuous covariates to logistic regression models to assess possible confounding by population heterogeneity. Only two of the three admixture proportions (European and Amerindian) were included in the model because the three admixture proportions sum to one, making them collinear. Models with and without genetic admixture were compared to identify population stratification. A confounding risk ratio (CRR) comparing ORs adjusted and unadjusted for admixture was chosen to provide a quantitative measure of the amount of confounding. Admixture proportions were included as a covariate in follow-up logistic regression analyses if adjusted and unadjusted haplotype ORs differed by ≥10%. For haplotypes showing significant associations with lung cancer, individual SNP analyses were conducted for genotyped SNPs present in each haplotype. Logistic regression models for single SNP analyses included variables coding for the heterozygous and homozygous variants such that no inheritance mode was assumed. Trend tests were conducted using the log-additive model by assigning 0, 1, or 2 copies of the minor allele to genotypes. A type I error rate of α = 0.05/7 haplotype blocks = 0.007 was set for inference of main effects. This approach decreases the type I error without being too conservative for this candidate gene study having a priori hypotheses. Exploratory analyses examined interactions between haplotypes and smoking using logistic regression and a type 1 error rate of α = 0.1 to infer the presence or absence of interaction. Smoking level was represented in the model as a three-level design variable, allowing for the joint influence between smoking and haplotypes to be unconstrained.

Descriptive statistics. Participation rates (completed the questionnaire and provided a biologic sample) for African-American and Latino cases were 69.6% and 54.0%, respectively. The median time between diagnosis and interview date for cases was 99 days for African-Americans and 123 days for Latinos. Among African-American and Latino cases, 19.4% and 16.2%, respectively, were not recruited due to death. Participation rates for African-American and Latino controls were 58.1% and 55.6%, respectively. Among African-American controls, 24.9% were from population-based sources (random digit dialing and HCFA lists) and 75.1% were identified by community-based outreach methods. Among Latino controls, 31.8% were from population-based sources and 68.2% were from community-based outreach methods.

African-American (n = 535) and Latino (n = 412) lung cancer cases and controls did not differ according to the frequency strata-matched variables age and sex (Table 1). Smoking patterns varied between cases and controls for African-Americans and Latinos. Among Latinos, significantly more cases were born in the United States and controls had lower household income compared with cases (Table 1). All genotyped CYP1A1/2 SNPs were in HWE among Latino controls, whereas African-American controls had eight SNPs that were not in HWE (Supplementary Table S1), which decreased to one SNP after correction for multiple testing using false discovery rate (31).

Table 1.

Characteristics of San Francisco Bay Area Lung Cancer Study participants by race/ethnicity, San Francisco Bay Area, California, 1998–2003

CharacteristicAfrican-Americans (n = 535)
Latinos (n = 412)
Cases (n = 255)
Controls (n = 280)
P*Cases (n = 113)
Controls (n = 299)
P*
n (%)n (%)n (%)n (%)
Age (y)       
    <50 19 (7.5) 34 (12.1)  10 (8.9) 29 (9.7)  
    50–59 80 (31.4) 90 (32.1)  24 (21.2) 53 (17.7)  
    60–69 77 (30.2) 82 (29.3)  25 (22.1) 77 (25.8)  
    70–79 61 (23.9) 53 (18.9)  43 (38.1) 110 (36.7)  
    ≥80 18 (7.1) 21 (7.5)  11 (9.7) 30 (10.0)  
    Mean (SE) 63.5 (0.7) 61.8 (0.7) 0.08 65.8 (1.1) 66.3 (0.7) 0.73 
    Min 30 30  36 28  
    Max 92 87  90 95  
Sex   0.33   0.54 
    Male 120 (47.1) 120 (42.9)  59 (52.2) 146 (48.8)  
    Female 135 (52.9) 160 (57.1)  54 (47.8) 153 (51.2)  
Foreign-born   0.25   <0.01 
    Yes 6 (2.4) 3 (1.1)  52 (46.0) 196 (65.6)  
    No 249 (97.7) 277 (98.9)  61 (54.0) 103 (34.5)  
Household income ≥$20,000 117 (50.9) 145 (52.5) 0.71 55 (66.1) 122 (46.6) 0.02 
Mean education years (SE) 12.3 (0.2) 13.6 (0.2) <0.01 10.1 (0.4) 11.0 (0.3) 0.06 
Ever smoked   <0.01   <0.01 
    Yes 240 (94.1) 190 (67.9)  81 (71.7) 141 (47.2)  
    No 15 (5.9) 90 (32.4)  32 (28.3) 158 (52.8)  
CharacteristicAfrican-Americans (n = 535)
Latinos (n = 412)
Cases (n = 255)
Controls (n = 280)
P*Cases (n = 113)
Controls (n = 299)
P*
n (%)n (%)n (%)n (%)
Age (y)       
    <50 19 (7.5) 34 (12.1)  10 (8.9) 29 (9.7)  
    50–59 80 (31.4) 90 (32.1)  24 (21.2) 53 (17.7)  
    60–69 77 (30.2) 82 (29.3)  25 (22.1) 77 (25.8)  
    70–79 61 (23.9) 53 (18.9)  43 (38.1) 110 (36.7)  
    ≥80 18 (7.1) 21 (7.5)  11 (9.7) 30 (10.0)  
    Mean (SE) 63.5 (0.7) 61.8 (0.7) 0.08 65.8 (1.1) 66.3 (0.7) 0.73 
    Min 30 30  36 28  
    Max 92 87  90 95  
Sex   0.33   0.54 
    Male 120 (47.1) 120 (42.9)  59 (52.2) 146 (48.8)  
    Female 135 (52.9) 160 (57.1)  54 (47.8) 153 (51.2)  
Foreign-born   0.25   <0.01 
    Yes 6 (2.4) 3 (1.1)  52 (46.0) 196 (65.6)  
    No 249 (97.7) 277 (98.9)  61 (54.0) 103 (34.5)  
Household income ≥$20,000 117 (50.9) 145 (52.5) 0.71 55 (66.1) 122 (46.6) 0.02 
Mean education years (SE) 12.3 (0.2) 13.6 (0.2) <0.01 10.1 (0.4) 11.0 (0.3) 0.06 
Ever smoked   <0.01   <0.01 
    Yes 240 (94.1) 190 (67.9)  81 (71.7) 141 (47.2)  
    No 15 (5.9) 90 (32.4)  32 (28.3) 158 (52.8)  
*

Categorical and continuous variables were assessed with χ2 test and two-sample t test, respectively.

Frequency-matched variable.

Median income.

Linkage disequilibrium and haplotype estimation. Block structure for the genotyped region of chromosome 15 in the 412 unrelated Latino study participants and the 535 unrelated African-Americans is shown in Supplementary Figs. S1 and S2, respectively. Three blocks defined the genotyped region for Latinos and four smaller blocks defined the region for African-Americans (Supplementary Tables S2 and S3). Haploview identified a total of 10 and 15 haplotype-tagging SNPs among Latinos and African-Americans, respectively, for discrimination of haplotypes with estimated frequencies >5% in the CYP1A1/2 region (Supplementary Figs. S3 and S4).

M1 and M2 variants. The M1 and M2 variants were not genotyped in the panel of 50 SNPs because M1 did not meet design criteria for the Illumina platform and a SNP near M2 was selected for genotyping. PCR genotyping data were available for both M1 and M2 loci. In Latino controls, M1 was in high linkage disequilibrium (r2 ≥ 0.8) with several SNPs in block 1 (rs17861120, rs12441817, and rs886605), whereas the M2 variant was in high linkage disequilibrium with two SNPs in block 1 (rs17861120 and rs12441817) and two SNPs in block 2 (rs16972208 and rs17861140). In African-American controls, M1 was in high linkage disequilibrium with two SNPs in block 1 (rs17861109 and rs4886605) and M2 was in high linkage disequilibrium with two SNPs in block 2 (rs16972208 and rs17861140), indicating that these SNPs are tightly linked. Together, M1 and M2 were also in linkage disequilibrium. Because M1 and M2 are in linkage disequilibrium with SNPs in blocks 1 and 2 for Latinos and African-Americans, observed associations in haplotypes in blocks 1 and 2 will indirectly assess associations with the frequently investigated CYP1A1 variants M1 and M2.

Haplotype associations among Latinos. Associations between haplotypes and lung cancer risk for Latinos are presented in Table 2. Only haplotype C in block 2 (haplotype 2C) was significantly associated with an increased risk of lung cancer [OR, 2.17; 95% confidence interval (95% CI), 1.39–3.41], adjusting for the frequency-matched variables and number of smoking pack-years. None of the other haplotype associations was significantly associated with lung cancer (P > 0.007). Haplotype 2C remained significantly associated (P < 0.007) with lung cancer (OR, 2.02; 95% CI, 1.28–3.19) and haplotype 3B became significantly associated (OR, 0.55; 95% CI, 0.36–0.83) after adjusting for the influence of admixture. The global test for haplotype association for block 2 remained significant before and after adjusting for admixture (P = 0.002; Table 2). Comparison of the crude and adjusted ORs using the CRR showed at least a 10% reduction in five of the point estimates (50% of the haplotype associations), with all estimates decreasing after adjustment for admixture (Table 2). Individual admixture was included as a covariate in logistic regression analyses to control for this confounding by population stratification.

Table 2.

Assessment of confounding by population stratification for the association between CYP1A1/2 haplotypes and lung cancer among Latinos participating in the San Francisco Bay Area Lung Cancer Study, 1998–2003

BlockHaplotypeOR (95% CI)*OR (95% CI)CRR
1.00 (reference) 1.00 (reference)  
 1.15 (0.74–1.78) 1.06 (0.67–1.66) 1.09 
 1.49 (0.89–2.49) 1.43 (0.85–2.39) 1.05 
 1.01 (0.59–1.74) 0.90 (0.51–1.58) 1.13§ 
  0.48 0.49  
1.00 (reference) 1.00 (reference)  
 0.93 (0.58–1.49) 0.80 (0.49–1.32) 1.16§ 
 2.17 (1.39–3.41) 2.02 (1.28–3.19) 1.07 
 1.08 (0.52–2.25) 0.89 (0.42–1.93) 1.21§ 
 1.03 (0.47–2.29) 0.98 (0.44–2.19) 1.05 
  0.002 0.002  
1.00 (reference) 1.00 (reference)  
 0.64 (0.43–0.95) 0.55 (0.36–0.83) 1.17§ 
 0.92 (0.54–1.55) 0.98 (0.58–1.66) 0.94 
 0.58 (0.24–1.37) 0.53 (0.22–1.27) 1.10§ 
  0.12 0.02  
BlockHaplotypeOR (95% CI)*OR (95% CI)CRR
1.00 (reference) 1.00 (reference)  
 1.15 (0.74–1.78) 1.06 (0.67–1.66) 1.09 
 1.49 (0.89–2.49) 1.43 (0.85–2.39) 1.05 
 1.01 (0.59–1.74) 0.90 (0.51–1.58) 1.13§ 
  0.48 0.49  
1.00 (reference) 1.00 (reference)  
 0.93 (0.58–1.49) 0.80 (0.49–1.32) 1.16§ 
 2.17 (1.39–3.41) 2.02 (1.28–3.19) 1.07 
 1.08 (0.52–2.25) 0.89 (0.42–1.93) 1.21§ 
 1.03 (0.47–2.29) 0.98 (0.44–2.19) 1.05 
  0.002 0.002  
1.00 (reference) 1.00 (reference)  
 0.64 (0.43–0.95) 0.55 (0.36–0.83) 1.17§ 
 0.92 (0.54–1.55) 0.98 (0.58–1.66) 0.94 
 0.58 (0.24–1.37) 0.53 (0.22–1.27) 1.10§ 
  0.12 0.02  

NOTE: Haplotypes correspond to haplotypes shown in Supplementary Fig. S3 and Supplementary Table S2. Rare haplotypes with frequencies <5% are not included in regression models. Global tests for haplotype association are indicated in italics (comparing model with and without haplotypes). ORs and 95% CIs are derived using 411 subjects (1 individual is missing number of pack-years).

*

Adjusted for age, sex, and number of pack-years.

Adjusted for age, sex, number of pack-years, and admixture (Amerindian and European ancestry).

CRR = pooled OR/adjusted OR (38).

§

Indicates CRR had a change of ≥10%.

Stratification by European ancestry revealed no significant associations among Latinos having <54% European ancestry (Table 3). Among participants with European ancestry ≥54%, the OR for haplotype 2C showed a statistically significant increased association with lung cancer (OR, 3.56; 95% CI, 1.82–6.95); however, the Woolf test for homogeneity revealed no difference in the two ORs for haplotype 2C (P = 0.014) based on our specified type I error rate (P = 0.007).

Table 3.

Association between CYP1A1/2 haplotypes and lung cancer, stratified by European ancestry, among Latinos participating in the San Francisco Bay Area Lung Cancer Study, 1998–2003

BlockHaplotypeEuropean ancestry <54% (n = 180)
European ancestry ≥54% (n = 231)
OR (95% CI)OR (95% CI)
1.00 (reference) 1.00 (reference) 
 0.73 (0.33–1.60) 1.38 (0.78–2.44) 
 0.99 (0.43–2.28) 1.90 (0.96–3.77) 
 0.40 (0.11–1.49) 1.42 (0.73–2.75) 
1.00 (reference) 1.00 (reference) 
 0.66 (0.28–1.55) 1.14 (0.61–2.14) 
 1.06 (0.53–2.12) 3.56 (1.82–6.95) 
 0.20 (0.02–1.78) 1.80 (0.73–4.41) 
 1.34 (0.41–4.30) 0.89 (0.28–2.82) 
1.00 (reference) 1.00 (reference) 
 0.53 (0.24–1.17) 0.62 (0.38–1.00) 
 1.14 (0.52–2.49) 0.80 (0.38–1.67) 
 0.47 (0.09–2.44) 0.55 (0.20–1.56) 
BlockHaplotypeEuropean ancestry <54% (n = 180)
European ancestry ≥54% (n = 231)
OR (95% CI)OR (95% CI)
1.00 (reference) 1.00 (reference) 
 0.73 (0.33–1.60) 1.38 (0.78–2.44) 
 0.99 (0.43–2.28) 1.90 (0.96–3.77) 
 0.40 (0.11–1.49) 1.42 (0.73–2.75) 
1.00 (reference) 1.00 (reference) 
 0.66 (0.28–1.55) 1.14 (0.61–2.14) 
 1.06 (0.53–2.12) 3.56 (1.82–6.95) 
 0.20 (0.02–1.78) 1.80 (0.73–4.41) 
 1.34 (0.41–4.30) 0.89 (0.28–2.82) 
1.00 (reference) 1.00 (reference) 
 0.53 (0.24–1.17) 0.62 (0.38–1.00) 
 1.14 (0.52–2.49) 0.80 (0.38–1.67) 
 0.47 (0.09–2.44) 0.55 (0.20–1.56) 

NOTE: Median European ancestry among Latino controls is 54%. Haplotypes correspond to haplotypes shown in Supplementary Fig. S3 and Supplementary Table S2. Rare haplotypes having frequencies <5% are not included in the regression models. ORs and 95% CIs are derived from logistic regression models, adjusted for age, sex, and number of pack-years.

Among light smoking Latinos, haplotype 2C was again found to be strongly associated (P < 0.007) with an increased risk of lung cancer (OR, 7.80; 95% CI, 2.74–22.15; Table 4). A significant decreased association with lung cancer was observed among nonsmokers having haplotype 3B (OR, 0.34; 95% CI, 0.17–0.71; Table 4). An interaction was revealed between smoking exposure and a haplotype in block 2 (Table 4); however, after adjusting for multiple comparisons (P = 0.04 * 7 = 0.28), this result was no longer statistically significant (P > 0.1).

Table 4.

Association between CYP1A1/2 haplotypes and lung cancer, stratified by smoking status, among Latinos participating in the San Francisco Bay Area Lung Cancer Study, 1998–2003

BlockHaplotypeNonsmokers (n = 190)
Light smokers* (n = 139)
Heavy smokers* (n = 82)
P
OR (95% CI)OR (95% CI)OR (95% CI)
1.00 (reference) 1.00 (reference) 1.00 (reference)  
 0.98 (0.48–2.02) 2.40 (1.07–5.38) 0.58 (0.19–1.73)  
 0.96 (0.40–2.32) 3.73 (1.41–9.90) 1.38 (0.51–3.69)  
 0.45 (0.15–1.32) 1.87 (0.69–5.10) 0.69 (0.21–2.25)  
     0.17 
1.00 (reference) 1.00 (reference) 1.00 (reference)  
 0.42 (0.18–0.98) 1.63 (0.59–4.50) 1.72 (0.62–4.77)  
 1.39 (0.69–2.81) 7.80 (2.74–22.15) 1.72 (0.71–4.17)  
 0.63 (0.18–2.19) 0.75 (0.15–3.81) 1.47 (0.35–6.19)  
 0.87 (0.25–3.05) 0.56 (0.10–3.01) 1.45 (0.25–8.29)  
     0.04 
1.00 (reference) 1.00 (reference) 1.00 (reference)  
 0.34 (0.17–0.71) 0.73 (0.36–1.48) 0.70 (0.30–1.63)  
 0.95 (0.41–2.19) 0.77 (0.27–2.19) 1.29 (0.44–3.77)  
 0.48 (0.12–1.93) 0.72 (0.13–4.06) 0.23 (0.05–1.13)  
     0.56 
BlockHaplotypeNonsmokers (n = 190)
Light smokers* (n = 139)
Heavy smokers* (n = 82)
P
OR (95% CI)OR (95% CI)OR (95% CI)
1.00 (reference) 1.00 (reference) 1.00 (reference)  
 0.98 (0.48–2.02) 2.40 (1.07–5.38) 0.58 (0.19–1.73)  
 0.96 (0.40–2.32) 3.73 (1.41–9.90) 1.38 (0.51–3.69)  
 0.45 (0.15–1.32) 1.87 (0.69–5.10) 0.69 (0.21–2.25)  
     0.17 
1.00 (reference) 1.00 (reference) 1.00 (reference)  
 0.42 (0.18–0.98) 1.63 (0.59–4.50) 1.72 (0.62–4.77)  
 1.39 (0.69–2.81) 7.80 (2.74–22.15) 1.72 (0.71–4.17)  
 0.63 (0.18–2.19) 0.75 (0.15–3.81) 1.47 (0.35–6.19)  
 0.87 (0.25–3.05) 0.56 (0.10–3.01) 1.45 (0.25–8.29)  
     0.04 
1.00 (reference) 1.00 (reference) 1.00 (reference)  
 0.34 (0.17–0.71) 0.73 (0.36–1.48) 0.70 (0.30–1.63)  
 0.95 (0.41–2.19) 0.77 (0.27–2.19) 1.29 (0.44–3.77)  
 0.48 (0.12–1.93) 0.72 (0.13–4.06) 0.23 (0.05–1.13)  
     0.56 

NOTE: ORs are derived from logistic regression models, adjusted for age, sex, and admixture (Amerindian and European). Haplotypes correspond to haplotypes shown in Supplementary Fig. S3 and Supplementary Table S2. Rare haplotypes having frequencies <5% are not included in the regression model.

*

Light smokers are subjects with <30 pack-year history and heavy smokers are subjects with ≥30 pack-year history.

P values for interaction between smoking level and haplotypes are derived from likelihood ratio tests. P values corrected for multiple comparisons are block 1: 0.17 * 7 = 1.00, block 2: 0.04 * 7 = 0.28, and block 3: 0.56 * 7 = 1.00.

Haplotype associations among African-Americans. Associations between haplotypes and lung cancer in African-Americans with and without adjustment for admixture are presented in Table 5. None of the haplotypes was associated with lung cancer and only one of the admixture-adjusted estimates differed from the crude estimates by >10%. Further analyses did not include individual admixture as a covariate because strong evidence of confounding by population stratification was not apparent.

Table 5.

Assessment of confounding by population stratification for the association between CYP1A1/2 haplotypes and lung cancer among African-Americans participating in the San Francisco Bay Area Lung Cancer Study, 1998–2003

BlockHaplotypeOR (95% CI)*OR (95% CI)CRR
1.00 (reference) 1.00 (reference)  
 1.10 (0.79-1.52) 1.11 (0.80-1.55) 0.98 
 0.97 (0.62-1.52) 1.06 (0.67-1.70) 0.91 
 1.10 (0.67-1.80) 1.09 (0.67-1.79) 1.01 
 0.87 (0.52-1.43) 0.83 (0.50-1.38) 1.04 
 0.72 (0.41-1.28) 0.74 (0.41-1.31) 0.98 
 1.36 (0.74-2.52) 1.57 (0.83-2.98) 0.87§ 
  0.70 0.53  
1.00 (reference) 1.00 (reference)  
 1.02 (0.72-1.45) 0.99 (0.69-1.40) 1.03 
 1.01 (0.72-1.42) 1.04 (0.73-1.48) 0.97 
 1.40 (0.85-2.30) 1.38 (0.84-2.26) 1.02 
 0.83 (0.48-1.44) 0.83 (0.47-1.44) 1.01 
  0.65 0.66  
1.00 (reference) 1.00 (reference)  
 1.06 (0.81-1.39) 1.07 (0.81-1.41) 0.99 
 1.21 (0.72-2.02) 1.20 (0.72-2.01) 1.00 
  0.75 0.74  
1.00 (reference) 1.00 (reference)  
 0.90 (0.66-1.24) 0.87 (0.63-1.20) 1.04 
 1.14 (0.82-1.59) 1.18 (0.84-1.65) 0.97 
 0.70 (0.39-1.26) 0.70 (0.39-1.25) 1.01 
  0.34 0.23  
BlockHaplotypeOR (95% CI)*OR (95% CI)CRR
1.00 (reference) 1.00 (reference)  
 1.10 (0.79-1.52) 1.11 (0.80-1.55) 0.98 
 0.97 (0.62-1.52) 1.06 (0.67-1.70) 0.91 
 1.10 (0.67-1.80) 1.09 (0.67-1.79) 1.01 
 0.87 (0.52-1.43) 0.83 (0.50-1.38) 1.04 
 0.72 (0.41-1.28) 0.74 (0.41-1.31) 0.98 
 1.36 (0.74-2.52) 1.57 (0.83-2.98) 0.87§ 
  0.70 0.53  
1.00 (reference) 1.00 (reference)  
 1.02 (0.72-1.45) 0.99 (0.69-1.40) 1.03 
 1.01 (0.72-1.42) 1.04 (0.73-1.48) 0.97 
 1.40 (0.85-2.30) 1.38 (0.84-2.26) 1.02 
 0.83 (0.48-1.44) 0.83 (0.47-1.44) 1.01 
  0.65 0.66  
1.00 (reference) 1.00 (reference)  
 1.06 (0.81-1.39) 1.07 (0.81-1.41) 0.99 
 1.21 (0.72-2.02) 1.20 (0.72-2.01) 1.00 
  0.75 0.74  
1.00 (reference) 1.00 (reference)  
 0.90 (0.66-1.24) 0.87 (0.63-1.20) 1.04 
 1.14 (0.82-1.59) 1.18 (0.84-1.65) 0.97 
 0.70 (0.39-1.26) 0.70 (0.39-1.25) 1.01 
  0.34 0.23  

NOTE: Haplotypes correspond to haplotypes shown in Supplementary Fig. S4 and Supplementary Table S3. Rare haplotypes having frequencies <5% are not included in the regression models. Global tests for haplotype association are indicated in italics (comparing model with and without haplotypes). ORs and 95% CIs are derived using 530 subjects (3 individuals are missing number of pack-years and 2 individuals were unable to have their haplotypes estimated).

*

Adjusted for age, sex, and number of pack-years.

Adjusted for age, sex, number of pack-years, and admixture (Amerindian and European ancestry).

CRR = pooled OR/adjusted OR (38).

§

Indicates CRR had a change of ≥10%.

Associations remained null when stratifying by the median European ancestry (17%) among the African-American controls (data not shown). Evaluation of interaction between smoking exposure and haplotypes in African-Americans revealed no significant interactions between CYP1A1/2 haplotypes and smoking, and no consistent relationships were observed among any of the haplotype blocks (data not shown).

Single SNP associations. Among Latinos, two SNPs (rs2472299 and rs762551) in haplotype 2C were significantly (P < 0.007) associated with an increased risk of lung cancer (Table 6), analogous to the positive association identified with haplotype 2C. The remaining associations with lung cancer for SNPs present in this haplotype were not significant. Only one SNP (rs11072508) in haplotype 3B was associated with a decreased risk of lung cancer (P = 0.007; Supplementary Table S4), corresponding to the decreased haplotype association observed with haplotype 3B. No single SNP associations were conducted for African-Americans because haplotype associations were null.

Table 6.

Single SNP analysis for SNPs present in haplotype 2C among Latinos participating in the San Francisco Bay Area Lung Cancer Study, 1998–2003

SNPControls, n (%)Cases, n (%)OR (95% CI)
rs2472297 GG 251 (84.0) 97 (85.8) 1.00 (reference) 
 AG 47 (15.7) 15 (13.3) 0.72 (0.38-1.36) 
 AA 1 (0.3) 1 (0.9) 1.96 (0.11-36.52) 
 Trend   0.79 (0.43-1.43) 
rs16972208 GG 133 (44.5) 66 (58.4) 1.00 (reference) 
 AG 134 (44.8) 36 (31.9) 0.63 (0.39-1.03) 
 AA 32 (10.7) 11 (9.7) 0.94 (0.43-2.08) 
 Trend   0.82 (0.57-1.18) 
rs2472299 GG 173 (57.9) 47 (42.0) 1.00 (reference) 
 AG 112 (37.5) 50 (44.6) 1.66 (1.04-2.67) 
 AA 14 (4.7) 15 (13.4) 4.20 (1.87-9.44) 
 Trend   1.89 (1.33-2.67) 
rs17861140 GG 133 (44.5) 66 (58.4) 1.00 (reference) 
 AG 132 (44.2) 36 (31.9) 0.64 (0.39-1.05) 
 AA 34 (11.4) 11 (9.7) 0.86 (0.40-1.88) 
 Trend   0.81 (0.57-1.16) 
rs762551 AA 171 (57.2) 46 (40.7) 1.00 (reference) 
 AC 114 (38.1) 51 (45.1) 1.67 (1.04-2.68) 
 CC 14 (4.7) 16 (14.2) 4.45 (2.00-9.90) 
 Trend   1.93 (1.36-2.73) 
rs2472304 GG 118 (39.7) 54 (48.2) 1.00 (reference) 
 AG 143 (48.2) 43 (38.4) 0.53 (0.32-0.87) 
 AA 36 (12.1) 15 (13.4) 0.57 (0.27-1.19) 
 Trend   0.68 (0.47-0.96) 
rs2470890 GG 118 (39.5) 55 (48.7) 1.00 (reference) 
 AG 145 (48.5) 43 (38.1) 0.51 (0.31-0.84) 
 AA 36 (12.0) 15 (13.3) 0.55 (0.26-1.17) 
 Trend   0.66 (0.46-0.94) 
rs11854147 AA 100 (33.4) 47 (41.6) 1.00 (reference) 
 AG 145 (48.5) 48 (42.5) 0.59 (0.36-0.97) 
 GG 54 (18.1) 18 (15.9) 0.49 (0.25-0.97) 
 Trend   0.68 (0.48-0.94) 
SNPControls, n (%)Cases, n (%)OR (95% CI)
rs2472297 GG 251 (84.0) 97 (85.8) 1.00 (reference) 
 AG 47 (15.7) 15 (13.3) 0.72 (0.38-1.36) 
 AA 1 (0.3) 1 (0.9) 1.96 (0.11-36.52) 
 Trend   0.79 (0.43-1.43) 
rs16972208 GG 133 (44.5) 66 (58.4) 1.00 (reference) 
 AG 134 (44.8) 36 (31.9) 0.63 (0.39-1.03) 
 AA 32 (10.7) 11 (9.7) 0.94 (0.43-2.08) 
 Trend   0.82 (0.57-1.18) 
rs2472299 GG 173 (57.9) 47 (42.0) 1.00 (reference) 
 AG 112 (37.5) 50 (44.6) 1.66 (1.04-2.67) 
 AA 14 (4.7) 15 (13.4) 4.20 (1.87-9.44) 
 Trend   1.89 (1.33-2.67) 
rs17861140 GG 133 (44.5) 66 (58.4) 1.00 (reference) 
 AG 132 (44.2) 36 (31.9) 0.64 (0.39-1.05) 
 AA 34 (11.4) 11 (9.7) 0.86 (0.40-1.88) 
 Trend   0.81 (0.57-1.16) 
rs762551 AA 171 (57.2) 46 (40.7) 1.00 (reference) 
 AC 114 (38.1) 51 (45.1) 1.67 (1.04-2.68) 
 CC 14 (4.7) 16 (14.2) 4.45 (2.00-9.90) 
 Trend   1.93 (1.36-2.73) 
rs2472304 GG 118 (39.7) 54 (48.2) 1.00 (reference) 
 AG 143 (48.2) 43 (38.4) 0.53 (0.32-0.87) 
 AA 36 (12.1) 15 (13.4) 0.57 (0.27-1.19) 
 Trend   0.68 (0.47-0.96) 
rs2470890 GG 118 (39.5) 55 (48.7) 1.00 (reference) 
 AG 145 (48.5) 43 (38.1) 0.51 (0.31-0.84) 
 AA 36 (12.0) 15 (13.3) 0.55 (0.26-1.17) 
 Trend   0.66 (0.46-0.94) 
rs11854147 AA 100 (33.4) 47 (41.6) 1.00 (reference) 
 AG 145 (48.5) 48 (42.5) 0.59 (0.36-0.97) 
 GG 54 (18.1) 18 (15.9) 0.49 (0.25-0.97) 
 Trend   0.68 (0.48-0.94) 

NOTE: ORs are derived from logistic regression models, adjusted for age, sex, and admixture (Amerindian and European).

To examine whether the frequently posited CYP1A1/2 region is associated with lung cancer incidence, haplotypes in the chromosome 15q region were analyzed in a genetic association study of lung cancer. Haplotype block structure inferred from extensive genotyping data in the CYP1A1/2 region differed between African-Americans and Latinos, with African-Americans showing greater haplotype diversity and smaller blocks as expected based on known population origins.

Evidence suggests a positive association between haplotype 2C and lung cancer in Latinos. Haplotype 2C was consistently associated with lung cancer in these analyses, suggesting that a genetic variant in haplotype 2C confers an increased risk of lung cancer in Latinos. The increased association between haplotype 2C and lung cancer in Latinos was robust, not only remaining after adjusting for admixture but becoming stronger when observed among participants with European ancestry ≥54%. Although the ORs for haplotype 2C stratified by European ancestry did not significantly differ, the observed association suggests that Latinos having high European ancestry may carry a susceptibility variant located in haplotype 2C. Exploratory analyses suggest an increased risk for lung cancer for this haplotype among light smokers, consistent with reports in the literature suggesting a greater susceptibility to lung cancer for certain gene variants at lower carcinogen levels (32).

Within haplotype 2C are two SNPs (rs762551 and rs2472299) with variant alleles present only in this common haplotype of block 2. Single SNP analyses for SNPs contained within haplotype 2C confirmed that these were the only two SNPs associated with an increased risk of lung cancer. SNP rs762551, located in intron 1, was found to have different frequencies between high and low metabolic phenotypes for CYP1A2, although not statistically significant perhaps due to small sample sizes (33). Jiang and colleagues (33) note that intron 1 likely contains the regulatory region of CYP1A2 because it is highly conserved between human, rat, and mouse genes. A recent study by Aklillu and colleagues (34) conducted in Ethiopians found that this same variant (rs762551) alone and in a haplotype did not influence CYP1A2 activity, supporting the lack of an association observed in the African-Americans in this study. Pavanello and colleagues (35) identified not only increased CYP1A2 metabolic activity but also increased urine mutagenicity among Italian heavy smokers having the ancestral A allele of this variant. Sachse and colleagues (36) describe increased CYP1A2 activity for the A/A genotype of this variant in Caucasians. Another report showed that Swedish subjects homozygous for the A/A genotype had increased metabolic activity, which was not observed in Koreans (7). The variant C allele and C/C genotype was associated with an increased risk of lung cancer in our Latino study population. The reason for the associations with opposite alleles of our study and prior studies is unknown but may be due to differences in linkage disequilibrium with other genetic variants (37). The significant association with rs2472299, which is located upstream of both CYP1A1 and CYP1A2, supports the involvement of this regulatory region with lung cancer (1).

The combined results of this study and previously published results suggest that haplotype 2C may contain a variant, possibly rs2472299 or rs762551, in the CYP1A2 locus with a functional role in the genetic susceptibility of lung cancer for certain racial/ethnic groups. Although it is possible that the observed association of an increased lung cancer risk with haplotype 2C is a result of linkage disequilibrium with M1 and M2 in CYP1A1 rather than other variants captured by this haplotype, it is difficult to separate the individual effect of these linked polymorphisms from the variants present on haplotype 2C.

To our knowledge, only two CYP1A1 studies of lung cancer have been conducted in Latinos (15, 16), one of which includes some subjects in this analysis (15). Although this study did not directly assess the M1 and M2 variants, linkage disequilibrium between these variants and SNPs in block 1 allows associations between M1 and M2 and lung cancer to be elucidated. Results of this analysis do not corroborate prior findings with these variants perhaps due to small sample sizes or different linkage disequilibrium patterns in the Latino populations. Reasons for these inconsistent results are unknown.

The significance of the negative associations observed with haplotype B in block 3 and single SNP rs11072508, located within haplotype 3B, in Latinos is unclear. SNP rs11072508 maps to the CYP1A2 gene and is the only SNP in haplotype 3B that displays a variant allele compared with the other haplotypes. This variant allele may be associated with low CYP1A2 enzyme activity reducing the risk of lung cancer or could be a marker locus in linkage disequilibrium with another variant associated with a reduced risk of lung cancer. It is also possible that this association is a result of random variation. However, this haplotype was associated with a reduced risk of lung cancer in several analyses in this study, becoming significant after accounting for confounding by admixture and among nonsmokers, suggesting that it may not be due to type I error.

CYP1A1 and CYP1A2 haplotypes do not seem to be associated with lung cancer in African-Americans, although additional studies are necessary for confirmation. These findings are compatible with other published reports (1214, 16) and with previous studies in this population (15). Among African-Americans, little evidence for confounding by population stratification was apparent. Our results suggest that the association between CYP1A1/2 and lung cancer differs by racial/ethnic group. The cancer-promoting effects of tobacco smoke may be mediated by several metabolic pathways, and it is possible that other genes or enzymes may play an important role in lung cancer in African-Americans.

Population stratification was present among Latino participants. After adjusting for admixture proportions, associations became less strong, revealing positive confounding by population substructure. Importantly, the significance of associations remained after adjusting for admixture. Our results suggest that population stratification may confound genetic association studies of lung cancer in Latinos.

A potential limitation of this study is the focus on only two CYP genes because the cancer-promoting effects of tobacco smoke may be mediated by several metabolic pathways. Broad substrate specificity exists for CYP1A1/2 and other enzymes may metabolize cigarette smoke carcinogens. Consideration of CYP1A1/2 in concert with other metabolizing loci will allow evaluation of possible epistasis in lung cancer susceptibility. Another limitation is that the M1 and M2 variants were not included on the Illumina CYP panel. However, these SNPs were genotyped using PCR technology and subsequently found to be in linkage disequilibrium with SNPs in the CYP1A1/2 haplotypes, allowing inferences to be made about M1 and M2, although they were not directly assessed in the haplotypes. An important limitation of this study is its modest sample size, which likely limits the statistical power for detection of weak associations with lung cancer. Moreover, the gene-environment interactions and subgroup analyses should be considered exploratory due to the limited statistical power. Replication of the results is warranted in studies with larger sample sizes of African-Americans and Latinos to confirm the role of these two candidate genes.

A strength of this study is its haplotype-based approach to identifying a genetic variant associated with lung cancer. To our knowledge, no other lung cancer studies in African-Americans or Latinos have done extensive genotyping allowing examination of haplotype associations in this CYP1A1/2 region. Haplotypes capture most of the genetic variation across a large chromosomal region, allowing for reduced genotyping efforts and an efficient statistical approach that provides more information than single-marker analyses. Although there are genetic variants in the CYP1A1/2 region that were not genotyped, it is likely that many of the untested markers are in linkage disequilibrium with the inferred haplotypes. The presence of linkage disequilibrium allows interrogation of disease variants as long as a genotyped marker variant is in linkage disequilibrium with the disease susceptibility variant. Inclusion of the ancestry informative genetic marker panel provided requisite information about whether population heterogeneity is causing spurious associations and confounding this genetic association study of lung cancer. This analysis benefited from newly available statistical methods and genetic markers and was able to address potential confounding by population stratification. Adding ancestry markers adds confidence to these results based on relatively small sample sizes.

In summary, our results show consistent evidence that variants in the CYP1A2 region may increase lung cancer risk among Latinos. Future studies should confirm this result in Latino populations and consider examining whether this association may be present in European populations. It is unknown whether variants comprising this haplotype or linked to the polymorphisms in this haplotype alter CYP1A2 enzyme activity. Demonstration of a phenotype for variants in this haplotype, such as examination of ethnic variation in blood levels of nicotine and cotinine and fine mapping of this region in an ancestral population such as Europeans, would lend support for a susceptibility locus for lung cancer. Moreover, admixture was found to have an important effect on the relationship between CYP1A1/2 and lung cancer in Latinos; thus, future studies of CYP1A1/2 in lung cancer should consider potential bias by population stratification before making inferences about results.

No potential conflicts of interest were disclosed.

Note: Supplementary data for this article are available at Cancer Research Online (http://cancerres.aacrjournals.org/).

Grant support: National Institute of Environmental Health Sciences grants R01ES06717 (J.K. Wiencke) and 2R01ES09137-06 (P.A. Buffler), National Institute of Arthritis and Musculoskeletal and Skin Diseases grant R01AR050267 (M.F. Seldin), and National Institute of Diabetes and Digestive and Kidney Diseases grant R01K071185 (M.F. Seldin).

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

We thank the Northern California Cancer Center and Summit Medical Center for their assistance with case ascertainment, Drs. Rick Kittles and Gabriel Silva for kindly providing ancestral DNA from West Africans and Mayans, and Dr. John Belmont for his support with collection of the Mayan population samples.

1
Corchero J, Pimprale S, Kimura S, Gonzalez FJ. Organization of the CYP1A cluster on human chromosome 15: implications for gene regulation.
Pharmacogenetics
2001
;
11
:
1
–6.
2
Nebert DW, Dalton TP, Okey AB, Gonzalez FJ. Role of aryl hydrocarbon receptor-mediated induction of the CYP1 enzymes in environmental toxicity and cancer.
J Biol Chem
2004
;
279
:
23847
–50.
3
Nebert DW, Dalton TP. The role of cytochrome P450 enzymes in endogenous signalling pathways and environmental carcinogenesis.
Nat Rev Cancer
2006
;
6
:
947
–60.
4
Hecht SS. Cigarette smoking: cancer risks, carcinogens, and mechanisms.
Langenbecks Arch Surg
2006
;
391
:
603
–13.
5
Bernauer U, Heinrich-Hirsch B, Tonnies M, Peter-Matthias W, Gundert-Remy U. Characterisation of the xenobiotic-metabolizing cytochrome P450 expression pattern in human lung tissue by immunochemical and activity determination.
Toxicol Lett
2006
;
164
:
278
–88.
6
Wei C, Caccavale RJ, Kehoe JJ, Thomas PE, Iba MM. CYP1A2 is expressed along with CYP1A1 in the human lung.
Cancer Lett
2001
;
171
:
113
–20.
7
Ghotbi R, Christensen M, Roh HK, Ingelman-Sundberg M, Aklillu E, Bertilsson L. Comparisons of CYP1A2 genetic polymorphisms, enzyme activity and the genotype-phenotype relationship in Swedes and Koreans.
Eur J Clin Pharmacol
2007
;
63
:
537
–46.
8
McLemore TL, Adelberg S, Liu MC, et al. Expression of CYP1A1 gene in patients with lung cancer: evidence for cigarette smoke-induced gene expression in normal lung tissue and for altered gene regulation in primary pulmonary carcinomas.
J Natl Cancer Inst
1990
;
82
:
1333
–9.
9
Wooding SP, Watkins WS, Bamshad MJ, Dunn DM, Weiss RB, Jorde LB. DNA sequence variation in a 3.7-kb noncoding sequence 5′ of the CYP1A2 gene: implications for human population history and natural selection.
Am J Hum Genet
2002
;
71
:
528
–42.
10
Jiang Z, Dalton TP, Jin L, et al. Toward the evaluation of function in genetic variability: characterizing human SNP frequencies and establishing BAC-transgenic mice carrying the human CYP1A1_CYP1A2 locus.
Hum Mutat
2005
;
25
:
196
–206.
11
Cosma G, Crofts F, Currie D, Wirgin I, Toniolo P, Garte SJ. Racial differences in restriction fragment length polymorphisms and messenger RNA inducibility of the human CYP1A1 gene.
Cancer Epidemiol Biomarkers Prev
1993
;
2
:
53
–7.
12
Shields PG, Caporaso NE, Falk RT, et al. Lung cancer, race, and a CYP1A1 genetic polymorphism.
Cancer Epidemiol Biomarkers Prev
1993
;
2
:
481
–5.
13
Cote ML, Wenzlaff AS, Bock CH, et al. Combinations of cytochrome P-450 genotypes and risk of early-onset lung cancer in Caucasians and African Americans: a population-based study.
Lung Cancer
2007
;
55
:
255
–62.
14
Wenzlaff AS, Cote ML, Bock CH, et al. CYP1A1 and CYP1B1 polymorphisms and risk of lung cancer among never smokers: a population-based study.
Carcinogenesis
2005
;
26
:
2207
–12.
15
Wrensch M, Miike R, Sison J, et al. CYP1A1 variants and smoking-related lung cancer in San Francisco Bay area Latinos and African Americans.
Int J Cancer
2005
;
113
:
141
–7.
16
Ishibe N, Wiencke JK, Zuo ZF, McMillan A, Spitz M, Kelsey KT. Susceptibility to lung cancer in light smokers associated with CYP1A1 polymorphisms in Mexican- and African-Americans.
Cancer Epidemiol Biomarkers Prev
1997
;
6
:
1075
–80.
17
Thomas DC, Witte JS. Point: population stratification: a problem for case-control studies of candidate-gene associations?
Cancer Epidemiol Biomarkers Prev
2002
;
11
:
505
–12.
18
Wacholder S, Rothman N, Caporaso N. Counterpoint: bias from population stratification is not a major threat to the validity of conclusions from epidemiological studies of common polymorphisms and cancer.
Cancer Epidemiol Biomarkers Prev
2002
;
11
:
513
–20.
19
Hansen HM, Wiemels JL, Wrensch M, Wiencke JK. DNA quantification of whole genome amplified samples for genotyping on a multiplexed bead array platform.
Cancer Epidemiol Biomarkers Prev
2007
;
16
:
1686
–90.
20
The International HapMap Consortium. The International HapMap Project.
Nature
2003
;
426
:
789
–96.
21
Packer BR, Yeager M, Staats B, et al. SNP500Cancer: a public resource for sequence validation and assay development for genetic variation in candidate genes.
Nucleic Acids Res
2004
;
32
:
D528
–32.
22
Tian C, Hinds DA, Shigeta R, Kittles R, Ballinger DG, Seldin MF. A genomewide single-nucleotide-polymorphism panel with high ancestry information for African American admixture mapping.
Am J Hum Genet
2006
;
79
:
640
–9.
23
Tian C, Hinds DA, Shigeta R, et al. A genomewide single-nucleotide-polymorphism panel for Mexican American admixture mapping.
Am J Hum Genet
2007
;
80
:
1014
–23.
24
Weir BS, Cockerham CC. Estimating F-statistics for the analysis of population structure.
Evolution
1984
;
38
:
1358
–70.
25
Gabriel SB, Schaffner SF, Nguyen H, et al. The structure of haplotype blocks in the human genome.
Science
2002
;
296
:
2225
–9.
26
Barrett JC, Fry B, Maller J, Daly MJ. Haploview: analysis and visualization of LD and haplotype maps.
Bioinformatics
2005
;
21
:
263
–5.
27
Kraft P, Cox DG, Paynter RA, Hunter D, De Vivo I. Accounting for haplotype uncertainty in matched association studies: a comparison of simple and flexible techniques.
Genet Epidemiol
2005
;
28
:
261
–72.
28
Chakraborty R, Weiss KM. Frequencies of complex diseases in hybrid populations.
Am J Phys Anthropol
1986
;
70
:
489
–503.
29
Chakraborty R. Gene admixture in human populations: models and predictions.
Yearb Phys Anthropol
1986
;
29
:
1
–43.
30
Hanis CL, Chakraborty R, Ferrell RE, Schull WJ. Individual admixture estimates: disease associations and individual risk of diabetes and gallbladder disease among Mexican-Americans in Starr County, Texas.
Am J Phys Anthropol
1986
;
70
:
433
–41.
31
Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing.
J R Statist Soc B
1995
;
57
:
289
–300.
32
Vineis P. Molecular epidemiology: low-dose carcinogens and genetic susceptibility.
Int J Cancer
1997
;
71
:
1
–3.
33
Jiang Z, Dragin N, Jorge-Nebert LF, et al. Search for an association between the human CYP1A2 genotype and CYP1A2 metabolic phenotype.
Pharmacogenet Genomics
2006
;
16
:
359
–67.
34
Aklillu E, Carrillo JA, Makonnen E, et al. Genetic polymorphism of CYP1A2 in Ethiopians affecting induction and expression: characterization of novel haplotypes with single-nucleotide polymorphisms in intron 1.
Mol Pharmacol
2003
;
64
:
659
–69.
35
Pavanello S, Pulliero A, Lupi S, Gregorio P, Clonfero E. Influence of the genetic polymorphism in the 5′-noncoding region of the CYP1A2 gene on CYP1A2 phenotype and urinary mutagenicity in smokers.
Mutat Res
2005
;
587
:
59
–66.
36
Sachse C, Brockmoller J, Bauer S, Roots I. Functional significance of a C→A polymorphism in intron 1 of the cytochrome P450 CYP1A2 gene tested with caffeine.
Br J Clin Pharmacol
1999
;
47
:
445
–9.
37
Lin PI, Vance JM, Pericak-Vance MA, Martin ER. No gene is an island: the flip-flop phenomenon.
Am J Hum Genet
2007
;
80
:
531
–8.
38
Wacholder S, Rothman N, Caporaso N. Population stratification in epidemiologic studies of common genetic variants and cancer: quantification of bias.
J Natl Cancer Inst
2000
;
92
:
1151
–8.

Supplementary data