Abstract
To date, most genetic association analyses of smoking behaviors have been conducted in populations of European ancestry and many of these studies focused on the phenotype that measures smoking quantity, that is, cigarettes per day. Additional association studies in diverse populations with different linkage disequilibrium patterns and an alternate phenotype, such as total tobacco exposure which accounts for intermittent periods of smoking cessation within a larger smoking period as measured in large cardiovascular risk studies, can aid the search for variants relevant to smoking behavior. For these reasons, we undertook an association analysis by using a genotyping array that includes 2,100 genes to analyze smoking persistence in unrelated African American participants from the Atherosclerosis Risk in Communities study. A locus located approximately 4 kb downstream from the 3′-UTR of the brain-derived neurotrophic factor (BDNF) significantly influenced smoking persistence. In addition, independent variants rs12915366 and rs12914385 in the cluster of genes encoding nicotinic acetylcholine receptor subunits (CHRNA5–CHRNA3–CHRNB4) on 15q25.1 were also associated with the phenotype in this sample of African American subjects. To our knowledge, this is the first study to more extensively evaluate the genome in the African American population, as a limited number of previous studies of smoking behavior in this population included evaluations of only single genomic regions. Cancer Prev Res; 4(5); 729–34. ©2011 AACR.
Introduction
Cigarette smoking continues to be the leading cause of preventable death in the United States because of its causative link to cancer, cardiovascular, and respiratory disease. The burden of lung cancer is greater in African Americans compared than Caucasians (1), with the average 1975–2007 annual age adjusted per 100,00 incidence and mortality rates of lung cancer in African Americans of 81.8 and 63.4, respectively, in comparison with 64.1 and 53.9 for Caucasians (2). Despite the increased risk, published studies evaluating genetic risk of tobacco dependence in African Americans with common single nucleotide polymorphisms (SNP) are limited to evaluations of individual genomic regions.
To this end, we evaluated smoking persistence in African American participants from the Atherosclerosis Risk in Communities (ARIC) study to advance our understanding of tobacco addiction genetics by using an innovative phenotype in genetic association analyses of smoking measured in pack-years that incorporates information on total nicotine exposure accounting for periods of time of intermittent smoking cessation within a longer period of smoking.
Materials and Methods
Study participants
Participants were a part of The National Heart, Lung, and Blood Institute's ARIC study. Our total sample consisted of 1,710 African American current or former smokers (44.1% of the total sample) who were 45 to 65 years of age. The mean age of smoking initiation for the participants was 19.5 and 50.2% of the total sample were male. Intermittency in smoking (>1 year) was reported by 28% of the subjects.
Phenotype
As tobacco use influences both the risk and progression of heart disease, information regarding individual smoking patterns is extensive in the ARIC study. The phenotype we investigated here is smoking persistence, as measured by pack-years of cigarette exposure. Because individual tobacco use varies over time, single timepoint measures like current cigarettes smoked per day may provide a less accurate estimate of overall smoking behavior. For this reason, we evaluated smoking persistence by using a method that accounts for the period(s) of abstaining from smoking within a longer overall period of smoking. The pack-year variable was calculated according to the following formulae:
Current smokers: PCKYR = AVGCIGDY/20 × [(CURAGE − AGEINIT) − NONSMK]
Former smokers: PCKYR = AVGCIGDY/20 × [(AGEQUIT − AGEINIT) − NONSMK]
where PCKYR is pack-year; AVGCIGDY is lifetime average cigarettes per day; CURAGE is current age; AGEINIT is age of smoking initiation; AGEQUIT is age quit smoking; NONSMOK is intermittent nonsmoking period (i.e., total period of nonsmoking in the overall smoking period).
Genotyping assay
Samples from the ARIC study were genotyped as part of the Candidate Gene Association Resource (CARe) project (3). The content of the genotyping array, ITMAT-Broad-CARe or “IBC chip,” is informed by genome-wide association studies (GWAS), expression quantitative trait loci, pathway-based approaches and comprehensive literature searching. It includes loci relevant to addiction. As an example, it contains densely spaced SNPs from 84 of the 130 genes from the “addiction array” (4) and additional genes that are not on the addiction array, but were found to be associated with addiction phenotypes in later genetic association studies.
The loci on the IBC chip are divided into 3 groups: group 1 (n = 435 loci)–genes and regions with a high likelihood of functional significance (tag SNPs selected to capture known variation with minor allele frequency (MAF) > 0.02 and an r2 of at least 0.8 in HapMap populations); group 2: (n = 1,349 loci)–candidate loci that are potentially involved in phenotypes of interest or established loci that required very large numbers of tagging SNPs (tag SNPs selected to capture known variation with MAF > 0.05 with an r2 of at least 0.5 in HapMap populations); group 3: (n = 232 loci)–comprised mainly of the larger genes (100 kb) which were of lower interest a priori to the investigators (includes only nonsynonomous SNPs and known functional variants). The average number of SNPs across the group 1 and group 2 loci of IBC was compared with Illumina and Affymetrix genotyping products. The average coverage for group 1 loci is approximately 36.5 SNPs per locus on the IBC chip. The Illumina Human1M and Affymetrix 6.0 platform, for comparison, have an average of approximately 28.0 and 17.4 SNPs, respectively, across the equivalent IBC loci. The average number of SNPs observed for the group 2 loci is approximately 16.3 SNPs, which is comparable with the current Illumina and Affymetrix products.
Statistical analyses
The pack-year phenotype was Box-Cox transformed (λ = 0.2). Phenotype residuals were constructed with adjustment for age and gender. The standardized residual served as the phenotype in genotype–phenotype association analysis. Generation of residuals was carried out with the R statistical package (The R Foundation for Statistical Computing). Association analysis was done in PLINK (6) by using linear regression under the additive genetic model. To address concerns about population stratification, we conducted principle component analysis as implemented in EIGENSTRAT (7). The first 10 principal components were included as covariates in the genetic association analysis. All results were adjusted for residual inflation by using the genomic control method. Bonferonni adjustment for multiple comparison was set at an α-level of 0.05/50,000 = 1 × 10−6.
The imputation, resulting in 270,000 total SNPs, was done by a combined CEU + YRI reference panel including SNPs segregating in both CEU and YRI, as well as SNPs segregating in one panel and monomorphic and nonmissing in the other. For imputation of IBC individuals, the use of the CEU + YRI panel resulted in an allelic concordance rate of approximately 95.6%, calculated as 1 − 1/2 × (imputed_dosage − chip_dosage). This rate is comparable to rates calculated for individuals of African descent imputed with the HapMap 2 YRI individuals (8). In the first step of imputation, individuals with pedigree relatedness or cryptic relatedness ( > 0.05) were filtered. Recombination and error rate estimates for the entire sample were calculated on the basis of a subset of random individuals. Next, these rates were used to impute all sample individuals across the entire reference panel. Single Nucleotide Polymorphisms with poor imputation scores (
< 0.6) and minor allele frequency of <0.01 were filtered out.
Results
Our analysis did not result in a significant inflation of the χ2 test statistic [genomic control inflation factor (λGC) = 1.00165]. Single nucleotide polymorphisms that were the most strongly associated with smoking persistence in the ARIC study are summarized in Table 1. Variants rs10767658 and rs925946 exhibited the strongest association, where each additional copy of rs10767658*C minor allele and rs925946*T minor allele corresponded to an increase in smoking persistence of approximately 3.95 (SE = 0.75; P = 1.6 × 10−7) pack-years (See Fig. 1 showing a regional plot of 11p14.1). Linkage disequilibrium (LD) between these 2 variants is high (r2 = 0.997) and the conditional analysis result shows a single signal in the region (see Supplementary Fig. 1 conditioning on the most significant SNP in our study).
Association between chromosome 11p14.1 variants and smoking persistence. Several common variants, including rs10767658 (purple triangle) and rs925946 (red triangle), cluster on 11p14.1 and show associations with smoking persistence in African Americans.
Association between chromosome 11p14.1 variants and smoking persistence. Several common variants, including rs10767658 (purple triangle) and rs925946 (red triangle), cluster on 11p14.1 and show associations with smoking persistence in African Americans.
Association of IBC array variants with smoking persistence
SNP . | Chr . | Gene . | L Gene . | R gene . | EA . | EA Freq . | Beta . | SE . | P . |
---|---|---|---|---|---|---|---|---|---|
rs10767658 | 11 | BDNFOS | LIN7C | BDNF | C | 0.2629 | 3.94 | 0.75 | 1.550E-07 |
rs925946 | 11 | BDNFOS | LIN7C | BDNF | G | 0.7369 | −3.95 | 0.75 | 1.639E-07 |
rs1401635 | 11 | BDNF | BDNFOS | KIF18A | C | 0.2449 | 3.56 | 0.70 | 4.707E-07 |
rs11030108 | 11 | BDNF | BDNFOS | KIF18A | A | 0.254 | 3.52 | 0.71 | 7.805E-07 |
rs11030119 | 11 | BDNF | BDNFOS | KIF18A | A | 0.2931 | 3.25 | 0.67 | 1.352E-06 |
rs17309874 | 11 | BDNFOS | LIN7C | BDNF | A | 0.1055 | 4.39 | 1.02 | 1.891E-05 |
rs1013402 | 11 | BDNF | BDNFOS | KIF18A | A | 0.8528 | −3.56 | 0.84 | 2.490E-05 |
rs10835211 | 11 | BDNF | BDNFOS | KIF18A | A | 0.106 | 4.13 | 0.98 | 2.555E-05 |
rs11030107 | 11 | BDNF | BDNFOS | KIF18A | A | 0.8936 | −4.14 | 0.98 | 2.601E-05 |
rs11030102 | 11 | BDNF | BDNFOS | KIF18A | C | 0.8935 | −4.15 | 0.98 | 2.622E-05 |
rs17309930 | 11 | NAa | BDNF | KIF18A | A | 0.0946 | 4.62 | 1.11 | 3.167E-05 |
rs12273363 | 11 | BDNF | BDNF | KIF18A | C | 0.0944 | 4.48 | 1.07 | 3.214E-05 |
rs12288512 | 11 | NAa | BDNF | KIF18A | A | 0.0945 | 4.53 | 1.09 | 3.226E-05 |
rs12915366 | 15 | PSMA4 | AGPHD1 | PSMA4 | A | 0.158 | −3.47 | 1.63 | 3.578E-05 |
rs12910289 | 15 | AGPHD1 | IREB2 | PSMA4 | G | 0.1604 | −3.49 | 1.63 | 3.605E-05 |
rs1504546 | 15 | AGPHD1 | IREB2 | PSMA4 | C | 0.8396 | 3.48 | 1.34 | 3.608E-05 |
rs12906951 | 15 | AGPHD1 | IREB2 | PSMA4 | C | 0.8396 | 3.47 | 1.45 | 3.617E-05 |
rs3813572 | 15 | PSMA41 | AGPHD1 | PSMA4 | C | 0.1606 | −3.40 | 1.64 | 3.617E-05 |
rs11636131 | 15 | AGPHD1 | IREB2 | PSMA4 | C | 0.8395 | 3.50 | 1.64 | 3.625E-05 |
rs11632604 | 15 | AGPHD1 | IREB2 | PSMA4 | C | 0.1604 | −3.50 | 1.75 | 3.635E-05 |
rs12916483 | 15 | PSMA41 | AGPHD1 | PSMA4 | A | 0.1606 | −3.41 | 1.37 | 3.638E-05 |
rs3813571 | 15 | PSMA4 | AGPHD1 | CHRNA5 | G | 0.8393 | 3.39 | 1.30 | 4.143E-05 |
rs12916999 | 15 | PSMA4 | AGPHD1 | CHRNA5 | A | 0.8446 | 2.93 | 0.81 | 4.246E-05 |
rs952216 | 15 | AGPHD1 | IREB2 | PSMA4 | C | 0.8542 | 3.69 | 0.93 | 7.605E-05 |
rs12914385 | 15 | CHRNA3 | CHRNA5 | CHRNB4 | C | 0.7998 | −3.25 | 0.82 | 7.621e-05 |
SNP . | Chr . | Gene . | L Gene . | R gene . | EA . | EA Freq . | Beta . | SE . | P . |
---|---|---|---|---|---|---|---|---|---|
rs10767658 | 11 | BDNFOS | LIN7C | BDNF | C | 0.2629 | 3.94 | 0.75 | 1.550E-07 |
rs925946 | 11 | BDNFOS | LIN7C | BDNF | G | 0.7369 | −3.95 | 0.75 | 1.639E-07 |
rs1401635 | 11 | BDNF | BDNFOS | KIF18A | C | 0.2449 | 3.56 | 0.70 | 4.707E-07 |
rs11030108 | 11 | BDNF | BDNFOS | KIF18A | A | 0.254 | 3.52 | 0.71 | 7.805E-07 |
rs11030119 | 11 | BDNF | BDNFOS | KIF18A | A | 0.2931 | 3.25 | 0.67 | 1.352E-06 |
rs17309874 | 11 | BDNFOS | LIN7C | BDNF | A | 0.1055 | 4.39 | 1.02 | 1.891E-05 |
rs1013402 | 11 | BDNF | BDNFOS | KIF18A | A | 0.8528 | −3.56 | 0.84 | 2.490E-05 |
rs10835211 | 11 | BDNF | BDNFOS | KIF18A | A | 0.106 | 4.13 | 0.98 | 2.555E-05 |
rs11030107 | 11 | BDNF | BDNFOS | KIF18A | A | 0.8936 | −4.14 | 0.98 | 2.601E-05 |
rs11030102 | 11 | BDNF | BDNFOS | KIF18A | C | 0.8935 | −4.15 | 0.98 | 2.622E-05 |
rs17309930 | 11 | NAa | BDNF | KIF18A | A | 0.0946 | 4.62 | 1.11 | 3.167E-05 |
rs12273363 | 11 | BDNF | BDNF | KIF18A | C | 0.0944 | 4.48 | 1.07 | 3.214E-05 |
rs12288512 | 11 | NAa | BDNF | KIF18A | A | 0.0945 | 4.53 | 1.09 | 3.226E-05 |
rs12915366 | 15 | PSMA4 | AGPHD1 | PSMA4 | A | 0.158 | −3.47 | 1.63 | 3.578E-05 |
rs12910289 | 15 | AGPHD1 | IREB2 | PSMA4 | G | 0.1604 | −3.49 | 1.63 | 3.605E-05 |
rs1504546 | 15 | AGPHD1 | IREB2 | PSMA4 | C | 0.8396 | 3.48 | 1.34 | 3.608E-05 |
rs12906951 | 15 | AGPHD1 | IREB2 | PSMA4 | C | 0.8396 | 3.47 | 1.45 | 3.617E-05 |
rs3813572 | 15 | PSMA41 | AGPHD1 | PSMA4 | C | 0.1606 | −3.40 | 1.64 | 3.617E-05 |
rs11636131 | 15 | AGPHD1 | IREB2 | PSMA4 | C | 0.8395 | 3.50 | 1.64 | 3.625E-05 |
rs11632604 | 15 | AGPHD1 | IREB2 | PSMA4 | C | 0.1604 | −3.50 | 1.75 | 3.635E-05 |
rs12916483 | 15 | PSMA41 | AGPHD1 | PSMA4 | A | 0.1606 | −3.41 | 1.37 | 3.638E-05 |
rs3813571 | 15 | PSMA4 | AGPHD1 | CHRNA5 | G | 0.8393 | 3.39 | 1.30 | 4.143E-05 |
rs12916999 | 15 | PSMA4 | AGPHD1 | CHRNA5 | A | 0.8446 | 2.93 | 0.81 | 4.246E-05 |
rs952216 | 15 | AGPHD1 | IREB2 | PSMA4 | C | 0.8542 | 3.69 | 0.93 | 7.605E-05 |
rs12914385 | 15 | CHRNA3 | CHRNA5 | CHRNB4 | C | 0.7998 | −3.25 | 0.82 | 7.621e-05 |
Abbreviation: EA, effect allele.
aSNP not located in transcript or within 2 kb flanking the transcript of a gene.
We also observed associations between smoking persistence and 2 distinct loci on chromosome 15 near and in the CHRNA5–CHRNA3–CHRNB4 cluster, although these associations did not pass the Bonferroni correction for multiple comparisons (set at 1 × 10−6). The associated SNPs, representing the 2 distinct loci, rs12915366 in PSMA4 (proteasome subunit-α type 4) and rs12914385 in CHRNA3 (nicotinic receptor-α 3 subunit) have low LD between them (r2 = 0.028; Fig. 2). The minor A allele of rs12915366 was associated with a reduction of 3.47 (SE = 1.63; P = 3.58 × 10−5) pack-years per allele, whereas the major C allele of rs12914385 in CHRNA3 was associated with a decrease of 3.25 (SE = 0.82; P = 7.62 × 10−5) pack-years per allele.
Association between chromosome 15q25.1 variants and smoking persistence. Note low LD between rs12915366 (with the lowest P value in PSMA4) and rs12914385 (with the lowest P value in CHRNA3).
Association between chromosome 15q25.1 variants and smoking persistence. Note low LD between rs12915366 (with the lowest P value in PSMA4) and rs12914385 (with the lowest P value in CHRNA3).
The total number of SNPs analyzed from the chromosome 11 region (27.6–27.75 million bp) was 79 (12 genotyped plus 67 imputed). The total number of SNPs analyzed from the chromosome 15 region (76.5–76.75 million bp) was 147 (12 genotyped plus 135 imputed). Please see Figures 1 and 2 for further information.
Discussion
We were able to advance our understanding of tobacco addiction genetics by using a richly phenotyped sample from the population of African ancestry known to have a reduced LD compared with European ancestry genomes (9) in whom most smoking behavior genetics studies have been conducted. We found that variants located downstream of the brain-derived neurotrophic factor (BDNF) 3′ untranslated region (UTR) were the most strongly associated with smoking persistence. Interestingly, the statistically significant variant in our study, rs925946, has been associated with adult (10) and more recently childhood obesity (11) risk. This finding suggests that studies evaluating natural and drug reward may be modeled by this locus.
Recently it has been found that transcription of the BDNF gene (up to 11 exons and ∼70 kb long; refs. 12, 13) can be initiated from 9 distinct functional promoters in humans. Regardless of which promoter is used, all BDNF transcripts are processed at 2 alternative polyadenylation sites, producing 2 sets of mRNA that carry either the long or the short 3′-UTR, and both of these encode the same BDNF protein arising from the single, last “3′ exon” (12, 14). Variants found in our study are located immediately proximal of the BDNF 3′-UTR region, shown to be central in BDNF transcription.
Our finding maps a locus on chromosome 11p14.1 for smoking persistence and for the first time shows the significance of BDNF when the genome is evaluated beyond a single locus analysis. Investigators in an earlier single candidate gene analysis (15) found that a haplotype block within BDNF (Chr11: 27,650,817–27,688,559) was associated with smoking behavior in European American men, but a haplotype block in a similar region (Chr11: 27,659,764–27,688,559) in African Americans was not (individual SNPs were not associated with smoking behavior in either population). The most significant SNP in our study, rs10767658, is located at Chr11: 27,628,828, therefore future analyses of BDNF should extend to the region downstream of the BDNF 3′ exon. Beuten and colleagues (15) analyzed a single candidate gene and smoking behavior, but the question of BDNF's significance against a more extensive coverage of the genome remained open. Our analysis shows that the effect of BDNF on smoking persistence is significant when the genome is evaluated more extensively as well as that the locus significantly modulates the behavior African Americans.
In addition to BDNF's influence on smoking persistence, in an analysis of an additional smoking behavior phenotype comparing individuals who never smoked versus those who ever smoked, a nonsynonomous BDNF variant rs6265, genotyped in our study and in low LD (r2 = 0.012) with our top variant, has been recently shown to significantly mediate smoking initiation (16). This SNP was not associated with smoking persistence in our study. (Additional information regarding the analysis of this variant in our dataset can be found in Supplementary Figure 1).
Since its discovery 3 decades ago, the role of BDNF in the differentiation and survival of neurons in the central nervous system has been firmly established. More recently it has been shown that the neurotrophin is critical for changes in synaptic strength that are important for information storage during memory formation (17). This may be relevant in tobacco addiction as relapse episodes occur following exposure to smoking cues because previously neutral stimuli acquire incentive motivational value when repeatedly paired with nicotine (18, 19). Because BDNF is known to elicit a plethora of functions in the brain via activation of the tropomyosin-related receptor tyrosine kinase B, mechanisms additional to the mnemonic processes important for smoking persistence may include the trophic effects of BDNF on the dopaminergic neurons that are at the center of the reward circuits activated by nicotine (20) and nicotine-induced BDNF expression changes (21).
Because our study was conducted in a cohort of African Americans, a population with lower levels of LD in comparison to populations of European descent in whom most studies of smoking behavior have been conducted, we were able to map the location of variants that are associated with smoking persistence in the CHRNA3–CHRNA5–CHRNB4 gene cluster on 15q25.1. We have identified 2 independent loci in this region. The top SNP rs12915366 is located in PSMA4. In studies conducted in Europeans and African Americans, SNPs in this region (i.e. downstream of 5′ CHRNA5) are thought to be associated with smoking (16, 22, 23) and lung cancer (24–26) because they tag a distal enhancer region (∼13 kb from CHRNA5), which regulates the expression of CHRNA5 (27). Our second locus, rs12914385–a SNP in low LD (r2 = 0.028) with the top SNP rs12915366 in this region—is located in CHRNA3 and has been associated with lung cancer in a 2-stage genome-wide case–control study of self-reported European Ancestry participants (28, 29). A literature search of previously associated SNPs with either nicotine dependence or lung cancer near rs12914385 for African Americans showed that a CHRNA5 SNP rs16969968 was previously associated with nicotine dependence (23). This SNP is in low LD (r2 = 0.25) with rs12914385–our top SNP in this region–and, as in the previous study (23), in our analysis rs16969968 showed a weak association with smoking persistence (P = 0.012), possibly due to a low MAF of 0.07 in the African American population. Two CHRNA3 variants, rs578776 (30) and rs1051730 (25, 26) in the same CHRNA3 intron as rs12914385 have been associated with lung cancer risk in African Americans and are in moderate (r2 = 0.58) and low (r2 = 0.006) LD with rs12914385, respectively. Further fine mapping efforts should aid the search of functional variant(s) in this locus.
In conclusion, results of our work support the idea of using alternate phenotypes and populations to efficiently map regions in the genome that influence smoking behavior. We have shown that a locus immediately downstream of BNDF 3′-UTR is a mediator of smoking behavior and our results on chromosome 15q25.1 show that distinct variants modulate smoking persistence in African Americans. We have more extensively evaluated the genome (∼2,100 genes) in the African American population and extended the work of previous studies which evaluated a limited number of genomic loci in relation to smoking behavior of this United States population.
Disclosure of Potential Conflicts of Interest
Neal Benowitz has served as a consultant to Pfizer and other pharmaceutical companies that develop and/or market smoking cessation medications. He also has served as a paid expert witness in litigation against tobacco companies.
Acknowledgments
The authors thank the staff and participants of the ARIC study for their important contributions. We thank James Wilson, MD for his outstanding logistical support in the completion of this project, Deb Farlow for her important support of the working group and Cameron Palmer for his expert assistance in the analysis of genomic data.
Grant Support
MD Scientist Fellowship in Genetic Medicine (Northerstern Memorial Foundation; Principal Investigator: A. Hamidovic), National Research Service Award F32DA024920 (NIH/NIDA; Principal Investigator: A. Hamidovic), Dr. Bonnie Spring's Professional Account at Northwestern Feinberg School of Medicine, KL2 RR024130-02 (support for E. Jorgenson). CARe wishes to acknowledge the support of the National Heart, Lung and Blood Institute and the contributions of the research institutions, study investigators, field staff, and study participants in creating this resource for biomedical research (NHLBI contract number HHSN268200960009C). ARIC is carried out as a collaborative study supported by National Heart, Lung, and Blood Institute contracts N01-HC-55015, N01-HC-55016, N01-HC-55018, N01-HC-55019, N01-HC-55020, N01-HC-55021, and N01-HC-55022.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.