Abstract
Genetic polymorphisms in genes involved in processes that affect DNA damage may explain part of the large interindividual variation in DNA adduct levels in smokers. We investigated the effect of 19 polymorphisms in 12 genes involved in carcinogen metabolism, DNA repair, and oxidant metabolism on DNA adduct levels (determined by 32P post-labeling) in lymphocytes of 63 healthy Caucasian smokers. The total number of alleles that were categorized as putatively high-risk alleles seemed associated with bulky DNA adduct levels (P = 0.001). Subsequently, to investigate which polymorphisms may have the highest contribution to DNA adduct levels in these smokers, discriminant analysis was done. In the investigated set of polymorphisms, GSTM1*0 (P < 0.001), mEH*2 (P = 0.001), and GPX1*1 (P < 0.001) in combination with the level of exposure (P < 0.001) were found to be key effectors. DNA adduct levels in subjects with a relatively high number of risk alleles of these three genes were >2-fold higher than in individuals not having these risk alleles. Noteworthy, all three genes are involved in deactivation of reactive carcinogenic metabolites. This study shows that analysis of multiple genetic polymorphisms may predict the interindividual variation in DNA adduct levels upon exposure to cigarette smoke. It is concluded that discriminant analysis presents an important statistical tool for analyzing the effect of multiple genotypes on molecular biomarkers. (Cancer Epidemiol Biomarkers Prev 2006;15(4):624–9)
Introduction
Cigarette smoke is a complex mixture of hazardous chemicals, including many genotoxic carcinogens (1). These compounds are carcinogenic as they covalently bind to DNA to form so-called DNA adducts. It has been shown that the amount of cigarettes smoked per day, bulky DNA adduct levels, and (lung) cancer risk are associated (2-7). Nevertheless, individuals with approximately similar exposures may have highly different DNA adduct levels. Part of this may be explained by the role of polymorphisms in genes involved in the process of DNA adduct formation and repair (8).
Because of the complexity of processes involved in formation of DNA adducts, it is very unlikely that one single polymorphism accounts for interindividual differences in DNA adduct levels in smokers (9). Thus, studies investigating single polymorphisms in relation to DNA adduct levels may either overestimate or underestimate the involvement of such polymorphisms. To identify the most relevant genetic polymorphisms and to quantitate possible interactions between them, studies are required that analyze many polymorphisms simultaneously in a single exposed population.
We hypothesize that simultaneous assessment of multiple genotypes yields a better prediction of DNA adduct levels in peripheral lymphocytes of smokers compared with the analysis of single gene polymorphisms. In the present study, 19 polymorphisms in metabolic, DNA repair, and oxidant metabolizing genes were genotyped using a single base extension (SBE)–based method (10). To our knowledge, such a large number of polymorphisms has never been assessed in a single population in relation to the presence of carcinogen-DNA adducts. Here, we use discriminant analysis as a tool to identify the most relevant genetic polymorphisms and to classify subgroups of smokers (in terms of low, medium, and high responders regarding DNA adduct levels).
Materials and Methods
Study Population
Genotyping was done using lymphocytic DNA from 63 healthy smoking Caucasians (29 males and 34 females) with an average ± SD age of 43 ± 9 years. These individuals reported smoking between 5 and 50 cigarettes per day (overall mean ± SD, 26 ± 9 cigarettes per day) for at least 10 years. Because the half-life of lymphocytic DNA adducts is short (11 weeks; ref. 11), the amount of cigarettes smoked per day instead of pack-years was used as variable for exposure. Informed consent was obtained from all individuals.
DNA Isolation and 32P Post-labeling
Selection of Polymorphisms
All single nucleotide polymorphisms (SNP) included in this study (Table 1) were selected based on (a) their association with cancer development and/or known effects on enzyme activity; (b) their expected influence on DNA adduct levels, based on literature review, as shown in Table 2; and (c) a frequency for occurrence in the population of >5%. DNA sequences and allele frequencies were obtained from the Cancer SNP 500 database (http://snp500cancer.nci.nih.gov).
Polymorphism . | . | Frequency (%)* . | DbSNP ID . | PCR primers . | Product (bp)† . | SBE primers‡ . | Length (bp) . | |
---|---|---|---|---|---|---|---|---|
CYP1A2 | *1F (−164A>C) | 6/43/51 | rs762551 | F5′-GAGGCTCCTTTCCAGCTCTC-3′ | 106 (1) | 5′-AACTGACTAAACTAGGTGCCACTCAAAGGGTGAGCTCTGTGGGC-3′ | 44 | |
R5′-CTCCCAGCTGGATACCAGA-3′ | ||||||||
GSTM1 | Deletion | 40/60 | F5′-CTCCTGATTATGACAGAAGCC-3′ | 648 (1) | 5′-AACTGACTAAACTAGGTGCCACGCTGGAGAACCAGACCATGGACAACC-3′ | 48 | ||
R5′-CTGGATTGTAGCAGATCATGC-3′ | ||||||||
GSTP1 | *2 (1404A>G) | 46/39/15 | rs947894 | F5′-TGGTGGACATGGTGAATGAC-3′ | 123 (1) | 5′-AACTC TGGAGGACCTCCGCTGCAAATAC-3 | 28 | |
R5′-AGCCCCTTTCTTTGTTCAGC-3′ | ||||||||
*3 (2294C>T) | 89/11/0 | rs1799811 | F5′-TGGGAGGGATGAGAGTAGGA-3′ | 106 (1) | 5′-AACTGACTAAACTAGGTGCCACGTCGTGAC CATGGTGGTGTCTGGCAGGAGG-3′ | 52 | ||
R5′-CAGGGTCTCAAAAGGCTTCA-3′ | ||||||||
GSTT1 | Deletion | 85/15 | F5′-GTAGCCATCACGGAGCTGAT-3′ | 97 (1) | 5′-AACTGACTAAACTAGGTGCCACGTCGTGAAAGTCTGGGCAGGTGAACCCACTAG-GC-3′ | 56 | ||
R5′-GGCAGCATAAGCAGGACTTC-3′ | ||||||||
NAT2 | *5 (341T>C) | 39/49/12 | rs1801280 | F5′-CAAATACAGCACTGGCATGG-3′ | 13 (1) | 5′-AACTGACTAAACTAGGTGCCACGTCGTGAAAGTCTATTCACCTTCTCCTGCAGGTG-ACCA-3′ | 60 | |
R5′-GGCTGATCCTTCCCAGAAAT-3′ | ||||||||
*6 (590G>A) | 49/42/9 | rs1799930 | F5′-CCTGCCAAAGAAGAAACACC-3′ | 143 (1) | 5′-CCTACCAAAAAATATACTTATTTACGCTTGAACCTC-3′ | 36 | ||
R5′-GGGTCTGCAAGGAACAAAAT-3′ | ||||||||
*7 (857G>A) | 92/8/0 | rs1799931 | F5′-TCCTTGGGGAGAAATCTCGT-3′ | 92 (1) | 5′-AACTGACTAAACTAGGTGCCACGTCGTGAAAGTCTGACAGCCCTCGTGCCCAAAC-CTGGTGATG-3′ | 64 | ||
R5′-GGGTGATACATACACAAGGGTTT-3′ | ||||||||
mEH | *2 (C>T) | 48/47/5 | rs1051740 | F5′-CTCTCAACTTGGGGTCCTGA-3′ | 231 (3) | 5′-AACTGACTAAACTAGGTGGAAGAAGCAGGTGGAG-ATTCTCAACAGA-3′ | 46 | |
R5′-GGCGTTTTGCAAACATACCT-3′ | ||||||||
*3 (A>G) | 59/38/3 | rs2234922 | F5′-CGTGCAGGGTCTTCTCTCTC-3′ | 194 (3) | 5′-AACTGACTAAACTAGGTGCCACGTCGTGAAAGTC-CAGCTGCCCGCAGGCC-3′ | 50 | ||
XRCC1 | *2 (26304C>T) | 89/11/0 | rs1799782 | R5′-GTTCTTGGGGTCAGTCAGGA-3′ | ||||
F5′-TGAAGGAGGAGGATGAGAGC-3′ | 147 (2) | 5′-CGGGGGCTCTCTTCTTCAGC-3′ | 21 | |||||
*3 (27466G>A) | 94/6/0 | rs25489 | R5′-CTCTACCCTCAGACCCACGA-3′ | |||||
F5′-CCCCAGTGGTGCTAACCTAA-3′ | 116 (2) | 5′-TCTTCTCCAGTGCCAGCTCCAACTC-3′ | 25 | |||||
R5′-GGGGTTTGCCTGTCACTG-3′ | ||||||||
*4 (28152G>A) | 45/40/15 | rs25487 | F5′-TAAGGAGTGGGTGCTGGACT-3′ | 101 (2) | 5′-AACTGACTAAACTAGTTGGCGTGTGAGGCCTTACCTC-3′ | 37 | ||
R5′-ATTGCCCAGCACAGGATAAG-3′ | ||||||||
XRCC3 | *1 (18067C>T) | 39/40/21 | rs861539 | F5′-GCCTGGTGGTCATCGACTC-3′ | 136 (2) | 5′-AACTGACTAAACTAGGTGCCACGTCGTGAAAGTCTGACATGCGCACTGCTCAGCTC-ACGCAGC-3′ | 63 | |
R5′-ACAGGGCTCTGGAAGGCA-3′ | ||||||||
XPD | *5 (35931A>C) | 35/49/15 | rs1052559 | F5′-TTCTCTGCAGGAGGATCAGC-3′ | 146 (2) | 5′-AACTGACTAAACTAGGTGGCTGCTGAGCAATCTGCTCTA-TCCTCT-3′ | 45 | |
R5′-CTCAGGAGTCACCAGGAACC-3′ | ||||||||
BRCA2 | *1 (-26G>A) | 60/34/6 | rs1799943 | F5′-AAATTTTCCAGCGCTTCTGA-3′ | 159 (2) | 5′-AACTGACTAAACTAGGTGCCACGTCGAGGTCTTCTGTTTTGCAGACTTATTTACC-AA-3′ | 57 | |
R5′-AATGTTGGCCTCTCTTTGGA-3′ | ||||||||
*3 (1342A>C) | 48/44/8 | rs144848 | F5′-AGCAAACGCTGATGAATGTG-3′ | 150 (2) | 5′-AACTGACTAAACTAGGTGTAAATGATACTGATCCATTAG-ATTCAAATGTAGCA-3′ | 53 | ||
NQO1 | *2 (609C>T) | 91/9/0 | rs1800566 | R5′-TTGGAGATTTTGTCACTTCCAC-3′ | ||||
F5′-TGAACTCAGGAGGTGGAGGT-3′ | 240 (3) | 5′-AAGCATTCAGAACCATCCACCTACCC-3′ | 26 | |||||
GPX1 | *1 (593C>T) | 57/34/9 | rs1050450 | R5′-CTGGTTTGAGCGAGTGTTCA-3′ | ||||
F5′-ACTGGGATCAACAGGACCAG-3′ | 213 (3) | 5′-AAATAACTAAACTAGGTGCGGCGCCCTAGGCACAGCTG-3′ | 38 | |||||
R5′-TTGACATCGAGCCTGACATC-3′ |
Polymorphism . | . | Frequency (%)* . | DbSNP ID . | PCR primers . | Product (bp)† . | SBE primers‡ . | Length (bp) . | |
---|---|---|---|---|---|---|---|---|
CYP1A2 | *1F (−164A>C) | 6/43/51 | rs762551 | F5′-GAGGCTCCTTTCCAGCTCTC-3′ | 106 (1) | 5′-AACTGACTAAACTAGGTGCCACTCAAAGGGTGAGCTCTGTGGGC-3′ | 44 | |
R5′-CTCCCAGCTGGATACCAGA-3′ | ||||||||
GSTM1 | Deletion | 40/60 | F5′-CTCCTGATTATGACAGAAGCC-3′ | 648 (1) | 5′-AACTGACTAAACTAGGTGCCACGCTGGAGAACCAGACCATGGACAACC-3′ | 48 | ||
R5′-CTGGATTGTAGCAGATCATGC-3′ | ||||||||
GSTP1 | *2 (1404A>G) | 46/39/15 | rs947894 | F5′-TGGTGGACATGGTGAATGAC-3′ | 123 (1) | 5′-AACTC TGGAGGACCTCCGCTGCAAATAC-3 | 28 | |
R5′-AGCCCCTTTCTTTGTTCAGC-3′ | ||||||||
*3 (2294C>T) | 89/11/0 | rs1799811 | F5′-TGGGAGGGATGAGAGTAGGA-3′ | 106 (1) | 5′-AACTGACTAAACTAGGTGCCACGTCGTGAC CATGGTGGTGTCTGGCAGGAGG-3′ | 52 | ||
R5′-CAGGGTCTCAAAAGGCTTCA-3′ | ||||||||
GSTT1 | Deletion | 85/15 | F5′-GTAGCCATCACGGAGCTGAT-3′ | 97 (1) | 5′-AACTGACTAAACTAGGTGCCACGTCGTGAAAGTCTGGGCAGGTGAACCCACTAG-GC-3′ | 56 | ||
R5′-GGCAGCATAAGCAGGACTTC-3′ | ||||||||
NAT2 | *5 (341T>C) | 39/49/12 | rs1801280 | F5′-CAAATACAGCACTGGCATGG-3′ | 13 (1) | 5′-AACTGACTAAACTAGGTGCCACGTCGTGAAAGTCTATTCACCTTCTCCTGCAGGTG-ACCA-3′ | 60 | |
R5′-GGCTGATCCTTCCCAGAAAT-3′ | ||||||||
*6 (590G>A) | 49/42/9 | rs1799930 | F5′-CCTGCCAAAGAAGAAACACC-3′ | 143 (1) | 5′-CCTACCAAAAAATATACTTATTTACGCTTGAACCTC-3′ | 36 | ||
R5′-GGGTCTGCAAGGAACAAAAT-3′ | ||||||||
*7 (857G>A) | 92/8/0 | rs1799931 | F5′-TCCTTGGGGAGAAATCTCGT-3′ | 92 (1) | 5′-AACTGACTAAACTAGGTGCCACGTCGTGAAAGTCTGACAGCCCTCGTGCCCAAAC-CTGGTGATG-3′ | 64 | ||
R5′-GGGTGATACATACACAAGGGTTT-3′ | ||||||||
mEH | *2 (C>T) | 48/47/5 | rs1051740 | F5′-CTCTCAACTTGGGGTCCTGA-3′ | 231 (3) | 5′-AACTGACTAAACTAGGTGGAAGAAGCAGGTGGAG-ATTCTCAACAGA-3′ | 46 | |
R5′-GGCGTTTTGCAAACATACCT-3′ | ||||||||
*3 (A>G) | 59/38/3 | rs2234922 | F5′-CGTGCAGGGTCTTCTCTCTC-3′ | 194 (3) | 5′-AACTGACTAAACTAGGTGCCACGTCGTGAAAGTC-CAGCTGCCCGCAGGCC-3′ | 50 | ||
XRCC1 | *2 (26304C>T) | 89/11/0 | rs1799782 | R5′-GTTCTTGGGGTCAGTCAGGA-3′ | ||||
F5′-TGAAGGAGGAGGATGAGAGC-3′ | 147 (2) | 5′-CGGGGGCTCTCTTCTTCAGC-3′ | 21 | |||||
*3 (27466G>A) | 94/6/0 | rs25489 | R5′-CTCTACCCTCAGACCCACGA-3′ | |||||
F5′-CCCCAGTGGTGCTAACCTAA-3′ | 116 (2) | 5′-TCTTCTCCAGTGCCAGCTCCAACTC-3′ | 25 | |||||
R5′-GGGGTTTGCCTGTCACTG-3′ | ||||||||
*4 (28152G>A) | 45/40/15 | rs25487 | F5′-TAAGGAGTGGGTGCTGGACT-3′ | 101 (2) | 5′-AACTGACTAAACTAGTTGGCGTGTGAGGCCTTACCTC-3′ | 37 | ||
R5′-ATTGCCCAGCACAGGATAAG-3′ | ||||||||
XRCC3 | *1 (18067C>T) | 39/40/21 | rs861539 | F5′-GCCTGGTGGTCATCGACTC-3′ | 136 (2) | 5′-AACTGACTAAACTAGGTGCCACGTCGTGAAAGTCTGACATGCGCACTGCTCAGCTC-ACGCAGC-3′ | 63 | |
R5′-ACAGGGCTCTGGAAGGCA-3′ | ||||||||
XPD | *5 (35931A>C) | 35/49/15 | rs1052559 | F5′-TTCTCTGCAGGAGGATCAGC-3′ | 146 (2) | 5′-AACTGACTAAACTAGGTGGCTGCTGAGCAATCTGCTCTA-TCCTCT-3′ | 45 | |
R5′-CTCAGGAGTCACCAGGAACC-3′ | ||||||||
BRCA2 | *1 (-26G>A) | 60/34/6 | rs1799943 | F5′-AAATTTTCCAGCGCTTCTGA-3′ | 159 (2) | 5′-AACTGACTAAACTAGGTGCCACGTCGAGGTCTTCTGTTTTGCAGACTTATTTACC-AA-3′ | 57 | |
R5′-AATGTTGGCCTCTCTTTGGA-3′ | ||||||||
*3 (1342A>C) | 48/44/8 | rs144848 | F5′-AGCAAACGCTGATGAATGTG-3′ | 150 (2) | 5′-AACTGACTAAACTAGGTGTAAATGATACTGATCCATTAG-ATTCAAATGTAGCA-3′ | 53 | ||
NQO1 | *2 (609C>T) | 91/9/0 | rs1800566 | R5′-TTGGAGATTTTGTCACTTCCAC-3′ | ||||
F5′-TGAACTCAGGAGGTGGAGGT-3′ | 240 (3) | 5′-AAGCATTCAGAACCATCCACCTACCC-3′ | 26 | |||||
GPX1 | *1 (593C>T) | 57/34/9 | rs1050450 | R5′-CTGGTTTGAGCGAGTGTTCA-3′ | ||||
F5′-ACTGGGATCAACAGGACCAG-3′ | 213 (3) | 5′-AAATAACTAAACTAGGTGCGGCGCCCTAGGCACAGCTG-3′ | 38 | |||||
R5′-TTGACATCGAGCCTGACATC-3′ |
Frequencies in the currently investigated population are shown as fully wild types/heterozygous/fully mutants. In case of GSTM1 and GSTT1, no differences can be made between wild types (no deletions) and heterozygous gene deletions.
(1) 8-plex PCR, (2) 7-plex PCR, (3) 4-plex PCR (see Materials and Methods).
Neutral nonbinding tails are in italics (see Materials and Methods).
Polymorphism . | . | Effect on enzymatic function . | Expected effect on DNA adduct level . |
---|---|---|---|
CYP1A2 | *1F | Higher inducibility | Increased bioactivation, higher adduct levels |
GSTM1 | *0 del | Deletion, no enzyme activity | Decreased detoxification, higher adduct levels |
GSTP1 | *2 I105V | Decreased enzyme activity | Decreased detoxification, higher adduct levels |
*3 A114V | |||
GSTT1 | *0 del | Deletion, no enzyme activity | Decreased detoxification, higher adduct levels |
NAT2 | *5 I114T | Decreased enzyme activity | Less N-acetylation; decreased detoxification; higher adduct levels |
*6 R197Q | |||
*7 G286E | |||
mEH | *2 Y113H | Decreased enzyme activity | Acts as phase II enzyme; decreased detoxification; increased adduct levels |
*3 H139R | Increased enzyme activity | Acts as phase I enzyme; increased bioactiviation; increased DNA adduct levels | |
XRCC1 | *2 R194W | Increased enzyme activity | Increased repair capacity, lower adduct levels |
*3 R280H | Decreased enzyme activity | Reduced repair capacity, higher adduct levels | |
*4 Q399R | Decreased enzyme activity | Reduced repair capacity, higher adduct levels | |
XRCC3 | *1 T241M | Decreased enzyme activity | Reduced repair capacity, higher adduct levels |
XPD | *5 K751Q | Decreased enzyme activity | Reduced repair capacity, higher adduct levels |
BRCA2 | *1 D991N | Decreased enzyme activity | Reduced repair capacity, higher adduct levels |
*3 N372H | |||
NQO1 | *2 P187S | Reduced enzyme activity | Higher DNA adduct levels |
GPX1 | *1 P198L | Less efficient final glutathione peroxidase complex | Higher DNA adduct levels |
Polymorphism . | . | Effect on enzymatic function . | Expected effect on DNA adduct level . |
---|---|---|---|
CYP1A2 | *1F | Higher inducibility | Increased bioactivation, higher adduct levels |
GSTM1 | *0 del | Deletion, no enzyme activity | Decreased detoxification, higher adduct levels |
GSTP1 | *2 I105V | Decreased enzyme activity | Decreased detoxification, higher adduct levels |
*3 A114V | |||
GSTT1 | *0 del | Deletion, no enzyme activity | Decreased detoxification, higher adduct levels |
NAT2 | *5 I114T | Decreased enzyme activity | Less N-acetylation; decreased detoxification; higher adduct levels |
*6 R197Q | |||
*7 G286E | |||
mEH | *2 Y113H | Decreased enzyme activity | Acts as phase II enzyme; decreased detoxification; increased adduct levels |
*3 H139R | Increased enzyme activity | Acts as phase I enzyme; increased bioactiviation; increased DNA adduct levels | |
XRCC1 | *2 R194W | Increased enzyme activity | Increased repair capacity, lower adduct levels |
*3 R280H | Decreased enzyme activity | Reduced repair capacity, higher adduct levels | |
*4 Q399R | Decreased enzyme activity | Reduced repair capacity, higher adduct levels | |
XRCC3 | *1 T241M | Decreased enzyme activity | Reduced repair capacity, higher adduct levels |
XPD | *5 K751Q | Decreased enzyme activity | Reduced repair capacity, higher adduct levels |
BRCA2 | *1 D991N | Decreased enzyme activity | Reduced repair capacity, higher adduct levels |
*3 N372H | |||
NQO1 | *2 P187S | Reduced enzyme activity | Higher DNA adduct levels |
GPX1 | *1 P198L | Less efficient final glutathione peroxidase complex | Higher DNA adduct levels |
PCR Primer Design and Multiplex PCR Amplification
Primer 3 software (http://www.broad.mit.edu/cgi-bin/primer/primer3_www.cgi) and Netprimer software (http://www.premierbiosoft.com/netprimer/netprlaunch/netprlaunch.html) were used to design PCR primers [see Knaapen et al. (10) for more detailed information].
PCR was done in three separate multiplex PCR reactions: one 8-plex, one 7-plex, and one 4-plex reaction (indicated as 1-3, respectively, in Table 1). PCR was carried out in a Tgradient 96-well Thermal cycler (Biometra, Goettingen, Germany) in a 10 μL volume, containing PCR buffer (Invitrogen, Breda, the Netherlands), 0.2 mmol/L deoxynucleotide triphosphates (Invitrogen), 0.5 mmol/L MgCl2 (Invitrogen), 0.25 unit Platinum Taq-Polymerase (Invitrogen), and 40 ng template DNA. The final concentrations of the primers were 0.2 μmol/L. PCR conditions were 94°C for 3 minutes (denaturation); 30 cycles of 94°C for 30 seconds, 56°C for 30 seconds (for multiplex 1: 60°C for 2 seconds and 57°C for 30 seconds), and 72°C for 30 seconds; and a final extension for 5 minutes at 72°C. PCR products were subsequently incubated (37°C for 45 minutes) with 4 μL Exo-SAP-IT (Amersham, Roosendaal, the Netherlands) to digest contaminating deoxynucleotide triphosphates and PCR primers. Enzymes were deactivated at 75°C (15 minutes).
Multiplex Genotyping
Genotyping was done by SBE using SnaPShot (Applied Biosystems, Nieuwekerk a.d. IJssel, the Netherlands) as described previously (10). SBE primers were designed using Primer 3 and Netprimer software to bind immediately adjacent 5′ to the specific SNP, with a template specific part of 20 to 33 bp and a Tm of 66°C to 69°C (Table 1). After SBE, the samples were incubated at 37°C (1 hour) with 1 unit shrimp alkaline phosphatase (Amersham) to degrade the unincorporated dideoxynucleotide triphosphates. SBE reactions were done in three separate multiplex genotyping experiments on the multiplex PCR reactions as described above.
Subsequently, SBE products were diluted and mixed with deionized formamide containing Genescan 120 LIZ size standard and denatured at 95°C for 5 minutes and thereafter analyzed on an ABI Prism 3100 genetic analyzer using Genescan Analysis software (version 3.7; ref. 10).
Statistical Analysis
Linear regression analysis was conducted to investigate the relationship between DNA adduct levels and the amount of cigarettes smoked per day. To investigate which genetic polymorphisms have the highest contribution to the interindividual variation within this relationship, the genotypes were coded based on the number of polymorphic alleles: 0 (two wild-type alleles), 1 (heterozygous, one polymorphic variant allele), and 2 (homozygous mutant, two polymorphic alleles). In case of deletions (GSTM1 and GSTT1), the wild type was coded 0, and the deletion was coded 2. Subsequently, SNPs in the same gene yielding the same phenotypic effect were merged to one single variable for that gene. This was done for NAT2, BRCA2, and GSTP1. Note that for mEH2 and XRCC1 also more than one SNP was investigated; however, these SNPs have opposite phenotypic effects and therefore can not be combined. This eventually led to 15 genotypes. To evaluate the association between a single polymorphism and DNA adduct levels, conventional methods, like Mann-Whitney U tests (for two groups) and Jonckheere-Terpstra tests (for more than two groups), were done. Second, to investigate the association between multiple polymorphisms and DNA adduct levels, individuals were divided into subgroups based on their response to exposure (amount of cigarettes smoked per day) with respect to DNA adduct levels. Three subgroups (classes) were formed by using the regression line: class 1 represents an observed adduct level ≤ 0.66 times the expected value according to the regression line; class 3 represents an observed adduct level ≥1.5 times the expected value according to the regression line; and class 2 holds all other subjects (Fig. 1). Using discriminant analysis, each individual can be classified. Stepwise discriminant analysis was done with the dependent variable “class” as a grouping variable and all genotypes, age, gender, and cigarettes per day as independent variables. Subsequently, results were cross-validated by the leave-one-out method. In all statistical tests, P < 0.05 was considered statistically significant. All statistics were done using SPSS for Windows (version 11.5). Results are expressed as mean ± SD.
Results
Overall Analysis of DNA Adduct Levels
The mean DNA adduct level was 1.40 ± 0.79 adducts per 108 nucleotides and ranged from <0.25 to 3.90 adducts per 108 nucleotides. A significant relationship was observed between the self-reported number of cigarettes smoked per day (exposure) and DNA adduct level (Fig. 1). Large interindividual variations were observed within this relationship.
Multiplex SBE Genotyping
Clear signals were obtained for all polymorphisms in all individuals. No signals were detected in negative control (using water as template). In Table 1, all observed genotype frequencies are shown. The distributions of all genotypes were in Hardy-Weinberg equilibrium.
Sum of Total Putative Risk Alleles in Relation to DNA Adduct Levels
As a first approach to investigate whether the genotype affects DNA adduct levels, the polymorphisms were a priori categorized as low-risk or high-risk alleles based on their expected modulating effect on DNA adduct levels (Table 2). Subsequently, the sum of risk alleles was computed for each individual, and linear regression showed a significant association between these sums and DNA adduct levels (P = 0.001; Fig. 2A), which was not due to differences in exposure (Fig. 2B).
Univariate and Multivariate Analysis
DNA adduct levels were significantly higher only in GSTM1-null individuals (1.59 ± 0.80 per 108 nucleotides) compared with GSTM1-positive subjects (1.05 ± 0.50; P < 0.01). To analyze the contribution of all genotypes simultaneously to the interindividual variation in DNA adduct levels, stepwise discriminant analysis was conducted. The predictors that seemed significant for this discrimination were GSTM1 (P < 0.001), mEH*2 (P = 0.001), GPX1 (P < 0.001), and exposure (P < 0.001). Classification results are shown in Table 3; 65.1% of the original grouped cases and 60.3% of the cross-validated grouped cases were correctly classified, whereas prediction by chance would have been 34% only.
. | . | Class . | Predicted group membership . | . | . | . | |||
---|---|---|---|---|---|---|---|---|---|
. | . | . | 1 . | 2 . | 3 . | Total . | |||
Original | Counts | 1 | 13 | 3 | 1 | 17 | |||
2 | 3 | 17 | 6 | 26 | |||||
3 | 2 | 7 | 11 | 20 | |||||
Cross-validated | Counts | 1 | 11 | 5 | 1 | 17 | |||
2 | 3 | 17 | 6 | 26 | |||||
3 | 2 | 8 | 10 | 20 |
. | . | Class . | Predicted group membership . | . | . | . | |||
---|---|---|---|---|---|---|---|---|---|
. | . | . | 1 . | 2 . | 3 . | Total . | |||
Original | Counts | 1 | 13 | 3 | 1 | 17 | |||
2 | 3 | 17 | 6 | 26 | |||||
3 | 2 | 7 | 11 | 20 | |||||
Cross-validated | Counts | 1 | 11 | 5 | 1 | 17 | |||
2 | 3 | 17 | 6 | 26 | |||||
3 | 2 | 8 | 10 | 20 |
NOTE: In cross-validation, each case is classified by the functions derived from all cases other than that particular case (n = 63-1, leave-one-out method); 65% of original grouped cases and 60% from cross-validated grouped cases were correctly classified.
Effect of the Sum of GSTM1, mEH*2, and GPX on DNA Adduct Levels
The effect of the sum of GSTM1, mEH*2, and GPX on DNA adduct level is shown in Fig. 2C, indicating that the sum of these risk alleles was associated with bulky DNA adduct level (P < 0.001). Individuals having four risk alleles for these three genes, had higher DNA adduct levels (1.97 ± 1.026 per 108 nucleotides) than individuals not possessing these particular risk alleles (0.79 ± 0.49).
Discussion
Several studies showed relationships between cigarette smoking and DNA adduct levels (1, 5, 14). However, large interindividual differences in DNA adduct levels were observed between individuals with apparently similar exposures. Our results support the hypothesis that genetic polymorphisms explain part of this interindividual variation. The total sum of putatively high-risk alleles in 12 genes correlated with DNA adduct levels. Subsequently, we identified GSTM1*0, mEH*2, and GPX1*1 as the most relevant polymorphisms for lymphocytic DNA adduct levels in smokers. Noteworthy, all three genes are involved in phase II metabolic processes. Furthermore, this is the first demonstration of the involvement of GPX1 in DNA adduct formation.
GSTM1, mEH, and GPX1 are enzymes involved in the biotransformation of carcinogenic compounds, including polycyclic aromatic hydrocarbons (15-18). GPX1 is involved in the detoxification of organic peroxides and in the conjugation of polycyclic aromatic hydrocarbon-diols to glutathione (1, 19, 20). In GPX1, the Pro198Leu substitution has been associated with a lower enzyme activity and increased lung cancer risk (20). Therefore, this allelic variant leads to less detoxification and hence higher DNA adduct levels. Indeed, we observe that individuals carrying the slow allelic variant for GPX1 have higher adduct levels compared with wild-type individuals. Our data confirm previous studies, which described that individuals lacking the GSTM1 enzyme have higher DNA adduct levels compared with GSTM1-positive individuals (14, 21, 22). For mEH, the most intensively studied polymorphisms are the Tyr113His (mEH*2; exon 3) and His139Arg (mEH*3, exon 4) variants. The first variant results in a decreased mEH activity of ∼40%, whereas the latter results in increased enzymatic activity (17). In our study, an association was found for mEH*2, showing that individuals carrying the decreased activity variant had higher DNA adduct levels. In this perspective, mEH functions as a phase II enzyme, in which the slow allelic variant (mEH*2) resulted in increased concentrations of epoxide intermediates and hence higher DNA adduct levels.
As described above, a relationship was found between the sum of risk alleles and bulky DNA adduct levels (Fig. 2A). This was also shown in a comparable approach by Matullo et al. (9). When focusing on the sum of risk alleles of the three polymorphisms that were identified by using discriminant analysis (GSTM1, mEH*2, and GPX1*1), this relationship was enriched (Fig. 2C).
As we state in our introduction, single genes (or polymorphisms) will never completely explain the interindividual variations in DNA adduct levels caused by cigarette smoking (9). We, therefore, focused on a combination of polymorphisms, using discriminant analysis. mEH*2 and GPX1*1, which were found to be nonsignificant in the univariate analysis, may therefore be significant when investigating multiple polymorphisms simultaneously in the multivariate analysis, for instance because of interactions.
To conclude, our data indicate that assessing multiple genetic polymorphisms can explain part of the interindividual variations in DNA adduct levels and that the analysis of many genotypes simultaneously is important to obtain better insights in the mechanisms that modulate DNA adduct levels. Furthermore, for high-throughput genotyping studies, we consider discriminant analysis as a meaningful statistical tool to investigate the effects of multiple genotypes.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.