Abstract
Theoretically, a haplotype has a higher level of heterozygosity than individual single nucleotide polymorphism (SNP) and the association study based on the haplotype may have an increased power for detecting disease associations compared with SNP-based analysis. In this study, we investigated the effects of four haplotype-tagging SNPs (htSNP) and the inferred haplotype pairs of the X-ray cross-complementing group 1 (XRCC1) gene on chromosome damage detected by the cytokinesis-block micronucleus assay. The study included 141 coke-oven workers with exposure to a high level of polycyclic aromatic hydrocarbons and 66 nonexposed controls. The frequencies of total MN and MNed cells were borderline associated with the Arg194Trp polymorphism (P = 0.053 and P = 0.050, respectively) but not associated with the Arg280His, Arg399Gln and Gln632Gln polymorphisms among coke-oven workers. Five haplotypes, including CGGG, TGGG, CAGG, CGAG, and CGGA, were inferred based on the four htSNPs of XRCC1 gene. The haplotype CGGG was associated with the decreased frequencies of total MN and MNed cells, and the haplotypes TGGG and CGAG were associated with the increased frequencies of total MN and MNed cells with adjustment for covariates among coke-oven workers. This study showed that the haplotypes derived from htSNPs in the XRCC1 gene were more likely than single SNPs to correlate with the polycyclic aromatic hydrocarbon–induced chromosome damage among coke-oven workers.
Introduction
There has been sufficient epidemiologic evidence suggesting an etiologic link between carcinogenic polycyclic aromatic hydrocarbon (PAH) exposure and increased lung cancer risk in coke-oven workers (1, 2). Besides bulky DNA adducts of carcinogenic PAHs, such as BPDE-DNA adduct, a wide variety of nonbulky base damages and single-strand breaks formed by free radicals during metabolic activation of PAHs are also involved in the PAH carcinogenesis (3-5). Therefore, genetic variation in an individual's ability to repair DNA base damage and single-strand breaks may confer differential risk for PAH-induced lung cancer. In humans, the base excision repair (BER) is responsible for the repair of oxidative DNA damages (6). Because the chromosome damage has been recognized as an important early biological event in chemical carcinogenesis, exploring the relationship between genetic polymorphisms of BER proteins and PAH-induced chromosomal damages is important for our understanding of PAH carcinogenesis.
The X-ray cross-complementing group 1 (XRCC1) protein acts as a facilitator or coordinator through its interaction with poly(ADP-ribose) polymerase (PARP), DNA polymerase β, and DNA ligase III in BER and single-strand break repair (7). Multiple XRCC1 gene variants exist due to several important coding single nucleotide polymorphisms (SNP), among which the Arg399Gln polymorphism has been extensively studied because of its relative high variant frequency in humans. Most of the epidemiologic studies found association between Gln399 variant of XRCC1 gene and decreased DNA repair capacity with several exceptions (reviewed in ref. 8). In addition, an in vitro expression study found that the Arg399 and Gln399 alleles equally completed both the single-strand break repair defect and the sensitivity to methyl methanesulfonate in XRCC1-deficient EM9 cells (9). One possible explanation for the apparent inconsistency on the functional significance of the Arg399Gln polymorphism of XRCC1 may be that the analysis based on single SNP did not reflect the subtle interactions or linkage disequilibrium among many SNPs in the XRCC1 gene.
A haplotype is a group of several SNPs linked physically on a single chromosome. Theoretically, a haplotype has a higher level of heterozygosity than single SNP and the association study based on the haplotype may have an increased power to detect disease associations (10-12) compared with SNP-based analysis, which has been evidenced in association studies between the haplotypes of the cholesteryl ester transfer protein gene and lipid-modifying response to statin therapy (13) and between the haplotypes of β2-adrenergic receptor gene and receptor expression and in vivo responsiveness (14). Recently, four haplotype-tagging SNPs (htSNP), including Arg194Trp, Arg280His, Arg399Gln, and Gln632Gln, have been reconstructed based on the resequencing data of the XRCC1 gene from the NIH DNA Polymorphism Discovery Resource (15) and five common (>5%) haplotypes were inferred in this mixed population group. The delineations of htSNPs of XRCC1 gene provide us the opportunity to make comparisons between inferred haplotypes and individual SNPs for associations with the biomarkers of DNA damage/repair or risk of cancers in molecular epidemiology study.
Our previous findings showed that the cytokinesis-block micronucleus (CBMN) assay in lymphocytes was suitable to detect the PAH-induced chromosome damage in coke-oven workers with exposure to a high level of PAHs and that genetic polymorphism of the microsomal epoxide hydrolase (mEH) involved in benzo[a]pyrene dihydrodiol epoxide metabolism was a major inherited metabolic factor modulating chromosome damage among the coke-oven workers (16). Theoretically, the chromosome damage detected by the CBMN assay provides genotoxic end points of PAH exposure as a result of gene-environment interaction of both metabolic and DNA repair enzymes. Therefore, these findings lead to our determination of the role of single htSNP and the inferred haplotype pairs of the XRCC1 gene in chromosome damage detected by the CBMN assay among 141 coke-oven workers and 66 non–coke-oven workers. Our hypothesis is that the XRCC1 gene haplotypes composed of variants of multiple SNPs may be a more appropriate tool for assessing host environment disease associations.
Materials and Methods
Study Population and Sample Collection
This study was approved by the Research Ethic Committee of the National Institute of Occupational Health and Poison Control, Chinese Centre for Disease Control and Prevention, Beijing, China. The details of this cross-sectional study were described previously (16). In brief, 141 coke-oven workers were recruited into this study as the PAH-exposed group. Sixty-six medical staffs without work-related PAH exposure were recruited as nonexposure controls. Exclusion criteria for participation in the study included recent treatment with mutagenic agents (such as X-ray), chronic conditions (such as autoimmune diseases), and recent acute infections that required medications such as antibiotics. The demographic information, smoking history, alcohol status, history of occupational exposure, and personal medical history of all participants were collected using the questionnaire after informed consents were obtained. Biological samples, including shift-end urine and venous blood, were obtained from each subject and coded after collection.
PAH Exposure Assessment
The air concentrations of benzene-soluble matter and particulate-phase benzo[a]pyrene in the working environment of coke-oven workers and non–coke-oven workers were sampled ∼1.5 months before urine and blood sample collection and were analyzed according to the OSHA method no. 58 (17). The excretion of urinary 1-hydroxypyrene as the internal dose of personal recent PAH exposure was measured according to the method of Jongeneelen et al. (18) with a few modifications (16). Measurements below the limit of detection (LOD) were replaced with
CBMN Assay Using Peripheral Blood Lymphocytes
The CBMN assay was done according to the standard method as previously described (20). Standard scoring criteria for selecting binucleated cells and identifying a micronucleus were adopted (21). All slides were coded and scored blindly by an experienced scorer. Total MN (the frequency of micronuclei per 1,000 binucleated lymphocytes) and MNed cells (the frequency of micronucleated cells per 1,000 binucleated lymphocytes) were scored as chromosomal damage indexes.
XRCC1 Genotyping and Haplotype Modeling
DNA was isolated from whole blood by using the standard method (22). We genotyped for Arg194Trp, Arg280His, Arg399Gln, and Gln632Gln of XRCC1 gene according to the published protocols (23-25). The Tyr113His and His139Arg polymorphisms of mEH gene were also analyzed according to the published methods (26), and three mEH activity phenotypes (low, intermediate, and high) were assigned as previously described (16). All genotypes were evaluated and agreed upon by at least two persons independently. Ten percent of DNA samples was genotyped a second time and the concordance was 100%.
The htSNPs of XRCC1 gene were ascertained in reference to studies of Han et al. (15) and Hao et al. (27). In Han et al.'s study, a multiple ethnicity group of 90 samples (23 European Americans, 23 African Americans, 11 Mexican Americans, 11 Native Americans, and 22 Asian Americans) from the NIH DNA Polymorphism Discovery Resource available from the Coriell Institute for Medical Research was resequenced for the exons and adjacent intronic and noncoding regions of the XRCC1 gene and 17 SNPs in XRCC1 gene with >1% allele frequency were found. On the basis of these selected 17 SNPs, five common haplotypes (>5%) were estimated by the Partition-Ligation EM algorithm and only four htSNPs, including Arg194Trp, Arg280His, Arg399Gln, and Gln632Gln, were necessary to reconstruct the five common haplotypes in this mixed population. There was no combination hotspot in the XRCC1 gene. In Hao et al.'s paper, 27 apparently normal Chinese individuals were used for resequencing of the exons and adjacent intronic and noncoding regions of the XRCC1 gene and found 18 SNPs with >1% allele frequency of XRCC1 gene. Four SNPs, including −77T/C at 5′ untranslated region, Arg194Trp, Arg280His, and Arg399Gln, were selected as the important tagging SNPs on the basis of their functional potentials and pairwise linkage disequilibrium with other SNPs in XRCC1 gene and five common haplotypes (>5%) were reconstructed. In our 207 Chinese participants, the Gln632Gln was in almost complete linkage disequilibrium with −77T/C (D′ = 0.96 and r2 = 0.84). Therefore, the htSNPs of Arg194Trp, Arg280His, Arg399Gln, and Gln632Gln of XRCC1 gene were selected in present study. XRCC1 haplotypes were estimated from the unphased genotypes by using an extension of Clark's algorithm (28). Haplotypes were assigned directly from individuals who were homozygous at all sites or heterozygous at only one site. The list of known haplotypes was then used to deconvolute the unphased genotypes in the remaining (multiple heterozygous) individuals. We further used a Bayesian statistical method (29) to confirm the inferred haplotypes for XRCC1 gene by Clark's algorithm. The consistency between the Bayesian statistical method and Clark's algorithm was complete 100%.
Statistical Analysis
The CBMN data (total MN and MNed cells) were ln-transformed to normalize the variance. The comparisons of CBMN data between different genotypes or haplotype pairs of XRCC1 gene were tested by multiple analysis of covariance with adjustment for ln-transformed urinary 1-hydroxypyrene, age, sex, and mEH phenotypes in exposed and nonexposed groups. If an overall F test for variables of genotypes or haplotype pairs showed significance, multiple comparisons was carried out by Dunnett-Hsu method with wild genotypes or haplotype pairs as the reference groups. In the multiple analysis of covariance model, mEH phenotypes (low, intermediate, and high) were designed as dummy variables. The micronuclei formation in coke-oven workers was considered as the combined effect of environmental PAH exposure and genes involved in PAH metabolism and DNA repair; therefore, in the multivariate models, the urinary 1-hydroxypyrene levels and the mEH phenotypes were adjusted to remove the effects of PAH exposure and metabolism on micronuclei formation and to assess the effects of XRCC1 polymorphisms. Because urinary 1-hydroxypyrene was reported as an index for both occupational and environmental PAH exposure (30) and there was a significant association between cigarettes per day and urinary 1-hydroxypyrene found in this study population (16), the variable of cigarettes per day was not adjusted in the covariance analysis. Multiple regression analysis was used to assess the potential gene-dose effect for haplotypes CGGG, TGGG, and CGAG with ln-transformed CBMN data as dependent variable. To give a convenient summary of the CBMN result in the tables, the means and SD of the nontransformed CBMN data were presented. For urinary 1-hydroxypyrene, geometric mean followed by 95% confidence interval was used. All statistical tests were two sided (α = 0.05) and done by using Statistical Analysis System software (version 8.0; SAS Institute, Inc., Cary, NC).
Results
The demographic data of the study population were previously described in detail (16). In brief, the mean age of coke-oven workers was 39.0 years and that of non–coke-oven workers was 38.1 years. The sex ratio (women/men) was 13:128 in coke-oven workers and 9:57 in non–coke-oven workers. The percentages of current smokers and the number of cigarettes smoked per day were higher in coke-oven workers than in non–coke-oven workers (63.8% versus 36.4%, P = 0.001; 8.3 versus 5.1 cigarettes per day, P = 0.003).
The PAH exposure levels and the relationship between PAH exposure and CBMN frequencies among coke-oven workers and non–coke-oven workers have been described in detail previously (16). In brief, the medians of air benzene-soluble matter and particulate-phase benzo[a]pyrene were significantly higher in coke-oven (benzene-soluble matter, 328.6 μg/m3 and benzo[a]pyrene, 926.9 ng/m3; n = 30) than in non–coke-oven working environment (benzene-soluble matter, 97.8 μg/m3 and benzo[a]pyrene, 49.1 ng/m3; n = 10; P < 0.001 for both indexes). The geometric mean of urinary 1-hydroxypyrene was significantly higher in coke-oven workers than in non–coke-oven workers (12.0, 95% confidence interval, 10.4-13.9 versus 0.7, 95% confidence interval, 0.6-1.3; μmol/mol creatinine; P < 0.001). The mean frequencies of total MN and MNed cells were 9.54 ± 6.59‰ and 8.85 ± 5.96‰ in coke-oven workers, which were significantly higher than those in non–coke-oven workers (3.98 ± 3.59‰, 3.80 ± 3.27‰; P < 0.001 for both indexes).
The distributions of the genotypes of four htSNPs of XRCC1 gene in all subjects were in Hardy-Weinberg equilibrium. Five XRCC1 haplotypes were inferred from the four htSNPs and accounted for 100% of the chromosomes in 207 study subjects of Han Chinese (Table 1) with TGGG the highest frequency of 0.34. Among these five common inferred haplotypes, one haplotype (CGGG) consisted of four wild SNPs; each of the other four haplotypes carried only one variant SNP of the four polymorphic sites exclusively, which meant that none of these four htSNPs was in linkage disequilibrium. The haplotypes were assembled as pairs and 14 haplotype pairs were found in these 207 subjects and were shown in Table 2.
HtSNPs and the inferred XRCC1 haplotypes in the study subjects (n = 207)
HtSNPs . | . | . | . | Haplotypes . | . | ||||
---|---|---|---|---|---|---|---|---|---|
Arg194Trp . | Arg280His . | Arg399Gln . | Gln632Gln . | Alleles . | Frequency (%) . | ||||
C | G | G | G | CGGG | 24.40 | ||||
T | G | G | G | TGGG | 33.57 | ||||
C | A | G | G | CAGG | 8.94 | ||||
C | G | A | G | CGAG | 26.09 | ||||
C | G | G | A | CGGA | 7.49 |
HtSNPs . | . | . | . | Haplotypes . | . | ||||
---|---|---|---|---|---|---|---|---|---|
Arg194Trp . | Arg280His . | Arg399Gln . | Gln632Gln . | Alleles . | Frequency (%) . | ||||
C | G | G | G | CGGG | 24.40 | ||||
T | G | G | G | TGGG | 33.57 | ||||
C | A | G | G | CAGG | 8.94 | ||||
C | G | A | G | CGAG | 26.09 | ||||
C | G | G | A | CGGA | 7.49 |
Distribution of haplotype pairs of XRCC1 gene in the study subjects (n = 207)
Haplotype pairs . | Chromosome A haplotype . | . | . | . | Chromosome B haplotype . | . | . | . | Total n (%) . | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | Arg194Trp . | Arg280His . | Arg399Gln . | Gln632Gln . | Arg194Trp . | Arg280His . | Arg399Gln . | Gln632Gln . | . | ||||||
TGGG/CGAG | T | G | G | G | C | G | A | G | 48 (23.2) | ||||||
CGGG/TGGG | C | G | G | G | T | G | G | G | 29 (14.0) | ||||||
CGGG/CGAG | C | G | G | G | C | G | A | G | 23 (11.1) | ||||||
TGGG/TGGG | T | G | G | G | T | G | G | G | 20 (9.7) | ||||||
CGGG/CGGG | C | G | G | G | C | G | G | G | 14 (6.8) | ||||||
TGGG/CGGA | T | G | G | G | C | G | G | A | 12 (5.8) | ||||||
CGAG/CGAG | C | G | A | G | C | G | A | G | 12 (5.8) | ||||||
TGGG/CAGG | T | G | G | G | C | A | G | G | 10 (4.8) | ||||||
CGGG/CAGG | C | G | G | G | C | A | G | G | 10 (4.8) | ||||||
CGGG/CGGA | C | G | G | G | C | G | G | A | 9 (4.3) | ||||||
CAGG/CGAG | C | A | G | G | C | G | A | G | 8 (3.9) | ||||||
CAGG/CGGA | C | A | G | G | C | G | G | A | 5 (2.4) | ||||||
CGAG/CGGA | C | G | A | G | C | G | G | A | 5 (2.4) | ||||||
CAGG/CAGG | C | A | G | G | C | A | G | G | 2 (1.0) |
Haplotype pairs . | Chromosome A haplotype . | . | . | . | Chromosome B haplotype . | . | . | . | Total n (%) . | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | Arg194Trp . | Arg280His . | Arg399Gln . | Gln632Gln . | Arg194Trp . | Arg280His . | Arg399Gln . | Gln632Gln . | . | ||||||
TGGG/CGAG | T | G | G | G | C | G | A | G | 48 (23.2) | ||||||
CGGG/TGGG | C | G | G | G | T | G | G | G | 29 (14.0) | ||||||
CGGG/CGAG | C | G | G | G | C | G | A | G | 23 (11.1) | ||||||
TGGG/TGGG | T | G | G | G | T | G | G | G | 20 (9.7) | ||||||
CGGG/CGGG | C | G | G | G | C | G | G | G | 14 (6.8) | ||||||
TGGG/CGGA | T | G | G | G | C | G | G | A | 12 (5.8) | ||||||
CGAG/CGAG | C | G | A | G | C | G | A | G | 12 (5.8) | ||||||
TGGG/CAGG | T | G | G | G | C | A | G | G | 10 (4.8) | ||||||
CGGG/CAGG | C | G | G | G | C | A | G | G | 10 (4.8) | ||||||
CGGG/CGGA | C | G | G | G | C | G | G | A | 9 (4.3) | ||||||
CAGG/CGAG | C | A | G | G | C | G | A | G | 8 (3.9) | ||||||
CAGG/CGGA | C | A | G | G | C | G | G | A | 5 (2.4) | ||||||
CGAG/CGGA | C | G | A | G | C | G | G | A | 5 (2.4) | ||||||
CAGG/CAGG | C | A | G | G | C | A | G | G | 2 (1.0) |
NOTE: Chromosomes A and B are arbitrarily assigned.
Table 3 showed the effects of XRCC1 gene htSNPs genotypes on the frequencies of total MN and MNed cells among coke-oven workers. The borderline significant associations among the XRCC1 gene Arg194Trp polymorphism and frequencies of total MN and MNed cells were found in coke-oven workers (P = 0.053 and P = 0.050, analysis of covariance with adjustment for covariates), and multiple comparisons using Dunnett-Hsu method found that coke-oven workers with the CT genotype had significantly higher frequencies of total MN and MNed cells than those with the CC genotype (P = 0.038 and P = 0.041 with adjustment for covariates). The coke-oven workers with the TT genotype had the highest CBMN frequencies among three genotypes of Arg194Trp polymorphism; however, because of the limited sample size with this TT genotype (n = 13), the difference between TT and CC genotypes did not reach the significance level. The borderline significances were also found for the difference in the frequencies of total MN and MNed cells by Gln632Gln genotypes among coke-oven workers. No association between Arg280His and Arg399Gln polymorphisms and the frequencies of total MN and MNed cells was found in coke-oven workers, and no association between these four SNPs and the frequencies of total MN and MNed cells was found in non–coke-oven workers (data not shown).
MNed cells (‰) and total MN frequencies (‰) by SNPs of XRCC1 gene in coke-oven workers
SNP genotypes . | n . | MNed cells (mean ± SD) . | P* . | Total MN (mean ± SD) . | P* . | |||||
---|---|---|---|---|---|---|---|---|---|---|
Arg194Trp | ||||||||||
CC | 53 | 7.70 ± 5.78 | Reference† | 8.36 ± 6.59 | Reference† | |||||
CT | 75 | 9.32 ± 5.62 | 0.041 | 10.08 ± 6.23 | 0.038 | |||||
TT | 13 | 10.85 ± 7.93 | 0.213 | 11.23 ± 8.30 | 0.290 | |||||
Arg280His | ||||||||||
GG | 117 | 9.15 ± 6.09 | Reference† | 9.80 ± 6.72 | Reference† | |||||
GA | 22 | 7.09 ± 4.99 | 0.330 | 8.00 ± 5.88 | 0.481 | |||||
AA | 2 | 11.00 ± 7.07 | 0.821 | 11.00 ± 7.07 | 0.888 | |||||
Arg399Gln | ||||||||||
GG | 73 | 8.34 ± 5.50 | Reference† | 8.90 ± 5.96 | Reference† | |||||
GA | 60 | 9.17 ± 6.21 | 0.954 | 9.98 ± 6.92 | 0.972 | |||||
AA | 8 | 11.13 ± 8.01 | 0.624 | 12.00 ± 9.35 | 0.660 | |||||
Gln632Gln | ||||||||||
GG | 119 | 8.63 ± 6.15 | Reference† | 9.29 ± 6.80 | Reference† | |||||
GA | 22 | 10.05 ± 4.74 | 0.101 | 10.91 ± 5.28 | 0.098 | |||||
AA | 0 |
SNP genotypes . | n . | MNed cells (mean ± SD) . | P* . | Total MN (mean ± SD) . | P* . | |||||
---|---|---|---|---|---|---|---|---|---|---|
Arg194Trp | ||||||||||
CC | 53 | 7.70 ± 5.78 | Reference† | 8.36 ± 6.59 | Reference† | |||||
CT | 75 | 9.32 ± 5.62 | 0.041 | 10.08 ± 6.23 | 0.038 | |||||
TT | 13 | 10.85 ± 7.93 | 0.213 | 11.23 ± 8.30 | 0.290 | |||||
Arg280His | ||||||||||
GG | 117 | 9.15 ± 6.09 | Reference† | 9.80 ± 6.72 | Reference† | |||||
GA | 22 | 7.09 ± 4.99 | 0.330 | 8.00 ± 5.88 | 0.481 | |||||
AA | 2 | 11.00 ± 7.07 | 0.821 | 11.00 ± 7.07 | 0.888 | |||||
Arg399Gln | ||||||||||
GG | 73 | 8.34 ± 5.50 | Reference† | 8.90 ± 5.96 | Reference† | |||||
GA | 60 | 9.17 ± 6.21 | 0.954 | 9.98 ± 6.92 | 0.972 | |||||
AA | 8 | 11.13 ± 8.01 | 0.624 | 12.00 ± 9.35 | 0.660 | |||||
Gln632Gln | ||||||||||
GG | 119 | 8.63 ± 6.15 | Reference† | 9.29 ± 6.80 | Reference† | |||||
GA | 22 | 10.05 ± 4.74 | 0.101 | 10.91 ± 5.28 | 0.098 | |||||
AA | 0 |
Multiple analysis of covariance tests for differences in ln-transformed data between genotypes with adjustment for ln-transformed urinary 1-hydroxypyrene, age, sex, and mEH phenotypes in coke-oven workers.
Multiple comparisons using Dunnett-Hsu method with wild genotypes as the reference.
The frequencies of total MN for individuals with the haplotype pairs among coke-oven workers and non–coke-oven workers were shown in Fig. 1. For statistic advantage, we combined all haplotypes that had an allele frequency of <0.05 as “others.” The frequencies of total MN were significantly associated with haplotype pairs of XRCC1 gene with adjustment for ln-transformed urinary 1-hydroxypyrene, age, sex, and mEH phenotypes (P = 0.051). Multiple comparisons using Dunnett-Hsu method found that coke-oven workers with wild haplotype pairs CGGG/CGGG of XRCC1 gene have significantly lower frequency of total MN (4.11 ± 2.09‰) than those with CGGG/TGGG (8.41 ± 5.48‰, P = 0.043), TGGG/TGGG (11.23 ± 8.30‰, P = 0.047), CGAG/CGAG (12.00 ± 9.35‰, P = 0.047), TGGG/CGAG (10.45 ± 6.96‰, P = 0.034), and TGGG/CGGA (11.60 ± 3.37 ‰, P = 0.006) and nonsignificantly lower frequency of total MN than those with TGGG/CAGG (9.88 ± 6.77‰, P = 0.077), others (8.58 ± 6.10‰, P = 0.177), and CGGG/CGAG (8.67 ± 6.62‰, P = 0.529) with adjustment for covariates. No significant association between XRCC1 gene haplotype pairs and CBMN frequencies among non–coke-oven workers was found. The association between MNed cells and XRCC1 gene haplotype pairs was similar to that between total MN and XRCC1 gene haplotype pairs among coke-oven workers and non–coke-oven workers (data not shown).
Frequencies of total MN by haplotype pairs of XRCC1. The frequencies of total MN were significantly related to XRCC1 haplotype pairs among coke-oven workers but not among non–coke-oven workers. “Others” of coke-oven workers include CAGG/CAGG, CAGG/CGAG, CAGG/CGGA, CGAG/CGGA, CGGG/CGGA, and CGGG/CAGG. “Others” of non–coke-oven workers include CAGG/CAGG, CAGG/CGAG, CAGG/CGGA, CGAG/CGGA, TGGG/CGGA, and TGGG/CAGG. Columns, means; bars, SE. See text for additional statistical analysis.
Frequencies of total MN by haplotype pairs of XRCC1. The frequencies of total MN were significantly related to XRCC1 haplotype pairs among coke-oven workers but not among non–coke-oven workers. “Others” of coke-oven workers include CAGG/CAGG, CAGG/CGAG, CAGG/CGGA, CGAG/CGGA, CGGG/CGGA, and CGGG/CAGG. “Others” of non–coke-oven workers include CAGG/CAGG, CAGG/CGAG, CAGG/CGGA, CGAG/CGGA, TGGG/CGGA, and TGGG/CAGG. Columns, means; bars, SE. See text for additional statistical analysis.
Because zero, one, or two copies of haplotypes CGGG and TGGG were present in our population as haplotype pairs CGGG/CGGG, CGGG/TGGG, and TGGG/TGGG, a potential gene-dose effect can be assessed by multiple regression analysis. Such an analysis indeed showed an increasing trend of the frequencies of total MN and MNed cells with the copy number of haplotype TGGG (P = 0.011 and P = 0.009, n = 39; Table 4). It would be of interest to carry out a similar analysis with haplotypes CGGG and CGAG, an increasing trend of the frequencies of total MN and MNed cells with the copy number of haplotype CGAG (P = 0.024 and P = 0.027, n = 29; Table 5) among coke-oven workers was also found. On the other hand, in non–coke-oven workers, no significant associations were found (data not shown). Because of the low frequencies of CAGG and CGGA in our study population, similar analysis could not be tested for these two haplotypes.
MNed cells (‰) and total MN frequencies (‰) by XRCC1 gene haplotype pairs of CGGG and TGGG in coke-oven workers
Haplotype pairs . | n . | MNed cells (mean ± SD) . | Total MN (mean ± SD) . |
---|---|---|---|
CGGG/CGGG | 9 | 4.00 ± 2.06 | 4.11 ± 2.09 |
CGGG/TGGG | 17 | 7.82 ± 4.99 | 8.41 ± 5.48 |
TGGG/TGGG | 13 | 10.85 ± 7.93 | 11.23 ± 8.30 |
P* | 0.009 | 0.011 |
Haplotype pairs . | n . | MNed cells (mean ± SD) . | Total MN (mean ± SD) . |
---|---|---|---|
CGGG/CGGG | 9 | 4.00 ± 2.06 | 4.11 ± 2.09 |
CGGG/TGGG | 17 | 7.82 ± 4.99 | 8.41 ± 5.48 |
TGGG/TGGG | 13 | 10.85 ± 7.93 | 11.23 ± 8.30 |
P* | 0.009 | 0.011 |
P value is the haplotype pairs term from a regression model for ln-transformed CBMN data with covariates of ln-transformed urinary 1-hydroxypyrene, age, sex, and mEH phenotypes and the haplotype pairs coded as an additive (0, 1, 2) term.
MNed cells (‰) and total MN frequencies (‰) by XRCC1 gene haplotype pairs of CGGG and CGAG in coke-oven workers
Haplotype pairs . | n . | MNed cells (mean ± SD) . | Total MN (mean ± SD) . |
---|---|---|---|
CGGG/CGGG | 9 | 4.00 ± 2.06 | 4.11 ± 2.09 |
CGGG/CGAG | 12 | 7.83 ± 5.57 | 8.67 ± 6.62 |
CGAG/CGAG | 8 | 11.13 ± 8.01 | 12.00 ± 9.35 |
P* | 0.027 | 0.024 |
Haplotype pairs . | n . | MNed cells (mean ± SD) . | Total MN (mean ± SD) . |
---|---|---|---|
CGGG/CGGG | 9 | 4.00 ± 2.06 | 4.11 ± 2.09 |
CGGG/CGAG | 12 | 7.83 ± 5.57 | 8.67 ± 6.62 |
CGAG/CGAG | 8 | 11.13 ± 8.01 | 12.00 ± 9.35 |
P* | 0.027 | 0.024 |
P value is the haplotype pairs term from a regression model for ln-transformed CBMN data with covariates of ln-transformed urinary 1-hydroxypyrene, age, sex, and mEH phenotypes and the haplotype pairs coded as an additive (0, 1, 2) term.
We further classified the study population into three groups based on the number of CGGG haplotype they carried and found that the frequencies of total MN and MNed cells decreased significantly with the number of haplotype CGGG in coke-oven workers (Table 6). On the other hand, in non–coke-oven workers, the frequencies of total MN and MNed cells decreased nonsignificantly with the number of haplotype CGGG (data not shown).
MNed cells (‰) and total MN frequencies (‰) by number of CGGG haplotype of XRCC1 gene in coke-oven workers
Number of CGGG . | n . | MNed cells (mean ± SD) . | P* . | Total MN (mean ± SD) . | P* . |
---|---|---|---|---|---|
2 | 9 | 4.00 ± 2.06 | Reference† | 4.11 ± 2.09† | Reference† |
1 | 41 | 7.71 ± 5.04 | 0.038 | 8.27 ± 5.64 | 0.034 |
0 | 91 | 9.85 ± 6.29 | 0.003 | 10.65 ± 6.94 | 0.003 |
Number of CGGG . | n . | MNed cells (mean ± SD) . | P* . | Total MN (mean ± SD) . | P* . |
---|---|---|---|---|---|
2 | 9 | 4.00 ± 2.06 | Reference† | 4.11 ± 2.09† | Reference† |
1 | 41 | 7.71 ± 5.04 | 0.038 | 8.27 ± 5.64 | 0.034 |
0 | 91 | 9.85 ± 6.29 | 0.003 | 10.65 ± 6.94 | 0.003 |
Multiple analysis of covariance tests for differences in ln-transformed data between haplotype pairs with adjustment for ln-transformed urinary 1-hydroxypyrene, age, sex, and mEH phenotypes in coke-oven workers.
Multiple comparisons using Dunnett-Hsu method with wild haplotype pairs as the reference.
Discussion
In the present study, we genotyped four htSNPs, including Arg194Trp, Arg280His, Arg399Gln, and Gln632Gln, of the XRCC1 gene and inferred five common haplotype alleles (CGGG, TGGG, CAGG, CGAG, and CGGA) in 207 study subjects of Han Chinese. The association between haplotype CGGG of XRCC1 gene and decreased CBMN frequencies and between haplotypes TGGG and CGAG and increased CBMN frequencies suggested that the haplotype CGGG may be related to the increased DNA repair capacity to PAH-induced chromosome damage, whereas the haplotypes TGGG and CGAG may be related to the decreased DNA repair capacity.
The significant influence of XRCC1 polymorphisms on micronuclei formation is biologically plausible, although the chromosome breaks detected by the CBMN assay are derived from double-strand breaks. Studies showed that XRCC1 mutants EM9 and EM11 cells showed a significantly reduced rate of single-strand breaks and double-strand breaks rejoining induced by ionizing radiation (31-33). However, these results did not necessarily mean that XRCC1 protein was directly involved in double-strand breaks repair (7), because it was evident that a longer lifetime of persistent single-strand breaks would increase the likelihood that the breaks created in close opposition existed simultaneously and were converted to double-strand breaks (34, 35). For PAH genotoxicity, the single-strand breaks can be formed directly by the attack of free radicals as well as indirectly by repair of unstable damaged base in the BER pathway. Therefore, the defective or variant XRCC1 protein may decrease the ability for single-strand breaks repair and inversely increase the transformation of single-strand breaks to double-strand breaks.
The effect of the XRCC1 gene CGAG haplotype on the PAH-induced chromosome damage among coke-oven workers supported previous association studies based on single SNP that the Gln399 variant was associated with decreased DNA repair rapacity (reviewed in ref. 8) and increased risk for several cancers (reviewed in ref. 12). At present, the mechanism responsible for the association of the XRCC1 Gln399 allele and decreased DNA repair capacity was still unclear. However, the codon 399 is located within the central BRCA1 COOH-terminus domain of XRCC1, which contains a binding site for PARP. Studies have shown that the interaction of XRCC1 with PARP may serve to recruit XRCC1 protein complexes to sites of single-strand breakage because XRCC1 preferentially binds the activated form of PARP that arises once the latter polypeptide has bound an single-strand break (7). In Chinese hamster ovary cells lines with nonconserved amino acid substitutions within the BRCA1 COOH terminus-domain, reduced repair of single-strand breaks and hypersensitivity to ionizing radiation was observed (36). So, one speculation on the functional effect of the Arg399Gln change on XRCC1 function may be that the amino acid transition from Arg (positively charged amino acid) to Gln (neutrally charged amino acid) may change the conformation of BRCA1 COOH terminus-domain of the XRCC1 protein, which will influence the interaction of XRCC1 and PARP and hamper the anchoring of XRCC1 protein and other BER proteins at sites of single-strand break formation. However, this speculation needs to be validated in molecular biological studies.
Because of the rare frequency of Trp194 allele of XRCC1 gene among Caucasians, there were few studies investigating the effect of Arg194Trp polymorphisms on DNA damage/repair end points and cancer risk with sufficient statistical power and the associations were conflicted in Caucasian population studies (reviewed in ref. 12). However, in an in vitro cytogenetic challenge assay using lymphocytes from Caucasians, Trp194 allele of XRCC1 gene increased X-ray– or UV-induced chromosome aberrations (37). Large ethnic difference exists for the frequency of Trp194 allele of XRCC1 gene between East Asian populations (0.21-0.36; refs. 38, 39) and Caucasians (<0.10), and the epidemiologic studies in East Asian populations were more likely to find significant associations between Arg194Trp polymorphisms and cancer risk. The Trp194 allele of XRCC1 gene was reported to be associated with the significantly increased risk for esophageal squamous cell carcinoma in a Chinese case-control study (38) and to be associated with high clinical response rate to the platin-based chemotherapy in Chinese patients with advanced non–small cell lung cancer (40). Tae et al. (39) detected a highly significant association of Trp194 allele with the increased risk of squamous cell carcinoma of the head and neck in Koreans. Our results showed that the haplotype TGGG of XRCC1 gene was associated with high PAH-induced chromosome damage, which further support the significant association between Trp194 allele of XRCC1 gene and reduced DNA repair capacity. However, because codon 194 of XRCC1 gene is not located in the functional domain for the XRCC1 protein interacting with DNA polymerase β, PARP, and DNA ligase III (7), it is difficult to attribute this association merely to the amino acid substitution of Arg194Trp.
Thus, the wild haplotype pair CGGG/CGGG of XRCC1 gene was supposedly the protective one toward PAH-induced chromosome damage among the coke-oven workers, whereas among non–coke-oven workers no such association was found, which just reflected the gene-environment interaction between XRCC1 genotypes and PAH exposure (i.e., when the PAH exposure was low, the chromosome damage for individuals with wild or variant XRCC1 genotypes was similar and low; on the other hand, under high PAH exposure, the chromosome damage for individuals with variant XRCC1 genotypes increased much more than those with wild genotypes). In addition, it was unlikely that occupational PAH exposure resulted in increased micronuclei frequency by different XRCC1 haplotype pairs among coke-oven workers because the urinary 1-hydroxypyrene and occupational exposure history were similar among the coke-oven workers with different XRCC1 haplotype pairs.
Our data also suggested that as far as the XRCC1 gene was concerned, the haplotypes composed of variants of multiple SNPs may be a more appropriate tool for assessing host environment disease associations than single SNP. The possible explanation may be that the potential effects of other three htSNPs of XRCC1 gene were incorporated or considered in the analysis based on haplotypes of XRCC1 gene. For example, the coke-oven workers with the CC or CT genotypes of the Arg194Trp polymorphism may also have the variant allele of one of the other three polymorphic sites whose effect could bias or disguise the genotype-phenotype correlation of the Arg194Trp polymorphism to a certain extent. Whereas, when we specifically analyzed the genotype-phenotype relationship among coke-oven workers with the haplotype genotypes CGGG/CGGG, CGGG/TGGG, and TGGG/TGGG, the bias in analysis based on the Arg194Trp polymorphism was excluded. This was also the case for association study based on the Arg399Gln polymorphism and CGAG haplotype. Therefore, our study provided the evidence to support the hypothesis that the study based on haplotypes may be statistically more relevant than single htSNP in genotype/phenotype or genotype/cancer risk association studies (10, 11).
In summary, our study was the first report to investigate the associations between XRCC1 polymorphisms and environmentally induced genetic damages on the basis of haplotype pairs in Chinese coke-oven workers. We found the evidences that the CGGG haplotype of XRCC1 gene may be protective against PAH-induced chromosome damage among coke-oven workers, whereas the TGGG and CGAG haplotypes may increase risk for chromosome damage that may be due to the reduced DNA repair function. However, the relative small sample size and the undetermined overall contribution of BER repairable DNA damage in micronucleus formation made us cautious in extrapolating our results; whether the Arg194Trp or Arg399Gln polymorphisms were the causative SNPs needs to be validated in future studies because the TGGG or CGAG haplotypes may be in linkage disequilibrium with the really causal SNPs in the XRCC1 gene or in adjacent genes within the same haplotype block.
Grant support: National Key Basic Research and Development Program (2002CB512903) and National Nature Science Foundation of China (30400348).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Acknowledgments
We thank Dr. Qingyi Wei (Department of Epidemiology, University of Texas M. D. Anderson Cancer Center, Houston, TX) for his critical review and scientific editing of the manuscript and Jian Qin (National Institute of Occupational Health, Chinese Center for Disease Control and Prevention, Beijing, China) for her help in editing the manuscript.