Many host factors or biomarkers are involved in the process of early DNA damage induced by occupational exposure to polycyclic aromatic hydrocarbons (PAH) as seen in coke-oven workers. This paper aimed to identify complicated causal interrelationship of various biomarkers using the path analysis. In this analysis, we included 235 subjects (166 coke-oven workers and 69 nonexposed controls) whose data on the comet assay (e.g., Olive tail moment) and cytogenetic analysis of peripheral blood lymphocytes as well as urinary 1-hydroxypyrene (1-OHP) were available. The path analysis showed that coke-oven exposure and tobacco smoke were both significant predictors of the concentrations of urinary 1-OHP (P < 0.05), with a coefficient of determination of 0.75. The factors having significant influence on the Olive tail moment were in the following order: urinary 1-OHP > XRCC1-exon 9 variant genotype > ERCC2-exon 10 variant genotype > XRCC1-exon 6 variant genotype, with a coefficient of determination of 0.22. The variables of relative importance in influencing on cytokinesis-block micronucleus frequencies were in the following order: coke-oven exposure > urinary 1-OHP > age > mEH3 variant genotype > ERCC2-exon 10 variant genotype > XRCC1-exon 6 variant genotype, with a coefficient of determination of 0.27. These results indicated that exogenous agents, especially the coke-oven exposure, played a more important role than the genotypes in the induction of early genetic damage. In conclusion, the path analysis seemed to be an alternative statistical approach for the ascertainment of complicated association among related biomarkers for the assessment of occupational exposure. (Cancer Epidemiol Biomarkers Prev 2007;16(6):1193–9)

Polycyclic aromatic hydrocarbons (PAH) are an established carcinogen that has been extensively investigated to date. Recently, there is an increasing interest in using biomarkers for exposure to PAHs and early biological effects such as DNA damages caused by carcinogens like PAHs in the risk assessment (1-5), thanks to the rapid development of molecular biological monitoring techniques. It is well known that many factors can contribute to carcinogenesis in humans. In molecular epidemiology studies, biological effects are more frequently measured by biomarkers, which are divided into three categories: measures of internal dose, early biological response, and host susceptibility (6, 7). Many studies have found the associations between PAH environmental exposure, whose internal dose is usually quantified by 1-hydroxypyrene (1-OHP) in urine, and an increase in the levels of carcinogen-DNA adducts, sister chromatid exchange, and chromosomal aberrations (8-12). In addition, environmental exposure interacts intricately with environmental responsive genes and some known behavioral factors such as tobacco smoking and alcohol consumption (13-15).

Therefore, the early genotoxic damage to human cells results from an interaction between extrinsic and endogenous factors. At the same time, there exist complicated interactions among various additional unknown factors. It has been recommended that combinations of biomarkers (e.g., those for internal dose, adducts, metabolic phenotype, and DNA repair capacity) are needed to assess individual risk of diseases (16). To address the complicated interrelationship among various biomarkers in response to exposure to PAHs, such as seen in coke-oven workers, we used the path analysis to reveal any causal associations between DNA damage or chromosomal aberrations in peripheral blood lymphocytes from coke-oven workers and environmental exposure to PAHs in a clearer and more visualized way. To accomplish this, we used published data on the comet assay and cytokinesis-block micronucleus (CBMN) assay to evaluate levels of DNA and chromosomal damage in peripheral blood lymphocytes, respectively. Both urinary 1-OHP and genetic damage were analyzed in relation to genotypes of metabolic genes or DNA repair genes. We included three biomarkers, urinary 1-OHP as the internal dose of exposure to PAHs, the Olive tail moment and frequencies of micronucleus as the early biological effects, and genotype data on CYP1A1, mEH3, mEH4, GSTT, GSTP1, and XRCC1 gene and XPD as the susceptibility markers.

Path analysis was first developed by Sewall Wright in the 1920s for the use in quantitative genetic studies and later was gradually adopted by social sciences, ecology, psychology, and economics (17, 18), as well as health sciences (19, 20), in which the interaction mechanisms were not well understood in the presence of known risk factors, such as epidemiologic studies of chronic illness. The path analysis helps understand comparative strengths of direct and indirect relationships among a set of variables of interest. In this analysis, a network of causes and effects is seen as a series of steps in a path with a coefficient assigned to each step to quantify their interrelationships (21). With the advances of path analysis methodology, this method has been used more extensively in different fields (22-26). Santos et al. first applied the path analysis to elucidate a biochemical pathway of carotenoid and had obtained some interesting results (25). Their work again showed the efficacy of path analysis for identifying key compounds in complex pathways.

In this report, we tested our hypothesis that the path analysis may also be an effective tool to identify the main risk factors of earlier biological effects induced by PAHs exposure, and the information produced by such an analysis could provide insights into the PAHs metabolic pathway in humans. The hypothesized model was based on the shown facts of previous studies, intending to reveal the causal relationship of selected variables rather than simple correlation.

Study Population and Sample Collection

The details of subject recruitment and blood sample procurement of this cross-sectional study were previously described (27). Briefly, 166 coke-oven workers exposed to PAHs at workplaces and 69 medical staffs without work-related PAH exposure were enrolled in this study as the exposed and nonexposure groups, respectively. Exclusion criteria for participation in the study included recent treatment with mutagenic agents (such as X-ray), chronic conditions (such as autoimmune diseases), and recent acute infections that required medications (such as antibiotics). All participants were then interviewed by an occupational physician using a questionnaire including demographic information, smoking history, alcohol consumption, history of occupational exposure, and personal medical history after informed consents were obtained. Individuals who had smoked >100 cigarettes in their lifetime were considered as smokers. Among these smokers, individuals who still smoked when interviewed were classified as current smokers and were asked for further detailed smoking-related information, including age at onset of smoking and duration of smoking; the others were classified as former smokers; however, their complete smoking history was not available. Individuals who drank more than twice a week in the last 6 months were classified as alcohol users. Biological samples, including shift-end urine and venous blood, were obtained from each subject.

PAH Exposure Assessment

PAH exposure assessment was also previously described (27). Briefly, the air concentrations of benzene-soluble matter and particulate-phase benzo(a)pyrene in the working environment of coke-oven workers and non–coke-oven workers were sampled ∼1.5 months before urine and blood sample collection and were analyzed according to the Occupational Safety and Health Administration method no. 58 (28). The excretion of urinary 1-OHP as the internal dose of personal recent PAH exposure was measured according to the method of Jongeneelen et al. (29), with some modifications (30). Measurements below the limit of detection (LOD) were replaced with

\(\mathrm{LOD}/\sqrt{\sqrt2}\)
before statistical analysis (31). The urinary 1-OHP concentrations were corrected by urinary creatinine and presented as μmol mol−1 creatinine.

Comet Assay in Peripheral Blood Lymphocyte

The detailed procedure of comet assay was previously described (32). In short, the blood samples for alkaline comet assay were stored at 4°C after venipuncture for no more than 1 h before the separation of lymphocytes. Lymphocytes were separated from ∼1 mL heparinized whole blood and suspended in 500 μL ice-cold PBS. The comet assay was done immediately after lymphocyte separation according to Singh et al. (33), with minor modifications. Slides were examined with Olympus IX 50 microscope equipped with a 100-W mercury lamp and WG filter block. Measurements were made using an image analysis system (version 1.0, IMI comet analysis software, China). More than 100 cells per subject were scored (50 cells for each of the two replicate slides). Olive tail moment were calculated. For each subject, the arithmetic mean of Olive tail moment of 100 cells was presented as DNA damage level in the following statistical analysis.

CBMN Assay Using Peripheral Blood Lymphocytes

The CBMN assay was done according to the standard method as previously described (34, 35). Standard scoring criteria for selecting binucleated cells and identifying a micronucleus were adopted (34, 36). All slides were coded and scored blindly by an experienced scorer. Total MN (the frequency of micronuclei per 1,000 binucleated lymphocytes) and MNed cells (the frequency of micronucleated cells per 1,000 binucleated lymphocytes) were scored as chromosomal damage indexes.

Genotyping

DNA was isolated from the whole blood using a standard method (37). The Ile/Val polymorphism in exon 7 of the CYP1A1 was analyzed according to the method of Hayashi et al. (38). Analysis of the GSTP1 polymorphism resulting in an Ile/Val substitution at residual 105 in exon 5 was done as described by Saarikoski et al. (39). Genotypes of GSTT1 were determined by a modified multiplex PCR method with β-globin as positive control (40, 41). The Tyr/His polymorphism at residual 113 in exon 3 and His/Arg polymorphism at residual 139 in exon 4 of mEH was analyzed as described by Zhou et al. (42). Three single nucleotide polymorphisms in XRCC1 gene, including C26304T (Arg194Trp), G27466A (Arg280His), and G28152A (Arg399Gln) were detected using the method of PCR–restriction fragment length polymorphism (43-45). The G23591A (Asp312Asn) and A35931C (Lys751Gln) polymorphisms of ERCC2 gene were determined according to published protocols (43). All genotypes were evaluated and agreed upon by at least two persons independently. Ten percent of DNA samples were genotyped a second time, and the concordance was 100%.

Statistical Analysis

Path analysis, an extension of the regression model, was used in this study. A path coefficient, represented as pij, is a standardized regression coefficient indicating the direct effect of variable i on variable j. Accordingly, in a multivariate regression system, it is a partial regression coefficient controlling for other prior variables. Basically, whenever a causal, rather than spurious or coincidental, correlation among a set of variables is suspected, the path analysis is strongly suggested to be done, especially when there is a possibility to sort out the sequence of variables or when it is necessary to distinguish the spurious effect by an intervening factor from the observed relationships (46).

The path analysis has several advantages. First of all, it allows for effect decomposition because the total causal effect is the sum of the values of all the paths from i to j. The indirect effect, measuring the effect of the intervening variables, is the total causal effect minus the direct effect. A second advantage of path analysis is shown by path diagram, which explains the hypothetical causality graphically. It runs through from the left to the right, with the exogenous variables or independent variables on the extreme left and the dependent or endogenous variables (the model tries to explain) on the right side. The relationship between two variables is represented by a straight line or curve with arrows in both extremities with the size of the correlation proportional to the width of the line.

Before performing the path analysis, a number of assumptions need to be checked (47). Among these assumptions, sample size and selection criteria to include variables in the models are the most important. As Kline (48) recommended, 10 times as many cases as parameters were needed to assess significance. If the ratio of the number of cases to the number of parameters was below 5, the model with great possibility to be estimated would be unstable (49). For this study, the number of the parameters to be estimated was 24. Conceptually, an adequate sample size should not be <240. However, the cases actually comprised in the computation were 203, a little less than the sample size assumed but sufficient for significant estimation of model effects (supported by the good fitness of the model and reasonable path coefficient estimation). Furthermore, the ratio number of cases/number of this study was >8.

In terms of variable selection criterions, literature review was first conducted to make sure that each factor intended to be included in the models was biologically meaningful according to previous studies (46). Then, we sought to use χ2 statistic and relative χ2 change to decide the inclusion or exclusion of each variable (50). When an entry of a variable happens, the χ2 value of the model should increase accordingly by 3.84 or greater (46). Otherwise, the variable would be deleted for the sake of the model parsimony. Additionally, the correct direction and size of the path coefficient also guided the selection of variables.

In this study, the path analysis was done with the Calis procedure in SAS. Maximum likelihood method was used to estimate parameters of the path model, and the approximated t tests were done to test individual path coefficients, following the t distribution with degrees of freedom nk − 1 (k is the parameters of the path model; ref. 51). Several fit indices were available for the model overall test, and the frequently used ones were χ2 statistics, the goodness-of-fit index (GFI), the adjusted goodness-of-fit index (AGFI), the root mean square error of approximation (RMSEA), the normal fit index (NFI), and the ratio of χ2 /ν (ν is the number of degrees of freedom of the model). A nonsignificant χ2 value indicates a good fit, whereas recommended values of the ratio χ2 /ν vary between 1.0 and 2.0. The accepted standard for the other indicators are above 0.90 for GFI, AGFI, and NFI and are below 0.05 for RMSEA (23, 46).

Urinary 1-OHP, Olive tail moment, and CBMN frequencies all followed approximately log-normal distribution as indicated in previous studies (27), and thus, their ln-transformed values were used in the analysis. Firstly, univariate analysis was done for these three biomarker measurements to screen for relevant factors. Secondly, with all of the selected factors, the path analysis model was constructed. However, because of missing data, only 203 out of 235 subjects had the data that were fit into the path analysis models, including urinary 1-OHP, frequencies of CBMN, Olive tail moment, genotypes of CYP1A1, GSTP1, mEH3, XRCC1-exon 6, XRCC1-exon 9, and ERCC2-exon 10 and smoke status. Smokers were divided into two groups according to their cigarette consumption per day: 1-19 cigarettes/day or ≥20 cigarettes/day. For this categorical variable, cigarette consumption, dummy variables were created and was used in the path analysis. However, only the dummied variable representing the 1-19 cigarettes/day group was included in the final path equations for model goodness of fit. In terms of age, all subjects were categorized into one of the following groups: <35 years, 35-44 years, or >44 years. For each gene, a dichotomized genotype was used based on a dominant genetic model, that is, a genotype with at least one variant allele, either homozygotes or heterozygotes was grouped into one category. In the path model, logarithm values of urinary 1-OHP, Olive tail moment, and CBMN frequencies were used as endogenous variables. All statistical tests were two sided, and the size of the test was specified at either 0.05 or 0.10. All statistical tests were done by Statistical Analysis System Software (version 8.0; SAS Institute).

The basic characteristics of coke-oven workers and controls were very close to those described previously (27), and all variables used in this analysis are listed in Table 1. In brief, the distributions of age, sex, and alcohol consumption were similar between the two groups (P = 0.29, P = 0.16, and P = 0.19, respectively), except the proportion of current smokers and the number of cigarettes smoked per day, both of which were higher in coke-oven workers than in controls (64.46% versus 36.23%, P < 0.01 and 8.22 ± 7.18 cigarettes/day versus 4.88 ± 7.58 cigarettes/day; P < 0.01).

Table 1.

List of variables included in this paper

Variable nameConnotationValue assignment
Log_Hydcre Ln-transformed urinary 1-OHP Numerical type 
Log_olive Ln-transformed olive tail moment Numerical type 
Log_mni Ln-transformed CBMN frequencies Numerical type 
Cokework Coke-oven worker or not 0, non–coke-oven worker; 1, coke-oven worker 
Tobacco smoke Smoke status 0, no; 1, yes 
Cigarettes Rating of cigarettes per day 0 cigarette/day 
  1–19 cigarettes/day 
  ≥20 cigarettes/day 
Age Rating of age 0, to <35 y 
  1, 35∼44 y 
  2, to >44 y 
mEH3 His113Tyr polymorphism in exon 3 0, w/w*; 1, m/w; 2, m/m 
CYP1A1 Ile462Val polymorphism in exon 7 0, w/w; 1, m/w; 2, m/m 
GSTP1 Ile104Val polymorphism in exon 5 0, w/w; 1, m/w; 2, m/m 
XRCC1-exon 6 Arg194Trp polymorphism 0, w/w; 1, m/w; 2, m/m 
XRCC1-exon 9 Arg280His polymorphism 0, w/w; 1, m/w; 2, m/m 
ERCC2-exon 10 Asp312Asn polymorphism 0, w/w; 1, m/w; 2, m/m 
Variable nameConnotationValue assignment
Log_Hydcre Ln-transformed urinary 1-OHP Numerical type 
Log_olive Ln-transformed olive tail moment Numerical type 
Log_mni Ln-transformed CBMN frequencies Numerical type 
Cokework Coke-oven worker or not 0, non–coke-oven worker; 1, coke-oven worker 
Tobacco smoke Smoke status 0, no; 1, yes 
Cigarettes Rating of cigarettes per day 0 cigarette/day 
  1–19 cigarettes/day 
  ≥20 cigarettes/day 
Age Rating of age 0, to <35 y 
  1, 35∼44 y 
  2, to >44 y 
mEH3 His113Tyr polymorphism in exon 3 0, w/w*; 1, m/w; 2, m/m 
CYP1A1 Ile462Val polymorphism in exon 7 0, w/w; 1, m/w; 2, m/m 
GSTP1 Ile104Val polymorphism in exon 5 0, w/w; 1, m/w; 2, m/m 
XRCC1-exon 6 Arg194Trp polymorphism 0, w/w; 1, m/w; 2, m/m 
XRCC1-exon 9 Arg280His polymorphism 0, w/w; 1, m/w; 2, m/m 
ERCC2-exon 10 Asp312Asn polymorphism 0, w/w; 1, m/w; 2, m/m 
*

w, wild-type allele; m, mutation allele.

Indexes used for the overall test of the model were as follows: GFI = 0.9910; AGFI = 0.9320; χ2 test, χ2 = 12.0433; df = 12; P = 0.4422; RMSEA = 0.0042; 90% confidence interval = 0, 0.0718; NFI = 0.9835; and the ratio of χ2 /ν = 1.0. Taken together, these indexes showed the good fitness of the model.

Table 2 shows that statistically, the variables, which had effects on urinary 1-OHP, were coke-oven exposure and tobacco smoke. The variables included in this table were further sorted according to their effects on urinary 1-OHP as follows: coke-oven exposure > tobacco smoke > age > mEH3 variant genotypes > CYP1A1 variant genotypes. The R2 of this model was 0.75, suggesting that the variables included in this model had explained 75% of the total variance.

Table 2.

Standardized coefficients and parameter test for the path analysis model of urinary 1-OHP (R2 = 0.7489)

VariableCokeworkTobacco smokeAgeCYP1A1mEH3
Standard coefficient 0.8253 0.1444 −0.0582 0.0368 0.0453 
SE 0.0364 0.0363 0.0359 0.0354 0.0354 
t value 22.6971* 3.9743* −1.6216 1.0398 1.2768 
VariableCokeworkTobacco smokeAgeCYP1A1mEH3
Standard coefficient 0.8253 0.1444 −0.0582 0.0368 0.0453 
SE 0.0364 0.0363 0.0359 0.0354 0.0354 
t value 22.6971* 3.9743* −1.6216 1.0398 1.2768 
*

P < 0.05.

Table 3 shows that the influence of urinary 1-OHP concentration, XRCC1-exon 9 variant genotypes, ERCC2-exon 10 variant genotypes, and XRCC1-exon 6 variant genotypes on Olive tail moment was statistically significant. Their importance was ranked as follows: urinary 1-OHP concentration > XRCC1-exon 9 variant genotypes > ERCC2-exon 10 variant genotypes > XRCC1-exon 6 variant genotypes. The R2 of the equation was 0.22. That is, the variables in the model accounted for 22% of the total variance.

Table 3.

Standardized coefficients and parameter test for the path analysis model of Olive tail moment (R2 = 0.2158)

VariableLog_HydcreCokeworkAgeCigarettesmEH3XRCC1-exon 9ERCC2-exon 10XRCC1-exon 6
Standardized coefficient 0.3266 0.0303 0.0335 0.0463 −0.0315 0.2012 0.1195 0.1061 
SE 0.1227 0.1206 0.0635 0.0670 0.0632 0.0640 0.0626 0.0649 
t value 2.6620* 0.2514 0.5276 0.6906 −0.4985 3.1449* 1.9075 1.6347 
VariableLog_HydcreCokeworkAgeCigarettesmEH3XRCC1-exon 9ERCC2-exon 10XRCC1-exon 6
Standardized coefficient 0.3266 0.0303 0.0335 0.0463 −0.0315 0.2012 0.1195 0.1061 
SE 0.1227 0.1206 0.0635 0.0670 0.0632 0.0640 0.0626 0.0649 
t value 2.6620* 0.2514 0.5276 0.6906 −0.4985 3.1449* 1.9075 1.6347 
*

P < 0.05.

P < 0.10.

Table 4 shows that the variables, including coke-oven exposure, age, mEH3 variant genotypes, XRCC1-exon 6 variant genotypes, and ERCC2-exon 10 variant genotypes significantly influenced the levels of CBMN frequency. Their preference ordering was as follows: coke-oven exposure > urinary 1-OHP concentration > age > mEH3 variant genotypes > ERCC2-exon 10 variant genotypes > XRCC1-exon 6 variant genotypes. The R2 of the equation was 0.27, indicating that 27% of the total variance for this model was explained by these variables.

Table 4.

Standardized coefficients and parameter test for the path analysis model of CBMN frequency (R2 = 0.2719)

VariableLog_HydcreCokeworkAgeCigarettesmEH3GSTP1XRCC1-exon 6ERCC2-exon 10
Standardized coefficient 0.1909 0.2422 0.1453 −0.0260 −0.1327 0.0345 0.1135 0.1213 
SE 0.1182 0.1162 0.0612 0.0645 0.0607 0.0609 0.0611 0.0610 
t value 1.6157 2.0847* 2.3752* −0.4023 −2.1851* 0.5671 1.8589 1.9894 
VariableLog_HydcreCokeworkAgeCigarettesmEH3GSTP1XRCC1-exon 6ERCC2-exon 10
Standardized coefficient 0.1909 0.2422 0.1453 −0.0260 −0.1327 0.0345 0.1135 0.1213 
SE 0.1182 0.1162 0.0612 0.0645 0.0607 0.0609 0.0611 0.0610 
t value 1.6157 2.0847* 2.3752* −0.4023 −2.1851* 0.5671 1.8589 1.9894 
*

P < 0.05.

P < 0.10.

The effective decompositions of different variables associated with either Olive tail moment or frequencies of CBMN were presented in Table 5. As it was shown in Table 5, the variables, including coke-oven exposure, age, and mEH3 variant genotypes, not only had direct effects on Olive tail moment but also had indirect effects on Olive tail moment by the pathway of urinary 1-OHP. The total causal effect was decomposed into a direct effect and an indirect effect. For example, the direct effect of mEH3 variant genotypes on the Olive tail comet was −0.0315. In comparison, the total indirect effect was only +0.0148. So the total causal effect of mEH3 variant genotypes was −0.0315 + 0.0148 =−0.0167. Likewise, as listed in Table 5, the variables that exerted both direct effects and indirect effects on CBMN frequencies were coke-oven exposure, age, and mEH3 variant genotypes. The GSTP1 variant genotype had a direct effect on CBMN frequencies.

Table 5.

Effect decompositions for Olive tail moment and CBMN frequencies

VariableOlive tail moment
CBMN
DirectIndirectTotalDirectIndirectTotal
Cokework 0.0303 0.2696 0.2999 0.2423 0.1576 0.3999 
Smoke — 0.0472 0.0472 — 0.0276 0.0276 
Cigarettes 0.0463 — 0.0463 −0.0260 — −0.0260 
Age 0.0335 −0.0190 0.0145 0.1453 −0.0111 0.1342 
Log_Hydcre 0.3266 — 0.3266 0.1910 — 0.1910 
CYP1A1 — 0.0120 0.0120 — 0.0070 0.0070 
mEH3 −0.0309 0.0148 −0.0167 −0.1328 0.0086 −0.1241 
GSTP1 — — — 0.0345 — 0.0345 
XRCC1-exon 6 0.1061 — 0.1061 0.1136 — 0.1136 
XRCC1-exon 9 0.2012 — 0.2012 — — — 
ERCC2-exon 10 0.1195 — 0.1195 0.1213 — 0.1213 
VariableOlive tail moment
CBMN
DirectIndirectTotalDirectIndirectTotal
Cokework 0.0303 0.2696 0.2999 0.2423 0.1576 0.3999 
Smoke — 0.0472 0.0472 — 0.0276 0.0276 
Cigarettes 0.0463 — 0.0463 −0.0260 — −0.0260 
Age 0.0335 −0.0190 0.0145 0.1453 −0.0111 0.1342 
Log_Hydcre 0.3266 — 0.3266 0.1910 — 0.1910 
CYP1A1 — 0.0120 0.0120 — 0.0070 0.0070 
mEH3 −0.0309 0.0148 −0.0167 −0.1328 0.0086 −0.1241 
GSTP1 — — — 0.0345 — 0.0345 
XRCC1-exon 6 0.1061 — 0.1061 0.1136 — 0.1136 
XRCC1-exon 9 0.2012 — 0.2012 — — — 
ERCC2-exon 10 0.1195 — 0.1195 0.1213 — 0.1213 

NOTE: The total effect for each variable is decomposed to direct effect and indirect effect.

Lines presented the path equations for the three variables (Log_Hydcre, Log_olive, and Log_mni) given by

\[\mathrm{Log_Hydcre}=0.8253{\times}\mathrm{cokework}+0.1444{\times}\mathrm{smoke}+0.0368{\times}CYP1A1+0.0453{\times}mEH3{-}0.0582{\times}\mathrm{age}+0.5011\]
\[\mathrm{Log_olive}=0.3266{\times}\mathrm{log_ohp1}+0.0303{\times}\mathrm{cokework}+0.0463{\times}\mathrm{cigarettes}+0.0335{\times}\mathrm{age}{-}0.0315{\times}mEH3+0.2012{\times}XRCC1{-}\mathrm{exon}\ 9+0.1195{\times}ERCC2{-}\mathrm{exon}\ 10+0.1061{\times}XRCC1{-}\mathrm{exon}\ 6+0.8855\]
\[\mathrm{Log_mni}=0.1910{\times}\mathrm{log_ohp1}+0.2423{\times}\mathrm{cokework}+0.1453{\times}\mathrm{age}{-}0.0260{\times}\mathrm{cigarettes}{-}0.1328{\times}mEH3+0.1136{\times}XRCC1{-}\mathrm{exon}6+0.1213{\times}ERCC2{-}\mathrm{exon}10+0.0345{\times}GSTP1+0.8533\]

Figure 1 shows a diagram relating various variables studied in this path analysis. The diagram showed a network of causes and effects, and this network was seen as a series of steps in a path with a coefficient assigned to each step to quantify their interrelationships.

Figure 1.

The path diagram reflected the hypotheses of the causation between the dependents (Olive tail moment and frequencies of CBMN) and exogenous variables (coke-oven exposure, smoke, age, and several primary genotypes) or intermediary variable (urinary 1-OHP). Each arrow was accompanied with a path coefficient, and the widths of the arrows were drawn proportionally to the absolute magnitude of the corresponding path coefficients. Dashed lines, negative causal relation.

Figure 1.

The path diagram reflected the hypotheses of the causation between the dependents (Olive tail moment and frequencies of CBMN) and exogenous variables (coke-oven exposure, smoke, age, and several primary genotypes) or intermediary variable (urinary 1-OHP). Each arrow was accompanied with a path coefficient, and the widths of the arrows were drawn proportionally to the absolute magnitude of the corresponding path coefficients. Dashed lines, negative causal relation.

Close modal

The results of path analysis have shown that coke-oven exposure was the most influential factor for the levels of urinary 1-OHP and CBMN frequencies, and their path coefficients were both statistically significant. A similar study conducted by Leng et al. had shown that urinary 1-OHP levels, Olive tail moment, and the frequencies of CBMN in coke-oven workers were all significantly higher than the unexposed controls (52). Wang et al. (1) reported an investigation in coking workers, in which they discovered that positive rates of micronuclei and micronucleus lymphocytes in coking workers were both significantly higher than those in the controls.

Our path analysis further suggests that among various factors, two variables (i.e., coke-oven exposure and tobacco smoke) had accounted for the most impact weights of the effects on the levels of urinary 1-OHP. Comparing with these variables, the metabolic enzyme genes had much less influence on the levels of urinary 1-OHP, with small path coefficients, which indicated that although the metabolic enzyme system did affect urinary 1-OHP levels, the external exposure played a more important role in the formation of urinary 1-OHP. Leng et al. found a significant correlation between external exposure categories and urinary 1-OHP concentrations (Spearman's correlation coefficient = 0.535, P < 0.01); however, when adjusted for external exposure, smoking also significantly influenced urinary 1-OHP levels (P < 0.105; ref. 53).

When we were trying to identify the impact factors of the comet assay, we discovered urinary 1-OHP preceded the other factors, including age, cigarettes consumption, and mEH3 variant genotype. It might imply that the importance of DNA repair system was almost equivalent to that of external exposure. As for the frequencies of CBMN, we found similar results. However, this is what we inferred from the path coefficients, and its biological plausibility still needs further verification.

It is interesting to have a look at the effect of the mEH3 genotype on CBMN frequencies where the mEH3 variant genotype exerted a direct protective effect and had an indirect risk effect by the pathway mediated by urinary 1-OHP. This was consistent with the conclusion drawn by Leng et al. that genetic polymorphism of the mEH gene was a susceptibility biomarker in the metabolic process of PAHs (54). In Leng's other two published studies (27, 55), they found that individuals with mEH3 variant genotypes had lower frequencies of CBMN. As we know, the microsomal epoxide hydrolase (mEH) can convert trans-7,8-diol of benzo(a)pyrene to more water-soluble trans-dihydrodiols, which is further activated by phase I enzymes to form the ultimate carcinogenic diol epoxide. When genetic variation takes place at the H113Y site of the mEH gene, the enzyme activity is reduced (56), which, on one hand, brings the reduction of the chance for epoxide to transform to dihydrodiols and, on the other hand, increases the probability for epoxide to become 1-hydroxypyrene. In addition, due to its involvement in the formation of ultimate carcinogens, the reduction of ultimate carcinogens further lessens the genetic damage. This biological plausibility indicates that the proposed model is somewhat compatible to the data.

The path analysis also showed that after repeated selection, the variant genotypes of these three DNA repair enzyme genes (i.e., XRCC1-exon 6, XRCC1-exon 9, and ERCC2-exon10) stayed in the path model ultimately, and they were all statistically significant. As shown in Leng's study (32), subjects with XRCC1-exon 9 heterozygous genotypes had a longer Olive tail moment compared with those with wild-type homozygous genotypes. The polymorphisms of XRCC1-exon 6 and ERCC2-exon 10 genotype were also studied, but their relations to Olive tail moment were not evident. However, a study conducted by Cheng et al. (57) showed that the Olive tail moment was significantly higher in subjects with XRCC1 His280 allele than those with the Arg280 allele. In Leng's other study (58), they identified the association between the XRCC1-exon 6 polymorphism and the frequencies of CBMN Therefore, the consistency within these studies also suggests that the path model built fits to the data, although our findings need to be verified in larger studies and several conflicting studies on the XPD gene (59-61) that we should be aware of.

An indication of the appropriateness of causal diagrams is given by the coefficient of determination (R2) and path significance values. According to Hatcher (51), it is generally accepted that, when R2 is >60%; a relatively large percentage of the variance can be explained by a causal model. In our path analysis, the R2 value for the urinary 1-OHP–dependent variable was 75%, suggesting that this model explained a considerable portion of the total variance of this dependent variable. However, the R2 value for the Olive tail moment and CBMN frequencies was only 22% and 27%, respectively, exhibiting a poor fitness of the models. This is a limitation of this study.

In summary, the path model in this report illuminated the relative importance of environmental factors and genetic polymorphisms in terms of theirs effects on DNA or chromosome damage that was not described in previous studies (27, 32, 52-55). Either the coke-oven exposure or its internal dose of urinary 1-OHP had a main effect on Olive tail moment and CBMN frequencies, suggesting the predominance of environmental factors in inducing genetic damage. Thus, we should put more weight on the improvement of occupation environmental conditions to reduce the exposure. Alternatively, we could as well identify the most susceptible by screening for specific genotypes of candidate genes. Compared with previous studies (27, 32, 52-55), the path analysis helped figure out the interactions between different factors clearly, exhibiting the multistages from PAHs exposure to genetic damage. However, a more integrated and coherent way of analyzing the data was still desired. Perera et al. (12) had detected the association between PAH environmental exposure and aromatic adducts, and the aromatic adducts were then found to correlate with chromosomal aberrations. In such a case, aromatic adducts could be thought of as a molecular link between environmental exposure and a genetic alteration relevant to cancer risk. As the progression of lung cancer must go through several stages, from PAH exposure to the clinical end point, it is critical to identify such links that could integrate various stages as a whole. Due to the lack of sufficient and related data, we were unable to map a coherent picture for the development of lung cancer in these coke workers at the present time.

However, the path analysis used in this study has several limitations. First, the path analysis can neither prove causality nor establish the direction of causality. It is meant to be exploratory rather than confirmatory because it is possible that some potential confounding variables were not taken into account (24, 26, 62), as indicated by the large residuals for the Olive tail moment and CBMN frequencies. Nevertheless, this does not prohibit us from further exploring factors that have more influence on the Olive tail moment and CBMN frequencies. Secondly, the path analysis cannot be applied when feedback loops are included in the hypotheses. Thirdly, there were strong correlations between duration of smoking and age at onset of smoking and age at interview (not presented in the results). To avoid multicollinearity, focus was placed on the most important variable, cigarettes smoked per day compared with other smoking variables in which effects on genetic damage, in addition, might be masked by recent smoking and occupational exposure in current smokers (63). Fourthly, in terms of the measurement of PAH exposure, this cross-section investigation neither measured personal PAH exposure nor collected dietary PAH exposure.

Grant support: National Key Basic Research and Development Program (2002CB512903, 2002CB512910) and National Nature Science Foundation of China (30400348).

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

We thank Dr. Zhiliang Zhu of the Baoan District Center for Disease Control and Prevention, Shenzhen, China for his technical assistance of version 1.0, IMI comet analysts software. We thank Dr. Zufei Pan of the Institute of Industrial Health, Benxi Steel Industrial Corp., Benxi, China for his critical contribution in the filed study and Dr. Qingyi Wei (Department of Epidemiology, University of Texas M. D. Anderson Cancer Center, Houston, TX) for his critical review and scientific editing of the manuscript.

1
Wang G, Jia M, Zhao S, Li J. A study on the relationship between lymphocyte micronucleus rates and blood plasma benzo(a) pyrene levels in coking workers [in Chinese].
Zhonghua Yu Fang Yi Xue Za Zhi
1999
;
33
:
40
–2.
2
Crebelli R, Carta P, Andreoli C, et al. Biomonitoring of primary aluminium industry workers: detection of micronuclei and repairable DNA lesions by alkaline SCGE.
Mutat Res
2002
;
516
:
63
–70.
3
Liu L, Lan P. Lymphocyte sister chromatid exchange frequencies and micronucleus in coke oven workers [in Chinese].
Gong Ye Wei Sheng Yu Zhi Ye Bing
1995
;
21
:
203
–4.
4
Karahalil B, Burgaz S, Fisek G, Karakava AE. Biological monitoring of young workers exposed to polycyclic aromatic hydrocarbons in engine repair workshops.
Mutat Res
1998
;
412
:
261
–9.
5
Burgaz S, Erdem O, Karahalil B, Karakayaa E. Cytogenetic biomonitoring of workers exposed to bitumen fumes.
Mutat Res
1998
;
419
:
123
–30.
6
Verma M. Biomarkers for Risk Assessment in molecular epidemiology of cancer.
Technol Cancer Res Treat
2004
;
3
:
505
–14.
7
Koh D, Seow A, Ong CN. Applications of new technology in molecular epidemiology and their relevance to occupational medicine.
Occup Environ Med
1999
;
56
:
725
–9.
8
Pavanello S, Pulliero A, Siwinska E, Mielzynska D, Clonfero E. Reduced nucleotide excision repair and GSTM1-null genotypes influence anti-B[a]PDE-DNA adduct levels in mononuclear white blood cells of highly PAH-exposed coke oven workers.
Carcinogenesis
2005
;
26
:
169
–75.
9
Strunk P, Ortlepp K, Heinz H, Rossbach B, Angerer J. Ambient and biological monitoring of coke plant workers: determination of exposure to polycyclic aromatic hydrocarbons.
Int Arch Occup Environ Health
2002
;
75
:
354
–8.
10
Perera F, Brenner D, Jeffrey A, et al. DNA adducts and related biomarkers in populations exposed to environmental carcinogens.
Environ Health Perspect
1992
;
98
:
133
–7.
11
Siwinska E, Mielzynska D, Kapka L. Association between urinary 1-hydroxypyrene and genotoxic effects in coke oven workers.
Occup Environ Med
2004
;
61
:
10
–7.
12
Perera FP, Hemminki K, Gryzbowska E, et al. Molecular and genetic damage in humans from environmental pollution in Poland.
Nature
1992
;
360
:
256
–8.
13
Bonassi S, Neri M, Lando C, et al. Effect of smoking habit on the frequency of micronuclei in human lymphocytes: results from the Human MicroNucleus project.
Mutat Res
2003
;
543
:
155
–66.
14
Duell EJ, Wiencke JK, Cheng TJ, et al. Poylmorphisms in the DNA repair genes XRCC1 and ERCC2 and biomarkers of DNA damage in human blood mononuclear cells.
Carcinogenesis
2000
;
21
:
965
–71.
15
Zhang J, Ichiba M, Hara K, et al. Urinary 1-hydroxypyrene in coke oven workers relative to exposure, alcohol consumption, and metabolic enzymes.
Occup Environ Med
2001
;
58
:
716
–21.
16
Bonassi S, Au WW. Biomarkers in molecular epidemiology studies for health risk prediction.
Mutat Res
2002
;
511
:
73
–86.
17
Lynch M, Walsh B. Genetics and data analysis of quantitative traits. Sunderland (MA): Sinauer; 1998.
18
Shipley B. Exploratory path analysis with applications in ecology and evolution.
Am Nat
1997
;
149
:
1113
–38.
19
Goldsmith JR. Paths of association in epidemiological analysis: application to health effects of environmental exposures.
Int J Epidemiol
1977
;
6
:
391
–9.
20
Brooks CH. Path analysis of socioeconomic correlates of county infant mortality rates.
Int J Health Serv
1975
;
5
:
499
–514.
21
Wright S. Correlation and causation.
J Agric Res
1921
;
20
:
557
–85.
22
Mo C, Ni Z, Zhang Q. The path analysis on influent factors of inpatient cost [in Chinese].
Zhongguo Yi Yuan Tong Ji
1999
;
6
:
213
–5.
23
Brekke J, Kay DD, Lee KS, Green MF. Biosocial pathways to functional outcome in schizophrenia.
Schizophr Res
2005
;
80
:
213
–25.
24
Kozak M, Kang MS. Note on modern path analysis in application to crop science.
Commun Biometry Crop Sci
2006
;
1
:
32
–4.
25
Santos CA, Senalik DA, Simon PW. Path analysis suggests phytoene accumulation is the key step limiting the carotenoid pathway in white carrot roots.
Genet Mol Biol
2005
;
28
:
287
–93.
26
Sulkes J, Fields S, Gabbay U, Hod M, Merlob P. Path analysis on the risk of mortality in very low birth weight infants.
Eur J Epidemiol
2000
;
16
:
337
–41.
27
Leng S, Dai Y, Niu Y, et al. Effects of genetic polymorphisms of metabolic enzymes on cytokinesis-block micronucleus in peripheral blood lymphocyte among coke-oven workers.
Cancer Epidemiol Biomarkers Prev
2004
;
13
:
1631
–9.
28
Organic Methods Evaluation Branch, OSHA Analytical Laboratory. Coal tar pitch volatiles (CTPVs), coke oven emissions (COE), and selected polynuclear aromatic hydrocarbons (PAH). OSHA Method 58. Salt Lake City (UT): OSHA; 1986.
29
Jongeneelen FJ, Bos RP, Anzion RB, Theuws JL, Henderson PT. Biological monitoring of polycyclic aromatic hydrocarbons. Metabolites in urine.
Scand J Work Environ Health
1986
;
12
:
137
–43.
30
Li XH, Leng SG, Guo J, Guan L, Zheng YX. An improved high performance liquid chromatography method for determination of 1-hydroxypyrene in urine. [in Chinese].
Wei Sheng Yan Jiu
2003
;
32
:
616
–7.
31
Hornung RW, Reed LD. Estimation of average concentration in the presence of nondetectable values.
Appl Occup Environ Hyg
1990
;
5
:
46
–51.
32
Leng S, Cheng J, Pan Z, et al. Associations between XRCC1 and ERCC2 polymorphisms and DNA damage in peripheral blood lymphocyte among coke oven workers.
Biomarkers
2004
;
9
:
395
–406.
33
Singh NP, McCoy MT, Tice RR, Schneider EL. A simple technique for quantitation of low levels of DNA damage in individual cells.
Exp Cell Res
1988
;
175
:
184
–91.
34
Fenech M. The cytokinesis-block micronucleus technique: a detailed description of the method and its application to genotoxicity studies in human populations.
Mutat Res
1993
;
285
:
35
–44.
35
Fenech M. The in vitro micronucleus technique.
Mutat Res
2000
;
455
:
81
–95.
36
Fenech M, Chang WP, Kirsch-Volders M, Holland N, Bonassi S, Zeiger E. Human MicroNucleus project. HUMN project: detailed description of the scoring criteria for the cytokinesis-block micronucleus assay using isolated human lymphocyte cultures.
Mutat
2003
;
534
:
65
–75.
37
Miller SA, Dykes DD, Polesky HF. A simple salting out procedure for extracting DNA from human nucleated cells.
Nucleic Acids Res
1988
;
16
:
1215
.
38
Hayashi SI, Watanabe J, Nakachi K, Kawajiri K. PCR detection of an A/G polymorphism within exon 7 of the CYP1A1 gene.
Nucleic Acids Res
1991
;
19
:
4797
.
39
Saarikoski ST, Voho A, Reinikainen M, et al. Combined effect of polymorphic GST genes on individual susceptibility to lung cancer.
Int J Cancer
1998
;
77
:
516
–21.
40
Zhong S, Wyllie AH, Barnes D, Wolf CR, Spurr NK. Relationship between the GSTM1 genetic polymorphism and susceptibility to bladder, breast and colon cancer.
Carcinogenesis
1993
;
14
:
1821
–4.
41
Katoh T, Nagata N, Kuroda Y, et al. Glutathione S-transferase M1 (GSTM1) and T1 (GSTT1) genetic polymorphism and susceptibility to gastric and colorectal adenocarcinoma.
Carcinogenesis
1996
;
17
:
1855
–9.
42
Zhou W, Thurston SW, Liu G, et al. The interaction between microsomal epoxide hydrolase polymorphisms and cumulative cigarette smoking in different histological subtypes of lung cancer.
Cancer Epidemiol Biomarkers Prev
2001
;
10
:
461
–6.
43
Shen MR, Jones IM, Mohrenweiser H. Nonconservative amino acid substitution variants exist at polymorphic frequency in DNA repair genes in healthy humans.
Cancer Res
1998
;
58
:
604
–8.
44
Butkiewicz D, Rusin M, Enewold L, Shields PG, Chorazy M, Harris CC. Genetic polymorphisms in DNA repair genes and risk of lung cancer.
Carcinogenesis
2001
;
22
:
593
–7.
45
Mort R, Mo L, Mcewan C, Melton DW. Lack of involvement of nucleotide excision repair gene polymorphisms in colorectal cancer.
Br J Cancer
2003
;
89
:
333
–7.
46
Vasconcelos AG, Almeida RM, Nobre FF. The path analysis approach for the multivariate analysis of infant mortality data.
Ann Epidemiol
1998
;
8
:
262
–71.
47
Hawkes JM, Holm K. Causal modeling: a comparison of path analysis and LISREL.
Nursing Res
1989
;
38
:
312
–4.
48
Kline RB. A very readable introduction to the subject, with good coverage of assumptions and SEM's relation to underlying regression, factor, and other techniques. Principles and practice of structural equation modeling. New York: Guilford Press; 1998.
49
Youngblut JM. A consumer's guide to causal modeling: part II.
J Pediatr Nurs
1994
;
9
:
409
–13.
50
Sava FA. Causes and effects of teacher conflict-inducing attitudes towards pupils: a path analysis model.
Teaching Teacher Educ
2002
;
18
:
1007
–21.
51
Hatcher L. A step-by-step approach to using the SAS system for factor analysis and structural equation modeling. Cary (NC): SAS Institute Inc.; 1994. p. 588.
52
Leng S, Zheng Y, Zhang W, et al. Genetic damage in peripheral blood lymphocyte of coke oven workers [in Chinese].
Zhonghua Lao Dong Wei Sheng Zhi Ye Bing Za Zhi
2004
;
22
:
29
–32.
53
Leng S, Zheng Y, Li X, et al. Study on the relationship between external exposure categories and urinary 1-hydroxypyrene concentrations in coke oven workers [in Chinese].
Gong Ye Wei Sheng Yu Zhi Ye Bing
2003
;
29
:
288
–91.
54
Leng S, Zheng Y, Huang C, et al. Effect of genetic polymorphisms of microsomal epoxide hydrolase on urinary 1-hydroxypyrene levels in coke oven workers [in Chinese].
Zhonghua Lao Dong Wei Sheng Zhi Ye Bing Za Zhi
2004
;
22
:
245
–9.
55
Leng S, Zheng Y, Pan Z, et al. A study on the inherited susceptibility of chromosomal damage in peripheral blood lymphocytes among coke oven workers [in Chinese].
Zhonghua Yu Fang Yi Xue Za Zhi
2004
;
38
:
94
–8.
56
Hassett C, Aicher L, Sidhu JS, Omieclnskl CJ. Human microsomal epoxide hydrolase: genetic polymorphism and functional expression in vitro of amino acid variants.
Hum Mol Genet
1994
;
3
:
421
–8.
57
Cheng J, Leng S, Dai Y, et al. Association of metabolic and DNA repair enzyme gene polymorphisms and DNA damage in coke-oven workers [in Chinese].
Zhonghua Yu Fang Yi Xue Za Zhi
2005
;
39
:
164
–7.
58
Leng S, Cheng J, Zhang L, et al. The association of XRCC1 haplotypes and chromosomal damage levels in peripheral blood lymphocyte among coke-oven workers.
Cancer Epidemiol Biomarkers Prev
2005
;
14
:
1295
–301.
59
Dorota B, Marek R, Lindsey E, et al. Genetic polymorphisms in DNA repair genes and risk of lung cancer.
Carcinogenesis
2001
;
22
:
593
–7.
60
Seker H, Butkiewicz D, Bowman ED, et al. Functional significance of XPD polymorphic variants: attenuated apoptosis in human lymphoblastoid cells with the XPD 312 Asp/Asp genotype.
Cancer Res
2001
;
61
:
7430
–4.
61
Schabath MB, Delclos GL, Grossman HB, et al. Polymorphisms in XPD exons 10 and 23 and bladder cancer risk.
Cancer Epidemiol Biomarkers Prev
2005
;
14
:
878
–84.
62
Honjo K, Tsutsumi A, Kawachi I, Kawakami N. What accounts for the relationship between social class and smoking cessation? Results of a path analysis.
Soc Sci Med
2006
;
62
:
317
–28.
63
Wiencke JK, Thurston SW, Kelsev KT, et al. Early age at smoking initiation and tobacco carcinogen DNA damage in the lung.
J Natl Cancer Inst
1999
;
91
:
614
–9.