Polymorphisms in DNA repair genes may be associated with differences in the repair capacity of DNA damage and may influence an individual’s susceptibility to smoking-related cancer. We investigated the association between two polymorphisms of the DNA repair gene XPA and risk of lung cancer in the Korean population. Two XPA polymorphisms (A23G and G709A) were typed in 265 lung cancer patients and 185 healthy controls who were frequency-matched on age and sex. The XPA G709A polymorphism was not detected in cases and controls. The XPA 23 GG genotype was associated with a significantly decreased risk for lung cancer [odds ratio (OR), 0.56; 95% confidence interval (CI), 0.35–0.90] when the combined AA and AG genotype was used as the reference. The reduction in risk for the XPA 23 GG genotype was significant in males (OR, 0.51; 95% CI, 0.30–0.86), younger individuals (OR, 0.39; 95% CI, 0.19–0.80), and current smokers (OR, 0.46; 95% CI, 0.25–0.83). These results suggest that the XPA A23G polymorphism contributes to genetic susceptibility for lung cancer.
Lung cancer has been considered as a disease determined solely by exposure to environmental carcinogens. However, there is a growing realization that genetic constitution is of importance in determining an individual’s susceptibility to lung cancer (1, 2). This genetic susceptibility may result from inherited polymorphisms in genes involved in carcinogen metabolism and repair of DNA damage (3, 4).
The complex system of DNA repair enzymes has a vital role in protecting the genome from carcinogenic damage (5, 6). In humans, >70 genes are involved in the five major DNA repair pathways: direct repair, base excision repair, NER,3 mismatch repair and double strand break repair (6, 7). Small base adducts produced by oxidation, methylation, and radiation are repaired through the base excision repair pathway, whereas bulky, helix-distorting adducts induced by chemical carcinogens in cigarette smoke are primarily repaired through the NER pathway (6, 7, 8). In human NER, 15–18 polypeptides in six repair factors act in concert to excise DNA damage in the form of 24–32-nucleotide-long oligomers (9, 10).
Molecular epidemiological studies have shown considerable interindividual variation in DRC in the general population. Individuals with suboptimal DRC are at increased risk of smoking-related cancers, such as lung cancer and squamous cell carcinoma of the head and neck (11, 12). The variation in DRC may be the result of functional polymorphisms in DNA repair genes.
One hypothesis is that genetic polymorphisms of DNA repair genes may modulate the susceptibility to lung cancer. To test this hypothesis, we previously studied that the contribution of polymorphisms in the DNA repair genes X-ray cross-complementing group 1 and xeroderma pigmentosum group D to the risk of lung cancer in a Korean population (13, 14). XPA protein plays a central role in NER through its interaction with replication protein A, transcription factor IIH, and excision repair cross-complementing group 1-xeroderma pigmentosum group F (9, 10). Butkiewicz et al. (15) identified two polymorphisms at the 5′ noncoding region (A23G, at position −4 from the ATG start codon) and codon 228 (G709A, in exon 6) in the XPA gene. Although the functional effects of these polymorphisms in the XPA gene has not been known, it is possible that these polymorphisms could have an effect on host capacity for removing bulky adducts caused by cigarette smoke and thus modulate the susceptibility to lung cancer. Even with the potential importance of XPA gene in carcinogenesis, none has investigated the role of polymorphisms of the XPA gene in relation to cancer. In the present study, we conducted a case-control study to evaluate the associations between these two XPA polymorphisms and lung cancer risk.
Materials and Methods
This case-control study included 265 lung cancer patients and 185 healthy controls. The cases were all patients with histopathologically confirmed primary lung cancer who were newly diagnosed between January 1998 and December 1998 in Kyungpook National University Hospital. There was no age, sex, histological, or stage restrictions, but patients with prior history of cancers were excluded. The tumors of the 265 lung cancer patients were: 136 (51.3%) squamous cell carcinomas, 78 (29.5%) adenocarcinomas, 47 (17.7%) small cell carcinomas, and 4 (1.5%) large cell carcinomas. The demographics and clinical characteristics of cases were compatible with those of the nationwide lung cancer survey conducted by the Korean Academy of Tuberculosis and Respiratory disease in 1998 (16). Controls were randomly selected from a pool of healthy volunteers who visited the general health check-up center of Kyungpook National University Hospital during the same period. It was difficult to collect enough controls over age 65 because younger aged persons mainly visited the health check-up center. Among 927 male volunteers who visited the health check-up center and agreed to this study (927 of 1746 males; participation rate, 53.1%), males over age 65 numbered only 57 (6.1%). These numbers were insufficient to match in a 1:1 ratio with cases because the number of male lung cancer patients over age 65 was 71. Therefore, controls were frequency (2:3) matched to cases on sex and age (±5 years). All cases and controls were residents of Taegu City and surrounding regions. A detailed questionnaire was completed for each case and control by a trained interviewer. The questionnaire included information on the average number of cigarettes smoked daily and the number of years the subjects had been smoking. For former smokers, the time elapsed since quitting was recorded.
Genomic DNA was extracted from peripheral blood lymphocytes by proteinase K digestion and phenol/chloroform extraction. XPA genotypes were determined by a PCR-RFLP assay. The PCR primers for the A23G polymorphism (GenBank accession no. U16815) were 5′-TTAACTGCGCAGGCGCTCTCACTC-3′ (bases 1689–1712 of XPA) and 5′-AAAGCCCCGTCGGCCGCCGCCAT-3′ (bases 1846–1824 of XPA), which generate a 158-bp fragment. The PCR primers for the codon 228 polymorphism in exon 6 (GenBank accession no. AL445531) were 5′-TTTTCAGAATTGCGTC-3′ (bases 5123–5108 of XPA; primer was mutated G→T at base 5109) and 5′-TTCATATGTCAGTTCATG-3′ (bases 4977–4994 of XPA), which generate a 143-bp fragment. PCR reactions were performed in a 20-μl reaction volume containing 200 ng of genomic DNA, 10 pmol of each primer, 0.2 mm each deoxynucleotide triphosphate, 1× PCR buffer [75 mm Tris-HCl (pH 9.0), 15 mm ammonium sulfate, and 0.1 μg/μl BSA], 2.5 mm MgCl2, and 1 unit of Taq polymerase (Takara Shuzo Co., Otsu, Shiga, Japan). The mixture were amplified with a Perkin-Elmer GeneAmp PCR System 9600 (Perkin-Elmer, Foster, CA). The PCR profile consisted of an initial melting step of 94°C for 5 min, followed by 36 cycles of denaturation at 94°C for 20 s; primer annealing, 20 s at 58°C for A23G and 20 s at 48°C for codon 228; and primer extension, 20 s at 72°C for A23G and 30 s at 72°C for codon 228. The cycles were followed by a final elongation step at 72°C for 5 min for A23G and 10 min for codon 228. The PCR products were checked on a 2% agarose gel, photographed using Polaroid film, and were then subjected to RFLP analysis.
The restriction enzyme MspI (New England BioLabs, Beverly, MA) was used to distinguish the A23G polymorphism in which the gain of a MspI restriction site occurs in the polymorphic allele. The wild-type (A) allele (i.e., 23A) has a single band representing the entire 158-bp fragment and the polymorphic (G) allele (i.e., 23G) results in two bands (132 and 26 bp; Fig. 1). The restriction enzyme TaqaI (New England Biolabs) was used to distinguish the codon 228 polymorphism in which the loss of a TaqaI restriction site occurs in the polymorphic allele. The wild-type (G) allele (i.e., 709G), which has a TaqaI restriction enzyme site, has two bands (127 and 16 bp), and the polymorphic (A) allele (i.e., 709A) has only one band representing the entire 143-bp fragment. Digestion of the PCR product was carried out according to the manufacturer’s instructions (New England Biolabs). Five μl of the PCR products were digested overnight with 5 units of MspI at 37°C or 5 units of TaqaI at 65°C. The digestion products were separated on 8% acrylamide gel. Genotyping was successful for all subjects. The A23G genotyping analysis was repeated twice for all subjects, and selected PCR-amplified DNA samples (n = 2, respectively, for 23 AA, AG, and GG genotypes) were examined by DNA sequencing to confirm genotyping results.
An ever-smoker was defined as an individual who had smoked at least once a day for >1 year in his or her lifetime. A former smoker was defined as one who had stopped smoking at least 1 year before diagnosis in the case of patients and 1 year before the study began in the case of controls. Cumulative cigarette dose (pack-years) was calculated by the following formula: pack-years = [(pack/day) × (years smoked)]. Light and heavy smokers were categorized by the approximate 50th percentile pack-years value among controls, i.e., ≤30 pack-years and >30 pack-years. Cases and controls were compared using Student’s t test for continuous variables and χ2 test for categorical variables. Hardy-Weinberg equilibrium was tested by a goodness-of-fit χ2 test to compare the observed genotype frequencies with the expected genotype frequencies among the cases and controls. The ORs and 95% CIs were obtained using unconditional logistic regression analysis. Crude ORs and ORs adjusted for age, sex, and pack-years were calculated. To analyze the association between genotype and lung cancer risk after stratification into age (median age, ≤62 years/>62 years), sex, smoking status and cigarette consumption (≤30 pack-years/>30 pack-years), multiple logistic regression analyses were performed. All analyses were performed using Statistical Analysis Software for Windows, version 6.12 (SAS Institute, Cary, NC).
The details of cases and controls enrolled in this study are shown in Table 1. There were no significant differences in the mean age and sex distribution between cases and controls, suggesting that matching on these two variables was adequate. Cases showed a higher prevalence of current smokers compared with controls (P < 0.01). The pack-years in smokers was significantly higher in cases than in controls (40.8 ± 21.5 versus 34.8 ± 15.6 pack-years; P < 0.01). These differences were controlled in later multivariate analyses.
The XPA G709A polymorphism was not detected in cases and controls. The distributions of XPA A23G genotypes (AA, AG, and GG) among cases and controls are shown in Table 2. The distributions of the genotypes among controls were in Hardy-Weinberg equilibrium. The AA and AG genotypes were more frequent in cases (22.6 and 60.4%, respectively) than in controls (20.5 and 54.6%), whereas the GG homozygotes were less frequent in cases (17.0%) than in controls (24.9%). These findings suggested that the AA and AG genotypes might be risk genotypes for lung cancer. Because the risk of lung cancer for the AG genotype was similar to that for the AA genotype, we combined the AG genotype with the AA genotype into one susceptible group and compared it with the group with the GG genotype. When the combined AA and AG genotype was used as the reference group, the GG genotype was associated with a significantly decreased risk for lung cancer (adjusted OR, 0.56; 95% CI, 0.35–0.90).
The association between the XPA A23G polymorphism and lung cancer was further examined after stratifying for the potential confounding variables such as age, sex, smoking status, and pack-years. The risk estimates for the GG genotype are presented in Table 3. When stratified by the median age, a significant reduction in risk was observed in younger individuals (≤62 years; adjusted OR, 0.39; 95% CI, 0.19–0.80), whereas there was no significant association in older individuals (>62 years; adjusted OR, 0.76; 95% CI, 0.39–1.48). When stratified by sex, a significant reduction in risk was observed in males (adjusted OR, 0.51; 95% CI, 0.30–0.86), whereas there was no significant association in females (adjusted OR, 0.82; 95% CI, 0.25–2.69). When stratified by smoking status, there was a significant protective effect (adjusted OR, 0.46; 95% CI, 0.25–0.83) for current smokers with the GG genotype but not for former or never smokers. When the ever-smokers were dichotomized by the pack-years of smoking, the protective effect of the GG genotype was similar in both light smokers (adjusted OR, 0.48; 95% CI, 0.21–1.08) and heavy smokers (adjusted OR, 0.58; 95% CI, 0.28–1.21). When the current smokers were dichotomized by the pack-years of smoking, the protective effect of the GG genotype was also similar in both light smokers and heavy smokers (data not shown).
The contribution of the XPA A23G polymorphism in each histological subcategory is shown in Table 4. The GG genotype was associated with significantly decreased risk for small cell lung cancer (OR, 0.23; 95% CI, 0.07–0.71) but with nonsignificant decreased risk for squamous cell carcinoma (OR, 0.63; 95% CI, 0.36–1.12) and adenocarcinoma (OR, 0.57; 95% CI, 0.28–1.16).
This is the first case-control study of XPA polymorphisms in relation to lung cancer. In our study, the XPA 23GG genotype was associated with a significantly decreased risk for lung cancer. The protective effects were evident in younger individuals, males, and current smokers. These findings suggest that the XPA A23G polymorphism may contribute to inherited genetic susceptibility to lung cancer.
Because the XPA G709A polymorphism was not detected in cases and controls, we analyzed only the association of the XPA A23G polymorphism with lung cancer risk. The frequency of the XPA 23G allele among the healthy controls in this study was 0.52, which was similar to that (0.57) observed in Polish population (15).
Although the mechanism responsible for the association between the XPA A23G polymorphism and lung cancer risk remains to be elucidated, several lines of evidence presented herein support the biological plausibility of this association:
(a) The XPA A23G polymorphism had more clear effect on lung cancer risk in younger individuals than older subjects. This finding conforms to the current theories that genotype susceptibility is more important in the early onset of disease.
(b) The protective effect of the XPA 23GG genotype was greatest in current smokers, consistent with a marker of genetic susceptibility reflecting a gene-environment interaction. However, the failure to see a significant effect in both former and never-smokers may be attributable to the relatively few subjects in these groups. Our results therefore need to be confirmed by larger studies.
If the XPA genotype is indeed a marker of genetic susceptibility rather than a tumor marker, the frequencies of the various genotypes should not be associated with disease status or stage of disease. However, certain genotypes could confer a greater susceptibility to the particular histological type of lung cancer (17, 18, 19, 20, 21). In our study, the protective effect of the XPA 23GG genotype was more evident for small cell lung cancer than squamous cell carcinoma or adenocarcinoma; this difference may be attributable to the differences in pathways of carcinogenesis among histological types of lung cancer (18, 22, 23). The histological type of lung cancer may be determined by the particular initiating agent to which an individual is exposed (24, 25, 26). Genetic susceptibility to small cell lung cancer is therefore probably different from genetic susceptibility to squamous cell lung cancer or adenocarcinoma. However, this finding should be interpreted with caution because of the relatively small numbers in the subgroups.
Whether the XPA A23G polymorphism itself alters the transcription and/or translation or is in linkage disequilibrium with other polymorphisms that may affect them remains to be known. Because this polymorphism is located in the vicinity of the translation initiation codon, it may alter translation efficiency. The proximal near-by nucleotides to the AUG initiation codon is important for initiation of translation because the 40S ribosomal subunit binds initially at the 5′-end of mRNA (27). The G+4 as well as each of the Kozak’s consensus nucleotides (GCCAGCCAUGG) from position −1 through −6 are important determinants of translation efficiency (28, 29). Recently, Afshar-Khargen et al. (30) reported that the T-5C polymorphism in the glycoprotein Ibα gene was associated with a marked increase in the level of glycoprotein Ibα receptor on the platelet membrane. They explained that this result may be attributable to that the sequence containing C instead T at position −5, which more closely approximates the consensus sequence, resulting in more efficient translation. The sequences (CCAGAGAUGG) around the predicted initiator methionine codon of the XPA gene agree with the Kozak’s consensus sequence at positions −3 and +4 (31). Although both the A and polymorphic variant G nucleotides at the −4 position of the XPA gene does not correspond to the original consensus Kozak sequence containing nucleotide C at position −4, it is possible that a nucleotide substitution of adenine to guanine at position −4 preceding the AUG codon may affect ribosomal binding and thus alter the efficiency of XPA protein synthesis. To investigate whether the transition from A to G changes the translation efficiency, an in vitro transcription/translation analysis (32) and a primer extension assay of initiation complex (33) will be necessary in the future. An alternative explanation could be that the protective XPA allele is in linkage disequilibrium with an allele from an adjacent gene, which is the true susceptible gene.
One must consider potential biases that might influence the results of case-control studies, primarily selection bias and information bias (34, 35):
(a) There may be selection bias. Given that most lung cancer patients are treated at university hospitals in Korea, the demographics and clinical characteristics of our cases are compatible with those of the nationwide lung cancer survey (16). Because we included all lung cancer patients diagnosed at a national university hospital, it might be reasonable to assume that our case group represents lung cancer cases in our community.
(b) Another selection bias may derive from controls who did not participate in this study. However, because the age and sex distribution and the exposure (smoking status and pack-years) of nonparticipating controls were similar to those of the participating controls in our study (data not shown), nonparticipant bias is unlikely.
(c) Disease status may be misclassified. All of our cases were pathologically confirmed, and controls were proved by health examination. Therefore, this type of bias is unlikely as well.
(d) Exposure may be misclassified because of differential recall between cases and controls during the interview. However, we interviewed the cases and controls with the same instrument and rechecked the questionnaires by randomly re-interviewing 10% of the subjects, which generated similar results. Therefore, recall bias is unlikely as well.
This is the first molecular epidemiological study of XPA polymorphisms in lung cancer. We found that the XPA A23G polymorphism was associated with lung cancer risk. The protective effects of the XPA 23 GG genotype were more evident in younger individuals, males, and current smokers. It is possible that our findings, particularly from the stratified analyses, are attributable to chance because of the relatively small numbers in the subgroups. Therefore, the functional relevance of this XPA polymorphism and its role in cancer susceptibility remain to be determined in larger epidemiological studies.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
This study was supported in part by the Korea Science and Engineering Foundation through the Biomolecular Engineering Center at Kyungpook National University.
The abbreviations used are: NER, nucleotide excision repair; DRC, DNA repair capacity; XPA, XP group A; OR, odds ratio; CI, confidence interval.