The repair of damaged DNA requires the function of multiple proteins in generally damage-specific, nonredundant pathways. The relationship of DNA repair to cancer susceptibility is obvious in “cancer families,” in which low frequency, high penetrance, loss-of-function variant alleles of genes with roles in the repair of damaged DNA have been associated with a high risk of disease. More important for the cancer incidence in the general population, many individuals exhibit reduced (60–75% of normal) repair capacity phenotypes that have been associated with several-fold increases in individual cancer risk. In a program to identify the molecular basis for the variation in repair capacity and the elevated cancer susceptibility, we have identified 127 amino acid substitution variants in resequencing 37 DNA repair genes in 36–164 unrelated individuals. Over 50% of the substitutions are exchanges of amino acid residues with dissimilar physical or chemical properties, at sites at which the common residue is identical in the human and mouse proteins. Five additional sequence changes resulting in proteins with altered termination of translation and one amino acid insertion variant were detected. The variant allele frequencies average 0.047, with individual variant allele frequencies ranging from <0.01 to 0.43. Homozygous variant individuals and individuals with multiple amino acid substitutions in a gene were observed. Most individuals exhibited variation in multiple genes in a repair pathway. Ten variant alleles accounted for 52% of the genetic variation among individuals, but a striking 23% of the total variation is associated with 108 variants with allele frequencies of less than 5%. Screening generally healthy individuals generates a catalogue of common variants that is a resource for molecular epidemiology studies endeavoring to use a genotype to phenotype paradigm to estimate the role of genetic variation and individual susceptibility in disease risk from environmental and lifestyle exposures in the general population of the United States.

Exposure of cells to environmental agents as well as the by-products of cellular metabolism results in extensive damage to DNA. The pattern of DNA damage is often complex but has characteristics associated with the damaging agent. Organisms have several, generally nonredundant, pathways for repairing different DNA lesions, e.g., strand breaks, adducts, and oxidized bases, resulting from these exposures. The NER3 pathway removes UV-induced pyrimidine dimers and bulky DNA adducts associated with chemical exposures (1, 2). The DSB/RR pathway repairs strand breaks that are often associated with exposure to ionizing radiation and radiometric drugs or the result of incomplete repair of other damage (3, 4, 5). A third pathway, BER, directly processes damaged bases, such as oxidized bases, the most common spontaneous damage (6, 7). These lesions also result from exposure of cells to ionizing or UV radiation via elevations in the intracellular reactive oxygen levels (8). A fourth pathway, MMR, repairs replication errors (9, 10). As recently summarized, DNA repair in mammalian cells involves more than 80 genes with direct roles in the repair of DNA damage (11, 12). At least 40 additional genes with roles in DNA damage recognition and cell cycle checkpoint processes have important, although more indirect, roles in the repair of DNA damage (13, 14).

The important role of DNA repair in the maintenance of a normal cellular genotype and a cancer-free state is most obvious in “cancer families,” in which the presence of rare but highly penetrant variant alleles at a number of loci is associated with a high risk of cancer. A classic example is xeroderma pigmentosum, a prototype cancer gene syndrome associated with the development of UV-induced skin cancers resulting from the loss of function of a gene of the NER pathway (15). The association of defects in MMR with colon cancer is another example of the critical role of DNA repair in cancer prevention (9, 10). Other genes with direct or indirect roles in DNA repair and in which variant alleles are associated with elevated cancer risk, include BRCA1, BRCA2, TP53, ATM, and NBS1(16). Disruption of the function of genes with roles in the repair of damaged DNA is associated with increased sensitivity to DNA damaging agents and cancer proneness (17). These instances of inherited cancer predisposition have provided important models for increasing the understanding of DNA repair pathways and carcinogenic processes and the relationship of DNA repair to cancer risk. In terms of cancer incidence, these generally highly penetrant, low-activity disease-associated alleles are estimated to account for no more than 5% of the cancer cases in the general population. The remaining cancer cases or sporadic cancers are suggested to occur in individuals with combinations of common polymorphisms with often low penetrance, only marginally altered function, and weak effects giving rise to individuals with increased susceptibility to disease from environmental exposures and lifestyle factors.

Reduced DNA repair capacity is a polymorphic phenotypic trait. At least 10% of the individuals in the population have a capacity to repair DNA damage after in vitro exposure of lymphocytes to DNA-damaging agents, which is only 60–75% of the population mean (18). The reduced-repair-capacity phenotypes for damage induced by bleomycin, γ radiation, and benzo[a]pyrene-diolepoxide, classes of damage expected to be repaired by different pathways and, thus, different sets of genes, behave as independent traits (19). The genetic contribution to the interindividual differences in repair capacity ranges from 0.65 to 0.80 for these three traits, the traits being the ability to repair DNA damage induced by bleomycin, γ radiation, or benzo[a]pyrene-diolepoxide (20, 21, 22). A link between the reduced-DNA-repair-capacity phenotypes and cancer susceptibility is supported by data from a series of epidemiology studies. These studies have demonstrated that a reduced-repair-capacity phenotype is associated with an increased risk (odds ratios of 2–10) of developing tumors at several sites, including breast, lung, skin, liver, or head/neck [see review of Berwick and Vineis (23) and references cited therein]. A limited number of individuals have reduced capacity to repair damage induced by both bleomycin and benzo[a]pyrene-diolepoxide, damage expected to be repaired by different pathways. The number of individuals is consistent with the number expected, given the incidence of individuals with reduced capacity phenotypes for each agent. These individuals with reduced capacity in two pathways exhibit a higher risk of developing lung cancer than do individuals with a reduced capacity in only one pathway (24).

Accumulating evidence suggests that many diseases, including most cancer cases, result from low-level exposures and lifestyle factors in genetically susceptible individuals. The results presented here focus on identification of the common or polymorphic amino acid substitution variants existing in DNA repair genes in the general population. This extends the work of Shen et al.(25) to more genes in a larger number of individuals. This catalogue of variant alleles is a resource for molecular epidemiology studies using a genotype to phenotype or health consequence paradigm for associating DNA repair gene variants to repair capacity and cancer susceptibility.

The resequencing strategy, which involves sequencing of the same genomic region in multiple individuals to identify DNA sequence variation in DNA repair genes, has been described previously (25). It involves the direct sequencing of PCR products containing the individual exons of a gene plus adjacent intronic and noncoding regions. The PCR products include the splice sites and the 5′ and 3′ regions of the genes.

PCR Amplification of Exons.

The PCR primers are designed using the Oligo Primer Analysis software (National Biosciences, Inc., Plymouth, MN). Appended to the 5′ end of each PCR primer is the primer binding site for the forward or reverse energy transfer (ET) DNA sequencing primer (Amersham Life Science, Inc., Cleveland, OH). PCR primers are matched so that the sense and the antisense PCR primers contain different sequencing primer binding sites. PCR primers are optimized as necessary by addition of DMSO. Primers were obtained from Sigma Genesys (The Woodlands, TX).

PCR primers are positioned so that amplification of the genomic sequence is initiated at least 75 nucleotides from the intron-exon boundary. This is sufficient distance for high-quality sequence data to be obtained before reaching the intron/exon splice site. The PCR products are ∼400 bp (range, 300–450 bp), and, therefore, the entire fragment can be sequenced in both directions without developing new sequencing primers.

The GenBank accession numbers for the genomic sequences used for primer design are listed in Table 1. The locus designations are from the HUGO Gene Nomenclature Committee guidelines.4 Common aliases are included for a number of the genes.

DNA Sequencing.

After PCR amplification, PCR products are diluted and used as substrate in sequencing reactions. Dye primer cycle sequencing reactions are performed according to manufacturer’s instructions for the DYEnamic Direct cycle sequencing kit with the DYEnamic energy transfer primers (Amersham Life Science, Inc.) and loaded into an ABI Prism 377 stretch DNA sequencer (Applied BioSystems, Foster City, CA). The dye primer sequencing method yields generally uniform peak intensities, which facilitates identification of sequence variation in heterozygote individuals as the two comigrating peaks are of similar intensity, but ∼50% of the intensity of the neighboring peaks in the chromatogram. All of the PCR products are sequenced in both directions, with the identification of the variant nucleotide in the sequencing of the reverse read providing evidence for the authenticity of the initial observation of sequence variation.

Sequence Analysis.

The initial data analysis (lane tracking and base calling) is performed with the ABI prism DNA sequence analysis software (version 2.1.2). Chromatograms created by the ABI prism DNA sequence analysis software are imported into a Sun Microsystems UNIX workstation (Sun Microsystems Inc., Mountain View, CA). The chromatograms are reanalyzed with Phred (bases called and quality of sequence values assigned, version 0.961028), assembled with Phrap (version 0.960213) and the resultant data viewed with Consed (version 4.1).5 “PolyPhred” (version 2.1), a software package that uses the output from Phred, Phrap, and Consed, is used to identify single nucleotide substitutions in heterozygote individuals (26). All of the sequence variants identified by PolyPhred and the immediately surrounding region were inspected to confirm the existence of high-quality sequence reads in the region, before “marking” the nucleotide substitution or other sequence alteration as a variant. The consensus sequence for each gene is derived from the samples sequenced in this study and may differ from the specific sequence(s) in GenBank. The numbering of the nucleotides in the genomic sequence is consistent with the numbering in GenBank or the public domain draft sequence that is available. The common or wild-type allele is defined as the most common allele in the sample set sequenced rather than by the reference GenBank sequence. The GenBank genomic sequences for ERCC1, LIG3, POLD1, RAD52, and XRCC1 are in descending numbers because the genomic sequence for each of these genes is in the reverse orientation of the cDNA sequence.

Samples for Variation Screening.

Four sets of samples were screened for variation. Table 1 relates the genes to the specific sample set screened and the number of unrelated individuals screened for variation for each gene. Table 1 also includes the GenBank number for the specific cDNA sequence used for assigning the location of the amino acid substitutions identified in these samples and listed in Table 2.

The majority of the genes have been screened for variation in 92 samples from the “DNA Polymorphism Discovery Resource” at the Coriell Institute for Medical Research (Group I). This resource was developed by the NIH to have a common set of samples available to investigators screening for common variants existing in the general population of the United States. The availability of lists of common polymorphisms in large numbers of candidate genes was expected to facilitate subsequent studies to relate genetic variation to disease risk (27). The samples (27) are from United States residents selected to represent the major ethnic groupings of the population, although the ethnic origin of specific individuals is unknown. The individuals in this sample set are from population groups as follows: European-American, 23; African-American, 23; Mexican-American, 11; Native-American, 11; Asian-American, 23. Because the ethnicity of specific individuals is unknown, the estimated allele frequency data will not suggest potential differences among ethnic groups in the distribution of alleles. These samples cannot be associated with specific donors and were deemed to be exempt by the LLNL-IRB for human subjects research.

Because this variation screening effort was initiated before the establishment of the Polymorphism Discovery Resource, two other sample sets were screened for variation in early studies. The initial resequencing screened DNA from 36 unrelated individuals (Group II). Group II included 12 samples from unidentified individuals for whom no characteristics are known, although they are presumed to have been healthy at the time of sample collection in Ann Arbor, MI and are probably Caucasian. These are the same samples as screened by Shen et al.(25) in a preliminary search for variation in DNA repair genes. Because the samples cannot be associated with a donor, they were deemed to be exempt by the LLNL-IRB. Twenty-four additional samples in Group II were from Caucasians enrolled in a lung cancer study conducted at Johns Hopkins University, Baltimore, MD. Informed consent to use these samples for the study of the possible relationship of variation in DNA repair and cancer had been obtained and the study approved by the Johns Hopkins University Institutional Review Board and the LLNL-IRB. Twelve of the samples are from cancer cases and 12 are from controls. No variants were identified in resequencing of lymphocyte DNA from the lung cancer cases that were not identified in other individuals also.

A subsequent sample set, that was also a predecessor to the Polymorphism Discovery Resource, included 72 individuals of African, Asian, or Caucasian origin selected for geographical diversity (Ref. 28; Group III). These samples are available from the Coriell Institute for Medical Research (Camden, NJ) and were also deemed to be exempt by the LLNL-IRB.

A fourth set of 46 samples (Group IV) has been used to screen for variation in genes of the NER pathway that may be associated with risk of melanoma. These individuals are of Caucasian origin and were selected because of the previous diagnosis of recurrent melanoma. The screening of these samples for sequence variation was approved by the Institutional Review Board of the Memorial Sloan-Kettering Cancer Center (New York, NY) and the LLNL-IRB.

All of the repair genes screened for variation are located on autosomes and, therefore, the number of chromosomes screened is twice the number of samples. Thus, in summary, Group I is 184 chromosomes, Group II is 72 chromosomes, Group III is 144 chromosomes, and Group IV is 92 chromosomes. APEX, POLB, and XRCC1 were screened for variation in both the Group I and II sample sets or a total of 256 chromosomes, and LIG3 was screened in the samples of Groups I and III or 328 chromosomes (Table 1).

Summary of Amino Acid Substitution Variants.

A total of 127 different single nucleotide polymorphisms resulting in amino acid substitutions were identified in the screening of 37 genes (Table 2). The sequence variation data and the individual genotypes for each individual for the genes screened in the Polymorphism Discovery Resource sample set (Group I) can be accessed on the Internet.6 Similar data for the Group III genes can also be accessed.7

A large number of different amino acid substitution variants were identified in several genes, including 10 different variants for LIG1, 9 for XRCC1, and 8 for MLH1. Although none of the 10 LIG1 variants exists at a frequency of over 0.02, the total variant allele frequency for the gene is 0.11. Similarly, the seven low-frequency variants of both XRCC1 and MLH1 exist at total variant allele frequencies of 0.10 for each gene. No amino acid substitution variants were detected in RAD51, PCNA, FEN1, ERCC1, and ERCC3. With the exception of FEN1, a small intronless gene, nucleotide substitutions were identified in both the exons and introns of these genes. In the collection of 127 amino acid substitution variants, 22 variant alleles (17%) exist at estimated frequencies of greater than 0.05. Two variant alleles existing at frequencies of at least 0.10 were identified in five genes (ERCC2, XPC, MSH3, XRCC1, and POLD1). The estimates of allele frequencies must be considered tentative, as relatively small numbers of chromosomes were screened for variation, although the estimated allele frequencies for several of the common polymorphisms are similar to frequencies observed in subsequent molecular epidemiology studies (29, 30, 31). The two nucleotide substitutions in the codon for amino acid residue 618 (Lys) of MLH1 exist in the same individual. From the direct sequencing of the PCR product, it is not possible to differentiate between nucleotide substitutions on the same chromosome and substitutions on opposite chromosomes; thus, three possibilities for the variant amino acid residue exist, Ala if the substitutions are on the same chromosome or Thr and Glu if the substitutions are on different chromosomes.

Six additional variants with potential to disrupt protein structure were identified (Table 3). Two of the variants were identified in RAD50. The first is an insertion of three nucleotides resulting in the addition of a Gln residue after amino acid residue 363, with the sequence changing from Gln-Glu-His-Ile to Gln-Glu-(Gln)-His-Ile. The second variant is a nucleotide substitution in the codon for Gln at 826, generating a termination signal and resulting in deletion of ∼30% of the protein. Each of these variants was observed only once, existing at an estimated allele frequency of <0.01. Nucleotide substitutions resulting in the generation of termination codons and synthesis of truncated proteins were identified at amino acid residue 333 of MRE11A (allele frequency of 0.009) and residues 346 (allele frequency of 0.033) and 415 (allele frequency of 0.041) of RAD52. A four-nucleotide duplication at the COOH terminus of MSH6 was identified (allele frequency of 0.02). This insertion changed the amino acid sequence from Thr-Leu-Ile-Lys-Glu-Leu-stop to the variant sequence Thr-Leu-Ile-Asp-stop. No nucleotide substitutions were observed in the critical splice site consensus sequences (splice site plus or minus two nucleotides). In total, 133 different nucleotide sequence alterations with potential to disrupt protein function were identified in the screening of 37 genes.

The variation identified in screening 37 genes is summarized by repair pathway in Table 4. An average of 3.6 different variants were identified per gene. The totals in Table 4 are the number of different genes screened and unique variants observed. They are not the sum of the numbers for the individual pathways because several genes have roles in more than one repair pathway. The average variant allele frequency is 0.047 (Table 4). No major differences in average allele frequency for the genes of the different pathways were noted.

In addition to the 133 nucleotide substitution and insertion/deletion variants described in Tables 2 and 3, 96 nucleotide substitutions that did not result in amino acid substitutions were detected within the exons. Thus, ∼60% of the nucleotide substitutions identified in the exons of these repair genes result in amino acid substitutions. In addition, 608 nucleotide substitutions have been identified in the adjacent intronic and 3′- and 5′-UTRs of these genes. Initial analysis does not provide evidence that any of the nucleotide substitutions in the 5′-UTR disrupt known regulatory sites.8 These sequence variation data are also available on the Internet.6

Characteristics of Variants.

Twenty-eight of the 127 amino acid substitutions (22%) occurred at Arg residues, i.e., Arg is the common allele. Twenty of the nucleotide substitutions in the Arg codon were at the G residue and 18 of the substitutions were G to A (CGX to CAX), resulting in replacement of an Arg with either His (10 variants) or Gln (8 variants), depending on the third position nucleotide. His and Gln are amino acid residues with properties that differ from Arg. Six substitutions occurred at Pro residues, a residue imparting significant constraints on protein structure.

The nature of the amino acid residues involved in the interchange can be defined by criteria reflective of the physical and chemical properties of the respective members of the amino acid pair (32). Using the groupings of amino acid residues by similar chemical and physical characteristics used by Smith and Smith (33), 101 of the 127 variants (80%) involve the interchange of amino acid residues with dissimilar physical or chemical properties.

The evolutionary conservation of the amino acid residues at the site of the substitution is another characteristic suggestive of the potential for an amino acid substitution to impact protein function. Ninety-five (75%) of the 127 amino acid substitution variants exist at amino acid residues at which the common allele in the human protein is identical to the amino acid residue in the mouse protein. Seventy-nine or 62% of the substitutions involve the exchange of amino acid residues with dissimilar physical or chemical properties at residues at which the common allele in humans encodes the same amino acid residue as observed in the mouse protein. This number is reduced to 56 or 44% when the properties of the exchanged amino acid residues are scored by the Blosum 62 matrix (34). Over 50% of the 22 variants with allele frequencies of >0.05 involve the exchange of amino acid residues with dissimilar physical and/or chemical properties at residues at which the human and mouse proteins have the same amino acid residue, characteristics often associated with negative impact on protein function.

Contributions of Individual Alleles to Total Variation and the Complexity of Individual Genotypes.

The relative contribution of variants existing at different allele frequencies to the total genetic variation among individuals is presented in Table 5. The seven variants existing at frequencies of >0.30 account for 41% of the total genetic variation among individuals for these genes, whereas the 116 variant alleles existing at individual frequencies of less than 10% account for 32% of the total variation in the population.

Homozygous variant individuals were identified for 14 of the 17 variant alleles existing at variant allele frequencies of 0.10 or greater. Two of the variant alleles not observed in the homozygous state were at loci (XPC and RAD23B) at which only 46 individuals were screened for variation.

Two or more different variant alleles were identified in 26 genes, and many individuals with two different variant alleles in a gene were observed. These individuals would express two different variant forms of a protein (if the substitutions were on different chromosomes) or a wild-type protein and a variant protein with two amino acid substitutions (if the substitutions were located on the same chromosome). For example, 14 of 36 individuals screened for variation in ERCC2 exhibited variation at more than one residue. One individual was homozygous variant at both of the highly polymorphic sites (312Asn/Asn and 751Gln/Gln) in ERCC2, seven individuals were heterozygous at one site and homozygous variant at the other and six were heterozygous at both sites. Among 128 individuals screened for variation in XRCC1, eight individuals were homozygous variant at residue 399 (399Gln/Gln), and three were homozygous variant at residue 194 (194Trp/Trp). No individuals were homozygous at one of these two sites in XRCC1 and also variant at the second site, which suggested that the variant alleles at XRCC1 194 and 399 were on different chromosomes. Twelve individuals were variant at one of these two sites and also at an additional site within the protein.

Complex genotypes were observed in compiling the variation for the genes of a repair pathway for an individual. The BER pathway, in which 49 different variants were identified in screening 12 of the 30 genes of this pathway, is illustrative. Only 3 of 90 individuals were homozygous wild-type at all 12 of the loci, whereas 5 individuals were identified with 5 variant alleles and 7 individuals had 6 variant alleles among the 12 BER genes screened thus far (Table 6). The average of 2.8 variant alleles per individual would extrapolate to ∼7 variants in the average individual, from a pool estimated to include ∼115 different variants, for the 30 genes of the BER pathway.

Seventy-five different combinations of alleles or pathway genotypes were observed in compiling the variation for these 12 BER genes. In addition to the three individuals who were wild-type at all of the 12 loci screened, five groups of either two or three individuals were heterozygous for the same single variant, whereas two pairs of individuals were heterozygous for the same 2 variant alleles. All of the 49 individuals with 3 or more variant alleles among these 12 genes had unique combination of single nucleotide polymorphisms. Examples of the complexity of genotypes observed are presented for the 26 individuals with three variant alleles in Table 7. The data presented include only the 27 variant alleles existing in this specific subset of individuals.

Similar complex pathway genotypes were observed for the other repair pathways, although the data sets were not as complete because not all of the genes screened in these pathways were resequenced in the same sample set. [The individual genotype data (except for the limited data for Group II individuals) are available at http://greengenes.llnl.gov/dpublic/secure/reseq/or http://manuel.niehs.nih.gov/egsnp/home.htm.]

Understanding the molecular and biological basis for the reduced-DNA-repair-capacity phenotype and the associated elevation in individual cancer risk will require that both the genes involved in the repair of the different classes of DNA damage be known and that the common variants in these genes be cataloged. Thus, this study focused on well-characterized genes of the different DNA repair pathways and the associated DNA damage recognition and cell cycle checkpoint processes. The ultimate goal is the identification of the common variants segregating in the United States population for all of the genes with roles in the repair of damaged DNA. This catalogue will serve as a resource for future biochemical and molecular epidemiology studies. The strategy used here, screening a limited number of individuals selected to represent a larger population, will not provide the resources for the study of genetically distinct populations or subpopulations with specific variation and potentially unique risks, e.g., the Ashkenazi Jewish population (35). Molecular epidemiology of genetically isolated populations will require additional screening to ensure ascertainment of variants specific to these populations. Although the ethnic composition of the set of samples screened for variation in this study is known (27), the ethnicity of specific individuals is unknown; thus, the estimated allele frequencies do not address the issue of ethnic differences in allele distribution or frequency. Future molecular epidemiology studies that address questions regarding the role of these variants in disease susceptibility will undoubtedly obtain data regarding potential differences in allele frequencies among different ethnic groups or substructured populations.

Extensive variation, 3.6 different amino acid substitution- and protein sequence altering-variants per gene with an average variant allele frequency of 0.047, was identified in the resequencing of the 37 genes in this screening effort. This is consistent with previous, more limited or focused studies of variation in DNA repair genes in “non-cancer family” individuals (36, 37, 38). The common variants of ERCC2(39), ERCC4(40), RAD52(41), XPC(42), MSH3(43), and RAD54(41) that were observed in this resequencing effort were identified in other studies also. The RAD51 gene appears to be very invariant, because no amino acid substitution variants were identified in screening 92 individuals in this study or in 100 additional individuals (38) or in over 60 human tumor samples (44). Kato et al.(45) identified one amino acid-substitution variant in RAD51 in 2 of 45 Japanese breast cancer patients, but this variant was not seen in genotyping 200 additional Japanese breast cancer and 100 colon cancer patients. As in this screen, no amino acid substitution variants were detected in screening another sample set for variants in FEN1 and PCNA(46). Although the study presented in the present report focuses on nucleotide sequence variation that results in amino acid substitutions or that otherwise disrupts protein structure extensive additional sequence variation in both exons and also noncoding regions that could potentially impact gene expression was identified.8 The reported association of variation in the 5′-UTR of RAD51 with the risk of breast and ovarian cancer in BRCA1/2 mutation carriers is an example of the potential for variation in noncoding regions of genes to be important (47).

The average of 3.6 different variants per repair gene is slightly higher than the number of different variants observed in the systematic screening of other sets of candidate disease-susceptibility genes. For example, an average of 1.1–2.8 different amino acid substitution variants per gene were observed in sequencing 36 cardiovascular risk genes (48), 75 hypertension risk genes (49), or 106 common disease risk genes (50) in samples from multiethnic populations. In contrast to the extensive variation observed in screening multiethnic sample sets, only 0.4–0.5 different amino acid substitution variants per gene were identified in genes associated with the risk of ischemic heart disease (41 genes; Ref. 51) or rheumatoid arthritis (41 genes; Ref. 52) in the resequencing of DNA from 49 Japanese individuals. This lower level of variation in the Japanese sample could be related to a lower level of heterogeneity in that population. It is also consistent with the observation that the total number of different variants detected is lowest when screening individuals of only one ethnic origin and is highest when individuals of African origin are included in a multiethnic sample set (53, 54). Similar results were observed in the present study, in which two variants of APEX and five variants of XRCC1 that were not observed in the Caucasian samples of Group II were detected in screening the multiethnic Group I sample set. Thus, it is expected that the total number of different variants identified in these genes would have been even higher if the 11 repair genes that were screened for variation only in the Caucasian sample sets (Groups II and IV) had been screened in a multiethnic sample set.

The functional relevance of the extensive variation identified in DNA repair genes remains to be fully addressed, although initial data are becoming available. Seven variants of APEX identified in this study, in other reports (55, 56), or GenBank DNA sequence databases have been characterized as to activity in biochemical assays (57). Four of the variants retained only 10–60% of normal or wild-type activity. Three of the reduced activity variants involved the exchange of amino acid residues with dissimilar properties at residues that were identical in the human and mouse proteins. The fourth variant, which retained 60% of wild-type activity, was a replacement of Glu at residue 126 by Asp, residues with similar properties. Two of the variants retaining wild-type activity (Asp148Glu and Gly306Ala) were exchanges that would not be expected to disrupt structure by these limited criteria. The exchange of Arg at residue 241 for Gly (the mouse residue is Gly), residues with dissimilar characteristics, also retained wild-type activity. Using the characteristics of the amino acid residues, knowledge of evolutionary conservation and localization of the residues within the three-dimensional structure of the APEX protein, Hadi et al.(57) correctly predicted the impact of the substitutions for six of these seven variants. The exception was the Glu to Asp exchange at residue 126. The availability of the protein structure obviously enhances the potential to correctly predict the impact of a substitution on protein activity.

In other studies to address the question of functional relevance, several variants have been associated with altered DNA repair capacity or level of damage from an exposure. Spitz et al.(58) reported that variant alleles at amino acid residues 312 and 751 of ERCC2 were associated with a reduced capacity to repair damage induced by in vitro exposure of lymphocytes from individuals of a lung cancer cohort to benzo[a]pyrene diol epoxide. Homozygosity for a variant allele in either of two NER genes, XPC or ERCC2, was associated with the reduced capacity to repair UV-induced DNA damage as assayed by the host-reactivation assay in lymphocytes from a cohort of healthy subjects (59). Hu et al.(31) reported that the APEX 148Glu allele was associated with prolonged mitotic delay in lymphocytes exposed to ionizing radiation. In addition, women with at least three variant alleles of APEX and XRCC1 were at increased likelihood of being ionizing-radiation sensitive. The XRCC1 399Gln variant has been associated with increased aflatoxin adducts (30) and with increased polyphenol adducts in the breast tissue of smokers (29). This variant has also been associated with increased levels of glycophorin A mutations in RBCs (30) and sister chromatid exchanges in lymphocytes from smokers (29). The data suggest that the approximately 40% of the population with this variant allele of XRCC1 will have 20–50% more DNA damage than will individuals with the common allele after similar exposures.

A steadily increasing number of molecular epidemiology studies are reporting on the potential association of polymorphic DNA-repair gene variants with cancer risk. Over 30 studies reporting on the cancer risk associated with one or more of the variants of APEX 148, ERCC2 312 and 751, XRCC1 194 and 399, or XRCC3 241 have been published. These manuscripts can be accessed using the appropriate gene symbol to search the National Center for Biotechnology Information Entrez search and retrieval system.9 These genotyping-risk-association studies can be summarized as follows: (a) the sample sizes are generally small, and usually the genotyping included only the variants in a single gene of a repair pathway and sometimes only a single variant in a gene; (b) most of these variant alleles have been reported to be associated with cancer risk in some studies; (c) other studies have reported an absence of association in cohorts with similar cancers or cohorts with other cancers; and (d) in some studies, the common or wild-type allele was associated with elevated risk.

The molecular epidemiology and biochemical characterization of individual variants are beginning to address the relationship of individual polymorphic variants to repair capacity and relevance as cancer risk factors. The cataloguing of variants in the genes in pathways with roles in ameliorating the negative consequences of exposures is a step in initiating these studies. It is important that a majority of the variants in each gene in a pathway be identified and be used as reagents for molecular epidemiology and biochemical function studies, because it is expected that the impact of most of the variants will be individually small (60, 61). It is usually assumed that selection against the alleles associated with elevated risk of complex diseases is generally small because they are expected to have only mildly deleterious impacts on function (62). As observed in the present study, the collective contribution of the low-frequency variants to the total variation among individuals is large, emphasizing the importance of using the full richness of this genetic variation across all relevant genes and pathways to estimate individual susceptibility. The development of approaches for relating complex genotypes, interacting with exposure (63), to disease risk (64) will be required. In the end of days, the refinement of estimates of cancer risk will require extensive genotyping in studies using large cohorts with well-documented exposure histories and disease status to account for the manifold and subtle effects of gene-gene and gene-gene-environment interactions in determining individual susceptibility.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1

Work performed under the auspices of the United States Department of Energy by the University of California, Lawrence Livermore National Laboratory (contract W-7405-ENG-48) and supported in part by Interagency Agreement Y1-ES-8054-05 from National Institute of Environmental Health Sciences and National Cancer Institute Grant 1 U-1 CA 83180-03.

3

The abbreviations used are: NER, nucleotide excision repair; DSB/RR, double-strand break/recombination repair; BER, base excision repair; MMR, mismatch repair; LLNL-IRB, Lawrence Livermore National Laboratory Institutional Review Board; UTR, untranslated region; UV, ultraviolet.

4

Internet address: http://www.gene.ucl.ac.uk/nomenclature/.

5

Description and documentation for Phred, Phrap, and Consed may be obtained at http://www.genome.washington.edu.

6

Internet address: http://greengenes.llnl.gov/dpublic/secure/reseq/.

7

Internet address: http://manuel.niehs.nih.gov/egsnp/home.htm.

8

Tong Xi, Johana Vázquez-Matías, and Harvey W. Mohrenweiser. Single nucleotide polymorphisms in the exonic and neighboring intronic and untranslated regions of 40 DNA repair and repair related genes in humans, manuscript in preparation.

9

Internet address: http://www.ncbi.nlm.nih.gov/Entrez/.

Table 1

Summary of genes screened for variation

GeneAliasSamples screenedaGenomic sequencecDNA sequence
ADPRT PARP AC04143 J03473 
APEX APE1 I, II M92444 M80261 
CDK4  II U37022 M14505 
CDKN2A INK4; p16 II U12818–U12820 NM_000077 
ERCC1  II M63796 M13194 
ERCC2 XPD II L47234 X52221 
ERCC3 XPB IV AC027142 M31899 
ERCC4 XPF II L76568 NM_005236 
ERCC5 XPG IV D16305 NM_000123 
FANCG XRCC9 AC004472 U70310 
FEN1  III AC004770 L37374 
LIG1  AC011466 M36067 
LIG3  I, III AC004223 U40671, X84740 
MLH1 HNPCC III U40960TO–U40978 U07418 
MRE11A  AP000786 NM_005590 
MSH2  III U41206–U41221 U03911 
MSH3  III D61397–D61419 U61981 
MSH6 GTBP III AC006509 NM_000179 
NBS1  AF069291.1 NM_002485 
NTHL1 NTH1 AC005600 U79718 
NUDT1 MTH1 D38591 D16581 
PCNA  II J04718 M15796 
POLB  I, II AH00541, and AF170802 NM_002690 
POLD1  AC020909, and CITB-E1_2545M3 M80397 
POLD2  AC006454 NM_006230 
RAD23A HHR23A II AD0000092 NM_005053 
RAD23B HHR23B IV AL137852 D21090 
RAD50  III AC004042A U63139 
RAD51  AF165088–AF165094 D14134 
RAD52  AC004803 U27516 
RAD54L  AL360086 NM_003579 
XPA  II AL442130, U16815 NM_000380 
XPC  IV AH009651 D21089 
XRCC1  I, II L34079 M36089 
XRCC2  III AC003109 Y08837 
XRCC3  II AF037222 NM_005432 
XRCC4  AC034211, AC022416 NM_003401 
GeneAliasSamples screenedaGenomic sequencecDNA sequence
ADPRT PARP AC04143 J03473 
APEX APE1 I, II M92444 M80261 
CDK4  II U37022 M14505 
CDKN2A INK4; p16 II U12818–U12820 NM_000077 
ERCC1  II M63796 M13194 
ERCC2 XPD II L47234 X52221 
ERCC3 XPB IV AC027142 M31899 
ERCC4 XPF II L76568 NM_005236 
ERCC5 XPG IV D16305 NM_000123 
FANCG XRCC9 AC004472 U70310 
FEN1  III AC004770 L37374 
LIG1  AC011466 M36067 
LIG3  I, III AC004223 U40671, X84740 
MLH1 HNPCC III U40960TO–U40978 U07418 
MRE11A  AP000786 NM_005590 
MSH2  III U41206–U41221 U03911 
MSH3  III D61397–D61419 U61981 
MSH6 GTBP III AC006509 NM_000179 
NBS1  AF069291.1 NM_002485 
NTHL1 NTH1 AC005600 U79718 
NUDT1 MTH1 D38591 D16581 
PCNA  II J04718 M15796 
POLB  I, II AH00541, and AF170802 NM_002690 
POLD1  AC020909, and CITB-E1_2545M3 M80397 
POLD2  AC006454 NM_006230 
RAD23A HHR23A II AD0000092 NM_005053 
RAD23B HHR23B IV AL137852 D21090 
RAD50  III AC004042A U63139 
RAD51  AF165088–AF165094 D14134 
RAD52  AC004803 U27516 
RAD54L  AL360086 NM_003579 
XPA  II AL442130, U16815 NM_000380 
XPC  IV AH009651 D21089 
XRCC1  I, II L34079 M36089 
XRCC2  III AC003109 Y08837 
XRCC3  II AF037222 NM_005432 
XRCC4  AC034211, AC022416 NM_003401 
a

Sample sets are described in “Materials and Methods.”

Table 2

Amino acid substitution variants identified in DNA repair and repair-related genes

The common nucleotide followed by the variant nucleotide is enclosed in parentheses, and the codon for the amino acid residue is underlined and bold. The nine amino acid substitutions reported previously (25) are indicated with an asterisk.

Gene nameExonCodonCommon residueVariant residueAllele frequencyMouse residuecDNA sequence 5′→3′
BER
ADPRT 188 Ala Thr 0.006 Ser TCCTT(G/A)CTACA 
ADPRT 334 Val Ile 0.011 Val AGTGG(G/A)TAACC 
ADPRT 383 Ser Tyr 0.014 Ser CTCCT(C/A)TGCTT 
ADPRT 17 761 Val Ala 0.18 Val CAAGG(T/C)GGAAA 
ADPRT 21 940 Lys Arg 0.011 Lys CAGCA(A/G)GTTAC 
APE1 51 Gln His 0.03 Gln GATCA(G/C)AAAAC 
APE1 64 Ile Val 0.01 Ile TCAAG(A/G)TCTGC 
APE1 148 Asp Glu 0.33 Glu GGCGA(T/G)GAGGA 
APE1 241 Gly Arg 0.01 Gly GCTTC(G/A)GGGAA 
FEN1  No variants      
LIG1 24 Ala Val 0.01 Thr GGAGG(C/T)ATCCA 
LIG1 62 Arg Trp 0.01 Gln CGGCC(C/T)GGGTC 
LIG1 249 Gly Glu 0.01 Gly GCCAG(G/A)GGCTC 
LIG1 10 267 Asn Ser 0.02 Asn TTACA(A/G)TCCTG 
LIG1 13 369 Val Ile 0.01 Ile AGTCC(G/A)TCCGG 
LIG1 13 409 Arg His 0.01 Cys GTTCC(G/A)CGACA 
LIG1 16 480 Met Val 0.01 Val CAGCC(A/G)TGGTG 
LIG1 20 614 Thr Ile 0.01 Thr GGTCA(C/T)ATCCT 
LIG1 22 673 Glu Asp 0.01 Gln CGTGA(G/T)CCCCT 
LIG1 22 677 Arg Leu 0.01 Arg TTCCC(G/T)GCGCC 
LIG3 18 780 Arg His 0.03 Cys GTCCC(G/A)CAAGG 
LIG3 19 811 Lys Thr 0.01 Lys TGCAA(A/C)GCCTT 
LIG3 21 899 Pro Ser 0.01 Thr AGAAC(C/T)CTGCG 
NTHL1 21 Arg Trp 0.006 Arg GGAGC(C/T)GGAGC 
NTHL1 31 Gly Val 0.006 Gly GCGGG(G/T)GTGTA 
NTHL1 33 Arg Lys 0.006 Arg GTGTA(G/A)GGAGG 
NTHL1 176 Ile Thr 0.005 Ile GCTCA(T/C)CTACC 
NTHL1 234 Ser Leu 0.006 Ser TGTGT(C/T)AGGCA 
NUDT1 83 Val Met 0.006 Val TGGAC(G/A)TGCAT 
NUDT1 135 Gly Trp 0.006 Gly TCCAC(G/T)GGTAC 
PCNA  No variants      
POLB Gln Arg 0.01 Gln GCCGC(A/G)GGAGA 
POLB 137 Arg Gln 0.006 Arg TCAGC(G/A)AATTG 
POLB 12 242 Pro Arg 0.005 Pro GCTTC(C/G)CAGTA 
POLD1 19 Arg His 0.12 Arg GGCCC(G/A)TGGGG 
POLD1 30 Arg Trp 0.006 Ser CACCT(C/T)GGCCA 
POLD1 119 Arg His 0.15 Arg ATCCC(G/A)CGGCT 
POLD1 173 Ser Asn 0.05 Ser CATCA(G/A)CCGGG 
POLD1 177 Arg His 0.003 Arg CAGTC(G/A)CGGGG 
POLD1 19 849 Arg His 0.011 Arg ACTGC(G/A)CCGCC 
POLD1 26 1086 Arg Gln 0.01 Arg GGTGC(G/A)GAAGG 
POLD2 303 Asn Ser 0.005 Asn CACCA(A/G)TTACA 
XRCC1 72 Val Ala 0.03 Val GCTGG(T/C)GGGCA 
XRCC1 161 Pro Leu 0.005 Thr GGCCC(C/T)GTCCC 
XRCC1 173 Phe Leu 0.005 Phe CAGTT(C/G)CGTGT 
XRCC1 194* Arg Trp 0.13 Arg TCAGC(C/T)GGATC 
XRCC1 280* Arg His 0.03 Arg AACTC(G/A)TACCC 
XRCC1 309 Pro Ser 0.01 Ala GACGA(C/T)CCCGA 
XRCC1 10 399* Arg Gln 0.24 Arg CTCCC(G/A)GAGGT 
XRCC1 15 560 Arg Trp 0.01 Arg AGCGG(C/T)GGAAA 
XRCC1 16 576 Tyr Ser 0.01 Tyr GGACT(A/C)TATGA 
Gene nameExonCodonCommon residueVariant residueAllele frequencyMouse residuecDNA sequence 5′→3′
BER
ADPRT 188 Ala Thr 0.006 Ser TCCTT(G/A)CTACA 
ADPRT 334 Val Ile 0.011 Val AGTGG(G/A)TAACC 
ADPRT 383 Ser Tyr 0.014 Ser CTCCT(C/A)TGCTT 
ADPRT 17 761 Val Ala 0.18 Val CAAGG(T/C)GGAAA 
ADPRT 21 940 Lys Arg 0.011 Lys CAGCA(A/G)GTTAC 
APE1 51 Gln His 0.03 Gln GATCA(G/C)AAAAC 
APE1 64 Ile Val 0.01 Ile TCAAG(A/G)TCTGC 
APE1 148 Asp Glu 0.33 Glu GGCGA(T/G)GAGGA 
APE1 241 Gly Arg 0.01 Gly GCTTC(G/A)GGGAA 
FEN1  No variants      
LIG1 24 Ala Val 0.01 Thr GGAGG(C/T)ATCCA 
LIG1 62 Arg Trp 0.01 Gln CGGCC(C/T)GGGTC 
LIG1 249 Gly Glu 0.01 Gly GCCAG(G/A)GGCTC 
LIG1 10 267 Asn Ser 0.02 Asn TTACA(A/G)TCCTG 
LIG1 13 369 Val Ile 0.01 Ile AGTCC(G/A)TCCGG 
LIG1 13 409 Arg His 0.01 Cys GTTCC(G/A)CGACA 
LIG1 16 480 Met Val 0.01 Val CAGCC(A/G)TGGTG 
LIG1 20 614 Thr Ile 0.01 Thr GGTCA(C/T)ATCCT 
LIG1 22 673 Glu Asp 0.01 Gln CGTGA(G/T)CCCCT 
LIG1 22 677 Arg Leu 0.01 Arg TTCCC(G/T)GCGCC 
LIG3 18 780 Arg His 0.03 Cys GTCCC(G/A)CAAGG 
LIG3 19 811 Lys Thr 0.01 Lys TGCAA(A/C)GCCTT 
LIG3 21 899 Pro Ser 0.01 Thr AGAAC(C/T)CTGCG 
NTHL1 21 Arg Trp 0.006 Arg GGAGC(C/T)GGAGC 
NTHL1 31 Gly Val 0.006 Gly GCGGG(G/T)GTGTA 
NTHL1 33 Arg Lys 0.006 Arg GTGTA(G/A)GGAGG 
NTHL1 176 Ile Thr 0.005 Ile GCTCA(T/C)CTACC 
NTHL1 234 Ser Leu 0.006 Ser TGTGT(C/T)AGGCA 
NUDT1 83 Val Met 0.006 Val TGGAC(G/A)TGCAT 
NUDT1 135 Gly Trp 0.006 Gly TCCAC(G/T)GGTAC 
PCNA  No variants      
POLB Gln Arg 0.01 Gln GCCGC(A/G)GGAGA 
POLB 137 Arg Gln 0.006 Arg TCAGC(G/A)AATTG 
POLB 12 242 Pro Arg 0.005 Pro GCTTC(C/G)CAGTA 
POLD1 19 Arg His 0.12 Arg GGCCC(G/A)TGGGG 
POLD1 30 Arg Trp 0.006 Ser CACCT(C/T)GGCCA 
POLD1 119 Arg His 0.15 Arg ATCCC(G/A)CGGCT 
POLD1 173 Ser Asn 0.05 Ser CATCA(G/A)CCGGG 
POLD1 177 Arg His 0.003 Arg CAGTC(G/A)CGGGG 
POLD1 19 849 Arg His 0.011 Arg ACTGC(G/A)CCGCC 
POLD1 26 1086 Arg Gln 0.01 Arg GGTGC(G/A)GAAGG 
POLD2 303 Asn Ser 0.005 Asn CACCA(A/G)TTACA 
XRCC1 72 Val Ala 0.03 Val GCTGG(T/C)GGGCA 
XRCC1 161 Pro Leu 0.005 Thr GGCCC(C/T)GTCCC 
XRCC1 173 Phe Leu 0.005 Phe CAGTT(C/G)CGTGT 
XRCC1 194* Arg Trp 0.13 Arg TCAGC(C/T)GGATC 
XRCC1 280* Arg His 0.03 Arg AACTC(G/A)TACCC 
XRCC1 309 Pro Ser 0.01 Ala GACGA(C/T)CCCGA 
XRCC1 10 399* Arg Gln 0.24 Arg CTCCC(G/A)GAGGT 
XRCC1 15 560 Arg Trp 0.01 Arg AGCGG(C/T)GGAAA 
XRCC1 16 576 Tyr Ser 0.01 Tyr GGACT(A/C)TATGA 
NER
ERCC1  No variants      
ERCC2 199* Ile Met 0.01 Ile TCAAT(C/G)CTGCA 
ERCC2 201* His Tyr 0.01 His TCCTG(C/T)ATGCC 
ERCC2 10 312* Asp Asn 0.40 Asp TGCCC(G/A)ACGAA 
ERCC2 20 616 Arg Pro 0.01 Arg CGGGC(G/C)GGCCG 
ERCC2 23 751* Lys Gln 0.32 Gln CGCTG(A/C)AGAGG 
ERCC3  No variants      
ERCC4 379* Pro Ser 0.03 Pro GCAAC(C/T)CAAAG 
ERCC4 415 Arg Gln 0.06 Arg TGACC(G/A)AACAT 
ERCC5 141 Asn Asp 0.02 Asp GAGAA(A/G)ACGAC 
ERCC5 254 Met Val 0.012 Met AGGAA(A/G)TGAAT 
ERCC5 529 Cys Ser 0.03 Arg AACTT(G/C)TACAA 
ERCC5 597 Val Leu 0.011 Met AGGCA(G/C)TAGAT 
ERCC5 10 761 Ala Thr 0.011 Ala GGATC(G/A)CTGCT 
ERCC5 15 1090 Glu Asp 0.011 Asp GGGGA(G/C)ACCTG 
ERCC5 15 1104 Asp His 0.18 Asp GTGAA(G/C)ATGCT 
LIG1  See above      
PCNA  See above      
POLD1  See above      
POLD2  See above      
RAD23A 200 Thr Met 0.03 Thr GCTCA(C/T)GGGAA 
RAD23B 249 Ala Val 0.10 Ala TGGGG(C/T)TCCTC 
XPA 256 Met Ile 0.01 Met GACAT(G/C)TACCG 
XPC 16 Leu Val 0.04 none GCGAA(C/G)TGCGC 
XPC 48 Leu Phe 0.04 Ser GCCTT(C/T)TCTCC 
XPC 492 Arg His 0.04 Arg CCATC(G/A)TAAGG 
XPC 499 Ala Val 0.24 Ala GCCAG(C/T)GGCAT 
XPC 15 939 Lys Gln 0.38 Lys TTGAG(A/C)AGCTG 
NER
ERCC1  No variants      
ERCC2 199* Ile Met 0.01 Ile TCAAT(C/G)CTGCA 
ERCC2 201* His Tyr 0.01 His TCCTG(C/T)ATGCC 
ERCC2 10 312* Asp Asn 0.40 Asp TGCCC(G/A)ACGAA 
ERCC2 20 616 Arg Pro 0.01 Arg CGGGC(G/C)GGCCG 
ERCC2 23 751* Lys Gln 0.32 Gln CGCTG(A/C)AGAGG 
ERCC3  No variants      
ERCC4 379* Pro Ser 0.03 Pro GCAAC(C/T)CAAAG 
ERCC4 415 Arg Gln 0.06 Arg TGACC(G/A)AACAT 
ERCC5 141 Asn Asp 0.02 Asp GAGAA(A/G)ACGAC 
ERCC5 254 Met Val 0.012 Met AGGAA(A/G)TGAAT 
ERCC5 529 Cys Ser 0.03 Arg AACTT(G/C)TACAA 
ERCC5 597 Val Leu 0.011 Met AGGCA(G/C)TAGAT 
ERCC5 10 761 Ala Thr 0.011 Ala GGATC(G/A)CTGCT 
ERCC5 15 1090 Glu Asp 0.011 Asp GGGGA(G/C)ACCTG 
ERCC5 15 1104 Asp His 0.18 Asp GTGAA(G/C)ATGCT 
LIG1  See above      
PCNA  See above      
POLD1  See above      
POLD2  See above      
RAD23A 200 Thr Met 0.03 Thr GCTCA(C/T)GGGAA 
RAD23B 249 Ala Val 0.10 Ala TGGGG(C/T)TCCTC 
XPA 256 Met Ile 0.01 Met GACAT(G/C)TACCG 
XPC 16 Leu Val 0.04 none GCGAA(C/G)TGCGC 
XPC 48 Leu Phe 0.04 Ser GCCTT(C/T)TCTCC 
XPC 492 Arg His 0.04 Arg CCATC(G/A)TAAGG 
XPC 499 Ala Val 0.24 Ala GCCAG(C/T)GGCAT 
XPC 15 939 Lys Gln 0.38 Lys TTGAG(A/C)AGCTG 
DSB/RR
LIG3  See above      
NBS1 142 Asn Ser 0.006 Asn AAACA(A/G)TTGGA 
NBS1 185 Gln Glu 0.34 Glu CAGTT(G/C)AGTCC 
NBS1 196 Phe Val 0.005 Phe AAAGT(T/G)TTTAC 
NBS1 216 Gln Lys 0.005 His GACGG(C/A)AGGAA 
NBS1 14 716 Asn Asp 0.013 Asn GAAAG(A/G)ATACA 
RAD50 191 Thr Ile 0.01 Thr AGAAA(C/T)ACTTC 
RAD50 16 884 Arg His 0.024 Arg ACGTC(G/A)TCAGC 
RAD50 24 1239 Arg Gln 0.01 Arg TGACC(G/A)AGAAA 
RAD51  No variants      
XRCC2 16 Ala Ser 0.01 Ala TCCTT(G/T)CCCGA 
XRCC2 188 Arg His 0.05 Arg CTATC(G/A)CCTGG 
XRCC3 241* Thr Met 0.43 Thr GGCCA(C/T)GCTGC 
XRCC4 12 Ser Cys 0.01 Ser TGTTT(C/G)TGAAC 
XRCC4 75 Leu Ser 0.006 Val ATTGT(T/C)GTCAG 
XRCC4 134 Ile Thr 0.016 Ile CACCA(T/C)TGCAG 
XRCC4 137 Asn Cys 0.005 Lys GAAAA(T/G)CAAGC 
XRCC4 247 Ala Ser 0.08 Ala GGTTG(G/T)CTTCA 
DSB/RR
LIG3  See above      
NBS1 142 Asn Ser 0.006 Asn AAACA(A/G)TTGGA 
NBS1 185 Gln Glu 0.34 Glu CAGTT(G/C)AGTCC 
NBS1 196 Phe Val 0.005 Phe AAAGT(T/G)TTTAC 
NBS1 216 Gln Lys 0.005 His GACGG(C/A)AGGAA 
NBS1 14 716 Asn Asp 0.013 Asn GAAAG(A/G)ATACA 
RAD50 191 Thr Ile 0.01 Thr AGAAA(C/T)ACTTC 
RAD50 16 884 Arg His 0.024 Arg ACGTC(G/A)TCAGC 
RAD50 24 1239 Arg Gln 0.01 Arg TGACC(G/A)AGAAA 
RAD51  No variants      
XRCC2 16 Ala Ser 0.01 Ala TCCTT(G/T)CCCGA 
XRCC2 188 Arg His 0.05 Arg CTATC(G/A)CCTGG 
XRCC3 241* Thr Met 0.43 Thr GGCCA(C/T)GCTGC 
XRCC4 12 Ser Cys 0.01 Ser TGTTT(C/G)TGAAC 
XRCC4 75 Leu Ser 0.006 Val ATTGT(T/C)GTCAG 
XRCC4 134 Ile Thr 0.016 Ile CACCA(T/C)TGCAG 
XRCC4 137 Asn Cys 0.005 Lys GAAAA(T/G)CAAGC 
XRCC4 247 Ala Ser 0.08 Ala GGTTG(G/T)CTTCA 
Damage recognition and cell cycle checkpoints
CDK4 20 Val Leu 0.03 Val GGACA(G/T)TGTAC 
CDKN2A 148 Ala Thr 0.05 Leu ATGCC(G/A)CGGAA 
FANCG 297 Thr Ile 0.01 Ala CACAA(C/T)AGCAG 
FANCG 330 Pro Ser 0.01 Pro TACTG(C/T)CACCA 
FANCG 378 Ser Leu 0.02 Ser TAGCT(C/T)GGAGC 
FANCG 10 464 Val Phe 0.01 Ser CCTGG(G/T)TTCAA 
RAD52 70 Arg Trp 0.005 Arg GTCAT(C/T)GGGTA 
RAD52 221 Gln Glu 0.011 Glu TGCAG(C/G)AGGTG 
RAD52 287 Ser Asn 0.05 His GAAGA(G/A)TGAGG 
RAD54L 74 Ile Met 0.006 Ile TTTAT(T/G)CGAAG 
RAD54L 202 Arg Cys 0.006 Arg TTTTA(C/T)GCCAG 
RAD54L 10 380 Arg Gln 0.011 Arg GGAGC(G/A)GCTGC 
RAD54L 16 583 Ile Thr 0.01 Ile TCTCA(T/C)TGGGG 
MLH1 213 Val Met 0.04 Val CAACC(G/A)TGGGAC 
MLH1 219 Ile Val 0.12 Ile GCTCC(A/G)TCTTT 
MLH1 11 325 Arg Gln 0.01 Arg GGAGC(G/A)GGTGC 
MLH1 11 326 Val Ala 0.01 Val GCGGG(T/C)GCAGC 
MLH1 12 452 Thr Ser 0.01 Leu ATACA(A/T)CAAAG 
MLH1 16 618 Lys Ala 0.01 Lys AGAAG(A/G)&(A/C)GGCTG 
MLH1 16 618 Lys Thr or Glu 0.01 Lys AGAAG(A/G)or(A/C)GGCTG 
MLH1 19 718 His Tyr 0.01 His TGGAA(C/T)ACATTG 
MSH2 127 Asn Ser 0.006 Asn TGGCA(A/G)TCTCT 
MSH2 170 Gln Glu 0.006 Gln CCATA(C/G)AGAGG 
MSH2 319 Asp Val 0.009 Asp TGAAG(A/T)TACCA 
MSH2 322 Gly Asp 0.011 Gly CACTG(G/A)CTCTC 
MSH2 390 Leu Phe 0.005 Leu ACCGA(C/T)TTGCC 
Damage recognition and cell cycle checkpoints
CDK4 20 Val Leu 0.03 Val GGACA(G/T)TGTAC 
CDKN2A 148 Ala Thr 0.05 Leu ATGCC(G/A)CGGAA 
FANCG 297 Thr Ile 0.01 Ala CACAA(C/T)AGCAG 
FANCG 330 Pro Ser 0.01 Pro TACTG(C/T)CACCA 
FANCG 378 Ser Leu 0.02 Ser TAGCT(C/T)GGAGC 
FANCG 10 464 Val Phe 0.01 Ser CCTGG(G/T)TTCAA 
RAD52 70 Arg Trp 0.005 Arg GTCAT(C/T)GGGTA 
RAD52 221 Gln Glu 0.011 Glu TGCAG(C/G)AGGTG 
RAD52 287 Ser Asn 0.05 His GAAGA(G/A)TGAGG 
RAD54L 74 Ile Met 0.006 Ile TTTAT(T/G)CGAAG 
RAD54L 202 Arg Cys 0.006 Arg TTTTA(C/T)GCCAG 
RAD54L 10 380 Arg Gln 0.011 Arg GGAGC(G/A)GCTGC 
RAD54L 16 583 Ile Thr 0.01 Ile TCTCA(T/C)TGGGG 
MLH1 213 Val Met 0.04 Val CAACC(G/A)TGGGAC 
MLH1 219 Ile Val 0.12 Ile GCTCC(A/G)TCTTT 
MLH1 11 325 Arg Gln 0.01 Arg GGAGC(G/A)GGTGC 
MLH1 11 326 Val Ala 0.01 Val GCGGG(T/C)GCAGC 
MLH1 12 452 Thr Ser 0.01 Leu ATACA(A/T)CAAAG 
MLH1 16 618 Lys Ala 0.01 Lys AGAAG(A/G)&(A/C)GGCTG 
MLH1 16 618 Lys Thr or Glu 0.01 Lys AGAAG(A/G)or(A/C)GGCTG 
MLH1 19 718 His Tyr 0.01 His TGGAA(C/T)ACATTG 
MSH2 127 Asn Ser 0.006 Asn TGGCA(A/G)TCTCT 
MSH2 170 Gln Glu 0.006 Gln CCATA(C/G)AGAGG 
MSH2 319 Asp Val 0.009 Asp TGAAG(A/T)TACCA 
MSH2 322 Gly Asp 0.011 Gly CACTG(G/A)CTCTC 
MSH2 390 Leu Phe 0.005 Leu ACCGA(C/T)TTGCC 
MMR
MSH2 13 735 Ile Val 0.006 Ile CTTCT(A/G)TCCTC 
MSH3 429 Ala Val 0.04 Met AGAGG(C/T)GCTCA 
MSH3 456 Tyr Cys 0.006 Tyr TGAAT(A/G)CAGCC 
MSH3 10 514 Glu Lys 0.05 Glu AACCT(G/A)AGAAT 
MSH3 13 597 Ser Asn 0.014 Ser ATCTA(G/A)TGTGT 
MSH3 15 700 Phe Leu 0.005 Phe CTGAC(T/C)TCCCT 
MSH3 21 931 Gly Cys 0.006 Gly GGATG(G/T)GTGCT 
MSH3 21 940 Arg Gln 0.10 Arg AGGAC(G/A)GAGTA 
MSH3 23 1036 Thr Ala 0.30 Asp CAGGC(A/G)CAGCA 
MSH6 25 Ala Val 0.003 Ala CTCGG(C/T)CAGGG 
MSH6 39 Gly Glu 0.24 Gly CCCCG(G/A)GGCCT 
MSH6 396 Leu Val 0.009 Leu CTACA(C/G)TCTAT 
MSH6 878 Val Ala 0.016 Val AGAAG(T/C)TGCTG 
MSH6 1152 Val Ile 0.009 Val TAGCT(G/A)TAATG 
MMR
MSH2 13 735 Ile Val 0.006 Ile CTTCT(A/G)TCCTC 
MSH3 429 Ala Val 0.04 Met AGAGG(C/T)GCTCA 
MSH3 456 Tyr Cys 0.006 Tyr TGAAT(A/G)CAGCC 
MSH3 10 514 Glu Lys 0.05 Glu AACCT(G/A)AGAAT 
MSH3 13 597 Ser Asn 0.014 Ser ATCTA(G/A)TGTGT 
MSH3 15 700 Phe Leu 0.005 Phe CTGAC(T/C)TCCCT 
MSH3 21 931 Gly Cys 0.006 Gly GGATG(G/T)GTGCT 
MSH3 21 940 Arg Gln 0.10 Arg AGGAC(G/A)GAGTA 
MSH3 23 1036 Thr Ala 0.30 Asp CAGGC(A/G)CAGCA 
MSH6 25 Ala Val 0.003 Ala CTCGG(C/T)CAGGG 
MSH6 39 Gly Glu 0.24 Gly CCCCG(G/A)GGCCT 
MSH6 396 Leu Val 0.009 Leu CTACA(C/G)TCTAT 
MSH6 878 Val Ala 0.016 Val AGAAG(T/C)TGCTG 
MSH6 1152 Val Ile 0.009 Val TAGCT(G/A)TAATG 
Table 3

Summary of non-amino acid substitution variants disrupting protein structure

Gene nameExonCodonCommon residueVariant residueAllele frequencyMouse residuecDNA sequence 5′→3′
RAD52 10 346 Ser Stop 0.033 Leu GCCCT(C/A)GTCTA 
RAD52 11 415 Tyr Stop 0.041 Leu AAATA(T/G)GATCC 
RAD50 363 QEHIa QE(Q)HI 0.005 QEHI TCAAG(AAC/AACAAC)ATATC 
RAD50 15 826 Gln Stop 0.005 Gln CTGTC(C/T)AACAA 
MRE11A 333 Gln Stop 0.009 Gln CCATA(C/T)AAAGC 
MSH6 16 1357 TLIKEL(STOP)b TLID(STOP) 0.020 ALINGL(STOP) GACT(TTGA/TTGATTGA)TTAAG 
Gene nameExonCodonCommon residueVariant residueAllele frequencyMouse residuecDNA sequence 5′→3′
RAD52 10 346 Ser Stop 0.033 Leu GCCCT(C/A)GTCTA 
RAD52 11 415 Tyr Stop 0.041 Leu AAATA(T/G)GATCC 
RAD50 363 QEHIa QE(Q)HI 0.005 QEHI TCAAG(AAC/AACAAC)ATATC 
RAD50 15 826 Gln Stop 0.005 Gln CTGTC(C/T)AACAA 
MRE11A 333 Gln Stop 0.009 Gln CCATA(C/T)AAAGC 
MSH6 16 1357 TLIKEL(STOP)b TLID(STOP) 0.020 ALINGL(STOP) GACT(TTGA/TTGATTGA)TTAAG 
a

The wild-type human sequence is Gln-Glu-His-Ile, the variant sequence is Gln-Glu-Gln-His-Ile, and the mouse sequence is Gln-Glu-His-Ile.

b

The human wild-type sequence is Thr-Leu-Ile-Lys-Glu-Leu, the variant sequence is Thr-Leu-Ile-Asp, and the mouse sequence is Ala-Leu-Ile-Asn-Gly-Leu.

Table 4

Summary of variants identified in genes in different DNA repair pathways

PathwayNo. of genesNo. of variantsVariants/gene (range)Allele frequency (average)
BER 12 49 0–10 0.033 
NER 13 40 0–9 0.061 
DSB/RR 22 0–5 0.054 
MMR 28 6–8 0.038 
Damage recognition and cell cycle checkpoint 15 1–5 0.02 
     
Totala 37 133   
 Average   3.6 0.047 
PathwayNo. of genesNo. of variantsVariants/gene (range)Allele frequency (average)
BER 12 49 0–10 0.033 
NER 13 40 0–9 0.061 
DSB/RR 22 0–5 0.054 
MMR 28 6–8 0.038 
Damage recognition and cell cycle checkpoint 15 1–5 0.02 
     
Totala 37 133   
 Average   3.6 0.047 
a

LIG1, POLD1, POLD2, and PCNA have functions in both the BER and the NER pathways. LIG3 functions in both BER and DSB/RR. These genes are counted in both pathways in the Table. The total is the number of different genes resequenced and the number of unique variants identified. The count includes both the protein truncation and the amino acid substitution variants.

Table 5

Contribution of alleles of different frequency to total variation

Allele frequencyVariant alleles (n)Total variation (%)Cumulative variation (%)
>0.40 14 14 
0.30–0.399 27 41 
0.20–0.299 11 52 
0.10–0.199 16 68 
0.05–0.099 76 
0.02–0.049 22 11 88 
<0.02 86 12 100 
Allele frequencyVariant alleles (n)Total variation (%)Cumulative variation (%)
>0.40 14 14 
0.30–0.399 27 41 
0.20–0.299 11 52 
0.10–0.199 16 68 
0.05–0.099 76 
0.02–0.049 22 11 88 
<0.02 86 12 100 
Table 6

Distribution of variant alleles per individual identified in screening 12 genes of the BER pathway in 90 individuals

No. of variant alleles/individualNo. of individuals
19 
19 
26 
11 
No. of variant alleles/individualNo. of individuals
19 
19 
26 
11 
Table 7

Examples of genotypes observed in individuals with four variant alleles among 12 genes of the BER pathway

“1” designates the wild-type allele and “2” is the variant allele; heterozygote individuals are shaded light gray with a diagonal line, and homozygous variant individuals are shaded dark gray and boxed. The top line is the wild-type amino acid, line 2 is the variant residue, and the third line indicates the position of the substitution. Additional details regarding the substitutions are in Table 2.

Examples of genotypes observed in individuals with four variant alleles among 12 genes of the BER pathway
Examples of genotypes observed in individuals with four variant alleles among 12 genes of the BER pathway

We gratefully acknowledge the assistance of Suzanne Duarte, Arlene Gonzales, and Karolyn Burkhart-Schultz in the sequencing and of Linda Ott, Mimi Yeh, and Tom Slezak in the informatics support. We appreciate the many discussions with Dr. David Wilson, III, and the assistance of Dr. Gloria Petersen in supplying the samples from Johns Hopkins University and of Dr. James Selkirk and the Environmental Genome Program of the National Institute of Environmental Health Sciences.

1
Cleaver J. E., Karplus K., Kashani-Sabet M., Limoli C. L. Nucleotide excision repair, “a legacy of creativity.”.
Mutat. Res.
,
485
:
23
-36,  
2001
.
2
Batty D. P., Wood R. D. Damage recognition in nucleotide excision repair of DNA.
Gene (Amst.)
,
241
:
193
-204,  
2000
.
3
Thompson L. H., Schild D. Homologous recombinational repair of DNA ensures mammalian chromosome stability.
Mutat. Res.
,
477
:
131
-153,  
2001
.
4
Pfeiffer P., Goedecke W., Obe G. Mechanisms of DNA double-strand break repair and their potential to induce chromosomal aberrations.
Mutagenesis
,
15
:
289
-302,  
2000
.
5
Khanna K. K., Jackson S. P. DNA double-strand breaks: signaling, repair and the cancer connection.
Nat. Genet.
,
27
:
247
-254,  
2001
.
6
Lindahl T. Suppression of spontaneous mutagenesis in human cells by DNA base excision-repair.
Mutat. Res.
,
462
:
129
-135,  
2000
.
7
Wilson S. H. Mammalian base excision repair and DNA polymerase beta.
Mutat. Res.
,
407
:
203
-215,  
1998
.
8
Demple B., Harrison L. Repair of oxidative damage to DNA: enzymology and biology.
Annu. Rev. Biochem.
,
63
:
915
-948,  
1994
.
9
Kolodner R. D., Marsischky G. T. Eukaryotic DNA mismatch repair.
Curr. Opin. Genet. Dev.
,
9
:
89
-96,  
1999
.
10
Hsieh P. Molecular mechanisms of DNA mismatch repair.
Mutat. Res.
,
486
:
71
-87,  
2001
.
11
Wood R. D., Mitchell M., Sgouros J., Lindahl T. Human DNA repair genes.
Science (Wash. DC)
,
291
:
1284
-1289,  
2001
.
12
Romen A., Glickman B. W. Human DNA repair genes.
Environ. Mol. Mutagen.
,
37
:
241
-283,  
2001
.
13
Orr-Weaver T. L., Weinberg R. A. A checkpoint on the road to cancer.
Nature (Lond.)
,
392
:
223
-224,  
1998
.
14
Weinert T. DNA damage checkpoints update: getting molecular.
Curr. Opin. Genet. Dev.
,
8
:
185
-193,  
1998
.
15
Cleaver J. E. Xeroderma pigmentosum: the first of the cellular caretakers.
Trends Biochem. Sci.
,
26
:
398
-401,  
2001
.
16
Hoeijmakers J. H. Genome maintenance mechanisms for preventing cancer.
Nature (Lond.)
,
411
:
366
-374,  
2001
.
17
Ishikawa T., Ide F., Qin X., Zhang S., Takahashi Y., Sekiguchi M., Tanaka K., Nakatsuru Y. Importance of DNA repair in carcinogenesis: evidence from transgenic and gene targeting studies.
Mutat. Res.
,
477
:
41
-49,  
2001
.
18
Grossman L., Matanoski G., Farmer E., Hedayati M., Ray S., Trock B., Hanfelt J., Roush G., Berwick, Hu J. J. DNA repair as a susceptibility factor in chronic diseases in human populations Dizdaroglu M. Karakaya A. E. eds. .
Advances in DNA Damage and Repair
,
149
-167, Kluwer Academic/Plenum Publishers New York  
1999
.
19
Wu X., Gu J., Amos C. I., Jiang H., Hong W. K., Spitz M. R. A parallel study of in vitro sensitivity to benzo[a]pyrene diol epoxide and bleomycin in lung cancer cases and controls.
Cancer (Phila.)
,
83
:
1118
-1127,  
1998
.
20
Wu X., Spitz M. R., de Andrade M., Benowitz N. L., Swan G. E. Genetic influence on mutagen sensitivity: a twin study.
Proc. Am. Assoc. Cancer Res.
,
41
:
437
2000
.
21
Cloos J., Nieuwenhuis E. J. C., Boomsma D. I., Kuik D. J., van der Sterre M. L. T., Arwert F., Snow G. B., Braakhuis B. J. M. Inherited susceptibility to bleomycin-induced chromatid breaks in cultured peripheral blood lymphocytes.
J. Natl. Cancer Inst. (Bethesda)
,
91
:
1125
-1130,  
1999
.
22
Roberts S. A., Spreadborough A. R., Bulman B., Barber J. B., Evans D. G., Scott D. Heritability of cellular radiosensitivity: a marker of low-penetrance predisposition genes in breast cancer.
Am. J. Hum. Genet.
,
65
:
784
-794,  
1999
.
23
Berwick M., Vineis P. Markers of DNA repair and susceptibility to cancer in humans: an epidemiologic review.
J. Natl. Cancer Inst. (Bethesda)
,
92
:
874
-897,  
2000
.
24
Wu X., Gu J., Patt Y., Hassan M., Spitz M. R., Beasley R. P., Hwang L. Y. Mutagen sensitivity as a susceptibility marker for human hepatocellular carcinoma.
Cancer Epidemiol. Biomark. Prev.
,
7
:
567
-570,  
1998
.
25
Shen M.-J., Jones I., Mohrenweiser H. W. Non-conservative amino acid substitutions exist at polymorphic frequency in DNA repair genes.
Cancer Res.
,
58
:
604
-608,  
1998
.
26
Nickerson D. A., Tobe V. O., Taylor S. L. PolyPhred: automating the detection and genotyping of single nucleotide substitutions using fluorescence-based resequencing.
Nucleic Acids Res.
,
25
:
2745
-2751,  
1997
.
27
Collins F. S., Brooks L. D., Chakravarti A. A DNA polymorphism discovery resource for research on human genetic variation.
Genome Res.
,
8
:
1229
-1231,  
1998
.
28
Fritsche E., Pittman G. S., Bell D. A. Localization, sequence analysis and ethnic distribution of a 96-bp insertion in the promotor of the human CYP2E1 gene.
Mutat. Res.
,
432
:
1
-5,  
2000
.
29
Duell E. J., Wiencke J. K., Cheng T. J., Varkonyi A., Zuo Z. F., Ashok T. D., Mark E. J., Wain J. C., Christiani D. C., Kelsey K. T. Polymorphisms in the DNA repair genes XRCC1 and ERCC2 and biomarkers of DNA damage in human blood mononuclear cells.
Carcinogenesis (Lond.)
,
21
:
965
-971,  
2000
.
30
Lunn R. M., Langlois R. G., Hsich L. L., Thompson C. L., Bell D. A. XRCC1 polymorphisms: effects on aflatoxin B1 DNA adducts and glycoprotein A variant frequency.
Cancer Res.
,
59
:
2557
-2561,  
1999
.
31
Hu J. J., Smith T. R., Miller M. S., Mohrenweiser H. W., Golden A., Case D. Amino acid variants of APE1 and XRCC1 genes associated with ionizing radiation sensitivity.
Carcinogenesis (Lond.)
,
22
:
917
-922,  
2001
.
32
Risler J. L., Delorme M. O., Delacroix H., Henaut A. Amino acid substitutions in structurally related proteins: a pattern recognition approach.
J. Mol. Biol.
,
204
:
1019
-1029,  
1998
.
33
Smith R. A., Smith T. F. Automatic generation of primary sequence patterns from a set of related protein sequences.
Proc. Natl. Acad. Sci. USA
,
87
:
118
-122,  
1990
.
34
Henikoff S., Henikoff J. G. Amino acid substitution matrices.
Adv. Protein Chem.
,
54
:
73
-97,  
2000
.
35
Struewing J. P., Hartge P., Wacholder S., Baker S. M., Berlin M., McAdams M., Timmerman M. M., Brody L. C., Tucker M. A. The risk of cancer associated with specific mutations of BRCA1 and BRCA2 among Ashkenazi Jews.
N. Engl. J. Med.
,
336
:
1401
-1408,  
1997
.
36
Ford B. N., Ruttan C. C., Kyle V. L., Brackley M. E., Glickman B. W. Identification of single nucleotide polymorphisms in human DNA repair genes.
Carcinogenesis (Lond.)
,
21
:
1977
-1981,  
2000
.
37
Butkiewicz D, Rusin M., Harris C. C., Chorazy M. Identification of four single nucleotide polymorphisms in DNA repair genes: XPA and XPB (ERCC3) in Polish population.
Hum. Mutat.
,
15
:
577
-578,  
2000
.
38
Thorstenson Y. R., Shen P., Tusher V. G., Wayne T. L., Davis R. W., Chu G., Oefner P. J. Global analysis of ATM polymorphism reveals significant functional constraint.
Am. J. Hum. Genet.
,
69
:
396
-412,  
2001
.
39
Broughton B. C., Steingrimsdottir H., Lehmann A. R. Five polymorphisms in the coding sequence of the xeroderma pigmentosum group D gene.
Mutat. Res.
,
362
:
209
-211,  
1996
.
40
Fan F., Liu C., Tavare S., Arnheim N. Polymorphisms in the human DNA repair gene XPF.
Mutat. Res.
,
406
:
115
-120,  
1999
.
41
Bell D. W., Wahrer D. C., Kang D. H., MacMahon M. S., FitzGerald M. G., Ishioka C., Isselbacher K. J., Krainer M., Haber D. A. Common nonsense mutations in RAD52.
Cancer Res.
,
59
:
3883
-3888,  
1999
.
42
Khan S. G., Metter E. J., Tarone R. E., Bohr V. A., Grossman L., Hedayati M., Bale S. J., Emmert S., Kraemer K. H. A new xeroderma pigmentosum Group C poly(AT) insertion/deletion polymorphism.
Carcinogenesis (Lond.)
,
21
:
1821
-1825,  
2000
.
43
Orimo H., Nakajima E., Yamamoto M., Ikejima M., Emi M., Shimada T. Association between single nucleotide polymorphisms in the hMSH3 gene and sporadic colon cancer with microsatellite instability.
J. Hum. Genet.
,
45
:
228
-230,  
2000
.
44
Ma X., Qianren J., Forsti A., Hemminski K., Kumar R. Single nucleotide polymorphism analyses of the human proliferating cell antigen (PCNA) and flap endonuclease (FEN1) genes.
Int. J. Cancer
,
88
:
938
-942,  
2000
.
45
Schmutte C., Tombline G., Rhiem K., Sadoff M. M., Schmutzler R., von Deimling A., Fishel R. Characterization of the human RAD51 genomic locus and examination of tumors with 15q14-15 loss of heterozygosity (LOH).
Cancer Res.
,
59
:
4564
-4569,  
1999
.
46
Kato M., Yano K., Matsuo F., Salto H., Katagiri T., Kurumizaka H., Yoshimoto M., Kasumi F., Akiyama F., Sakamoto G., Nagawa H., Nakamura Y., Miki Y. Identification of RAD51 alteration in patients with bilateral breast cancer.
J. Hum. Genet.
,
45
:
133
-137,  
2000
.
47
Wang W. W., Sprudle A. B., Kolachana P., Bove B., Modan B., Ebbers S. M., Suthers G., Tucker M. A., Kaufman D. J., Doody M. M., Tarone R. E., Daly M., Levavi H., Pierce H, Chetrit A., Yechezkel G. H., Chenevix-Trench G., Offit K., Godwin A. K., Struewing J. P. A single nucleotide polymorphism in the 5′ untranslated region of RAD51 and risk of cancer among BRCA1/2 mutation carriers.
Cancer Epidemiol. Biomark. Prev.
,
10
:
955
-960,  
2001
.
48
Cambien F., Poirier O., Nicaud V., Herrmann S. M., Mallet C., Ricard S., Behague I., Hallet V., Blanc H., Loukaci V., Thillet J., Evans A., Ruidavets J. B., Arveiler D., Luc G., Tiret L. Sequence diversity in 36 candidate genes for cardiovascular disorders.
Am. J. Hum. Genet.
,
65
:
183
-191,  
1999
.
49
Halushka M. K, Fan J. B., Bentley K., Hsie L., Shen N., Weder A., Cooper R., Lipshutz R., Chakravarti A. Patterns of single-nucleotide polymorphisms in candidate genes for blood-pressure homeostasis.
Nat. Genet.
,
22
:
239
-247,  
1999
.
50
Cargill M., Altshuler D., Ireland J., Sklar P., Ardlie K., Patil N., Shaw N., Lane C. R., Lim E. P., Kalyanaraman N., Nemesh J., Ziaugra L., Friedland L., Rolfe A., Warrington J., Lipshutz R., Daley G. Q., Lander E. S. Characterization of single-nucleotide polymorphisms in coding regions of human genes.
Nat. Genet.
,
22
:
231
-238,  
1999
.
51
Ohnishi Y., Tanaka T., Yamada R., Suematsu K., Minami M., Fujii K., Hoki N., Kodama K, Nagata S., Hayashi T, Kinoshita N., Sato H., Kuzuya T., Takeda H., Hori M., Nakamura Y. Identification of 187 single nucleotide polymorphisms (SNPs) among 41 candidate genes for ischemic heart disease in the Japanese population.
Hum. Genet.
,
106
:
288
-292,  
2000
.
52
Yamada R., Tanaka T., Ohnishi Y., Suematsu K., Minami M., Seki T., Yukioka M., Maeda A., Murata N., Saiki O., Teshima R., Kudo O., Ishikawa K., Ueyosi A., Tateishi H., Inaba M., Goto H., Nishizawa Y., Tohma a S., Ochi T., Yamamoto K., Nakamura Y. Identification of 142 single nucleotide polymorphisms in 41 candidate genes for rheumatoid arthritis in the Japanese population.
Hum. Genet.
,
106
:
293
-297,  
2000
.
53
Nickerson D. A., Taylor S. L., Weiss K. M., Clark A. G., Hutchinson R. G., Stengard J., Salomaa V., Vartiainen E., Boerwinkle E., Sing C. F. DNA sequence diversity in a 9.7-kb region of the human lipoprotein lipase gene.
Nat. Genet.
,
19
:
233
-240,  
1998
.
54
Rieder M. J., Taylor S. L., Clark A. G., Nickerson D. A. Sequence variation in the human angiotensin converting enzyme.
Nat. Genet.
,
22
:
59
-62,  
1999
.
55
Hayward C., Colville S., Swingler R. J., Brock D. J. Molecular genetic analysis of the APEX nuclease gene in amyotrophic lateral sclerosis.
Neurology
,
52
:
1899
-1901,  
1999
.
56
Olkowaki Z. L. AP endonuclease in patients with amyotrophic lateral sclerosis.
Neuroreport
,
26
:
239
-242,  
1998
.
57
Hadi M., Coleman M., Fidelis K., Mohrenweiser H. W., Wilson D. W., III. Functional characterization of Ape1 variants identified in the human population.
Nucleic Acids Res.
,
28
:
3871
-3879,  
2000
.
58
Spitz M. R., Wu X., Wang Y., Wang L. E., Shete E. S., Amos C. I., Guo Z., Lei L., Mohrenweiser H., Wei Q. Modulation of nucleotide excision repair capacity by XPD polymorphisms in lung cancer patients.
Cancer Res.
,
61
:
1354
-1357,  
2001
.
59
Qiao Y. L., Spitz M. R., Sheng H., Guo Z., Shete S., Hadeyati M., Grossman L., Kraemer K. H., Mohrenweiser H., Wei Q. Modulation of repair of ultraviolet damage in the host-cell reactivation assay by polymorphic XPC and XPD/ERCC2 genotypes.
Carcinogenesis (Lond.)
,
23
:
295
-299,  
2002
.
60
Pritchard J. K. Are rare variants responsible for susceptibility to complex diseases?.
Am. J. Hum. Genet.
,
69
:
124
-137,  
2001
.
61
Tabor H. K., Risch N. J., Myers R. M. Candidate-gene approaches for studying complex genetic traits: practical considerations.
Nat. Rev. Genet.
,
3
:
1
-7,  
2002
.
62
Sunyaev S. R., Lathe W. C., III, Ramensky V. E., Bork P. SNP frequencies in human genes: an excess of rare alleles and differing modes of selection.
Trends Genet.
,
16
:
335
-337,  
2000
.
63
Nelson H. H., Kelsey K. T., Mott L. A., Karagas M. R. The XRCC1 Arg399Gln polymorphism, sunburn, and non-melanoma skin cancer: evidence for gene-environment interaction.
Cancer Res.
,
62
:
152
-155,  
2002
.
64
Pharoah P. D. P., Antoniou A., Bobrow M., Zimmern R. L., Easton D. F., Ponder B. A. J. Polygenic susceptibility to breast cancer and implications for prevention.
Nat. Genet.
,
31
:
33
-36,  
2002
.