Polymorphic arylamine N-acetyltransferase 2 (NAT2) status varies widely between individuals and ethnic groups and has been associated with susceptibility to several cancers. Few studies have reported the distribution of NAT2 status for Caucasian-American populations or evaluated the concordance between methods of assessment for cancer cases and controls. In our study, distribution of NAT2 status was classified by genotype and phenotype measurements in PANCAN, a population-based case-control study of pancreatic cancer, and concordance between measurements was evaluated for 33 cases and 222 controls. Major genotypes and alleles among controls were *5B/*6A, *5B/*5B, *4/*6A, and *5B/*4. One putative new allele was found in a single individual. Genotypes and phenotypes were classified as rapid or slow, according to a bimodal model. Presence of the *4 (wild-type) allele defined a NAT2 genotype as rapid. The NAT2 phenotype was analyzed by the caffeine assay. Ratios of 5-acetylamino-6-formylamino-3-methyluracil to 1-methylxanthine were determined, and individuals with values of ≥0.66 were identified as having a rapid phenotype. In our population, 58.1 and 59.5% of control subjects were classified as slow acetylators by phenotype and genotype, respectively. Concordance of NAT2 genotype and phenotype classification was 97.8% in the bimodal model. A similar analysis was completed for a trimodal model. Concordance of genotype and phenotype was high in cases (90.9%) and similar to controls; genotyping alone provided an efficient, accurate method of analysis for acetylator status. A comparison with two previous reports revealed subtle differences in genotype and allele distribution but exhibited overall similarity with other Caucasian-American populations.
Several well-known drug-metabolizing enzymes catalyze the activation and detoxification of xenobiotics (1) and are classified as phase I or II enzymes, respectively. Members of the phase I category include cytochrome P450-related enzymes and epoxide hydrolases. Phase II enzymes are N-acetyltransferases, glutathione S-transferases, UDP-glucuronosyltransferases, and sulfotransferases. The absolute and relative amounts of phase I and II enzyme activities differ between individuals and affect biological responses to xenobiotic exposure (2). Several adverse drug reactions and cancer susceptibilities have been attributed to interindividual differences in phase I and II enzyme activities (3, 4).
The N-acetyltransferases catalyze acetylation of aromatic amines and hydrazines, which include carcinogenic compounds and therapeutic drugs. Many of the drugs are commonly used and include isoniazid, dapsone, procainamide, and sulfamethazine (5). Carcinogens metabolized by N-acetyltransferase include 2-naphythylamine, 2-aminofluorene, 4-aminobiphenyl, and benzidine (6, 7, 8). Thus, N-acetylation can modulate drug activities and detoxify carcinogens (8).
A wide range of N-acetylation activity is observed in humans and has been characterized by some authors as a bimodal and by others as a trimodal distribution (9, 10, 11). These distributions of activity have resulted in the classification of individuals as rapid and slow (bimodal model) or rapid, intermediate, and slow acetylators (trimodal model). The proportion of slow and rapid acetylators differs between ethnic populations (12, 13). The slow acetylator phenotype is most prevalent in Northern Africans (90%) and Scandinavians (75–80%) and lowest in Canadian Eskimos and Japanese (5%); in Chinese populations, the proportion is 20%, and 40–60% of African-Americans and other Caucasians are slow acetylators (12, 13, 14, 15, 16, 17).
It is believed that N-acetyltransferase activity is primarily under genetic control, although variation in activity that is attributable to environmental factors has not been fully examined (18, 19). NAT13 and NAT2 are encoded by two 870-bp genes, NAT1 and NAT2, respectively (20). The genes are located on chromosome 8, and both are expressed in an autosomal codominant manner (20, 21, 22). NAT1 and NAT2 are both considered polymorphic. The NAT2 locus, the focus of this report, has alleles that contain one to four single-nt substitutions. In human populations, 26 alleles have been reported for NAT2.4 Polymorphisms of NAT2 are known to result in variation of acetylation activity and have been associated with the incidence of several diseases (5). Generally, single-nt substitutions in NAT2 result in low activity, decreased expression, and enzyme instability (23). A slow acetylator phenotype is generally observed for individuals with two germ-line copies of alleles that contain any of several single-nt substitutions. Rapid acetylators generally contain two wild-type or one wild-type and one variant allele, an allele with a nt substitution, of NAT2. A high concordance between NAT2 genotype and phenotype has been reported in a number of populations (24, 25, 26, 27, 28, 29).
Both rapid and slow N-acetyltransferase phenotypes and genotypes have been associated with the risk of several cancers. Slow N-acetyltransferase phenotype has been associated with a higher risk of bladder cancer in arylamine-exposed workers (30). The rapid N-acetyltransferase phenotype and genotype have been associated with the risk of colon in some (31, 32) but not other studies (15) A positive association with rapid NAT2 genotype has also been reported for lung cancer (33). An examination of the association between N-acetyltransferase genotype and breast cancer has produced conflicting results (34, 35, 36). In general, the associations of disease end points with NAT2 phenotype are stronger for smokers than nonsmokers (34, 37, 38). These observations, together with ethnic variation in the ratio of slow and rapid acetylators, suggest that the NAT2 genotype or a combination of environmental exposure and N-acetyltransferase genotype may partially explain cancer risk patterns in different populations.
Characterization of NAT2 genotypes and phenotypes in healthy and diseased individuals allows for evaluation of both NAT2and cancer associations and evaluation of DNA-based methods for classifying N-acetylation phenotype. To date, no large studies of NAT2 genotype distribution and concordance have been reported for the Caucasian-American population. Here, we measured NAT2 genotype and phenotype in a mostly Caucasian Minnesota (United States) population. All major NAT2 genotypes were characterized, and metabolic phenotyping was measured by the caffeine assay. This study was undertaken as part of a population-based case-control study of pancreatic cancer, PANCAN. Ultimately, the case-control study enrolled ≈280 individuals with pancreatic cancer and 675 controls. Results from a subset of these subjects are reported here. The objectives of this study were: (a) to determine the distribution of NAT2 genotypes in an Caucasian-American population using the control subjects and (b) to determine the concordance of NAT2 genotype and acetylator phenotype in both cases and controls. In the course of the study, we found evidence of a new allele. Here, we describe the putative allele and propose a name for it. Also, we discuss the feasibility of applying NAT2 genotyping and phenotyping methods in a study of pancreatic cancer.
Subjects and Methods
Two hundred fifty-five individuals were recruited in the initial stages of a large case-control study of pancreatic cancer. These subjects included 33 cases diagnosed with pancreatic cancer (9 women and 24 men) and 222 population-based controls (103 women and 119 men). The cases were recruited from all hospitals in the seven-county metropolitan area of the Twin Cities, Minneapolis and St. Paul, Minnesota. The controls were drawn from driver’s license lists. The study was approved by the Institutional Review Board of the University of Minnesota, and informed consent was obtained from all subjects. Extensive medical, dietary, and demographic data were collected from each individual as part of the parent case-control study. The mean ages of the cases and controls were 66.0 and 67.7 years, respectively. There was one African-American among the cases. One control self-reported race as “other.” The remaining 32 cases and 221 controls were self-reported as “white.” Subjects who indicated they had a high sensitivity to caffeine or who had been told to avoid caffeine for medical reasons were excluded from the caffeine study. Individuals who refused blood collection or the caffeine test (for any reason) or had missing data (genotype or phenotype) were excluded from this study.
NAT2 Activity (Phenotype).
In vivo NAT2 activity was measured with a caffeine-based assay, as described by Butler et al. (22), with minor modifications. Subjects ingested two tablets, each containing 100 mg of caffeine, following an overnight fast. Subjects refrained from the consumption of caffeine- and methylxanthine-containing foods and beverages from midnight until 5 h after the dose of caffeine. Four and 5 h following caffeine ingestion, subjects voided their bladders, and a urine specimen was collected at the 5-h time point. Immediately following collection, the urine samples were frozen in a home freezer and collected by study personnel within 1 week. Samples were thawed, the pH was adjusted to 3.5, aliquots were taken, and the aliquots were stored at −70°C until analysis.
Prior to HPLC analysis, frozen samples were thawed, 200 μl of urine were saturated with 125 mg of ammonium sulfate, and 6.0 ml of chloroform:isopropanol (95:5) were added. Each sample was vortexed and centrifuged, and the organic phase was removed and evaporated to dryness. The residue was resuspended in 250 μl of 0.05% acetic acid, filtered, and frozen until analysis. In an HPLC-based analysis, 50 μl of the extract were injected onto a Beckman C18 Ultrasphere octadecylsilane column (25 cm in length, 4.6-mm diameter, 5-μm particle size) and eluted with a 0.05% acetic acid-methanol solvent (flow rate, 1.2 ml/min). The analysis detects metabolites of caffeine, including AFMU and 1X.
Acetylation phenotype was assigned on the basis of a molar AFMU/1X ratio. A ratio of <0.66 identified subjects as slow acetylators, and a ratio of ≥0.66 identified subjects as rapid acetylators (bimodal model). Alternatively, subjects with ratios from 0.66 up to and including 3.0 were classified as intermediate acetylators, those with ratios of >3.0 were classified as rapid acetylators, and those with ratios of <0.66 were classified as slow acetylators (trimodal model).
The cutoff of 0.66 was based on a kernel density function analysis, which objectively defined the nadir of the data distribution for a bimodal model (39). The 3.0 cutoff recognizes a natural nadir in the distribution of activities among individuals carrying at least one NAT2*4 allele. Data are presented as both bimodal and trimodal distributions.
Isolation of Genomic DNA.
Thirty ml of venous blood were drawn from each patient into 10-ml EDTA tubes. Following centrifugation and removal of plasma, the buffy coat was isolated, stored at −70°C, and used for DNA isolation. DNA was isolated by a phenol-chloroform extraction method and precipitated with ethanol using an automated Applied Biosystems DNA extractor. Isolation procedures followed the manufacturer’s specifications, including an RNase step. DNA was dissolved in Tris-EDTA buffer (pH 7.4) and quantified spectrophotometrically by measurement of absorbance at 260 nm. Quality of the DNA sample was assessed by measurement of the A260 nm:A280 nm absorbance ratio. Typical yields of DNA were >10 μg/ml and A260 nm:A280 nm ratios were 1.7–1.8, indicating good yield and high-quality DNA. Samples were stored in Tris-EDTA buffer at −70°C until further analysis.
Determination of NAT2 Genotype.
Known alleles of the NAT2 gene contained one to four point substitutions at nt 191, 282, 341, 481, 590, 803, and 857. Many of the substitutions were linked within single alleles. The nt position of each substitution was based on the sequence reported by Blum et al. (20). The standardized designation for each allele is based on the report of Vatsis et al. (40), and designations are given in Table 1. The most prevalent alleles of N-acetyltransferase are (positions of their nt substitutions are given in parentheses): NAT2*5A (341/481), NAT2*5B (341/481/803), NAT2*6A (282/590), and NAT*7B (857). Former designations for *5A, *6A, and *7B alleles were M1, M2 and M3, respectively, whereas the *4 allele was formerly referred to as wild type (41).
Polymorphisms of the NAT2 gene were determined using one of three methods. The first, based on the method of Blum et al. (20), involved two PCRs and use of two primer pairs for each DNA sample (reactions 1 and 2). Reaction 1 amplified a 1165-bp fragment using primer 1 (P1, CACGACGTTGTAAAACGACAGGGGATCATGGACATTGAA), which has a sequence that is complimentary to a M13 vector (first 20 bases) and the NAT2 gene at nt −7 to 12 (last 19 bases), and primer 2 (P2, TCTAGCATGAATCACTCTGC), which is complimentary to nt 1119–1138. This product was used for restriction enzyme digests that detected polymorphisms at nt 191, 481, and 857. Reaction 2 amplified a 652-bp fragment using primer 3 (P3, CACGACGTTGTAAAACGAC ATCAACTTCTGTAC-T-GGGCT), which is complimentary to a M13 vector (first 20 bases) and the NAT2 gene at nt 216–234, and primer 4 (P4, GGTTTGGGCACGAGATTTCT), which is complimentary to nt 829–848. This product was used for restriction enzyme digests, which detected a polymorphism at nt 590. Each 50-μl PCR contained 100–500 ng of genomic DNA, 0.2 mm each dNTP, 0.2 μm each primer, 1× buffer [50 mm KCl and 10 mm Tris-HCl (pH 8.3)], 2.5 units of Taq DNA polymerase, and 2.0 mm (reaction 1) or 1.5 mm (reaction 2) MgCl2. The mixture was initially incubated for 4 min at 95°C. Thirty cycles of amplification (denaturation for 60 s at 95°C, annealing for 70 s at 58°C, and extension for 90 s at 72°C), followed by a final extension period of 4 min at 72°C, were performed using a programmable Perkin-Elmer Corp. thermocycler.
Products from reaction 1 were digested separately with MspI/HpaII, KpnI, and BamHI to detect the loss of restriction sites at nt 191, 481, and 857, respectively. The product of reaction 2 was digested with TaqI to detect the loss of a restriction site at nt 590. All reactions were carried out for 2 h at 37°C, with the exception of TaqI reactions, which were digested at 65°C. Five μl of PCR product were added to 1× enzyme buffer [10 mm Tris-HCl (pH 7.5), 5 mm MgCl2, 100 mm NaCl, and 1 mm 2-mercaptoethanol], 15 units of enzyme, and 0.1 mg/ml BSA and brought to a final volume of 20 μl. All digests were analyzed on ethidium bromide-stained 1% agarose gels, except the TaqI digests, which were examined on 2% agarose gels.
The second method (triple-digest method) for obtaining NAT2 genotypes was a modified procedure of Abe et al. (42). Briefly, 200–500 ng of genomic DNA were amplified by a PCR containing 5 pmol of each primer [TGACGGCCAGGAATTACATTGTC (P5, nt 342–362) and ACACAAG-GGTTTATTTTGTTCC (P6, nt 879–900)], 0.2 mm each dNTP, 2.5 mm MgCl2, 1× buffer [50 mm KCl and 10 mm Tris-HCl (pH 8.3)], and 1 unit of Taq polymerase in a total volume of 50 μl. The mixture was initially incubated for 5 min at 94°C. Thirty-five cycles of amplification (denaturation for 50 s at 94°C, annealing for 50 s at 55°C, and extension for 50 s at 65°C), followed by a final extension period of 7 min at 72°C, were performed using a programmable Perkin-Elmer thermocycler (Norwalk, CT).
Following amplification, 10 μl of the PCR product were added to a mixture of 5 units each of BamHI, KpnI, and TaqI, 1× buffer [10 mm Tris-HCl (pH 7.5), 5 mm MgCl2, 100 mm NaCl, and 1 mm 2-mercaptoethanol], and 0.01 mg/ml BSA in a total reaction volume of 20 μl. The mixture was incubated at 37°C for 3 h and then incubated at 65°C for 1 h. Fifteen μl of digested product were electrophoresed at 100 V for 30 min and then at 75 V for 30 min on a 3% agarose gel stained with ethidium bromide.
Although the triple-digest method was an efficient means of analysis, an unanticipated band of ≈400 bp was found in DNA samples with a polymorphism at nt 590. The band was present in carriers of NAT2*6A alleles only and did not result from nonspecific amplification of DNA samples. Because alternative methods of analysis, including the use of single enzyme digests, did not produce the fragment, it appears that the band is an incomplete digest of the PCR product from individuals with *6A alleles; an incomplete digestion of the 481 site would yield a 396-bp band. Assignments of genotypes for all analyses producing the ≈400 bp band were verified by single-enzyme digests of the PCR product from the triple-digest method.
In a third method, a Fok I restriction digest was used to detect substitutions at nt 282, and a DdeI digest was used to detect substitutions at nt 803. For this analysis, an additional PCR was performed using 0.2 μm of the first primer (P1) from the first method and P6 from the second method. The PCR was performed with 0.8 mm dNTPs, 2 mm MgCl2, 1× buffer [50 mm KCl and 10 mm Tris-HCl (pH 8.3)], and 1 unit of Taq polymerase. Thirty cycles of amplification were performed (denaturation for 60 s at 95°C, annealing for 70 s at 57°C, and extension for 180 s at 72°C), followed by a final extension period of 4 min at 72°C. This PCR amplified a 907-bp fragment. FokI digestions were performed for 1 h at 37°C using 20 μl of amplified product, 1 unit of FokI, 1× buffer, and 0.1 mg/ml BSA. DdeI reactions were performed for 2 h at 37°C, using 10 μl of amplified product, 5 units of DdeI, 1× buffer, and 0.1 mg/ml BSA. All digests were analyzed on a 3% agarose gel.
The first method of analysis was used for ≈150 subjects; all other DNA samples (105 total) were analyzed by the second method. The third method was applied to all DNA samples. Analysis for the polymorphism at nt 191 was completed for ≈140 subjects, and no polymorphism was found. This polymorphism generally has been found only in African-Americans. This study had only one African-American, who was negative for the polymorphism, and thus, we did not screen the remainder of our population.
Variant or substituted alleles were defined as alleles containing a substitution at any of following nt: 191, 282, 481, 590, 803, and 857. Individuals with two substituted alleles were classified as slow acetylators. Heterozygotes with a wild-type allele and one copy of a variant allele or individuals with no substitutions (two wild-type alleles) were classified as rapid acetylators. Alternatively, the rapid genotype was subdivided into an intermediate group, including all heterozygotes (a wild-type and substituted allele), and a rapid group, including only homozygous wild-type genotypes.
Allele and genotype frequencies are reported as percentages. The Hardy-Weinberg equation was used to calculate the expected heterozygote genotype frequency, based on frequencies of wild-type or mutant alleles. For this assessment, all mutant alleles were grouped as a single category. Cutoff points for the phenotype distributions were determined through the use of probit plots (43) and a kernel density function (39). Concordance values (percentages) and κ statistics (44) are reported as measures of agreement between genotypes and phenotypes. ANOVA was used to compare the bimodal and trimodal models for genotype and phenotype. The difference between concordance for cases and controls was compared using the χ2 test. Mean values of the caffeine metabolite ratio (AFMU/1X) for the heterozygous and homozygous wild-type genotypes were compared using the Student’s t test.
All alleles and genotypes were identified by a combination of the first or second and third methods of analysis. For example, for the latter combination, unique bands were found at 226, 278, and 142 bp for alleles with substitutions at nt 481, 590, and 857, respectively, by the second analysis method. In the third method, mutations at nt 282 and 803 created unique bands with 717 and 101 bp, respectively.
Results are presented for the control subjects only, except where indicated. Ten NAT2 alleles were found in our population. A list and description of the alleles are shown in Table 1; the distribution of alleles is shown in Table 2. NAT2*5B was the major allele, accounting for 37.2% of the total. NAT2*6A and NAT2*4 (wild type) were the next major contributors, accounting for 25.0 and 23.2%, respectively, of the total. Each of the remaining alleles accounted for 0.2–6.8% of the total.
Nine of the alleles have been described previously. We found evidence for a new allele that contained substitutions at nt 282 and 481. Analysis of fragment patterns following a series of single and multiple restriction enzyme digests were consistent with this linkage. Tentatively, we have proposed the designation NAT2*13B on the basis of NAT2 nomenclature, as described by Vatsis et al. (40).
The control subjects contained 24 unique genotypes (Table 3). An additional genotype, NAT2*6B/NAT2*7A, was found only among the cases. The major genotypes were NAT2*5B/NAT2*6A and NAT2*5B/NAT2*5B, which accounted for 16.7 and 19.8% of the total genotypes, respectively. The next most frequent genotypes were NAT2*4/NAT2*6A and NAT2*4/NAT2*5B, which accounted for 14.0 and 11.1% of the total, respectively. Each remaining genotype accounted for 0.5–6.3% of the total. Thirteen individuals were homozygous for the wild-type (NAT2*4) allele. Seventy-seven individuals were heterozygous containing the wild-type and a substituted allele. The remaining 132 subjects had two variant alleles. The distribution of genotypes corresponded with predictions of the Hardy-Weinberg equilibrium (p2 + 2pq + q2). With p as the frequency of the wild-type allele and q as the frequency of all substituted alleles combined, the predicted number of heterozygotes was 79, which corresponds to the actual number of 77.
On the basis of a dichotomous genotype, 132 (59.5%) individuals were slow and 90 (40.5%) were rapid acetylators. With slow, intermediate, and rapid classification, these categories contained 132 (59.5%), 77 (34.6%), and 13 (5.9%) of the individuals, respectively.
The major distribution of N-acetylator phenotypes for control subjects was bimodal, with a nadir at 0.66 for the AFMU/1X ratio (Fig. 1). The 0.66 cutoff, determined by kernel density function analysis (Fig. 2), identified 129 (58.1%) and 93 (41.9%) of individuals as slow and rapid N-acetylators, respectively. Alternatively, the phenotypes included a major and minor distribution with cutoffs at 0.66 and 3.0, creating a trimodal distribution. This division split the rapid acetylators into intermediate and rapid groups. In the trimodal distribution, there were 129 (58.1%), 87 (39.2%), and 6 (2.7%) individuals with slow, intermediate, and rapid phenotypes, respectively.
The distribution of phenotype within each genotype is shown in Table 3 . Individuals with two substituted alleles generally had a slow phenotype, whereas heterozygous individuals containing a wild-type and a variant allele generally had a rapid phenotype (AFMU/1X ratio > 0.66). Although individuals with two wild-type alleles also had a rapid phenotype, this population had a significantly higher mean AFMU/1X ratio than the heterozygous population; the respective means and SEs were 2.41 + 0.21 versus 1.68 + 0.07 (P < 0.001).
Concordance and κ statistics were calculated as measures of agreement for both models of N-acetyltransferase phenotype and genotype. Concordance values were high (97.8 and 92.8%) for the bimodal and trimodal distribution of N-acetyltransferase activity, respectively. The κ statistics indicated good agreement between genotype and phenotype with κ = 0.95 (95% CI, 0.913- 0.994) and 0.861 (95% CI, 0.798–0.923) for the bi- and trimodal models, respectively. With a bimodal model, concordance in the cases was 90.9% (κ = 0.81; 95% CI, 0.616–1.02). The difference in concordance values between cases and controls was not statistically significant; Fisher’s exact test yielded a P of 0.07. Although this P is suggestive of a difference, both values are high (90.9 versus 97.8; P = 0.07), which likely reflects the precision of the data rather than a large difference in concordance.
Self-reported medication use was evaluated as a possible explanation for discordance. It did not appear to be related to genotype-phenotype discordance. These observations indicated a strong genetic influence and little, if any, effect by medication use or disease status on NAT2 phenotype.
In our study population, 128 of 132 individuals with genotypes consisting of any combination of two variant alleles had a slow acetylator phenotype. Of the four individuals who were homozygous for substituted alleles and who were not phenotypically slow, three had AFMU/1X ratios that were only slightly higher than the 0.66 cutoff. These observations indicated that all of the variant alleles in our population affected NAT2 activity, including those with single changes at nt 590, 282, and 803.
We evaluated the use of NAT2 genotypes as the primary measure for the characterization of acetylator status. Using N-acetyltransferase activity (phenotype) as the gold standard, the specificity and sensitivity of genotype analysis for subject classification were 99.2 and 95.6%, respectively, for the rapid acetylator in the bimodal model. In the trimodal model of acetylator activity, we evaluated characterization of the rapid versus intermediate phenotypes. The specificity and sensitivity of genotype analysis for the rapid acetylator were 95.4 and 66.6%, respectively. In addition, an ANOVA indicated that genotype, characterized as homozygous wild-type, heterozygous, and homozygous substituted alleles (rapid, intermediate, and slow), accounted for 82.5% of the variance in N-acetyltransferase activity (R2 = 0.825). When individuals heterozygous and homozygous for the *4 allele are treated as one category, ANOVA showed that the dichotomy (rapid and slow) explained 81.6% of the variance in N-acetyltransferase activity.
A limited number of studies have characterized NAT2 distributions in Caucasian populations. We describe NAT2 distributions in a Caucasian-American population and compare our findings with those of similar populations. Data are presented for both bimodal and trimodal distributions of phenotypes and genotypes. The trimodal model is included because individuals classified as intermediate and rapid acetylators, on the basis of genotype, have an overlap of phenotype ranges but a statistically significant difference in the mean activities. The overlap is possibly due to the effects of unidentified genetic or environmental factors. For example, we have recently shown that hepatic NAT activity can be effectively down-regulated in rats treated with dietary kahweol and cafestol.5 Thus, the classification of individuals as intermediate and rapid may be important in the analysis of epidemiological data.
The nt substitutions of NAT2 occur at several positions, including 191, 282, 341, 481, 590, 803, and 857. All individuals were tested for all of these substitutions, except those at 191 and 341. Approximately 140 subjects were evaluated for the alteration at nt 191 in our study; it was not present in any of the subjects, confirming that this nt substitution is uncommon in Caucasians. We did not analyze subjects for a substitution at nt 341 because it is virtually always linked to alterations at nt 481 or 803 and, thus, it would be unlikely to affect our allele classification (28, 45).
Distribution of NAT2 alleles in our data are similar to that reported for Caucasian-American (46) and German-Caucasian (28) populations. The most prevalent alleles in our subjects were NAT2*5B, NAT2*6A, NAT2*4, NAT2*5A, and NAT2*7B, which accounted for 37.2, 25.0, 23.2, 6.8, and 2.5%, respectively (total = 94.7%), of all of the alleles. Similar frequencies (within ≤3%) were found in two other populations (46, 28). The major genotypes in our study were NAT2*5B/NAT2*6A, NAT2*5B/NAT2*5B, NAT2*4/NAT2*6A, and NAT2*4/NAT2*5B, which accounted for 16.7, 19.8, 14.0, and 11.7% of the total genotypes, respectively. These genotype frequencies are nearly identical to those reported by Bell et al. (46) in a Caucasian-American population. The major genotypes in a study by Cascorbi et al. (28) differed slightly from ours; the major genotypes are identical between the two populations, but the relative rank and percentage of the total population are different.
The proportion of rapid and slow acetylator genotypes is similar among the studies cited above, a study by Lin et al. (45), and this study. In the bimodal model, the Minnesota population had 58.1% slow acetylators. Lin et al. (45) reported 52.6% slow acetylators among Caucasian-Americans and 60% among German-Caucasians, when screening for 6 substituted nts. Cascorbi et al. (28) found 58.9% slow acetylators among German-Caucasians screened for substitutions at nt 191, 282, 341, 481, 590, 803, and 857, whereas Bell et al. (46) detected 55% slow acetylators in an Caucasian-American population. In the trimodal model, the proportions of intermediate and rapid acetylators in our population were 34.6 and 5.9%, respectively, whereas the corresponding values reported by Bell et al. (46) and Cascorbi et al. (28) were 39.0 and 6% and 36.8 and 4.3%, respectively.
Several NAT2 alleles such as NAT2*12A and NAT2*13 are rare in human populations, and assignment of their phenotype activities has been based on few observations. These assignments require further evaluation. In our population, 10 heterozygous subjects had a NAT*12A allele. The second allele was NAT2*5B (six subjects), NAT2*6A (three subjects), or NAT2*4 (one subject). The subject with a NAT2*4/NAT2*12A genotype and one subject with NAT2*6A/NAT2*12A genotype had rapid phenotypes, with the later individual having an AFMU/1X ratio only slightly above the 0.66 cutoff for slow versus rapid phenotype. All other subjects (8 of 10) had a slow phenotype. On the basis of these observations and recognition that an amino acid change occurs with the nt substitution of NAT2*12A, we have considered the NAT2*12A allele to have a slow phenotype in our population. This phenotype assignment is in contrast with the rapid phenotype assigned to NAT2*12A on the basis of a recombinant N-acetyltransferase protein (23) and with data from two subjects from a single family with NAT2*12A/NAT2*5B genotypes (47) The discrepancy between the apparent phenotypes of NAT2*12A may be the result of gene-gene interactions, an unmeasured dietary or environmental factor, or an additional undetected nt substitution.
The NAT2*13 allele has a nt substitution that does not change an amino acid in the protein. Nonetheless, the apparent phenotype is slow. The slow phenotype may result from low mRNA stability or expression. A low in vivo activity is consistent with low-intermediate activity reported for a recombinant protein of the NAT2*13 allele (23) and a previous report of in vivo activity (28). It is apparent that the definitive assignment of phenotypic activity awaits the accumulation of additional data. NAT2*12A and NAT2*13 have been associated with a wide range of in vivo activity and may be influenced by several factors. These findings underscore the need for both phenotype and genotype measurement.
Our control population is likely to be representative of the Caucasian population in the Twin Cities area (Minneapolis and St. Paul, Minnesota). These subjects were randomly selected from the general adult population and frequency-matched by age and sex to cases (age and sex are unrelated to genotype; Ref. 5). The participation rate was 61.7% for controls. Among participants, refusal of specimens was 12% for blood and 17% for urine. The genotype distribution of subjects providing blood but refusing the urine assay was not significantly different from the genotype distribution of subjects who provided both blood and urine samples (χ2 = 1.16, P = 0.28), thus supporting the view that the decision to take part in the caffeine assay is unlikely to be related to NAT2 status. There were only two individuals who provided a urine sample without a blood sample.
One of the primary objectives of this study was an analysis of the concordance between NAT2genotype-predicted acetylator status and acetylator phenotype. Our results show a high correlation between the two methods of estimating acetylator status. Concordance was 97.8 and 92.8% for the bimodal and trimodal models, respectively. The data confirm results of studies in other populations. Mrozikiewicz et al. (25) calculated a 95.8% concordance of genotype and phenotype in 120 Polish children. Graf et al. (24) cited a 95% concordance for 81 European individuals, and Cascorbi et al. found a 93.3% concordance for 563 German subjects (28).
An important consideration in using the NAT2 phenotype assay in case-control studies is the possibility that the disease state might alter the results. In this study, we found that concordance in cases was comparable to the value in controls; the difference was not statistically significant.
There are a number of plausible explanations for the discrepancy, albeit small, among genotype and phenotype. Biological and intraindividual variability in phenotypic expression or procedural/laboratory variability of the caffeine test may explain some of the subjects with ratios near the 0.66 cutoff point. The chemical analysis incorporates a quantitative HPLC analysis, which can have a within-day variation of up to 21% (22). Our genotyping methods analyzed for six single-nt substitutions within the coding region. It is possible that other mutations within and outside the coding region exist, and may result in reduced enzyme activity. It is also possible that recombination may have linked substitutions that are not commonly found together, and resulted in misclassification of an allele. Additional caffeine assays, DNA sequencing, and further mapping may reveal the basis for discordance.
The caffeine metabolite assay is currently the gold standard for assigning acetylator status. However, a number of potential limitations of the assay warrant discussion. Many large-scale studies such as our own obtain only one phenotypic measurement. In theory, a single phenotypic measurement may be influenced by short-term dietary changes, medications, or the disease process itself, although we found no evidence for the latter two in the study; we did not address dietary influences in these analyses. The five-hour time commitment and medical status of the subject hinders compliance. In our study participants, there was a higher refusal rate for the caffeine test than for the blood draw.
In contrast, the genotyping method requires a single blood draw, minimizing participant time commitment. DNA could also be obtained from buccal cells or archived samples, allowing for a retrospective analysis. DNA is stable for long periods of time, is not influenced by short-term environmental factors, and is less expensive than phenotyping. As a consequence, genotyping may be preferable to phenotyping in large epidemiological studies.
The modified method of Abe et al. (42) is an efficient method for genotyping the most common nt substitutions of positions 481, 590, and 857, requiring only one PCR amplification and one restriction digest. One additional PCR and three additional digests are needed to screen for most restriction sites. This study demonstrates that the determination of the five mutations at nt 282, 481, 590, 803, and 857 is sufficient for the identification of the major slow alleles in this population. The reduced number of steps minimizes supply cost, sample handling time, and risk of contamination.
In conclusion, genotyping is an accurate and efficient means to determine acetylator status that can be used either alone or as a supplement to phenotyping data in large epidemiological studies.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
This work was supported in part by USPHS Grant RO1CA58697 (to K. A.).
The abbreviations used are: NAT1 and NAT2, N-acetyltransferases 1 and 2, respectively; nt, nucleotide; HPLC, high-pressure liquid chromatography; AFMU, 5-acetylamino-6-formylamino-3-methyluracil; 1X, 1-methylxanthine; CI, confidence interval.
Arylamine N-Acetyltransferase (EC220.127.116.11) Nomenclature <www.louisville.edu/medschool/pharmacology/NAT.html>.
F. Kadlubar, unpublished data.
|Allele .||Location of nt substitution .||Allele frequency .|
|Allele .||Location of nt substitution .||Allele frequency .|
We acknowledge the excellent efforts of Michael Donndelinger in performing a portion of the genotyping experiments and thank Peter Hannan for the ANOVA. We also appreciate the efforts of Carol Hansen in the preparation of this manuscript. Special acknowledgment is given to Joyce P. Massengill for her performance of caffeine metabolite assays.