Human α-methylacyl-CoA racemase (AMACR) was overexpressed in prostate cancer compared with nonmalignant tissues. The Gene Logic Inc. BioExpress database containing Affymetrix U133 GeneChip expression profiles of 4400 human normal, benign, diseased, and tumor samples from >60 tissue types was examined to determine the specificity of AMACR mRNA expression. One particular AMACR probeset was derived from an alternatively spliced exon with 88% identity to a 521-bp sequence that spans four exons of the fumarate hydratase. The predicted protein sequence revealed a novel GLGELIL peptide shared by both proteins. Whether the mitochondrial and peroxisomal AMACR described previously are distinct products from alternatively spliced transcripts remains to be determined. The determination of the cellular location and function of the altered AMACR will be critical in the elucidation of the role of AMACR in prostate cancer diagnosis and pathogenesis.
AMACR3 (also termed P504S or 2-arylpropionyl-CoA epimerase) belongs to the CAIB-BAIF Co-A transferase family (Ref. 1; InterPro IPR003673)4 and is involved in fatty acid and bile acid intermediate metabolism (2, 3) and the inversion metabolism of the anti-inflammatory cyclooxegenase inhibitor ibuprofen (4). AMACR activity was found predominantly in the mitochondria of rat liver (2, 5); it has dual compartmentalization in human liver and fibroblasts, with its enzyme activity mainly associated with peroxisomes but also with mitochondria (2, 3, 6), and is equally distributed in the two organelles in mouse cells (7, 8). AMACR overexpression in PC was first identified using cDNA library subtraction (9) and confirmed by microarray expression profiling, RT-PCR, and immunohistochemical staining studies (10, 11, 12, 13, 14). Immunohistochemical staining of tissue microarrays using a polyclonal antibody against AMACR identified overexpression of AMACR protein in a number of other cancer types (15). This result differs from a more PC-specific AMACR expression pattern when a monoclonal antibody (anti-P540S) was used (9, 10). We examined the Gene Logic Inc. BioExpress database containing Affymetrix U133 GeneChip® expression profiles of 4400 human samples from >60 tissue types to determine the specificity of AMACR mRNA expression. Our results showed that significant increase in AMACR expression in PC. AMACR expression from several other tissue types is detectable, mostly to a much lesser extent than PC. Genome analysis further revealed the presence of an alternatively spliced AMACR transcript containing an extra exon with FH homology. Neurological alterations observed in germ-line deficiencies of AMACR (16) and FH (17, 18), overexpression of AMACR observed in a variety of tumors, including PC, and germ-line mutations of FH associated with renal and uterine tumors (19) highlight the potential parallels between these genes.
MATERIALS AND METHODS
Tissue Samples, RNA Preparation, and Affymetrix GeneChip Hybridization.
All normal, diseased, and tumor tissue samples obtained from various clinical sources were acquired according to the respective institutions’ Institutional Review Board guidelines with appropriate informed consent. Each tissue was accompanied by clinical information and pathology reports and histologically confirmed by pathologists. Sample preparation and processing, hybridization to the Human Genome U133 Set, and normalization were performed as described in the Affymetrix GeneChip Expression Analysis Manual (Santa Clara, CA). Information on the Human Genome U133 Set, which consists of two GeneChip arrays with ∼45,000 probesets representing >39,000 transcripts derived from ∼33,000 well-substantiated human genes, is available on the Internet.5
To compare expression intensities reported from different Affymetrix GeneChip experiments for such a large set of samples and reduce the effects of variability introduced into the system because of differences in sample preparation, hybridization conditions, staining, or use of different array lots, two normalization methods were used. They are the Affymetrix normalization, which multiplies each expression intensity for a given experiment (chip) by a global scaling factor, and a standard curve or spike-in normalization. Description of the Affymetrix normalization is available on the Internet.6 For the standard curve normalization, known concentrations of particular gene fragments are spiked in to the sample RNA mixture before hybridizing it to the chips (Bacterial genes are used for the spike-ins, so there is no additional RNA contribution from the sample donor; Ref. 20). The concentration of transcript, expressed as a frequency in parts per million, is reported as the expression value. This frequency is a relative measure and estimates the equivalent amount of sample RNA in parts per million in the original hybridization mixture (assuming that a nominal transcript in the sample to be analyzed is 1000 bases long).
End Point and Real-Time Q-RT-PCR.
RNA from normal tissues was analyzed by standard RT-PCR reactions using the ABI PRISM 7700 Sequence Detection systems and TaqMan EZ RT-PCR Kit (PE Applied Biosystems Protocol). Total RNA (15 ng) and 100 nm each forward and reverse primers (Fig. 4) were used for each reaction in a program of 2 min at 50°C, 30 min at 60°C, 5 min at 95°C, followed by 40 cycles of 20 s at 94°C, and finally 1 min at 60°C. One primer set was designed to span the 5′ splice junction between exon 4 and alternate exon 5 (403F/533R), and two primer sets were designed to detect transcript within the alternate exon only (568F/717R, 749F/864R). The nucleotide designation for alternate exon starts at 454 and ends at 966. The sizes from the three primer sets after 4% agarose gel electrophoresis are determined based on the 25-bp molecular weight ladder. The level of alternate AMACR expression using the 403F/533R primer was normalized to β-actin expression.
Expression data were stored in the GeneExpress 2000 database and analyzed using the GeneExpress Software System. The data were either viewed using the Spotfire Provisualization or Microsoft Excel tools. Genome analysis was performed using tools available on public Web sites.7
The Gene Logic Inc. BioExpress database containing Affymetrix U133 GeneChip expression profiles of 4400 human samples from >60 tissue types was examined to determine the expression profile of AMACR. Four independent AMACR probesets showed overexpression of AMACR in PC (Fig. 1). A fifth AMACR probeset derived from the antisense strand, 217113_at, showed expression at noise levels (data not shown). Three of four probesets, which hybridize to the known AMACR transcript sequence, show expression in prostate and other tissue types. In contrast, 217111_at showed extremely selective, though weaker, expression in prostate (P) only.
Sequence analysis using the Blast software8 revealed that the 217111_at probe sequence was derived from a single alternatively spliced AMACR exon inserted between coding exons 4 and 5 (Fig. 2,a). Sequence alignment further shows that this exon is highly homologous, with 88% identity, to a stretch of 521-bp sequence that spans four spliced exons of a Kreb cycle enzyme, the mitochondrial FH (Fig. 2 b). All 11 probesets for 217111_at and 8 of the 11 probesets for the 3′ portion of the FH fragment 203033_x_at are within the 521-bp sequence.9 The expression level obtained for the 217111_at is attributable to the presence of the alternatively spliced AMACR transcript because none of the probesets within AMACR perfectly matches the probesets within FH. Expression profiles for FH 203033_x_at and other FH fragments for the 4400 samples show a distinct pattern from AMACR, further validating that the alternate AMACR expression is attributable to 217111_at hybridization and not from any FH probes.
The predicted AMACR protein sequence from this alternative transcript retains the 5′ functional CAIB-BAIF domain but loses the 3′ peroxisome targeting signal (6). A novel peptide, GLGELIL is present in the homologous region shared between the two proteins (Fig. 3). This peptide is immediately upstream (2 amino acid residues) of an FH point mutation identified in two siblings with progressive encephalopathy and fumarase deficiency (18). The significance of this novel peptide sequence is presently unknown.
AMACR and FH expression from a selected panel of tissues within the initial 4400 sample set was further examined in depth (Supplemental Section Fig. 1). The expression profiles for 209425_at (derived from the major AMACR transcript) and 217111_at (derived from the alternate AMACR exon) are shown. The expression values from the chip experiments were obtained using a standard curve (spike-in) normalization method (20). Consistent with results shown in Fig. 1 (expression intensity based on Affymetrix 5.0 normalization), the alternate AMACR transcript is expressed selectively in PC, with a mean expression value equivalent to 15% of the overall AMACR expression. As shown by the SD and E-Northern visualization of various sample sets (Supplemental Section, Fig. 2), expression of the novel AMACR transcript varies widely among the PC samples. On the basis of the present/absent call algorithm provided by Affymetrix, only 50% of the PC samples express the alternate AMACR transcript. In several other tissue types, most notably colorectal, kidney, and liver, expression of the major AMACR, but not the alternate, transcript is detected. Our AMACR expression data are in agreement with the EST information for AMACR reported by the Stanford Online Universal Resource for Clones and ESTs,10 which shows significant levels of AMACR both in some normal as well as tumor samples from multiple tissue types. Overexpression of AMACR protein in multiple tumor types has been reported based on results from tumor tissue arrays (15). Data from our comprehensive gene expression database show moderate AMACR expression in some normal and tumor tissues, most notably in kidney and liver. It will be worthwhile to include normal and tumor tissues in the tissue microarrays so AMACR expression can be directly compared.
To further validate the microarray expression results, we have performed end point and real-time Q-RT-PCR using primer pairs to identify transcripts that either span the 5′ alternate splice junction (403F/533R) or specific to the alternate AMACR exon (568F/717R, 749F/864R). The PCR products of the predicted sizes were found in multiple tissues, including normal liver and colon (Fig. 4, top panel). It is worthwhile to note that expression of the alternate AMACR transcript in these tissues, as well as in normal brain and placenta (data not shown), is below the level of detection based on the present/absent call algorithm provided by Affymetrix. The Q-RT-PCR results using samples from colon (normal), liver (normal), PC with low AMACR expression levels based on U133 data (PCL1–3), and PC with high AMACR expression levels based on U133 data (PCH1–3) agree completely with the microarray results (Fig. 4, bottom panel).
We have described here an alternatively spliced AMACR transcript characterized based on microarray expression profiles and genome analysis. The function of this novel form of AMACR is unknown, but loss of the 3′ peroxisomal signal is of interest. Most proteins with different compartmentalization were the result of different localization signals generated by alternative transcripts or proteolytic modifications. In cells from patients with a generalized defect of peroxisome assembly (Zellweger syndrome), AMACR activity resides only in the mitochondria and is reduced to a level (10–20%) that corresponds to the mitochondrial fraction in normal cells (3). This suggests the presence of distinct organelle-specific AMACR with different targeting signals and independent fates. Studies using cells from liver and fibroblasts conclude that AMACR is from only one single gene product (2, 6). However, it is possible that the mitochondrial AMACR may actually be from the less abundant alternatively spliced mRNA species, detected in our study because of its overexpression in PC. The lack of peroxisome-targeting signal in the predicted protein sequence further suggests that the alternative AMACR may reside only in the mitochondria, an organelle that is vital in cell growth and survival. In a series of transfected Chinese hamster ovary cell experiments using green fluorescent protein fused to the COOH terminus of the full-length racemase or racemase with deletions of the NH2 terminus, mitochondrial targeting information has been localized to a region that is upstream of the alternate exon, between amino acids 22 and 85 (8). Thus far, no mitochondrial targeting sequence has been identified. It will be of interest to examine whether the alternate AMACR exon contains the mitochondrial targeting sequence signal. With the appropriate reagent, it should be possible to determine whether the mitochondria AMACR is from a distinct transcript. Further analysis of AMACR from patients with Zellweger syndrome should provide additional insights as to the distinction between the mitochondrial and peroxisomal AMACR.
In summary, we demonstrated here the utility of “expression genomics” and feasibility of using a reference oligonucleotide microarray expression database and genome analysis to identify a novel AMACR transcript. Our PCR results show that the alternative transcript is found in normal and tumor cells, but its presence was detectable by microarray only in PC samples with substantial AMACR expression. The predicted AMACR protein product from this transcript differs from the AMACR known previously characterized in liver cells and fibroblasts. We are in the process of generating reagents that can selectively identify the novel AMACR to further determine whether the two different forms of AMACR have distinct functions and cellular localization. The region of homology between FH and AMACR represents a single exon in the AMACR genomic locus on chromosome 5 but a contiguous sequence spanning four different correctly spliced exons in FH genomic locus on chromosome 1. It will be of interest to determine whether this alternative AMACR exon, a partial pseudogene within a functional gene, is found in other species and whether this stretch of DNA sequence has any biological significance.
Additional studies will be required to determine whether the mitochondrial and peroxisomal AMACR described previously are distinct products from alternatively spliced transcripts. Neurological alterations observed in germ-line deficiencies of AMACR and FH, overexpression of AMACR observed in a variety of tumors, including PC, and germ-line mutations of FH associated with renal and uterine tumors highlight the potential parallels between these genes and the importance of these genes in the multiple diseases. This is the first published study demonstrating that AMACR may have multiple products from a single gene, instead of being a single gene product with dual cellular locations.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Supplementary data for this article are available at Cancer Research Online (http://cancerres.aacrjournals.org).
The abbreviations used are: AMACR, α-methylacyl-CoA racemase; RT-PCR, reverse transcription-PCR; CAIB-BAIF, carnithine dehydratase-bile acid-inducible family; PC, prostate cancer; FH, fumarate hydratase; Q-RT-PCR, quantitative-reverse transcription-PCR.
Internet address: http://www.ebi.ac.uk.
Internet address: http://www.affymetrix.com/products/arrays/specific/hgu133.affx.
Internet address: http://www.affymetrix.com/support/technical/technotes/statistical_reference_guide.pdf.
Internet addresses: http://www.ncbi.nlm.nih.gov/LocusLink/ and http://genome.ucsc.edu.
Internet address: http://www.ncbi.nlm.nih.gov/BLAST.
Internet address: http://affymetrix.com/index.affx.
Internet address: http://genome-www5.stanford.edu/cgi-bin/SMD/source/sourceSearch.