Abstract
α-Methylacyl CoA racemase (AMACR) is overexpressed in prostate cancer relative to benign prostatic tissue. AMACR expression is highest in localized prostate cancer and decreases in metastatic prostate cancer. Herein, we explored the use of AMACR as a biomarker for aggressive prostate cancer. AMACR protein expression was determined by immunohistochemistry using an image analysis system on two localized prostate cancer cohorts consisting of 204 men treated by radical prostatectomy and 188 men followed expectantly. The end points for the cohorts were time to prostate-specific antigen (PSA) failure (i.e., elevation >0.2 ng/mL) and time to prostate cancer death in the watchful waiting cohort. Using a regression tree method, optimal AMACR protein expression cutpoints were determined to best differentiate prostate cancer outcome in each of the cohorts separately. Cox proportional hazard models were then employed to examine the effect of the AMACR cutpoint on prostate cancer outcome, and adjusted for clinical variables. Lower AMACR tissue expression was associated with worse prostate cancer outcome, independent of clinical variables (hazard ratio, 3.7 for PSA failure; P = 0.018; hazard ratio, 4.1 for prostate cancer death, P = 0.0006). Among those with both low AMACR expression and high Gleason score, the risk of prostate cancer death was 18-fold higher (P = 0.006). The AMACR cutpoint developed using prostate cancer–specific death as the end point predicted PSA failures independent of Gleason score, PSA, and margin status. This is the first study to show that AMACR expression is significantly associated with prostate cancer progression and suggests that not all surrogate end points may be optimal to define biomarkers of aggressive prostate cancer.
Introduction
The prevalence of pathologic prostate cancer is extremely high and increases with age. Pathologic prostate cancer is seen in autopsy series in men in their 20s and 30s and increases to >80% in men over the age of 70 (1). Prostate-specific antigen (PSA) testing has facilitated the detection of prostate cancer, which may have contributed to the decline in mortality from prostate cancer over the past few years (2, 3). However, PSA may also have introduced an increase in the detection of clinically insignificant disease (4-6). Moreover, the relationship between pretherapy clinical variables and outcomes have become obscured in part due to an eroding relationship between serum PSA and the abundance of cancer and/or higher grade cancer in the gland. Given the incidence and prevalence of prostate cancer, the ease of diagnosis, the aging of the population, and the morbidity of treatment, the ability to distinguish aggressive versus indolent forms of prostate cancer is critical.
Clinicians treating prostate cancer patients can assess the risk that a prostate cancer poses to the patient by using pretreatment nomograms that have been developed and validated to predict prostate cancer recurrence after treatment for localized disease (7-10). These nomograms account for serum PSA levels, prostate needle biopsy Gleason score, and clinical stage. At best, these clinical nomograms have an area under the receiver operating characteristic curve of 0.75. Kattan et al. have recently shown that the addition of two serum markers (IL6SR and TGF-β1) improve the area under the receiver operating characteristic curve to 0.83, suggesting that molecular markers in combination with clinical variables could improve on existing predictive nomograms in their ability to predict recurrence after local therapy (11). Molecular biomarkers are being characterized in order to help refine these clinical nomograms, and moreover, to help distinguish aggressive from indolent prostate cancer in an effort to spare men from unnecessary treatment (12, 13).
α-Methylacyl CoA racemase (AMACR) is a biomarker that was identified by both differential display and expression array analysis as a gene abundantly expressed in prostate cancer relative to benign prostate epithelium (14-16). In a metaanalysis of four cDNA expression array data sets, AMACR was one of the genes most consistently overexpressed in prostate cancer (17). AMACR is a peroxisomal and mitochondrial enzyme that plays an important role in bile acid biosynthesis and β-oxidation of branched-chain fatty acids through the interconversion of (R)- and (S)-2-methyl-branched-chain fatty acyl-CoA fragments (18). Our group initially reported that AMACR expression is consistently lower both at the transcriptional (cDNA expression arrays and RT-PCR) and at the protein level (Western blot analysis and immunohistochemistry) in metastatic prostate cancer compared with localized prostate cancer (14, 19). More recently, a fluorescent-based measurement of AMACR in tissue samples confirmed these observations (20).
In the current study, we describe the development of a quantitative AMACR protein expression test to determine the risk of progression for men with clinically localized prostate cancer and for men treated for localized prostate cancer.
Materials and Methods
Patient Cohorts
Surgical Cohort. This cohort consisted of 204 patients from the University of Michigan (Ann Arbor, MI), who underwent radical retro pubic prostatectomy as a primary therapy (i.e., no preceding hormonal or radiation therapy) for clinically localized prostate cancer between 1994 and 1998. Clinical and pathologic data for all patients were acquired with approval from the Institutional Review Board at the University of Michigan. Clinical data regarding this cohort has been reported separately (21, 22). Disease progression was defined as a serum PSA increase >0.2 ng/mL after radical prostatectomy. Patients were considered censored if they have not had a PSA biochemical failure at the last follow-up time evaluated for that individual. Mean follow-up time in the cohort was 4.8 (2.4) years. In this cohort, 60% of cases were Gleason 7 or more. Mean age at diagnosis was 60 years.
Watchful Waiting Cohort. This cohort is the largest population-based watchful waiting cohort, and consists of patients from Örebro, Sweden with clinically localized prostate cancer, who underwent watchful waiting. This cohort, initially described in 1989 (23), consists of men who all presented with voiding symptoms referred to the urology department to rule out the diagnosis of prostate carcinoma. From March 1977 through September 1991, 1,230 patients were diagnosed with prostate cancer in Örebro County. Among these, 253 were diagnosed through transurethral resection of the prostate and these represent the study base for the watchful waiting cohort. None were diagnosed by PSA screening. For the current investigation, cases were excluded due to insufficient amount of tumor (n = 39), inadequate immunohistochemistry (n = 13), inability to confirm the original diagnosis of cancer (n = 9), or initially presenting for cystoprostatectomy due to bladder cancer (n = 1). Thus, data from 188 watchful waiting cases were included in this study.
The baseline evaluation of these patients at diagnosis included physical examination, chest radiography, i.v. pyelogram, bone scan, and skeletal radiography (if needed). Lymph node staging was not done. In accordance with standard practices at that time in Örebro, these patients were initially followed expectantly (“watchful waiting”). Patients were treated with androgen deprivation therapy only if they exhibited symptoms. Patient follow-up included clinical examinations, laboratory tests, and bone scans every 6 months during the first 2 years following the initial prostate cancer diagnosis and subsequently every 2 years. Medical records of all deceased patients have been reviewed to determine cause of death. As a validation, the classification of cause of death was compared with that recorded in the Swedish Death Register. Thus far, agreement on cause of death has been >90%, with no evidence of systematic over- or underestimation of prostate cancer as cause of death. Through March 2003, with up to 23 years of follow-up, 36 (19.2%) of the patients in this cohort died of prostate cancer. The remaining patients are considered censored, having either died of other causes (126 or 67.0%) or were still alive without disease at time of last follow-up (26 or 13.8%). No patients have been lost to follow-up.
In order to ensure a uniform review of the pathology, one of the study pathologists reviewed all cases from both series. Uniform pathology review included Gleason grading, an estimate of overall tumor involvement (tumor burden per tissue samples evaluated), and tumor type (peripheral zone versus transition zone). Although there are no strict criteria for distinguishing a transition zone tumor from a peripheral zone tumor that has invaded the transition zone, we defined the transition zone tumors for the sake of this analysis as tumors with Gleason score of 6 and below with a well-circumscribed growth pattern. For staging and grading of the tumors, the TNM classification from 1992 (6) and the WHO classification (8) were used. Of the 188 patients in the watchful waiting cohort, 75 (39.9%) were stage T1a and 113 (60.1%) were found to have T1b. The mean age at diagnosis was 73 years.
Tissue Microarray Construction
The tissue microarrays (TMA) from both patient cohorts were assembled using the manual tissue arrayer (Beecher Instruments, Silver Spring, MD) as previously described (24). Tissue cores from circled areas were targeted for transfer to the recipient array blocks. Three to five replicate tissue cores were sampled from each patient sample. In all cases, the dominant prostate cancer nodule or the nodule with the highest Gleason pattern was sampled for the TMA. The 0.6-mm diameter TMA cores were each spaced at 0.8 mm from core-center to core-center. Six TMA blocks with an average of 480 cores per block were used for this study. All blocks contained benign prostate tissue as well as prostate cancer. Each block was assembled without prior knowledge of associated clinical or pathology staging information. After construction, 4-μm sections were cut and stained with H&E on the initial slides to verify the histologic diagnosis. All data is maintained on a relational database as previously described (25).
Immunohistochemistry
Pretreatment conditions and incubations were worked out for AMACR immunostaining using a commercially available monoclonal antibody directed against AMACR (p504s, Zeta Co., Sierra Madre, CA). Pretreatment included placing the slide in a pH 6.0 citrate buffer and microwaving for 30 minutes. Primary p504s antibodies were incubated for 40 minutes at room temperature. Secondary anti-mouse antibodies applied for 30 minutes and the enzymatic reaction was completed using a streptavidin biotin detection kit (Dako Developing System, Dako, Carpinteria, CA) for 5 minutes. Optimal primary antibody concentration was determined by serial dilutions, optimizing for maximal signal without background immunostaining.
Manual Scoring of AMACR
All TMA cores were assigned a diagnosis (i.e., benign, atrophy, PIN, or prostate cancer) by the two study pathologists. Prostate cancer samples were only included in the analysis if both reviewers agreed that it was cancer. All manual scoring was done on an Internet-based image evaluation tool that employs zoomable TMA images generated by the BLISS Imaging System (Bacus Lab, Lombard, IL). The AMACR protein expression was evaluated using a categorical scoring method ranging from negative to strong staining intensity as previously reported (14).
Semiautomated Quantitative Image Analysis of AMACR
A semiautomated quantitative image analysis system, ACIS II (Chromavision, San Juan Capistrano, CA), was used to evaluate the same TMA slides from both cohorts. The ACIS II device consists of a microscope with a computer-controlled mechanical stage. Proprietary software is used to detect the brown stain intensity of the chromogen used for the immunohistochemical analysis and compares this value to blue counterstain used as background. Theoretical intensity levels range from 0 to 255 chromogen intensity units. In pilot experiments for this study, the reproducibility of the ACIS II system was tested and confirmed by scoring several TMAs on separate occasions. The correlation coefficient for these experiments was r2 = 0.973. Because of tissue heterogeneity, one of the study pathologist digitally circled the areas of histologically recognizable prostate cancer using the ACIS II software for each TMA core. This process ensured that AMACR intensity measurements were from prostate cancer tissue only and not the surrounding benign glands or stroma.
Statistical Analysis
AMACR intensity readings were obtained for each of the TMA slides separately and were then normalized within each array before combining the data for analysis. After several pilot studies (data not shown), we determined that normalization of the data was necessary. Despite using the same protocol for immunohistochemical staining, experiment-to-experiment variation was observed. Therefore, we normalized the AMACR intensity readings for each TMA core on a given array prior to merging all of the data for the final analysis. Critical for the normalization process was the presence of approximately equally distributed numbers of normal and cancer samples on each TMA. AMACR staining intensity readings for each TMA core from a given array were subtracted by the mean intensity for that same array and then divided by the SD:
where j = 1, …, ni (ni is the total number of cores on TMAi). As a result, each of the normalized arrays has mean score of 0 and SD equal to 1. Data were then combined using this normalized scale.
In order to determine an optimal cutpoint for AMACR, we used a modification of the method of regression trees (26) applied to censored data (27). The regression tree method is an estimation procedure that selects a cutpoint for AMACR based on optimizing a discriminating measure using the censored failure time outcome. The method employs a likelihood criterion to optimize the cutpoint, and assumes that the cost of a false-positive and false-negative are equal. We further did an adjusted analysis for determining a cutpoint, which involves obtaining Martingale residuals (28) at the first stage by adjusting for potential confounders and then applying the regression tree algorithm in order to find a cutpoint. The adjusted method allowed for the cutpoint to be determined accounting for clinical variables.
The cutpoints for the AMACR intensity scores had a theoretical range between 0 and 255 intensity units. Using the regression tree method, we determined the cutpoint that best differentiated PSA biochemical failure in the 204 patients from the PSA-screened surgical series. A similar process was repeated for the Örebro watchful waiting cohort (n = 188 cases) using cancer-specific death as the end point.
Once the cutpoints were determined for each cohort, we then applied the cutpoint to the other cohort. For example, we tested the optimal cutpoint derived using the surrogate end point (PSA failure) on the watchful waiting cohort to determine if it would predict a true end point (prostate cancer–specific death). We then tested the cutpoint derived using prostate cancer–specific death as the end point on the surgical series to see if it would predict PSA biochemical failure. We further employed Cox proportional hazards regression analysis to examine the association between the AMACR cutpoint and time to prostate cancer outcome, taking into account other clinical variables.
Results
Manual Evaluation of AMACR
AMACR protein expression was evaluated manually by the study pathologist and graded on a four-tiered scale. We found a significant difference in intensity between prostate cancer (mean score = 3.14/4) and benign prostate epithelium (mean score = 1.3/4) with a mean difference of 1.84 (ANOVA post hoc Scheffé analysis, P < 0.00001). In the surgical series, no significant associations between AMACR intensity scores and biochemical failure were observed, consistent with previous observations (14).
Semiautomated Quantitative AMACR Expression Analysis
The Surgery Cohort with Time to PSA Recurrence. In Table 1, we present clinical characteristics of the surgery cohort in relation to tertiles of AMACR expression. There were only small differences in clinical characteristics comparing men with lower and higher AMACR tissue expression. In contrast, men with the lowest levels of AMACR expression were more than twice as likely to experience PSA recurrence during follow-up compared to those with higher levels.
Mean (range) AMACR . | AMACR expression (tertiles) . | . | . | |||
---|---|---|---|---|---|---|
. | 1 . | 2 . | 3 . | |||
−1.8 (−4.3 to −1.2) | −0.77 (−1.20 to −0.28) | +0.40 (−0.24 to +1.9) | ||||
Characteristic | ||||||
Mean age (y) | 60.6 | 60.6 | 59.4 | |||
Gleason score (%) | ||||||
4/5 | 6.0 | 1.4 | 3.0 | |||
6 (3+3) | 31.3 | 42.9 | 35.8 | |||
7 (3+4) | 43.3 | 40.0 | 47.8 | |||
7 (4+3) | 13.4 | 12.9 | 10.5 | |||
8/9 | 6.0 | 2.9 | 3.0 | |||
Mean preoperative PSA | 9.3 | 7.5 | 8.7 | |||
Mean tumor weight (g) | 52.7 | 52.4 | 49.2 | |||
Surgical margin status | ||||||
Negative | 67.2 | 72.9 | 71.6 | |||
Positive | 32.8 | 27.1 | 28.4 | |||
Extraprostatic extension (%) | 25.4 | 21.4 | 23.9 | |||
Abnormal digital rectal examination (%) | 41.8 | 42.9 | 43.3 | |||
PSA recurrence during follow-up (%) | 37.3 | 15.7 | 17.9 |
Mean (range) AMACR . | AMACR expression (tertiles) . | . | . | |||
---|---|---|---|---|---|---|
. | 1 . | 2 . | 3 . | |||
−1.8 (−4.3 to −1.2) | −0.77 (−1.20 to −0.28) | +0.40 (−0.24 to +1.9) | ||||
Characteristic | ||||||
Mean age (y) | 60.6 | 60.6 | 59.4 | |||
Gleason score (%) | ||||||
4/5 | 6.0 | 1.4 | 3.0 | |||
6 (3+3) | 31.3 | 42.9 | 35.8 | |||
7 (3+4) | 43.3 | 40.0 | 47.8 | |||
7 (4+3) | 13.4 | 12.9 | 10.5 | |||
8/9 | 6.0 | 2.9 | 3.0 | |||
Mean preoperative PSA | 9.3 | 7.5 | 8.7 | |||
Mean tumor weight (g) | 52.7 | 52.4 | 49.2 | |||
Surgical margin status | ||||||
Negative | 67.2 | 72.9 | 71.6 | |||
Positive | 32.8 | 27.1 | 28.4 | |||
Extraprostatic extension (%) | 25.4 | 21.4 | 23.9 | |||
Abnormal digital rectal examination (%) | 41.8 | 42.9 | 43.3 | |||
PSA recurrence during follow-up (%) | 37.3 | 15.7 | 17.9 |
NOTE: Surgical margin status, tumor size, digital rectal examination results, and extraprostatic extension, by two-sample t test for pretreatment PSA serum levels, age, and gland weight.
Using the adjusted regression tree method, we established a dichotomous cutpoint for AMACR intensity of 1.11 SD (i.e., samples with minimum AMACR intensity of 1.11 SD below that of the mean), wherein 37.5% of patients with AMACR intensity scores below the cutpoint had PSA biochemical failure compared with 14.5% of patients with AMACR intensity scores above the cutpoint (P = 0.0002).This univariate association can be visually appreciated by Kaplan-Meier analysis (Fig. 1). Tables 2 and 3 present the univariate and multivariate associations, respectively, between AMACR expression, clinical characteristics and time to PSA recurrence. In multivariate analysis, patients with AMACR expression levels below the threshold were at a significantly higher risk of developing PSA recurrence [P = 0.04, hazard ratio (HR) 2.12; 95% confidence interval (CI), 1.04-4.32] after adjusting for preoperative PSA, Gleason score, and surgical margin status. In complimentary analyses, we found that the likelihood of PSA recurrence increased 60% with each additional SD decrease in AMACR expression (P = 0.004). Comparing the lowest to highest tertiles of AMACR expression, the HR (95% CI) was 2.80 (1.37-5.72).
Variables . | Univariate . | . | . | . | . | |||||
---|---|---|---|---|---|---|---|---|---|---|
. | Censored (n = 156) . | Recurred (n = 48) . | HR . | 95% CI for HR . | P . | |||||
AMACR | ||||||||||
<−1.11 (low) | 32.1% | 62.5% | 2.89 | 1.47-5.68 | 0.002 | |||||
≥−1.11 (high) | 67.9% | 37.5% | REF | |||||||
Age, per year | 60 y | 61 y | 1.01 | 0.97-1.05 | 0.61 | |||||
Gleason score | ||||||||||
8-9 | 1.9 | 10.5 | 10.7 | 3.40-33.9 | <0.0001 | |||||
7 | 50.0 | 75.0 | 4.04 | 1.79-9.08 | 0.0007 | |||||
<7 | 48.0 | 24.6 | REF | |||||||
ln(PSA), per unit increase | 7.2 ng/mL | 12.7 ng/mL | 2.41 | 1.72-3.38 | <0.0001 | |||||
Tumor size (cm) | ||||||||||
>2 | 89.7% | 66.7% | 3.49 | 1.91-6.37 | <0.0001 | |||||
≤2 | 10.3% | 33.3% | REF | |||||||
Surgical margin status | ||||||||||
Positive | 79.5% | 41.7% | 3.61 | 2.26-5.76 | <0.0001 | |||||
Negative | 20.5% | 58.3% | REF | |||||||
Extraprostatic extension | ||||||||||
Extension | 84.6% | 50.0% | 2.79 | 1.97-3.94 | <0.0001 | |||||
No extension | 15.4% | 50.0% | REF | |||||||
Digit rectal exam | ||||||||||
Positive | 61.5% | 43.8% | 1.90 | 1.07-3.36 | 0.03 | |||||
Negative | 38.5% | 56.3% | REF |
Variables . | Univariate . | . | . | . | . | |||||
---|---|---|---|---|---|---|---|---|---|---|
. | Censored (n = 156) . | Recurred (n = 48) . | HR . | 95% CI for HR . | P . | |||||
AMACR | ||||||||||
<−1.11 (low) | 32.1% | 62.5% | 2.89 | 1.47-5.68 | 0.002 | |||||
≥−1.11 (high) | 67.9% | 37.5% | REF | |||||||
Age, per year | 60 y | 61 y | 1.01 | 0.97-1.05 | 0.61 | |||||
Gleason score | ||||||||||
8-9 | 1.9 | 10.5 | 10.7 | 3.40-33.9 | <0.0001 | |||||
7 | 50.0 | 75.0 | 4.04 | 1.79-9.08 | 0.0007 | |||||
<7 | 48.0 | 24.6 | REF | |||||||
ln(PSA), per unit increase | 7.2 ng/mL | 12.7 ng/mL | 2.41 | 1.72-3.38 | <0.0001 | |||||
Tumor size (cm) | ||||||||||
>2 | 89.7% | 66.7% | 3.49 | 1.91-6.37 | <0.0001 | |||||
≤2 | 10.3% | 33.3% | REF | |||||||
Surgical margin status | ||||||||||
Positive | 79.5% | 41.7% | 3.61 | 2.26-5.76 | <0.0001 | |||||
Negative | 20.5% | 58.3% | REF | |||||||
Extraprostatic extension | ||||||||||
Extension | 84.6% | 50.0% | 2.79 | 1.97-3.94 | <0.0001 | |||||
No extension | 15.4% | 50.0% | REF | |||||||
Digit rectal exam | ||||||||||
Positive | 61.5% | 43.8% | 1.90 | 1.07-3.36 | 0.03 | |||||
Negative | 38.5% | 56.3% | REF |
NOTE: ln(PSA), the natural logarithm of pretreatment PSA.
Variable . | Multivariate adjusted . | . | . | |||
---|---|---|---|---|---|---|
. | HR . | 95% CI . | P . | |||
AMACR | ||||||
<−1.11 (low) | 2.12 | 1.04-4.32 | 0.039 | |||
≥−1.11 (high) | REF | |||||
Gleason score | ||||||
8+ | 2.97 | 0.44-20.0 | 0.26 | |||
7 | 3.31 | 0.94-11.6 | 0.06 | |||
5-6 | REF | |||||
Surgical margin status | ||||||
Positive | 2.92 | 1.68-5.08 | 0.0001 | |||
Negative | REF | |||||
ln(PSA) | 1.70 | 1.09-2.67 | 0.020 |
Variable . | Multivariate adjusted . | . | . | |||
---|---|---|---|---|---|---|
. | HR . | 95% CI . | P . | |||
AMACR | ||||||
<−1.11 (low) | 2.12 | 1.04-4.32 | 0.039 | |||
≥−1.11 (high) | REF | |||||
Gleason score | ||||||
8+ | 2.97 | 0.44-20.0 | 0.26 | |||
7 | 3.31 | 0.94-11.6 | 0.06 | |||
5-6 | REF | |||||
Surgical margin status | ||||||
Positive | 2.92 | 1.68-5.08 | 0.0001 | |||
Negative | REF | |||||
ln(PSA) | 1.70 | 1.09-2.67 | 0.020 |
NOTE: Data in multivariate model are adjusted for AMACR cutpoint, Gleason score, surgical margin status, and natural log of PSA.
The Watchful Waiting Cohort with Time to Prostate Cancer Death. In the watchful waiting cohort, AMACR tissue expression correlated with clinical characteristics (Table 4). Cases with lowest AMACR expression were also more likely to be clinically stage T1b and to have Gleason > 7.
Mean (range) AMACR . | AMACR expression (tertiles) . | . | . | |||
---|---|---|---|---|---|---|
. | 1 . | 2 . | 3 . | |||
. | −1.0 (−1.5 to −0.6) . | −0.15 (−0.56 to +0.27) . | +1.2 (+0.28 to +3.8) . | |||
Characteristics | ||||||
Mean age, years | 72.9 | 74.2 | 74.1 | |||
T stage (%) | ||||||
T1a | 47.6 | 46.8 | 25.4 | |||
T1b | 52.4 | 53.2 | 74.6 | |||
Gleason score (%) | ||||||
4/5 | 4.8 | 8.1 | 3.2 | |||
6 (3+3) | 60.3 | 61.3 | 38.1 | |||
7 (3+4) | 12.7 | 14.5 | 30.2 | |||
7 (4+3) | 7.9 | 8.1 | 11.1 | |||
8/9 | 14.3 | 8.1 | 17.5 | |||
Status in 2003 (%) | ||||||
Prostate cancer death | 19.1 | 21.0 | 17.5 | |||
Other death | 68.3 | 62.9 | 69.8 | |||
Alive | 12.7 | 16.1 | 12.7 |
Mean (range) AMACR . | AMACR expression (tertiles) . | . | . | |||
---|---|---|---|---|---|---|
. | 1 . | 2 . | 3 . | |||
. | −1.0 (−1.5 to −0.6) . | −0.15 (−0.56 to +0.27) . | +1.2 (+0.28 to +3.8) . | |||
Characteristics | ||||||
Mean age, years | 72.9 | 74.2 | 74.1 | |||
T stage (%) | ||||||
T1a | 47.6 | 46.8 | 25.4 | |||
T1b | 52.4 | 53.2 | 74.6 | |||
Gleason score (%) | ||||||
4/5 | 4.8 | 8.1 | 3.2 | |||
6 (3+3) | 60.3 | 61.3 | 38.1 | |||
7 (3+4) | 12.7 | 14.5 | 30.2 | |||
7 (4+3) | 7.9 | 8.1 | 11.1 | |||
8/9 | 14.3 | 8.1 | 17.5 | |||
Status in 2003 (%) | ||||||
Prostate cancer death | 19.1 | 21.0 | 17.5 | |||
Other death | 68.3 | 62.9 | 69.8 | |||
Alive | 12.7 | 16.1 | 12.7 |
Using the adjusted regression tree method, we determined an optimal cutpoint of 0.18 SD using time to prostate cancer death as the outcome. The univariate and multivariate analysis for the association between the +0.18 SD AMACR cutpoint and prostate cancer–specific survival are presented in Table 5. As shown in Table 5 and Fig. 2, there was a significant association between the AMACR cutpoint and prostate cancer death in the multivariable analysis. Adjusting for clinical characteristics, the HR for prostate cancer death after >20 years of follow-up was 4.06 (95% CI 1.82-9.06; P = 0.0006) comparing those with low AMACR levels to those with high levels. The effect of AMACR was independent of age at diagnosis, Gleason score, and tumor stage as evidenced in the multivariable analysis. The association between the AMACR cutpoint and prostate cancer death was constant across follow-up time (data available upon request). In complimentary analyses, we observed that the likelihood of PSA recurrence increased 50% with each additional SD decrease in AMACR expression (P = 0.023). Men in the lowest tertile of AMACR expression were 3-fold more likely to die of prostate cancer over follow-up compared with those in the highest tertile.
Variable . | Censored (n = 152) . | Prostate cancer death (n = 36) . | Unadjusted . | . | . | Multivariate adjusted . | . | . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | . | . | HR . | 95% CI . | P . | HR . | 95% CI . | P . | ||||||||
AMACR | ||||||||||||||||
<0.18 (low) | 60.1 | 69.4 | 1.40 | 0.68-2.83 | 0.36 | 4.06 | 1.82-9.06 | 0.0006 | ||||||||
≥0.18 (high) | 39.9 | 30.6 | REF | REF | ||||||||||||
Age (per year) | 74.1 y | 72.5 y | 1.03 | 0.99-1.08 | 0.17 | 1.01 | 0.97-1.06 | 0.58 | ||||||||
T stage | ||||||||||||||||
T1b | 46.1 | 13.9 | 5.12 | 1.98-13.21 | 0.0007 | 3.61 | 1.30-9.97 | 0.013 | ||||||||
T1a | 53.9 | 86.1 | REF | REF | ||||||||||||
Gleason score | ||||||||||||||||
8+ | 8.5 | 33.4 | 10.45 | 4.55-24.0 | <0.0001 | 13.03 | 4.88-34.8 | <0.0001 | ||||||||
7 | 26.9 | 33.3 | 2.84 | 1.27-6.38 | 0.011 | 2.70 | 1.12-6.52 | 0.027 | ||||||||
5-6 | 62.7 | 33.3 | REF | REF |
Variable . | Censored (n = 152) . | Prostate cancer death (n = 36) . | Unadjusted . | . | . | Multivariate adjusted . | . | . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | . | . | HR . | 95% CI . | P . | HR . | 95% CI . | P . | ||||||||
AMACR | ||||||||||||||||
<0.18 (low) | 60.1 | 69.4 | 1.40 | 0.68-2.83 | 0.36 | 4.06 | 1.82-9.06 | 0.0006 | ||||||||
≥0.18 (high) | 39.9 | 30.6 | REF | REF | ||||||||||||
Age (per year) | 74.1 y | 72.5 y | 1.03 | 0.99-1.08 | 0.17 | 1.01 | 0.97-1.06 | 0.58 | ||||||||
T stage | ||||||||||||||||
T1b | 46.1 | 13.9 | 5.12 | 1.98-13.21 | 0.0007 | 3.61 | 1.30-9.97 | 0.013 | ||||||||
T1a | 53.9 | 86.1 | REF | REF | ||||||||||||
Gleason score | ||||||||||||||||
8+ | 8.5 | 33.4 | 10.45 | 4.55-24.0 | <0.0001 | 13.03 | 4.88-34.8 | <0.0001 | ||||||||
7 | 26.9 | 33.3 | 2.84 | 1.27-6.38 | 0.011 | 2.70 | 1.12-6.52 | 0.027 | ||||||||
5-6 | 62.7 | 33.3 | REF | REF |
NOTE: Data in multivariate model are adjusted for AMACR cutpoint, age, T stage, and Gleason score.
Cross-validation of the AMACR Cutpoints
After development of an optimal AMACR expression cutpoint to identify men at highest risk of developing PSA biochemical recurrence, we sought to explore if this cutpoint developed with PSA relapses following radical prostatectomy also predicted cancer-specific death among patients left without curative treatment by applying the −1.11 SD cutoff on the watchful waiting cohort. Similarly, we examined the AMACR cutpoint determined in the watchful waiting cohort (0.18 SD) to the surgery cohort (Table 6). Using the surgical cutpoint of −1.11 SD, AMACR intensity did not predict prostate cancer–specific survival in univariate or multivariate analysis. The multivariate adjusted HR for prostate cancer death after 20 years of follow-up was 1.09 (95% CI, 0.33-3.62; P = 0.89). In contrast, there was evidence that AMACR levels below the cutpoint determined by prostate cancer death was significantly associated with time to PSA biochemical failure at the univariate level as shown by the Kaplan-Meier analysis (Fig. 2) On multivariate analysis, AMACR expression <0.18 SD cutpoint was a significant predictor of biochemical failure, independent of Gleason score, surgical margin status, and pretreatment PSA, with a HR of 3.67 (95% CI, 1.25-10.78; P = 0.018; Table 6). Lymph node status, tumor stage, and seminal vesicle invasion were associated with a 50% greater likelihood of PSA failure but were not significant predictors of outcome in the final multivariate model. As this was a PSA-screened population, only 4 of 204 had positive lymph nodes and 10 of 204 had seminal vesicle invasion. After excluding these cases, there was no significant difference in the HRs for either cutpoint (data not shown).
AMACR cutpoint . | Surgical cohort PSA biochemical failure . | . | Watchful waiting cohort (Prostate Cancer Death) . | . | ||||
---|---|---|---|---|---|---|---|---|
. | HR* (95% CI) . | P . | HR* (95% CI) . | P . | ||||
Prostate cancer death | ||||||||
Low (<0.18) | 3.67 (1.25-10.78) | 0.018 | 4.06 (1.82-9.06) | 0.0006 | ||||
High (≥0.18) | REF | REF | ||||||
Biochemical failure | ||||||||
Low (<−1.11) | 2.12 (1.04-4.32) | 0.039 | 1.09 (0.33-3.62) | 0.89 | ||||
High (≥−1.11) | REF | REF |
AMACR cutpoint . | Surgical cohort PSA biochemical failure . | . | Watchful waiting cohort (Prostate Cancer Death) . | . | ||||
---|---|---|---|---|---|---|---|---|
. | HR* (95% CI) . | P . | HR* (95% CI) . | P . | ||||
Prostate cancer death | ||||||||
Low (<0.18) | 3.67 (1.25-10.78) | 0.018 | 4.06 (1.82-9.06) | 0.0006 | ||||
High (≥0.18) | REF | REF | ||||||
Biochemical failure | ||||||||
Low (<−1.11) | 2.12 (1.04-4.32) | 0.039 | 1.09 (0.33-3.62) | 0.89 | ||||
High (≥−1.11) | REF | REF |
Data were adjusted for Gleason score; surgical margin status, and the natural log of pretreatment serum PSA.
Cross-classifying AMACR Expression and Gleason Score
The application of a biomarker in the clinical setting would likely accommodate clinical characteristics of patients as well in assessing prostate cancer prognosis. In Table 7, we examine the joint association between AMACR expression and Gleason score on prostate cancer outcome in the two cohorts. For this analysis, we rely on the AMACR cutpoint of 0.18 SD. Compared to those with “better” biomarker and clinical measures (i.e., high AMACR expression and low Gleason score), those with both low AMACR expression and high Gleason score had an almost four times higher risk of PSA biochemical failure. In the watchful waiting cohort, however, individuals with the “poorer” measures had an 18-fold higher risk of prostate cancer death (P = 0.006). In fact, among the 12 prostate cancer deaths with Gleason score of 6 or less, using the AMACR cutpoint appropriately predicted 11 as deaths. Furthermore, among the four prostate cancer deaths with Gleason score 6 or less and tumor stage T1a, all were correctly predicted as death. These data indicate that the AMACR biomarker, in combination with clinical variables, can substantially predict prostate cancer mortality.
. | Surgical cohort, PSA biochemical failure . | . | Watchful waiting cohort, prostate cancer death . | . | ||
---|---|---|---|---|---|---|
. | Low AMACR . | High AMACR . | Low AMACR . | High AMACR . | ||
Gleason ≥ 7 | 3.9 (0.53-29.2) | 1.7 (0.21-14.5) | 18.0 (2.3-140.8) | 4.9 (0.63-37.9) | ||
Gleason < 7 | 1.2 (0.12-10.9) | REF | 6.4 (0.8-51.2) | REF |
. | Surgical cohort, PSA biochemical failure . | . | Watchful waiting cohort, prostate cancer death . | . | ||
---|---|---|---|---|---|---|
. | Low AMACR . | High AMACR . | Low AMACR . | High AMACR . | ||
Gleason ≥ 7 | 3.9 (0.53-29.2) | 1.7 (0.21-14.5) | 18.0 (2.3-140.8) | 4.9 (0.63-37.9) | ||
Gleason < 7 | 1.2 (0.12-10.9) | REF | 6.4 (0.8-51.2) | REF |
Note: Data are adjusted for surgical margins and PSA (surgical cohort) or age and T stage (watchful waiting cohort). AMACR cutpoint = 0.18.
Discussion
This is the first demonstration that AMACR protein expression levels are associated with long-term clinical outcome in early prostate cancer. Our original observations suggested that AMACR expression decreases with prostate cancer progression (14, 19). However, standard immunohistochemical evaluation failed to identify clinically significant associations between AMACR expression and biochemical failure (14, 29, 30). Using a more sensitive technique to measure AMACR protein expression, we were able to identify an AMACR cutpoint that was associated with a three times greater chance of biochemical recurrence and a three to seven greater risk of developing cancer-specific death. These observations were independent of conventional clinical and pathologic variables in multivariable analysis.
The paradigm for developing prostate cancer biomarkers has focused on using PSA biochemical failure as the surrogate end point for the development of metastatic disease and cancer-specific death. Studies based on surrogate end points are inherently less reliable than studies with clinical end points that therapeutic intervention aims to prevent or delay, such as cancer-specific death (31). PSA failure is considered a surrogate for the progression of metastatic disease and prostate cancer–specific death (32, 33). Indeed, studies by Pound et al. support the potential role of PSA failure as a clinically relevant surrogate end point (33). For example, of 304 men who develop a PSA biochemical failure following radical prostatectomy for clinically localized disease, 103 (34%) developed metastatic disease with a mean actuarial time to metastases of 8 years following the initial PSA elevation. The Pound study also showed that time from initial diagnosis to biochemical failure and PSA doubling time predicted metastatic disease. These findings suggest that biochemical failure per se is not a complete surrogate for clinical outcomes, in particular death, and should take into account PSA kinetics as well (34). Additional support that biochemical failure may not be an optimal end point was recently reported by D'Amico et al. (35). They evaluated surrogate end points for prostate cancer-specific mortality in two multi-institutional databases of over 8,669 patients with prostate cancer treated with surgery (5,918 men) or radiation (2,751 men). The posttreatment PSA doubling time was significantly associated with time to prostate cancer–specific mortality and with time to all-cause mortality.
The current study is consistent with the view that PSA failure may not be the optimal end point to use to predict prostate cancer–specific death and thus not optimal for the development of prostate cancer biomarkers. We observed that the AMACR cutpoints differed depending on whether it was derived from PSA failure or cancer-specific death. Biochemical failure may be due to causes unrelated to the biology of prostate cancer such as local recurrence due to positive surgical margins. Moreover, given the slow progression of prostate cancer, even 8 years may not be sufficient to identify all patients who will die from prostate cancer. Therefore, censoring underestimates the number of men who will ultimately experience prostate cancer progression. The work from Pound et al. and D'Amico et al. would suggest that the surrogate end point needs further consideration.
However, we also need to be cautious as there are other possible explanations as to why the cutpoint derived by biochemical failure did not predict prostate cancer–specific death. There are several important differences between the two populations including eras collected (i.e., pre- and post-PSA screening era), populations (i.e., Swedish versus U.S.), and treatment (i.e., watchful waiting versus surgery). Theoretically, a clinical trial such as the Scandinavian Prostate Cancer Group IV study (36), where patients were randomized to be either followed expectantly or treated by surgery, would be extremely useful in the development of prostate cancer biomarkers to predict cancer-specific death. Given the important need for the development of uniformly collected cohorts, there is currently an important effort by the National Cancer Institute–supported prostate cancer Specialized Programs of Research Excellence groups to develop such resources.
The use of prostate cancer–specific death to develop a biomarker test is likely a more valid strategy. In the current study, using cancer-specific death to determine the AMACR cutoff yielded a different cutoff from that derived from the surrogate end point. Using this revised cutoff on the patients treated for clinically localized disease, AMACR expression independently predicted PSA relapse, even accounting for pretreatment PSA, surgical margin status, and Gleason score. The high-risk group included 81% of the patients in this surgical cohort suggesting that using PSA recurrence as the surrogate end point, we are missing some patients who would have progressed. This finding is not what one would anticipate if biochemical failure also included “false-positives” due to other cases such as surgical technique. If these data are confirmed, then risk of progression given sufficient time is not being adequately measured using one elevation of PSA following treatment. The most recent review of the entire Örebro watchful waiting series identified a significant increase in prostate cancer–specific mortality beyond 15 years of follow-up (37). This would also further support the need for better prognostic markers. In the randomized Swedish study, survival benefits were observed for surgery over watchful waiting but the absolute reduction in prostate cancer mortality was small (36). Longer follow-up might show even greater benefits from localized therapy given the significant percentage of men who developed metastatic disease over the 8 years of follow-up (36).
This study also highlights one of the technical limitations of prior biomarker work. Most studies to date, including our own work, used manual evaluation dividing immunostaining results into a small number of categories. Although reasonable reproducibility may be achieved using a four-tiered (i.e., negative, weak, moderate, or strong), three-tiered or dichotomous scale, the range of expression is compressed and the ability of the pathologist to distinguish, for example, various shades of moderate staining intensity is not possible. The current study illustrates how critical this might be in the development of prostate cancer biomarkers (Fig. 3). Several studies on AMACR failed to identify an association with decreased expression and prostate cancer progression (14, 19, 29, 30, 38). Our initial observations suggested that AMACR expression decreased with cancer progression as both at the transcriptional level and protein level AMACR was lower in metastatic prostate samples (14, 19). However, using manual standard pathology review, no clinical associations were discovered. More recent work using a fluorescence-based approach confirmed lower AMACR expression in metastatic samples. This finding suggests that a quantitative method might be required to identify a critical cutoff to distinguish clinically localized tumors with subtle differences in AMACR expression levels (20). The current study supports this hypothesis. No significant cutoff could be determined using a standard evaluation of immunohistochemical markers by the pathologist. In contrast, the use of a continuous scale of protein measurement allowed us to identify a critical cutoff. This cutoff could then be translated into a specific test using an intensity level that was not appreciated despite numerous studies on AMACR to date.
The use of this semiautomated quantitative method for biomarker analysis is promising. However, the clinical application of such systems is still in its early phases. Results need to be validated and tested in different cohorts and standardization needs to be worked out. Specifically, can we determine a cutpoint that would be useful to apply to prostate needle biopsies in the clinical assessment of men with clinically localized prostate cancer? One future approach would be to evaluate prostate needle biopsies for AMACR staining intensity and determine if this information adds to currently employed clinical nomograms. As recently described by Kattan et al., one could explore if the addition of AMACR intensity improves the receiver operating characteristic area under the curve as compared with a traditional nomogram (11, 39). However, as seen in this current study using a regression tree method to identify the optimal cutpoints, different populations may end up having different significant cutpoints after adjusting for different covariates. Therefore, predictive models using molecular data will be no different than standard nomograms as both approaches require carefully defining the clinical cohort.
Another reasonable approach to developing cutpoint is to make them cohort-specific biomarkers. The biomarkers could theoretically be tested in half of the cohorts and validated in the remaining cases. If the testing phase took into account the clinical variables, one might predict that the model would stand a better chance of being predictive in the validation set where clinical variables would also be required to show that the molecular biomarker improves over the current clinical model.
One can also use a metric called the concordance index to determine how much a predictive model with the molecular biomarker improves over and beyond that with the clinical models alone. The concordance index is the probability that, given two randomly selected patients, the patient with the worse outcome is, in fact, predicted to have a worse outcome (40). This measure ranges from 0.5 (i.e., chance) to 1.0 (perfect ability to rank patients). This step is important because the ultimate goal of identifying molecular signatures for prostate cancer outcome is to improve the current clinical models (39). If the molecular signature is only informative in that it predicts the clinical markers, then the signature will not be useful in the clinical setting. One limitation of this study was that the relative small size of the two groups precluded dividing them up for testing and validation sets. We did, however, try a leave-one-out cross-validated analysis in which each sample in the study was held out, and various cutpoints of AMACR were considered. The predictive power of AMACR was assessed using the C-index (41) with the response being time to biochemical failure in the Michigan cohort and time to death in the Örebro cohort. The cross-validated values of the C-index for the two cohorts, across a variety of cutpoints, ranged between 0.5 and 0.6.
The current study now extends the potential utility of AMACR as a prostate cancer biomarker. The AMACR (p504s) antibody is gaining wide acceptance in clinical practices as an adjuvant tool in the workup of diagnostically challenging prostate lesions referred to as atypical small acinar proliferation or atypical suspicious for cancer (14, 19, 29, 38, 42-48). Pathologists can now use AMACR (p504s) in combination with a basal cell marker to help make a more definitive diagnosis. We also recently observed a measurable humoral response directed against the AMACR protein (49). Using an ELISA assay, this response could be measured in serum samples from men with known prostate cancer but not in age-matched controls. This early work suggests that AMACR expression, despite the fact that it is not a secreted protein, may potentially become a clinically useful serum test. A second tissue-based assay can also detect enzymatic activity in prostate needle biopsy samples. This assay may represent another means of determining AMACR protein expression more reproducibly (50).
In summary, we showed for the first time that the AMACR expression is associated with prostate cancer progression after examining tumors from >440 men diagnosed with clinically localized prostate cancer. Selection of the optimal test cutpoint came from analysis using prostate-specific death and not a surrogate end point, suggesting that the definition of the surrogate end points, such a PSA biochemical failure need to be further evaluated. Future work will need to determine if this AMACR tissue test can be applied to prostate needle biopsies in a prospective manner to predict the risk of recurrence.
Grant support: Specialized Program of Research Excellence for Prostate Cancer, National Cancer Institute grant P50CA90381 (P.W. Kantoff and M.A. Rubin), P50CA69568 (M.A. Rubin and A.M. Chinnaiyan), National Cancer Institute grant CA 97063 (A.M. Chinnaiyan and M.A. Rubin), R01AG21404 (M.A. Rubin) and the American Cancer Society RSG-02-179-MGO (A.M. Chinnaiyan and M.A. Rubin), the Swedish Cancer Society (O. Andrén and J-E. Johansson), and by GMP companies in accordance with Partners Health Organization (BWH) and the University of Michigan.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Note: This represents an original research study that was presented in part at the National Cancer Institute–sponsored Specialized Program of Research Excellence meeting in Baltimore, MD, July 2004.
Acknowledgments
We thank Julia Elvin for constructive comments during the development of this project and Martina Storz-Schweizer and Lela Schumacher for their technical assistance.