Abstract
Although prostate cancer is a leading cause of cancer death, most men die with and not from their disease, underscoring the urgency to distinguish potentially lethal from indolent prostate cancer. We tested the prognostic value of a previously identified multigene signature of prostate cancer progression to predict cancer-specific death. The Örebro Watchful Waiting Cohort included 172 men with localized prostate cancer of whom 40 died of prostate cancer. We quantified protein expression of the markers in tumor tissue by immunohistochemistry and stratified the cohort by quintiles according to risk classification. We accounted for clinical variables (age, Gleason, nuclear grade, and tumor volume) using Cox regression and calculated receiver operator curves to compare discriminatory ability. The hazard ratio of prostate cancer death increased with increasing risk classification by the multigene model, with a 16-fold greater risk comparing highest-risk versus lowest-risk strata, and predicted outcome independent of clinical factors (P = 0.002). The best discrimination came from combining information from the multigene markers and clinical data, which perfectly classified the lowest-risk stratum where no one developed lethal disease; using the two lowest-risk groups as reference, the hazard ratio (95% confidence interval) was 11.3 (4.0-32.8) for the highest-risk group and difference in mortality at 15 years was 60% (50-70%). The combined model provided greater discriminatory ability (area under the curve = 0.78) than the clinical model alone (area under the curve = 0.71; P = 0.04). Molecular tumor markers can add to clinical variables to help distinguish lethal and indolent prostate cancer and hold promise to guide treatment decisions. (Cancer Epidemiol Biomarkers Prev 2008;17(7):1682–8)
Introduction
On diagnosis with localized prostate cancer, patients and clinicians are faced with the decision of whether to treat or to defer treatment. On one hand, prostate cancer is a leading cause of cancer death among men in westernized countries (1), and deaths occur even 20 years after diagnosis (2). On the other, treatment has adverse effects (3) and is often unneeded, as most men do not die from their cancer, and many harbor tumors that are indolent even in the absence of therapy (2, 4, 5).
Treatment of localized disease can reduce cancer-specific mortality, but in the only randomized trial of radical prostatectomy versus watchful waiting (6, 7), the number needed to treat to prevent one cancer death was 19. That trial predated screening by prostate-specific antigen (PSA); 30% to 60% of PSA-detected cancers have been characterized as overdiagnosed (8, 9). Therefore, the number needed to treat may be greater for a screened population.
There is a clear need for tools to distinguish potentially lethal from indolent disease at diagnosis to guide treatment decisions. Clinical nomograms characterize risk of progression using pretreatment clinical markers: PSA levels, biopsy Gleason scores, tumor extent, and clinical stage (10-13). These scoring systems have significant predictive power, but molecular tumor markers hold promise to improve prediction (14). A 12-gene molecular signature of advanced prostate cancer was recently identified through integration of proteomic and expression array data, comparing benign prostate, localized prostate cancer, and metastatic disease (15). A set of 36 markers, which showed differential expression at both RNA and protein levels, plus 5 additional genes, were immunostained on a prostate cancer progression array. Through linear discriminant analysis, the multigene model was identified which was significantly associated with PSA failure after prostatectomy in a small cohort. However, most men with PSA recurrence do not develop lethal disease (16). We tested the prognostic value of this molecular signature in relation to cancer-specific death within a cohort of men diagnosed with clinically localized prostate cancer and followed prospectively over 28 years.
Materials and Methods
Study Population
The population-based Örebro Watchful Waiting Cohort (2, 17) comprises men with localized (T1a/T1b, NX, M0) prostate cancer diagnosed by transurethral resection of the prostate for symptomatic benign prostatic hyperplasia. Cases were diagnosed between 1977 and 1991, before the widespread use of PSA screening, in Örebro, Sweden, within the University Hospital's catchment area. In accordance with standard treatment, the men were initially followed expectantly with careful monitoring by clinical exams, laboratory tests, and bone scans every 6 months during the first 2 years postdiagnosis and yearly thereafter. Hormonal therapy was initiated on demonstrated progression to symptomatic disease.
During this time, 252 men were diagnosed with prostate cancer at the University Hospital by transurethral resection of the prostate and followed by watchful waiting. The current study was nested among the 172 men for whom tumor tissue was available. We noted no difference in the distribution of Gleason scores or incidence of prostate cancer death among those with and without tumor tissue. Follow-up of the cohort is 100% complete through March 2006. Metastases were diagnosed by bone scan. Deaths were identified using the Swedish Death Register, and medical records were reviewed by the study investigators to confirm cause.
Tissue Microarrays
We retrieved archival formalin-fixed, paraffin-embedded transurethral resection of the prostate specimens to construct tissue microarrays using a manual tissue arrayer (17). The study pathologists reviewed H&E slides for each case to provide uniform Gleason grading. We found a 30% discordance comparing Gleason scoring re-review with the initial pathology review, with generally lower scores in the initial reports. This grade migration has also been described by Albertsen et al. (4). The pathologist determined the dominant prostate cancer nodule or nodule with the highest Gleason pattern, and two 0.6-mm tissue cores from tumor areas were transferred to the recipient array blocks.
Tumor Biomarkers
We assayed protein expression of the markers (Table 1) in the multigene model on the Örebro Watchful Waiting tissue microarray using immunohistochemistry. The TPD52 antibody could not be obtained; thus, 11 markers were assessed. One 5-μm section of the tissue microarray block was cut for each protein. Incubations and dilutions for each antibody were optimized while minimizing background (Table 1). Secondary antibodies linked to streptavidin-biotin were used to visualize staining.
Summary of biomarkers in the multigene model
Marker . | Regulation* . | Staining† . | Clone . | Dilution . | Antigen retrieval . | Source . |
---|---|---|---|---|---|---|
AMACR | − | Cytoplasm | 13H4 (rabbit monoclonal) | 1:25 | Pressure cooking method | Zeta |
Itga-5 | − | Cytoplasm | 1 (mouse monoclonal) | 1:25 | Microwave | BD Biosciences |
ABP280 | − | Nucleus | 5 (mouse monoclonal) | 1:50 | Pressure cooking method | BD Biosciences |
CDK7 | − | Nucleus | 17 (mouse monoclonal) | 1:50 | Microwave | BD Biosciences |
PSA | − | Cytoplasm | (rabbit polyclonal) | 1:7,500 | No antigen retrieval needed | DakoCytomation |
P63 | − | Nucleus | 4A4 (mouse monoclonal) | 1:600 | Microwave | Lab Vision |
MTA1 | − | Nucleus | A-11 (mouse monoclonal) | 1:5 | Microwave | Santa Cruz Biotechnology |
Kanadaptin | − | Cytoplasm | 49 (mouse monoclonal) | 1:200 | Microwave | BD Biosciences |
Jagged1 | + | Cytoplasm | (rabbit polyclonal) | 1:50 | Pressure cooking method | Santa Cruz Biotechnology |
MIB1 | + | Nucleus | MIB-1 (mouse monoclonal) | 1:200 | Pressure cooking method | DakoCytomation |
MUC1 | + | Cytoplasm | VU4H5 (mouse monoclonal) | 1:50 | Microwave | Santa Cruz Biotechnology |
TPD52‡ | − |
Marker . | Regulation* . | Staining† . | Clone . | Dilution . | Antigen retrieval . | Source . |
---|---|---|---|---|---|---|
AMACR | − | Cytoplasm | 13H4 (rabbit monoclonal) | 1:25 | Pressure cooking method | Zeta |
Itga-5 | − | Cytoplasm | 1 (mouse monoclonal) | 1:25 | Microwave | BD Biosciences |
ABP280 | − | Nucleus | 5 (mouse monoclonal) | 1:50 | Pressure cooking method | BD Biosciences |
CDK7 | − | Nucleus | 17 (mouse monoclonal) | 1:50 | Microwave | BD Biosciences |
PSA | − | Cytoplasm | (rabbit polyclonal) | 1:7,500 | No antigen retrieval needed | DakoCytomation |
P63 | − | Nucleus | 4A4 (mouse monoclonal) | 1:600 | Microwave | Lab Vision |
MTA1 | − | Nucleus | A-11 (mouse monoclonal) | 1:5 | Microwave | Santa Cruz Biotechnology |
Kanadaptin | − | Cytoplasm | 49 (mouse monoclonal) | 1:200 | Microwave | BD Biosciences |
Jagged1 | + | Cytoplasm | (rabbit polyclonal) | 1:50 | Pressure cooking method | Santa Cruz Biotechnology |
MIB1 | + | Nucleus | MIB-1 (mouse monoclonal) | 1:200 | Pressure cooking method | DakoCytomation |
MUC1 | + | Cytoplasm | VU4H5 (mouse monoclonal) | 1:50 | Microwave | Santa Cruz Biotechnology |
TPD52‡ | − |
NOTE: Bismar et al. (15).
All genes in the multigene model were similarly dysregulated at the proteomic and transcriptomic level: -, down-regulated; +, up-regulated.
Staining stronger in the nucleus than cytoplasm.
Polyclonal antibody for TPD52 not available.
Protein expression was determined on scanned digital images of tissue microarray cores (18) using a semiautomated image analysis system (Chromavision) with high reproducibility (19) that assessed staining intensity (0-255) and percent of positive stained area (0-100%). The study pathologist electronically circled areas of histologically recognizable prostate cancer to capture tumor expression.
Presence of the TMPRSS2:ERG fusion was evaluated previously on a subset of cases (n = 107) using a fluorescence in situ hybridization assay (20). In an earlier publication of the cohort, we showed that presence of the TMPRSS2:ERG fusion was associated with ∼3-fold increased risk of cancer death (20).
Tissue microarray blocks were constructed without prior knowledge of clinical outcomes, and the pathologist remained blinded to outcome during immunohistochemistry evaluation.
Statistical Analysis
A priori, we used protein intensity for markers that stained primarily in the cytoplasm and percent staining for those staining primarily in the nucleus (Table 1). For individuals missing data on specific markers, we imputed using the k-nearest neighbor classification, an algorithm that assigns missing data based on the majority of vote of its neighbors, as defined by the other markers. We selected k = 3, so that a comparison was made to its 3-nearest neighbors and assigned the mean value for the three. More than 93% of our cases had complete markers, and 3% were missing expression for only one marker.
To create the molecular signature score, we divided expression of each marker into quartiles based on the cohort distribution. For markers whose expression is up-regulated in metastatic versus localized cancer, based on prior data (Table 1), we assigned a score = 1 for those in the highest quartile of expression and 0 otherwise. For markers with down-regulated expression, a score = 1 was given for those in the lowest quartile of expression and 0 otherwise. We calculated a weighted risk score across the 11 markers, multiplying the protein expression coding values (0 or 1) for each gene by the coefficients from the linear discriminant analysis (15), thereby prioritizing genes that provided the greatest discrimination in the original article.
As described previously (17), we generated a weighted risk score that incorporated clinical predictors available on the cohort: age at diagnosis (continuous), Gleason grade (categorically, 2-5, 6, 7, 8-10), nuclear grade (categorically, grade 1, 2, 3; ref. 21), and tumor extent (ref. 22; defined categorically as the proportion of chips with tumor, <5%, 5-24.9%, 25-49.9%, ≥50%). The cohort was assembled before the introduction of PSA screening; thus, PSA levels at diagnosis were not available. Finally, we created a combined risk score of the multigene molecular signature and clinical markers to examine their joint predictive value. Men were then classified as having high, intermediate, or low risk of lethal cancer based on their molecular signature score, their clinical risk score, or combined clinical/molecular risk score divided into quintiles. In separate analyses, we added the TMPRSS2:ERG fusion status to the molecular and molecular-clinical risk scores to evaluate the additional informativeness of the fusion, in combination with other markers, to predict poor cancer prognosis among the subset of men.
We used time-to-event analyses to evaluate the gene signature to predict prostate cancer death during follow-up. Person-time was calculated from date of cancer diagnosis to date of development of metastases, cancer death, or censored at time of death from other causes or end of follow-up (March 2006). Hazard ratios (HR) and cumulative incidence differences [with 95% confidence interval (95% CI)] were used as effect measures using the Cox proportional hazards model.
Competing causes of death could play an important role in prostate survival analyses. Men who died of another cause soon after diagnosis are less informative about prognosis, because some would have progressed had they lived longer. We categorized men as lethal phenotype (men who developed metastases during follow-up; n = 40), indolent phenotype (men who lived at least 10 years after diagnosis without metastases; n = 49), and indeterminate phenotype (men who died of competing causes within 10 years of diagnosis; n = 83) and compared clinical characteristics of the three groups. We estimated the cumulative incidence of lethal disease accounting for competing risks (23) using a publicly available SAS macro (24). Rather than treating other causes of death as censored observations, this method simultaneously analyzes multiple cause-specific hazards. We fit Cox models stratified by failure type (lethal cancer versus other cause of death) and adjusted for clinical covariates.
In addition to estimating HR and cumulative incidence differences, we calculated receiver operator curves for the three models: the molecular signature, clinical model, and combined molecular-clinical model—plotting sensitivity versus 1 - specificity to predict lethal disease at 15 years. We compared the area under the curve (AUC), where a value of 1.0 indicates perfect discrimination and 0.5 is no better than chance alone (25).
Analyses were undertaken using the SAS Statistical Analysis version 9.1. The research protocol was approved by the institutional review boards at the collaborating U.S. and Swedish institutions.
Results
Of 172 men with localized prostate cancer, 40% had high-grade tumors and 19% had tumor volume >25% (Table 2). During 28 years of follow-up, 40 men died of cancer (n = 39) or were alive with bone metastases (n = 1), 49 were long-term survivors who lived >10 years after their diagnosis without development of metastases, and 83 died of causes other than prostate cancer. Mean follow-up to development of metastatic disease was 7.6 years (range, 0.1-27.1) and mean follow-up from metastasis to death was 2.0 years.
Characteristics of Örebro Watchful Waiting Cohort, 1977 to 2006
. | Overall . | Prostate cancer prognosis . | . | . | ||||
---|---|---|---|---|---|---|---|---|
. | . | Lethal outcome* . | Indeterminate outcome† . | Indolent cancer‡ . | ||||
n | 172 | 40 | 83 | 49 | ||||
Mean age at diagnosis, y | 74.1 | 72.2 | 76.6 | 71.3 | ||||
Mean follow-up, y | 7.6 | 5.2 | 4.4 | 14.9 | ||||
Gleason score, % | ||||||||
4-5 | 10 (5.8) | 3 (7.5) | 1 (1.2) | 6 (12.2) | ||||
6 | 88 (51.2) | 12 (30.0) | 44 (53.0) | 32 (65.3) | ||||
7 | 50 (29.1) | 13 (32.5) | 27 (32.5) | 10 (20.4) | ||||
8-9 | 24 (14.0) | 12 (30.0) | 11 (13.3) | 1 (2.0) | ||||
Tumor extent, % | ||||||||
<5 | 67 (39.0) | 8 (20.0) | 30 (36.1) | 29 (59.2) | ||||
5-24.9 | 72 (41.9) | 19 (47.5) | 34 (41.0) | 19 (38.8) | ||||
25-49.9 | 9 (5.2) | 3 (7.5) | 5 (6.0) | 1 (2.0) | ||||
50+ | 24 (14.0) | 10 (25.0) | 14 (16.9) | 0 (0.0) | ||||
Nuclear grade, % | ||||||||
I | 120 (69.8) | 21 (52.5) | 58 (69.9) | 41 (83.7) | ||||
II | 39 (22.7) | 13 (32.5) | 19 (22.9) | 7 (14.3) | ||||
III | 13 (7.5) | 6 (15.0) | 6 (7.2) | 1 (2.0) | ||||
TMPRSS2:ERG fusion, %§ | ||||||||
Positive | 24.4 | 48.2 | 14.9 | 16.1 | ||||
Negative | 75.6 | 51.8 | 85.1 | 84.9 |
. | Overall . | Prostate cancer prognosis . | . | . | ||||
---|---|---|---|---|---|---|---|---|
. | . | Lethal outcome* . | Indeterminate outcome† . | Indolent cancer‡ . | ||||
n | 172 | 40 | 83 | 49 | ||||
Mean age at diagnosis, y | 74.1 | 72.2 | 76.6 | 71.3 | ||||
Mean follow-up, y | 7.6 | 5.2 | 4.4 | 14.9 | ||||
Gleason score, % | ||||||||
4-5 | 10 (5.8) | 3 (7.5) | 1 (1.2) | 6 (12.2) | ||||
6 | 88 (51.2) | 12 (30.0) | 44 (53.0) | 32 (65.3) | ||||
7 | 50 (29.1) | 13 (32.5) | 27 (32.5) | 10 (20.4) | ||||
8-9 | 24 (14.0) | 12 (30.0) | 11 (13.3) | 1 (2.0) | ||||
Tumor extent, % | ||||||||
<5 | 67 (39.0) | 8 (20.0) | 30 (36.1) | 29 (59.2) | ||||
5-24.9 | 72 (41.9) | 19 (47.5) | 34 (41.0) | 19 (38.8) | ||||
25-49.9 | 9 (5.2) | 3 (7.5) | 5 (6.0) | 1 (2.0) | ||||
50+ | 24 (14.0) | 10 (25.0) | 14 (16.9) | 0 (0.0) | ||||
Nuclear grade, % | ||||||||
I | 120 (69.8) | 21 (52.5) | 58 (69.9) | 41 (83.7) | ||||
II | 39 (22.7) | 13 (32.5) | 19 (22.9) | 7 (14.3) | ||||
III | 13 (7.5) | 6 (15.0) | 6 (7.2) | 1 (2.0) | ||||
TMPRSS2:ERG fusion, %§ | ||||||||
Positive | 24.4 | 48.2 | 14.9 | 16.1 | ||||
Negative | 75.6 | 51.8 | 85.1 | 84.9 |
Lethal prostate cancer defined as men who developed distant metastases or died of cancer over follow-up.
Indeterminant prostate cancer outcome based on short-term follow-up after diagnosis, defined as <10 y without development of distant metastases or death from prostate cancer.
Indolent cancer based on long-term survival defined as ≥10 y without development of distant metastases or death from prostate cancer.
TMPRSS2:ERG fusion data were determined for 107 of the 172 men in our study.
Men classified as having a lethal disease tended to have tumors with higher Gleason grade, higher nuclear grade, and greater tumor extent than men with the indolent phenotype (Table 2). Men with lethal phenotype were also more likely to have fusion-positive tumors. Men classified as indeterminate had clinical characteristics between lethal and indolent phenotypes, reflecting in this group a mixture of men with indolent and those who would have developed lethal disease if they lived long enough.
Expression of Jagged1 and MTA1 were most strongly correlated with expression of other markers, showing correlation coefficients of 0.3 to 0.4 for positive correlations and -0.3 to -0.4 for inverse correlations; no one marker was correlated with all others. We evaluated each specific marker to predict prostate cancer death, adjusted for clinical variables. The strongest molecular predictors [HR (95% CI)] of prostate cancer death were MTA1 [3.4 (1.2-9.2)], p63 [1.8 (0.8-4.2)], Jagged1 [1.8 (0.7-4.5)], and ABP280 [1.6 (0.7-3.6)]. Interestingly, these markers were among the strongest discriminators of metastatic versus localized disease in the publication by Bismar et al. (15).
Using the molecular markers, the age-adjusted HR of prostate cancer death increased with increasing risk group classification, with a 16-fold increased risk of cancer death comparing the highest-risk versus lowest-risk groups (Table 3). The multigene signature remained a significant predictor of lethal prostate cancer even controlling for clinical variables: the HR (95% CI) of developing lethal disease was 12.3 (1.5-100.7) comparing extreme risk groups, and there was increased risk for all risk categories compared with the lowest (P for trend = 0.0015). Moreover, the molecular signature was a significant predictor of lethal disease among men with low-grade (Gleason score 4-6) tumors (HR, 16.9; P = 0.007).
Multigene model as a predictor of lethal prostate cancer, alone and in combination with clinical data, Örebro Watchful Waiting Cohort, 1977 to 2006
. | Total (n) . | Lethal (n) . | HR (95% CI) . | |||
---|---|---|---|---|---|---|
Molecular model only* | ||||||
Q1 (lowest risk) | 34 | 1 | Reference | |||
Q2 | 35 | 9 | 11.8 (1.5-93.8) | |||
Q3 (intermediate risk) | 34 | 7 | 12.3 (1.5-100.7) | |||
Q4 | 35 | 13 | 16.8 (2.2-129.6) | |||
Q5 (highest risk) | 34 | 10 | 16.9 (2.1-133.4) | |||
P for trend | 0.0015 | |||||
Clinical variables only† | ||||||
Q1 (lowest risk) | 51 | 4 | Reference | |||
Q2 | 11 | 2 | 3.3 (0.6-17.9) | |||
Q3 (intermediate risk) | 51 | 14 | 3.7 (1.2-11.3) | |||
Q4 | 25 | 4 | 4.0 (1.0-16.6) | |||
Q5 (highest risk) | 34 | 16 | 13.1 (4.3-40.5) | |||
P for trend | <0.0001 | |||||
Molecular model and clinical variables‡ | ||||||
Q1 (lowest risk) | 34 | 0 | No deaths | |||
Q2 | 35 | 5 | Reference§ | |||
Q3 (intermediate risk) | 34 | 11 | 6.3 (2.1-18.3) | |||
Q4 | 35 | 11 | 5.5 (1.9-15.9) | |||
Q5 (highest risk) | 34 | 13 | 11.3 (4.0-32.8) | |||
P for trend | <0.0001 |
. | Total (n) . | Lethal (n) . | HR (95% CI) . | |||
---|---|---|---|---|---|---|
Molecular model only* | ||||||
Q1 (lowest risk) | 34 | 1 | Reference | |||
Q2 | 35 | 9 | 11.8 (1.5-93.8) | |||
Q3 (intermediate risk) | 34 | 7 | 12.3 (1.5-100.7) | |||
Q4 | 35 | 13 | 16.8 (2.2-129.6) | |||
Q5 (highest risk) | 34 | 10 | 16.9 (2.1-133.4) | |||
P for trend | 0.0015 | |||||
Clinical variables only† | ||||||
Q1 (lowest risk) | 51 | 4 | Reference | |||
Q2 | 11 | 2 | 3.3 (0.6-17.9) | |||
Q3 (intermediate risk) | 51 | 14 | 3.7 (1.2-11.3) | |||
Q4 | 25 | 4 | 4.0 (1.0-16.6) | |||
Q5 (highest risk) | 34 | 16 | 13.1 (4.3-40.5) | |||
P for trend | <0.0001 | |||||
Molecular model and clinical variables‡ | ||||||
Q1 (lowest risk) | 34 | 0 | No deaths | |||
Q2 | 35 | 5 | Reference§ | |||
Q3 (intermediate risk) | 34 | 11 | 6.3 (2.1-18.3) | |||
Q4 | 35 | 11 | 5.5 (1.9-15.9) | |||
Q5 (highest risk) | 34 | 13 | 11.3 (4.0-32.8) | |||
P for trend | <0.0001 |
Derived from a linear combination of the indicator variables for protein expression multiplied by the variables from the linear discriminant model in Bismar et al. (15). HR and 95% CI are adjusted for clinical variables age at diagnosis, Gleason score, nuclear grade, and tumor extent.
Derived from a linear combination of indicator variables for the clinical variables and multiplied by the variables from the proportional hazards model in Andren et al. (17). Clinical variables include Gleason grade, nuclear grade, and tumor volume.
A linear combination adding together the weighted risk scores for the molecular and clinical models.
Reference group for this comparison includes men in the two lowest quintiles of risk, because there were no deaths in the lowest-risk group.
Gleason grade, tumor volume, and nuclear grade were each independent predictors of prostate cancer prognosis. Men classified as highest risk based on the clinical markers were 13 times (95% CI, 4.3-40.5) more likely to die of prostate cancer compared with the lowest-risk group (Table 3). Interestingly, among men characterized as low or intermediate risk based on clinical variables, the multigene signature could further stratify who would have good or bad prognosis (P for trend = 0.028).
Although both the molecular and clinical signatures independently predicted lethal phenotype, the best discrimination came from a score combining the multigene and clinical information. No man classified as lowest risk in the combined score developed metastasis or died of his disease (Table 3). As a result, we combined the two lowest-risk strata as the reference category to calculate HR. With this comparison, the HR of developing lethal prostate cancer was 11-fold higher (95% CI, 4.0-32.8).
Figure 1 shows cumulative incidence of lethal prostate cancer at 5, 10, 15, and 20 years of follow-up based on risk according to the combined multigene and clinical variables. Even at 5 years, higher-risk groups identified those who developed lethal disease (cumulative incidence difference, 28.7%; 95% CI, 17.4-40.0%). With continued follow-up, the difference in cumulative incidence of lethal cancer between lowest-risk and highest-risk groups increased. Although the greatest discrimination in prediction was in contrasting the highest-risk and lowest-risk groups, the intermediate-risk groups also were predictive of outcome.
Cumulative incidence of lethal prostate cancer during follow-up based on the combined multigene and clinical risk score. Men were stratified into risk groups based on the combined score, divided into quintiles. Cumulative incidence curves are produced from the proportional hazard models accounting for competing causes of death and age at diagnosis.
Cumulative incidence of lethal prostate cancer during follow-up based on the combined multigene and clinical risk score. Men were stratified into risk groups based on the combined score, divided into quintiles. Cumulative incidence curves are produced from the proportional hazard models accounting for competing causes of death and age at diagnosis.
Receiver operator curves are presented in Fig. 2. At 15 years follow-up, the predictive ability of the molecular signature alone (AUC = 0.68) was similar to that of the clinical markers alone (AUC = 0.71). The model that combined the molecular and clinical variables provided the greatest discrimination (AUC = 0.78), with a 10% improvement over the clinical markers alone (P = 0.04). The highest risk score based on clinical variables was a better classifier (higher sensitivity) than the molecular signature of those who would develop lethal disease. However, 10% of the lowest-risk men based on clinical markers died of cancer or developed metastasis during follow-up compared with 3% classified as low risk by the molecular signature and 0% classified by the molecular-clinical model, suggesting the molecular data could improve classification of those who would have a good prognosis.
Receiver operator curves of the multigene and clinical models to predict development of lethal prostate cancer. The predictive value of the models was assessed by plotting sensitivity versus 1 - specificity to predict prostate cancer death at 15 y and calculating the AUC. The combined molecular signature provided the greatest prognostic discrimination.
Receiver operator curves of the multigene and clinical models to predict development of lethal prostate cancer. The predictive value of the models was assessed by plotting sensitivity versus 1 - specificity to predict prostate cancer death at 15 y and calculating the AUC. The combined molecular signature provided the greatest prognostic discrimination.
Information was collected previously on the presence or absence of the TMPRSS2:ERG fusion on a subset of 107 men in the cohort (20). Information on TMPRSS2:ERG fusion status improved prognostication of the multigene model. At 15 years follow-up, the AUC for the multigene signature plus fusion data was 0.79, and the AUC for the combined molecular/clinical plus fusion data was 0.83.
Discussion
In this population-based cohort of men with initially untreated localized prostate cancer, we tested and validated a proposed multigene signature to predict lethal disease or long-term survival. The overall probability of developing lethal prostate cancer was 1 in 5. The multigene model was a significant predictor of cancer prognosis, independent of clinical variables, such that the probability of developing lethal disease was 1 in 20 for those classified at lowest risk but 1 in 2 for those classified as highest risk. The signature distinguished lethal and indolent disease even among men with tumors Gleason <7. These data show that tumor markers at diagnosis can predict outcome >20 years hence and suggest that in part the biological phenotype of prostate tumors to have a lethal or indolent course is set early in the disease development.
The discriminatory ability of the molecular signature and clinical model were similar based on the receiver operator curves. However, the clinical model was a worse classifier for the low-risk group and misclassified a greater proportion of men as indolent who in reality developed metastasis or died of their disease. In assessing classification, one should consider misclassification of truly lethal disease to be a more hazardous occurrence.
The combination of molecular and clinical data provided the greatest outcome discrimination. None of the lowest-risk men (20% of the total) developed lethal disease, whereas by the end of follow-up almost three-quarters of those classified as highest risk had died of their cancer or developed metastasis. Although few would suggest active surveillance for a man diagnosed with Gleason ≥8 tumors, molecular markers may be most informative in guiding treatment decisions among men with Gleason 6 to 7 tumors or where other clinical variables are suggestive of low to middle risk. The improvement in the AUC for the combined multigene/clinical model compared with the clinical model alone suggests that prostate cancer prediction models may seek to combine both molecular and clinical data.
These data provide a proof of concept and show the potential utility of molecular signatures of lethal prostate cancer. The signature was imperfect, however; not all men with the multigene signature died of the disease. Moreover, the majority of deaths occurred in the middle-risk groups, with mixed discriminatory ability, reflecting the need for better markers to classify outcomes. Nonetheless, the ability to predict accurately a man's outcome from prostate cancer at the extreme quintiles could be of great clinical utility. Moreover, our data suggest that the recently identified TMPRSS2:ERG fusion may provide even greater improvement in prognostication in combination with other markers. A set of molecular markers has the added potential benefit of being developed into a standardized and objective test. Clinical variables such as Gleason grading involve a level of subjectivity, as shown in the apparent Gleason score reclassification, which has occurred over time (4).
For validation of biomarkers of prostate cancer prognosis, cancer-specific death or metastasis is the optimal outcome. Although PSA recurrence is associated with an increased risk of prostate cancer death, most men with recurrences do not die of cancer (26, 27), so studies based on intermediary measures may be misleading. Long-term and complete follow-up is critical, because prostate-specific deaths can occur even 20 years after diagnosis (2, 4). The Örebro cohort has been followed prospectively with careful clinical annotation (2).
The cohort was followed by watchful waiting, and thus initially treatment naive, which provides an opportunity to characterize a man's cancer as indolent even in the absence of therapy. Our study population derived from a well-defined catchment area, with similar clinical care for all patients, thus reducing potential selection biases. We applied a standardized histopathologic review for Gleason grading to avoid potential grade migration over time (4). Although the Örebro cohort was assembled in the pre-PSA era, the cancers were incidentally detected and likely resemble PSA-detected cases given the distribution of Gleason grade and stage.
These transurethral resection of the prostate–detected tumors tended to be in the transitional zone, as opposed to peripheral tumors, so there might be concern that transurethral resection of the prostate–detected and PSA-detected tumors have different molecular phenotypes, such that the current findings cannot be generalized to current clinical practices. However, there is little evidence to suggest meaningful differences in the biology of tumors in these zones and among different modes of presentation. Indeed, the multigene signature, developed on primarily peripheral zone specimens, was predictive of outcome among our cohort. We had no baseline PSA levels, a clinical predictor of outcome (28-30), but given that PSA levels is not a strong prognostic predictor among men who opt for watchful waiting following diagnosis of localized prostate cancer (31), such information would likely provide a small improvement in the predictive probability of the multigene/clinical risk score.
Our findings suggest that evaluation of prostate tumor biomarkers at diagnosis can enhance prediction models to aid in counseling patients and guide clinical practice. The signature can identify men at lowest risk of progression, for whom active surveillance may be most appropriate. Although prediction of the middle-risk group is not perfect, the molecular tools can identify men for whom aggressive therapy would be indicated and thus substantially reduce the number needed to treat to avoid one prostate cancer death. The future challenge is to improve the molecular signature so that a greater proportion of men can be classified as low or high risk with similar or better discrimination.
Disclosure of Potential Conflicts of Interest
M.A. Rubin and F. Demichelis: 12-gene signature patent holders.
Grant support: NIH/National Cancer Institute Prostate SPORE at the Dana-Farber/Harvard Cancer Center grant NCI P50 CA090381, NIH T32 training grant CA009001 (L.A. Mucci), NIH grant R01AG21404 (M.A. Rubin and F. Demichelis), and Deutsche Forschungsgemeinschaft DFG PE1179/1-1 (S. Perner).
Acknowledgments
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
We thank Kelly Lamb and Lela Schumacher for technical support critical to this study, Ryan Lee for expert advice in statistical programming, and David Havelick for expert editorial assistance. The tissue microarray arrays were constructed at the Dana-Farber/Harvard Cancer Center Tissue Microarray Core Facility.