Purpose: Clinically useful molecular markers predicting the clinical course of patients diagnosed with non–muscle-invasive bladder cancer are needed to improve treatment outcome. Here, we validated four previously reported gene expression signatures for molecular diagnosis of disease stage and carcinoma in situ (CIS) and for predicting disease recurrence and progression.
Experimental Design: We analyzed tumors from 404 patients diagnosed with bladder cancer in hospitals in Denmark, Sweden, England, Spain, and France using custom microarrays. Molecular classifications were compared with pathologic diagnosis and clinical outcome.
Results: Classification of disease stage using a 52-gene classifier was found to be highly significantly correlated with pathologic stage (P < 0.001). Furthermore, the classifier added information regarding disease progression of Ta or T1 tumors (P < 0.001). The molecular 88-gene progression classifier was highly significantly correlated with progression-free survival (P < 0.001) and cancer-specific survival (P = 0.001). Multivariate Cox regression analysis showed the progression classifier to be an independently significant variable associated with disease progression after adjustment for age, sex, stage, grade, and treatment (hazard ratio, 2.3; P = 0.007). The diagnosis of CIS using a 68-gene classifier showed a highly significant correlation with histopathologic CIS diagnosis (odds ratio, 5.8; P < 0.001) in multivariate logistic regression analysis.
Conclusion: This multicenter validation study confirms in an independent series the clinical utility of molecular classifiers to predict the outcome of patients initially diagnosed with non–muscle-invasive bladder cancer. This information may be useful to better guide patient treatment.
Bladder cancer is a common malignant disease with 357,000 new cases and 145,000 deaths worldwide annually (1). Its prevalence is 3- to 8-fold higher than its incidence, making bladder cancer one of the most prevalent neoplasms, and hence a major burden for health care systems. The overall cause-specific 5-year survival rate is about 65%. The disease presents in two different forms: non–muscle-invasive tumors (stages Ta and T1), usually treated with a local, organ-sparing approach, and muscle-invasive cancers (stages T2-T4), usually requiring cystectomy if cure is intended.
The non–muscle-invasive tumors account for ∼75% of newly diagnosed cases. A low proportion of patients are cured after tumor resection, but the tumors of more than 60% of these patients recur, and the frequency of recurrences has a significant effect on the patients' quality of life. Some of these patients also develop muscle-invasive tumors over time, the proportion ranging from very low for noninvasive papillary low-grade tumors to up to 60% progression for high-grade submucosa-invasive tumors (2, 3). Clinical risk factors for progression include invasion of the lamina propria, high grade, tumor size, occurrence of carcinoma in situ (CIS), and multiplicity or recurrence of high-risk tumors. The recurrence of non–muscle-invasive tumors may be prevented by intravesical instillations of Bacillus Calmette-Guerin (BCG) or, for example, mitomycin-C chemotherapy. BCG is also used to treat patients with CIS lesions that, if not treated, progress to muscle-invasive disease in 50% of cases (2). The treatment is effective in about 70% of the cases but has common side effects of local pain and dysuria. Cystectomy may be considered in selected cases of patients with very frequent recurrences, stage pT1, high-grade lesions, CIS, and failure of BCG treatment. Radical treatment is, however, a major surgical procedure, with postoperative morbidity and effect on the patients' quality of life. Presently, no molecular markers exist that can guide the clinicians in the selection of treatment regimens for patients with non–muscle-invasive bladder cancer.
The advent of genome-wide transcriptome profiling has had a big effect on the discovery rate of new molecular markers or gene expression signatures for classifying and predicting disease outcome in various cancers, including bladder cancer (4–8). Previously, we have used Affymetrix GeneChips to identify expression signatures of potential clinical value. First, we identified a gene expression signature for classifying tumor samples according to disease stage (Ta, T1, and T2-T4). That study also included the identification of a gene signature for predicting recurrence frequency in non–muscle-invasive tumors (9). In a later study, we identified a gene signature for classifying early-stage bladder tumors according to the presence of surrounding CIS (10). Finally, we have identified a gene signature for predicting disease progression (11). Only few studies have thus far documented a clinical utility of the identified gene expression signatures through large-scale validation in independent tumor series. In this study, we have validated the diagnostic and prognostic value of these four gene signatures in an independent series of tumors from a cohort of 404 patients diagnosed with bladder cancer in hospitals in Denmark, Sweden, France, England, and Spain. This study confirms in an independent series the clinical utility of molecular classifiers for diagnosis and prognosis of patients initially diagnosed with non–muscle-invasive bladder cancer.
Materials and Methods
Tumor specimens. Biological materials from incident and, in some cases, recurrent tumors were obtained directly from surgery after removal of the necessary amount of tissue for routine pathology examination. The tumors from Denmark were frozen at −80°C in a guanidinium thiocyanate solution. The tumors from Sweden and Spain were frozen at −80°C in Tissue-Tek OCT compound. The tumors from France and England were dry frozen at −80° C. Informed written consent was obtained from all patients, and research protocols were approved by institutional review boards or ethical committees in all involved countries. All diagnostic pathology slides were re-evaluated by one experienced uropathologist (N.M., Denmark) and graded according to the WHO 2004 guidelines (12). For 21 patients (outlined in Supplementary File 1), it was not possible to acquire the diagnostic sections for review; consequently, the original staging of the tumor was used together with a translation from the original grading to the WHO system (G1 + G2, low grade; G3 + G4, high grade).
Treatment and follow-up information. The samples were taken from patients that were operated in the years 1987 to 2000 in hospitals in Denmark, Sweden, Spain, France, and England. Ninety-four patients received intravesical treatment with BCG or mitomycin-C. Progression of the disease was defined as (a) invasion into the bladder muscle, verified by microscopy; or (b) more distant metastases verified also by microscopy except for rare cases where this was not possible, but where scanning revealed an unambiguous metastasis. Twenty-one patients with non–muscle-invasive tumors were cystectomized before progression occurred and 15 patients after progression. Twenty patients with primary muscle-invasive cancer were cystectomized. Progression-free survival time was recorded from sampling visit and censored at the time of the last control cystoscopy or at cystectomy. Disease-specific survival was recorded from sampling visit and censored at the time of the last annotation of the patient being alive, and death causes were obtained by a review of the hospital files. The follow-up was done retrospectively, and we included as many tumors form the different centers as possible, preferably high-risk tumors. Consequently, the material used in this study is selected and does not reflect the tumor group sizes that would be included from a prospective study. Details on clinical courses are outlined in Supplementary File 1. The high and low clinical risk groups used were defined as high clinical risk (stage T1 or high grade or CIS) and other tumors (low risk; ref. 13).
RNA extraction and quality control. RNA was extracted from the Danish and English samples using a standard Trizol RNA extraction method (Invitrogen). RNA from Swedish and Spanish samples was extracted using RNeasy mini kit (Qiagen), and RNA from the French samples was extracted by cesium chloride density centrifugation (14). All RNA was quality controlled using an Agilent Bioanalyzer (criteria: 28S/18S >1 and RIN>5).
Gene expression profiling. For the validation of the gene signatures, a microarray platform was developed, including probes for the genes comprised in the four expression signatures previously described. Oligonucleotide design, microarray spotting procedure, sample labeling, array hybridization, and scanning were done as previously described (11). Briefly, all classifier genes were represented by one to four 60-mer oligonucleotides spotted in duplicate on CodeLink slides (GE Health Care), and all samples (N = 404) were assayed twice using the microarrays. One microgram total RNA from each tumor was reversed transcribed to cDNA using an oligo-dT primer containing a T7 RNA polymerase promoter sequence. The cDNA was transcribed into cRNA with incorporation of aminoallyl-linked UTP nucleotides for Cy dye binding. All tumor cRNA samples were labeled with Cy3 and analyzed against a common reference sample (Universal Human Reference RNA, Stratagene) labeled with Cy5. TIGR spotfinder 2.23 software was used to generate raw-intensity data, which were Lowess (blockwise) normalized using TIGR MIDAS 2.19 software (15). Average log 2 ratios were calculated from the normalized data based on the four measurements of each gene. Microarray data are available at GEO14
Probe selection and classification procedures. For stage classification, we previously identified 79 genes, and for recurrence classification, a set of 26 genes (9). For CIS classification, we previously identified a 16-gene signature and furthermore delineated the 100 best marker genes (10). For progression classification, we identified a 45-gene signature and, in addition, delineated the 200 best marker genes (11). In this study, we focused on all previously reported classifier genes and did not focus solely on the previously reported optimal signatures for the classifiers. The optimal gene expression signatures were identified previously based on cross-validation performance using a limited number of training samples. This procedure may result in very different numbers of “optimal” gene signatures, depending on the number of training samples used. We only excluded previously identified genes when the probes did not work on the new platform. Therefore, this new gene selection approach applied in this work resulted in different optimal gene expression signatures than reported in the original studies, but no new genes were introduced. Genes used in previously reported optimal gene signatures are listed in Supplementary File 2. Oligonucleotides of sequences representing the genes were spotted on the microarray, and in this study, we included all genes in the classifiers that showed a Pearson correlation equal to or above 0.25 when expression levels were compared with previously generated Affymetrix GeneChip gene expression intensities for the same samples. The probe with highest Pearson correlation was selected when several probes for the same gene were above the correlation threshold. This gene re-selection on the new array platform generated a 52-gene stage classifier, a 68-gene CIS classifier, a 20-gene recurrence classifier, and a 88-gene progression classifier. Only samples previously used for generating the classifier gene sets were used in the selection of best-performing genes on the new platform. Samples used for probe selection were not used for validation purposes. The probes used in the classifiers are shown in Supplementary File 2.
Maximum likelihood classifiers were constructed as previously described (9). Because the classifiers were initially identified using the Affymetrix GeneChip platform, the classifier group mean values were regenerated by training the classifiers with samples previously profiled on the Affymetrix platform (stage classifier, n = 18; recurrence classifier, n = 19; CIS classifier, n = 16; progression classifier, n = 15; see Supplementary File 1 for details on samples used for training). Of importance, none of the samples used to generate the classifiers were included in the validation set. In total, we profiled 404 tumors that all were used as independent test samples for one or more classifier validations. All microarray measurements were done blinded to patient diagnosis and outcome. We used the following patient selection criteria for testing the different classifiers: Out of the 404 tumors, we included 386 for stage classifier validation. The remaining 18 samples that were used for training the stage classifier were used for validation of other classifiers. Samples used to validate the CIS classifier are pTa and pT1 tumors from patients with known CIS status from either routine random biopsies taken at all visits to the clinic (at least two biopsies with CIS), or when CIS was diagnosed in the diagnostic sections from the analyzed tumor. Samples used to validate the recurrence classifier are pTa tumors with short recurrence frequency (<8 months) or at least 24-month recurrence-free survival (criteria used in the original study). Only samples from Danish patients were used. Samples used to validate the progression classifier were only samples from patients with pTa and pT1 tumors with no previous or synchronous muscle-invasive tumors.
Statistical procedures. Kaplan-Meier estimates, univariate and multivariate Cox regression analyses, and logistic regression analyses were done using the STATA 8.0 statistical analysis software.
Clinical and histopathologic variables for the patients included in this study are listed in Table 1. The median follow-up time for all patients was 40 months (range, 0-243 months). The median follow-up time for patients with non–muscle-invasive, non-progressing tumors was 51 months (range, 1-185 months). The previously published stage, recurrence, progression, and CIS signatures were measured by microarray gene expression profiling, and classification results were compared with clinical outcome and histopathologic variables. We observed an average Pearson correlation between replicate measurements of samples of 0.91. No systematic difference in gene expression patterns between samples from different countries was observed (Supplementary Fig. S1). For details on all clinical data and classification results, see Supplementary File 1.
Stage and recurrence classifier validation. Molecular classification of disease stage based on the 52-gene classifier was carried out for all patients (n = 386, 18 samples used for training), and a significant correlation between pathologic disease stage and classified disease stage was found (P < 0.001, χ2 test). Tumors classified as muscle invasive (stage T2-T4) had significantly lower cancer-specific survival times (Fig. 1A; log-rank test, P < 0.001). The stage classifiers ability to predict future stage progression of non–muscle-invasive bladder tumors reported in our original study was confirmed as well (Fig. 1B; log-rank test, P < 0.001; ref. 9). The previously published 26-gene signature for recurrence prediction showed no significant correlation to clinical outcome (results not shown).
Progression classifier validation. Molecular prediction of progression was carried out for all patients with non–muscle-invasive tumors and available follow-up (n = 294) based on the 88-gene progression signature (Fig. 2A). Progression-free survival analysis showed a long-term cumulative probability of progression of 40% when the classifier predicted progression, in contrast to a probability of <15% when it predicted no progression (log-rank test, P < 0.001; Fig. 1C). Long-term cumulative probabilities of cancer-specific survival were 65% and 85%, respectively (log-rank test, P = 0.001, Fig. 1D). These results were independent of age, sex, stage, grade, and BCG/mitomycin-C treatment (hazard ratio, 2.29; 95% confidence interval, 1.26-4.18; P = 0.007) in multivariate Cox regression analysis (Table 2). The fact that we did not find stage T1 to be associated with progression in the multivariate analysis may be explained by the inclusion of a high proportion of progressing Ta tumors in this study.
We correctly predicted progression to muscle-invasive or higher stage (T2-T4) in 37 of 56 cases (66% sensitivity), and 158 of 238 non-progressing cases were correctly classified (66% specificity). Positive and negative predictive values of 32% and 89%, respectively, were observed. These numbers have to be compared with a risk estimation based on clinical variables as it is used today (see Materials and Methods). Forty-eight of 184 clinical high-risk tumors progressed (86% sensitivity, 26% positive predictive value), whereas 102 of 110 clinical low-risk tumors did not (43% specificity, 93% negative predictive value). To evaluate whether the molecular risk estimation adds prognostic value to clinical risk assessment, we examined progression-free survival as a function of a combination of the progression classifier results and clinical risk categories (Fig. 1E). Clinical and molecular high-risk tumors had a cumulative probability of progression of nearly 50%, with a positive predictive value of 35%; in contrast, <5% progressed when clinical and molecular risk were low, with a negative predictive value of 95%. When clinical and molecular classification differed, the cumulative probability of progression was ∼20%; this includes low-grade Ta tumors with high-risk molecular profile as well as T1 and grade 3 tumors with low-risk molecular profile.
Molecular classification could also be used to guide treatment decisions based on the inclusion of exclusively those patients with a high classification power, eliminating those where classification is weak. We calculated the sensitivity and specificity of the progression classifier for different tail percentiles (Supplementary Fig. S2) and found, for example, for the 50% fraction of patients classified with highest strength that the sensitivity and specificity for progression prediction was 83% and 65%, respectively.
CIS classifier validation. Molecular classification of CIS was done using a 68-gene signature for the 150 patients with known CIS status (Fig. 2B). The CIS classifier correctly classified 36 of 45 samples with CIS (80% sensitivity) and correctly classified 71 of 105 with no CIS (68% specificity). These results were independent of age, sex, disease stage, and grade in multivariate logistic regression analysis (odds ratio 5.78; 95% confidence interval, 2.25-14.88; P < 0·001, Table 3). As CIS is associated with a high risk of disease progression, the signature can also be considered as a progression risk signature. When using the CIS signature to predict progression for all patients with non–muscle-invasive tumors and available follow-up (n = 294), we obtained a sensitivity of 75% and a specificity of 55% with positive and negative predictive values of 28% and 90%, respectively.
Progression and CIS classifiers in combination. We tested whether a combined classification scheme that included both the progression and the CIS signatures could improve progression prediction. When using this combined classifier, we observed a hazard ratio of 4.6 (95% confidence interval, 1.87-11.52; P = 0.001) for the high-risk group (both positive) compared with the low-risk group (both negative) in a multivariate Cox regression analysis when stratifying for sex, age, stage, grade, and intravesical treatment. Kaplan-Meier plot of progression-free survival analysis as a function of this combined classifier is shown in Fig. 1F.
Prediction of disease outcome for patients diagnosed with non–muscle-invasive bladder cancer is a major clinical challenge. In previous studies of bladder cancer, we identified gene expression profiles associated with disease stage, disease progression, disease recurrence, and the presence of CIS. In this present work, we have validated these gene expression signatures in 404 tumors from independent series of patients from Sweden, England, Spain, France, and Denmark. To our knowledge, this is the largest clinical validation study of microarray-based molecular classifiers ever done. We found a highly significant concordance between pathologic disease stage and a 52-gene stage signature and between the presence of CIS and a 68-gene CIS signature. Furthermore, we found that a 88-gene progression signature predicts progression independently from standard clinical risk variables. Unfortunately, a signature for disease recurrence prediction did not show any correlation to clinical outcome. This observation has previously been reported by another group (16). The molecular signatures were able to classify tumors correctly, although the tumors used in this study were sampled in different countries with different freezing and RNA extraction procedures. This underlines the robustness of the classifiers and the gene expression microarray platform used.
All patients in the study were treated with transurethral resection of tumors and, in many cases, intravesical BCG or mitomycin-C treatment. Consequently, predictive classifiers are superimposed with treatment, and this obscures the outcome and makes end point monitoring difficult. This fact may explain the relatively low specificity for the progression classifier. In this work, we have exclusively used zero as cutoff point between the classifier groups. However, other cutoff points could be used to, for example, increase the sensitivity of the test.
The classifiers for progression and CIS were developed using different Affymetrix gene expression microarrays and with different individual tumors. This is probably the reason that the published gene profiles show little overlap, although outcome variables are highly correlated. This has also been stressed in previous studies (17–19). As we showed here, the classification results may be improved by combining different signatures. A three-way approach was shown to define a high-risk group, a low-risk group, and an intermediate-risk group of patients. This approach is more related to the conventional way of clinical decision making; however, it showed a very high specificity. Future work will hopefully further enhance the performance of molecular diagnosis and risk prediction.
We showed the clinical benefit of applying the classifiers because molecular classification proved capable of improving clinical diagnosis and risk prediction. Molecular CIS diagnosis is also of outmost importance because the diagnosis of CIS is difficult as not all clinicians take random biopsies for CIS diagnosis, the number of biopsies varies, and the diagnosis may be associated with sampling error. Hence, we believe that this study represents an important step towards the clinical use of molecular diagnosis in bladder cancer. Further evaluation of the molecular risk prediction in prospective, standardized setting is warranted and will hopefully open up the way to broader use of these molecular signatures in routine clinical analysis. In a potential clinical use of the RNA-based signatures, it is imperative to establish good standard operating procedures to secure a good RNA quality. Finally, we conclude that the results of this European multicenter study for validation of previously reported gene expression signatures have proved the value of the classifiers for especially progression and CIS classification. The study documents the clinical utility of applying molecular classifiers to guide the decision for treatment regimen for patients initially diagnosed with non–muscle-invasive bladder cancer.
Grant support: Danish Cancer Society, The John and Birthe Meyer Foundation, The Danish Research Council, The Nordic Centre of Excellence in Molecular Medicine, The University of Aarhus, Aarhus University Hospital, Centre National de la Recherche Scientifique, Institut Curie, La Ligue Contre le Cancer (laboratoire associé), INCa, and Instituto de Salud Carlos III grants FIS 00/0745, C03/009, C03/010, G03/160, and G03/174.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Note: Supplementary data for this article are available at Clinical Cancer Research Online (http://clincancerres.aacrjournals.org/).
We thank Gitte Høj, Inge Lis Thorsen, Bente Pytlich, Gitte Stougård, Hanne Steen, and Pamela Celis for excellent technical assistance; the staff at the Departments of Urology, Clinical Biochemistry, and Pathology at Aarhus University Hospital; Pascale Maille and Pascale Soyeux for excellent technical assistance at the Departments of Pathology and Urology, Henri Mondor Hospital; Jo Brown for excellent assistance with sample collection and managing patient information and staff in the Pyrah Department of Urology, St. James's University Hospital, Leeds; Christer Busch for help with histopathologic evaluations and Karolina Edlund for excellent technical assistance at University Hospital, Uppsala, Sweden; the study monitors and physicians at Hospital del Mar (Barcelona) and Hospital Universitario de Elche for help with sample collection; and Josep Lloreta and Manolis Kogevinas for valuable contributions.