Purpose: Cancer of the urinary bladder is a common malignant disease in the western countries. The majority of patients presents with superficial tumors with a high recurrence frequency, a minor fraction of these patients experience disease progression to a muscle invasive stage. No clinical useful molecular markers exist to identify patients showing later disease progression. The purpose of this study was to identify markers of disease progression using full-genome expression analysis.
Experimental Design: We did a full-genome expression analysis (59,619 genes and expressed sequence tags) of superficial bladder tumors from 29 bladder cancer patients (13 without later disease progression and 16 with later disease progression) using high-density oligonucleotide microarrays. We used supervised learning for identification of the optimal genes for predicting disease progression. The identified genes were validated on an independent test set (74 superficial tumor samples) using in house-fabricated 60-mer oligonucleotide microarrays.
Results: We identified a 45-gene signature of disease progression. By monitoring this progression signature in an independent test set, we found a significant correlation between our classifications and the clinical outcome (P < 0.03). The genes identified as differentially expressed were involved in regulating apoptosis, cell differentiation, and cell cycle and hence may represent potential therapeutic targets.
Conclusions: Our results indicate that it may be possible to identify patients with a high risk of disease progression at an early stage using a molecular signature present already in the superficial tumors. In this way, better treatment and follow-up regimens could be assigned to patients suffering from superficial bladder cancer.
Bladder cancer is the fourth most common malignancy in males in the western countries (1). The disease presents in two different forms: superficial tumors (stages Ta and T1), where an attempt to local, organ-sparing treatment may be made, and muscle invasive cancers (stage T2+), requiring immediate radical treatment, if cure is intended. The superficial tumors form a heterogeneous group, spanning from completely benign, noninvasive papillary tumors, which rarely progress, to papillary or flat lesions showing malignant transformation to a greater or lesser degree (invasion of the lamina propria and high grade of dysplasia). The latter group has a risk of progression to muscle invasive cancer of up to 60% at long-term follow-up (2, 3) and is naturally in the focus of clinical interest, particularly if signs of malignant field disease are present [multiple or recurrent lesions and widespread carcinoma in situ (CIS)], indicating that malignant cells still might be left in the bladder following local treatment. The clinical evaluation of the progression risk is complex and error prone. Therefore, extensive research has been carried out to find reliable molecular markers of progression. However, none of these markers has been accepted into clinical routine.
The completion of the Human Genome Project and the recent developments in DNA microarray technology have paved the way for making systematic full-genome searches for prognostic and predictive molecular cancer markers. Several microarray studies have shown the predictive power of using different sets of genes (expression signatures) for predicting disease outcome. Successful prediction of diffuse large B-cell lymphoma outcome has been shown recently (4–6), and prediction of metastatic potential and outcome in patients with breast cancer seems possible (7–9). Similar predictors of outcome have been reported on renal cell carcinoma (10), embryonal tumors of the central nervous system (11), early-stage lung carcinoma (12), and solid primary tumors in general (13). In a recent expression profiling study of bladder carcinoma using ∼5,000 genes, we constructed molecular classifiers for tumor stage and tumor recurrence. Notably, the stage classifier identified superficial bladder tumors with progression potential, as these seemingly benign papillary tumors were assigned to a higher-stage class (14). In this early study, the classifier was limited to the use of ∼5,000 genes and did not include a systematic search for genes predicting progression to a higher stage.
In this present study, we searched for a progression gene expression signature by delineating nonprogressing tumors from progressing tumors using high-density oligonucleotide microarrays. We profiled a total of 29 bladder tumor samples: 13 early-stage bladder tumor samples without subsequent stage progression (median follow-up time, 56 months) and 16 early-stage bladder tumor samples with subsequent stage progression (median time to progression, 8 months). The identified progression signature was monitored in an independent test set of 74 tumor samples, with a median follow-up time of 52 months for the nonprogressing tumors, and compared with the clinical outcome of these patients. We used two different clinical outcome end points, one was the progression of the disease as verified by a higher-stage tumor and another was assignment of the patient to a high-risk group based on clinical and pathologic variables that are documented to be associated with a high risk of disease progression. The latter was needed as the bladder tumor disease has a very long time course that may span >10 years.
Materials and Methods
Patient samples. Bladder tumor biopsies were obtained directly from surgery after removal of the necessary amount of tissue for routine pathology examination. All slides were scored by one experienced pathologist. The tumors were frozen at −80°C in a guanidinium thiocyanate solution for preservation of the RNA. Informed consent was obtained from all patients, and the protocols were approved by the scientific ethical committee of Aarhus County. The samples for the no progression group were selected by the following criteria: (a) superficial tumors with no prior muscle invasive tumors and (b) a minimum follow-up period of 12 months to the most recent routine cystoscopy examination of the bladder with no occurrence of tumors of higher stage. The samples for the progression group were selected by two criteria: (a) superficial tumors with no prior muscle invasive tumors and (b) subsequent progression to a higher-stage tumor.
Clinical risk scores. The estimation of clinical risk is complex and depends on a comprehensive individual evaluation of the clinical and pathologic findings and on the experience of both clinician and pathologist. Aware of this, we estimated the risk of progression by means of scores based on the available clinical data. The scores consist of the following clinical risk factors: stage (Ta versus T1), grade of dysplasia (grade 1-2 versus 3), tumor size (<3 versus >3 cm), urine cytology (no tumor cells versus tumor cells), preselected site biopsies (no dysplasia versus dysplasia), multiplicity (in high-grade disease), and, if relevant, previous history and the results of bimanual palpation in anesthesia. The scoring principles are detailed in Supplementary Table S1. The risk factors are partly dependent on each other. The grade of dysplasia and the results of the urine cytology and preselected site biopsies are overlapping, because tumor cells in the urine may stem from a high-grade tumor or CIS. The presence of concomitant CIS was assumed when at least one preselected site biopsy showed CIS or when low-grade intraurothelial neoplasia was found in connection with a grade 3 tumor or positive urine cytology. Otherwise, low-grade intraurothelial neoplasia was scored as suspicious for CIS. The maximum score was 5 for newly diagnosed tumors and 6 for tumors with previous history.
Risk scoring was done twice during the disease course: at the time of the microarray analysis and, as an outcome score, at the end of the follow-up period, considering all clinical data collected during the course. A tumor with a score of maximum 2 was considered “low risk” at microarray analysis time and a score of ≥4 was considered “high risk.” Those in between were scored as “medium risk.” Because the disease course may change in both directions, clinical data older than 60 months were not considered. At end of follow-up, a tumor was considered low risk if the score was <2.5 and high risk if the score was ≥4.
Customized GeneChip design, normalization, and expression measures. The biotinylated targets were hybridized to the Eos Hu03, a customized Affymetrix GeneChip (Santa Clara, CA) oligonucleotide array comprising 59,619 probe sets representing 46,000 unique sequences, including known genes, expressed sequence tag clusters, and FGENESH-predicted exons that were based on the first draft of the human genome. The Eos Hu03 probe sets consist of nonoverlapping perfect match probes only, most probe sets having six or seven probes. Due to the nonrandom, positive correlation between the perfect match and the mismatch probes, only perfect match probes were used on the Eos Hu03 array. Independent analysis has shown increased precision and higher sensitivity and specificity of perfect match–derived data (15). Normalization of the gene expression data was done as follows. The probe-level intensity data from each array were fitted to a fixed distribution using an inverse function to map the empirical cumulative distribution of intensities to the desired distribution. This procedure is akin to other per-chip normalization procedures, such as fixing the mean and SD of each chip to a standard value, except it is more stringent in that it fixes the entire distribution rather than one or two variables. The purpose of per-chip normalization is to remove between-chip variation on the assumption that it is attributable to nonbiological factors (i.e., technical noise). The scale variable for the γ distribution was chosen to yield a distribution with an arbitrary mean value of 300, and the shape variable of 0.81 was chosen to reproduce the typical shape of the empirical distribution seen in good samples.
A single measure of average intensity was calculated for each probe set using Tukey's trimean of the intensity of the constituent probes. The trimean is a measure of central tendency that is resistant to the effects of outliers. Finally, a background subtraction was applied to each average intensity measure to correct for nonspecific hybridization. The average intensity measure of a “null” probe set consisting of 491 probes with scrambled sequence was subtracted from all of the other probe sets on the chip.
cRNA preparation, array hybridization, and scanning. Preparation of cRNA from total RNA and subsequent hybridization and scanning of the customized GeneChip microarrays (Eos Hu03) were done as described previously (14).
Custom oligonucleotide microarray procedures. Three to four 60-mer oligonucleotides were designed for each of the 45 genes in the progression predictor and for each of the 32 genes in the previously published stage classifier using Array Designer 2.0. As template for the probe design, we used Ref-Seq sequences. The probes were spotted on Codelink slides (Amersham Biosciences, Inc., Hillerød, Denmark) using a VersArray Chipwriter from Bio-Rad (Hercules, CA). All samples were homogenized using a Bio101 Fastprep from Savant Instruments, Inc. (Holbrook, NY). Total RNA was extracted from all samples using standard protocols. Reverse transcription was done on 5 μg total RNA for 1 hour at 42°C using a T7 oligo(dT)24 primer and SuperScript II reverse transcriptase (Life Technologies, Gaithersburg, MD). Second-strand cDNA synthesis was done for 2 hours at 16°C using Escherichia coli DNA PolI, DNA ligase, and RNase H (Life Technologies) followed by incubation in NaOH/EDTA for 10 minutes at 65°C. In vitro transcription was done for 6 hours at 37°C using aminoallyl-UTP and T7 Megascript kit (Ambion, Austin, TX). The produced cRNA was purified using RNeasy spin columns (Qiagen, Valencia, CA) followed by coupling of Cy3 and Cy5 dyes for 2.5 hours at room temperature. Finally, the labeled cRNA was fragmented for 30 minutes at 60°C in a 50 mmol/L ZnCl2 solution. Each Cy3- and Cy5-labeled target sample (1 μg) was applied to the microarray slide for 16 hours at 42°C. Before scanning, all slides were washed as described previously (16) and scanned.
The probes were spotted in duplicate and all hybridizations were carried out twice. The samples were labeled with Cy3 and a common reference pool was labeled with Cy5. The reference pool was made by pooling the cRNA generated from the investigated samples and from universal human reference RNA (Stratagene, La Jolla, CA). Following scanning of the glass slides, the fluorescent intensities were quantified and background adjusted using SPOT 2.0 (17). Data were subsequently normalized using a LOWESS pen-normalization procedure implemented in the SMA package to R. Log-average expression ratios were calculated from the four measurements of each probe. To select the best performing oligonucleotide probe for each of the genes, seven of the samples from the progression predictor training set and nine of the samples from the stage classifier training set were reanalyzed on the custom oligonucleotide microarray platform. The obtained expression ratios were compared with the expression levels from Affymetrix GeneChip using Pearson correlations. The probes showing the highest correlations to Affymetrix GeneChip expression levels were selected to represent the classifier genes. For three of the genes in the 45-gene progression signature, it was not possible to select a probe with a positive correlation coefficient to the Affymetrix GeneChip profile. These genes were not included for predicting prognosis in the test samples.
Gene expression data analysis. Before analyzing the expression data from the Eos Hu03 GeneChip, all control probes were removed. Furthermore, we added 300 to all values to avoid negative values. Only probes with expression levels above 400 in at least eight experiments and with a maximum/minimum of ≥1.6 were selected. This filtering generated a gene set consisting of 6,647 probes for further analysis. Average linkage hierarchical cluster analysis of the tumor samples was carried out using a modified Pearson correlation as similarity metric (18). Genes and arrays were median centered and normalized to the magnitude of 1 before clustering. We used the GeneCluster 2.0 software for the supervised selection of markers and for performing permutation tests. The 45 genes for predicting progression were selected by t statistics and the leave-one-out cross-validation performance as described previously (14).
To make comparisons between the two microarray types, we rebuild the classifiers using the training samples analyzed on the custom microarray platform. In this way, we recalculated the group means and variances on the new platform for both progression and stage classifier. The samples in the test set were classified as described previously (14).
We analyzed gene expression changes between the two groups of tumors by hybridizing the labeled RNA samples to customized Affymetrix GeneChip with 59,619 probe sets (∼95% transcriptome coverage). See Supplementary Table S2 for description of patient disease courses. Low expressed and nonvarying probe sets were eliminated from the data set and the remaining 6,647 probe sets that showed variation across the tumor samples were subjected to further analysis.
Unsupervised hierarchical cluster analysis of tumor samples. Initially, we analyzed the gene expression differences between the tumor samples using unsupervised hierarchical cluster analysis (Fig. 1). The cluster dendrogram showed a notable distinction between the nonprogressing and the progressing tumors when using the 3,197 most varying probe sets (SD ≥ 75) for clustering. All tumors that later became muscle invasive were found in the same branch of the dendrogram, and a subcluster in this branch contained exclusively these tumors. Most of the Ta tumors that showed later upstaging to T1 were found in the progression branch as well. Only two of these samples (825-3 and 112-2) were found in the nonprogression branch of the cluster dendrogram and two of the nonprogressing samples (815-1 and 150-6) were found in the progression branch. χ2 test for independence between cluster group and progression gave a P of 0.0001. This distinct separation of the samples indicated a considerable biological difference between the two groups of tumors. Based on the finding that the progression event from Ta to T1 and from T1 to T2+ may share the same distinct gene expression patterns, we decided to look for a general signature of disease progression disregarding histologic staging of the tumors.
Construction of a molecular progression classifier. To identify the gene set that gives the best classification results using the lowest number of genes, we built a classifier using the leave-one-out cross-validation approach as described previously (19). Selecting the 100 best genes in each cross-validation loop gave the lowest number of prediction errors (five errors, 83% correct classification) in our training set consisting of the 29 tumors (see Supplementary Fig. S1). As in our previous study, we used a maximum likelihood classification approach. We selected a gene expression signature consisting of those 45 genes that were present in 75% of the cross-validation loops, and these represented our optimal gene set for prognosis classification (Fig. 2A; Supplementary Table S3). Permutation analysis revealed that for all of the 45 genes similar good markers were only found in 5% of the 500 permuted data sets.
Validation of the progression classifier using independent samples. The significance of the 45-gene expression signature was validated using an independent test set consisting of 74 superficial bladder tumors: 60 tumors from patients with no progression of the disease (median follow-up time, 56 months) and 14 tumors from patients with later disease stage progression (median time to progression, 13 months). A 60-mer oligonucleotide microarray was constructed for the test set validation with probes for the progression gene expression signature and our previously published 32-gene stage classifier. Total RNA from the test set samples were amplified, Cy3 labeled, and hybridized to the oligonucleotide microarray together with a Cy5-labeled common reference sample. Appropriate normalization and background adjustments of the microarray data were carried out following scanning of the microarray slides (see Materials and Methods).
To make robust classifications, we defined that a successful classification was when the difference between the distances to the two classes was >30%. Applying less stringent or no cutoff limits resulted in poorer classification performance, whereas more stringent cutoff limits gave better classification performance. However, the gain in classification performance simultaneously resulted in a decrease in the number of samples classified (see Supplementary Fig. S2). Using the 30% cutoff limit, we classified 51 of the test set samples and found a significant correlation between disease stage progression and our progression classifier predictions (χ2 test; P < 0.03; Fig. 2B). To compensate for differences in follow-up time and in time to progression, we calculated Kaplan-Maier progression estimates, which showed a log-rank test value of 0.03 (Supplementary Fig. S3). Using the 30% cutoff level, the progression classifier predicted the outcome correctly for 9 of 10 patients who showed subsequent disease progression to a higher stage. Of the 41 nonprogressing tumors, we classified 21 as progressing, and only 1 of 20 patients with the no progression signature showed later disease progression to stage T1. Based on these numbers, we obtained sensitivity and specificity values of 0.90 and 0.49, respectively. The positive and negative predictive values are 0.30 and 0.95, respectively. The low positive predictive value may partly reflect the fact that some high-risk patients were radically treated, and cured, by radiotherapy (two patients) or received intravesical Bacillus Calmette-Guerin instillations (nine patients). In other cases, the progression event may still occur later in the course (or may have occurred if the patient were not deceased of other causes), and the classifier thus failed due to limited follow-up time. The high negative predictive value is clinically interesting. Transferred to the clinic, it indicates that patients with Ta or T1 tumors classified as nonprogressors could be taken out of follow-up programs as the risk of disease progression would be as small as 5%. This would have both human and economic benefits.
Above we used a clinical outcome (stage progression) as end point, but one could also use the clinical risk. We therefore assigned each patient a risk score based on all available clinical and histopathologic risk variables, such as tumor stage, grade, size, concomitant CIS, tumor multiplicity, and recurrence pattern during the complete course of the disease (see Supplementary Table S1). Then, we compared the ability of our array classifier to identify the patients defined as high risk at end of follow-up based on classification of the early superficial tumor we analyzed several months before. Using this end point, we found a highly significant correlation between the high-risk group and our classifications (χ2 test; P < 0.00001), with a sensitivity of 0.95 and a specificity of 0.67. The positive and negative predictive values obtained were 0.67 and 0.95, respectively.
Clinical risk scores at time of microarray analysis were also calculated based on the clinical and histologic risk variables available at this early time point (see Fig. 2B; Supplementary Table S1). From these scores, it was evident that the gene expression classification adds predictive information at an early time point. Of the 30 samples classified as progression, 10 samples (33%) were scored as low or medium risk at time of microarray analysis and increased to high risk after a long follow-up period. For the patients classified as no progression, it was only 5% (1 of 21 patients) that showed this increase in risk score.
For comparison, we also classified the 51 samples analyzed with the progression classifier using our 32-gene stage classifier and found a significant correlation between outcome and stage classifications made (χ2 test; P < 0.004). The stage classifier identified 9 of the 10 tumors from patients who showed later disease stage progression and only 1 of the samples classified as no progression showed later progression. The sensitivity and specificity obtained using the stage classifier was 0.9 and 0.6, respectively. Furthermore, the obtained positive and negative prediction values were 0.36 and 0.96, respectively. However, we did not obtain a significant correlation between our stage classifier predictions and the clinical risk groups, indicating that the two classifiers provide independent information.
Delineation of the best disease progression markers. To acquire a broader view of the gene expression regulations between the progressing and the nonprogressing samples, we delineated the differences by selecting the 200 most significantly up-regulated genes in each group using t statistics (Fig. 3; Supplementary Table S4). We estimated the significance of the identified differentially expressed genes by performing 500 permutations of the sample labels. This permutation analysis revealed that it was highly unlikely to find as good markers by chance, as similar good markers were only found in 5% of the permutated data sets (see Supplementary Table S4).
Hierarchical cluster analysis of primary and secondary tumors. Expression profiling of the metachrone higher-stage tumors could provide important information on the degree of expression similarities between the primary and the secondary tumors. Tissues from secondary tumors were available from 14 of the patients with disease progression and these were hybridized to the customized Affymetrix GeneChip. Unsupervised hierarchical cluster analysis of all tumor samples based on the 3,213 most varying probe sets showed that tumors originating from the same patient in nine of the cases clustered tightly together, indicating a high degree of intraindividual similarity in expression profiles (Fig. 4). Notable, one tight clustering pair of tumors was a Ta and a T2+ tumor (patient 941). It was remarkable that Ta and T1 tumors and T1 or T2+ tumors from a single individual were more similar than, for example, Ta tumors from two individuals. There was no correlation between presence and absence of the tight clustering of samples from the same patient and time interval to tumor progression. The tight clustering of the nine tumor pairs probably reflects the monoclonal nature of many bladder tumors (20). A set of genomic abnormalities like chromosomal gains and losses (21) or epigenetic changes characterize bladder tumors of different stages from single individuals, and such physical abnormalities could be one of the causes of the strong similarity of metachronous tumors. The fact that five of the tumor pairs clustered apart could reflect an oligoclonal origin of these tumors.
The majority of patients suffering from bladder cancer present with multiple successive superficial (Ta and T1) tumors and the prognosis and treatment regimens depend on clinical and histologic characterization of the tumors. This procedure is difficult and error prone and, although followed tightly, many patients with superficial bladder cancer experience disease progression. In this study, we have identified a 45-gene expression signature, which may be used to identify patients at risk of disease progression.
The fact that several of the nonprogressing tumors that showed the progression signature did not progress may in part be explained by treatment and limited follow-up data. All patients involved in the study were continuously followed and newly diagnosed tumors were removed using transurethral resection. Several patients, for example, received intravesical Bacillus Calmette-Guerin installations, which reduces the risk of disease recurrence and progression (22). Consequently, many of these nonprogressing patients could, at least in theory, have experienced disease progression if they were not treated. This is supported by the fact that many of these patients were scored as high-risk patients based on all known clinical risk factors recorded during follow-up. We therefore used two end points for evaluation of our progression classifier, either de facto progression to a higher stage or assignment of the tumor to the high-risk group at end of follow-up. Such an approach is needed when conducting predictive studies, as the natural history of the individual disease cannot be established without leaving the patient untreated. Avoiding this approach could lead to erroneous discard of a well-working classifier that could be of practical clinical value.
Both the present progression classifier and the previously reported stage classifier (14) showed a predictive power, although they were trained on two different end points and although they contain completely different genes. Interestingly, only the progression classifier correlated with increased risk score. This indicates that certain sets of genes are related to the stage, the level of invasiveness, of the tumors, whereas others reflect grade, presence of CIS, multiplicity of tumors, etc.
Hierarchical cluster analysis of the tumors in the training set grouped the samples according to progression disregarding the stage of the tumor and the degree of progression (Ta-T1 or T1-T2+). Based on this unsupervised information, we decided to look for a general signature of progression present in both Ta and T1 tumors. This is remarkable as it is well known that T1 tumors have more genomic changes than Ta tumors and in many ways resemble muscle invasive tumors more than Ta tumors. However, recent data from breast cancer (8, 9) and studies of cancer in general (13) indicate that it is possible to find these signatures very early in the disease course.
Cluster analysis of the metachronous tumor samples showed a very close clustering of tumors from the same individual irrespective of these being of different stages. This is remarkable as the different stages otherwise cluster separately, when only looking at one from each individual. It may point to the monoclonal nature of bladder cancer and probably indicates that the chromosomal aberrations, such as gains and losses, as well as methylation events lead to an individual expression profile highly specific for a given tumor. Many of these changes may not be of biological importance in terms of being necessary for a certain malignant disease course.
The delineation of the most significantly differentially expressed genes identified several interesting markers. Among the genes up-regulated in the nonprogressing group, we found the SERPINB5 and FAT tumor suppressor genes as well as the FGFR3 gene, which is frequently mutated in superficial bladder tumors with low recurrence rates (23). Furthermore, we identified the BNIP3L, BIRC4, NCKAP1, and BIRC6 genes, which are involved in the apoptotic cell death pathway up-regulated in the nonprogressing group. Among the genes up-regulated in the progressing group, we found the CDC25B (24), CDC20 (25), and MCM7 (26) genes, which are involved in regulating cell cycle and cell proliferation. Furthermore, in this group, we identified the WHSC1 and GRB7 genes, which have been predicted/computed (Gene Ontology) to be involved in oncogenic transformation. The PPARD gene was also identified as up-regulated in the tumors that show later progression. Disruption of this gene was found to decrease tumorigenicity in colon cancer cells (27). Furthermore, PPARD regulates VEGF expression in bladder cancer cell lines (28). The 45 genes in the derived optimal prognosis signature were also found among the 200 best markers; however, the cross-validation approach also identified other interesting markers like BIRC5 (survivin), an apoptosis inhibitor that is up-regulated in the tumors that show later progression. BIRC5 has been reported to be expressed in most common cancers (29).
The ability to predict outcome of cancer using molecular classifiers as described in this and other studies is very promising and may soon be combined with traditional histopathologic examinations for offering better treatment regimens to patients suffering from cancer.
Grant support: Karen Elise Jensen Foundation, Danish Cancer Society, John and Birthe Meyer Foundation, The University and County of Aarhus, and Danish Research Council.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
We thank the technicians (Inge-Lis Thorsen, Gitte Høj, Hanne Steen, Bente Pytlich, Birgitte Stougaard, and Bente Devantié) for excellent assistance and the staff at the Departments of Urology, Clinical Biochemistry, and Pathology at Aarhus University Hospital.