Multiple transcriptional events take place when normal urothelium is transformed into tumor tissue. These can now be monitored simultaneously by the use of oligonucleotide arrays, and expression patterns of superficial and invasive tumors can be established. Single-cell suspensions were prepared from bladder biopsies (36 normal, 29 tumor). Pools of cells were made from normal urothelium and from pTa grade I and II and pT2 grade III and IV bladder tumors. From these suspensions, and from 10 single-tumor biopsies, labeled cRNA was hybridized to oligonucleotide arrays carrying probes for 6500 genes. The obtained expression data were sorted according to a weighting scheme and were subjected to hierarchical cluster analysis of tissues and genes. Northern blotting was used to verify the array data, and immunohistology was used to correlate between RNA and protein levels. Hierarchical clustering of samples correctly identified the stage using both 4076 genes and a subset of 400 genes covarying with the stages and grades of tumors. Hierarchical clustering of gene expression levels identified several stage-characteristic, functionally related clusters, encoding proteins that were related to cell proliferation, oncogenes and growth factors, cell adhesion, immunology, transcription, proteinases, and ribosomes. Northern blotting correlated well with array data. Immunohistology showed a good concordance between transcript level and protein staining. The study indicates that gene expression patterns may be identified in bladder cancer by combining oligonucleotide arrays and cluster analysis. These patterns give new biological insight and may form a basis for the construction of molecular classifiers and for developing new therapy for bladder cancer.

Multiple molecular events take place when normal epithelia are transformed into tumor tissue and later acquire an invasive potential. Previously, these events have been examined at the single-gene level, however, the newly developed array technology has made it possible to simultaneously monitor thousands of genes during tumor evolution and progression (1). To generate meaningful information from such large datasets, new methods have been developed that cluster closely related genes or tissues based on their expression profiles (2, 3). The purpose is two-fold: to identify genes having a coordinated expression change and thereby obtain insight into possible regulatory patterns (4, 5, 6, 7), and to classify biological samples based on expression (5, 8). The latter may then form a basis for building of molecular class predictors, potentially useful for tumor staging and prediction of clinical outcome as recently demonstrated (5, 9, 10). Two recent studies (3, 4) that used two-way clustering of thousands of genes were able to separate biopsies into those originating from cancerous and noncancerous tissue, and cell lines from in vivo tumors. In both studies it was possible to identify clusters of functionally related genes, some associated with proliferation, IFN-regulation, ribosomes or stromal components like lymphocytes, and smooth muscle cells.

The bladder cancer disease is characterized by two basically different disease courses: one in which patients suffer from multiple recurrences of superficial tumors, and one in which the patients’ primary tumor is muscle invasive (11, 12). The group with superficial recurrences includes patients with the relatively benign superficial pTa tumors (∼5% of which will progress to muscle invasive cancer), and patients with superficially invasive pT1 tumors (∼25% of which will progress to muscle invasive disease; Ref. 13). At present, no marker can predict which of the three disease courses a patient with a superficial tumor will follow: no recurrence, superficial recurrence, or invasive cancer. With the present study, we have aimed at identifying gene clusters that characterize the superficial and the muscle invasive diseases, as the first step in a detailed gene expression-based characterization of bladder cancer.

DNA arrays have been used primarily on relatively homogeneous biological samples such as yeast cultures, cell lines, and leukemias. The application on complex mixtures of cells such as tissue biopsies may create data that are more difficult to interpret because of the many different cell types providing signals (3). One approach to this problem is immunohistological or in situ hybridization detection of the cell type providing a specific signal, as demonstrated with the STAT1 gene product in breast cancer (4). We used immunohistology to demonstrate the cellular origin of eight proteins and to correlate the level of protein to the level of transcript detected on the array. Furthermore, to reduce the number of transcripts from stromal tissue and to enrich for tumor cells, tumors were disintegrated into single-cell suspensions by a method previously used for preparation of bladder tumor cells for flow cytometry (14, 15).

Two different expression measures, based on fold change and numerical difference between transcript level in tumor and normal urothelium, were tested for the clustering of tissues and genes. Of these, fold change proved to be the most informative. With both methods, however, several important functionally related clusters of genes were identified. These include clusters related to cell proliferation, oncogenes and growth factors, cell adhesion, immunology, transcription, and proteinases, as well as a large ribosomal gene cluster.

Material.

Thirty-nine bladder tumor biopsies (29 formed the first tumor set, 10 the second) were sampled from patients after the removal of the necessary amount of tissue for routine pathology examination. Grading according to Bergkvist et al.(16) was made by one pathologist. In the first hypothesis-generating tumor set, single-cell suspensions were made by immediately disintegrating biopsies on ice with a scalpel and a syringe followed by filtering through a 100 μm filter, as described previously (14, 15). The cells were inspected and counted under a microscope, and if RBCs occurred frequently, the sample was discarded to reduce signals from leukocytes in peripheral blood. When pools were made, RNA corresponding to a similar number of cells was used from each of six tumors all having the same stage and grade. From the first tumor set, a total of four tumor pools were made (pTa grade I pool, pTa grade II pool, pT2+3 grade III pool, and pT2+ grade IV pool), and the single tumors examined were 335 (stage pTa grade I), 837 (pTa grade II), 901 (pTa grade III), 320 (pT1 grade III), and 713 (pT2+ grade III). For generation of a second dataset to test the reproducibility of the first set on biopsies, the tumor biopsies examined were 709, 928, 930, 934, 968 (pTa grade II), and 875, 937, 1005, 1078, 1133 (pT2+ grade III).

Cells from normal bladder mucosa biopsies from 36 patients with prostatic hyperplasia or urinary incontinence were pooled to obtain a normal urothelial reference. Informed consent was obtained in all of the cases, and protocols were approved by the local scientific ethical committee.

cRNA Preparation.

Total RNA was isolated using the RNAzol B RNA isolation method (WAK-Chemie Medical GmbH). Poly(A) + RNA was isolated by an oligo-dT selection step (Oligotex mRNA kit; Qiagen). One μg of mRNA was used as starting material for the cDNA preparation. The first and second strand cDNA synthesis was performed using the SuperScript Choice System (Life Technologies) according to the manufacturer’s instructions, except using an oligo-dT primer containing a T7 RNA polymerase promoter site. Labeled cRNA was prepared using the MEGAscript In Vitro Transcription kit (Ambion). Biotin-labeled CTP and UTP (Enzo) were used in the reaction together with unlabeled NTPs. After the IVT reaction, the unincorporated nucleotides were removed using RNeasy columns (Qiagen).

Array Hybridization and Scanning.

On the basis of previously published methods (1, 17, 18, 19, 20), 10 μg of cRNA was fragmented at 94°C for 35 min in a fragmentation buffer containing 40 mm Tris-acetate (pH 8.1), 100 mm KOAc, and 30 mm MgOAc. Prior to hybridization, the fragmented cRNA in a 6× SSPE-T hybridization buffer (1 m NaCl, 10 mm Tris (pH 7.6), and 0.005% Triton) was heated to 95°C for 5 min and subsequently to 40°C for 5 min before loading onto the Affymetrix probe array cartridge. The probe array was then incubated for 16 h at 40°C at constant rotation (60 rpm). The washing and staining procedure was performed in the Affymetrix Fluidics Station. The probe array was exposed to 10 washes in 6× SSPE-T at 25°C followed by 4 washes in 0.5× SSPE-T at 50°C. The biotinylated cRNA was stained with a streptavidin-phycoerythrin conjugate (10 μg/ml; Molecular Probes, Eugene, OR) in 6× SSPE-T for 30 min at 25°C followed by 10 washes in 6× SSPE-T at 25°C. The probe arrays were scanned at 560 nm using a confocal laser-scanning microscope with an argon ion laser as the excitation source (Hewlett Packard GeneArray Scanner G2500A). The readings from the quantitative scanning were analyzed by the Affymetrix Gene Expression Analysis software. For comparison from array to array, these were scaled to a global intensity of 150, as published previously (17, 19, 20).

Northern Blotting.

Total RNA, 0.5–4 μg per lane, was separated in 1.5% agarose-formaldehyde gels, transferred onto Zeta-Probe nylon membrane (Bio-Rad) by positive pressure (Posiblotter; Stratagene) and immobilized by baking for 20 min at 120°C. The filters were hybridized with digoxygenin-labeled RNA transcribed from 600-1000 bp PCR products containing a T7 promoter incorporated via the oligo dT primers used for reverse transcription. Filters were hybridized with 10 ng of probe per ml of ultrahyb (Ambion) hybridization solution at 68°C for 16 h and were washed to a stringency of 0.1× SSC at 68°C. Specific hybridization was detected by reacting the membrane with monoclonal antidigoxygenin antibodies conjugated with alkaline phosphatase, incubating with ECF chemifluorescence substrate (Amersham Pharmacia) and scanning on a Storm 840 (Molecular Dynamics). The hybridization signals were quantified with ImageQuant 5.0 software.

Immunohistochemistry.

Four-μm sections were cut from the paraffin-embedded tissue blocks, mounted, and deparaffinized by incubation at 80°C for 10 min, followed by immersion in heated oil (Estisol 312; Estichem A/S, Copenhagen, Denmark) at 60°C for 10 min and rehydration. Antigen retrieval was achieved in TEG buffer using microwaves at 900 W. The tissue sections were cooled in the buffer for 15 min before a brief rinse in tap water. Endogenous peroxidase activity was blocked by incubating the sections with 1% H2O2 for 20 min, followed by three rinses in tap water, 1 min each. Then the sections were soaked in PBS buffer for 2 min. The next steps were modified from the descriptions given by Oncogene Science Inc., in the Mouse Immunohistochemistry Detection System (XHCO1; UniTect, Uniondale, NY). Briefly, the tissue sections were incubated overnight at 4°C with primary antibody (against β2-microglobulin (Dako), Cytokeratin 8, Cystatin C (both from Europa), junB, CD59, E-cadherin, apo-E, Cathepsin E, (all from Santa Cruz), followed by three rinses in PBS buffer, 5 min each. Afterward, the sections were incubated with biotinylated secondary antibody for 30 min, rinsed three times with PBS buffer, and subsequently incubated with avidin-biotinlylated horseradish peroxidase complex (AEC) for 30 min, followed by three rinses in PBS buffer. Staining was performed by incubation with 3-amino-ethylcarbazole for 10 min. The tissue sections were counterstained with Mayer’s hematoxylin, washed in tap water for 5 min, and mounted with glycerol-gelatin. Positive and negative controls were included in each staining round with all of the antibodies.

Cluster Analysis.

The scaled AvgDif measures (1) calculated by Affymetrix software were extracted from each array assay. Only the 4067 genes scored as present in at least one of the tissues were considered. For calculation, all of the AvgDif measures below 20 were set to 20. For each tumor tissue and each gene, the AvgDif from normal urothelium was either subtracted, to define the difference, or divided, and natural logarithm applied to define the “log-fold” relative measure.

For tissue clustering, the relative expression measures for each tumor tissue (log-fold or difference) were used to cluster tissues by a hierarchical method using Euclidean distance between tissues X and Y [average-linkage, agglomerative hierarchical clustering (UPGMA), our UNIX implementation of the algorithm described in Ref. 2].

Euclidean distance was calculated as follows:

\[d(X,\ Y)\ {=}\ {\surd}\ \left({\sum}_{i{=}1}^{n}\ (x_{i}\ {-}\ y_{i})^{2}\right)\]

where xi and yi are the component genes of the expression vectors X, Y; and n is the dimensionality of X, Y (number of genes).

Tissue dendrograms were constructed with the PHYLIP program DRAWGRAM from the newick tree files generated by the clustering program.

As a new approach for gene clustering, a weighting scheme for the seven observed grades of cancer was used to select 200 genes positively covarying and 200 genes negatively covarying to increasing stage. Only genes with an AvgDif of ≥100 in at least one tissue were used for the selection of the 400 covarying genes. An integer value between 1 and 8 was assigned to each stage, incrementing by one for each grade within stages and incrementing by 2 for increasing stage.

Covariance measure between gene A and the increasing stage vector (or gene) B:

\[\mathrm{Covariance}(A,B</)\ {=}\ {\sum}^{n}_{i{=}1}\ \left(\frac{(a_{1}\ {-}\mathrm{mean}(A))^{{\ast}}\ (b_{i}{-}\mathrm{mean}(B))}{n\ {-}\ 1}\right)\]

where n is the dimensionality of vectors A and B (number of tissues). Pearson correlation distance was also tested for this ranking but was found to be too sensitive to the specific shape of the arbitrary weight vector chosen. Furthermore, Pearson correlation distance selected vectors with small overall fold change (small vector length). The covariance factor for the top 200 positively covarying genes ranged from 5.7 down to 1.6, whereas the 200 most negatively covarying genes ranged from −6.3 up to −1.8.

For gene clustering the same hierarchical method (average linkage agglomerative hierarchical clustering)as used for tissue clustering was applied, but in this case a normalized Euclidean distance (vector angle) was used to cluster the top 400 positively and negatively covarying genes for both relative expression measures.

The angle between expression vectors A and B:

\[d(A,B)\ {=}\mathrm{cos}^{{-}1}\ \left(\frac{{\sum}_{i{=}1}^{n}\ (a_{i}{\ast}b_{i})}{(\mathrm{length}(A){\ast}\mathrm{length}(B))}\right)\]
\[\mathrm{where\ length}(X){=}{\surd}\ \right){\sum}_{i{=}1}^{n}\ (x_{i})^{2}\right)\]

Gene dendrograms were constructed by the same method as the tissue dendrograms.

We performed scaling to a global chip intensity of 150 units per probe set. The scaling made it possible to compare individual experiments with each other. To verify the reproducibility, double determinations were made in selected cases and showed a good correlation (Fig. 1 A).

A scatter plot of the noninvasive pTa grade 1 tumor and the invasive highly abnormal grade IV pT2+ tumor showed a minor subfraction of the gene transcripts to deviate much from those in the normal urothelium. The large majority of transcripts were within a narrow range in both tumors and normal urothelium (Fig. 1, B and C). The number of deviating genes was higher in the most abnormal tumor. With the purpose of confirming the array data with another method, we performed Northern blotting of four genes on the same samples of RNA as used for array hybridization. A standardized amount of RNA was run in each lane, followed by blotting with a labeled RNA probe, and quantitation of the obtained band (Fig. 2). The oligonucleotide array and the Northern blot gave similar results (Fig. 2) corresponding to previous studies (19, 20).

Cluster Analysis.

The level of a gene transcript in different tumors can be thought of as a pattern that can be related to patterns of other gene transcripts. If the expression of one gene is very similar to the expression of another gene in several samples, they will have a high similarity measure. Gene expression patterns (vectors) that are similar are grouped together in clusters by the methods described in Ref. 2. Briefly, clustering is done by finding the most similar pair of vectors and merging them. The merged pair is replaced by a single vector that is the average of the two, and the distances from the new cluster vector to all of the other vectors are calculated. Because clusters can represent many genes, the averaging of cluster vectors is weighted by the size of the respective merged clusters or singletons. In this way, the cluster vector maintains the average of all of the vectors it represents (the cluster centroid). The same procedure can be applied to cluster tissues based on expression levels over the different genes considered. We based clustering analysis on either the 4067 transcripts being scored as present in at least one of the samples, or based on those 400 transcripts that covaried best with a simple weighting scheme adding increasing values to increasing grades of atypia, and sequentially increasing from stage to stage (pTa grade I, 1; pTa grade II, 2; pTa grade III, 3; pT1grade III, 5; pT2 grade III, 7; pT2 grade IV, 8). Without a weighting scheme, the gene clusters formed lacked functionally related genes, a finding also seen when two-way clustering was used (data not shown).

Tissue Clusters.

Different algorithms, applied to either log-fold change or differences in transcript levels across the different samples, were applied to all of the transcripts or only those covarying with the weighting scheme (Fig. 3). The log-fold-change based method, applied to the covarying genes correctly, identified the relationship between the samples (Fig. 3,A), clustering the superficial pools close together and far from the invasive pools. The use of the full dataset of 4076 genes led to a reduced separation of superficial and invasive tumors (Fig. 3, C and D), as did the use of the average difference method (Fig. 3, B and D). However, separation of superficial and invasive tumors was still possible. We repeated the tissue clustering based on the 400 covarying genes on a second set of tumor biopsy samples (5 samples of pTa and 5 of pT2+), and these were also correctly identified (Fig. 3 E), which indicated that the 400 covarying genes could be of general use for classification. This is remarkable because the second set of tumors consisted of tissue biopsies and not single-cell suspensions.

On the basis of these dendrograms, we conclude that the log-fold-change-clustering algorithm works well, that the dataset obtained from the oligonucleotide arrays reflects the stage of the tumors, and that objective information on the stage and grade of a tumor can be obtained from a mathematical analysis of gene expression data. Because the set of 400 covarying genes showed optimal performance in tissue clustering, we used this set of genes when analyzing gene clusters.

Gene Clusters.

The data obtained from gene cluster analysis are presented as colored images in which genes with similar expression patterns are clustered next to each other on the vertical axis, and the samples according to stage and grade on the horizontal axis (Fig. 4). The two different clusterings, log-fold change and difference, gave completely different clusters across the set of samples (Fig. 4). To validate the reproducibility of the selected 400 genes for log-fold-based gene clustering, we applied the same set of genes to a new set of 10 samples, 5 pTa and 5 pT2+ tumors. This new set of samples consisted of biopsies and not single-cell suspensions. As can be seen from the cluster (Fig. 4, second column from the left), a similar pattern of expression was detected in the set of biopsies, indicating the robustness of the selected 400 genes for clustering. The difference between suspensions of single cells and biopsies was reflected as a slightly higher variation from tumor to tumor in the biopsies compared with the suspensions.

In the log-fold-based cluster analysis, the top 200 positively covarying genes can be divided into five different clusters containing functionally related genes (Fig. 4,A).4 The cluster shown at the top (Fig. 4,A) contains genes related to cell proliferation such as cyclins A and E, PCTAIRE-1, and SWI/SNF. The next cluster mainly contains oncogenes and growth factors. Genes in both of these clusters are expressed at a level close to that seen in normal urothelium in superficial tumors (Fig. 4,A, black) and increase in higher stage tumors (yellow). The three clusters at the lower part show a reduced expression level in the superficial tumors compared with normal (cyan) and then, in most cases, increase above the normal urothelial level in invasive tumors (shades of yellow). These clusters contain a set of immunologically related genes, including different MHCs and immunoglobulins, cancer-related genes like src-like kinase and Fas/Apo-1, and finally another immunologically related cluster at the bottom of the figure (Fig. 4 A).

The 200 negatively covarying genes (Fig. 4,B) could be divided into three different clusters based on log-fold change and function of the genes. The upper cluster contains genes related to cell adhesion like laminins, integrins, and P-cadherin (Fig. 4 B). They all show a reduced level of expression in the invasive tumors as evidenced by the cyan coloring to the right. The small cluster in the middle contains four genes related to transcription, and, finally, the lowest cluster in the figure contains five proteinases, including cathepsin E (two different probe sets for the same gene) and metalloproteinase as well as a protease inhibitor. The lower clusters are characterized by an increase in level in superficial tumors (yellow) followed by a return to approximately normal levels in invasive tumors.

In the difference-based cluster analysis, the top 200 covarying genes that showed a positive covariance contained only a few clusters having a functional relation. The upper cluster (Fig. 4,C) contained five genes related to cell proliferation like the microtubule-associated protein and oncoprotein 18/stathmin. The next cluster was a set of immunology-related genes such as MHC and LERK-2. Both of these clusters showed an increased expression level in invasive tumors compared with normal urothelium. The cluster at the lower end showed a reduced level in superficial tumors and a return to normal or increased level in invasive tumors. This cluster contained many immunology-related genes like MHC, HLA, and immunoglobulin genes. Finally, for genes that showed a negative covariance based on difference (Fig. 4 D), most were attributable to the clustering of ribosomal genes. The upper and lower ribosomal clusters show genes that are up-regulated in expression in superficial tumors and down-regulated or unaltered in invasive tumors. The middle ribosomal cluster is generally expressed at a lower level than in normal urothelium. Other genes that seemed to cluster were a closely related set of immunology-related genes, and two tumor inhibitors, TGF-β superfamily protein and Sui1 in the uppermost cluster.

We conclude that a characteristic pattern of gene expression identifies superficial and invasive tumors. This is clearly demonstrated by the shift in color from the left side of the images to the right side of the images. The pattern includes a number of functionally related gene clusters and is most pronounced when the log-fold method is applied to single-cell suspensions.

Correlation between mRNA Levels Detected on Arrays and Protein Levels Detected by Immunohistology.

From the single tumors examined on arrays, tissue sections were cut and used for immunostaining. It was then possible to correlate the level of transcript measured on the chip to the extent of immunostaining for the translation product (Fig. 5, genes marked with ∗ in Fig. 4). On the basis of the transcript levels, we selected a group of proteins for immunostaining that were supposed to show variation from sample to sample and that covered a broad range of expression levels. Finally, an antibody should be commercially available.

Several of the proteins were expressed not only by urothelial cells but also by leukocytes, endothelial cells, or histiocytes (Fig. 5). The level of protein identified by immunostaining, disregarding the cell type expressing the protein, seemed to correlate well with the transcript level measured on the microarray (Fig. 5).

The use of array technology for monitoring the expression of thousands of genes makes it possible to obtain a more complete understanding of the many events that characterize the different stages of a cancer disease. Furthermore, stage characteristic expression patterns may be used for molecular classification of tumors. The basis for this is reproducible and simultaneous measurement of transcripts, which we demonstrate is possible with oligonucleotide arrays, that showed a good correlation with Northern blots.

To obtain a general information on gene expression at superficial and invasive stages of bladder cancer, we chose to pool tumors representing different stages. In that way individual differences between samples were smoothed, and the most stage-characteristic patterns strengthened. We believe it is necessary to seek general characteristic patterns, because these may later form the basis for the building of molecular classifiers (5). In addition to the pools, we analyzed single superficial and invasive tumors. These showed an expression pattern similar to the pools representing their stage, and clustered close to these when tissue clustering was performed. It was interesting to observe that the pT1 tumor clustered close to the pT2+ tumors, indicating that pT1 tumors could be more familiar with muscle-invasive tumors than with superficial benign tumors.

Another approach that we used was the preparation of single cell suspensions to reduce the stromal component of the biopsies. The single-cell suspensions showed less expression of smooth muscle- and connective tissue-related genes (data not shown). The single-cell suspensions were prepared from cooled biopsies immediately after surgery using a procedure previously used for the preparation of bladder tumors for flow cytometry (14, 15). The single-cell suspensions, furthermore, have the advantages that the cells can be inspected under the microscope to ensure the presence of >90% urothelial cells and only a minimum of peripheral blood contamination, tumor pools can be made based on similar numbers of cells from each tumor, and the RNA-preserving guanidinium thiocyanate, used for storage, immediately disrupts the cells and inactivates RNases.

To reduce the number of genes used for clustering, we filtered the expression data based on different cutoff levels (data not shown). However, we found this approach to be much less informative than using the top 10% of genes covarying with a simple weighting scheme. The weighting scheme applied more weight to samples with increasing grade of atypia as well as to samples having a high stage. The top 10% covarying genes were then analyzed by clustering the tissues in the first dimension, then genes in the other dimension. Organizing genes with a similar expression pattern into clusters identified a number of functionally related genes in several clusters. The most significant of these were obtained by examining log-fold change of expression, which clustered genes related to cell cycle, oncogenes and growth factors, immunology, cell adhesion, transcription, and proteinases into separate clusters. In summary, this pattern suggests a high level of protein synthesis in superficial papillomas having increased transcription factor and ribosomal levels as well as potential tissue-degrading properties obtained by proteinase up-regulation. In the invasive tumors, an increased level of cell cycle-related transcripts were observed, which might partly reflect the increased level of growth factor and oncogene transcripts also seen in these. A loss of cellular adhesion proteins was found in invasive tumors and may be related to tissue invasion and metastasis. The invading tumor cells seem to be a challenge to the immune system as reflected by an increase in immunology-related proteins. Why we find a relatively lower level of proteinases in invasive tumors compared with superficial tumors is at present unexplained.

Because of space limitations, we will comment only on a few of the clusters in the following. A closely related cluster of positively covarying genes, reaching maximum expression in invasive tumors, consisted of genes related to the cell cycle. Most of these were genes that will increase the cellular proliferation. Several genes normally up-regulated in the G1 phase and during entry into S phase were increased, such as cyclins A and E, c-myb (induces cyclin A1), hepatocyte growth factor, and MCM2/BM28(21, 22, 23, 24, 25). Other genes related to the spindle checkpoint as TTK(26, 27), and sister chromatid separation as E2-C/UbcH10(28), peaking at late G2-M phase were also up-regulated, as was PCTAIRE-1(29), which is a member of the CDK family with an unknown function. The only negative regulator in this cluster was the SWI/SNF complex, which represses the activation function of E2F1 in cooperation with Rb(30, 31). These data indicate that all of the components needed to direct the cell through the cell cycle are up-regulated in invasive bladder tumors, probably accounting for the increased number of cells in S phase that was previously observed in many flow cytometry studies (32). These up-regulated proteins could be targets for inhibitors aiming at reducing the cell cycle and thereby working as cytostatic pharmaceuticals.

We identified a cell adhesion cluster in the log-fold analysis that showed down-regulation of the transcripts in invasive tumors. The cluster contained two laminin and two integrin genes, a laminin receptor, two members of the cadherin family (P-cadherin 3 and FAT tumor suppressor), epican, factor H homologue, and the mucin MGC-24. A reduced cell adhesion has been reported to accompany progression of various epithelial cancers. Although still controversial, it seems that self-adhesion is reduced in invasive cancers, whereas other adhesive properties may be up-regulated simultaneously (33). The reduced expression of laminins and integrins has previously been described in invasive cancer (34) and both α3 and β4 integrins showed a progressive reduction with increasing neoplastic transformation in colorectal carcinomas (35). P-cadherin has shown reduced expression in poorly differentiated gastric carcinomas compared with well differentiated (36), and the E-cadherin is lost in invasive bladder tumors (37). The FAT gene is a new member of the human cadherin superfamily that closely resembles a Drosophila gene essential for controlling cell proliferation during Drosophila development (38). Loss of gene function in Drosophila causes hyperplastic tumor-like overgrowth of larval imaginal discs (39). Epican is a heparan/chondroitin sulfate proteoglycan form of CD44 that creates self-aggregation and adherence to keratinocytes when transfected into fibroblasts (40).

Factor H is a multifunctional protein that acts as a complement regulator but that also has functions outside the complement system because it binds to the cellular integrin receptor (CD11b/CD18) and interacts with cell surface glycosaminoglycans (41). It has several homologues with ill-defined functions. Finally, the MGC-24 gene encodes a mucin (MUC18/MCAM/CD146) that leads to increased homotypic adhesion in vitro when transfected into melanoma cell lines (42). In conclusion, it seems that muscle-invasive bladder cancer is characterized by loss of expression of several adhesion proteins, some of which are still poorly examined in relation to cancer.

The ribosomal gene cluster that was identified when using the difference calculation corresponded to a similar cluster recently identified by a deterministic-annealing algorithm in a set of colorectal tissue samples (3). In that case, the ribosomal transcripts were low in normal tissue and high in the colon tumor tissue. In the bladder, a high level was also detected in superficial tumors, whereas invasive bladder cancers had a normal or reduced level of ribosomal proteins. Eight of the ribosomal proteins clustering in the colon tumor tissue were identical to those in the bladder and may form a set of transcripts or proteins that change in cancer development.

Apart from the functionally related genes, most clusters also contained genes without any obvious relation. Why they occurred together is at present unknown. Coregulation attributable to utilization of the same transcription factors is one possible explanation. We did examine whether it was attributable to co-localization at the same chromosomal region but found no support for that hypothesis.

The level of a transcript determined in a tumor biopsy does not provide any information on the cellular origin of that transcript. Many transcripts, such as housekeeping transcripts, are produced in the majority of cells, whereas others, such as immunoglobulins, are confined to certain cell types. We used immunohistology to demonstrate the complex cellular origin of some of the transcripts. Because many thousands of transcripts are examined by array technology, it seems impossible to localize the cellular origin of each signal. That would be necessary if new information on the biology of urtohelial tumor cells per se was sought; however, another approach would be to regard the tumor cells and their environment as providers of equally important information. For example, the increase in immune-related transcripts in invasive tumors might provide information on the stage of the disease. Such a global picture on tumor markers may provide important classifiers and predictive markers in the near future. Interestingly, a good correlation between mRNA level and protein level was observed in eight selected cases. Because the protein, not the mRNA, most often is the functional effector, this is of obvious importance.

The level of a transcript gives no information on the functional status of the translation product, the protein, that could be damaged by mutations. The application of representative genes from the identified clusters on medium density arrays and the examination of a large group of bladder tumors would make it possible to evaluate the predictive information that can be obtained from these data. Pairing such an approach with high throughput sequencing of key genes like p53, Rb, and other tumor inhibitors known to be subjected to mutations, might lead to a genetic classification set that could be useful for the individual bladder cancer patient.

Fig. 1.

Plot of correlation between transcript levels scored as present in at least one sample. A, repetition of array analysis on pool from normal urothelium; 3.8% of transcripts deviated more than 3-fold. B, plot of superficial pTa tumor versus normal pool; 14.6% deviated more than 3-fold. C, plot of invasive pT2 tumor versus normal pool; 24.3% deviated more than 3-fold. The expression levels of the majority of transcripts present in both samples are not significantly different. The solid lines, a difference by a factor of three; the broken lines, a difference by a factor of ten.

Fig. 1.

Plot of correlation between transcript levels scored as present in at least one sample. A, repetition of array analysis on pool from normal urothelium; 3.8% of transcripts deviated more than 3-fold. B, plot of superficial pTa tumor versus normal pool; 14.6% deviated more than 3-fold. C, plot of invasive pT2 tumor versus normal pool; 24.3% deviated more than 3-fold. The expression levels of the majority of transcripts present in both samples are not significantly different. The solid lines, a difference by a factor of three; the broken lines, a difference by a factor of ten.

Close modal
Fig. 2.

Comparison of Northern blots and oligonucleotide arrays. The samples analyzed were normal pool (Norm), superficial pTa grade I tumor (335), superficial pTa grade III (901), and invasive pT2 grade III (713). The Northern blots were scanned by densitometry (solid line) and plotted together with a plot of the level detected on the arrays (dotted line). The levels of expression are indicated on the figure.

Fig. 2.

Comparison of Northern blots and oligonucleotide arrays. The samples analyzed were normal pool (Norm), superficial pTa grade I tumor (335), superficial pTa grade III (901), and invasive pT2 grade III (713). The Northern blots were scanned by densitometry (solid line) and plotted together with a plot of the level detected on the arrays (dotted line). The levels of expression are indicated on the figure.

Close modal
Fig. 3.

Dendrograms of tissue samples based on clustering of different data sets. Clustering was based either on the log-fold change in expression level of genes (A, C, and E), or on the difference (B and D) when comparing tumor to a pool of normal samples. Genes used for clustering were either those 10% of the genes that covaried best with progression (A, B, and E), or all of the 4076 genes that were scored as present in at least one sample (C and D). Stage and grade of the tumors were 335 (pTa grade I); 709, 837, 928, 930, 934, 968 (pTa grade II); 901 (pTa grade III); 320 (pT1 grade III); 713, 875, 937, 1005, 1078, 1133 (pT2 grade III). In A, B, C, and D, single-cell suspensions were used; in E, tissue biopsies were used. The green colors, grades of superficial tumors; the red colors, grades of the invasive tumors. (The same color code is used in Fig. 4.)

Fig. 3.

Dendrograms of tissue samples based on clustering of different data sets. Clustering was based either on the log-fold change in expression level of genes (A, C, and E), or on the difference (B and D) when comparing tumor to a pool of normal samples. Genes used for clustering were either those 10% of the genes that covaried best with progression (A, B, and E), or all of the 4076 genes that were scored as present in at least one sample (C and D). Stage and grade of the tumors were 335 (pTa grade I); 709, 837, 928, 930, 934, 968 (pTa grade II); 901 (pTa grade III); 320 (pT1 grade III); 713, 875, 937, 1005, 1078, 1133 (pT2 grade III). In A, B, C, and D, single-cell suspensions were used; in E, tissue biopsies were used. The green colors, grades of superficial tumors; the red colors, grades of the invasive tumors. (The same color code is used in Fig. 4.)

Close modal
Fig. 4.

Cluster diagram of 9 bladder tumor preparations (5 single tumors and 4 pools of six tumors) representing single-cell suspensions from invasive (far left) and noninvasive (far right) tumors, and of 10 bladder tumor biopsies from noninvasive (5, pTa) and invasive (5, pT2+) tumors (second from left). Each column, a tumor preparation; each row, a gene preparation. The diagrams show clustering based on log-fold change from normal urothelium (A and B) and based on difference from normal urothelium (C and D). The color saturation is directly proportional to the magnitude of the measured expression ratio or difference; cyan, the lowest value; yellow, the highest value. Black, a ratio of 1, a similar level of expression in tumor as in normal urothelium. The dendrograms at each side show the relationship between the different genes. In the middle, distinct functional clusters are identified and members of the clusters are annotated in brief. In an effort to identify those genes most indicative of cancer progression, a weighting scheme was used to select the 400 genes that covaried best with the different stages of bladder cancer, 200 positively covarying (A and C) and 200 negatively covarying (B and D). Gene clustering was based on normalized Euclidean distance (vector angle) calculated between genes or gene cluster centers. Numbers below the columns, range of changes: log-fold from −2 to +2; difference from −2910 to 2910. ∗, genes that are also analyzed in Fig. 5; numbered genes, standard names are in database.4

Fig. 4.

Cluster diagram of 9 bladder tumor preparations (5 single tumors and 4 pools of six tumors) representing single-cell suspensions from invasive (far left) and noninvasive (far right) tumors, and of 10 bladder tumor biopsies from noninvasive (5, pTa) and invasive (5, pT2+) tumors (second from left). Each column, a tumor preparation; each row, a gene preparation. The diagrams show clustering based on log-fold change from normal urothelium (A and B) and based on difference from normal urothelium (C and D). The color saturation is directly proportional to the magnitude of the measured expression ratio or difference; cyan, the lowest value; yellow, the highest value. Black, a ratio of 1, a similar level of expression in tumor as in normal urothelium. The dendrograms at each side show the relationship between the different genes. In the middle, distinct functional clusters are identified and members of the clusters are annotated in brief. In an effort to identify those genes most indicative of cancer progression, a weighting scheme was used to select the 400 genes that covaried best with the different stages of bladder cancer, 200 positively covarying (A and C) and 200 negatively covarying (B and D). Gene clustering was based on normalized Euclidean distance (vector angle) calculated between genes or gene cluster centers. Numbers below the columns, range of changes: log-fold from −2 to +2; difference from −2910 to 2910. ∗, genes that are also analyzed in Fig. 5; numbered genes, standard names are in database.4

Close modal
Fig. 5.

Immunohistochemical staining (red) of the tissue sections used for expression analysis. Proteins were selected on the following criteria: (a) variation in expression level between samples; (b) general level of expression; and (c) antibody commercially available. On each section in figure, the protein examined and the level measured on the oligonucleotide array. Arrowsin cathepsin E, ApoE, and CD59 stainings, stained urothelial cells; arrows in β2-microglobulin and Cystatin C, stained stromal cells or leukocytes. The stages and grades of the tumors were: JunB [pTa grade (gr) II, 1253; pT2grIII, absent]; β2-microglobulin (pTagrII, 6932; pT1grIII, 2481); Keratin 8 (pTagrI, 5006; pT2grIV, 390); Cystatin C (pTagrII, 941; pT2grIII, 713); Cathepsin E (pTagrII, 297,; pT2grIV, absent); ApoE (pTagrII, 389; pT2grIV, absent); CD59 (pTagrII, 260; pT1grIII, absent); E-cadherin (pTagrII, 433; pT2grIII, 236).

Fig. 5.

Immunohistochemical staining (red) of the tissue sections used for expression analysis. Proteins were selected on the following criteria: (a) variation in expression level between samples; (b) general level of expression; and (c) antibody commercially available. On each section in figure, the protein examined and the level measured on the oligonucleotide array. Arrowsin cathepsin E, ApoE, and CD59 stainings, stained urothelial cells; arrows in β2-microglobulin and Cystatin C, stained stromal cells or leukocytes. The stages and grades of the tumors were: JunB [pTa grade (gr) II, 1253; pT2grIII, absent]; β2-microglobulin (pTagrII, 6932; pT1grIII, 2481); Keratin 8 (pTagrI, 5006; pT2grIV, 390); Cystatin C (pTagrII, 941; pT2grIII, 713); Cathepsin E (pTagrII, 297,; pT2grIV, absent); ApoE (pTagrII, 389; pT2grIV, absent); CD59 (pTagrII, 260; pT1grIII, absent); E-cadherin (pTagrII, 433; pT2grIII, 236).

Close modal

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1

Supported by the Karen Elise Jensens Foundation, The Danish Cancer Society, Fraenkels Foundation, the Danish National Research Foundation, and Consul Meyers Foundation. Part of the work was performed at the Academic User Center at Affymetrix, Santa Clara, CA, funded in part by NIH Grant PO1 HG0132.

3

The abbreviations used are: pT2+, stages pT2, pT3, and pT4; log-fold, natural logarithm of fold changes; AvgDif, average difference between perfect match probes and mismatch probes on array.

4

For a list of standard gene names see data at: www.MDL.DK/sdata.html.

We thank Thomas Gingeras, Christine Harrington, Søren Brunak, and Anders Krogh for helpful discussions.

1
Lockhart D. J., Dong H., Byme M. C., Follettie M. T., Gallo M. V., Chee M. S. Expression monitoring by hybridization to high-density oligonucleotide arrays.
Nat. Biotechnol.
,
14
:
1675
-1680,  
1996
.
2
Eisen M. B., Spellman P. T., Brown P. O., Botstein D. Cluster analysis and display of genome-wide expression patterns.
Proc. Natl. Acad. Sci. USA
,
95
:
14863
-14868,  
1998
.
3
Alon U., Barkai N., Notterman D. A., Gish K., Ybarra S., Mack D., Levine A. J. Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotid arrays.
Proc. Natl. Acad. Sci. USA
,
96
:
6745
-6750,  
1999
.
4
Perou C. M., Jeffrey S. S., van de Rijn M., Rees C. A., Eisen M. B., Ross D. T. Distinctive gene expression patterns in human mammary epithelial cells and breast cancers.
Proc. Natl. Acad. Sci. USA
,
96
:
9212
-9217,  
1999
.
5
Nacht M., Ferguson A. T., Zhang W., Petroziello J. M., Cook B. P., Gao Y. H., Maguire S., Riley D., Coppola G., Landes G. M., Madden S. L., Sukumar S. Combining serial analysis of gene expression and array technologies to identify genes differentially expressed in breast cancer.
Cancer Res.
,
59
:
5464
-5470,  
1999
.
6
Schummer M., Ng W. V., Bumgarner R. E., Nelson P. S., Schummer B., Bednarski D. W., Hassell L., Baldwin R. L., Karlan B. Y., Hood Comparative hybridization of an array of 21,500 ovarian cDNAs for the discovery of genes overexpressed in ovarian carcinomas.
Gene
,
1
:
375
-385,  
1999
.
7
Leethanakul C., Patel V., Gillespie J., Pallente M., Ensley J. F., Koontongkaew S., Liotta L. A., Emmert-Buck M., Gutkind J. S. Distinct pattern of expression of differentiation and growth-related genes in squamous cell carcinomas of the head and neck revealed by the use of laser capture microdissection and cDNA arrays.
Oncogene
,
29
:
3220
-3224,  
2000
.
8
Anbazhagan R., Tihan T., Bornman D. M., Johnston J. C., Saltz J. H., Weigering A., Piantadosi S., Gabrielson E. Classification of small cell lung cancer and pulmonary carcinoid by gene expression profiles.
Cancer Res.
,
15
:
5119
-5122,  
1999
.
9
Martin K. J., Kritzman B. M., Price L. M., Koh B., Kwan C. P., Zhang X., Mackay A., O’Hare M. J., Kaelin C. M., Mutter G. L., Pardee A. B., Sager R. Linking gene expression patterns to therapeutic groups in breast cancer.
Cancer Res.
,
59
:
2232
-2238,  
2000
.
10
Alizadeh A. A., Eisen M. B., Davis R. E., Ma C. Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling.
Nature (Lond.)
,
403
:
503
-511,  
2000
.
11
Cordon-Cardo C. Molecular alterations in bladder cancer.
Cancer Surv.
,
32
:
115
-131,  
1998
.
12
Knowles M. A. The genetics of transitional cell carcinoma: progress and potential clinical application.
BJU Int.
,
84
:
412
-427,  
1999
.
13
Wolf H., Kakizoe T., Smith P. H., Brosman S. A., Okajima E., Rubben H., Utz D. C. Bladder tumors. Treated natural history.
Prog. Clin. Biol. Res.
,
221
:
223
-255,  
1986
.
14
Zarbo R. J., Visscher D. W., Crissman J. D. Two-color multiparametric method for flow cytometric DNA analysis of carcinomas using staining for cytokeratin and leukocyte-common antigen.
Anal. Quant. Cytol. Histol.
,
11
:
391
-402,  
1989
.
15
Orntoft T. F., Petersen S. E., Wolf H. Dual-parameter flow cytometry of transitional cell carcinomas. Quantitation of DNA content and binding of carbohydrate ligands in cellular subpopulations.
Cancer (Phila.)
,
61
:
963
-970,  
1998
.
16
Bergkvist A., Ljungqvist A., Moberger G. Classification of bladder tumours based on the cellular pattern. Preliminary report of a clinical-pathological study of 300 cases with a minimum follow-up of eight years.
Acta Chir. Scand.
,
130
:
371
-378,  
1965
.
17
Kaminski N., Allard J. D., Pittet J. F., Zuo F., Griffiths M. J., Morris D. Global analysis of gene expression in pulmonary fibrosis reveals distinct programs regulating lung inflammation and fibrosis.
Proc. Natl. Acad. Sci. USA
,
97
:
1778
-1783,  
2000
.
18
Der S. D., Zhou A., Williams B. R., Silverman R. H. Identification of genes differentially regulated by interferon α, β, or γ using oligonucleotide arrays.
Proc. Natl. Acad. Sci. USA
,
95
:
15623
-15628,  
1998
.
19
Jelinsky S. A., Samson L. D. Global response of Saccharomyces cerevisiae to an alkylating agent.
Proc. Natl. Acad. Sci. USA
,
96
:
1486
-1491,  
1999
.
20
Zhu H., Cong J. P., Mamtora G., Gingeras T., Shenk T. Cellular gene expression altered by human cytomegalovirus: global monitoring with oligonucleotide arrays.
Proc. Natl. Acad. USA
,
95
:
14470
-14475,  
1998
.
21
Bastians H., Townsley F. M., Ruderman J. V. The cyclin-dependent kinase inhibitor p27(Kip1) induces N-terminal proteolytic cleavage of cyclin A.
Proc. Natl. Acad. Sci. USA
,
95
:
15374
-15381,  
1998
.
22
Leone G., DeGregori J., Jakoi L., Cook J. G., Nevins J. R. Collaborative role of E2F transcriptional activity and G1 cyclin-dependent kinase activity in the induction of S phase.
Proc. Natl. Acad. Sci. USA
,
96
:
6626
-6631,  
1999
.
23
Campanero M. R., Armstrong M., Flemington E. Distinct cellular factors regulate the c-myb promoter through its E2F element.
Mol. Cell. Biol.
,
19
:
8442
-8450,  
1999
.
24
Tsubari M., Taipale J., Tiihonen E., Keski-Oja J., Laiho M. Hepatocyte growth factor releases mink epithelial cells from transforming growth factor β1-induced growth arrest by restoring Cdk6 expression and cyclin E-associated Cdk2 activity.
Mol. Cell. Biol.
,
19
:
3654
-3663,  
1999
.
25
Dimitrova D. S., Todorov I. T., Melendy T., Gilbert D. M. Mcm2, but not RPA, is a component of the mammalian early G1-phase prereplication complex.
J. Cell Biol.
,
146
:
709
-722,  
1999
.
26
Schmandt R., Hill M., Amendola A., Mills G. B., Hogg D. IL-2-induced expression of TTK, a serine, threonine, tyrosine kinase, correlates with cell cycle progression.
J. Immunol.
,
152
:
96
-105,  
1994
.
27
Hogg D., Guidos C., Bailey D., Amendola A., Groves T., Davidson J. Cell cycle dependent regulation of the protein kinase TTK.
Oncogene
,
9
:
89
-96,  
1994
.
28
Townsley F. M., Aristarkhov A., Beck S., Hershko A., Ruderman J. V. Dominant-negative cyclin-selective ubiquitin carrier protein E2-C/UbcH10 blocks cells in metaphase.
Cell Biol.
,
94
:
2362
-2367,  
1997
.
29
Charrasse S., Carena I., Hagmann J., Woods-Cook K., Ferrari S. PCTAIRE-1: characterization, subcellular distribution, and cell cycle-dependent kinase activity.
Cell Growth Differ.
,
10
:
611
-620,  
1999
.
30
Trouche D., Le Chalony C., Muchardt C., Yaniv M., Kouzarides T. RB and hbrm cooperate to repress the activation functions of E2F1.
Proc. Natl. Acad. Sci. USA
,
94
:
11268
-11273,  
1997
.
31
Murphy D. J., Hardy S., Engel D. A. Human SWI-SNF component BRG1 represses transcription of the c-fos gene.
Mol. Cell. Biol.
,
19
:
2724
-2733,  
1999
.
32
Tetu B., Allard P., Fradet Y., Roberge N., Bernard P. Prognostic significance of nuclear DNA content and S-phase fraction by flow cytometry in primary papillary superficial bladder cancer.
Hum. Pathol.
,
27
:
922
-926,  
1996
.
33
Pfohler C., Fixemer T., Jung V., Dooley S., Remberger K., Bonkhoff H. In situ hybridization analysis of genes coding collagen IV α1 chain, laminin β1 chain, and S-laminin in prostate tissue and prostate cancer: increased basement membrane gene expression in high-grade and metastatic lesions.
Prostate
,
36
:
143
-150,  
1998
.
34
Liebert M., Washington R., Stein J., Wedemeyer G., Grossman H. B. Expression of the VLA β1 integrin family in bladder cancer.
Am. J. Pathol.
,
144
:
1016
-1022,  
1994
.
35
Stallmach A., von Lampe B., Matthes H., Bornhoft G., Riecken E. O. Diminished expression of integrin adhesion molecules on human colonic epithelial cells during the benign to malign tumour transformation.
Gut
,
33
:
342
-346,  
1992
.
36
Yasui W., Sano T., Nishimura K., Kitadai Y., Ji Z. Q., Yokozaki H. Expression of P-cadherin in gastric carcinomas and its reduction in tumor progression.
Int. J. Cancer
,
54
:
49
-52,  
1993
.
37
Fujisawa M., Miyazaki J., Takechi Y., Arakawa S., Kamidono S. The significance of E-cadherin in transitional-cell carcinoma of the human urinary bladder.
World J. Urol.
,
14 (Suppl. 1)
:
12
-15,  
1996
.
38
Dunne J., Hanby A. M., Poulsom R., Jones T. A., Sheer D., Chin W. G. Molecular cloning and tissue expression of FAT, the human homologue of the Drosophila fat gene that is located on chromosome 4q34–q35 and encodes a putative adhesion molecule.
Genomics
,
30
:
207
-223,  
1995
.
39
Mahoney P. A., Weber U., Onofrechuk P., Biessmann H., Bryant P. J., Goodman C. S. The fat tumor suppressor gene in Drosophila encodes a novel member of the cadherin gene superfamily.
Cell
,
67
:
853
-868,  
1991
.
40
Milstone L. M., Hough-Monroe L., Kugelman L. C., Bender J. R., Haggerty J. G. Epican, a heparan/chondroitin sulfate proteoglycan form of CD44, mediates cell-cell adhesion.
J. Cell Sci.
,
107
:
3183
-3190,  
1994
.
41
Zipfel P. F., Jokiranta T. S., Hellwage J., Koistinen V., Meri S. The factor H protein family.
Immunopharmacology
,
42
:
53
-60,  
1999
.
42
Schlagbauer-Wadl H., Jansen B., Muller M., Polterauer P., Wolff K., Eichler H. G. Influence of MUC18/MCAM/CD146 expression on human melanoma growth and metastasis in SCID mice.
Int. J. Cancer
,
81
:
951
-955,  
1999
.