In modern clinical neuro-oncology, histopathological diagnosis affects therapeutic decisions and prognostic estimation more than any other variable. Among high-grade gliomas, histologically classic glioblastomas and anaplastic oligodendrogliomas follow markedly different clinical courses. Unfortunately, many malignant gliomas are diagnostically challenging; these nonclassic lesions are difficult to classify by histological features, generating considerable interobserver variability and limited diagnostic reproducibility. The resulting tentative pathological diagnoses create significant clinical confusion. We investigated whether gene expression profiling, coupled with class prediction methodology, could be used to classify high-grade gliomas in a manner more objective, explicit, and consistent than standard pathology. Microarray analysis was used to determine the expression of ∼12,000 genes in a set of 50 gliomas, 28 glioblastomas and 22 anaplastic oligodendrogliomas. Supervised learning approaches were used to build a two-class prediction model based on a subset of 14 glioblastomas and 7 anaplastic oligodendrogliomas with classic histology. A 20-feature k-nearest neighbor model correctly classified 18 of the 21 classic cases in leave-one-out cross-validation when compared with pathological diagnoses. This model was then used to predict the classification of clinically common, histologically nonclassic samples. When tumors were classified according to pathology, the survival of patients with nonclassic glioblastoma and nonclassic anaplastic oligodendroglioma was not significantly different (P = 0.19). However, class distinctions according to the model were significantly associated with survival outcome (P = 0.05). This class prediction model was capable of classifying high-grade, nonclassic glial tumors objectively and reproducibly. Moreover, the model provided a more accurate predictor of prognosis in these nonclassic lesions than did pathological classification. These data suggest that class prediction models, based on defined molecular profiles, classify diagnostically challenging malignant gliomas in a manner that better correlates with clinical outcome than does standard pathology.

Malignant gliomas are the most common primary brain tumor and result in an estimated 13,000 deaths each year in the United States3. Glial tumors are classified histologically, with pathological diagnosis affecting prognostic estimation and therapeutic decisions more than any other variable. Among high-grade gliomas, anaplastic oligodendrogliomas have a more favorable prognosis than glioblastomas (1). Moreover, although glioblastomas are resistant to most available therapies, anaplastic oligodendrogliomas are often chemosensitive, with approximately two-thirds of cases responding to procarbazine, 1-(2-chloroethyl)-3-cyclohexyl-1-nitrosourea, and vincristine (2, 3). Paradoxically, recognition of the clinical importance of diagnosing anaplastic oligodendroglioma has blurred the histopathological line separating glioblastoma and oligodendroglioma; to ensure that patients are not deprived of effective chemotherapy, pathologists have loosened their criteria for anaplastic oligodendroglioma. Indeed, this diagnostic promiscuity has recently been described as a “contagion” (4). As such, there is a critical need for an objective, clinically relevant method of glioma classification.

The most widely used histological system of brain tumor classification is that of the WHO (1). Gliomas are classified according to defined histological features characteristic of the presumed normal cell of origin. Tumors of classic histology clearly display these features and resemble typical depictions in standard textbooks (5, 6); these cases would be diagnosed similarly by nearly all pathologists. Unfortunately, there are situations in which the WHO classification system is problematic, primarily because pathological diagnosis remains subjective (7); intratumoral histological variability is common, and high-grade gliomas can display little cellular differentiation, thus lacking defining histological features. The diagnosis of tumors with such nonclassic histology is often controversial. Consequently, diagnostic accuracy and reproducibility are jeopardized, and significant interobserver variability can occur. Coons et al. (8) found that complete diagnostic concordance among four neuropathologists reviewing gliomas over four sessions peaked at 69%. Giannini et al. (9), in a study of seven neuropathologists and six surgical pathologists scoring histological features of oligodendroglioma, found that agreement for identifying features ranged from 0.05 to 0.8, confirming that numerous classification parameters are not easily reproduced.

To develop more objective approaches to glioma classification, recent investigations have focused on molecular genetic analyses. Sasaki et al. (10) demonstrated loss of chromosome 1p in 86% of oligodendrogliomas with classic histology and maintenance of both 1p alleles in 73% of “oligodendrogliomas” with astrocytic features. Interestingly, tumor genotypes more closely predicted chemosensitivity, demonstrating an ability of tumor genotype to augment standard pathology. Burger et al. (11) also demonstrated close correlation between classic low-grade oligodendroglioma appearance and allelic losses of 1p and 19q. In gene expression studies, Lu et al. (12) suggested that expression of oligodendrocyte lineage genes (Olig1 and 2) might augment identification of oligodendroglial tumors. Similarly, Popko et al. (13) found three of four myelin transcripts significantly more often in oligodendrogliomas than in astrocytomas.

The advent of expression microarray techniques now allows simultaneous analysis of thousands of genes. We hypothesized that this approach could identify molecular markers capable of refining the current method of malignant glioma classification. We therefore investigated whether gene expression profiling, coupled with the computational methodology of class prediction (14), could be used to define subgroups of high-grade glioma in a manner more objective, explicit, and consistent than standard pathology. To this end, a subset of gliomas with classic histology was used to build a class prediction model, and this model was then used to predict the classification of samples with nonclassic histology.

Glioma Tissue Samples.

These investigations have been approved by the Massachusetts General Hospital Institutional Review Board. Tissue samples were collected from Canadian Brain Tumor Tissue Bank (London, Ontario, Canada), Massachusetts General Hospital (Boston, MA), Brigham and Women’s Hospital (Boston, MA), and Charité Hospital (Berlin, Germany). Samples were collected immediately after surgical resection, snap frozen, and stored at −80°C. H&E-stained frozen sections were reviewed histologically for every specimen (D. N. L.); samples containing significant regions of normal cell contamination (>10%) and/or excessively large amounts of necrotic material were excluded. Using these criteria, 50 high-grade glioma samples were selected (Table 1), 28 glioblastomas and 22 anaplastic oligodendrogliomas; all were primary tumors sampled before therapy. All cases had been diagnosed at the primary hospital by board-certified neuropathologists. Original pathology slides were obtained and reviewed centrally by two additional neuropathologists (M. E. M. and D. N. L.) for diagnostic confirmation and selection of the classic tumor subset. Anaplastic oligodendrogliomas designated as having classic histopathology exhibited relatively evenly distributed, uniform, and rounded nuclei and frequent perinuclear halos (10). In contrast, classic glioblastomas were characterized by irregularly distributed, pleomorphic, and hyperchromatic nuclei, sometimes with conspicuous eosinophilic cytoplasm. The classic subset of tumors were cases diagnosed similarly by all examining pathologists, and each case resembled typical depictions in standard textbooks (5, 6). A total of 21 classic tumors was selected, and the remaining 29 samples were considered nonclassic tumors, lesions for which diagnosis might be controversial. Of the 21 classic tumors, 14 were glioblastomas, and 7 were anaplastic oligodendrogliomas.

Gene Expression Profiling.

Tissues were homogenized in guanidinium isothiocyanate, and RNA was isolated using a CsCl gradient. RNA integrity was confirmed by gel electrophoresis. For each sample, 15 μg of total RNA were used to generate biotinylated cRNAs, which were hybridized overnight to Affymetrix U95Av2 GeneChips as described previously (14, 15). On the basis of previous experience, one array per sample provided reproducible results with a sample set of the size used in this study (14, 16). Arrays were scanned on Affymetrix scanners, and data were collected using GeneChip software (Affymetrix, Santa Clara, CA). Scan quality was assured based on a priori quality control criteria, which included the absence of visible microarray artifacts (e.g., scratches) and significant differences in microarray intensity, and the presence of >30% “present” calls for the ∼12,600 genes and expressed sequence tags on the U95Av2 GeneChips.

Class Prediction Methodology.

The subset of classic gliomas was used to build a class prediction model. This model was then used to predict the classification of the nonclassic samples. Raw expression values were normalized by linear scaling so that mean array intensity for active (present) genes was identical for all scans4. Data filtration settings were based on previous studies (14, 16). Intensity thresholds were set at 20 and 16,000 units. Gene expression data were subjected to a variation filter that excluded genes showing minimal variation across the samples; genes whose expression levels varied <100 units between samples, and genes whose expression varied <3-fold between any two samples, were removed. The variation filters excluded two-thirds of the genes, leaving ∼3,900 genes for building class prediction models. Further feature (gene) selection was effected, as described previously (14, 16), using the S2N5 statistic. S2N ratio ranks genes based on their correlation to each of the two class distinctions (i.e., classic glioblastoma and anaplastic oligodendroglioma). In addition, the significance of the highly ranked genes was confirmed by random permutation testing; the sample classification labels were permuted, and the S2N ratio was recomputed to compare the true gene correlations to what would have been expected by chance. Five different k-NN class prediction models were built, using different gene numbers (10, 20, 50, 100, and 250 genes), with GeneCluster6. Training error (on the classic cases) for these k-NN models was determined using leave-one-out cross-validation, where one sample is withheld, and the class membership of this withheld sample is predicted using a model built on the remaining samples. Class prediction for the withheld sample was the majority class membership of the k (k = 3 in these experiments) closest “neighboring” samples based on the Euclidean distance between the sample under consideration and samples used in training the k-NN model. This process was repeated for each sample in the training set, and a cumulative training error was calculated. Finally, a k-NN model was built using all 21 classic cases (with no samples left out), which was then used to predict classification of the remaining gliomas based on the class labels of the k-NNs of each sample.

Survival Analyses: Statistical Methods.

Survival distributions were compared between groups defined by pathology or gene expression profiling using permutation Log-rank tests, computed by drawing 50,000 samples from the relevant permutation distribution. The statistical programming language, R,7 was used to compute permutation Ps. Kaplan-Meier plots were generated with GraphPad Prism (Version 3.02; GraphPad Software, San Diego, CA).

Training of the k-NN Class Prediction Models.

We investigated whether gene expression profiling could be used to define subgroups of high-grade glioma more objectively and consistently than standard pathology. To this end, we examined the expression profile of 14 glioblastomas and 7 anaplastic oligodendrogliomas with classic histology (Fig. 1,A). Features (genes) correlating with each of the two class distinctions were ranked according to S2N as described; diagrammatic results for the top 50 features of each class are illustrated (Fig. 1,B; the complete list of genes is available online).4 Because the expression profiles demonstrated robust class distinctions, we proceeded to construct five k-NN class prediction models. The number of features used in the models was chosen to give a range of prediction accuracy; increasing the number of genes in a model can improve prediction accuracy by providing additional biologically relevant input and affording robust signals against noise, whereas using too many genes can increase inaccuracy by generating excess noise. Models were built using 10, 20, 50, 100, or 250 features, and the training error for each model was calculated using leave-one-out cross-validation (Table 2). Although accuracy of the models was comparable, the 20-feature k-NN model was chosen for further study because it predicted most accurately the class distinctions of the classic glioma training set (18 of 21 correct calls; 86% accuracy).

The 20 features used for prediction in this model correspond to 19 genes because of the presence of redundant probe sets (Table 3). Genes highly correlated with glioblastoma included a mixture of metabolic, structural, and signaling proteins. In particular, Rho GTPases (ARHC) and mitogen-activated protein kinases are members of Ras signal transduction pathways known to play a role in tumorigenesis and cell migration (17, 18). A large proportion of genes highly correlated with anaplastic oligodendroglioma was found to be involved in protein translation and ribosome biogenesis; translation factors have been implicated previously as effectors of tumorigenesis (19). Paradoxically, ribosomal protein-encoding genes were found recently to be correlated with poor outcome in medulloblastoma (16). These models thus provide a substantial number of features that correlate with glioma class distinction, but determination of the biological and clinical significance of these genes requires additional studies.

Training “Errors” of the Class Prediction Model.

Although a class prediction was made for all 21 classic gliomas using the model, such techniques typically classify some samples with more confidence than others. For this reason, confidence values were calculated for all predictions (Table 4). Of the three errors within the classic training set, one prediction was made with relative high confidence (“Brain_CO_4”; ranked 9 of 21), and two were classified as low confidence predictions (“Brain_CG_5” and “Brain_CG_10”; ranked 16 and 18, respectively). “Brain_CO_4,” a classic anaplastic oligodendroglioma, displayed a gene expression profile strikingly more similar to that of glioblastoma (Fig. 1 B) and was classified as a glioblastoma with relative high confidence in all five k-NN models examined (mean confidence value of 0.17). Reexamination of reports from the initial diagnosis and slides from the central pathology review gave no justification for a histological classification of glioblastoma. Although some evidence of nuclear pleomorphism and hyperchromasia was noted in the original pathology report, the presence of prominent perinuclear halos and a fine capillary network indicated a classic anaplastic oligodendroglioma. Furthermore, glial fibrillary acidic protein, an astrocytic marker, was not expressed in the neoplastic cells. Notably, however, although the histological features of “Brain_CO_4” were consistent with anaplastic oligodendroglioma, clinical data suggested a course more characteristic of a glioblastoma, with survival of only 7 months from diagnosis.

Independent Validation of Class Prediction through Survival Analysis.

The prediction model classified 18 of 21 classic gliomas identically to the pathological classification during leave-one-out cross-validation. The discrepancies in tumor classification could be the result of a class prediction model error or a diagnostic error; preliminary examination of the clinical behavior of “Brain_CO_4” suggested that the class prediction model provided more pertinent tumor classification. Ideally, the designation of error requires independent validation. Differences in survival between patients with glioblastomas and those with anaplastic oligodendrogliomas have been well documented (1); consequently, as an independent validation of the gene expression prediction model, prediction model classifications were compared with pathological diagnoses with respect to survival. When the classic gliomas were sorted according to pathology, a clear distinction was found between survival of patients with glioblastoma and those with anaplastic oligodendroglioma (Fig. 2). Although this comparison was not statistically significant (n = 21, P = 0.21), most likely because of the small sample size and relatively short follow-up time on three of the seven anaplastic oligodendrogliomas, statistically significant differences in survival were seen within the pathologically defined classes when all glioblastomas and anaplastic oligodendrogliomas were compared (n = 50, P = 0.009; data not shown). Remarkably, however, when the classic gliomas were sorted using class distinctions according to the model, survival differences were statistically significant (n = 21, P = 0.031; Fig. 2). These results demonstrate that, even within high-grade gliomas of classic histology, the biologically and clinically relevant information afforded by the genetic profiles augments that provided by pathology alone. Furthermore, the clinical outcome data suggest that the discrepancies in tumor classification are more likely caused by a diagnostic error than a class prediction model error.

Class Prediction of Nonclassic High-grade Gliomas.

Next, we examined the ability of this model to classify the common, nonclassic high-grade gliomas that currently cause such clinical uncertainty regarding therapy and prognosis (Fig. 3,A). The ability to identify these lesions in a uniform and reproducible manner would facilitate more accurate therapeutic decisions and prognostic estimation, allowing for improved clinical management of individual patients. The prediction model classifications were compared with pathological diagnoses with respect to survival. When these diagnostically challenging tumors were classified according to pathology, survival of patients with nonclassic glioblastoma was not significantly different from that of patients with nonclassic anaplastic oligodendroglioma (n = 29, P = 0.194; Fig. 3,B). These results demonstrate clearly the difficulty in distinguishing these challenging cases in a clinically relevant manner based exclusively on histological parameters. In contrast, class distinctions according to the gene expression-based model trained on the classic gliomas were statistically significant (P = 0.051), giving much better separation between the anaplastic oligodendroglioma and glioblastoma survival curves (Fig. 3 B). Thus, gene expression profiles have a remarkable ability to distinguish histologically ambiguous glioblastomas and anaplastic oligodendrogliomas in a clinically relevant manner. Indeed, gene expression profiles provide a more objective and accurate predictor of prognosis in high-grade nonclassic gliomas than does traditional histology. In addition, the ability to distinguish histologically ambiguous gliomas enables appropriate therapies to be tailored to specific tumor subtypes, sparing patients who would not respond from unnecessary treatments. Moreover, uniform and reproducible classification of these nonclassic lesions would provide improved stratification of patients in clinical trials and molecular marker studies.

Summary.

We investigated whether gene expression profiling, coupled with the computational methodology of class prediction, could be used to define subgroups of high-grade glioma in a manner more objective, explicit, and consistent than standard pathology. Not only was this method effective at classifying high-grade gliomas objectively and reproducibly, it also appeared to provide a more accurate predictor of prognosis. Although the training sample sets for these models were selected based on classic histological features, the biologically and clinically relevant information afforded by the genetic profiles greatly augments that provided by pathology alone. These data therefore suggest that class prediction models, based on defined molecular profiles, classify diagnostically challenging malignant gliomas in a manner that better correlates with clinical outcome than does standard pathology.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1

Supported in part by NIH Grant CA57683 (D. N. L.), Affymetrix and Bristol-Myers Squibb (Whitehead Institute/MIT Center for Genome Research), NIH Grant NS35701 (S. L. P.), and Canadian Institutes of Health Research MOP37849 (J. G. C.).

3

Internet address: http://www.cbtrus.org.

4

Internet address: http://www-genome.wi.mit.edu/cancer/pub/glioma.

5

The abbreviations used are: S2N, signal-to-noise; k-NN, k-nearest neighbor.

6

Internet address: http://www-genome.wi.mit.edu/cancer/software/software.html.

7

Internet address: http://www.r-project.org.

Fig. 1.

Characterization of classic high-grade gliomas. A, histological features of classic high-grade gliomas. “Brain_CG_3” (top), classic glioblastoma featuring cells with copious eosinophilic cytoplasm and fibrillary processes; “Brain_CG_7” (middle), classic glioblastoma illustrating pleomorphic and spindled cells; “Brain_CO_1” (bottom), classic anaplastic oligodendroglioma illustrating monomorphic cells with rounded nuclei and perinuclear halos. B, classification of high-grade gliomas by gene expression. Genes were ranked by the S2N metric according to their correlation with the classic glioblastoma (GBM) versus classic anaplastic oligodendroglioma (AO) distinction. Results are shown for the top 50 genes of each distinction. Each column represents a single glioma sample, and each row represents a single gene. For each gene, red indicates a high level of expression relative to the mean; blue indicates a low level of expression relative to the mean. The SD from the mean is indicated (ς). ∗, “Brain_CO_4” sample.

Fig. 1.

Characterization of classic high-grade gliomas. A, histological features of classic high-grade gliomas. “Brain_CG_3” (top), classic glioblastoma featuring cells with copious eosinophilic cytoplasm and fibrillary processes; “Brain_CG_7” (middle), classic glioblastoma illustrating pleomorphic and spindled cells; “Brain_CO_1” (bottom), classic anaplastic oligodendroglioma illustrating monomorphic cells with rounded nuclei and perinuclear halos. B, classification of high-grade gliomas by gene expression. Genes were ranked by the S2N metric according to their correlation with the classic glioblastoma (GBM) versus classic anaplastic oligodendroglioma (AO) distinction. Results are shown for the top 50 genes of each distinction. Each column represents a single glioma sample, and each row represents a single gene. For each gene, red indicates a high level of expression relative to the mean; blue indicates a low level of expression relative to the mean. The SD from the mean is indicated (ς). ∗, “Brain_CO_4” sample.

Close modal
Fig. 2.

Survival curves of patients with the 14 classic glioblastomas (dashed line) and 7 classic anaplastic oligodendrogliomas (solid line) used to train the 20-feature k-NN class prediction model. Survival curves were plotted according to classifications based on either traditional pathology or the class prediction model. When classic tumors were sorted according to pathology, a clear distinction was found between survival of patients with glioblastoma and those with anaplastic oligodendroglioma, although this comparison was not significantly different (P = 0.21). Survival curves generated using class distinctions according to the class prediction model were significantly different (P = 0.031).

Fig. 2.

Survival curves of patients with the 14 classic glioblastomas (dashed line) and 7 classic anaplastic oligodendrogliomas (solid line) used to train the 20-feature k-NN class prediction model. Survival curves were plotted according to classifications based on either traditional pathology or the class prediction model. When classic tumors were sorted according to pathology, a clear distinction was found between survival of patients with glioblastoma and those with anaplastic oligodendroglioma, although this comparison was not significantly different (P = 0.21). Survival curves generated using class distinctions according to the class prediction model were significantly different (P = 0.031).

Close modal
Fig. 3.

Characterization of nonclassic, high-grade gliomas. A, histological features of nonclassic, high-grade gliomas. “Brain_NG_1” (top), nonclassic glioblastoma with a region having microgemistocytes that raise the differential diagnosis of anaplastic oligodendroglioma; “Brain_NG_3” (middle), nonclassic glioblastoma with an area of rounded cells that resembles oligodendroglioma and more spindled cells that resemble glioblastoma; “Brain_NO_14” (bottom), nonclassic anaplastic oligodendroglioma with a region displaying the typical branching vasculature and calcification (arrowhead) of oligodendroglioma but with more spindled cells. B, survival curves of patients with the 14 nonclassic glioblastomas (dashed line) and 15 nonclassic anaplastic oligodendrogliomas (solid line). Survival curves were plotted according to classifications based on either traditional pathology or the class prediction model trained on the classic gliomas. When tumors were classified according to pathology, survival of patients with nonclassic glioblastoma was not significantly different from that of patients with nonclassic anaplastic oligodendroglioma (P = 0.194). In contrast, class distinctions according to the class prediction model were significantly different (P = 0.051).

Fig. 3.

Characterization of nonclassic, high-grade gliomas. A, histological features of nonclassic, high-grade gliomas. “Brain_NG_1” (top), nonclassic glioblastoma with a region having microgemistocytes that raise the differential diagnosis of anaplastic oligodendroglioma; “Brain_NG_3” (middle), nonclassic glioblastoma with an area of rounded cells that resembles oligodendroglioma and more spindled cells that resemble glioblastoma; “Brain_NO_14” (bottom), nonclassic anaplastic oligodendroglioma with a region displaying the typical branching vasculature and calcification (arrowhead) of oligodendroglioma but with more spindled cells. B, survival curves of patients with the 14 nonclassic glioblastomas (dashed line) and 15 nonclassic anaplastic oligodendrogliomas (solid line). Survival curves were plotted according to classifications based on either traditional pathology or the class prediction model trained on the classic gliomas. When tumors were classified according to pathology, survival of patients with nonclassic glioblastoma was not significantly different from that of patients with nonclassic anaplastic oligodendroglioma (P = 0.194). In contrast, class distinctions according to the class prediction model were significantly different (P = 0.051).

Close modal
Table 1

Summary of clinical parameters for the high-grade glioma dataset

Pathological diagnosis and survival from date of intial diagnosis are given for all patients. For living patients, survival is given to time of last follow-up.

Sample NamePathologyVital statusSurvival (days)
Brain_CG_1 Classic GBMa Dead 308 
Brain_CG_2 Classic GBM Dead 281 
Brain_CG_3 Classic GBM Dead 501 
Brain_CG_4 Classic GBM Dead 670 
Brain_CG_5 Classic GBM Alive 729 
Brain_CG_6 Classic GBM Dead 21 
Brain_CG_7 Classic GBM Alive 630 
Brain_CG_8 Classic GBM Dead 263 
Brain_CG_9 Classic GBM Dead 219 
Brain_CG_10 Classic GBM Dead 408 
Brain_CG_11 Classic GBM Dead 242 
Brain_CG_12 Classic GBM Dead 323 
Brain_CG_13 Classic GBM Dead 213 
Brain_CG_14 Classic GBM Dead 97 
Brain_NG_1 Nonclassic GBM Dead 1375 
Brain_NG_2 Nonclassic GBM Alive 1644 
Brain_NG_3 Nonclassic GBM Dead 406 
Brain_NG_4 Nonclassic GBM Dead 308 
Brain_NG_5 Nonclassic GBM Dead 177 
Brain_NG_6 Nonclassic GBM Dead 103 
Brain_NG_7 Nonclassic GBM Alive 992 
Brain_NG_8 Nonclassic GBM Dead 41 
Brain_NG_9 Nonclassic GBM Alive 1354 
Brain_NG_10 Nonclassic GBM Dead 276 
Brain_NG_11 Nonclassic GBM Dead 519 
Brain_NG_12 Nonclassic GBM Dead 368 
Brain_NG_13 Nonclassic GBM Dead 157 
Brain_NG_14 Nonclassic GBM Dead 1162 
Brain_CO_1 Classic AO Alive 231 
Brain_CO_2 Classic AO Alive 1674 
Brain_CO_3 Classic AO Alive 1604 
Brain_CO_4 Classic AO Dead 215 
Brain_CO_5 Classic AO Alive 359 
Brain_CO_6 Classic AO Alive 171 
Brain_CO_7 Classic AO Dead 272 
Brain_NO_1 Nonclassic AO Dead 63 
Brain_NO_2 Nonclassic AO Alive 585 
Brain_NO_3 Nonclassic AO Alive 1804 
Brain_NO_4 Nonclassic AO Dead 916 
Brain_NO_5 Nonclassic AO Dead 793 
Brain_NO_6 Nonclassic AO Dead 803 
Brain_NO_7 Nonclassic AO Dead 559 
Brain_NO_8 Nonclassic AO Alive 1137 
Brain_NO_9 Nonclassic AO Alive 1100 
Brain_NO_10 Nonclassic AO Dead 498 
Brain_NO_11 Nonclassic AO Alive 795 
Brain_NO_12 Nonclassic AO Dead 790 
Brain_NO_13 Nonclassic AO Dead 789 
Brain_NO_14 Nonclassic AO Alive 439 
Brain_NO_15 Nonclassic AO Alive 638 
Sample NamePathologyVital statusSurvival (days)
Brain_CG_1 Classic GBMa Dead 308 
Brain_CG_2 Classic GBM Dead 281 
Brain_CG_3 Classic GBM Dead 501 
Brain_CG_4 Classic GBM Dead 670 
Brain_CG_5 Classic GBM Alive 729 
Brain_CG_6 Classic GBM Dead 21 
Brain_CG_7 Classic GBM Alive 630 
Brain_CG_8 Classic GBM Dead 263 
Brain_CG_9 Classic GBM Dead 219 
Brain_CG_10 Classic GBM Dead 408 
Brain_CG_11 Classic GBM Dead 242 
Brain_CG_12 Classic GBM Dead 323 
Brain_CG_13 Classic GBM Dead 213 
Brain_CG_14 Classic GBM Dead 97 
Brain_NG_1 Nonclassic GBM Dead 1375 
Brain_NG_2 Nonclassic GBM Alive 1644 
Brain_NG_3 Nonclassic GBM Dead 406 
Brain_NG_4 Nonclassic GBM Dead 308 
Brain_NG_5 Nonclassic GBM Dead 177 
Brain_NG_6 Nonclassic GBM Dead 103 
Brain_NG_7 Nonclassic GBM Alive 992 
Brain_NG_8 Nonclassic GBM Dead 41 
Brain_NG_9 Nonclassic GBM Alive 1354 
Brain_NG_10 Nonclassic GBM Dead 276 
Brain_NG_11 Nonclassic GBM Dead 519 
Brain_NG_12 Nonclassic GBM Dead 368 
Brain_NG_13 Nonclassic GBM Dead 157 
Brain_NG_14 Nonclassic GBM Dead 1162 
Brain_CO_1 Classic AO Alive 231 
Brain_CO_2 Classic AO Alive 1674 
Brain_CO_3 Classic AO Alive 1604 
Brain_CO_4 Classic AO Dead 215 
Brain_CO_5 Classic AO Alive 359 
Brain_CO_6 Classic AO Alive 171 
Brain_CO_7 Classic AO Dead 272 
Brain_NO_1 Nonclassic AO Dead 63 
Brain_NO_2 Nonclassic AO Alive 585 
Brain_NO_3 Nonclassic AO Alive 1804 
Brain_NO_4 Nonclassic AO Dead 916 
Brain_NO_5 Nonclassic AO Dead 793 
Brain_NO_6 Nonclassic AO Dead 803 
Brain_NO_7 Nonclassic AO Dead 559 
Brain_NO_8 Nonclassic AO Alive 1137 
Brain_NO_9 Nonclassic AO Alive 1100 
Brain_NO_10 Nonclassic AO Dead 498 
Brain_NO_11 Nonclassic AO Alive 795 
Brain_NO_12 Nonclassic AO Dead 790 
Brain_NO_13 Nonclassic AO Dead 789 
Brain_NO_14 Nonclassic AO Alive 439 
Brain_NO_15 Nonclassic AO Alive 638 
a

GBM, glioblastoma; AO, anaplastic oligodendroglioma.

Table 2

Training error of k-NN models

Class prediction models were built using 10, 20, 50, 100, or 250 features, and the training error for each model was calculated using leave-one-out cross-validation.

No. of featuresError
10 features 4/21 
20 features 3/21 
50 features 5/21 
100 features 4/21 
250 features 6/21 
No. of featuresError
10 features 4/21 
20 features 3/21 
50 features 5/21 
100 features 4/21 
250 features 6/21 
Table 3

Features of the 20-feature k-NN class prediction model

Genes highly correlated with the class distinction of either GBMa or AO in the 20-feature k-NN class prediction model. Affymetrix feature numbers, fold increase in gene expression (GBM > AO; AO > GBM), accession numbers, and gene identifications are shown.

Class correlationFeature no.Fold increaseAccession no.Gene description
GBM 34091_s_at 2.55 Z19554 VIM: vimentin 
GBM 630_at 4.83 L39874 DCTD: dCMP deaminase 
GBM 631_g_at 2.80 L39874 DCTD: dCMP deaminase 
GBM 39691_at 1.80 AB007960 SH3GLB1: SH3-domain GRB2-like endophilin B1 
GBM 160039_at 5.57 NM_002747 MAPK4: mitogen-activated protein kinase 4 
GBM 35016_at 1.89 M13560 CD74: CD74 antigen (invariant polypeptide of major histocompatibility complex, class II antigen associated) 
GBM 38791_at 1.78 D29643 DDOST: dolichyl-diphosphooligosaccharide protein glycosyltransferase 
GBM 1395_at 2.10 L25081 ARHC: ras homologue gene family, member C 
GBM 37542_at 2.41 D86961 LHFPL2: lipoma HMGIC fusion partner-like 2 
GBM 935_at 1.49 L12168 CAP: adenylyl cyclase-associated protein 
AO 33619_at 2.20 L01124 RPS13: ribosomal protein S13 
AO 34679_at 2.64 X02596 BCR: breakpoint cluster region 
AO 37573_at 3.96 AF007150 ANGPTL2: angiopoietin-like 2 
AO 33677_at 1.81 M94314 RPL24: ribosomal protein L24 
AO 326_i_at 2.03 HG1800–HT1823 RPS20: ribosomal protein S20 
AO 41325_at 2.43 AF006823 KCNK3: potassium channel, subfamily K, member 3 (TASK-1) 
AO 38681_at 1.76 U62962 EIF3S6: eukaryotic translation initiation factor 3, subunit 6 (48kD) 
AO 41792_at 2.16 L78207 ABCC8: ATP-binding cassette, subfamily C (CFTR/MRP), member 8 
AO 37249_at 3.40 AF079529 PDE8B: phosphodiesterase 8B 
AO 37953_s_at 2.77 U78181 ACCN2: amiloride-sensitive cation channel 2, neuronal 
Class correlationFeature no.Fold increaseAccession no.Gene description
GBM 34091_s_at 2.55 Z19554 VIM: vimentin 
GBM 630_at 4.83 L39874 DCTD: dCMP deaminase 
GBM 631_g_at 2.80 L39874 DCTD: dCMP deaminase 
GBM 39691_at 1.80 AB007960 SH3GLB1: SH3-domain GRB2-like endophilin B1 
GBM 160039_at 5.57 NM_002747 MAPK4: mitogen-activated protein kinase 4 
GBM 35016_at 1.89 M13560 CD74: CD74 antigen (invariant polypeptide of major histocompatibility complex, class II antigen associated) 
GBM 38791_at 1.78 D29643 DDOST: dolichyl-diphosphooligosaccharide protein glycosyltransferase 
GBM 1395_at 2.10 L25081 ARHC: ras homologue gene family, member C 
GBM 37542_at 2.41 D86961 LHFPL2: lipoma HMGIC fusion partner-like 2 
GBM 935_at 1.49 L12168 CAP: adenylyl cyclase-associated protein 
AO 33619_at 2.20 L01124 RPS13: ribosomal protein S13 
AO 34679_at 2.64 X02596 BCR: breakpoint cluster region 
AO 37573_at 3.96 AF007150 ANGPTL2: angiopoietin-like 2 
AO 33677_at 1.81 M94314 RPL24: ribosomal protein L24 
AO 326_i_at 2.03 HG1800–HT1823 RPS20: ribosomal protein S20 
AO 41325_at 2.43 AF006823 KCNK3: potassium channel, subfamily K, member 3 (TASK-1) 
AO 38681_at 1.76 U62962 EIF3S6: eukaryotic translation initiation factor 3, subunit 6 (48kD) 
AO 41792_at 2.16 L78207 ABCC8: ATP-binding cassette, subfamily C (CFTR/MRP), member 8 
AO 37249_at 3.40 AF079529 PDE8B: phosphodiesterase 8B 
AO 37953_s_at 2.77 U78181 ACCN2: amiloride-sensitive cation channel 2, neuronal 
a

GBM, glioblastoma; AO, anaplastic oligodendroglioma.

Table 4

Summary of training sample set class predictions

Set includes the 21 classic high-grade gliomas. The “call” is the classification given by the 20-feature k-NN model during leave-one-out cross-validation and appears along with the confidence value. Errors are those tumors whose classification differed from the pathological classification.

Sample nameCallConfidencePathologyError
Brain_CG_8 GBMa 0.677 GBM  
Brain_CG_11 GBM 0.610 GBM  
Brain_CG_3 GBM 0.558 GBM  
Brain_CG_4 GBM 0.524 GBM  
Brain_CG14 GBM 0.455 GBM  
Brain_CG_2 GBM 0.445 GBM  
Brain_CO_5 AO 0.377 AO  
Brain_CO_1 AO 0.234 AO  
Brain_CO_4 GBM 0.224 AO *b 
Brain_CG_1 GBM 0.182 GBM  
Brain_CO_6 AO 0.166 AO  
Brain_CG_9 GBM 0.158 GBM  
Brain_CO_2 AO 0.143 AO  
Brain_CO_7 AO 0.141 AO  
Brain_CO_6 GBM 0.101 GBM  
Brain_CG_5 AO 0.028 GBM 
Brain_CO_3 AO 0.023 AO  
Brain_CG_10 AO 0.021 GBM 
Brain_CG_13 GBM 0.008 GBM  
Brain_CG_12 GBM 0.006 GBM  
Brain_CG_7 GBM 0.000 GBM  
Sample nameCallConfidencePathologyError
Brain_CG_8 GBMa 0.677 GBM  
Brain_CG_11 GBM 0.610 GBM  
Brain_CG_3 GBM 0.558 GBM  
Brain_CG_4 GBM 0.524 GBM  
Brain_CG14 GBM 0.455 GBM  
Brain_CG_2 GBM 0.445 GBM  
Brain_CO_5 AO 0.377 AO  
Brain_CO_1 AO 0.234 AO  
Brain_CO_4 GBM 0.224 AO *b 
Brain_CG_1 GBM 0.182 GBM  
Brain_CO_6 AO 0.166 AO  
Brain_CG_9 GBM 0.158 GBM  
Brain_CO_2 AO 0.143 AO  
Brain_CO_7 AO 0.141 AO  
Brain_CO_6 GBM 0.101 GBM  
Brain_CG_5 AO 0.028 GBM 
Brain_CO_3 AO 0.023 AO  
Brain_CG_10 AO 0.021 GBM 
Brain_CG_13 GBM 0.008 GBM  
Brain_CG_12 GBM 0.006 GBM  
Brain_CG_7 GBM 0.000 GBM  
a

GBM, glioblastoma; AO, anaplastic oligodendroglioma.

b

*, discrepancies between class prediction model and pathological classification.

We thank Magdalena Zlatescu and Loc Pham for valuable assistance with collecting patient data, Marcela White and Jennifer Roy for accessing tissue samples and information, Lisa Sturla for technical assistance, members of the Program in Cancer Genomics at the Whitehead Institute/Massachusetts Institute of Technology Center for Genome Research for valuable discussions, and Anat Stemmer-Rachamimov for critical review of this manuscript.

1
Kleihues P., Cavenee W. K. .
World Health Organization Classification of Tumours of the Nervous System
, WHO/IARC Lyon, France  
2000
.
2
Cairncross J. G., Macdonald D. R. Successful chemotherapy for malignant oligodendroglioma.
Ann. Neurol.
,
23
:
360
-364,  
1988
.
3
Cairncross J. G., Ueki K., Zlatescu M. C., Lisle D. K., Finkelstein D. M., Hammond R. R., Silver J. S., Stark P. C., Macdonald D. R., Ino Y., Ramsay D. A., Louis D. N. Specific chromosomal losses predict chemotherapeutic response and survival in patients with anaplastic oligodendrogliomas.
J. Natl. Cancer Inst. (Bethesda)
,
90
:
1473
-1479,  
1998
.
4
Burger P. C. What is an oligodendroglioma?.
Brain Pathol.
,
12
:
257
-259,  
2002
.
5
Ironside J. W., Moss T. H., Louis D. N., Lowe J. S., Weller R. O. .
Diagnostic Pathology of Nervous System Tumours
, Churchill Livingstone London  
2002
.
6
Burger P. C., Scheithauer B. W., Vogel F. S. .
Surgical Pathology of the Nervous System and its Coverings
, Ed. 4 Churchill Livingstone London  
2002
.
7
Louis D. N., Holland E. C., Cairncross J. G. Glioma classification: a molecular reappraisal.
Am. J. Pathol.
,
159
:
779
-786,  
2001
.
8
Coons S. W., Johnson P. C., Scheithauer B. W., Yates A. J., Pearl D. K. Improving diagnostic accuracy and interobserver concordance in the classification and grading of primary gliomas.
Cancer (Phila.)
,
79
:
1381
-1393,  
1997
.
9
Giannini C., Scheithauer B. W., Weaver A. L., Burger P. C., Kros J. M., Mork S., Graeber M. B., Bauserman S., Buckner J. C., Burton J., Riepe R., Tazelaar H. D., Nascimento A. G., Crotty T., Keeney G. L., Pernicone P., Altermatt H. Oligodendrogliomas: reproducibility and prognostic value of histologic diagnosis and grading.
J. Neuropathol. Exp. Neurol.
,
60
:
248
-262,  
2001
.
10
Sasaki H., Zlatescu M. C., Betensky R. A., Johnk L., Cutone A., Cairncross J. G., Louis D. N. Histopathological-molecular genetic correlations in referral pathologist-diagnosed low-grade “oligodendroglioma.”.
J. Neuropathol. Exp. Neurol.
,
61
:
58
-63,  
2002
.
11
Burger P. C., Minn A. Y., Smith J. S., Borell T. J., Jedlicka A. E., Huntley B. K., Goldthwaite P. T., Jenkins R. B., Feuerstein B. G. Losses of chromosomal arms 1p and 19q in the diagnosis of oligodendroglioma. A study of paraffin-embedded sections.
Mod. Pathol.
,
14
:
842
-853,  
2001
.
12
Lu Q. R., Park J. K., Noll E., Chan J. A., Alberta J., Yuk D., Alzamora M. G., Louis D. N., Stiles C. D., Rowitch D. H., Black P. M. Oligodendrocyte lineage genes (OLIG) as molecular markers for human glial brain tumors.
Proc. Natl. Acad. Sci. USA
,
98
:
10851
-10856,  
2001
.
13
Popko B., Pearl D. K., Walker D. M., Comas T. C., Baerwald K. D., Burger P. C., Scheithauer B. W., Yates A. J. Molecular markers that identify human astrocytomas and oligodendrogliomas.
J. Neuropathol. Exp. Neurol.
,
61
:
329
-338,  
2002
.
14
Golub T. R., Slonim D. K., Tamayo P., Huard C., Gaasenbeek M., Mesirov J. P., Coller H., Loh M. L., Downing J. R., Caligiuri M. A., Bloomfield C. D., Lander E. S. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring.
Science (Wash. DC)
,
286
:
531
-537,  
1999
.
15
Bhattacharjee A., Richards W. G., Staunton J., Li C., Monti S., Vasa P., Ladd C., Beheshti J., Bueno R., Gillette M., Loda M., Weber G., Mark E. J., Lander E. S., Wong W., Johnson B. E., Golub T. R., Sugarbaker D. J., Meyerson M. Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses.
Proc. Natl. Acad. Sci. USA
,
98
:
13790
-13795,  
2001
.
16
Pomeroy S. L., Tamayo P., Gaasenbeek M., Sturla L. M., Angelo M., McLaughlin M. E., Kim J. Y. H., Goumnerova L. C., Black P. M., Lau C., Allen J. C., Zagzag D., Olson J. M., Curran T., Wetmore C., Biegel J. A., Poggio T., Mukherjee S., Rifkin R., Califano A., Stolovitzky G., Louis D. N., Mesirov J. P., Lander E. S., Golub T. R. Prediction of central nervous system embryonal tumour outcome based on gene expression.
Nature (Lond.)
,
415
:
436
-442,  
2002
.
17
Boettner B., Van Aelst L. The role of Rho GTPases in disease development.
Gene
,
286
:
155
-174,  
2002
.
18
Ridley A. J. Rho GTPases and cell migration.
J. Cell Sci.
,
114
:
2713
-2722,  
2001
.
19
Clemens M. J., Bommer U-A. Translational control: the cancer connection.
Int. J. Biochem. Cell Biol.
,
31
:
1
-23,  
1999
.