Abstract
Purpose: To predict individual survival times for neuroblastoma patients from gene expression data using the cancer survival prediction using automatic relevance determination (CASPAR) algorithm.
Experimental Design: A first set of oligonucleotide microarray gene expression profiles comprising 256 neuroblastoma patients was generated. Then, CASPAR was combined with a leave-one-out cross-validation to predict individual times for both the whole cohort and subgroups of patients with unfavorable markers, including stage 4 disease (n = 67), unfavorable genetic alterations, intermediate-risk or high-risk stratification by the German neuroblastoma trial, and patients predicted as unfavorable by a recently described gene expression classifier (n = 83). Prediction accuracy of individual survival times was assessed by Kaplan-Meier analyses and time-dependent receiver operator characteristics curve analyses. Subsequently, classification results were validated in an independent cohort (n = 120).
Results: CASPAR separated patients with divergent outcome in both the initial and the validation cohort [initial set, 5y-OS 0.94 ± 0.04 (predicted long survival) versus 0.38 ± 0.17 (predicted short survival), P < 0.0001; validation cohort, 5y-OS 0.94 ± 0.07 (long) versus 0.40 ± 0.13 (short), P < 0.0001]. Time-dependent receiver operator characteristics analyses showed that CASPAR-predicted individual survival times were highly accurate (initial set, mean area under the curve for first 10 years of overall survival prediction 0.92 ± 0.04; validation set, 0.81 ± 0.05). Furthermore, CASPAR significantly discriminated short (<5 years) from long survivors (>5 years) in subgroups of patients with unfavorable markers with the exception of MYCN-amplified patients (initial set). Confirmatory results with high significance were observed in the validation cohort [stage 4 disease (P = 0.0049), NB2004 intermediate-risk or high-risk stratification (P = 0.0017), and unfavorable gene expression prediction (P = 0.0017)].
Conclusions: CASPAR accurately forecasts individual survival times for neuroblastoma patients from gene expression data.
Current clinical trials stratify neuroblastoma patients into risk groups with distinctive outcome. However, the individual courses within these risk groups, in particular those of advanced-risk patients still vary clearly, covering both patients who are cured by present treatment concepts or possibly even overtreated and others who present as nonresponders to current therapeutic strategies. Similar problems also remain when categorizing neuroblastoma patients by means of gene expression–based approaches.
In the present study, we applied a novel algorithm, termed cancer survival prediction using automatic relevance determination (CASPAR), to predict patients' individual survival time as a continuous variable based on gene expression data of each patient's tumor. Because CASPAR-predicted values for patients' individual event-free and overall survival times were highly accurate, our approach may be helpful in optimizing both clinical monitoring and therapeutic management of neuroblastoma patients during first-line treatment and aftercare. Furthermore, we observed that our strategy was particularly accurate in identifying patients with an ultra-poor outcome in subgroups with unfavorable markers or current advanced-risk groups. We therefore feel confident that CASPAR-based prediction presents a valuable tool to identify patients for whom current treatment is insufficient and who should be considered for alternative therapeutic approaches.
Neuroblastoma is a malignant pediatric tumor originating from migrating neural crest cells that accounts for ∼8% of all childhood cancers (1). One of the hallmark of the disease is its contrasting biological behavior, which results in diverse clinical courses ranging from spontaneous regression to rapid and fatal tumor progression despite excessive treatment. In recent years, several markers have been reported to offer valuable prognostic information. Of these, tumor stage (2), patients' age at diagnosis (3), genomic amplification of the MYCN oncogene on chromosome 2p24.1 (4) and allelic loss of the short arm of chromosome 1 (del 1p; ref. 5) are routinely determined by the current German neuroblastoma trial NB2004 to stratify patients into groups of low risk (∼50% of patients), intermediate risk (∼10%), and high risk (∼40%) of disease. Therapeutic strategies vary according to these risk categories and range from a wait-and-see approach (low risk) to intensive cytotoxic treatment, including myeloablative therapy with subsequent autologous stem cell transplantation (high risk).
As common clinical experience suggests that such risk classification is still suboptimal for a substantial number of patients, various other markers were proposed to exert additional prognostic information. These comprise genetic alterations of the chromosomal regions 3p (del 3p), 11q (del 11q; ref. 6), and 17q (gain 17q; ref. 7) or determination of expression levels of certain candidate genes (e.g., NTRK1, ref. 8; CD44, ref. 9). More recently, studies supported the implementation of complex gene expression signatures to more precisely reflect the underlying biological phenotype of the tumor (10–13). However, grouping of patients according to both current and proposed markers still results in categorizing patients into sizable subgroups in which the observed individual clinical courses are still diverse. Patients classified as unfavorable, for instance, may either show event-free survival after therapy (i.e., patients cured by therapy), survival after relapse of disease and salvage therapy or fatal tumor progression despite treatment.
In this study, to overcome this limitation, we used the recently described cancer survival prediction using automatic relevance determination (CASPAR; ref. 14) algorithm to predict individual survival time as continuous variable for neuroblastoma patients from gene expression profiles of their tumors. To comprehensively evaluate the performance of this algorithm, we first tested CASPAR on a considerable cohort of 256 initial pretreatment neuroblastoma tumors of all stages. Subsequently, we applied the algorithm to conduct a systematic gene expression–based classification of subgroups of the disease carrying unfavorable markers, as particularly these patients may benefit from a refined prediction of their disease's course. Finally, we used gene expression profiles of an independent test cohort of 120 initial, pretreatment neuroblastoma samples of all stages to validate the accuracy of CASPAR in predicting individual survival times from gene expression data.
Patients and Methods
Patients. The first set of the study comprised 256 patients of the German Neuroblastoma Trials NB90-NB2004, diagnosed between 1989 and 2004. Informed consent was obtained from all patients before this study. Patients' age at diagnosis ranged from 0 to 296 mo (median, 15 mo). Median follow-up for patients without fatal events was 4.5 y (0.8-15.6 y). Stage was classified according to the International Neuroblastoma Staging System (3); response to treatment was defined according to the revised criteria of the International Neuroblastoma Response Criteria (3). Analysis of chromosomal alterations was done by fluorescent in situ hybridization, as described (15, 16), and aberrations were defined according to the guidelines of the European Neuroblastoma Quality Assessment Group (17). Stratification of neuroblastoma patients was done according to the criteria of the current German neuroblastoma trial (NB2004), as described (11, 18).
As an independent second set, gene expression profiles of another 120 patients from centers in several countries were generated. In this set, 29 samples were obtained from German patients enrolled in German neuroblastoma trials (NB90-NB2004). Remaining samples were obtained from patients enrolled in national trials in the United States (n = 26), France (n = 26), Spain (n = 12), Italy (n = 11), Belgium (n = 6), the United Kingdom (n = 5), and Israel (n = 5). Informed consent was obtained from all patients before this study. In the complete test set, age of patients at diagnosis ranged from 0 to 125 mo (median, 15 mo) and median follow-up for patients without fatal events was 4.4 y (0.4-18.1 y). Stage, response to treatment, and chromosomal aberrations were classified as described above, as was risk stratification for the current German neuroblastoma trial NB2004 whenever possible.
Sample preparation. Tumor samples were checked by a pathologist for ≥60% tumor content. Total RNA was isolated from 30 to 60 mg of snap-frozen tissue obtained before cytotoxic treatment using the FastPrep FP120 cell disruptor (Qbiogene, Inc.) and the TRIzol reagent (Invitrogen) according to the manufacturer's protocol. RNA integrity was assessed using the 2100 Bioanalyzer (Agilent Technologies), considering only samples with RNA integrity numbers of ≥7.5.
Gene expression profiling. In this study, customized neuroblastoma-related oligonucleotide microarrays were used as described (11). Microarrays were produced by Agilent Technologies. Profiles from neuroblastoma tumors were generated as dye-flipped duplicates in dual-color experiments. For each sample, 1 μg of linearly amplified Cy3-labeled and Cy5-labeled cRNA, respectively, was hybridized with 1 μg of reverse-color Cy-labeled cRNA of a total RNA pool of 100 neuroblastoma tumor samples using Agilent's Low RNA Input Fluor Linear Amp kit and In situ Hyb kit Plus. After washing and scanning, raw microarray data were preprocessed using software packages from the R-project5
and Bioconductor6 (19). Quality control was done using the package ArrayMagic (20). Subsequently, expression data and annotations were stored in a relational database, iCHIP, developed at the German Cancer Research Center, which complies with the MIAME (minimal information about a microarray experiment) guidelines. Samples were normalized using the variance stabilization algorithm (21), and data from dye-flipped chip pairs were averaged, yielding one intensity value for every microarray probe of each patient. All microarray data generated in this study are available for download at the ArrayExpress database7 (accession no. E-MTAB-16).Supervised class prediction analysis using CASPAR and statistical analysis. In this study, the CASPAR algorithm was used to predict individual survival times as continuous variable for neuroblastoma patients based on gene expression measurements. In comparison to other classification algorithms, CASPAR works directly with (possibly censored) time-to-event data and does not need a manual classification of training samples into discrete classes. A detailed description of the CASPAR algorithm is given in ref. 14. In brief, CASPAR is based on a multivariate Cox regression model that is embedded in a Bayesian framework. The Cox regression is done simultaneously for all genes in the Bayesian statistical framework, which is made possible by using a special prior distribution on the regression coefficients. CASPAR weighs all genes on a continuous scale, and the prior distribution used by CASPAR drives the regression to solutions where most genes have weights very close to 0. Due to this penalty scheme, the Cox model defines solutions with only few, relevant genes from the high-dimensional gene expression data. Genes receiving the highest weights (top-ranked positive and negative genes) are indicated in the manuscript. In this study, CASPAR was combined with a leave-one-out cross-validation to predict event-free survival (EFS) and overall survival (OS; with time as continuous variable) for the respective patient cohort under consideration, based on the gene expression data only.
Due to censoring of the survival times for some patients, prediction errors cannot readily be computed. Therefore, patients were subdivided into predicted long and short survivor subgroups, based on the CASPAR predictions, using the median true survival time of the patient cohort under consideration as threshold defining the two groups. Kaplan-Meier estimates for EFS and OS were then calculated for the predicted subgroups and compared using the log-rank test. Recurrence, progression, and death from disease were considered as events.
Receiver operator characteristics (ROC) analysis was carried out by varying the threshold variable separating the prediction groups and computing sensitivity and specificity in each case. ROC curves were generated and summarized by computing the area under the curve (AUC). To directly evaluate correctness of predicted survival times for all time points and, thus, to reflect the accuracy of the predicted survival times, we calculated time-dependent ROC, as suggested by Heagerty (22), and summarized these information by computing and plotting the AUC against time.
After evaluating CASPAR's performance in the leave-one-out analyses on the first set, the accuracy of the algorithm in predicting survival times for independent data sets was also assessed. To this end, we first trained the algorithm on all data of the training set for each particular cohort (e.g., stage 4 patients of the first set). Then, CASPAR was applied to predict survival times for patients of the corresponding validation cohort (e.g., stage 4 patients of the validation cohort) using the gene expression information of each feature in combination with the weight the feature was given by CASPAR in the initial cohort.
Hierarchical cluster analysis. Hierarchical cluster analysis was done using the method heatmap of the R software for statistical computing with Euclidean distance and complete linkage as variables. To reflect the weighting of the gene relevances assigned by CASPAR, gene expression values were rescaled according to the CASPAR weights before clustering.
Results
EFS and OS time prediction on the whole cohort of 256 neuroblastoma patients using CASPAR. First, the CASPAR algorithm was applied to gene expression information of the complete set of 256 neuroblastoma patients, and survival times for both EFS and OS were predicted by a leave-one-out cross-validation (results of all CASPAR predictions for the first cohort of 256 neuroblastoma patients are summarized in Supplementary Table S3). To visualize classification results, Kaplan-Meier analyses for subcohorts of patients with predicted short and long EFS and OS, respectively, were done using a 5-year cutoff, as most patients with neuroblastoma experience relapse or fatal outcome of disease within this period of time [5y-EFS probability in the predicted short EFS group (n = 67) 0.33 ± 0.14 versus predicted long EFS group (n = 189) 0.82 ± 0.06, P < 0.0001 (Fig. 1A), and 5y-OS predicted short OS group (n = 53) 0.38 ± 0.17 versus predicted long OS group (n = 203) 0.94 ± 0.04, P < 0.0001 (Fig. 1B)]. In addition to the significant discrimination of divergent courses of the disease, it was observed that CASPAR-predicted short and long survival times were correlated with other clinical markers, such as tumor stage, age at diagnosis, or MYCN amplification (MNA; Table 1).
EFS . | CASPAR predicted >5 y (n = 189) . | CASPAR predicted <5 y (n = 67) . | P . | |||
---|---|---|---|---|---|---|
Mean age at diagnosis (d) | 504 | 1,395 | ||||
MNA | 3 (1.5%) | 29 (43.4%) | ||||
Stage 4 disease | 29 (15.3%) | 38 (56.7%) | ||||
5-y EFS | 0.82 ± 0.06 | 0.33 ± 0.14 | <0.001 | |||
OS | CASPAR predicted >5 y (n = 203) | CASPAR predicted <5 y (n = 53) | P | |||
Mean age at diagnosis (d) | 500 | 1,645 | ||||
MNA | 1 (0.5%) | 31 (58.5%) | ||||
Stage 4 disease | 36 (17.7%) | 31 (58.5%) | ||||
5-y OS | 0.94 ± 0.04 | 0.38 ± 0.17 | <0.001 |
EFS . | CASPAR predicted >5 y (n = 189) . | CASPAR predicted <5 y (n = 67) . | P . | |||
---|---|---|---|---|---|---|
Mean age at diagnosis (d) | 504 | 1,395 | ||||
MNA | 3 (1.5%) | 29 (43.4%) | ||||
Stage 4 disease | 29 (15.3%) | 38 (56.7%) | ||||
5-y EFS | 0.82 ± 0.06 | 0.33 ± 0.14 | <0.001 | |||
OS | CASPAR predicted >5 y (n = 203) | CASPAR predicted <5 y (n = 53) | P | |||
Mean age at diagnosis (d) | 500 | 1,645 | ||||
MNA | 1 (0.5%) | 31 (58.5%) | ||||
Stage 4 disease | 36 (17.7%) | 31 (58.5%) | ||||
5-y OS | 0.94 ± 0.04 | 0.38 ± 0.17 | <0.001 |
NOTE: Estimated 5-y EFS and OS probability according to Kaplan-Meier and clinical covariates for the subgroups of CASPAR-predicted long (>5 y) and short (<5 y) overall survivors of the initial cohort.
Subsequently, we sought to assess the accuracy of the algorithm in identifying contrasting courses of the disease and did ROC analyses for these classifications. As highlighted in Fig. 1C and D, CASPAR predicted the 5y-EFS and 5y-OS of both patients with both balanced and high sensitivity and specificity. To moreover evaluate CASPAR's accuracy in predicting patients' individual EFS and OS time, we implemented time-dependent ROC analyses, as proposed by Heagerty and colleagues (22). After this approach, we calculated the AUC (ROC) continuously over a time interval of 20 years and plotted AUC values against time. As higher AUC values represent higher performance, it was observed that CASPAR predicted times for both individual EFS (mean AUC values over first 10 years, 0.82 ± 0.06; range, 0.57-0.87) and OS (mean AUC values over first 10 years, 0.92 ± 0.04; range, 0.65-0.996) were highly accurate within our patient cohort (Fig. 1E and F). The 50 top-ranked genes that were identified by CASPAR as being most highly associated with EFS and OS times are listed in Supplementary Table S1A (EFS) and B (OS).
Survival time prediction of neuroblastoma patients with unfavorable single markers. From a clinical perspective, accurate prediction of individual survival times seems most desirable for patients with known unfavorable tumor biology, who are considered to require cytotoxic treatment, to discriminate long-term survivors from those patients, in whom cytotoxic treatment will fail. Thus, we next tested CASPAR on subgroups of neuroblastoma patients, who were characterized as unfavorable by current single markers. To this end, we identified subcohorts of patients with (a) stage 4 disease (n = 67), (b) MNA (n = 32), (c) 1p deletion (n = 53), (d) 11q deletion (n = 56), and (e) 17q-gain (n = 78) from the whole set and applied CASPAR to predict individual survival times of patients for OS from gene expression information of these subcohorts (calculations for EFS yielded similar results which are not highlighted in this manuscript). Again, a predicted survival time of ≥5 years was chosen as cutoff to perform Kaplan-Meier analyses of predicted short (<5 y) and long (>5 y) survivors. As indicated in Fig. 2A-E, CASPAR significantly discriminated subgroups with divergent outcome within the cohorts of patients with stage 4 disease (P < 0.0001), 11q deletion (P < 0.0001), and 17q gain (P < 0.0001) but did less well in the cohorts of patients with 1p deletion (P = 0.0306) or MNA (P = 0.301). Consistently, we observed high AUC values in the time-dependent ROC analyses of the prediction for stage 4 patients (mean AUC over the first 10 years, 0.78 ± 0.04; range, 0.7-0.99), patients with del11q (mean AUC, 0.82 ± 0.05; range, 0.67-0.98), and patients with 17gain (mean AUC, 0.81 ± 0.07; range, 0.73-0.99; data not shown), indicating that CASPAR forecasted individual survival times with high accuracy within these groups. In contrast, time-dependent ROC analyses showed poorer performance for CASPAR's prediction of the cohort with MNA and 1p deletion (mean AUC MNA 0.5 ± 0.09, range 0.45-0.97; mean AUC for del1p 0.7 ± 0.05, range 0.59-0.98). Distribution of clinical covariates within the CASPAR predicted subgroups of these cohorts are highlighted in Table 2A.
A. Covariates and OS predictions for subgroups of the first cohort of 256 patients* . | . | . | . | |||
---|---|---|---|---|---|---|
. | CASPAR predicted OS >5 y . | CASPAR predicted OS <5 y . | P . | |||
Patients with stage 4 disease (n = 67) | ||||||
n = 47 | n = 20 | |||||
Age at diagnosis (d) | 1,388 | 1,831 | ||||
MNA | 1 | 14 | ||||
5y-OS | 0.66 ± 0.16 | 0.2 ± 0.22 | 0.00000127 | |||
Patients with MNA (n = 32) | ||||||
n = 16 | n = 16 | |||||
Age at diagnosis (d) | 1,353 | 1,311 | ||||
Stage 4 disease | 9 | 6 | ||||
5y-OS | 0.44 ± 0.32 | 0.37 ± 0.29 | 0.301 | |||
Patients with del1p (n = 53) | ||||||
n = 31 | n = 22 | |||||
Age at diagnosis (d) | 926 | 1,410 | ||||
Stage 4 disease | 18 | 10 | ||||
MNA | 11 | 19 | ||||
5y-OS | 0.59 ± 0.21 | 0.38 ± 0.24 | 0.0306 | |||
Patients with del11q (n = 56) | ||||||
n = 46 | n = 10 | |||||
Age at diagnosis (d) | 2,452 | 917 | ||||
Stage 4 disease | 30 | 6 | ||||
MNA | 0 | 6 | ||||
5y-OS | 0.76 ± 0.15 | <0.18 | <0.000001 | |||
Patients with gain17q (n = 78) | ||||||
n = 39 | n = 39 | |||||
Age at diagnosis (d) | 456 | 1,510 | ||||
Stage 4 disease | 10 | 24 | ||||
MNA | 0 | 11 | ||||
5y-OS | 0.93 ± 0.1 | 0.48 ± 0.18 | 0.0000725 | |||
NB2004 HR (n = 79) | ||||||
n = 50 | n = 29 | |||||
Age at diagnosis (d) | 1,275 | 1,955 | ||||
Stage 4 disease | 46 | 16 | ||||
MNA | 9 | 23 | ||||
5y-OS | 0.59 ± 0.17 | 0.34 ± 0.2 | 0.0000918 | |||
NB2004 IR or HR (n = 94) | ||||||
n = 57 | n = 37 | |||||
Age at diagnosis (d) | 1,248 | 1,668 | ||||
Stage 4 disease | 45 | 22 | ||||
MNA | 5 | 27 | ||||
IR patients | 11 | 4 | ||||
5y-OS | 0.69 ± 0.15 | 0.34 ± 0.19 | 0.00000398 | |||
Unfavorable PAM 144-gene prediction (n = 83) | ||||||
n = 35 | n = 48 | |||||
Age at diagnosis (d) | 1,163 | 1,561 | ||||
Stage 4 disease | 24 | 27 | ||||
MNA | 2 | 29 | ||||
5y-OS | 0.62 ± 0.21 | 0.33 ± 0.18 | 0.000386 | |||
B. Covariates and survival time predictions for neuroblastoma patients of the validation cohort† | ||||||
CASPAR predicted EFS >5 y | CASPAR predicted EFS <5 y | P | ||||
Whole cohort (n = 120) | ||||||
n = 67 | n = 53 | |||||
Age at diagnosis (d) | 485 | 1,080 | ||||
MNA | 1 | 22 | ||||
Stage 4 disease | 16 | 41 | ||||
5-y EFS | 0.83 ± 0.10 | 0.26 ± 0.13 | <0.0001 | |||
CASPAR predicted OS >5 y | CASPAR predicted OS <5 y | |||||
Whole cohort (n = 120) | ||||||
n = 59 | n = 61 | |||||
Age at diagnosis (d) | 437 | 1,048 | ||||
MNA | 0 | 23 | ||||
Stage 4 disease | 15 | 42 | ||||
5-y OS | 0.94 ± 0.07 | 0.40 ± 0.13 | <0.0001 | |||
Patients with stage 4 disease (n = 57) | ||||||
n = 36 | N = 21 | |||||
Age at diagnosis (d) | 962 | 1,272 | ||||
MNA | 1 | 16 | ||||
5y-OS | 0.54 ± 0.17 | 0.25 ± 0.22 | 0.0049 | |||
NB2004 IR or HR (n = 71) | ||||||
n = 39 | n = 32 | |||||
Age at diagnosis (d) | 947 | 1,115 | ||||
Stage 4 disease | 29 | 28 | ||||
MNA | 2 | 21 | ||||
5y-OS | 0.62 ± 0.16 | 0.27 ± 0.18 | 0.0017 | |||
Unfavorable PAM 144-gene prediction (n = 66) | ||||||
N = 31 | n = 35 | |||||
Age at diagnosis (d) | 858 | 1,106 | ||||
Stage 4 disease | 16 | 29 | ||||
MNA | 1 | 21 | ||||
5y-OS | 0.51 ± 0.18 | 0.23 ± 0.17 | 0.0017 |
A. Covariates and OS predictions for subgroups of the first cohort of 256 patients* . | . | . | . | |||
---|---|---|---|---|---|---|
. | CASPAR predicted OS >5 y . | CASPAR predicted OS <5 y . | P . | |||
Patients with stage 4 disease (n = 67) | ||||||
n = 47 | n = 20 | |||||
Age at diagnosis (d) | 1,388 | 1,831 | ||||
MNA | 1 | 14 | ||||
5y-OS | 0.66 ± 0.16 | 0.2 ± 0.22 | 0.00000127 | |||
Patients with MNA (n = 32) | ||||||
n = 16 | n = 16 | |||||
Age at diagnosis (d) | 1,353 | 1,311 | ||||
Stage 4 disease | 9 | 6 | ||||
5y-OS | 0.44 ± 0.32 | 0.37 ± 0.29 | 0.301 | |||
Patients with del1p (n = 53) | ||||||
n = 31 | n = 22 | |||||
Age at diagnosis (d) | 926 | 1,410 | ||||
Stage 4 disease | 18 | 10 | ||||
MNA | 11 | 19 | ||||
5y-OS | 0.59 ± 0.21 | 0.38 ± 0.24 | 0.0306 | |||
Patients with del11q (n = 56) | ||||||
n = 46 | n = 10 | |||||
Age at diagnosis (d) | 2,452 | 917 | ||||
Stage 4 disease | 30 | 6 | ||||
MNA | 0 | 6 | ||||
5y-OS | 0.76 ± 0.15 | <0.18 | <0.000001 | |||
Patients with gain17q (n = 78) | ||||||
n = 39 | n = 39 | |||||
Age at diagnosis (d) | 456 | 1,510 | ||||
Stage 4 disease | 10 | 24 | ||||
MNA | 0 | 11 | ||||
5y-OS | 0.93 ± 0.1 | 0.48 ± 0.18 | 0.0000725 | |||
NB2004 HR (n = 79) | ||||||
n = 50 | n = 29 | |||||
Age at diagnosis (d) | 1,275 | 1,955 | ||||
Stage 4 disease | 46 | 16 | ||||
MNA | 9 | 23 | ||||
5y-OS | 0.59 ± 0.17 | 0.34 ± 0.2 | 0.0000918 | |||
NB2004 IR or HR (n = 94) | ||||||
n = 57 | n = 37 | |||||
Age at diagnosis (d) | 1,248 | 1,668 | ||||
Stage 4 disease | 45 | 22 | ||||
MNA | 5 | 27 | ||||
IR patients | 11 | 4 | ||||
5y-OS | 0.69 ± 0.15 | 0.34 ± 0.19 | 0.00000398 | |||
Unfavorable PAM 144-gene prediction (n = 83) | ||||||
n = 35 | n = 48 | |||||
Age at diagnosis (d) | 1,163 | 1,561 | ||||
Stage 4 disease | 24 | 27 | ||||
MNA | 2 | 29 | ||||
5y-OS | 0.62 ± 0.21 | 0.33 ± 0.18 | 0.000386 | |||
B. Covariates and survival time predictions for neuroblastoma patients of the validation cohort† | ||||||
CASPAR predicted EFS >5 y | CASPAR predicted EFS <5 y | P | ||||
Whole cohort (n = 120) | ||||||
n = 67 | n = 53 | |||||
Age at diagnosis (d) | 485 | 1,080 | ||||
MNA | 1 | 22 | ||||
Stage 4 disease | 16 | 41 | ||||
5-y EFS | 0.83 ± 0.10 | 0.26 ± 0.13 | <0.0001 | |||
CASPAR predicted OS >5 y | CASPAR predicted OS <5 y | |||||
Whole cohort (n = 120) | ||||||
n = 59 | n = 61 | |||||
Age at diagnosis (d) | 437 | 1,048 | ||||
MNA | 0 | 23 | ||||
Stage 4 disease | 15 | 42 | ||||
5-y OS | 0.94 ± 0.07 | 0.40 ± 0.13 | <0.0001 | |||
Patients with stage 4 disease (n = 57) | ||||||
n = 36 | N = 21 | |||||
Age at diagnosis (d) | 962 | 1,272 | ||||
MNA | 1 | 16 | ||||
5y-OS | 0.54 ± 0.17 | 0.25 ± 0.22 | 0.0049 | |||
NB2004 IR or HR (n = 71) | ||||||
n = 39 | n = 32 | |||||
Age at diagnosis (d) | 947 | 1,115 | ||||
Stage 4 disease | 29 | 28 | ||||
MNA | 2 | 21 | ||||
5y-OS | 0.62 ± 0.16 | 0.27 ± 0.18 | 0.0017 | |||
Unfavorable PAM 144-gene prediction (n = 66) | ||||||
N = 31 | n = 35 | |||||
Age at diagnosis (d) | 858 | 1,106 | ||||
Stage 4 disease | 16 | 29 | ||||
MNA | 1 | 21 | ||||
5y-OS | 0.51 ± 0.18 | 0.23 ± 0.17 | 0.0017 |
Estimated 5-y OS probability according to Kaplan-Meier and clinical covariates for CASPAR-predicted long (>5 y) and short survivors (<5 y) for subgroups of the disease with unfavorable markers.
Estimated 5-y EFS and OS probability according to Kaplan-Meier and clinical covariates for CASPAR-predicted long (>5 y) and short (<5 y) event-free and overall survivors in the validation cohort.
Survival time prediction of neuroblastoma patients with unfavorable risk-stratification or gene expression–based classification. To optimize risk assessment of neuroblastoma patients, current trials combine several clinical and genetic markers to categorize patients into different risk groups with assumed similar tumor behavior. To test whether CASPAR is able to identify contrasting courses of disease within current trial risk groups, we applied the algorithm on two categories of patients: (a) patients stratified as high risk by the German NB2004 neuroblastoma trial (n = 79), reflecting those patients who qualify for most aggressive treatment regimens, and (b) patients stratified as either intermediate or high risk by the NB2004 trial (n = 94), as this group covers all patients currently considered to require cytotoxic treatment (low-risk patients follow a watch-and-wait approach). As indicated by Kaplan-Meier analyses in Fig. 3A and B, CASPAR significantly discriminated patients with divergent outcome in both subgroups (clinical covariates of the predicted subgroups are shown in Table 2A).
Then, as the most challenging analysis, we applied CASPAR on a cohort of 83 patients who were defined as unfavorable by our recently published, highly specific 144-gene expression classifier that was reported to outperform risk classification of current clinical trials, including the NB2004 trial (11). Using CASPAR, we were able to significantly discriminate patients with divergent courses of the disease within this unequivocal unfavorable group (Table 2A). As indicated in Fig. 3C, the 5-year OS probability in the CASPAR predicted long survivor group was 0.62 ± 0.21, whereas predicted short survivors clearly formed an ultra-poor outcome group, in which patients have to be anticipated as insufficiently responding to current therapeutic strategies (5y-OS, 0.33 ± 0.18; P = 0.000386). Time-dependent ROC analyses yielded mean AUC values of 0.71 ± 0.095 (range, 0.61-0.99) for this extremely difficult subgroup of patients (data not shown). In addition to these calculations, we did a hierarchical cluster analysis of the 83 patients of this cohort using gene expression information of all 459 genes that were selected by CASPAR as being correlated with survival in this cohort (Fig. 3D; top 25 positively and negatively correlated genes are summarized in Supplementary Table S2). As indicated by the color bar aside the clustering, CASPAR-selected genes differentiate between groups of patients with discriminative outcome in this unfavorably classified cohort, underscoring that CASPAR can be used to detect specific, survival time–depending gene expression patterns, further subdividing homogeneous patient cohorts.
Performance of CASPAR in an independent cohort. To evaluate the results of CASPAR-based individual survival time prediction for neuroblastoma patients, we did gene expression profiles of 120 additional samples covering all stages and clinical courses of the disease. Subsequently, CASPAR was applied to predict survival times for EFS and OS considering gene weights as defined by the CASPAR calculation for EFS and OS, respectively, in the initial cohort (results of all CASPAR predictions for the first cohort of 256 neuroblastoma patients are summarized in Supplementary Table S4). Based on the CASPAR prediction, Kaplan-Meier analyses for subcohorts of patients with prognosticated short and long EFS and OS, respectively, were done using a 5-year cutoff. With respect to EFS, CASPAR separated a predicted short EFS group (n = 53) with a 5y-EFS probability of 0.26 ± 0.13 from a predicted long EFS group (n = 67) with a 5y-EFS probability of 0.83 ± 0.10, P < 0.0001 (Fig. 4A). Likewise, CASPAR discriminated a predicted short OS group (n = 61) with poor outcome (5y-OS probability, 0.40 ± 0.13) from a predicted long OS group (n = 59) characterized by a significantly better 5y-OS probability of 0.94 ± 0.07, P < 0.0001 (Fig. 4B). ROC curve analyses for both EFS and OS prediction are shown in Supplementary Fig. S1A and B. Furthermore, in the time-dependent ROC analyses for the survival time prediction of the independent validation set, observed mean AUC values were similar to those obtained for the initial set of 256 patients [EFS validation cohort: mean AUC values over first 10 years 0.85 ± 0.04, range 0.66-0.90 (Supplementary Fig. S1C); OS validation cohort: mean AUC values over first 10 years 0.81 ± 0.05, range 0.42-0.96 (Supplementary Fig. 1D)], indicating proper prediction of individual survival time of patients by CASPAR.
Subsequently, CASPAR's performance in forecasting individual survival times for neuroblastoma patients with unfavorable markers was also evaluated. To this end, we applied the algorithm to predict OS time for those subgroups, which comprised a reasonable number of patients. These were (a) patients with stage 4 disease (n = 57), (b) patients with NB2004 IR or HR disease (n = 71), and (c) patients who had received an unfavorable PAM prediction (n = 66). The remaining subgroups were not analyzed, as the small number of patients in the validation cohort impeded a reasonable assessment. As highlighted in Fig. 4C-E, CASPAR significantly discriminated subgroups with divergent OS within these three cohorts [stage 4: 5y-OS predicted short OS group (n = 21) 0.25 ± 0.22 versus predicted long OS group (n = 36) 0.54 ± 0.17, P = 0.0049 (Fig. 4C); NB2004 HR+IR: 5y-OS predicted short OS group (n = 32) 0.27 ± 0.18 versus predicted long OS group (n = 39) 0.62 ± 0.16, P = 0.0017 (Fig. 4D); PAM UF: 5y-OS predicted short OS group (n = 35) 0.23 ± 0.17 versus predicted long OS group (n = 31) 0.51 ± 0.18, P = 0.0017 (Fig. 4E)]. AUC values for the time-dependent ROC analyses were as follows: stage 4 patients: mean AUC over the first 10 years 0.69 ± 0.08, range 0.36-0.99; patients with NB2004 HR or IR stratification: mean AUC 0.74 ± 0.07, range 0.51-0.99; and patients with PAM UF classification: mean AUC 0.61 ± 0.10, range 0.15-0.98 (data not illustrated). Distribution of clinical covariates within the CASPAR-predicted subgroups of the validation cohort is shown in Table 2B.
Discussion
In this study, we report on the prediction of individual patient survival time as continuous variable for neuroblastoma patients based on the particular gene expression profile of the tumor of each patient by using the recently described CASPAR algorithm (14).
To derive a reliable estimation of the predictive power of CASPAR for neuroblastoma patients, we applied the algorithm to gene expression profiles of a considerable total cohort of 376 neuroblastoma patients (first set, n = 256; second set, n = 120) comprising all stages and all clinical courses of the disease, thereby presenting the largest data set classified by this algorithm to date (14). Although both Kaplan-Meier calculations and ROC analyses indicated that CASPAR classification results were comparable with other gene expression–based classification approaches for neuroblastoma patients (10–12, 23), our approach clearly differs from these works by using individual survival time of patients as both the main training and the continuous predictive variable. This feature has two capital advantages. First, in contrast to other clinical or genetic markers for neuroblastoma, CASPAR provides a more specific measure of outcome of patients by predicting an individual continuous EFS or OS time for each patient. When precise, such information obtained at the time of diagnosis could function to optimize therapeutic strategies or clinical monitoring of patients both during first-line treatment and aftercare. Evidence for the necessary high accuracy of CASPAR-predicted individual EFS and OS times was derived from the time-dependent ROC analyses, in which the observed high AUC values (mean AUC values of >0.8 for both EFS and OS) clearly warrant further testing of this algorithm to predict individual survival times in a clinical setting.
As a second major advantage, the CASPAR algorithm does not depend on an initial supervised categorization of patients for training of the algorithm, as it simply uses survival time as the main variable. This permits the utilization of gene expression information of all patients for model selection, including those for whom an initial supervised group assignment is impractical. The questions “how should event-free surviving MYCN-amplified or stage 4 patients be categorized in a classifier training set?” or “how should patients who eventually survived after relapse of disease be classified for algorithm training purposes?” remain challenging as the effects of cytotoxic treatment impede a clear-cut supervised classification in these cases, and more subjective criteria of the investigator might therefore distort the analyses. As CASPAR does not require a hand-selected classification of such difficult cases, this tool might be of greatest value for classification of patients with unfavorable markers. Intriguingly, studies focusing on gene expression–based classification of unfavorable subgroups of neuroblastoma patients are still rare to date (12, 24), despite a clear indication for such analyses. Probably, the mentioned difficulties to categorize patients with unfavorable markers other than by choosing survivors versus nonsurvivors after an subjectively adequate period of observation could account for this paucity and underline the relevance of our approach, which represents the first systematic study aiming at specifically differentiating between contrasting courses of the disease within subgroups of patients with consistent unfavorable markers.
Regarding the subgroup classification, the CASPAR predictions for patients with stage 4 disease are comparable with those by Asgharzadeh et al., who reported on a 55–gene expression signature discriminating MYCN nonamplified stage 4 patients with ultra-poor outcome from those with a high-survival probability (12). Therefore, these two works clearly support that gene expression–based tools could serve to further subclassify patients with unfavorable markers. In contrast, CASPAR achieved inferior results in patients with MNA or 1p deletion. Several factors might be responsible for this finding. First, the small number of patients with MNA (n = 32) or Del1p (n = 53) might indicate that the CASPAR algorithm could need a critical cohort size of >50 patients or more for an accurate prediction, as almost any other subcohort had a larger number of patients. This hypothesis is also underlined by the observation that, opposed to 16 patients predicted to survive longer than 5 years in the CASPAR analysis of the MYCN subcohort, all but one patient with MNA was predicted to have an OS of <5 years by CASPAR when the algorithm was applied to the total cohort (Supplementary Table S1) Second, patients with MNA could represent a very homogeneous subgroup of neuroblastoma tumors, both clinically and with respect to their gene expression profiles; a finding that is supported by the fact that this marker has proved to be highly specific for unfavorable courses of disease in a large number of studies (25) and also by integrative genomic studies of neuroblastoma tumors (26). Moreover, this hypothesis is also underlined by the fact that MYCN-amplified tumors were predominantly classified into the short survivor groups in all other CASPAR analyses in this study. The rather poor accuracy of CASPAR to predict outcome of patients with 1p deletion is very likely caused by both the small number of cases (n = 53), and the high proportion of MYCN-amplified patients in this group, as both markers, are highly correlated (25, 27, 28). In contrast, the convincing CASPAR results for patients carrying the more frequent alterations of 11q deletion (n = 56) and 17q gain (n = 78) might suggest that these cohorts exhibit less consistent biological phenotypes of the tumor. It therefore seems reasonable to hypothesize that despite the growing evidence supporting the effect of these marker on patients' outcome (7, 16, 28–30), comparative breakpoint positioning analyses (31) or analyses of the gene expression profiles of these patients could aid in further defining patients with divergent outcome within these cohorts. One could imagine that, in future risk stratification systems, genetic markers identify subgroups of the disease, which are then further refined by data on gene expression profiles of patients—an approach that might be supported by our study.
Similarly, it can also be envisioned that gene expression–based classification can be used to substratify patients within current clinical trial risk groups, as indicated by the CASPAR results in the cohorts of NB2004 HR patients and NB2004 HR + IR patients. In this context, we observed that CASPAR's survival time predictions were particularly accurate for the NB2004 IR patients, a subgroup that is among the most challenging for clinicians in terms of risk stratification. Therefore, our data show that such clinically difficult subgroups may benefit the most from gene expression–based analyses and that these tools could greatly enhance our ability to avoid both under and overtreatment. It even seems feasible to apply nested rounds of gene expression–based classification of patients using different algorithms for an optimal categorization of patients, an idea that is supported by the CASPAR prediction of 83 (first set) and 66 (second set) patients exhibiting an unfavorable gene expression pattern when applying a 144-gene classifier (11). Whereas the highly specific 144-gene PAM classifier could serve to identify all patients who need to receive chemotherapeutic treatment, CASPAR could then be used to define the therapeutic dose intensity for these patients. Thereby, patients who have an adequate probability of survival after current treatment can be delineated from those patients with an ultra-poor outcome, who should probably be considered for alternative treatment options. The feasibility of such a consecutive gene expression approach is underscored by the hierarchical clustering analyses highlighted in Fig. 4. Whereas all 83 patients were classified as unfavorable by the PAM algorithm, hierarchical cluster analysis with weighed gene expression information using CASPAR-selected genes clearly outlines differences in the gene expression patterns of patients with ultra-poor outcome compared with those with longer survival times. However, it also has to be recognized that CASPAR showed convincing predictions for the total cohort and could therefore be used in a stand-alone approach for classification.
Gene discovery capacity of CASPAR. As the CASPAR algorithm is assumed to automatically identify those genes which are most relevant for survival of patients (14), we also reviewed candidates which the algorithm weighed as being highly associated with survival times. Because a detailed analysis of all CASPAR-selected genes of our calculations was beyond the scope of this study, we focused on the top 50 weighed features (of 459 features receiving a weight ≠ 0) of the OS prediction of the cohort of 83 patients, who were classified as unfavorable by our previously described PAM predictor. In this cohort, it was observed that CASPAR selected several candidate genes, which had been reported to be involved in the processes of angiogenesis and metastasis [e.g., DEFA3 (ref. 32), IL8 (ref. 33), ANXA2 (ref. 34)], apoptosis [e.g., IFI6 (ref. 35), GSTP1 (ref. 36), GATA6 (ref. 37)], retinoic acid signaling, and neuronal differentiation [e.g., NR0B1 (ref. 38), FEZ1 (refs. 39, 40), S100A6 (ref. 41)]. Furthermore, CASPAR selected genes which were found to have similar expression patterns in other types of malignant disease [e.g., HRK (ref. 42), VIM (ref. 43), SPARC (refs. 44–46), GATA6 (ref. 47), GSTP1 (ref. 48), or IL8 (ref. 49)] or which had been associated with outcome of neuroblastoma patients in previous studies [e.g., PRAME (ref. 50), ALK (refs. 51, 52), PTN (refs. 53, 54)]. Regarding the latter category, two interacting partners, ALK (anaplastic lymphoma kinase) and one of its ligands, the secreted growth factor PTN (pleiptrophin), were both selected to indicate outcome of neuroblastoma patients. PTN was negatively correlated with poor survival (i.e., higher expression values correlate with better outcome), which is in line with reports by Calvet et al. (53), who showed that PTN expression was down-regulated in neuroblastoma resistant to certain cytostatic drugs and also with a study by Nakagawara and colleagues, who showed that PTN expression is correlated with lower tumor stages and a better prognosis (54). ALK, one of the receptors of the PTN protein, has a specific role in the development of the embryonic nervous system (55) and was found to be positively correlated with poor outcome (i.e., higher expression values correlate with poor outcome). Although past studies reported that down-regulation of ALK expression did not correlate with known prognostic factors in neuroblastoma (52), the finding that two interacting partners were identified as highly associated with different outcome in a subcohort of neuroblastoma patients with unfavorable outcome supports further research of this pathway with focus on high-risk neuroblastoma.
In summary, our data support further application of the CASPAR algorithm for both gene expression–based classification approaches, either as a stand-alone approach or in extension to other markers, and gene discovery analyses.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Grant support: Deutsche Krebshilfe grant 50-2719, Bundesministerium für Bildung und Forschung through the National Genome Network 2 grants 01GS0456, 01GS0457, and 01GR0450, European Union 6th Framework Programme grant LSHC-CT-2005-018911, German Competence Network Pediatric Hematology and Oncology, and Fördergesellschaft Kinderkrebs-Neuroblastom-Forschung e.V.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Note: Supplementary data for this article are available at Clinical Cancer Research Online (http://clincancerres.aacrjournals.org/).
A. Oberthuer and L. Kaderali contributed equally to this manuscript.
Acknowledgments
We thank all institutes that contributed tumor material or RNA and corresponding clinical data to this study and Dr. Karen Ernestus for assessment of tumor cell content.