Abstract
Purpose: Given their accessibility, surrogate tissues, such as peripheral blood mononuclear cells (PBMC), may provide potential predictive biomarkers in clinical pharmacogenomic studies. In leukemias and lymphomas, the prognostic value of peripheral blast expression profiles is clear; however, it is unclear whether circulating mononuclear cells of patients with solid tumors might yield profiles with similar prognostic associations.
Experimental Design: In this study, we evaluated the association of expression profiles in PBMCs with clinical outcomes in patients with advanced renal cell cancer. Transcriptional patterns in PBMCs of 45 renal cell cancer patients were compared with clinical outcome data at the conclusion of a phase II study of the mTOR kinase inhibitor CCI-779 to determine whether pretreatment transcriptional patterns in PBMCs were correlated with eventual patient outcomes.
Results: Unsupervised hierarchical clustering of the PBMC profiles using all expressed genes identified clusters of patients with significant differences in survival. Cox proportional hazards modeling showed that the expression levels of many PBMC transcripts were predictors for the patient outcomes of time to progression and overall survival (time to death). Supervised class prediction approaches identified multivariate expression patterns in PBMCs capable of assigning favorable outcomes of time to death and time to progression in a test set of renal cancer patients, with overall performance accuracies of 72% and 85%, respectively.
Conclusions: The present study provides the first example of gene expression profiling in peripheral blood, a clinically accessible surrogate tissue, for identifying patterns of gene expression associated with higher likelihoods of positive outcome in patients with a solid tumor.
INTRODUCTION
Gene expression profiling studies in primary tumors have repeatedly demonstrated differences between normal and malignant tissues (1,2). It is becoming clear that expression profiles within tumors seem to be correlated with overall survival (3–7), and a recent study suggests that expression profiling of primary tumor biopsies yields prognostic “signatures” that rival or may even outperform currently accepted standard measures of risk in cancer patients (8).
Because of their greater accessibility, expression profiles in surrogate tissues, such as peripheral blood mononuclear cells (PBMC), are also of interest for determining whether expression patterns may predict clinical outcomes in cancer patients. We have reported recently that baseline expression profiles of PBMC from renal cell cancer (RCC) patients are significantly distinct from those of disease-free subjects (9). Many of the expressed transcripts were nonetheless highly variable and heterogeneously expressed across the RCC PBMCs, suggesting that subsets of patients with distinct transcriptional profiles may exist in this disease setting.
Following the conclusion of the present clinical trial (10), we compared expression patterns in PBMCs of these RCC patients with various clinical variables to determine whether expression patterns in circulating mononuclear cells were correlated with eventual patient outcomes in this study. The results of both unsupervised and supervised analyses suggest that transcriptional profiles in PBMCs from patients with advanced RCC are not only distinct from profiles in disease-free individuals but also significantly correlated with overall survival and progression-free survival in this disease setting.
PATIENTS AND METHODS
Clinical Variables, Demographics, and Inclusion/Exclusion Criteria for Renal Cell Carcinoma Patients in the Present Study. Forty-five advanced RCC patients (18 females and 27 males) participated in the pharmacogenomic study of the phase II clinical trial. The efficacy, safety, and pharmacokinetics of CCI-779 in the full cohort of 110 patients enrolled in this clinical trial were reported recently (10). Written informed consent for the pharmacogenomic portion of this study was received for the participating individuals and the project was approved by the local institutional review boards at the participating clinical sites. RCC tumors of patients were classified at the clinical sites as conventional (clear cell) carcinomas (25), granular (1), papillary (3), or mixed subtypes (7). Classifications for the remaining 9 tumors were unknown, and all tumors in the study were, by entry criteria, classified as stage IV. The 45 patients who signed informed consent for pharmacogenomic analysis of baseline PBMC expression profiles were also classified by a clinical multivariate prognostic score (11). Of the consented patients enrolled in this study, 6 were assigned a favorable risk assessment, 17 patients possessed an intermediate risk score, and 22 patients received a poor prognosis lassification. The characteristics of this group of patients were similar to those observed in the overall study (Table 1).
Characteristic . | PG consented . | . | . | Overall study . | . | . | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
. | 25 mg (n = 14) . | 75 mg (n = 14) . | 250 mg (n = 15) . | 25 mg (n = 36) . | 75 mg (n = 38) . | 250 mg (n = 37) . | ||||||
Age, y | ||||||||||||
Median | 55 | 61 | 62 | 55 | 58 | 57 | ||||||
Minimum | 40 | 44 | 40 | 40 | 17 | 40 | ||||||
Maximum | 68 | 78 | 76 | 79 | 78 | 81 | ||||||
Sex, % | ||||||||||||
Male | 64 | 64 | 53 | 67 | 84 | 57 | ||||||
Female | 36 | 36 | 47 | 33 | 16 | 43 | ||||||
Prior therapy, % | ||||||||||||
Immunotherapy or chemotherapy | 93 | 93 | 87 | 89 | 95 | 89 | ||||||
Radiotherapy | 36 | 36 | 40 | 39 | 32 | 35 | ||||||
Outcomes, mo | ||||||||||||
TTP (median) | 7.7 | 6.9 | 7.7 | 6.3 | 6.7 | 5.2 | ||||||
TTD (median) | 18.5 | 11.2 | 19.6 | 13.8 | 11 | 17.5 |
Characteristic . | PG consented . | . | . | Overall study . | . | . | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
. | 25 mg (n = 14) . | 75 mg (n = 14) . | 250 mg (n = 15) . | 25 mg (n = 36) . | 75 mg (n = 38) . | 250 mg (n = 37) . | ||||||
Age, y | ||||||||||||
Median | 55 | 61 | 62 | 55 | 58 | 57 | ||||||
Minimum | 40 | 44 | 40 | 40 | 17 | 40 | ||||||
Maximum | 68 | 78 | 76 | 79 | 78 | 81 | ||||||
Sex, % | ||||||||||||
Male | 64 | 64 | 53 | 67 | 84 | 57 | ||||||
Female | 36 | 36 | 47 | 33 | 16 | 43 | ||||||
Prior therapy, % | ||||||||||||
Immunotherapy or chemotherapy | 93 | 93 | 87 | 89 | 95 | 89 | ||||||
Radiotherapy | 36 | 36 | 40 | 39 | 32 | 35 | ||||||
Outcomes, mo | ||||||||||||
TTP (median) | 7.7 | 6.9 | 7.7 | 6.3 | 6.7 | 5.2 | ||||||
TTD (median) | 18.5 | 11.2 | 19.6 | 13.8 | 11 | 17.5 |
RCC patients were primarily of Caucasian descent (44 Caucasian and 1 African American) and had a mean age of 58 years (range, 40-78 years). Included were patients with histologically confirmed advanced renal cancer who had received prior therapy for advanced disease or who had not received prior therapy for advanced disease but were not appropriate candidates to receive high doses of interleukin-2 therapy. Exclusion criteria were the presence of known central nervous system metastases; surgery or radiotherapy within 3 weeks of start of dosing; or chemotherapy, biological therapy, or treatment with a prior investigational agent within 4 weeks of start of dosing.
Clinical Procedures. Patients with advanced cases of RCC were randomized to receive treatment with one of three doses of CCI-779 (25, 75, and 250 mg) administered as a 30-minute i.v. infusion once weekly for the duration of the trial. Clinical staging and extent of residual, recurrent, or metastatic disease were recorded before treatment and every 8 weeks following initiation of CCI-779 therapy. Tumor size was measured (in cm) and reported as the product of the longest diameter and its perpendicular. Measurable disease was defined as any bidimensionally measurable lesion with both diameters >1.0 cm by computed tomography scan, X-ray, or palpation. Tumor responses (complete response, partial response, minor response, stable disease, or progressive disease) were determined by the sum of the products of the perpendicular diameters of all measurable lesions, with progression defined as a 25% increase in tumor size over baseline or nadir or the presence of new lesions. The two main clinical outcome measures used in the pharmacogenomic analysis were time to progression (TTP) and survival or time to death (TTD). TTP was defined as the interval from the date of initial CCI-779 treatment until the first day of measurement of progressive disease or censored at the last date known as progression free. Survival or TTD was defined as the interval from date of initial CCI-779 treatment to the time of death or censored at the last date known alive. In this clinical trial, three doses of CCI-779 were evaluated (25, 75, and 250 mg), but tumor response rates and median survival times were comparable among the three dose groups (10). For these reasons, dose level was not considered in subsequent pharmacogenomic analyses.
Peripheral Blood Mononuclear Cell Preparation, Isolation of RNA, and Hybridization of Targets to Microarrays. Before initiation of therapy, peripheral blood samples (8 mL) were collected into Vacutainer sodium citrate cell purification tubes and PBMCs were isolated according to the manufacturer's protocol (Becton Dickinson, Franklin Lakes, NJ). All blood samples were shipped in cell purification tubes overnight before PBMC processing. Total RNA was isolated from PBMC pellets using the RNeasy mini kit (Qiagen, Valencia, CA) and biotinylated cRNA was prepared using a modification of the procedure described by Lockhart et al. (12). Labeled probes were hybridized to oligonucleotide arrays composed of >12,600 human sequences (HgU95A, Affymetrix, Santa Clara, CA) according to the Affymetrix Expression Analysis Technical Manual (Affymetrix).
Gene Expression Data Reduction. Data analysis and absent/present call determination were done on raw fluorescent intensity values using Genechip 3.2 software (Affymetrix). “Present” calls were calculated using Genechip 3.2 software by estimating whether a transcript is detected in a sample based on the strength of the signal of the gene compared with background. The “average difference” values for each transcript were normalized to “frequency” values using the scaled frequency normalization method (13) in which the average differences for 11 control cRNAs with known abundance spiked into each hybridization solution were used to generate a global calibration curve. This calibration was then used to convert average difference values for all transcripts to frequency estimates (in ppm) ranging from 1:300,000 (∼3 ppm) to 1:1,000 (1,000 ppm).
Statistical Analyses. Cox proportional hazards models, which account for censoring in time-to-event outcomes, were used to evaluate the relationships between expression and TTP and between expression and TTD. The models were calculated using log-transformed expression data, and separate models were fit for each transcript. Survival data were assessed by Kaplan Meier analysis, and statistical significance was established using a Wilcoxon test.
Unsupervised hierarchical clustering of genes and/or arrays based on similarity of their expression profiles was done using the procedure of Eisen et al. (14). In these analyses, 5,424 transcripts meeting a nonstringent data reduction filter were used (at least one present call and at least one frequency >10 ppm). Expression data were log transformed and standardized to have a mean value of 0 and a variance of 1, and hierarchical clustering results were generated using average linkage clustering with an uncentered correlation similarity metric.
Gene selection and supervised class prediction were done using GeneCluster version 2.0, which has been described previously (15) and is available from http://www.genome.wi.mit.edu/cancer/software/genecluster2.html. In these analyses, only the 4,022 transcripts meeting a more stringent data reduction filter were used (at least 25% present calls and an average frequency >5 ppm). This more stringent filter was used to avoid including low-level and unreliably detected transcripts in the predictive models. For gene selection, all expression data in training sets and test sets were log transformed before analysis. In training sets of data, models containing increasing numbers of features (transcript sequences) were built using a two-sided approach (equal numbers of features in each class) with a S2N similarity metric that used median values for the class estimate. All comparisons were binary, and predictive gene classifiers containing between 2 and 60 genes in steps of 2 (and 60-200 genes in steps of 10) were evaluated by leave-one-out cross-validation (LOOCV) to identify the smallest predictive model yielding the most accurate class assignments. Prediction of class membership was done using a k nearest-neighbor algorithm also in GeneCluster (16). In these predictions, the number of neighbors was set to k = 3, the cosine distance measure was used, and all k neighbors were given equal weights.
RESULTS
Identification of Renal Cell Carcinoma Patient Subpopulations Associated with Clinical Outcome. In our initial analysis, we employed an unsupervised hierarchical clustering approach using all genes passing the main filtering criteria to identify subpopulations of PBMC samples with similar expression profiles. Of the 12,626 genes on the HgU95A chip, 5,424 genes met the initial criteria for further analysis (at least one present call and at least one frequency >10 ppm). The dendrogram describing sample relationships grouped the RCC PBMCs (n = 45) into four roughly equivalent sized subclusters designated A to D (Fig. 1A). Kaplan-Meier analysis showed that patients in the four subclusters possessed significant differences in survival (P = 0.021, Wilcoxon test; Fig. 1B). In particular, survival curves for patients in cluster A (designated “poor outcome cluster”) and cluster C (designated “good outcome cluster”) were significantly distinct (P = 0.0025, Wilcoxon test).
These findings suggested that expression patterns in PBMCs correlated with survival might reflect a molecular subclassification of patients with RCC. The identification of expression profiles correlating with survival using this unsupervised approach prompted additional supervised approaches to 1.) identify individual transcripts in PBMCs most strongly associated with poor and favorable outcomes and 2.) to determine whether transcriptional patterns in PBMCs might predict clinical outcome in patients with metastatic RCC.
Identification of Pretreatment Transcript Levels Associated with Patient Outcome. To identify specific transcripts in PBMCs that were correlated with patient outcome, we employed a Cox proportional hazards regression to model outcome as a function of log2-transformed expression levels (in ppm). Cox regression analyses were done on two clinical outcome measures (TTD and TTP) for each of the 5,424 qualifiers that passed the initial filtering criteria (at least one “present” call across the data set and at least one transcript with a frequency of ≥10 ppm). Of the 45 RCC patients with baseline PBMC expression levels, 10 had censored data for TTD and 4 had censored data for TTP. In the Cox proportional hazards analysis, the risk coefficient associated with each transcript indicates the likelihood of a favorable or nonfavorable outcome, where a risk coefficient <1.0 indicates less risk and a risk coefficient >1.0 indicates higher risk.
For each transcript and outcome measure, risk coefficients were calculated and the P for the hypothesis that the risk coefficient was equal to 1 (i.e., no risk) was calculated. The number of tests that were nominally significant out of the 5,424 tests done for each outcome measure was calculated for five type I (i.e., false-positive) error levels. To adjust for the fact that the 5,424 tests were not independent, a permutation-based approach was then employed to evaluate how often the observed number of significant tests would be found under the null hypothesis of no risk.
The Cox proportional hazards regressions identified transcripts significantly correlated with progression and survival. Permutation analyses confirmed that more genes had statistically significant correlations of gene expression with survival than had significant correlations with disease progression. The 20 genes in PBMCs whose transcript levels possessed a minimal level of significance (P < 0.05) and were most correlated with low risk (risk coefficient <1.0) or high risk (risk coefficient >1.0) for survival (Table 2) or disease progression (Table 3) are presented.
. | Unigene . | Hazard ratio . | P . | |||
---|---|---|---|---|---|---|
Elevated expression at baseline = low risk for death | ||||||
Developmentally regulated GTP-binding protein 1 | Hs.115242 | 0.0322 | <0.00001 | |||
Heterogeneous nuclear ribonucleoprotein D | Hs.303627 | 0.0547 | 0.00026 | |||
Nucleoporin, 62 kDa | Hs.9877 | 0.1030 | 0.00038 | |||
Interleukin enhancer binding factor 2, 45 kDa | Hs.75117 | 0.1100 | 0.00285 | |||
Proteasome (prosome, macropain) 26S subunit, ATPase, 2 | Hs.61153 | 0.1140 | 0.00003 | |||
Murine leukemia viral (bmi-1) oncogene homologue | Hs.431 | 0.1250 | 0.00009 | |||
HIV-1 rev binding protein 2 | Hs.154762 | 0.1265 | 0.00025 | |||
Female sterile homeotic-related gene 1 (mouse homologue) | Hs.75243 | 0.1287 | 0.00029 | |||
AFG3 (ATPase family gene 3, yeast)–like 2 | Hs.29385 | 0.1288 | 0.00276 | |||
DEAD/H (Asp-Glu-Ala-Asp/His) box polypeptide 5 (RNA helicase, 68 kDa) | Hs.76053 | 0.1295 | 0.00012 | |||
Clone 24781 mRNA sequence | Hs.108112 | 0.1333 | 0.00002 | |||
Heterogeneous nuclear ribonucleoprotein K | Hs.129548 | 0.1428 | 0.00186 | |||
Chromosome 9, P1 clone 11659 | Hs.106357 | 0.1433 | 0.00111 | |||
Cytokine receptor-related protein 4 (CYTOR4) mRNA | Hs.7120 | 0.1447 | 0.00007 | |||
Ribosomal protein L6 | Hs.349961 | 0.1466 | 0.00254 | |||
Clk-associating RS-cyclophilin | Hs.77965 | 0.1538 | 0.00001 | |||
Ribosomal protein L4 | Hs.286 | 0.1591 | 0.00100 | |||
Dendritic cell protein | Hs.250581 | 0.1620 | 0.00132 | |||
Nucleotide binding protein 1 (Escherichia coli MinD like) | Hs.81469 | 0.1625 | 0.00035 | |||
DKFZP566C134 protein | Hs.20237 | 0.1675 | 0.00118 | |||
Elevated expression at baseline = high risk for death | ||||||
Moesin | Hs.170328 | 9.6763 | 0.01218 | |||
Homo sapiens chromosome 19, cosmid R26445 | Hs.108847 | 8.0370 | 0.01492 | |||
γ-Aminobutyric acidA receptor-associated protein | Hs.7719 | 7.6453 | 0.00209 | |||
Hypothetical protein | Hs.84359 | 6.7764 | 0.00006 | |||
Excision repair cross-complementing rodent repair deficiency, comp group 1 | Hs.59544 | 6.1122 | 0.00040 | |||
Myosin, light polypeptide 6, alkali, smooth muscle and nonmuscle | Hs.77385 | 4.9451 | 0.00094 | |||
Actin, β | Hs.288061 | 4.9169 | 0.00266 | |||
Capping protein (actin filament) muscle Z-line, β | Hs.333417 | 4.8396 | 0.00574 | |||
Eukaryotic translation initiation factor 4A, isoform 1 | Hs.129673 | 4.7016 | 0.01027 | |||
Capping protein (actin filament) muscle Z-line, α2 | Hs.75546 | 4.5981 | 0.00417 | |||
Actin, γ1 | Hs.14376 | 4.5693 | 0.00855 | |||
Vimentin | Hs.297753 | 4.4114 | 0.01584 | |||
H2A histone family, member O | Hs.795 | 4.2492 | <0.00001 | |||
ATPase, H+ transporting, lysosomal (vacuolar proton pump), subunit 1 | Hs.6551 | 4.1617 | 0.00834 | |||
Guanine nucleotide binding protein (G protein), β polypeptide 1 | Hs.215595 | 4.0632 | 0.01016 | |||
Cofilin 1 (nonmuscle) | Hs.180370 | 4.0505 | 0.00745 | |||
Adenylyl cyclase–associated protein | Hs.104125 | 4.0159 | 0.00155 | |||
ATP synthase, H+ transporting, mitochon F0 complex, subunit f, isoform 2 | Hs.155751 | 3.8316 | 0.00431 | |||
ADP-ribosylation factor 5 | Hs.77541 | 3.8205 | 0.01258 | |||
Lymphocyte cytosolic protein 1 (L-plastin) | Hs.16488 | 3.8170 | 0.00588 |
. | Unigene . | Hazard ratio . | P . | |||
---|---|---|---|---|---|---|
Elevated expression at baseline = low risk for death | ||||||
Developmentally regulated GTP-binding protein 1 | Hs.115242 | 0.0322 | <0.00001 | |||
Heterogeneous nuclear ribonucleoprotein D | Hs.303627 | 0.0547 | 0.00026 | |||
Nucleoporin, 62 kDa | Hs.9877 | 0.1030 | 0.00038 | |||
Interleukin enhancer binding factor 2, 45 kDa | Hs.75117 | 0.1100 | 0.00285 | |||
Proteasome (prosome, macropain) 26S subunit, ATPase, 2 | Hs.61153 | 0.1140 | 0.00003 | |||
Murine leukemia viral (bmi-1) oncogene homologue | Hs.431 | 0.1250 | 0.00009 | |||
HIV-1 rev binding protein 2 | Hs.154762 | 0.1265 | 0.00025 | |||
Female sterile homeotic-related gene 1 (mouse homologue) | Hs.75243 | 0.1287 | 0.00029 | |||
AFG3 (ATPase family gene 3, yeast)–like 2 | Hs.29385 | 0.1288 | 0.00276 | |||
DEAD/H (Asp-Glu-Ala-Asp/His) box polypeptide 5 (RNA helicase, 68 kDa) | Hs.76053 | 0.1295 | 0.00012 | |||
Clone 24781 mRNA sequence | Hs.108112 | 0.1333 | 0.00002 | |||
Heterogeneous nuclear ribonucleoprotein K | Hs.129548 | 0.1428 | 0.00186 | |||
Chromosome 9, P1 clone 11659 | Hs.106357 | 0.1433 | 0.00111 | |||
Cytokine receptor-related protein 4 (CYTOR4) mRNA | Hs.7120 | 0.1447 | 0.00007 | |||
Ribosomal protein L6 | Hs.349961 | 0.1466 | 0.00254 | |||
Clk-associating RS-cyclophilin | Hs.77965 | 0.1538 | 0.00001 | |||
Ribosomal protein L4 | Hs.286 | 0.1591 | 0.00100 | |||
Dendritic cell protein | Hs.250581 | 0.1620 | 0.00132 | |||
Nucleotide binding protein 1 (Escherichia coli MinD like) | Hs.81469 | 0.1625 | 0.00035 | |||
DKFZP566C134 protein | Hs.20237 | 0.1675 | 0.00118 | |||
Elevated expression at baseline = high risk for death | ||||||
Moesin | Hs.170328 | 9.6763 | 0.01218 | |||
Homo sapiens chromosome 19, cosmid R26445 | Hs.108847 | 8.0370 | 0.01492 | |||
γ-Aminobutyric acidA receptor-associated protein | Hs.7719 | 7.6453 | 0.00209 | |||
Hypothetical protein | Hs.84359 | 6.7764 | 0.00006 | |||
Excision repair cross-complementing rodent repair deficiency, comp group 1 | Hs.59544 | 6.1122 | 0.00040 | |||
Myosin, light polypeptide 6, alkali, smooth muscle and nonmuscle | Hs.77385 | 4.9451 | 0.00094 | |||
Actin, β | Hs.288061 | 4.9169 | 0.00266 | |||
Capping protein (actin filament) muscle Z-line, β | Hs.333417 | 4.8396 | 0.00574 | |||
Eukaryotic translation initiation factor 4A, isoform 1 | Hs.129673 | 4.7016 | 0.01027 | |||
Capping protein (actin filament) muscle Z-line, α2 | Hs.75546 | 4.5981 | 0.00417 | |||
Actin, γ1 | Hs.14376 | 4.5693 | 0.00855 | |||
Vimentin | Hs.297753 | 4.4114 | 0.01584 | |||
H2A histone family, member O | Hs.795 | 4.2492 | <0.00001 | |||
ATPase, H+ transporting, lysosomal (vacuolar proton pump), subunit 1 | Hs.6551 | 4.1617 | 0.00834 | |||
Guanine nucleotide binding protein (G protein), β polypeptide 1 | Hs.215595 | 4.0632 | 0.01016 | |||
Cofilin 1 (nonmuscle) | Hs.180370 | 4.0505 | 0.00745 | |||
Adenylyl cyclase–associated protein | Hs.104125 | 4.0159 | 0.00155 | |||
ATP synthase, H+ transporting, mitochon F0 complex, subunit f, isoform 2 | Hs.155751 | 3.8316 | 0.00431 | |||
ADP-ribosylation factor 5 | Hs.77541 | 3.8205 | 0.01258 | |||
Lymphocyte cytosolic protein 1 (L-plastin) | Hs.16488 | 3.8170 | 0.00588 |
. | Unigene . | Hazard ratio . | P . | |||
---|---|---|---|---|---|---|
Elevated expression at baseline = low risk for disease progression | ||||||
Heterogeneous nuclear ribonucleoprotein K | Hs.129548 | 0.0818 | 0.0002 | |||
U5 small nuclear ribonucleoprotein-specific protein (220 kDa), orthologue of Saccharomyces cerevisiae Prp8p | Hs.181368 | 0.1608 | 0.0001 | |||
Heterogeneous nuclear ribonucleoprotein H1 (H) | Hs.245710 | 0.1657 | 0.0024 | |||
RNA-binding protein S1, serine-rich domain | Hs.75104 | 0.1661 | 0.0040 | |||
Eukaryotic translation initiation factor 4A, isoform 2 | Hs.173912 | 0.1662 | 0.0009 | |||
Polyadenylic acid binding protein, cytoplasmic 1 | Hs.172182 | 0.1724 | 0.0071 | |||
DEAD/H (Asp-Glu-Ala-Asp/His) box polypeptide 5 (RNA helicase, 68 kDa) | Hs.76053 | 0.1831 | 0.0010 | |||
UDP-N-acetylglucosamine-2-epimerase/N-acetylmannosamine kinase | Hs.5920 | 0.2094 | 0.0002 | |||
Splicing factor, arginine/serine-rich 2 | Hs.73965 | 0.2147 | 0.0031 | |||
Fusion, derived from t(12;16) malignant liposarcoma | Hs.99969 | 0.2154 | 0.0009 | |||
RAE1 (RNA export 1, Schizosaccharomyces pombe) homologue | Hs.196209 | 0.2186 | 0.0010 | |||
Ribosomal protein L6 | Hs.349961 | 0.2211 | 0.0076 | |||
Non-Pou domain-containing octamer (ATGCAAAT) binding protein | Hs.172207 | 0.2258 | 0.0016 | |||
Translocase of inner mitochondrial membrane 17 (yeast) homologue A | Hs.20716 | 0.2298 | 0.0006 | |||
Nucleotide binding protein 1 (E. coli MinD like) | Hs.81469 | 0.2321 | 0.0016 | |||
Dendritic cell protein | Hs.250581 | 0.2330 | 0.0035 | |||
v-abl Abelson murine leukemia viral oncogene homologue 1 | Hs.146355 | 0.2331 | 0.0005 | |||
Splicing factor, arginine/serine-rich 6 | NA | 0.2385 | 0.0037 | |||
H. sapiens clone 23711 unknown mRNA, partial cds | Hs.256583 | 0.2386 | 0.0013 | |||
Serine/arginine–related nuclear matrix protein (plenty of prolines 101-like) | Hs.18192 | 0.2393 | 0.0023 | |||
Elevated expression at baseline = high risk for disease progression | ||||||
Lymphocyte cytosolic protein 1 (L-plastin) | Hs.16488 | 6.1066 | 0.0001 | |||
Adenylyl cyclase–associated protein | Hs.104125 | 5.8829 | <0.0001 | |||
γ-Aminobutyric acidA receptor-associated protein | Hs.7719 | 4.6595 | 0.0046 | |||
Hematopoietic cell-specific Lyn substrate 1 | Hs.14601 | 4.2099 | 0.0061 | |||
IFN-induced transmembrane protein 1 (9-27) | Hs.146360 | 4.1051 | 0.0016 | |||
Sjogren's syndrome/scleroderma autoantigen 1 | Hs.25723 | 3.9750 | 0.0106 | |||
Expressed sequence tags, highly similar to HSPC022 (H. sapiens) | Hs.367740 | 3.8093 | 0.0013 | |||
Cargo selection protein (mannose-6-phosphate receptor binding protein) | Hs.140452 | 3.5692 | 0.0243 | |||
Proteasome (prosome, macropain) subunit, β type, 3 | Hs.82793 | 3.3680 | 0.0053 | |||
H. sapiens mRNA; cDNA DKFZp564H1664 (from clone DKFZp564H1664) | Hs.109201 | 3.2703 | 0.0029 | |||
Cluster Incl AF053356: H. sapiens chromosome 7q22 sequence+A36 | Hs.91299 | 3.0853 | 0.0092 | |||
RAB, member of RAS oncogene family-like | Hs.479 | 2.9842 | 0.0140 | |||
H. sapiens cDNA clone IMAGE:2409932 | Hs.5947 | 2.9149 | 0.0060 | |||
Nuclear domain 10 protein | Hs.154230 | 2.9000 | 0.0059 | |||
Neural precursor cell expressed, developmentally down-regulated 8 | Hs.75512 | 2.8913 | 0.0224 | |||
Guanine nucleotide binding protein (G protein), β polypeptide 1 | Hs.215595 | 2.7878 | 0.0407 | |||
CD53 antigen | Hs.82212 | 2.7807 | 0.0035 | |||
ATP synthase, H+ transporting, mitochondrial F0 complex, subunit f isoform 2 | Hs.155751 | 2.7701 | 0.0105 | |||
ATP synthase, H+ transporting, lysosomal (vacuolar proton pump) subunit 1 | Hs.6551 | 2.7308 | 0.0362 | |||
Leukocyte immunoglobulin-like receptor, subfamily B member 4 | Hs.67846 | 2.6110 | 0.0089 |
. | Unigene . | Hazard ratio . | P . | |||
---|---|---|---|---|---|---|
Elevated expression at baseline = low risk for disease progression | ||||||
Heterogeneous nuclear ribonucleoprotein K | Hs.129548 | 0.0818 | 0.0002 | |||
U5 small nuclear ribonucleoprotein-specific protein (220 kDa), orthologue of Saccharomyces cerevisiae Prp8p | Hs.181368 | 0.1608 | 0.0001 | |||
Heterogeneous nuclear ribonucleoprotein H1 (H) | Hs.245710 | 0.1657 | 0.0024 | |||
RNA-binding protein S1, serine-rich domain | Hs.75104 | 0.1661 | 0.0040 | |||
Eukaryotic translation initiation factor 4A, isoform 2 | Hs.173912 | 0.1662 | 0.0009 | |||
Polyadenylic acid binding protein, cytoplasmic 1 | Hs.172182 | 0.1724 | 0.0071 | |||
DEAD/H (Asp-Glu-Ala-Asp/His) box polypeptide 5 (RNA helicase, 68 kDa) | Hs.76053 | 0.1831 | 0.0010 | |||
UDP-N-acetylglucosamine-2-epimerase/N-acetylmannosamine kinase | Hs.5920 | 0.2094 | 0.0002 | |||
Splicing factor, arginine/serine-rich 2 | Hs.73965 | 0.2147 | 0.0031 | |||
Fusion, derived from t(12;16) malignant liposarcoma | Hs.99969 | 0.2154 | 0.0009 | |||
RAE1 (RNA export 1, Schizosaccharomyces pombe) homologue | Hs.196209 | 0.2186 | 0.0010 | |||
Ribosomal protein L6 | Hs.349961 | 0.2211 | 0.0076 | |||
Non-Pou domain-containing octamer (ATGCAAAT) binding protein | Hs.172207 | 0.2258 | 0.0016 | |||
Translocase of inner mitochondrial membrane 17 (yeast) homologue A | Hs.20716 | 0.2298 | 0.0006 | |||
Nucleotide binding protein 1 (E. coli MinD like) | Hs.81469 | 0.2321 | 0.0016 | |||
Dendritic cell protein | Hs.250581 | 0.2330 | 0.0035 | |||
v-abl Abelson murine leukemia viral oncogene homologue 1 | Hs.146355 | 0.2331 | 0.0005 | |||
Splicing factor, arginine/serine-rich 6 | NA | 0.2385 | 0.0037 | |||
H. sapiens clone 23711 unknown mRNA, partial cds | Hs.256583 | 0.2386 | 0.0013 | |||
Serine/arginine–related nuclear matrix protein (plenty of prolines 101-like) | Hs.18192 | 0.2393 | 0.0023 | |||
Elevated expression at baseline = high risk for disease progression | ||||||
Lymphocyte cytosolic protein 1 (L-plastin) | Hs.16488 | 6.1066 | 0.0001 | |||
Adenylyl cyclase–associated protein | Hs.104125 | 5.8829 | <0.0001 | |||
γ-Aminobutyric acidA receptor-associated protein | Hs.7719 | 4.6595 | 0.0046 | |||
Hematopoietic cell-specific Lyn substrate 1 | Hs.14601 | 4.2099 | 0.0061 | |||
IFN-induced transmembrane protein 1 (9-27) | Hs.146360 | 4.1051 | 0.0016 | |||
Sjogren's syndrome/scleroderma autoantigen 1 | Hs.25723 | 3.9750 | 0.0106 | |||
Expressed sequence tags, highly similar to HSPC022 (H. sapiens) | Hs.367740 | 3.8093 | 0.0013 | |||
Cargo selection protein (mannose-6-phosphate receptor binding protein) | Hs.140452 | 3.5692 | 0.0243 | |||
Proteasome (prosome, macropain) subunit, β type, 3 | Hs.82793 | 3.3680 | 0.0053 | |||
H. sapiens mRNA; cDNA DKFZp564H1664 (from clone DKFZp564H1664) | Hs.109201 | 3.2703 | 0.0029 | |||
Cluster Incl AF053356: H. sapiens chromosome 7q22 sequence+A36 | Hs.91299 | 3.0853 | 0.0092 | |||
RAB, member of RAS oncogene family-like | Hs.479 | 2.9842 | 0.0140 | |||
H. sapiens cDNA clone IMAGE:2409932 | Hs.5947 | 2.9149 | 0.0060 | |||
Nuclear domain 10 protein | Hs.154230 | 2.9000 | 0.0059 | |||
Neural precursor cell expressed, developmentally down-regulated 8 | Hs.75512 | 2.8913 | 0.0224 | |||
Guanine nucleotide binding protein (G protein), β polypeptide 1 | Hs.215595 | 2.7878 | 0.0407 | |||
CD53 antigen | Hs.82212 | 2.7807 | 0.0035 | |||
ATP synthase, H+ transporting, mitochondrial F0 complex, subunit f isoform 2 | Hs.155751 | 2.7701 | 0.0105 | |||
ATP synthase, H+ transporting, lysosomal (vacuolar proton pump) subunit 1 | Hs.6551 | 2.7308 | 0.0362 | |||
Leukocyte immunoglobulin-like receptor, subfamily B member 4 | Hs.67846 | 2.6110 | 0.0089 |
Class Prediction Approach for Identification of Multivariate Expression Patterns Correlated with Clinical Outcome. The Cox proportional hazards regression suggested an association between gene expression and time until disease progression and an even stronger association between gene expression and survival. Based on these findings, we next employed a class prediction algorithm to identify expression patterns in PBMCs that could possibly be used to predict patient outcome. In these analyses, we searched for pretreatment expression patterns correlated with the clinical outcomes of TTD and TTP.
To evaluate the predictive utility of the profiles correlated with clinical outcomes, we randomly selected 70% of the patient PBMC profiles as a training set, and the remaining 30% of the samples formed the test set. This strategy allows the test set of samples to function similarly to a set of future samples on which the classifications of interest could be predicted. The main benefit of this approach is that it ensures that the test samples are not used in gene selection and it therefore allows a truly independent evaluation of the predictive model discovered in the training set (17,18).
For each outcome measure, we stratified the profiles as originating from patients with poor or favorable outcomes. In this process, we attempted to discover models that could predict either (a) long-term survival or (b) rapid times to disease progression. We established yearlong survival as a favorable outcome for overall survival because this approximated the median survival across all three dose groups for the 45 patients in the pharmacogenomic portion of the trial. We established 106 days as nonfavorable outcome for TTP, because this represented the lower quartile value of disease-free survival for the 45 patients in the pharmacogenomic portion of the trial.
Because it was possible that the observed differences in expression between PBMCs from patients classified into the good and poor outcome categories might be confounded by other differences, such as patient demographics or technical variables, we first compared these characteristics between the poor outcome and the good outcome patient groups defined based on TTD or TTP (Supplementary Data). Groups were tested for differences in continuous variables using a Student's t test or for categorical differences using a likelihood test. Comparison of technical chip variables (raw Q, glyceraldehyde-3-phosphate dehydrogenase 3′:5′ ratio, scale factors, average frequency, and present calls), demographics (gender, age, and ethnicity), or clinical variables (histologic tumor types, previous surgeries, previous nephrectomies, numbers of metastatic sites, and dose levels) indicated no significant differences between patients in the good and poor outcome categories that might contribute to observed differences in PBMC gene expression. Prognosis by the Motzer-based risk assessment was significantly associated with the groups in the survival comparison as expected but was not significantly associated with the groups in the TTP comparison.
Because these studies used PBMCs as the tissue of interest, we also examined the distribution of cell types (neutrophils, eosinophils, lymphocytes, and monocytes) in the samples of the various groups to determine whether differences in cell populations might be responsible for any observed differences in expression. This analysis showed that the distributions of the various cell subtypes between PBMCs of patients assigned to either good or poor outcome categories for survival and TTP were not significantly different (Supplementary Data). This supports the hypothesis that any observed transcriptional differences between the groups were not the result of altered cell compositions but seem to reflect distinct expression patterns in the PBMCs.
In subsequent analyses, we used a nearest-neighbor prediction algorithm to generate gene classifiers correlated with the groups in the training sets and selected the classifiers that gave the highest accuracy of class assignment by LOOCV. The results of these analyses are depicted for classification based on yearlong survival in Fig. 2 and for classification based on short times to disease progression in Fig. 3. The 20-gene classifier in PBMCs that gave the highest accuracy of class assignment by LOOCV (73%) in the TTD comparison and the 30-gene classifier in PBMCs that displayed the highest accuracy for class assignment by LOOCV in the TTP analysis (74%) are provided in Supplementary Data.
Finally, we evaluated the optimally sized classifiers based on LOOCV of the training set on an independent test set of samples. We defined sensitivity of the PBMC expression-based assays as the correct identification of patients with favorable outcome and specificity as the correct identification of patients with unfavorable outcomes. In the test set, the PBMC-based gene classifier for TTD showed moderate overall accuracy (72%) with high sensitivity (100%) but a poor specificity (33%) due to a high false-positive rate. The PBMC-based gene classifier for TTP showed good overall accuracy (85%) with high sensitivity (80%) and specificity (100%) due to both low false-positive and low false-negative rates.
DISCUSSION
In this study, we defined our pharmacogenomic objective as the identification of patients with good or poor outcome based on pretreatment expression profiles in PBMCs. In an initial analysis, an unsupervised hierarchical clustering algorithm segregated patients solely based on the similarity in their global expression profiles in PBMCs and identified clusters of patients with differences in clinical outcomes. It is encouraging that the Kaplan-Meier–based differences in survival curves for the subsets of patients in the good versus poor prognosis gene expression clusters were more distinct than the differences in survival for those same patients as predicted by their associated clinical risk classifications (data not shown). Although not yet externally validated on an independent set of samples, this is an intriguing finding that supports the continued exploration of surrogate tissue profiling for identification of gene expression patterns predictive of outcome.
Several supervised approaches further strengthened the hypothesis that transcriptional levels of select genes in PBMC profiles of RCC patients are significantly correlated with the clinical outcomes of disease progression and overall survival. Parametric (Cox proportional hazards modeling) univariate analyses identified individual transcripts in PBMCs that were significantly correlated with both disease progression and survival. GeneCluster-based gene selection methods also identified multigene signatures in PBMCs that were predictive of progression and survival.
It should be noted that we did not limit the present analyses to the RCC disease-associated transcripts reported previously (9) and in so doing found that the transcripts most strongly correlated with disease outcome in RCC patients were not necessarily the transcripts most strongly associated with the presence of disease in the comparison of RCC PBMCs and normal volunteers. It seems that some of the differences between transcriptional profiles in PBMCs of RCC patients and healthy subject are driven by differences in cell populations; yet, cell populations were not significantly distinct in poor and favorable outcome groups in the present analysis. This may be one of several plausible reasons that those transcriptional differences in PBMCs that define the presence of RCC relative to the healthy condition are not necessarily those that are most significantly correlated with clinical outcomes within the population of RCC patients evaluated in the present study.
The overall accuracy of the predictive models for TTP and overall survival on test sets of patients was encouraging (85% and 72%, respectively), and overall accuracies in both training set cross-validation and test-set predictions were similar. The model based on TTP displayed higher accuracy mainly because the model based on yearlong survival possessed a higher false-positive rate, because several of the shorter-term survivors were incorrectly classified as long-term survivors. Nonetheless, the gene classifier for yearlong survival exhibited high sensitivity for the patients tested (i.e., correctly assigned the majority of yearlong survivors).
The PBMC-based gene classifier was able to classify short versus long TTP with relatively high accuracy, but the Motzer risk assessment scores, although very well correlated with TTD, were not significantly distinct for patients who exhibited short-term and long-term times to disease progression as defined in this study (see Suplementary Data). This is not surprising because the Motzer risk assessment was developed for overall survival (11). As additional cytostatic cancer therapies are developed, TTP will continue to be an important diagnostic end point in oncology trials. The finding that the PBMC profiles of patients in this study seem prognostic of short versus long TTP provides additional support for continued evaluations of PBMC transcriptional profiles in the context of clinical outcomes in patients with other solid tumors. The results from supervised classification analyses in the present study suggest that it may be possible to use transcriptional profiles in the surrogate tissue of peripheral blood to identify cancer patients with greater chances for long or short times to progression and/or long or short survival.
The results further imply that the circulating mononuclear cells of peripheral blood may serve as a sensitive monitor of the organism's physiologic state. As these cells pass through various tissues, their reaction to the microenvironment is captured in a complex transcriptional response measured through profiling. Surprisingly, such patterns not only seem to be diagnostic of disease state (e.g., RCC) but also may reflect differential responses to variations in clinically indistinguishable disease states (e.g., advanced RCC with different degrees of aggressiveness). This suggests that the PBMCs, due to their transit through the body, may serve as an accessible surrogate monitor of tissues and systems that are not easily obtained by routine biopsies. These changes may be reflective of an ongoing physiologic host response to the tumor, such as an immune or inflammatory response. Concerning this possibility, the elevated expression of immune-associated genes in PBMCs from patients in the good prognosis category is of particular interest.
The functional categories of transcripts in PBMCs associated with low or high risk displayed several interesting trends. First, transcripts elevated in PBMCs of patients with shorter TTP or survival include those involved in cytoskeletal organization/cell motility, associated small GTPases, general pathways of proteasome-dependent catabolism, and general pathways of metabolism. In contrast, transcripts elevated in PBMCs of patients with longer TTP or survival included those involved in mRNA transport, mRNA processing/splicing, and ribosomal protein subunits. Different eukaryotic translation initiation factor isoforms were elevated in patients with poor or favorable outcomes. Because the drug evaluated in this study is a well-characterized inhibitor of eukaryotic translation (19, 20), it is tempting to speculate that elevated transcripts levels of certain eukaryotic translation initiation factor isoforms in PBMCs may represent potential biomarkers of poor or favorable response to treatment with CCI-779 but this is currently unproven.
It is important to note that the present study cannot distinguish whether the profiles in PBMCs discovered here are simply prognostic of outcome in these patients regardless of therapy or whether they are specific to impending treatment with CCI-779. In the absence of a placebo or active control arm, we were unable to determine whether there are patterns in pretreatment PBMCs that are predictive of clinical outcome only in the context of CCI-779 therapy. To address this, we are currently collecting whole blood samples at baseline before patient entry into a phase III oncology trial of CCI-779 in RCC, which includes several different treatment arms. Surrogate tissue analysis in the phase III trial will enable the discrimination between pretreatment transcriptional profiles that are specific to the therapies in question from those that are simply prognostic of disease outcome regardless of therapy. Analysis of the larger phase III trial of CCI-779 in a completely independent set of RCC patients currently under way will allow validation and/or further refinement of the current predictive models identified in the present studies. The findings reported in the present phase II trial support the continued evaluation of surrogate tissue expression profiles to enhance the prediction of clinical outcomes in cancer populations.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Note: Supplementary data for this are available at Clinical Cancer Research Online (http://clincancerres.aacrjournals.org).
Acknowledgments
We thank the many patients who donated clinical samples for this study; Christine Reilly for expert technical assistance; and Andrew Hill, William Mounts, Maryann Whitley, Charles Zacharchuk, and John Ryan for many thoughtful discussions throughout this study.