Abstract
Purpose: Better understanding of the underlying biology of primary central nervous system lymphomas (PCNSL) is critical for the development of early detection strategies, molecular markers, and new therapeutics. This study aimed to define genes associated with survival of patients with PCNSL.
Experimental Design: Expression profiling was conducted on 32 PCNSLs. A gene classifier was developed using the random survival forests model. On the basis of this, prognosis prediction score (PPS) using immunohistochemical analysis is also developed and validated in another data set with 43 PCNSLs.
Results: We identified 23 genes in which expressions were strongly and consistently related to patient survival. A PPS was developed for overall survival (OS) using a univariate Cox model. Survival analyses using the selected 23-gene classifiers revealed a prognostic value for high-dose methotrexate (HD-MTX) and HD-MTX–containing polychemotherapy regimen–treated patients. Patients predicted to have good outcomes by the PPS showed significantly longer survival than those with poor predicted outcomes (P < 0.0001). PPS using immunohistochemical analysis is also significant in test (P = 0.0004) and validation data set (P = 0.0281). The gene-based predictor was an independent prognostic factor in a multivariate model that included clinical risk stratification (P < 0.0001). Among the genes, BRCA1 protein expressions were most strongly associated with patient survival.
Conclusion: We have identified gene expression signatures that can accurately predict survival in patients with PCNSL. These predictive genes should be useful as molecular biomarkers and they could provide novel targets for therapeutic interventions. Clin Cancer Res; 18(20); 5672–81. ©2012 AACR.
In this study, we report the development and validation of a risk-score model based on the expression of 23 genes. This 23-gene risk score is highly associated with the outcome of patients with newly diagnosed primary central nervous system lymphoma (PCNSL). These results suggest the importance of this multimarker panel as a stratification factor for the design of future comparative therapeutic trials.
Introduction
A primary central nervous system lymphoma (PCNSL) is an extranodal form of non-Hodgkin lymphoma arising in the craniospinal axis. For many years, PCNSLs were reported to represent 3% to 5% of all primary central nervous system (CNS) tumors (1). However, PCNSL seems to be increasing in incidence (2–4). The tumor manifestation is often diffuse and multifocal, and most frequently affects the supratentorial brain parenchyma, with periventricular lesions involving the corpus callosum, basal ganglia, or thalamus. The absence of systemic lymphadenopathies and other extracranial localizations of disease should be confirmed. Most PCNSLs belong to the diffuse large B-cell lymphomas (DLBCL) but differ from systemic DLBCLs by their less favorable prognosis.
The systemic use of high-dose methotrexate (HD-MTX)–based chemotherapy with radiotherapy for newly diagnosed PCNSL has improved the median overall survival (OS) from 20 to 36 months (5–8). However, there are still many individual variations within the diagnostic and prognostic categories, resulting in a need for additional biomarkers, partly because of the inability to recognize these patients prospectively. Although, the clinical scoring model using age, Karnofsky performance status (KPS), and lactate dehydrogenase (LDH) level has prognostic value for PCNSL (9–11), it has not been used successfully to stratify patients for therapeutic trials. Molecular markers could improve the outcome prediction, discover potential targets for therapeutic intervention, and elucidate mechanisms that result in resistance to chemotherapy. A comprehensive molecular approach to predict the prognosis is awaited. In the present study, we carried out an expression profiling analysis in patients with PCNSL for the identification of genes that are predictive of OS.
Materials and Methods
Samples and study population
Patients were diagnosed and treated at Niigata University Hospital (Niigata, Japan), Chiba University Hospital (Chiba, Japan), Yamaguchi University Hospital (Ube, Yamaguchi, Japan), and Toyama Prefectural Central Hospital (Toyama, Japan) between 2000 and 2010. Clinical data were obtained through a registered database and chart review. Inclusion criteria were a histology-proven CNS lymphoma without the evidence of systemic lymphoma, and no evidence of HIV-1 infection, opportunistic infections, or other immunodeficiency. Patients were selected on the basis of the availability of tumor specimens without regard to the clinical outcome. All patients underwent brain imaging with either computed tomography (CT) or magnetic resonance imaging (MRI). After the diagnostic biopsy, detailed history and physical examination, complete blood count, screening blood tests of hepatic and renal function, serum protein electrophoresis, and chest radiographs were obtained. The CT or MRI of the thorax, abdomen, and pelvis were conducted for all patients. Ophthalmologic consultations and slit-lamp examinations were used to rule out ocular involvement. Bone marrow biopsy was not routinely conducted unless CNS involvement was part of a systemic lymphoma. Lumbar puncture for cerebrospinal fluid (CSF) evaluation was routinely conducted. Tissues were snap-frozen in liquid nitrogen within 5 minutes of harvesting, and stored at −80°C thereafter. All specimens were centrally reviewed by a board-certified pathologist by observation of sections of paraffin-embedded tissues that were adjacent or in close proximity to the frozen sample from which the RNA was subsequently extracted. The cut-offs for normal CSF protein and serum LDH levels were 45 mg/dL and 216 IU/L, respectively. Informed consent was obtained from all patients for the use of their samples, in accordance with the guidelines of the respective Ethical Committees on Human Research.
RNA extraction and array hybridization
Approximately, 100 mg of tissue from each tumor was subjected to total RNA extraction using Isogen (Nippon Gene) in accordance with the manufacturer's instructions. These tissues contained more than 95% tumor cells. The quality of the obtained RNA was verified using a Bioanalyzer System (Agilent Technologies) and RNA Pico Chips (Agilent Technologies). Subsequently, 1 μg of RNA was processed for hybridization to a GeneChip Human Genome U133 Plus 2.0 Expression Array (Affymetrix Inc.), which contained approximately 47,000 genes. After hybridization, the chips were processed using a Fluidics Station 450, High-Resolution Microarray Scanner 3000, and GCOS Workstation Version 1.3 (Affymetrix Inc.).
Validation of differential expression by real-time qPCR
Quantitative PCR (qPCR) was conducted using a StepOne Real-Time PCR System (Applied Biosystems) and TaqMan Universal PCR Master Mix (Applied Biosystems) according to the manufacturer's protocol. The Assays-on-Demand probe/primer sets (Applied Biosystems) used were as follows: ATAD1, Hs00907773_g1; BRCA1, Hs01556193_m1; FANCA, Hs01116668_m1; GAPDH, Hs99999905_m1; GGH, Hs00914163_m1; GNASAS, Hs00294858_m1; PGAM1, Hs01652468_g1; PPP3R1, Hs01547793_m1; RBBP8, Hs00161222_m1; ROCK1, Hs01127699_m1; STIL, Hs00161700_m1; TRMT6, Hs00210942_m1; and ZNF681, Hs01862022_s1. Total RNA (1 μg) was reverse-transcribed into cDNA using SuperScript II (Invitrogen), and 1 μL of the resulting cDNA was used for qPCR. Validation was conducted on a subset of tumors that were part of the original tumor data set assessed. Assays were carried out in duplicate. The raw data produced by qPCR referred to the number of cycles required for reactions to reach the exponential phase. Expression of glyceraldehyde-3-phosphate dehydrogenase (GAPDH) was used for normalization of the qPCR data. The mean expression fold change differences between tumor groups were calculated using the 2−ΔΔCT method (12).
Immunohistochemistry
Three antibodies for immunophenotype determination and 5 commercially available antibodies for proteins encoded by genes associated with patient survival were selected for immunohistochemistry (IHC). Sections (5 μm) of the formalin-fixed, paraffin-embedded tissue specimens were evaluated. The primary antibodies recognized BCL6 (DAKO; 1:200 dilution), BRCA1 (Abcam; 1:200 dilution), CD10 (Nichirei; 1:1 dilution), CD79a antibody (DAKO; 1:50 dilution), FANCA (Abcam; 1:3,000 dilution), MUM1 (DAKO; 1:50 dilution), PPP3R1 (Abcam; 1:100 dilution), ROCK1 (Sigma–Aldrich; 1:125 dilution), and RBBP8 (Abnova; 1:200 dilution). Anti-goat or anti-mouse secondary antibodies (Nichirei) were also applied. The staining intensity was classified as none or weakly positive (0 points), moderately positive (1 point), or strongly positive (2 points). The cases in which positive cells were more than 20% of lymphoma cells were determined to be positive. The averages of 3 independent measurements were calculated to the first decimal place. The observers were not aware of the case numbers. For double-immunostaining, after immunostaining of BRCA1 was conducted, sections were washed with glycine buffer (pH 2.2). Then sections were rereacted with anti-CD79a antibody and visualized with HistoGreen (LinarisBiologische).
Bioinformatics analysis
The primary outcome was OS defined as the time from first diagnosis to death or last follow-up. All statistical analyses were conducted using R software (13) and Bioconductor (14). The Affymetrix GeneChip probe-level data were preprocessed using MAS 5.0 (Affymetrix Inc.) for background adjustment and log-transformation (base 2). Each array was normalized by applying a quantile normalization to impose the same empirical distribution of intensities to each array. Genes that passed the filter criteria below were considered for further analysis. To select predictors (genes) for OS, we first set filtered gene expressions and applied the random survival forests-variable hunting (RSF-VH) algorithm (15). Among the parameters in the algorithm, the number of Monte Carlo iterations (nrep) and value controlling the step size used in the forward process (nstep) were set as nrep = 100 and nstep = 5 following Ishwaran and colleagues (15). For other parameters, such as the number of trees and number of variables selected randomly at each node, we used the default settings in the varSelfunction within the RandomSurvivalForest package before the selection. We classified the samples into 2 survival groups by a Ward minimum variance cluster analysis, with inputs of ensemble cumulative hazard functions for each individual for all unique death time points estimated from the fitted random survival forests model to selected genes.
The 2 classified survival groups were used to compute the prognosis prediction score (PPS) from a simple form (linear combination of gene expressions). To do this, we used the principal component analysis and receiver operating characteristic analysis. Briefly, we computed the first principal component of the gene expressions selected by the RSF-VH algorithm as a risk score, and then searched for the optimal cut-off point to predict survival groups with maximum accuracy by the Yoden index (16). The validation for this method was conducted using 10-fold cross-validation. The predictive accuracy of the PPS was assessed by Harrell concordance index.
The Kaplan–Meier method was used to estimate the survival distribution for each group. A log-rank test was used to test the differences between the survival groups. The association of the PPS with OS was evaluated by multivariate analyses with clinical characteristics as other predictors using the Cox proportional hazards regression model. A value of P < 0.05 was considered to indicate statistical significance.
Results
Patient characteristics
The baseline characteristics are shown in Table 1. Expression profiling was conducted on 32 PCNSLs in the test set. The median age of the patients was 64.1 years (range, 44–76 years). Seventeen patients (53%) were males and 15 (47%) were females. The median preoperative KPS was 70. All the patients had histology-proven DLBCL. Twenty-one patients (66%) had a single lesion and 11 (34%) had multiple lesions. Deep structures of the brain, that is, the periventricular lesion, basal ganglia, corpus callosum, brain stem, and/or cerebellum, were involved in 13 patients (41%). Ocular involvement was detected in 4 patients (12.5%), and tumor cell in CSF was positive in 2 patients (6.2%). An elevated LDH serum level was detected in 15 patients (46.8%), and an elevated concentration of CSF protein was detected in 7 of 12 patients (58.3%) assessed. The samples were obtained by stereotactic or open biopsy in 15 patients (46.8%) and surgical resection in 17 patients (53.1%). The treatments were 3 g/m2 per course for 3 or more cycles of HD-MTX in 16 cases (50%) and HD-MTX–containing polychemotherapy (cyclophosphamide, pirarubicin, etoposide, vincristine, procarbazine with or without rituximab; ref. 17) in 16 cases (50%). Chemotherapy alone was used in 8 patients (25%) and chemotherapy followed by radiotherapy in 24 patients (75%). Radiotherapy was administered to the whole brain at 30 Gy and local brain at 20 Gy in 12 patients (37.5%), whole brain at 40 Gy in 9 patients (28.1%) and 20 Gy in 3 patients (9.3%). Relapse after response to the first-line therapy occurred in 19 patients (relapse rate, 59.3%). The second-line treatment was at the physician's choice. The median OS was 1,626 days in all patients. Ten patients (31.2%) remained alive (10 with no evidence of disease) after a median follow-up of 48 months (range, 3.8–135.1 months). The causes of death were lymphoma in 16 patients (72.7%), unrelated causes with no evidence of disease in 3 patients (13.6%), and unknown causes with no evidence of disease in 3 patients (13.6%). The patient characteristics in the validation set are similar to the test set except more patients are involved in deep lesion (Table 1). The patients were monitored for tumor recurrences during the initial and maintenance therapy by MRI or CT.
. | Training set (n = 32) . | Validation set (n = 43) . | . | |
---|---|---|---|---|
Characteristic . | N (%) . | N (%) . | P . | |
Age, y | 0.949 | |||
Average | 64.18 | 64.34 | ||
Range | 44–76 | 17–84 | ||
Gender | 0.141 | |||
Male | 17 (53) | 30 (70) | ||
Female | 15 (47) | 13 (30) | ||
KPS at diagnosis | 0.392 | |||
Median | 70 | 70 | ||
70≦ | 19 (59) | 24 (56) | ||
70> | 13 (41) | 19 (44) | ||
No. of lesions | 0.647 | |||
Single | 21 (66) | 26 (60) | ||
Multiple | 11 (34) | 17 (40) | ||
Deep lesions | <0.05 | |||
Yes | 13 (41) | 30 (70) | ||
No | 19 (59) | 13 (30) | ||
Histology | ||||
DLBCL | 32 (100) | 43 (100) | ||
Other | 0 (0) | 0 (0) | ||
Chemotherapy | 0.14 | |||
HD-MTX | 16 (50) | 17 (40) | ||
Polychemo | 16 (50) | 23 (53) | ||
Other | 0 (0) | 3 (7) |
. | Training set (n = 32) . | Validation set (n = 43) . | . | |
---|---|---|---|---|
Characteristic . | N (%) . | N (%) . | P . | |
Age, y | 0.949 | |||
Average | 64.18 | 64.34 | ||
Range | 44–76 | 17–84 | ||
Gender | 0.141 | |||
Male | 17 (53) | 30 (70) | ||
Female | 15 (47) | 13 (30) | ||
KPS at diagnosis | 0.392 | |||
Median | 70 | 70 | ||
70≦ | 19 (59) | 24 (56) | ||
70> | 13 (41) | 19 (44) | ||
No. of lesions | 0.647 | |||
Single | 21 (66) | 26 (60) | ||
Multiple | 11 (34) | 17 (40) | ||
Deep lesions | <0.05 | |||
Yes | 13 (41) | 30 (70) | ||
No | 19 (59) | 13 (30) | ||
Histology | ||||
DLBCL | 32 (100) | 43 (100) | ||
Other | 0 (0) | 0 (0) | ||
Chemotherapy | 0.14 | |||
HD-MTX | 16 (50) | 17 (40) | ||
Polychemo | 16 (50) | 23 (53) | ||
Other | 0 (0) | 3 (7) |
NOTE: Polychemo, HD-MTX–containing polychemotherapy.
Selection of predictive genes
Microarray data have been deposited in Gene Expression Omnibus (accession number GSE34771). Twenty-three genes were selected as the predictors. Table 2 shows a list of the genes with their obtained variable importance values. Variable importance measures the increase (or decrease) in the prediction error for the random forests model when a variable is randomly “noise up.” That is, if the prediction error of the model became worse when the effect of one variable in the model on the prediction was intentionally destroyed, this means that the variable is important in the model. The scatter plot in Supplementary Fig. S1 shows the relationships between the estimated ensemble mortalities and expressions for 6 selected genes (BRCA1, FANCA, PPP3R1, RBBP8, ROCK1, and ZNF681). Validation of the microarray results was accomplished using qPCR. These 12 genes were also found to be differentially expressed between short-term survivors (survival time, ≤2.5 years; n = 12) and long-term survivors (survival time, ≥4 years; n = 9; Supplementary Table S2).
Probe . | Symbol . | Description . | VI . |
---|---|---|---|
209092_s_at | GLOD4 | Glyoxalase domain containing 4 | 0.0368 |
238962_at | ZNF681 | Zinc finger protein 681 | 0.0255 |
223779_at | AFAP1AS | AFAP1 antisense RNA (nonprotein coding) | 0.0227 |
203344_s_at | RBBP8 | Retinoblastoma binding protein 8 | 0.0170 |
201839_s_at | EPCAM | Epithelial cell adhesion molecule | 0.0170 |
236976_at | FANCA | Fanconi anemia, complementation group A | 0.0113 |
200886_s_at | PGAM1 | Phosphoglyceratemutase 1 (brain) | 0.0113 |
213044_at | ROCK1 | Rho-associated, coiled-coil containing protein kinase 1 | 0.0085 |
224874_at | POLR1D | Polymerase (RNA) I polypeptide D, 16 kDa | 0.0085 |
209146_at | SC4MOL | Sterol-C4-methyl oxidase-like | 0.0085 |
239233_at | CCDC88A | Coiled-coil domain containing 88A | 0.0085 |
224850_at | ATAD1 | ATPase family, AAA domain containing 1 | 0.0057 |
236302_at | PPM1E | Protein phosphatase, Mg2+/Mn2+ dependent, 1E | 0.0057 |
220176_at | NUBPL | Nucleotide binding protein-like | 0.0057 |
204531_s_at | BRCA1 | Breast cancer 1, early onset | 0.0028 |
203560_at | GGH | γ-Glutamyl hydrolase (conjugase, folylpolygammaglutamyl hydrolase) | 0.0028 |
226103_at | NEXN | Nexilin (F-actin binding protein) | 0.0028 |
217398_x_at | GAPDH | Glyceraldehyde-3-phosphate dehydrogenase | 0.0000 |
232881_at | GNASAS | GNAS antisense RNA (nonprotein coding) | −0.0028 |
223721_s_at | DNAJC12 | DnaJ (Hsp40) homolog, subfamily C, member 12 | −0.0028 |
204507_s_at | PPP3R1 | Protein phosphatase 3, regulatory subunit B, α | −0.0057 |
205339_at | STIL | SCL/TAL1 interrupting locus | −0.0085 |
233970_s_at | TRMT6 | tRNAmethyltransferase 6 homolog (S. cerevisiae) | −0.0142 |
Probe . | Symbol . | Description . | VI . |
---|---|---|---|
209092_s_at | GLOD4 | Glyoxalase domain containing 4 | 0.0368 |
238962_at | ZNF681 | Zinc finger protein 681 | 0.0255 |
223779_at | AFAP1AS | AFAP1 antisense RNA (nonprotein coding) | 0.0227 |
203344_s_at | RBBP8 | Retinoblastoma binding protein 8 | 0.0170 |
201839_s_at | EPCAM | Epithelial cell adhesion molecule | 0.0170 |
236976_at | FANCA | Fanconi anemia, complementation group A | 0.0113 |
200886_s_at | PGAM1 | Phosphoglyceratemutase 1 (brain) | 0.0113 |
213044_at | ROCK1 | Rho-associated, coiled-coil containing protein kinase 1 | 0.0085 |
224874_at | POLR1D | Polymerase (RNA) I polypeptide D, 16 kDa | 0.0085 |
209146_at | SC4MOL | Sterol-C4-methyl oxidase-like | 0.0085 |
239233_at | CCDC88A | Coiled-coil domain containing 88A | 0.0085 |
224850_at | ATAD1 | ATPase family, AAA domain containing 1 | 0.0057 |
236302_at | PPM1E | Protein phosphatase, Mg2+/Mn2+ dependent, 1E | 0.0057 |
220176_at | NUBPL | Nucleotide binding protein-like | 0.0057 |
204531_s_at | BRCA1 | Breast cancer 1, early onset | 0.0028 |
203560_at | GGH | γ-Glutamyl hydrolase (conjugase, folylpolygammaglutamyl hydrolase) | 0.0028 |
226103_at | NEXN | Nexilin (F-actin binding protein) | 0.0028 |
217398_x_at | GAPDH | Glyceraldehyde-3-phosphate dehydrogenase | 0.0000 |
232881_at | GNASAS | GNAS antisense RNA (nonprotein coding) | −0.0028 |
223721_s_at | DNAJC12 | DnaJ (Hsp40) homolog, subfamily C, member 12 | −0.0028 |
204507_s_at | PPP3R1 | Protein phosphatase 3, regulatory subunit B, α | −0.0057 |
205339_at | STIL | SCL/TAL1 interrupting locus | −0.0085 |
233970_s_at | TRMT6 | tRNAmethyltransferase 6 homolog (S. cerevisiae) | −0.0142 |
Abbreviation: VI, variable importance.
Survival analysis using the selected 23-gene classifiers reveals a prognostic value
Kaplan–Meier curves were drawn for groups classified by clustering analyses based on the gene expressions selected by the significance analysis of microarrays (SAM; ref. 18) with the false discovery rates (Fig. 1A) and by the random survival forests model (Fig. 1B). The corresponding P values by the log-rank test were P = 0.038 for the SAM and P < 0.0001 for the random survival forests model using the 23-gene set. These results show that the random survival forests model is more useful than direct use of the gene expressions.
Survival analysis using the selected 23-gene classifiers reveals a prognostic value independent of HD-MTX–based chemotherapy regimens
The 23-gene profile was tested for the prediction of outcome in the HD-MTX and HD-MTX–containing polychemotherapy groups using Kaplan–Meier curves. The corresponding P values for the log-rank test were P = 0.0001 for the HD-MTX chemotherapy group (Fig. 1C) and P < 0.0001 for the HD-MTX–containing polychemotherapy group (Fig. 1D). These results show that the random survival forests model is useful for predicting survival irrespective of the HD-MTX–based chemotherapy regimens.
Identification of a PPS associated with survival
The PPS was computed from a linear combination of the 23 genes and calculated for each tumor as follows: Z1 = 0.18 × ZNF681 + 0.03 × GNASAS + 0.15 × FANCA + 0.06 × GAPDH + 0.28 × TRMT6 + 0.24 × PGAM1 + 0.28 × PPP3R1 + 0.23 × RBBP8 + 0.27 × ROCK1 + 0.26 × STIL + 0.28 × BRCA1 + 0.26 × ATAD1 + 0.27 × GGH + 0.22 × GLOD4 + 0.15 × EPCAM + 0.14 × AFAP1AS + 0.23 × POLR1D + 0.1 × NEXN + 0.14 × PPM1E + 0.15 × SC4MOL + 0.22 × NUBPL + 0.27 × CCDC88A + 0.04 × DNAJC12.
The Z1-score of the expression value for each individual gene was adapted in this formula. The Z1 scores ranged from −10.0 to 4.43, with a high score associated with a poor outcome. The optimal cut-off was a Z1 score of 1.82. As expected, the predictor performed well in term of the prognosis: the good prognosis group (Z1 ≤ 1.82) had a median survival time of 2,271 days, whereas the poor prognosis group (Z1 > 1.82) had a median survival time of 640 days (P < 0.0001; Fig. 1E). The 10-fold cross-validated c-index was 0.856 [95% confidence interval (CI), 0.824–0.887; P < 0.0001], indicating a significant predictive accuracy.
The potential extension of the microarray-based outcome prediction to the clinical setting was further explored using IHC detection methods. For practical purposes, we tried to incorporate the IHC data and clinical parameters to compute another PPS. The following formula was constructed by the Cox proportional hazards regression model and backward selection:
where KPS was scored as 1 for 70 to 100 and 0 for 10 to 60, and PPP3R1, RBBP8, and BRCA1 were scored as 1 for immunohistochemical scores of 1 or more and 0 for immunohistochemical score of 0.
The Z2 scores ranged from 2.15 to 4.99, with a high score associated with a poor outcome. The optimal cut-off with highest value of the log-rank test was a Z2 score of 3.48. As expected, the predictor performed well in term of the prognosis: the good prognosis group (Z2 ≤ 3.48) had a median survival time of 2,271 days, whereas the poor prognosis group (Z2 > 3.48) had a median survival time of 721 days (P = 0.0004; Fig. 1F). It should be noted that we used different methods to compute the PPS for clinical characteristics owing to poor significance.
The gene expression predictor is the most significant feature
The performance of the gene expression predictor (PPS) was compared with those of traditional individual features. As shown in Table 3, Z1 and Z2 were significantly associated with OS in the univariate analyses. Other prognostic scoring systems, such as IELSG (9) or MSKCC (10) prognostic risk group classification were not significant in our series. Table 4 shows the results for multivariate analyses, in which the clinical characteristics were treated as Z2 as shown in Table 4A or selected by the stepwise procedure as shown in Table 4B. As shown in Table 4, the gene expression predictor Z1 was significantly associated with OS in the multivariate analyses. It should be noted that BRCA1 IHC was associated with progression-free survival (PFS) in the univariate analyses and nearly with OS in the multivariate analyses.
Variable . | Median OS, wks . | P . | Median PFS, wks . | P . |
---|---|---|---|---|
Age, y | 0.581 | 0.777 | ||
65≤ | 286 | 178 | ||
65> | 184 | 73 | ||
Gender | 0.324 | 0.596 | ||
Male | 311 | 100 | ||
Female | 232 | 123 | ||
KPS | 0.256 | 0.547 | ||
70≤ | 286 | 123 | ||
70> | 232 | 78 | ||
No. of lesions | 0.419 | 0.782 | ||
Single | 286 | 135 | ||
Multiple | 122 | 78 | ||
Deep lesions | 0.704 | 0.777 | ||
Yes | 135 | 68 | ||
No | 286 | 147 | ||
LDH serum level | 0.646 | 0.199 | ||
216 ≤ | 135 | 43 | ||
216> | 311 | 178 | ||
CSF protein level | 0.24 | 0.442 | ||
Elevated | 191 | 178 | ||
Normal | 324 | 147 | ||
Operation method | 0.26 | 0.564 | ||
Biopsy | 163 | 73 | ||
Removal | 286 | 123 | ||
Chemotherapy | 0.733 | 0.514 | ||
HD-MTX | 136 | 123 | ||
Polychemo | 298 | 123 | ||
Immunophenotype (microarray) | 0.026 | 0.098 | ||
GCB | 232 | 78 | ||
ABC | 79 | 34 | ||
Immunophenotype (IHC) | 0.256 | 0.578 | ||
GCB | 363 | 128 | ||
ABC | 135 | 100 | ||
BRCA1 (IHC) | 0.168 | 0.016 | ||
Positive | 96 | 35 | ||
Negative | 311 | 178 | ||
RBBP8 (IHC) | 0.232 | 0.141 | ||
Positive | 135 | 68 | ||
Negative | 286 | 147 | ||
ROCK1 (IHC) | 0.931 | 0.94 | ||
Positive | 136 | 123 | ||
Negative | 271 | 89 | ||
FANCA (IHC) | 0.333 | 0.197 | ||
Positive | 232 | 123 | ||
Negative | 122 | 39 | ||
PPP3R1 (IHC) | 0.693 | 0.401 | ||
Positive | 136 | 68 | ||
Negative | 311 | 298 | ||
MSKCC | 0.557 | 0.819 | ||
Age ≤ 50 | 495 | 248 | ||
Age > 50, 70 ≤ KPS | 191 | 123 | ||
Age > 50, 70 > KPS | 232 | 47 | ||
IELSG | 0.781 | 0.827 | ||
0–1 | 230 | 111 | ||
2–3 | 232 | 113 | ||
3–4 | 223 | 145 | ||
Z1 | <0.0001 | <0.0001 | ||
1.82< | 91 | 34 | ||
≤1.82 | 324 | 298 | ||
Z2 | 0.0004 | 0.0059 | ||
3.48< | 103 | 41 | ||
≤3.48 | 324 | 248 |
Variable . | Median OS, wks . | P . | Median PFS, wks . | P . |
---|---|---|---|---|
Age, y | 0.581 | 0.777 | ||
65≤ | 286 | 178 | ||
65> | 184 | 73 | ||
Gender | 0.324 | 0.596 | ||
Male | 311 | 100 | ||
Female | 232 | 123 | ||
KPS | 0.256 | 0.547 | ||
70≤ | 286 | 123 | ||
70> | 232 | 78 | ||
No. of lesions | 0.419 | 0.782 | ||
Single | 286 | 135 | ||
Multiple | 122 | 78 | ||
Deep lesions | 0.704 | 0.777 | ||
Yes | 135 | 68 | ||
No | 286 | 147 | ||
LDH serum level | 0.646 | 0.199 | ||
216 ≤ | 135 | 43 | ||
216> | 311 | 178 | ||
CSF protein level | 0.24 | 0.442 | ||
Elevated | 191 | 178 | ||
Normal | 324 | 147 | ||
Operation method | 0.26 | 0.564 | ||
Biopsy | 163 | 73 | ||
Removal | 286 | 123 | ||
Chemotherapy | 0.733 | 0.514 | ||
HD-MTX | 136 | 123 | ||
Polychemo | 298 | 123 | ||
Immunophenotype (microarray) | 0.026 | 0.098 | ||
GCB | 232 | 78 | ||
ABC | 79 | 34 | ||
Immunophenotype (IHC) | 0.256 | 0.578 | ||
GCB | 363 | 128 | ||
ABC | 135 | 100 | ||
BRCA1 (IHC) | 0.168 | 0.016 | ||
Positive | 96 | 35 | ||
Negative | 311 | 178 | ||
RBBP8 (IHC) | 0.232 | 0.141 | ||
Positive | 135 | 68 | ||
Negative | 286 | 147 | ||
ROCK1 (IHC) | 0.931 | 0.94 | ||
Positive | 136 | 123 | ||
Negative | 271 | 89 | ||
FANCA (IHC) | 0.333 | 0.197 | ||
Positive | 232 | 123 | ||
Negative | 122 | 39 | ||
PPP3R1 (IHC) | 0.693 | 0.401 | ||
Positive | 136 | 68 | ||
Negative | 311 | 298 | ||
MSKCC | 0.557 | 0.819 | ||
Age ≤ 50 | 495 | 248 | ||
Age > 50, 70 ≤ KPS | 191 | 123 | ||
Age > 50, 70 > KPS | 232 | 47 | ||
IELSG | 0.781 | 0.827 | ||
0–1 | 230 | 111 | ||
2–3 | 232 | 113 | ||
3–4 | 223 | 145 | ||
Z1 | <0.0001 | <0.0001 | ||
1.82< | 91 | 34 | ||
≤1.82 | 324 | 298 | ||
Z2 | 0.0004 | 0.0059 | ||
3.48< | 103 | 41 | ||
≤3.48 | 324 | 248 |
NOTE: Polychemo, HD-MTX–containing polychemotherapy.
. | . | Entire series (n = 32) . | |
---|---|---|---|
Variable . | Subgroup . | HR (95% CI) . | P . |
(A) | |||
Z1 | Continuous variable | 1.73 (1.08–3.52) | 0.017 |
Z2 | Continuous variable | 2.49 (0.23–39.4) | 0.434 |
(B) | |||
Age | Continuous variable | 0.99 (0.93–1.05) | 0.775 |
KPS | 70≤, 70> | 0.75 (0.28–2.06) | 0.573 |
BRCA1 (IHC) | Positive/negative | 2.94 (0.98–8.86) | 0.052 |
Z1 | Continuous variable | 1.45 (1.18–1.91) | <0.0001 |
. | . | Entire series (n = 32) . | |
---|---|---|---|
Variable . | Subgroup . | HR (95% CI) . | P . |
(A) | |||
Z1 | Continuous variable | 1.73 (1.08–3.52) | 0.017 |
Z2 | Continuous variable | 2.49 (0.23–39.4) | 0.434 |
(B) | |||
Age | Continuous variable | 0.99 (0.93–1.05) | 0.775 |
KPS | 70≤, 70> | 0.75 (0.28–2.06) | 0.573 |
BRCA1 (IHC) | Positive/negative | 2.94 (0.98–8.86) | 0.052 |
Z1 | Continuous variable | 1.45 (1.18–1.91) | <0.0001 |
Z2 formula was validated in the independent sample set
Because validation of the gene expression signature in another independent set is difficult, Z2 score was validated in the validation set (Table 1). The Z2 scores ranged from 0.97 to 6.36. As expected, there was a significant difference in the OS between the good prognosis group (Z2 ≤ 3.48) and the poor prognosis group (Z2 > 3.48; P = 0.0281; Fig. 2A).
Classification by cell-of-origin is not associated with survival
We classified our cases in germinal center B-cell–like (GCB) and activated B-cell–like (ABC) subgroups using a gene expression–based method according to Wright and colleagues (19). Ten cases were classified as GCB and 9 were classified as ABC (Supplementary Fig. S3A). There was difference in survival between these 2 groups in univariate analyses (P = 0.026; Supplementary Fig. S3B), and no difference in multivariate analyses (Table 4). We also immunophenotyped the cases in the GCB and ABC subgroups by CD10, BCL-6, and MUM1 IHC according to Camilleri-Broët and colleagues (20). Six cases were classified as GCB and 21 were classified as ABC. There was no difference in survival between these 2 groups (P = 0.256; data not shown). In our cases, classification by cell-of-origin by microarray was not significantly associated with patient survival.
High BRCA1 expression is associated with poor survival
BRCA1 mRNA determined by qPCR was found to be differentially expressed between short-term and long-term survivors (P = 0.027; Supplementary Table S2). Examples of BRCA1 IHC are shown in Fig. 2B. The result of double-immunostaining of BRCA1 and CD79a (B-cell marker) showed that BRCA1-positive signals were detected in nucleus of CD79a-positive lymphoma cells, which had enlarged nucleus. The staining pattern for BRCA1 was predominantly nuclear in 8 cases, cytoplasmic in 7 cases, and both nuclear and cytoplasmic in 4 cases. There was no significant difference in OS between the nuclear and cytoplasmic patterns (data not shown). However, there was a significant difference in the PFS between the BRCA1-positive and BRCA1-negative groups in both datasets (Fig. 2C and D). The PFS was defined as the time from first diagnosis to disease recurrence or death in univariate analyses. Overexpression of BRCA1 mRNA or protein was strongly associated with poor survival in patients with PCNSL. However, FANCA, PPP3R1, ROCK1, and RBBP8 IHC findings were not significantly associated with patient survival (Table 3).
Discussion
The reason little progress in molecular analyses of PCNSL has been achieved so far is the very tiny sample amounts obtained for genetic analyses. Although, our study is still associated with a small number of patients, it is the largest series to date and the first study using a gene expression prognostic classification context in patients with PCNSL. A better understanding of PCNSL biology is crucial to improve its prognosis. However, only a few studies have been reported on gene expression profiles of PCNSLs. Rubenstein and colleagues (21) compared the gene expression signature of 23 patients with PCNSL with that of 9 patients with nodal large B-cell lymphoma. They showed that individual cases of PCNSL were classified as GCB cell, ABC cell, or type III large B-cell lymphoma based on the cell-of-origin classification described by Alizadeh and colleagues (22). In addition, PCNSLs were distinguished from nodal B-cell lymphoma by high expression of regulators of the unfolded protein response signaling pathway by c-Myc and Pim-1. The IL-4 signaling pathway is associated with tumorigenesis and adverse prognosis in patients with PCNSL (21). Montesinos-Rongen and colleagues (23) reported the gene expression profile of 21 PCNSLs. They showed that PCNSLs resembled late GCB cells in their gene expression pattern, and that PCNSLs were distributed among the spectrum of systemic DLBCLs. Tun and colleagues (24) reported a gene expression comparison between 13 PCNSLs and 30 nonCNS DLBCLs. PCNSL was characterized by significant expression of multiple extracellular matrix- and adhesion-related pathways. Sung and colleagues (25) evaluated 12 patients with PCNSL by comparative genomic hybridization and 7 out of the 12 patients by expression profiling. They selected 8 candidate genes in which expression changes were associated with copy number changes.
Systemic DLBCLs comprise several diseases that differ in responsiveness to chemotherapy (26, 27). The GCB cell–like subgroup expressed genes characteristic of normal GCB cells and were associated with a good outcome, whereas the ABC cell–like subgroup expressed genes characteristic of activated B cells and were associated with a poor outcome. Gene expression analyses of PCNSLs have largely focused on normal lymphocyte development, and the cell-of-origin classification method was not associated with significant survival differences in multivariate analyses. Moreover, prognostic scoring systems, such as IELSG (9) or MSKCC (10) prognostic risk group classification were not significant in our series. Therefore, we developed a novel scoring system based on molecular markers.
We assessed the relationships between gene expressions and survival time using the random survival forests model, and its performance provided a better classification compared with the SAM and gene expression subgroups. As discussed in a study by Cordell (28), the functional form should contain gene-by-gene interaction terms. The random forests method is classified into a tree-based method, which has an advantage in detecting interactions. It has been developed for application to data with several variables (genes) much larger than the number of patients. In this regard, a framework of random forests that overcomes this problem would be necessary in the analysis. Genes were selected by applying the RSF-VH algorithm. The advantage of this method is that no screening of the genes is necessary. There are many studies on microarray data using univariate analyses for screening, in which potential genes interacting with other genes may be dropped from the analyses. In this regard, the RSF-VH algorithm would be more desirable.
Among the selected genes, PPP3R1 is a calmodulin-regulated protein phosphatase, which plays an important role in signal transduction (29), although there are no reports about its role in cancer development. BRCA1 seems to promote cell survival after DNA damage by preventing apoptosis and participating in repair pathways (30). In addition, a role of BRCA1 in the cellular response to chemotherapy has been discussed (31, 32). The expression levels of BRCA1 mRNA or protein predicted survival after chemotherapy for patients with sporadic ovarian (33), breast (34, 35), prostate (36), and non–small cell lung cancer (37–40). Low levels of BRCA1 expression resulted in increased sensitivity to platinum therapy and decreased sensitivity to taxane therapy (33, 35, 37–40). Silencing of the BRCA1 gene by promoter hypermethylation was reported in sporadic breast and ovarian tumors, especially in the presence of loss of heterogeneity (41). We have provided evidence that BRCA1 expression may represent a predictive biomarker of survival in patients with PCNSL. We are trying to further investigate the associations between BRCA1 expression and the chemotherapeutic response by MTX. Furthermore, as BRCA1 shows promise as a prognostic and predictive marker in PCNSL, patients identified as being high expressors could be treated with agents that downregulate BRCA1, thereby sensitizing them to standard therapies. RBBP8 (also known as CtIP) is a BRCA1-interacting protein (42) and implicated the functional involvement in the development of tamoxifen resistance for breast cancer (43). Furthermore, work, both in vitro studies and clinical trials, is needed to assess the correlations between BRCA1-RBBP8 complex expression levels and responses to potential-targeted therapies.
Because these are retrospective analyses, there are limitations and other limitations inherent in a retrospective design. So, these results should be investigated further in the future. Our PPS may help to identify the patients with PCNSL who are unlikely to be cured by standard therapy. Our PPS involves a small number of genes, and thus quantitative reverse transcriptase PCR assays or customized DNA microarrays could be developed for clinical applications. Much more aggressive therapies, such as high-dose chemotherapy with stem cell implantation (44) or molecular-targeted therapies that specifically target disabled pathways, might be tailored in those patients with a poor prognosis. In this regard, the expression profiles might not only predict the likelihood of short-term survival, but also yield clues on individual genes involved in tumor development, progression, and response to therapy. Moreover, the ability to distinguish PCNSLs will enable appropriate therapies to be tailored to specific tumor subtypes. Class prediction models based on defined molecular profiles allow classification of PCNSLs in a manner that will be better correlated with clinical outcomes. Therefore, identification of these molecular subclasses of PCNSLs could greatly facilitate prognosis prediction and our ability to develop effective treatment protocols. In conclusion, our profiling results will help to construct a new classification scheme that better assesses these clinical malignancies.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Authors' Contributions
Conception and design: R. Yamanaka
Development of methodology: A. Kawaguchi, Y. Komohara, N. Tsuchiya, R. Yamanaka
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): Y. Iwadate, M. Sano, K. Kajiwara, N. Yajima, N. Tsuchiya, J. Homma, H. Aoki, T. Kobayashi, Y. Sakai, H. Hondoh, Y. Fujii, R. Yamanaka
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): A. Kawaguchi, Y. Iwadate, T. Kakuma, R. Yamanaka
Writing, review, and/or revision of the manuscript: A. Kawaguchi, R. Yamanaka
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): A. Kawaguchi, Y. Sakai, Y. Fujii, R. Yamanaka
Study supervision: R. Yamanaka
Pathological diagnosis, analyzation, and interpretation of the immunohistochemical data: Y. Komohara
Carrying out experiments and analyzing data: Y. Sakai
Acknowledgments
The authors thank Akiyoshi Kakita of Resource Branch for Brain Disease Research, Brain Research Institute, Niigata University for preparing specimens.
Grant Support
This work was supported in part by JSPS KAKENHI grant number 21700312 to A. Kawaguchi and 20390392 to R. Yamanaka, and by the Collaborative Research Project grant number 2010–2022 to R. Yamanaka of the Brain Research Institute, Niigata University.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.