Purpose: Glioblastoma is a devastating, incurable disease with few known prognostic factors. Here, we present the first genome-wide survival and validation study for glioblastoma.

Experimental Design: Cox regressions for survival with 314,635 inherited autosomal single-nucleotide polymorphisms (SNP) among 315 San Francisco Adult Glioma Study patients for discovery and three independent validation data sets [87 Mayo Clinic, 232 glioma patients recruited from several medical centers in Southeastern United States (GliomaSE), and 115 The Cancer Genome Atlas patients] were used to identify SNPs associated with overall survival for Caucasian glioblastoma patients treated with the current standard of care, resection, radiation, and temozolomide (total n = 749). Tumor expression of the gene that contained the identified prognostic SNP was examined in three separate data sets (total n = 619). Genotype imputation was used to estimate hazard ratios (HR) for SNPs that had not been directly genotyped.

Results: From the discovery and validation analyses, we identified a variant in single-stranded DNA-binding protein 2 (SSBP2) on 5q14.1 associated with overall survival in combined analyses (HR, 1.64; P = 1.3 × 10−6). Expression of SSBP2 in tumors from three independent data sets also was significantly related to patient survival (P = 5.3 × 10−4). Using genotype imputation, the SSBP2 SNP rs17296479 had the strongest statistically significant genome-wide association with poorer overall patient survival (HR, 1.79; 95% CI, 1.45-2.22; P = 1.0 × 10−7).

Conclusion: The minor allele of SSBP2 SNP rs17296479 and the increased tumor expression of SSBP2 were statistically significantly associated with poorer overall survival among glioblastoma patients. With further confirmation, previously unrecognized inherited variations influencing survival may warrant inclusion in clinical trials to improve randomization. Unaccounted for genetic influence on survival could produce unwanted bias in such studies. Clin Cancer Res; 18(11); 3154–62. ©2012 AACR.

Translational Relevance

Glioblastoma is the most fatal form of primary brain cancer and only a few prognostic factors, age, initial Karnofsky performance status, and some treatments, are known. Reliable genetic prognostic markers are still not well established. We present the first genome-wide survival and validation study for glioblastoma patients treated with the current standard of care, resection, radiation, and temozolomide. Using Cox regressions for genome-wide survival analysis, followed by functional validation in tumor expression and genotype imputation, we identified a variant in single-stranded DNA-binding protein 2 (SSBP2) and the tumor expression of SSBP2 to be significantly associated with patient survival. Identification and characterization of the role of genetic variation in predicting glioblastoma patient survival may help optimize clinical trial study design and individualize patient treatment plans.

Glioblastoma is a rapidly fatal form of primary brain cancer with few known prognostic factors. Major challenges of achieving complete patient follow-up, treatment heterogeneity, and changing patterns of patient care over time have limited the feasibility of genome-wide cancer survival discovery with very few such studies published for any cancer site (1) and none thus far for glioblastoma. Moreover, candidate gene studies for glioblastoma survival have provided equivocal results (2–9) possibly due to the factors above or to inadequate gene coverage. To minimize these challenges, we focused this first genome-wide discovery and validation study for glioblastoma patient survival on carefully selected glioblastoma patient groups with follow-up and initial treatment with current standard of care.

### Study subjects

Informed consent was obtained from each subject. The subject recruitment and studies were conducted after approval was obtained from the Institutional Review Boards at each participating site in accordance with assurances filed with and approved by the U.S. Department of Health and Human Services (10, 11).

#### Discovery study.

Details of subject ascertainment for the San Francisco Adult Glioma Study (AGS) have been previously described (10, 12, 13). The 315 glioblastoma patients in this study are the subset who had received current standard-of-care treatment (resection, radiation, and temozolomide) of the 525 glioblastoma patients whose results were used in the genome-wide association study reported by Wrensch and colleagues (10) after stringent sample quality control filtering. Among these patients, tumor characteristics [IDH1 (n = 173) and TP53 (n = 151) mutation status and EGFR copy number (n = 173)] were available from ongoing studies (14–16).

#### Validation study.

The Mayo Clinic study included 87 glioblastoma patients newly diagnosed between 2005 and 2008. Most cases were identified within 24 hours of diagnosis; some were initially diagnosed elsewhere and later had their diagnosis verified at the Mayo Clinic. Pathologic diagnosis was confirmed by review of the primary surgical material for all cases by 2 Mayo Clinic neuropathologists based on surgically resected material.

The glioma patients recruited from several medical centers in Southeastern United States (GliomaSE) study included glioblastoma patients enrolled in a case–control study conducted at medical centers in the Southeast and diagnosed with a primary (e.g., nonrecurrent) glioma between 2005 and 2010 (11). Patients were enrolled a median of 1 month following glioblastoma diagnosis (and a maximum of 4 months according to study protocol). The glioblastoma diagnosis was based on diagnostic pathology reports available for all patients in the study.

The Cancer Genome Atlas (TCGA) data set was downloaded from http://cancergenome.nih.gov/ (17). At the time of data retrieval from TCGA, alignment of sample identifiers yielded 181 glioblastoma patients with both genotype and clinical data, 115 of whom had resection, radiation, and temozolomide treatment. The subject IDs of these 115 TCGA patients are listed in Supplementary Table S1.

### Genotyping

Genotyping for the AGS discovery subjects was conducted by deCODE Genetics using Illumina's HumanCNV370-duo BeadChip as previously described (10). After excluding single-nucleotide polymorphisms (SNP) with P < 10−5 for Hardy–Weinberg equilibrium in the AGS control samples (AGS participants that did not have glioma), or minor allele frequency less than 5%, or missing genotyping data more than 5% in the case groups, there were 314,635 autosomal SNPs to consider in the survival tests. Genotyping for the Mayo Clinic study subjects was carried out with Illumina 610Quad SNP arrays as previously described (10). Genotyping for the GliomaSE study subjects was conducted with the Illumina Goldengate assay as previously described (11). Genotyping for the TCGA study subjects was conducted with Illumina 550 platform (17).

### Statistical analysis

Supplementary Fig. S1 provides an overview of the 3 types of analyses conducted: (i) genome-wide constitutive discovery and validation of SNPs associated with glioblastoma patient overall survival, (ii) functional validation of survival loci (association of gene expression in tumors with glioblastoma overall patient survival), and (iii) fine mapping via genotype imputation.

#### Genome-wide survival and validation analyses.

Due to human subject IRB constraints, analyses on the raw genotype data were carried out separately at the AGS, Mayo Clinic, and GliomaSE sites (TCGA data were analyzed at the AGS site). Summary statistics were then submitted to the AGS site for combined analysis. For the AGS discovery study, we conducted Cox proportional hazards regression models to assess the association of each of the 314,635 SNPs with overall survival, adjusting for age (on a continuous scale) and sex. The SNP variable used in the model is coded as a continuous count of the number of minor alleles based on the additive genetic model. Per-allele HR and 95% CI were obtained for each SNP. Statistical significance for each SNP was assessed with the Wald test. The same Cox proportional hazard models were used for all ensuing analyses of the validation data sets. The genomic inflation factor based on the genome-wide P values for the AGS discovery study was 1.04 indicating that systematic inflation of our survival association signals due to model misspecification, undetected genotyping error, or hidden ancestry relationship was highly unlikely. The proportional hazards assumption for validated SNPs with a 4-site combined P ≤ 10−5 was tested within each site with the Schoenfeld residuals, and SNPs with evidence for nonproportionality were removed from further consideration. Results for the nonproportionality test for rs7732320 are shown in Supplementary Table S2. Heterogeneity across the 4 studies for rs7732320 was assessed by Cochran's Q statistic (18). As no significant heterogeneity across study sites was observed, a fixed effect model that used the inverse of the variance of the study-specific log (HR) estimates to give weights to the contribution of each study (19), was used to summarize results across studies. Specifically,

where $\hat \beta _i$ and $\hat \nu _i$ are the log HR estimate and its variance for the ith study, respectively.

#### Functional validation of survival loci.

To examine associations of expression of the candidate gene, with survival, we assembled data from 619 primary glioblastoma samples from 3 published studies (20–22). The Lee and colleagues (20) data set described 218 glioblastoma expression samples including 132 samples from 3 previously published data sets as well as 86 new samples assembled into a single, unified data set with Affymetrix U133A. The Murat and colleagues (21) data set contains 75 glioblastoma expression samples using the Affymetrix U133. Normalized expression values using the standard RMA method for the Lee and Murat data sets were downloaded from the National Center for Biotechnology Information Gene Expression Omnibus database (GSE13041 and GSE7696). The TCGA data set (22) has 326 primary glioblastoma expression samples using the Affymetrix U133A expression platform. Transcriptional class labels were obtained from the TCGA Advanced Working Group (23). The updated labeling extends the original labeled set presented in Verhaak and colleagues (22) to previously unclassified samples. In total, we obtained 74 proneurals, 45 neurals, 93 mesenchymals, and 91 classicals. For each of the 3 expression data sets, we carried out age and sex adjusted study specific survival analysis employing Cox models relating continuous gene expression data to patient survival and then combined the study-specific HR estimates with a fixed effect model using the inverse variance approach (19). Within the TCGA expression data set, we also conducted expression subtype (proneural, neural, classical, and mesenchymal) stratified survival analysis using a Cox model with the same specification. As treatment data were either missing or incomplete for these patients, we did not restrict the tumor gene expression analyses to patients with the current standard of care.

#### Fine mapping via imputation.

Using MACH (24) and data from release 22 phase II CEU HapMap data (MACH version 1.0.16), we imputed SNPs separately within each of the 3 studies with sufficient tagging SNPs (AGS, Mayo, and TCGA). MACH implements a Markov chain–based algorithm to infer possible pairs of haplotypes for each individual's genotypes (including untyped genotypes). We ran MACH with the default parameter values with the number of iterations of the Markov Chain set to 50 and the “greedy” option turned on. We then carried out study-specific Cox survival analysis with expected allele counts as the predictor for a total of 159 SNPs, whose variance ratios were larger than 0.5 for all 3 studies to exclude SNPs with poor quality imputed genotypes. Meta-analysis of the imputed data was carried out in the same way as described above. To obtain survival signals independent of the most significant (imputed) SNP in the region, we included its expected counts in the Cox model as an additional covariate, along with the other covariates such as age and sex. All analyses were conducted by the R statistical package.

Patient characteristics (age, sex, and median survival) for the 4 data sets (AGS, the Mayo Clinic, GliomaSE, and TCGA) are described in Table 1. The majority of the observed survival Cox regression P values for 314,635 SNPs from the AGS discovery data set conformed to the identity line in the Q-Q plot, whereas 90 SNPs showed significant deviation from expectation at P = 10−4 (Supplementary Fig. S2). We submitted these 90 SNPs for validation in Mayo Clinic patients of which 78 passed quality control. Ten of these SNPs had P < 10−5 in the combined analysis using a fixed effect model (25). Examination of these 10 SNPs in 2 additional patient groups, GliomaSE and TCGA patients, yielded one SNP, rs7732320, that had discovery and validation combined P < 10−5 for survival and had proportionality of hazards in all 4 data sets (Table 2 and Supplementary Table S3). The associations of this SNP with patient survival were in the same direction across the studies and had a combined validation P = 0.008 and a combined discovery validation P = 1.3 × 10−6. There was no evidence of heterogeneity of the HR estimates across the 4 studies (Table 2). Effect modification by age at diagnosis for rs7732320 was evaluated in the AGS discovery data by the significance of the interaction term between age at diagnosis and the SNP; no statistical significant interaction was detected. In the AGS discovery data, the median survival time for the 3 groups of patients with 0, 1, and 2 adverse alleles of rs7732320 were 17.8, 13.4, and 10.6 months, respectively.

Table 1.

Characteristics of glioblastoma patients used in discovery (University of California, San Francisco, 1997–2008) and validation sets (Mayo Clinic, GliomaSE, and TCGA)

Discovery: AGSValidation I: MayoValidation II: GliomaSEValidation III: TCGA
N (events/total)Median survival (mo)N (events/total)Median survival (mo)N (events/total)Median survival (mo)N (events/total)Median survival (mo)
Total 270/315 17.1 64/87 16.3 137/232 16.4 78/115 17.8
Age at diagnosis
Median (interquartile range) 55 (17.3) 54 (15.0) 59 (17.4) 57 (18.0)
HR (95% CI), Pa 1.03 (1.02–1.04), 7.3E-09 1.03 (1.01–1.08), 4.3E-03 1.02 (1.01–1.04), 3.0E-04  1.03 (1.01–1.05), 5.7E-04
Sex
Female 92/101 15.4 22/35 16.3 55/85 16.4 34/48 20.4
Male 178/214 17.2 42/52 16.8 82/146 17.1 44/67 16.5
HR (95% CI), Pa 0.82 (0.64–1.05), 0.12 1.10 (0.66–1.85), 0.72 0.98 (0.73–1.33), 0.92 1.07 (0.68–1.68), 0.77
Race
White 315 (100%) 87 (100%) 232 (100%) 115 (100%)
Discovery: AGSValidation I: MayoValidation II: GliomaSEValidation III: TCGA
N (events/total)Median survival (mo)N (events/total)Median survival (mo)N (events/total)Median survival (mo)N (events/total)Median survival (mo)
Total 270/315 17.1 64/87 16.3 137/232 16.4 78/115 17.8
Age at diagnosis
Median (interquartile range) 55 (17.3) 54 (15.0) 59 (17.4) 57 (18.0)
HR (95% CI), Pa 1.03 (1.02–1.04), 7.3E-09 1.03 (1.01–1.08), 4.3E-03 1.02 (1.01–1.04), 3.0E-04  1.03 (1.01–1.05), 5.7E-04
Sex
Female 92/101 15.4 22/35 16.3 55/85 16.4 34/48 20.4
Male 178/214 17.2 42/52 16.8 82/146 17.1 44/67 16.5
HR (95% CI), Pa 0.82 (0.64–1.05), 0.12 1.10 (0.66–1.85), 0.72 0.98 (0.73–1.33), 0.92 1.07 (0.68–1.68), 0.77
Race
White 315 (100%) 87 (100%) 232 (100%) 115 (100%)

Abbreviations: GliomaSE, glioma patients recruited from several medical centers in Southeastern United States; TCGA, The Cancer Genome Atlas.

aP values from log-additive Cox Proportional Hazards model adjusted for age at diagnosis (on a continuous scale) and sex.

Table 2.

Association of rs7732320 genotype with overall survival for glioblastoma multiforme patients with initial standard of care (resection, radiation, and temozolomide) treatment

Discovery AGSCombined validation (3 sites)Heterogeneity test (4 sites)Combined statistics (4 sites)
SNPHR (95% CI)PaHR (95% CI)PbQPHR (95% CI)Pc
rs7732320 (SSBP2, Chr 5, MA = T, MAF = 0.11) 1.80 (1.36–2.30) 3.07E-05 1.48 (1.11–1.99) 0.008 1.58 0.66 1.64 (1.34–2.00) 1.30E-06
Discovery AGSCombined validation (3 sites)Heterogeneity test (4 sites)Combined statistics (4 sites)
SNPHR (95% CI)PaHR (95% CI)PbQPHR (95% CI)Pc
rs7732320 (SSBP2, Chr 5, MA = T, MAF = 0.11) 1.80 (1.36–2.30) 3.07E-05 1.48 (1.11–1.99) 0.008 1.58 0.66 1.64 (1.34–2.00) 1.30E-06

NOTE: SNP discovered in a genome-wide survival association study [University of California, San Francisco, 1997–2008, AGS (10)] and validated in 3 independent studies [Mayo Clinic (10), GliomaSE (11), and TCGA (17)] based on combined P < 1E-5.

Abbreviations: GliomaSE, glioma patients recruited from several medical centers in Southeastern United States; MA, minor allele; MAF, minor allele frequency.

aP values from log-additive Cox Proportional Hazards model adjusted for age at diagnosis (on a continuous scale) and sex.

bP values based on combining summary statistics from the 3 validation studies of Mayo, GliomaSE, and TCGA using a fixed effect model with inverse variance weights.

cP values based on combining summary statistics from all 4 study sites (AGS, Mayo, GliomaSE, and TCGA) using a fixed effect model with inverse variance weights.

Rs7732320 is located in the intronic region of SSBP2; we therefore investigated whether patient survival was associated with the transcript levels of SSBP2 among 619 patients in 3 publically available glioblastoma gene expression data sets [Lee and colleagues (20), Murat and colleagues (21), and TCGA (22); see Methods and Supplementary Fig. S1)]. We observed a strong and significant association of SSBP2 expression with poorer overall survival (HR, 1.22; 95% CI, 1.09–1.36; P = 5.3 × 10−4), and the association was consistent across the 3 expression data sets (Table 3). No effect modification by age at diagnosis was found for the association of SSBP2 tumor expression with survival in any of the 3 expression data sets. In addition, among TCGA glioblastoma patients, the HR for patient survival associated with tumor SSBP2 expression was highest and statistically significant only among patients with the previously described (22) proneural signature (HR, 1.44; 95% CI, 1.10–1.89; P = 0.007; Table 3). Consistent with this finding, we found that proneural glioblastoma patients expressed the lowest amount of SSBP2 compared with the other subtypes (Wilcoxon P = 2.16 × 10−12; Fig. 1A). Intriguingly, even though the overall survival for patients of the proneural subtype was not significantly different from the other gene expression subtypes (log-rank P = 0.21; Fig. 1B), significant survival differences were observed for the proneural SSBP2-negative patients (Fig. 1C), arbitrarily defined as the subset of patients with lower than 25 percentile of SSBP2 expression in the proneural group. We observed significantly better survival for proneural SSBP2-negative patients (median survival time, 28.8 months) than proneural SSBP2-positive patients (median survival time, 12.4 months) and all other nonproneural glioblastoma patients (median survival time, 13.8 months). Proneural SSBP2-negative status remained a significant prognostic factor for longer survival (Cox P = 9.7 × 10−3) in Cox multivariate analysis after adjusting for patient age at diagnosis and sex.

Figure 1.

A, Boxplots of SSBP2 tumor expression by previously assigned TCGA expression groups in 303 glioblastomas: C, classical; M, mesenchymal; N, neural; and P, proneural. B, Kaplan–Meier survival curves for the 4 TCGA expression groups. C, Kaplan–Meier survival curves based on SSBP2 expression and TCGA expression groups. The “Proneural SSBP2-” group is designated as the subset of 20 patients with lower than 25 percentile expression of SSBP2 expression in the proneural group versus the rest of the TCGA patients.

Figure 1.

A, Boxplots of SSBP2 tumor expression by previously assigned TCGA expression groups in 303 glioblastomas: C, classical; M, mesenchymal; N, neural; and P, proneural. B, Kaplan–Meier survival curves for the 4 TCGA expression groups. C, Kaplan–Meier survival curves based on SSBP2 expression and TCGA expression groups. The “Proneural SSBP2-” group is designated as the subset of 20 patients with lower than 25 percentile expression of SSBP2 expression in the proneural group versus the rest of the TCGA patients.

Close modal
Table 3.

Association of gene expression and survival in glioblastoma multiforme cases with data from 3 different sources

Lee et al. (20) N = 218Murat et al. (21) N = 75TCGA (22)N = 326Combined N = 619
HR (95% CI)aPHR (95% CI)aPHR (95% CI)aPEvents/NMSTHR (95% CI)P
SSBP2 All subtypes 1.18 (1.01–1.38) 0.034 1.48 (0.88–2.51) 0.14 1.24 (1.05–1.47) 0.013   1.22 (1.09–1.36) 0.00053
Proneural     1.44 (1.10–1.89) 0.007 65/74 14.7 (11.3–23.0)
Neural     1.19 (0.58–2.46) 0.63 40/45 14.3 (10.7–19.8)
Mesenchymal     1.27 (0.77–2.07) 0.35 86/93 11.9 (10.4–15.4)
Classical     1.25 (0.72–2.17) 0.43 78/91 13.9 (12.1–17.6)
Lee et al. (20) N = 218Murat et al. (21) N = 75TCGA (22)N = 326Combined N = 619
HR (95% CI)aPHR (95% CI)aPHR (95% CI)aPEvents/NMSTHR (95% CI)P
SSBP2 All subtypes 1.18 (1.01–1.38) 0.034 1.48 (0.88–2.51) 0.14 1.24 (1.05–1.47) 0.013   1.22 (1.09–1.36) 0.00053
Proneural     1.44 (1.10–1.89) 0.007 65/74 14.7 (11.3–23.0)
Neural     1.19 (0.58–2.46) 0.63 40/45 14.3 (10.7–19.8)
Mesenchymal     1.27 (0.77–2.07) 0.35 86/93 11.9 (10.4–15.4)
Classical     1.25 (0.72–2.17) 0.43 78/91 13.9 (12.1–17.6)

Abbreviations: MST, median survival time (in mo).

The proneural expression subtype has recently been linked to a subset of tumors exhibiting a glioma-CpG island methylator phenotype (G-CIMP; ref. 26). To understand the relationship between SSBP2 and the G-CIMP signature, we compared the SSBP2 genotype and tumor expression in the set of TCGA glioblastoma samples with available G-CIMP status. Of the 241 TCGA samples with concomitant tumor expression and G-CIMP information, 24 were G-CIMP positive and they expressed a much lower level of SSBP2 than the 217 G-CIMP–negative tumors (Wilcoxon P = 3.54 × 10−4). Of the 151 TCGA samples with attendant SSBP2 genotype and G-CIMP information, 2 out of the 16 (12.5%) G-CIMP–positive glioblastoma patients belonged to the group with at least one copy of the adverse allele T, in contrast to a much higher proportion (28.4%, 38 of 135) in the G-CIMP–negative glioblastomas. Because of small sample sizes, validating the relationship between SSBP2 genotype, expression and G-CIMP status will require further studies.

Interestingly, IDH1 mutation status was not found to be associated with the SSBP2 genotype in either of the AGS and TCGA data sets, nor was it linked to SSBP2 tumor expression in the TCGA data set. For TP53 mutation, we detected an increased frequency of the risk allele T of SSBP2 in TP53 mutated glioblastoma patients (OR, 2.35; 95% CI, 1.06–5.19; P = 0.03) in the TCGA data set. However, this association was not found in the AGS data set. Next, to conduct a multivariate analysis incorporating both patient genotypes and tumor markers that are related to survival in glioblastoma patients, we used the AGS data set, for whom 143 of the 315 patients with standard-of-care treatment had data on TP53 and IDH1 mutation status, and EFGR amplification. Unfortunately, only 35 of the 115 TCGA patients with standard-of-care treatment had both IDH1 and TP53 mutation data. In a Cox multivariate regression including age, sex, IDH1 mutation status, EGFR copy number, TP53 mutation, and SSBP2 rs7732320 genotype, SSBP2 genotype remained an independent predictor of poorer survival (HR, 1.99; 95% CI, 1.32–3.00, P = 0.001, n = 143)

Taken together, the findings above present a consistent connection by showing that both the adverse SSBP2 inherited variant and increased SSBP2 expression in tumors are associated with shorter survival time in glioblastoma patients and that the relationship is most evident among patients with the proneural expression signature. A test for the statistical interaction between the SSBP2 SNP rs7732320 and its tumor expression was carried out in the TCGA data set by inclusion of the cross-product term in the Cox model and assessed by use of the likelihood ratio test. No statistical significant interaction effect (P = 0.66) was observed.

To further localize the association with survival in the 5q14.1 region around rs7732320, we imputed nongenotyped SNPs in the entire genomic locus of SSBP2 with a 100 kb extension at its 3′ end from 80,680,000–80,980,000 on chromosome 5. The Hapmap II CEU data set (27) contained 217 SNPs in this region (the AGS data set had 31 SNPs). Out of the 186 (217 minus 31) imputed SNPs, 159 had good imputation quality for AGS, Mayo, and TCGA. Meta-analysis using a fixed effect model to combine study-specific HR estimates from age-gender adjusted Cox models shows a genome-wide statistically significant association of patient survival with the SNP rs17296479 (P = 1.0 × 10−7; see Fig. 2 and Supplementary Table S4), which is located approximately 8 kb centromeric of rs7732320. Two SNPs, rs12187089 and rs11738172, located between these 2 markers, also displayed strong associations with patient survival, with P = 1.2 × 10−7 and 2.3 × 10−7, respectively. These 4 SNPs are highly linked with each other (r2 > 0.8). The smallest combined nominal P value from multivariate Cox models of patient survival with the remaining SNPs adjusting for rs17296479 genotype was 0.061, suggesting that there were no residual independent survival signals remaining.

Figure 2.

Association of genetic variants near SSBP2 with survival using data from uniformly treated glioblastoma patients. We used data from the San Francisco AGS, the Mayo Clinic, and The Cancer Genome Atlas studies for imputation. Evidence for association at each SNP, measured as the combined -log10P value, is represented along the y-axis. The x-axis represents the placement of each SNP on chromosome 5 in genome build 36. Results for directly genotyped SNPs are marked with squares, and imputed SNPs with triangles. Association results are superimposed on a black line that summarizes the local recombination rate map. The upper panel indicates known RefSeq and mRNA coding sequences in the region.

Figure 2.

Association of genetic variants near SSBP2 with survival using data from uniformly treated glioblastoma patients. We used data from the San Francisco AGS, the Mayo Clinic, and The Cancer Genome Atlas studies for imputation. Evidence for association at each SNP, measured as the combined -log10P value, is represented along the y-axis. The x-axis represents the placement of each SNP on chromosome 5 in genome build 36. Results for directly genotyped SNPs are marked with squares, and imputed SNPs with triangles. Association results are superimposed on a black line that summarizes the local recombination rate map. The upper panel indicates known RefSeq and mRNA coding sequences in the region.

Close modal

Major strengths of this study include: (i) a large group of glioblastoma patients in the discovery study (AGS) with initial standard of care treatment of resection, radiation, and temozolomide; (ii) three independent validation studies restricted to patients also treated with standard of care; (iii) direct functional analysis of tumor gene expression at discovered loci at different levels; and (iv) imputation to localize the SNPs most strongly associated with patient survival. Limitations of this study include the lack of detailed temozolomide dosing or timing information, and the fact that subsequent treatments at patient relapse are not included as part of the analysis. Another limitation is that tumor expression data were not available for most of the patients for whom constitutive genotyping data were available, but TCGA data did provide one group of patients with both tumor expression and constitutive genotyping. Recently, Colman and colleagues (28) found an approximately 3-fold HR for overall survival for glioblastoma associated with a 9-gene tumor expression signature among patients treated with temozolomide. In our analysis, we have identified a distinct subset of proneural patients with low SSBP2 expression with a median survival time that was more than twice as long as the other glioblastoma patients. In addition, the SSBP2 risk allele conferred a 1.64-fold increase in rate of death. As our survival analyses are done using a single SNP covariate, the inclusion of additional SNPs in combinations with tumor makers may lead to improved prognostic ability. Such an undertaking is an important future direction for research.

Despite assembling the largest sample size yet available of standard of care treated primary glioblastoma patients with genome-wide SNPs and survival data, our study is still exploratory with relatively small sample size compared with case–control genome-wide studies. The observed associations between the SSBP2 SNP and glioblastoma patient overall survival did not reach nominal genome-wide significance in the discovery study. However, genotype imputation identified an untyped SNP (rs17296479) in SSBP2 achieving genome-wide significance (Bonferroni corrected P = 1.0 × 10−7 *314,635 = 0.03). Nevertheless, preventing false positive discoveries is a pertinent issue in such a large-scale study involving so many statistical tests. Consequently, we sought additional functional validation of the discovered loci by assessing their tumor gene expression association with survival. We believe these additional exercises improved our chances of deriving results that can be replicated in future studies as well as inform future functional studies.

We report here persuasive evidence for the genotypic and transcriptional association of the SSBP2 locus with patient survival. However, establishing the nature of the regulatory relationship between the 2 awaits further in-depth experimental investigation. It is also possible that the variant is associated with the natural history of the disease; leading to differences in time of diagnosis for carriers versus noncarriers. As yet, the variant has not been associated with glioma risk.

Using imputation for fine mapping, we identified 4 linked SNPs (rs17296479, rs12187089, rs11738172, and rs7732320), spanning approximately 12 kb at the 3′ end of SSBP2, that are strongly associated with patient survival. Although all 4 SNPs are noncoding, their immediate proximity to the gene and the ample evidence for epigenetic modifications within the region supports a possible role in transcriptional regulation of SSBP2. First, the histone methylation marker H3K4Me1 for enhancer elements has a broad peak encompassing 3 of the 4 variants (See Supplementary Fig. S3). Second, there are 3 unannotated human transcripts (AK024171, AK054959, and CR608789) located in the same region, just downstream of SSBP2, suggestive of a transcriptionally active genomic interval. Last and most importantly, the direct functional evidence relating the variant rs7732320 to SSBP2 expression in glioblastomas and the unequivocal associations of patient survival with SSBP2 inherited variants and SSBP2 expression levels in tumors point to a cis effect of the variant(s) with the disruption of the transcriptional control of SSBP2 as the likely functional mechanism. The genotyped and imputed variants could either tag the principal association with survival attributable to this 5q14.1 locus or they themselves could be the principal culprits. Comprehensive resequencing efforts and further functional analysis will be required to unambiguously identify the causal variants.

As further evidence of the biologic plausibility of these findings, SSBP2 has been reported to be involved in the maintenance of genome stability (29) and has been implicated in transcriptional signatures in several cancers including leukemia (30), pancreatic cancer (31), oligodendroglial tumors (32), and esophageal squamous cell carcinoma (29). A direct confirmation of the link between SSBP2 and survival in brain cancer is further proffered by Shaw and colleagues (32), in which the expression of SSBP2 was shown to be associated with response to chemotherapy in patients with oligodendroglial tumors. Evidence that the genotypic association of SSBP2 with patient survival seems to be independent of tumor IDH1 mutation status and strongest among patients with a proneural/G-CIMP expression signature suggests SSBP2 may contribute to glioblastoma pathogenesis.

With further confirmation, these previously unrecognized inherited variations influencing survival may warrant inclusion in clinical trials to improve randomization and validate new therapeutic approaches. The genes identified here by SNP tags may represent potential targets for developing new drug therapies.

M.S. Berger: consultant/advisory board, IVIVI Health Services and Pharmaco-Kinesis Corp. L.B. Nabors has an uncompensated position at Merck KGaA. The other authors disclosed no potential conflicts of interest.

Conception and design: Y. Xiao, J.L. Wiemels, R.C. Thompson, L.B. Nabors, J.K. Wiencke, R.B. Jenkins, M.R. Wrensch

Development of methodology: Y. Xiao, J.L. Wiemels, M.R. Wrensch

Acquisition of data: T. Rice, H.M. Hansen, J.L. Wiemels, M.D. Prados, S.M. Chang, D.H. LaChance, R.C. Thompson, L.B. Nabors, J.J. Olson, S. Brem, J.K. Wiencke, K.M. Egan, R.B. Jenkins

Analysis and interpretation of data: Y. Xiao, P.A. Decker, T. Rice, T. Tihan, M.L. Kosel, B.L. Fridley, B.P. O'Neill, J.E. Browning, K.M. Egan, R.B. Jenkins, M.R. Wrensch

Writing, review, and/or revision of the manuscript: Y. Xiao, P.A. Decker, L.S. McCoy, H.M. Hansen, J.L. Wiemels, M.D. Prados, S.M. Chang, M.S. Berger, D.H. LaChance, B.P. O'Neill, J.C. Buckner, R.C. Thompson, J.J. Olson, S. Brem, M.H. Madden, J.E. Browning, J.K. Wiencke, K.M. Egan, M.R. Wrensch

Administrative, technical, or material support: P.A. Decker, T. Rice, I. Smirnov, J. Patoka, T. Tihan, D.H. LaChance, B.P. O'Neill, R.C. Thompson, L.B. Nabors, J.J. Olson, M.H. Madden, J.E. Browning, K.M. Egan, R.B. Jenkins

Study supervision: B.L. Fridley, R.C. Thompson, J.J. Olson, M.H. Madden, K.M. Egan, R.B. Jenkins, M.R. Wrensch

The authors thank study participants, the clinicians, and research staffs at participating medical centers, Kenneth Aldape, deCODE genetics, the late Dr. Bernd Scheithauer, Dr. Caterina Gianinni, the Mayo Clinic Comprehensive Cancer Center Biospecimens and Processing, Celia Sigua, Marek Wloch, and Ms. Anna Konidari.

The ideas and opinions expressed herein are those of the author(s) and endorsement by the State of California Department of Public Health, the National Cancer Institute, and the Centers for Disease Control and Prevention or their Contractors and Subcontractors is not intended nor should be inferred. The results published here are in part based upon data generated by The Cancer Genome Atlas pilot project established by the National Cancer Institute and National Human Genome Research Institute. Information about TCGA and the investigators and institutions that constitute the TCGA research network can be found at “http://cancergenome.nih.gov”.

The work at University of California, San Francisco was supported by the NIH (grant numbers R01CA52689 and P50CA097257), as well as the National Brain Tumor Foundation, the UCSF Lewis Chair in Brain Tumor Research and by donations from families and friends of John Berardi, Helen Glaser, Elvera Olsen, Raymond E. Cooper, and William Martinusen. The work at the Mayo Clinic was supported by the NIH (grant numbers P50CA108961 and P30 CA15083) and the Bernie and Edith Waterman Foundation. The work at Moffitt Cancer Center for the GliomaSE study was supported the NIH (CAR01116174) as well as institutional funding from the Moffitt Cancer Center, Tampa, FL and the Vanderbilt-Ingram Comprehensive Cancer Center, Nashville, TN. The collection of cancer incidence data used in this study was supported by the California Department of Public Health as part of the statewide cancer reporting program mandated by California Health and Safety Code Section 103885; the National Cancer Institute's Surveillance, Epidemiology and End Results Program under contract HHSN261201000036C awarded to the Cancer Prevention Institute of California, contract HHSN261201000035C awarded to the University of Southern California, and contract HHSN261201000034C awarded to the Public Health Institute; and the Centers for Disease Control and Prevention's National Program of Cancer Registries, under agreement #1U58 DP000807-01 awarded to the Public Health Institute.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1.
Wu
X
,
Ye
Y
,
Rosell
R
,
Amos
CI
,
Stewart
DJ
,
Hildebrandt
MA
, et al
Genome-wide association study of survival in non-small cell lung cancer patients receiving platinum-based chemotherapy
.
J Natl Cancer Inst
2011
;
103
:
817
25
.
2.
U
,
Osterman
P
,
Sjostrom
S
,
Johansen
C
,
Henriksson
R
,
Brannstrom
T
, et al
MNS16A minisatellite genotypes in relation to risk of glioma and meningioma and to glioblastoma outcome
.
Int J Cancer
2009
;
125
:
968
72
.
3.
Bhowmick
DA
,
Zhuang
Z
,
Wait
SD
,
Weil
RJ
.
A functional polymorphism in the EGF gene is found with increased frequency in glioblastoma multiforme patients and is associated with more aggressive disease
.
Cancer Res
2004
;
64
:
1220
3
.
4.
Liu
Y
,
Shete
S
,
Etzel
CJ
,
Scheurer
M
,
Alexiou
G
,
Armstrong
G
, et al
Polymorphisms of LIG4, BTBD2, HMGA2, and RTEL1 genes involved in the double-strand break repair pathway predict glioblastoma survival
.
J Clin Oncol
2010
;
28
:
2467
74
.
5.
Okcu
MF
,
Selvan
M
,
Wang
LE
,
Stout
L
,
Erana
R
,
Airewele
G
, et al
Glutathione S-transferase polymorphisms and survival in primary malignant glioma
.
Clin Cancer Res
2004
;
10
:
2618
25
.
6.
Scheurer
ME
,
Amirian
E
,
Cao
Y
,
Gilbert
MR
,
Aldape
KD
,
Kornguth
DG
, et al
Polymorphisms in the interleukin-4 receptor gene are associated with better survival in patients with glioblastoma
.
Clin Cancer Res
2008
;
14
:
6640
6
.
7.
Sjostrom
S
,
U
,
Liu
Y
,
Brannstrom
T
,
Broholm
H
,
Johansen
C
, et al
Genetic variations in EGF and EGFR and glioblastoma outcome
.
Neuro Oncol
2010
;
12
:
815
21
.
8.
Sjostrom
S
,
Wibom
C
,
U
,
Brannstrom
T
,
Broholm
H
,
Johansen
C
, et al
Genetic variations in VEGF and VEGFR2 and glioblastoma outcome
.
J Neurooncol
2010
;
104
:
523
7
.
9.
Wang
L
,
Wei
Q
,
Wang
LE
,
Aldape
KD
,
Cao
Y
,
Okcu
MF
, et al
Survival prediction in patients with glioblastoma multiforme by human telomerase genetic variation
.
J Clin Oncol
2006
;
24
:
1627
32
.
10.
Wrensch
M
,
Jenkins
RB
,
Chang
JS
,
Yeh
RF
,
Xiao
Y
,
Decker
PA
, et al
Variants in the CDKN2B and RTEL1 regions are associated with high-grade glioma susceptibility
.
Nat Genet
2009
;
41
:
905
8
.
11.
Egan
KM
,
Thompson
RC
,
Nabors
LB
,
Olson
JJ
,
Brat
DJ
,
Larocca
RV
, et al
Cancer susceptibility variants and the risk of adult glioma in a US case-control study
.
J Neurooncol
2011
;
104
:
535
42
.
12.
Felini
MJ
,
Olshan
AF
,
Schroeder
JC
,
Carozza
SE
,
Miike
R
,
Rice
T
, et al
Reproductive factors and hormone use and risk of adult gliomas
.
Cancer Causes Control
2009
;
20
:
87
96
.
13.
Wrensch
M
,
McMillan
A
,
Wiencke
J
,
Wiemels
J
,
Kelsey
K
,
Patoka
J
, et al
Nonsynonymous coding single-nucleotide polymorphisms spanning the genome in relation to glioblastoma survival and age at diagnosis
.
Clin Cancer Res
2007
;
13
:
197
205
.
14.
Christensen
BC
,
Smith
AA
,
Zheng
S
,
Koestler
DC
,
Houseman
EA
,
Marsit
CJ
, et al
DNA methylation, isocitrate dehydrogenase mutation, and survival in glioma
.
J Natl Cancer Inst
2011
;
103
:
143
53
.
15.
Wiencke
JK
,
Aldape
K
,
McMillan
A
,
Wiemels
J
,
M
,
Miike
R
, et al
Molecular features of adult glioma associated with patient race/ethnicity, age, and a polymorphism in O6-methylguanine-DNA-methyltransferase
.
Cancer Epidemiol Biomarkers Prev
2005
;
14
:
1774
83
.
16.
Wrensch
M
,
Wiencke
JK
,
Wiemels
J
,
Miike
R
,
Patoka
J
,
M
, et al
Serum IgE, tumor epidermal growth factor receptor expression, and inherited polymorphisms associated with glioma survival
.
Cancer Res
2006
;
66
:
4531
41
.
17.
Cancer Genome Atlas Research Network
.
Comprehensive genomic characterization defines human glioblastoma genes and core pathways
.
Nature
2008
;
455
:
1061
8
.
18.
Cochran
WG
.
The combination of estimates from different experiments. Biometrics
.
Washington, DC
:
International Biometric Society
;
1954
.
p.
101
29
.
19.
Normand
SL
.
Meta-analysis: formulating, evaluating, combining, and reporting
.
Stat Med
1999
;
18
:
321
59
.
20.
Lee
Y
,
Scheck
AC
,
Cloughesy
TF
,
Lai
A
,
Dong
J
,
Farooqi
HK
, et al
Gene expression analysis of glioblastomas identifies the major molecular basis for the prognostic benefit of younger age
.
BMC Med Genomics
2008
;
1
:
52
.
21.
Murat
A
,
Migliavacca
E
,
Gorlia
T
,
Lambiv
WL
,
Shay
T
,
Hamou
MF
, et al
Stem cell-related "self-renewal" signature and high epidermal growth factor receptor expression associated with resistance to concomitant chemoradiotherapy in glioblastoma
.
J Clin Oncol
2008
;
26
:
3015
24
.
22.
Verhaak
RG
,
KA
,
Purdom
E
,
Wang
V
,
Qi
Y
,
Wilkerson
MD
, et al
Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, and NF1
.
Cancer Cell
2010
;
17
:
98
110
.
23.
Huse
JT
,
Phillips
HS
,
Brennan
CW
.
Molecular subclassification of diffuse gliomas: seeing order in the chaos
.
Glia
2011
;
59
:
1190
9
.
24.
Li
Y
,
Willer
CJ
,
Ding
J
,
Scheet
P
,
Abecasis
GR
.
MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes
.
Genet Epidemiol
2010
;
34
:
816
34
.
25.
Parmar
MK
,
Torri
V
,
Stewart
L
.
Extracting summary statistics to perform meta-analyses of the published literature for survival endpoints
.
Stat Med
1998
;
17
:
2815
34
.
26.
Noushmehr
H
,
Weisenberger
DJ
,
Diefes
K
,
Phillips
HS
,
Pujara
K
,
Berman
BP
, et al
Identification of a CpG island methylator phenotype that defines a distinct subgroup of glioma
.
Cancer Cell
2010
;
17
:
510
22
.
27.
Sabeti
PC
,
Varilly
P
,
Fry
B
,
Lohmueller
J
,
Hostetter
E
,
Cotsapas
C
, et al
Genome-wide detection and characterization of positive selection in human populations
.
Nature
2007
;
449
:
913
8
.
28.
Colman
H
,
Zhang
L
,
Sulman
EP
,
McDonald
JM
,
Shooshtari
NL
,
Rivera
A
, et al
A multigene predictor of outcome in glioblastoma
.
Neuro Oncol
2010
;
12
:
49
57
.
29.
Huang
Y
,
Chang
X
,
Lee
J
,
Cho
YG
,
Zhong
X
,
Park
IS
, et al
Cigarette smoke induces promoter methylation of single-stranded DNA-binding protein 2 in human esophageal squamous cell carcinoma
.
Int J Cancer
2011
;
128
:
2261
73
.
30.
Liang
H
,
Samanta
S
,
Nagarajan
L
.
SSBP2, a candidate tumor suppressor gene, induces growth arrest and differentiation of myeloid leukemia cells
.
Oncogene
2005
;
24
:
2625
34
.
31.
Baine
MJ
,
Chakraborty
S
,
Smith
LM
,
Mallya
K
,
Sasson
AR
,
Brand
RE
, et al
Transcriptional profiling of peripheral blood mononuclear cells in pancreatic cancer patients identifies novel genes with potential diagnostic utility
.
PLoS One
2011
;
6
:
e17014
.
32.
Shaw
EJ
,
Haylock
B
,
Husband
D
,
du Plessis
D
,
Sibson
DR
,
Warnke
PC
, et al
Gene expression in oligodendroglial tumors
.
Anal Cell Pathol (Amst)
2010
;
33
:
81
94
.