Background:

A previous genome-wide association study identified several loci with genetic variants associated with prostate cancer survival time in two cohorts from Sweden. Whether these variants have an effect in other populations or if their effect is homogenous across the course of disease is unknown.

Methods:

These variants were genotyped in a cohort of 1,298 patients. Samples were linked with age, PSA level, Gleason score, cancer stage at surgery, and times from surgery to biochemical recurrence to death from prostate cancer. SNPs rs2702185 and rs73055188 were tested for association with prostate cancer–specific survival time using a multivariate Cox proportional hazard model. SNP rs2702185 was further tested for association with time to biochemical recurrence and time from biochemical recurrence to death with a multi-state model.

Results:

SNP rs2702185 at SMG7 was associated with prostate cancer–specific survival time, specifically the time from biochemical recurrence to prostate cancer death (HR, 2.5; 95% confidence interval, 1.4–4.5; P = 0.0014). Nine variants were in linkage disequilibrium (LD) with rs2702185; one, rs10737246, was found to be most likely to be functional based on LD patterns and overlap with open chromatin. Patterns of open chromatin and correlation with gene expression suggest that this SNP may affect expression of SMG7 in T cells.

Conclusions:

The SNP rs2702185 at the SMG7 locus is associated with time from biochemical recurrence to prostate cancer death, and its LD partner rs10737246 is predicted to be functional.

Impact:

These results suggest that future association studies of prostate cancer survival should consider various intervals over the course of disease.

Prostate cancer is the second leading cause of cancer-related death among men in the United States, though most men diagnosed with prostate cancer will not die from it (1). This apparent paradox is due to the relatively indolent nature of many prostate cancers. Left untreated, most prostate tumors will grow slowly and not be a clinical threat to a man's health. However, for a subset of tumors, progression to metastatic disease, cytotoxic and antiandrogen treatment, and death occur. This has led to a contentious debate on the value of screening for prostate cancer. Although it has been widely accepted that prostate-specific antigen (PSA) screening does reduce prostate cancer mortality, it remains an open question whether current screening modalities result in more harm than good due to the adverse effects of treatment compared with the number of fatal cancers prevented (2–4).

Over the past decade, one approach to improve identification of men at increased risk for prostate cancer has been germline genomic studies. These studies rest on the observation that prostate cancer is highly heritable (5, 6). By one measure, such studies have been highly successful; a recent study notes over 269 independent variants that influence the risk of prostate cancer (7). When these variants are combined into a polygenic risk score, men in the highest 10% of risk are 5 times more likely to be diagnosed with prostate cancer than men at average risk (7). However, these variants generally are not associated with prostate cancer survival after diagnosis of disease (8, 9). Therefore, incorporation of these variants into screening algorithms could actually exacerbate overdiagnosis.

The extent to which inherited genetic variation influences prostate cancer outcome in men who are diagnosed with disease is unclear. Prostate cancer outcome, like prostate cancer risk, does show familial aggregation suggesting a heritable component (10, 11). Recently, we conducted a genome-wide association study of prostate cancer survival and identified several loci for which there was evidence that genetic variation influences prostate cancer survival time (12). One locus, tagged by the SNP rs73055188 at the AOX1 gene reached a strict threshold for genome-wide significance; a second locus tagged by the SNP rs2702185 at the SMG7 gene also was strongly associated with survival time.

This previous study was limited by its exclusive examination of individuals from Sweden. Furthermore, it is not clear whether these SNPs influence survival across the entire course of the disease or only at a particular stage. To examine these questions, we looked at the association of these SNPs with prostate cancer survival in a well-annotated, hospital-based cohort of prostate cancer cases (8, 9, 13, 14). We demonstrate that rs2702185 is associated with prostate cancer survival specifically by altering the time from biochemical recurrence to death.

Patient cohort

Our cohort consists of 1,298 individuals who underwent radical prostatectomy at Memorial Sloan Kettering Cancer Center (MSKCC) from January 1990 through January 2006, and from whom de-identified DNA samples are available (8, 9, 13, 14). This study was approved by the MSKCC Institutional Review Board in accordance with the U.S. Common Rule. Consistent with the patient population at MSKCC, this cohort is primarily of European ancestry. Over half of the men self-identify as Ashkenazi Jewish (AJ), which we define as reporting all four grandparents as being Jewish and from Eastern Europe. These data were linked with abstracted clinical records, which record, among other variables, age, PSA level, Gleason score and cancer stage at surgery as well as time from surgery to endpoints including biochemical recurrence and death from prostate cancer.

SNP genotyping

DNA previously extracted from blood was used in these analyses (13). To genotype rs2702185 and rs73055188, we used predesigned TaqMan assays (ref. 15; Thermo Fisher Scientific Inc, MA). A master mix dilution was prepared and 3 μL aliquoted to a 384-well plate using multichannel pipette. Two μL of high molecular weight DNA diluted to 3 ng/μL was added to the plate, spun down, and subjected to thermocycling for 40 cycles on an ABI GeneAmp PCR System 9600. The fluorescent signal was quantified and alleles assigned using the ABI Prism 7900HT instrument and SDS software. Allele frequencies and Hardy–Weinberg equilibrium quality checks were calculated.

Association with prostate cancer survival

The two SNPs rs2702185 and rs73055188 were tested for their association with prostate cancer–specific survival using a Cox proportional hazard model (16) using the R package survival (17). Survival time was measured from the date of prostate cancer surgery until the date of prostate cancer–specific death or last follow-up date (mean of 10 years follow-up). Data on death from other causes was not available in this cohort based at a cancer care hospital. Genotypes were coded under an additive model. A log-additive effect (log HR per additional minor allele) was estimated. The interactive effects between each SNP and study population (AJ and non-AJ) were considered to capture population-specific effects. Stratified analysis for AJ and non-AJ populations was also performed to evaluate the sensitivity of the estimated parameters. Covariates under adjustment include age, PSA level, Gleason score categories (<7, 7, and 8–10), and World Health Organization stage categories (1–2, 3–4) at surgery. Associations with P < 0.05 in the Wald test and HR in the same direction as in discovery cohort were considered significant for validation. Kaplan–Meier curves for the validated associations were plotted and its P value from log rank test was reported.

Decomposing the effects of prostate cancer survival associated SNP(s) on two stages of prostate cancer progression

To test if the SNP rs2702185 influences different stages of prostate cancer progression in the non-AJ population, we divided disease progression into two distinct time intervals: from surgery to biochemical recurrence, and from biochemical recurrence to prostate cancer–specific death (Fig. 1). Biochemical recurrence was defined as PSA of ≥0.2 ng/mL after radical prostatectomy and a value of “nadir + 2” after other therapy (14). This model assumes that patients who died of prostate cancer previously experienced biochemical recurrence. We tested for SNP association with these intervals using a multi-state model under an additive model. We considered three stages including surgery (stage 1), biochemical recurrence (stage 2), and death from disease (stage 3). We defined the transition matrix such that diagnosed patients could only transit to death of disease by passing through biochemical recurrence. Age, PSA level, Gleason score and stage at surgery were adjusted as covariates.

Figure 1.

Stages of prostate cancer progression. Our conceptualization of the stages of prostate cancer progression, with biochemical recurrence in between primary treatment (surgery) and death from disease, is shown.

Figure 1.

Stages of prostate cancer progression. Our conceptualization of the stages of prostate cancer progression, with biochemical recurrence in between primary treatment (surgery) and death from disease, is shown.

Close modal

Linkage disequilibrium analysis

To examine linkage disequilibrium (LD) in these populations, we combined data from the 1000 Genomes Project (18) with that from the Ashkenazi Genome Consortium (TAGC; ref. 19). We used VCFtools (20) to compute LD between rs2702185 and all SNPs within one megabase of it. For the 1000 Genomes data, we focused on individuals of Northwestern European ancestry (GBR and CEU); we used the complete TAGC data.

Functional effect and eQTL analyses

To examine the functional effects, we explored the predicted probabilities of being promoters and enhancers in prostate tissue of 9 SNPs in LD with rs2702185. The predicted functional probabilities were obtained from FUN-LDA (21), a Latent Dirichlet Allocation (LDA) based model that predicts functional effects of noncoding genetic variants by integrating diverse epigenetic annotations for specific cell types and tissues from large scale genomics projects.

To investigate the chromatin accessibility of these SNPs, we used data from the ENCODE project (http://www.encodeproject.org), including the SCREEN browser at http://screen.encodeproject.org (22). To look at chromatin accessibility in the TCGA data, we used the UCSC Xena browser (http://xenabrowser.net/; ref. 23).

To examine eQTLs, we used data from the eQTL Catalog (24), downloaded on October 9, 2020. For gene expression datasets generated on the Illumina microarray platform, we considered each probe for a given gene as a separate test in cases where more than one probe maps to the same HGNC gene id.

Data availability statement

The data generated in this study are not publicly available due to containing Protected Health Information but are available upon reasonable request from the corresponding author.

Demographics of prostate cancer cases in the MSKCC cohort

We included 1,298 men with incident diagnosis of prostate cancer from MSKCC in our study (Table 1). A total of 162 (12.5%) died from prostate cancer and 632 (48.6%) had biochemical recurrence. The median age at prostate cancer surgery was 68 (IQR, 62–72) and 63 (IQR, 58–70) years old for AJ and non-AJ men, respectively.

Table 1.

Cohort characteristics.

CharacteristicsAJNon-AJ
N 723 575 
Age at surgery, median (IQR) 68 (62, 73) 63 (58, 70) 
PSA, median (IQR) 6.9 (4.6, 11) 7.6 (5.27, 14.2) 
Gleason score category, n (%) 
 0–6 279 (37.9) 213 (37.0) 
 7 278 (40.1) 243 (42.3) 
 8–10 166 (22.0) 119 (20.7) 
Clinical stage, n (%) 
 Nonadvanced (c1–c2) 568 (78.6) 476 (82.8) 
 Advanced (c3-c4) 155 (21.4) 99 (17.2) 
Dead of disease, n (%) 
 No 627 (86.7) 509 (88.5) 
 Yes 96 (13.3) 66 (11.5) 
Biochemical recurrence, n (%) 
 No 400 (55.3) 266 (46.3) 
 Yes 323 (44.7) 309 (53.7) 
rs2702185, n (%) 
 C C 568 (78.4) 494 (85.9) 
 C T 128 (17.7) 75 (13.0) 
 T T 1 (0.1) 3 (0.5) 
 Missing 26 (3.6) 3 (0.5) 
rs73055188, n (%) 
 G G 577 (79.8) 489 (85.0) 
 A G 109 (15.1) 57 (10.0) 
 A A 6 (0.8) 4 (0.7) 
 Missing 31 (4.3) 25 (4.3) 
CharacteristicsAJNon-AJ
N 723 575 
Age at surgery, median (IQR) 68 (62, 73) 63 (58, 70) 
PSA, median (IQR) 6.9 (4.6, 11) 7.6 (5.27, 14.2) 
Gleason score category, n (%) 
 0–6 279 (37.9) 213 (37.0) 
 7 278 (40.1) 243 (42.3) 
 8–10 166 (22.0) 119 (20.7) 
Clinical stage, n (%) 
 Nonadvanced (c1–c2) 568 (78.6) 476 (82.8) 
 Advanced (c3-c4) 155 (21.4) 99 (17.2) 
Dead of disease, n (%) 
 No 627 (86.7) 509 (88.5) 
 Yes 96 (13.3) 66 (11.5) 
Biochemical recurrence, n (%) 
 No 400 (55.3) 266 (46.3) 
 Yes 323 (44.7) 309 (53.7) 
rs2702185, n (%) 
 C C 568 (78.4) 494 (85.9) 
 C T 128 (17.7) 75 (13.0) 
 T T 1 (0.1) 3 (0.5) 
 Missing 26 (3.6) 3 (0.5) 
rs73055188, n (%) 
 G G 577 (79.8) 489 (85.0) 
 A G 109 (15.1) 57 (10.0) 
 A A 6 (0.8) 4 (0.7) 
 Missing 31 (4.3) 25 (4.3) 

SNPs associated with prostate cancer survival in AJ and non-AJ populations

In univariate analysis, SNP rs2702185 was significantly associated with prostate cancer–specific survival (log rank test P value = 0.0013) in the non-AJ population (Fig. 2A) but not in the AJ population (log rank test P value = 0.58). The population-level heterogeneity for this association was significant after adjusting for age, PSA, Gleason score and stage at surgery by modeling the interactive effects between rs2702185 and population (AJ vs. non-AJ) in all samples (interactive effects: P = 0.010; per-allele HR = 2.696; 95% confidence interval (CI), 1.271–5.715). In this covariate-adjusted model rs2702185 remained significantly associated with prostate cancer–specific survival (P = 0.001; per-allele HR = 2.518; 95% CI, 1.468–4.321) in the non-AJ population only; the test for association in the AJ population was clearly not significant (P = 0.799; per-allele HR = 0.934; 95% CI, 0.553–1.577; Table 2). In this model, while age at surgery, PSA and advanced clinical stages were no longer associated with prostate cancer–specific survival after surgery (P > 0.05), Gleason score remained highly significant for prostate cancer–specific survival (P = 0.0015 for Gleason score 7 vs. <7 and P = 1.6E-10 for Gleason score 8–10 vs. <7; Supplementary Table S1). The second SNP rs730155188 did not show significant results with prostate cancer–specific survival in both the AJ and non-AJ populations of our cohort. In a covariate-adjusted interaction model, rs730155188 showed an increasing trend of HR (P = 0.432; per-allele HR = 1.275; 95% CI, 0.695–2.340) in the non-AJ population, suggesting the likely existence of small effect that was not significant due to small sample sizes. In the AJ population, there was no trend of association (per-allele HR = 1.030) with prostate cancer–specific survival. The same patterns were observed for these two SNPs in covariate-adjusted stratified analysis of AJ and non-AJ populations (Supplementary Table S2).

Figure 2.

The Kaplan–Meier curves for three time-to-event outcomes by rs2702185 in non-AJ populations. A, The Kaplan–Meier curve for time from prostate cancer surgery to death of the disease by rs2702185 in non-AJ populations. B, The Kaplan–Meier curve for time from prostate cancer surgery to biochemical recurrence by rs2702185 in non-AJ populations. C, The Kaplan–Meier curve for time from biochemical recurrence to death of the disease by rs2702185 in non-AJ populations. The log rank P values testing the similarity of the two rs2702185 groups are presented in the figure.

Figure 2.

The Kaplan–Meier curves for three time-to-event outcomes by rs2702185 in non-AJ populations. A, The Kaplan–Meier curve for time from prostate cancer surgery to death of the disease by rs2702185 in non-AJ populations. B, The Kaplan–Meier curve for time from prostate cancer surgery to biochemical recurrence by rs2702185 in non-AJ populations. C, The Kaplan–Meier curve for time from biochemical recurrence to death of the disease by rs2702185 in non-AJ populations. The log rank P values testing the similarity of the two rs2702185 groups are presented in the figure.

Close modal
Table 2.

Association of the two SNPs with survival time among AJ and non-AJ individuals in the study.

SNPPopulationHRPHR LCLHR UCL
rs2702185 Non-AJ 2.519 0.001 1.468 4.321 
 AJ 0.934 0.799 0.553 1.577 
rs730155188 Non-AJ 1.275 0.432 0.695 2.340 
 AJ 1.030 0.906 0.633 1.677 
SNPPopulationHRPHR LCLHR UCL
rs2702185 Non-AJ 2.519 0.001 1.468 4.321 
 AJ 0.934 0.799 0.553 1.577 
rs730155188 Non-AJ 1.275 0.432 0.695 2.340 
 AJ 1.030 0.906 0.633 1.677 

LCL: 95% lower confidence limit.

UCL: 95% upper confidence limit.

The SNP rs2702185 associates with cancer progression stages

As rs2702185 was identified to be associated with prostate cancer survival in the non-AJ population in our analysis, we took a closer look at its effects on cancer progression by considering two time periods (Table 3; Fig. 2B and C). Interestingly, rs2702185 was not associated with time from surgery to biochemical recurrence in the covariate-adjusted multi-state model (P = 0.491; per-allele HR = 1.112; 95% CI, 0.822–1.503; Kaplan–Meier curve in Fig. 2B), but significantly associated with time from biochemical recurrence to prostate cancer–specific survival with similar HR as for prostate cancer–specific survival (P = 0.0014; per-allele HR = 2.537; 95% CI, 1.435– 4.485; Kaplan–Meier curve in Fig. 2C) in the non-AJ population. Details on the subset of individuals with biochemical recurrence are provided in Supplementary Table S3. This suggests that the effect of rs2702185 does not physiologically manifest itself until later stages of disease. It is worth noting that all covariates were significantly associated with time from surgery to biochemical recurrence (P < 0.05), but only Gleason score was also associated with time from biochemical recurrence to death (P = 0.001 for Gleason score 8–10 vs. <7).

Table 3.

Association of rs2702185 with survival across different intervals of disease progression.

Surgery to biochemical recurrenceBiochemical recurrence to death of disease
HRPHR LCLHR UCLHRPHR LCLHR UCL
Age at surgery 0.949 1.96E-13 0.935 0.962 1.008 6.39E-01 0.974 1.043 
PSA 1.008 1.05E-03 1.003 1.013 0.991 1.56E-01 0.979 1.003 
Stage 3–4 (Ref: Stage 1–2) 1.424 1.25E-02 1.079 1.879 1.196 5.06E-01 0.706 2.028 
Gleason score 7 (Ref: Gleason score <7) 3.242 8.02E-14 2.381 4.414 1.571 2.76E-01 0.697 3.541 
Gleason score 8–10 (Ref: Gleason score <7) 4.012 4.24E-15 2.836 5.676 3.891 1.05E-03 1.727 8.769 
rs2702185 1.112 4.91E-01 0.822 1.503 2.537 1.36E-03 1.435 4.485 
Surgery to biochemical recurrenceBiochemical recurrence to death of disease
HRPHR LCLHR UCLHRPHR LCLHR UCL
Age at surgery 0.949 1.96E-13 0.935 0.962 1.008 6.39E-01 0.974 1.043 
PSA 1.008 1.05E-03 1.003 1.013 0.991 1.56E-01 0.979 1.003 
Stage 3–4 (Ref: Stage 1–2) 1.424 1.25E-02 1.079 1.879 1.196 5.06E-01 0.706 2.028 
Gleason score 7 (Ref: Gleason score <7) 3.242 8.02E-14 2.381 4.414 1.571 2.76E-01 0.697 3.541 
Gleason score 8–10 (Ref: Gleason score <7) 4.012 4.24E-15 2.836 5.676 3.891 1.05E-03 1.727 8.769 
rs2702185 1.112 4.91E-01 0.822 1.503 2.537 1.36E-03 1.435 4.485 

LCL: 95% lower confidence limit.

UCL: 95% upper confidence limit.

LD analysis and functional annotation suggests a causal SNP at the rs2702185 locus

We next wished to identify candidate causal variants at this locus. As we observed an interaction between the effect of rs2702185 and Ashkenazi ancestry, we examined LD in both non-Ashkenazi European and Ashkenazi populations. Using reference data from the 1000 Genomes Consortium, in individuals of Northwestern European ancestry, we identified 9 variants that were in LD with rs2702185 (r2 > 0.2; Table 4). In contrast, in data from Ashkenazi individuals from TAGC, LD is lower for six of the variants and one variant found in the 1000 Genomes data is not present in the TAGC data.

Table 4.

SNPs that are highly correlated (r2 > 0.2) with rs2702185 in Northwestern Europeans along with their functional annotations.

SNPPositionaNon-AJ r2AJ r2Functional region in FUN-LDAPromoter prob(Weak) Enhancer prob
rs7517641 183406643 0.68 0.50 183406626 - 183406650 0.000 0.268 
rs6678117 183410168 0.30 0.46 183410151 - 183410175 0.000 0.175 
rs10737246 183443785 0.75 0.50 183443776 - 183443800 0.874 0.000 
rs4652803 183446620 0.75 N.D. 183446601 - 183446625 0.492 0.047 
rs2782412 183501661 0.28 0.50 183501651 - 183501675 0.000 0.268 
rs17434335 183505044 0.75 0.50 183505026 - 183505050 0.004 0.076 
rs34823167 183534681 0.61 0.50 183534676 - 183534700 0.001 0.166 
rs80008821 183563520 0.75 0.37 183563501 - 183563525 0.001 0.216 
SNPPositionaNon-AJ r2AJ r2Functional region in FUN-LDAPromoter prob(Weak) Enhancer prob
rs7517641 183406643 0.68 0.50 183406626 - 183406650 0.000 0.268 
rs6678117 183410168 0.30 0.46 183410151 - 183410175 0.000 0.175 
rs10737246 183443785 0.75 0.50 183443776 - 183443800 0.874 0.000 
rs4652803 183446620 0.75 N.D. 183446601 - 183446625 0.492 0.047 
rs2782412 183501661 0.28 0.50 183501651 - 183501675 0.000 0.268 
rs17434335 183505044 0.75 0.50 183505026 - 183505050 0.004 0.076 
rs34823167 183534681 0.61 0.50 183534676 - 183534700 0.001 0.166 
rs80008821 183563520 0.75 0.37 183563501 - 183563525 0.001 0.216 

aPositions are on chromosome 1, build 37.

To identify candidate functional SNPs among these 9, we investigated the predicted functional probabilities of being in promoter and enhancer regions in prostate tissue from FUN-LDA, a prediction tool built upon large scale functional genomics databases. We observed predicted function for 8 of them, and the highest functional probability is for rs10737246 (87% chance of being in a promoter; Table 4). Consistent with the FUN-LDA result, this SNP is in a candidate cis regulatory element (EH38E1403945) as annotated by the ENCODE project (22).

We next investigated in which cell type(s) this regulatory element may be active. Across all cell types in the ENCODE data, the epigenetic marks observed are annotated as being most consistent with distal enhancer-like signature. Among the ENCODE samples with DNase-Seq data indicating regions of open chromatin, this region was only found to have evidence of DNase hypersensitivity (open chromatin) in bronchial and esophageal epithelial cells. For those cell types without DNase-Seq data, there was no evidence for H3K27Ac peaks which is correlated with enhancer activity. In contrast, several cell types showed evidence for H3K4me3 marks, indicative of promoter activity. Two of the four top cell types with this mark were derived from T cells. We additionally asked if rs10737246 overlaps regions of open chromatin in samples from The Cancer Genome Atlas (TCGA; ref. 12); we find that it does (Supplementary Fig. S1).

We had previously found that rs2702185 was not associated with expression of SMG7 in prostate tissue but was weakly associated with NCF2 and ARPC5 instead (12). As the functional genomic evidence suggests that rs10737246 may lie in both a distal enhancer in epithelial cells and a promoter element in T cells, we wished to investigate the association of both rs2702185 and rs10737246 with gene expression across a broader range of cell types. To do so, we used data from the eQTL Catalog, a compendium of cis-eQTL results from numerous studies (24). Across all 112 studies in the catalog, there were 2,086 tests for association between either rs2702185 or rs10737246 and expression of a nearby gene; for each tissue-gene pair both SNPs were tested. At a Bonferonni significance level of P < 1.2e-05, 16 of these tissue-gene pairs were significant in at least one gene (Supplementary Table S4). All but one of these hits were with SMG7 in cells of the immune system, consistent with the epigenetic evidence and supporting the hypothesis that rs10737246 functions in immune cells rather than prostate epithelium.

We performed a validation analysis for two genetic variants that had been previously identified to be associated with prostate cancer–specific mortality (12). We replicated the association of rs2702185 with prostate cancer–specific survival in a cohort of non-Ashkenazi individuals of European ancestry in the U.S. The result for the second SNP, rs73055188, trended in a consistent direction with previous reports but did not reach a level of statistical significance. Intriguingly, when we separately examined the effect of these SNPs on the intervals from surgery to biochemical recurrence, and from biochemical recurrence to death, we found effect heterogeneity; SNP rs2702185 specifically associated with the interval from biochemical recurrence to death. This suggests that a genetic epidemiologic approach can be used to identify variants that are active in influencing different stages of prostate cancer progression. Neither SNP appears to be associated with prostate cancer survival in individuals of AJ ancestry. This population heterogeneity may explain why larger international consortia were unable to find SNPs associated with prostate cancer survival while our initial study focused in Sweden did (25). We note that germline BRCA2 mutations are observed in 2.4% of AJ prostate cancer cases, and are associated with worse outcome (14), compared with less than 1% in prostate cancer cases in general (26, 27). In contrast, at least one copy of the minor allele of rs2702185 is found in 19% of AJ and 14% of non-AJ individuals in this study. Thus, we find it unlikely that differences in germline BRCA2 mutation status explains this difference.

Several other important differences may contribute to the different effects of the SNPs on survival observed in the current study and our previous report (12). For instance, this current study was hospital-based while the prior studies in Sweden were population-based. Furthermore, at the time the studies were undertaken in Sweden PSA screening was not widely used, while the current study focuses on a hospital-based cohort in a setting where PSA screening was common. The overall length of follow-up time was larger in our prior study (12). Finally, we note that due to the “winner's curse” we would expect to observe attenuated effect sizes in this replication study (28).

It is worth noting that assembly of detailed clinical data is a key limiting factor in these kinds of studies of SNPs associated with cancer survival. In our earlier population-based studies, we used a national cancer registry combined with a national death index (12). This allowed us to have a complete view of the specified time points (diagnosis and death from disease), but no details on the intermediate course of disease. In contrast, here we used a hospital-based cohort. To extract detailed information on the course of disease, two urology fellows reviewed each individual medical record. Such an approach is not scalable, and restricts our ability to examine additional clinical variables that were not abstracted initially. With the advent of large biobanks linked to electronic medical records (29), it may be possible to design algorithms to extract some data on the course of prostate cancer, though these databases are often limited to fields like ICD-10 codes, laboratory values, and prescribed medicines. For instance, rising PSA or prescription of antiandrogen or chemotherapeutic agents is likely easy to identify in EMR. Details on Gleason grade or metastatic site are not included in these data. Databases created especially for following patients with prostate cancer prospectively may be especially useful for such studies (30).

We identified a candidate causal variant, rs10737246, which is located in the first intron of SMG7. We found functional genomic evidence that this variant lies in a regulatory region. Illustrative of the complexity of interpreting functional genomic data, there is evidence to support both its role as an enhancer in epithelial tissue and as a promoter-like element in T cells. Expression QTL analysis of blood supports the hypothesis that rs10737246 alters gene expression of SMG7 in immune cells, while we could not find evidence for a cis-eQTL effect on nearby genes in prostate. This raises the intriguing possibility that SNPs associated with prostate cancer survival could exert their effect via the microenvironment rather than the tumor cell itself.

Our results suggest that lower levels of SMG7 are associated with shorter survival times in prostate cancer. SMG7 is known to play a role in nonsense-mediated decay (NMD), the process by which mRNA molecules containing premature termination codons (PTC) are degraded before they can be translated. Specifically, it is thought that SMG7 links the recognition of a PTC to the mRNA degradation machinery (31). Decreased NMD activity has been linked to a tumor immune response (32). However, such a mechanism of action presupposes action in the cancer cells themselves and, because a tumor immune response is thought to be beneficial, would suggest that decreased expression of SMG7 should be associated with longer survival times. A role for SMG7 in immune cells, as suggested by our data, is less clear. SNPs at SMG7 have been associated with risk of systemic lupus erythematosus, in which the risk allele is associated with lower expression of SMG7 in peripheral blood mononuclear cells (33). Decreased NMD could lead to the accumulation of proteins with premature stop codons in immune cells, thereby altering the immune response to the tumor.

This study has several limitations. Though it expanded beyond individuals from Scandinavia (12), it still only examined people of European ancestry. Given the heterogeneity we observe, it remains an open question as to whether these SNPs have an influence on survival in non-European populations. Patients treated at tertiary care centers may not be representative of the general population. We only abstracted a subset of the clinical data that can be found in the medical record and therefore cannot say if these SNPs associate with specific metastatic sites or show different effects depending on adjuvant radiation therapy. Resource availability prevented us from performing a genome-wide scan of these individuals and asking if variants at additional loci are associated with time from biochemical recurrence to mortality. Finally, with longer follow-up time, this study may have greater power to identify additional associations. For future study, we propose to consider larger studies, such as through integration of multiple cohorts, employ alternative strategies to switch focus from SNPs to targeted genes to boost study power, and consider experiments for validation.

X. Song reports grants from NCI during the conduct of the study; grants from NCI; and grants from National Institute of Aging outside the submitted work. R.P. Kopp reports other support from Hoffman La Roche, Merck Sharp & Dohme; grants and personal fees from AstraZeneca; and other support from Seer, Inc outside the submitted work. Z.H. Gümüş reports grants from NCI (R33 CA263705-01) during the conduct of the study. K. Offit reports Dr. Offit is a founder of AnaNeo Therapeutics. This poses no conflict of interest to the current work. R.J. Klein reports grants from NCI during the conduct of the study. No disclosures were reported by the other authors.

The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH.

X. Song: Formal analysis, writing–original draft. M. Ru: Formal analysis, writing–review and editing. Z. Steinsnyder: Investigation, writing–review and editing. K. Tkachuk: Resources, data curation. R.P. Kopp: Data curation. J. Sullivan: Data curation. Z.H. Gümüş: Resources, writing–review and editing. K. Offit: Resources, supervision, writing–review and editing. V. Joseph: Resources, supervision, writing–review and editing. R.J. Klein: Conceptualization, supervision, funding acquisition, writing–review and editing.

This work was supported by R01 CA224948 (to R.J. Klein, Z. H. Gümüş, and X. Song) and Cancer Center Support Grant P30 CA196521 (X. Song and M. Ru). This work was also supported in part through the computational resources and staff expertise provided by Scientific Computing at the Icahn School of Medicine at Mount Sinai. Research reported in this paper was supported by the Office of Research Infrastructure of the NIH under award number S10OD026880 (to R.J. Klein).

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1.
Siegel
RL
,
Miller
KD
,
Jemal
A
.
Cancer statistics, 2019
.
CA Cancer J Clin
2019
;
69
:
7
34
.
2.
Wilt
TJ
,
Vo
TN
,
Langsetmo
L
,
Dahm
P
,
Wheeler
T
,
Aronson
WJ
, et al
.
Radical prostatectomy or observation for clinically localized prostate cancer: extended follow-up of the prostate cancer intervention versus observation trial (PIVOT)
.
Eur Urol
2020
;
77
:
713
24
.
3.
Pinsky
PF
,
Miller
E
,
Prorok
P
,
Grubb
R
,
Crawford
ED
,
Andriole
G
.
Extended follow-up for prostate cancer incidence and mortality among participants in the prostate, lung, colorectal and ovarian randomized cancer screening trial
.
BJU Int
2019
;
123
:
854
60
.
4.
Schröder
FH
,
Hugosson
J
,
Roobol
MJ
,
Tammela
TLJ
,
Zappa
M
,
Nelen
V
, et al
.
Screening and prostate cancer mortality: results of the European Randomized Study of Screening for Prostate Cancer (ERSPC) at 13 years of follow-up
.
Lancet Lond Engl
2014
;
384
:
2027
35
.
5.
Lichtenstein
P
,
Holm
NV
,
Verkasalo
PK
,
Iliadou
A
,
Kaprio
J
,
Koskenvuo
M
, et al
.
Environmental and heritable factors in the causation of cancer–analyses of cohorts of twins from Sweden, Denmark, and Finland
.
N Engl J Med
2000
;
343
:
78
85
.
6.
Mucci
LA
,
Hjelmborg
JB
,
Harris
JR
,
Czene
K
,
Havelick
DJ
,
Scheike
T
, et al
.
Familial risk and heritability of cancer among twins in Nordic Countries
.
JAMA
2016
;
315
:
68
76
.
7.
Conti
DV
,
Darst
BF
,
Moss
LC
,
Saunders
EJ
,
Sheng
X
,
Chou
A
, et al
.
Trans-ancestry genome-wide association meta-analysis of prostate cancer identifies new susceptibility loci and informs genetic risk prediction
.
Nat Genet
2021
;
53
:
65
75
.
8.
Sullivan
J
,
Kopp
R
,
Stratton
K
,
Manschreck
C
,
Corines
M
,
Rau-Murthy
R
, et al
.
An analysis of the association between prostate cancer risk loci, PSA levels, disease aggressiveness and disease-specific mortality
.
Br J Cancer
2015
;
113
:
166
72
.
9.
Gallagher
DJ
,
Vijai
J
,
Cronin
AM
,
Bhatia
J
,
Vickers
AJ
,
Gaudet
MM
, et al
.
Susceptibility loci associated with prostate cancer progression and mortality
.
Clin Cancer Res
2010
;
16
:
2819
32
.
10.
Lindström
LS
,
Hall
P
,
Hartman
M
,
Wiklund
F
,
Grönberg
H
,
Czene
K
.
Familial concordance in cancer survival: a Swedish population-based study
.
Lancet Oncol
2007
;
8
:
1001
6
.
11.
Hemminki
K
.
Familial risk and familial survival in prostate cancer
.
World J Urol
2012
;
30
:
143
8
.
12.
Li
W
,
Middha
M
,
Bicak
M
,
Sjoberg
DD
,
Vertosick
E
,
Dahlin
A
, et al
.
Genome-wide scan identifies role for AOX1 in prostate cancer survival
.
Eur Urol
2018
;
74
:
710
9
.
13.
Vijai
J
,
Kirchhoff
T
,
Gallagher
D
,
Hamel
N
,
Guha
S
,
Darvasi
A
, et al
.
Genetic architecture of prostate cancer in the Ashkenazi Jewish population
.
Br J Cancer
2011
;
105
:
864
9
.
14.
Gallagher
DJ
,
Gaudet
MM
,
Pal
P
,
Kirchhoff
T
,
Balistreri
L
,
Vora
K
, et al
.
Germline BRCA mutations denote a clinicopathologic subset of prostate cancer
.
Clin Cancer Res
2010
;
16
:
2115
21
.
15.
Woodward
J
.
Bi-allelic SNP genotyping using the TaqMan® assay
.
Methods Mol Biol
2014
;
1145
:
67
74
.
16.
Cox
DR
.
Regression models and life-tables. Cox, David R “Regression models and life-tables
.
J Roy Stat Soc Ser B Methodol
1972
;
34
:
187
202
.
17.
Therneau
T
,
Lumley
T
,
Elizabeth
A
,
Cynthia
C
.
A package for survival analysis in R [Internet]
.
Available from
: https://CRAN.R-project.org/package=survival.
18.
1000 Genomes Project Consortium
,
Auton
A
,
Brooks
LD
,
Durbin
RM
,
Garrison
EP
,
Kang
HM
, et al
.
A global reference for human genetic variation
.
Nature
2015
;
526
:
68
74
.
19.
Lencz
T
,
Yu
J
,
Palmer
C
,
Carmi
S
,
Ben-Avraham
D
,
Barzilai
N
, et al
.
High-depth whole genome sequencing of an Ashkenazi Jewish reference panel: enhancing sensitivity, accuracy, and imputation
.
Hum Genet
2018
;
137
:
343
55
.
20.
Danecek
P
,
Auton
A
,
Abecasis
G
,
Albers
CA
,
Banks
E
,
DePristo
MA
, et al
.
The variant call format and VCFtools
.
Bioinforma Oxf Engl
2011
;
27
:
2156
8
.
21.
Backenroth
D
,
He
Z
,
Kiryluk
K
,
Boeva
V
,
Pethukova
L
,
Khurana
E
, et al
.
FUN-LDA: a Latent Dirichlet Allocation model for predicting tissue-specific functional effects of noncoding variation: methods and applications
.
Am J Hum Genet
2018
;
102
:
920
42
.
22.
ENCODE Project Consortium
,
Moore
JE
,
Purcaro
MJ
,
Pratt
HE
,
Epstein
CB
,
Shoresh
N
, et al
.
Expanded encyclopedias of DNA elements in the human and mouse genomes
.
Nature
2020
;
583
:
699
710
.
23.
Corces
MR
,
Granja
JM
,
Shams
S
,
Louie
BH
,
Seoane
JA
,
Zhou
W
, et al
.
The chromatin accessibility landscape of primary human cancers
.
Science
2018
;
362
:
eaav1898
.
24.
Kerimov
N
,
Hayhurst
JD
,
Peikova
K
,
Manning
JR
,
Walter
P
,
Kolberg
L
, et al
.
eQTL Catalog: a compendium of uniformly processed human gene expression and splicing QTLs
.
bioRxiv
2021
.
25.
Cooney
KA
,
Beebe-Dimmer
JL
.
Finding a needle in the Haystack: The search for germline variants associated with prostate cancer clinical outcomes
.
Eur Urol
2018
;
74
:
720
1
.
26.
Pritchard
CC
,
Mateo
J
,
Walsh
MF
,
De Sarkar
N
,
Abida
W
,
Beltran
H
, et al
.
Inherited DNA-repair gene mutations in men with metastatic prostate cancer
.
N Engl J Med
2016
;
375
:
443
53
.
27.
Darst
BF
,
Sheng
X
,
Eeles
RA
,
Kote-Jarai
Z
,
Conti
DV
,
Haiman
CA
.
Combined effect of a polygenic risk score and rare genetic variants on prostate cancer risk
.
Eur Urol
2021
;
80
:
134
8
.
28.
Kraft
P
.
Curses–winner's and otherwise–in genetic epidemiology
.
Epidemiology
2008
;
19
:
649
51
.
29.
Abul-Husn
NS
,
Kenny
EE
.
Personalized medicine and the power of electronic health records
.
Cell
2019
;
177
:
58
69
.
30.
Koshkin
VS
,
Patel
VG
,
Ali
A
,
Bilen
MA
,
Ravindranathan
D
,
Park
JJ
, et al
.
PROMISE: a real-world clinical-genomic database to address knowledge gaps in prostate cancer
.
Prostate Cancer Prostatic Dis
2021 Aug 6 [Epub ahead of print]
.
31.
Unterholzner
L
,
Izaurralde
E
.
SMG7 acts as a molecular link between mRNA surveillance and mRNA decay
.
Mol Cell
2004
;
16
:
587
96
.
32.
Pastor
F
,
Kolonias
D
,
Giangrande
PH
,
Gilboa
E
.
Induction of tumor immunity by targeted inhibition of nonsense-mediated mRNA decay
.
Nature
2010
;
465
:
227
30
.
33.
Deng
Y
,
Zhao
J
,
Sakurai
D
,
Sestak
AL
,
Osadchiy
V
,
Langefeld
CD
, et al
.
Decreased SMG7 expression associates with lupus-risk variants and elevated antinuclear antibody production
.
Ann Rheum Dis
2016
;
75
:
2007
13
.

Supplementary data