Abstract
Men engaged in high physical activity have lower risks of advanced and fatal prostate cancer. Mechanisms underlying this association are not well understood but may include systemic and tumor-specific effects. We investigated potential mechanisms linking physical activity and gene expression in prostate tissue from men with prostate cancer.
We included a subset of 118 men in the Health Professionals Follow-up Study diagnosed with prostate cancer between 1986 and 2005 with whole-transcriptome gene expression profiling on tumor and adjacent normal prostate tissue and physical activity data. Long-term vigorous physical activity was self-reported as the average time spent engaged in various forms of recreational physical activity at baseline and biennially until prostate cancer diagnosis. Gene set enrichment analysis was performed among KEGG and Hallmark gene sets to identify pathways with differential expression based on vigorous physical activity.
In adjacent normal tissue, we identified 25 KEGG gene sets enriched (downregulated) in the highest compared with lowest quintile of vigorous physical activity at an FDR <0.10, including a number of cancer- and immune-related pathways. Although no gene sets reached statistical significance in tumor tissue, top gene sets differentially expressed included TGF beta, apoptosis, and p53 signaling pathways.
These findings suggest that physical activity may influence the tumor microenvironment. Future studies are needed to confirm these findings and further investigate potential mechanisms linking physical activity to lethal prostate cancer.
Identification of gene expression alterations in the prostate associated with physical activity can improve our understanding of prostate cancer etiology.
This article is featured in Highlights of This Issue, p. 585
Introduction
Physical activity is associated with lower risk of several cancers (1), but much remains unknown regarding the biological mechanisms involved. Epidemiologic studies support a link between physical activity and reduced prostate cancer incidence and progression (2, 3). In a prospective study, men who engaged in higher amounts of vigorous physical activity had a lower risk of clinically significant and TMPRSS2:ERG-positive prostate cancer (4). Physical activity influences several biological processes that may be involved in prostate cancer (5, 6), and may exert both systemic and local effects (7). Specifically, it alters endogenous hormone levels, including testosterone, insulin, and insulin-like growth factor (IGF)-1 (5–7). Furthermore, physical activity of different types and duration may have different effects on the tumor microenvironment (8).
Although its effects on the prostate, at this time, are unclear, there is emerging evidence that physical activity is linked to epigenetic changes in prostate tissue (9, 10). Epigenetic modifications such as DNA methylation and histone modifications are known to disrupt key biological processes in cancers, such as tumor growth, tissue invasion, and metastasis, and contribute to the occurrence of genetic mutations (11). There is growing evidence that lifestyle factors, such as diet and physical activity, have an important role in influencing epigenetic processes (12). Additional studies by our group and others have shown that activity at higher intensity and over the long-term may be most relevant to decreasing prostate cancer development and improving outcomes (4, 13, 14). There is limited evidence regarding the relationship between physical activity and gene expression in prostate cancer. Identification of gene expression alterations in the prostate that are associated with physical activity would potentially strengthen the biologic plausibility of these associations and may facilitate identification of biomarkers related to prostate cancer development.
The present study aimed to investigate the potential relationship between long-term, pre-diagnosis vigorous activity and gene expression alterations in prostate tumor and adjacent normal tissue. We broadly hypothesized that physical activity may alter gene expression in prostate tissue that may contribute to the beneficial association between physical activity and prostate cancer risk.
Materials and Methods
Study population
The Health Professionals Follow-up Study (HPFS) is an ongoing prospective cohort of 51,529 US male health professionals ages 40 to 75 years at baseline in 1986. Participants completed questionnaires at baseline and were mailed questionnaires every two years thereafter to ascertain lifestyle, health-related factors, and disease outcomes. The study protocol was approved by institutional review boards of the Brigham and Women's Hospital and Harvard T.H. Chan School of Public Health, and those of participating registries as required.
For HPFS participants diagnosed with prostate cancer during follow-up, we contacted hospital pathology departments to retrieve archival formalin-fixed paraffin-embedded (FFPE) prostate tumor tissue obtained through either radical prostatectomy or transurethral resection of the prostate. Gene expression profiling was performed on a subset of cases diagnosed between 1982 and 2005 using an extreme case design with the aim of identifying expression signatures to differentiate indolent and lethal disease (15). These cases included men with prostate cancer who had a metastatic event or died of prostate cancer during follow-up through 2016 (lethal cases), and men who survived at least 8 years without any evidence of metastases during follow-up (indolent cases). Gene expression profiling of tumor tissue was obtained for a total of 254 men in HPFS. Of these men, a subset of 120 men had profiling of both tumor tissue and adjacent normal tissue. To improve the comparability between our analyses of tumor and normal tissue, only men with both tumor tissue and matched adjacent normal tissue were included. After exclusion of 2 men who did not provide baseline physical activity data, our primary analysis included 118 cases (44 lethal and 74 indolent) with tumor and adjacent normal tissue.
Gene expression profiling
Using archival FFPE tissues from this subset of cases, cores were taken from foci highly enriched for cancer and mRNA was extracted from areas selected by study pathologists, as previously described (16, 17). Adjacent normal tissue was defined as prostatic tissue with histologic features closest to normal prostatic glands and stroma and clearly separated (at least 5 mm) from the cancer counterpart. In cases with inflammation, the area with less inflammation was selected. We performed whole-transcriptome amplification (WT-Ovation FFPE System V2; NuGEN) followed by hybridization to the GeneChip Human Gene 1.0 ST Array (Affymetrix; refs. 18, 19). For the expression profiles, we regressed out technical variables, including mRNA concentration, age of the block, batch (96-well plate), percentage of probes detectable above the background, log-transformed average background signal, and the median of the perfect match probes for each probe intensity of the raw data. The residuals were shifted to the original mean expression values and normalized using the robust multichip average method (20, 21). We used NetAffx annotations to map gene names to Affymetrix transcript cluster IDs as implemented in Bioconductor annotation package pd.hugene.1.0.st.v1; this resulted in 20,254 unique gene names. Gene expression data are available through Gene Expression Omnibus (GSE79021).
Clinical data ascertainment
Incident prostate cancer was ascertained initially by self-report on questionnaires. Medical records and pathology reports were used to confirm prostate cancer diagnosis and to extract clinical and treatment information. Since 2000, participants diagnosed with prostate cancer were followed through biennial disease-specific questionnaires for development of metastases, treatment, and PSA levels. Prostate cancer–specific death was determined by review of death certificates and medical records by an endpoint committee of physicians. The study pathologists reviewed hematoxylin and eosin slides to provide uniform Gleason grade and histopathologic review. Lethal prostate cancer was defined as distant metastasis or death due to the disease with follow-up through December 2016.
Physical activity assessment
In the HPFS, physical activity was assessed through validated questionnaires beginning at baseline in 1986 and every two years thereafter (22). Participants were asked to report in categories the average total time per week engaged in specific recreational activities during the past year. Specific activities included walking or hiking outdoors, jogging (>10 minutes/mile), running (≤10 minutes/mile), bicycling, lap swimming, tennis, squash or racquetball, and calisthenics or rowing. Participants also reported their usual walking pace and the number of flights of stairs climbed daily. Additional specific activities were included on the questionnaire in subsequent cycles: Heavy outdoor work (e.g., digging or chopping) from 1988, weightlifting from 1990, moderate outdoor work (e.g., yard work or gardening) from 2004.
To quantify intensity of activity, each specific activity was assigned a metabolic equivalent of task (MET) value based on a compendium of physical activities (23). A measure of MET-hours per week was derived for each specific activity by multiplying the MET value assigned for that activity by the average number of hours per week reported by the participant. Vigorous activity was restricted to activities with a MET value of 6 or greater: Walking at a pace of 4 miles per hour or faster, jogging, running, bicycling, lap swimming, tennis, squash or racquetball, calisthenics or rowing, and stair climbing. A validation study in the HPFS showed that vigorous activity may be measured with better validity than lower intensity activity (22).
To examine the long-term association with physical activity in our primary analyses, we used cumulative average physical activity, using the average of all available questionnaire data from baseline until the time of prostate cancer diagnosis. In our primary analysis, we considered physical activity categorized into quintiles and compared the highest with the lowest category of activity. This was based on previous findings by our group of association with prostate cancer risk comparing extreme quintiles of vigorous activity (4, 13). In analyses stratified by lethal status, we used continuous vigorous physical activity to maximize the statistical power of this analysis. In secondary analyses, we also evaluated the association with recent physical activity, using only physical activity reported on the questionnaire completed most recently before cancer diagnosis.
Statistical analysis
To identify predefined sets of functionally related genes associated with long-term, pre-diagnosis vigorous activity, gene set enrichment analysis (GSEA) was performed on mRNA expression profiles of tumor and adjacent normal tissue (24). Before performing GSEA, we filtered genes to exclude those with low expression across 50% or more samples, separately for tumor and normal tissue. A total of 6,167 genes were included in the tumor tissue analysis, and 6,215 genes in the adjacent normal tissue analysis. Age is a potential confounder in this study because it may affect gene expression as well as vigorous activity. To remove variation in gene expression due to age, we obtained gene expression residuals from linear regression on age at diagnosis. As secondary analyses, we performed GSEA among lethal cases and among indolent cases, separately.
As determined a priori, we included 186 Kyoto Encyclopedia of Genes and Genomes (KEGG) and 50 Hallmark gene sets from the Molecular Signature Database version 7.0 with software from the Broad Institute (http://software.broadinstitute.org/gsea/index.jsp). Gene sets with fewer than 15 or more than 500 genes (KEGG n = 79; Hallmark n = 3) were excluded. The signal-to-noise metric was used to rank the genes in analyses comparing the highest with lowest quintile of vigorous activity. Pearson correlations were used to rank genes in analyses with continuous vigorous activity. An enrichment score (ES) was calculated for each gene set using a weighted Kolmogorov–Smirnoff statistic. The ES represents how much the gene set is overrepresented at the top or bottom of the ranked list of genes. A positive ES indicated gene set enrichment at the top of the ranked list (upregulated in high vs. low activity) whereas a negative ES indicated gene set enrichment at the bottom of the ranked list (downregulated in high vs. low activity). The top-ranked subset of genes contributing to the ES were considered the leading-edge genes. The significance level was estimated using 10,000 phenotype-based permutations. The ES was normalized to account for variable size of gene sets and multiple testing was accounted for by calculating the FDR. Cytoscape version 3.7.1 (www.cytoscape.org) was used to visualize results from GSEA. To evaluate differential expression of individual genes by activity level, we used linear regression as implemented in the limma Bioconductor package. All other analyses were performed in R version 3.1.0. Gene sets with FDR <0.10 were considered statistically significant.
Results
Table 1 shows the characteristics of the subset of prostate cancer cases at diagnosis among whom gene expression profiling was performed. Men in the lowest quintile of vigorous activity were somewhat older, had slightly higher PSA level at diagnosis, more likely to be diagnosed with advanced-stage disease, and more likely to have lethal disease compared with men in the highest quintile. Other clinical characteristics were similar among men in the highest compared with the lowest quintile of vigorous activity. Results from the individual gene analysis were largely non-significant; therefore, we focused on the pathway analysis (Supplementary Table S1).
. | . | Cases according to vigorous activity quintile . | ||||
---|---|---|---|---|---|---|
Characteristics . | All cases . | 1 . | 2 . | 3 . | 4 . | 5 . |
Number | 118 | 24 | 23 | 24 | 23 | 24 |
Age at diagnosis, years (mean, SD) | 65.1 (6.7) | 66.6 (6.1) | 64.7 (6.8) | 65.2 (6.7) | 64.9 (7.0) | 63.9 (7.2) |
Year of diagnosis, N (%) | ||||||
Before 1990 | 9 (8) | 4 (17) | 1 (4) | 1 (4) | 1 (4) | 2 (8) |
1990–1993 | 20 (17) | 4 (17) | 3 (13) | 5 (21) | 8 (35) | 0 (0) |
After 1993 | 89 (75) | 16 (67) | 19 (83) | 18 (75) | 14 (61) | 22 (92) |
PSA Level, ng/mL, median (Q1, Q3)a | 7.0 (5.3, 12.7) | 8.8 (5.8, 18.9) | 8.3 (6.3, 10.5) | 7.2 (4.8, 11.1) | 6.0 (5.3, 14.7) | 5.2 (4.0, 7.0) |
Pathologic Gleason score, N (%) | ||||||
6 | 8 (7) | 0 (0) | 0 (0) | 4 (17) | 3 (13) | 1 (4) |
7 | 81 (69) | 20 (83) | 15 (65) | 14 (58) | 13 (57) | 19 (79) |
8–10 | 29 (25) | 4 (17) | 8 (35) | 6 (25) | 7 (30) | 4 (17) |
Clinical stage, N (%)b | ||||||
T1/T2 | 92 (81) | 17 (71) | 19 (83) | 21 (88) | 13 (57) | 22 (92) |
T3 | 12 (11) | 3 (13) | 2 (9) | 2 (8) | 4 (17) | 1 (4) |
T4/N1/M1 | 9 (8) | 1 (4) | 1 (4) | 1 (4) | 5 (22) | 1 (4) |
Lethal, N (%) | 44 (37) | 12 (50) | 8 (35) | 7 (29) | 11 (48) | 6 (25) |
Total activity, MET-h/wk, median (Q1, Q3)c | 26.3 (15.2, 41.2) | 14.0 (7.5, 25.2) | 19.4 (7.4, 26.0) | 26.9 (16.4, 36.3) | 28.0 (19.3, 41.2) | 44.6 (35.7, 57.8) |
Vigorous activity, MET-h/wk, median (Q1, Q3)c | 4.4 (0.7, 12.8) | 0.1 (0.0, 0.1) | 1.3 (0.7, 2.4) | 4.4 (4.1, 5.0) | 11.3 (9.0, 12.7) | 28.9 (25.3, 34.5) |
. | . | Cases according to vigorous activity quintile . | ||||
---|---|---|---|---|---|---|
Characteristics . | All cases . | 1 . | 2 . | 3 . | 4 . | 5 . |
Number | 118 | 24 | 23 | 24 | 23 | 24 |
Age at diagnosis, years (mean, SD) | 65.1 (6.7) | 66.6 (6.1) | 64.7 (6.8) | 65.2 (6.7) | 64.9 (7.0) | 63.9 (7.2) |
Year of diagnosis, N (%) | ||||||
Before 1990 | 9 (8) | 4 (17) | 1 (4) | 1 (4) | 1 (4) | 2 (8) |
1990–1993 | 20 (17) | 4 (17) | 3 (13) | 5 (21) | 8 (35) | 0 (0) |
After 1993 | 89 (75) | 16 (67) | 19 (83) | 18 (75) | 14 (61) | 22 (92) |
PSA Level, ng/mL, median (Q1, Q3)a | 7.0 (5.3, 12.7) | 8.8 (5.8, 18.9) | 8.3 (6.3, 10.5) | 7.2 (4.8, 11.1) | 6.0 (5.3, 14.7) | 5.2 (4.0, 7.0) |
Pathologic Gleason score, N (%) | ||||||
6 | 8 (7) | 0 (0) | 0 (0) | 4 (17) | 3 (13) | 1 (4) |
7 | 81 (69) | 20 (83) | 15 (65) | 14 (58) | 13 (57) | 19 (79) |
8–10 | 29 (25) | 4 (17) | 8 (35) | 6 (25) | 7 (30) | 4 (17) |
Clinical stage, N (%)b | ||||||
T1/T2 | 92 (81) | 17 (71) | 19 (83) | 21 (88) | 13 (57) | 22 (92) |
T3 | 12 (11) | 3 (13) | 2 (9) | 2 (8) | 4 (17) | 1 (4) |
T4/N1/M1 | 9 (8) | 1 (4) | 1 (4) | 1 (4) | 5 (22) | 1 (4) |
Lethal, N (%) | 44 (37) | 12 (50) | 8 (35) | 7 (29) | 11 (48) | 6 (25) |
Total activity, MET-h/wk, median (Q1, Q3)c | 26.3 (15.2, 41.2) | 14.0 (7.5, 25.2) | 19.4 (7.4, 26.0) | 26.9 (16.4, 36.3) | 28.0 (19.3, 41.2) | 44.6 (35.7, 57.8) |
Vigorous activity, MET-h/wk, median (Q1, Q3)c | 4.4 (0.7, 12.8) | 0.1 (0.0, 0.1) | 1.3 (0.7, 2.4) | 4.4 (4.1, 5.0) | 11.3 (9.0, 12.7) | 28.9 (25.3, 34.5) |
Note: Percentages may not sum to 100 due to rounding.
Abbreviation: PSA, prostate-specific antigen
aPSA at diagnosis was missing for 21 men.
bClinical TNM stage was missing for 5 men.
cValues are cumulative averages updated from baseline to prostate cancer diagnosis.
Overall, we observed a signal in both tumor and normal tissue as illustrated by enrichment of nominally significant GSEA P values in both tissue types (Supplementary Fig. S1). These P values indicate that while some pathways are common between tumor and normal tissue, others are not (Fig. 1). Controlling for multiple testing using the FDR, we did not identify any significant gene sets in tumor tissue at an FDR <0.10.
In normal tissue, we identified 25 KEGG gene sets downregulated among men in the highest versus lowest quintile of vigorous activity (FDR <0.10; Fig. 2; Supplementary Table S2). Among the sets identified were eight cancer-defined pathways and several pathways related to cellular immune response, such as B- and T-cell receptor signaling, and signal transduction. The extensive overlap of genes across these pathways is depicted in Fig. 2. Further analysis of the leading-edge genes showed that MAPK1, RAF1, PIK3R3, HRAS, AKT2, SOS1 and SOS2 were present in more than half of the 25 gene sets identified in normal tissue (Fig. 3).
In analyses stratified by lethal status, we identified 25 significant gene sets in normal tissue of lethal cases but none in that of indolent cases (Supplementary Tables S3 and S4). Of the 25 gene sets, 12 were identified in the analysis of adjacent normal tissue overall, whereas 13 were unique to the lethal cases. When we examined recent vigorous activity, we identified 11 gene sets significantly enriched (FDR <0.10) in adjacent normal tissue, fewer gene sets than we identified in association with long-term activity (Supplementary Table S5).
We conducted a sensitivity analysis in adjacent normal tissue excluding 15 cases with gene expression assayed from TURP specimens. There were fewer gene sets (21) that reached significance at FDR <0.10, likely due to the decreased sample size. The top gene sets were largely the same as when both TURP and RP specimens were included (Supplement Table S6). None of the Hallmark gene sets reached statistical significance at FDR <0.10 in tumor or adjacent normal tissue (Supplementary Table S7).
Discussion
Our study evaluated the associations between long-term vigorous activity before prostate cancer diagnosis and gene expression alterations in tumor and adjacent normal tissue. We identified 25 KEGG gene sets that were downregulated in men who reported high compared with low vigorous activity in the tumor-adjacent normal tissue. These gene sets included several cancer-related, immune system, and signal transduction pathways. To our knowledge, our study is the first to examine the associations between long-term, pre-diagnostic vigorous activity and gene expression alterations in prostate tissue.
Physical activity, particularly long-term activity, has been linked to epigenetic changes in breast, gastric, and colorectal cancers (12). There is limited evidence regarding the relationship between physical activity and gene expression in prostate cancer. One study examining prostate gene expression and post-diagnostic vigorous activity in low-risk prostate cancer identified associations with genes related to cell signaling, metabolism, and DNA repair pathways (10). In a cohort of men with localized prostate cancer, pre-diagnostic vigorous activity was associated with DNA methylation in the CRACR2A gene (9).
We identified several cancer-related pathways that are downregulated in adjacent normal tissue among men who performed long-term high compared with low amounts of vigorous activity. These findings suggest that physical activity may affect biological pathways that are common across different cancers. For example, the MAPK, PI3K-Akt, and p53 signaling pathways are common to many of the cancer gene sets identified in our study. The present study also identified several immune system pathways downregulated in men engaged in high versus low vigorous activity. This is consistent with prior studies showing that physical activity may reduce oxidative stress and improve immune functions (7).
Our leading-edge analysis showed that many of the gene sets identified by GSEA had leading-edge genes in common. Consideration of these individual genes provides additional understanding of the potential biological effects of physical activity. MAPK1, a member of the MAPK signaling pathway, and PIK3R3, a member of the PI3K pathway, were identified as leading-edge genes in our analysis. Upregulation of the MAPK pathway is associated with reduced survival in castration-resistant prostate cancer (25). The PI3K pathway regulates cell growth, survival, proliferation, metabolism, and angiogenesis, and is altered in metastatic prostate cancer (26). TP53, another leading-edge gene, is responsible for the tumor protein p53, a well-known tumor-suppressor gene. Regular exercise may alter the IGF-axis and reduce cell proliferation and increase apoptosis through expression of p53 in prostate cancer cells (27).
We identified differentially expressed gene sets in adjacent normal tissue whereas enriched gene sets in tumor tissue did not reach significance. Given its proximity to the tumor, adjacent normal tissue may exhibit similar characteristics or undergo processes similar to those in the tumor. It is possible that, by the time that cancer is detectable, cellular activity is dysregulated to such an extent that relatively small differences in gene expression due to lifestyle factors become more difficult to detect. There may also be an interaction between normal and tumor tissue that affects disease progression.
Strengths of this study include its long-term, prospective physical activity assessment and ability to examine tumor and normal tissue. Compared with individual gene analysis, GSEA improves power when effects of individual genes are small and facilitates interpretation of biological mechanisms (24). There are limitations to this study. Our sample size was moderate and may have limited our power to detect significant associations. This was an observational study; therefore, we cannot conclude that differences in vigorous activity are causally related to epigenetic changes in prostate tissue. Although physical activity was assessed by self-report using a validated questionnaire, there may be misclassification of men according to their physical activity level. In addition, because participants in the HPFS are primarily Caucasian, our findings may not be generalizable to other racial/ethnic groups.
In conclusion, our study identified sets of functionally related genes differentially expressed in tumor and adjacent normal prostate tissue among men who engage in high compared with low vigorous activity. This exploratory study suggests that vigorous activity may influence prostate tissue through changes in cancer-related, immune system, cell signaling, and other pathways in men with prostate cancer.
Authors' Disclosures
G. Parmigiani reports equity in Phaeno Biotechnologies and CRA Health, and currently consults for the Foundation Medicine Institute and for Delphi Diagnostics. S.P. Finn reports personal fees from Roche Pharmaceutical, MSD, and Pfizer outside the submitted work. M.G. Vander Heiden reports personal fees from Agios Pharmaceuticals, Aeglea Biotherapeutics, iTeos Therapeutics, and Auron Therapeutics outside the submitted work. No disclosures were reported by the other authors.
Disclaimer
The authors assume full responsibility for analyses and interpretation of these data.
Authors' Contributions
C.H. Pernar: Conceptualization, formal analysis, methodology, writing–original draft, project administration. G. Parmigiani: Conceptualization, methodology, writing–review and editing. E.L. Giovannucci: Conceptualization, methodology, writing–review and editing. E.B. Rimm: Conceptualization, methodology, writing–review and editing. S. Tyekucheva: Resources, methodology, writing–review and editing. M. Loda: Resources, methodology, writing–review and editing. S.P. Finn: Resources, methodology, writing–review and editing. M.G. Vander Heiden: Methodology, writing–review and editing. M. Fiorentino: Resources, methodology, writing–review and editing. E.M. Ebot: Conceptualization, supervision, methodology, writing–review and editing. L.A. Mucci: Conceptualization, resources, supervision, methodology, writing–review and editing.
Acknowledgments
This work was supported in part by National Institutes of Health (T32ES 007069 and T32CA009001, to C.H. Pernar; P50CA090381 and R01CA136578, to L.A. Mucci; R01CA174206, to G. Parmigiani; 4P30CA006516-51, to L.A. Mucci and G. Parmigiani); the Prostate Cancer Foundation Young Investigator Awards (to L.A. Mucci and S.P. Finn), and the Prostate Cancer Foundation Challenge Award (to L.A. Mucci). The Health Professionals Follow-up Study is supported by U01CA167552 from the National Cancer Institute. We would like to thank the participants and staff of the HPFS for their valuable contributions as well as the following state cancer registries for their help: AL, AZ, AR, CA, CO, CT, DE, FL, GA, ID, IL, IN, IA, KY, LA, ME, MD, MA, MI, NE, NH, NJ, NY, NC, ND, OH, OK, OR, PA, RI, SC, TN, TX, VA, WA, WY. In particular, we would like to acknowledge Elizabeth Frost-Hawes, Sioban Saint-Surin, Eleni Konstantis, Liza Gazeeva, Robert Sheahan, Ann Fischer, and Ruifeng Li.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.