Background: The androgen receptor (AR) is an essential gene in prostate cancer pathogenesis and progression. Genetic variation in AR exists, including a polymorphic CAG repeat sequence that is inversely associated with transcriptional activity. Experimental data suggest that heightened AR activity facilitates formation of TMPRSS2:ERG, a gene fusion present in approximately 50% of tumors of patients with prostate cancer.

Methods: We undertook a nested case–control study to investigate the hypothesis that shorter CAG repeat length would be associated with prostate cancer risk defined by TMPRSS2:ERG status. The study included 291 men with prostate cancer (147 ERG-positive) and 1,221 cancer-free controls. ORs and 95% confidence intervals (CI) were calculated using logistic regression.

Results: Median CAG repeat length (interquartile range) among controls was 22 (20–24). Men with shorter CAG repeats had an increased risk of ERG-positive (OR, 1.07 per 1 repeat decrease; 95% CI, 1.00–1.14), but not ERG-negative prostate cancer (OR, 0.99 per 1 repeat decrease; 95% CI, 0.93–1.05).

Conclusions: These data suggest that shorter CAG repeats are specifically associated with development of TMPRSS2:ERG–positive prostate cancer.

Impact: Our results provide supportive evidence that androgen signaling underlies the development of prostate tumors that harbor TMPRSS2:ERG. Moreover, these results suggest that TMPRSS2:ERG may represent a unique molecular subtype of prostate cancer with an etiology distinct from TMPRSS2:ERG–negative disease. Cancer Epidemiol Biomarkers Prev; 23(10); 2027–31. ©2014 AACR.

The androgen receptor (AR) is a nuclear transcription factor that mediates the actions of testosterone and dihydrotestosterone. AR signaling is critical for prostate growth and maintenance. In the context of cancer, altered AR signaling is implicated in prostate cancer development, and almost all prostate tumors depend on androgens for growth. Located on the Y chromosome, AR contains multiple genetic variants including the length of a polymorphic CAG repeat sequence in exon 1. This CAG repeat encodes a polyglutamine tract in the terminal domain of the protein (1). Variability in the trinucleotide repeat length has functional consequences and is inversely correlated to transcriptional activity of AR (2).

Shorter versus longer CAG repeats have been associated with higher risk of prostate cancer overall in some but not all studies (3–7). The role of the AR CAG repeat is of interest in the context of TMPRSS2:ERG, the gene fusion present in the tumors of about half of Caucasian patients with prostate cancer. This common fusion event involves the androgen-regulated promoter TMPRSS2 and the erythroblast transformation-specific (ETS) transcription factor family member ERG (8). Prior data suggest that TMPRSS2:ERG is an early event in prostate cancer development and that fusion-positive cancer represents a molecularly distinct subgroup (9). In experimental models, androgen signaling induces spatial proximity of the TMPRSS2 and ERG genomic loci, both located on chromosome 21q22 (10). Subsequent treatment with γ irradiation, which causes DNA double-strand breaks, leads to formation of the TMPRSS2:ERG gene fusion, suggesting that heightened AR signaling can facilitate TMPRSS2:ERG formation. Supporting this notion, in a case-only study of 40 men with prostate cancer, Bastus and colleagues found suggestive evidence that the number of CAG repeats is lower in TMPRSS2:ERG–positive versus -negative prostate cancer (11).

We sought to extend this work to investigate whether CAG repeat length in AR is related to the risk of TMPRSS2:ERG–positive or -negative prostate cancer in a nested case–control study among 291 men with prostate cancer and 1,221 cancer-free controls. We further tested whether six SNPs capturing variation across AR are associated with risk of TMPRSS2:ERG–positive or -negative prostate cancer.

Study population

We included men from previously conducted, prospective case–control studies nested within the Physicians' Health Study (PHS) and the Health Professionals Follow-up Study (HPFS) cohorts for whom genotyping and tumor data were available (4). The PHS was initiated in 1982 as a randomized trial of 22,071 U.S. male physicians age 40 to 84 years who were free of cardiovascular disease and cancer. HPFS is an ongoing prospective study of 51,529 male health professionals age 50 to 75 years initiated in 1986. In both studies, blood specimens were collected from a subset of the men before cancer diagnosis (N = 14,500 in PHS; N = 18,000 in HPFS), and DNA was extracted from whole blood and stored.

The original case–control studies that measured genetic variants in AR included 1,423 incident prostate cancer cases diagnosed from 1982 to 2004 and 1,467 matched controls (4–6). In the PHS samples, we previously reported that shorter CAG repeats were related to higher overall risk of prostate cancer (5) but found no association in the HPFS (6). Of the 1,423 cases, we also had available archival tumor tissue for characterization of TMPRSS2:ERG for 291 men. Because more than 96% of participants in the PHS and HPFS are Caucasian, the study was restricted to men of European Ancestry. The final sample size for this investigation included 291 with prostate cancer and known TMPRSS2:ERG status, and 1,221 cancer-free controls. Written consent had previously been obtained from all participants, and the Institutional Review Boards of Partners HealthCare and the Harvard School of Public Health approved the study.

Genotyping

CAG repeat length in AR was determined by PCR, running the amplified fragments on a denaturing polyacrylamide gel with automated fluorescence detection of the fragments and sizing (Genescan) at the Dana-Farber Cancer Institute (5). Data on the CAG repeat length were available for 269 of the prostate cancer cases and 1,154 of controls. Six polymorphic AR variants (rs6152, rs962458, rs1204038, rs2361634, rs1337080, and rs1337082) were selected to capture the haplotype variation in the study population using the Tagger program. Genotyping was done using the fluorogenic 5′-endonuclease assay (TaqMan) with the ABI Prism 7900 (Applied Biosystems) at the Harvard School of Public Health (4). Replicate samples were included in all genotyping assays to assess quality control and showed excellent concordance of genotyping data.

TMPRSS2:ERG status in tumors

The presence or absence of TMPRSS2:ERG was characterized on tumor tissue available from a biorepository of archival radical prostatectomy (95%) and transurethral resection of the prostate (5%) tumor specimens for men with prostate cancer in PHS and HPFS. For each case, the pathology team reviewed hematoxylin and eosin slides to confirm the presence of prostate cancer, assign standardized Gleason grade, and to identify areas of tumor for construction of tissue microarrays. Tissue microarrays were constructed by sampling at least three 0.6-mm cores of tumor per case from the dominant nodule or nodule with the highest Gleason pattern. To characterize TMPRSS2:ERG status, we performed IHC on 5-μm sections of tissue microarrays to assess ERG protein tumor expression (12), which has been shown to be strongly correlated with fusion status assessed by FISH (13). Briefly, 5-μm sections were deparaffinized and treated with citrate buffer for antigen retrieval. We applied ERG antibody (Clone ID: EPR3864; Epitomics) at 1:100 for 1 hour, followed by the BioGenex SS Multilink secondary antibody, and visualized using the DAB substrate Kit (Vector Laboratories). We classified tumors as ERG-positive (i.e., carrying the TMPRSS2:ERG gene fusion) if at least one core stained positive for ERG and ERG-negative if all cores stained negative for ERG. Eighty-five percent of the ERG-positive cases were positive on all tissue microarray cores. On a subset of cases in the PHS for whom TMPRSS2:ERG status was also assessed by FISH, we saw high agreement (>93%) with immunohistochemical evaluation.

Statistical analysis

Unconditional logistic regression analysis was used to calculate ORs and 95% confidence intervals (CI) of the association between CAG repeat length (continuous, and categorical: ≤19, 20–21, 22–23, ≥24–reference) and prostate cancer risk by ERG status, comparing ERG-positive cases with cancer-free controls and separately ERG-negative cases with cancer-free controls. In addition, we assessed associations with the six AR SNPs (binary, minor allele as referent). All analyses were adjusted for age at blood draw (continuous). We used the method described by Altman and Bland (14) to test for the statistical interaction between CAG length (continuous) and prostate cancer risk by ERG tumor status.

Clinical characteristics of the 291 prostate cancer cases by ERG status are shown in Table 1. The prevalence of TMPRSS2:ERG among the cases was 51%. Mean age at cancer diagnosis was 66 years, and 18% of men had pathologic Gleason 8–10 tumors. In this set of cases, the clinical characteristics were similar for men who had ERG-positive compared with ERG-negative prostate cancer. The mean (SD) age at blood draw for the controls was 63.2 (5.0) years. Among cases, the mean follow-up from blood draw to diagnosis was 6.4 years, and was similar for ERG-positive (6.5 years) and ERG-negative (6.4 years) prostate cancer.

The median (interquartile range, IQR) CAG repeat length among controls was 22 (IQR, 20–24) repeats, compared with 21 repeats (IQR, 20–23) for ERG-positive prostate cancer and 22 repeats (IQR, 20–24) for ERG-negative disease. Men with shorter AR CAG repeats had a higher risk of developing ERG-positive prostate cancer (OR, 1.07 per 1 repeat decrease; 95% CI, 1.00–1.14), whereas there was no association between CAG repeat length and risk of ERG-negative prostate cancer (OR, 0.99 per 1 repeat decrease; 95% CI, 0.93–1.05; Table 2). The test for heterogeneity of CAG repeat by ERG tumor status was borderline significant, P = 0.06. Compared with those with the longest (≥24) CAG repeat length, men in the category of shortest (≤19) CAG repeats had a nonsignificant 40% increased risk of ERG-positive prostate cancer (OR, 1.40; 95% CI, 0.82–2.39), albeit not statistically significant; the corresponding OR for ERG-negative prostate cancer was 0.96 (95% CI, 0.57–1.61). In the case only comparison, the OR for ERG-positive versus -negative prostate cancer was 1.08 (95% CI, 0.99 – 1.20) per 1 shorter CAG repeat.

We observed no significant association between any of the six polymorphic AR genetic variants (rs962458, rs6152, rs1204038, rs2361634, rs1337080, and rs1337082) and risk of either ERG-positive or ERG-negative prostate cancer (Supplementary Table S1).

In this integrative patho-epidemiology study, we found shorter germline CAG repeat length in AR to be associated with higher risk of ERG-positive prostate cancer, whereas there was no association between CAG repeat length and risk of ERG-negative prostate cancer. This finding confirms the prior study by Bastus and colleagues (11), who found that mean CAG repeat length was shorter in ERG-positive (mean length = 20 repeats) compared with ERG-negative (mean length = 21 repeats) cancer in a study of 40 patients with prostate cancer. Moreover, the results are in line with experimental evidence that heightened AR signaling induces TMPRSS2:ERG formation (10). One of the proposed mechanisms of fusion formation is that AR signaling induces spatial proximity, leading to colocalization of the 5′ and 3′ ends of TMPRSS2 and ERG, which may increase the probability of a fusion event occurring. In prostate cancer and noncancer cell lines, TMPRSS2:ERG formation is androgen dose dependent and may be the result of long-term androgen exposure (11). Thus, if shorter CAG repeat length drives increased transactivation of AR and is a proxy for long-term androgen exposure, there may be an increased likelihood for the fusion to occur and fusion-positive prostate cancer to result.

We found no association between haplotype tagging SNPs in AR and risk of ERG-positive or ERG-negative prostate cancer. This is in line with a prior study that showed no association between these variants and prostate cancer risk overall (4).

There are strengths and limitations of our investigation to consider. This is the first genetic epidemiology study of variants in AR and risk of prostate cancer defined by TMPRSS2:ERG status. The study integrates data on inherited susceptibility and tumor biomarkers within well-defined and prospective cohorts of men. We comprehensively investigated genetic variation in AR, including the CAG repeat polymorphism and common SNPs to capture inherited susceptibility across the gene. This analysis was limited to men for whom tumor tissue was available, primarily radical prostatectomy specimen. Among the patients with prostate cancer in the two cohorts who had surgery, there were no differences in clinical features for those for whom we did or did not have tissue available. However, the prostate cancer cases in this study tended to be slightly younger at diagnosis, have lower PSA levels, and be somewhat less likely to have T3 or higher stage disease than the cases without tissue who had primarily undergone radiation therapy or received androgen deprivation therapy. Although the differences in clinical features are not large, it is key to understand that the generalizability of these findings to all men with prostate cancer is needed. Although our study was based on a 5-fold greater number of cases than Bastus and colleagues and also included controls, we are somewhat limited in statistical power and may have not detected small associations with inherited AR variants. Future epidemiologic studies may require consortium efforts to investigate further the association between CAG repeats and TMPRSS2:ERG formation in prostate cancer.

The CAG repeat length occurs in a domain critical for full in vivo transcriptional activation activity of the receptor. This polymorphic repeat is inversely associated with an androgen-responsive reporter in androgen-dependent prostate cell lines (15). Within the NCI Breast and Prostate Cancer Cohort Consortium (BPC3), longer CAG repeat length was intriguingly associated with higher levels of both testosterone and estradiol in the circulation (4), and the age-related decline in testosterone is partly determined by CAG repeat length (16). Given reductions in AR activity associated with longer repeat length, the elevated hormone levels may represent a compensatory mechanism to achieve a balance in hormone signaling.

In the BPC3 study of 5,777 prostate cancer cases and 6,402 controls (4), there was no association between CAG repeat length and prostate cancer risk overall. Given the differing prevalence of both the fusion and average CAG repeat length across ethnicities/study populations (17, 18), including a lower prevalence of the fusion among Asian (Japanese) populations who also have longer CAG repeats on average, but also a lower prevalence of the fusion among African Americans who have shorter CAG repeats on average, our findings may partly explain why the association between CAG repeats in AR and total prostate cancer risk varies between studies. Further studies are needed to explore this notion.

In summary, data from this epidemiologic study provide supportive evidence that androgen signaling underlies the development of prostate tumors that harbor TMPRSS2:ERG. Moreover, these results suggest that TMPRSS2:ERG may represent a unique molecular subtype of prostate cancer with an etiology distinct from TMPRSS2:ERG–negative disease.

No potential conflicts of interest were disclosed.

Conception and design: S. Yoo, A. Pettersson, M.J. Stampfer, M. Brown, P.W. Kantoff, L.A. Mucci

Development of methodology: P.W. Kantoff, L.A. Mucci

Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): R.T. Lis, S. Lindstrom, M.J. Stampfer, M. Loda, E.L. Giovannucci, L.A. Mucci

Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): S. Yoo, A. Pettersson, K.M. Jordahl, R.T. Lis, A. Meisner, E.C. Stack, P. Kraft, M. Loda, E.L. Giovannucci, L.A. Mucci

Writing, review, and or revision of the manuscript: S. Yoo, A. Pettersson, K.M. Jordahl, R.T. Lis, S. Lindstrom, E.J. Nuttall, E.C. Stack, M.J. Stampfer, P. Kraft, M. Brown, M. Loda, E.L. Giovannucci, P.W. Kantoff, L.A. Mucci

Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): S. Yoo, K.M. Jordahl, R.T. Lis, E.J. Nuttall

Study supervision: L.A. Mucci

The authors thank participants in the PHS and HPFS for their dedicated commitment to these research studies. They also acknowledge the contributions of several members of the research team for their efforts: Li Moy, Elizabeth Frost-Hawes, Lauren McLaughlin, and Chungdak Li. The tissue microarrays were constructed by the Tissue Microarray Core Facility at the Dana Farber/Harvard Cancer Center.

This work was financially supported by the Dana-Farber/Harvard Cancer Center Specialized Programs of Research Excellence (SPORE) in Prostate Cancer P50CA090381-08 (to P.W. Kantoff and E.L. Giovannucci), and the NCI CA136578 (to L.A. Mucci and E.L. Giovannucci), CA141298 (to M.J Stampfer), CA097193 (to J.M. Gaziano), PO1 CA055075 (to W. Willett and E.L. Giovannucci), UM1CA167552 (to W. Willett), and U01CA098233 (to D.J. Hunter, P. Kraft, and S. Lindstrom). L.A. Mucci is supported by the Prostate Cancer Foundation. A. Pettersson is supported by the Swedish Research Council 2009-7309 and the Royal Physiographic Society in Lund, Sweden.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1.
Faber
PW
,
Kuiper
GG
,
van Rooij
HC
,
van der Korput
JA
,
Brinkmann
AO
,
Trapman
J
. 
The N-terminal domain of the human androgen receptor is encoded by one, large exon
.
Mol Cell Endocrinol
1989
;
61
:
257
62
.
2.
Chamberlain
NL
,
Driver
ED
,
Miesfeld
RL
. 
The length and location of CAG trinucleotide repeats in the androgen receptor N-terminal domain affect transactivation function
.
Nucleic Acids Res
1994
;
22
:
3181
6
.
3.
Gu
M
,
Dong
X
,
Zhang
X
,
Niu
W
. 
The CAG repeat polymorphism of androgen receptor gene and prostate cancer: a meta-analysis
.
Mol Biol Rep
2012
;
39
:
2615
24
.
4.
Lindstrom
S
,
Ma
J
,
Altshuler
D
,
Giovannucci
E
,
Riboli
E
,
Albanes
D
, et al
A large study of androgen receptor germline variants and their relation to sex hormone levels and prostate cancer risk. Results from the National Cancer Institute Breast and Prostate Cancer Cohort Consortium
.
J Clin Endocrinol Metab
2010
;
95
:
E121
7
.
5.
Giovannucci
E
,
Stampfer
MJ
,
Krithivas
K
,
Brown
M
,
Dahl
D
,
Brufsky
A
, et al
The CAG repeat within the androgen receptor gene and its relationship to prostate cancer
.
Proc Natl Acad Sci U S A
1997
;
94
:
3320
3
.
6.
Platz
EA
,
Leitzmann
MF
,
Rifai
N
,
Kantoff
PW
,
Chen
YC
,
Stampfer
MJ
, et al
Sex steroid hormones and the androgen receptor gene CAG repeat and subsequent risk of prostate cancer in the prostate-specific antigen era
.
Cancer Epidemiol Biomarkers Prev
2005
;
14
:
1262
9
.
7.
Gilligan
T
,
Manola
J
,
Sartor
O
,
Weinrich
SP
,
Moul
JW
,
Kantoff
PW
. 
Absence of a correlation of androgen receptor gene CAG repeat length and prostate cancer risk in an African-American population
.
Clin Prostate Cancer
2004
;
3
:
98
103
.
8.
Tomlins
SA
,
Rhodes
DR
,
Perner
S
,
Dhanasekaran
SM
,
Mehra
R
,
Sun
XW
, et al
Recurrent fusion of TMPRSS2 and ETS transcription factor genes in prostate cancer
.
Science
2005
;
310
:
644
8
.
9.
Rubin
MA
,
Maher
CA
,
Chinnaiyan
AM
. 
Common gene rearrangements in prostate cancer
.
J Clin Oncol
2011
;
29
:
3659
68
.
10.
Mani
RS
,
Tomlins
SA
,
Callahan
K
,
Ghosh
A
,
Nyati
MK
,
Varambally
S
, et al
Induced chromosomal proximity and gene fusions in prostate cancer
.
Science
2009
;
326
:
1230
.
11.
Bastus
NC
,
Boyd
LK
,
Mao
X
,
Stankiewicz
E
,
Kudahetti
SC
,
Oliver
RT
, et al
Androgen-induced TMPRSS2:ERG fusion in nonmalignant prostate epithelial cells
.
Cancer Res
2010
;
70
:
9544
8
.
12.
Pettersson
A
,
Graff
RE
,
Bauer
SR
,
Pitt
MJ
,
Lis
RT
,
Stack
EC
, et al
The TMPRSS2:ERG rearrangement, ERG expression, and prostate cancer outcomes: a cohort study and meta-analysis
.
Cancer Epidemiol Biomarkers Prev
2012
;
21
:
1497
509
.
13.
Park
K
,
Tomlins
SA
,
Mudaliar
KM
,
Chiu
YL
,
Esgueva
R
,
Mehra
R
, et al
Antibody-based detection of ERG rearrangement-positive prostate cancer
.
Neoplasia
2010
;
12
:
590
8
.
14.
Altman
DG
,
Bland
JM
. 
Interaction revisited: the difference between two estimates
.
BMJ
2003
;
326
:
219
.
15.
Beilin
J
,
Ball
EM
,
Favaloro
JM
,
Zajac
JD
. 
Effect of the androgen receptor CAG repeat polymorphism on transcriptional activity: specificity in prostate and non-prostate cell lines
.
J Mol Endocrinol
2000
;
25
:
85
96
.
16.
Krithivas
K
,
Yurgalevitch
SM
,
Mohr
BA
,
Wilcox
CJ
,
Batter
SJ
,
Brown
M
, et al
Evidence that the CAG repeat in the androgen receptor gene is associated with the age-related decline in serum androgen levels in men
.
J Endocrinol
1999
;
162
:
137
42
.
17.
Magi-Galluzzi
C
,
Tsusuki
T
,
Elson
P
,
Simmerman
K
,
LaFargue
C
,
Esgueva
R
, et al
TMPRSS2-ERG gene fusion prevalence and class are significantly different in prostate cancer of Caucasian, African-American and Japanese patients
.
Prostate
2011
;
71
:
489
97
.
18.
Buchanan
G
,
Yang
M
,
Cheong
A
,
Harris
JM
,
Irvine
RA
,
Lambert
PF
, et al
Structural and functional consequences of glutamine tract variation in the androgen receptor
.
Hum Mol Genet
2004
;
13
:
1677
92
.