Abstract
Prostate cancer is the second most common cancer among men worldwide. Alterations in the DNA methylation pattern can be one of the leading causes for prostate cancer formation. This study is the first high-throughput sequencing study investigating genome-wide DNA methylation patterns in a large cohort of 51 tumor and 53 benign prostate samples using methylated DNA immunoprecipitation sequencing. Comparative analyses identified more than 147,000 cancer-associated epigenetic alterations. In addition, global methylation patterns show significant differences based on the TMPRSS2–ERG rearrangement status. We propose the hypermethylation of miR-26a as an alternative pathway of ERG rearrangement-independent EZH2 activation. The observed increase in differential methylation events in fusion–negative tumors can explain the tumorigenic process in the absence of genomic rearrangements.
Significance: In contrast to TMPRSS2–ERG-rearranged tumors, the pathomechanism for gene fusion–negative tumors is completely unclear. Using a sequencing-based approach, our work uncovers significant global epigenetic alterations in TMPRSS2–ERG gene fusion–negative tumors and provides a mechanistic explanation for the tumor formation process. Cancer Discov; 2(11); 1024–35. ©2012 AACR.
Read the Commentary on this article by Alumkal and Herman, p. 979.
This article is featured in Highlights of This Issue, p. 961
Introduction
More than 900,000 men are diagnosed with prostate cancer each year, making it the second most common cancer among men worldwide (1). The clinical course of prostate cancer is heterogeneous, ranging from indolent tumors requiring no therapy during the patient's lifetime to highly aggressive prostate cancer developing into a metastatic disease. Despite its high prevalence, the clinical management of prostate cancer is limited by the low specificity of the existing diagnostic and prognostic tools and the lack of effective systemic therapeutic strategies.
Recent years have brought about a marked extension of our understanding of the somatic basis of prostate cancer. With 0.33 somatic protein altering mutations per megabase (Mb), the mutation frequency in prostate cancer is significantly lower than in breast (∼1/Mb) or lung cancer (3.8/Mb) and lies within the lowest range of cancer-associated mutations (2, 3). A large proportion of prostate cancers harbor gene fusions involving members of the ETS family and the androgen-regulated transmembrane protease serine 2 (TMPRSS2) gene (4), most commonly involving the v-ets erythroblastosis virus E26 oncogene homolog (ERG) that is observed in approximately 50% of all prostate cancer cases (5). The overexpression of ERG is thought to be sufficient for the initiation of prostate intraepithelial neoplasia lesions, a precursor of prostate cancer (6). Other rearrangements are less frequent and, interestingly, tend to be present in prostate cancers harboring the TMPRSS2–ERG gene fusion (FUS+; refs. 7, 8). Recent work has uncovered frequent somatic deletions at 5q21 and 6q21 including CHD1 and FOXO3 as well as recurrent SPOP mutations within ETS gene fusion–negative prostate cancers, suggesting distinct subclasses within gene fusion–negative tumors (9, 10).
During prostate cancer progression, one of the most significantly upregulated genes is enhancer of zeste homolog 2(EZH2), which is also associated with invasiveness and high malignancy of several other tumor entities (11, 12). The polycomb group protein EZH2 is responsible for silencing of homeobox (HOX) genes through H3K27 methylation during tissue development (13–15), and it links histone modifications to DNA methylation as EZH2 target genes may consecutively become hypermethylated (16, 17). In prostate cancer, aberrant DNA methylation is found to be associated with altered EZH2 expression, correlates with tumor progression, and is proposed to be one of the earliest events in oncogenesis (12, 13, 15, 18–20). Hypermethylation of the glutathione S-transferase pi 1 (GSTP1) is considered to be a cardinal gatekeeper event in early prostate cancer development (21).
Most epigenetic studies conducted so far are focused on specific gene regions and interrogate cancer-associated differential methylation events in general without asking if epigenetic alterations might differ in specific subgroups of prostate cancer. This might be particularly important for TMPRSS2–ERG-negative tumors, in which the pathomechanism of oncogenesis is so far unclear. Recent identifications of different patterns of DNA methylation in promoters of benign, cancerous, and metastatic prostate tissues as well as decreased levels of LINE-1 methylation in fusion-negative prostate cancer revealed a more detailed picture of epigenetic alterations in prostate cancer (22). However, the epigenetic landscapes of prostate cancer subgroups are still not sufficiently investigated to draw any conclusions. To understand alterations in the ERG fusion–negative class of prostate cancer, we used a deep sequencing readout of methylated DNA immunoprecipitation sequencing (MeDIP-Seq) to screen 51 tumor and 53 benign prostate tissues (23–26). We integrated the results with gene and microRNA (miRNA) expression analyses and proposed a model for the development of aberrant DNA methylation patterns in ERG fusion–negative (FUS−) prostate cancers.
Results
Catalogs of Tumor-Specific Epigenetic Alterations
For the analysis of genome-wide methylation patterns in 51 prostate cancers and 53 normal prostate tissues, we used MeDIP followed by high-throughput sequencing by oligonucleotide ligation and detection (SOLiD) (26). All tumors selected for this study were staged pT2–pT4 and had Gleason scores ranging from 6 to 9. The TMPRSS2–ERG fusion was present in 17 tumors; 20 tumors were TMPRSS2–ERG-negative as determined by PCR (Supplementary Table S1).
For a validation of the genome-wide MeDIP-Seq data we used a focused, bisulfite-based, mass spectrometry approach [BS-MS (Epityper)] in 79 heterogeneous regions covering promoters, high and low CpG content areas, and endo- and exogenous regions (Supplementary Table S2 and Supplementary Fig. S1A and S1B). Pearson correlation coefficients greater than 0.8 were achieved. To assess whether the identified loci of differential methylations are tumor cell–specific alterations, and are not due to different cellular compositions of tumor and normal samples, we also applied BS-MS analyses (EpiTyper) to investigate 36 differentially methylated regions on matched micro- and macrodissected samples from two additional patients and achieved correlation values greater than 0.91 (Supplementary Fig. S1B and S1C).
Principal component analyses (PCA) of global methylation revealed comprehensive differences between tumor and normal samples (Supplementary Fig. S2A). Similar results were obtained after restricting the analyses to regions outside of somatic copy number alterations (sCNA; Supplementary Fig. S2A). Within the genome-wide MeDIP-Seq data sets, we identified approximately 147,000 cancer-specific differentially methylated regions (cDMR): 85,406 hypermethylated and 61,308 hypomethylated [Benjamini–Hochberg (BH) corrected Mann–Whitney test P < 0.05; Supplementary Table S3A], including the well-characterized differentially methylated region in GSTP1 (Supplementary Fig. S2B). We identified additional biomarkers that were equally, or even more, significantly differentially methylated than previously described ones. From a recently published list by Kobayashi and colleagues (27) containing 87 potential biomarker regions, we confirmed 84 regions with our high-throughput sequencing approach. Notably, within our list of significantly differentially methylated regions, their top candidate was ranked 36th and GSTP1 108th. Of the 110 most significant differentially methylated regions, 887 combinations of 2 regions already allowed a perfect separation of tumor and normal samples and might be used as prostate cancer biomarkers (Supplementary Fig. S2C and S2D).
Hypermethylated regions were more homogeneous among the tumor samples (lower P values) than hypomethylated regions. This might indicate a specific and site-directed methylation process as compared with a more unspecific global loss of methylation.
Although most hypomethylated regions were detected in intergenic regions, only a quarter of the hypermethylated bins were located outside of genes and promoters (Supplementary Fig. S2E). We found a peak of hypermethylation within ±2 kb windows around the transcription start sites (TSS). In addition, we detected that hypermethylation is enriched within conserved [OR = 1.47, confidence interval (95% CI) = 1.43–1.50, P = 1.08 × 10−209] and miRNA (OR = 2.05, CI = 1.83–2.29, P = 8.79 × 10−32) regions, whereas hypomethylation of these sites is lower than expected (ORcons = 0.42, CI = 0.4–0.44, P = 0 and ORmiRNA = 0.63, CI = 0.51–0.77, P = 4.37 × 10−6; Supplementary Fig. S2F). Pathway analyses revealed the developmental gene group to be the most significantly differentially methylated. Of 231 homeobox genes (28), we identified 175 with a significant TSS-associated hypermethylation. The differential methylation of homeobox genes also resulted in a more stringent impact on transcription: In the complete data set, although we found significant negative and positive correlations for 1,143 and 839 genes, respectively, we found 53 negative and 8 positive correlations for homeobox genes. Consistent with the literature, we found hypomethylations predominantly localized within repeat regions mainly within long interspersed nuclear elements (LINE) and long terminal repeats (LTR) (20, 24).
Methylation Patterns DifferentiateTMPRSS2–ERG FUS1 and FUS2 Tumors
Although methylation patterns clearly distinguished normal from tumor tissues, we also found FUS− and FUS+ samples separated in a PCA (Fig. 1A, Supplementary Fig. S3A). In line with previous reports on gene expression (29, 30), the PCA analyses of the MeDIP-Seq values suggested that FUS+ samples exhibit a higher homogeneity than FUS− samples. In addition, the FUS+ class seems more similar to normal tissues than to FUS− samples, implying that the DNA methylation patterns in FUS− prostate cancer are considerably more altered. In testing the extend of the global methylation profiles in normal, FUS+, and FUS− samples, we found a significantly higher number of methylation events in FUS− compared with normal and FUS+ tissues (P < 0.011 and P < 0.021, respectively; Fig. 1B), with FUS+ samples showing similar counts as normal samples. These results were also valid after excluding regions with somatic copy number alterations (Supplementary Fig. S3B). We calculated the patient-wise number of differential methylations in the 147,000 cDMR regions identified previously in the tumor to normal comparison and found FUS− samples enriched for differential methylation (Supplementary Fig. S3C).
FUS− and FUS+ samples significantly differ from one another at 27,500 regions encompassing cancer census genes (31), homeobox (28), and miRNA genes [OR = 1.99 (P = 7.16 × 10−12, 95% CI = 1.65–2.38), OR = 1.56 (P = 0.004, 95% CI = 1.14 − 2.09), and OR = 1.82 (P = 4.87 × 10−7, 95% CI = 1.45–2.26), respectively; Fig. 1C and Supplementary Table S3)]. We also detected more hypomethylated regions in FUS− samples than in FUS+ and normal samples. In particular, in LINE L1 elements, we found significant hypomethylations (Fig. 1D), consistent with Kim and colleagues (22). From our data set, the top 2 marker regions (TMP1: chr1:149033001–149033500 and TMP2: chr16:46413501–46414500) were sufficient to distinguish between FUS+ and FUS− samples with 2 samples not correctly assigned (Fig. 1E). This finding was validated with BS-MS using 46 of the tumor samples already investigated by MeDIP-Seq (Supplementary Fig. S3D and Supplementary Table S2D).
EZH2 Overexpression Correlates with Altered Methylation Patterns
To identify genes that might be responsible for the increased methylations in FUS− tumors, we investigated the gene expression levels of DNA and histone methyltransferases (DNMT1, DNMT3A, DNMT3B, EZH1, EZH2, Fig. 2A). In our tumor samples, DNMT1, DNMT3A, and EZH2 levels were increased. In addition, for EZH2 we could observe a significant increase in the expression level in FUS− in comparison with FUS+ tissues (Mann–Whitney P = 0.013). Upregulation of EZH2 in FUS+ tissues can be explained by increased ERG levels in this tumor subgroup (Fig. 2B; ref. 32).
EZH2 is a polycomb group protein with H3K27 methyltransferase activity that functions as a transcriptional repressor in prostate cells (12, 32). Its function depends on an intact SET domain as well as an endogenous HDAC activity and is thought to induce a DNA hypermethylation of its target genes during cancer development (12, 16). We confirmed a positive correlation between the expression of EZH2 and promoter methylation for 77 of 83 polycomb group target genes that are associated with metastatic prostate cancer (33). Thirty-five of these also showed a significant correlation between gene expression and EZH2 expression (Supplementary Fig. S4 and Supplementary Table S4). An analysis of 204 HOX genes (28), the primary targets of EZH2-mediated cellular effects, revealed a positive correlation for 189 (92%) of them between EZH2 expression and HOX gene promoter methylation (Fig. 3A and B and Supplementary Table S4). Of these, 117 (62%) were significantly downregulated. Interestingly, for HOXC6, we find a hypermethylation and upregulation of the short transcript forms, whereas for the long isoforms, no differential methylation and no change in the expression level could be detected.
EZH2 Expression Is Regulated by ERG and miR-26a in FUS+ and FUS− Prostate Cancers, Respectively
EZH2 and MYC are known targets of the ERG transcription factor and are significantly overexpressed in TMPRSS2–ERG FUS+ samples (34, 35). As expected, we found increased EZH2 and MYC expression in FUS+ samples and observed ERG/EZH2 and ERG/MYC expression correlation in the FUS+ sample set (Spearman correlation = 0.52, P < 1 × 10−5 and correlation = 0.79, P < 2.2 × 10−6, respectively). Consistent with this finding, a knockdown of ERG in VCaP prostate cancer cells revealed a downregulation of EZH2 and MYC (Fig. 4A–C).
In contrast, in FUS− samples the ERG expression level is low, but MYC and EZH2 are still increased. MYC is significantly increased in tumor samples (P < 1.33 × 10−21) with no significant difference between FUS− and FUS+ tumors (P = 0.267). The increase in MYC might be due to decreased miRNA-34c levels in our samples, a predicted regulator of MYC. In addition, we found hypermethylation of the miRNA-34 region, suggesting a suppression of the miRNA transcription. For EZH2, the situation is even more pronounced because the expression of EZH2 is even higher than in FUS+ samples (Fig. 2A). We therefore looked at EZH2 promoter methylation, but found no association of enhanced expression of EZH2 with promoter methylation in FUS− samples; rather, the methylation levels were found to be comparable in both tumor subsets (minimal P value after BH correction = 0.29). We next turned our attention to the expression levels of miRNAs, as we had found miRNA genes to be preferred sites of differential methylation in FUS− samples. In particular, we investigated the expression levels of 6 miRNAs (miR-26a, -101, -138, -124, -214, and let-7b), suggested regulators of EZH2 (11, 36). For the quantitative PCR (qPCR) analyses, we used the same set of primary prostate tumors that had previously been used for the methylation analyses. For miRNA-124, -214 and -138, we found significant differences between normal and tumor tissues (Fig. 5A). However, these miRNAs showed no differential expression between FUS− and FUS+ samples. In contrast, for miR-26a, a miRNA whose downregulation is associated with EZH2 upregulation in nasopharyngeal carcinoma (11), breast cancer (37), as well as in a murine lymphoma model (38), we found significant expression differences between FUS+ and FUS− samples. We corroborated this finding in prostate cancer cell lines (Fig. 5B). In line with the in vivo data, the FUS− cells DU145 and PC3 express lower levels of miR-26a and more EZH2 than the FUS+ cells VCaP and NCI-H660. We also transfected DU145 cells with miR-26a mimics and found a more than 2-fold decrease of EZH2 gene expression level (Fig. 5C).
Hypermethylation of a region of 2 kb in FUS− samples (chr12:58217001–58219000) in the vicinity of miR-26a was found to be negatively correlated with miR-26a expression in FUS− (correlation = −0.47; P < 4.8 × 10−5), but this region was not found to be hypermethylated in FUS+ samples (Fig. 2B and Supplementary Fig. S5A). Furthermore, expression of miR-26a was negatively correlated to EZH2 expression in FUS− samples but not in FUS+ samples (correlation = −0.37, P < 3 × 10−3 and correlation = 0.09, P = 0.48, respectively). A luciferase reporter assay with the region of the differentially methylated region (DMR) at either the miR-26a locus or the positive control dual oxidase 1 (DUOX1) promoter region upstream of the luciferase gene resulted in significantly lower luciferase signals in DUOX1 and the miR-26a system after in vitro methylation (Fig. 5D). The miR-26a region is methylated in FUS− DU145 and PC3 cell lines, with less methylation in DU145 cells (Fig. 5E and Supplementary Fig. S5A). We validated the methylation in the miR-26a region with a bisulfite sequencing approach and found the region in DU145 less methylated than in FUS+ VCaP cells (Supplementary Fig. S5B). We thus tested if a treatment of fusion gene–negative cells with 5-aza-2′-deoxycytidine results in an increased miR-26a level. Indeed, we determined an increased expression of miR-26a and accompanying decrease in EZH2 after treatment in PC3 cells, with less strong effects in DU145 cells (Fig. 5F and data not shown).
Discussion
Our analyses provide a molecular mechanism for the development of fusion gene–negative prostate cancer. Comparing FUS− and FUS+ tumor samples on a genome-wide scale, we found significantly more differentially methylated regions in FUS− than in FUS+ tumors and normal samples. Among the hypermethylated regions, we found a significant enrichment of homeobox gene promoters (28). Homeobox genes are involved in embryonic development and cell lineage determination and a deregulation of these genes might partly explain the tumor dedifferentiation phenotype. A candidate for the deregulation of homeobox genes, EZH2, is significantly overexpressed in our tumor datasets and even more highly expressed in FUS− tumors compared with FUS+; cell line data support these findings. An increase in EZH2 expression induces epigenetic alterations and contributes to an erroneous expression of developmental genes, further promoting tumor formation (12, 16, 33). EZH2 has been associated with high-grade prostate cancer (15, 33, 39, 40) and is also found to be overexpressed in many other tumor types (11, 37).
Hypomethylations are predominating within repeat regions in FUS− tumors (Fig. 1C and D). This is in line with a recent study by Kim and colleagues (22) showing that fusion gene-negative prostate cancers in comparison with fusion gene-positive and normal harbor decreased levels of methylations within LINE L1 regions. Notably, they not only showed lower numbers of methylations within LINE L1 regions in ETS− tumors, but their data also implicate lower overall numbers of DMRs in ETS− tumors compared with FUS+ tumors and adjacent benign tissues. This is in contrast to our results because we found FUS+ tumors containing increased genome-wide numbers of DMRs. The discrepancy is most likely due to the different technologies used. Even though the MeDIP-Seq technology is an affinity-based approach, and as such high CpG regions are preferentially enriched, we do not find this to be an explanation for the discrepancy between our study and the one from Kim and colleagues (22). When we restrict our data sets to similar regions, we still find more cDMRs in FUS− than in FUS+ tumors. The study from Kim and colleagues is based on a MethylPlex-next generation sequencing (M-NGS) approach concentrating on high guanine and cytosine (GC) content regions. A comparison of both technologies resulted in concordance rates of 89% in CpG island regions and 62% outside of CpG islands, with more than 3 times more DMRs detected in MeDIP-Seq experiments (3,928 in MeDIP-Seq and 1,274 in M-NGS in CpG containing promoter regions; ref. 22). This low number of detected DMRs within M-NGS might be due to a specific GC enrichment step within the M-NGS protocol, which resulted in a loss of low CpG density DMRs. Thus, on an extended global scale, we find a higher degree of deregulated methylations in FUS− tumors with higher numbers of differential methylation events.
Our finding of extensive differences between the methylation patterns of FUS+ and FUS− samples suggests 2 pathomechanistic models (Fig. 6).
EZH2 overexpression in TMPRSS2–ERG FUS+ samples can be explained by ERG overexpression, as EZH2 is a target gene of the ERG transcription factor (34). In contrast, EZH2 overexpression in FUS− tumors seems to be regulated in a different manner, as ERG is not increased in these tumors, but EZH2 levels are even higher. Increased MYC expression in prostate cancer alone does not suffice as an explanation for the regulation of EZH2 because MYC expression levels are comparable between FUS+ and FUS− tumors and thus do not explain the increased EZH2 and differential methylation levels in FUS− cancers. Interestingly, by comparing FUS+ and FUS− tumors, we found significantly more DMRs in FUS− than in FUS+ and normal samples, suggesting an epigenetic mechanism to play a predominant role in FUS− tumors.
In the course of our study, we noticed miRNA regions to be significantly hypermethylated. We have shown here that one of the miRNAs targeting EZH2, miR-26a, is suppressed in FUS− prostate cancer. This finding is supported by reports describing miR-26a to be suppressed in nasopharyngeal carcinoma (13, 41), lymphomas (38), and breast cancer (37). We show here that the hypermethylation of the miR-26a locus is associated with high EZH2 levels in primary FUS− prostate cancers as well as in PC cell lines, whereas miR-26a mimics led to a reduction of EZH2. Thus, we suggest that in FUS− samples, a suppression of miR-26a caused by a hypermethylation leads to a decreased inhibition of EZH2, thereby adding further perturbations to the global DNA methylation profile.
Recurrent gene fusions are also found in other tumors including leukemia, lymphoma, sarcoma, non–small cell lung cancer, and other less common tumor entities (42). Thus, the pathomechanistic model of a deregulated epigenome in tumors without gene fusions might also be applicable to these. First evidence may be seen in studies on acute myeloid leukemia (AML) where mutations in DNMT3A appear exclusively in AML samples without translocations (43) and where AML tumors with somatic mutations of isocitrate dehydrogenase 1 or 2 (IDH1 or 2) reveal different DNA methylation patterns than patients with mixed lineage leukemia (MLL) translocations, suggesting subgroup-specific methylation events (44).
Gaining insight into tumor formation processes as described in this study opens the way for personalized targeted therapeutics and for novel drug developments. In this regard, DNA methyltransferase inhibitors, like the well-known but toxic nucleoside DNMT inhibitor 5-aza-2′-deoxycytidine (45) or the direct DNMT1 inhibitor procaine (46), might preferentially be useful as treatment for patients with FUS− prostate tumors (47). In contrast, FUS+ patients might benefit from PARP1 inhibitors, as suggested by Brenner and colleagues (48) who showed that PARP1 is essential for TMPRSS2–ERG target gene transcription and directly interacts with ETS genes. Given the central role of EZH2 in prostate cancer, a combination therapy with 3-deazaneplanocin A (49) might be considered. However, additional studies are required to further elucidate the ideal treatment strategies within the different prostate cancer subtypes.
Methods
Biological Samples
Prostate tissue samples were obtained from the University Medical Centre Hamburg-Eppendorf (Hamburg, Germany). Approval for the study was obtained from the local ethics committee and all patients agreed to provide additional tissue sampling for scientific purposes. Tissue samples from 51 prostate cancer and 53 normal prostate tissues were included (Supplementary Table S1). None of the patients had been treated with neoadjuvant radiation, cytotoxic, or endocrine therapy. To confirm the presence of tumor, all punches were sectioned, and tumor cell content was determined in every 10th section. Only sections containing at least 70% tumor cells were included in the study. Normal prostate tissue samples were obtained from 53 patients who underwent radical prostatectomy for prostate cancer. Only sections containing exclusively normal tissue material with epithelial cell content between 20% and 40% were included in the study. For Laser Capture Microdissection (LCM; Zeiss) of epithelial cells, 16-μm tissue sections were mounted on special LCM slides and briefly stained with hematoxylin and eosin to facilitate the localization of epithelial cells. Epithelial cells were collected by LCM from 10 tissue sections each. DNA was isolated using the DNA Mini Kit (Qiagen) according to the manufacturer's instructions. The TMPRSS2–ERG fusion status was determined with real-time PCR (RT-PCR) following protocols from Jhavar and colleagues (50) and Mertz and colleagues (51).
Methylation Profiling by MeDIP-Seq
In brief, 2.5 μg of genomic DNA were fragmented to 100 to 200 bp using the Covaris S2 system and end repaired with End Repair mix (Enzymatics) followed by a purification step (Qiagen DNA Purification Kit). Barcoded sequencing adapters (Supplementary Table S2) were ligated followed by nick translation with DNA polymerase I (NEB, 10 U).
For the enrichment step of the methylated DNA immunoprecipitation (MeDIP), 5 μg of an anti-5-methyl cytosine antibody (Eurogentec) coupled to magnetic beads was used. Coupling was conducted by incubation overnight in 1 × PBS + 0.5% bovine serum albumin (BSA). Sequencing libraries were generated before the enrichment and incubated with the beads for 4 hours in IP Buffer (10 mmol/L sodium phosphate buffer pH 7, 140 mmol/L NaCl, 0.25% Triton X100). Beads were washed 3 times with IP buffer and DNA was eluted in elution buffer (50 mmol/L Tris-HCl pH 7.5, 10 mmol/L EDTA, 1% SDS) by incubation for 15 minutes at 65°C. After 2 hours of incubation with proteinase K, the DNA was phenol/chloroform extracted and ammonium acetate/ethanol precipitated. Enrichment controls were conducted with RT-PCR targeting methylated as well as unmethylated regions.
SOLiD sequencing libraries were prepared following the SOLiD V3 fragment multiplex library preparation protocol (Life Technologies) with slight modifications. Following MeDIP, enrichment libraries were amplified with multiplex library PCR primers 1 and 2 and size-selected and quantified using qPCR with library PCR primer 1 and 2 (Supplementary Table S2). Dilutions of a prequantified SOLiD fragment library control (DH10B) were used to create the standard. Samples were diluted to 100 pg/μL using 1x Low TE buffer (ABI) and qPCR was repeated. Identical amounts of up to 8 barcoded libraries were pooled. Libraries were fixed to sequencing beads by emulsion PCR following the templated bead preparation protocol for SOLiD V3. For each pool of 8 samples, approximately 4 to 5 emulsion PCRs were needed to fill a full slide with up to 600 million beads. Sequencing was conducted on a SOLiD 3+ using barcode sequencing chemistry (5 + 35 bp; Lifetech). The sequencing statistics are listed in Supplementary Table S1.
Alignment and Peak Detection
Reads were aligned to HG19 using Applied Biosystem's Bioscope Alignment module in seed and extend mode taking the first 25 bp of the reads as seeds allowing 2 mismatches and a mismatch penalty score of −2 for extension. The aligned reads were elongated to 200 bp in a strand-oriented manner. Redundant reads and reads with no CpGs in the elongated sequence were excluded from further analyses. Next, the HG19 reference genome was split into adjacent 500-bp bins, and the number of reads per bin was counted. Reads were assigned to a bin when their center was located within the bin. For sample-wise normalization, binwise read counts were related to the total read count of each sample (reads per million = rpm). Statistical analyses were conducted using R (version 2.9.2). A binomial distribution of the reads (null hypothesis) was assumed and, thus, a probability value for each read count was assigned. Differential methylations between normal and tumor tissues were calculated with the Mann–Whitney test with corrections for multiple testing using the Benjamini–Hochberg approach. Bins with corrected P values of <0.05 are called differentially methylated.
Primary sequencing data (read sequence files *.csfasta and read quality files *.qual) and secondary analysis data (bed files) for all prostate cancers are available under the GEO accession number GSE35342. Bed files contain only extended, 0-CpG, and duplicate depleted reads.
PCA and Hierarchical Clustering
PCA were conducted with the prcomp-function in R using all bins methylated in normal or tumor samples. Hierarchical clusterings were conducted with the heatmap.2-function included in the R package “gplots” using the Euclidian distances and the agglomerative algorithms “complete” and “ward.”
Annotations
Additional information of the following databases was annotated to the 500-bp bins for functional and topographic analyses: CpG-island (cpgIslandExt) and repeat masker data (rmsk) were extracted from the UCSC database version GRCh37/HG19. Bins containing at least one base of an annotated CpG island were annotated as “CGI” and the ±2 kb surrounding bins as CGI-shore.
Genic, exonic, and transcription start sites were used from Biomart Ensembl59, miRNA regions from miRBASE v16, homeobox genes from Holland and colleagues (28). The cancer gene census list was obtained from Futreal and colleagues (31). Regions of ±2 kb around a TSS were annotated as “promoter region.” Conserved regions were annotated as follows: the nucleotide-wise conservation scores of a set of 46 species were extracted from UCSC (50) and the sums of these scores for 100-bp sized consecutive bins were calculated. The highest scoring 1% quantile consisting of 281,450 100-bp bins were assigned to 176,060 different genomic 500 bp bins that are annotated as “conserved.” For enrichment analyses, the log2-ratios were calculated from the ORs using a Benjamini–Hochberg corrected P value cut-off of 0.05 for differential methylation.
Bisulfite Mass Spectrometry
For bisulfite analyses, 1 μg of DNA was bisulfite-converted using Epitect bisulfite Conversion Kit (Qiagen) and subsequently amplified with specific primer pairs (Supplementary Table S2) carrying a T7 promoter that was designed using the Epidesigner tool (51) with standard criteria (amplicon length: 400–600 bp). In vitro transcription was conducted, and the transcripts were cleaved and subsequently analyzed using MALDI-TOF mass spectrometry on a MassARRAY Analyser 4.
Bisulfite Illumina Sequencing of the miR-26a Region
Enrichment and sequencing of the miR-26a region was conducted according to Agilent's SureSelect Methyl-Seq Target Enrichment Protocol. Mapping of 100-bp paired end reads generated on a Illumina HiSeq device was conducted with BSMAP with default parameters except the mismatch number (-v 10) and the quality trimming threshold (-p 2). Duplicates were removed with Picard Tools' MarkDuplicate module using default parameters. Methylation status of each CpG was calculated by the ratio mC/(C+mC) and visualized with R.
Gene Expression Analyses
The Affymetrix GeneChip Whole Transcript Sense Target Labeling Assay was used to generate amplified and labeled sense DNA. Briefly, 1 μg of total RNA was initially used for rRNA reduction. Following the manufacturer's instructions, cDNA was hybridized to the Affymetrix 1.0 Human Exon ST arrays. The raw data files were preprocessed and normalized using Affymetrix powertools. The Affymetrix data was MIAMI-compliant submitted to GEO database (NCBI, GEO, GSE29079). Expression values of the HOX (28) and EZH2 target (33) genes were correlated with MeDIP-Seq data of all corresponding bins in a region of 2 kb around the TSS.
Determination of Copy Number Variations
DNA from 41 tumor samples was hybridized on Affymetrix Genome-Wide SNP Arrays 6.0. The raw data were preprocessed and copy number number alterations were estimated with CRMA v2 (52). The median of all tumor samples was used as reference sample.
Statistical Analyses
Statistical analyses were conducted using R (version 2.9.2). For determining methylated bins, a binomial distribution of the reads (null hypothesis) was assumed, and thus a sample-wise probability value for each read count was assigned. Bins with P values <0.001 were called “significantly methylated.” PCA were conducted with the prcomp-function in R using all bins methylated in normal or tumor samples.
For determining DMRs between tumor (n = 51) and normal samples (n = 54), or FUS+ (n = 17) and FUS− (n = 20) samples, multiple 2-tailed Mann–Whitney tests were used and the P values corrected for multiple testing with the Benjamini–Hochberg approach. Corrected P values below 0.05 were called “significant.” Analyses of differentially expressed genes and miRNAs were conducted similarly.
For correlation analyses, only samples with gene expression and methylation data were considered (gene expression data: nnormal = 48, ntumor = 47, nFUS+ = 17, nFUS− = 20; miRNA expression data: nnormal = 48, ntumor = 51, nFUS+ = 17, nFUS− = 20) and Spearman correlation analyses were conducted. Correlations with Benjamini–Hochberg corrected P values of <0.05 were called “significant.”
miR-26a and EZH2, MYC, and ERG RT-PCR Quantification
Total RNA, including miRNA, was extracted from tissue sections as described by Brase and colleagues (53). Quantitative RT-PCR amplification of miRNAs was conducted using low-density TaqMan arrays v2.0, the TaqMan MicroRNA Reverse Transcription Kit, and TaqMan MicroRNA Assays (Applied Biosystems; ref. 53). For RT-PCR measurements of EZH2, MYC, ERG, and TBP, the following primers were used: EZH2-forward: 5′-tgt gga tac tcc tcc aag gaa, EZH2-reverse: 5′-gag gag ccg tcc ttt ttc a, ERG-forward: 5′-aag tag ccg cct tgc aaa t, ERG-reverse: 5′-cag ctg gag ttg gag ctg t, TBP-forward: 5′-cgg ctg ttt aac ttc gct tc, and TBP-reverse: 5′-cac acg cca aga aac agt ga.
Cell Lines and Knockdown of ERG in VCaP Cells, Transfection of miR-26a Mimics, and 5-aza-2′-deoxycytidine Treatment
VCaP, PC-3, DU145, and NCI-H660 cells were obtained from American Type Culture Collection (ATCC). All cells were maintained and propagated according to the recommendations of ATCC; PC-3 cells were, in addition to ATCC, authenticated at the German Cancer Research Center (DKFZ). For the knockdown of ERG, VCaP cells were transfected with 50 nmol/L of ON-TARGETplus Non-Targeting Pool or ERG custom siRNA (35; CGACAUCCUUCUCUCACA duplexes with UU overhang; both from Dharmacon) using Lipofectamine RNAiMAX (Invitrogen) as recommended by the supplier. Cells were harvested for DNA and RNA extraction 96 hours after transfection.
DU145 cells were transfected with miR-26a mimics constructs with Lipofectamine 2000. After 24, 48, 72, 96, and 120 hours, incubation RNA was isolated and EZH2 levels were determined by qPCR. For the 5-aza-2′-deoxycytidine (Sigma) treatment, PC-3 cells were incubated for 96 hours with 0.5 μmol/L, 2 μmol/L, or a corresponding amount of dimethyl sulfoxide (DMSO; vehicle control). The medium was changed every 24 hours. Cells were then harvested for DNA and RNA extraction.
In Vitro Methylation and Luciferase Reporter Assay
Target regions were cloned in the pGL3_basic vector (Promega) using the following primer sequences: miR26a_Forward: 5′-ggg tga cag gag agg aga ca, miR26a_Reverse: 5′-tgg tca ttg agg gga aaa ag, DUOX1-short-Forward: 5′-gca ccg acg gaa cat ctc ta, and DUOX1-short-Reverse: 5′-ctc tcg tcc ggt gcc tct. In vitro plasmid methylation was conducted with SssI-CpG-Methyltransferase (NEB) following the manufacturer's instruction. Methylation was confirmed by methylation-sensitive restriction enzyme digest (ApaL1, NEB) and sequencing. Methylated and unmethylated plasmids were transfected into HEK293 cells using PEI (polyethyleneimine) transfection reagents. After 24 hours, Dual-Luciferase Reporter assays (Promega) were conducted according to the manufacturer's instructions, and luciferase signals were detected using the Promega Multi Detection System.
Disclosure of Potential Conflicts of Interest
S.T. Börno, T. Schlomm, and M.-R. Schweiger have ownership interest (including patents) in pending patent. No potential conflicts of interest were disclosed by the other authors.
Authors' Contributions
Conception and design: S.T. Börno, T. Schlomm, H. Sültmann, H. Lehrach, M.-R. Schweiger
Development of methodology: S.T. Börno, A. Dahl, M.-R. Schweiger
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): S.T. Börno, M. Laible, J.C. Brase, R. Kuner, A. Dahl, B. Timmermann, R. Claus, M. Graefen, R. Simon, M.A. Rubin, G. Sauter, T. Schlomm, H. Sültmann, M.-R. Schweiger
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): S.T. Börno, A. Fischer, M. Kerick, M. Fälth, M. Laible, J.C. Brase, R. Kuner, C. Grimm, B. Sayanjali, M. Isau, C. Röhr, A. Wunderlich, B. Timmermann, R. Claus, C. Plass, M. Graefen, F. Demichelis, M.A. Rubin, G. Sauter, H. Sültmann, M.-R. Schweiger
Writing, review, and/or revision of the manuscript: S.T. Börno, A. Fischer, M. Kerick, M. Fälth, M. Laible, R. Kuner, C. Grimm, M. Isau, C. Röhr, A. Wunderlich, B. Timmermann, R. Claus, C. Plass, M. Graefen, R. Simon, F. Demichelis, M.A. Rubin, G. Sauter, T. Schlomm, H. Sültmann, M.-R. Schweiger
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): S.T. Börno, A. Fischer, A. Dahl, T. Schlomm, H. Sültmann, M.-R. Schweiger
Study supervision: T. Schlomm, H. Sültmann, H. Lehrach, M.-R. Schweiger
Performed MeDIP experiments: S.T. Börno
Provided bioinformatic analyses: S.T. Börno, A. Fischer, M. Kerick, M. Fälth
Provided RNA and microRNA data and bioinformatics analyses: M. Fälth, J.C. Brase, R. Kuner, H. Sültmann
Generated sequence data: S.T. Börno, A. Dahl, B. Timmermann
Contributed to the procurement of tumor tissue, preparation of DNA and RNA, clinical data, copy number, TMPRSS2–ERG status, and gene expression analyses:M. Laible, R. Kuner, M. Graefen, R Simon, G. Sauter, T. Schlomm, H. Sültmann
Performed cell culture experiments: S.T. Börno, M. Laible, B. Sayanjali
Performed BS-MS experiments: S.T. Börno, R. Claus, C. Plass
Acknowledgments
This work is dedicated to Professor Manfred Schweiger who was a passionate scientist. The authors thank Anna Kosiura, Nada Kumer, Uta Marchfelder, and Michael Zinke for excellent technical assistance, and Julia Liep for providing RNA samples.
Grant Support
This work has been supported by grants from the Federal Ministry of Education and Research “Proceed,” “Mutanom,” “Intestinal Modifier,” and “Predict” (BMBF 01GS0890 to H. Sültmann, M. Fälth, and R. Kuner; 01GS0891 to S.T. Börno and A. Wunderlich; 01GS089105 to M. Graefen; 01GS08111 to C. Grimm and M.R. Schweiger; Predict to M. Isau).