Abstract
Ovarian clear cell carcinoma (OCCC) is a rare ovarian cancer histotype that tends to be resistant to standard platinum-based chemotherapeutics. We sought to better understand the role of DNA methylation in clinical and biological subclassification of OCCC.
We interrogated genome-wide methylation using DNA from fresh frozen tumors from 271 cases, applied nonsmooth nonnegative matrix factorization (nsNMF) clustering, and evaluated clinical associations and biological pathways.
Two approximately equally sized clusters that associated with several clinical features were identified. Compared with Cluster 2 (N = 137), Cluster 1 cases (N = 134) presented at a more advanced stage, were less likely to be of Asian ancestry, and tended to have poorer outcomes including macroscopic residual disease following primary debulking surgery (P < 0.10). Subset analyses of targeted tumor sequencing and IHC data revealed that Cluster 1 tumors showed TP53 mutation and abnormal p53 expression, and Cluster 2 tumors showed aneuploidy and ARID1A/PIK3CA mutation (P < 0.05). Cluster-defining CpGs included 1,388 CpGs residing within 200 bp of the transcription start sites of 977 genes; 38% of these genes (N = 369 genes) were differentially expressed across cluster in transcriptomic subset analysis (P < 10−4). Differentially expressed genes were enriched for six immune-related pathways, including IFNα and IFNγ responses (P < 10−6).
DNA methylation clusters in OCCC correlate with disease features and gene expression patterns among immune pathways.
This work serves as a foundation for integrative analyses that better understand the complex biology of OCCC in an effort to improve potential for development of targeted therapeutics.
Introduction
Ovarian clear cell carcinoma (OCCC) remains an enigmatic histotype of epithelial ovarian cancer (EOC; ref. 1). When diagnosed at an advanced stage, it has a worse outcome than the more common high-grade serous histotype (2–4), and it tends to present at a younger age, showing a poorer response to platinum-based therapy, the mainstay treatment for EOC. As reviewed previously (5, 6), relatively small studies suggest that OCCC possesses some singularly unique features. Like endometrioid EOC, it can arise from endometriotic lesions; OCCC is generally TP53 wild-type with recurrent somatic mutations in PIK3CA and ARID1A and with a relatively low frequency of structural rearrangements (7–10). Although we and others have shown that tumor DNA methylation profiles differ between OCCC and other histotypes (10–12), methylation profiles among OCCC tumors have not been comprehensively evaluated.
OCCC with ARID1A mutations display dysregulation of chromatin remodeling (9) and frequently overexpress HNF1B via hypomethylation which has been reported to be associated with a methylated phenotype (12). Mismatch repair deficiency, resulting from DNA mismatch repair gene mutation or hypermethylation has also been reported in OCCC, albeit at a relatively low frequency and predominantly in older patients (13). In addition to the paucity of studies on methylation and OCCC, there have been few gene expression studies. Upregulation of genes in the IL6–STAT3–HIF and glycogen pathways suggest a response to persistent oxidative stress and inflammation (14, 15). Tan and colleagues (16) described two groups of OCCC: a mesenchymal-like subtype, with increased proliferation, tumor-infiltrating lymphocytes (TIL), and poorer outcome, and an epithelial-like tumor subtype which presented earlier in stage and with mutations in SWI/SNF genes. The tumor microenvironment may also contribute to an immune-suppressive state, suggesting a role for immunotherapeutics such as checkpoint inhibitors (17). Programmed death ligand 1 (PD-L1) expression is common in OCCC (∼45%) and is more common in more advanced disease (18), supporting the tenet of an immunosuppressive microenvironment. OCCCs are known to express hypoxia-related genes (19) which also influence the tumor microenvironment and potentially T-cell responses. Despite such promising avenues, it has been challenging to find ideal molecular targets (20).
Epigenome-wide OCCC studies have generally been limited in sample size, with the largest being less than 20 cases (10, 11, 21). To date, none have had comparable statistical power for evaluation of genome-wide DNA methylation in the context of other genomic and clinical features on the scale of The Cancer Genome Atlas (TCGA) high-grade serous EOC study (9, 22). In this article, we evaluate the hypothesis that epigenomic profiling of a relatively large collection of OCCC tumors can identify subclasses that may provide biological insight and show distinct clinical behavior patterns.
Materials and Methods
Study participants
Clinical data and chemo-naïve fresh frozen tumor material were examined from women diagnosed with invasive OCCC and enrolled into research studies from the following sites: Memorial Sloan Kettering Cancer Center Gynecology Tissue Bank (New York, NY; ref. 23), Mayo Clinic (Rochester, MN; ref. 10), University of Cambridge (Cambridge, United Kingdom; ref. 24), Cedars-Sinai Medical Center (Los Angeles, CA; ref. 25), University of Pittsburgh (Pittsburgh, PA), Gynaecological Oncology Biobank (GynBiobank) at Westmead Hospital (Sydney, Australia; ref. 26), University of Edinburgh (Edinburgh, Scotland; ref. 27), Canadian Ovarian Experimental Unified Resource, (COEUR, Vancouver, British Columbia, Canada; refs. 28, 29), Brigham and Women's Hospital (Boston, MA; ref. 30), and University of Pennsylvania (Philadelphia, PA; ref. 31). Participants provided written informed consent to Institutional review board–approved protocols. To confirm histotype, tumor sections were reviewed by an expert gynecological pathologist (M. Kobel) using Napsin A, p53, and WT1 IHC data (32).
DNA methylation arrays
Following bisulfite modification, Illumina Infinium MethylationEPIC Beadchips were run on DNA samples arrayed on three 96-well plates including extracted DNA from tissues containing >70% tumor from 239 participants, eight laboratory control DNAs (Human Methylated and Non-Methylated Control DNA Set; catalog no. D5014, Zymo Research), and four participant duplicates using a standard operating procedure based on the Illumina protocol. Following scanning, intensity data were imported into the Genome Studio Methylation Module for analysis. Data were normalized, and detection p-values (reflecting the likelihood that the signal is distinguishable from the internal negative controls) were calculated for each CpG; call rate reflects the percentage of CpGs detected (33, 34). Sample-independent controls included those for bisulfite conversion allowing identification of those with incomplete conversion; positive and negative controls were included to determine whether any probes should be excluded due to poor performance. The methylation status of the target CpG sites was determined by comparing the ratio of fluorescent signal from the methylated allele to the sum of the fluorescent signals from both methylated and unmethylated alleles, the beta value. These values per CpG range from 0 (unmethylated) to 1 (fully methylated). Laboratory controls and participant duplicates indicated excellent assay performance (e.g., r2 = 0.99 for beta values of participant duplicates). Nine participant samples showed poor performance and were excluded, including eight with call rate <95% and one outlier for median methylation intensity; no samples revealed sex error or mean or median detection P > 0.05. For quality control (QC), CpGs probes were excluded if they were located at a SNP location, failed in more than 10% of samples, were located on the Y chromosome, were determined to be cross-reactive, or overlap genetic variants (33, 34); this resulted in 707,744 probes passing QC on the EPIC array. For an additional 41 patients from the Mayo Clinic, published data derived from the Illumina Infinium HumanMethylation450k Beadchip were used (10), and CpG probes were analyzed which overlap the EPIC and 450K datasets after QC. In combination, the resulting analytical set consisted of 344,914 CpG probes and 271 cases.
Methylation clusters and annotation
We analyzed 344,914 CpGs representing the intersection of CpG sites included on the Illumina Infinium HumanMethylation450k and MethylationEPIC Beadchips. Data were normalized separately for the two platforms, and batch correction across the two platforms were performed via COMBAT (33). On the 1% most variable CpGs as defined by median absolute deviation (3,450 probes), we evaluated three clustering methods (Brunet (35), Lee (36), and nonsmooth nonnegative matrix factorization, nsNMF (37), as implemented in the R package “NMF” (https://cran.r-project.org/web/packages/NMF/index.html). Consensus clustering with 100 runs was performed, and nsNMF resulted in the most stable cluster assignment and was chosen as the most appropriate method. The optimal number of clusters was determined by cophenetic correlation coefficient assessment (35) which showed the largest drop at two clusters over the span of two to seven clusters. We implemented 2,000 bootstrap samples to estimate confidence intervals (CI) for the cophenetic coefficient for k = 2 through k = 7 clusters (38). As shown in the cophenetic correlation plots along with the consensus map, estimates were highly reproducible with k = 2 (with narrow CI) and highly variable for k>2 (wide CIs), emphasizing the reasoning of k = 2 as the optimal number of clusters for subsequent analysis (Fig. 1). Feature extraction was used to determine which CpGs had the greatest impact on the derived clusters (2,437 CpGs). Cluster 1 CpGs were hyper-methylated in Cluster 1 versus Cluster 2 and vice versa. We characterized CpGs with annotation derived from lllumina Corporation and Ensembl (v78, GRCh38), limiting to CpGs in loci likely to be cis regulatory regions, defined as within 5′ UTR and 200 base pairs (bp) of a transcriptional start site (TSS). Gene set enrichment analysis (GSEA) of methylation data (differentially methylated genes) was used to assess the extent of enrichment of cluster-defining CpGs within cancer hallmark gene sets (39) using the Bioconductor package ‘missMethyl’, which was designed specifically for methylation data (40).
Association testing
For the clinical characteristics, somatic mutation, and IHC features described below, association testing used a Kruskal–Wallis test for quantitative measures and Pearson χ2 test for categorical variables, unless any cell count was less than five, in which case Fisher exact testing with simulated P value based on 2,000 replicates was used. Exploratory analyses examined larger number of clusters and excluded Illumina Infinium HumanMethylation450k Beadchip data.
Clinical characteristics
We examined associations between cluster and baseline clinical features including age at diagnosis (continuous), stage (early, advanced), study continent (North America, Europe, Australia), self-reported race (white non-Hispanic, Asian, other), extent of residual disease following primary debulking surgery (no macroscopic, macroscopic), presence of adjacent endometriosis (yes, no), menopause status (postmenopausal, pre/perimenopausal), and prior endometriosis (yes, no).
Somatic mutation data
We also examined the association between cluster and somatic mutation derived from targeted DNA screening on a subset of 234 (87%) OCCC tumors across study sites with adequate tissue. DNA sequencing used a custom Nimblegen capture-based panel which of 166 putative OCCC driver genes based on pilot studies and COSMIC Cancer Gene Census (5). Median coverage was 539x. Raw sequence data were aligned to the human genome (NCBI build 37) using BWA with variant calling for single nucleotide variants via Mutect2, Strelka, and Caveman and insertions/deletions using Pindel, Mutect2, and Strelka. Mutations were classified as pathogenic based upon their annotation in OncoKB (8), frequency of occurrence in COSMIC and our combined OCCC database of previously published sequencing data, predicted pathogenicity based on PolyPhen (9) and SIFT (10), and literature review. Analyzed features included aneuploidies (continuous number of chromosomal or chromosomal arm level events), microsatellite instability (MSI) score (continuous; ref. 41), single gene somatic mutation status (for ARID1A, TP53, PIK3CA, BRCA1, BRCA2), paired gene somatic mutation status (for ARID1A/PIK3CA), and a hierarchical somatic mutation classification [ARID1A mutation with one other mutation in PIK3CA, PIK3R1, KRAS, PPP2R1A, SPOP, or TERT (Group A); multiple ARID1A mutations with one other mutation in PIK3CA or PIK3R1 (Group B); single ARID1A mutation (Group C); multiple ARID1A mutations without mutations in PIK3CA or PIK3R1 (Group D); mutation in PIK3CA, PIK3R1, KRAS, PPP2R1A, SPOP, or TERT (Group E); TP53 mutation without mutations in ARID1A or SMARCA4 (Group F); SMARCA4 mutation not in combination with a mutation described above (Group G); and remaining tumors (Group H)). Tumor mutation burden or mutation number was calculated as the sum of the presence or absence of a mutation in all targeted genes.
Tumor IHC
On small subsets of up to 38 cases, IHC data was used to evaluate association of tumor methylation cluster with levels of CD8+ TILs [negative (none), low (1–2 per field), moderate (3–19 per field), high (≥20 per field); ref. 42) and protein expression categories for ARID1A [absent (internal control retained), present, subclonal loss (distinct area of absence with internal control and presence in the same core)], HNF1β [absent, any score less than score 2, diffuse (>50%) at least moderate intensity] and p53 [complete absence with internal control, wild-type pattern (variable intensity 1%–90% of nuclei), overexpression (strong intensity >90% of nuclei); ref. 32).
Outcome analyses
Overall survival analyses were restricted to a subset of 253 cases with vital status and survival time data from date of diagnosis and allowed for left truncation with censoring at five years from diagnosis. As covariates, we included race (white non-Hispanic, Asian, other), study continent (North America, Europe, Australia), and age at diagnosis (continuous and quadratic, assigned as site median for three cases), and we stratified by disease stage (FIGO stage I+II, III+IV, unknown), and extent of residual disease (no macroscopic, macroscopic, unknown). Proportionality of hazards was examined using Schoenfeld residuals. In addition, contingency analysis was done on cluster with primary treatment response (complete response, partial response, stable disease, progressive disease) and vital status at five years using chi-square testing. Analyses were also conducted for progression-free survival available on a subset of 248 cases.
Gene expression analyses
To further understand CpGs of interest, we used tumor RNA-sequencing (RNA-seq) data on a subset of N = 116 patients with OCCC across multiple contributing sites with sufficient tissue available for total RNA extraction. RNA-seq libraries were prepared using poly(A) enrichment with sequencing of 100 bp paired-end libraries on Illumina's HiSeq at a targeted depth of 40 million reads per sample. Alignment using STAR (version STAR_2.5.1b) against the reference genome hg38 (GENCODE v26). Reads were summarized using featureCounts (version 1.5.0-p1). Because gene expression data can often be skewed, a van der Waerden rank transformation (43) was applied. We assessed the differential gene expression between tumor methylation clusters using a moderated t test as implemented in the R package “limma”, with a false discovery rate (FDR) threshold of 0.05 to correct for multiple testing. To assess pathway enrichment of genes differentially expressed by feature cluster, GSEA was performed with cancer hallmark gene sets (39) using “goseq” R Bioconductor package (44). For CpGs driving feature clusters and within likely cis-regulatory regions as defined above, we assessed the correlation between tumor methylation and cis gene expression using a generalized linear model, with gene expression as the response variable and CpG methylation beta value as the predictor variable.
Results
Study participants and methylation clustering
A total of 271 women from ten study sites were included in this large-scale analysis of genome-wide OCCC tumor DNA methylation data (Table 1); key characteristics did not vary by study site other than race; Asian ancestry was more common in participants from GynBiobank at Westmead Hospital (31%) and Memorial Sloan Kettering Cancer Center Gynecology Tissue Bank (23%) than other study sites. Thirty-five percent of participants were diagnosed at advanced stage (FIGO III, IV), 23% reported prior endometriosis, and, following primary therapy, 37% were deceased at last follow-up within five years.
. | N (%) . |
---|---|
Study site (country) | |
Memorial Sloan Kettering Cancer Center (USA) | 64 (24%) |
Mayo Clinic (USA) | 56 (21%) |
Cedars Sinai Medical Center (USA) | 31 (11%) |
University of Cambridge (United Kingdom) | 27 (10%) |
University of Pittsburgh (USA) | 23 (9%) |
Westmead Hospital (Australia) | 22 (8%) |
Edinburgh (United Kingdom) | 20 (7%) |
COEUR (Canada) | 13 (5%) |
Brigham and Women's Hospital (USA) | 9 (3%) |
University of Pennsylvania (USA) | 6 (2%) |
Race, self-reported | |
White non-Hispanic | 180 (86%) |
Asian | 25 (12%) |
Black | 4 (2%) |
Other/Unknown | 62 |
Age at diagnosis | |
Mean (range), N | 58.1 (31–88), 268 |
FIGO stage | |
Early (I, II) | 166 (65%) |
Advanced (III, IV) | 91 (35%) |
Unknown | 14 |
Tumor primary site | |
Ovary | 209 (88%) |
Omentum | 4 (2%) |
Pelvis | 4 (2%) |
Peritoneum | 3 (1%) |
Fallopian tube | 1 (<1%) |
Other | 17 (7%) |
Unknown | 33 |
Residual disease | |
No macroscopic disease | 181 (76%) |
Macroscopic disease | 57 (24%) |
Unknown | 33 |
Prior endometriosis, self-reported | |
Yes | 33 (23%) |
No | 108 (77%) |
Unknown | 130 |
Primary therapy outcome | |
Complete response | 157 (80%) |
Partial response | 7 (4%) |
Stable disease | 9 (5%) |
Progressive disease | 23 (12%) |
Unknown | 75 |
Progression within five years | |
Yes | 137 (55%) |
No | 111 (45%) |
Unknown | 23 |
Time to progression among progressors, months | |
Mean (range), n | 18.8 (0.03–59.7), 137 |
Time to last follow-up among nonprogressors, months | |
Mean (arange), n | 51.0 (0.40–60.0), 111 |
Vital status at five years | |
Alive | 159 (63%) |
Deceased | 94 (37%) |
Unknown | 18 |
Time to last follow-up among living, months | |
Mean (range), n | 45.7 (0.40–60.0), 159 |
. | N (%) . |
---|---|
Study site (country) | |
Memorial Sloan Kettering Cancer Center (USA) | 64 (24%) |
Mayo Clinic (USA) | 56 (21%) |
Cedars Sinai Medical Center (USA) | 31 (11%) |
University of Cambridge (United Kingdom) | 27 (10%) |
University of Pittsburgh (USA) | 23 (9%) |
Westmead Hospital (Australia) | 22 (8%) |
Edinburgh (United Kingdom) | 20 (7%) |
COEUR (Canada) | 13 (5%) |
Brigham and Women's Hospital (USA) | 9 (3%) |
University of Pennsylvania (USA) | 6 (2%) |
Race, self-reported | |
White non-Hispanic | 180 (86%) |
Asian | 25 (12%) |
Black | 4 (2%) |
Other/Unknown | 62 |
Age at diagnosis | |
Mean (range), N | 58.1 (31–88), 268 |
FIGO stage | |
Early (I, II) | 166 (65%) |
Advanced (III, IV) | 91 (35%) |
Unknown | 14 |
Tumor primary site | |
Ovary | 209 (88%) |
Omentum | 4 (2%) |
Pelvis | 4 (2%) |
Peritoneum | 3 (1%) |
Fallopian tube | 1 (<1%) |
Other | 17 (7%) |
Unknown | 33 |
Residual disease | |
No macroscopic disease | 181 (76%) |
Macroscopic disease | 57 (24%) |
Unknown | 33 |
Prior endometriosis, self-reported | |
Yes | 33 (23%) |
No | 108 (77%) |
Unknown | 130 |
Primary therapy outcome | |
Complete response | 157 (80%) |
Partial response | 7 (4%) |
Stable disease | 9 (5%) |
Progressive disease | 23 (12%) |
Unknown | 75 |
Progression within five years | |
Yes | 137 (55%) |
No | 111 (45%) |
Unknown | 23 |
Time to progression among progressors, months | |
Mean (range), n | 18.8 (0.03–59.7), 137 |
Time to last follow-up among nonprogressors, months | |
Mean (arange), n | 51.0 (0.40–60.0), 111 |
Vital status at five years | |
Alive | 159 (63%) |
Deceased | 94 (37%) |
Unknown | 18 |
Time to last follow-up among living, months | |
Mean (range), n | 45.7 (0.40–60.0), 159 |
Note: Residual disease following primary debulking surgery.
Abbreviations: COEUR, Canadian Ovarian Experimental Unified Resource; FIGO, International Federation of Gynecology and Obstetrics.
nsNMF clustering of the 1% most variable CpGs (3,450 CpGs) intersecting Illumina Infinium HumanMethylation450k and MethylationEPIC Beadchips resulted in 134 (49%) OCCC cases in methylation Cluster 1 and 137 (51%) OCCC in Cluster 2. A heat map using the 2,437 CpGs contributing most significantly to the clustering (as determined by the feature extraction) is shown in Supplementary Fig. S1A, and the basis matrix (matrix W or the metagenes) heat map is shown in Supplementary Fig. S1B. Among the CpGs contributing to clustering, a total of 1,388 reside within 200 bp of TSS of a gene (N = 977 genes); these genes that did not fall into particular cancer hallmark pathways (39). Clustering was consistent when excluding Illumina Infinium HumanMethylation450k Beadchip data.
Clinical, tumor mutation, and IHC associations
Table 2 shows clinical, molecular, and pathologic covariates that differed by methylation cluster at P < 0.10 (full results in Supplementary Table S1). Cluster 1 included OCCC that tended to be TP53 mutation positive (P < 0.001), have abnormal p53 protein expression (P < 0.013), be of advanced FIGO stage (P = 0.022), and have macroscopic residual disease. Cluster 2 tumors were more likely to be ARID1A mutation positive (P < 0.001), PIK3CA (P < 0.001) mutation positive, of Asian race, and early stage (P = 0.022) with increased total aneuploidy (P < 0.001). While ARID1A and PIK3CA mutations were common in these OCCC tumors (47%, 43%, respectively, in this study), we expected TP53 mutations to be less common (∼11%–13%), yet found them in 20% of cases overall and significantly more in Cluster 1 cases (31%). These particular cases were re-reviewed to confirm their histology. Supplementary Figure S2 shows the distribution across tumors of the presence of mutations in the five genes that define the mutation clusters (ARID1A, PIK3CA, TP53, BRCA1, BRCA2).
. | Cluster 1 (n = 134) . | Cluster 2 (n = 137) . | P . |
---|---|---|---|
Illumina Infinium Methylation BeadChip | 0.042 | ||
MethylationEPIC | 120 (90%) | 110 (80%) | |
HumanMethylation450k | 14 (10%) | 27 (20%) | |
Self-reported race | 0.012 | ||
White non-Hispanic | 95 (92%) | 85 (80%) | |
Asian | 6 (6%) | 19 (18%) | |
Black | 2 (2%) | 2 (2%) | |
Missing/other | 31 | 31 | |
FIGO stage | 0.022 | ||
Early (I, II) | 73 (57%) | 93 (72%) | |
Advanced (III, IV) | 54 (42%) | 37 (28%) | |
Unknown | 7 | 7 | |
Residual disease | 0.065 | ||
No macroscopic | 85 (71%) | 96 (81%) | |
Macroscopic | 35 (29%) | 22 (19%) | |
Unknown | 14 | 19 | |
p53 expression | 0.013 | ||
Wild-type pattern: variable intensity 1–90% of nuclei | 13 (72%) | 22 (100%) | |
Complete absence with internal control | 2 (11%) | 0 | |
Overexpression, strong intensity >90% of nuclei | 3 (17%) | 0 | |
Unknown | 116 | 115 | |
TP53 mutation | <0.001 | ||
Yes | 37 (31%) | 10 (9%) | |
No | 82 (69%) | 105 (91%) | |
Unknown | 15 | 22 | |
ARID1A mutation | <0.001 | ||
Yes | 38 (32%) | 73 (63%) | |
No | 81 (68%) | 42 (37%) | |
Unknown | 15 | 22 | |
PIK3CA mutation | 0.002 | ||
Yes | 39 (33%) | 62 (54%) | |
No | 80 (67%) | 53 (46%) | |
Unknown | 15 | 22 | |
ARID1A/PIK3CA mutation | <0.001 | ||
Yes/yes | 18 (15%) | 48 (42%) | |
Yes/no | 20 (17%) | 25 (22%) | |
No/yes | 21 (18%) | 14 (12%) | |
No/no | 60 (50%) | 28 (24%) | |
Unknown | 15 | 22 | |
Total aneuploidy | <0.001 | ||
Mean (range) | 7.1 (0–27) | 11.1 (0–28) | |
Unknown | 15 | 22 | |
Somatic mutation group | <0.001 | ||
ARID1A mutation with one other mutation in PIK3CA, PIK3R1, KRAS, PPP2R1A, SPOP, or TERT (Group A) | 21 (18%) | 24 (21%) | |
Multiple ARID1A mutations with one other mutation in in PIK3CA or PIK3R (Group B) | 7 (6%) | 33 (28%) | |
Single ARID1A mutation (Group C) | 8 (7%) | 3 (3%) | |
Multiple ARID1A mutations without mutations in PIK3CA or PIK3R1 (Group D) | 2 (2%) | 13 (11%) | |
Mutation in PIK3CA, PIK3R1, KRAS, PPP2R1A, SPOP, or TERT (Group E) | 29 (24%) | 27 (23%) | |
TP53 mutation without mutations in ARID1A or SMARCA4 (Group F) | 28 (24%) | 4 (4%) | |
SMARCA4 mutation (Group G) | 6 (5%) | 2 (2%) | |
Undefined (Group H) | 18 (15%) | 9 (8%) | |
Unknown | 15 | 22 | |
Vital status | 0.01 | ||
Alive | 69 (55%) | 90 (70%) | |
Deceased | 56 (45%) | 38 (30%) | |
Unknown | 9 | 9 | |
Median survival, months | 58.7 | NA | 0.01 |
Time to progression among progressors, months; mean (range), N | 16.6 (0.03–50.3), 68 | 20.9 (0.16–59.7), 69 | 0.07 |
. | Cluster 1 (n = 134) . | Cluster 2 (n = 137) . | P . |
---|---|---|---|
Illumina Infinium Methylation BeadChip | 0.042 | ||
MethylationEPIC | 120 (90%) | 110 (80%) | |
HumanMethylation450k | 14 (10%) | 27 (20%) | |
Self-reported race | 0.012 | ||
White non-Hispanic | 95 (92%) | 85 (80%) | |
Asian | 6 (6%) | 19 (18%) | |
Black | 2 (2%) | 2 (2%) | |
Missing/other | 31 | 31 | |
FIGO stage | 0.022 | ||
Early (I, II) | 73 (57%) | 93 (72%) | |
Advanced (III, IV) | 54 (42%) | 37 (28%) | |
Unknown | 7 | 7 | |
Residual disease | 0.065 | ||
No macroscopic | 85 (71%) | 96 (81%) | |
Macroscopic | 35 (29%) | 22 (19%) | |
Unknown | 14 | 19 | |
p53 expression | 0.013 | ||
Wild-type pattern: variable intensity 1–90% of nuclei | 13 (72%) | 22 (100%) | |
Complete absence with internal control | 2 (11%) | 0 | |
Overexpression, strong intensity >90% of nuclei | 3 (17%) | 0 | |
Unknown | 116 | 115 | |
TP53 mutation | <0.001 | ||
Yes | 37 (31%) | 10 (9%) | |
No | 82 (69%) | 105 (91%) | |
Unknown | 15 | 22 | |
ARID1A mutation | <0.001 | ||
Yes | 38 (32%) | 73 (63%) | |
No | 81 (68%) | 42 (37%) | |
Unknown | 15 | 22 | |
PIK3CA mutation | 0.002 | ||
Yes | 39 (33%) | 62 (54%) | |
No | 80 (67%) | 53 (46%) | |
Unknown | 15 | 22 | |
ARID1A/PIK3CA mutation | <0.001 | ||
Yes/yes | 18 (15%) | 48 (42%) | |
Yes/no | 20 (17%) | 25 (22%) | |
No/yes | 21 (18%) | 14 (12%) | |
No/no | 60 (50%) | 28 (24%) | |
Unknown | 15 | 22 | |
Total aneuploidy | <0.001 | ||
Mean (range) | 7.1 (0–27) | 11.1 (0–28) | |
Unknown | 15 | 22 | |
Somatic mutation group | <0.001 | ||
ARID1A mutation with one other mutation in PIK3CA, PIK3R1, KRAS, PPP2R1A, SPOP, or TERT (Group A) | 21 (18%) | 24 (21%) | |
Multiple ARID1A mutations with one other mutation in in PIK3CA or PIK3R (Group B) | 7 (6%) | 33 (28%) | |
Single ARID1A mutation (Group C) | 8 (7%) | 3 (3%) | |
Multiple ARID1A mutations without mutations in PIK3CA or PIK3R1 (Group D) | 2 (2%) | 13 (11%) | |
Mutation in PIK3CA, PIK3R1, KRAS, PPP2R1A, SPOP, or TERT (Group E) | 29 (24%) | 27 (23%) | |
TP53 mutation without mutations in ARID1A or SMARCA4 (Group F) | 28 (24%) | 4 (4%) | |
SMARCA4 mutation (Group G) | 6 (5%) | 2 (2%) | |
Undefined (Group H) | 18 (15%) | 9 (8%) | |
Unknown | 15 | 22 | |
Vital status | 0.01 | ||
Alive | 69 (55%) | 90 (70%) | |
Deceased | 56 (45%) | 38 (30%) | |
Unknown | 9 | 9 | |
Median survival, months | 58.7 | NA | 0.01 |
Time to progression among progressors, months; mean (range), N | 16.6 (0.03–50.3), 68 | 20.9 (0.16–59.7), 69 | 0.07 |
Note: Kruskal–Wallis sum test was used for categorical tests, unless any cell less than five, then Pearson χ2 test with simulated P value based on 2,000 replicates used; Fisher exact test used for quantitative measures. Total aneuploidy: number of chromosomal or chromosomal arm level events.
Consistent with single gene mutation results, multi-gene mutation groups were found to associate with methylation clusters (P < 0.001; Table 2). Cluster 1 tumors tended to be mutation Group F (TP53 mutation positive with no mutation in ARID1A or SMARCA4), mutation Group C (have a single ARID1A mutation,) or mutation Group G (SMARCA4 mutations). Cluster 2 tumors were more likely to be mutation Group B (multiple ARID1A mutations with a mutation in PIK3CA, PIK3RA1, KRAS, PPP2R1A, SPOP, or TERT), or mutation Group D (multiple ARID1A mutations without mutations in PIK3CA or PIK3RA1). No association was seen between Clusters and study continent, age at diagnosis, menopause status, history of or presence of endometriosis, MSI score, extent of whole genome duplication or tumor mutation burden (Supplementary Table S1). The distributions of the clinical and molecular features significantly differing in Cluster 1 and Cluster 2 are shown in Fig. 2 and an overview of these characteristics for each methylation cluster is provided in Fig. 3.
Clinical outcomes
Vital status at five years was also associated with methylation clusters, with 55% of Cluster 1 cases and 70% of Cluster 2 cases alive at time of follow-up (Table 2, P = 0.01); similarly, time to disease progression was shorter on average by 4.3 months in Cluster 1 cases compared to Cluster 2 cases (Table 2, P = 0.07). Consistent with these observations and published literature on ARID1A- and PIK3CA-mutant OCCC, univariate analysis of overall survival time revealed an apparent association with Cluster 2 having longer survival (Supplementary Fig. S3; cluster 1 vs. Cluster 2: HR, 1.70; 95% CI, 1.13–2.57; P = 0.015). However, the proportional hazard assumption for the cluster association was violated with an attenuation of risk difference toward five years (P = 0.037). Covariate adjustment for age, continent, and race with stratification by stage and residual disease attenuated the estimated cluster-associated risk (Cluster 1 vs. Cluster 2; HR, 1.48; 95% CI, 0.97–2.27; P = 0.067); proportional hazards remained in violation (P = 0.027). Subset analyses of cases by disease stage, suggested that methylation cluster may associate with overall survival time only among women diagnosed at advanced stage. However, as proportional hazards again were violated, survival analysis results should be considered suggestive at most and larger studies with time-dependent analyses are needed. There was no association between methylation cluster and primary therapy outcome (partial, stable disease, progressive disease, no evidence of disease).
Transcriptomic analyses
From among the 1,388 cluster-defining CpGs that lie within 200 bp of transcription start sites (N = 977 genes, from methylation clustering above), we further analyzed 971 CpGs mapped to 700 genes in RNA sequence data. At the CpG level, among the cluster-driving CpGs determined from the feature extraction, we observed cis correlations between methylation and gene expression at 113 CpGs (46 CpGs for Cluster 1 and 67 for Cluster 2; Q value < 10−4, Supplementary Table S2). Among the top cluster-driving CpGs (top 100 from feature extraction), the most statistically significant were 13 CpGs associated with decreased expression in twelve genes (AP2A2, ACKR2, RXFP1, CTH, CEP44, R11–141M3.5, ANKS4B, FAM149A, LMF1, TNS2, WIPF1, ATOH8; Supplementary Fig. S4).
In a subset of 116 cases with RNA-seq data, we also evaluated differential RNA expression by methylation cluster (Supplementary Table S2) and expression at 5,854 genes had an FDR <0.05. At the gene level, among 977 genes with 1,388 cluster-defining CpGs residing within 200 bp of transcription start sites, we found that 369 genes (38%) were differentially expressed across methylation clusters (P < 10−4, Supplementary Table S3). Through GSEA, we observed that these 369 differentially expressed genes were significantly overrepresented in nine of the 50 hallmark gene sets (39). Six of the pathways (67%) are categorized as immune-related including inflammation, and IFNα and IFNγ responses (Table 3). Genes contributing to these pathways that were significantly differently expressed across clusters are provided in Supplementary Table S4 and include the non-receptor tyrosine kinase JAK2, complement factor H, and toll-like receptor 2. No enrichment of differentially expressed genes was seen in gene sets related to other pathways, cellular components, or functions (39).
Process category . | Description (Gene set ID) . | N (%) . | P . |
---|---|---|---|
Immune | Interferon gamma response (17) | 107 (55%) | 1.32 × 10−11 |
Development | Epithelial mesenchymal transition (6) | 105 (54%) | 1.02 × 10−9 |
Immune | Allograft rejection (13) | 93 (53%) | 9.66 × 10−9 |
Immune | Interferon alpha response (16) | 55 (57%) | 1.80 × 10−7 |
Immune | Inflammation (19) | 88 (47%) | 1.59 × 10−5 |
Immune | Complement cascade (15) | 85 (46%) | 7.53 × 10−5 |
Signaling | KRAS signaling, downregulated genes (43) | 82 (44%) | 6.08 × 10−4 |
DNA damage | UV response: downregulated genes (11) | 63 (44%) | 7.91 × 10−3 |
Immune | IL6 STAT3 signaling during acute phase response (18) | 37 (46%) | 6.10 × 10−3 |
Process category . | Description (Gene set ID) . | N (%) . | P . |
---|---|---|---|
Immune | Interferon gamma response (17) | 107 (55%) | 1.32 × 10−11 |
Development | Epithelial mesenchymal transition (6) | 105 (54%) | 1.02 × 10−9 |
Immune | Allograft rejection (13) | 93 (53%) | 9.66 × 10−9 |
Immune | Interferon alpha response (16) | 55 (57%) | 1.80 × 10−7 |
Immune | Inflammation (19) | 88 (47%) | 1.59 × 10−5 |
Immune | Complement cascade (15) | 85 (46%) | 7.53 × 10−5 |
Signaling | KRAS signaling, downregulated genes (43) | 82 (44%) | 6.08 × 10−4 |
DNA damage | UV response: downregulated genes (11) | 63 (44%) | 7.91 × 10−3 |
Immune | IL6 STAT3 signaling during acute phase response (18) | 37 (46%) | 6.10 × 10−3 |
Note: P < 10−2; N (%) represent number of genes differentially expressed by OCCC methylation Cluster 1 vs. Cluster 2 and % of genes in each overall gene set.
Discussion
We report on examination of genome-wide tumor methylation in 271 women with OCCC, a collaborative effort involving ten institutions across five countries. Clustering algorithms were applied to discern whether there existed methylation subgroups with distinct clinical, molecular, or prognostic characteristics. Quantitative molecular analyses sought to highlight pathways that may bridge epigenomic and clinical associations.
Comparing diagnostic results of three clustering approaches revealed nsMNF with rank k = 2 to be the most stable method. Subsequent nsNMF methylation clustering of tumors produced two broad groups: Cluster 1 with ARID1A/PIK3CA mutations, early stage and aneuploidy, and Cluster 2 with TP53 mutations, later stage and residual disease. Mutational cluster analysis revealed that ARID1A multiple mutations were almost exclusively in Cluster 2. ARID1A deficiency impairs DNA double strand break repair (21) and limits chromatin access, impairing IFN expression and promoting an immunosuppressive environment (45).
While OCCC tumors are thought to have low levels of genomic instability, a recent study (11) reported moderate levels of chromosomal gains and losses in OCCC. In this study, we note that those OCCC with ARID1A/PIK3CA mutations had higher levels of chromosomal aneuploidy, while TP53 mutations were more common than previously reported and at more advanced stages of disease (22). That Cluster 2 with multiple ARID1A mutations appears to be ARID1A deficient may explain the greater genomic instability associated with this cluster, as assessed by aneuploidies.
Pathway analyses provide support for a role of immune related pathways in OCCC, from the tumor microenvironment or tumor cells themselves. Looking at genes identified by clustering analyses based on gene expression in OCCC previously (14–16), little overlap was seen in genes significantly associated with methylation clusters with those reported in the Anglelsio and colleagues (15) study, nor with the cytokine genes examined in Yanajhara and colleagues (15), although the number of OCCC were relatively small in these two studies. Tan and colleagues (16), reported gene expression in 222 ovarian clear cell carcinomas, noting two clusters, epithelial-like and mesenchymal-like. However, there was little overlap between those genes and those in our gene expression associated with methylation clustering.
Strengths of this study include utilization of the largest OCCC sample size to date, use of multiple study sites, consideration of genome-wide epigenomics, and incorporation of tumor molecular results where possible. Analysis of outcome differences between the methylation clusters suggested improved prognosis in Cluster 2, but this was complicated by potential changes in survival relationships over time. Although modeling prognostic analyses will require further consideration of potentially time-dependent survival patterns in larger patient collections, current results suggest that immune-related methylation factors may provide an avenue for focused development of potential future therapeutics. A potential weakness of this report is the difference in resolution between the two methylation arrays. However, we found that results were consistent when analyses restricted to Illumina Infinium MethylationEPIC BeadChip (85% of cases). Because sample size was limited in subset analyses presented here, more complete somatic data is needed to further clarify relationships between tumor mutations, DNA methylation, gene expression, and proteomics in OCCC. Greater overall sample size with follow-up data will allow also enable appropriate statistical evaluation of a variety of interactions and improved assessment of overall and progression-free survival. As the most extensive OCCC methylation study to date, this study represents a foundation on which to build upon for future clinical, molecular, and epidemiologic investigation.
Authors' Disclosures
B. Weigelt reports personal fees from Repare Therapeutics outside the submitted work. Y. Chiew reports grants from National Health and Medical Research Council of Australia and The Cancer Institute NSW grants during the conduct of the study; and grants from AstraZeneca outside the submitted work. C. Gourley reports grants from Nicola Murray Foundation, Aprea, Novartis, and Cancer Research UK during the conduct of the study; grants and personal fees from AstraZeneca, MSD, GSK, Clovis, Nucana, and Tesaro; personal fees from Foundation One, Chugai, and Cor2Ed; grants and personal fees from Sierra Oncology, personal fees from Takeda, grants from BerGenBio and Medannexin outside the submitted work; in addition, C. Gourley has a patent for Molecular Diagnostic Test for Cancer PCT Patent Application No. PCT/US12/40805 pending and issued, a patent for Molecular Diagnostic Test for Cancer PCT Patent Application No. GB2013/053202 issued, and a patent for Molecular Diagnostic Test for Cancer PCT Patent Application No. PCT/GB2015/050352 issued. C.J. Kennedy reports grants from National Health and Medical Research Council of Australia and The Cancer Institute New South Wales during the conduct of the study. J.D. Brenton reports grants from Cancer Research UK during the conduct of the study; personal fees from AstraZeneca and GSK outside the submitted work. A. DeFazio reports grants from National Health and Medical Research Council of Australia and The Cancer Institute NSW during the conduct of the study; grants from AstraZeneca outside the submitted work. R. Drapkin reports personal fees from Repare Therapeutics, Boehringer Laboratories, and TwoXAR, Inc. outside the submitted work. D.G. Huntsman is a founder and shareholder of Canexia Health. B.Y. Karlan reports grants from American Cancer Society during the conduct of the study. J. Konner reports advisory board memberships for Merck, AstraZeneca, and Clovis Oncology. E. Papaemmanuil reports other support from Isabl, Inc. outside the submitted work. K.L. Bolton reports grants from Bristol Myers Squbb outside the submitted work. No disclosures were reported by the other authors.
Authors' Contributions
J.M. Cunningham: Resources, methodology, writing–original draft, writing–review and editing. S.J. Winham: Resources, formal analysis, supervision. C. Wang: Resources. B. Weigelt: Resources. Z. Fu: Resources. S.M. Armasu: Resources, formal analysis, writing–review and editing. B.M. McCauley: Formal analysis. A.H. Brand: Resources. Y.-E. Chiew: Resources. E. Elishaev: Resources. C. Gourley: Resources. C.J. Kennedy: Resources. A. Laslvic: Resources. J. Lester: Resources. A. Piskorz: Resources. M. Sekowska: Resources. J.D. Brenton: Resources. M. Churchman: Resources. A. DeFazio: Resources. R. Drapkin: Resources. K.M. Elias: Resources. D.G. Huntsman: Resources. B.Y. Karlan: Resources. M. Köbel: Resources. J. Konner: Resources. K. Lawrenson: Resources. E. Papaemmanuil: Resources. K.L. Bolton: Conceptualization, investigation, project administration. F. Modugno: Resources, writing–review and editing. E.L. Goode: Conceptualization.
Acknowledgments
This work was supported by Brigham and Women's Hospital: Foundation for Women's Cancer as part of the Reproductive Scientist Development Program, Honorable Tina Brozman Foundation, Minnesota Ovarian Cancer Alliance, Deborah and Robert First Family Foundation, Greg and Peggy Strakosch, the Saltonstall Foundation, the Potter Foundation, and the Brigham Ovarian Cancer Research Fund.
Cedars Sinai Medical Center: The work was supported in part by the American Cancer Society SIOP-06-258-01-COUN (K. Lawrenson).
Mayo Clinic: R21-CA222867 (E.L. Goode and J.M. Cunningham), R01-CA248288 (E.L. Goode), and P50-CA136393 (E.L. Goode).
Memorial Sloan Kettering Cancer Center: This work was funded in part the NCI Cancer Center Core Grant No. P30-CA008748 (MSK: B. Weigelt, J. Konner, and E. Papaemmanuil). B. Weigelt is funded in part by Cycle for Survival and Breast Cancer Research Foundation grants.
University of Edinburgh: We thank the Nicola Murray Foundation for their generous support of the laboratory and the NRS Lothian Bioresource volunteers for their participation. We also acknowledge the NHS Trusts and staff for their contribution to this work.
University of Pittsburgh: NIH SPORE in ovarian cancer (P50 CA228991, to E. Elishaev and F. Mogduno), the Penn Medicine Translational Center of Excellence in Ovarian Cancer, the Dr. Miriam and Sheldon G. Adelson Medical Research Foundation, Tina's Wish Foundation, and the Claneil Foundation.
University of Pittsburgh: This project used the UPMC Hillman Cancer Center and Tissue and Research Pathology/Pitt Biospecimen Core shared resource which is supported in part by award P30CA047904 (to F. Modugno).
Westmead Hospital: We thank all the women who participated in the GynBiobank, and gratefully acknowledge the Departments of Gynaecological Oncology, Medical Oncology and Anatomical Pathology at Westmead Hospital, Sydney. The Gynaecological Oncology Biobank at Westmead (GynBiobank), a member of the Australasian Biospecimen Network-Oncology group, was funded by the National Health and Medical Research Council of Australia (ID 310670 and ID 628903, to A. Difazio) and the Cancer Institute NSW (12/RIG/1-17 & 15/RIG/1-16, to A. Difazio).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.