Abstract
Despite advances in diagnostics, less than 5% of patients with periampullary tumors experience an overall survival of five years or more. Periampullary tumors are neoplasms that arise in the vicinity of the ampulla of Vater, an enlargement of liver and pancreas ducts where they join and enter the small intestine. In this study, we analyzed copy number aberrations using Affymetrix SNP 6.0 arrays in 60 periampullary adenocarcinomas from Oslo University Hospital to identify genome-wide copy number aberrations, putative driver genes, deregulated pathways, and potential prognostic markers. Results were validated in a separate cohort derived from The Cancer Genome Atlas Consortium (n = 127). In contrast to many other solid tumors, periampullary adenocarcinomas exhibited more frequent genomic deletions than gains. Genes in the frequently codeleted region 17p13 and 18q21/22 were associated with cell cycle, apoptosis, and p53 and Wnt signaling. By integrating genomics and transcriptomics data from the same patients, we identified CCNE1 and ERBB2 as candidate driver genes. Morphologic subtypes of periampullary adenocarcinomas (i.e., pancreatobiliary or intestinal) harbor many common genomic aberrations. However, gain of 13q and 3q, and deletions of 5q were found specific to the intestinal subtype. Our study also implicated the use of the PAM50 classifier in identifying a subgroup of patients with a high proliferation rate, which had impaired survival. Furthermore, gain of 18p11 (18p11.21-23, 18p11.31-32) and 19q13 (19q13.2, 19q13.31-32) and subsequent overexpression of the genes in these loci were associated with impaired survival. Our work identifies potential prognostic markers for periampullary tumors, the genetic characterization of which has lagged. Cancer Res; 76(17); 5092–102. ©2016 AACR.
Introduction
Pancreatic cancer is the fourth most common cause of cancer-related deaths in Western countries, and it is projected to be the second leading cause of cancer-related death by 2030 (1). The incidence and mortality rate for pancreatic cancer are almost equal and the 5-year survival rate is <5%. Across tumor types, tumor evolution is driven either by mutations or by copy number aberrations (CNA; ref. 2). CNAs play a critical role in activating oncogenes and inactivating tumor suppressor genes, thereby targeting the hallmarks of cancer (3, 4). Studies have shown that driver alterations in pancreatic cancer include both single-nucleotide variants and large-scale rearrangements (5, 6). In the field of cancer genomics, the focus has been on identifying altered genomic regions and pathways by high-throughput technologies, and relating these to phenotypic effects. This knowledge has already led to substantial advances in diagnostics and therapeutics in other cancers such as targeting the HER2 oncogene in breast cancer patients using the mAb trastuzumab (7).
A number of studies are published on pancreatic ductal adenocarcinomas (PDAC) using high-throughput data analysis (8–14). Previous studies on relatively small cohorts of PDAC have documented homozygous deletions of 1p, 3p, 6p, 9p, 12q, 13q, 14q, 17p, and 18q, and amplifications of 1q, 2q, 3q, 5q, 7p, 7q, 8q, 11p, 14q, 17q, and 20q (11–13). Recently, a study of 75 PDAC and 25 cell lines derived from PDAC patients were analyzed using Illumina SNP arrays and whole genome SOLID sequencing (5). The results showed that genomic alterations in PDACs are dominated by structural alterations, and were classified by the number and distribution of structural variation events. Another recent publication identified four subtypes of PDAC namely squamous, pancreatic progenitor, immunogenic, and aberrantly differentiated endocrine exocrine type (14), which overlapped with Collisson subtypes namely quasi-mesenchymal, exocrine, and classical subtype, except for the immunogenic subtype (8, 14). Despite these studies, our knowledge about pancreatic cancer subtypes is limited, partly due to small samples sizes and lack of validation in different cohorts.
Here we analyzed the copy number profile of 60 tumors from Oslo University Hospital (OUH; Oslo, Norway) and 127 tumors from The Cancer Genome Atlas (TCGA) cohort using SNP arrays. Because of the low tumor purity frequently observed in pancreatic cancer biopsies, copy number alterations may present as subtle changes in copy number signals. The Battenberg analysis pipeline applied herein (ref. 15; doi: 10.5281/zenodo.16107) performs phasing of both parental haplotypes to increase sensitivity. CNAs were identified in tumors originating from the pancreatic ducts, the bile duct, the ampulla and the duodenum, collectively called periampullary adenocarcinomas in the OUH cohort of 60 patients. Several regions of recurrent gain or loss were identified in the OUH cohort and validated in TCGA cohort, providing a set of putative driver genes and deregulated pathways in periampullary adenocarcinomas. The frequent gain and overexpression of genes was further associated with poor patient prognosis.
Materials and Methods
DNA extraction
DNA was extracted from tumor tissue using the Maxwell Tissue DNA kit on the Maxwell 16 Instrument (Promega). Briefly, five 20-μm sections were homogenized in 300-μL lysis buffer and added to the cartridge. The method is based on purification using paramagnetic particles as a mobile solid phase for capturing, washing, and elution of genomic DNA. Elution volume was 200 μL. DNA was extracted from 6-mL EDTA blood using the QiAamp DNA Blood BioRobot MDx Kit on the BioRobot MDx (Qiagen). This was done at Aros Applied Biotechnology AS, and the Department of Medical Genetics, Oslo University Hospital (Oslo, Norway). The method is based on lysis of the sample using protease, followed by binding of the genomic DNA to a silica-based membrane and washing and elution in 200-μL buffer AE. DNA from normal tissue was extracted at Aros Applied Biotechnology AS according to their Standard Operation Procedures (SOPs) for extraction with a column-based technology (Qiagen). Tissue specimens were homogenized in Qiagen Tissuelyzer homogenizer. The amount of tumor cells in the sections used for DNA isolation were estimated on HE-stained sections cut before and after cutting of sections used for DNA isolation.
Tumor and matching normal samples
The OUH cohort contained a total of 60 samples of fresh frozen tumor tissue with origin in the four different periampullary locations and corresponding normal DNA samples from EDTA blood; 28 from pancreas, 4 from bile duct, 6 from ampulla of pancreatobiliary type, 7 from ampulla of intestinal type, 9 from duodenum, 3 intraductal papillary mucinous neoplasia (IPMN) samples and three samples from xenograft cell lines generated from PDAC patients were analyzed using Affymetrix SNP 6.0. The xenograft cell lines were generated at Oslo University Hospital (OUH) between 2010 and 2012. Fresh, surgically excised primary pancreatic adenocarcinoma material was kept on ice and transported directly to the animal facility within 2 hours. STR fingerprint from the cell lines and the corresponding primary tumors were performed at the genotyping core facility at OUH. The latest mycoplasma test for all three cell lines was done on 09.02.15. They all tested negative (16). Furthermore, for validation of the results, 127 PDAC samples from the TCGA cohort https://tcga-data.nci.nih.gov/tcga/ were analyzed.
Affymetrix SNP 6.0 arrays
The Affymetrix SNP 6.0 arrays include 1.8 million genetic markers, including 906,600 SNPs and 946,000 copy number probes. DNA digestion, labeling, and hybridization were performed according to the manufacturer's instruction (Affymetrix).
Statistical analysis
Copy number aberration profiles from the OUH (n = 60) and the TCGA (n = 127) cohort were generated. Segmental copy number information was derived for each sample using the Battenberg pipeline (https://github.com/cancerit/cgpBattenberg/) as described previously (15) to estimate tumor cell fraction, tumor ploidy, and copy numbers. The Battenberg pipeline has high sensitivity for samples with low cellularity, frequently observed in pancreatic tumors. Briefly, the tool phases heterozygous SNPs with use of the 1000 genomes genotypes as a reference panel using Impute2 (17), and corrects phasing errors in regions with copy number changes through segmentation (18). After segmentation of the resulting B-allele frequency (BAF) values, t tests are performed on the BAFs of each copy number segment to identify whether they correspond to the value resulting from a fully clonal copy number change. If not, the copy number segment is represented as a mixture of two different copy number states, with the fraction of cells bearing each copy number state estimated from the average BAF of the heterozygous SNPs in that segment. The genome instability indices (GII) were calculated for both the cohorts; it is measured as the fraction of aberrant probes throughout the genome above or below the ploidy. Correlation analysis was carried out to identify any association between GII and tumor ploidy.
Frequency plots
For each tumor, an aberration score was calculated per copy number segment. The aberration score was set to one if total copy number per segment was larger than the ploidy of the tumor, corresponding to a copy number gain and to −1 if it was smaller than the ploidy of the tumor, corresponding to deletion. Remaining segments were scored to zero. The frequency plots were generated on the basis of aberration score for all samples per segment. The whole genome allelic aberration frequency plots for the OUH cohort based on the four anatomical locations (pancreatic ducts, bile duct, ampulla, and duodenum), the two morphologies (pancreatobiliary and intestinal) and for validation in the TCGA cohort were plotted using ggplot2 library in R version 3.1.2. The radial plots were drawn for regions significantly different at P < 0.001 for χ2 test in samples under comparison. Hierarchical clustering of the OUH and the TCGA cohort samples were done using Spearman's distance measure for cytobands and complete linkage method; where gain was given a score of 1, deletion as −1 and 0 otherwise.
mRNA expression analysis
The mRNA expression data for the OUH cohort with GEO accession numbers GSE60979 and GSE58561 has been published previously (9, 16). The data was background corrected and quantile normalized. For the TCGA cohort, gene expression levels were assayed by RNA sequencing, RNA-Seq by Expectation-Maximization (RSEM) normalized per gene. The PAM50 gene signature was used for hierarchical clustering of periampullary adenocarcinomas using Spearman's correlation as distance measure and complete linkage method. The proliferation score was calculated as average gene expression of 11 proliferative genes namely; CCNB1, UBE2C, BIRC5, CDC20, PTTG1, RRM2, MKI67, TYMS, CEP55, KNTC2, and CDCA1 (19). Two-way ANOVA test was done to estimate the significant difference in proliferation score between the groups.
Driver genes in amplified and deleted regions
The genes located in the amplified and deleted regions were mapped using ENSEMBL (GRCh37 genome assembly; ref. 20) genome annotation for SNP6 arrays. The genes in amplified and deleted chromosomal locations were identified based on two criteria. First, frequency of occurrence with threshold set to >25%. Second, genes that were mapped in the COSMIC cancer gene census (21) list for most frequently mutated genes in cancer, census of amplified and overexpressed genes in cancer (n = 77; ref. 22) and the tumor suppressor gene list (n = 718; ref. 23).
Correlation analysis of copy number aberrations and gene expression data
The Pearson correlation coefficient was calculated to estimate the correlation between the copy number state (total copy number subtracted from the absolute ploidy of sample) and the expression data for 52 periampullary adenocarcinomas and three cell lines from the OUH cohort, and 120 PDACs in TCGA cohort. Expression data for the three IPMNs, and two periampullary adenocarcinomas in OUH cohort and seven PDACs in TCGA cohort were unavailable. The quantile-normalized gene expression values for OUH cohort and RSEM normalized per gene values for TCGA cohort was used for correlation analysis. The P values are reported for the significant association between the allele frequency and the gene expression correlation test at P < 0.05.
Gene-set enrichment analysis
We performed KEGG pathway–based analysis using the Web-based Gene Set Analysis Toolkit (WebGestalt; refs. 24, 25) to identify biological pathways with enrichment of genes amplified or deleted in the OUH and the TCGA cohort. WebGestalt uses Hypergeometric test for enrichment evaluation analysis at P < 0.05 after Benjamin and Hochberg's correction and the minimum number of genes required for a pathway to be considered significant is set to 10.
Survival analysis
Survival analysis was performed using the Kaplan–Meier estimator as implemented in the KMsurv package and the log-rank test in R version 3.1.2. Overall survival (OS) time was calculated from date of surgery to time of death. OS data were obtained from the National Population Registry in Norway. Three patients with distant metastases (M1) at time of resection, and one patient that died from cardiac arrest were excluded from the analysis. Disease-free survival (DFS) time was calculated from date of surgery to date of recurrence of disease. Recurrence was defined as radiologic evidence of intra-abdominal soft tissue around the surgical site or of distant metastasis.
The Kaplan–Meier survival curve is plotted for focally amplified regions of periampullary adenocarcinoma samples and for genes located on these regions. The expression for each sample was designated as high if the expression was higher than median expression otherwise low. The P value from log-rank test is reported for significant findings.
Results
Genomic aberrations in periampullary adenocarcinomas
The most frequent genomic aberrations in the periampullary adenocarcinomas were identified in both the OUH and the TCGA cohort. In the OUH cohort, deletion of chromosome 18q was the most frequent event, occurring in 77% of the tumors. Focal deletions of 9p21 and 9p23 were found in 70% of the tumors, loss of 17p13 and 17p12 in 68% of the tumors and loss of 6q and 8p in more than 50% of the tumors. Focal gains were observed for the following locations: 8q24.21 (32%), 18q11.2 (33%), 13q33.3 (30%), 3q25.31 (30%), 7p21.3 (28%), 19q13.2 (25%), 1q25.3 (25%) and 1q31 (25%; Fig. 1).
In the TCGA cohort, deletion of 18q was also the most frequent event, deleted in 78% of the tumors. Focal deletion of 9p21 and 9p23 were found in 62% of the tumors, loss of 17p12 and 17p13 in 74% and loss of 6q and 8p were found in 61% and 43% of the tumors, respectively. Focal gains of the chromosomal locations 8q24.21 (43%), 18q11.2 (28%), 13q33.3 (22%), 3q25.31 (20%), 7p21.3 (31%), 19q13.2 (27%), 1q25.3 (42%), and 1q31 (16%) were also observed (Fig. 1).
The frequency of chromosomal aberrations in the OUH and TCGA cohort are plotted in Fig. 1. Deletions and gains at the individual tumor level are plotted as a heatmap in Supplementary Fig. S1. The aberration patterns of the periampullary adenocarcinomas were mainly similar in the two cohorts. Approximately, 30% of the tumors in both cohorts had acquired copy number gains, and most (>75%) of the tumors carry one or more deletion.
Clinicopathologic characteristics of periampullary adenocarcinomas
The clinicopathologic characteristics of the OUH and TCGA cohort were similar, with the majority of the tumors of stage T3 and grade G2 in both cohorts (Table 1). The average genome instability indices (GII) for the OUH and the TCGA cohort were 0.33 and 0.37, respectively. The clinicopathologic characteristics of tumors (excluding three IPMNs and two xenograft cell lines) are presented in Table 1. The clinical data for the cell line is available elsewhere (16). The three IPMNs included in the study represent the benign lesions; hence the clinical features were not presented with malignant adenocarcinomas in Table 1. CNA profiles of 2 of 3 xenograft cell lines were compared with their original tumors. One of the cell lines had a lower ploidy than its corresponding tumor (Supplementary Fig. S2). There was no tumor tissue available for CNA profiling for the original tumor tissue corresponding to third xenograft cell line. The three IPMN samples were more normal-like with few chromosomal aberrations, like deletions of chromosome 9p and 10q and amplification of 1q (Supplementary Fig. S3). These aberrations could be early events in periampullary adenocarcinomas.
. | OUH (n = 55) . | TCGA (n = 127) . |
---|---|---|
Clinical features . | Frequency . | Frequency . |
Gender | ||
Female | 30 (55%) | 52 (41%) |
Male | 25 (45%) | 75 (59%) |
Type | ||
PDAC | 29 (52%) | 111 (87%) |
PDAC-other subtypes | — | 16 (13%) |
Bile duct | 4 (7%) | — |
Ampulla pancreatobiliary subtype | 6 (10%) | — |
Ampulla intestinal subtype | 7 (12%) | — |
Duodenum | 9 (16%) | — |
pT | ||
T1 | 4 (7%) | 2 (1%) |
T2 | 9 (16%) | 11 (9%) |
T3 | 36 (65%) | 110 (87%) |
T4 | 6 (11%) | 3 (2%) |
TX | — | 1 (1%) |
N | ||
N0 | 19 (35%) | 34 (27%) |
N1 | 35 (64%) | 91 (72%) |
N2 | 1 (2%) | 0 (0%) |
NX | — | 2 (1%) |
M | ||
M0 | 52 (95%) | 59 (47%) |
M1 | 3 (5%) | 4 (3%) |
MX | — | 64 (50%) |
R | ||
R0 | 37 (67%) | 68 (54%) |
R1 | 18 (33%) | 42 (33%) |
R2 | — | 4 (3%) |
RX | — | 4 (3%) |
Not available | — | 9 (7%) |
Differentiation/grade | ||
Well (G1) | 18 (33%) | 14 (11%) |
Moderately (G2) | 37 (67%) | 72 (57%) |
Poor (G3) | — | 39 (31%) |
Undetermined (GX) | — | 2 (1%) |
Mean overall survival | 578 days | 247 days |
Median disease free survival | 259 days | — |
Median age | 65 years | 66 years |
. | OUH (n = 55) . | TCGA (n = 127) . |
---|---|---|
Clinical features . | Frequency . | Frequency . |
Gender | ||
Female | 30 (55%) | 52 (41%) |
Male | 25 (45%) | 75 (59%) |
Type | ||
PDAC | 29 (52%) | 111 (87%) |
PDAC-other subtypes | — | 16 (13%) |
Bile duct | 4 (7%) | — |
Ampulla pancreatobiliary subtype | 6 (10%) | — |
Ampulla intestinal subtype | 7 (12%) | — |
Duodenum | 9 (16%) | — |
pT | ||
T1 | 4 (7%) | 2 (1%) |
T2 | 9 (16%) | 11 (9%) |
T3 | 36 (65%) | 110 (87%) |
T4 | 6 (11%) | 3 (2%) |
TX | — | 1 (1%) |
N | ||
N0 | 19 (35%) | 34 (27%) |
N1 | 35 (64%) | 91 (72%) |
N2 | 1 (2%) | 0 (0%) |
NX | — | 2 (1%) |
M | ||
M0 | 52 (95%) | 59 (47%) |
M1 | 3 (5%) | 4 (3%) |
MX | — | 64 (50%) |
R | ||
R0 | 37 (67%) | 68 (54%) |
R1 | 18 (33%) | 42 (33%) |
R2 | — | 4 (3%) |
RX | — | 4 (3%) |
Not available | — | 9 (7%) |
Differentiation/grade | ||
Well (G1) | 18 (33%) | 14 (11%) |
Moderately (G2) | 37 (67%) | 72 (57%) |
Poor (G3) | — | 39 (31%) |
Undetermined (GX) | — | 2 (1%) |
Mean overall survival | 578 days | 247 days |
Median disease free survival | 259 days | — |
Median age | 65 years | 66 years |
NOTE: The original primary tumors of two cell lines were present in 55 periampullary adenocarcinomas and three IPMNs were benign lesions, hence the clinical features of these five samples are not presented in the table. For comparative reasons, the OS and DFS in patients from the OUH cohort were only calculated for PDAC as the TCGA cohort is primarily composed of PDAC tumors.
Abbreviations: M, metastasis status; N, nodal status; pT, tumor size; R, resection margin.
Correlation analysis between ploidy and GII showed a positive correlation for both the OUH cohort (Pearson correlation 0.48; P < 0.0001) and the TCGA cohort (Pearson correlation 0.72; P < 0.0001; Supplementary Fig. S4).
Copy number changes in periampullary adenocarcinomas stratified by morphology
To identify CNAs that had different prevalence between subtypes of periampullary adenocarcinomas, frequencies of copy number gains and deletions were calculated for morphologic subgroups (Fig. 2A) and sites of origin (Supplementary Fig. S5). The aberrations specific to morphologic subgroups and sites of origin (P < 0.05, χ2 test) are plotted in Fig. 2B and C.
Genomic aberrations specific to the intestinal subtype include loss of 4q, 5q, and gains of 3q and 13 chromosomal loci. Also, gains at chromosomes 13q14.3/22.1/32.1/34 and deletions of loci in chromosomes 5q11.2/13.3/21.3, 18p11.22/11.23/11.31 and 18q12.3 are more evident in the intestinal than in the pancreatobiliary subtype (Fig. 2A and B). Despite small sample sizes (n = 41 for the pancreatobiliary and n = 16 for the intestinal subtype), these differences are highly significant (P < 0.001; χ2 test). In contrast, whole arm deletion of chromosome arms 6q, 8p, 9p, 17p and 18q21 were more common in the pancreatobiliary than in the intestinal subtype (Fig. 2B). Focal deletion of 18q11 was more frequent in the ampulla of the intestinal subtype and the duodenum than in the tumors of pancreatobiliary subtype (P < 0.001; χ2 test; Fig. 2C). The deletions of 4q (57%) and 5q (57%) were observed in the duodenum and ampulla of intestinal subtype, respectively.
Because of the strikingly different aberration patterns in the two morphologic subtypes of periampullary adenocarcinomas, we hypothesized that the PAM50 gene signature, initially developed for classification of breast cancer subtypes (19) and also used for retrieving prognostic information of non–small cell lung cancers (26) may also help in understanding prognostic information of periampullary adenocarcinomas. Thus, we performed clustering of the periampullary adenocarcinoma using the PAM50 gene signature. The PAM50 gene signature clustered the samples broadly into intestinal and pancreatobiliary subtypes, where the latter clustered into basal-like and classical groups (Fig. 3A). Clustering our data using the gene signatures defined by Moffitt (n = 50; ref. 27) and Collisson (n = 62; ref. 8) showed overlap with Moffitt's basal and classical subtype, and the PAM50 basal-like subtype with Collisson's quasi-mesenchymal subtype. Collisson's exocrine subtype was absent in the OUH cohort and the classical subtype was mixed with both the PAM50 basal-like and classical subtypes (Fig. 3A). The tumors in the basal-like cluster were poorly differentiated and were highly proliferative as compared with the classical subtype (P = 5.9e−05, two-way ANOVA test; Fig. 3A and B).
Driver genes in periampullary adenocarcinomas
Putative driver genes were identified in the most frequently deleted and amplified chromosomal regions by mapping the genes to known oncogene and tumor suppressor gene lists as described in Materials and Methods. We identified putative driver genes in each chromosomal locus, and the average frequencies of putative driver genes in deleted and amplified genomic loci for both cohorts are presented in Fig. 4. The genes located on frequently deleted locations are RUNX3 and EPHB2 on chromosome 1p; PBRM1 and LTF on chromosome 3p; MYB and PRDM1 on chromosome 6; CDKN2A and CDKN2B on chromosome 9; MAP2K4 and PIK3R5 on chromosome 17. Multiple genes including MAPK4, SMAD2, SMAD4, DCC, and BCL2 were found on chromosome 18, which were the most frequent deletion events in both cohorts. The genes AKT3 on chromosome 1q; EGFR, PIK3CG on chromosome 7; MYC, PTK2 on chromosome 8; ERCC5 on chromosome 13 and CCNE1 on chromosome 19 were typically amplified. Supplementary Table S1 shows the frequencies of aberration of amplified or deleted genes in periampullary adenocarcinomas in both the cohorts. Putative driver genes in the regions that were exclusively aberrant in the intestinal subtype were KLF5, RAP2A, and IRS2 on chromosome 13, which were amplified, and PIK3R1, PLK2, and PPAP2A on chromosome 5 and PTPRM gene on chromosome 18p, which were deleted. These genes are known tumor suppressors or oncogenes and were frequently deleted or amplified in our intestinal subtype tumors.
Integrated analysis of copy number and gene expression data
To determine genomic hotspots of periampullary adenocarcinomas, we carried out correlation analysis of copy number and gene expression data. Gains or losses of several chromosomal loci were associated with gene expression levels in both cohorts. The upregulation of 974 and downregulation of 1,060 genes in the OUH cohort and upregulation of 3,566 genes and downregulation of 4,953 genes in the TCGA cohort were associated with CNA in cis. Of these, 795 of 2,034 genes identified in the OUH cohort were validated in the TCGA cohort (Supplementary Table S2A–S2D). Expressions of the genes such as FBXL20 and MED1 on 17q12 (ERBB2-amplicon) and POP4, CCNE1, C19orf12, and UQCRFS1 on 19q12 were highly correlated with copy number in cis (P < 0.001; Pearson's correlation test) in the OUH cohort. We validated that the expression of POP4, CCNE1, C19orf12, and UQCRFS1 on 19q12 were associated with the copy number in the TCGA cohort. Deletions and gains of various chromosomal loci were associated with downregulation of tumor suppressor genes including SMAD2, PBRM1 and TNFRSF10A, and overexpression of oncogenes including JAK2 and FAS (Supplementary Table S2), respectively, in both cohorts. Several of our reported driver genes were found significantly correlated with CNA in cis in both cohorts. Examples include NEK3, GRAMD3, PHLPP1, PINX1, MLLT3, and CD274.
Co-occurrence of chromosomal aberrations in periampullary adenocarcinomas
Co-occurrences of the deletion or amplification events were identified in both cohorts to investigate coinvolvement of aberration events in deregulating pathways of periampullary adenocarcinomas. Loss of 17p13 occurred in 63% of the OUH samples and 74% of the TCGA samples, while loss of 18q21/18q22 occurred in 70% of the OUH samples and 79% of the TCGA samples. Simultaneous loss of 17p13 and 18q21/18q22 occurred in 60% of the OUH samples and in 62% of the TCGA samples, a significant level of co-occurrence in both cohorts (P < 0.05; Fisher exact test). The candidate genes located on chromosome 17p13 and 18q21 are involved in cell-cycle regulation [TP53 and YWHAE (17p13), SMAD2 and SMAD4 (18q21)], p53 signaling [TP53 (17p13) and SERPINB5 and PMAIP1 (18q21)], apoptosis [TP53 and PIK3R5 (17p13) and BCL2 (18q21)], and Wnt signaling [TP53 (17p13) and SMAD4 (18q21)].
Gene-set enrichment analysis
To identify pathways deregulated in periampullary adenocarcinomas, gene-set enrichment analysis was performed using the WebGestalt tool (24, 25). The pathways significantly enriched in both OUH and TCGA (FDR < 0.05) and genes deregulated in more than 20% of the samples in each pathway are reported in Supplementary Table S3. The top pathways deregulated in both cohorts and pathways significantly enriched in both gene expression and copy number data are reported in Table 2. The pathways associated with frequent codeletions of 17p and 18q (cell cycle, apoptosis, and p53 signaling) are also reported in Table 2.
. | OUH Cohort . | TCGA Cohort . | ||||
---|---|---|---|---|---|---|
Pathways . | Number of genes . | P . | Padj . | Number of genes . | P . | Padj . |
MAPK Signaling | 44 | 6.83E−07 | 5.18E−06 | 65 | 2.2E−13 | 3.1E−12 |
Jak–STAT Signaling | 29 | 3.86E−06 | 1.95E−05 | 44 | 6.0E−12 | 5.2E−11 |
Cell cycle | 22 | 0.0001 | 0.0003 | 27 | 1.7E−05 | 3.8E−05 |
p53 Signaling | 13 | 0.0014 | 0.0026 | 20 | 1.8E−06 | 5.9E−06 |
Apoptosis | 14 | 0.005 | 0.0081 | 25 | 1.5E−07 | 6.9E−07 |
Insulin signaling | 19 | 0.0071 | 0.0104 | 22 | 0.007 | 0.008 |
TGF-β signaling | 12 | 0.0219 | 0.0269 | 16 | 0.003 | 0.004 |
Wnt Signaling | 18 | 0.0312 | 0.0374 | 34 | 5.8E−07 | 2.2E−06 |
. | OUH Cohort . | TCGA Cohort . | ||||
---|---|---|---|---|---|---|
Pathways . | Number of genes . | P . | Padj . | Number of genes . | P . | Padj . |
MAPK Signaling | 44 | 6.83E−07 | 5.18E−06 | 65 | 2.2E−13 | 3.1E−12 |
Jak–STAT Signaling | 29 | 3.86E−06 | 1.95E−05 | 44 | 6.0E−12 | 5.2E−11 |
Cell cycle | 22 | 0.0001 | 0.0003 | 27 | 1.7E−05 | 3.8E−05 |
p53 Signaling | 13 | 0.0014 | 0.0026 | 20 | 1.8E−06 | 5.9E−06 |
Apoptosis | 14 | 0.005 | 0.0081 | 25 | 1.5E−07 | 6.9E−07 |
Insulin signaling | 19 | 0.0071 | 0.0104 | 22 | 0.007 | 0.008 |
TGF-β signaling | 12 | 0.0219 | 0.0269 | 16 | 0.003 | 0.004 |
Wnt Signaling | 18 | 0.0312 | 0.0374 | 34 | 5.8E−07 | 2.2E−06 |
NOTE: The table shows the enriched pathways, number of genes enriched in the pathway, P values, and FDR.
Prognostic implications of copy number gain
To determine the prognostic implications of copy number gains, survival analysis was performed for the focally amplified regions and also for genes located within these regions. We focused on prognostic values of focal amplicon regions as an oncogene or tumor suppressor is more likely to drive focal rather than whole aberrations. Kaplan–Meier survival analyses showed that gain of the chromosomal region 18p11 (18p11.21-23, 18p11.31-32) was associated with decreased DFS and OS at P < 0.01 (log-rank test). Amplifications of the genes RAB12 and COLEC12 located on 18p11.22 and 18p11.32, respectively, were associated with decreased DFS (Fig. 5A–C). Gain of 19q13 (19q13.2, 19q13.31-32) and amplification of the genes SERTAD3 and ERCC1 located on 19q13.2 and 19q13.32, respectively, were associated with decreased OS at P < 0.05 (Fig. 5D–F).
The prognostic relevance of PAM50 classification into intestinal, basal-like, and classical subtypes was determined by plotting a Kaplan–Meier survival curve. Patients with basal like tumor had the worst survival (P = 0.006; Fig. 3C).
Discussion
The frequencies of gains and losses were comparable between the OUH and TCGA cohort, both of which approximately 30% of the samples had copy number gains and approximately 75% had deletions as the most frequent events. The validation using the TCGA cohort supports the findings in the OUH cohort with respect to genomic aberration patterns, the candidate driver genes and pathways. The correlation analysis between gene expression and copy number data identified genes that play an important role in pancreatic cancer tumor biology. Out of 9 genes in the 19q12 amplicon, the expressions of 4 genes (CCNE1, POP4, UQCRFS1, and C19orf12) were significantly correlated with gain of 19q12. Studies have identified 19q12 gain in ER-negative grade III breast cancers, and that silencing of POP4 and CCNE1 reduce cell viability in cancer cells harboring this amplification (28). Amplification of CCNE1 is typically found in basal-like breast cancers, and is associated with increased proliferation (29). We found relatively higher expression of CCNE1 in the basal-like compared with the classical subtype (P < 0.01; t test). Furthermore, gain of 17q12 was associated with overexpression of FBXL20 and MED1 genes within the ERBB2 amplicon. These coamplified genes are important contributors to cancer progression (30).
Recently, Yachida and colleagues showed similarities between CNA profiles of ampulla of pancreatobiliary and intestinal types (amplification of 1p13.1, 3q26.2, 8q24.21, 12q15 and deletion of 18q21.2 and 9p21.3; ref. 31). We found the same deletion events as reported by Yachida and colleagues. However, the reported amplification events were not frequent in our cohort except for amplifications 3q26.2 and 8q24.21, and were more frequently observed in the intestinal subtype of ampullary adenocarcinomas. Furthermore, the deletion of 5q and amplification of 13q events are novel to the intestinal subtype in our study when comparing ampulla of pancreatobiliary and intestinal subtype. We further extended the comparison of pancreatobiliary and intestinal adenocarcinoma to all periampullary locations. We found that intestinal subtype from either ampulla or duodenum shared more similarities in their CNA profiles compared with the pancreatobiliary type. Gingras and colleagues compared bile duct, ampulla, and duodenal tumor CNA profiles of periampullary adenocarcinomas and reported unique site-of-origin–specific aberrations like 5p amplification of ampulla, 3q amplification of bile duct, and 13p amplification of duodenum (32). These sites of origin specific aberrations were not unique in our cohort, where 13p amplification of duodenum was frequently observed in the ampulla of intestinal type. Furthermore, the 3q amplification in the bile duct was very common in the intestinal type, and the 5p amplification of the ampulla was specific to pancreatobiliary type. The difference observed could be related to the fact that Gingras and colleagues did not stratify samples based on morphology and focused only on site of origin.
The CNA analysis using the Battenberg pipeline is advantageous for low cellularity samples like periampullary adenocarcinomas. As a consequence, we have high power to detect aberrations specific to individual subtypes. One of the important findings is that the periampullary adenocarcinomas are more distinct at the level of morphology than at the site of origin, which is consistent with our previously published data of mRNA and miRNA expression profiling of the same tumors (9), suggesting the importance of morphology specific subtyping of periampullary adenocarcinomas. The putative driver genes, PLK2, PIK3R1, and PTPRM, associated with morphologic subtypes were also differentially expressed between the pancreatobiliary and intestinal subtypes in our previous study of the same cohort. In addition, the prognostic relevance of genes like PIK3R1, NEK3, SASH1, and RASA3 on chromosome 17p13 was also defined for the intestinal subtype (9). The intestinal subtype specific aberration patterns including deletion of 5q, gain of 3q21-26 and overexpression of CCNL1 and KLF5 and downregulation of PLK2 and PPP3CA were also identified in other tumor types (29, 33–35). Furthermore, subtyping of periampullary adenocarcinomas using the PAM50 classifier identified a subgroup of pancreatobiliary tumors that are basal-like with worse prognosis as identified by Moffitt and colleagues as well; however, the Collisson's subtypes were not prognostically different in our cohort. The basal-like subtype is identified in all the classification systems but named differently as squamous type, quasi-mesenchymal type, and PAM50 basal-type (8, 14, 27). The classification of periampullary adenocarcinoma samples based on PAM50 and “Moffitt's gene signature” are highly correlated showing the usefulness of PAM50 in subtyping of other tumor types than breast cancer. The PAM50 gene signature classified tumors based on degree of differentiation as it has genes associated with proliferation, cell cycle, keratins, and cell adhesion (19). This difference in expression pattern between the two subgroups is unlike to be driven by CNAs, as the aberration patterns between the basal-like and classical subgroups did not significantly differ in either of the two cohorts (data not shown).
Gains of 18p11 in periampullary adenocarcinomas have been linked to poor DFS and OS and gain of 19q13 with poor DFS. The 19q13 chromosomal locus is commonly amplified in various cancers, including ovarian cancer, breast cancer, pancreatic cancer, and non–small cell lung cancer (36–39). The gain of 19q13 is associated with higher grade, stage, and outcome in pancreatic cancer (38). The overexpression of the genes SERTAD3 and ERCC1 located on 19q13.2 and 19q13.32, respectively, were found to be associated with impaired overall survival in this study. SERTAD3 is a putative oncogene that induces E2F activity and promotes tumor growth (40). The DNA excision repair protein gene ERCC1 is a known prognostic biomarker of head and neck and non–small cell lung cancer (41, 42).
Deletion of 18p11.22 is specific to the intestinal subtype, whereas gain of this locus occurs in ten of the pancreatobiliary samples. 18p11.22 has also been suggested as a novel lung cancer susceptibility locus in never smokers (43). High expression of RAB12 on chromosome 18p11.22 is associated with poor DFS. RAB12 is a Ras oncogene family member and is overexpressed in colorectal cancers (44). Furthermore, the COLEC12 gene is known prognostic marker in anaplastic thyroid cancer and brain tumors (45, 46). Because of limited clinical annotation of the TCGA data, we were unable to validate these as prognostic markers in the TCGA cohort.
The current study has a relatively large sample size in both cohorts. The results provide new knowledge of the genomic changes characteristics of pancreatic cancer and may prove useful for better understanding the molecular basis for this devastating disease.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Authors' Contributions
Conception and design: V. Sandhu, O. Myklebost, A.-L. Børressen-Dale, T. Ikdahl, P. Van Loo, E.H. Kure
Development of methodology: D.C. Wedge, O.C. Lingjærde, P. Van Loo, S. Nord
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): I.M. Bowitz Lothe, K.J. Labori, T. Buanes, M.L. Skrede, E. Munthe, O. Myklebost, T. Ikdahl, E.H. Kure
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): V. Sandhu, D.C. Wedge, K.J. Labori, S.C. Dentro, T. Buanes, O. Myklebost, O.C. Lingjærde, A.-L. Børressen-Dale, P. Van Loo, S. Nord, E.H. Kure
Writing, review, and/or revision of the manuscript: V. Sandhu, D.C. Wedge, K.J. Labori, S.C. Dentro, T. Buanes, M.L. Skrede, E. Munthe, O. Myklebost, O.C. Lingjærde, A.-L. Børressen-Dale, T. Ikdahl, P. Van Loo, S. Nord, E.H. Kure
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): M.L. Skrede, A.M. Dalsgaard, T. Ikdahl, E.H. Kure
Study supervision: A.-L. Børressen-Dale, S. Nord, E.H. Kure
Acknowledgments
We thank all the patients who participated in the study.
Grant Support
This research was supported by grants from the South-Eastern Regional Health Authority, Hole's Foundation, The Radium Hospital Foundation, Oslo University Hospital, and University College of Southeast Norway. S. Nord was supported by a carrier grant from the Norwegian Regional Health Authorities (grant number 2014061).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.