Abstract
Purpose: Esophageal adenocarcinoma (EAC) is a lethal malignancy that can develop from the premalignant condition, Barrett's esophagus (BE). Currently, there are no validated simple methods to predict which patients will progress to EAC. A better understanding of the genetic mechanisms driving EAC tumorigenesis is needed to identify new therapeutic targets and develop biomarkers capable of identifying high-risk patients that would benefit from aggressive neoadjuvant therapy. We employed an integrative genomics approach to identify novel genes involved in EAC biology that may serve as useful clinical markers.
Experimental Design: Whole genome tiling-path array comparative genomic hybridization was used to identify significant regions of copy number alteration in 20 EACs and 10 matching BE tissues. Copy number and gene expression data were integrated to identify candidate oncogenes within regions of amplification and multiple additional sample cohorts were assessed to validate candidate genes.
Results: We identified RFC3 as a novel, candidate oncogene activated by amplification in approximately 25% of EAC samples. RFC3 was also amplified in BE from a patient whose EAC harbored amplification and was differentially expressed between nonmalignant and EAC tissues. Copy number gains were detected in other cancer types and RFC3 knockdown inhibited proliferation and anchorage-independent growth of cancer cells with increased copy number but had little effect on those without. Moreover, high RFC3 expression was associated with poor patient outcome in multiple cancer types.
Conclusions:RFC3 is a candidate oncogene amplified in EAC. RFC3 DNA amplification is also prevalent in other epithelial cancer types and RFC3 expression could serve as a prognostic marker. Clin Cancer Res; 18(7); 1936–46. ©2012 AACR.
Esophageal adenocarcinoma (EAC) is a lethal cancer that can develop from the premalignant condition, Barrett's esophagus. Little progress has been made in improving the survival rate for EAC patients; thus, an improved understanding of the genetic alterations driving EAC development is essential to improve patient prognosis.
We employed an integrative genomics approach to identify novel genes involved in EAC tumorigenesis that may serve as clinical markers. We identified RFC3 as a candidate oncogene activated by DNA amplification in approximately 25% of EAC patients. Knockdown of this gene inhibited cancer cell lines with amplification but had little effect on cancer cell lines without the alteration or nonmalignant cells. Furthermore, high RFC3 expression was associated with poor patient outcome in multiple epithelial cancer types. Our findings suggest that activation of RFC3 may play a role in EAC development and that its expression could serve as a prognostic marker.
Introduction
Over the past 3 decades there has been a dramatic rise in the incidence of esophageal adenocarcinoma (EAC) especially in Western countries (1, 2). However, progress in early detection and treatment strategies have been unable to improve the poor 5-year survival rate of 16% (3). Barrett's esophagus (BE) is a premalignant condition that can give rise to EAC. Histologically, BE is characterized by the replacement of the stratified squamous epithelium of lower esophagus with a metaplastic columnar epithelium (4). Larger length of the Barrett's segment may be associated with higher risk of progression to EAC, and there is preliminary evidence to support that genomic instability is a marker for BE progression (4, 5). Improved knowledge of EAC cancer biology and the identification of markers of progression in BE could significantly improve patient prognosis by invoking early interventions.
To discover genes involved in cancer progression, we identified recurrent genetic alterations in EAC tumors and assessed their presence in BE. Analysis of more than 80 EAC tumors revealed RFC3 (Replication Factor C-3 at chromosome 13q13), a gene involved in DNA replication and cell proliferation, to be frequently amplified in EAC tumors. RFC3 amplification was also detected in the BE of a patient, suggesting it could be an early event in tumorigenesis. Furthermore, we showed that RFC3 gain is prevalent in several types of cancer, knockdown of RFC3 has an antiproliferative effect, and that RFC3 expression is associated with poor prognosis in multiple patient cohorts, which collectively suggests an oncogenic role for RFC3.
Materials and Methods
Discovery set (University of Michigan) tissue accrual and processing
The discovery set includes 20 EACs, 10 matched BE tissue, and 6 nonmalignant esophagus tissue samples corresponding to the 10 BEs (Supplementary Table S1). An additional set of 65 EAC tumor specimens was collected for validation (Supplementary Table S1). All samples were collected (fresh frozen in liquid nitrogen and stored at −80°C) with written patient consent and according to the ethics guidelines of the University of Michigan Institutional Review Board from radiation and chemotherapy-naive patients who underwent esophagectomy for adenocarcinoma at the University of Michigan Health System between 1991 and 2004. DNA and RNA were isolated from microdissected tissue that contained at least 70% tumor cell or Barrett's metaplasia content (for more details, see Supplementary Methods).
Array comparative genomic hybridization and GISTIC analysis for altered regions
A total of 20 EACs, 10 matched BE cases, and 6 nonmalignant esophageal tissues were profiled as previously described (6, 7). The EAC profiles were analyzed using the GISTIC (Genomic identification of significant targets in cancer) algorithm for copy number alteration frequency and amplitude, using the following parameters: q-value threshold of 0.10, refgene file Hg18, amplification threshold 0.1, deletion threshold 0.1, join segment size = 2 (8). CNAs were filtered for natural copy number variants within the GISTIC analysis (9). High-level copy number amplifications and deletions were defined as DNA segments with log2 ratios more than 0.8 or log2 ratios less than −1.3, respectively.
Quantitative real-time PCR validation of 13q13 genes
Genomic DNA for the 85 EAC tumors acquired at the University of Michigan was analyzed using Quantitative real-time PCR (qPCR) to validate DNA copy number status of 2 genes situated in the 13q13 amplicon identified by GISTIC (STARD13 and RFC3). Primer sequences and analysis details are provided in Supplementary Methods and Supplementary Table S2. RFC3 copy number status was also assessed by genomic DNA qPCR for the esophageal cell lines OE33, Flo-1, and Het-1a.
Gene expression profiling
Gene expression profiles for 11 of the 20 discovery set EAC tumors were generated using Affymetrix U133A expression arrays as previously described (Supplementary Tables S1 and S3; ref. 10). Expression arrays were carried out by the University of Michigan Microarray Core. The probe displaying the maximum average intensity across the 11 samples was used to assess gene expression when multiple probes for the same gene were present as previously described (10).
Externally generated SNP, CGH, and expression data
Additional array data were accessed from publically available sources to investigate RFC3 in external cohorts (Supplementary Fig. S1 and Supplementary Table S3). Data analyses done on these datasets are described in Supplementary Methods.
Integration of DNA copy number and gene expression data
Spearman's correlation and Mann–Whitney U tests conducted using MATLAB software were used to investigate whether increased gene dosage influenced expression for the 4 13q13 amplicon genes. A correlation coefficient more than 0.6 and P values less than 0.05 were considered significant.
Cell culture and shRNA-mediated RFC3 knockdown
EAC cell lines, OE33 (Sigma-Aldrich 96070808-1VL) and OE19 (Sigma-Aldrich 96071721-1VL), and breast ductal adenocarcinoma cell line, HCC1395 (ATCC CRL-2324), were cultured in RPMI-1640 media supplemented with 10% FBS and 0.1% Penicillin–Streptomycin (Invitrogen). Flo-1, also an EAC line, and Het-1a, a nonmalignant esophageal cell line, were cultured in Dulbecco's modified Eagle's medium supplemented with 10% FBS and 10× Antibiotic–Antimycotic (Invitrogen 15240-096). DNA was isolated using standard phenol:chloroform extractions and RNA was extracted using TRIzol reagent (Invitrogen). PLKO plasmid constructs containing short hairpin RNAs (shRNA) targeting RFC3 were purchased from Open Biosystems (Catalog RHS4533-NM_181558). Lentiviral production and infections were carried out as previously described (11). RFC3 knockdown was quantified by quantitative reverse transcriptase PCR (qRT-PCR) using TaqMan gene expression assays for RFC3 (Hs00161357_m1) with 18S rRNA (Hs99999901_s1) as an endogenous control (11). Four shRNAs designed to target RFC3 were tested and the one giving the greatest RFC3 mRNA knockdown (R2–TRCN0000072649) was used for cell model experiments. Knockdowns were also measured by Western blotting, as described in Supplementary Methods.
MTT cell proliferation assay
OE33, OE19, HCC1395, Flo-1, and Het-1a cell lines were stably transduced with shRNA targeting RFC3 and were investigated for differences in cell proliferation rates using the MTT assay (Trevigen) as previously described (11). Vector-only lines (OE33-PLKO, OE19-PLKO, HCC1395-PLKO, Flo-1-PLKO, and Het-1a-PLKO) served as controls to assess proliferation in the knockdown lines (OE33-R2, OE19-R2, HCC1395-R2, Flo-1-R2, and Het-1a-R2). Assay details are further described in Supplementary Methods.
Colony formation assay
Anchorage-independent growth was assessed in stably transfected OE33, OE19, HCC1395, Flo-1 and Het-1a PLKO and RFC3 knockdown cell lines by the soft agar method as previously described (11). Cells were plated at a density of 1,000 cells per well in supplemented media containing 0.3% low melting point agarose. Each cell line was seeded in triplicate and cultured for 2 (OE33 and Het-1a) or 3 weeks (HCC1395, OE19, and Flo-1) at 37°C in 5% CO2. Colonies were stained with MTT, counted, and the mean ± SEM were normalized to the average of control (PLKO) cells.
Survival analysis
Gene expression and patient survival data for 6 independent tumor datasets were obtained including 1 EAC, 1 breast cancer, and 4 lung cancer datasets (Supplementary Table S3). For each dataset, patients were grouped into tertiles based on expression levels (i.e., low, middle, and high) and survival times of patients with expression values in the lowest tertile were compared with those of patients whose expression ranked in the highest tertile. Survival associations were tested for RFC3, PDS5B, KL, and STARD13. Log-rank Mantel–Cox tests were done and Kaplan–Meier survival curves were generated using GraphPad Prism 5 software to assess RFC3 expression associations with patient survival. A P value less than 0.05 was considered significant. Multivariate analyses were carried out using the robust likelihood–based survival modeling package in R (rbsurv; ref. 12) as described in Supplementary Methods.
Results
Identification of significant regions of copy number alteration in EAC
We generated tiling resolution comparative genomic hybridization (CGH) profiles for a discovery set of 20 EAC tumors. A total of 11 significant regions of amplification (Fig. 1) and 8 significantly deleted regions (Supplementary Fig. S2) were detected. The characteristics of each region are provided in Fig. 1 and Supplementary Table S4. We identified several frequently reported EAC alterations, including gains of 7p, 8q, and 17q as well as losses of 3p, 5q, and 17p, showing our tumors are consistent with the EAC genomic landscape (5, 13–21).
Significant regions of DNA amplification in EAC. A, GISTIC analysis revealed 11 significantly amplified regions in the discovery set of 20 EAC tumors. Chromosomes are displayed vertically with boundaries indicated by alternating white and black segments. Red bars represent DNA amplification peaks identified by GISTIC. The green vertical line marks the q value threshold, 0.10. Amplification peaks extending past the green threshold line are considered significant. B, genomic coordinates (March 2006, hg18 genome build), size, frequency, and candidate driver genes for the 11 significant regions of DNA amplification identified by GISTIC in the discovery set of EAC tumors (n = 20).
Significant regions of DNA amplification in EAC. A, GISTIC analysis revealed 11 significantly amplified regions in the discovery set of 20 EAC tumors. Chromosomes are displayed vertically with boundaries indicated by alternating white and black segments. Red bars represent DNA amplification peaks identified by GISTIC. The green vertical line marks the q value threshold, 0.10. Amplification peaks extending past the green threshold line are considered significant. B, genomic coordinates (March 2006, hg18 genome build), size, frequency, and candidate driver genes for the 11 significant regions of DNA amplification identified by GISTIC in the discovery set of EAC tumors (n = 20).
Investigation of amplified regions without known oncogenes
Although both amplifications and deletions are known to be important in cancer development, the gain-of-function effect of gene amplifications makes them ideal targets for the development of biomarkers and strategies for therapeutic intervention. Thus, we chose to focus our study on DNA amplifications, as they are a prominent mechanism of oncogene activation (7). In total, 9 of the 11 regions in Fig. 1 can be attributed to known oncogenes in EAC or other tumor types, such as MYC, ERBB2, and EGFR. Driver oncogenes for the 7q21.11 and 13q13.1 regions remain elusive despite frequent reporting of DNA amplifications on 7q and 13q in EAC copy number studies. In our discovery set, high-level amplifications of 13q13 were more prevalent than amplifications of 7q21; thus, we focused on the 13q13 amplicon which contained only 4 genes: PDS5B, KL, STARD13, and RFC3 (Fig. 2).
DNA amplification of 13q13 in EAC tumors. 13q13 DNA amplification is a recurrent event in the discovery set of EAC tumors. Array CGH copy number profiles for 6 different tumors harboring the 13q13 amplicon are shown (A). Orange shading marks the 13q13 region boundaries identified by GISTIC. Each blue dot is an individual array element, and those shifted to the right (toward +1) of the yellow neutral line (0) exemplify copy number gains whereas those shifted to the left (toward −1) exemplify copy number loss. The numbers −1, 0, and +1 indicate the log2 ratio scale. The 13q13 amplicon is shown in the UCSC genome browser using the March 2006 hg18 build (C). The focal region of amplification is approximately 1.3 Mb in size and encompasses 4 RefSeq genes: PDS5B, KL, STARD13, and RFC3.
DNA amplification of 13q13 in EAC tumors. 13q13 DNA amplification is a recurrent event in the discovery set of EAC tumors. Array CGH copy number profiles for 6 different tumors harboring the 13q13 amplicon are shown (A). Orange shading marks the 13q13 region boundaries identified by GISTIC. Each blue dot is an individual array element, and those shifted to the right (toward +1) of the yellow neutral line (0) exemplify copy number gains whereas those shifted to the left (toward −1) exemplify copy number loss. The numbers −1, 0, and +1 indicate the log2 ratio scale. The 13q13 amplicon is shown in the UCSC genome browser using the March 2006 hg18 build (C). The focal region of amplification is approximately 1.3 Mb in size and encompasses 4 RefSeq genes: PDS5B, KL, STARD13, and RFC3.
Copy number and gene expression integration for 13q13 genes
Next, we sought to identify the driver gene of 13q13 amplification in EAC. Matched copy number and gene expression data for 11 of the 20 discovery set EAC tumors were integrated to determine whether PDS5B, KL, STARD13, or RFC3 exhibited elevated gene expression as a consequence of DNA amplification. Correlation analysis revealed a positive correlation between DNA copy number (using the moving average of log2 ratios) and gene expression for STARD13 (r = 0.65) and RFC3 (r = 0.69). Expression of both genes was significantly higher in tumors with DNA copy number gain as opposed to those without (P = 0.01), providing further evidence that amplification of STARD13 and RFC3 is a mechanism of gene deregulation in EAC tumors (Fig. 3A). To determine whether any of the 4 candidate genes were differentially expressed between EAC and nonmalignant tissue, we interrogated expression levels of PDS5B, KL, STARD13, and RFC3 throughout the histologic progression of EAC in a publically available gene expression dataset consisting of 8 sets of nonmalignant, BE and EAC cases, each originating from the same patient (Fig. 3B–E; ref. 22). RFC3 was the only gene differentially expressed between nonmalignant and EAC tumor tissues (P < 0.05). Analyses of ours and externally generated data revealed RFC3 as the only 13q13-amplified gene that exhibited a strong correlation between gene dosage and expression and differential expression in tumors relative to nonmalignant tissue. Thus, we concluded that RFC3 is the driver gene of the 13q13 amplicon and a novel candidate oncogene in EAC development.
Integration of DNA copy number and gene expression identifies RFC3 as the driver oncogene of the 13q13 amplicon in EACs. Gene expression of the 4 genes located within the 13q13 amplicon (PDS5B, KL, STARD13, and RFC3) was compared in EAC tumors with 13q13 DNA amplification or gain (green) versus tumors without amplification or gain of 13q13 (blue; A). Only 2 of the genes in the 13q13 amplicon, STARD13 and RFC3, had significantly higher expression in tumors with amplification and gain (U test, P < 0.05), suggesting increased dosage of these genes drives their overexpression. Gene expression values are scaled across samples from 0 to 100. Gene expression levels for the 4 genes were assessed in a publically available data set of 8 groups of matched normal, BE and EAC tumor samples from the same patient (B–E). Only RFC3 is differentially expressed between tumor and normal samples (U test, P < 0.05).
Integration of DNA copy number and gene expression identifies RFC3 as the driver oncogene of the 13q13 amplicon in EACs. Gene expression of the 4 genes located within the 13q13 amplicon (PDS5B, KL, STARD13, and RFC3) was compared in EAC tumors with 13q13 DNA amplification or gain (green) versus tumors without amplification or gain of 13q13 (blue; A). Only 2 of the genes in the 13q13 amplicon, STARD13 and RFC3, had significantly higher expression in tumors with amplification and gain (U test, P < 0.05), suggesting increased dosage of these genes drives their overexpression. Gene expression values are scaled across samples from 0 to 100. Gene expression levels for the 4 genes were assessed in a publically available data set of 8 groups of matched normal, BE and EAC tumor samples from the same patient (B–E). Only RFC3 is differentially expressed between tumor and normal samples (U test, P < 0.05).
RFC3 amplification is a prevalent feature of EAC tumors
To validate our array CGH data, we conducted qPCR on genomic DNA for each EAC tumor in the discovery set. A strong correlation between qPCR gene dosage and moving average array CGH log2 ratios was observed for RFC3 (r = 0.81, P < 0.05) validating our array CGH findings (Supplementary Table S5). To investigate the prevalence of RFC3 amplification, we carried out genomic qPCR to assess gene copy number in an additional set of 65 EAC tumors. DNA copy number gain, defined as greater than 1.6-fold increase in gene dosage relative to normal (diploid) tissue, was observed in 22 of 85 (26%) samples. High-level amplification of RFC3, defined as greater than 2-fold increase in gene dosage relative to normal (diploid) tissue, was observed in the 19% of tumors. We also assessed RFC3 copy number status in a publically available dataset (GSE22524), which showed RFC3 was gained or amplified in 2 of 7 of EACs. Lastly, we looked for RFC3 amplification in BE samples obtained from the same patients in the discovery set, whose tumors we profiled with array CGH (Supplementary Table S1). In the 10 EAC cases that had corresponding BE samples examined, one (10%) harbored RFC3 amplification. RFC3 was amplified in both the tumor and corresponding BE tissue, indicating it may play a role in the progression of BE to EAC (Fig. 4). These findings showed RFC3 DNA amplification is prevalent in a broad spectrum of EAC tumors and is not limited to our study.
RFC3 DNA amplification may be an early event in EAC tumorigenesis. Comparison of copy number profiles generated from nonmalignant BE and associated EAC tissues, each set derived from individual patients in the Michigan discovery set, revealed 13q13 amplification was present in a BE and retained in the corresponding EAC tumor.
RFC3 DNA amplification may be an early event in EAC tumorigenesis. Comparison of copy number profiles generated from nonmalignant BE and associated EAC tissues, each set derived from individual patients in the Michigan discovery set, revealed 13q13 amplification was present in a BE and retained in the corresponding EAC tumor.
RFC3 is a broad-spectrum oncogene amplified in cancers from tissues of diverse origin
Because a number of established oncogenes are activated in a broad spectrum of cancer tissues, we investigated whether RFC3 was altered in cancers other than EAC. To address this question we analyzed copy number data for more than 700 cancer cell lines spanning 31 different tissue types. RFC3 copy number gain (defined as more than 2 copies) was detected in 14% of all cases in 23 tissue types, whereas 6% of all malignancies harbored gains of more than 3 copies. A variety of cancer tissues, most of which were epithelial in origin, displayed elevated RFC3 gene dosage (Supplementary Fig. S3A). Interestingly, RFC3 gains were most prevalent in cancers of the large intestine and esophagus, further supporting our findings. Similar to EAC tumors, cell lines harboring RFC3 copy number gains showed higher expression than those without gain (P < 0.0001; Supplementary Fig. S3B).
Knockdown of RFC3 inhibits proliferation and anchorage-independent growth
To assess the functional role of RFC3 in esophageal tumorigenesis, we conducted shRNA-mediated RFC3 knockdown. Array CGH for the commonly used EAC model cell line, OE33, revealed a 13q13 DNA amplification encompassing RFC3 (Supplementary Fig. S4A). We confirmed the presence of RFC3 DNA amplification in OE33 by analyzing additional copy number profiles independently generated using single-nucleotide polymorphism (SNP) arrays by the Wellcome Trust Sanger Institute and GlaxoSmithKline (Supplementary Fig. S4A), and by genomic DNA qPCR (data not shown). Knockdowns were also done in additional EAC cell lines, OE19 (RFC3 gain, data not shown) and Flo-1 (no gain of RFC3, data not shown), as well as Het-1a, a nonmalignant esophageal line (no gain of RFC3, data not shown). Consistent with the copy number status, cell lines with increased gene dosage also displayed higher expression of RFC3 as determined by qPCR (Supplementary Fig. S5). We hypothesized that if RFC3 plays an oncogenic role, OE33 and OE19 (lines harboring RFC3 gains) would be dependent on RFC3 expression, and that knockdown would result in a reduction in cell proliferation and anchorage-independent growth, whereas Flo-1 and Het-1a, which lack RFC3 copy number gains, would not be affected.
Four separate shRNAs designed to target RFC3 were tested, and the one that consistently showed the greatest knockdown (R2) was used for all subsequent experiments (Methods, Supplementary Fig. S6). Indeed, we observed a significant reduction in cell proliferation and anchorage-independent growth in both OE33 and OE19 R2-knockdown lines relative to the vector-only control lines (Fig. 5A–D, Supplementary Fig. S7 and S8) and minimal to no effect in Flo-1 EAC and nonmalignant Het-1a cells (Fig. 5A, B, F, and G, Supplementary Fig. S7 and S8). Growth inhibition in OE33 was also seen using the second best hairpin (R3)—albeit to a lesser degree than R2, consistent with the lower level of knockdown—reducing the probability of off-target effects (Supplementary Fig. S9). To determine whether RFC3 functions as a putative oncogene in other cancer types, we also inhibited RFC3 in the breast cancer cell line, HCC1395, which harbored RFC3 DNA copy number gain (Supplementary Fig. S4B). Knockdown of RFC3 resulted in similar reductions in expression, cell proliferation, and colony formation in HCC1395 (Figs. 5A, B, and E, Supplementary Fig. S8). These cell experiments supported our hypothesis, showing a significant biologic effect of RFC3 knockdown in cancer cells, consistent with an oncogenic role for RFC3 in cancer.
RFC3 confers a growth advantage to cancer cells, suggesting it may be a broad-spectrum oncogene. shRNA-mediated knockdown of RFC3 was done in the EAC cell lines, OE33, OE19 and the breast ductal carcinoma cell line, HCC1395, all of which harbor RFC3 DNA amplifications; knockdown was also done in the EAC line Flo-1 and nonmalignant esophageal line Het-1a which do not harbor RFC3 gains. RFC3 knockdowns were verified by qRT-PCR and relative expression in the control (PLKO) and knockdown lines (R2) are shown (A). Knockdown of RFC3 caused a prominent reduction in anchorage-independent growth (as measured by colony formation in soft agar) in cancer cell lines with RFC3 amplification compared with those without RFC3 amplification (B). Knockdown of RFC3 in all cancer cell lines with RFC3 gain caused a decrease in cell proliferation, showing the oncogenic effect of RFC3 activation (C–E), whereas knockdown in EAC cells and nonmalignant esophageal cells lacking RFC3 gain had no significant effect on proliferation (F and G). Asterisks indicate significant differences in proliferation and colony formation measurements between PLKO and knockdown lines (Student t test, P < 0.05).
RFC3 confers a growth advantage to cancer cells, suggesting it may be a broad-spectrum oncogene. shRNA-mediated knockdown of RFC3 was done in the EAC cell lines, OE33, OE19 and the breast ductal carcinoma cell line, HCC1395, all of which harbor RFC3 DNA amplifications; knockdown was also done in the EAC line Flo-1 and nonmalignant esophageal line Het-1a which do not harbor RFC3 gains. RFC3 knockdowns were verified by qRT-PCR and relative expression in the control (PLKO) and knockdown lines (R2) are shown (A). Knockdown of RFC3 caused a prominent reduction in anchorage-independent growth (as measured by colony formation in soft agar) in cancer cell lines with RFC3 amplification compared with those without RFC3 amplification (B). Knockdown of RFC3 in all cancer cell lines with RFC3 gain caused a decrease in cell proliferation, showing the oncogenic effect of RFC3 activation (C–E), whereas knockdown in EAC cells and nonmalignant esophageal cells lacking RFC3 gain had no significant effect on proliferation (F and G). Asterisks indicate significant differences in proliferation and colony formation measurements between PLKO and knockdown lines (Student t test, P < 0.05).
Overexpression of RFC3 is associated with increased cell growth and proliferation pathways
In an attempt to elucidate gene networks and biologic pathways to which RFC3 belongs, we conducted a significance analysis of microarrays analysis on a panel of EAC tumors (Supplementary Methods and Supplementary Table S3). Expression of 167 genes was positively or negatively correlated with that of RFC3 (Supplementary Table S6). These genes were investigated using Ingenuity Pathway Analysis software to determine whether they were enriched for any annotated cell pathways or functions. Applying this strategy, we revealed these genes were enriched in cell growth and proliferation and DNA replication, recombination and repair networks (Supplementary Table S7). Moreover, these genes were enriched for involvement in the Wntβ-catenin signaling pathway. These findings validated the known functional involvement of RFC3 in DNA replication and implicated RFC3 in cancer promoting pathways and networks, providing additional evidence to support our hypothesis that RFC3 is an important oncogene in cancer.
High RFC3 mRNA levels are associated with poor patient survival
As a final assessment of the importance of increased RFC3 copy number in EAC biology, we sought to determine whether RFC3 gene expression was associated with patient survival in publically available expression data for 48 EAC tumors (23). Because RFC3 was gained/amplified in approximately 25% to 35% of tumors (approximately 25% from qPCR of all 85 samples, approximately 35% from CGH in the initial discovery set), and this subset of tumors also showed the highest expression of the gene, we segregated the samples by RFC3 expression into tertiles based on the assumption that the highest expressers would likely represent the cases with RFC3 gain/amplification. We then compared survival between the top and bottom patient tertiles (see Methods). Although the association did not reach statistical significance, a trend toward poorer survival in patients with higher RFC3 expression was evident (P = 0.1565; Fig. 6A). We also assessed associations between STARD13, KL, and PDS5B expression with patient survival in this EAC cohort. Consistent with our data that RFC3 is the driver of 13q13 amplification, RFC3 expression had the strongest association with patient survival in EAC. Because the small sample size of EAC tumors may have prevented a significant association with patient survival in EAC and we observed RFC3 amplification in other cancer types including lung and breast, we interrogated 5 additional datasets to determine whether RFC3 expression was associated with patient survival. Comparison of survival times in patients with high versus low expression in 295 breast cancers and 4 independent sets of lung adenocarcinoma (n = 58, n = 82, n = 107, and n = 92) revealed survival was significantly worse in patients with high RFC3 expression in the breast cancer (P = 0.0027) and 2 of 4 lung adenocarcinoma sets (P = 0.0079 and P = 0.0102; Fig. 6B–D). The same trend was evident in the 2 other lung datasets (P = 0.0603 and P = 0.0653). Of the 4 genes we identified in the 13q13 amplicon, RFC3 had the strongest association with patient survival in each cohort assessed, except for one lung adenocarcinoma dataset, in which STARD13 had the strongest association with patient survival (P = 0.0136).
High RFC3 expression is associated with poor patient survival in multiple cancer types. A Mantel–Cox test was used to assess the association between RFC3 gene expression and patient survival time in publically available datasets: Peters and colleagues EAC (n = 48; A), Van DeVijver and colleagues breast cancer (n = 295; B), Bild and colleagues lung adenocarcinoma (n = 58; C), and Director's Challenge Consortium (DCC) lung adenocarcinoma (n = 92; D). Survival time among patients having RFC3 expression in the top tertile (blue curves) was compared with that in patients with expression in the bottom tertile (red curves). High RFC3 expression was significantly associated with poor patient survival in the breast and lung cancers (B and D). A trend toward poorer survival in EAC patients with high RFC3 expression was also observed, although it was nonsignificant (A).
High RFC3 expression is associated with poor patient survival in multiple cancer types. A Mantel–Cox test was used to assess the association between RFC3 gene expression and patient survival time in publically available datasets: Peters and colleagues EAC (n = 48; A), Van DeVijver and colleagues breast cancer (n = 295; B), Bild and colleagues lung adenocarcinoma (n = 58; C), and Director's Challenge Consortium (DCC) lung adenocarcinoma (n = 92; D). Survival time among patients having RFC3 expression in the top tertile (blue curves) was compared with that in patients with expression in the bottom tertile (red curves). High RFC3 expression was significantly associated with poor patient survival in the breast and lung cancers (B and D). A trend toward poorer survival in EAC patients with high RFC3 expression was also observed, although it was nonsignificant (A).
Finally, we aimed to determine the prognostic performance of RFC3 in relation to other known oncogenes targeted by amplification in EAC. For this purpose, we conducted robust likelihood-based survival modeling (ref. 12; see Supplementary Methods) using RFC3 along with the known driver genes of 9 other significantly amplified regions identified in the EAC samples (MYB, EGFR, MET, MYC, CCND1, KRAS, ERBB2, GATA6, and CCNE1). This method selects survival-associated genes based on the partial likelihood of the Cox model and can discover multiple sets of genes by iterative forward selection. When applied to the same 5 datasets described above, we found that RFC3 was part of a selected model for 3 of these datasets (Supplementary Table S8). RFC3 and MYC were the amplified oncogenes that most consistently contributed to a prognostic model across the datasets, suggesting that RFC3 may indeed be useful from a prognostic standpoint. Together, the association of RFC3 expression with survival in multiple datasets and cancer types further implies the significance of this gene and suggests RFC3 could potentially be developed as a pan-cancer marker for patient prognosis.
Discussion
DNA CNAs can play significant roles in tumorigenesis and provide insight into gene deregulation and cancer biology, as exemplified by DNA amplifications of the oncogenes HER-2 and EGFR (24, 25). CNAs in EAC tumors have been studied using a variety of detection methods with various resolutions including conventional CGH, microsatellite mapping, array CGH, and SNP arrays (4). The goal of these studies has generally been to identify the targeted driver genes of CNAs, as they are postulated to have causal roles in EAC tumorigenesis. Findings from copy number profiling studies in EAC have resulted in the compilation of a well-established genomic landscape of EAC; recurrent genomic events include gains of 7p, 8q, and 17q and losses of 3p, 4p, 5q, and 17p (26–31). Most recurrently altered regions in EAC contain candidate or proven oncogenes or tumor suppressors such as MYC, EGFR, CDKN2A, and TP53, however, the driver genes for some frequently altered regions have yet to be discovered.
In this study, we identified and investigated a frequently amplified region of DNA located on chromosome 13q13. Recurrent DNA amplification of this region was initially identified in a discovery set of 20 EAC tumors and was validated in the discovery set and an additional 65 EAC tumors. 13q copy number gains are frequently reported in EACs, including several recent high-resolution studies, although no driver gene has been conclusively identified (5, 21, 26–28, 30, 32). This could reflect the fact that 13q DNA gains are often large, segmental alterations encompassing hundreds of genes, making the elucidation of target genes difficult. Our analysis detected a discrete, minimal common region of DNA amplification in an EAC tumor and matched BE sample encompassing only 4 genes. Integration of copy number and gene expression data revealed RFC3 as the candidate oncogene responsible for driving recurrent 13q13 amplification.
RFC3, replication factor C 3, is one of 5 replication factor subunits that form of a multiprotein complex known as activator 1, which is required for DNA replication and repair by the DNA polymerases ϵ and δ. Collectively, the RFC complex acts as a clamp loader, enabling binding of the proliferating cell nuclear antigen (PCNA) clamp onto primed DNA (33). PCNA is required for processive elongation during DNA synthesis, emphasizing the importance of RFC3 in cell proliferation because of its role in DNA replication (33). Human replication factor C genes are highly homologous to RFC genes in yeast, and RFC3 is essential for growth in fission yeast (34, 35). In human cells, RFC3 protein was found associated with the mitogenic transcription factor, MYC, which suggests RFC3 could have an integral role in driving cell proliferation in human cells as well and could explain the antiproliferative knockdown effect we observed in cancer cells (36).
Interestingly, RFC3 expression is regulated by several transcription factors encoded by genes with established roles in cancer. MYCN and SHH, two proteins that promote proliferation in cancer cells, increase RFC3 expression levels whereas tumor suppressors, TP53 and CDKN2A, suppress RFC3 expression (37–40). Expression analysis in a cohort of 52 EACs corroborated these findings, as RFC3 expression was negatively correlated with P53 expression and positively correlated with SHH expression, although the fold change in P53 and SHH expression between high and low expressing RFC3 tumors did not reach our 2-fold criteria for pathway analysis. The transcription factors E2F1 and E2F4, which induce transition of cells from G1 to S-phase in the cell cycle, thereby promoting DNA replication, are also known to bind the RFC3 gene promoter, suggesting they could positively regulate RFC3 (41). Pathway analysis on genes whose expression was associated with RFC3 expression revealed an enrichment of genes in the Wnt/β-catenin signaling pathway, which stimulates the transcription of proliferation stimulating genes, as well as gene networks involved in cellular growth and proliferation and DNA replication, recombination and repair. Induction by oncogenes and negative regulation by tumor suppressors, in addition to the involvement of RFC3 in cell pathways and networks with roles in cell proliferation are consistent with a cancer-promoting role for RFC3. Considering these findings, we hypothesize RFC3 functions as an oncogene by driving proliferation and enabling continuous expansion of cancer cells.
Other replication factor C genes have also been implicated in tumorigenesis. RFC4 upregulation was discovered in hepatocellular carcinoma (HCC) tissues relative to nonmalignant liver tissue, and knockdown of endogenous RFC4 decreased cellular proliferation, increased apoptosis, and enhanced the chemosensitivity of the HCC cell line HepG2 (42). Moreover, RFC4 seems to be overexpressed in cancer cells from the lung, prostate, colon, stomach, and skin relative to tissue-matched nonmalignant cells (43). Importantly, RFC4 was one of several genes whose expression was upregulated in response to SV40 immortalization in lung fibroblasts, implicating the replication factor C complex in one of the earliest and essential steps in malignant transformation (43). RFC2 is overexpressed in nasopharyngeal carcinoma and was also found associated with the oncogenic protein c-MYC, whereas RFC5 is upregulated in human papillomavirus–positive squamous cell carcinomas of the head and neck (36, 44, 45). Thus, there is abundant evidence implicating replication factor C genes in tumorigenesis and our study has identified a novel cancer-promoting role for factor C3 in EAC. Interestingly, a recent study reported frequent loss of function mutations and downregulation of RFC3 in gastric and colorectal cancers (46). The findings of this study and ours suggest RFC3 may act in a tissue-specific manner in the context of cancer.
We validated the biologic effect of RFC3 upregulation in cell model experiments by showing that RFC3 suppression impedes cell proliferation and anchorage-independent growth exclusively in cancer cell lines with copy gain, an observation consistent with our oncogene hypothesis. Although knockdown of RFC3 was less pronounced in the nonmalignant esophageal line Het-1a as compared with all other lines (50% vs. ∼75%–90%), even a 50% reduction in mRNA expression would be expected to cause a phenotypic effect. We are therefore confident in our observation that knockdown of RFC3 has no significant effect on the growth of nonmalignant esophageal cells.
We were also interested in when RFC3 becomes involved during EAC development. Our unique dataset allowed us to screen preneoplastic, BE tissues from individuals who progressed to develop EAC. We asked whether RFC3 DNA amplification was present in any of the BE tissues of patients who developed EAC, based on the hypothesis that if present in preneoplastic tissue, RFC3 amplification could be an early, causal event in EAC development. Intriguingly, we observed a high-level, focal DNA amplification encompassing RFC3 in one of the BE samples whose corresponding tumor also harbored the DNA alteration (Fig. 4). However, areas of dysplasia are present in many BE from patients with EAC, and we cannot confirm whether the only area showing increased copy number were from dysplastic or nondysplastic cells in this specimen. Therefore, we interpret this finding with caution and conclude that further investigation in BE samples is required to confirm whether or not RFC3 DNA amplification is an early event in EAC tumorigenesis. Although genomic data on premalignant BE samples are rare, a segmental copy number gain of 13q was also identified in a BE case in the study by Akagi and colleagues (21), showing the 13q amplicon we observed may be a recurrent event not only in EAC but also in its BE precursors. The observation of RFC3 amplification and 13q copy number gain in BE samples of our own and from external datasets points to the significance of this alteration. These findings suggest that RFC3 amplification may have a role early in the development of EAC and that amplification could potentially be a marker of BE progression to EAC; however, we emphasize that a large prospective cohort is needed to investigate the significance of RFC3 amplification in BE samples.
Finally, we showed that RFC3 DNA copy number alterations are not restricted to EAC but are also prevalent in other cancers (Supplementary Fig. S3A). Moreover, high RFC3 expression was associated with poor patient survival in multiple lung and breast adenocarcinoma datasets, as well as in EACs. Taken together, our data provide multifaceted evidence that RFC3 is a candidate oncogene with potential predictive and prognostic implications in addition to its biologic significance in EAC development, and perhaps, in other malignancies as well.
Transcript Profiling
GEO Accession Numbers: GSE22524, GSE1420, GSE19417, GSE2845, GSE3141. caArray: jacob-00182. caBIG (https://cabig.nci.nih.gov/tools/caArray_GSKdata). Wellcome Trust Sanger Institute CGP Data Archive (http://www.sanger.ac.uk/genetics/CGP/Archive/).
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Acknowledgments
SNP 6.0 cell line data were downloaded from the Wellcome Trust Sanger Institute CGP Data Archive (http://www.sanger.ac.uk/genetics/CGP/Archive/). Authors declare that those who carried out the original analysis and collection of the data bear no responsibility for the further analysis or interpretation of it by the authors. The authors thank Jennifer Kennett and Ivy Tsui for technical assistance.
Grant Support
This work was supported by the Canadian Institutes of Health Research (W.L. Lam), Canadian Cancer Society (W.L. Lam), and Canadian Institutes of Health Research Vanier Canada Graduate Scholarships (K.L. Thu and L.A. Pikor), and CIHR Jean-Francois Saint Denis Fellowship in Cancer Research (W.W. Lockwood).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.