Abstract
Cancer genomes maintain a complex array of somatic alterations required for maintenance and progression of the disease, posing a challenge to identify driver genes among this genetic disorder. Toward this end, we mapped regions of recurrent amplification in a large collection (n = 392) of primary human cancers and selected 620 genes whose expression is elevated in tumors. An RNAi loss-of-function screen targeting these genes across a panel of 32 cancer cell lines identified potential driver genes. Subsequent functional assays identified SHMT2, a key enzyme in the serine/glycine synthesis pathway, as necessary for tumor cell survival but insufficient for transformation. The 26S proteasomal subunit, PSMB4, was identified as the first proteasomal subunit with oncogenic properties promoting cancer cell survival and tumor growth in vivo. Elevated expression of SHMT2 and PSMB4 was found to be associated with poor prognosis in human cancer, supporting the development of molecular therapies targeting these genes or components of their pathways. Cancer Res; 74(11); 3114–26. ©2014 AACR.
Introduction
Genome-wide documentation of tumor gene copy number, gene expression, and somatic mutations has become standard practice, driven by remarkable technologic advances over the last decade (1, 2). By elucidating the “omic” landscape of tumor cells, we are only now realizing the vast complexity of the diseases that are classed as cancer (3). Assigning function to any gene in this morass of observed changes can be a challenge, and years of collective study have begun to hone in on core functionalities regulating specific hallmarks of cancer (4). However, if the long-term goal of diagnostic-driven, individualized therapy is to be realized, we need to begin to understand how these genetic changes sustain tumor growth, progression, and survival. Oncogenes and tumor suppressors are the classical drivers of oncogenesis, but it is still an open question as to which genes are the drivers in any single tumor when as much as 10% to 15% of the genome can be altered by recurrent copy-number abnormalities (5). The task of finding “driver genes” is further hampered by the fact that the majority of copy-number abnormalities are low-level gains, often encompassing large regions of the genome, which contrasts with high-level, focal copy-number gains as exemplified by HER2 (6). Evidence also suggests that individual amplicons contain multiple genes that collectively alter processes important for tumor growth and survival (7) and that there may be equivalent cooperativity between genes in separate but coamplified genomic regions (8, 9). The distinct challenge is to separate passenger genes from driver genes (10). In the quest to find new driver genes and, hence, potential therapeutic targets, a number of groups have used either a reductionist approach by identifying a region of copy-number gain/loss of clinical interest and testing the functionality of each gene within that region (11, 12), or a holistic approach using RNA interference library screens in a single cell line to find driver genes (13, 14).
The primary goal of this study was to develop a genome-wide oncogenomic screening strategy to identify candidate oncogenic driver genes. Our approach was to identify recurrent regions of copy-number gain and increased expression across multiple solid tumor types, with the hypothesis that these genes represent fundamental processes for sustaining solid tumor growth and survival. We selected 620 genes from commonly amplified regions whose expression is elevated in tumor versus normal and determined the effect on cell viability by RNAi targeting in 32 cancer cell lines. Here, we report the findings of this study that represents one of the most comprehensive functional oncogenomic screens to date.
Materials and Methods
Cell lines
See Supplementary Table S7 for full list of cell lines, cell line short tandem repeat (STR) profiles, growth conditions, and transfection conditions used. All cell lines are tested for mycoplasma, cross contamination, and genetically fingerprinted when new stocks are generated to ensure quality and confirm ancestry.
Cell line fingerprinting: SNP fingerprinting.
Single-nucleotide polymorphism (SNP) genotypes are performed each time new stocks are expanded for cryopreservation. Cell line identity is verified by high-throughput SNP genotyping using Fluidigm multiplexed assays. SNPs were selected on the basis of minor allele frequency and presence on commercial genotyping platforms. SNP profiles are compared with SNP calls from available internal and external data (when available) to determine or confirm ancestry. In cases in which data are unavailable or cell line ancestry is questionable, DNA or cell lines are repurchased to perform profiling to confirm cell line ancestry.
SNPs.
The following SNPs were used: rs11746396, rs16928965, rs2172614, rs10050093, rs10828176, rs16888998, rs16999576, rs1912640, rs2355988, rs3125842, rs10018359, rs10410468, rs10834627, rs11083145, rs11100847, rs11638893, rs12537, rs1956898, rs2069492, rs10740186, rs12486048, rs13032222, rs1635191, rs17174920, rs2590442, rs2714679, rs2928432, rs2999156, rs10461909, rs11180435, rs1784232, rs3783412, rs10885378, rs1726254, rs2391691, rs3739422, rs10108245, rs1425916, rs1325922, rs1709795, rs1934395, rs2280916, rs2563263, rs10755578, rs1529192, rs2927899, rs2848745, and rs10977980.
STR profiling.
STR profiles are determined for each line using the Promega PowerPlex 16 System. This is performed once and compared with external STR profiles of cell lines (when available) to determine cell line ancestry.
Loci analyzed.
Detection of sixteen loci (fifteen STR loci and Amelogenin for gender identification), including D3S1358, TH01, D21S11, D18S51, Penta E, D5S818, D13S317, D7S820, D16S539, CSF1PO, Penta D, AMEL, vWA, D8S1179, and TPOX.
Cloning
Full-length open reading frame clones were obtained from Invitrogen and Genentech. C-terminus FLAG tags were added by PCR then tagged proteins were subcloned into pLPCX retroviral expression vector (Clontech). NIH/3T3 and MCF10A stable cell lines were generated by infection with retroviral particles expressing FLAG-tagged constructs and selection with puromycin. See Supplementary Table S7 for a full list of clones and primers used.
Gene selection for RNAi screen
An independent set of tumor expression data (source, GeneLogic, Inc.; ref. 15) was used to identify genes upregulated in tumor versus normal samples (breast, 24 normal, 91 tumors; lung, 80 normal, 105 tumors; ovary, 101 normal, 82 tumors; and prostate, 27 normal, 70 tumors). Percentile analysis for differential gene expression (16), a statistical tool that compares both the magnitude and the variability between two sample groups, was used to identify differential gene expression between normal and cancer samples in the GeneLogic (15) database. Genes found within an amplicon that showed a 1.5-fold increase in cancer samples versus normal were selected for the screen.
Tumor samples, DNA preparation, and copy-number arrays
Tumor samples had a tumor percentage of >80% as assessed by an expert pathologist. DNA was extracted from frozen tissues and cell lines by a standard protocol using the DNA/RNA Extraction Kit (Qiagen). DNA was then hybridized to the Agilent Human Genome CGH Microarray Kit 244A and genomic copy number was calculated by comparing the intensity ratio between individual tumors and the averaged normal DNA content.
RNAi screen
Cells were reverse lipid transfected with pools of four siRNAs; siGENOME siRNAs were used in the primary screen and ON-TARGETplus siRNA was used in the secondary screen and all follow-up studies (Dharmacon). siRNA pools were used at 50 nmol/L and singles at 25 nmol/L. Pooled siRNAs targeting PLK1 and TOX transfection reagent were used as positive controls at 25 nmol/L. Lipids used were Dharmafects 1, 2, 3, and 4 (Dharmacon) and Lipofectamine RNAiMAX (Invitrogen). See Supplementary Table S7 for transfection conditions for each cell line. siRNA–lipid complexes were formed for 1 hour, then cells were seeded on top of complexes. After 5 days, cell viability was assayed using CellTiter-Glo (Promega) and luminescence readout was performed using Envision plate reader (PerkinElmer). Gene selection criteria: Genes were selected from the screen by one of three criteria: (i) correlation of RNAi response with total RNA expression (expression, value >0.3 Pearson correlation); (ii) correlation of RNAi response with copy-number [comparative genomic hybridization (CGH), value >0.3 Pearson correlation]; (iii) genes that affected cell viability across the panel of cell lines regardless of copy number or gene expression (multiple). Genes were classified as a multiple hit if six or more cell lines had an RNAi response of greater than 0.65 (65% cell death).
Gene expression analysis
Patient tissue samples with the appropriate Institutional Review Board approval and patient-informed consent were obtained from commercial sources (Supplementary Table S1). The human tissue samples used in the study were deidentified (double-coded) before their use, and, hence, the study using these samples is not considered human subject research under the US Department of Human and Health Services regulations and related guidance (45 CFR Part 46). All tumor tissues were subject to pathology review. Tumor DNA and RNA were extracted using the Qiagen AllPrep DNA/RNA Kit (Qiagen).
Total RNA was harvested from cells using the RNeasy Kit with on-column DNase digestion (Qiagen). RNA for validation of siRNA function was acquired 3 days after transfection unless otherwise noted. RNA was quantified using a Nanodrop spectrophotometer, amplified with TaqMan One-Step RT-PCR Master Mix, assayed with TaqMan gene expression assays, and then analyzed using a 7900HT Fast Real-Time PCR System (Applied Biosystems). See Supplementary Table S7 for full list of gene expression assays used. Normalization control assays were HPRT1 (human) and ACTB (mouse).
Immunoblotting
Protein was harvested from cells with RIPA buffer, passed through a syringe, and cleared by centrifugation. Protein for validation of siRNA function was acquired 3 days after transfection unless otherwise noted. Protein was quantified using bicinchoninic acid protein assay (Pierce). Protein was separated on 4% to 12% Bis-Tris gels (Invitrogen), transferred to nitrocellulose membranes, blocked with 5% bovine serum albumin or milk in TBST for 30 minutes, then blotted with primary antibody overnight at 4°C. See Supplementary Table S7 for list of primary antibodies used. Membranes were then washed and incubated with appropriate horseradish peroxidase–conjugated secondary antibodies for 1 hour, washed and detected with SuperSignal West Femto Chemiluminescent Substrate (Pierce). Luminescence signal was acquired with FluorChem Q (Alpha Innotech).
3D assays
Matrigel drip cultures.
MCF10A cells were cultured in three-dimensional (3D) culture as previously described (17). Of note, 125,000 cells were seeded in a 6-well plate on top of a layer of polymerized Matrigel (BD Biosciences), overlaid with a solution of 5% Matrigel, and cultured for 21 days.
Soft agar assay.
NIH/3T3 cells were cultured in soft agar colony formation assay with a base layer of 1.2% agarose and an assay layer of 0.4% agarose (BD Biosciences). Of note, 4,000 cells were seeded per well in a 12-well plate. Plates were scanned and analyzed after 21 days using Gelcount scanner and software (Oxford Optronics).
Xenograft studies
One million NIH/3T3 cells were implanted subcutaneously in the right flank of NCr nude mice (Taconic), with 5 mice per group. Tumor dimensions were measured by caliper. At end of assay, tumors were harvested and flash-frozen in liquid nitrogen. RNA was extracted from tumors using a TissueLyser II and RNeasy extraction (Qiagen).
Cell-cycle analysis
For steady-state assay, NIH/3T3 cells were cultured for 48 hours and labeled with 10 μmol/L bromodeoxyuridine (BrdUrd) for 30 minutes. Cells were then detached and stained using the BrdUrd Flow Kit (BD Biosciences). Flow cytometry data were acquired with FACSCalibur (BD Biosciences) and analyzed with FlowJo software. For cell synchronization assay (double thymidine block), NIH/3T3 cells were cultured for 48 hours then blocked with 2 mmol/L thymidine (Sigma) for 15 hours. Cells were released into full media for 10 hours and reblocked with 2 mmol/L thymidine for 15 hours. Cells were then released and pulsed at indicated time points for 30 minutes with 10 μmol/L BrdUrd in the presence or absence of 100 ng/mL nocodazole (Sigma). Cells were then analyzed for cell-cycle status as described above.
Gene ontology analysis
Gene sets were classified by Gene Ontology using the Database for Annotation, Visualization and Integrated Discovery (DAVID) pathway analysis online tool and standard parameters (18).
Cell-based assays
Apoptosis was assayed using Caspase-Glo 3/7 Assay (Promega). Proteasome activity was assayed using Proteasome-Glo Chymotrypsin-Like Cell-Based Assays (Promega).
Kaplan–Meier analysis
Data were generated with KMPlotter (http://kmplot.com/analysis/) using a median cutoff for gene expression. Probesets used for analysis: 202244_at (PSMB4); 214095_at (SHMT2).
Results
DNA copy-number analysis of human tumors
Anonymous tumor specimens were collected and appraised for tumor content and pathology by expert pathologists. The collection consisted of 161 breast tumors (HER2-positive, n = 53; hormone receptor–positive, n = 54; and triple-negative, n = 54), 51 ovarian tumors (serous, n = 37; papillary serous n = 14), 52 lung tumors (adenocarcinoma, n = 5; squamous cell carcinoma, n = 47), 51 melanomas, and 57 prostate tumors (Supplementary Table S1). DNA copy number was measured using Agilent arrays (see Materials and Methods). Gain and Loss Analysis of DNA (R package from Bioconductor; ref. 19) was applied to infer the segmented copy numbers for each sample normalized to two copies. Figure 1A compares the frequency of recurrent abnormalities across all five indications. The genomic identification of significant targets in cancer (GISTIC) algorithm (20) was applied to calculate significant regions of copy-number change and identify the genes within each region. Table 1 contains the summary of this analysis; gene lists are supplied in Supplementary Table S2.
Recurrent gene copy-number changes in breast, ovarian, lung, melanoma, and prostate primary tumors. A, frequencies of gene copy gain (red) and loss (green) are plotted as a function of genomic location for each tumor types. Positive values indicate frequencies of samples showing copy-number increases [log2 (copy number) > 0.3], and negative values indicate frequencies of samples showing copy-number decreases [log2 (copy number) < 20.3]. Vertical solid lines indicate boundaries between chromosomes. Vertical dotted lines indicate positions of centromeres. Sample number is indicated to the right of each graph. B, analysis of significant regions of copy-number gain by GISTIC. Q-scores for gene copy-number gains are overlaid on a single plot for breast, ovarian, lung, and melanoma samples. C, averaged Q values across all four tumor types identify most significant regions for the entire dataset.
Recurrent gene copy-number changes in breast, ovarian, lung, melanoma, and prostate primary tumors. A, frequencies of gene copy gain (red) and loss (green) are plotted as a function of genomic location for each tumor types. Positive values indicate frequencies of samples showing copy-number increases [log2 (copy number) > 0.3], and negative values indicate frequencies of samples showing copy-number decreases [log2 (copy number) < 20.3]. Vertical solid lines indicate boundaries between chromosomes. Vertical dotted lines indicate positions of centromeres. Sample number is indicated to the right of each graph. B, analysis of significant regions of copy-number gain by GISTIC. Q-scores for gene copy-number gains are overlaid on a single plot for breast, ovarian, lung, and melanoma samples. C, averaged Q values across all four tumor types identify most significant regions for the entire dataset.
Summary of GISTIC analysis of copy-number aberrations for each tumor type
. | Gain . | Loss . | ||||
---|---|---|---|---|---|---|
Tumor type . | No. of peaks . | Genes . | OIC Genes . | No. of peaks . | Genes . | UIC Genes . |
Breast (HER2) | 27 | 250 | 69 | 25 | 327 | 193 |
Breast (TN) | 22 | 372 | 103 | 21 | 375 | 237 |
Breast (HR) | 20 | 308 | 85 | 18 | 180 | 131 |
Ovarian | 26 | 486 | 130 | 25 | 330 | 233 |
Melanoma | 36 | 1,019 | 197 | 32 | 119 | 65 |
NSCLC | 26 | 636 | 145 | 24 | 274 | 187 |
Total | 3,071 | 729 | 1,605 | 1,046 |
. | Gain . | Loss . | ||||
---|---|---|---|---|---|---|
Tumor type . | No. of peaks . | Genes . | OIC Genes . | No. of peaks . | Genes . | UIC Genes . |
Breast (HER2) | 27 | 250 | 69 | 25 | 327 | 193 |
Breast (TN) | 22 | 372 | 103 | 21 | 375 | 237 |
Breast (HR) | 20 | 308 | 85 | 18 | 180 | 131 |
Ovarian | 26 | 486 | 130 | 25 | 330 | 233 |
Melanoma | 36 | 1,019 | 197 | 32 | 119 | 65 |
NSCLC | 26 | 636 | 145 | 24 | 274 | 187 |
Total | 3,071 | 729 | 1,605 | 1,046 |
NOTE: Number of significant peaks (as determined by GISTIC), total gene, and miRNA count within those regions are listed for both copy-number gain and loss. Genes identified as overexpressed in cancer (OIC) and underexpressed in cancer (UIC) compared with normal controls are also listed.
Abbreviations: HER2, HER2-positive; HR, hormone receptor–positive; TN, triple-negative.
Common regions of gene amplification and deletion across tumor types
Our data provided a direct comparison among the genomes of five solid tumor types analyzed on a single platform and it was evident that the frequency of genomic aberrations found in breast, ovarian, lung, and melanoma tumors had strikingly similar architecture in several chromosomal regions (Fig. 1). In contrast, the prostate dataset shared few common features with the other tumor types, showing greater frequency of copy-number loss overall and few regions of significant copy-number gain. To select genes for an RNAi-based screen to identify oncogenic drivers, we focused on genomic regions that exhibited recurrent amplification across the breast, ovarian, lung, and melanoma datasets. Figure 1B shows an overlay of the GISTIC Q-scores for these four tumor types and Fig. 1C shows peaks derived from a cross-analysis of these significant peaks. Genes within these 86 peaks (Supplementary Table S2) were filtered by gene expression levels (see Materials and Methods) to select genes within amplicons that are overexpressed in cancer compared with normal tissues, resulting in a final list of 620 genes (Supplementary Table S3). Gene Ontology analysis (18) of these genes showed enrichment for several cancer-related functions such as regulation of DNA, protein, and glucose metabolism (Supplementary Table S4).
Loss-of-function RNAi screen to identify oncogenic driver genes
To identify amplified and overexpressed genes required for initiation and/or maintenance of tumor cell proliferation, we performed an RNAi loss-of-function screen targeting 620 genes across a panel of 32 tumor cell lines representing the four solid tumor types used to generate the gene list as outlined in Fig. 2A. Each of the 86 amplified regions identified by GISTIC was represented by at least three cell lines. Two rounds of RNAi screening were conducted, the results of which are summarized in Fig. 2B; Supplementary Table S3. Three independent criteria were used to select genes; (i) correlation of RNAi response with copy number (CGH; Fig. 2C), (ii) correlation of RNAi response with total RNA level (expression; Fig. 2D), and (iii) genes that affect multiple lines (multiple; Fig. 2E) independent of copy number or expression correlations (Supplementary Table S3). Using these selection criteria, 105 genes passed and moved to second round screening, of which 25 were validated (Table 2). This list was significantly enriched for genes with functions associated with the proteasome, spliceosome, DNA replication, cell cycle, and metabolism (Supplementary Table S4). ERBB2 and MYC were among the genes meeting the validation criteria, confirming the screen was able to identify amplified/overexpressed driver oncogenes, although these were both selected on the basis of a correlation between phenotype and total RNA levels, not phenotype and copy number. Overall correlation of phenotype with expression levels yielded more hits than that with copy number, perhaps due to the noted lack of linearity between copy number/mRNA levels/protein levels for the majority of genes (21, 22).
RNAi screen design and results. A, schematic outline of the screening approach used in this study. B, summary of all data for primary RNAi screen (top) and secondary RNAi screen (bottom). Details of genes and cell lines can be found in Supplementary Tables S3 and S7, respectively. For each graph, individual targeted genes and controls are plotted on the horizontal axis, and cell growth is represented as the percentage of control plotted on the vertical axis. There are 32 data points for each gene or control, representing a data point for each cell line in the screen. Bottom, genes are ordered by the criteria by which the genes were selected for the secondary screen (indicated above the data points). Examples of how genes were selected for further analysis are shown for individual genes; C, correlation with copy number, C3ORF62 (Pearson correlation = 0.53); D, correlation with expression, ERBB2 (Pearson correlation = 0.52); and F, multiple hit, EIF4A3 [genes were classified as a multiple hit if six or more cell lines had an RNAi response of greater than 0.65 (65% cell death), indicated by dotted line], respectively. See Materials and Methods for detailed selection criteria.
RNAi screen design and results. A, schematic outline of the screening approach used in this study. B, summary of all data for primary RNAi screen (top) and secondary RNAi screen (bottom). Details of genes and cell lines can be found in Supplementary Tables S3 and S7, respectively. For each graph, individual targeted genes and controls are plotted on the horizontal axis, and cell growth is represented as the percentage of control plotted on the vertical axis. There are 32 data points for each gene or control, representing a data point for each cell line in the screen. Bottom, genes are ordered by the criteria by which the genes were selected for the secondary screen (indicated above the data points). Examples of how genes were selected for further analysis are shown for individual genes; C, correlation with copy number, C3ORF62 (Pearson correlation = 0.53); D, correlation with expression, ERBB2 (Pearson correlation = 0.52); and F, multiple hit, EIF4A3 [genes were classified as a multiple hit if six or more cell lines had an RNAi response of greater than 0.65 (65% cell death), indicated by dotted line], respectively. See Materials and Methods for detailed selection criteria.
Genes passing selection criteria after two rounds of RNAi screens
Gene . | Chr . | Cytoband . | Description . | Analysis . |
---|---|---|---|---|
ELOVL1 | 1 | 1p34.2 | CGI-88 PROTEIN | CGH |
C30RF62 | 3 | 3p21.31 | CHROMOSOME 3 OPEN READING FRAME 62 | CGH |
ENO3 | 17 | 17pter-p11 | ENOLASE 1, (ALPHA) | CGH |
ERBB2 | 17 | 17q11.2-q12|17q21 | V-ERB-B2 ERYTHROBLASTIC LEUKEMIA VIRAL ONCOGENE HOMOLOG 2 | EXPRESSION |
ACTN4 | 19 | 19q13 | ACTININ, ALPHA 4 | EXPRESSION |
MYC | 8 | 8q24.21 | V-MYC MYELOCYTOMATOSIS VIRAL ONCOGENE HOMOLOG (AVIAN) | EXPRESSION |
EMILIN1 | 2 | 2p23.3-p23.2 | ELASTIN MICROFIBRIL INTERFACER 1 | EXPRESSION |
WARS | 14 | 14q32.31 | INTERFERON-INDUCED PROTEIN 53 | EXPRESSION |
EIF2S2 | 2 | 20pter-q12 | EUKARYOTIC TRANSLATION INITIATION FACTOR 2, SUBUNIT 2 BETA, 38KDA | EXPRESSION |
BOP1 | 8 | 8q24.3 | BLOCK OF PROLIFERATION 1 | EXPRESSION |
SHMT2 | 12 | 12q12-q14 | SERINE HYDROXYMETHYLTRANSFERASE 2 (MITOCHONDRIAL) | EXPRESSION |
EIF6 | 20 | 20q12 | EUKARYOTIC TRAHSLATION INITIATION FACTOR 6 | EXPRESSION |
EDEM2 | 20 | 20q11.22 | CHROMOSOME 20 OPEN READING FRAME 31 | EXPRESSION |
ALDOA | 16 | 16p11.2 | ALDOLASE A, FRUCTOSE-BISPHOSPHATE | MULTIPLE |
EIF3S8 | 16 | 16p11.2 | EUKARYOTIC TRANSLATION INITIATION FACTOR 3. SUBUNIT 8, 110KDA | MULTIPLE |
EIF4A3 | 17 | 17q25.3 | DEAD (ASP-GLU-ALA-ASP) BOX POLYPEPTIDE 48 | MULTIPLE |
PHF5A | 22 | 22q13.2 | PHD FINGER PROTEIN 5A | MULTIPLE |
PSMA6 | 14 | 14q13 | PROTEASOME (PROSOME, MACROPAIN) SUBUNIT, ALPHA TYPE, 6 | MULTIPLE |
PSMB4 | 1 | 1q21 | PROTEASOME (PROSOME, MACROPAIN) SUBUNIT, BETA TYPE, 4 | MULTIPLE |
PSMD11 | 17 | 17q11.2 | PROTEASOME (PROSOME, MACROPAIN) 26S SUBUNIT, NON-ATPASE, 11 | MULTIPLE |
PSMD2 | 3 | 3q27.1 | PROTEASOME (PROSOME, MACROPAIN) 26S SUBUNIT. NON-ATPASE, 2 | MULTIPLE |
PSMD3 | 17 | 17q12-q21.1 | PROTEASOME (PROSOME, MACROPAIN) 26S SUBUNIT, NON-ATPASE, 3 | MULTIPLE |
PSMD8 | 19 | 19q13.2 | PROTEASOME (PROSOME, MACROPAIN) 26S SUBUNIT. NON-ATPASE, 8 | MULTIPLE |
SNRPD2 | 19 | 19q13.2 | SMALL NUCLEAR RIBONUCLEOPROTEIN D2 POLYPEPTIDE 16.5KDA | MULTIPLE |
THOC4 | 17 | 17q25.3 | THO COMPLEX 4 | MULTIPLE |
Gene . | Chr . | Cytoband . | Description . | Analysis . |
---|---|---|---|---|
ELOVL1 | 1 | 1p34.2 | CGI-88 PROTEIN | CGH |
C30RF62 | 3 | 3p21.31 | CHROMOSOME 3 OPEN READING FRAME 62 | CGH |
ENO3 | 17 | 17pter-p11 | ENOLASE 1, (ALPHA) | CGH |
ERBB2 | 17 | 17q11.2-q12|17q21 | V-ERB-B2 ERYTHROBLASTIC LEUKEMIA VIRAL ONCOGENE HOMOLOG 2 | EXPRESSION |
ACTN4 | 19 | 19q13 | ACTININ, ALPHA 4 | EXPRESSION |
MYC | 8 | 8q24.21 | V-MYC MYELOCYTOMATOSIS VIRAL ONCOGENE HOMOLOG (AVIAN) | EXPRESSION |
EMILIN1 | 2 | 2p23.3-p23.2 | ELASTIN MICROFIBRIL INTERFACER 1 | EXPRESSION |
WARS | 14 | 14q32.31 | INTERFERON-INDUCED PROTEIN 53 | EXPRESSION |
EIF2S2 | 2 | 20pter-q12 | EUKARYOTIC TRANSLATION INITIATION FACTOR 2, SUBUNIT 2 BETA, 38KDA | EXPRESSION |
BOP1 | 8 | 8q24.3 | BLOCK OF PROLIFERATION 1 | EXPRESSION |
SHMT2 | 12 | 12q12-q14 | SERINE HYDROXYMETHYLTRANSFERASE 2 (MITOCHONDRIAL) | EXPRESSION |
EIF6 | 20 | 20q12 | EUKARYOTIC TRAHSLATION INITIATION FACTOR 6 | EXPRESSION |
EDEM2 | 20 | 20q11.22 | CHROMOSOME 20 OPEN READING FRAME 31 | EXPRESSION |
ALDOA | 16 | 16p11.2 | ALDOLASE A, FRUCTOSE-BISPHOSPHATE | MULTIPLE |
EIF3S8 | 16 | 16p11.2 | EUKARYOTIC TRANSLATION INITIATION FACTOR 3. SUBUNIT 8, 110KDA | MULTIPLE |
EIF4A3 | 17 | 17q25.3 | DEAD (ASP-GLU-ALA-ASP) BOX POLYPEPTIDE 48 | MULTIPLE |
PHF5A | 22 | 22q13.2 | PHD FINGER PROTEIN 5A | MULTIPLE |
PSMA6 | 14 | 14q13 | PROTEASOME (PROSOME, MACROPAIN) SUBUNIT, ALPHA TYPE, 6 | MULTIPLE |
PSMB4 | 1 | 1q21 | PROTEASOME (PROSOME, MACROPAIN) SUBUNIT, BETA TYPE, 4 | MULTIPLE |
PSMD11 | 17 | 17q11.2 | PROTEASOME (PROSOME, MACROPAIN) 26S SUBUNIT, NON-ATPASE, 11 | MULTIPLE |
PSMD2 | 3 | 3q27.1 | PROTEASOME (PROSOME, MACROPAIN) 26S SUBUNIT. NON-ATPASE, 2 | MULTIPLE |
PSMD3 | 17 | 17q12-q21.1 | PROTEASOME (PROSOME, MACROPAIN) 26S SUBUNIT, NON-ATPASE, 3 | MULTIPLE |
PSMD8 | 19 | 19q13.2 | PROTEASOME (PROSOME, MACROPAIN) 26S SUBUNIT. NON-ATPASE, 8 | MULTIPLE |
SNRPD2 | 19 | 19q13.2 | SMALL NUCLEAR RIBONUCLEOPROTEIN D2 POLYPEPTIDE 16.5KDA | MULTIPLE |
THOC4 | 17 | 17q25.3 | THO COMPLEX 4 | MULTIPLE |
NOTE: Analysis indicates the criteria by which the genes were selected; correlation of phenotype with copy number (CGH), correlation of phenotype with mRNA level (expression), and effect on multiple lines (multiple). Genes highlighted in bold were selected for follow-up functional studies. Cell line copy number and gene mRNA baseline levels are shown in Supplementary Table S5.
Experimental and functional validation of RNAi screen candidates
Fifteen of the 25 hits from the RNAi screen were selected for further validation based on function and/or potential druggability (Table 2; see Materials and Methods). Protein levels of these genes were assessed in the cell line panel by Western blot analysis (Supplementary Fig. S1). Of the eight candidate genes for which specific antibodies were found, only three, ACTN4, ERBB2, and EMILIN1, showed correlation between mRNA (detected by Affymetrix expression arrays or TaqMan) and protein (Supplementary Table S5). RNAi-mediated knockdown was confirmed for 13 of the 15 target genes. Levels of two genes, ENO3 and WARS, were unaffected by RNAi and dropped from further evaluation (Supplementary Fig. S2). To assess the transforming activity of the remaining 13 genes, each was cloned and expressed in NIH/3T3 cells to perform soft agar colony formation assays (Fig. 3A). Of these, ELOVL1 could not be expressed, PSMA6 expression was only detected by TaqMan, and all other genes expressed at or above physiologic levels. Two of the 13 genes (PSMB4 and SHMT2) promoted colony formation in addition to the HRAS.G12V-positive control (Fig. 3B). Overall proliferation rates of PSMB4- and SHMT2-expressing cells were higher than parental cells (Supplementary Fig. S3A). A more detailed analysis of cell-cycle effects of these genes in both steady-state and after release from synchronization revealed that the expression of PSMB4, but not SHMT2, resulted in aberrations in the cell-cycle profile. In the steady-state condition, the cell-cycle profile of SHMT2-expressing cells was indistinguishable from that of control cells, whereas PSMB4-expressing cells spent a significantly lower proportion of time in G0–G1 phase in favor of more time in both S and G2–M phases (Supplementary Fig. S3B). Synchronization of the cells using a double thymidine block and release into media with or without nocodozole revealed that PSMB4-expressing cells have a similar cell-cycle profile as HRAS.G12V-expressing cells, with accelerated transition through the G2–M phase and bypass of the G2 DNA damage and mitotic spindle checkpoints (Supplementary Fig. S3C and S3D; ref. 23). PSMB4 and SHMT2 were expressed in MCF10A cells to assess effects on epithelial cell morphogenesis in 3D cultures. These cells exhibited no morphologic differences compared with control (Fig. 3C), in contrast with the highly invasive phenotype of the HRAS.G12V control, suggesting these genes have minimal effect on polarity, migration, and invasion. Knockdown efficiency for the individual RNAis within pools was confirmed by TaqMan and Western blot analysis for PSMB4 and SHMT2 (Supplementary Fig. S4A). In vivo assessment of oncogenic properties of PSMB4 and SHMT2 demonstrated that PSMB4 but not SHMT2 expression was sufficient to promote NIH/3T3 xenograft tumor growth (Fig. 3D). Expression of each transgene in the tumors was confirmed using human-specific TaqMan probes on mRNA extracted from the tumors (Supplementary Fig. S4B).
Effect of PSMB4 and SHMT2 on anchorage-independent growth and cell cycle. A, Western blot analysis of FLAG-tagged target genes stably expressed in NIH/3T3 cells. Dotted line, lanes placed next to each other from different regions of the same gel. B, soft agar assays to assess effect of target genes on anchorage-independent growth. Total number of colonies and average colony diameter are shown in bar graphs below; scale bar, 1 mm. C, effect of target gene expression on MCF10A acinar morphogenesis. Selected target genes were expressed in the nontumorigenic breast epithelial cell line to assay gross acinar morphogenesis. Oncogenic HRAS.G12V is shown as a positive control at a 4-day time point; scale bar, 100 μm. D, effect of PSMB4 and SHMT2 on tumor growth in vivo. Nude mice bearing NIH/3T3 cell lines stably expressing PSMB4, SHMT2, and HRAS.G12V (positive control) were monitored for xenograft growth. The mean tumor volume (+ SEM) for each group (n = 5) is shown.
Effect of PSMB4 and SHMT2 on anchorage-independent growth and cell cycle. A, Western blot analysis of FLAG-tagged target genes stably expressed in NIH/3T3 cells. Dotted line, lanes placed next to each other from different regions of the same gel. B, soft agar assays to assess effect of target genes on anchorage-independent growth. Total number of colonies and average colony diameter are shown in bar graphs below; scale bar, 1 mm. C, effect of target gene expression on MCF10A acinar morphogenesis. Selected target genes were expressed in the nontumorigenic breast epithelial cell line to assay gross acinar morphogenesis. Oncogenic HRAS.G12V is shown as a positive control at a 4-day time point; scale bar, 100 μm. D, effect of PSMB4 and SHMT2 on tumor growth in vivo. Nude mice bearing NIH/3T3 cell lines stably expressing PSMB4, SHMT2, and HRAS.G12V (positive control) were monitored for xenograft growth. The mean tumor volume (+ SEM) for each group (n = 5) is shown.
Relevance of SHMT2 and PSMB4 in human cancer
SHMT2 and PSMB4 were primarily selected for our screen as amplified in lung and ovarian cancer, respectively. To investigate the broader clinical relevance, we compared expression of SHMT2 and PSMB4 in cancer and normal tissue in two published datasets (15, 24). For the five primary indications surveyed in this study, SHMT2 expression is significantly increased in cancer samples compared with normal tissue in all five indications in both datasets (Fig. 4A, left). Normal samples were not available for ovary and skin in The Cancer Genome Atlas (TCGA) RNAseq dataset (24); however, Agilent expression array data from the same group contained three normal ovary samples and showed an increase in expression in cancer (P = 0.0226; Supplementary Table S6). In comparison, the cytosolic form of serine hydroxymethyltransferase, SHMT1, was found overexpressed in only four tumor types, and showed decreased expression in several others. PSMB4 showed significant upregulation in breast, lung, ovarian, and skin tumors (Fig. 4A and B, right). Overexpression of the three proteasome β-ring catalytic subunits was also observed (Supplementary Table S6). PSMB5 overexpression followed a similar pattern to PSMB4, whereas levels of PSMB6 and 7 were unrelated and less frequently elevated in cancer. Increases in expression of SHMT2 and PSMB4 were observed in a wide range of other tumor types supporting the involvement of these genes in the etiology of multiple cancers (Supplementary Table S6).
Expression of SHMT2 and PSMB4 in normal and cancer tissues. A, expression of SHMT2 is shown for breast, lung, ovary, skin, and prostate tissues from GeneLogic (left) and TCGA (right) data sources. B, expression of PSMB4 is shown for breast, lung, ovary, skin, and prostate tissues from GeneLogic (left) and TCGA (right) data sources; N, normal; C, cancer; numbers above normal/cancer pairings represent P values of a nonparametric t test (Mann–Whitney test). C, Kaplan–Meier survival plots showing the prognostic effect on RFS in breast cancer for SHMT2 (left) and PSMB4 (right). D, Kaplan–Meier survival plots showing the prognostic effect on OS in lung cancer for SHMT2 (left) and PSMB4 (right). Data were generated with KMPlotter (25) using a median cutoff for gene expression.
Expression of SHMT2 and PSMB4 in normal and cancer tissues. A, expression of SHMT2 is shown for breast, lung, ovary, skin, and prostate tissues from GeneLogic (left) and TCGA (right) data sources. B, expression of PSMB4 is shown for breast, lung, ovary, skin, and prostate tissues from GeneLogic (left) and TCGA (right) data sources; N, normal; C, cancer; numbers above normal/cancer pairings represent P values of a nonparametric t test (Mann–Whitney test). C, Kaplan–Meier survival plots showing the prognostic effect on RFS in breast cancer for SHMT2 (left) and PSMB4 (right). D, Kaplan–Meier survival plots showing the prognostic effect on OS in lung cancer for SHMT2 (left) and PSMB4 (right). Data were generated with KMPlotter (25) using a median cutoff for gene expression.
Kaplan–Meier analysis (25) indicated high expression of SHMT2 is associated with worse relapse-free survival (RFS) and distant metastasis-free survival and overall survival (OS) in breast cancer, and increased time to first progression and decreased OS in lung cancer (Fig. 4C and D, left; Supplementary Table S6). Increased PSMB4 expression was associated with worse RFS in breast cancer and decreased OS in ovarian cancer (Fig. 4C, right; Supplementary Table S6).
Suppression of PSMB4 inhibits proteasomal activity and processing of β-ring catalytic subunits
Inhibition of proteasome activity is an established anticancer therapeutic strategy (26, 27). Although PSMB4 does not contain intrinsic hydrolytic activity, it is rate-limiting for 20S proteasome assembly (28). Cell viability and protease activity was assessed after PSMB4 knockdown revealing a time-dependent decrease in both cell number and protease activity (Fig. 5A, top). Cells treated with the proteasome inhibitor, MG132, showed a similar effect in the cells tested (Fig. 5A, bottom). Expression of PSMB5, -6 and -7 mRNA increased over time in response to PSMB4 knockdown and MG132 treatment (Fig. 5B) and an accumulation of polyubiquinated (K48-linked) protein products were observed (Fig. 5C). This did not result in an obvious increase of the total protein levels of PSMB5, -6 and -7, but a decrease in the processed (lower molecular weight, higher mobility) forms with a concomitant accumulation of the precursor forms of these subunits (Fig. 5C) was observed. These data indicate loss of PSMB4 disrupts the formation of the 20S proteasome, because assembly of 20S half-mers requires PSMB4 and propeptide removal precedes maturation of the 20S proteasome (29). This effect seems to be specific to loss of PSMB4, because knockdown of PSMA6 had little or no effect on the protein levels of the core β-ring catalytic subunits (Supplementary Fig. S5A). To compare the effects of PSMB4 loss with proteasome inhibitor activity, we selected five cell lines with a range of sensitivities to bortezomib (PS-341, Velcade) and compared the effect on cell viability of bortezomib against PSMB4 knockdown. Although one line (NCI-H1838) was exceptionally resistant to both bortezomib treatment and loss of PSMB4, other lines tested showed variable sensitivities to both treatments with little correlation between the two results (Fig. 5D). A similar result was observed with 20 of the lines used in the RNAi screen (Supplementary Fig. S5C). Together, these data support a central role for PSMB4 in maintaining a functional 20S proteasome and suggest that targeting PSMB4 may offer an alternative therapeutic option to existing proteasome inhibitors.
Effects of PSMB4 loss on proteasome activity and protein expression of the proteasome catalytic β subunits. A, siRNA knockdown of PSMB4 in NCI-H1299 and ES-2 cells over a 72-hour time course measuring cell viability (top left) and chymotrypsin-like proteasome activity normalized to cell number (top right). Effects of the proteasome inhibitor, MG132, on cell viability and chymotrypsin-like proteasome activity are shown at the bottom. B, measurement of RNA levels of the proteasome β-ring subunits PSMB4, PSMB5, PSMB6, and PSMB7 in cells treated with PSMB4 RNAi for the stated times or 1 μmol/L MG132 for 24 hours. C, protein levels (Western blots) of the proteasome β-ring subunits PSMB4, PSMB5, PSMB6, and PSMB7 in cell lines transfected with PSMB4 RNAi for the stated times or 1 μmol/L MG132 for 24 hours. D, comparison of cell line sensitivity with 10 nmol/L bortezomib and PSMB4 RNAi in a panel of cell lines. RNAi knockdown of PSMB4 was confirmed in Supplementary Fig. S5B.
Effects of PSMB4 loss on proteasome activity and protein expression of the proteasome catalytic β subunits. A, siRNA knockdown of PSMB4 in NCI-H1299 and ES-2 cells over a 72-hour time course measuring cell viability (top left) and chymotrypsin-like proteasome activity normalized to cell number (top right). Effects of the proteasome inhibitor, MG132, on cell viability and chymotrypsin-like proteasome activity are shown at the bottom. B, measurement of RNA levels of the proteasome β-ring subunits PSMB4, PSMB5, PSMB6, and PSMB7 in cells treated with PSMB4 RNAi for the stated times or 1 μmol/L MG132 for 24 hours. C, protein levels (Western blots) of the proteasome β-ring subunits PSMB4, PSMB5, PSMB6, and PSMB7 in cell lines transfected with PSMB4 RNAi for the stated times or 1 μmol/L MG132 for 24 hours. D, comparison of cell line sensitivity with 10 nmol/L bortezomib and PSMB4 RNAi in a panel of cell lines. RNAi knockdown of PSMB4 was confirmed in Supplementary Fig. S5B.
Discussion
In this report, we used genomics-based selection and a loss-of-function screen to identify genes required for the survival and growth of human cancer cells. Two rounds of RNAi screens reduced our initial pool of 620 candidates to 25 genes, 15 of which were put through a stringent series of follow-up screens to assess oncogenic potential. Known oncogenes ERBB2 and MYC were identified using this approach, confirming the validity of this screen. A number of shared biologic functions were enriched in the 25 candidate genes; SHMT2 and ALDOA have glycolytic/metabolic functions; PHF5A, THOC4, and SNRPD2 are involved in mRNA processing; and 40% (6/15) of the genes are components of the 26S proteasome (four from the 19S regulatory particle and one each from the 20S α- and β-rings). We used stringent criteria to identify potential targets, and, in addition to SHMT2 and PSMB4, our data revealed a number of genes on which cancer cells from multiple tissues rely for continued proliferation and that warrant further investigation (Table 2).
On the basis of our results, we established that the mitochondrial serine hydroxymethyltransferase gene (SHMT2) is required for cancer cell survival, but is not sufficient to promote tumorigenesis. SHMT enzymes catalyze the conversion of serine to glycine by catalyzing the transfer of the β-carbon of serine to tetrahydrofolate (THF) generating 5,10-methylene-THF and glycine (30, 31). Phosphoglycerate dehydrogenase (PHGDH), another enzyme in the pathway of generating glycine from glucose, was recently shown to be amplified in melanoma (32) and breast cancer (33) and the next enzyme in this pathway, phosphoserine aminotransferase, is overexpressed in colorectal cancers, associated with chemoresistance (34), and required for serine pathway flux in breast cancer (33). More recently, a delicate interdependency between glucose and amino acid metabolism has been elucidated, indicating that serine is an allosteric regulator of pyruvate kinase M2 (35, 36). Our data show that cell lines with high SHMT2 require the gene for survival, but exogenous expression of the gene does not alter the acinar development of MCF10A cells grown in Matrigel (in contrast with PHGDH; 32), whereas it is sufficient to promote anchorage-independent growth in colony formation assays but not to drive growth of tumors in vivo. Furthermore, SHMT2 is found overexpressed in many tumors and is associated with worse clinical outcome in breast and lung cancer. Therefore, our findings contribute to an evolving paradigm that serine/glycine metabolism is of critical importance to the development and or maintenance of cancer.
Our studies also provided the first evidence for the requirement of a proteasomal subunit, PSMB4, as a potential driver oncogene in multiple tumor types. PSMB4 and five other proteasome subunits were found to be essential for the survival of a broad range of tumor types (breast, lung, skin, and ovary; Supplementary Table S3). PSMB4 has also been identified as a gene required for the survival of human glioblastoma cells (37), although there is little evidence for elevated levels in brain cancer versus normal brain in the datasets we studied. Of the two proteasome subunits that underwent functional screening, PSMB4 was the only one that promoted both anchorage-independent growth and tumorigenesis and, to the authors' knowledge, is the first proteasomal subunit shown to possess oncogenic properties.
The two FDA-approved proteasome inhibitors, bortezomib and carfilzomib, both target the core proteolytic subunits PSMB5, PSMB6, and PSMB7 (38). The noncatalytic subunit PSMB4 represents a novel potential target as it plays an important role in regulating the assembly of the proteasome (28, 29), and, thus, by inhibiting its function and proteasome assembly one could potentially prevent the catalytic activity of all three proteolytic subunits. There is precedence that such an approach is feasible, as CRBN has been shown to directly interact with PSMB4, negatively regulating proteasome activity in vivo, possibly through interfering with the assembly of the 20S proteasome (39). Because proteasomal assembly requires the 15-amino acid C-terminal tail of PSMB4 to intercalate into a groove between the PSMB6 and PSMB7 subunits, interfering with this interaction offers a potential therapeutic opportunity (40). Despite PSMB4 being amplified and overexpressed in cancer (a prerequisite for being screened in this project), tumor cell sensitivity to loss of PSMB4 did not correlate with gene copy number or gene expression. Evidence suggests that regulation of proteins in the proteasomal complex seems to occur at the protein level, rather than at the copy number/mRNA level (21). Paradoxically, in vitro and patient-derived data suggest that elevated expression of certain proteasome subunits may contribute to bortezomib resistance (41, 42). Although it is plausible that increased copy number and expression of PSMB4 is required in certain tumors to maintain PSMB4 transcript and/or protein levels, the exact mechanism of PSMB4 deregulation remains to be evaluated and it is likely a complex mechanism tied to the intrinsic turnover rate of the proteasome.
Overall, our approach has led to the discovery of potential therapeutic targets, but has also underscored the complexity of cancer biology. HER2 is the paradigm for a gene driven by amplification of its locus, generating high RNA and protein expression, resulting in oncogenic activity. Although our screen was predicated on this hypothesis, very few of the targets were selected on the basis of copy number, highlighting the biologic complexity evident in a cancer cell. A number of studies have indicated that there is ever decreasing linearity in the relationship of gene copy number to RNA to protein (43) due to a number of regulatory mechanisms such as methylation, RNA editing and stability, and protein translation and ubiquitination. Amplicons may also contain more than one potential driver gene that cooperates in tumor etiology and maintenance (44, 45), whereas many genes within any given amplicon are likely to be passengers and do not contribute to the survival/homeostasis for a given cancer cell. Therefore, expanding the strategy used in this study by selecting genes based on multiple biologic parameters (copy number, mutation, RNA, epigenetics, and protein expression) may improve our ability to identify bona fide cancer targets. As technologies advance to generate accurate high-density datasets, the use of high-content screening approaches to assess multiple hallmarks of cancer in relevant cancer models will greatly improve our ability to identify relevant targets (46, 47).
Disclosure of Potential Conflicts of Interest
R.M. Neve is a scientist, has received commercial research support, and has ownership interest (including patents) from Genentech. S. Seshagiri is a scientist in Roche/Genentech. F.J. de Sauvage has ownership interest (including patents) in Roche. No potential conflicts of interest were disclosed by the other authors.
Authors' Contributions
Conception and design: G.Y. Lee, H. Stern, D. Stokoe, F.J. de Sauvage, R.M. Neve
Development of methodology: G.Y. Lee, L. Li, J. Lee, D. Davis, D. Stokoe, R.M. Neve
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): J. Lee, Z. Modrusan, S. Seshagiri, R.M. Neve
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): G.Y. Lee, P.M. Haverty, L. Li, N.M. Kljavin, R. Bourgon, Z. Zhang, J. Settleman, R.M. Neve
Writing, review, and/or revision of the manuscript: G.Y. Lee, L. Li, H. Stern, D. Stokoe, J. Settleman, R.M. Neve
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): G.Y. Lee, L. Li, R.M. Neve
Study supervision: D. Davis, J. Settleman, R.M. Neve
Acknowledgments
The authors thank Mamie Yu and Suresh Selvaraj for maintaining and providing cell lines for this study. The authors also thank Don Kirkpatrick for technical assistance and insightful discussions.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.