Abstract
Identification of genes with cancer-specific overexpression offers the potential to efficiently discover cancer-specific activities in an unbiased manner. We apply this paradigm to study mesothelin (MSLN) overexpression, a nearly ubiquitous, diagnostically and therapeutically useful characteristic of pancreatic cancer. We identified an 18-bp upstream enhancer, termed CanScript, strongly activating transcription from an otherwise weak tissue-nonspecific promoter and operating selectively in cells having aberrantly elevated cancer-specific MSLN transcription. Introducing mutations into CanScript showed two functionally distinct sites: an Sp1-like site and an MCAT element. Gel retardation and chromatin immunoprecipitation assays showed the MCAT element to be bound by transcription enhancer factor (TEF)-1 (TEAD1) in vitro and in vivo. The presence of TEF-1 was required for MSLN protein overexpression as determined by TEF-1 knockdown experiments. The cancer specificity seemed to be provided by a putative limiting cofactor of TEF-1 that could be outcompeted by exogenous TEF-1 only in a MSLN-overexpressing cell line. A CanScript concatemer offered enhanced activity. These results identify a TEF family member as a major regulator of MSLN overexpression, a fundamental characteristic of pancreatic and other cancers, perhaps due to an upstream and highly frequent aberrant cellular activity. The CanScript sequence represents a modular element for cancer-specific targeting, potentially suitable for nearly a third of human malignancies. [Cancer Res 2007;67(19):9055–65]
Introduction
Overactive cellular pathways play a crucial role in tumorigenesis, as they represent fundamental and causative features of cancers. In some cases, unambiguous and high levels of cancer-specific pathway activation can be reliably detected by reporter constructs and are specific for defined DNA motifs at which transcription factors bind. Determining the components of activated pathways using reporter constructs was essential to specifying some mutationally targeted genes. For example, SMO/GLI and β-catenin were so defined after biochemical definition of the pathway activated by PTC and adenomatous polyposis coli (APC) mutations, respectively (1–3). Identifying altered cellular pathways by directly searching for mutations in individual genes has become difficult. These difficulties can potentially be overcome by a renewed attention to the pathways having activation in cancers by studying them using biochemical methods. The search can start with the identification of a gene having cancer-specific overexpression. Mesothelin (MSLN) is such a gene, whose overexpression characterizes pancreatic cancer.
The MSLN gene, localized to 16p21, encodes a precursor protein of 69 kDa, which is cleaved to a 40-kDa membrane-bound protein, termed MSLN, and a soluble 31-kDa protein, megakaryocyte-potentiating factor. The protein is glycosylated and glycosylphosphatidylinositol-anchored to the membrane. Its expression in normal tissue is limited mainly to mesothelia of peritoneal, pleural, and pericardial cavities (4). Overexpression of MSLN RNA and protein was found in several human malignancies, with the highest expression in ovarian, pancreatic, bronchial, gastroesophageal, cervical, endometrial, and biliary carcinomas (4–7), totaling up to nearly a third of human malignancies.
The biological function of MSLN is unknown; knockout mice have no distinguishable phenotype (8). Cancer-specific overexpression and its limited expression in normal tissue make MSLN a good candidate as a diagnostic marker and target in immunotherapy (9). Various methods of detecting MSLN are already used to aid the diagnosis of mesotheliomas and ovarian and pancreatic cancer (10, 11).
Investigators have repeatedly confirmed MSLN overexpression to affect the vast majority of ductal adenocarcinomas of the pancreas by serial analysis of gene expression (SAGE; ref. 12), oligonucleotide array (13), in situ hybridization, reverse transcription-PCR (RT-PCR), and immunohistochemistry (14). Strong immunolabeling was detected in virtually all ductal carcinomas, whereas the adjacent normal pancreas did not label (14). Interestingly, of the patients vaccinated with granulocyte macrophage colony-stimulating factor–transduced autologous pancreatic cells who developed systemic antitumor immunity, all had a strong MSLN-specific CD8+ T-cell immune response (15). A targeted therapy using anti-MSLN monoclonal antibodies has entered a clinical trial.3
The nearly ubiquitous overexpression of MSLN in pancreatic cancer hints that its dysregulation relates to a fundamental biological activity of the neoplasm. To improve our understanding of this dysregulation, we identified regulatory cis-elements and trans-acting factors involved in the control of MSLN cancer-specific overexpression. We found evidence that an upstream enhancer element was responsible for the aberrant cancer-specific MSLN overexpression. The activity of this enhancer was dependent on the binding of transcription enhancer factor TEF-1 to an MCAT sequence. We propose that the specificity of MSLN aberrant expression in cancer was provided by a cofactor of TEF that binds to specific sequences immediately flanking the core MCAT motif.
Materials and Methods
Cell lines and cell culture. Cell lines were obtained from the American Type Culture Collection except for the lymphoblastoid cell line BJAB, which was kindly provided by B. Barnes (Johns Hopkins University, Baltimore, MD), and the mesothelioma cell line H-513, which was kindly provided by R. Salgia (University of Chicago, Chicago, IL). Cells were cultured in conventional medium supplemented with 10% FCS, l-glutamine, and penicillin/streptomycin.
RNA isolation and RT-PCR. RNA was isolated from cells using RNeasy Mini kit (Qiagen) and converted to cDNA using SuperScript II (Invitrogen) following the manufacturers' protocols. PCR conditions and primer sequences are available on request.
5′-random amplification of cDNA ends. RNA was treated with DNase I (Qiagen). The first cDNA strand was synthesized using a MSLN-specific primer, treated with RNases H and T1, purified using QIAquick PCR Purification kit (Qiagen), and tailed with 200 μmol/L dGTP using 1 μL of terminal deoxytransferase (U.S. Biochemical). The dG-tailed cDNA was amplified and subsequently nested amplified by PCR. The obtained products were separated on 1% lithium borate agarose gels (Faster Better Media LLC; ref. 16), purified, and sequenced at the Johns Hopkins Sequencing Core Facility.
Immunoblotting. Equal amounts of proteins were separated on 10% Bis-Tris gels and transferred onto polyvinylidene difluoride membranes. After blocking, the membranes were incubated with a mouse anti-MSLN antibody MN (1 μg/mL; kindly provided by I. Pastan, National Cancer Institute, Bethesda, MD; ref. 17), a mouse anti-TEF-1 antibody (1:250; BD Biosciences), a goat anti-TEF-4 antibody (1:200; Santa Cruz Biotechnology), or a goat anti-actin antibody (1:200; Santa Cruz Biotechnology) for 1 h to overnight. Testing other commercially available anti-MSLN antibodies (5B2, Vector Laboratories; V-16, Santa Cruz Biotechnology) proved unsatisfactory. Membranes were washed and incubated with a secondary anti-mouse antibody (1:20,000; GE Healthcare) or a secondary anti-goat antibody (1:20,000; Santa Cruz Biotechnology) for 90 min. Detection was done using SuperSignal Pico reagents (Pierce).
In silico analysis. Sequence analysis for potential transcription binding sites was done using MatInspector software (18).
Plasmids. All plasmids for reporter assays were constructed by cloning of PCR-amplified inserts (High Fidelity Platinum Taq DNA Polymerase, Invitrogen) into plasmids pGL3-Basic or pGL3-Promoter (Promega). The structures and fidelity of the resulting constructs were confirmed by restriction mapping and sequencing. Plasmids were purified using the Plasmid Midi kit (Qiagen). At least two independent plasmid preparations of each construct were used in reporter assays. Mutated constructs were generated using the QuikChange Site-Directed Mutagenesis kit (Stratagene).
The Gli1 expression plasmid was a kind gift of A. Maitra (Johns Hopkins University).
Transfections and reporter assays. Transient transfections were done using Lipofectamine or Lipofectamine 2000 (Invitrogen). Briefly, 2.5 × 105 to 4 × 105 cells were plated in six-well plates 24 h before transfection. The next day, cells were transfected using 1.5 μg of reporter construct and 0.5 μg of pRSVβgal vector and incubated for 5 h and the medium was exchanged. In cotransfection experiments, corresponding amounts of empty vector were used to equalize the amounts of transfected DNA. After 24 to 48 h, cells were washed twice with PBS, lysed in reporter lysis buffer (Promega), and subjected to a cycle of freeze/thaw. Luciferase activity was determined using the Steady-Glo System (Promega). To control for transfection efficiency, β-galactosidase activity was determined using standard protocols.
Electrophoretic mobility shift assay. Single-stranded high-performance liquid chromatography–purified oligonucleotides corresponding to the enhancer element (5′-GGTCTCCACCCACACATTCCTGGGGCGTG-3′) were purchased (Integrated DNA Technology), as were oligonucleotides carrying the same mutations as certain reporter constructs (M4 and M6). Annealed oligonucleotides were end labeled with 10 μCi/μL [α-32P]dATP using Klenow polymerase (New England Biolabs). Radiolabeled probes (20,000 cpm) were used per electrophoretic mobility shift assay (EMSA) binding reaction. Nuclear extracts were prepared following standard protocols. The binding reaction consisted of 5 μg of nuclear extracts, 2.5 μg poly(deoxyinosinic-deoxycytidylic acid) [poly(dI-dC)], radiolabeled probe, and binding buffer in a total volume of 20 μL. Free probes bound by proteins were separated on a 10% polyacrylamide nondenaturating gel in 0.5× Tris-borate EDTA, vacuum dried, and exposed to film at −80°C. For competition experiments, the protein extracts were first incubated with up to a 100-fold molar excess of unlabeled competing fragments for 10 min before adding the radioactive fragment. For supershift experiments, mouse anti-TEF-1, goat anti-TEF-4, or control mouse IgG (both from Santa Cruz Biotechnology) antibodies were used. His-tagged recombinant TEF-1 was grown in Escherichia coli and purified as described previously (19).
Gene silencing. For transient transfections, cells were transfected twice sequentially using Lipofectamine 2000 with 100 nmol/L small interfering RNA (siRNA; Ambion). The oligonucleotide sequences used in transfections were as follows: T1a, GCCCUGUUUCUAAUUGUGG; T1b, GGUUAACACUAAUCUCCUA; T1c, CGGAGUAUGCAAGGUUUGA; and T4, GGACGGCAGAUUUGUGUAC.
The TEF-1 cDNA was released from the pXJ40-TEF-1A plasmid using EcoRI and BglII and subcloned into pcDNA3.1+Zeo (Invitrogen) precut with EcoRI and BamHI, resulting in an antisense orientation of the insert in respect to the promoter. For stable transfections, AsPc1 and HeLa cells were transfected at ∼60% confluency in T75 culture flasks with 10 μg of plasmid using Lipofectamine. Cells were selected in zeocin for 2 to 3 weeks, lysed, and analyzed for the presence of plasmid and TEF-1 protein expression. No viable clones with a significant reduction of TEF-1 expression were generated, possibly due to detrimental effects of down-regulation of expression of the multimember family of TEF transcription factors.
Chromatin immunoprecipitation. The assay was done as previously described (20) with no antibody, a mouse monoclonal anti-TEF-1 antibody, a goat anti-TEF-4 antibody, a rabbit TEF-5 antibody (21), or a control mouse IgG. DNA was recovered (QIAquick PCR Purification kit) and amplified by PCR using primers spanning and flanking (230 bp upstream and 294 bp downstream) the CanScript sequence.
Immunohistochemistry and immunocytochemistry. Immunohistochemistry was done on paraffin-embedded slides as previously described (14) using a mouse monoclonal anti-MSLN antibody (5B2). Immunocytochemistry was done using the LSAB+System-HRP kit (Dako) after incubation with a primary anti-MSLN antibody MORAb-009 (1:500; kindly provided by N. Nicolaides, Morphotek, Exton, PA) for 90 min.
Results
MSLN RNA and protein overexpression in pancreatic cancer tissues and representative cell lines. We reproduced the reported differential expression of MSLN in cancer cells of tissues by immunohistochemistry (Fig. 1A). MSLN was localized mainly to cell membranes (Fig. 1B). Using RT-PCR and immunoblotting, we established an appropriate comparative cell line panel in which regulation of the cancer-specific expression of the MSLN gene could be studied (Fig. 1C).
MSLN RNA and protein overexpression in tissues and representative cell lines. A and B, MSLN is labeled immunochemically. A, MSLN protein overexpression characterizes most pancreatic adenocarcinomas. Arrows, a cancerous duct and a normal duct (inset) from the same patient. B.1, MSLN protein overexpression localizes to cellular membranes. The AsPc1 cell line is depicted as a representative example. B.2, the MiaPaCa2 cell line is immunocytochemically negative for MSLN. C, MSLN overexpression is restricted to certain cell lines. AsPc1 and CAPAN2 represent pancreatic, and HeLa represents nonpancreatic cancer-specific MSLN overexpression. OVCAR3 represents overexpression in a cancer arising in a tissue normally expressing MSLN (ovary). HEK293 (human embryonic kidney), BJAB (Burkitt's lymphoma), and HG261 (human noncancerous fibroblasts) represent cancerous and noncancerous cells not expressing MSLN. The glycosylated (slower) and deglycosylated (faster) forms of MSLN protein can be distinguished. Glyceraldehyde-3-phosphate dehydrogenase (GAPDH) and actin are loading controls. RNA, RT-PCR detection of the MSLN and GAPDH gene. Protein, immunoblots using specific antibodies against MSLN and actin. NC, no template PCR. D, identification of promoter and transcriptional start sites. Sequence of the 5′-end of the MSLN gene, including the MSLN promoter identified here. Sequencing of 5′-RACE products revealed 10 different possible transcriptional start sites, 3 corresponding to the consensus initiator sequence (overlying arrows), the rest to nonconsensus start sites (arrowheads). Underlined, a cryptic intron excised at cryptic splice sites that define the borders of alternate exons 1A and 1B. The translation start site lies in exon 2, not depicted. Lower case letters, start of intron 1.
MSLN RNA and protein overexpression in tissues and representative cell lines. A and B, MSLN is labeled immunochemically. A, MSLN protein overexpression characterizes most pancreatic adenocarcinomas. Arrows, a cancerous duct and a normal duct (inset) from the same patient. B.1, MSLN protein overexpression localizes to cellular membranes. The AsPc1 cell line is depicted as a representative example. B.2, the MiaPaCa2 cell line is immunocytochemically negative for MSLN. C, MSLN overexpression is restricted to certain cell lines. AsPc1 and CAPAN2 represent pancreatic, and HeLa represents nonpancreatic cancer-specific MSLN overexpression. OVCAR3 represents overexpression in a cancer arising in a tissue normally expressing MSLN (ovary). HEK293 (human embryonic kidney), BJAB (Burkitt's lymphoma), and HG261 (human noncancerous fibroblasts) represent cancerous and noncancerous cells not expressing MSLN. The glycosylated (slower) and deglycosylated (faster) forms of MSLN protein can be distinguished. Glyceraldehyde-3-phosphate dehydrogenase (GAPDH) and actin are loading controls. RNA, RT-PCR detection of the MSLN and GAPDH gene. Protein, immunoblots using specific antibodies against MSLN and actin. NC, no template PCR. D, identification of promoter and transcriptional start sites. Sequence of the 5′-end of the MSLN gene, including the MSLN promoter identified here. Sequencing of 5′-RACE products revealed 10 different possible transcriptional start sites, 3 corresponding to the consensus initiator sequence (overlying arrows), the rest to nonconsensus start sites (arrowheads). Underlined, a cryptic intron excised at cryptic splice sites that define the borders of alternate exons 1A and 1B. The translation start site lies in exon 2, not depicted. Lower case letters, start of intron 1.
Transcriptional start site determination by 5′-random amplification of cDNA ends. Sequencing of products generated by 5′-random amplification of cDNA ends (5′-RACE) revealed multiple transcript ends spanning a 400-bp region (Fig. 1D). No TATA box was identified. We designated the 5′-end of the longest transcript as nucleotide +1 and submitted this previously unreported transcript to Genbank under number EF420155. Three of the start sites matched the consensus initiator sequence, and others did not and may represent partially degraded transcripts (Fig. 1D). Exons 1A and 1B were formed by splicing at cryptic splice sites located at position +86 and +336. This splicing was observed in our transcripts and in previously reported cDNA sequences (Vega transcript OTTHUMT00000155614). Sequence analysis of available transcripts from public databases revealed the presence of a putative polyadenylation signal, poly A tail, and absence of potential destabilization signals. These features suggest that the MSLN transcript would be stable.
Transcriptional regulation of MSLN from a response element specific for aberrant expression in cancer. We used promoter-reporter assays to analyze 8 kb of genomic sequence surrounding nucleotide +1 (6 kb upstream and 2 kb downstream, including intron 1; Fig. 2A) in our panel of representative cell lines. In transient transfection assays, we identified a region between −135 and −49 bp that activated transcription from a minimal promoter in all tested MSLN-overexpressing cell lines AsPc1, CAPAN2, HeLa, HS766, and PL45 but did not activate the minimal promoter in MSLN-nonexpressing lines MiaPaCa2, RKO, HEK293, and BJAB (Fig. 2B; data not shown). In cell lines having cancer-specific MSLN overexpression, the region produced a 4- to 10-fold increase in reporter activity over the core promoter. In nonexpressing lines, the basal core promoter activity remained low and unchanged across the tested region. Plasmid P2F, which contained the sequences downstream from the start of transcription, had no enhancing activity (Fig. 2C).
Transcriptional regulation of MSLN overexpression. A, reporter constructs from vectors pGL3-Basic and pGL3-Promoter, engineered regions shown. Lines, MSLN gene fragments, numbered relative to transcriptional start site (+1). SV40, SV40 minimal promoter; LUC, luciferase gene. Construct labels are given to the right of the luciferase element. Arrows, orientation of the fragments within the vector. B, reporter activity of indicated constructs in MSLN-expressing and MSLN-nonexpressing cell lines. AsPc1, CAPAN2, and HeLa cells have high reporter activity originating in the region between −135 and −49 bp, whereas this activity is absent in MiaPaCa2, RKO, and HEK293 cells. Activities relative to that of vector pGL3-Basic are presented. C, the region downstream from the transcriptional start site, including intron 1 inserted in pGL3-Promoter, does not have enhancer activity (construct P2F). D, cell lines derived from tissues having expression of MSLN in the nonneoplastic state have very little or no enhancer activity of the region −135/−49. Activities of the construct B-135 relative to B-49 are presented. Columns, average data from two to six experiments done in duplicates; bars, SEM.
Transcriptional regulation of MSLN overexpression. A, reporter constructs from vectors pGL3-Basic and pGL3-Promoter, engineered regions shown. Lines, MSLN gene fragments, numbered relative to transcriptional start site (+1). SV40, SV40 minimal promoter; LUC, luciferase gene. Construct labels are given to the right of the luciferase element. Arrows, orientation of the fragments within the vector. B, reporter activity of indicated constructs in MSLN-expressing and MSLN-nonexpressing cell lines. AsPc1, CAPAN2, and HeLa cells have high reporter activity originating in the region between −135 and −49 bp, whereas this activity is absent in MiaPaCa2, RKO, and HEK293 cells. Activities relative to that of vector pGL3-Basic are presented. C, the region downstream from the transcriptional start site, including intron 1 inserted in pGL3-Promoter, does not have enhancer activity (construct P2F). D, cell lines derived from tissues having expression of MSLN in the nonneoplastic state have very little or no enhancer activity of the region −135/−49. Activities of the construct B-135 relative to B-49 are presented. Columns, average data from two to six experiments done in duplicates; bars, SEM.
No significant increase in reporter activity in the −135/−49 region was seen in cell lines derived from tissues that are known to express MSLN even in a nonneoplastic state (ovarian carcinoma cell line OVCAR3 and mesothelioma cell lines MSTO-211H and H-513; Fig. 2D; ref. 22). Those lines are thought to exhibit a tissue-specific, but not cancer-specific, MSLN expression.
Once the longest transcript extending to nucleotide +1 was determined, we generated constructs extending only from +27 (constructs B-67b and B-49b; Fig. 2A). Reporter activity of these constructs did not differ from those extending from +413 (data not shown). To rule out the possibility that the short transcripts were generated from an alternative promoter, the region immediately upstream of the most downstream transcriptional start site (nucleotide +369) was also examined in a promoterless vector (B+183 and B+92; Fig. 2A) and had no enhanced activity in all cell lines tested (data not shown).
We next attempted to further narrow the response element and found all the cancer-specific activity to originate from a region defined by plasmids B-67 and B-49 (Fig. 3A and B). The region acted like an enhancer, in that it functioned in either orientation, activated transcription independently of location, and also enhanced a heterologous SV40 promoter (Fig. 3C).
The response element localized to the region between −67 and −49 and acted as an enhancer. A, the response element containing the transcriptional activity in MSLN-expressing cell line AsPc1 was localized to the region defined by plasmids B-67 and B-49. B, RKO cells were used as a negative control. Activities relative to pGL3-Basic are presented. C, the response element acted as an enhancer. It activated transcription from a heterologous promoter (construct P1) and was independent of orientation (PE1F versus PE1R, PE3F versus PE3R) and location (PE1F versus PE3F). The construct P3F, which did not contain the enhancer element, and RKO cells were used as negative controls.
The response element localized to the region between −67 and −49 and acted as an enhancer. A, the response element containing the transcriptional activity in MSLN-expressing cell line AsPc1 was localized to the region defined by plasmids B-67 and B-49. B, RKO cells were used as a negative control. Activities relative to pGL3-Basic are presented. C, the response element acted as an enhancer. It activated transcription from a heterologous promoter (construct P1) and was independent of orientation (PE1F versus PE1R, PE3F versus PE3R) and location (PE1F versus PE3F). The construct P3F, which did not contain the enhancer element, and RKO cells were used as negative controls.
Mutations defining the required sequence. To vigorously define the required sequence, variants created by site-directed mutagenesis were tested. Transfections of mutated constructs revealed two independent functional sites [termed site 1 (−53 to −46) and site 2 (−65 to −56); Fig. 4A] of the enhancer element that were both required for reporter activity in MSLN-overexpressing lines AsPc1 and HeLa, whereas no significant changes of basal reporter activity in the MSLN-nonexpressing line RKO were observed (Fig. 4B; data not shown).
Two distinct functional sites of the enhancer element were determined by site-directed mutagenesis. A, mutated constructs were based on the wild-type B-99 construct. The location of proposed sites 1 and 2 is depicted. The bases mutated in various constructs are in lower case and underlined, with constructs labeled at the left. B, reporter assays done in cell line AsPc1 indicated the presence of two distinct sites within the enhancer that were both required for reporter activity in AsPc1. No significant effect of the mutations was observed in RKO. C, pCanScript3.luc containing three concatemerized copies of CanScript produced a 30-fold enhancement over a SV40 minimal promoter.
Two distinct functional sites of the enhancer element were determined by site-directed mutagenesis. A, mutated constructs were based on the wild-type B-99 construct. The location of proposed sites 1 and 2 is depicted. The bases mutated in various constructs are in lower case and underlined, with constructs labeled at the left. B, reporter assays done in cell line AsPc1 indicated the presence of two distinct sites within the enhancer that were both required for reporter activity in AsPc1. No significant effect of the mutations was observed in RKO. C, pCanScript3.luc containing three concatemerized copies of CanScript produced a 30-fold enhancement over a SV40 minimal promoter.
Site 1, located at the 3′-end of the enhancer element, contained a 100% match to an MCAT element, a binding site of the TEF family of transcription factors (23). Transfections of mutated constructs M6, M6b, and M6c completely abolished enhancer activity, the resultant sequence having the basal activity of the core promoter-construct B-49 (Fig. 4B).
Site 2 at the 5′-end of the enhancer contained a putative binding site for the Sp1 and Sp1-like Krüppel family of transcription factors (24). Construct M4 removed 40% of reporter activity, construct M5 removed 65% of reporter activity, whereas construct M1, combining only one base from the substitutions of both M4 and M5, produced a 75% reduction in reporter activity (Fig. 4B).
The defined requisite sequence for cancer-specific enhancement of transcription was thus −62 to −45 and was termed the CanScript sequence. When three copies of the CanScript sequence were inserted in front of a minimal promoter (pCanScript3.luc), the enhancing activity increased to 30-fold in AsPc1 cells (Fig. 4C). The CanScript sequence is partially conserved among mammals. The human sequence is identical to the chimpanzee and rhesus monkey sequences and has one mismatch with the cow sequence. Three mismatches with the mouse sequence and four mismatches with the rat and dog sequences affect both CanScript sites.
Binding of TEF-1 to the MCAT motif of the MSLN enhancer in vitro. EMSAs using synthetic radiolabeled oligonucleotides representing the enhancer element identified one specific DNA-protein complex, complex 1 (Fig. 5A,, lane 3), which (a) was outcompeted by an excess of unlabeled probe (Fig. 5A,, lane 5), (b) did not form when a radiolabeled site 1–mutated probe was used (Fig. 5A,, lane 4), and (c) was not outcompeted by an excess of mutated unlabeled probes based on sites 1 and 2 (Fig. 5A; data not shown). Complex 1 and its pattern of responses to competition experiments did not differ between MSLN-overexpressing lines with high reporter activity of the enhancer, AsPc1 (Fig. 5A) and HeLa (Fig. 5B,, lanes 1–4), the MSLN-expressing cell line without enhancer activity, OVCAR3 (Fig. 5B,, lanes 5–8), and the MSLN-nonexpressing line with no enhancer activity, RKO (Fig. 5A,, lanes 7–10). The presence of complex 1 was restricted to cells expressing TEF-1 (TEAD1) irrespective of MSLN expression (Fig. 5C,, lanes 1–10). No complex was present in the TEF-1–nonexpressing line BJAB and human fibroblasts having minimal TEF-1 expression HG261 (Fig. 5C,, lanes 7 and 9). The specific complex 1 was supershifted by an anti-TEF-1 antibody (Fig. 5D,.1, lanes 2–4) but not by an anti-TEF-4 (TEAD2) or a control mouse IgG (Fig. 5D,.1, lanes 5–7). When a mutated probe was used, no supershift occurred (Fig. 5D,.1, lanes 8–10). Complex 1 was reduced on knockdown of TEF-1, whereas no reduction was observed on knockdown of TEF-4 or an irrelevant protein FANCD2 (Fig. 5D,.2, lanes 1–3). When a recombinant TEF-1 was added to an MCAT wild-type but not mutated probe, a complex formed that comigrated with complex 1 (native TEF-1; Fig. 5D,.2, lanes 4 and 5). Complex 1 was enhanced when a recombinant TEF-1 was added to cell nuclear extracts (Fig. 5D .2, lane 6).
Binding of TEF-1 to site 1 in vitro. EMSAs used nuclear extracts from various cell lines. The constituents of the reaction assayed in each lane are depicted above the gel. PWT, probe wild-type; Pm1, probe mutated in site 1; CWT, competitor wild-type; Cm1, competitor mutated in site 1. A, lane 3, a specific retarded band termed complex 1 (C1) was present when nuclear extracts of AsPc1 were incubated with an oligonucleotide corresponding to the enhancer element; lane 4, this complex did not form when site 1 was mutated. It was outcompeted by a 100-fold molar excess of unlabeled wild-type probe (lane 5) but not site 1–mutated probe (lane 6). Lanes 7 to 10, the same pattern was observed in RKO cells. B, replication in HeLa (lanes 1–4) and OVCAR3 (lanes 5–8) cells gave the same results. C, the specific complex 1 was present in various cell lines irrespective of MSLN expression and required TEF-1. MSLN-expressing cell lines HeLa, CAPAN2, and MSTO-211H (lanes 1, 2, and 4) and MSLN-nonexpressing cell lines LNCaP (prostate cancer), MiaPaCa2, HEK293, BJAB, Cal27 (oral squamous cell cancer), HG261, and MDA-MB-480 (breast cancer; lanes 3, 5, 6, 7, 8, 9, and 10) all formed complex 1. Complex 1 failed to form in the cell line having no expression of TEF-1 (BJAB) and in the noncancerous cell line HG261. D.1, complex 1 was supershifted (SS) by increasing concentrations of TEF-1 antibody (1, 2, and 5 μg; lanes 2–4, āTEF-1) but was not supershifted by increasing concentration of TEF-4 antibody (1.5 μg; lanes 5 and 6, āTEF-4) or control mouse IgG (2 μg; lane 7, āIgG). Lanes 8 to 10, the nonspecific complex 2 (C2) was not shifted by any of the antibodies (2 μg). D.2, lanes 1 to 3, in HeLa cells, complex 1 was reduced on siRNA knockdown of TEF-1 (T1a used) but not TEF-4 or FANCD2; lanes 4 and 5, recombinant TEF-1 protein formed a retarded band that comigrated with complex 1 using a wild-type but not a mutated MCAT probe; lane 6, native complex 1 was enhanced by recombinant TEF-1.
Binding of TEF-1 to site 1 in vitro. EMSAs used nuclear extracts from various cell lines. The constituents of the reaction assayed in each lane are depicted above the gel. PWT, probe wild-type; Pm1, probe mutated in site 1; CWT, competitor wild-type; Cm1, competitor mutated in site 1. A, lane 3, a specific retarded band termed complex 1 (C1) was present when nuclear extracts of AsPc1 were incubated with an oligonucleotide corresponding to the enhancer element; lane 4, this complex did not form when site 1 was mutated. It was outcompeted by a 100-fold molar excess of unlabeled wild-type probe (lane 5) but not site 1–mutated probe (lane 6). Lanes 7 to 10, the same pattern was observed in RKO cells. B, replication in HeLa (lanes 1–4) and OVCAR3 (lanes 5–8) cells gave the same results. C, the specific complex 1 was present in various cell lines irrespective of MSLN expression and required TEF-1. MSLN-expressing cell lines HeLa, CAPAN2, and MSTO-211H (lanes 1, 2, and 4) and MSLN-nonexpressing cell lines LNCaP (prostate cancer), MiaPaCa2, HEK293, BJAB, Cal27 (oral squamous cell cancer), HG261, and MDA-MB-480 (breast cancer; lanes 3, 5, 6, 7, 8, 9, and 10) all formed complex 1. Complex 1 failed to form in the cell line having no expression of TEF-1 (BJAB) and in the noncancerous cell line HG261. D.1, complex 1 was supershifted (SS) by increasing concentrations of TEF-1 antibody (1, 2, and 5 μg; lanes 2–4, āTEF-1) but was not supershifted by increasing concentration of TEF-4 antibody (1.5 μg; lanes 5 and 6, āTEF-4) or control mouse IgG (2 μg; lane 7, āIgG). Lanes 8 to 10, the nonspecific complex 2 (C2) was not shifted by any of the antibodies (2 μg). D.2, lanes 1 to 3, in HeLa cells, complex 1 was reduced on siRNA knockdown of TEF-1 (T1a used) but not TEF-4 or FANCD2; lanes 4 and 5, recombinant TEF-1 protein formed a retarded band that comigrated with complex 1 using a wild-type but not a mutated MCAT probe; lane 6, native complex 1 was enhanced by recombinant TEF-1.
A discrete larger complex (complex 2) above complex 1 was also identified (Fig. 5A–D). Complex 2 was not fully diminished when an excess of wild-type unlabeled competitor, site 1–mutated competitor, or site 2–mutated competitor, was used. It was, however, diminished on addition of an excess of poly(dI-dC) and was therefore considered nonspecific. The apparent enhancement of complex 2 that appeared when a site 1–mutated probe was used was also considered nonspecific (Fig. 5A , lane 8; data not shown).
When a radiolabeled oligonucleotide having a wild-type site 1 and a mutation in site 2 was used as a probe, complex 1 was preserved and no other specific complex could be identified (data not shown).
TEF-1 is required but not sufficient for MSLN overexpression. The graded levels of RNA and protein expression of the members of the TEF family of transcription factors in cancer cell lines (Fig. 6A) did not correspond to MSLN expression (Fig. 1C). None of the TEF members was selectively expressed only in cell lines having cancer-specific MSLN overexpression.
TEF-1 was required but not sufficient for MSLN cancer-specific expression. A, the members of the TEF family of transcription factors are widely expressed in cancer cell lines, and their expression does not correspond to expression of MSLN (compare with Fig. 1D). Left, gene and type of assay. GAPDH and actin were used as loading controls. B, down-regulation of TEF-1 was achieved by three nonoverlapping siRNAs (T1a, T1b, and T1c) and resulted in reduction of MSLN protein expression in HeLa cells. Scrambled siRNA (SCR) and siRNA directed against an irrelevant gene FANCD2 (D2) were used as negative controls. Down-regulation of TEF-4 (T4a and T4b) did not result in reduction of MSLN protein expression in HeLa cells. C, ChIP with a TEF-1 antibody (10 μg) resulted in amplification of a sequence containing the CanScript sequence (CS) in AsPc1 cells but not in TEF-1–nonexpressing BJAB cells. Immunoprecipitations serving as negative controls were done with normal mouse IgG and without antibody (NoIg) or used sequences immediately upstream (5′) and downstream (3′) of the CanScript sequence. Representative results of at least two independent immunoprecipitation experiments and multiple independent PCR analyses are shown. Inp, a constant fraction of the input; NC, control lacking template DNA. D, squelching in AsPc1 and MiaPaCa2. Left, exogenous TEF-1 expression caused a decrease in activity of the MSLN enhancer-containing reporter B-99 in MSLN-expressing cell line AsPc1 but not in MSLN-nonexpressing cell line MiaPaCa2. Much smaller effects were seen when the parental reporter containing the SV40 promoter, pGL3-Promoter, was used (middle) or the irrelevant protein Gli1 was expressed (right).
TEF-1 was required but not sufficient for MSLN cancer-specific expression. A, the members of the TEF family of transcription factors are widely expressed in cancer cell lines, and their expression does not correspond to expression of MSLN (compare with Fig. 1D). Left, gene and type of assay. GAPDH and actin were used as loading controls. B, down-regulation of TEF-1 was achieved by three nonoverlapping siRNAs (T1a, T1b, and T1c) and resulted in reduction of MSLN protein expression in HeLa cells. Scrambled siRNA (SCR) and siRNA directed against an irrelevant gene FANCD2 (D2) were used as negative controls. Down-regulation of TEF-4 (T4a and T4b) did not result in reduction of MSLN protein expression in HeLa cells. C, ChIP with a TEF-1 antibody (10 μg) resulted in amplification of a sequence containing the CanScript sequence (CS) in AsPc1 cells but not in TEF-1–nonexpressing BJAB cells. Immunoprecipitations serving as negative controls were done with normal mouse IgG and without antibody (NoIg) or used sequences immediately upstream (5′) and downstream (3′) of the CanScript sequence. Representative results of at least two independent immunoprecipitation experiments and multiple independent PCR analyses are shown. Inp, a constant fraction of the input; NC, control lacking template DNA. D, squelching in AsPc1 and MiaPaCa2. Left, exogenous TEF-1 expression caused a decrease in activity of the MSLN enhancer-containing reporter B-99 in MSLN-expressing cell line AsPc1 but not in MSLN-nonexpressing cell line MiaPaCa2. Much smaller effects were seen when the parental reporter containing the SV40 promoter, pGL3-Promoter, was used (middle) or the irrelevant protein Gli1 was expressed (right).
Three nonoverlapping siRNAs targeting specifically the TEF-1 transcript caused decreases in MSLN protein expression (Fig. 6B). Selective knockdown of TEF-4 resulted in down-regulation of TEF-4 but MSLN protein expression was not reduced (Fig. 6B).
Binding of TEF-1 to MSLN enhancer in vivo. Having shown binding of TEF-1 to the MCAT element of the CanScript sequence in vitro, we sought to confirm this finding in living cells. Chromatin immunoprecipitation (ChIP) assays showed that a DNA sequence, including the enhancer element (100 bp tested), was bound by TEF-1 in vivo in AsPc1 cells but not in BJAB cells, which expressed no TEF-1 (Fig. 6C). Sequences surrounding the enhancer element immediately upstream and downstream were not bound by TEF-1 (Fig. 6C). No binding to any of the DNA sequences tested was detected when antibodies against TEF-4 or TEF-5 (TEAD3) were used, although no positive control could be established for these two antibodies in the ChIP assay (data not shown).
Cancer specificity of the enhancer mediated by a cofactor. It has been suggested that the transactivation function of TEF-1 is mediated by a highly limiting, cell-specific, titratable transcriptional intermediary factor (a cofactor) that is reliably detected by a squelching assay (25). To test such a possibility in our system, exogenous TEF-1 driven by the cytomegalovirus promoter was introduced in increasing concentrations into MSLN-overexpressing and MSLN-nonexpressing cell lines together with either a reporter containing the CanScript sequence or heterologous SV40 promoter-driven luciferase. In the MSLN-overexpressing cell line AsPc1, CanScript activity diminished with increasing concentrations of TEF-1 expression plasmid (Fig. 6D), interpreted as squelching. In contrast, in the MSLN-nonexpressing cell line MiaPaCa2, the background reporter activity was neither induced nor reduced when TEF-1 plasmid was added (Fig. 6D , left).
No significant decrease in reporter activity was observed in the cell lines when the heterologous SV40 promoter-driven luciferase (pGL3-Promoter) was used as a reporter (Fig. 6D,, middle), suggesting that the squelching effect observed in AsPc1 was specific to the CanScript sequence. A control Gli1-expressing plasmid, used as a negative control expressing an irrelevant protein, led only to a slight reduction of reporter activity in the cell lines when transfected with the MSLN reporter (Fig. 6D , right).
We evaluated the expression pattern of genes previously suggested by others as cofactors of TEF [the vestigial-like proteins VGLL2, VGLL3, and VGLL4; the related WW domain-containing YAP1 (YAP65), WWTR 1 (TAZ), and PARP; and interacting transcription factors MAX and MEF2A; refs. 19, 26–30] by analyzing the publicly available SAGE database4
as well as by RT-PCR in our panel of cell lines. The expression pattern of any of the proposed TEF-1 cofactors did not correspond to the differential expression of MSLN in pancreatic cancer or in our cell line model (data not shown).Discussion
We investigated the mechanism producing cancer-specific overexpression of MSLN, a property common to nearly a third of human cancers and almost ubiquitous in pancreatic ductal adenocarcinoma. An 18-bp upstream enhancer, termed CanScript, seemed responsible for this overexpression and used transcription factors from the TEF family as a major regulator.
We identified multiple transcriptional start sites of the MSLN gene spanning hundreds of base pairs. Nevertheless, the transcription seemed to be regulated from one weak tissue-nonspecific TATA-less promoter located upstream of nucleotide +1. The low activity of the isolated MSLN promoter may explain the absence of MSLN expression in most tissues.
In a group of cell lines having aberrant overexpression of MSLN, the core promoter activity was strongly enhanced by an 18-bp sequence, CanScript, located ∼60 bp from the transcriptional start site. In pancreatic and cervical carcinoma cells, the enhancer produced a 4- to 10-fold induction of promoter activity. In contrast, in cell lines originating from tissues having physiologic expression of MSLN (mesothelium, ovarian epithelium, and cortical inclusion cysts; ref. 22), insignificant effects were seen. This absence of enhancement in our analysis agrees with the previously published study by Urwin and Lake (31), who analyzed transcriptional regulation of MSLN in the mesothelioma cell line JU77 and did not observe any transcriptional activity of the CanScript-containing region.
Mutational analysis revealed the presence of two functionally distinguishable putative binding sites in the enhancer. The distal site 1 perfectly matched the MCAT element. The MCAT element was first identified by Mar and Ordahl (32) in the cardiac troponin gene (cTNT) and has been since implicated in the regulation of several cardiac and skeletal muscle-specific genes (α and β myosin heavy chains, skeletal α-actin, and β acetylcholine receptor; refs. 33–35). In addition to muscle-specific expression, MCAT-dependent expression was described in the SV40 enhancer (36) and human papillomavirus (HPV) 16 (37) oncogenes in a variety of cell lines and in the chorionic somatomammotropin gene in placenta (38). The MCAT element is bound by members of the TEF family of transcription factors [TEF-1 (TEAD1), TEF-3 (TEAD4), TEF-4 (TEAD2), and TEF-5 (TEAD3); ref. 23]. TEF-1 was first identified as a binding factor of the SV40 enhancer in HeLa cells (25). The members of the TEF family are broadly expressed, in tissue-specific patterns, suggesting a unique function for each of them (39).
Here, we show that the MCAT element of the MSLN enhancer is bound by TEF-1. Several pieces of evidence support this conclusion. First, the 18-bp CanScript sequence, which acted as an enhancer, lost all of its activity when the MCAT site was mutated. Similarly, the EMSA-defined complex 1 did not form when mutations in the MCAT site were introduced. Second, complex 1 was supershifted by a TEF-1 antibody and did not form in a cell line having no expression of TEF-1. Third, selective knockdown of TEF-1 resulted in reductions of complex 1 and of MSLN protein expression. Fourth, a complex formed of sequences encompassing the MSLN enhancer and TEF-1 protein was detected in ChIP experiments.
Our studies suggested that TEF-1 but not TEF-4 or TEF-5 interacts with the MSLN enhancer. Previous in vitro studies showed that all TEF proteins had almost identical DNA-binding domains, recognizing the MCAT motif with essentially the same affinity (39). However, disruption of TEF-1 in mice resulted in embryonic lethality due to cardiac abnormalities, suggesting that TEF members are not redundant in function (40). The selective role of individual members of the TEF family may be explained by differences in tissue expression, target accessibility, or the specificity of cofactor/TEF interactions. Recently, Mahoney et al. (19) showed differential interaction of the TEF cofactor TAZ with TEF-1 and TEF-3. Our studies did not examine the effects of all the members of the TEF family, and it is possible that TEF-1 is not the only member of the family transactivating through the CanScript sequence.
Our data confirmed the results of others showing that tissue-specific expression of TEFs is by itself insufficient to direct tissue-specific expression of TEF-driven genes. Cofactors have been proposed to direct tissue specificity (39). They were first suggested by Xiao et al. (25) from experiments done in the SV40 enhancer. We did experiments analogous to Xiao and observed that adding exogenous TEF-1 into a cell line lacking MSLN did not result in an induction of reporter activity. Furthermore, in a MSLN-expressing cell line, the overexpression of TEF-1 resulted in a decrease in reporter activity. Such behavior is presumed to be attributable to squelching of cell-specific cofactors (25). Thus, our experiments suggested the cancer specificity to be provided by a titratable cofactor of TEF.
The cofactor controlling the activity of the MSLN enhancer remains to be identified. The expression pattern of proteins previously suggested as TEF cofactors (19, 26–30) did not correspond to MSLN overexpression in our model. The tissue specificity of the enhancer might also be provided by splicing isoforms of TEF-1 (41). Furthermore, TEF-1 might act as a selector protein recruiting a transcription factor responsive to specific signaling events, selectively present only in MSLN-overexpressing cells and binding independently to the enhancer (42).
The exact role of site 2 is not certain. Mutational studies suggested that it was functional yet not to the same extent as site 1. Furthermore, EMSA experiments failed to show a retarded complex corresponding to this site. Such absence may have been caused by a discrete and short-lived protein-DNA interaction that was undetectable under our experimental conditions. It is also possible that alterations in site 2 could cause conformational changes altering the binding of TEFs to site 1. None of the mutations introduced to site 2 resulted in an increase in reporter activity in MSLN-nonexpressing cell lines, ruling out the possibility of a repressor mechanism selectively absent in MSLN-overexpressing cell lines.
The binding protein of site 2 may be the presumptive TEF cofactor. Sequences immediately flanking the MCAT element were shown to direct tissue specificity in the cTNT promoter (43). The site has a high similarity to the consensus binding site for transcription factors of the Sp1 family (8 of 10 bases; ref. 24). Sp1 and Krüppel-like factor proteins have been proposed to play an important role in the growth and metastasis of many tumor types, including pancreatic cancer (44). Recently, a GC box bound by Sp1 was shown to be required for the activity of an MCAT-dependent cTNT promoter (45). Sp1 is also a common regulator of TATA-less promoters and might function as a nonspecific transactivating factor of the TATA-less MSLN promoter. The slight decrease in the background reporter activity in a MSLN-nonexpressing cell line RKO, observed when site 2–mutated constructs were used, might support this explanation.
It is intriguing to consider possible similarities of the cancer-specific cofactor/TEF-driven MSLN overexpression and the cancer-specific alterations of β-catenin/T-cell factor (TCF). The first members of the TCF family, TCF1 and LEF1, were identified for their ability to drive tissue-specific expression of CD3ε and HIV long terminal repeat genes in T lymphocytes (46, 47). Because TCF factors could not directly activate transcription in reporter assays, the appropriate molecular context, in which the factors acted, was not determined until a binding cofactor, β-catenin, was discovered. Cotransfection with β-catenin strongly induced transcription from multimerized TCF-binding motifs linked to a reporter gene (48). This evidence led to completion of the diagram of the gatekeeper APC-β-catenin-TCF pathway, ubiquitously activated in colon cancer by mutations in APC or β-catenin (49). It is tempting to speculate that the cofactor/TEF regulation of MSLN aberrant expression may result from a genetic alteration, a highly prevalent mutation or amplification, yet unknown in pancreatic and other cancers, producing constitutive activation of a downstream pathway. The search for the cofactor thus represents an attractive goal as the final step of this alternative approach to identify highly frequent aberrant cellular activities in cancer.
In the past two decades, gene therapy has been frequently attempted as a potentially powerful therapeutic technique in the treatment of cancer. The recent development of new technologies in gene therapy offers hope for further clinical improvement (50). However, the weak point of virtually all available delivery strategies is the lack of specific targeting. One way to overcome this obstacle is to use highly cancer-specific promoter elements to selectively drive gene expression. The MSLN enhancer with its high specific activity in a variety of cancers may serve this purpose well.
Our findings expand the role of TEF family of transcription factors in cancer. TEF-1 was first identified in cancer cells and is important for expression of oncogenic viruses SV40 and HPV16. HPV plays a causative role in cervical cancer. Interestingly, cervical cancers express MSLN and the expression seems TEF-1 dependent. Our data suggest a role for nonviral TEF-driven gene expression in cancer that may participate in pancreatic tumorigenesis. According to the proposed model, TEFs, similarly to TCF, are likely to represent only a downstream tool to direct transcriptional activation from a specific upstream pathway. Identifying other biologically relevant targets of TEF-driven expression represents another way to uncover the exact role of TEFs and the processes upstream of them.
In conclusion, by analyzing a large region of genomic sequences surrounding the MSLN gene in a representative panel of cell lines, we discovered an upstream enhancer that seemed responsible for cancer-specific MSLN overexpression. The activity of the enhancer was dependent on the binding of TEF-1 and its specificity on a yet unknown cofactor of TEF. A modular 18-bp responsive sequence, CanScript, has enhanced activity when arranged in a tandem repeat and represents an optimal candidate for cancer-specific targeting, potentially suitable for a third of human malignancies.
Acknowledgments
Grant support: National Cancer Institute grant CA62924 and the Marjorie Kovler Fund.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
We thank M. Gorospe and J.M. Winter for helpful suggestions.