PAX3 and PAX7 are closely related paired box family members expressed during early neural and myogenic development. Assay of PAX3 and PAX7 mRNA expression in embryonal rhabdomyosarcoma, neuroblastoma, Ewing’s sarcoma, and melanoma cell lines revealed tumor-specific expression patterns similar to the corresponding embryonic lineages. Although the mammalian PAX3 and PAX7 genes were reported to contain eight exons, we found that the predominant PAX3 and PAX7 transcripts in these tumor lines contain previously uncharacterized ninth exons. These splicing events alter the COOH-terminal coding regions of the encoded products but do not alter the transcriptional activity as assayed using a reporter gene with a model PAX3/PAX7 binding site. However, the findings of nearly identical COOH-terminal regions within the corresponding genes of the avian and fish genomes suggest conserved functional roles for these regions that require further investigation.
Within the paired box family of transcription factors, PAX3 and PAX7 constitute a subfamily characterized by a highly conserved NH2-terminal region consisting of paired box, octapeptide, and homeobox motifs (1). The known transcription units of two corresponding genes consist of eight exons and have similar structural and functional organization (Fig. 1,A; Refs. 2, 3). In particular, the paired box is encoded by exons 2, 3, and 4 whereas the homeobox is encoded by exons 5 and 6. These NH2-terminal PAX3 and PAX7 regions contain DNA binding domains with similar, if not identical, target specificity (4). Within the NH2-terminal regions, there are also transcriptional repression domains overlapping the DNA binding domains (5, 6). Distal to the homeobox, there are several regions with sequence similarity between PAX3 and PAX7 followed by marked divergence at the COOH-terminal ends (Fig. 1 B; Refs. 3, 4, 7). Functional analyses of exons 6, 7, and 8 indicate that they encode transcriptional activation domains with activity differences that are attributed to the regions of sequence divergence and differing interactions with the NH2-terminal repression domains (6, 8).
PAX3 and PAX7 are expressed during early development with overlapping patterns (7, 9). In situ hybridization analysis of murine embryos revealed Pax3 expression in the developing central and peripheral nervous system as well as in mesoderm compartments that give rise to skeletal muscle progenitors. Although Pax7 is expressed in similar areas, spatial and temporal differences have been noted between the expression patterns of the two genes. For example, Pax7 expression is activated later and persists longer than the expression of Pax3. In addition, Pax3 but not Pax7 is expressed in the neural tube region from which the neural crest cells arise, in the migrating subpopulations of neural crest cells, and in some developing neural crest derivatives. Another difference is the localization of Pax3 expression to the lateral dermomyotome and myogenic progenitors that migrate to the limbs in contrast with the more prominent Pax7 expression in the medial dermomyotome and absence in the myogenic progenitors migrating to the limbs (10).
The available expression and functional data support a model in which a similar set of target genes are regulated by PAX3 and PAX7 during development with differing efficiencies, at different times, and in different places. As biological systems in which expression and function of these genes can be analyzed and manipulated, we are investigating these issues in tumors related to the myogenic and neural lineages. In previous work, we studied the PAX3-FKHR and PAX7-FKHR fusion genes that are generated by chromosomal translocations in ARMS,5 and demonstrated increased transcriptional function of the fusion products and gene-specific mechanisms for fusion gene overexpression (11). In this report, we focus on the wild-type PAX3 and PAX7 genes and demonstrate distinct expression patterns in ERMS, ES, NB, and MEL cell lines. Using these cell lines for detailed characterization of the RNA expression products, we find that alternative splicing forms of PAX3 and PAX7 involving previously uncharacterized ninth exons are the predominant transcripts.
Materials and Methods
The RD ERMS line and A673 ES line were obtained from the American Type Culture Collection. The ERMS lines SMS-CTR, Birch, TTC442, and TTC516 and the ES line TTC466 were provided by Dr. T. Triche (Children’s Hospital of Los Angeles, Los Angeles, CA). The ES lines TW, N1000, and TC32 were provided by Dr. J. Biegel (Children’s Hospital of Philadelphia, Philadelphia, PA). The NB lines were provided by Dr. G. Brodeur (Children’s Hospital of Philadelphia, Philadelphia, PA), and the MEL lines were established as described previously (12). RNA was isolated from these lines as described previously (13). In addition, RT-PCR assays were performed to ensure that ES lines expressed either the EWS-FLI1 or EWS-ERG fusion transcript characteristic of ES and that ERMS lines did not express the fusion transcripts characteristic of ARMS (PAX3-FKHR or PAX7-FKHR) or ES (13).
RNase protection assays were performed as described previously (14). Sizes of protected fragments were calculated from plots comparing the mobility and size of the unprotected and protected PAX3-P2 and PAX7-P2 fragments and the protected GAPDH or β-actin fragment. The PAX3-P2 and PAX7-P2 riboprobe plasmids contain PvuII fragments (365 and 353 bp, respectively) from exons 6–8 cloned into the PvuII site of pSP72 (Promega; Ref. 14). The PAX3-AHc and PAX7-BHBX plasmids consist of a 483-bp ApaI-HincII fragment and a 447-bp BspHI-BstXI fragment, respectively, from exons 7 and 8 cloned into the PvuII site of pSP72. The PAX3-APf plasmid consists of a 492-bp ApaI-PflMI fragment from exons 7–9 cloned into the PvuII and XhoI sites of pSP72. The integrity of selected constructs was verified by automated cycle sequencing (University of Pennsylvania DNA Sequencing Facility).
The alternative 37prime;-PAX7 end was isolated by RT-PCR with forward primer CACAGCTTCTCCAGCTACTCTG and reverse primer GAAGCTCAGGGGTCAGTTAGG. The 515-bp RT-PCR product was subcloned into plasmid pCR2.1 (Invitrogen). The PAX7-BH9R2 riboprobe plasmid was then prepared by subcloning a BspHI-XhoI fragment consisting of 40 bp of pCR2.1 polylinker and a 487-bp PAX7 cDNA insert (corresponding to exons 7–9) into the PvuII and XhoI sites of pSP72.
PAX7-containing PAC clones (394P16, 394P21, and 365K6) were identified in the Sanger Center Chromosome 1 database6 and were recovered from the Roswell Park Cancer Institute human PAC library. Aliquots of PAC DNA (1 μg) were digested for 2–4 h with 5–20 units of restriction endonuclease, electrophoresed in 0.75% agarose gels in Tris-borate buffer, and blotted to nylon membranes (Hybond N+, Amersham). After the isolation of DNA fragments from restriction endonuclease-cleaved plasmids by silica gel particle adsorption and elution (Qiaex, Qiagen), fragments were labeled by the random primer technique and hybridized to blots by standard techniques.
Expression constructs for the original PAX3 and PAX7 clones as well as PAX3-FKHR have been described previously (5, 8). The full-length alternative PAX3 cDNA expression construct was constructed from the original PAX3 clone and IMAGE cDNA clone 249038 (American Type Culture Collection), and consists of 24 bp 5′ untranslated region, 1452 bp coding region, and 167 bp 3′ untranslated region cloned into the EcoRI and XhoI sites of pcDNA3 (Invitrogen). The full-length alternative PAX7 cDNA expression construct was constructed from the original PAX7 clone and the subcloned alternative PAX7 RT-PCR product described above and consists of 102 bp 5′ untranslated region, 1515 bp coding region, and 5 bp 3′ untranslated region cloned into the HindIII and ApaI sites of pcDNA3. The transcriptional activity of these expression constructs was assayed by transient transfection with the 6xPRS-9/E1b CAT reporter plasmid as described previously (5, 8).
PAX3 and PAX7 Expression in Tumor Cell Lines.
On the basis of the finding of murine Pax3 and Pax7 expression in myogenic progenitors and the developing nervous system (7, 9), we investigated the expression of wild-type PAX3 and PAX7 in human tumor cell lines related to these lineages. In particular, we examined expression in NB and MEL, tumors related to the peripheral sympathetic and melanocytic lineages, respectively, and which are both derived from the neural crest. We also analyzed ERMS, a tumor related to the striated muscle lineage, and ES, a peripheral tumor that is related to a poorly defined neuroectodermal lineage.
We used the RNase protection assay to quantify PAX3 and PAX7 transcript expression in five ERMS lines, five ES lines, six NB lines, and five MEL lines. The riboprobes PAX3-P2 and PAX7-P2 contain 40 bp from exon 6, 215 bp (PAX3) or 203 bp (PAX7) from exon 7, and 110 bp from exon 8 (Fig. 1,A). The expression results revealed tumor-specific patterns of PAX3 and PAX7 expression, with some heterogeneity within the tumor groups (Fig. 2). Among the neural-related tumors, both PAX3 and PAX7 were expressed at moderate-to-high levels in ES lines, whereas there was moderate-to-high PAX3 expression but no detectable PAX7 expression in MEL lines. In contrast, neither PAX3 nor PAX7 were expressed at detectable levels in NB lines. In the ERMS lines, PAX3 was expressed at low-to-moderate levels, whereas PAX7 was expressed at higher levels. To corroborate these findings in tumors, we analyzed a panel of ERMS specimens and found generally low or undetectable levels of PAX3 expression and significantly higher levels of PAX7 expression (data not shown).
Expression Analysis of PAX3 and PAX7 Exon 8.
Using these lines as model systems for wild-type PAX expression, we studied the specific composition of the expressed PAX3 and PAX7 transcripts. We focused on exon 8 for which previous reports had shown marked sequence dissimilarity between the extreme 3′ portions of the PAX3 and PAX7 coding regions (3, 7). In particular, after regions of moderate and marked sequence similarity (26 of 48 and 32 of 35 identity, respectively, at the amino acid level), there is a 5-amino-acid COOH-terminal PAX3 portion and an unrelated 52-amino-acid COOH-terminal PAX7 portion (PAX3A and PAX7A, Fig. 1 B).
Hybridization of a Northern blot of RD and A673 RNA with a PAX7 exon 8 fragment from the region of sequence similarity (240BsH, Fig. 3,A) detected the 6.25-kb wild-type PAX7 transcript (data not shown). In contrast, there was no detectable hybridization to any RNA species of an adjacent PAX7 cDNA fragment corresponding to the region of sequence dissimilarity (200SfB, Fig. 3,A). To corroborate this finding, we constructed a riboprobe (PAX7-BHBX, Fig. 1,A) consisting of 62 bp of PAX7 exon 7 and 385 bp of exon 8. RNase protection analysis of RD, SMS-CTR, Birch, A673, and TC32 RNA with this riboprobe demonstrated a major 310-bp protected fragment that constitutes only a portion of the 447-bp full-length probe (Fig. 2 B, arrow 2, some of the data are not shown). This finding is consistent with expression of a PAX7 transcript that diverges from the previously cloned PAX7 cDNA at a point approximately 250 bp into exon 8; this point corresponds to the previously reported region in exon 8 where the PAX3 and PAX7 coding regions diverge.
A corresponding analysis of the 3′ end of the PAX3 transcript was similarly performed. Using a riboprobe construct (PAX3-AHc, Fig. 1,A) consisting of 46 bp PAX3 exon 7 and 437 bp exon 8, we assayed expression in our panel of ERMS, ES, and MEL cell lines. Although protected fragments near the full-length size of 483 bp were only detected in two ES lines (TC32 and TTC466), this assay revealed a prominent 295-bp protected fragment in all of the lines (Fig. 2,C, arrow 2). This finding indicates that an abundant PAX3 transcript diverges from the previously cloned PAX3 cDNA at a point approximately 250 bp into PAX3 exon 8; this point coincides with the point of PAX3/PAX7 sequence divergence. Therefore, for both PAX3 and PAX7, the original cDNA clones have 3′ portions that are not part of the predominant transcripts expressed by these cell lines. The transcripts expressed by the cell lines diverge from these sequences at a conserved point, which suggests the existence of an additional ninth exon situated distal to this divergence point that represents a conserved splice site. This hypothesis is supported by the correspondence of the sequences at this point to the consensus for a donor splice site (Fig. 1 B).
In addition to the predominant fragments described above, this analysis of PAX3 and PAX7 transcripts revealed several fragments that were less abundant than the predominant fragment (Fig. 2,B and 2 C). One of these fragments in each experiment may represent unspliced exon 8 of PAX3 or PAX7. In addition, some of these less abundant protected fragments may also represent other, less frequent, alternative splices involving portions of exon 8.
Identification of PAX3 Exon 9.
In a GenBank search for PAX3 transcripts with alternative 3′ ends, we found two published expressed sequence tags (IMAGE cDNA clones 249038 and 251555) from a human foreskin melanocyte cDNA library (GenBank H82467 and H97691). These entries contain exon 8 sequence until the divergence point and then novel sequence encoding a 10-amino-acid COOH-terminal extension (PAX3B, Fig. 1,B). A sequence encoding an identical COOH-terminal extension was also identified in a cloned quail Pax3 cDNA (GenBank AF000673). To determine whether this human sequence corresponds to the predominant PAX3 transcript in our tumor lines, we prepared a riboprobe construct (PAX3-APf, Fig. 1,A) consisting of 46 bp of PAX3 exon 7, 247 bp of exon 8, and 199 bp of putative exon 9. When hybridized to RNA from the various PAX3-expressing tumor cell lines, we detected a large protected fragment consistent with the full-length size of 492 bp (Fig. 2 C, arrow 1). This finding indicates that the 3′ extension represents a frequently used ninth exon. Several smaller fragments were also detected and were most prominent in ES lines TC32 and TTC466; these bands may represent splices of exons 8 and 9 only or other alternative splices.
To localize the ninth exon and determine the size of the intervening intron, we sequenced a plasmid containing the genomic region 3′ to PAX3 exon 8 (Fig. 3 B; Ref. 2). The sequence corresponding to the ninth exon was found in this plasmid and was separated from exon 8 by a 501-bp intron. In the sequence directly 5′ of exon 9, there is a pyrimidine-rich stretch and the trinucleotide CAG that corresponds to the consensus sequence for a splice acceptor site.
Identification of PAX7 Exon 9.
Although a similar search of the GenBank database did not reveal any expressed sequence tags for PAX7, the search identified zebrafish and chicken Pax7 cDNA clones with a coding sequence that is similar to human PAX7 exon 8 up to the PAX3/PAX7 divergence point (GenBank AF014368 and D87838). After this point, the zebrafish and chicken clones encode a highly related 37-amino-acid region that does not show any similarity to the 52-amino-acid human PAX7 COOH-terminal extension. In addition to these cDNA clones, the GenBank database contains a sequence of over 100 kb from a human genomic clone (PAC 394P21) containing the PAX7 locus (GenBank AL021528). A search of the possible translation reading frames of this clone with the zebrafish and chicken Pax7 protein sequence identified a human sequence that encodes a nearly identical 37-amino-acid region. Within the PAC sequence, this region is situated 3′ of PAX7 exon 8 and in the same transcriptional orientation with respect to the other eight exons in the compiled PAC sequence.
To determine whether the human region within the PAC clone represents a ninth exon, we designed a reverse primer from this region and used it in conjunction with a forward primer from PAX7 exon 8 in a RT-PCR assay of A673 and RD RNA. This RT-PCR assay produced a single fragment corresponding to the predicted 515-bp product (data not shown). Sequence analysis demonstrated an in-frame fusion of PAX7 exon 8 to putative exon 9 at the PAX3/PAX7 divergence point (PAX7B, Fig. 1,B). Examination of the genomic sequence 5′ to this putative exon confirmed the presence of a splice acceptor site. After subcloning the RT-PCR product, we prepared a riboprobe construct (PAX7-BH9R2, Fig. 1,A) consisting of 62 bp PAX7 exon 7, 246 bp of exon 8, and 179 bp of putative exon 9. Hybridization of this riboprobe to RNA from PAX7-expressing tumor lines demonstrated a large protected band that corresponds to the predicted full-length 487-bp fragment (Fig. 2 B, arrow 1). Therefore, splicing of PAX7 exon 8 to this novel exon represents the predominant PAX7 transcript in these tumor lines.
To situate this ninth exon on a genomic map, we used Southern blot analysis to examine a series of genomic clones containing the PAX7 locus. Although exon 9 was not present within our contig of human bacteriophage and cosmid clones from the 3′ end of PAX7,7 this exon was present in a series of three PAC clones including PAC 394P21, whose compiled sequence was described above. Southern blot analysis of these PAC clones demonstrated that PAX7 exons 8 and 9 are both contained within a 12-kb HindIII fragment (Fig. 3 A). Fine-mapping revealed that these two exons are separated by a 9-kb intron. These mapping results are consistent with the intronic size of 8.9 kb predicted by the compiled PAC sequence.
Transcriptional Activity of PAX3 and PAX7 Products.
In previous transcriptional studies, we used a reporter gene with a model PAX3/PAX7 binding site in transient transfection assays and found a high activity for the PAX3-FKHR and PAX7-FKHR fusion proteins, low activity for the wild-type PAX3 protein, and undetectable activity for the wild-type PAX7 protein (5, 8). To compare the activity of the alternative PAX3 and PAX7 translation products with the original constructs, we assembled full-length cDNA expression constructs containing exon 9-encoded sequences. As shown, the alternative PAX3 and PAX7 products function similarly to the original products (Fig. 4). Therefore, any functional differences associated with the alternative COOH-terminal regions are not detected in this model assay system.
The previous finding of murine Pax3 and Pax7 expression in the developing neural and myogenic lineages provided the impetus to characterize expression of these transcripts in tumors related to these lineages. Our analysis of a panel of human tumor cell lines revealed concordances between the PAX3 and PAX7 expression patterns in the tumors and the corresponding embryonic lineages. Our finding of higher PAX7 expression in ERMS lines indicates a relationship to the subpopulation of myogenic precursors that preferentially express PAX7 (9, 10). The PAX3 and PAX7 expression patterns in MEL and NB are consistent with developmental expression and phenotypic findings that indicate a role for PAX3 in the developing melanocytic lineage and not in the developing sympathetic neural lineage (1, 7, 9). In particular, Pax3 but not Pax7 is expressed in some subsets of the developing murine neural crest. In addition, the melanocytic lineage is affected, and the sympathetic neural lineage is unaffected in mice and humans with heterozygous germline Pax3/PAX3 mutations. The finding of both PAX3 and PAX7 expression in most ES cell lines agrees with a previous report of PAX3 expression in this tumor category and is consistent with the proposed neuroectodermal derivation of this tumor (15). Finally, although medulloblastoma, a pediatric brain tumor, was not examined in this study, a previous analysis of tumor specimens reported occasional PAX3 expression but not PAX7 expression in this neural tumor category (16).
In addition to the relatively striking differences in PAX3 and PAX7 expression among the different tumor cell line categories, there was also variation in expression levels within the tumor cell line categories. For example, PAX7 expression varied over a 10-fold range in the ERMS cell lines. These variations in expression can be attributed to differences in the tumor cell transcriptional environment—differences due to variability in (a) the cell of origin; (b) extracellular environment; or (c) acquired genetic alterations in the tumor cells. The clinical significance of the variation is difficult to assess because the tumor cell lines were generally derived from advanced tumors and often developed additional changes during establishment and passage. Investigations of the relationship of clinical and pathological variables to the PAX3 and PAX7 expression levels within a given tumor category, therefore, will require analysis of a well-characterized group of primary tumor specimens representing the full spectrum of lesions within the category.
Using tumor cell lines, we determined that the predominant PAX3 and PAX7 transcripts do not correspond to the previously cloned versions consisting of eight exons. Instead, we found that the predominant PAX3 and PAX7 transcripts splice the eighth exons at a conserved point to previously uncharacterized ninth exons, generating alternative products with structurally distinct COOH-terminal ends. Because the published studies of Pax3 and Pax7 expression during murine development used in situ hybridization probes from cDNA regions 5′ of this splice site (7, 9, 10), the predominant transcripts expressed during myogenic and neural development remain to be determined. Our finding of expression of these alternative forms in different tumor types and the previous finding of similar transcripts in normal tissues of other species suggest that the exon 9-containing transcripts may be predominant in general. Still, another possible explanation is that there may be a relationship of these alternative transcripts to tumorigenesis in these lineages.
The functional significance of the regions encoded by the ninth exons of PAX3 and PAX7 was not revealed by standard transient transfection assays of transcriptional function. However, the finding of identical COOH-terminal regions encoded by the quail Pax3 gene and the chicken and zebrafish Pax7 genes suggests that these regions have an evolutionarily conserved function. One possibility is that these regions modulate interactions with other proteins. In this case, their functional roles may be elucidated in protein-binding screens and transcriptional assays with more complex target gene regulatory elements than the model PAX3/PAX7 binding sites used in our assays. An alternative hypothesis is that these COOH-terminal regions, and possibly the adjacent 3′ untranslated regions, may influence expression of the PAX3 and PAX7 products.
Alternative splicing has been reported for many of the genes in the mammalian paired box family. In addition to alternative splices that modulate the structure of the NH2-terminal DNA binding domains, splicing events that alter the COOH-terminal regions of the corresponding PAX proteins have been described previously (17). In the PAX3/PAX7 subfamily, alternative splicing within the paired box-encoding region has been reported in both the murine Pax3 and Pax7 genes (18). These splicing events involve the splice acceptor sites of exon 3 and either include or exclude a glutamine residue at this point. DNA binding studies revealed that the isoforms without the additional glutamine have a higher affinity for a subset of DNA binding sites that require interactions with both the NH2- and COOH-terminal portions of the paired domain. In addition to these splicing events that were assessed by quantitative expression studies, other, nonquantitative, studies identified additional alternative PAX3 and PAX7 transcripts whose abundance and significance is unknown (19, 20).
In summary, the studies presented in this report have indicated the potential utility of human tumor cell lines as model systems for investigating PAX3 and PAX7 expression and function. As evidence of this experimental utility, we have demonstrated that alternative splicing events generate the predominant PAX3 and PAX7 transcripts in these tumor lines. Additional studies are now needed to track the expression of the alternative and originally cloned transcripts during embryogenesis and to determine the functional significance of the conserved COOH-terminal regions introduced by these splicing events.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
This work was supported by funds from NIH Grant CA71838 and the Dr. Louis Sklarow Memorial Fund.
The abbreviations used are: ARMS, alveolar rhabdomyosarcoma; ERMS, embryonal rhabdomyosarcoma; ES, Ewing’s sarcoma; NB, neuroblastoma; MEL, melanoma; RT, reverse transcription; GAPDH, glyceraldehyde 3-phosphate dehydrogenase.
Database can be found at www.sanger.ac.uk/HGP/Chr1/.
J. C. Fitzgerald, A. H. Scherr, and F. G. Barr. Structural analysis of PAX7 rearrangements in alveolar rhabdomyosarcoma. Cancer Genet. Cytogenet., in press.
We are grateful for the technical assistance provided by Michelle Macris, Donna Strzelecki, Oana Tomescu, and Dr. Peter White.