Abstract
Studies of alternative RNA splicing (ARS) have the potential to provide an abundance of novel targets for development of new biomarkers and therapeutics in oncology, which will be necessary to improve outcomes for patients with cancer and mitigate cancer disparities. ARS, a key step in gene expression enabling individual genes to encode multiple proteins, is emerging as a major driver of abnormal phenotypic heterogeneity. Recent studies have begun to identify RNA splicing–related genetic and genomic variation in tumors, oncogenes dysregulated by ARS, RNA splice variants driving race–related cancer aggressiveness and drug response, spliceosome-dependent transformation, and RNA splicing–related immunogenic epitopes in cancer. In addition, recent studies have begun to identify and test, preclinically and clinically, approaches to modulate and exploit ARS for therapeutic application, including splice-switching oligonucleotides, small molecules targeting RNA splicing or RNA splice variants, and combination regimens with immunotherapies. Although ARS data hold such promise for precision oncology, inclusion of studies of ARS in translational and clinical cancer research remains limited. Technologic developments in sequencing and bioinformatics are being routinely incorporated into clinical oncology that permit investigation of clinically relevant ARS events, yet ARS remains largely overlooked either because of a lack of awareness within the clinical oncology community or perceived barriers to the technical complexity of analyzing ARS. This perspective aims to increase such awareness, propose immediate opportunities to improve identification and analysis of ARS, and call for bioinformaticians and cancer researchers to work together to address the urgent need to incorporate ARS into cancer biology and precision oncology.
Translational Relevance
The path forward for translational cancer research and clinical practice in oncology is promising, as drivers of tumor biological diversity remain underexplored. One of the underexplored mechanisms, for which there is emerging evidence that it plays a critical role in cancer heterogeneity, aggressiveness, and therapeutic response, is alternative RNA splicing (ARS). There is also emerging evidence for agents to target and exploit ARS for therapeutic application. Despite the indications that ARS plays such critical roles in cancer, most translational and clinical cancer research focuses on mutation and aggregate gene expression. Increasing awareness of the significance of ARS to cancer and coalescence of ARS bioinformatics and cancer biology have the potential to increase incorporation of ARS into biomarker and drug development in oncology. Ultimately, this has the potential to lead to new precision medicine interventions that are likely to improve outcomes for patients with cancer and mitigate cancer disparities among racial groups.
The widespread adoption of genomic profiling of human tumors is now providing information to researchers, patients, and providers, and influencing translational research and clinical practice (1, 2). However, studies to date have largely focused on actionable mutations and aggregate gene expression and have predominantly included patients of European ancestry (3, 4). As a result, these efforts may have missed drivers of cancer biological and clinical heterogeneity among patients of different ancestries that have the potential to aid in the development of new diagnostic and therapeutic interventions.
As our understanding of the molecular etiology of cancer has evolved past the “initiation–promotion” paradigm, we are increasingly appreciating the importance of transcriptional reprogramming in early- and late-stage tumor evolution (5). For cancers with a long developmental history, such as breast, colorectal, and prostate cancer, the mutation burden reflects mostly late accumulation events, raising the question as to whether or not mutations or other genetic alterations are the early oncogenic drivers. Interestingly, it is for these same cancers for which some of the most striking disparities in incidence and outcome among patients of different ancestries have been repeatedly demonstrated. Here, we draw attention to the emergence of novel aspects of another level of clinically relevant genomic complexity that has the potential to explain more clearly the dynamic diversity in human tumor biology: alternative RNA splicing (ARS; recently reviewed by Urbanski and colleagues in ref. 7).
ARS is a key step in gene expression in higher eukaryotes. Humans share 99% similarity with chimpanzees by DNA sequence, but less than 60% by alternatively spliced exons (8). The current theory as to how such striking diversity can exist so late in evolution is the unique ability of ARS to provide a modular, low-risk mechanism of protein diversification in risk-averse higher organisms (9). Given the importance of ARS to evolutionary biological diversity, it could also be reasonably speculated that ARS likely drives tumor-related biological diversity. Indeed, oncogenes dysregulated by ARS, but not by mutation, have been identified (e.g., BARD1; ref. 10).
ARS is the physiologic process that creates different RNA variants from the same sequence of DNA (11). It is regulated by cis-acting splicing elements (nucleotide sequences or motifs) that recruit trans-acting splicing factors (proteins or RNAs) that enhance or silence the use of splice sites. Variation in cis-acting splicing elements, differential expression of trans-acting splicing factors, or mutation in genes encoding components of the RNA splicing machinery can all alter ARS and result in disease, including cancer (12). In addition, noncanonical RNA splicing events can result in aberrant RNAs (i.e., not normally expressed in healthy tissues or cells) in pathophysiologic states (13).
Analyses of tumors highlight the magnitude of putative actionable ARS alterations that have yet to undergo characterization in patients, as half of such tumors harbor ARS-altering single-nucleotide variants (14). The frequency of these alterations raises the question of whether “mutations of unknown significance” might drive changes in ARS. Several examples of the role of ARS in tumor biology have been recently reviewed (7). We have shown that discrepant probe set changes within the same gene, thought to be “noise” on microarrays intended to measure aggregate gene expression, is often a signal of changes in ARS (15, 16). The ability to detect isoform-specific mRNA changes within expression data suggest that any physiologic state, characterized by significant differences in gene expression is likely to exhibit comparable changes in more nuanced metrics of alternative mRNA processing and pre-mRNA splicing.
Work from our laboratories and others has begun to highlight the importance of ARS in cancer biology and cancer disparities (19, 20) and demonstrates that dysregulation of ARS may be a principal feature differentiating cancers from their host tissues of origin (21). In prostate cancer, a role of ARS is emerging in association with local (e.g., SRPK1, which regulates ARS of VEGF, associates with local prostate cancer stage and invasion; ref. 22) and distant (e.g., transcriptome-wide changes in ARS associated with metastatic colonization; ref. 23) disease progression. Our team participated in a multi-institutional study demonstrating differences in expression of RNA splice variants between prostate cancer in African American and White patients. Approximately, one-third of the variants enriched in prostate cancer in African American patients were likewise present in patient-matched normal prostate specimens, indicating germline origin and potential clinical significance as biomarkers (19). The number of differentially expressed, ancestry-related RNA splice variants far exceeded the aggregate gene expression differences in the same tissues. Ancestry-specific prostate cancer cell lines and xenografts were used to demonstrate the functional significance of these RNA splice variants to driving ancestry-related prostate cancer aggressiveness and influencing drug responses to targeted therapeutics. As one example of the power of this comparative spliceomics (24) approach, Phosphatidylinositol-4,5-bisphosphate 3-Kinase delta (PI3Kδ) was identified as a novel driver of prostate cancer aggressiveness and RNA splice variants of PI3Kδ were discovered with distinct functions that serve as biomarkers of drug response. Studies in metastatic prostate cancer suggest that aberrant RNA splicing may play roles in progression (25) and studies have identified high-frequency tumor-associated differences in ARS in breast, liver, and lung cancer (26). Furthermore, the androgen receptor (AR), a driver of prostate cancer progression and treatment target, undergoes aberrant RNA splicing with predictive and prognostic treatment implications in castration-resistant disease (27). Additional examples of the role of ARS in cancer are emerging in the dysregulation of tumor-suppressive genes and oncogenes, including TP53, BARD1, AR, and BCL2 (10), and oncogenes, including MYC, appear to rely on the spliceosome to drive transformation (28). In fact, ARS has been causally demonstrated across all of the hallmarks of cancer (20). Recently, the plastic nature of ARS and the bridge between ARS and therapeutic effect have been demonstrated with the discovery that ionizing radiation induces senescence through ARS of TP53 (29), and that hypoxia, a fundamental driver of both chemotherapy and radiation resistance, regulates ARS of genes involved in the hallmarks of cancer in breast cancer cells (30).
Germline or somatic genetic variation in cis-acting splicing elements has also been found to associate with cancer risk and prognosis. We have identified associations between germline single-nucleotide polymorphisms predicted to regulate RNA splicing of stemness genes and disparities in prostate cancer risk and prostate cancer survival (31, 32). Work focusing on somatic mutations in BRCA1 has shown that African American women have 24% of mutations associated with cis-acting splicing elements, greater than in women of other ancestries (33). In addition, others have observed higher rates of germline “variants of uncertain significance” in African Americans as compared with Whites with early onset breast cancer (34), suggesting that ARS might be relevant to disease as a function of ancestry. Somatic mutations in genes encoding core units of the spliceosome have been identified in cancers (35). Dysregulated trans-acting splicing factors have also been identified, with roles in genomic stability (via inhibition of destabilizing RNA:DNA complexes; ref. 36) and are overexpressed in breast, colon, and lung tumors (37). In breast cancer, an appreciation of trans-acting splicing factors as drivers of progression is emerging, with such factors being differentially expressed during progression (38).
Therapeutic approaches to manipulate ARS, correct aberrant RNA splicing, or produce novel RNA splice variants are being developed and tested in human clinical trials. Splice-switching oligonucleotides (SSO) can modulate pre-mRNA splicing by binding to target pre-mRNAs and blocking access of the RNA splicing machinery to a particular splice site (39). Thus, SSOs can simultaneously limit production of pathogenic variants and induce expression of variants with therapeutic value, as reported in spinal muscular atrophy (40), leading to the first FDA-approved splicing-targeted therapy (Spinraza) in December 2016. Additional SSOs exhibit therapeutic potential in mouse models of disease, including cancer (41). These successes dovetail with advances in RNA therapeutic delivery (42). In addition to SSO-based approaches, studies have used phenotypic screens and splicing-specific reporters to conduct high-throughput screens of small molecules and have identified modulators of RNA splicing, including those with activity in cancer cells (43, 44). A small-molecule modulator of RNA splicing is in clinical trials for spinal muscular atrophy (45). Despite such proofs of principle, relatively limited effort focuses on adopting these technologies in cancer drug development. Much as in current targeted therapy approaches, it is likely that the ultimate efficacy of any proposed “splice targeted” therapy will strongly depend on the hallmark of cancer (5) and gene-specific splicing profile under consideration.
ARS is also likely a mechanism generating immunogenic epitopes in cancer and a predictive indicator of immunogenic diversity. Examples of ARS driving immunogenic potential date back 20 years, but further pursuit has not occurred in the immune checkpoint therapy era (46). Molecular analyses of melanoma support the potential for ARS to affect immunotherapy; for example, melanomas that have mutations in the RNA splicing regulator RNA Binding Motif protein, X-linked Like 1 (RBMXL1) may have corresponding widespread ARS (47), although the prevalence of mutated RBMXL1 may be low (∼8%; ref. 48). It has been confirmed that novel alternatively spliced gene fusion products may provide novel immunogenic epitopes (49, 50). Furthermore, interventions to drive ARS may synergize with immune checkpoint inhibitors. For example, small molecule and drug screens have identified both new and existing RNA splicing modulators, for example, digoxin (51); although, the efficacy of such agents in combination with immunotherapies remain untested.
Despite the significance of ARS to cancer, clinically oriented reviews of cancer biomarkers, therapeutics, and profiling of tumor heterogeneity often fail to mention or only peripherally reference RNA splicing (52–54), suggesting that this aspect of genomic regulation has remained outside the mainstream of discussions of clinical cancer genomics. We are only now starting to appreciate the translational importance of ARS in cancer; for example, patients having exon 14 splice site alterations in MET exhibit positive clinical response to MET inhibitors (55). These examples of missed “hits” suggest that many RNA splice variants with potential as targets in precision oncology have yet to be discovered. ARS can yield targets relevant to all aspects of precision oncology. As described herein and shown in Fig. 1, RNA splice variants can preexist in normal cells and persist following transformation or can be expressed de novo in cancer cells. Such RNA splice variants and variation in cis-acting splicing elements can serve as biomarkers. RNA splice variants can serve as targets for RNA-targeted therapeutics, including SSOs and RNA-targeted small molecules. The proteins encoded by RNA splice variants and trans-acting splicing factors can serve as targets for protein-targeted therapeutics, including protein-targeted small molecules. RNA splice variants and their encoded proteins can also serve as neoantigens.
Roles of RNA splicing events and RNA splice variants in precision oncology. Genetic variation in cis-acting splicing elements in different populations can result in expression of alternative RNA splice variants, as exemplified by pre-mRNA #1. Some of these can be oncogenic RNA splice variants that preexist in normal cells and persist in cancer cells, as exemplified by pre-mRNA #1. Alterations that occur during transformation, for example, differential expression of trans-acting splicing factors can result in oncogenic RNA splice variants that arise de novo in cancer cells, as exemplified by pre-mRNA #2. Such RNA splicing events and RNA splice variants can be biomarkers, therapeutic targets, and/or neoantigens. Ultimately, such RNA splicing events and RNA splice variants can influence cancer aggressiveness and drug response. Solid lines within pre-mRNAs, RNA splicing patterns. E, exon; I, intron; Joined Es depict RNA splice variants and schematics below joined Es depict corresponding encoded protein isoforms; Gray oval, nucleus; Red letters, single-nucleotide polymorphism in cis-acting splicing element; SF, trans-acting splicing factor.
Roles of RNA splicing events and RNA splice variants in precision oncology. Genetic variation in cis-acting splicing elements in different populations can result in expression of alternative RNA splice variants, as exemplified by pre-mRNA #1. Some of these can be oncogenic RNA splice variants that preexist in normal cells and persist in cancer cells, as exemplified by pre-mRNA #1. Alterations that occur during transformation, for example, differential expression of trans-acting splicing factors can result in oncogenic RNA splice variants that arise de novo in cancer cells, as exemplified by pre-mRNA #2. Such RNA splicing events and RNA splice variants can be biomarkers, therapeutic targets, and/or neoantigens. Ultimately, such RNA splicing events and RNA splice variants can influence cancer aggressiveness and drug response. Solid lines within pre-mRNAs, RNA splicing patterns. E, exon; I, intron; Joined Es depict RNA splice variants and schematics below joined Es depict corresponding encoded protein isoforms; Gray oval, nucleus; Red letters, single-nucleotide polymorphism in cis-acting splicing element; SF, trans-acting splicing factor.
There are likely reasons that ARS has not risen to the forefront of translational research, despite its enormous potential. ARS is complex and related analyses must specify details of the structures of the events and reference this information with respect to the relative abundance of one RNA variant to another within the same gene. Exon-level annotation is highly variable by data source. Definitions of RNA splice variant ratios or other nonstandardized metrics must be used to quantify ARS. Finally, the distinction between RNA splice variant–specific expression versus overall expression is not always made and may in some circumstances be more accurately described by mRNA transcript–specific changes in abundance.
Technical limitations and analyses of ARS are not trivial. Standardized computational approaches to analyzing these data do not exist. Sequence-based approaches are typically described as structural or count-based (56). Count-based approaches require selecting a database to provide the coordinates or “bins” with which to quantify exon abundance, and can produce variable results depending on bin definition. Thus, the same software, using a different reference genome or alignment, can produce different results. Liu and colleagues compared the ability of current RNA-seq–based methods to detect ARS within a heat shock dataset in plants (56). The study did not detect a single gene as alternatively spliced by the seven programs included in the analysis, underscoring the need to understand the relative strengths and limitations of various ARS analysis methods. The application of novel bioinformatics techniques to existing data with an ARS focus is resulting in substantial advances in understanding tumor genomic heterogeneity (57, 58), and efforts are underway to better understand how ARS interrelates to other genomic phenomena including long noncoding RNAs, miRNAs, and protein translation (59). Although we focused on the role of ARS of mRNAs, it is important to note that long noncoding RNAs have been demonstrated to undergo, as well as regulate, ARS (60, 61). Finally, it should be noted that there are emerging technologies such as single-molecule real-time isoform sequencing that are used in conjunction with the commercial RNA-seq platforms (i.e., “third generation sequencing”). This technology and companion software permit comprehensive analysis of entire molecules and variants of RNA (messenger, noncoding, circular, etc.; ref. 62). This technology holds much potential for the future of ARS analyses; however, its present utility in clinical oncology remains limited, given that it is not incorporated in clinically used genomic assays in oncology and its analytic performance in this setting remains to be confirmed.
We suggest that key factors that have limited incorporation of ARS in genome-wide studies within the clinical oncology community are lack of awareness, cost, and technical complexity and interpretation. We hope that this Perspective and ongoing research will increase awareness. Fortunately, cost of such analyses continues to decrease. The largest barrier is technical complexity and interpretation. We call for attention to spliceomics and the need for increased collaboration between bioinformaticians and cancer biologists to develop improved methods to identify and analyze ARS. Of particular value would be the expansion of RNA-Seq software to include analyses of ARS in parallel to standard gene expression pipelines, which would greatly remove current time and technical barriers to investigator examination of RNA splicing. Such software should also provide pathway analysis, analysis of factors that regulate ARS, and be accessible without sophisticated bioinformatics expertise (63). Finally, there are immediate opportunities to standardize variant names, exon descriptions and numbering, and the approaches that report RNA splicing events.
In summary, ARS is a principal driver of biological diversity and plays a role in every hallmark of cancer, yet is rarely examined in profiling of tumors and is largely overlooked in biomarker and drug development in oncology. We believe the primary barrier to taking advantage of this plethora of potentially actionable data is the difficulty of analyzing ARS data and call for a partnership between bioinformaticians and cancer researchers to address this need. Although the time and learning curve associated with these analyses is steep, such efforts are likely to solve unmet challenges in cancer biology, including cancer disparities and patient care.
Disclosure of Potential Conflicts of Interest
T.J. Robinson, J.A. Freedman, B. LaCroix, B.M. Patierno, D.J. George, and S.R. Patierno are listed as co-inventors on a patent application involving RNA splicing targeted therapeutics that is owned by Duke University. It has not been licensed at the time of this disclosure.
Authors' Contributions
Conception and design: T.J. Robinson, J.A. Freedman, M. Al Abo, D.J. George, S.R. Patierno
Development of methodology: T.J. Robinson
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): T.J. Robinson, J.A. Freedman, B. LaCroix, B.M. Patierno
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): T.J. Robinson, J.A. Freedman, B.M. Patierno, S.R. Patierno
Writing, review, and/or revision of the manuscript: T.J. Robinson, J.A. Freedman, M. Al Abo, A.E. Deveaux, B.M. Patierno, D.J. George, S.R. Patierno
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): J.A. Freedman, B. LaCroix, D.J. George
Study supervision: J.A. Freedman, S.R. Patierno
Acknowledgments
This work was partially supported by a RSNA Resident Research Grant (to T.J. Robinson, principal investigator and S.R. Patierno, mentor), a DoD Prostate Cancer Research Program Health Disparity Research Award (PC131972, to S.R. Patierno, principal investigator and J.A. Freedman, coinvestigator), a NIH Feasibility Studies to Build Collaborative Partnerships in Cancer Research P20 Award (1P20-CA202925-01A1, to S.R. Patierno, overall principal investigator and J.A. Freedman, principal investigator of Pilot Project One), and a NIH Basic Research in Cancer Health Disparities R01 Award (R01CA220314, to S.R. Patierno, principal investigator and J.A. Freedman, coinvestigator).