Nine percent of disease-causing mutations classified in HGMD have the potential to impact RNA splicing. Splicing variants called from DNA sequence data, especially variants outside the canonical splice site, are difficult to evaluate clinically as they often have no effect on protein expression. A method to utilize RNA-Seq data for splice site variant classification was developed. RNA reads aligned to the region of a splice site variant are examined to determine if RNA read through is present which indicates normal splicing at the site is disrupted.

Thousands of paired tumor and normal TCGA DNA and RNA samples were used to evaluate the algorithm and rule out alternate splicing in the tissue as explanation for RNA read through. Splicing variants can also alter exon usage and RNA isoform expression, and this can be identified by examining RNA isoform expression. An analysis of splicing variants in the TCGA project datasets has generated sets of confirmed functional and non-disruptive splicing varaints. This splicing dataset can be used to test and refine splicing defect prediction algorithms.

Citation Format: Jim Lund, Shannon Bailey, Sharvari Gujja, Jeffrey Gulcher. Using RNA expression data for splicing variant assessment and confirmation [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2018; 2018 Apr 14-18; Chicago, IL. Philadelphia (PA): AACR; Cancer Res 2018;78(13 Suppl):Abstract nr 5308.