Background: The use of sequencing technologies to detect gene fusions (GFs) from RNA shows promising results for the future of cancer diagnosis and treatment. Major obstacles for this approach include target design and lack of well-curated databases of RNA breakpoints. Currently, off-the-shelf designs include full transcript targeting that results in massive and costly amounts of data. Directly targeting the known GFs from RNA by designing probes targeting the fusion junction sequence is studied here as an alternative to whole-exome sequencing (WES). We present notably a novel algorithm capable of designing the probes to accurately target the desired fusions from RNA.

Methods: For a given GF detected either from DNA or from RNA, the algorithm is as follows: (1) Collect gene and isoform information for both partners from seven public databases; (2) For each candidate pair of isoforms, locate where the breakpoints will be observed and assign a score based on various criteria such as sequence completion, coding information, transcript support level, % identity with and % visible on hg38; (3) Select the top scoring pair of transcripts and extract the chimeric probe sequence. Two sets of probes extracted with this protocol targeting 524 and 1632 known GFs were synthetized and tested on several samples (Table 1). The Agilent SureSelect Human All Exon V6 capture kit was used to compare targeting efficiency against WES.

Results: Targeted enrichment of a SeraSeq control showed a 5 to 20 fold increase in supporting evidence over WES. On 10 clinical samples, we observed 10-30x increase in supporting reads. A higher sensitivity is observed in both cases.

Conclusion: We developed a novel algorithm capable of accurately identifying the most likely location of an RNA fusion junction and generating the probe sequences for oligo synthesis. This method not only enriches for more supporting data but also reduces the associated costs.

Average number of supporting reads per fusion per million reads for WES & direct targeting

SampleKnown FusionWES (A)WES (B)524 TF (A)524 TF (B)1632 TF (A)1632 TF (B)
SeraSeq 0710-0496 CCDC6→RET 10 270 355 81 78 
 CD74→ROS1 35 81 592 832 246 250 
 EGFR→SEPTIN14 18 17 234 315 91 80 
 FGFR3→BAIAP2L1 14 428 326 125 72 
 FGFR3→TACC3 23 861 879 270 203 
 LMNA→NTRK1 23 11 215 280 71 67 
 PAX8→PPARG 29 19 193 246 66 62 
 SLC34A2→ROS1 10 22 176 425 89 142 
 SLC45A3→BRAF 11 15 433 420 141 105 
 TFG→NTRK1 35 40 275 377 128 132 
 TMPRSS2→ERG 559 348 170 79 
 TPM3→NTRK1 15 23 246 359 106 117 
Avg. SeraSeq 12 Fusions 18 21 373 430 132 116 
Clinical S1 EML4→ALK 25 15 344 NA 81 NA 
Clinical S2 EWSR1→FLI1 57 38 514 NA 111 NA 
Clinical S3 TES→MET 20 30 115 NA 75 NA 
Clinical S4 EZR→ROS1 31 1,059 NA 266 NA 
Clinical S5 SDC4→ROS1 3,354 NA 1,082 NA 
Clinical S6 SH3BP5→PPARG NA NA 
Clinical S7 H2BC21→NTRK1 18 NA NA 
Clinical S8 COL1A1→PDGFB 209 250 5,530 NA 2,289 NA 
Clinical S9 KIF5B→RET 24 24 437 NA 174 NA 
Clinical S10 POC1B→GLI1 15 416 NA 161 NA 
Avg. Clinical 10 Fusions 36 40 1,179 NA 424 NA 
SampleKnown FusionWES (A)WES (B)524 TF (A)524 TF (B)1632 TF (A)1632 TF (B)
SeraSeq 0710-0496 CCDC6→RET 10 270 355 81 78 
 CD74→ROS1 35 81 592 832 246 250 
 EGFR→SEPTIN14 18 17 234 315 91 80 
 FGFR3→BAIAP2L1 14 428 326 125 72 
 FGFR3→TACC3 23 861 879 270 203 
 LMNA→NTRK1 23 11 215 280 71 67 
 PAX8→PPARG 29 19 193 246 66 62 
 SLC34A2→ROS1 10 22 176 425 89 142 
 SLC45A3→BRAF 11 15 433 420 141 105 
 TFG→NTRK1 35 40 275 377 128 132 
 TMPRSS2→ERG 559 348 170 79 
 TPM3→NTRK1 15 23 246 359 106 117 
Avg. SeraSeq 12 Fusions 18 21 373 430 132 116 
Clinical S1 EML4→ALK 25 15 344 NA 81 NA 
Clinical S2 EWSR1→FLI1 57 38 514 NA 111 NA 
Clinical S3 TES→MET 20 30 115 NA 75 NA 
Clinical S4 EZR→ROS1 31 1,059 NA 266 NA 
Clinical S5 SDC4→ROS1 3,354 NA 1,082 NA 
Clinical S6 SH3BP5→PPARG NA NA 
Clinical S7 H2BC21→NTRK1 18 NA NA 
Clinical S8 COL1A1→PDGFB 209 250 5,530 NA 2,289 NA 
Clinical S9 KIF5B→RET 24 24 437 NA 174 NA 
Clinical S10 POC1B→GLI1 15 416 NA 161 NA 
Avg. Clinical 10 Fusions 36 40 1,179 NA 424 NA 

Citation Format: Christophe N. Magnan, Steven P. Rivera, Fernando J. Lopez-Diaz, Chen-Yin Ou, Kenneth B. Thomas, Hyunjun Nam, Lawrence M. Weiss, Segun C. Jung, Vincent A. Funari. An efficient probe design algorithm for direct fusion targeting from RNA [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2021; 2021 Apr 10-15 and May 17-21. Philadelphia (PA): AACR; Cancer Res 2021;81(13_Suppl):Abstract nr 241.