Abstract
Background: The use of sequencing technologies to detect gene fusions (GFs) from RNA shows promising results for the future of cancer diagnosis and treatment. Major obstacles for this approach include target design and lack of well-curated databases of RNA breakpoints. Currently, off-the-shelf designs include full transcript targeting that results in massive and costly amounts of data. Directly targeting the known GFs from RNA by designing probes targeting the fusion junction sequence is studied here as an alternative to whole-exome sequencing (WES). We present notably a novel algorithm capable of designing the probes to accurately target the desired fusions from RNA.
Methods: For a given GF detected either from DNA or from RNA, the algorithm is as follows: (1) Collect gene and isoform information for both partners from seven public databases; (2) For each candidate pair of isoforms, locate where the breakpoints will be observed and assign a score based on various criteria such as sequence completion, coding information, transcript support level, % identity with and % visible on hg38; (3) Select the top scoring pair of transcripts and extract the chimeric probe sequence. Two sets of probes extracted with this protocol targeting 524 and 1632 known GFs were synthetized and tested on several samples (Table 1). The Agilent SureSelect Human All Exon V6 capture kit was used to compare targeting efficiency against WES.
Results: Targeted enrichment of a SeraSeq control showed a 5 to 20 fold increase in supporting evidence over WES. On 10 clinical samples, we observed 10-30x increase in supporting reads. A higher sensitivity is observed in both cases.
Conclusion: We developed a novel algorithm capable of accurately identifying the most likely location of an RNA fusion junction and generating the probe sequences for oligo synthesis. This method not only enriches for more supporting data but also reduces the associated costs.
Average number of supporting reads per fusion per million reads for WES & direct targeting
Sample . | Known Fusion . | WES (A) . | WES (B) . | 524 TF (A) . | 524 TF (B) . | 1632 TF (A) . | 1632 TF (B) . |
---|---|---|---|---|---|---|---|
SeraSeq 0710-0496 | CCDC6→RET | 8 | 10 | 270 | 355 | 81 | 78 |
CD74→ROS1 | 35 | 81 | 592 | 832 | 246 | 250 | |
EGFR→SEPTIN14 | 18 | 17 | 234 | 315 | 91 | 80 | |
FGFR3→BAIAP2L1 | 14 | 5 | 428 | 326 | 125 | 72 | |
FGFR3→TACC3 | 23 | 9 | 861 | 879 | 270 | 203 | |
LMNA→NTRK1 | 23 | 11 | 215 | 280 | 71 | 67 | |
PAX8→PPARG | 29 | 19 | 193 | 246 | 66 | 62 | |
SLC34A2→ROS1 | 10 | 22 | 176 | 425 | 89 | 142 | |
SLC45A3→BRAF | 11 | 15 | 433 | 420 | 141 | 105 | |
TFG→NTRK1 | 35 | 40 | 275 | 377 | 128 | 132 | |
TMPRSS2→ERG | 0 | 0 | 559 | 348 | 170 | 79 | |
TPM3→NTRK1 | 15 | 23 | 246 | 359 | 106 | 117 | |
Avg. SeraSeq | 12 Fusions | 18 | 21 | 373 | 430 | 132 | 116 |
Clinical S1 | EML4→ALK | 25 | 15 | 344 | NA | 81 | NA |
Clinical S2 | EWSR1→FLI1 | 57 | 38 | 514 | NA | 111 | NA |
Clinical S3 | TES→MET | 20 | 30 | 115 | NA | 75 | NA |
Clinical S4 | EZR→ROS1 | 4 | 31 | 1,059 | NA | 266 | NA |
Clinical S5 | SDC4→ROS1 | 5 | 5 | 3,354 | NA | 1,082 | NA |
Clinical S6 | SH3BP5→PPARG | 0 | 0 | 4 | NA | 1 | NA |
Clinical S7 | H2BC21→NTRK1 | 1 | 1 | 18 | NA | 4 | NA |
Clinical S8 | COL1A1→PDGFB | 209 | 250 | 5,530 | NA | 2,289 | NA |
Clinical S9 | KIF5B→RET | 24 | 24 | 437 | NA | 174 | NA |
Clinical S10 | POC1B→GLI1 | 15 | 8 | 416 | NA | 161 | NA |
Avg. Clinical | 10 Fusions | 36 | 40 | 1,179 | NA | 424 | NA |
Sample . | Known Fusion . | WES (A) . | WES (B) . | 524 TF (A) . | 524 TF (B) . | 1632 TF (A) . | 1632 TF (B) . |
---|---|---|---|---|---|---|---|
SeraSeq 0710-0496 | CCDC6→RET | 8 | 10 | 270 | 355 | 81 | 78 |
CD74→ROS1 | 35 | 81 | 592 | 832 | 246 | 250 | |
EGFR→SEPTIN14 | 18 | 17 | 234 | 315 | 91 | 80 | |
FGFR3→BAIAP2L1 | 14 | 5 | 428 | 326 | 125 | 72 | |
FGFR3→TACC3 | 23 | 9 | 861 | 879 | 270 | 203 | |
LMNA→NTRK1 | 23 | 11 | 215 | 280 | 71 | 67 | |
PAX8→PPARG | 29 | 19 | 193 | 246 | 66 | 62 | |
SLC34A2→ROS1 | 10 | 22 | 176 | 425 | 89 | 142 | |
SLC45A3→BRAF | 11 | 15 | 433 | 420 | 141 | 105 | |
TFG→NTRK1 | 35 | 40 | 275 | 377 | 128 | 132 | |
TMPRSS2→ERG | 0 | 0 | 559 | 348 | 170 | 79 | |
TPM3→NTRK1 | 15 | 23 | 246 | 359 | 106 | 117 | |
Avg. SeraSeq | 12 Fusions | 18 | 21 | 373 | 430 | 132 | 116 |
Clinical S1 | EML4→ALK | 25 | 15 | 344 | NA | 81 | NA |
Clinical S2 | EWSR1→FLI1 | 57 | 38 | 514 | NA | 111 | NA |
Clinical S3 | TES→MET | 20 | 30 | 115 | NA | 75 | NA |
Clinical S4 | EZR→ROS1 | 4 | 31 | 1,059 | NA | 266 | NA |
Clinical S5 | SDC4→ROS1 | 5 | 5 | 3,354 | NA | 1,082 | NA |
Clinical S6 | SH3BP5→PPARG | 0 | 0 | 4 | NA | 1 | NA |
Clinical S7 | H2BC21→NTRK1 | 1 | 1 | 18 | NA | 4 | NA |
Clinical S8 | COL1A1→PDGFB | 209 | 250 | 5,530 | NA | 2,289 | NA |
Clinical S9 | KIF5B→RET | 24 | 24 | 437 | NA | 174 | NA |
Clinical S10 | POC1B→GLI1 | 15 | 8 | 416 | NA | 161 | NA |
Avg. Clinical | 10 Fusions | 36 | 40 | 1,179 | NA | 424 | NA |
Citation Format: Christophe N. Magnan, Steven P. Rivera, Fernando J. Lopez-Diaz, Chen-Yin Ou, Kenneth B. Thomas, Hyunjun Nam, Lawrence M. Weiss, Segun C. Jung, Vincent A. Funari. An efficient probe design algorithm for direct fusion targeting from RNA [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2021; 2021 Apr 10-15 and May 17-21. Philadelphia (PA): AACR; Cancer Res 2021;81(13_Suppl):Abstract nr 241.