Abstract
Purpose: To identify systematically unknown kinase fusions in an unbiased manner from minimal amounts of tumor-derived genomic DNA.
Background: Tyrosine kinase (TK) fusion proteins drive tumorigenesis of many different cancers through deregulated kinase activity. Since cancer cells are dependent on such aberrant signaling, TK fusions are attractive drug targets, and effective therapies targeting these lesions have translated rapidly into the clinic. Unfortunately, only a limited number of TK fusions have been found, because current methods for their identification lack sufficient throughput to allow for systematic analyses of hundreds of tumors. We hypothesized that tumors contain as yet unidentified TK fusions whose discovery has been hindered by these experimental limitations.
Experimental Procedures: To overcome these challenges, we performed an in silico analysis of the protein and genomic breakpoint sequences of known cancer-derived TK fusions (n=59). All TK fusions identified to date contain an intact GXGXXG kinase motif. Fusion points at the genomic level occurred within a defined region upstream of this motif, evoking a high throughput screening strategy. We therefore applied DNA capture (Agilent SureSelect) technology to target specifically these regions where breakpoints were likely to occur. Our custom DNA capture platform included all 90 human TKs (and AKT-1, -2, -3 and BRAF), and should capture ∼92% of known TK fusions. The captured DNA was then subjected to 454 massively parallel sequencing. We chose 454 sequencing because the long read length (∼200 nts) would allow for identification of breakpoints that occurred far upstream of the GXGXXG motif. To validate this platform, we used DNA (1.5 g per sample) from thyroid cancer cells (TPC-1) and acute myeloid leukemia cells (KG-1) with known fusions at the mRNA level but unknown fusion points at the genomic level. The recovered sequences were analyzed using two independently derived novel computational algorithms designed specifically for this application.
Results: Approximately 60,000 and 100,000 captured 454 sequences were recovered from TPC-1 and KG-1 cells, respectively. TK-containing sequences were enriched ∼776 fold across both samples, indicating the efficiency of our capture method. Candidate fusion sequences were validated by PCR with breakpoint-spanning primers. Among the 15 candidate fusions identified from computational analyses of the TPC-1 sequences, only the CCDC6-RET fusion was validated by direct sequencing of the PCR products. Whereas the CCDC6-RET sequence juxtaposed intronic elements from the fusion partners, the FGFR1OP2-FGFR1 fusion sequence in KG-1 cells proved more complex. The only validated fusion sequence among the 30 candidates contained two non-contiguous portions of FGFR1. Further analysis with long-range PCR revealed that this sequence was actually part of the FGFR1OP2-FGFR1 fusion sequence, which contained inverted exonic regions between the sequences of the two fusion partners. No fusion sequences were recovered from an additional 3 control cell lines without known fusions.
Conclusions: We have developed a unique high throughput platform to map genomic fusion points involving TKs in an unbiased manner. Using this technology, we were able to identify readily the genomic sequences of 2 TK fusions mapped previously at the mRNA level. This novel method is distinct from other similar efforts, because it focuses specifically on targets with therapeutic potential, uses only 1.5 g of DNA per sample, and circumvents the need for complex computational sequence analysis. We now plan to use this screening method to search for novel TK fusions in highly annotated tumor samples.
Citation Information: Clin Cancer Res 2010;16(7 Suppl):A18