Structural rearrangements of chromosomes are frequently found in most human cancer cells. These rearrangements often result in creation of fusion genes which can be responsible for cancer pathogenesis. Currently, more than 2 million expressed sequence tags (ESTs) isolated from cancer cells are deposited in the EST database (dbEST). If a chromosomal translocation produces a fusion gene, it should be represented in the dbEST as chimeric transcripts. Such transcripts can be identified because they are made up of portions of transcripts from two genes and map to two different locations in the human genome sequence. These chimeric transcripts can be distinguished from artificial chimeras, which are created by accidental ligation of different cDNAs during the cloning procedure, by examining the sequence at the fusion point. The fusion point in the chimeras from true fusion genes will usually coincide with a canonical exon boundary because the genes are likely to break in an intron. In contrast, the fusion point for an artificial chimera will usually be within an exon of each gene because the fusion occurs between two cDNAs. We have developed a semi-automatic procedure to identify fusion gene transcripts by filtering mRNA- and EST-to-genome alignment data. Using this procedure, we could collect 347 fusion cases. Among them, 76 were previously reported as fusion gene transcripts, demonstrating reliability of the method. The presence of the newly predicted IRA1/RGS17 fusion has been experimentally verified by RT-PCR and fluorescent in situ hybridization in the breast cancer cell line MCF7. The fusion gene may encode the full-length RGS17 protein under the control of the IRA1 gene promoter. We have also developed a fusion gene transcripts database to expedite characterization of predicted fusion genes. Combination of computational prediction and experimental verification should result in a large collection of chromosomal rearrangements, which will present a new opportunity to uncover the molecular mechanism of tumor pathogenesis.

[Proc Amer Assoc Cancer Res, Volume 46, 2005]