Researchers analyzed exomes from 4,742 tumor samples and matched normal tissue and found most of the roughly 130 known recurrently mutated cancer genes, as well as 33 mutated genes not previously associated with cancer. The analysis advances the effort to compile a comprehensive catalog of cancer genes.

Researchers at the Broad Institute in Cambridge, MA, analyzed the exomes of 4,742 tumor samples from 21 cancer types and matched normal tissue and found most of the roughly 130 genes known to be recurrently mutated in cancer, as well as 33 mutated genes not previously associated with the disease. Further studies may illuminate which, if any, of the newly found mutations have potential as therapeutic targets or biomarkers.

The analysis, recently published online in Nature, advances the effort to compile a comprehensive catalog of cancer genes. Such a catalog, the study's authors say, could help clinicians match the biology of a person's tumor with a targeted therapy. At the same time, the authors note that the study shows that such a catalog is far from finished.

Recurrently mutated genes can be grouped according to how frequently they appear. High-frequency mutated genes, such as TP53, are involved in more than 20% of tumors.

“For the most common tumor types, like breast and ovarian cancer, we're very close to knowing the complete catalog of high-frequency genes,” says computational biologist Michael Lawrence, PhD, the study's lead author and co-senior author, along with Gad Getz, PhD, and Eric Lander, PhD. Research in the last decade, he says, has focused primarily on intermediate-frequency genes, found in 2% to 20% of tumors.

“There, in the intermediate range, we still have a lot more to discover,” he says. “If we want to get down to the level of 2% or 3%, we have quite a ways to go.”

Lawrence and his colleagues estimate that building a comprehensive catalog of high- and intermediate-frequency recurrently mutated genes will require analyzing about 100,000 tumors—2,000, on average, for each of at least 50 tumor types. Such a catalog, he estimates, will include dozens of high-frequency and probably hundreds of intermediate-frequency genes.

The 33 newly identified genes have biological roles within the “hallmarks” of cancer, including cell proliferation, cell death, genome stability, chromatin regulation, immune evasion, RNA processing, and protein homeostasis. Fifteen of those occurred in more than 5% of patients, making them of intermediate frequency. The rest were found at lower frequencies. Lawrence says he wasn't surprised by the percentages: The vast majority of genetic mutations occur in a small number of tumors.

“The lower you go in frequency, the more genes become involved and the more mutations you see in the data set,” he says.

That inverse relationship presents researchers like Lawrence with an open question: How deep must a catalog go?

“Do we want to get down to the 0.1 percent level? For lung cancer, that fraction would affect many thousands of people,” he says. For such a low frequency, researchers cannot confidently predict the total number of genes they'd find. Any effort to create a catalog, at some point, will require its authors to pick a stopping point, he adds.

Broad researchers are following up on the study by working with larger data sets to look for more genes and by creating cell lines with the newly identified mutations to search for potential therapeutic targets.