Five years into the giant mapping initiative, what have we learned?
The most comprehensive analysis of genomic changes in ovarian cancer arrived in June and brought striking insights into the disease. Researchers found that mutations in a single gene that normally prevents cancer formation, TP53, were present in more than 96% of the 489 high-grade tumors analyzed (Nature 2011;474:609–15). Two other well-known genes, BRCA1 and BRCA2, were mutated in 22% of the tumors. The scientists identified 4 distinct subtypes of the disease, detected patterns of gene expression that predict patient survival, and pointed out that Food and Drug Administration–approved compounds can target some of the dysfunctional genes they uncovered.
Credit The Cancer Genome Atlas (TCGA) Network, which joins more than 150 researchers at dozens of institutions, for those discoveries. Launched as a 3-year pilot project in 2006 with $100 million from the National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI), TCGA set out to characterize genomic changes in glioblastoma multiforme (GBM), ovarian cancer, and lung cancer. After producing dramatic results in GBM in 2008, NCI, NHGRI, NIH, and the Department of Health and Human Services announced that as an American Recovery and Reinvestment Act Signature Initiative, TCGA would receive approximately $275 million to produce genomic maps of 20 types of cancer over the next 5 years.
Dr. Anna Barker, former deputy director of NCI, with Dr. Francis Collins and others, helped conceive, launch, and oversee TCGA. Cancer Discovery recently asked her to highlight some of the project's ambitions and accomplishments.
What unanticipated challenges did you encounter in launching TCGA?
We knew we were going to face several challenges because the project was unprecedented in size and scope. But one challenge we didn't anticipate involved biospecimens. There were a lot of samples. But unfortunately, for a whole range of reasons, there weren't a lot of high-quality samples. We had set a goal of 500 samples for each tumor type, but only about 30% of the existing samples in biobanks were acceptable for the assays the TCGA Centers needed to perform. When we published our manuscript on GBM, we had characterized only 206 samples. We realized that we had to upgrade the quality of the samples for TCGA, which was an expensive undertaking.
Did the goals of the program change significantly between the start in 2006 and phase II in 2009?
No, but the challenges changed. The rapid evolution of sequencing technologies was a big one. We had started our work using PCR-based sequencing, but we soon moved to next-generation sequencing—and the technology is continuing to change. The ovarian cancer manuscript is the first good example of using next-generation sequencing on a statistically robust set of samples—enough that we could subtype the tumors. In the ovarian cancer manuscript, researchers found a broad range of disrupted genes and their associated pathways. They then took clinical data, which are pretty comprehensive in TCGA, and mapped that back to the genes and pathways to establish subtypes. That will allow the communities to look at different drugs and combinations of drugs to inhibit the various pathways in each of the subtypes.
What do you see as the biggest challenge now?
We've generated a tsunami of data, but we're ill-equipped to handle it. We don't have a lot of scientists trained in areas such as computational biology, algorithm development, game theory, and artificial intelligence who can make sense of these very deep multidimensional databases. Young people coming into the field and individuals with knowledge of these areas will find huge opportunities in translating the information into knowledge of cancer biology.
Are you pleased with the pace of the project?
I was probably one of the most vocal people in calling for the project to move more quickly. But as I began to look closely at what had to be done to sustain a high-quality project of this scope—for example, attracting the best people and figuring out how to store, move, and analyze large amounts of complex data—I realized that doing it right and getting the best answers for the community was much more important than speed. But now that a solid foundation has been built and teams are working simultaneously on a range of cancers, I think the pace of discovery will be relatively rapid.
What might surprise people about TCGA?
When people look at the ovarian cancer manuscript, they probably won't realize the amount of work by so many that went into it. They may not see the extent to which major problems have been solved by TCGA, such as improving the quality of biospecimens, undertaking large-scale whole-genome sequencing, and devising innovative approaches to manage massive amounts of data. People also may not realize that on a fundamental level TCGA is quite an altruistic enterprise. There are literally hundreds of investigators working together quite effectively as a team. I think that they all know that the results they produce are unprecedented because their goal is to produce the best outcomes for the community.
If someone asks you whether it's really worth spending so much money on TCGA given the current economic climate, what do you say?
A simple, emphatic “yes.” If we had to wait for individual investigators to figure out the composition of every dysregulated genome in cancer, it would be far too long for patients.