An Institute of Medicine report offers best-practice recommendations for developing omics clinical tools.
The march toward individually tailored therapeutic plans for patients with cancer and many other diseases has been hamstrung by a lack of suitably designed analytical tools that make sense of huge collections of genomic, proteomic, and other omics sets of molecular data. That's a key message from “Evolution of Translational Omics”, a report released in March by the Institute of Medicine (IOM) that lays out a roadmap for developing and deploying these molecular tests.
The IOM, the health arm of the National Academy of Sciences, says that it was asked by the National Cancer Institute (NCI) to develop recommendations to clarify and improve the pathway from discovery to use of omics-based tests in a clinical trial, following the recent case of inappropriate gene expression testing in clinical trials at Duke University.
Many of the issues in the Duke case “stemmed from problems that may exist at other institutions,” the report notes. Among potential issues are unclear lines of accountability, lack of consistently strong data management, and failures to analytically, clinically, and biologically validate omics-based tests before launching clinical trials.
Measures against Mistakes
Keith Baggerly, PhD, and Kevin Coombes, PhD, biostatisticians at The MD Anderson Cancer Center in Houston, struggled for years to bring to light the problems with Duke's data.
“The Duke case is undoubtedly an aberration,” Baggerly says now. “The thing that has us worried is, simple mistakes are common.”
When it comes to developing a statistically sound omics algorithm and making it into a clinical test, simple mistakes can add up to major problems, the report emphasizes.
Many of the difficulties arise from the sheer complexity of molecular tests. Among other causes, errors can arise from a lack of understanding of proper statistical methods. “Because omics-based tests rely on interpretation of high-dimensional datasets, it is important to guard against overfitting the data throughout the test development process,” the report emphasizes. “Overfitting due to lack of proper statistical methods can lead to a model that fits the training samples well, even though the model might perform poorly on independent samples not used in test development.”
Once a test algorithm is ready, it's supposed to be “locked down” and tested on a validation set, a totally new group of samples. Sometimes researchers don't follow this lockdown requirement, thus tainting subsequent analyses.
Moreover, as data become more complicated, errors may become less obvious. When looking at an expression signature for 100 genes, intuitive judgments may break down. Baggerly says that in the Duke case, gene-expression data were indexed incorrectly and riddled with errors, yet biologists could provide an apparently sensible biological narrative about the patterns they thought they saw.
“It's a fact of big data that it's hard to check your work,” says Daniela Witten, PhD, an IOM committee member and assistant professor of biostatistics at the University of Washington. “So we need a transparent system for sharing data and software, especially if the test is going to be used in a clinical setting.”
The sheer scale of omics research can make sharing of the complex data sets and computational models difficult. “Data sharing is not routine and … replication and verification are more difficult than for single biomarker tests,” the report notes.
Additionally, the report points out that some researchers and institutional review boards aren't following key government regulations about clinical testing, simply because they aren't aware of them.
“There's a degree of ignorance up and down the academic chain” about tests whose outcome will direct patient care, comments Larry Kessler, ScD, a member of the IOM committee and chair of the department of health services at the University of Washington's School of Public Health and Community Medicine in Seattle.
Two big quality-assurance issues arise for tests whose outcome will direct patient care. First, any such test must be performed in a Clinical Laboratory Improvement Amendments (CLIA)–certified facility, the report emphasizes. Second, U.S. Food and Drug Administration (FDA) review is required for any test whose outcome will direct patient care.
Research labs that lack CLIA certification may create so-called laboratory-developed tests (LDT). “While the FDA has the authority for regulatory oversight of all tests used in patient care, the FDA has not defined a regulatory framework that includes oversight of LDTs,” the report points out. “It is precisely this LDT pathway that … places a new and mostly unrecognized demand on academic institutions to provide proper oversight for omics-based test development, validation, and clinical implementation.”
“Doing omics research has proven more challenging than we thought it would be when the human genome was sequenced,” sums up Kessler. “We hope the IOM report will provide a roadmap for people to do this right.” —Katherine Bourzac
Here are selected Institute of Medicine recommendations.
Discovery phase
Candidate tests should be confirmed with an independent set of samples not used in generation of the computational model.
Data and metadata used for test development should be made available in an independently managed database.
The test should be defined precisely, including its molecular measurements, computational procedures, and intended clinical use.
Validation phase
The test and its intended use should be discussed with the FDA before starting validation studies.
Validation should be done in a CLIA-certified clinical laboratory.
For more news on cancer research, visit Cancer Discovery online at http://CDnews.aacrjournals.org.