To the Editors: Johnstone et al. recently reported that COMT genotype modified the effectiveness of a transdermal nicotine patch for smoking abstinence in a randomized clinical trial conducted in the 1990s (1, 2). Several previous publications have described outcomes of this trial in relation to other genotypes (DRD2, DBH, 5-HTTLPR, DRD4, and OPRM1; refs. 3-7). All but one reported positive findings, without mentioning whether other relevant polymorphisms have been examined.
Randomized clinical trials are the acme of epidemiologic study design when the objective is an unbiased measure of effect. “Intention-to-treat analysis” helps protect against bias introduced post-randomization (e.g., by lack of compliance; ref. 8). Together, these approaches reduce the likelihood of false-positive results. However, when clinical trial results are analyzed for multiple interactions, they are just as vulnerable as other epidemiologic studies to type 1 error, which must then be considered as an alternative explanation of positive findings. Selective publication of positive results, or “significance-chasing bias,” may propagate and amplify this type of error (9).
An a priori research hypothesis declares the “intention to analyze” data from a clinical trial or other epidemiologic study. Examining such data for questions in addition to the primary research hypothesis is often interesting and useful, but epidemiologists are trained to be wary of false associations discovered on “fishing expeditions.” Currently, many investigators are taking advantage of increasingly affordable genotyping to supplement their data sets from completed studies with genotypes measured from stored samples. Like other such research, analysis of gene-drug interactions in preexisting clinical trials operates at the interface between hypothesis testing and hypothesis generation.
What does the intention to analyze mean in the era of genome-wide association studies? These studies are becoming increasingly popular for their efficient, apparently hypothesis-free interrogation of genetic markers across the genome. Comprehensive genome-wide association study databases, such as dbGAP, will help make the results of single studies more transparent (10). However, analyzing associations for statistical significance—no matter how small the P value—is only the starting point for deciding which polymorphisms to pursue in further epidemiologic, laboratory, and clinical studies. Thus, the intention to analyze is a result rather than a premise of genome-wide association and other exploratory studies, whose principal contribution is to identify new candidate genes. Such studies represent a new “discovery engine,” which fuels (rather than replaces) the “risk engine” (11). A systematic process for evaluation, synthesis, and integration of research results builds the cumulative evidence base needed to move the field forward.