Genetic and other host characteristics that have the potential to influence susceptibility to disease occurrence may exert that influence only in the presence of one or more environmental (or other genetic) exposures, and so epidemiologic studies of disease susceptibility are well advised to conduct analyses to explore such a possibility. However, the interpretation of observed variation in the size of the association across subgroups of the study population is subjective: the number of subjects in the subgroups, the magnitude of the variation in the association across subgroups, and the previous likelihood of such variation all bear on that interpretation. This commentary considers the special case in which there is subgroup variation in the direction of an exposure-disease association with little or no association when all subgroups are combined. It concludes that, in this situation, stringent criteria are appropriate for an interpretation of anything other than chance as the explanation for that variation.
This “special case” is not actually all that special: it is a common occurrence in epidemiologic studies. For example, Landi et al. (1) compared 132 persons with cutaneous melanoma and 145 controls for the capacity of their peripheral blood lymphocytes to repair UV radiation–damaged DNA, and the fraction of cases and controls having DNA repair capacity that was “low” (that is, below the median for controls) was identical (odds ratio, 1.0; 95% confidence interval, 0.5-2.0). However, a low DNA repair capacity was present in 26 of the 45 cases and in only 4 of the 18 controls who described their ability to tan as limited or absent (odds ratio, 4.9; 95% confidence interval, 1.4-17.3). On the other hand, among persons with medium or high tanning ability, there was a suggestion of reduced risk of melanoma associated with a low DNA repair capacity (odds ratio, 0.6; 95% confidence interval, 0.3-1.1).
As another example, a pooled analysis of studies of lung cancer (2) found no overall association with the null glutathione S-transferase 𝛉1 genotype, but a suggestion of a reduced risk associated with this characteristic (odds ratio, 0.73) in persons who had a history of a chemical exposure at work and, correspondingly, a (very small) positive genotype-disease association (odds ratio, 1.06) among persons with no such occupational exposure.
In the absence of confounding, the observation of an exposure-disease association in a subgroup of the study population, combined with the absence of such an association in the population as a whole, necessitates the presence of an association in the opposite direction among persons who are not members of that subgroup. The magnitude of the opposite association is a function of the strength of the association in the initial subgroup and of the size of the subgroup relative to the rest of the study population. For example (Fig. 1), if an odds ratio of 3.0 is seen for an exposure-disease relationship in a subgroup that constitutes half of the population (subgroup A), yet the odds ratio is 1.0 in the overall study population, the odds ratio in the remainder of the population (subgroup B) will be 0.33. The same exposure-disease odds ratio (in this instance, 3.0) present in a less common subgroup A implies a more modest reduction in risk related to exposure in subgroup B (again, when the overall odds ratio is 1.0). For example, if subgroup A comprised but 10% of the population, then the odds ratio in the much larger subgroup B would be 0.88.
Some investigators, observing no overall association between a given exposure and disease, go on to ask whether the presence or size of an association between a second exposure and disease is influenced by the first one. For example, Rebbeck et al. (3) observed the incidence of breast cancer to be identical between women who did and did not have one or more G331A alleles of the progesterone receptor, but their data also suggested that the deleterious influence of postmenopausal combined hormone therapy on risk differed by progesterone receptor genotype. However, if such effect modification was genuine, it would imply an association between genotype and cancer that would be in opposite directions, depending on hormone use. Table 1 seeks to illustrate this using an example with hypothetical data. The example has been constructed so that the incidence of disease in persons with genotype A is 10 per 1,000 person-years, a rate identical to that in other persons (who have genotype B). As is seen in part 1 of the example, the environmental factor is associated with an increased risk of disease (relative risk, 2.33) only in persons with genotype A. In part 2, the data in part 1 are rearranged to examine the association between genotype and disease within strata of the environmental exposure. They document that genotype A is positively related to disease incidence (relative risk, 1.4) when the environmental exposure has been present and is negatively related (relative risk, 0.6) when the environmental exposure has been absent. If one was to interpret the apparently different relation of the environmental exposure to disease incidence according to genotype as indicative of a genuine interaction, that interpretation would be obliged to specifically address the observation of a genotype-disease association that differed in direction depending on the presence or absence of the environmental exposure.
Sometimes, an overall association is present between exposure and outcome along with variation in the size or presence of this association across subgroups of the study population. In this situation, the potential role of chance as the basis for the observed variation across subgroups can be gauged by a formal test of statistical interaction (Often complicating the interpretation of such a test is that, in a given study, there can be many ways of forming these subgroups. This important issue will not be discussed further here.). To the extent that chance is unlikely to be responsible, that variation can be important both in trying to understand the means by which the disease can be produced and, in a practical way, in helping to decide if there are persons particularly susceptible or insusceptible to the influence of the exposure on disease risk. However, when an exposure-disease association is present in one subgroup and this is balanced by an opposite association in the remainder of the study population, the question of interest is not the likelihood that chance could explain the difference in the direction of the association between members of the subgroup and other persons. Rather, it is the likelihood that, in either of these two groups, chance is responsible for the exposure-disease associations differing from the null.
I believe it is almost always will be true that an exposure that has the capacity to act to cause an illness in one segment of the population will not have the capacity to protect against that same illness in other persons. This belief probably is not idiosyncratic. For example, commentators on the results of randomized trials of therapies have stated that “so-called qualitative interactions in which the treatment effect is in the opposite direction in different subgroups are thought to be rare and highly implausible,” and indeed, no example of a qualitative interaction was seen in their examination of 50 large trials reported in major biomedical journals during 1997 (4). If my belief is correct, then when there is no overall exposure-disease association, an inference that chance (or bias, or a combination of the two) is unlikely to be responsible for a positive association in a given subgroup A must simultaneously posit that the obligatory negative association observed among other persons (subgroup B) is due to chance or bias. This additional inferential “hurdle” can sometimes be cleared, at least if the positive association is large in size and confined to a subgroup A that proportionately is quite small. Nonetheless, I recommend that the size of the hurdle be treated with a great deal of respect and that heed be paid to the admonition of yet another set of commentators (5) who addressed this issue in a simulation study: “It is generally recognized that subgroup analysis can produce spurious results, [and] the extent of the problem is almost certainly under-estimated.”
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Grant support: Established Investigator Award (K05 CA092002) from the National Cancer Institute.
Acknowledgments
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
I thank Thomas Koepsell, Jennifer Doherty, and Norman Breslow for the useful suggestions on an earlier draft of this manuscript; and Clara Bodelon for the assistance in preparing Fig. 1.