Abstract
Cancer risk associations with commonly prescribed medications have been mainly evaluated in hypothesis-driven studies that focus on one drug at a time. Agnostic drug-wide association studies (DWAS) offer an alternative approach to simultaneously evaluate associations between a large number of drugs with one or more cancers using large-scale electronic health records. Although cancer DWAS approaches are promising, a number of challenges limit their applicability. This includes the high likelihood of false positivity; lack of biological considerations; and methodological shortcomings, such as inability to tightly control for confounders. As such, the value of DWAS is currently restricted to hypothesis generation with detected signals needing further evaluation. In this commentary, we discuss those challenges in more detail and summarize the approaches to overcome them by using published cancer DWAS studies, including the accompanied article by Støer and colleagues. Despite current concerns, DWAS future is filled with opportunities for developing innovative analytic methods and techniques that incorporate pharmacology, epidemiology, cancer biology, and genetics.
See related article by Støer et al., p. 682
Detection of potential adverse effects associated with pharmaceuticals agents is a critical element for improving the quality and safety of patient care. Although safety is an essential component for the drug-approval process, events such as cancer (the focus of this commentary) are rarely captured even in large clinical trials, mainly because of its long latency and relative rarity. Herein lies the value of post-marketing observational research evaluating pharmaceutical agents not only for their potential carcinogenic effects but also for possible chemopreventive action. Hypothesis-driven studies of cancer risk associations with commonly prescribed medications, such as metformin, statins, and hormone-replacement therapy have provided significant insights related to cancer etiology and prevention but results are sometimes inconsistent (1). The availability of large-scale electronic medical records and national patient registries provide opportunities for agnostic, hypothesis generating explorations using drug-wide association studies (DWAS).
In the current issue of the CEBP, Støer and colleagues (2) report on a DWAS that screened for associations between prescription drugs and risk of 15 common cancer types. The authors analyzed data from the Norwegian Prescription Database and Cancer Registry of Norway using a nested case–control design. The study identified several prescription drugs classes [based on the anatomical therapeutic chemical (ATC) classification system] with carcinogenic or chemopreventive associations. Cancer sites with the highest number of carcinogenic associations with therapeutic subgroup (ATC 2nd level) were lung and kidney, and for chemopreventive association was prostate cancer. For drug classes, the associations with cancer were most frequently detected with antibiotics, analgesics, and antidiabetics. The study replicated some of the findings from previously published cancer DWAS reports (3, 4), and identified expected others such as the association of menopausal hormone therapy with breast cancer. Unexpected findings were also reported, including a positive relationship between propulsive drugs, antiemetic medication, and lung cancer. A visual interactive tool of the study findings is available at (https://pharmacoepi.shinyapps.io/drugwas/).
The interest in hypothesis-free evaluation of drug-cancer associations started decades ago. Several previously published reports used the power of the computer-based prescription records in the Kaiser-Permanente Medical system (5–8). Results from these studies were used by the International Agency for Research on Cancer in their assessment of drug carcinogenesis (9, 10). Follow-up to the original study with additional drugs, more patients, and longer follow-up has been published (11). Modeled after genome- and phenome-wide association studies, Ryan and colleagues (12) proposed a new DWAS approach (also known as medication-wide association study, MWAS) and evaluated its value in assessing the safety profile of a large number of pharmaceutical agents with four acute severe adverse events (acute liver or kidney failure, acute myocardial infarction, and gastrointestinal ulcer). Methods for data mining in drug safety programs are summarized elsewhere (13). As for cancer DWAS, two population-based studies were recently published. Patel and colleagues (3) published a study with more than 9 million individuals using the Swedish Prescribed Drug Register and Cancer Register to evaluate cancer associations with 552 drugs. The study used both cohort and case-crossover designs and identified cancer association signals with 26% and 7% of all drugs, respectively. The second study used a nested case–control design (1:10 case–control ratio) with 278,485 cancer cases using the Danish nationwide health registries. The study identified 1,020 cancer–drug association signals in 22,125 drug–cancer pairs (4). DWAS hold the advantage of efficiency and cost saving where one can use available electronic medical records to simultaneously test the relationship between a large number of drugs with one or more cancer sites, or even multiple different outcomes. However, the method is still limited by the high likelihood of false-positive results, confounding by the indication of drug use, and reverse-causation. These limitations, together with others, challenge our ability to draw conclusions about causal inference and constrain DWAS use to a hypothesis-generating method.
The discussion around “association versus causation” is not new to epidemiology. In the early 1960s and concerning the relationship between smoking and lung cancer, the Surgeon General's report presented the first framework of assessing causation in observational data. This included consistency, strength, specificity, temporality, and coherence (14). Shortly after, Sir Austin Bradford Hill expanded the list to also include biological gradient, plausibility, experiment, and analogy (15). Having those criteria facilitated the development of formal procedures to classify the strength of available evidence for causality. The usefulness of applying the Hill's criteria of causality in pharmacoepidemiology research was reviewed elsewhere (16). Published cancer DWAS, to date, have used different combinations of these standards to filter identified drug signals to those that are more likely to be true and actionable. Those included adjustment for multiple comparisons, and the use of several causality criteria, including temporality (by applying lag-time between drug exposure and cancer diagnosis), biological gradient (through dose–response analyses), consistency (by using two stage analysis or multiple analytic approach), specificity (by comparing signals across cancer types), and strength of association (by specifying minimum actionable effect size; summarized in Table 1). Despite this effort, the number of detectable signals was still overwhelmingly high (e.g., >1,000 signal in about 22,000 drug–cancer pairs; ref. 4). This suggests lack of precision.
. | Patel et al., 2016 (3) . | Pottegård et al., 2016 (4) . | Støer et al., 2020 (2) . |
---|---|---|---|
Temporality | Applied 1 (time-to-event) and 1–2 years (case-crossover) lag time | Applied 1-year lag time | Applied 1-year lag time |
Biologic gradient | Not considered | Used dose–response patterns as requirement for signal selection | Evaluated dose–response relationship |
Consistency | Used two approaches: | Not considered | Not considered |
(i) test and validation datasets | |||
(ii) time-to-event and case-crossover designs | |||
Specificity | Not used to select signals, but analysis was completed for all cancers combined and for cancers of the breast, colon, and prostate | Addressed in a quantitative way: A ratio of odds ratio (OR) for a specific drug–cancer association to that of the same drug with all cancers between 0.83 and 1.20 indicates nonspecificity | Not used to select signals, but analysis was completed for different cancers and for selected histologic subtypes |
Strength of association | Presented but was not a signal selection criterion | Was used as a signal selection criterion | Presented but was not a signal selection criterion |
Signals with an OR >1.5 or <0.67, or lower limit of 95% confidence interval >1.2 or a higher limit <0.83 were selected | |||
Plausibility | Not considered but discussed | Not considered but discussed | Not considered but discussed |
Coherence | Compared their results with previously published associations | Not considered but discussed | Compared their results with known drug–cancer associations and the other two DWS summarized in this table |
Experimental evidence | Not considered | Not considered | Not considered but discussed in some cases |
Analogy | Not considered | Not considered | Not considered |
. | Patel et al., 2016 (3) . | Pottegård et al., 2016 (4) . | Støer et al., 2020 (2) . |
---|---|---|---|
Temporality | Applied 1 (time-to-event) and 1–2 years (case-crossover) lag time | Applied 1-year lag time | Applied 1-year lag time |
Biologic gradient | Not considered | Used dose–response patterns as requirement for signal selection | Evaluated dose–response relationship |
Consistency | Used two approaches: | Not considered | Not considered |
(i) test and validation datasets | |||
(ii) time-to-event and case-crossover designs | |||
Specificity | Not used to select signals, but analysis was completed for all cancers combined and for cancers of the breast, colon, and prostate | Addressed in a quantitative way: A ratio of odds ratio (OR) for a specific drug–cancer association to that of the same drug with all cancers between 0.83 and 1.20 indicates nonspecificity | Not used to select signals, but analysis was completed for different cancers and for selected histologic subtypes |
Strength of association | Presented but was not a signal selection criterion | Was used as a signal selection criterion | Presented but was not a signal selection criterion |
Signals with an OR >1.5 or <0.67, or lower limit of 95% confidence interval >1.2 or a higher limit <0.83 were selected | |||
Plausibility | Not considered but discussed | Not considered but discussed | Not considered but discussed |
Coherence | Compared their results with previously published associations | Not considered but discussed | Compared their results with known drug–cancer associations and the other two DWS summarized in this table |
Experimental evidence | Not considered | Not considered | Not considered but discussed in some cases |
Analogy | Not considered | Not considered | Not considered |
Here, we present some factors that may pose challenges for cancer DWAS using examples from the accompanied article and published literature. (i) Confounding: Special attention to confounders in DWAS is needed because of the complex interaction between underlying disease and its risk factors, other comorbid conditions and coprescribed medications, and factors driving physicians' prescribing decisions. The accompanied report is not free from such limitation as discussed extensively by the authors; one example is the associations found between antibiotics and lung cancer that may be explained by the patient smoking behavior. Another example is the protective associations between anti-cholinesterases (a drug class that is used for dementia) and several cancers that maybe explained, at least in part, by the underlying dementia diagnosis that has been found in several studies to be inversely associated with cancer(17–19). The lack of adjustment for potential confounders makes the current DWAS approach less informative. (ii) Temporality: Is defined as time between exposure and cancer development. Støer and colleagues (2) used a prespecified 12 months lag-time between drug prescription and cancer diagnosis; previously published DWAS also used 12–24 months lag-time (2–4). The relatively short and identical lag-time for all cancers may have increased the likelihood of reverse-causation. This may explain, at least in part, the high number of detected drug signals associated with lung cancer (latency ranging between 5 and 19 years) as compared with those detected with leukemia (minimum latency around two years for chronic lymphocytic leukemia; number of detected signals = 30 vs. 9, respectively, in one study using 12 months lag-time; ref. 2). (iii) Dose–response analysis: It was considered in the accompanied report and previously published and was used as a filtering technique in some. Yet, this analysis can be complicated by possible difference in associations based on dose definitions (e.g., cumulative therapeutic duration, defined daily dose or relative dose intensity), non-linear relationships, or confounded associations (e.g., drug dose may correlate with the severity of a cancer, or a cancer risk factor such as body mass index). An example from Støer and colleagues report is the observed associations between antibiotics and cancers of the lung, bladder, and kidney; all with significant dose–response relationships. These observations carry a high possibly of confounding either by indication where the chronic exposure to infection leads to inflammation that increases patient risk of cancer, or by lifestyle factors such as smoking that predisposes to both infection and cancer. (iv) Specificity: In most studies the focus was on specificity by cancer type. So, the example presented above for antibiotics with three different cancers in the accompanied report can be indicative of non-specificity, and therefore can disqualify those signals. The reverse, where different classes of antibiotics are associated with one cancer, can be also a sign of non-specificity. It is worth noting that the use of both dose–response patters and specificity as combined criteria for prioritizing drug–cancer signals filtered out 78% of initially detected in one study (4). (v) Non-exchangeability: A form of bias that can occur in cohort or case–control design (when the distribution of factors associated with outcomes is different by the exposure). This can be addressed in a case-only design such as case-crossover. One published study (3), comparing cancer-DWAS results from cohort and case-crossover analyses showed a significant reduction of the number of detected signals in the case-crossover design but surprisingly it detected only drug signals with protective effect. The use of case-only design in cancer DWAS may be questionable because its use is intended for evaluating transient exposures and risk of acute outcome, such as work hours and traffic accidents (20). (vi) Multiple testing: The large number of exposures tested (drugs), and outcomes (different cancers) in DWAS increase the probability of finding associations by chance and therefore false-positive results. Ryan and colleagues (12) showed that false-positive results in DWAS can still exist even in small threshold of adjusted P values and suggested that confounders are to blame.
In conclusion, the use of DWAS in cancer research provides promise for drug safety and cancer etiology research. The method efficiently expands the framework for cancer–drug association screening by using available large-scale healthcare databases. The accompanied study, together with published DWAS highlighted the need for innovative analytic methods that can filter out false signals and prioritize promising ones worthy of further investigation. One example is the work by Vilar and colleagues (21) in which they proposed the use of similarity-based modeling techniques that incorporates digital molecular structure to existing methods. In this era of artificial intelligence, the application of natural language processing and machine learning algorithms may improve DWAS precision (22). The integration of drug molecular pathway analysis in this framework showed great promise (23). These promising advancements need to be incorporated in a framework that takes into account the basic pharmacoepidemiologic sources of biases and confounding. In addition, incorporating biological information, including inherited genetic variation that modifies a patient's drug response may shed light on causality.
Authors' Disclosures
No disclosures were reported.
Acknowledgments
This work was supported by the Intramural Research Program of the Division of Cancer Epidemiology and Genetics, National Cancer Institute (NCI), National Institutes of Health (NIH).