Drug combination therapy has become a promising therapeutic strategy for cancer treatment. While high-throughput drug combination screening is effective for identifying synergistic drug combinations, measuring all possible combinations is impractical due to the vast space of therapeutic agents and cell lines. In this study, we propose a biologically-motivated deep learning approach to identify pathway-level features from drug and cell lines' molecular data for predicting drug synergy and quantifying the interactions in synergistic drug pairs. This method obtained an MSE of 70.6 ± 6.4, significantly surpassing previous approaches while providing potential candidate pathways to explain the prediction. We further demonstrate that drug combinations tend to be more synergistic when their top contributing pathways are closer to each other on a protein interaction network, suggesting a potential strategy for combination therapy with topologically interacting pathways. Our computational approach can thus be utilized both for prescreening of potential drug combinations and for designing new combinations based on proximity of pathways associated with drug targets and cell lines.

Implications:

Our computational framework may be translated in the future to clinical scenarios where synergistic drugs are tailored to the patient and additionally, drug development could benefit from designing drugs that target topologically close pathways.

Drug combination therapy has been a promising approach to treat cancers, especially in the fight against drug resistance (1). Inhibition of cancer cells with multiple agents can have a stronger and more durable effect than a single agent (1) and interfering simultaneously with multiple pathways might be more effective than single target agents (2). Thus, drug combinations provide more treatment options, especially for patients who are resistant to monotherapy or cannot be treated with surgery (3).

Conventional network-based approaches have demonstrated promising ways to discover combinatorial drugs through integrating various interaction networks between subjects of interest. For example, Peng and colleagues proposed a balanced solution to find optimal combinatorial therapy with maximum coverage of a given disease gene set and minimum off-target hits in a drug-target network (4). Huang and colleagues integrated genomic profiles into drug–gene and disease–gene networks, and then prioritized synergistic drug pairs by searching target genes that are significantly enriched in the disease signaling network (5). Finally, Cheng and colleagues used network proximity to quantify the relationship between target gene network and disease-causing gene network (6). These approaches, however, are limited to association of multiple drugs to one disease or type of cancer and lack dose information.

High-throughput drug screening is a useful platform for detecting synergistic drug combinations. However, due to the vast space of drug combinations and cell lines, measuring all possible combinations is impractical. Computational approaches thus provide an opportunity to facilitate the discovery of novel drug combination therapy by utilizing drug combination data generated from experimental approaches. Such drug combination data is collected in available databases from multiple studies, such as DrugComb database (7).

Recent studies have demonstrated that deep neural networks can achieve good performance using genomic data and drug chemical structure. For example, Preuer and colleagues fitted a deep learning model using cell line gene expression of a preselected gene set and drug chemical information, outperforming other machine learning algorithms (8). To address the high-dimensionality of the data, Zhang and colleagues proposed the AuDNNSynergy model (9) that predicts drug synergy with encoded omics data by applying the techniques of autoencoder, while Kim and colleagues utilized the multitask multimodal approach to predict drug synergism in understudied cancer types (10). Although autoencoders can efficiently compress data, they are not designed to interpret predictions. Furthermore, as genes do not work in isolation, it is often useful to look at higher structures that involve multiple genes and their interactions such as molecular pathways. Indeed, the first-place winner of the DREAM challenge for drug synergy prediction in 2018 showed that neighboring genes of drug targets in the protein–protein interaction network are predictive of drug synergy (11). Recently, the TranSynergy model by Liu and Xie connected between drug target genes via network propagation and looked for enrichment of the most predictive drug targets in an attempt to address model interpretability (12). However, their model showed only small improvement over the available state of the art methods and since it included only drug target genes, it missed potential downstream effects. Thus, the biological interpretation of their predictions remains challenging.

There is still a need for models with better explainability without sacrificing model performance. Interpretable prediction models allow clinicians to understand prediction results and thus are critical for translation of these models into the clinic (13). Molecular pathways have been the cornerstone of our understanding of cellular processes and means toward biological interpretability. Moreover, functional similarities at the pathway level have been shown to be useful for drug repurposing more than the single-gene level (14). We thus propose that associating molecular pathways with drug synergy could bridge the gap of explainability. Indeed, in our previous work, we demonstrated that pathway-level information can accurately predict drug sensitivity and provide biological insights into drug responses in cancer cell lines as well (15). In this study, we hypothesize that pathway-level interaction of two drugs can reflect drug synergistic effects on cancer cell lines. We describe a new deep learning model we term SynPathy, which incorporates molecular pathways originating from drug targets to predict drug synergy on cancer cell lines. We further utilize our method to test whether synergistic drugs are associated with close or distant pathways, showing that top associated pathways in synergistic drug pairs are closer to each other and to the cell line related pathways than either additive or antagonistic drug sets.

Data retrieval

The drug combination screening data were retrieved from the DrugComb database (7). Eligible combination data are drug-drug-cell line triplets that have canonical SMILE structure strings (used for model comparison), drug target information, and omics data available. We obtained SMILE strings via DrugComb's API (http://drugcomb.fimm.fi/api/drugs), while drug targets were collected from DrugBank (16), STITCH (17), or Therapeutic Target Database (TTD; ref. 18) by drug names or identifiers used by DrugComb database. We used the Loewe additive score as the target variable for the regression task.

Genomic data of cell lines were downloaded from Genomics of Drug Sensitivity in Cancer (GDSC) (19). Cancer cell lines were excluded if expression data, somatic mutation and copy-number variation data are not accessible from the download page (https://www.cancerrxgene.org/downloads/bulk_download). The 196 PID cancer signaling pathways and 1554 REACTOME drug metabolism pathways were retrieved from the C2 collection of MSigdb (https://www.gsea-msigdb.org/gsea/msigdb/collections.jsp#C2; ref. 20). The human PPI network was downloaded from the STRING database (https://stringdb-static.org/download/protein.links.detailed.v11.5/9606.protein.links.detailed.v11.5.txt.gz; ref. 21).

Drug feature engineering

We implemented two types of drug features: pathway-based and based on chemical structure. For the pathway-based features, drug information was represented by closeness of target genes to pathway genes in the space of protein–protein interaction (PPI) network. For each drug, we measured closeness of its target genes to each of 1,750 pathways using the network-based pathway enrichment analysis (NetPEA; ref. 22), which consists of the following steps: (i) performed the random walk with restart algorithm to obtain a related probability for every PPI genes; (ii) quantified the enrichment of pathway relative to target genes by averaging probability scores on pathway gene nodes; and (iii) for each pathway, we ran a permutation test by comparing the enrichment score obtained from the step (ii) to the sample mean aggregated from the 1,000 randomly selected sets of genes with the same size of the pathway gene set. The resulting z-statistic thus indicates the level of closeness to a pathway relative to a drug target. Due to the nature of the random walk algorithm, higher node values mean higher probabilities of revisiting to query nodes. Therefore, the higher the enrichment score the closer they are to each other in the space of PPI network. For clarity, we coined our pathway features represented for the first drug of a drug–drug–cell triplet as DGNet1, DGNet2 for pathway features of the second drug.

For features based on chemical structure, drug information was represented by Morgan fingerprint. We used RDKit (http://www.rdkit.org), an open-source cheminformatics toolkit to convert a canonical SMILE string into a vector of size 256. The binary vector consists of 0 representing absence of a substructure and 1 stand for presence of a substructure. We note that converting to a vector of 1024 reduced slightly the performance from MSE of 70.6 to MSE of 71.1.

Cell line feature engineering

We applied the single-sample gene set enrichment (ssGSEA) algorithm (23) to estimate which set of pathway genes in PID and REACTOME pathways are significantly up- or downregulated within a cell line. We used the Python package, GSEApy (https://gseapy.readthedocs.io/en/master/gseapy_example.html), and set the permutation test to 1,000 times, which generates an enrichment matrix with 81 cancer cell lines by 1,750 pathways, and we termed these pathway features as EXP. We used the same procedures of DGNet feature engineering to generate per-sample pathway enrichment for somatic mutation and copy-number variation data as mutation data are collections of mutated genes for each cell line and copy-number variation (genes with abnormal gene copies) data, determined by the Genomic Identification of Significant Targets in Cancer (GISTIC) score.

Drug combination modeling with deep learning

We implemented the deep learning model with Keras (https://keras.io). The Python code is available at Github repository (https://github.com/TangYiChing/SynPathy). We concatenated our pathway-level features encoded from drug target and gene expression with chemical structure feature as it has been demonstrated that it can improve performance (8). We tested different normalization method to scale input data, including the conventional standardization, min-max, as well tanh and tanh with standard normalization (i.e., tanh_norm) suggested by previous study (8), selecting tanh, as it achieved the best performance comparing to other normalization approaches (also suggested by Preuer and colleagues, see Supplementary Table S1). For each drug combination, we used the two possible orderings of each drug pair in order to be independent of drug order. We then averaged the predicted synergy outcome of the pair of drugs (i.e., the synergy prediction of drug 1 + drug 2 with drug 2 + drug 1). We performed hyperparameter tuning using 10-fold cross validation with 70% of training data, 20% of validation data, and 10% of test data. We used the Adam optimizer and trained for 1,000 epochs with an early stopping mechanism at 25 epochs (after detecting a decrease in the validation loss). Mean squared error (MSE) was used to evaluate performance and select the optimal model. Hyperparameters were optimized using grid-search based on the performance on the validation set, including learning rate (0.00001, 0.0001, 0.001, 0.1), dropout (no dropout, 0.2, 0.5, 0.8) and batch size (32, 64, 128). The deep learning architecture and the best set of hyperparameter are listed in Supplementary Table S2.

Drug combination modeling with machine learning algorithms

We used scikit-learn API (24) to build other machine learning algorithms, including ElasticNet, Support Vector Machine (SVM; ref. 25) and XGBoost (26). Parameters of each model were optimized using Bayesian optimization with scikit-optimize package (https://scikit-optimize.github.io/stable/index.html). After search, we trained each model (ElasticNet: alpha = 0.01, l1_ratio = 0.4, XGBoost: gamma = 0.25, learning rate = 0.1, max depth = 5, reg_lambda = 10.0, SVM: C = 10.0, kernel = rbf) and reported average MSE across the ten folds after 10-fold cross validation for comparison.

Feature importance

We calculated Shapley values (27) with the SHAP package (28) to rank input pathway features based on their contributions (i.e., absolute Shapley values) to drug synergy prediction. Similar to the performance calculation, we averaged the Shapley values for each pair of drugs over the two different orderings (drug1+drug2 and drug2+drug1). For each drug–drug–cell line triplet, we identified the top pathway from the subset of DGNet1 features (i.e., top DGNet1), top pathway from the subset of DGNet2 features (i.e., top DGNet2), and correspondingly top pathway from the subset of EXP features (i.e., top EXP).

Pathway distance measure

We mapped pathway genes to protein interaction network to quantify pathway proximity using shortest path distance. Given gene networks GA of pathway A and GB of pathway B the proximity is defined as follows:

Where d(i,j) is the length of the shortest path between i and j.

Statistical analysis

After calculating pathway distances described above, we used the Wilcoxon rank-sum test to compare the differences in pathway-pathway closeness for synergistic samples and nonsynergistic samples which were stratified by a range of thresholds from 30 to 60 based on the Loewe score, where 30 was the minimal threshold suggested by DeepSynergy and 60 is the highest range of Loewe scores. The mean of each group is reported, as well as rank-sum statistics and P values. We set the FDR-adjusted significance level at 0.05.

Data availability

All datasets are publicly available and detailed in the data section of the manuscript.

Drug synergy prediction with drug target pathway networks

We used the DrugComb database, a large collection of experimental drug combination data of cancer cell lines which is integrated from eight studies (7), including 157,532 drug–drug–cell line combinations, comprising 7,787 drug pairs, 2,287 unique drugs with target information and 81 cancer cell lines with gene expression information (Materials and Methods, Fig. 1). We used the Loewe score as measure for drug synergy to compare with previous studies. Under the hypothesis that pathway-level information can be useful for prediction of drug combination effects on a cancer cell, our proposed model leverages enrichment of well-studied biological pathways (Materials and Methods, Fig. 1).

Figure 1.

Schematic view of the SynPathy framework. There are four main components. A, Collection of drug combination screening dataset, genomic data of cancer cell lines, and pathway gene sets. B, Drug feature representation by chemical structure, and pathway-level feature engineering using drug targets and gene expression. C, Drug combination modeling with deep learning model. D, Identification of top contributing pathways and comparison of pathway-pathway distance among synergistic, additive, antagonistic drug pairs.

Figure 1.

Schematic view of the SynPathy framework. There are four main components. A, Collection of drug combination screening dataset, genomic data of cancer cell lines, and pathway gene sets. B, Drug feature representation by chemical structure, and pathway-level feature engineering using drug targets and gene expression. C, Drug combination modeling with deep learning model. D, Identification of top contributing pathways and comparison of pathway-pathway distance among synergistic, additive, antagonistic drug pairs.

Close modal

As model features, we used 196 signaling pathways from the PID database and 1,750 metabolic pathways from the REACTOME database (Materials and Methods). We compared the performance of several machine learning methods, where the best-performing and final model used a 2-hidden layers deep neural network (Materials and Methods; Supplementary Tables S1–S3). The performance of our model from a 10-fold cross-validation obtained an MSE of 70.6 ± 6.4. Using only one type of pathways obtained decreased performance (MSE of 74.3 ± 5.8 and 128.2 ± 6.2 when using only PID and REACTOME pathways, respectively).

We tested adding additional cell line associated data types. Contrary to evidence from recent studies (9, 29), we found that including somatic mutation and copy-number variation had minor effect on performance (MSE of 70.7 ± 5.9; Supplementary Table S4). Therefore, we resolved to using only one type of cell line features (i.e., gene expression, termed EXP) along with the two types of drug features (Chemical structure and Drug-Gene Network, or DGNet, see Methods) and for further comparison.

Model comparison and generalizability

Previously published methods trained and tested their model only on the O'Neil dataset (30), which is part of the DrugComb dataset. To compare our performance to previously published methods, we applied our model to the O'Neil dataset, this time using repeated 5-fold cross validation to match the choice of other methods. SynPathy obtained an MSE of 64.7 ± 4.2 on the O'Neil dataset, substantially outperforming DeepSynergy (MSE = 255.49; ref. 8), TranSynergy (MSE = 231 ± 21; ref. 12), Kim and colleagues multitask multimodal learning approach (MSE = 181.7; ref. 10), and Zhang and colleagues omics-integrated autoencoders (MSE = 241.12; ref. 9).

The DrugComb dataset is comprised of drug combination data from 15 datasets, eight of which include drugs with known targets and gene expression that could be incorporated into our model. The distribution of Loewe score varies between these eight studies owing to difference in their experimental designs (Fig. 2; ref. 7). We thus tested the generalizability of our algorithm across drug synergy sources by training our model with all datasets except one dataset, and then testing on the hold-out study dataset in a scenario we call Leave-One Study-Out (LOSO; Table 1). Out of previously published methods, only Kim and colleagues attempted a similar LOSO scenario on the O'Neil study as the left-out dataset. As they focused only on a subset of the tissues and their training set included also small datasets that include only drugs without target information or cell lines without OMICs data, it is not an exact comparison but does show large difference in performance (Kim and colleagues, MSE = 742.1 versus our results of MSE of 279.4).

Figure 2.

Distribution of Loewe score for drug combination data from eight studies.

Figure 2.

Distribution of Loewe score for drug combination data from eight studies.

Close modal
Table 1.

Prediction performance of our proposed model in external validation in a LOSO cross validation (total samples in all datasets is 157,532).

Train setTest set# of samples in the test studyMSE
All other datasets ALMANAC 139,759 273.1 
All other datasets FORCINA 1,423 383.6 
All other datasets NCATS_2D_3D 29.9 
All other datasets NCATS_ES(FAKi/AURKi) 1,441 71.0 
All other datasets NCATS_ES(Nampt+PARP) 36 39.6 
All other datasets NCATS_HL 1,869 347.5 
All other datasets O'NEIL 12,936 279.4 
All other datasets Yohe 65 91.9 
Train setTest set# of samples in the test studyMSE
All other datasets ALMANAC 139,759 273.1 
All other datasets FORCINA 1,423 383.6 
All other datasets NCATS_2D_3D 29.9 
All other datasets NCATS_ES(FAKi/AURKi) 1,441 71.0 
All other datasets NCATS_ES(Nampt+PARP) 36 39.6 
All other datasets NCATS_HL 1,869 347.5 
All other datasets O'NEIL 12,936 279.4 
All other datasets Yohe 65 91.9 

Pathways of synergistic drugs are closer to each other

We proceeded to leverage the explainability of computed pathway to investigate the relationship between synergistic drugs and their most predictive pathways. We asked whether there are underlying common molecular traits for synergistic combinations. Specifically, we tested whether drug-associated pathways lie closer in synergistic pairs than additive or antagonistic drugs on the protein network, and whether the distance is also shorter between the synergistic drug-associated pathways and the cell-associated pathway. We selected the top-ranking pathway with highest Shapley value from each drug- and cell-line triplet features, notated as top1 DGNet1, top1 DGNet2, and top1 EXP, respectively. We then measured pathway–pathway closeness for each pair based on shortest path distance. Several shortest path-based measures have been explored in a previous study (6), but they did not account for the number of genes in each pathway, which can skew the results if not done in a symmetrical manner. To address this, we took the average of shortest distance from both sides (Materials and Methods). Samples were stratified into synergistic, additive, and antagonistic groups based on a range of thresholds on the real Loewe synergy scores (Materials and Methods).

We observed that the synergistic group had significantly smaller shortest path distance between drug-associated pathways compared to the additive group or the antagonistic group across a range of thresholds, starting with the DeepSynergy threshold of Loewe score of 30 up to the top values of Loewe scores in our dataset (overall, average values of 20.2 ± 1.6, 35.7 ± 0.1, and 30.3 ± 0.6 for the synergistic, additive and antagonistic groups, respectively. Wilcoxon rank-sum test (Benjamini–Hochberg FDR-adjusted; ref. 31) P < 0.001 between synergistic and additive and FDR-adjusted P < 0.04 between synergistic and antagonistic, Fig. 3). Notably, antagonistic drugs are even more distinct from additive in terms of distance between drug-associated pathways (P < E–21). When computing the distance between drugs and cell line enriched pathways, synergistic versus additive combinations remained significant (Wilcoxon rank-sum test FDR-adjusted P <0.05, Fig. 3) while synergistic versus antagonistic combinations are not. Again, antagonistic drugs are highly different from additive drugs also when considering distance to the cell line (FDR-adjusted P < 5E–31).

Figure 3.

Line plots of pathway closeness using top pathways across thresholds from 30 to 60. Synergy group is defined as samples with Loewe score equal or larger than the cutoff, antagonism group is the samples with Loewe score equal or lower than the negative cutoff, while the rest of samples are defined as the additive group.

Figure 3.

Line plots of pathway closeness using top pathways across thresholds from 30 to 60. Synergy group is defined as samples with Loewe score equal or larger than the cutoff, antagonism group is the samples with Loewe score equal or lower than the negative cutoff, while the rest of samples are defined as the additive group.

Close modal

Case study – the ovarian cancer cell line

We provide a proof of concept for the applicability of SynPathy for precision medicine settings. To simulate a case of a new patient tumor, we performed a leave-one-cell-out (LOCO) procedure to validate cell line-specific performance when the cell line was not part of the training set. We compared to the state-of-art model TranSynergy based on the metric of Pearson Correlation Coefficient (PCC) that they used instead of MSE (12). Our model obtained better performance (Pearson ρ = 0.59 ± 0.2) than the state-of-art model (Pearson ρ = 0.51 ± 0.02).

As a case study, we selected the best performing cell line, A2780, an ovarian cancer cell line, with an MSE of 104.8 (Pearson ρ = 0.8), to demonstrate the use of pathways to provide potential explanation for drug synergy. While the full list is provided in Supplementary Tables S5 and S6, we highlight the two best performing examples of synergistic drug pairs on A2780.

The first example includes the drugs ridaforolimus (AP23573; MK-8669) and dactolisib (BEZ235). Ridaforolimus is a mTOR inhibitor and dactolisib is a mTOR/PI3K dual inhibitor (32, 33). The mTOR/PI3K/AKT signaling pathway is a well-known oncogenic pathway that can promote tumor cell growth and division, and has been previously implicated in ovarian cancer (34, 35). While both drugs share the same target, the two top contributing pathways to our model are different, but functionally related to the mTOR signaling pathway: Reactome REACTOME_SODIUM_ COUPLED_PHOSPHATE_COTRANSPORTERS (ridaforolimus, AP23573; MK-8669) and REACTOME_SARS_COV_1_GENOME_ REPLICATION_AND_TRANSCRIPTION (formerly REACTOME_ GENOME_REPLICATION_AND_TRANSCRIPTION; dactolisib). REACTOME_SODIUM_COUPLED_PHOSPHATE_COTRANSPORTERS includes solute carrier family 20 and 34 members (SLC34A and SLC20A protein families). Two key proteins of the pathway, Solute Carrier Family 34 Member 2 (SLC34A2) and Solute Carrier Family 20 Member 1 (SLC20A1), are both implicated in the Wnt/catenin signaling pathway involved in cell growth through cross talk with the mTOR pathway (36, 37). REACTOME_SARS_COV_1_ GENOME_REPLICATION_AND_TRANSCRIPTION includes glycogen synthase kinase 3 (GSK3; IPR039192, FDR-adjusted P < 9E–5), which plays a critical role in the Akt signaling pathway controlling cell proliferation, survival, and metabolism (38). Interestingly, the top EXP pathway, Reactome's BH3_ONLY_PROTEINS_ASSOCIATE_WITH_ A ND_INACTIVATE_ANTI_APOPTOTIC_BCL_2_MEMBERS, includes apoptosis regulator, the B-cell lymphoma-2 (Bcl-2) that is negatively regulated by the mTOR signal (39). Our observation is thus aligned with the existing evidence that a dual PI3K/mTOR inhibitor can be effective for the ovarian cancer cell line A2780 due to disruption of the balance of Bcl-2 family proteins (40).

The second example is the combination of a WEE1 G2 Checkpoint Kinase (Wee1) inhibitor (adavosertib; AZD1775; MK-1775) and the dual mTOR/PI3K inhibitor ridaforolimus with A2780. While the two drugs target different pathways, the top pathways in the synergy prediction are PID's REELIN_PATHWAY (adavosertib) and Reactome ACTIVATED_NTRK2_SIGNALS_THROUGH_RAS (ridaforolimus) and are enriched in protein kinase domains (IPR011009, IPR000719 and IPR017441, FDR-adjusted P < 0.001) composed of proteins that are involved in cell cycle, including AKT Serine/Threonine Kinase 1 (AKT1; refs. 41, 42), Phosphatidylinositol-4,5-Bisphosphate 3-Kinase Catalytic Subunit Alpha (PIK3CA), Phosphoinositide-3-Kinase Regulatory Subunit 1 (PIK3R1; refs. 43, 44), MAPK phosphatase (MPK; ref. 45), KRAS (46), and Reelin (RELN) (ref. 47; Supplementary Fig. S1). The effectiveness of the synergistic combination of adavosertib and ridaforolimus for inhibiting cell growth of ovarian cancer was previously demonstrated in vitro and in vivo (O'Neil and colleagues, 2016), where the underlying mechanism within the context of cell-cycle regulation has only recently been elucidated with the discovery of the expanded role of PI3K in DNA damage and genome instability: A recent study has demonstrated that inhibiting PI3K signal can enhance replication stress and thus inhibition of cell-cycle checkpoint such as WEE1, a key checkpoint at the G2–M transition, can have synergistic effect (48). Similarly, the combination of mTOR/Wee1 inhibition also has a demonstrated role in augmenting replication stress that can synergistically induce apoptosis in ovarian cancer cells (49).

We described SynPathy: a deep learning approach incorporating pathway information of target genes and cell line expression profile for predicting drug synergy. While our pathway-based model obtained better performance compared with existing methods, the main contribution of this study is that our approach attempts to address the explainability of drug synergy prediction models. Using this approach, we demonstrated that the top influencing pathways for synergistic drugs tend to be closer than additive and antagonistic drug pairs on a protein interaction network. In addition, we described two examples that demonstrate how two drugs can be synergistic when their top associated pathways relate to the top pathway enriched with the cancer cell line expression profile. This approach shed lights on the influence of off-target networks to drug synergy, which is different from traditional network-based approaches that search for overlapping genes between primary drug targets and disease-associated genes (6, 11). In addition, it is a proof of concept for future expansion into a prescreening tool for prioritizing drug combinations for appropriate patients based on similarity of genomic profiles between cancer cell lines and the patient tumor.

When training and testing the method on independent datasets, we identified that the performance declines relative to training and testing on the datasets from the same source. This is likely due to the low overlap of drug combination between these studies and different distribution of the Loewe scores (50). While applying transfer learning techniques can mitigate this problem (10), their results demonstrate that their utility is still limited.

The top contributing pathways of synergistic drugs were significantly closer to each other than in additive or antagonistic drugs. However, top contributing pathways in synergistic and antagonistic drug pair prediction were significantly closer to the cell line only relative to additive drug pairs, but not relative to each other. This suggests that while nonadditive effect occurs when the drug-associated pathways lie close to the overexpressed genes of the cell line, the directionality of effect (i.e., synergistic or antagonistic) can differ based on the drugs involved.

We provide two examples involving ovarian cancer cell line A2780 and different combinations of drugs. Interestingly, the two of the drugs in one of our examples, dactolisib and ridaforolimus, are also described in TransSynergy as drugs that are part of a synergistic drug combination that are well predicted by their algorithm. While we identified other combinations involving these two drugs from the ones predicted by TransSynergy, it is worthwhile to note that by providing pathway information instead of just drug target, we potentially improve the ability to offer mechanistic interpretation.

A limitation of our method is the requirement to know the drug target, information which is unavailable for a subset of experimental drugs. We have utilized multiple publicly available drug target databases to reduce this effect. Alternatively, we can resort to using only chemical structure for drug features with a small reduction in performance (MSE = 72.6 ± 5.4, Supplementary Table S4), but we then lose much of the explainability of the model. While more prone to false positives, a possible alternative will be to incorporate inferred drug targets and is subject to further research.

By providing explainable pathways for the synergy prediction, our method could be a first step towards bridging the translational gap between computational methods and clinical utility. Furthermore, it could help in drug development in designing drug pairs that target close but complementary pathways.

No disclosures were reported.

Y. Tang: Conceptualization, data curation, software, formal analysis, validation, investigation, visualization, methodology, writing–original draft, writing–review and editing. A. Gottlieb: Formal analysis, supervision, funding acquisition, investigation, writing–original draft, project administration, writing–review and editing.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1.
Humphrey
RW
,
Brockway-Lunardi
LM
,
Bonk
DT
,
Dohoney
KM
,
Doroshow
JH
,
Meech
SJ
, et al
.
Opportunities and challenges in the development of experimental drug combinations for cancer
.
J Natl Cancer Inst
2011
;
103
:
1222
6
.
2.
Petrelli
A
,
Giordano
S
.
From single- to multi-target drugs in cancer therapy: when aspecificity becomes an advantage
.
Curr Med Chem
2008
;
15
:
422
32
.
3.
Bayat Mokhtari
R
,
Homayouni
T
,
Baluch
N
,
Morgatskaya
E
,
Kumar
S
,
Das
B
.
Combination therapy in combating cancer
.
Oncotarget
2017
;
8
:
38022
43
.
4.
Pang
K
,
Wan
YW
,
Choi
WT
,
Donehower
LA
,
Sun
J
,
Pant
D
, et al
.
Combinatorial therapy discovery using mixed integer linear programming
.
Bioinformatics
2014
;
30
:
1456
63
.
5.
Huang
L
,
Li
F
,
Sheng
J
,
Xia
X
,
Ma
J
,
Zhan
M
, et al
.
DrugComboRanker: drug combination discovery based on target network analysis
.
Bioinformatics
2014
;
30
:
i228
36
.
6.
Cheng
F
,
Kovács
IA
,
Barabási
AL
.
Network-based prediction of drug combinations
.
Nat Commun
2019
;
10
;
1197
.
7.
Zheng
S
,
Aldahdooh
J
,
Shadbahr
T
,
Wang
Y
,
Aldahdooh
D
,
Bao
J
, et al
.
DrugComb update: a more comprehensive drug sensitivity data repository and analysis portal
.
Nucleic Acids Res
2021
;
49
:
W174
84
.
8.
Preuer
K
,
Lewis
RPI
,
Hochreiter
S
,
Bender
A
,
Bulusu
KC
,
Klambauer
G
.
DeepSynergy: predicting anti-cancer drug synergy with deep learning
.
Bioinformatics
2018
;
34
:
1538
46
.
9.
Zhang
T
,
Zhang
L
,
Payne
PRO
,
Li
F
.
Synergistic drug combination prediction by integrating multiomics data in deep learning models
. In:
Markowitz
J
,
editor
.
Translational Bioinformatics for Therapeutic Development
.
New York, NY
:
Springer US
;
2021
:
223
38
.
10.
Kim
Y
,
Zheng
S
,
Tang
J
,
Zheng
WJ
,
Li
Z
,
Jiang
X
.
Anticancer drug synergy prediction in understudied tissues using transfer learning
.
J Am Med Inform Assoc
2021
;
28
:
42
51
.
11.
Li
H
,
Li
T
,
Quang
D
,
Guan
Y
.
Network propagation predicts drug synergy in cancers
.
Cancer Res
2018
;
78
:
5446
57
.
12.
Liu
Q
,
Xie
L
.
TranSynergy: mechanism-driven interpretable deep neural network for the synergistic prediction and pathway deconvolution of drug combinations
.
PLOS Comput Biol
2021
;
17
:
e1008653
.
13.
Azuaje
F
.
Computational models for predicting drug responses in cancer research
.
Brief Bioinform
2017
;
18
:
820
9
.
14.
Di
J
,
Zheng
B
,
Kong
Q
,
Jiang
Y
,
Liu
S
,
Yang
Y
, et al
.
Prioritization of candidate cancer drugs based on a drug functional similarity network constructed by integrating pathway activities and drug activities
.
Mol Oncol
2019
;
13
:
2259
77
.
15.
Tang
YC
,
Gottlieb
A
.
Explainable drug sensitivity prediction through cancer pathway enrichment
.
Sci Rep
2021
;
11
:
3128
.
16.
Wishart
DS
,
Feunang
YD
,
Guo
AC
,
Lo
EJ
,
Marcu
A
,
Grant
JR
, et al
.
DrugBank 5.0: a major update to the DrugBank database for 2018
.
Nucleic Acids Res
2018
;
46
:
D1074
82
.
17.
Szklarczyk
D
,
Santos
A
,
von Mering
C
,
Jensen
LJ
,
Bork
P
,
Kuhn
M
.
STITCH 5: augmenting protein–chemical interaction networks with tissue and affinity data
.
Nucleic Acids Res
2016
;
44
:
D380
4
.
18.
Wang
Y
,
Zhang
S
,
Li
F
,
Zhou
Y
,
Zhang
Y
,
Wang
Z
, et al
.
Therapeutic target database 2020: enriched resource for facilitating research and early development of targeted therapeutics
.
Nucleic Acids Res
2020
;
48
:
D1031
41
.
19.
Yang
W
,
Soares
J
,
Greninger
P
,
Edelman
EJ
,
Lightfoot
H
,
Forbes
S
, et al
.
Genomics of drug sensitivity in cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells
.
Nucleic Acids Res
2012
;
41
:
955
61
.
20.
Liberzon
A
,
Subramanian
A
,
Pinchback
R
,
Thorvaldsdóttir
H
,
Tamayo
P
,
Mesirov
JP
.
Molecular signatures database (MSigDB) 3.0
.
Bioinformatics
2011
;
27
:
1739
40
.
21.
Szklarczyk
D
,
Gable
AL
,
Lyon
D
,
Junge
A
,
Wyder
S
,
Huerta-Cepas
J
, et al
.
STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets
.
Nucleic Acids Res
2019
;
47
:
D607
13
.
22.
Liu
L
,
Ruan
J
.
Network-based pathway enrichment analysis
.
IEEE Int Conf Bioinforma Biomed
2013
:
218
21
.
23.
Barbie
DA
,
Tamayo
P
,
Boehm
JS
,
Kim
SY
,
Moody
SE
,
Dunn
IF
, et al
.
Systematic RNA interference reveals that oncogenic KRAS-driven cancers require TBK1
.
Nature
2009
;
462
:
108
12
.
24.
Buitinck
L
,
Louppe
G
,
Blondel
M
,
Pedregosa
F
,
Mueller
A
,
Grisel
O
, et al
.
API design for machine learning software: experiences from the scikit-learn project
.
Eur Conf Mach Learn Princ Pract Knowl Discov Databases
.
2013
.
25.
Chang
CC
,
Lin
CJ
.
LIBSVM: a library for support vector machines
.
ACM Trans Intell Syst Technol
2011
;
2
:
27
.
26.
Chen
T
,
Guestrin
C
.
XGBoost: A Scalable Tree Boosting System
.
Proc 22nd ACM SIGKDD Int Conf Knowl Discov Data Min
.
2016
:
785
94
.
27.
Shapley
LS
.
17. A value for n-person games
.
1953
;
307
18
.
28.
Lundberg
SM
,
Lee
SI
.
A unified approach to interpreting model predictions
.
NIPS17 Proc 31st Int Conf Neural Inf Process Syst
.
2017
:
4768
77
.
29.
Celebi
R
,
Bear Don't Walk
O
,
Movva
R
,
Alpsoy
S
,
Dumontier
M
.
In-silico prediction of synergistic anti-cancer drug combinations using multi-omics data
.
Sci Rep
2019
;
9
:
8949
.
30.
O'Neil
J
,
Benita
Y
,
Feldman
I
,
Chenard
M
,
Roberts
B
,
Liu
Y
, et al
.
An unbiased oncology compound screen to identify novel combination strategies
.
Mol Cancer Ther.
2016
;
15
:
1155
62
.
31.
Benjamini
Y
,
Hochberg
Y
.
Controlling the false discovery rate: a practical and powerful approach to multiple testing
.
J R Stat Soc Ser B Methodol
1995
;
57
:
289
300
.
32.
Hsu
CM
,
Lin
PM
,
Tsai
YT
,
Tsai
MS
,
Tseng
CH
,
Lin
SF
, et al
.
NVP-BEZ235, a dual PI3K-mTOR inhibitor, suppresses the growth of FaDu hypopharyngeal squamous cell carcinoma and has a synergistic effect with cisplatin
.
Cell Death Discov
2018
;
4
:
57
.
33.
Rivera
VM
,
Squillace
RM
,
Miller
D
,
Berk
L
,
Wardwell
SD
,
Ning
Y
, et al
.
Ridaforolimus (AP23573; MK-8669), a potent mTOR inhibitor, has broad antitumor activity and can be optimally administered using intermittent dosing regimens
.
Mol Cancer Ther
2011
;
10
:
1059
71
.
34.
Ediriweera
MK
,
Tennekoon
KH
,
Samarakoon
SR
.
Role of the PI3K/AKT/mTOR signaling pathway in ovarian cancer: biological and therapeutic significance
.
Semin Cancer Biol
2019
;
59
:
147
60
.
35.
Ghoneum
A
,
Said
N
.
PI3K-AKT-mTOR and NFκB pathways in ovarian cancer: implications for targeted therapeutics
.
Cancers
2019
;
11
:
949
.
36.
Arend
RC
,
Londoño-Joshi
AI
,
Straughn
JM
,
Buchsbaum
DJ
.
The Wnt/β-catenin pathway in ovarian cancer: a review
.
Gynecol Oncol
2013
;
131
:
772
9
.
37.
Pizzagalli
MD
,
Bensimon
A
,
Superti-Furga
G
.
A guide to plasma membrane solute carrier proteins
.
FEBS J
2021
;
288
:
2784
835
.
38.
Evangelisti
C
,
Chiarini
F
,
Paganelli
F
,
Marmiroli
S
,
Martelli
AM
.
Crosstalks of GSK3 signaling with the mTOR network and effects on targeted therapy of cancer
.
Biochim Biophys Acta BBA Mol Cell Res
2020
;
1867
:
118635
.
39.
Long
HZ
,
Cheng
Y
,
Zhou
ZW
,
Luo
HY
,
Wen
DD
,
Gao
LC
.
PI3K/AKT signal pathway: a target of natural products in the prevention and treatment of Alzheimer's disease and Parkinson's disease
.
Front Pharmacol
2021
;
12
:
648636
.
40.
Hu
X
,
Xia
M
,
Wang
J
,
Yu
H
,
Chai
J
,
Zhang
Z
, et al
.
Dual PI3K/mTOR inhibitor PKI-402 suppresses the growth of ovarian cancer cells by degradation of Mcl-1 through autophagy
.
Biomed Pharmacother
2020
;
129
:
110397
.
41.
Kandel
ES
,
Skeen
J
,
Majewski
N
,
Di Cristofano
A
,
Pandolfi
PP
,
Feliciano
CS
, et al
.
Activation of Akt/protein kinase B overcomes a G2/M cell cycle checkpoint induced by DNA damage
.
Mol Cell Biol
2002
;
22
:
7831
41
.
42.
Xu
N
,
Lao
Y
,
Zhang
Y
,
Gillespie
DA
.
Akt: A double-edged sword in cell proliferation and genome stability
.
J Oncol Hindawi
2012
;
2012
:
e951724
.
43.
García
Z
,
Kumar
A
,
Marqués
M
,
Cortés
I
,
Carrera
AC
.
Phosphoinositide 3-kinase controls early and late events in mammalian cell division
.
EMBO J
2006
;
25
:
655
61
.
44.
Liang
J
,
Slingerland
JM
.
Multiple roles of the PI3K/PKB (Akt) pathway in cell cycle progression
.
Cell Cycle
2003
;
2
:
339
45
.
45.
MacCorkle
RA
,
Tan
TH
.
Mitogen-activated protein kinases in cell-cycle control
.
Cell Biochem Biophys
2005
;
43
:
451
61
.
46.
Fan
J
,
Bertino
JR
.
K-ras modulates the cell cycle via both positive and negative regulatory pathways
.
Oncogene
1997
;
14
:
2595
607
.
47.
Qin
X
,
Lin
L
,
Cao
L
,
Zhang
X
,
Song
X
,
Hao
J
, et al
.
Extracellular matrix protein Reelin promotes myeloma progression by facilitating tumor cell proliferation and glycolysis
.
Sci Rep
2017
;
7
:
45305
.
48.
Huang
TT
,
Lampert
EJ
,
Coots
C
,
Lee
JM
.
Targeting the PI3K pathway and DNA damage response as a therapeutic strategy in ovarian cancer
.
Cancer Treat Rev
2020
;
86
:
102021
.
49.
Li
F
,
Guo
E
,
Huang
J
,
Lu
F
,
Yang
B
,
Xiao
R
, et al
.
mTOR inhibition overcomes primary and acquired resistance to Wee1 inhibition by augmenting replication stress in epithelial ovarian cancers
.
Am J Cancer Res
2020
;
10
:
908
24
.
50.
Menden
MP
,
Wang
D
,
Mason
MJ
,
Szalai
B
,
Bulusu
KC
,
Guan
Y
, et al
.
Community assessment to advance computational prediction of cancer drug combinations in a pharmacogenomic screen
.
Nat Commun
2019
;
10
:
2674
.

Supplementary data