Identifying biomarkers predictive of cancer cells’ response to drug treatment constitutes one of the main challenges in precision oncology. Recent large-scale cancer pharmacogenomic studies have boosted the research for finding predictive biomarkers by profiling thousands of human cancer cell lines at the molecular level and screening them with hundreds of approved drugs and experimental chemical compounds. Many studies have leveraged these data to build predictive models of response using various statistical and machine learning methods. However, a common challenge in these methods is the lack of interpretability as to how they make the predictions and which features were the most associated with response, hindering the clinical translation of these models. To alleviate this issue, we develop a new machine learning pipeline based on the recent LOBICO approach that explores the space of bimodally expressed genes in multiple large in vitro pharmacogenomic studies and builds multivariate, nonlinear, yet interpretable logic-based models predictive of drug response. Using our method, we used a compendium of three of the largest pharmacogenomic data sets to build robust and interpretable models for 101 drugs that span 17 drug classes with high validation rate in independent datasets.

Citation Format: Wail Ba-alawi, Sisira Kadambat Nair, Bo Li, Anthony Mammoliti, Petr Smirnov, Arvind Singh Mer, Linda Penn, Benjamin Haibe-Kains. Bimodality of gene expression in cancer patient tumors as interpretable biomarkers for drug sensitivity [abstract]. In: Proceedings of the AACR Virtual Special Conference on Artificial Intelligence, Diagnosis, and Imaging; 2021 Jan 13-14. Philadelphia (PA): AACR; Clin Cancer Res 2021;27(5_Suppl):Abstract nr PO-070.