Background: Machine learning models that rely on single omics data for drug sensitivity prediction are challenging and frequently fail within precision medicine scenarios. Proteomics, for example, reflects the system biology and the regulatory network better than genomics. However, it is less likely available for many preclinical and in vivo data, mainly due to the cost of data generation. Currently, there are massive amounts of genomics data available, such as CNV, mutation, and RNA-seq signatures. However, these data do not characterize the post-translational modifications in proteins, limiting their utility for biomarker discovery.

Objective: The main objective of this study is to overcome the lack of proteomics data gap. Herein, we propose a novel modeling approach to improve the prediction accuracy of drug sensitivity, that is, combining genomics and proteomics signatures. In addition to genomics and transcriptomics data, the model infers proteomic activity from gene expression using VIPER from in vitro data, and this is extendable to in vivo settings.

Material and Methods: Using PharmacoGX package developed in our lab, we downloaded, curated, and annotated the genomic and pharmacologic data of the Cancer Cell line Encyclopedia (CCLE), as well as the Cancer Therapeutics Response Portal (CTRPV2) dataset that is a continuation of the CTRP project and the largest pharmacologic screen conducted to date, containing several hundreds of thousands of drug dose-response curves. In this study, we extracted from CCLE dataset the following signatures: RPPA, RNASeq, Mutation, and CNV, then inferred the VIPER protein using RNASeq. Then, we built a model that a) checks the different omics combinations, b) applies random forest with ten-fold cross-validation for sensitivity prediction using CTRPV2, and c) evaluates the significance of each model using the concordance index (CI) package developed in our lab.

Results: The proposed model was tested in vitro using CCLE and CTRPv2 common cell lines. We tested the model with drugs having known biomarkers in pharmacogenomics literature. ERBB2 biomarker for lapatinib showed the best CI=0.95 by combining CNV and VIPER models while CI equals 0.77 and 0.9 for each of them, respectively. Moreover, MET biomarker for crizotinib showed best CI=0.89 by integrating RNASeq, Mutation, RPPA, and VIPER, while each model obtained CI=(0.45, 0.5, 0.85, 0.5), respectively.

Conclusion: In conclusion, omics integration boosted the drug sensitivity prediction compared to single models. The application of the proposed model in vivo will improve drug development and increase the prediction quality of precision medicine.

Citation Format: Hassan Mahmoud, Benjamin Haibe-Kains. Drug sensitivity prediction modeling from genomics, transcriptomics and inferred protein activity [abstract]. In: Proceedings of the AACR Special Conference on Advancing Precision Medicine Drug Development: Incorporation of Real-World Data and Other Novel Strategies; Jan 9-12, 2020; San Diego, CA. Philadelphia (PA): AACR; Clin Cancer Res 2020;26(12_Suppl_1):Abstract nr 33.