A panel of more than 600 cell lines from 17 tumor types has been profiled and sensitivity to a set of FDA approved compounds with different mechanisms of action has been tested. Comparison of gene expression profiles with overlapping set of publicly available profiles showed 100% accuracy of cell line identity prediction using nearest neighbor classifier. Similar analysis of CNV data had 80% accuracy due to relatively little CNV perturbation in some of the cell lines. Significant gene expression signatures have been detected for 80% of compounds. De-novo agnostic classification based on 50% train/test split and a linear classifier resulted in significant prediction on the test set for about 40% of the compounds, such as dasatinib, 5FU, paclitaxel, but failed to produce a significant prediction for others, such as doxorubicine, irinotecan, and vinblastine. For most of the compounds, the prediction of response is complex, with multiple distinct molecular features contributing to a classification algorithm. This inherent complexity requires integration of gene expression, CNV and mutation data as well as a large cell line sets for development of accurate classification algorithms. We defined functional CNV and SNV events using gene expression based modules as a functional readout. Predictive models that incorporate prior knowledge of mechanism of action of the compounds and rely on functional SNV and CNV events out perform completely agnostic methods of prediction.

