Deconvolution of individual reference cell type profiles from their mixture in bulk sample can yield biological insights on the cellular heterogeneity of tumors and their microenvironment. Deconvolution can reveal both the composition of individual cell types, and the activity levels for each gene and regulatory region in the constituent cell types, thus reducing the need for single-cell profiling, which can be prohibitively costly, and is still not systematically applicable to epigenomic profiles. Current methods for mixture deconvolution employ regression-based methods to calculate the composition from a predefined set of reference expression signatures, which presents several shortcomings, including: (1) incorporation of only one data type (either epigenomic or transcriptomic, but not both); (2) inability to incorporate prior knowledge regarding the mixture composition; (3) lack of accounting for the variability in the data sets that are used to generate the reference signatures; and (4) variability in prediction due to the choice of genes used to populate the signatures matrix. Here, we present two computational approaches that overcome these limitations, by deconvolving cell type mixture profiles jointly across both transcriptomic and epigenomic datasets. The first method, bDeconvolve, is a hierarchical Bayesian model that jointly models the epigenomics and transcriptomic composition of a cellular mixture. The model also allows for: incorporation of empirical priors regarding the composition of the mixture, incorporation of variability in the data used to generate the signatures, and the ability to infer the signatures of unknown cell types. We foresee this model being used when the signatures matrix is generally well established. The second method, DC-NMF, jointly learns the reference signatures and the composition of the bulk sample. By using sparse non-negative matrix factorization of previous reference datasets, we are able to perform deconvolution on a reduced rank representation of key transcriptomic and epigenomic signatures. We envision that this approach will be used when the signatures matrix is unknown. We apply these methods to both simulated mixtures and complex mixtures from TCGA melanoma datasets. We demonstrate both of our approaches are efficient and accurate in joint transcriptomic and epigenomic deconvolution, and we show that the deconvolved profiles can be used to yield informative clusters and highlight important signatures at the tumor-immune interface. Overall, these methods can play a key role in the development of scalable and personalized approaches to understand tumor immunology in rapid and cost-effective ways.

Citation Format: Alvin H. Shi, Yue Li, Karthik Murugadoss, Manolis Kellis. Deconvolution of diverse cell types in the tumor microenvironment by jointly modeling transcriptomic and epigenomic information. [abstract]. In: Proceedings of the AACR Special Conference on Tumor Immunology and Immunotherapy; 2016 Oct 20-23; Boston, MA. Philadelphia (PA): AACR; Cancer Immunol Res 2017;5(3 Suppl):Abstract nr A15.