Introduction: Immune cell infiltration in solid tumors correlates with patient outcome and therapeutic response. While specific cell-type infiltration can be elucidated by single-cell transcriptomic techniques, these suffer biases and limitations of scale. To instead leverage existing large repositories of bulk gene expression data with clinical outcomes, computational methods have been developed to deconvolve sample profiles into their stromal (including immune) and malignant cell components. However, their performance has yet to be compared within an unbiased, objective framework. To assess methods and catalyze development of new approaches, we are organizing a community-wide DREAM Challenge.

Methods: The Challenge consists of: (1) an open phase, during which methods are trained on publicly-available transcriptomic profiles of cell populations; (2) a leaderboard phase, during which methods are submitted, assessed, and revised using bulk expression data having ground truth (e.g., ratios from mixing experiments or from flow cytometry); and (3) a validation phase, during which final submissions are assessed using independent expression profiles of known admixtures. The latter are generated in vitro by mixing RNA from multiple types of purified stromal and cancer cells. To assess sensitivity, we provide in silico admixtures from expression profiles of purified populations with a range of tumor “contamination.” Models will be submitted as Docker containers, executed in the cloud, and assessed based on their ability to predict levels of an individual cell type across samples. In sub-Challenge 1 models will predict coarsely-defined stromal populations, while in sub-Challenge 2 models will further dissect these into subsets (e.g., of T-cell subtypes).

Results: We have isolated cell populations of interest and performed quality control on datasets for use in the leaderboard phase. We have defined mixing proportions to assess sensitivity and specificity of deconvolution algorithms. The infrastructure for conducting the Challenge is in place. There is robust interest, with 240 participants pre-registered. The active phase will launch in early 2019 for seven weeks. At its completion, we will identify features of best performing models and provide guidelines where improvements are needed.

Discussion: We expect methods to have difficulty deconvolving cell types with closely-correlated expression profiles and in detecting low-frequency populations. An assessment of these limits will enable appropriate use of deconvolution methods in prognostic and predictive models. Our synthetic admixtures model a diverse range of stromal microenvironments that mimic realistic tumors as far as possible. In a subsequent in vivo Challenge, methods will be tested against a future data set in which patient samples are profiled in bulk for deconvolution and at the single-cell level to establish ground truth.

Citation Format: Brian S. White, Andrew J. Gentles, Aurélien de Reyniès, Aaron M. Newman, Andrew Lamb, Laura Heiser, Joshua J. Waterfall, Thomas Yu, Justin Guinney. A tumor deconvolution DREAM Challenge: Inferring immune infiltration from bulk gene expression data [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2019; 2019 Mar 29-Apr 3; Atlanta, GA. Philadelphia (PA): AACR; Cancer Res 2019;79(13 Suppl):Abstract nr 1690.