Evidence points towards the differentiation state of cells as a marker of cancer risk and progression. Measuring the differentiation state of single cells in a preneoplastic population could thus enable novel strategies for early detection and risk prediction. Recent maps of somatic mutagenesis in normal tissues from young healthy individuals have revealed cancer driver mutations, indicating that these do not correlate well with differentiation state and that other molecular events also contribute to cancer development. We hypothesized that the differentiation state of single cells can be measured by estimating the regulatory activity of the transcription factors (TFs) that control differentiation within that cell lineage. To this end, we present a novel computational method called CancerStemID that estimates a stemness index of cells from single-cell RNA-Seq data. CancerStemID is validated in two human esophageal squamous cell carcinoma (ESCC) cohorts, demonstrating how it can identify undifferentiated preneoplastic cells whose transcriptomic state is overrepresented in invasive cancer. Spatial transcriptomics and whole-genome bisulfite sequencing demonstrated that differentiation activity of tissue-specific TFs was decreased in cancer cells compared to the basal cell-of-origin layer and established that differentiation state correlated with differential DNA methylation at the promoters of these TFs, independently of underlying NOTCH1 and TP53 mutations. The findings were replicated in a mouse model of ESCC development, and the broad applicability of CancerStemID to other cancer-types was demonstrated. In summary, these data support an epigenetic stem-cell model of oncogenesis and highlight a novel computational strategy to identify stem-like preneoplastic cells that undergo positive selection.

This content is only available via PDF.

Article PDF first page preview

Article PDF first page preview
You do not currently have access to this content.