Tissue profiling technologies present opportunities for understanding transition from precancerous lesions to malignancy, which may impact risk stratification, prevention, and even cancer treatment. A human precancer atlas building effort is ongoing to tackle the significant challenge of decoding the heterogeneity among cells, specimens, and patients. Here, we discuss the findings resulting from atlases built across precancer types, including those found in colon, breast, lung, stomach, cervix, and skin, using bulk, single-cell, and spatial profiling strategies. We highlight two main themes that emerge across precancer types: the ordering of molecular events that occur during tumor progression and the fluctuation of microenvironmental response during precancer progression. We further highlight the key challenges of data integration across large cohorts of patients, and the need for computational tools to reliably annotate and quality control high-volume, high-dimensional data.

Cancers are dynamic and complex diseases, which undergo constant alteration in the genome, epigenome, and tumor microenvironment (TME). More than 200 types of cancers have been described, each can be further classified into subtypes. Modern technologies enable subtypes to be defined through molecular data describing the genome, epigenome, and TME, which provide characterizations previously unrevealed by histologic classifications. These technologies fueled The Cancer Genome Atlas (TCGA), and subsequently, the Clinical Proteomic Tumor Analysis Consortium (CPTAC), which generated large-scale molecular data on cancer specimens for 33 human cancer types (1). These rich datasets are transformative in cancer research, in that they define key molecular aberrations with clinical implications over various cancer types, such as TP53, mutations IDH1/2, and 1p and 19q deletion in subtyping diffuse glioma with distinct diagnoses (2–4). Assessment of recurrent fusion events’ druggability across cancer types is also enabled by these data (5). Expanded application of therapies to a myriad of cancer types is made possible by the identification of ubiquitous driver genes, such as TP53 in breast, head and neck, and ovarian cancers (1, 6). Systematic examinations of cancers at the molecular level rapidly expand our knowledge of the underlying mechanisms driving cancers, ultimately contributing to patient stratification and intervention strategies.

Despite the tremendous progress in understanding cancer, successful control of advanced cancers remains a significant challenge in part due to tumor heterogeneity (7). Tumor heterogeneity is presented as molecular and cellular diversity at various levels, including within tumor, between different tumors, and over time. In cancer, tumor heterogeneity is thought to arise mostly from stochastic genomic alterations with selection favoring tumor growth and plasticity (8, 9). Nongenetic, but inheritable, epigenetic mechanisms can also play a role (10). Molecular subtyping efforts essentially attempt to find generalizable and actionable principles among heterogeneous cancers that arose from independently complex evolutionary trajectories. An illustrative example of this challenge can be found in colorectal cancer, where collective efforts have been conducted to establish consensus molecular subtypes that integrate multidimensional bulk-profiling data (11). More recently, single-cell resolution data established a new classification system for colorectal cancer (12). However, these data also show that individual cells exhibit more similarity within a tumor than across tumors, demonstrating that each tumor potentially can be classified as its own subtype. This can be attributed to individualized genetic variation and somatic evolutionary patterns that give rise to intertumoral variability in advanced cancers. Clinical outcomes also demonstrate the impact of tumor heterogeneity. Almost all advanced cancers develop resistance to targeted therapies, even those that exhibit drastic initial response. Evidence points to intratumoral and temporal heterogeneity as the main forces fostering therapy resistance (13). These instances highlight significant obstacles posed by tumor heterogeneity in decoding cancers with individual variabilities.

Lesions at premalignant stages can provide alternative windows for cancer intervention. Precancerous lesions often precede the development of invasive carcinoma (14). Precancers are usually less heterogeneous than cancers (13, 15), as premalignant lesions (PML) usually start with mutations in well-known driver genes that initiate neoplastic growth, as opposed to mutations arising later in tumor evolution that are harbored by subclones that contribute to intratumoral heterogeneity and patient-to-patient variation (16, 17). Given that higher levels of tumor heterogeneity render worse outcomes to anticancer treatments (13), it is conceivable that early intervention is the key to controlling cancer.

Studying precancers also benefits our understanding of the disease from the perspective of prevention. Extrinsic factors, such as diets, smoking, alcohol consumption, and microbiome differences have profound influences on cancer development (18–20). Understanding the mechanisms, such as disrupted pathways modulated by extrinsic factors, would offer valuable information for developing preventative strategies. For example, a high-fat diet induced intestinal stem cell adaptation through peroxisome proliferator-activated receptor signaling pathway is associated with colorectal cancer risks, and the inhibition of key proteins in this pathway can reduce the protumorigenic effects of a high-fat diet on tumor initiation (21). This understanding of the high-fat diet induced tumorigenesis pathway opens new opportunities for diet-mediated cancer prevention. Furthermore, because not all PMLs will progress to cancer, it is also crucial to understand the features distinguishing lesions at high risks to those (16, 17) that are likely to regress (14). Understanding precancer biology may, in some way, be more effective in controlling cancer, because this knowledge is key to deliver timely intervention, guide appropriate follow-up screening, prevent overtreatments, and identify potential targets for early intervention.

Given the measurable value of understanding PMLs, many groups are now engaged in constructing human precancer atlases to characterize molecular changes during the early stages towards malignancy. We provide an overview of various molecular and spatial profiling technologies (Fig. 1) applied in these precancer atlases’ studies that focus on precancer originating in the colon, breast, lung, stomach, cervix, skin, and more importantly, the insights gained through these atlasing efforts.

Figure 1.

Bulk, single-cell, and spatial profiling of precancers. (A) Bulk, (B) single cell, and (C) spatial profiling technologies can interrogate different aspects of a sample. Bulk technologies have low resolution. They are more suited for deciphering intersample differences, but they can be applied in a high throughput manner and have the most mature operation pipelines. Single-cell technologies can evaluate individual cells, and thus, can characterize intra-sample heterogeneity, enabling differences between cell types or cell states to be investigated. Single-cell technologies are more susceptible to data quality issues. Spatial profiling technologies uncover an additional layer of information within a sample, that of spatial organization, enabling examination of spatial heterogeneity and macro-structures. The newer technologies are more challenging to be applied in a high throughput way due to the demands of downstream analysis and prohibitive cost.

Figure 1.

Bulk, single-cell, and spatial profiling of precancers. (A) Bulk, (B) single cell, and (C) spatial profiling technologies can interrogate different aspects of a sample. Bulk technologies have low resolution. They are more suited for deciphering intersample differences, but they can be applied in a high throughput manner and have the most mature operation pipelines. Single-cell technologies can evaluate individual cells, and thus, can characterize intra-sample heterogeneity, enabling differences between cell types or cell states to be investigated. Single-cell technologies are more susceptible to data quality issues. Spatial profiling technologies uncover an additional layer of information within a sample, that of spatial organization, enabling examination of spatial heterogeneity and macro-structures. The newer technologies are more challenging to be applied in a high throughput way due to the demands of downstream analysis and prohibitive cost.

Close modal

Being important and prevalent pillars of molecular profiling, bulk sequencing analyses have been extensively used in characterizing advanced human cancers (Fig. 1A; refs. 1, 22). Likewise, these approaches have been applied to PMLs of several cancer types, although at a lower rate compared with malignant lesions. Colonic precancerous lesions are one of the most profiled human precancers. Outside of canonical APC mutations (23, 24), a myriad of mutations in WNT pathway genes such as LRP1B, SOX9, FAT4, TCF7L2, FBXW7, ARID1A, CTNNB1 (25–28), as well as early onset epigenetic silencing of WNT negative regulators, were also observed in conventional colonic adenomas (25, 29–32). In contrast to conventional adenomas that are mainly found in the descending colon, serrated colonic polyps that arise from the ascending colon lack APC mutations and frequently exhibit CpG island methylator phenotype (CIMP) and epigenetic silencing of CDKN2A/p16 (23, 33–37), highlighting diverging genetic events initiating conventional and serrated polyps. Notably, a high mutational burden characteristic of microsatellite instable (MSI) colorectal cancers was not yet observed in sessile serrated lesion precursors (15).New technologies such as multirestriction enzymes digested Hi-C (mHi-C) revealed progressive decay of chromatin microstructures, such as strips and loops originating from active cis-regulatory elements, particularly at promoters in a progression spectrum of familial adenomatous polyposis (FAP) adenomas (bioRxiv 2022.08.26.505505). Another recently developed technology Ultra High-throughput Whole Genome Methylation Sequencing (WGMS) enabled comprehensive and untargeted measurements of methylation status in large scale studies. Its application on the same specimen set revealed a gradual decrease in average genome-wide methylation, although specific regulatory elements had divergent patterns of methylation alterations, highlighting the complex and dynamic nature of the methylation landscape during precancer progression (bioRxiv 2022.05.30.494076). Bulk microbiome profiling (16S rRNA-sequencing and metagenomic sequencing) of human colorectal cancer slurries has also identified tumor initiation roles of key bacterial species, including colibactin-expressing Escherichia coli, enterotoxigenic Bacteroides fragilis, and more recently, Clostridioides difficile (38, 39). In parallel, bacterial FISH are used to probe the presence of microorganisms in tissues and lesions (40–42). The high data quality and cost-effectiveness of bulk profiling render a global view of genetic, epigenetic, and microbial alterations between different samples representing colonic polyp progression.

Molecular profiling applied to lung cancer adenocarcinoma (LUAD) research has shed light on precursor lesions at various stages including adenomatous hyperplasia (AAH), adenocarcinoma in situ (AIS), minimally invasive AD (MIA). With whole-exome sequencing and more refined multiregional exome sequencing, key mutations in driver genes such as EGFR, RBM10, BRAF, ERBB2, TP53, KRAS, MAP2K1, and MET are already observed in PMLs. Although these well-known cancer driver events appear early, a rather gradual increment of other events such as APOBEC-associated mutations, total mutational burden, arm and focal copy-number alteration, allelic imbalance, TP53, and HLA loss of heterozygosity accompany malignant progression (43, 44). Using reduced representation bisulfite sequencing, gradual increases in methylation aberrations and intratumoral heterogeneity in the methylation landscape were also observed from early to late lesions (45). These tumor cell-specific changes were accompanied by a gradual decrease in antitumor response and an increase of suppressive or dysfunctional immune cell subtypes along the progression axis (46). Combining targeted immune-related mRNA profiling and TCR profiling with genomic profiling, an association of summed effects of genomic/epigenetic changes and immune alterations was established, although the association of the immune contexture with any single tumor feature was weak, indicating heterogeneous mechanisms that converge on the progression towards immune evasion (46). Dysregulated immunity was also observed in the precursor bronchial PMLs to lung squamous cell cancer (LUSC) through large-scale sequencing (47). Examination of TCR diversity in PML samples revealed a decreased diversity associated with regressed and non-progressed PMLs within the proliferative subtype, indicative of the encroaching of a few dominant T-cell clones in suppressing transformation (48). TCR diversity is negatively associated with IFN signaling scores and antigen processing signatures. Bulk molecular characterization of lung precancer lesions clarified genetic events that occur early versus later in malignant transformation and implicated tumor microenvironmental changes occurring during tumor progression.

In breast cancer, the value of bulk characterization has been demonstrated in characterizing ductal carcinoma in situ (DCIS) pre-invasive lesions. The profiling of two well-annotated cohorts of DCIS specimens with matched recurrence/invasion data identified 812 genes and an early onset copy-number profile that can be used as a classifier for recurrence and invasive progression with high accuracy. Biological programs underlying these molecular profiles include cell-cycle progression, growth factor signaling, increased metabolism, and elevated immune response. The latter was further probed by multiplexed ion beam imaging (MIBI) to reveal associations between CD4+ T cells, myeloid and plasmacytoid dendritic cells, monocytes, and macrophages, with DCIS recurrence (49). Classification efforts in other precancer types produce similar translational values. Transcriptomic profiling produced two nevi/melanoma subtypes: type 1 characterized by pigmentation-type and MITF gene signature and type 2 characterized by inflammatory-type and AXL gene signature, with the former predicted to confer resistance to BRAF/MEK inhibitor and the latter to anti-PD1 treatment (50). Although in cervical cancer, integration of premalignant cervical intraepithelial neoplasia sequencing with HPV status classified high versus low-risk subtypes (51). These multidimensional precancer characterization efforts not only enhance our understanding of the biology of tumor progression but also lay a solid foundation for translational applications such as risk prediction.

Technologies with single-cell and spatial resolution enable several levels of heterogeneity within precancerous lesions to be assessed, including heterogeneity between tumor and microenvironmental cells, and between tumor cells of different states (Fig. 1B). Although the number of samples profiled for each study has been historically small, larger-scale studies of these types are beginning to emerge. Chen and colleagues performed scRNA-seq on 128 human colonic specimens, identifying two different precancerous cell states within polyps (15). High-level gene program analysis of these cell states revealed that conventional adenomas originate from aberrant WNT-driven stem cell expansion and that serrated lesions originate from pyloric metaplasia of non-stem cells, potentially due to damage. Analysis of the TME using scRNA-seq and spatially resolved multiplex immunofluorescence revealed a cytotoxic immune microenvironment preceding high tumor mutational burden in serrated polyps that are thought to be precursors of MSI-H colorectal cancers. These human data are supported by a proximal serrated tumorigenesis mouse model driven by mutant Braf and enterotoxigenic Bacteroides fragilis (52). The earliest events of the enterotoxigenic response occur in epithelial cells at the colonic mucosal surface prior to tumor formation, and resultant tumors are hypermethylated, infiltrated with CD8+ cytotoxic T cells, but again, not hypermutated. Association studies of human carcinogenic microbes in mice combined with scRNA-seq also showed a serrated-like damage response in differentiated colonic cells compared with WNT-activation in stem/progenitor cells (39). Microbial influence in serrated tumorigenesis is evidenced by the detection of polymicrobial biofilms in ∼90% of right-sided colorectal cancers, which are enriched for serrated tumors (42). Remarkably, the transition of serrated pre-cancerous lesions to MSI-H colorectal cancers maintained some metaplastic character, but portions of the tumor gain stem properties accompanied by non-APC WNT pathway mutations; these regions were devoid of cytotoxic T cells, further demonstrating the heterogeneous nature of both tumor cells and associated immune cells (15). Cellular plasticity between regenerative and stem states in colonic tumors modulate cytotoxic immunity, an observation supported mechanistically by mouse models. The mapping of these transitions proposes the reclassification of the consensus molecular subtypes (CMS) to only include two tumor cell subtypes based on their adenomatous (iCMS2) or serrated origins (iCMS3; ref. 12). These two pathways dichotomize WNT-dependent and WNT-independent mouse models of tumorigenesis that produce different tumor cell populations downstream (53). Single-cell characterization specifically of serrated lesions further uncovered two subgroups of tumor cells: the sessile serrated lesion subtype featuring Notch signaling and the traditional serrated adenoma subtype featuring Paneth cell metaplasia (54). In addition, further characterization of the damage response to stimulate abnormal ROS and further subsetting of CD103+ CD8+ tissue-resident memory T cells to be associated with enhanced cytotoxicity in serrated lesions were found. Single nuclei transcriptomic and chromosomal accessibility analyses on a progression spectrum of FAP samples, which are largely driven by APC mutations, further revealed a coordinated epigenomic and transcriptional trajectory during progression (55). Consistent with single-cell studies above, tumor progression is associated with a gain in stem features, enriched T-reg, exhausted T cells, and precancer associated fibroblast, all pointing towards immunosuppression. Key cell compositional changes were validated with co-detection by indexing (CODEX) multiplex imaging, and potential biomarkers for progression staging (GPX2 expression, LEK, TCF, and HNF4A accessibility). Mid-scale single-cell and spatial studies of human tissues with confirmatory mouse experiments are richly delineating different subtypes of tumor cell states, tumor-immune modulation mechanisms, and alterations associated with progression in colonic precancerous lesions.

Single-cell transcriptomic profiling has also been applied to gastric precancerous lesions spanning a spectrum of malignant stages. Pyloric metaplasia towards intestinal stem-like cells from glandular secretory cells was observed, followed by two groups of goblet cells that emerge from intestinal metaplasia (IM) with one featuring metabolism alteration and the other proliferation (56). HES6, a transcription factor of early goblet differentiation, as well as early gastric markers KLK10 and SULT2B1, were proposed as markers indicating the risk of IM progression. Another study that performed trajectory analyses on matched malignant and PMLs from patients diagnosed with either intestinal type (IGC) or diffuse type (DGC) gastric cancer revealed a potential path of neoplastic progression from IM to IGC and a de novo path toward DGC characterized by a higher expression of stemness and inflammatory CAF interaction (57).

In other atlasing efforts, more emphasis has been placed on spatially resolved profiling (Fig. 1C). Cell densities, cell neighborhoods and collagen structure were identified as key predictors of DCIS progression using multiplexed-ion beam imaging by time of flight (MIBI-TOF) data generated from DCIS specimens with matched progression to invasive breast cancer (IBC). Coupled with laser-capture microdissection RNA-seq, myoepithelial, and stromal cell features were determined to be more predictive of progression compared with cancer cell features, emphasizing the important role of TME interactions. The hypothesis is that a compromised myoepithelial barrier surrounding the precancer enables infiltration of cells, such as immune cells and CAFs, which restrain tumor progression and protect against the incidence of invasive relapse (58). On the other hand, invasive transformation of DCIS to IBC generates hypoxia, marked by CA9 in tumor cells, which plays a role in recruiting FOXP3+ T regs cells to promote DCIS progression (59). Single-cell transcriptomic profiling also uncovered differential long noncoding RNA SNHG6 and SNHG29 in invasive breast cancer and DCIS (60). The value of experimental design on profiling pure DCIS precancer lesions, and synchronous DCIS and IBC specimens is demonstrated in these atlasing studies.

At last, high-plex immunofluorescence imaging (CyCIF), 3D high-resolution microscopy and spatially resolved microregion transcriptomic revealed molecular evidence of progression both within and across specimens of cutaneous melanoma spanning various degrees of malignancies (61). Imaging analyses revealed higher order cellular patterns including immune synapses, PD1-PDL1 colocalization and other juxtacrine ligand–receptor interactions; they also revealed complex patterns of cytokine and receptor expression leading to immune cell polarization. These discoveries, together with recurrent neighborhood analysis, revealed a number of immunosuppression mechanisms along the progression spectrum especially near invasive borders of lesions. Moreover, expression gradients expression of melanoma-related proteins such as MITF, S100A, and S100B were observed across invasive regions, highlighting the potential of morphogen gradients acting in tumor progression (61).

Several common themes have emerged from profiling efforts across precancer types (Fig. 2). The first is the clarification of the ordering of molecular events that occur during tumor progression. Although substantial effort has been exerted on studying the effects of high-frequency driver genes, several precancer studies showed that these highly recurrent mutations mainly exert their effects at the precancer stage. Although these mutations may play a synergistic role with other events during later stages of cancer, progression is usually triggered by less prevalent molecular events conferring fitness advantages, which are more diverse among individuals. High-frequency driver events are targets of study because they are found in large majorities of different cancers. However, these events are shown to be already prevalent in premalignancy, although very few of PMLs actually progress to cancer. These observations put into question the nature of events that determine paths of progression towards malignancy, cancer heterogeneity, and cancer cell plasticity in response to treatments.

Figure 2.

Common themes that emerge from precancer atlasing studies. Through the multidimensional characterizations of precancerous lesions along the progression axis (the red arrow), a continuum of molecular and cellular alterations and increased heterogeneity were found. Many driver events can already be observed in premalignancies, implying their roles in tumor initiation, whereas less prevalent and more diverse sporadic alterations are their key players fueling progression. A cytotoxic immune environment usually presents in the precancers and is gradually replaced by an immunosuppressive environment. Atlasing efforts (in yellow bubbles) identify biomarkers and elucidate biological mechanisms in tumor progression that can potentially predict a precancer's risk of progression (in yellow rectangles). They can also provide valuable targets for cancer intervention and prevention (in yellow rectangles). Abbreviations: NK, nature killer cell; TH, T Helper cell; CTL, cytotoxic T lymphocyte; M1&2, M1&M2 macrophages; mDC, mature dendritic cell; T-reg, regulatory T cell; MDSC, myeloid-derived suppressor cell; iDC, immature dendritic cell.

Figure 2.

Common themes that emerge from precancer atlasing studies. Through the multidimensional characterizations of precancerous lesions along the progression axis (the red arrow), a continuum of molecular and cellular alterations and increased heterogeneity were found. Many driver events can already be observed in premalignancies, implying their roles in tumor initiation, whereas less prevalent and more diverse sporadic alterations are their key players fueling progression. A cytotoxic immune environment usually presents in the precancers and is gradually replaced by an immunosuppressive environment. Atlasing efforts (in yellow bubbles) identify biomarkers and elucidate biological mechanisms in tumor progression that can potentially predict a precancer's risk of progression (in yellow rectangles). They can also provide valuable targets for cancer intervention and prevention (in yellow rectangles). Abbreviations: NK, nature killer cell; TH, T Helper cell; CTL, cytotoxic T lymphocyte; M1&2, M1&M2 macrophages; mDC, mature dendritic cell; T-reg, regulatory T cell; MDSC, myeloid-derived suppressor cell; iDC, immature dendritic cell.

Close modal

The second major theme is the fluctuation of microenvironmental response during the course of precancer progression. Although cancers are classified as “immune hot” or “immune cold,” precancer profiling offers a glimpse into pathway by which a TME arrive at this final state. A cytotoxic environment is already established in precancers despite their lower mutation burden compared with later stage tumors. Other extrinsic factors such as damage might have facilitated this favorable microenvironment to enable immune surveillance. A cytotoxic immune microenvironment restrains tumor progression, but this environment is progressively replaced by an immunosuppressive one as a precancer evolves into cancer. The transition is modulated by tumor-dependent microenvironmental changes, such as hypoxia, tissue physical architecture that excludes immune cells, and altered cytokine and fibroblast profiles. For instance, recent characterization of cervical lesions of various stages unveiled a low but active immune surveillance response featuring infiltration of CD8+ T cells, effector NK cells, and M1-like macrophage in PMLs. This is in contrast to malignant lesions, where the TME is characterized by enrichments of immunosuppressive cell types including exhausted T cells and M2-like macrophage (62). These findings suggest the potential of boosting immune cytotoxicity at early stages of malignancies as a preventative approach against tumor progression, although the applications of such strategies need to be carefully monitored to prevent immune-related adverse events such as autoimmunity (63).

Heterogeneity among cells, specimens, and patients is a substantial obstacle for understanding and treating cancer. For understanding tumor progression, variability from individual specimens or patients within a stage can dampen the differential signals across the stages. In addition, many PMLs do not progress, leading to false signals when pooled together with progressing lesions. Using a longitudinal experimental design by sampling within the same patient can somewhat alleviate interpatient heterogeneity (Fig. 1, yellow axes). Yet, this is tremendously challenging for large-scale studies given patient compliance, the unpredictability of disease progression, and the long-time scale for cancer development. In a majority of cases, precancers are completely removed to prevent cancer development, leading to even less chance for additional sampling from the same lesion over time. In the colorectal precancer space, every polyp arises from a unique sequence of genetic events. Thus, polyps metachronously collected from the same patient can only share limited degree of similarities in their progression histories and cannot be safely referenced as precursors to later tumors (13, 64). Thus, mapping an accurate progression path from a collection of independent precancer specimens characterized in bulk presents significant challenges, although metachronous tumor characterization can shed light on the effects of shared genetic predisposition and environmental exposure.

Precancer atlas building requires datasets from large numbers of human specimens, which also brings up logistical challenges. Precancerous lesions are usually very limited in size compared with surgical resections of cancers, which makes customization of next-generation assays to work well with small numbers of cells all the more important. Many assays also generate the highest quality data with freshly collected samples, which put increased demands on a coordinated tissue collection/processing pipeline. Furthermore, many high-resolution data types still require meticulous manual annotation to manage technical and batch variation, making it challenging to perform in high throughput.

On the bright side, newer technologies with increasing resolution (single-cell/spatial) present interesting opportunities (Fig. 1, green axes). Single-cell technologies enable the investigation of differences between cell types and states within a sample, which is usually masked by signal averaging in bulk profiling technologies. Spatially resolved profiling technologies enable the evaluation of differences between regions within a lesion, changes in organization of different cell types, and alterations in cellular neighborhoods and tissue architecture (bioRxiv 2023.03.09.530832). These additional layers of information facilitate deconvolution of cellular heterogeneity within precancers and cancers. Using these technologies, it is possible to identify cells or regions featuring various degrees of malignancy within a tumor, opening new ways to dissect molecular and cellular alterations along the tumor progression axis. The generalizability of these types of analyses as they apply across tumors should be noted, and the importance of data integration across large cohorts of patients should be emphasized along technology advancement. Even though there is still yet a lot of work to be done to develop computational tools to reliably integrate, annotate, and quality control high-volume and high-dimensional data, three dimensional single-cell atlases are now emerging (65), representing a promising next step of precancer characterization.

K.S. Lau reports grants from NIDDK and NCI during the conduct of the study. No disclosures were reported by the other authors.

Z. Chen and K.S. Lau are partly funded by R01DK103831 from the National Institute of Diabetes and Digestive and Kidney Diseases and P50CA236733 and U54CA274367 from the NCI.

1.
Tomczak
K
,
Czerwińska
P
,
Wiznerowicz
M
.
The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge
.
Contemp Oncol
2015
;
19
:
A68
77
.
2.
Network
CGAR
,
Brat
DJ
,
Verhaak
RGW
,
Aldape
KD
,
Yung
WKA
,
Salama
SR
, et al
.
Comprehensive, integrative genomic analysis of diffuse lower-grade gliomas
.
New Engl J Medicine
2015
;
372
:
2481
98
.
3.
Hai
Y
,
Williams
PD
,
Genglin
J
,
Roger
M
,
Ahmed
RB
,
Weishi
Y
, et al
.
IDH1 and IDH2 mutations in gliomas
.
New Engl J Med
2009
;
360
:
765
73
.
4.
Parsons
DW
,
Jones
S
,
Zhang
X
,
Lin
JC-H
,
Leary
RJ
,
Angenendt
P
, et al
.
An integrated genomic analysis of human glioblastoma multiforme
.
Science
2008
;
321
:
1807
12
.
5.
Gao
Q
,
Liang
W-W
,
Foltz
SM
,
Mutharasu
G
,
Jayasinghe
RG
,
Cao
S
, et al
.
Driver fusions and their implications in the development and treatment of human cancers
.
Cell Rep
2018
;
23
:
227
38
.
6.
Kandoth
C
,
McLellan
MD
,
Vandin
F
,
Ye
K
,
Niu
B
,
Lu
C
, et al
.
Mutational landscape and significance across 12 major cancer types
.
Nature
2013
;
502
:
333
9
.
7.
Greaves
M
,
Maley
CC
.
Clonal evolution in cancer
.
Nature
2012
;
481
:
306
13
.
8.
Marjanovic
ND
,
Weinberg
RA
,
Chaffer
CL
.
Cell plasticity and heterogeneity in cancer
.
Clin Chem
2013
;
59
:
168
79
.
9.
Merlo
LMF
,
Pepper
JW
,
Reid
BJ
,
Maley
CC
.
Cancer as an evolutionary and ecological process
.
Nat Rev Cancer
2006
;
6
:
924
35
.
10.
Hanahan
D
.
Hallmarks of cancer: new dimensions
.
Cancer Discov
2022
;
12
:
31
46
.
11.
Guinney
J
,
Dienstmann
R
,
Wang
X
,
RA
de
,
Schlicker
A
,
Soneson
C
, et al
.
The consensus molecular subtypes of colorectal cancer
.
Nat Med
2015
;
21
:
1350
6
.
12.
Joanito
I
,
Wirapati
P
,
Zhao
N
,
Nawaz
Z
,
Yeo
G
,
Lee
F
, et al
.
Single-cell and bulk transcriptome sequencing identifies two epithelial tumor cell states and refines the consensus molecular classification of colorectal cancer
.
Nat Genet
2022
;
54
:
963
75
.
13.
Dagogo-Jack
I
,
Shaw
AT
.
Tumour heterogeneity and resistance to cancer therapies
.
Nat Rev Clin Oncol
2018
;
15
:
81
94
.
14.
Srivastava
S
,
Ghosh
S
,
Kagan
J
,
Mazurchuk
R
.
The PreCancer atlas (PCA)
.
Trends Cancer
2018
;
4
:
513
4
.
15.
Chen
B
,
Scurrah
CR
,
McKinley
ET
,
Simmons
AJ
,
Ramirez-Solano
MA
,
Zhu
X
, et al
.
Differential pre-malignant programs and microenvironment chart distinct paths to malignancy in human colorectal polyps
.
Cell
2021
;
184
:
6262
80
.
16.
Biswas
A
,
De
S
.
Drivers of dynamic intratumor heterogeneity and phenotypic plasticity
.
Am J Physiol-cell Ph
2021
;
320
:
C750
60
.
17.
Gerstung
M
,
Jolly
C
,
Leshchiner
I
,
Dentro
SC
,
Gonzalez
S
,
Rosebrock
D
, et al
.
The evolutionary history of 2,658 cancers
.
Nature
2020
;
578
:
122
8
.
18.
Lichtenstein
P
,
Holm
NV
,
Verkasalo
PK
,
Iliadou
A
,
Kaprio
J
,
Koskenvuo
M
, et al
.
Environmental and heritable factors in the causation of cancer—analyses of cohorts of twins from Sweden, Denmark, and Finland
.
New Engl J Medicine
2000
;
343
:
78
85
.
19.
Mucci
LA
,
Wedren
S
,
Tamimi
RM
,
Trichopoulos
D
,
Adami
H-O
.
The role of gene–environment interaction in the aetiology of human cancer: examples from cancers of the large bowel, lung and breast
.
J Intern Med
2001
;
249
:
477
93
.
20.
Czene
K
,
Hemminki
K
.
Kidney cancer in the Swedish family cancer database: familial risks and second primary malignancies
.
Kidney Int
2002
;
61
:
1806
13
.
21.
Mana
MD
,
Hussey
AM
,
Tzouanas
CN
,
Imada
S
,
Millan
YB
,
Bahceci
D
, et al
.
High-fat diet-activated fatty acid oxidation mediates intestinal stemness and tumorigenicity
.
Cell Rep
2021
;
35
:
109212
.
22.
Rodriguez
H
,
Zenklusen
JC
,
Staudt
LM
,
Doroshow
JH
,
Lowy
DR
.
The next horizon in precision oncology: proteogenomics to inform cancer diagnosis and treatment
.
Cell
2021
;
184
:
1661
70
.
23.
Tse
BCY
,
Welham
Z
,
Engel
AF
,
MMP
Genomic
.,
Microbial and immunological microenvironment of colorectal polyps
.
Cancers
2021
;
13
:
3382
.
24.
Dekker
E
,
Tanis
PJ
,
Vleugels
JLA
,
Kasi
PM
,
Wallace
MB
.
Colorectal cancer
.
Lancet
2019
;
394
:
1467
80
.
25.
Parmar
S
,
Easwaran
H
.
Genetic and epigenetic dependencies in colorectal cancer development
.
Gastroenterology Rep
2022
;
10
:
goac035
.
26.
Muzny
DM
,
Bainbridge
MN
,
Chang
K
,
Dinh
HH
,
Drummond
JA
,
Fowler
G
, et al
.
Comprehensive molecular characterization of human colon and rectal cancer
.
Nature
2012
;
487
:
330
7
.
27.
Zhou
D
,
Yang
L
,
Zheng
L
,
Ge
W
,
Li
D
,
Zhang
Y
, et al
.
Exome capture sequencing of adenoma reveals genetic alterations in multiple cellular pathways at the early stage of colorectal tumorigenesis
.
PLoS One
2013
;
8
:
e53310
.
28.
Borras
E
,
Lucas
FAS
,
Chang
K
,
Zhou
R
,
Masand
G
,
Fowler
J
, et al
.
Genomic landscape of colorectal mucosa and adenomas
.
Cancer Prev Res
2016
;
9
:
417
27
.
29.
Bormann
F
,
Rodríguez-Paredes
M
,
Lasitschka
F
,
Edelmann
D
,
Musch
T
,
Benner
A
, et al
.
Cell-of-origin DNA methylation signatures are maintained during colorectal carcinogenesis
.
Cell Rep
2018
;
23
:
3407
18
.
30.
Luo
Y
,
Wong
C-J
,
Kaz
AM
,
Dzieciatkowski
S
,
Carter
KT
,
Morris
SM
, et al
.
Differences in DNA methylation signatures reveal multiple pathways of progression from adenoma to colorectal cancer
.
Gastroenterology
2014
;
147
:
418
29
.
31.
Koestler
DC
,
Li
J
,
Baron
JA
,
Tsongalis
GJ
,
Butterly
LF
,
Goodrich
M
, et al
.
Distinct patterns of DNA methylation in conventional adenomas involving the right and left colon
.
Modern Pathol
2014
;
27
:
145
55
.
32.
Hanley
MP
,
Hahn
MA
,
Li
AX
,
Wu
X
,
Lin
J
,
Wang
J
, et al
.
Genome-wide DNA methylation profiling reveals cancer-associated changes within early colonic neoplasia
.
Oncogene
2017
;
36
:
5035
44
.
33.
Pai
RK
,
Bettington
M
,
Srivastava
A
,
Rosty
C
.
An update on the morphology and molecular pathology of serrated colorectal polyps and associated carcinomas
.
Modern Pathol
2019
;
32
:
1390
415
.
34.
Leggett
B
,
Whitehall
V
.
Role of the serrated pathway in colorectal cancer pathogenesis
.
Gastroenterology
2010
;
138
:
2088
100
.
35.
Bettington
M
,
Walker
N
,
Clouston
A
,
Brown
I
,
Leggett
B
,
Whitehall
V
.
The serrated pathway to colorectal carcinoma: current concepts and challenges
.
Histopathology
2013
;
62
:
367
86
.
36.
Palma
FDED
,
D'Argenio
V
,
Pol
J
,
Kroemer
G
,
Maiuri
MC
,
Salvatore
F
.
The molecular hallmarks of the serrated pathway in colorectal cancer
.
Cancers
2019
;
11
:
1017
.
37.
Kriegl
L
,
Neumann
J
,
Vieth
M
,
Greten
FR
,
Reu
S
,
Jung
A
, et al
.
Up and downregulation of p16Ink4a expression in BRAF-mutated polyps/adenomas indicates a senescence barrier in the serrated route to colon cancer
.
Modern Pathol
2011
;
24
:
1015
22
.
38.
Yu
J
,
Feng
Q
,
Wong
SH
,
Zhang
D
,
LQ
yi
,
Qin
Y
, et al
.
Metagenomic analysis of faecal microbiome as a tool towards targeted non-invasive biomarkers for colorectal cancer
.
Gut
2017
;
66
:
70
.
39.
Drewes
JL
,
Chen
J
,
Markham
NO
,
Knippel
RJ
,
Domingue
JC
,
Tam
AJ
, et al
.
Human colon cancer–derived clostridioides difficile strains drive colonic tumorigenesis in mice
.
Cancer Discov
2022
;
12
:
OF1
13
.
40.
Dejea
CM
,
Fathi
P
,
Craig
JM
,
Boleij
A
,
Taddese
R
,
Geis
AL
, et al
.
Patients with familial adenomatous polyposis harbor colonic biofilms containing tumorigenic bacteria
.
Science
2018
;
359
:
592
7
.
41.
Pushalkar
S
,
Hundeyin
M
,
Daley
D
,
Zambirinis
CP
,
Kurz
E
,
Mishra
A
, et al
.
The pancreatic cancer microbiome promotes oncogenesis by induction of innate and adaptive immune suppression
.
Cancer Discov
2018
;
8
:
403
16
.
42.
Dejea
CM
,
Wick
EC
,
Hechenbleikner
EM
,
White
JR
,
Welch
JLM
,
Rossetti
BJ
, et al
.
Microbiota organization is a distinct feature of proximal colorectal cancers
.
Proc National Acad Sci
2014
;
111
:
18321
6
.
43.
Chen
H
,
Carrot-Zhang
J
,
Zhao
Y
,
Hu
H
,
Freeman
SS
,
Yu
S
, et al
.
Genomic and immune profiling of pre-invasive lung adenocarcinoma
.
Nat Commun
2019
;
10
:
5472
.
44.
Hu
X
,
Fujimoto
J
,
Ying
L
,
Fukuoka
J
,
Ashizawa
K
,
Sun
W
, et al
.
Multi-region exome sequencing reveals genomic evolution from preneoplasia to lung adenocarcinoma
.
Nat Commun
2019
;
10
:
2978
.
45.
Hu
X
,
Estecio
MR
,
Chen
R
,
Reuben
A
,
Wang
L
,
Fujimoto
J
, et al
.
Evolution of DNA methylome from precancerous lesions to invasive lung adenocarcinomas
.
Nat Commun
2021
;
12
:
687
.
46.
Dejima
H
,
Hu
X
,
Chen
R
,
Zhang
J
,
Fujimoto
J
,
Parra
ER
, et al
.
Immune evolution from preneoplasia to invasive lung adenocarcinomas and underlying molecular features
.
Nat Commun
2021
;
12
:
2722
.
47.
Beane
J
,
Mazzilli
SA
,
Tassinari
AM
,
Liu
G
,
Zhang
X
,
Liu
H
, et al
.
Detecting the presence and progression of premalignant lung lesions via airway gene expression
.
Clin Cancer Res
2017
;
23
:
5091
100
.
48.
Maoz
A
,
Merenstein
C
,
Koga
Y
,
Potter
A
,
Gower
AC
,
Liu
G
, et al
.
Elevated T cell repertoire diversity is associated with progression of lung squamous cell premalignant lesions
.
J Immunother Cancer
2021
;
9
:
e002647
.
49.
Strand
SH
,
Rivero-Gutiérrez
B
,
Houlahan
KE
,
Seoane
JA
,
King
LM
,
Risom
T
, et al
.
Molecular classification and biomarkers of clinical outcome in breast ductal carcinoma in situ: Analysis of TBCRC 038 and RAHBT cohorts
.
Cancer Cell
2022
;
40
:
1521
36
.
e7
.
50.
Kunz
M
,
Löffler-Wirth
H
,
Dannemann
M
,
Willscher
E
,
Doose
G
,
Kelso
J
, et al
.
RNA-seq analysis identifies different transcriptomic types and developmental trajectories of primary melanomas
.
Oncogene
2018
;
37
:
6136
51
.
51.
Huang
J
,
Qian
Z
,
Gong
Y
,
Wang
Y
,
Guan
Y
,
Han
Y
, et al
.
Comprehensive genomic variation profiling of cervical intraepithelial neoplasia and cervical cancer identifies potential targets for cervical cancer early warning
.
J Med Genet
2019
;
56
:
186
.
52.
Shields
CED
,
White
JR
,
Chung
L
,
Wenzel
A
,
Hicks
JL
,
Tam
AJ
, et al
.
Bacterial-driven inflammation and mutant BRAF expression combine to promote murine colon tumorigenesis that is sensitive to immune checkpoint therapy ETBF- and BRAF-driven colon tumors respond to PD-L1 blockade
.
Cancer Discov
2021
;
11
:
1792
807
.
53.
Kleeman
SO
,
Leedham
SJ
.
Not all wnt activation is equal: ligand-dependent versus ligand-independent wnt activation in colorectal cancer
.
Cancers
2020
;
12
:
3355
.
54.
Zhou
Y-J
,
Lu
X-F
,
Chen
H
,
Wang
X-Y
,
Cheng
W
,
Zhang
Q-W
, et al
.
Single-cell transcriptomics reveals early molecular and immune alterations underlying the serrated neoplasia pathway toward colorectal cancer
.
Cell Mol Gastroenterology Hepatology
2023
;
15
:
393
424
.
55.
Becker
WR
,
Nevins
SA
,
Chen
DC
,
Chiu
R
,
Horning
AM
,
Guha
TK
, et al
.
Single-cell analyses define a continuum of cell state and composition changes in the malignant transformation of polyps to colorectal cancer
.
Nat Genet
2022
;
54
:
985
95
.
56.
Zhang
P
,
Yang
M
,
Zhang
Y
,
Xiao
S
,
Lai
X
,
Tan
A
, et al
.
Dissecting the single-cell transcriptome network underlying gastric premalignant lesions and early gastric cancer
.
Cell Rep
2019
;
27
:
1934
47
.
57.
Kim
J
,
Park
C
,
Kim
KH
,
Kim
EH
,
Kim
H
,
Woo
JK
, et al
.
Single-cell analysis of gastric pre-cancerous and cancer lesions reveals cell lineage diversity and intratumoral heterogeneity
.
Npj Precis Oncol
2022
;
6
:
9
.
58.
Risom
T
,
Glass
DR
,
Averbukh
I
,
Liu
CC
,
Baranski
A
,
Kagel
A
, et al
.
Transition to invasive breast cancer is associated with progressive changes in the structure and composition of tumor stroma
.
Cell
2022
;
185
:
299
310
.
59.
Sobhani
F
,
Muralidhar
S
,
Hamidinekoo
A
,
Hall
AH
,
King
LM
,
Marks
JR
, et al
.
Spatial interplay of tissue hypoxia and T-cell regulation in ductal carcinoma in situ
.
Npj Breast Cancer
2022
;
8
:
105
.
60.
Tokura
M
,
Nakayama
J
,
Prieto-Vila
M
,
Shiino
S
,
Yoshida
M
,
Yamamoto
T
, et al
.
Single-cell transcriptome profiling reveals intratumoral heterogeneity and molecular features of ductal carcinoma in situ
.
Cancer Res
2022
;
82
:
3236
48
.
61.
Nirmal
AJ
,
Maliga
Z
,
Vallius
T
,
Quattrochi
B
,
Chen
AA
,
Jacobson
CA
, et al
.
The spatial landscape of progression and immunoediting in primary melanoma at single-cell resolution
.
Cancer Discov
2022
;
12
:
1518
41
.
62.
Li
C
,
Hua
K
.
Dissecting the single-cell transcriptome network of immune environment underlying cervical premalignant lesion, cervical cancer and metastatic lymph nodes
.
Front Immunol.
2022
;
13
:
897366
.
63.
Young
A
,
Quandt
Z
,
Bluestone
JA
.
The balancing act between cancer immunity and autoimmunity in response to immunotherapy
.
Cancer Immunol Res
2018
;
6
:
1445
52
.
64.
Molinari
C
,
Marisi
G
,
Passardi
A
,
Matteucci
L
,
Maio
GD
,
Ulivi
P
.
Heterogeneity in colorectal cancer: a challenge for personalized medicine?
Int J Mol Sci
2018
;
19
:
3733
.
65.
Lin
J-R
,
Wang
S
,
Coy
S
,
Chen
Y-A
,
Yapp
C
,
Tyler
M
, et al
.
Multiplexed 3D atlas of state transitions and immune interaction in colorectal cancer
.
Cell
2023
;
186
:
363
81
.