Precancer atlases have the potential to revolutionize how we think about the topographic and morphologic structures of precancerous lesions in relation to cellular, molecular, genetic, and pathophysiologic states. This mini review uses the Human Tumor Atlas Network (HTAN), established by the National Cancer Institute (NCI), to illustrate the construction of cellular and molecular three-dimensional atlases of human cancers as they evolve from precancerous lesions to advanced disease. We describe the collaborative nature of the network and the research to determine how and when premalignant lesions progress to invasive cancer, regress or obtain a state of equilibrium. We have attempted to highlight progress made by HTAN in building precancer atlases and discuss possible future directions. It is hoped that the lessons from our experience with HTAN will help other investigators engaged in the construction of precancer atlases to crystallize their thoughts on logistics, rationale, and implementation.

A major limitation to improving early-detection and prevention of cancer is the lack of understanding of the sequence of molecular alterations in the cells and changes in the associated microenvironment that cause initiation and progression of premalignant lesions to invasive cancer. Premalignant lesions are regions of histologically abnormal tissues that often precede the development of invasive carcinoma (1). Some of these lesions will progress to invasive carcinoma while others will stabilize or regress (2, 3). Although the histologic progression of these lesions has been characterized for many cancers, the initial molecular events responsible for causing normal tissue to enter a premalignant state and for causing premalignant lesions to progress to invasive carcinoma have not been well defined.

In 2016, NCI's Blue-Ribbon Panel concluded that understanding the mechanisms by which cells and tissues progress from normal to precancerous lesions, to localized invasive cancer, and finally to metastatic cancer has the potential to identify targets to prevent the development of cancer, biomarkers of risk and early-detection, and to inform clinical decision-making and treatment options. In 2017, NCI's Board of Scientific Advisors approved the formation of the Human Tumor Atlas Network (HTAN). The HTAN consists of five Human Tumor Atlas (HTA) Research Centers, five PreCancer Atlas (PCA) Research Centers, and a Data Coordinating Center (HTAN–DCC). The HTA Research Centers are constructing three-dimensional (3D) atlases of tumors that describe the transition from locally invasive to metastatic cancer, the dynamics of patient response to therapy, and/or the development of resistance to therapy. The PCA Research Centers are constructing 3D atlases describing how and when premalignant lesions progress to invasive cancer, regress or obtain a state of equilibrium (Fig. 1). The study of molecular and cellular features in a 3D environment should lead to the identification of valuable tumor features that 2D data cannot provide. The HTAN–DCC works closely with the NCI Cancer Research Data Commons to manage data storage and sharing, compile HTAN constructed atlases and disseminate data and resources to the wider scientific community through a variety of data sharing and visualization platforms, including the Seven Bridges Genomics and Institute for Systems Biology Cancer Genomics Clouds.

Figure 1.

Schematic representation of precancer atlases and characterization methodologies. HTAN PCA Research Centers are constructing multimodal atlases of precancerous lesions for breast, colon, lung, and skin cancers using a variety of methodologies, including genomics, proteomics, imaging and modeling and visualization of data using commercially available software.

Figure 1.

Schematic representation of precancer atlases and characterization methodologies. HTAN PCA Research Centers are constructing multimodal atlases of precancerous lesions for breast, colon, lung, and skin cancers using a variety of methodologies, including genomics, proteomics, imaging and modeling and visualization of data using commercially available software.

Close modal

The impetus behind the development of the PCA Research Centers was that a deeper and more comprehensive understanding of the molecular, cellular, and tissue alterations and the interactions of the various cell types that drive tumor development and progression, especially the progression from premalignant lesions to invasive cancer, could result in improvements in risk stratification, early-detection, and development of cancer prevention strategies (4, 5). Precancerous lesions within each organ type are diverse, and while some progress to invasive carcinoma, others remain stable or even regress. The ability to distinguish lesions that progress to cancer from those that do not could help reduce unnecessary treatment. Although some molecular data and histologic features of these lesions have been determined, the genomic, transcriptomic, and epigenomic profiles of precancerous lesions and their functional interactions with different cell types and the microenvironment, including the immune milieu and the microbiome, are not well defined, which makes it difficult to develop risk stratification, precision prevention and therapeutic intervention strategies.

This mini review provides an overview of the research being conducted by the five PCA Research Centers. The HTAN defines a precancer atlas as a 3D cellular, molecular map of a human premalignant tumor, which is complemented with critical spatial information that facilitate visualization of the structure, composition, and multiscale interactions over time. The development of atlases by the PCA Research Centers is not an end in itself but rather a means to an end, to (i) gain an understanding of the molecular and cellular events that drive the transition of precancerous lesions to cancer (ii), identify targets for preventive agents, and (iii) identify markers for risk of progression and early detection. The PCA Research Centers are collecting data at single-cell resolution and integrating this with spatial information and clinical data. Each PCA Research Center consists of an Administrative Core and three highly integrated research units: Biospecimen Acquisition, Processing, and Classification Unit; Molecular, Cellular, and Tissue Characterization Unit; and Data Processing, Analysis, and Modeling Unit.

As of January 2023, the HTAN teams have published and shared data for 12 tumor atlases. These spatio-genomics atlases provide new biological insights on cancer initiation and progression and on cellular and structural heterogeneity within the tumor microenvironment. The vision of deriving greater information from 3D atlases is also coming to fruition because of ongoing efforts within HTAN (6). Precancer atlases have been reported on sporadic colorectal cancer (7), familial adenomatous polyposis (FAP)-related colorectal cancer (8), melanoma (9), and breast cancer (10, 11). In addition, identification and characterization of precancerous lesions found in pancreas resections that also contain advanced cancer have been reported by the Washington University in St. Louis HTA Research Center (Saint Louis, MO; ref. 12). As a consortium, HTAN has generated and curated a substantial number of shared community resources, which are publicly browsable through the HTAN portal. Open access data and all clinical, assay, and biospecimen metadata can be directly downloaded from the HTAN Data Portal. To facilitate secondary data analysis and to broaden the impact of the NCI investment, HTAN investigators have developed and adopted several data and metadata standards, such as the Minimum Information about Highly Multiplexed Tissue Imaging (MITI) imaging standard (13).

The PCA Research Centers have made substantial progress that provides insights into the transitional moments of precancer to invasive cancer and the role of the microenvironment, including immune cells, in tumor progression or regression. Most Centers have a focus on single-cell and multiplex imaging modalities. Although complete descriptions of the research conducted by the five PCA Research Centers is beyond the scope of this overview, a few notable results from each center are described.

Precancer atlas of colorectal cancer

Investigators under the leadership of Robert Coffey, Martha Shrubsole, and Ken Lau, Vanderbilt University Medical Center (Nashville, TN), are constructing precancer atlases of sporadic colorectal adenoma progression that depict the spatial landscape of the tumor ecosystem, including the stroma and biofilm-associated microbiome, using scRNA sequencing, multi-region whole-exome sequencing (WES), multiplex immunofluorescence (IF), spatial transcriptomics, and species-specific bacterial fluorescence in situ hybridization. Thus far, they have built atlases of the two most common forms of human colorectal polyps, conventional adenomas and serrated polyps, and the resulting colorectal cancers. Analysis of data from 62 participants revealed that while adenomas arise from Wnt pathway dysregulation in colon crypt stem cells, serrated polyps arise from metaplastic reprogramming of differentiated cells at the crypt surface. Their single-cell resolution atlas showed distinct paths for precancer to cancer transformation, which are accompanied by differential immune microenvironments (7). This provides insights into malignant progression of colorectal polyps and could provide a framework for precision surveillance and prevention. A unique component of this PCA Research Center is a focus on the role of the microbial environment in colorectal carcinogenesis (14).

Precancer atlas of breast cancer

Investigators under the leadership of Shelley Hwang, Duke University (Durham, NC), Robert West, Stanford University (Stanford, CA), and Carlo Maley, Arizona State University (Tempe, AZ), are building atlases of ductal carcinoma in situ (DCIS) to further enhance the understanding of the disease and to better stratify the risk of progression to invasive breast cancer (IBC). They used multiplexed ion beam imaging by time of flight (MIBI–TOF) and stromal-targeted transcriptomics to generate a spatially resolved atlas of breast precancers, which allowed complementary modalities to be directly compared and correlated with conventional pathology findings, disease states, and clinical outcomes. The team performed detailed characterization of the tumor microenvironment (TME) accompanying transition from DCIS to IBC using a 37-plex antibody panel to interrogate 79 clinically annotated surgical resections (15). Comparison of normal breast with patient-matched DCIS and IBC revealed coordinated transitions between four TME states that were delineated on the basis of the location and function of myoepithelium, fibroblasts, and immune cells. Surprisingly, myoepithelial disruption was more advanced in patients with DCIS that did not develop IBC, suggesting this process could be protective against recurrence by allowing immune infiltration (11).

Precancer atlas of FAP

Investigators under the leadership of Michael Snyder, James Ford, Christina Curtis, and William Greenleaf, Stanford University (Stanford, CA), are developing a precancer atlas for hereditary colorectal adenocarcinoma by employing samples from patients living with FAP. The team is characterizing tissue samples using multi-omic and imaging technologies including whole-genome sequencing (WGS) and methylation, bulk, single-cell, and spatial transcriptomics, bulk and single-cell epigenomics, mass spectrometry (MS)-based proteomics and metabolomics, and multiplexed protein imaging. The team has built a precancer atlas of colorectal cancer by profiling multiple polyps from the same patient across ten modalities to characterize the early-events driving colorectal cancer development. In an analysis of 81 polyps collected from eight patients with FAP and seven patients with non-FAP, they identified overlapping transcriptional, proteomic, metabolomic, and lipidomic molecular trajectories for key pathways, including a progressive increase of Wnt2 signaling along the progression from normal mucosa through benign polyp, dysplastic polyp, and adenocarcinoma. WGS of these samples supported a model for precancer lesion development of polyclonal origin and spreading. In a complementary analysis that integrated single-cell transcriptional and epigenetic data, they found that a hallmark of the premalignancy progression is increased stemness in the epithelial cell population (8) and that stromal fibroblasts exhibit a precancer associated fibroblast signature that portends progression to colorectal cancer.

Precancer atlas of cutaneous origin

Investigators under the leadership of Peter Sorger, Sandro Santagata, and Jon Aster, Harvard Medical School (Boston, MA), are constructing a spatial precancer atlas focused on progression of premelanoma lesions to invasive cancer. They are integrating single-cell transcriptomics and highly multiplexed imaging using tissue-based cyclic IF (t-CyCIF) to construct 3D atlases of disease progression. They combined high resolution spatial mapping of whole-slide tissue samples with micro-region sequencing of thirteen melanoma patients to discover recurrent cellular patterns and spatial gradients of signaling molecules (reminiscent of developmental processes) as lesions progressed from premalignancy to invasive tumors. Their first precancer atlas was comprised of conventional and high–resolution multiplexed imaging of 70 distinct histologic regions in early primary melanoma, complemented by spatial transcript profiling (9). They identified recurrent cellular neighborhoods involving tumor, immune, and stromal cells that change significantly along a progression axis from precursor states to melanoma in situ to invasive tumor. Hallmarks of immunosuppression were detectable by the precursor stage, and when tumors become locally invasive, a consolidated and spatially restricted suppressive environment forms along the tumor–stromal boundary. They have also contributed to the development of the MITI guidelines for genomics, highly multiplexed tissue images and traditional histology (13). MITI covers biospecimens, reagents, data acquisition, and data and metadata analysis as well as data for imaging with antibodies, aptamers, peptides, dyes, and similar detection reagents.

Precancer atlas of lung cancer

Investigators under the leadership of Avrum Spira, Boston University Medical Center (Boston, MA), and Steven Dubinett, University of California (Los Angeles, CA) are developing multidimensional atlases of precancerous lung lesions and their surrounding microenvironment. They are developing atlases of precancerous lesions for lung squamous cell carcinoma (LUSC) and adenocarcinoma (LUAD), the two most common subtypes of lung cancer. They are utilizing single-cell and bulk sequencing and spatial proteomic approaches combined with the development of advanced computational tools (15) to characterize the lung lesions and have begun to identify mutational heterogeneity in premalignant lesions that are reflective of smoking and advanced disease. They are currently identifying common somatic alterations in multiple pathways (genome integrity, immune regulation, cell-cycle, transcriptional regulation, and Wnt/ β‐catenin signaling). By comparing progressive lesions that eventually become cancer to indolent lesions, they propose to identify frequent alterations in cancer related genes and immune related genes. These alterations could implicate these genes in later stages of progression of preinvasive lesions.

Although there is general agreement as to what constitutes cancer, the definition of precancer differs, even among experts for a specific type of lesion, contributing to confusion as to precisely what constitutes a precancerous lesion. In the context of precancer atlas construction, where biospecimens may come from multiple collection sites that may use different criteria and collection protocols, it is imperative to address the relatively fluid definition of precancers. In the future, identification of a precancerous lesion will be based more on molecular characteristics than on histopathology. However, this will require a better understanding of the molecular properties that distinguish precancers from other types of lesions and the molecular properties that distinguish precancerous lesions that are likely to progress from those that are not. Other points to consider when defining precancers and considering progression trajectories are (i) somatic clonal evolution can occur in normal tissues, (ii) truly benign growths can share driver mutations with cancers, and (iii) lesions can be identified in tissue resections where there is no evidence of cancer development.

The scientific community needs to address the following complex questions so that atlases can be employed for multiple purposes, including identification of biomarkers for early detection and targets for intercepting the development of invasive cancers.

  • What biological, molecular, radiologic, and pathologic features should be considered when defining precancerous lesions?

  • In the case of high-risk individuals, because of familial or genetic predispositions, what special considerations should be included in defining precancerous lesions?

  • In the context of cancer prevention, what additional kinds of molecular or cellular data should be collected in building precancer atlases?

Screening methods exist for those cancers currently being investigated by the PCA Research Centers (breast, colon, lung, and melanoma), which allows for the precancers to be detected and biopsied. Methods will need to be developed to collect precancers for cancers for which there are currently no screening methods that allow for biopsies to be collected.

Published atlases, including those arising from HTAN, are spatially resolved single-cell atlases of tumor development, progression, drug resistance, and metastasis that provide valuable information on the molecular and architectural features of cancer. To gain additional information on tumor initiation and identification of sites for prevention, one could consider the inclusion of inflammatory conditions that are associated with an increased risk of cancer (Barrett's esophagus, liver cirrhosis, chronic obstructive pulmonary disease, Crohn disease, chronic pancreatitis, and monoclonal gammopathy of undetermined significance). Those at high risk because of identified genetic predisposition or family history may also need to be included. Another consideration would be to prioritize cancer types where serial sampling is feasible or serial samples are available through existing biobanks.

Two areas of interest to NCI are (i) whether these atlases have provided or have a high potential to provide significant insights into cancer biology, tumor initiation and progression, and the longitudinal trajectory of precancerous to cancerous state and (ii) can investigators query the atlases in order to identify and understand hallmark events that drive oncogenesis, find molecular markers of early-detection and identify targets to intercept and prevent development of cancer. The results reported to date indicate that these maps can provide insight into the biology at the cellular and molecular levels of tumor initiation and progression. However, due to a myriad of factors and continuing challenges (Fig. 2), there is limited data on the “longitudinal trajectory” of these precancerous lesions. Without this data, it will be difficult to develop biomarkers for risk assessment and early-detection or to identify actionable targets to prevent cancer development. The latter is particularly challenging given the heterogeneity of precancerous lesions and the difficulty of developing effective immunopreventive and chemopreventive agents that are acceptable to patients.

Figure 2.

Challenges in harmonizing and integrating data across multiple technological platforms, patient populations, and cell types. The PCA Research Centers are developing predictive models for risk stratification and to identify biomarkers for early cancer detection and potential prevention targets including neoantigens and neoepitopes (Adapted from Cell, 2020 April 16; 181(2): 236–249).

Figure 2.

Challenges in harmonizing and integrating data across multiple technological platforms, patient populations, and cell types. The PCA Research Centers are developing predictive models for risk stratification and to identify biomarkers for early cancer detection and potential prevention targets including neoantigens and neoepitopes (Adapted from Cell, 2020 April 16; 181(2): 236–249).

Close modal

To increase the likelihood that precancer atlases can be used to identify actionable targets, the following questions should be considered.

  • What modalities will give the most valuable information?

  • Should the focus be on epithelial cancers with a distinct basement membrane and then expanded to other cancer types?

  • Should germline mutations and other familiar risk information be collected on each subject or a select few?

  • In precancer atlas building, what should be the targeted endpoints; identification of neoantigens, neoepitopes, or actionable targets for interception?

  • Are there alternatives to requiring longitudinal samples? Are lesions that can be sampled longitudinally of true clinical concern?

  • How accurately does the clonal evolution observed in longitudinally collected samples from the same lesion reflect what would occur in uninterrupted lesion? Similarly, can data collected longitudinally from lesions at different spatial coordinates in the same organ be used to develop precise progression models?

  • Recognizing the difficulties in collecting serial longitudinal samples for precancerous lesions and early-cancers, can engineered and/or animal models be used to supplement the spatial data with longitudinal data, especially for prevention where intervention is likely to be more effective at an early stage?

In recognition of the importance of understanding early-lesions, their microenvironments, and reciprocal interactions that drive early lesion fate and clinical outcomes, NCI recently established the Translational and Basic Science Research in Early Lesions (TBEL) program. The goal of TBEL is to understand the biological and pathophysiologic mechanisms driving or restraining precancers and early cancers and to facilitate biology-backed precision prevention approaches. Another NCI supported program related to early lesions is the Cancer Prevention-Interception Targeted Agent Discovery Program (CAP-IT) to discover prevent or intercept the oncogenic process in high-risk populations. Cancer interception refers to the disruption of the oncogenic process during the precursor or precancer stage.

Although data generated by the PCA Research Centers have provided insight into the biology of tumor initiation and progression, there are differences in opinion as to what is the optimal approach. The current approach taken by the PCA Research Centers includes a significant use of single-cell sequencing to reveal relationships of genetics and somatic alterations and phenotypes. Although this approach has the potential to identify pathways for targeting, some researchers are concerned that it will only identify pathways that are already well known and that nuanced interactions among pathways will be difficult to discern via genetics as current knowledge of gene-environment and gene–gene interactions is limited. These researchers believe that there is a need for breakthroughs at a fundamental level in understanding risks of genetic variation as a first step to identifying targets for prevention.

To make precancer atlases more targeted toward cancer prevention, risk assessment, and early detection, future efforts might include:

  • Correlative studies or secondary analyses specifically designed to improve prevention by (1) identifying biomarkers of early-stage cancers or precancerous lesions that are likely to progress and (2) by identifying molecular targets to intercept progression through vaccines or other modalities;

  • Creation of additional atlases of precancerous lesions that include more longitudinal data, and collection of additional data that are more directly applicable for identifying potential targets for interception, risk assessment and early detection;

  • Extending the scope of research to characterize at‐risk tissues, including adjacent regions due to field cancerization;

  • Pairing high-risk with average risk samples for each cancer type;

  • Collect biospecimens from diverse populations to help advance health equity;

  • Use of animal models or engineered tissues to validate prevention targets.

The development of precancer atlases with clinical implications is a challenging task because (i) precancerous lesions are in a state of flux, (ii) lesions are usually small and inadequate for molecular analyses, (iii) longitudinally followed lesions are rare, and (iv) the natural history of these lesions is altered once resected or biopsied. Extensive characterization of these lesions requires multifaceted efforts and often long-term commitments as deriving meaningful data on clinical outcomes is not easy when surrogate endpoints are unknown. Some cancers may not have a well-defined precancerous stage and clinically present at a very advanced stage. However, it is necessary to take steps towards resolving these issues and starting to address the gaps in the field.

No disclosures were reported.

The opinions expressed by the authors are their own and this material should not be interpreted as representing the official viewpoint of the U.S. Department of Health and Human Services, the NIH, or the NCI.

The authors would like to thank PCA Research Center investigators Robert Coffey, Vanderbilt University Medical Center, Shelley Hwang, Duke University, Michael Snyder, Stanford University, Peter Sorger, Harvard Medical School, Avrum Spira, Boston University Medical Center, and Sarah Mazzilli, Boston University Medical Center for providing updates on their research and Tracy Lively, National Cancer Institute for reading and commenting on this manuscript. The authors would also like to acknowledge the members of the NCI's HTAN Implementation Team for their tireless efforts toward the building of Human Tumor Atlas.

1.
Wacholder
S
.
Precursors in cancer epidemiology: aligning definition and function
.
Cancer Epidemiol Biomarkers Prev
2013
;
22
:
521
7
.
2.
Nasiell
K
,
Nasiell
M
,
Vaćlavinková
V
.
Behavior of moderate cervical dysplasia during long-term follow-up
.
Obstet Gynecol
1983
;
61
:
609
14
.
3.
Merrick
DT
,
Gao
D
,
Miller
YE
,
Keith
RL
,
Baron
AE
,
Kennedy
TC
, et al
.
Persistence of bronchial dysplasia is associated with development of invasive squamous cell carcinoma
.
Cancer Prev Res (Phila)
2016
;
9
:
96
104
.
4.
Campbell
JD
,
Mazzilli
SA
,
Reid
ME
,
Dhillon
SS
,
Platero
S
,
Beane
J
, et al
.
The case for a Pre-Cancer Genome Atlas (PCGA)
.
Cancer Prev Res (Phila)
2016
;
9
:
119
24
.
5.
Srivastava
S
,
Ghosh
S
,
Kagan
J
,
Mazurchuk
R
.
National Cancer Institute's HTAN Implementation Team. The making of a precancer atlas: promises, challenges, and opportunities
.
Trends Cancer
2018
;
4
:
523
36
.
6.
Lin
JR
,
Wang
S
,
Coy
S
,
Chen
YA
,
Yapp
C
,
Tyler
M
, et al
.
Multiplexed 3D atlas of state transitions and immune interaction in colorectal cancer
.
Cell
2023
;
186
:
363
81
.
7.
Chen
B
,
Scurrah
CR
,
McKinley
ET
,
Simmons
AJ
,
Ramirez-Solano
MA
,
Zhu
X
, et al
.
Differential pre-malignant programs and microenvironment chart distinct paths to malignancy in human colorectal polyps
.
Cell
2021
;
184
:
6262
80
.
8.
Becker
WR
,
Nevins
SA
,
Chen
DC
,
Chiu
R
,
Horning
AM
,
Guha
TK
, et al
.
Single-cell analyses define a continuum of cell state and composition changes in the malignant transformation of polyps to colorectal cancer
.
Nat Genet
2022
;
54
:
985
95
.
9.
Nirmal
AJ
,
Maliga
Z
,
Vallius
T
,
Quattrochi
B
,
Chen
AA
,
Jacobson
CA
, et al
.
The spatial landscape of progression and immunoediting in primary melanoma at single-cell resolution
.
Cancer Discov
2022
;
12
:
1518
41
.
10.
Risom
T
,
Glass
DR
,
Averbukh
I
,
Liu
CC
,
Baranski
A
,
Kagel
A
, et al
.
Transition to invasive breast cancer is associated with progressive changes in the structure and composition of tumor stroma
.
Cell
2022
;
185
:
299
310
.
11.
Strand
SH
,
Rivero-Gutiérrez
B
,
Houlahan
KE
,
Seoane
JA
,
King
LM
,
Risom
T
, et al
.
Molecular classification and biomarkers of clinical outcome in breast ductal carcinoma in situ: analysis of TBCRC 038 and RAHBT cohorts
.
Cancer Cell
2022
;
40
:
1521
36
.
12.
Cui Zhou D,
Jayasinghe
RG
,
Chen
S
,
Herndon
JM
,
Iglesia
MD
,
Navale
P
,
Wendl
MC
, et al
.
Spatially restricted drivers and transitional cell populations cooperate with the microenvironment in untreated and chemo-resistant pancreatic cancer
.
Nat Gen
2022
;
54
:
1390
405
.
13.
Schapiro
D
,
Yapp
C
,
Sokolov
A
,
Reynolds
SM
,
Chen
YA
,
Sudar
D
, et al
.
MITI minimum information guidelines for highly multiplexed tissue images
.
Nat Methods
2022
;
19
:
262
7
.
14.
Drewes
JL
,
Chen
J
,
Markham
NO
,
Knippel
RJ
,
Domingue
JC
,
Tam
AJ
, et al
.
Human colon cancer-derived clostridioides difficile strains drive colonic tumorigenesis in mice
.
Cancer Discov
2022
;
12
:
1873
85
.
15.
Hong
R
,
Koga
Y
,
Bandyadka
S
,
Leshchyk
A
,
Wang
Y
,
Akavoor
V
, et al
.
Comprehensive generation, visualization, and reporting of quality control metrics for single-cell RNA sequencing data
.
Nat Commun
2022
;
3
:
1688
.