T cells are central players in cancer immunotherapy. Despite much concentrated effort on the study of tumor-infiltrating lymphocytes (TIL), such as T cells, a series of fundamental properties that include heterogeneity, clonal expansion, migration, and functional state transition remain elusive. Advances of single-cell sequencing have enabled the detailed characterization of immune cells in tumors and have vastly improved our understanding of less-defined cell subsets. Here, we discuss the current strategies for uncovering the heterogeneity of TILs, and how the deep transcriptome coupled with T-cell receptor analysis enhances the understanding of detailed properties of T-cell subsets. We further discuss the identification of novel T-cell markers with therapeutic or prognosis potentials, and highlight distinct T-cell properties among different cancer indications.
Cancer immunotherapies, such as immune-checkpoint blockade, are clinically effective for multiple cancer indications, such as melanoma and non–small cell lung cancer, although the clinical benefit is not uniform (1, 2), and T cells have been a focal point in these immunotherapies. The activation of T cells is mediated through antigen recognition by TCRs and is regulated by a balance between costimulatory and inhibitory signals (3). However, tumor-infiltrating lymphocytes (TIL) are a heterogeneous population of cells regarding their cell-type composition, gene and protein expression, and functional properties (4). These characteristics might contribute to different responses to immune-checkpoint blockade, and it is imperative to decipher the fundamental properties of such diverse T cells in the complex tumor microenvironment.
Although expression profiling is important for the functional interpretation for various T-cell types, a series of additional properties such as clonal expansion, migration, and functional state transition are also potentially important for antitumor immunity by T cells. For example, because T-cell exhaustion or dysfunction represents a distinct epigenetic state (5, 6), the understanding of the transition to and from such a state might inspire new strategies for T-cell revitalization. Systemic antitumor immune responses require the migration of T cells, especially effector T cells, and elevated T-cell migration into tumors could enhance cancer immunotherapy responses (7).
Detailed understanding of such T-cell properties at the single-cell level in the context of different diseases has been challenging due to the limitations of previous technologies. Emerging advances in single-cell sequencing technologies provide new opportunities to unearth such hidden characteristics (8, 9). Besides the T cells themselves, tumor cells may also affect how T cells are shaped in the tumor microenvironment. Single-cell studies focusing on revealing the evolution and heterogeneity of tumor cells have been reviewed elsewhere (10). In this perspective, we discuss different strategies used in the characterization of TILs by single-cell sequencing, describe key findings from studies, and highlight the importance of performing such analysis for different cancer indications. We also provide our views of future directions that might aid tumor immunotherapies.
Technologies Used to Uncover the Broad Heterogeneity of Immune Cells
Advances in single-cell RNA-sequencing (scRNA-seq) technologies
Over the years, single-cell technologies have leveraged advances in cytometry, microscopy, and next-generation sequencing to categorize immune cells (8). Based on traditional technologies, such as flow cytometry, advances in mass cytometry have enabled the simultaneous measurement of over 40 proteins in millions of single cells by detecting metal-conjugated antibodies (11). These technologies have been applied to reveal the heterogeneity of immune cells and to capture their functional states. Two single-cell mass cytometry studies of immune cells from 32 patients with non–small cell lung carcinoma (NSCLC) and 76 patients with clear-cell renal cell carcinoma reveal diverse immune cell subsets with different phenotypic markers (12, 13). Because the pertinent antibodies only covered dozens of genes, mostly cell-surface markers selected in advance, this strategy is best at analyzing the distribution of immune cells with known phenotypes across different samples but is limited at finding novel markers or subtypes with uncharacterized properties.
scRNA-seq provides an unbiased profiling of immune cells without prior gene selection (14–25), which enables the classification of different subsets and the identification of novel makers or regulators for each subset (Fig. 1A). Since its inception (26), scRNA-seq has evolved rapidly in its detection accuracy and scale. Among plate-based protocols for full-length mRNA detection, SMART-Seq2 (27) is commonly used and has the advantages of high RNA recovery rate, detection accuracy, and capability to capture rare transcripts. However, drawbacks include relatively high cost and laboriousness. Subsequently, pooled approaches (e.g., CEL-seq and MARS-seq) and droplet-based, massively parallel approaches (e.g., Drop-seq, inDrop, and 10× Genomics) were developed (28). Given the advantages of low cost, easy handling, and capability to capture more cells, 10× Genomics is gaining momentum as a widely used commercial solution for surveying multiple cell subtypes (20–23) and for the human cell atlas construction. However, due to the nature of 3′-end sequencing, this approach has lower RNA capture efficiencies and higher drop-out rates (frequency of undetected genes) compared with the full-length sequencing, which might hamper its utility to explore subtle differences among cell subsets, recover splicing patterns, and reconstruct the various receptors of immune cells. A comprehensive comparison of the strengths and weaknesses of these scRNA-seq technologies have been highlighted in reviews by us and others (8, 28, 29) and will not be the focus in this perspective.
Strategies to dissect TIL heterogeneity
One critical issue to consider when designing the study of single TILs is how to balance the size of the patient cohort and the number of cells sequenced. The objectives of individual studies may dictate the choice of appropriate strategies to achieve the sensitivity, depth, and scale within the reach of a fixed budget. For instance, studies of CD45+ cells from melanoma (14) and head and neck tumors (18) have revealed multiple immune cell subtypes, including three T-cell subsets: CD8+ T cells, CD4+ T-helper (TH) cells, and regulatory T cells (Tregs). Although such uses of the pan-immune sorting strategy cover more cell types, the small number of cells acquired for each cell type limits the ability to evaluate their more detailed characteristics.
In contrast to the above broader survey of immune cells across many donors, more concentrated efforts can be applied to a specific category of cells (e.g., T cells) to capture rare populations and obtain detailed characteristics. Using this strategy, our laboratory has applied SMART-Seq2 to analyze around 1,000 T cells per donor for patients with hepatocellular carcinoma (HCC), NSCLC, or colorectal cancer (Fig. 1A; refs. 4, 19, 24). High sequencing depth allowed detection of genes with lower expression and the assembly of full-length TCRs, which led to identifying TCRαβ pairs and, thus, the precise clonal lineages of the T cells.
Focusing on depth instead of breadth can be more effective for addressing questions that are independent of patient numbers (Fig. 1B). For example, single Tregs isolated from human HCC, NSCLC, and colorectal cancer tumors exhibited a stronger suppression signature compared with those isolated from peripheral blood, in line with what would be expected for this class of TILs (4, 19, 24). The larger number of cells obtained per patient enabled the detection of rare cell populations (Fig. 1B), such as CD8+FOXP3+ T cells in HCC tumors (4). Given the diverse nature of TCR repertoires, sequencing small numbers of T cells underestimates the abundances of the clonal population (Fig. 1B). Because T cells can migrate or differentiate into different subsets, higher T-cell numbers per patient could maximize the finding of clonal cells across the spectrum of phenotypic clusters.
Limitations of scRNA-seq technologies
Despite the intensive improvements of scRNA-seq technologies, substantial limitations and challenges remain. Current scRNA-seq methods are limited to capturing polyadenylated RNA transcripts, resulting in the loss of noncoding RNAs. Another limitation is the lack of spatial distribution of single cells and their interactions in the tumor microenvironment, although various experimental and computational methods are being developed to resolve this limitation, as reviewed elsewhere (30). Finally, scRNA-seq only captures the transcriptomic properties, while other properties, like proteomic and epigenetic elements, are not captured. Because of this, utilizing multiomic, single-cell technologies or other approaches that incorporate these elements into the transcriptomic data could be used (reviewed in ref. 31). For example, cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq) provides a more detailed characterization of cellular phenotypes of cord blood mononuclear cells than transcriptomic measurement alone (32). Transcript-indexed, an assay for transposase-accessible chromatin with sequencing (T-ATAC-seq), allows for the identification of epigenomic landscapes in clonal T cells, as one study demonstrated in healthy donors and leukemia patients (33). These technologies could enhance our understanding of immune environments in the future.
Characterizing T-cell Subsets and Identifying Targetable Markers
Exhausted/dysfunctional CD8+ T cells
T cells with prolonged exposure to antigens can enter a state of exhaustion or dysfunction, which is characterized by the elevated expression of inhibitory receptors (e.g., PD-1, CTLA-4, TIM-3, TIGIT, and LAG3) and a hierarchical loss of effector functions (34, 35). Although it is still debatable how exhaustion is related to various properties, such as cell proliferation and effector functions, single-cell studies have enhanced our understanding of these cells in tumors. It has been reported that during chronic infections, the EomeshiPD-1hi terminally exhausted CD8+ T cells are more proliferative than T-bethiPD-1int precursor cells (36). However, human tumors present a more complex picture. On the one hand, clonal expansion and proliferation capability are consistently observed in exhausted T cells or CD8+PD1hi tissue-resident memory T (TRM) cells in tumors isolated from colorectal cancer, NSCLC, breast cancer, and melanoma patients (22, 24, 25, 37). On the other hand, the dysfunctional status of these cells cannot be separated by their high or low proliferation as it can in chronic infections (24, 37). A single-cell study using MARS-seq and TCR-seq in 25 melanoma patients shows that proliferation is a key feature in early-stage CD8+ T-cell dysfunction (25). Although exhausted T cells have loss or low expression of specific cytotoxic molecules, they exhibit a unique cytokine and cytotoxicity signature, suggesting that these exhausted T cells might not have completely lost their antitumor effector functions in vivo (24, 25).
A focal point of characterizing exhausted T cells is the lineage connection between these cells and others in the tumor microenvironment. The chronic LCMV murine infection model reveals that the T-bethiPD-1int progenitor cells give rise to the EomeshiPD-1hi terminally differentiated exhausted T cells (36). However, evidence for such relationships has been scarce in human cancers due to technical limitations. Clonal T-cell analysis provides an alternative to evaluating such relationships. Our TCR sharing analysis of HCC and colorectal cancer CD8+ TILs reveals a developmental connection between exhausted T cells and GZMK+ T cells, whereas the latter also exhibits a connection to effector T cells (4, 24). The NSCLC T-cell analysis gives a more complicated scenario, with two different “preexhausted” T-cell populations (CD8+GZMK+ and CD8+ZNF683+ T cells) having a connection with exhausted T cells and only GZMK+ T cells exhibiting a connection to effector T cells (19). Further investigation of the TCR specificity with emerging techniques may unmask developmental trajectory regulators and the tumor reactivity of these T cells. Although the specific regulators controlling the functional transition to T-cell exhaustion have not been fully reported, transcription factors with high expression in these cells need to be thoroughly investigated (24).
Given the clinical importance of reinvigorating T-cell functions, different checkpoint blockade agents, such as the PD-1 or PD-L1 antibodies, have been successfully developed (1, 2). Tumors refractory to checkpoint blockade are often thought to signal via other inhibitory pathways (35), pointing to the need to further characterize and identify novel markers on TILs. Our HCC T-cell study identified a number of markers in exhausted T cells, including LAYN, PHLDA1, and SNAP47 (4). Functional analysis revealed that the mutually exclusive pattern of LAYN and LAG3, as well as the overexpression of LAYN, in primary CD8+ T cells resulted in the inhibition of IFNγ production, suggesting a regulatory function of LAYN in T-cell exhaustion. The high expression of these genes was also confirmed in our NSCLC and colorectal cancer study of CD8+ exhausted T cells (19, 24) and by another independent study of NSCLC CD8+PD-1hi T cells (37), which demonstrated significant alterations of transcriptional and metabolic profiles, as well as a high capacity for tumor recognition. Further characterization of other signature genes of CD8+ exhausted T cells may also reveal novel therapeutic targets.
Tregs and their heterogeneity
Treg subtypes have also been heavily investigated in the context of immunotherapy. Among the aforementioned signature genes, LAYN and PHLDA1 were also highly expressed in tumor-infiltrating Tregs (4, 19, 24). Phenotypic and functional analyses revealed that LAYN is preferentially upregulated in FOXP3+Helios+ Tregs, suggesting the association of LAYN expression with more repressive and stable Tregs (4). CCR8 and IL1R2 also exhibit specific expression in HCC, NSCLC, and colorectal cancer Tregs (4, 19, 24). IL1R2 has high expression in activated Tregs, characterized by high expression of TNFRSF9 (also known as 4-1BB) in NSCLC, and its high expression correlates with poor prognosis in lung adenocarcinoma (19). These genes, together with LAYN, should be investigated as potential therapeutic targets.
Understanding the different origins of Tregs and their interactions with conventional T cells will help understand the heterogeneity of Tregs and how they are shaped in the tumor microenvironment. A study using bulk TCRβ sequencing of breast cancer-associated Tregs revealed that these Tregs have little TCR sharing with conventional T cells, suggesting that Tregs in tumors are mainly generated through local expansion (38). Consistent with this study, the HCC-, NSCLC-, and colorectal cancer–infiltrating Tregs also show a large proportion of unique TCR clonotypes expanded in tumors (4, 19, 24), suggesting their local expansion characteristics and potential for recognizing tumor-associated antigens. Nevertheless, a small portion of tumor-infiltrating Tregs has been observed to share TCRs with conventional TH cells (4, 19, 24). For example, certain colorectal cancer–infiltrating Tregs exhibit TCR sharing with TH17 and CD4+CXCL13+ TH cells, suggesting the potential for trans-differentiation between Tregs and different TH subsets inside colorectal cancer (24). It would be intriguing to identify the different transcriptional programs driving Tregs into distinct functional groups within tumors.
TH1-like T cells, the newly defined T cells enriched in MSI colorectal cancer
Aside from exhausted T cells and Tregs, the single-cell colorectal cancer study also revealed a less-characterized T-cell subset, CD4+CXCL13+ TH cells, which are defined as TH1-like cells and preferentially enriched in microsatellite instable (MSI) colorectal cancer tumors (24). Clinical trials have demonstrated that anti–PD-1 response is associated with the MSI but not MSS status of colorectal cancer patients (39). However, the underlying differences of T cells among these patients remain unclear. Although previous studies have revealed the elevated IFNG expression in MSI patients (40, 41), the detailed T-cell types were undercharacterized due to technical limitations. Indeed, the colorectal cancer single T-cell study demonstrated that although two IFNG-expressing CD4+ TH cell types were identified, only the CXCL13+ TH1-like cells were enriched in MSI colorectal cancer (24). Deep transcriptome data enabled the finding of potential novel regulatory genes, such as IGFLR1 and BHLHE40 in this T-cell subset. BHLHE40, a basic helix–loop–helix family member E40 that can promote IFNγ production independent of T-bet and repress IL10 production from TH1 cells (42, 43), may help define and differentiate this subset from conventional TH1 cells.
IGFLR1 (IGF-like family receptor 1) is a putative TNFR family member with uncharacterized function (44) and is preferentially expressed in the newly defined CXCL13+BHLHE40+ TH1-like subset (24). IGFLR1 has three potential ligands, with IGFL1 and IGFL3 showing high interaction affinity (44). Functional analysis reveals that the expression of IGFLR1 is upregulated in the CD4+ memory T cells upon activation and that the IGFL3/IGFLR1 axis can enhance IFNγ production, suggesting a costimulatory function of IGFLR1 (24). Given the clinical significance of costimulatory molecules in T cells (45), it is tempting to speculate that activation of T cells by IGFLR1 could provide new avenue to antitumor immunotherapy.
Besides the aforementioned molecular characteristics, the CXCL13+BHLHE40+ TH1-like cells also exhibit high clonal expansion and proliferation, indicating their active status in tumors. Because CD4+ T cells in peripheral blood have been shown to recognize neoantigens to mediate tumor killing of melanomas (46), further illustration of the phenotypic and functional relevance of CXCL13+BHLHE40+ TH1-like cells in multiple tumors will be of great interest.
Dynamic relationships of T cells
Single-cell lineage tracing can be achieved by genetic labeling or transcriptome-based trajectory prediction and has been widely used in delineating tissue development in zebrafish, mouse, and human embryos (reviewed in ref. 47). However, for the dynamic relationships of T cells, tracking them in human tissues remains challenging. Although traditional TCRα or TCRβ sequencing has been widely used for determining the clonality of T cells from bulk samples (48), such methods are limited in unmasking the phenotypic differences of T cells with the same TCR clonotypes, hindering the assessment of T-cell relationships. In contrast, the high diversity of TCRs revealed by scRNA-seq allows us to use these TCRαβ pairs as unique identifiers to track the dynamics of T cells based on the assumption that T cells with identical TCRs arise from a common T-cell clone (49).
A proof-of-concept single-cell study revealed a distinct T-cell clonal expansion in HCC and demonstrated their developmental trajectory using integrated transcriptome and TCR analyses (4). This approach has also been used in an NSCLC study to reveal the intertissue and intratissue clonal expansion of effector T cells, which sheds light into TIL mobility (19). To eliminate the possible influence of sampling bias by such nonquantitative analyses, in our colorectal cancer study, a new analysis framework STARTRAC was developed to quantitatively and systematically analyze the clonal expansion, tissue migration, and developmental transition of T cells (24). STARTRAC analyses provide additional insights into the properties of T cells. For example, this approach showed that four subsets of T cells, including CD8+ exhausted T cells, Tregs, TH1-like cells, and TH17 cells, were all preferentially enriched and clonally expanded in colorectal cancer tumors. It is of interest to identify the antigens that these T cells recognize and to determine the interactions of these T cells. Another example is the high mobility of both CD8+ and CD4+ effector T cells, measured by the migration index. As supported by the expression of migration-related molecules, these cells show a tissue-homing potential. Finally, the state transition index demonstrated that CXCL13+ TH1-like cells exhibit developmental connections with CD4+GZMK+ memory T cells, indicating the potential origins of these CXCL13+ TH1-like cells (24). One limitation of these analyzes is the lack of rigorous methods to define the migration direction and the precursor–progeny relationships of these T cells. Nevertheless, with the accumulating data generated by the massive parallel method and integrated with TCR V(D)J sequencing (20), the STARTRAC approach can be applied to track the dynamic relationships of T cells in other diseases.
Distinct T-cell patterns in different cancers
The TILs from one cancer type may not directly apply to others, and different cancer types also exhibit distinct responses to checkpoint blockade (2, 39), further supporting potential differences of tumor-infiltrating immune cells among multiple cancer types. Much like separate Cancer Genome Atlas studies are needed to understand genomic characteristics in various cancer types, the composition, function, and clonal properties of TILs are expected to vary among distinct cancer types. A meta-analysis of single T cells from HCC, NSCLC, and colorectal cancer elucidated a similar composition of T-cell subsets based on reclustering of the combined data set, indicating that combining data could help identify robust and stable T-cell subsets (24). Compared with the similar pattern of T cells in blood from different cancer patients, T cells from different tumors and adjacent normal tissues exhibit distinct patterns, respectively. For T cells in tumor tissues, the main CD8+ T-cell subtypes in HCC and colorectal cancer are effector memory T (TEM) cells and exhausted T (TEX) cells, whereas NSCLC exhibits high abundance of CD8+ ZNF683+ TRM cells (Fig. 1C). For CD4+ T cells, TH17 cells are uniquely enriched in a subpopulation of colorectal cancer patients. Although the CD4+CXCL13+ TH cells can also be detected in all three cancer types, these cells exhibit significant enrichment in MSI colorectal cancer patients.
The underlying mechanisms of different T-cell patterns in tumors are still unknown. One possible explanation is that these T-cell patterns are attributed to their tissue origins. Indeed, three T-cell subtypes, including CD8+ TRM, CD8+ intraepithelial lymphocytes (IEL), and CD4+ follicular helper T (TFH) cells, are uniquely enriched in the normal mucosa of colorectal cancer patients, whereas mucosa-associated invariant T (MAIT) cells and effector T (TEFF) cells are much more abundant in the normal tissues of HCC and NSCLC patients, respectively (24). These analyses demonstrate that certain T-cell subtypes are tissue-specific, and cancer types can also shape T-cell subtypes and states. It is also conceivable that the distinct T-cell composition and abundance may contribute, at least in part, to how different cancers respond to immunotherapies. As technologies and analysis strategies are updated, it may become preferable to perform “pan-cancer” single-cell immune profiling, which will allow systemic comparison and comprehensive understanding of T cells across a wide spectrum of cancers.
Here, we have primarily discussed studies highlighting how integrated transcriptome and TCR analyses at the single-cell level can aid the discovery of T-cell subsets, distinct cell states, heterogeneity in certain subsets, and their potential utilities. Such approaches can be potentially applied to other cell types or diseases. With the rapid accumulation of single-cell data, we anticipate that many rare cell types will be discovered and characterized at a resolution rate previously unseen.
The mechanisms of checkpoint resistance need to be urgently addressed in immunotherapy. One appealing approach to address this problem is to collect TILs for single-cell analysis before and after treatment. Another approach is to compare T cells in a given cancer type with diverse treatment outcomes. For example, for gastric, endometrial, and colorectal cancers, patients with or without mismatch-repair deficiency exhibit distinct responses to anti–PD-1 treatment (39), and the specific increase of CXCL13+BHLHE40+ TH1-like cells in MSI colorectal cancer patients may explain why these patients respond better to checkpoint blockade.
Also of interest is to identify tumor-specific T cells that recognize specific neoantigens. Future studies could potentially identify specific tumor antigens or neoantigens that are recognized by expanded CD8+ T cells or Tregs. Tumor-recognizing TCR pairs should be explored for their potential in T-cell–based immunotherapies, such as the adoptive cell therapy involving transferring TCR-engineered T cells into tumors, which complements the widely popular CAR-T approaches (50).
As single-cell technologies advance, we anticipate that the interrogation of single-cell RNA-seq and other omic technologies will allow not only the depiction of different functional states of diverse immune cells, but also the identification of key regulators of cell-fate decisions and the understanding of cell–cell interactions. Such rich knowledge of the “tumor immune atlas” should provide important guidance to future immunotherapy design and clinical practice.
Disclosure of Potential Conflicts of Interest
Z. Zhang is a founder of Analytical BioSciences Inc., reports receiving commercial research grant from Amgen, Bayer, and Boehringer Ingelheim, has ownership interest (including stock, patents, etc.), and is a consultant/advisory board member for GennLife Inc. No potential conflicts of interest were disclosed by the other author.
Conception and design: L. Zhang, Z. Zhang
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): Z. Zhang
Writing, review, and/or revision of the manuscript: L. Zhang, Z. Zhang
Study supervision: L. Zhang, Z. Zhang
This work was supported by grants from the Beijing Advanced Innovation Center for Genomics at Peking University, Key Technologies R&D Program (2016YFC0900100 and 2016YFC0902300), and the National Natural Science Foundation of China (31530036 and 91742203). Dr. Lei Zhang was supported by the Postdoctoral Foundation of Center for Life Sciences at Peking University–Tsinghua University.