Abstract
Immune checkpoint inhibitors (ICI) represent the cornerstone for the treatment of patients with metastatic clear cell renal cell carcinoma (ccRCC). Despite a favorable response for a subset of patients, others experience primary progressive disease, highlighting the need to precisely understand the plasticity of cancer cells and their cross-talk with the microenvironment to better predict therapeutic response and personalize treatment. Single-cell RNA sequencing of ccRCC at different disease stages and normal adjacent tissue (NAT) from patients identified 46 cell populations, including 5 tumor subpopulations, characterized by distinct transcriptional signatures representing an epithelial-to-mesenchymal transition gradient and a novel inflamed state. Deconvolution of the tumor and microenvironment signatures in public data sets and data from the BIONIKK clinical trial (NCT02960906) revealed a strong correlation between mesenchymal-like ccRCC cells and myofibroblastic cancer-associated fibroblasts (myCAF), which are both enriched in metastases and correlate with poor patient survival. Spatial transcriptomics and multiplex immune staining uncovered the spatial proximity of mesenchymal-like ccRCC cells and myCAFs at the tumor–NAT interface. Moreover, enrichment in myCAFs was associated with primary resistance to ICI therapy in the BIONIKK clinical trial. These data highlight the epithelial–mesenchymal plasticity of ccRCC cancer cells and their relationship with myCAFs, a critical component of the microenvironment associated with poor outcome and ICI resistance.
Single-cell and spatial transcriptomics reveal the proximity of mesenchymal tumor cells to myofibroblastic cancer-associated fibroblasts and their association with disease outcome and immune checkpoint inhibitor response in clear cell renal cell carcinoma.
Introduction
Renal cell carcinomas (RCC) are the most frequent malignant neoplasms arising from the kidney. Clear cell renal cell carcinoma (ccRCC) is the most common RCC representing 75% of all cases, with an estimated 315,000 new patients worldwide per year (1, 2). Extensive studies identified genetic and epigenetic alterations driving oncogenesis, with frequent mutations including VHL inactivation followed by mutations in various genes involved in chromatin remodeling complexes (PBRM1, SETD2, and BAP1; refs. 3–6). Recent single-cell transcriptomics described the ccRCC tumor microenvironment (TME) identifying co-occurrence between exhausted CD8+ T cells and M2-like tumor-associated macrophages (TAM) in advanced forms of the disease and their association with immune checkpoint inhibitor (ICI) resistance (7–9). Furthermore, the association between lack of ICI response and enrichment in endothelial cells was also described (10). Nevertheless, the potential plasticity of ccRCC cancer cells and their interactions with the TME have not been investigated. Similarly, the role of cancer-associated fibroblasts (CAF) and their potential cell of origin are still poorly described in ccRCC as compared with various other cancer types (11, 12).
To address these questions, we performed single-cell RNA sequencing (scRNA-seq) of 56,421 cells identifying 5 tumor cell states forming an epithelial-to-mesenchymal gradient together with a novel inflamed population expressing MHC class II genes. We further identified that myofibroblastic (my)CAFs strongly associate with mesenchymal-like ccRCC tumor cells reflecting their spatial proximity within tumors. Notably, these populations were selectively enriched at metastatic ccRCC sites, and the presence of myCAFs strongly associated with resistance to ICI in the BIONIKK clinical trial.
Materials and Methods
Patient samples
The seven ccRCC samples and the two normal adjacent tissues (NAT) subjected to scRNA-seq were collected from the pathology department of Strasbourg University Hospital. Sample collection for further research analysis was approved by the local ethics committee, and all patients signed an informed written consent for the use of the material for research. The seven ccRCC samples represented various stages/grades of tumor evolution. NAT sample N2 is the associated normal adjacent tissue of tumor sample T4, whereas N1 comprises 3 different NAT samples, one of which matched to T3 and 2 from patients where no corresponding tumor samples were obtained. N1 and N2 were combined to provide more tubule diversity.
Single-cell sample preparation and sequencing
Following resection, samples from the tumor and normal adjacent tissue were conserved in 1 mL of MACS Tissue Storage solution (Miltenyi Biotech) at 4°C for up to 24 hours. Single-cell suspensions were then prepared using gentleMACS dissociator and human tumor kit dissociation (Miltenyi Biotech) according to the manufacturer's instructions. Briefly, samples were transferred to C tubes containing 4.7 mL of DMEM at room temperature and minced to segments of <5 mm3. 100 μL enzyme H, 50 μL enzyme R, and 12.5 μL enzyme A were added to each C tube and loaded on the dissociator using program h_tumour_01. Tubes were then incubated for 30 minutes at 37°C at 130 rpm and loaded again on the dissociator and program h_tumour_02 was run followed by another identical period of incubation. Finally, tubes were loaded a third time on the dissociator and program h_tumour_03 was run. The cell suspension was applied on MACS SmartStrainer 70 μm placed on a 15-mL Falcon tube, and 10 mL DMEM was used to wash C tubes and the filter. Following centrifugation of 10 minutes at 300 × g, 4°C, cells were resuspended in 40% DMEM, 50% FCS, and 10% DMSO, placed in Cryotubes and gradually frozen to −80°C. On the day of the single-cell experiment, cells were quickly thawed at 37°C until only a tiny ice crystal remained and dropwise transferred to 10 mL of prewarmed DMEM + 10% FCS media. The cryotube vial was washed with 2 mL of DMEM + 10% FCS and added to the main 10-mL solution. Samples were centrifuged for 10 minutes at 300 × g, 4°C. Dead cells were removed using a dead cell removal kit (Miltenyi Biotech) by resuspending in the correct volume of dead cell removal microbeads incubation for 15 minutes at room temperature and passaged through an MS column placed in a separator magnetic field. The column was rinsed with binding buffer, and unlabeled cells were collected in a 15-mL Falcon tube placed in ice as live-cell fraction. Cells were centrifuged again to be resuspended in smaller volume. Cell count and viability were assessed by mixing cells with trypan blue (1:1 ratio) and using a Malassez counting chamber. 15,000 cells were then loaded on the Chromium 10X Genomics to be captured for subsequent preparation of the 3′-mRNA single-cell libraries following the manufacturer's instructions. Libraries were sequenced 2 × 100 bp on either NextSeq550 or HiSeq4000 sequencers.
Individual sample processing of scRNA-seq data
After sequencing, raw reads were processed using CellRanger (v3.1) to align on the hg19 human genome, remove unexpressed genes and quantify barcodes and UMIs. Data were then analyzed in R (v4.0.2). Samples are first processed individually using Seurat (v3.2.0) with the recommended workflow. Cells were filtered to keep only cells with feature counts ranging from 200 to 4,500 and a percentage of mitochondrial reads <20%. Potential doublets were removed using the DoubletFinder package assuming a doublet formation rate of 0.8% per 1,000 cells reported by CellRanger. Counts were normalized with the “LogNormalize” method and data were scaled to remove unwanted sources of variation (number of features and mitochondrial reads). Highly variable genes were determined by the “VariableFeatures()” and PCA was performed on these genes with the “RunPCA()” function. The number of principal components to use was determined from the plots generated from the “JackStraw()” function and generally ranged from 20 to 30. Clustering was performed using functions “FindNeighbors()” and “FindClusters()” with a resolution of 0.8 on the most significant principal components. Cluster markers are computed with the function “FindAllMarkers()” and cells were broadly labeled by manually validating automated labeling using R package clustifyr (v1.6.0).
Analysis of merged samples
In order to correct batch effect when merging samples, 3 different methods were tested: “SCTransform(),” “IntegrateData()” following the Seurat integration vignette, and the R package Harmony using sample names as the batch variable. In our data set, the linear correction approach with “SCTransform()” produced a better result than Seurat integrate and Harmony that showed excessive bias probably due to the similar level of gene capture in samples and their asymmetric compositions. All 9 prefiltered samples with broad cell labels were merged using the “merge.Assay” function of Seurat. Data were normalized and scaled with batch effect corrected according to sample name using “SCTransform()” then an initial clustering was performed using a resolution of 0.8 and the 30 most significant principal components. Based on cluster marker genes, cells from 6 compartments (cancer cells, normal epithelium, fibroblasts, endothelium, lymphoid cells, and myeloid cells) were extracted with the “subset()” function and submitted to reclustering. Misclustered cells were removed based on their broad labeling and then data were normalized, scaled, and reclustered by using varying principal components and resolutions to find optimal parameters for each compartment. Cells were annotated according to their marker genes in the different subclustering analysis and these new cluster labels were transferred to the global analysis.
Cell typing and gene signatures
To phenotype clusters, both automatic (Clustifyr) and manual techniques were used. Known lineage markers for immune populations [CD45 (pan-immune), CD3E (pan T cells), CD4 (CD4+ T cells), CD8A (CD8+ T cells), CD79A (B cells), NKG7 (NK cells), CPA3 (Mast cells), CD68/LYZ (macrophages), CD16/CD14 (monocytes)], endothelial population (PLVAP), mesangial population (ACTA2, RGS5), epithelial population (EPCAM, KRT8, CDH16), and tumor population (CA9, ANGPTL4) were projected on the Uniform Manifold Approximation and Projection (UMAP). Subsequently, each major cell type was reclustered to identify specific subpopulations, and differentially expressed genes for each subcluster were determined. Cell identification was based on a comparison of the differentially expressed genes of each subcluster with markers previously described in the literature (Supplementary Data Set S1).
Analysis of public scRNA-seq data
For reanalysis of the Krishna and colleagues (8) data set (SRA, accession number: PRJNA705464), we retrieved the available precomputed RDS Seurat object and set it to v3 architecture with the “UpdateSeuratObject()” function. First, epithelial cells were reclustered by subsetting clusters 9 and 16 (labeled as “PAX8+ epithelium” and “CA9+ ccRCC”) with a resolution of 0.5 and the 30 most significant components, resulting in 10 new clusters. Based on marker genes, some clusters could clearly be identified as normal renal tubules, and the remainder marked by CA9 expression were then isolated and reclustered with a resolution of 0.5 to obtain the final 7 ccRCC clusters.
For reanalysis of the Braun and colleagues (13) data set (dbGaP, accession number phs002252.v1.p1), we retrieved raw counts and cell annotations in CSV format and used it to create a new Seurat object. We applied an analysis similar to our data set with normalization and scaling with batch effect correction according to the “Batch” column present in the annotation file. Global clustering was performed with a resolution of 0.8 using the 30 most significant components that yielded 35 clusters. We found a high visual overlap between these new clusters and the originals by plotting the labels “ClusterName_AllCells” on the UMAP. Epithelial clusters with marked expression of CA9 were reclustered with a resolution of 0.4 to obtain the final 8 ccRCC clusters.
Functional analysis of scRNA-seq data
Data were represented as single-cell heat maps with the “DoHeatmap()” function, as custom heat maps with the pheatmap package and as bubble plots with the “DotPlot()” function. Data were also visualized using the scope website (https://scope.aertslab.org) after converting the Seurat object to loom format using the SCopeLoomR package. To represent gene expression on the global UMAP and avoid overlapping signal, we used the schex package (https://github.com/SaskiaFreytag/schex). Regulome analyses of active transcription factors were performed using the SCENIC v1.1.2.2 package. Gene signatures were computed and visualized on UMAPs using the R package VISION (https://github.com/YosefLab/VISION). Gene set variation analyses were performed using the r-package gene set variation analysis (GSVA), and gene set enrichment analyses (GSEA) were done with the GSEA software v3.0 using the hallmark gene sets of Molecular Signature Database v6.2. Gene Ontology analysis was done using DAVID (https://david.ncifcrf.gov/). Gene list intersections and Venn diagrams were computed by the web-tool Venny (https://bioinfogp.cnb.csic.es/tools/venny/). Cell state trajectories were defined using the R package SWNE v0.6.18 (https://yanwu2014.github.io/swne/; ref. 14).
Ligand–receptor interactions were inferred using the CellPhoneDB python package v2.0 (https://github.com/Teichlab/cellphonedb) with the “statistical_analysis” argument and default parameters (15). In addition, for predicting ligand–receptor interactions regulating specific programs we used NicheNet (16). First, we used cluster MSG as “receiver,” cluster ccRCC.mes as “sender” and the list of upregulated genes (log2FC>1, adj. P < 5%) from the myCAF versus MSG clusters as “geneset_oi” arguments. Then we used cluster ccRCC.mes as “receiver,” cluster “myCAF” as “sender” and the list of 442 ccRCC.mes marker genes from the tumor cell reclustering as “geneset_oi” arguments.
Generation of the BIONIKK data set
RNA from 50 samples from the BIONIKK trial was received for sequencing. In addition, frozen tissues (n = 47) were also received and processed with the AllPrep DNA/RNA kit following the manufacturer's instructions (Qiagen, ref 80204). Briefly, around 30 mg of frozen tissue was plunged in 600 μL of RLT Plus buffer supplemented with 40 mmol/L of DTT and disrupted using an UltraTurrax for 30 seconds. Tubes were centrifuged at 10,000 × g, at RT for 3 minutes and the supernatant was transferred to an AllPrep DNA spin column. After centrifugation at 10,000 × g, room temperature for 30 seconds, 1 volume of 70% EtOH was added to the flow-through, mixed by pipetting and transferred to an RNeasy spin column. After centrifugation, the column was sequentially washed once with RW1 buffer and twice with RPE buffer. After drying, the RNA was eluted twice with 20 μL of RNAse-free water, and sample quality was assessed by BioAnalyzer. Libraries for total rRNA-depleted RNA were prepared for the 97 qualified samples and paired-end 2×100 bp sequencing was performed on a HiSeq4000 sequencer.
Analysis of bulk RNA sequencing
RNA-seq data from the BIONIKK clinical trial were generated by total rRNA-depleted paired-end RNA-seq. After sequencing, raw reads were preprocessed in order to remove the adapter and low-quality sequences (Phred quality score below 20) using cutadapt version 1.10, and reads shorter than 40 bases were discarded. Residual reads mapping to rRNA sequences using bowtie version 2.2.8 were also removed. Reads were mapped onto the hg19 assembly using STAR version 2.5.3a. Gene-expression quantification was performed from uniquely aligned reads using htseq-count version 0.6.1p1, with annotations from Ensembl version 75 and “union” mode. Only nonambiguously assigned reads were retained for further analyses. Read counts were normalized across samples with the median-of-ratios method and across genes by using the median transcript length. For KIRC-TCGA, the raw data files were downloaded and renormalized using the Bioconductor package DESeq2 version 1.16.1. Sample compositions were estimated by deconvolution from our single-cell data using the CIBERSORTx algorithm.
Survival analysis
Relationships between cell signatures and patient survival were computed in R using survival version 3.2 and survminer version 0.4.9. Patients were stratified using “surv_cutpoint(),” and Kaplan–Meier curves were represented with the “survfit()” and “ggsurvplot()” functions. Hazard ratios were determined by a univariate Cox proportional-hazards model with the “coxph()” function using continuous values from CIBERSORTx absolute scores. Scores with significant P values in univariate models were adjusted with the independent clinical covariates age and stage in multiple regression analyses. Overall response rate was defined as the proportion of patients with partial response (PR) and complete response (CR), whereas clinical benefit rate was defined as the proportion of patients with PR, CR, and stable disease over 6 months. In the BIONIKK data set, there is one case where two tumors samples were taken from the same patient, one of the two samples was arbitrarily removed for survival analysis.
Analysis of spatial transcriptomics data
Spatial transcriptomics was performed on formalin-fixed paraffin-embedded (FFPE) tissue sections using the 10X Genomics Visium Spatial Gene Expression platform (10X Genomics). Using a Leica RM2235 microtome (Leica), 5-μm tissue sections were cut from FFPE tissue blocks and placed within the etched frames of the capture areas on the active surface of the Visium Spatial Slide. Slide was dried in an oven overnight at 37°C and then at 60°C during 25 minutes before dehydration in ethanol and hematoxylin and eosin staining. The slide was finally transferred to water and mounted in 87% glycerol before image acquisition on the same day. Image mosaic of each section was acquired at 5× and 20× magnification (Zeiss EC Plan-Neofluar 5×/0.16, Zeiss Plan-Apochromat 20×/0.8) with transmitted white light and a Zeiss color camera Zeiss Axiocam Color on a Zeiss Axio Observer microscope to generate a brightfield image of each section including the fiducial markers frame. The final stitched mosaic was created with the Zeiss ZEN Stitching module. Subsequently, stained sections were strictly processed as outlined in the demonstrated protocol “Visium Spatial Gene-Expression Reagent kits for FFPE, UserGuide” (Part number CG000407, RevC). Quantification and quality control of the final libraries were performed using Bioanalyzer 2100 (Agilent Technologies). Libraries were sequenced on an Illumina HiSeq 4000 platform as paired-end 2×100 bp reads and trimmed at 28 + 50 bp reads. Image analysis and base calling were performed using RTA version 2.7.7 and Spaceranger mkfastq v1.1.0. Image and read processing were performed using Spaceranger v1.3.1 count.
Data were then analyzed in R using Seurat. UMI counts were normalized, and scaled and the most variable features were determined by the function “SCTransform()” and then dimension reduction was performed with “RunPCA().” For unsupervised clustering analysis, Seurat objects from both samples were merged before running “RunPCA(),” “FindNeighbors(),” “FindClusters(),” and “FindAllMarkers()” functions using default parameters and the 30 most significant principal components. For integration with scRNA-seq data, the previously analyzed Seurat object was provided as a reference with the spatial Seurat object as a query to the “FindTransferAnchors()” function and then predictions from “TransferData()” were stored as an assay in the spatial Seurat object. Both predictions and gene expression were visualized on the slides using the “SpatialFeaturePlot()” function. To better illustrate ccRCC.mes and myCAF proximity, we generated a multicolored representation highlighting spots with a prediction score above 0.1 for myCAF in green, above 0.1 (or 0.01 for T1) for ccRCC.mes in red, whereas spots with double-positive signal (above the thresholds for both cell types) are in yellow. For analysis of Gene-Expression Omnibus (GEO) public data (GSE175540) from Meylan and colleagues (15), we retrieved three samples corresponding to one of each available clinical stage: pT1 (GSM5924035), pT3 (GSM5924037), and pT4 (GSM5924031) and applied the same analysis pipeline.
Multiplex immunostaining
Tissues were prepared using standard fixation and paraffin embedding techniques. Tissues were sliced at 5 μm to have a slide for specific antibody staining and another for isotype antibody control. Multiplex immunostaining was performed using the OPAL 4-color kit (Akoya Biosciences) according to the manufacturer's instructions. Briefly, slides were baked in the oven for 1 hour, dewaxed for 30 minutes with Sub-X, a xylene substitute, and rehydrated through a graded series of ethanol baths (100%, 10 minutes; 95% 10 minutes; rinse in 70% and rinse in distilled water). Slides were fixed for 40 minutes in 10% neutral buffered formalin, rinsed in distilled water, and in appropriate AR buffer. Slides were placed in AR buffer bath, microwaved for 45 seconds at 900 W and for 15 minutes at 350 W. After the slides cooled down and were rinsed in water and TBST buffer (25 mmol/L Tris-HCl, pH 7.5; 150 mmol/L NaCl; 0.05% Tween20), tissue sections were blocked for 10 minutes in a humidified chamber using OPAL kit buffer. Sections were incubated with the primary antibody working solution (antibody diluted in the blocking buffer) for various times or temperatures. After washes, slides were incubated in Polymer HRP Ms+Rb for 10min, washed again, and incubated in Opal Fluorophore working solution for 10 minutes. Microwave treatment was again applied to strip the primary–secondary–HRP complex. The entire process was repeated starting at the blocking step to label with the next antibody. First, FN1 primary antibody (1/100, Proteintech) was introduced using pH6 AR buffer, a 2-hour incubation at room temperature, and the OPAL570 as a secondary antibody. Then, CA9 primary antibody (1/1000, Abcam) was added with pH9 AR buffer, an overnight incubation at 4°C and the OPAL690. The third primary antibody was TAGLN (1/1000, Proteintech) with pH6 AR buffer, a 2-hour incubation at room temperature and OPAL520. Finally, DAPI working solution was applied for 5 minutes. Slides were washed and mounted using ProLong Gold. We also performed negative controls using the same procedure, but omitting the primary antibodies. Immunofluorescence images were acquired on a wide field microscope Leica DM6B using a Hamamatsu ORCA-Flash 4.0LT sCmos camera. Representative images were taken at 20× magnification using the Leica LasX software for acquisition. Data were obtained in .lif format and opened with FIJI to adjust brightness/contrast and convert merged and individual channel images into .tiff format files.
Data availability
The single-cell and spatial RNA-seq data generated in this study are publicly available in GEO at GSE210042. The data that support the findings of the BIONIKK trial are not publicly available due to patient privacy requirements but are available upon reasonable request from the study sponsor ([email protected]). The public scRNA-seq data analyzed in this study were obtained from SRA (accession number: PRJNA705464) and dbGaP (accession number: phs002252.v1.p1).
Results
Single-cell RNA-seq profiling of ccRCC tumors
We performed scRNA-seq on 7 in situ ccRCC primary tumor samples (T1–T7) and NATs (N1, a mix of 3 NAT one associated with T3; N2 associated with T4; Fig. 1A and B; Supplementary Fig. S1A). Patients were selected to cover a diverse ccRCC spectrum with early to advanced stages and grades at time of surgery, 5 of which were treatment-naïve and 2 with metastatic disease underwent delayed nephrectomies after optimal response of distant metastatic lesions to either ICI (ipilimumab–nivolumab combination) or tyrosine kinase inhibitor (sunitinib; Fig. 1B; Supplementary Fig. S1A). Sequencing data were merged to obtain an integrated UMAP projection of 56,421 single cells broadly categorized by marker gene expression as lymphoid cells, myeloid cells, endothelium, mesangium, renal tubule, and tumor cells. (Fig. 1C; Supplementary Fig. S1B; Data Set S1). Immune cells represented 50% of the total, with a majority of lymphoid cells classified as natural killer-like T cells (NKT; CD3D, CD8A/B, and KLRD1), T cells (CD3D, CD3E, and CD4 or CD8A/B) and NK cells (KLRD1; Fig. 1C; Supplementary Fig. S1B). Other immune populations included myeloid cell populations including LYZ-expressing monocytes and TAM, neutrophils (FCGR3B), plasma or follicular B cells (CD79A and MS4A1), and mast cells (CPA3). MKI67-expressing cycling cells were mainly lymphocytes and a smaller number of tumor cells. Stromal cells included endothelial cells (EMCN), mesangial cells (RGS5), and CAF (POSTN) along with proximal and distal renal tubule cells (CDH16). Overall, 41,537 cells were derived from tumor samples and 14,884 cells from NAT, the unique source of tubule cells (Fig. 1D and E; Supplementary Fig. S1C and Data Set S2).
The tumor, stromal and immune cell subtypes were individually reclustered to obtain 45 subclusters (Supplementary Fig. S2, Data Set S2), 14 from lymphoid cells (Supplementary Fig. S2A–S2B), 6 from endothelial cells (Supplementary Fig. S2C–S2E), 4 from myeloid cells including classic (MONO.cl) S100A8high and atypical monocytes (MONO.at) along with TAM populations including the recently described CD1C-expressing macrophages (Supplementary Fig. S2F; ref. 17). Classic monocytes were characterized as CD14+ and CD16− (FCGR3A−), whereas atypical monocytes were CD14−/CD16+. TAMs expressed both M1 (IL1A, CXCL9, CD86) or M2 (APOE, C1QA, MRC1) polarization markers and formed an M1–M2 gradient (Supplementary Fig. S2F–S2H; refs. 13, 17). The M1 signature was represented in the MONO.at population in agreement with reports that they can differentiate into TAMs (13) and account for the residual expression of some genes from the MONO.at signature in the TAM subcluster (Supplementary Fig. S2G).
Plasticity of ccRCC tumor cells
We captured 1,763 cancer cells from 6 tumors with none captured from the sunitinib-treated T3 sample, consistent with its good response to TKI therapy (Fig. 1F and G; Supplementary Fig. S1C). Although all tumor cells were characterized by CA9 and VEGFA expression, reclustering identified 5 populations with differentially expressed genes and pathways: epithelial-like (ccRCC.epi; n = 186 cells, 10.5%), expressing PDZK1IP1, LGALS2, and GPX4, mesenchymal-like (ccRCC.mes; n = 116 cells, 6.5%) expressing mesenchymal markers COL1A1, COL1A2, COL6A3, fibronectin 1 (FN1) and SERPINE1, two intermediate states (ccRCCint1/ccRCCint2; n = 864/536 cells, 49%/30%, respectively) expressing stress markers (ATF3, DNAJB1) or metallothionines (MT1A, MT1M), a novel inflamed state (ccRCC.inf, n = 61 cells, 3.4%) characterized by IL6/JAK/STAT3, IL2/STAT5, interferon signaling pathway and expression of CD74 and MHC-II genes (HLA-DRA, HLA-DRB1, HLA-DQB1:Fig. 1F; Supplementary Fig. S3A, Data Set S2). Each tumor comprised varying numbers of cells of each phenotype and all phenotypes were present in multiple tumors (Fig. 1G). Acquisition of mesenchymal markers in ccRCC.mes was accompanied by loss of epithelial genes in an epithelial-to-mesenchymal transition (EMT) gradient (Supplementary Fig. S3B) confirmed by SWNE trajectory analysis that traced transformation from epithelial-like to mesenchymal-like states via intermediate states (Fig. 1H). GSVA of Hallmark pathways showed that ccRCC.epi was enriched for oxidative phosphorylation and fatty acid metabolism, whereas ccRCC.mes was enriched for EMT, hypoxia, TGFβ, and Notch signaling (Fig. 1I). Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis recapitulated the GSVA with ccRCC.epi enriched in oxidative phosphorylation and metabolic pathways and ccRCC.mes in ECM-receptor interaction, focal adhesion, and the PI3K–AKT pathway (Supplementary Fig. S3C).
In contrast, trajectory analysis did not distinguish whether the inflamed state was an intermediate in the EMT gradient or a distinct state adopted by cells from one of the intermediate populations (Fig. 1H). Analogous “interferon response” tumor cells were described in a multicancer assessment of tumor cell states (18) and in an MAPKi-treated melanoma patient-derived xenograft (19). VISION projection of a 129-gene “interferon response” signature from melanoma (19, 20) showed that ccRCC.inf cluster had the highest score, confirming the existence of an inflamed/immune-like phenotype in these two different cancers (Supplementary Fig. S3D).
Characterization of the ccRCC.int subclusters was challenging due to low numbers of marker genes compared with other populations (95 for ccRCC.int1, 24 for ccRCC.int2, 351 for ccRCC.epi, 442 for ccRCC.mes, and 117 for ccRCC.inf; Data Set S2) and lack of specifically enriched pathways in GSVA and KEGG ontology analyses (Fig. 1I; Supplementary Fig. S3C). 137 genes were enriched in ccRCC.int1, that appeared more epithelial with higher expression of EPCAM, PDZK1IP1, and PAX8, compared with 52 genes in ccRCC.int2 that appeared more mesenchymal with higher expression of TGFΒI, SERPINE1, some collagens and metallothionine genes (Data Set S2). These observations were confirmed by GSEA (Supplementary Fig. S3E), showing enrichment of mesenchymal phenotype in ccRCC.int2 (epithelial-to-mesenchymal transition and hypoxia pathways) or stress pathways for ccRCC.int1 (e.g., UV response) represented by genes such as ATF3, JUNB, DNAJB, and DNAJA1. Despite the heterogeneous aspect of both ccRCC.int1 and ccRCC.int2 visible upon further subclustering (Supplementary Fig. S3F–S3H), we retained the ccRCC.int1 and ccRCC.int2 populations as they displayed signatures analogous to recently described stress response and metal-response populations in other solid tumors (18).
To validate the identification of these ccRCC tumor cell types, we reanalyzed data sets reported by Krishna and colleagues (Supplementary Fig. S4A–S4G; ref. 8) and Braun and colleagues (Supplementary Fig. S5A–S5G; ref. 13). We reclustered CA9-expressing cells into 6 subclusters (CK1–6) in the Krishna data set (Supplementary Fig. S4A–S4B) and 8 subclusters (CB1–8) in the Braun data set (Supplementary Fig. S5A–S5B). Despite the large number of samples, tumor cells were principally derived from few patients in both cohorts (Supplementary Figs. S4C–S4D and S5C–S5D). VISION projections of our ccRCC tumor signatures identified the ccRCC.epi, ccRCC.mes, and ccRCC.inf populations in both data sets (Supplementary Figs. S4E and S5E). Cluster identification was supported by GSVAs and heat maps showing expression of specific signature genes in each cluster (Supplementary Figs. S4F–S4G and S5F–S5G). Additional ccRCC states with cycling signatures (clusters CK4 and CB5), hypoxia signature (CB6), translation signature (CB3), or apoptosis (CB7) were identified (Supplementary Figs. S4B and S5B). A small number of ccRCC cells were present in the “Cycling” cluster of our data set. Cells with hypoxia signature in the Braun and colleagues data set were derived from a single metastasis sample accounting for its absence in our data set coming exclusively from primary tumors (13). The small apoptosis cluster was possibly filtered out in our quality control due to high expression of mitochondrial genes. Otherwise, these analyses confirmed the existence of the major ccRCC tumor cell types in public data sets.
Comparison of tumor cell states to proximal straight tubules
To compare tumor cell states with their putative cells of origin, we reclustered 6,585 renal tubule epithelial cells into 7 populations representing different nephron segments (Fig. 2A): 5 subclusters from the proximal and distal tubules, 1 subcluster of connecting tubules, and 1 subcluster of collecting duct cells (Fig. 2A and B; Supplementary Fig. S6A–S6B). We defined proximal straight tubules (PST) expressing GPX3, MIOX as well as segment 3 markers PDZK1IP1, MT1G, and MT1H. We could not divide the PST cells into 3 segments using the previously defined markers (21, 22), preventing a more precise annotation of the PST cluster (Supplementary Fig. S6C–S6D). Descending thin limb cells displayed the highest expression of SLPI and ascending thin limb cells highest expression of ITGB6. Two groups of thick ascending limb cells expressed SLC12A1 and ENOX1 and could be divided into cortical (cTAL) and medullar (mTAL) based on differential UMOD expression. We identified a group of connecting tubules (CNT) marked by the expression of SLC8A1 and AQP2 and a group of collecting ducts marked by expression of ATP6V0D2 and SLC4A1 (Fig. 2B; Supplementary Data Set S2).
PST cells most closely resembled ccRCC.epi based on their proximity on the UMAP plot (Fig. 2C) and their gene signature was most represented in ccRCC.epi, but progressively lost in the EMT gradient (Fig. 2D). Trajectory analysis traced PST to ccRCC.epi followed by the EMT gradient (Fig. 2E). A PST origin was also supported by the prominent gluconeogenesis signature (FBP1, FBP2, PCK1, PCK2, PC, GPI, SGLT1, and SGLT2), the preferential mode of glucose production in proximal renal tubules (23), that persisted in ccRCC.epi and some intermediate cells. Similarly, the OXPHOS signature, prominent in PST cells also persisted in ccRCC.epi cells before diminishing during EMT. In contrast, HIF1a and glycolysis signatures were poorly represented in the PST cells, but were strong in all tumor cells including the ccRCC.epi population. Oncogenic transformation of PST cells therefore involved the acquisition of a strong hypoxia signature and a metabolic switch from OXPHOS to glycolysis, with both signatures found in the ccRCC.epi population, but with glycolysis predominant in the ccRCC.mes and ccRCC.inf populations (Supplementary Fig. S6E).
To define the transcriptional regulatory networks underlying the transformation of PST cells into ccRCC and those characterizing the different ccRCC cell states, we performed SCENIC analysis (Fig. 2F). PST cells displayed high activity of several TFs associated with metabolism, cell identity and tumor suppression, either specific to PST (TP53 and PPARG), or gradually lost (TFEC and NR1H4). The transformation was accompanied by activation of multiple regulons linked with stress response, inflammation, and angiogenesis shared across all ccRCC clusters (MYC and ATF4), whereas others were either shared across ccRCC.int clusters (ARNT and RXRA), enriched in ccRCC.int1 (CHD2 and STAT3) or enriched in ccRCC.int2 (THAP11 and ZBTB33). The mesenchymal state was marked by regulons linked with inflammation, Wnt signaling, and EMT either gradually activated (STAT1 or FOXO3) or specific to ccRCC.mes (SNAI1 and ZEB1). Further, IRF1 and NFKB2 regulons were enriched in ccRCC.inf consistent with their inflamed/immune-like signature.
These analyses defined how the transcription program of PST cells governed by epithelial identity regulators such as PAX8 and PPARG was replaced by a pan-ccRCC program involving MYC and ATF4 as well progressively acquired or specific programs involving AP1, the SOX and POU family of transcription factors (BRN1 and OCT4) as well as EMT regulators (ZEB and SNAI) defining different cell states (Supplementary Fig. S6F).
Heterogeneity of the mesangial and CAF populations
The mesangial compartment could be sorted into 5 subclusters (Fig. 3A and B). A subcluster designated SMC represented smooth muscle rather than bone fide fibroblasts with high expression of contractile muscle marker genes such as ACTA2, MYH11, or MYL9 and CNN1, a smooth muscle marker. Classic RGS5- and PDGFRB-expressing pericytes designated as mesangial cells (MSG) in the kidney were observed along with RGS5+ cells expressing stress markers, inflammation-associated cytokines such as IL6, CXCL1, or TNF and showing expression of MHC class II genes (MSG.inf). We further identified two clusters of RGS5low/POSTNhigh and PDGFRBlow/PDGFRAhigh CAFs that segregated into previously described phenotypes of inflammatory/antigen-presenting CAFs (iCAFs/apCAFs) and myofibroblastic CAFs (myCAF) with myofibroblast markers and TGFβ activated genes such as FAP, FN1, MMP11, and multiple collagens (Fig. 3B). ApCAFs were distinguished from MSG.inf by loss of RGS5, gain of POSTN, and higher expression of MHC class II genes (Fig. 3B). All tumors comprised varying proportions of MSG and myCAF populations (Fig. 3C), whereas apCAFs were mainly derived from NAT as observed in other tumors such as bladder cancer (24). Trajectory analyses suggested MSG.inf as precursors to apCAFs, whereas myCAFs developed from RGS5+ pericytes via a distinct trajectory (Fig. 3D). SCENIC regulon analysis also differentiated pericytes from CAFs with the activity of myofibroblast-related transcription factors such as MLXIP, TEAD1, or TEAD3 in myCAFs, whereas apCAFs showed activity for GATA3, NFKB1 and as reported (25) the NFE transcription factors (Fig. 3E). Gain of activity of these regulons was accompanied by the loss of those active in pericytes, although MSG.inf shared reduced activity of 3 regulons strongly active in apCAFs, again consistent with their being a precursor population.
Association between cancer cell and stromal cell states with patient outcome
To investigate the presence of the above-described cell populations in the TCGA-KIRC patient cohort, we used our scRNA-seq data set as a reference for the CIBERSORTx algorithm to perform deconvolution of bulk RNA-seq data. To facilitate interpretation, ccRCC.int1 and ccRCC.int2 were merged as an intermediate signature designated ccRCC.int.
Unsupervised clustering of TCGA ccRCC cohort (n = 495) using deconvolution scores of tumor cell populations, CAF and TAM states, revealed enrichment of ccRCC.mes and myCAF or ccRCC.epi in subsets of tumors in an anticorrelated manner (Fig. 4A and B; Supplementary Fig. S7A–S7B; Data Set S3). Spearman correlations confirmed strong anticorrelation of ccRCC.mes and ccRCC.epi (Spearman coefficient r = −0.527; P = 8.5e−37), whereas ccRCC.mes and myCAF strongly correlated (r = 0.498; P = 2.6e−32), and ccRCC.inf showed significant correlation with TAMs (r = 0.367; P = 2.9e−17; Fig. 4B; Supplementary Fig. S7C) in agreement with recent observations (18). MyCAF, ccRCC.mes, and ccRCC.inf were strongly enriched in grade (G4) ccRCC (p(myCAF) = 3.3e−9; p(ccRCC.inf) = 1.7e−5; p(ccRCC.mes) = 4.7e−7) along with exhausted CD8.ex and TAM populations as previously described (Fig. 4C; ref. 7). In contrast, ccRCC.epi was strongly enriched in G1. Similarly, myCAF, ccRCC.mes, and ccRCC.inf were strongly enriched in TCGA m3 molecular subtype, the transcriptional signature in the TCGA classification most strongly associated with poor outcome (Supplementary Fig. S7D; ref. 26).
We analyzed the relationship between the tumor, CAF and TAM populations with patient survival by stratification using the optimal cut-point value method and computation of a univariate Cox regression model using continuous values. Using both methods, only ccRCC.mes [hazard ratio (HR) = 6.07, P < 0.001], ccRCC.inf (HR = 9.5, P < 0.001), and myCAF (HR = 6.6, P < 0.001) were significantly associated with poor survival, whereas ccRCC.epi was significantly associated with better survival (HR = 0.38, P = 0.002; Fig. 4D; Data Set S3). Although not significant using continuous values in the univariate Cox regression model, apCAF showed a trend toward better survival and ccRCC.int and TAM signatures a trend with poor survival using the optimal cut-point value. After Cox multiple regression analysis to correct clinical covariates including age and stage, the ccRCC.mes, myCAFs, and ccRCC.inf remained strong negative predictors of survival, with ccRCC.epi a strong positive predictor of survival (Data Set S3).
Spatial association of ccRCC.mes and myCAFs
Given the strong ccRCC.mes–myCAFs correlation, we asked if they were spatially associated within ccRCC tumors and thus potentially signaling to one another. We performed 10X visium spatial transcriptomics (Supplementary Figs. S8A–S8F and S9A–S9B) on sections from tumors T1 and T7 and applied the transcriptional signatures derived from the scRNA-seq to localize CD4+ T cell, CD8+ T cell, NK cell, B cell, monocytes, neutrophils, mast, CAF, and tumor cell populations.
Histology and transcriptional analyses of the T1 section revealed several distinct areas. Area A, high-grade ccRCC, displayed many ccRCC.epi and ccRCC.int1/2 cells, which, although intermixed, were enriched at different locations. CcRCC.int1 cells were enriched at the interface with fibrotic area B strongly enriched in myCAFs and TAMs (Fig. 5). Strikingly, ccRCC.mes cells localized along the A–B interface adjacent to myCAFs. In contrast, ccRCC.inf cells localized adjacent to ccRCC.int2 in area A. Area E was strongly enriched in B and T cells and separated from the rest of the tumor and NAT by fibrotic area C, strongly enriched in myCAFs, with a small number of apCAFs at their interface (Fig. 5; Supplementary Fig. S9A). Area F corresponded to low-grade ccRCC, but was more strongly marked by mesangial signature than by tumor signature perhaps due to the abundant capillary vessels around the tumor cells in this region. Region D, separated from the tumor by the fibrotic myCAF-containing area C, corresponded to NAT with the visible presence of tubules and PST signature.
In the T7 section, ccRCC.inf cells located in a specific region of area A with dispersed ccRCC.epi and ccRCC.int1/2 cells in areas A, C, and E separated by a fibrotic area B running diagonally through the section and strongly enriched in myCAFs with a small number of adjacent apCAFs (Supplementary Figs. S8B–S8D and S9B). In contrast, ccRCC.mes cells were specifically localized adjacent to myCAFs in area C and small groups of ccRCC.mes closely associated with clusters of myCAFs in area E. Area D, separated from the tumor by the myCAF-containing fibrotic area B, corresponded to NAT, with PST and POD signatures. Although in T1 lymphoid cells were separated from the tumor, areas A, C, and E of T7 all showed dispersed lymphoid infiltration (Supplementary Fig. S9A–S9B).
To consolidate the identification of ccRCC.mes and myCAFs and illustrate their colocalization, we colored spots with a prediction score above 0.1 for myCAF in green, above 0.1 (or 0.01 for T1) for ccRCC.mes in red and spots with double-positive signal in yellow. Most red spots were close to green spots and a smaller number of yellow spots showed the presence of both cell types in close proximity (Fig. 5D; Supplementary Fig. S8E). Thus, in contrast to ccRCC.epi/int1/int2 cells localized widely throughout the tumors, ccRCC.mes cells localized adjacent to myCAFs that formed fibrotic areas at the interface with NAT.
These findings were consolidated by analyses of the public data set of Myelan and colleagues (27). Based on the provided clinical annotations, we selected one case for each available stage (C20-pT1, C34-pT3, and C3-pT4; Supplementary Fig. S10A–S10C, left). The ccRCC.epi signature was more widely represented in C20, whereas ccRCC.mes was more abundant in the more advanced-stage tumors C34 and C3. However, in all tumors, a spatial proximity of ccRCC.mes and myCAFs was observed (illustrated by colored spots in the dual prediction panel in Supplementary Fig. S10A–S10C). Moreover, as seen in T1 and T7, myCAFs formed an interface between the tumor and NAT labeled with the PST signature. In C34, both myCAFs and ccRCC.mes were seen invading the NAT (Supplementary Fig. S10C) similar to myCAFs invading NAT in T1 and T7.
We further performed multiplex immunostaining on mirrored sections of T1 and T7 used for spatial transcriptomics using FN1 for ccRCC.mes, TAGLN for myCAFs, CA9 for tumor cells, and DAP1 staining for nuclei. On T1, CA9 staining delineated the tumor-containing regions A and F previously defined by spatial transcriptomics, TAGLN labeled the fibrotic myCAF regions B and C, whereas ccRCC.mes cells labeled by both CA9 and FN1 were located at the border of the myCAF/TAGLN-labeled region B (Fig. 6A). For T7, CA9 stained tumor regions A and E as defined by spatial transcriptomics, whereas TAGLN labeled the fibrotic myCAF region B invading the NAT in regions D1 and D2 (Fig. 6B). As observed by spatial transcriptomics, CA9 and FN1 labeled ccRCC.mes cells were abundant in region C also infiltrated by dispersed TAGLN expressing myCAFs and bordering myCAF-rich region B. Similarly, dispersed groups of FN1 labeled ccRCC.mes cells localized with TAGLN-labeled myCAFs in region E. Immunostaining, therefore, confirmed the results of spatial transcriptomics showing the proximity of myCAFs and ccRCC.mes within these tumors as well as the propensity of myCAFs to invade the adjacent NAT.
Signaling between ccRCC.mes and myCAFs reveals potential therapeutic targets
We used CellPhoneDB and NicheNet software to predict ligand–receptor interactions between myCAF and ccRCC.mes that may contribute to their communication and constitute therapeutic targets. CellPhoneDB revealed a collection of ligand–receptor interactions using either ccRCC.mes ligands with myCAF receptors (Supplementary Fig. S11A) or vice versa (Supplementary Fig. S11B). In both configurations, multiple high-confidence interactions were predicted involving secreted (e.g., MDK–LRP1 or VEGFA–LRP1) or juxtacrine (JAG1–NOTCH3) ligand–receptor pairs. Other bidirectional interactions were predicted involving NRP1 known to promote invasion in cancer cells (28) but also interactions involving MDK, EGFR, and AXL.
We used NicheNet to identify ccRCC.mes ligands involved in promoting the conversion of pericytes into myCAFs (Supplementary Fig. S11C). CcRCC.mes-secreted TGFβ was predicted as a principal pathway, but IL6–IL6R, GAS6–AXL, and PDK1–PDK2 pathways were further observed. NicheNet also predicted how myCAF-secreted TGFβ may promote or maintain EMT of ccRCC.mes together with GAS6–AXL and CXCL12–CXCR4 (Supplementary Fig. S11D). A complex network of positive feedback signaling between ccRCC.mes and myCAFs may therefore promote and sustain the phenotypes of these closely associated cell populations.
MyCAFs associate with resistance to first-line nivolumab ± ipilimumab treatment
To ask if the ccRCC.mes–myCAF correlation is seen in an independent data set, we used CIBERSORTx to deconvolute 97 RNA-seq samples from patients before first-line treatment with either sunitinib, nivolumab or the nivolumab/ipilimumab combination from the BIONIKK clinical trial (29, 30). Tumors were again heterogeneous with all ccRCC tumor signatures represented and a strong ccRCC.mes–myCAF association (r = 0.671; P = 5.3e−14; Fig. 7A and B; Supplementary Fig. S12A–S12C; Data Set S4). BIONIKK was based on tumor classification in 4 groups (ccRCC1–4): ccRCC2 and ccRCC3 designated “pro-angiogenic” compared with less differentiated and immune-rich ccRCC1 and ccRCC4 (31). Computing the average score for each group, with the exception of ccRCC3 not well represented in the sequenced samples, indicated that the “angiogenic” ccRCC2 group was enriched in ccRCC.epi and endothelial ED.cor, whereas ccRCC.mes, myCAFs, and TAMs were strongly enriched in the ccRCC4 group and ccRCC.inf in ccRCC1 (Data Set S4).
We then associated cell populations with progression-free survival (PFS) and overall survival (OS) of patients treated with either nivolumab or ipilimumab + nivolumab combination (n = 77). Patients with high myCAF signature (n = 14) displayed a significantly shorter PFS of 3.7 months (95% CI, 2.1 to NE), compared with 13.1 months (95% CI, 10.4 to NE) for the remaining patients (HR for myCAF-high group, 2.62; 95% CI, 1.32–5.12; P = 0.005; Fig. 7C). Furthermore, those patients displayed a shorter median OS of 15.9 months (95% CI, 13.6–NE) compared with myCAF-low where the median was not reached (HR for myCAF-high group, 15.26; 95% CI, 3.91–59.67; P < 0.001; Fig. 7D). Notably, the myCAF-high group showed lower overall response rate (ORR) and clinical benefit rate (CBR) both being 35.71% (5 partial responses, 9 progressive diseases) compared with myCAF-low displaying an ORR of 73.02% and a CBR of 52.38% (8 complete responses, 25 partial responses, 16 stable diseases, 14 progressive diseases).
Despite the association of the ccRCC.mes and myCAF populations, no significant association of high ccRCC.mes score with OS was seen (Supplementary Fig. S12D). Nevertheless, 11 of the 14 myCAF-high tumors associated with high ccRCC.mes score and represented the 5 mortality events in this group, with no events seen among the 17 myCAF-low patients (Supplementary Fig. S12E). Thus, although there was association of the ccRCC.mes and myCAF populations, only a high myCAF score showed strong association with ICI resistance.
Enrichment for ccRCC.mes and myCAFs at metastatic sites
BIONIKK samples comprise RNA-seq data from metastatic tumor sites, including liver (n = 5), lymph node (n = 5), lung (n = 4), bone (n = 3), brain (n = 2), skin (n = 1), digestive system (n = 1), and unknown locations (n = 5). Comparison of primary (n = 70) and metastatic tumors (n = 26) revealed selective enrichment of only the ccRCC.mes and myCAF populations at the metastatic sites (Fig. 7E). This observation highlighted the importance of these populations and of the ccRCC.mes invasive profile in local infiltration and metastatic dissemination (Fig. 7F).
Discussion
Tumor cell state association with ccRCC disease outcome
We established a comprehensive characterization of ccRCC and the associated TME defining up to 46 cell populations including 5 tumor cell populations. The ccRCC.epi population, enriched in epithelial markers, showed the highest signature similarity to the PST population, the proposed cell of ccRCC origin (10, 32, 33). We defined tumor cell plasticity by characterizing hybrid EMT and fully mesenchymal cells with high activity of EMT-related transcription factors and a population expressing MHC class II genes with transcription factors associated with interferon and inflammatory responses. This latter subtype was observed in other cancers such as melanoma (20, 33, 34, 35). Deconvolution of the TCGA-KIRC and BIONIKK data sets defined the contribution of the ccRCC cell states to tumor heterogeneity. CcRCC.mes and ccRCC.inf cells were highly enriched in advanced-grade tumors and associated with poor prognosis, whereas ccRCC.epi was enriched in low-grade tumors with better outcomes. Moreover, in line with previous studies linking EMT to ccRCC metastasis (36), ccRCC.mes was the only ccRCC population enriched at metastatic sites. Our results therefore highlight how tumor cell states associate with disease progression, with the mesenchymal and inflamed states predicting poor patient prognosis.
Our analyses further defined the transcriptional basis for ccRCC plasticity identifying transcription factors and associated programs involved in oncogenic transformation and EMT. We identified potential tumor suppressors active in PST but lost in ccRCC such as TP53 (37), FOXP4, previously reported as expressed in kidney but diminished in tumors (38), and PPARG shown to maintain epithelial phenotype in renal fibrogenesis (39). Some factors linked to hypoxia response including MYC and CEBPD (40) were active in all ccRCC cells, likely resulting from VHL inactivation. In contrast, others were linked to a specific cell state, for example, ZEB and SNAI families active in ccRCC.mes, or SOX4, SOX13, BRN1 (POU3F3), and OCT4 (POU5F1) active in dedifferentiating ccRCC cells. These results show how the loss of factors maintaining terminal differentiation results in MYC activation and loss of epithelial identity in line with a previously proposed model (41).
Although several previous ccRCC scRNA-seq studies focused mainly on the immune and myeloid compartments (7, 8, 14), Bi and colleagues (7) described 2 tumor cell populations, designated TP1 and TP2, whose transcriptional signatures partially overlapped with the ccRCC.epi/int signatures described here. More recently, while this manuscript was under review, Li and colleagues (42) identified several tumor cell meta-programs. The stress response–related (MP1) resembles ccRCC.int1 characterized by expression of JUNB, FOSB, DNAJB1, and ATF3. MP3 displaying a mesenchymal program is analogous to ccRCC.mes and was found enriched in the TCGA m3 group and characterized by CEBPB, which we also identified as a key regulon of undifferentiated states. Proximal tubule MP2 is similar to ccRCC.epi with MP2 and MP3 anticorrelated analogous to ccRCC.epi and ccRCC.mes. MP5 shows the characteristics of ccRCC.inf with expression of the MHC-II genes, confirming the existence of this cell type in ccRCC. These two complementary studies thus converge on ccRCC cell plasticity and how it contributes to ccRCC heterogeneity and disease outcome.
MyCAF enrichment at the tumor/NAT interface and association with resistance to ICI therapy
We defined 2 RGS5-expressing pericyte populations, one displaying a weak apCAF-type signature. Trajectory analyses suggested that pericytes gave rise to the MSG.inf population, a precursor to POSTN-expressing apCAFs, via IL1α response pathway or to the myCAF population via a TGFβ-driven pathway as described in pancreatic cancer (43). Computational analyses predicted ligand–receptor interactions whereby ccRCC.mes may induce pericyte conversion to myCAF, via the TGFβ, IL6–IL6R, PKD1–PKD2, and the AXL–GAS6 pathways. MyCAFs may therefore be derived mainly from the resident pericytes lining the glomeruli and peritubular capillaries of the normal kidney upon interactions with dedifferentiated tumor cells.
We identified myCAFs as a critical component of the ccRCC TME that strongly associated with ccRCC.mes in primary and metastatic tumors and robustly associated with poor ccRCC prognosis. MyCAFs seem also to play a critical role in ccRCC tumor organization. Although ccRCC.epi/int cells were widespread within tumors, spatial transcriptomics and multiplex immune staining revealed a proximity of ccRCC.mes and myCAFs that may account for their strong association and facilitate their signaling to one another. MyCAFs form fibrotic structures seen at the tumor–NAT boundary and sometimes invaded the NAT. CcRCC.mes were also preferentially localized in these regions in agreement with Li and colleagues (41), who localized these cells at the leading edge of the tumor. Li and colleagues did not, however, analyze the presence of myCAFs at the tumor–NAT interface, but rather noted enrichment in a subtype of IL1β-high macrophages. Thus, both studies underscore important characteristics of ccRCC tumor spatial organization highlighting the potential roles of mesenchymal tumor cells, myCAFs, and IL1β-high macrophages in the invasive process.
Computational analyses identified signaling pathways that could mutually reinforce the myCAF and ccRCC.mes transcriptional programs with a prominent role of the TGFβ pathway with ligands and receptors expressed in both cell types. However, other pathways were identified, such as JAG1–NOTCH interaction. NOTCH signaling has been associated with EMT, stemness, and cell plasticity (44). In breast cancer, different CAFs promote EMT via TGFβ and CXCL12 or invasion by NOTCH signaling (45). MyCAFs may signal to ccRCC.mes through TGFβ and CXCL12 as well as by NOTCH, suggesting they promote both EMT and invasion in agreement with their association with metastases and poor prognosis.
The AXL-GAS6 pathway is potentially a key pathway in myCAF–ccRCC.mes cross-talk and a novel potential therapeutic target. High tumor cell AXL expression is associated with resistance to anti–PD-L1 ICI in ccRCC (46). These observations underscore the potential role of this pathway in ccRCC.mes–myCAF communication.
A key conclusion of our analyses was the robust association of myCAFs, but not ccRCC.mes or other tumor cell states, with resistance to nivolumab or ipilimumab + nivolumab ICI both in terms of OS and PFS in the BIONNIK clinical trial. Although the sunitinib-treatment arm of the BIONIKK trial was too small to provide a suitable control, the selective association with poor OS and PFS indicated that a high myCAF score was likely predictive of patient response to ICI therapy. Moreover, our data suggest that patients with myCAF-high tumors that are more likely to show ICI resistance may benefit more from a combinatorial TKI-ICI treatment to target the myCAF–ccRCC.mes cross-talk, ideally one targeting both AXL and PDGFR such as sunitinib or axitinib, lenvatinib, cabozantinib, or sorafenib. Additional targetable ligand–receptor couples may be relevant, such as the cell surface receptor NRP1 targeted by monoclonal antibody ASP1948 under evaluation in combination with a PD-1 inhibitor in clinical trial NCT03565445 and by the receptor antagonist EG00229 that showed efficacy in a ccRCC xenograft model when combined with everolimus (47). Other potential targets are MDK that has been successfully inhibited by an MDK inhibitor (48) or EGFR targeted by TKIs such as afatinib or by monoclonal antibodies cetuximab and panitumumab.
MyCAFs have been associated with immunosuppressive microenvironment and ICI resistance in other tumors. Analyses of myCAF populations revealed a positive association with PD-1 and CTLA4-expression and primary resistance to ICI therapy in several cancers perhaps forming a barrier excluding immune infiltration (49). Alternatively, myCAFs express high levels of TGFβ known to attenuate anti–PD-L1 response (50). Our data revealed a novel association of myCAFs with ICI resistance in ccRCC further expanding the role of these cells as important modulators of ICI therapy.
Authors' Disclosures
Y.A. Vano reports grants, personal fees, and nonfinancial support from BMS, Ipsen, Pfizer, MSD, Eisai, and Roche outside the submitted work. P. Barthélémy reports other support from BMS, Pfizer, Merck, MSD, GILEAD, IPSEN, AstraZeneca, Astellas, Janssen Cilag, Bayer, and Novartis outside the submitted work. G.G. Malouf reports grants from ARC during the conduct of the study and other support from MSD AVENIR, MSD, Ipsen, and BMS outside the submitted work. No disclosures were reported by the other authors.
Authors' Contributions
G. Davidson: Conceptualization, data curation, software, formal analysis, validation, investigation, visualization, methodology, writing–original draft. A. Helleux: Conceptualization, data curation, formal analysis, validation, investigation, visualization, writing–original draft. Y.A. Vano: Conceptualization, data curation, funding acquisition, validation, investigation, visualization, writing–review and editing. V. Lindner: Resources, data curation, validation, investigation, visualization, writing–review and editing. A. Fattori: Resources, validation, investigation, methodology, writing–review and editing. M. Cerciat: Resources, investigation, visualization, writing–review and editing. R.T. Elaidi: Resources, data curation, investigation, visualization, project administration, writing–review and editing. V. Verkarre: Resources, visualization, writing–review and editing. C. Sun: Resources, validation, investigation, writing–review and editing. C. Chevreau: Resources, data curation, investigation, writing–review and editing. M. Bennamoun: Resources, data curation, writing–review and editing. H. Lang: Resources, data curation, funding acquisition, investigation, writing–review and editing. T. Tricard: Resources, data curation, investigation, writing–review and editing. W.H. Fridman: Resources, data curation, validation, investigation, writing–review and editing. C. Sautes-Fridman: Data curation, validation, investigation, writing–review and editing. X. Su: Resources, software, investigation, writing–review and editing. D. Plassard: Resources, software, writing–review and editing. C. Keime: Resources, software, investigation, writing–review and editing. C. Thibault-Carpentier: Resources, validation, investigation, writing–review and editing. P. Barthélémy: Resources, data curation, writing–review and editing. S.M. Oudard: Resources, data curation, funding acquisition, writing–review and editing. I. Davidson: Conceptualization, resources, data curation, supervision, validation, investigation, visualization, writing–original draft. G.G. Malouf: Conceptualization, resources, formal analysis, supervision, funding acquisition, validation, investigation, visualization, methodology, writing–original draft, project administration.
Acknowledgments
The authors thank the staff of the IGBMC common facilities, especially Dr. C. Carpentier and Ms. M. Cerciat for their precious help with single-cell techniques, and Dr. E. Guiot for her great guidance within the microscope facility. This work was supported by grants from the ARC Foundation (SIGNIT) and ARC-2020-PJA32020070002319, the Ligue Nationale contre le Cancer, the Institut National du Cancer, the ANR-10-LABX-0030, and ANR-10-IDEX-0002-02. I. Davidson is an “équipe labellisée” of the Ligue Nationale contre le Cancer. G. Davidson was supported by the Région Grand Est. A. Helleux was supported by the Ministère de la Recherche. Sequencing was performed by the GenomEast platform, a member of the “France Génomique” consortium (ANR-10-INBS-0009). The authors would also like to thank Drs V. Debien, J. Gantzer, and P. Baltzinger for their insightful comments.
The publication costs of this article were defrayed in part by the payment of publication fees. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.
Note: Supplementary data for this article are available at Cancer Research Online (http://cancerres.aacrjournals.org/).