With only a fraction of patients responding to cancer immunotherapy, a better understanding of the entire tumor microenvironment is needed. Using single-cell transcriptomics, we chart the fibroblastic landscape during pancreatic ductal adenocarcinoma (PDAC) progression in animal models. We identify a population of carcinoma-associated fibroblasts (CAF) that are programmed by TGFβ and express the leucine-rich repeat containing 15 (LRRC15) protein. These LRRC15+ CAFs surround tumor islets and are absent from normal pancreatic tissue. The presence of LRRC15+ CAFs in human patients was confirmed in >80,000 single cells from 22 patients with PDAC as well as by using IHC on samples from 70 patients. Furthermore, immunotherapy clinical trials comprising more than 600 patients across six cancer types revealed elevated levels of the LRRC15+ CAF signature correlated with poor response to anti–PD-L1 therapy. This work has important implications for targeting nonimmune elements of the tumor microenvironment to boost responses of patients with cancer to immune checkpoint blockade therapy.
This study describes the single-cell landscape of CAFs in pancreatic cancer during in vivo tumor evolution. A TGFβ-driven, LRRC15+ CAF lineage is associated with poor outcome in immunotherapy trial data comprising multiple solid-tumor entities and represents a target for combinatorial therapy.
This article is highlighted in the In This Issue feature, p. 161
Pancreatic ductal adenocarcinoma (PDAC) remains a devastating disease, with a 5-year survival rate of 7% (1). One of the hallmarks of this aggressive cancer is a dramatic desmoplasia driven by carcinoma-associated fibroblasts (CAF). CAFs not only deposit the extracellular matrix (ECM) that characterizes desmoplasia, but also produce factors that promote tumor growth. Subsequently, CAFs have been targeted in efforts to improve PDAC outcomes, with conflicting results (2–6). The discrepancy in outcomes might be explained by CAF heterogeneity, with different fibroblast populations having separate, perhaps even opposing, functions (7, 8). Smooth muscle actin (SMA) and fibroblast-activating protein (FAP) have been described as showing heterogenous expression on CAF populations (8, 9), and SMA-high CAFs have further been identified as a tumor-adjacent TGFβ-driven population with different inflammatory properties from SMA-low CAFs.
Intriguingly, despite the conflicting results of targeting CAFs as a single therapy, modulating CAFs in combination with immunotherapies improved outcomes in several preclinical models (2, 4, 10). As these studies model cancers that show resistance to immunotherapies alone, they suggest that elucidating CAF functions may provide the understanding needed to design more efficacious immunotherapeutic approaches and address the unmet clinical need in devastating cancers like PDAC. The full scope of CAF functions in the context of cancer immunotherapy remains to be determined, but will necessarily be influenced by the fibroblast state at the tissue–tumor interface.
We sought to provide an unbiased assessment of fibroblast heterogeneity in normal as well as PDAC tissues by using a combination of bulk and single-cell RNA sequencing (RNA-seq) of stromal cells. Normal tissues, nonmalignant adjacent tissue, and early and advanced tumors from genetically engineered mouse models (GEMM) were utilized in this study. We hypothesized that the changing microenvironment during tumor progression affects the phenotype of tissue-resident fibroblasts, resulting in their development into multiple CAF subsets. Our analyses revealed that preexisting fibroblast heterogeneity in normal tissue dictated the developmental trajectories of murine CAFs. These data enabled identification of the transcriptional profiles of individual CAF populations, and revealed a TGFβ-programmed CAF, identifiable by expression of leucine-rich repeat containing 15 (LRRC15), that became the dominant fibroblast in advanced tumors. Combining publicly available human sequencing data with newly acquired IHC of tissues from 70 patients with PDAC, we confirmed the identification of these LRRC15+ CAFs in human patients. The LRRC15+ CAF signature was used to evaluate their impact on anti–PD-L1 immunotherapy response in large patient cohorts and revealed that high expression of the LRRC15+ CAF signature was associated with poor response to anti–PD-L1 therapy in immune-excluded tumors.
PDPN+ Cells Are the Dominant Fibroblast Population in Normal and PDAC Murine Pancreas
To characterize the stromal compartment in PDAC, we began by optimizing digestion conditions for stromal cell phenotyping from murine pancreas, starting with protocols to isolate the known dominant stromal cell in the pancreas, the stellate cell. Standard stellate cell pronase-based digestion (11) was observed to cleave many surface markers, whereas our novel digestion method preserved podoplanin (PDPN) and platelet endothelial cell adhesion molecule 1 (PECAM1/CD31) expression (Fig. 1A). To model PDAC, we used the Pdx1cre/+;LSL-KrasG12D/+;p16/p19flox/flox (KPP) mice that form aggressive tumors within 12 weeks (12). Although tumors from these mice often show several different carcinoma types, including sarcomatoid, acinar, and mucinous subtypes, we observed that up to 88% of a given cohort developed substantial regions of PDAC as has been reported previously (refs. 13, 14; Supplementary Fig. S1A). Flow cytometry of dissociated pancreases from the KPP mice and normal mice from the same albino B6 background [B6(Cg)-Tyrc-2J/J] revealed three major populations of stromal cells with similar composition between the two states (Fig. 1B; Supplementary Fig. S1B). CD31+ stromal cells were predominantly PDPN− blood endothelial cells with very few lymphatic endothelial cells (15). The remaining CD31− cells were largely PDPN+ with fibroblast and stellate cell characteristics (Fig. 1B; Supplementary Fig. S1C-S1F). Immunofluorescence microscopy confirmed the presence of PDPN+ cells around structures in the normal pancreas including acinar clusters, ducts, and islets as well as a single-cell layer of mesothelial cells encapsulating the pancreas (Fig. 1C, left; Supplementary Fig. S1G). The KPP mice exhibited increased PDPN expression, most dramatically bordering tumor islets, but also in some areas distal to tumors (Fig. 1C, middle and right; Supplementary Fig. S1H). To better characterize changes in the nonendothelial stroma with tumor progression, we harvested tissues from normal mice, mice with early tumors (<5 mm), and mice with advanced disease (tumors >5 mm). Pancreatic stromal cells were sorted on CD31−PDPN+PDGFRA+ and CD31−PDPN− (DN) cells for RNA-seq of these two populations. This strategy was used to exclude mesothelial cells, which are also PDPN+ but negative for PDGFRA (ref. 16; Supplementary Fig. S1I). The transcriptional profiles confirmed that PDPN+ stromal cells are enriched for fibroblast signature genes and the DN population for pericyte signature genes (ref. 17; Fig. 1D). Several CAF-associated genes were enriched in the PDPN+ PDAC population, although in mice Fap, Sma (Acta2), Fsp1, and Pdgfrb, often described as CAF marker genes, were detected to some degree in both stromal populations in normal and PDAC pancreas. Particularly, we found that Acta2 is highest in normal pericytes and Fap is equally high in normal fibroblasts (Fig. 1E). Although the pericyte-enriched population also showed changes between normal and tumor tissues, we focused on the PDPN+ populations as they represent the major CAF constituent.
Single-Cell RNA-seq Identifies Several Populations of PDPN+ Cells
Although PDPN+ stromal cells constituted the majority of CAFs, they expressed individual CAF markers at variable levels between replicates [e.g., Il6 levels ranged from 60 to 240 reads per kilobase of transcript per million mapped reads (RPKM)], and furthermore they appeared to simultaneously express markers reported to separate CAF subsets, that is, Acta2 and Il6 (Fig. 1E). This implied a significant heterogeneity within the PDPN+ stroma. To resolve this heterogeneity, we performed single-cell RNA-seq (scRNA-seq) of viable PDPN+ stromal cells from the pancreas of KPP and normal mice. To better capture changes that occur with tumor progression, we divided the KPP samples into tumor-adjacent tissue as well as small (1–4 mm) and large (5–10 mm) tumor samples for scRNA-seq (Fig. 2A). Five animals were pooled per condition in each of two biological replicates, and scRNA-seq was performed (Fig. 2A). After quality control and batch correction (Supplementary Fig. S2A; described in Methods), we obtained 13,454 high-quality cells for downstream analysis (replicate 1: n = 3,315; replicate 2: n = 10,139). Graph-based clustering of cells after dimensionality reduction with t-Distributed Stochastic Neighbor Embedding (t-SNE; Fig. 2B) or Uniform Manifold Approximation and Projection (UMAP; Supplementary Fig. S2B) identified 12 robust groups of cells (Supplementary Table S1).
Endothelial, myeloid, and acinar cell clusters represent contaminating cells as anti-CD31, anti-CD45, and anti-EPCAM, respectively, were used to gate out those populations in our flow protocol prior to sequencing (3% of all cells, Fig. 2A–C; Supplementary Fig. S2C). Within the remaining 97% of cells, 83.5% of cells were identified as fibroblasts, 11.5% were classified as tumor cells undergoing epithelial-to-mesenchymal transition (EMT), and 5.1% were mesothelial cells. All clusters were represented in both replicates (Fig. 2B–D). Clusters 5 and 7 had lost Epcam expression but retained higher expression of several keratins and genes associated with an epithelial origin. This suggested they might be EMT tumor populations (Supplementary Fig. S2D). To confirm this assignment, we identified Alcam as a gene uniquely expressed by these clusters (Fig. 2C). Flow cytometry confirmed ALCAM protein was expressed by a subset of cells found only in some large tumors. Isolation and sequencing of ALCAM+ cells revealed expression of the KrasG12 allele with 98% variant allele frequency (Supplementary Fig. S2E), confirming their identity as tumor cells. Cluster 6 was identified as mesothelial cells based on previous work from Buechler and colleagues, who transcriptionally profiled these cells and their transcriptional differences to fibroblasts using bulk RNA-seq, as well as Xie and colleagues, who identified their signature genes with scRNA-seq (16, 18). Genes identified by both studies as mesothelial cell markers were strongly enriched in cluster 6; conversely, 18 of the 20 most enriched genes in cluster 6 were also upregulated in mesothelial cells compared with fibroblasts in the Buechler and colleagues dataset (Fig. 2C and E; Supplementary Table S2). We primarily observed mesothelial cells in normal and normal-adjacent tissues (Fig. 2D).
Clusters 0 to 4, 8, and 9 were identified as fibroblasts by their expression of signature fibroblast genes (Supplementary Fig. S2D). Two clusters of normal tissue fibroblasts (ntFib) derived from normal mice (c3 and c4), as well as five clusters of CAFs, were identified (Fig. 2B; Supplementary Fig. S2B; c0, c1, c2, c8, and c9). c0 and c1 were most abundant in tissue adjacent to tumors (∼88% of CAFs; Fig. 2F). Meanwhile, the frequency of cells from c8 and especially c2 increased with tumor progression and dominated in late-stage tumors (>70% of all CAFs; Fig. 2F). Given the disappearance of ntFib with tumor progression, but proximity of normal fibroblast and CAF clusters in t-SNE and UMAP space, we hypothesized that heterogeneity at baseline might play a role in subsequent CAF development.
In Mice, Two Separate Fibroblast Lineages Coevolve during Tumor Progression Driven by TGFβ and IL1
UMAP dimensionality reduction of ntFib alone confirmed two major Pdpn+Pdgfra+ cell populations (Fig. 3A), and we identified their transcriptional profiles (Supplementary Fig. S3A; Supplementary Table S3). The c3 population expressed ECM genes associated with elastin fibrils and ECM attachment (e.g., Emilin2, Mfap5, Fbn1) whereas the c4 population was characterized by high expression of ECM proteins that suggested a predominant role in structural support through the production and maintenance of collagen networks and basement membranes (e.g., Col4a1, Col6a6, Plc). Consequently, c4 exhibited a significantly higher overall expression of collagens compared with c3 (Wilcoxon rank-sum test <0.001; Supplementary Fig. S3B). A similar trend was observed with respect to immune regulation, with c4 showing enrichment of several immune chemoattractants (including Ccl11, Cxcl14, and Cxcl16), whereas c3 showed enrichment of different immunoregulatory genes [Thbd, Cd55, Il33, Dpp4, and Ackr3 (Cxcr7)]. Their expression of nonoverlapping genes indicates complementary activities of c3 and c4. Flow cytometry for DPP4 and Ly6C, markers for c3, and ENG, a c4 marker (Supplementary Table S3), confirmed that both these populations can be phenotypically identified (Fig. 3B).
To assess the transcriptional changes during tumor progression, we first performed an unbiased principal component analysis (PCA) comprising all CAF and ntFib cells. PC1, the component explaining the strongest variance in the dataset, clearly separated c4 ntFib and c1 and c2 CAFs from c3 ntFib and c0 and c8 CAFs (Fig. 3C, top). Genes driving this unbiased separation were the same found in the supervised differential expression test between the two ntFibs c3 and c4 (Fig. 3C, bottom). This analysis strongly suggests a lineage relationship between CAFs and preexisting fibroblasts in the tissue. To investigate this further, we calculated a score for each CAF cell based on the normal fibroblast ontogeny signature genes (Supplementary Fig. S3C), which enabled tracing of the CAF populations back to their nonmalignant ancestor (Fig. 3D; Supplementary Fig. S3C). c3 and c4 ntFibs have separate differentiation trajectories during tumor progression, with c4 giving rise to c1 CAFs, which predominantly give rise to c2 CAFs; meanwhile, c3 ntFibs give rise to c0 CAFs, which then predominantly progress into c8 CAFs (Fig. 3D, right; Fig. 3E). We find c9, strongly characterized by high expression of proliferation markers (Mki67, Top2a), splits into two clusters in UMAP space (Supplementary Fig. S2B), one aligning with EMT tumor cells, the other one aligning with c2 CAFs. The proliferating, CAF-proximal cells also exhibit a higher c4 ntFIB score, explaining the observed expansion of descendants of this lineage with tumor progression (Supplementary Fig. S3D). We thus conclude that nontumor cells from c9 are mostly a proliferating subset of c2 CAF.
The trajectories for the two separate fibroblast populations were confirmed with pseudotime analysis for each of the lineages (ref. 19; Fig. 3F). Comparing the expression of ECM genes and selected immune-regulatory genes across all of the CAF clusters revealed a sharp transcriptional shift in the programming of c2 and c8 (Fig. 3G). In the transition from c4 ntFib, there is a significant loss of basement membrane components (e.g., type IV and type VI collagens) with a drastic increase in levels of several fibrillar collagens in c2 (Fig. 3G), indicating an increase and reorganization of fibrillar collagen deposition (20, 21). CAFs originating from c3 also increase expression of ECM genes, particularly fibrillar collagens (Fig. 3G, top), but the most dramatic changes observed with tumor progression are in chemokine and cytokine expression (Fig. 3G, bottom) such as the upregulation of Cxcl9/10, Cxcl1, and Ccl2, which likely recruit myeloid populations through CXCR2 and CCR2 as well as pleiotropic cytokines such as IL6 (22–24), particularly in late-stage fibroblasts (Fig. 3G). Interestingly, although there are differences in expression levels between c2 and c8 CAFs, we also see increased expression of genes encoding factors that are known to support tumor cell survival and metastasis, such as Timp1, Vegf, Il11, Lif, and Pdgf (25–29), that are increased in both lineages (Fig. 3G). Moreover, with tumor progression both lineages acquire potential immune-regulatory gene expression such as increased expression of Mif and Timp1, which can drive infiltration of myeloid-derived suppressor cells and regulatory T cells in PDAC (30, 31). Thus, although each lineage does appear to have specialized functions reflected by differences in their gene signatures, they also share some common programming.
To further investigate the signals driving the differences between the two lineages, we queried the promoters of their signature genes (Supplementary Table S1) for transcription factor binding sites. This analysis revealed a strong enrichment of NFκB binding sites in the promoters of c8-specific genes, whereas c2-specific genes showed SMAD3 binding site enrichment (Fig. 3H). Pathway enrichment analysis supported these predictions, suggesting signaling through IL1 and TNFα as a driver of the c8 transcriptional signature and TGFβ-driven activation of c2 (Fig. 3H). Furthermore, we observed a strong enrichment of a TGFβ fibroblast gene signature (32) in c2 cells, further validating TGFβ as a key driver of the c2 phenotype (Fig. 3I, left). Interestingly, their transcriptional signatures suggest these populations may promote their own programming; whereas c8 cells express IL1α and their chemotactic profile suggests paracrine interactions with myeloid cells, which can also be a primary source of IL1 and TNF (Fig. 3I, right; refs. 33–35), c2 shows expression of Tgfb1 and Tgfb3 (Fig. 3I).
To confirm the validity of our fibroblast evolution model, we compared our expression signatures to the previously published fibroblast-enriched data from KPC mice (36). UMAP clustering identified a group of Pdpn+ Pdgfra+ cells that could be dissected into three different subclusters (Supplementary Fig. S3E): one cluster of Ly6a/c1+ cells, previously described as “iCAFs,” one cluster of Col15a1+ cells, previously described as “myCAFs,” and one cluster with high levels of Cd74, H2-Ab1, and Saa3, previously described as “apCAFs.” When we compared the average expression levels of our two normal fibroblast lineage programs to those of these clusters, we found that myCAFs clearly clustered with our c4 fibroblasts and iCAFs with c3 fibroblasts, confirming that the lineage hierarchy is similarly present in the KPC model (Fig. 3J). Notably, apCAFs clustered with c6 mesothelial cells from normal pancreas (markers, Supplementary Table S3). Accordingly, the IL1 c8 CAFs exhibited most similar expression profiles to iCAFs, the TGFβ c2 CAFs clustered with myCAFs, and the c6 mesothelial cells clustered with apCAFs (Supplementary Fig. S3F).
Mouse Models of PDAC Identify LRRC15 as a Marker of TGFβ-Driven c2 CAFs
We were particularly interested in further characterizing c2 as it increased with tumor progression, dominating the CAF compartment in late-stage tumors (Fig. 2F), and because TGFβ-associated stroma is correlated with poor prognosis (32, 37). Therefore, we sought to identify markers that distinguish TGFβ-driven c2 CAFs from the other fibroblast stromal subsets in PDAC. Bulk RNA-seq data from early- and late-stage tumor PDPN+ CAFs identified Lrrc15 to be one of the most differentially expressed genes between CAFs and ntFIBs (Fig. 4A). Lrrc15 encodes a transmembrane domain–containing molecule expressed in the stroma of several human tumors and upregulated by TGFβ (38). Cross-referencing genes enriched in the TGFβ-driven c2 to an atlas of proteins experimentally identified to be on the cell surface (39) further validated Lrrc15 to be a strongly enriched c2 gene encoding a surface protein (Supplementary Fig. S4A).
The presence of LRRC15+ PDPN+ cells with fibroblast morphology in tumor-bearing pancreases was confirmed by immunofluorescence microscopy; LRRC15+ cells were usually found in nests throughout the tumor-bearing pancreas surrounding tumor cells in KPP GEMMs (Fig. 4B). We further employed subcutaneous models of PDAC, using a cell line derived from KPP GEMMs (KPP14388). Characterization of flank injections of 100K tumor cells showed tumors with similar stromal composition to the KPP mice (Supplementary Fig. S4B). Immunofluorescence microscopy revealed abundant LRRC15+ PDPN+ cells in the subcutaneous tumors derived from KPP PDAC lines (Fig. 4C). Flow cytometry analysis showed that LRRC15 marked a significant portion of PDPN+ stromal cells, and was largely absent from other cell populations in the tumor microenvironment (TME; Fig. 4D). Notably, LRRC15, unlike many other CAF markers, is also absent in lymph-node stroma as well as normal pancreas (Supplementary Fig. S4C).
Because of their proximity to tumor islets, we decided to assess whether LRRC15+ CAFs can directly enhance tumor growth. We generated a KPP line expressing diphtheria toxin receptor (DTR) that allowed us to remove residual tumor cells and culture isolated CAFs with the addition of diphtheria toxin. Two thousand KPP-mApple tumor cells were grown alone or in coculture with LRRC15+ CAFs compared with LRRC15− Ly6C+ CAFs or c3 and c4 ntFIBs and assessed for their ability to promote spheroid growth in 3-D culture. Tumor spheroids cultured with any fibroblast population grew larger than those in medium alone. This demonstrated that all the fibroblasts tested can directly enhance tumor growth (Fig. 4E). Although we cannot rule out that the spheroids themselves reprogrammed the fibroblasts as has been previously reported (40), it suggests that the specific in situ positioning of LRRC15+ CAFs next to tumor islets might be one of the keys to their role in a protumor niche, rather than a unique ability to promote tumor growth. This experiment also represents a single functional test; given the unique transcriptional differences between the fibroblast populations, we might expect functional differences in other areas, such as immune regulation.
LRRC15+ CAFs Are Present in Human PDAC Samples
To translate our findings from mouse models into human cancer, we reanalyzed data from a recently published study of scRNA-seq of human patients with PDAC, by Peng and colleagues (41). After quality control and filtering, we retained 84,276 cells from 22 patients for downstream analysis. Clustering in dimensionality-reduced space revealed 12 clusters of 11 main cell types (Fig. 5A; Supplementary Table S4). All clusters were comprised of cells from more than 8 patients (Fig. 5B). To confirm our tumor-cell assignment and by extension ensure that our fibroblast assignment did not include tumor cells that had undergone EMT, we collected three independent lines of evidence. First, we confirmed only cells in cluster 0 were positive for mRNA with the KRASG12 missense mutation (Fig. 5A). Second, reanalysis of a separate publicly available microdissection study (42) showed that only markers for tumor cells (cluster 0) are enriched in bulk RNA-seq of microdissected tumor samples compared with samples from microdissected stroma or nonmalignant control pancreas (Fig. 5C). Third, we identified large-scale copy-number variants in cells from cluster 0, but not cells from the CAF cluster (Supplementary Fig. S5A).
After confirming the nontumor origin of the population identified as CAF cells, we specifically focused on this cluster. Subclustering of the 8,931 fibroblasts revealed three distinct subsets (Supplementary Table S5): 52% of cells expressed high levels of the TGFβ c2 CAF markers TAGLN and LRRC15, 3% were strongly enriched in the IL1 c8 CAF markers HAS1 and CCL2 (Fig. 5D), and 44% of cells expressed high levels of C7 and CFD. LRRC15 was also highly expressed in the bulk RNA-seq from microdissected stroma compared with tumor or normal control samples, suggesting that this population is prominent in PDAC stroma (Fig. 5E). Conversely, we found HAS1 lowly expressed across tumor, stroma, and nonmalignant samples and enriched in only a small fraction of microdissected stroma samples, likely representing only a minor population in PDAC stroma. These trends were confirmed when the average expression of signature genes for each of the human fibroblast single-cell clusters was compared within the microdissected bulk-sequencing samples (Fig. 5F). To confirm protein expression of LRRC15 in human CAFs, we performed dual IHC on tissue from 70 patients with PDAC (Supplementary Table S6) for LRRC15 and CD8. We found 100% of patients showed LRRC15 staining in nonnormal areas of pancreas and LRRC15 appeared fibrillar and was largely excluded from tumor islets. Mostly, it was found surrounding them; additionally, it was frequently seen in proximity to CD8+ T cells in the area (Fig. 5G; Supplementary Fig. S5B). Flow cytometry on 4 patient samples further confirmed LRRC15 was largely restricted to the EPCAM− CD45− stromal gate and marked the majority of CAFs (Fig. 5H). Altogether, we find that LRRC15+ TGFβ cluster 0 (hC0) CAFs are the most prominent fibroblast population in multiple human PDAC datasets, confirming our findings from the mouse model.
Although human fibroblast clusters 0 and 2 (hC0 and hC2) exhibited overlapping genes of mouse TGFβ c2 and IL1 c8 CAFs in a cross-species comparison, respectively (Fig. 5I), cluster 1 (hC1) did not obviously match the early CAF populations observed in mouse. Although HC1 was characterized by high levels of mouse c4 relative to c3 genes (Supplementary Fig. S5C), individual cells did not show a clear phenotype of one or the other population. To test if, in contrast to mice, human pancreatic fibroblasts are a homogeneous population, we performed in silico isolation of single fibroblast cells from 11 nonmalignant pancreatic tissues, published as part of Peng and colleagues (41). Dimensionality reduction with UMAP and clustering in reduced space revealed a population of 1,407 fibroblasts (DCN, LUM; Supplementary Fig. S5D; Supplementary Table S7). Subclustering of these cells identified two clusters, of which the minor one (<5% of cells) exhibited high levels of EPCAM and other epithelial markers and likely represents an artifact of our in silico isolation (Supplementary Fig. S5E). The other cluster was characterized by strong expression of C7 and CFD. The strong similarity of markers of these nonmalignant fibroblasts to hC1 suggested that hC1 have not undergone extensive transcriptional changes relative to nonmalignant fibroblasts. This was confirmed by comparing the average expression profile of these two cell types (Pearson correlation: 0.97), where only a few genes changed expression (Fig. 5J). On the basis of this analysis, we were able to identify genes enriched in nonmalignant fibroblasts and hC1 CAFs, but not hC0 TGFβ CAFs or hC2 IL1 CAFs, as well as genes that are induced in all three CAF populations compared with nonmalignant fibroblasts (Supplementary Fig. S5F). Together, the results suggested that hC1 CAFs are early CAFs (eCAFs) and the predecessor to both hC2 IL1 and hC0 TGFβ CAFs. To test this hypothesis, we performed PCA of all fibroblasts including those from nonmalignant pancreas. Strikingly, PC1 separated cells in the order of nonmalignant fibroblasts, hC1 eCAFs, hC2 IL1 CAFs, and hC0 TGFβ CAFs (Fig. 5K). The minimum spanning tree fit to this dimensionality-reduced data supported that starting from nonmalignant fibroblasts cells undergo a transformation into C1 eCAFs, upregulating type I collagen, SPARC, and other ECM proteins. From this intermediate state, cells become either hC2 IL1 CAFs or hC0 TGFβ CAFs. Interestingly, both the hC1 eCAFs and the hC0 TGFβ CAFs make up almost all fibroblasts in the single-cell dataset under investigation. We did not see evidence of any cells with a mesothelial or apCAF signature in these data; however, we find that all human CAFs expressed CD74 and HLA-DRA (Supplementary Fig. S5G), which is consistent with data found in Elyada and colleagues (36). We have summarized our findings in both human and mouse in a model (Fig. 5L).
An LRRC15+ CAF Signature Can Be Found across Several Human Cancer Indications
As it had been previously reported that stromal LRRC15 expression could be observed in several tumor types (38), we performed a pan-cancer analysis across tumors in The Cancer Genome Atlas (TCGA; n = 9,736) and compared these data to matched nonmalignant tissues from the GTEx database (n = 8,587). We found that LRRC15 expression was consistently low/absent across normal tissues, but upregulated in a variety of tumors including but not limited to pancreatic, breast, and head and neck cancers (Fig. 6A). To verify that the LRRC15 signal was derived from TGFβ-activated CAFs in tumor types other than PDAC, we first identified a more robust expression signature of TGFβ CAFs from our human PDAC scRNA-seq analysis. We focused on genes significantly enriched in TGFβ CAFs compared with all other fibroblast populations that showed no/low expression by any other cell type in the full dataset (Fig. 6B). The gene set was strongly enriched in microdissected PDAC bulk stroma versus tumor samples (Fig. 6C), suggesting that their combined signal allows conclusions about the presence/absence of TGFβ fibroblasts in bulk RNA-seq data.
To next confirm the presence of this population in other tumor types, we exemplarily reanalyzed single-cell RNA-seq data from 18 patients with head and neck squamous cell carcinoma (HNSCC; ref. 43). Dimensionality reduction with UMAP of 3,363 cells from the TME free of somatic mutations confirmed the cell-type annotations provided by the authors (Fig. 6D, top). Clustering of the mesenchymal cells revealed 5 subclusters, of which two were pericytes (25% of all cells; PDGFRB, MCAM, RGS5, ACTA2), two were fibroblasts (15% of all cells; LUM, DCN), and one was myoblastic-like (2% of all cells). Confirming the results from the pancreatic cancer single-cell data, expression of our TGFβ CAF marker gene set was almost entirely restricted to one of the two fibroblast clusters (Fig. 6D, bottom right; cluster 2, 60% of all fibroblasts). This underscores the presence of LRRC15+ TGFβ CAFs also in head and neck cancer. Furthermore, the majority of genes showed low/no expression by tumor cells (Supplementary Fig. S6A), indicating that the main signal of these genes in bulk RNA-seq data is, as we have shown in pancreatic cancer, likely primarily derived from TGFβ CAFs and less so from EMT tumor cells.
On the basis of these results, we used an 11-gene signature (MMP11, COL11A1, C1QTNF3, CTHRC1, COL12A1, COL10A1, COL5A2, THBS2, AEBP1, LRRC15, ITGA11) to infer the presence of TGFβ CAFs across different cancer types from bulk RNA-seq TCGA data. In this pan-cancer analysis comprising 31 different cancer types from TCGA, we found a positive correlation between the average expression and the average gene-wise correlation of our core signature across cancer types (Fig. 6E). The positive correlation indicates that in cancer types where the TGFβ CAF is present (high average expression), there is a true signal in the bulk coming from this population that leads to a high gene-wise correlation. In cancer types lacking TGFβ CAFs (low average expression), there is just a “noise” signal and the genes are uncorrelated. Besides pancreatic adenocarcinoma (PAAD), the analysis points to TGFβ CAFs also playing a strong role in breast cancer (BRCA), lung cancer (LUSC, LUAD), ovarian cancer (OV), colon cancer (COAD), renal cancer (READ), esophageal cancer (ESCA), stomach adenocarcinoma (STAD), and bladder cancer (BLCA; Supplementary Fig. S6B), as well as HNSCC. In summary, these data suggest that LRRC15+ CAFs are a prominent population across multiple human cancer types that emerges from an LRRC15− fibroblast population.
An LRRC15+ CAF Signature Predicts Poor Clinical Response to Checkpoint Blockade
Having shown that Lrrc15+ CAFs are present in human cancers, we next sought to test the clinical impact of this population. Because of the known roles of TGFβ in modulating immunotherapy (32, 44), we evaluated the clinical significance of the newly identified LRRC15+ CAFs in response to cancer immunotherapy. Multiple reports have shown that the molecular makeup of bladder cancer is similar to that of pancreatic cancer with shared subtypes (45, 46). Furthermore, we have shown in our previous analysis that LRRC15+ CAFs are also a frequent population in bladder cancer (Fig. 6E). We found that the markers for LRRC15+ CAFs identified in PDAC were also significantly coexpressed in RNA-seq data from our recent bladder cancer immunotherapy trial (ref. 32; Supplementary Fig. S7A). Moreover, the signature was associated with worse outcome for patients receiving anti–PD-L1 (atezolizumab) therapy (P = 0.03, HR = 1.4; Supplementary Fig. S7B). This effect is explained by the increased expression of the signature in patients who fail to respond to anti–PD-L1 therapy exclusively in immune-excluded tumors, but not in tumors with inflamed or immune-desert phenotype (Fig. 7A). Consequently, we observe a significant association of the LRRC15+ CAF signature with worse outcome specifically in patients with immune-excluded tumors (P < 0.001, HR = 2.3; Fig. 7B). Our result represents an improvement over the fibroblast TGFβ response signature obtained through in vitro activation of fibroblasts with TGFβ in regard to their cell-type specificity (Fig. 7C) and predictive power (Fig. 7D). Immunoscoring of patients with PDAC revealed that the majority showed an immune-excluded phenotype, suggesting these findings might also apply to patients with PDAC (Supplementary Fig. S7C). Importantly, the LRRC15+ CAF signature was also predictive of response to atezolizumab in a second trial comprising multiple other cancer types, such as renal cell carcinoma, head and neck cancer, and non–small cell lung cancer (P = 0.01, HR = 2.01; Fig. 7E; Supplementary Fig. S7D). This effect (HR > 1.5) was apparent across several individual cancer indications, despite their small individual sample sizes (Supplementary Fig. S7E). It remains unclear what the nature of LRRC15+ CAF immunosuppression might be, but these data provide a strong basis to further elucidate the functions of these cells. The correlation between LRRC15+ CAFs and poor outcome in immunotherapy treatment suggests that multiple tumor indications may benefit from LRRC15+ CAF reprogramming combined with immunotherapy.
Here, we took advantage of our novel digestion method to profile PDPN+ CAFs ex vivo using scRNA-seq. Our approach identified two separate populations of ntFibs in mouse pancreas. Their expression signatures suggest disparate functions, with one more primed to provide structural support and another appearing more immunoregulatory. These two separate lineages evolve separately into IL1- and TGFβ-driven CAFs in the context of PDAC. Further work on localization and lineage tracking will allow us to distinguish whether there is a physical compartmentalization or niche that results in differential exposure to IL1 and TGFβ or whether the ntFibs fundamentally have differing potential to respond to IL1 and TGFβ that results in the two separate CAF trajectories we observe.
Our findings from murine late-stage tumors support previous observations identifying IL1-driven “iCAFs” and TGFβ-driven “myCAFs” (40, 47). Furthermore, we provide new insights into how resting fibroblast heterogeneity predetermines the fate of stromal cells in the TME. It also seems from our cross comparison that “iCAFs” include both our early CAF1 and IL1 CAFs whereas “myCAFs” include both our early CAF2 and TGFβ CAFs. We do observe a difference with our identification of a population previously designated as “apCAF” (36), which we identify as a mesothelial cell population. This discrepancy might be due to mesothelial cells acquiring expression of some fibroblast genes in the KPC system, as mesothelial-to-mesenchymal transition has been described in some tissues (48). In fact, in the KPC system, a relationship between the “apCAFs” and “myCAFs” is apparent in UMAP space, and the “apCAFs” express Pdgfra, suggesting changes from the normal tissue state. However, the dominant mesothelial genes driving the clustering of that population, including the antigen-presenting genes Cd74 and H2-Ab1, were present in normal pancreas. We also did not observe these cells in the tumor, which may be a consequence of our dissection method where the majority of the mesothelium would be included in the adjacent normal tissue, or it could reflect a difference between the KPC and KPP models.
Comparison with several different human patient cohorts revealed both similarities and differences with the mouse, which we have modeled (Fig. 5L). Although the conservation of IL1 and TGFβ CAFs was quite obvious, we were struck by several differences between the mouse models and human patient data. First, we do not observe baseline heterogeneity in the human nonmalignant tissue fibroblasts. Rather, human fibroblasts from nonmalignant tissue show a transcriptional profile that combines the mouse ntFIB signatures. Subsequently, nonmalignant human fibroblasts transition to a single early CAF, which then gives rise to either TGFβ- or IL1-programmed CAFs. However, given that the human tissues we analyzed were not truly normal, we cannot rule out that nonmalignant fibroblasts had already undergone changes that masked baseline heterogeneity. We also found that CD74 and HLA-DRA are expressed by all human CAF populations, revealing a potentially important functional difference between human and mouse CAFs. We do not observe a specific population with a similar transcriptional signature to “apCAF” or mesothelial cells, suggesting this population is absent or very rare in the TME of PDAC.
We identify the TGFβ-driven cell population as the most prevalent CAFs in late-stage tumors, and show that surface expression of LRRC15 enables experimental isolation and manipulation of these CAFs in both mouse models and human patient samples. Furthermore, we find that the LRRC15+ CAF signature correlates with poor response to immune checkpoint blockade in several different human tumor types. These cells have myofibroblastic properties and a dominant ECM gene signature. We find they constitute the majority of CAFs in patients with PDAC, which are dominantly of an immune-excluded phenotype. This suggests an immunoregulatory role for these cells. It will be valuable to further explore whether early CAFs can be prevented from adopting the protumorigenic fate of the LRRC15+ CAFs or whether the LRRC15+ CAF phenotype can be reverted to improve immunotherapy efficacy.
Although we chose to focus on the LRRC15+ CAFs due to their prevalence in PDAC, IL1 CAFs have a transcriptional program clearly suggesting immune regulation of the TME. Inhibition of JAK signaling in PDAC has been shown to be associated with both a reduction in IL1 CAFs and reduced tumor burden (47), although it is difficult to distinguish direct effects of these inhibitors on the tumor cells (28, 49–51) from the effects of IL1 CAF loss in the TME. It is also important to note that although both of these CAF types have many transcriptional differences, they also both express genes associated with myofibroblast characteristics and both express various immune-regulatory and even inflammatory mediators. Thus, we describe them by their most conserved characteristics, their major transcriptional programming; IL1 CAF and TGFβ CAF. For the TGFβ CAF, we have identified LRRC15 expression as a good proxy across several cancer indications.
We chose to focus on fibroblasts to generate a robust dataset that would be a good representation of the heterogeneity of a somewhat rare population. Our work also identifies other populations with fibroblast properties: PDPN− CD31− cells, which were enriched for pericytes but adopted expression of some CAF genes, as well as two populations of PDPN+ cells, which we classified as tumor cells undergoing EMT. There are also various stromal and nonstromal cells that we chose to exclude from the focus this study. These cells are all part of the TME, and future research into their functions is sure to yield a more complete understanding of their interactions and contributions to tumor progression and response to therapy.
WT B57BL/6 mice (colony 000664) and albino WT B6(Cg)-Tyrc-2J/J mice (colony 000058) were purchased from Jackson Laboratory. We licensed KrasLSLG12D from Tyler Jacks (Massachusetts Institute of Technology, Boston, MA), p16/p19fl/fl from Anton Berns (NKI, Amsterdam, the Netherlands), and Pdx1.Cre from Andy Lowy (University of Ohio, Columbus, OH). KPP mice were generated as described previously (12). Age- and sex-matched mice were used for experiments. The mice were housed at Genentech in standard rodent microisolator cages and were acclimated to study conditions for at least 3 days before tumor cell implantation. Animals were 6 to 12 weeks old.
KPP GEMM mice were euthanized at a median age of 9 weeks. This age reflects the disease state with high-penetrance adenocarcinoma and at which GEMMs with moderate and large tumors are observed.
All animals were monitored according to guidelines from the Institutional Animal Care and Use Committee at Genentech, Inc.
The KPP14388 murine pancreatic adenocarcinoma cell line was generated by the Junttila group at Genentech from the KPP GEMMs. Transgenic lines were created as follows: mApple MSCV retrovirus was transfected into KPP14388 and a single clone was grown; KPP-DTR was obtained by transducing KPP14388 with DTR-efp lentivirus and a single cell clone was selected. Cancer cells were cultured in high-glucose DMEM plus 2 mmol/L l-glutamine with 10% FBS (HyClone). All cell lines were tested for Mycoplasma by qPCR. For all injected tumors, cells were used within the first three passages.
Tissue Digestion, Cell Isolation, and Flow Cytometry of Murine Tissues
To isolate pancreases, first the omentum was removed and then the pancreas collected with careful exclusion of draining lymph nodes. The pancreas was then minced. For stellate cell enrichment, tissue was digested as described previously (11). Enzymatic digestion was used with 0.02% Pronase (Roche catalog no. 10165921001), 0.05% Collagenase P (Roche, catalog no. 11249002001), and 0.1% DNase (Roche, catalog no. 10104159001) in Gey balanced salt solution (GBSS; Sigma, catalog no. G9779) for 50 minutes. Digested tissue was then filtered through a 100-μm nylon mesh. Cells were centrifuged at 1,300 rpm for 5 minutes, washed, and then resuspended in GBSS containing 0.3% BSA (Sigma Aldrich, catalog no. A2153). The cell suspension was centrifuged, decanted, and resuspended into 8 mL 28.7% (wt/vol) solution of Nycodenz (Sigma, no longer available; also used 16.7% Optiprep; Sigma-Aldrich D1556) overlaid with 6 mL 0.3% BSA GBSS, and then centrifuged at 1,400 × g with no break for 20 minutes. The cells of interest separated into a fuzzy band just above the interface of the Nycodenz cushion and the GBSS. This band was harvested and the cells were washed and resuspended in MACS buffer.
For our new digest, which was modified from Fletcher and colleagues (52), 20 μg/mL anti-trypsin (Sigma Aldrich, catalog no. 10109886001) was used in the first round of digest incubations, which were prepared as follows. Pancreases were enzymatically digested using 800 μg/mL Dispase, 400 μg/mL collagenase P, and 100 μg/mL DNase I at 37°C. Fractions were collected into MACS buffer and digest media was refreshed 2 more times after 15-, 10-, and 5-minute incubations. At this point either: (i) for comparison with stellate cell enrichment, the cell suspension was centrifuged at 1,300 rpm for 5 minutes, decanted, and subjected to a density gradient centrifugation as described above or (ii) for subsequent digests without gradient enrichment, samples underwent RBC lysis and were spun for 4 minutes at 50 × g to pellet debris, supernatant was collected, spun down and resuspended in media, and cells were counted using a Vi-CELL XR (Beckman Coulter).
Pancreatic tumors were similarly treated with the addition of 2 U/mL hyaluronidase (Worthington, catalog no. LS002592) and 20 U/mL purified collagenase (Worthington, catalog no. LS005273) and 100 μg/mL Collagenase P. In PDAC experiments, control normal pancreases were digested using same enzymatic cocktail as tumors. Tumors also often required 1 to 2 additional digest incubations to break down tissue.
Subcutaneous tumors were collected with care to avoid draining lymph node and epidermis. Subcutaneous tumors were weighed and enzymatically digested using the same enzymatic mix as the previously described normal pancreas samples, without the trypsin inhibitor addition.
Cells were labeled with mAbs purchased from eBioscience, BioLegend, or BD Biosciences at 1:200 for 20 to 30 minutes, unless otherwise noted. Prior to cell surface staining with the following fluorescently labeled antibodies, cells were blocked with Fc block (2.4G2; 1:500). Surface staining for experiments was performed using antibodies described in Supplementary Table S8 for 25 minutes at 4°C, washed 2.5 times with MACS buffer, then either fixed (BioLegend, catalog no. 420801) or resuspended in 7-AAD (1:50; BD Biosciences, catalog no. 559925) and Calcein Blue (1:1,000; Invitrogen, catalog no. C1429) for cytometry analysis. For intracellular staining, cells were surface stained as above, washed, and then fixed and permeabilized using the FoxP3 ICS Kit (eBioscience, catalog no. 00-5523-00) as per the manufacturer's directions. Cells were then incubated with antibodies described in Supplementary Table S8 for 1 hour in permeabilization buffer. Data were acquired on a Fortessa, Symphony, or LSR II (BD Biosciences) and analyzed using FlowJo (Tree Star).
Cells were isolated as above. Once a single-cell suspension was obtained, cells were plated and resuspended in Aldefluor assay buffer, as part of the Aldefluor Kit (StemCell Technologies, catalog no. 01700) in FACS tubes (500 μL) or 96-well U-bottom plates (200 μL) with Fc block. Then, 10 μL (FACS tubes) or 4 μL (96-well plate) of assay buffer was aliquoted into quench tubes. 2.5 μL (FACS tubes) or 1 μL (96-well plates) of DEAB buffer was aliquoted into quench tubes. Five microliter (FACS tubes) or 2 μL (96-well plate) Aldefluor reagent was then added to cells as rapidly as possible. Cells were then mixed, and half of the volume was added to quench tubes with DEAB. Samples were incubated at 37°C for 15 to 20 minutes, then spun down and surface-stained as above on ice for 20 to 30 minutes before FACS analysis.
Anti-LRRC15 Antibodies for Flow Cytometry and Murine Imaging
Gene synthesis and cloning was performed following reverse translation and codon optimization for Chinese hamster ovary (CHO) cells of the amino acid sequences encoding the variable heavy (VH) and variable light (VL) domains of huM25 (flow) and huAD208.4.1 (imaging) anti-LRRC15 clones, as published via patent 10195209 (53). For huM25, expression constructs of human–mouse chimeric antibodies were generated by subcloning the VL and VH sequences into mammalian expression vectors containing mouse kappa light-chain and mouse IgG2a heavy-chain frameworks, respectively. For huAD208.4.1, expression constructs of human antibodies were generated by subcloning the VL and VH sequences into mammalian expression vectors containing human kappa light-chain and human IgG1 heavy-chain frameworks, respectively. Both antibodies were produced by transiently transfected CHO cells and purified using standard antibody purification methods and were confirmed to be human/mouse LRRC15 cross-reactive by surface plasmon resonance (data not shown).
Immunofluorescence and Image Analysis of Mouse Tissues
Mouse pancreas, PDAC tumor tissue, and subcutaneous KPP tumors were fixed overnight in 1% paraformaldehyde (PFA), embedded in optimal cutting temperature medium (Sakura Finetek) and frozen for storage at −80°C. Five to 12 μm–thick sections were cryosectioned, immune-stained, and imaged with confocal microscope Leica TCS SP8. Images were processed with Fiji software (ImageJ v2.0.0-rc-69/1.52i). For staining, slides were fixed for 5 minutes in 4% PFA, blocked, and permeabilized in stain buffer (2% BSA, 5% goat serum in PBS) with 0.3% Triton for 1 hour. Primary antibodies were added for 1 hour at room temperature to overnight at 4°C, secondaries were added for 30 minutes to 2 hours at room temperature; slides were mounted in hardening media (Dako S3023). Details of the antibodies used can be found in Supplementary Table S8.
Cells between passage 1 and 2 were trypsinized, filtered, counted, and resuspended in 50% PBS and 50% Matrigel (Corning, catalog no. 356231) at a concentration of 1 × 106 cells/mL for injection into mice. The mice were housed at Genentech in standard rodent microisolator cages and were acclimated to study conditions for at least 3 days before tumor cell implantation. Animals were 6 to 10 weeks old. Only animals that appeared to be healthy and free of obvious abnormalities were used for studies. Mice were inoculated in the right flank with 1 × 105 cancer cells in 100 μL of PBS:Matrigel (1:1). Sixteen to 24 days after tumor injection, mice were euthanized and tumors collected for either immunofluorescence or flow cytometry analysis.
Donor B6 mice were injected with KPP-DTR cancer cells as described above. Tumors were collected and digested. Single-cell suspensions were spun down at 1,300 rpm for 3 minutes, decanted, and resuspended in fibroblast media [10% FBS αMEM, supplemented with 1× L-glutamine, 1× penicillin/streptomycin, and 1× HEPES (all from Gibco) and 10% batch tested low IgG FCS (Gemini)]. Bead depletion was performed with anti-CD45-biotin, anti-CD24-biotin, and anti-CD31-biotin in conjunction with the EasySep Biotin Selection Kit (StemCell Technologies, catalog no. 17655). Cells were then sorted for LRRC15+ CAFs and LRRC15− Ly6C+ CAFs, as control normal pancreas was sorted for ENG+Ly6C− and ENG−LY6C+ fibroblasts as described in flow cytometry panels. Sorted cells were cultured in fibroblast media, and 25 ng of diphtheria toxin (DT; Enzo catalog no. BML-G135-0001) was added to kill tumor cells; this was done in 100 mm tissue culture–treated dishes with 20 mL media. Plates were rinsed the next day with PBS and media with DT was replaced. Cells were cultured for 7 to 10 days, with one passage at confluence, to expand fibroblasts. Cultured fibroblast cells were trypsinized, counted, and then combined with KPP-mApple cancer cells.
For spheroids, 2,000 KPP-mApple cancer cells were seeded in 100 μL Matrigel (Corning, catalog no. 356231) with or without different fibroblast populations (15K). Before plating cells, 100 μL of Matrigel was spread on each well of a 24-well glass-bottom plate (Mattek) and allowed to polymerize, to keep cells from directly colonizing glass. Two milliliters of fibroblast media were added to each well. The plate was imaged 2 to 4 times a week with a 4× Plan Fluor objective (NA: 0.13, Nikon) on a Nikon Ti-E inverted microscope equipped with a Neo scMOS camera (Andor, Oxford Instruments), a linear encoded automated stage (Applied Scientific Instrumentation), 37°C/5% CO2 environmental chamber (Okolab), all run by NIS Elements software (Nikon). Image sets in TRITC and brightfield of the Matrigel bubble were stitched and focused into one image projection with an extended depth of focus module (EDF; Nikon). The resulting TRITC EDF image was analyzed in Matlab (vR2018a, Mathworks) to measure total mApple spheroid area.
Tissue Digestion, Cell Isolation, and Flow Cytometry of Human Tissues
Human PDAC tumor samples were obtained and digested using a previously published protocol (9). PDAC samples were fragmented into small pieces (around 1 mm3) and digested in CO2-independent medium (Gibco, catalog no. 18045-054) supplemented with 5% FBS (PAA, catalog no. A11-151), 2 mg/mL collagenase I (Sigma-Aldrich, catalog no. C0130), 2 mg/mL hyaluronidase (Sigma-Aldrich, catalog no. H3506), and 25 mg/mL DNase I (Roche, catalog no. 11284932001) for 45 minutes at 37°C with shaking (180–200 rpm). After tissue digestion, cells were filtered using a cell strainer (40 mm, Thermo Fisher Scientific, catalog no. 223635447) and resuspended in PBS+ solution supplemented with 2 mmol/L EDTA and 1% human serum (Sigma P2918) to a final concentration at approximately 5 × 105 cells in 50 μL. Tissue single-cell suspension was stained with antibody cocktail described in Supplementary Table S8.
IHC and Image Analysis of Human Tissues
Tissues were fixed in 10% neutral buffered formalin for 24 hours, then dehydrated and paraffin embedded, and sectioned into 4-μm slices. Slides were deparaffinized and antigen retrieved in CC1 buffer (TRIS-EDTA pH 8.1) at 95°C for 64 minutes. Sequential staining with elution step after first antibody detection was completed. First antibody was anti-LRRC15 (Abcam ab150376) used at 2.5 μg/mL. Elution was done with CC2 buffer (citrate-acetate based with SDS 0.3%; pH 6.0; time: 8 minutes at 100°C). Second antibody was anti-CD8 (Abcam ab101500) used at 1:200 and the isotype control for both was naïve rabbit monoclonal antibody (Cell Signaling Technology, catalog no. 3900S). The detection system used was OmniMap-Rbt-HRP with DAB for CD8 and OmniMap-Rbt-HRP with Discovery purple for LRRC15.
Patient slides were immunoscored by a pathologist with expertise in the field (H. Koeppen). Brightfield images were acquired by a Hamamatsu Nanozoomer automated slide-scanning platform at a final magnification of 200×. The images were analyzed with the 2019a version of the Matlab software package (MathWorks). LRRC15+ fibroblasts and CD8 cell nuclei were segmented by intensity thresholding and simple morphologic filtering of the image.
Generation of Bulk-Sorted RNA-seq
Single-cell suspensions were isolated as described above. For sorting described in Fig. 1, 4 individual animals/group were sorted from the following: healthy albino B6 pancreas, KPP pancreas bearing tumors <4 mm, or KPP pancreas with tumors > 10 mm. Samples were stained and sorted for EPCAM−, CD45−, TER119−, ITGA6−, CD31−, viable (Life Technologies) and then PDPN+ PDGFRa+, or PDPN− populations were sorted directly into TRIzol and purity was assessed on a small aliquot sorted into PBS as 95% or higher. RNA was isolated according to Universal RNeasy Kit (Qiagen). RNA was profiled with the Bioanalyzer Pico RNA Kit (Agilent Technologies). Low-input RNA kit (ClonTech) was used to generate cDNA libraries. RNA-seq libraries were multiplexed and sequenced using HiSeq 4000 to generate 30 million single-end 50-bp reads per library.
Generation of Single-Cell Sequencing Libraries
For single-cell sequencing, albino B6 and KPP age and sex matched mice were sacrificed for each of two replicates; 5 albino B6 pancreases were used for “normal”; 5 KPP animals were used for other samples with tissues being divided into “adjacent” (no masses), “small tumors” (tumors <4 mm), and “large tumors” (tumors 5–10 mm). Single-cell suspensions were isolated as described above. Samples were stained and sorted for CD45−, TER119−, CD24a−, CD31−, 7AAD−, Calcein Violet+, PDPN+. Sorted single-cell suspensions were converted to barcoded scRNA-seq libraries by using the Chromium Single Cell 3′ Library, Gel Bead & Multiplex Kit and Chip Kit (10x Genomics), loading an estimated 6,000 cells per library and following the manufacturer's instructions. Samples were processed using kits pertaining to either the V2 barcoding chemistry of 10x Genomics. Single samples were processed in a single well of a PCR plate, allowing all cells from a sample to be treated with the same master mix and in the same reaction vessel. For each replicate, all samples (nonmalignant and tumor) were processed in parallel in the same thermal cycler. The final libraries were profiled using the Bioanalyzer High Sensitivity DNA Kit (Agilent Technologies) and quantified using the Kapa Library Quantification Kit (Kapa Biosystems). Each single-cell RNA-seq library was sequenced twice in two lanes of HiSeq 4000 (Illumina) to obtain single-end, 98 bp, approximately 500 million reads per library.
Bioinformatic Processing of Mouse scRNA-seq Data
Single-cell RNA-seq data for each replicate were processed with cellranger count (CellRanger 2.1.0; 10x Genomics) using a custom reference package based on mouse reference genome GRCm38 and GENCODE gene models. Individual count tables were merged using cellranger aggr to reduce batch effects. Subsequent data analysis was carried out in R 3.5.1 and the Seurat package (v 2.3.4). From an initial set of 14,916 cells, counts of transcripts measured as unique molecular identifiers (UMI) in each cell were normalized and log transformed to log(CPM/100+1; CPM = UMI counts per million. Cells with at least 1,200 measured transcribed genes per cell were considered for analysis. To remove noise from droplets containing more than one cell, we focused on cells with at most 5,000 measured genes. Dead cells were excluded by retaining cells with less than 3% mitochondrial reads, leaving 13,454 cells for final analysis. Genes induced due to dissociation stress of single cells published previously (54) were used to score the dissociation stress in each cell with the AddModuleScore function in Seurat (see the section “Calculation of Single-Cell Scores” below for details). Subsequently, normalized data was scaled to regress out the number of distinct UMIs and the stress signature score.
Prior to dimensionality reduction we performed batch correction with Harmony (55) version 0.0.0.9000 as described in the tutorial at http://htmlpreview.github.io/?https://github.com/immunogenomics/harmony/blob/master/docs/SeuratV2.html. We adjustedthe cluster membership penalty parameter theta to 1, in order to put a less strong force on combining cells across replicates. Dimensionality reduction was carried out with the Seurat package (56). Prior to PCA, we identified the 1,000 most variable genes (Seurat, FindVariableGenes using the mean of log-transformed values and the variance to mean ratio in non-logspace) and applied PCA to cells in this gene space. Principal components 1 to 20 were provided as an input for dimensionality reduction via t-SNE and UMAP with default parameters in Seurat. Clusters of cells were identified on the basis of a shared-nearest neighbor graph between cells and the smart moving (SLM) algorithm (k = 40, resolution = 0.7). Markers for each cluster were identified by reducing the number of candidate genes to those genes which were (i) at least log(0.25) fold higher expressed in the cluster under consideration compared with all other clusters and (ii) expressed in at least 10% of cells in the cluster under consideration. For genes passing those criteria, significance between cells in the cluster versus all other cells was calculated using Wilcoxon rank sum test and adjusted with the Benjamini–Hochberg method.
Calculation of Single-Cell Scores
Scores for single cells were calculated as the average relative expression of a gene set of interest, minus the average relative expression of a control gene set to account for technical differences between cells, as described by Tirosh and colleagues (57) and implemented in the Seurat AddModuleScore function. To obtain the geneset for lineages P3 and P4, we used the 20 most significantly upregulated (Padj < 0.00001, sorted by average logFC) genes in each of the two normal populations to score all cells for these two expression programs. To obtain a Collagen signature, we used all Collagen-encoding genes expressed in the full dataset to score each cell.
Single-cell pseudotime trajectories were constructed with Monocle version 2.8.0 independent for each of the two fibroblast lineages, as Monocle 2 does not support trajectories with multiple roots. For each trajectory we collected a set of 400 ordering genes that defined CAF progression by testing each gene for differential expression between normal fibroblasts from the respective lineage and fibroblasts from late-stage tumors (Padj < 0.001, sorted by logFC, 200 most upregulated and 200 downregulated genes). Expression profiles were reduced to two dimensions using the DDRTree algorithm included with Monocle 2 via the reduceDimension method and cells ordered along the trajectory using the orderCells method, both with default parameters.
Pathway and Gene Ontology enrichments for cluster-specific genes were calculated using ConsensusPathDB (58) and DAVID (59), respectively. Genes with an Padj < 0.00001, log(0.25) fold higher expressed in the cluster and expressed in >10% of cells served as an input for analysis. Transcription factor binding site (TFBS) enrichment was calculated using the OPOSSUM web service (60). Pathways with Padj < 0.05 were considered significantly enriched for pathway analysis; TF motifs with a z-score >10 were considered enriched. The z-score calculation in OPOSSUM uses the normal approximation to the binomial distribution to compare the rate of occurrence of a TFBS in the target set of genes to the expected rate estimated from a precomputed background set.
Bioinformatic Processing of Bulk RNA-seq Data
Bulk RNA-seq data from GEMM and subcutaneous mouse models was processed as described previously (32). Briefly, reads were aligned to the mouse reference genome (mm10) using GSNAP version “2013-10-10,” allowing a maximum of two mismatches per 75 base sequence (parameters: “-M 2 -n 10 -B 2 -i 1 -N 1 -w 200000 -E 1 –pairmax-rna = 200000 –clip-overlap”). To quantify gene expression levels, the number of reads mapped to the exons of each gene was calculated in a strand-specific manner using the functionality provided by the R package GenomicAlignments. For heat-map visualizations, per-gene counts were normalized to Reads Per Kilobase Million (RPKM) within each sample to account for differences in transcript length and sequencing depth.
Differentially expressed genes between groups were determined using the R package limma (61) after trimmed mean of M-values normalization, which implements an empirical Bayesian approach to estimate gene-expression changes using moderated t tests.
Bioinformatic Processing of Human PDAC scRNA-seq Data
Data from 24 patients with PDAC and 11 control pancreas tissues was obtained from the Genome Sequence Archive under project PRJCA001063 in FASTQ format. Single-cell RNA-seq data for each patient was processed with cellranger count (Cell Ranger 3.0.2; 10x Genomics) using standard parameters and supplying a custom reference package based on human reference genome GRCh38 and GENCODE gene models. Samples from 22 patients and 11 control tissues for which the correct chemistry was detected by Cell Ranger from the sequencing data were used for downstream analysis.
Subsequent data analysis was carried out in R 3.5.1 and the Seurat package (v 3.0.2). Cells with at least 300 measured transcribed genes per cell were considered for analysis. To remove noise from droplets containing more than one cell, we focused on cells with at most 6,000 measured genes. Dead cells were excluded by retaining cells with less than 15% mitochondrial reads leaving 84,276 cells for final analysis. Subsequently, data was normalized to log(CPM/100+1) and scaled regressing out the number of distinct UMIs and the fraction of mitochondrial reads during scaling.
Dimensionality reduction was carried out with the Seurat package. Prior to PCA, we identified the 2,000 most variable genes and applied PCA to cells in this gene space. Principal components 1 to 20 were provided as an input for dimensionality reduction via UMAP with default parameters. Clusters of cells were identified on the basis of a shared-nearest neighbor graph between cells and the smart moving (SLM) algorithm (resolution = 0.1). Markers for each cluster were identified by reducing the number of candidate genes to those genes which were (i) at least log(0.25)-fold higher expressed in the cluster under consideration compared to all other clusters and (ii) expressed in at least 10% of cells in the cluster under consideration. For genes passing those criteria, significance between cells in the cluster versus all other cells was calculated using Wilcoxon rank sum test and adjusted with the Benjamini–Hochberg method (62). Average expression within individual clusters was calculated with the AverageExpression function in Seurat and subsequently z-score transformed for each gene. The minimum spanning to infer global lineage structure of CAFs was calculated using Slingshot (19) with default parameters and defining normal fibroblasts (leftmost population of PC1) as starting and TGFβ CAFs (rightmost population of PC1) as end point. KRASG12X mutations in each cell were manually identified from individual reads in BAM alignment files visualized via Integrative Genomics Viewer (63) and assigned to a cell via the CB tag in the BAM file. Copy-number alterations were inferred from single-cell RNA-seq data with the CONICS R package (64) using nonmalignant acinar cells as reference cells. The default filtering and normalization procedures were followed, as outlined in https://goo.gl/tFYLEh.
Bioinformatic Processing of Human HNSCC scRNA-seq Data
Normalized data from 18 patients with HNSCC (43) was obtained from the Gene Expression Omnibus (GEO; GSE103322) as log(CPM/10+1) transformed gene-by-cell count matrix. Annotations of cell types (malignant/nonmalignant, as well as immune and stromal cell types for nontumor cells) for each cell were downloaded from the same GEO repository. Data was scaled regressing out the number of distinct UMIs and the different usage of enzymes for scRNA-seq library preparation during scaling. Dimensionality reduction was carried out with the Seurat package. Prior to PCA, we identified the 2,000 most variable genes and applied PCA to cells in this gene space. Principal components 1 to 30 were provided as an input for dimensionality reduction via UMAP with default parameters in Seurat (v3.0.2). Clusters of cells were identified based on a shared-nearest neighbor graph between cells and the smart moving (SLM) algorithm (resolution = 0.4).
Bioinformatic Processing of Mouse KPC scRNA-seq Data
Normalized fibroblast-enriched data from 4 KPC mice (36) was obtained from GEO (GSE129455) as log (number of UMIs in each cell is equal to the median UMI count across the dataset) transformed gene-by-cell count matrix. Ensembl IDs were converted to gene names using the biomart R package (65). Data were scaled, regressing out the number of distinct UMIs during scaling. Dimensionality reduction was carried out with the Seurat package. Prior to PCA, we identified the 3,000 most variable genes and applied PCA to cells in this gene space. Principal components 1 to 20 were provided as an input for dimensionality reduction via UMAP with default parameters in Seurat (v3.0.2). Clusters of cells were identified on the basis of a shared-nearest neighbor graph between cells and the smart moving (SLM) algorithm (resolution = 0.2).
Bioinformatic Processing of Human Bulk RNA-seq Data
Comparisons of normalized LRRC15 levels between tumors from TCGA and their host tissues from GTEx were retrieved from the GEPIA (66) web platform and filtered for tumor types with significant differences between normal and tumor tissue. Raw expression counts per sample of microdissected stroma (n = 122) and tumor (n = 65) samples (42) were downloaded from GEO (GSE93326). Raw expression counts per sample of normal pancreas RNA-seq (n = 247) was downloaded from GTEx (67). Both datasets were normalized to Log2(CPM+1) and heat maps were generated using the pheatmap R package (https://cran.r-project.org/web/packages/pheatmap/) using complete linkage clustering and with Euclidean distance as distance measure.
For pan-cancer TCGA data analysis, the TCGA Pan-Cancer (PANCAN) batch effects normalized mRNA data was downloaded from the UCSC XenaBrowser (https://xenabrowser.net) providing a gene by samples matrix of log2(norm_value+1) counts table and patient metadata. From the initial set of n = 11,060 samples, we only utilized those samples that were annotated as “Primary Tumor” or “Additional - New Primary” in the metadata table resulting in 9,712 samples from 31 different cancer types.
Analysis of Immunotherapy Trial Data
Whole-transcriptome data from patients enrolled in anti–PD-L1 (atezolizumab) immunotherapy trial IMvigor210 (NCT02951767, NCT02108652; ref. 68), were generated as described previously (32). Data from anti–PD-L1 (atezolizumab) immunotherapy trial PCD (NCT01375842) were generated as given in ref. 69. For calculation of LRRC15 CAF scores from expression data, the signature was computed by using the eigenWeightedMean method from the MultiGSEA R package (https://github.com/lianos/multiGSEA). Briefly, the expression of each gene in a signature is first z-score transformed. Then, a PCA was performed; weights for the genes were calculated by the percent of which they contribute to the first principal component indicated by eigengene. Finally, a weighted average per sample is calculated as the final score. This approach has the advantage of focusing the score for the set on the largest block of well-correlated (or anticorrelated) genes in the set, while downweighting contributions from genes that do not track with other set members.
For survival analysis, patients were split into low and high expression groups by median. Kaplan–Meier curves were generated using the RMS R package (http://biostat.mc.vanderbilt.edu/wiki/Main/Rrms).
The single-cell RNA-seq data from mouse PDAC KPP GEMMs are available from the ArrayExpress database (http://www.ebi.ac.uk/arrayexpress) under accession number E-MTAB-8483.
Disclosure of Potential Conflicts of Interest
All authors except for C. Klijn are employees of Genentech. C. Klijn is an employee of Genmab.
Conception and design: C.X. Dominguez, S.J. Turley
Development of methodology: C.X. Dominguez, H. Koeppen
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): C.X. Dominguez, S. Keerthivasan, H. Koeppen, S. Gierke, B. Breart, O. Foreman, T.W. Bainbridge, A. Castiglioni, Z. Madrusan, Y. Liang, M.R. Junttila
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): C.X. Dominguez, S. Muller, S. Keerthivasan, H. Koeppen, J. Hung, S. Gierke, O. Foreman, Y. Senbabaoglu, M.R. Junttila, R. Bourgon, S.J. Turley
Writing, review, and/or revision of the manuscript: C.X. Dominguez, S. Muller, S. Keerthivasan, H. Koeppen, J. Hung, A. Castiglioni, M.R. Junttila, C. Klijn, R. Bourgon, S.J. Turley
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): S. Keerthivasan, A. Castiglioni
Study supervision: M.R. Junttila, C. Klijn, S.J. Turley
Our flow cytometry core group performed all the cell sorting described in this study. Our in-house genotyping and murine reproductive technology core groups were invaluable for the work done in this study. Thanks to Alex T. Ritter for his advice on the construction of the retroviruses used in this project. Thanks to Jeffrey Eastham-Anderson for his expertise and advice in automated quantification of imaging data. Thanks to members of the Turley laboratory for their scientific input and discussion.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.