A PRRX1 Signature Identifies TIM-3 and VISTA as Potential Immune Checkpoint Targets in a Subgroup of Microsatellite Stable Colorectal Cancer Liver Metastases

Disease recurrence and drug resistance are major challenges in the clinical management of patients with colorectal cancer liver metastases (CLM), and because tumors are generally microsatellite stable (MSS), responses to immune therapies are poor. The mesenchymal phenotype is overrepresented in treatment-resistant cancers and is associated with an immunosuppressed microenvironment. The aim of this work was to molecularly identify and characterize a mesenchymal subgroup of MSS CLM to identify novel therapeutic approaches. We here generated a mesenchymal gene expression signature by analysis of resection specimens from 38 patients with CLM using ranked expression level of the epithelial-to-mesenchymal transition–related transcription factor PRRX1. Downstream pathway analysis based on the resulting gene signature was performed and independent, publicly available datasets were used to validate the findings. A subgroup comprising 16% of the analyzed CLM samples were classified as mesenchymal, or belonging to the PRRX1high group. Analysis of the PRRX1 signature genes revealed a distinct immunosuppressive phenotype with high expression of immune checkpoints HAVCR2/TIM-3 and VISTA, in addition to the M2 macrophage marker CD163. The findings were convincingly validated in datasets from three external CLM cohorts. Upregulation of immune checkpoints HAVCR2/TIM-3 and VISTA in the PRRX1high subgroup is a novel finding, and suggests immune evasion beyond the PD-1/PD-L1 axis, which may contribute to poor response to PD-1/PD-L1–directed immune therapy in MSS colorectal cancer. Importantly, these checkpoints represent potential novel opportunities for immune-based therapy approaches in a subset of MSS CLM. Significance: CLM is an important cause of colorectal cancer mortality where the majority of patients have yet to benefit from immunotherapies. In this study of gene expression profiling analyses, we uncovered novel immune checkpoint targets in a subgroup of patients with MSS CLMs harboring a mesenchymal phenotype.

Introduction mesenchymal phenotype has not been well characterized, and a more detailed understanding of this subgroup might reveal novel therapeutic opportunities.
In this work, we aimed to investigate whether a mesenchymal phenotype subgroup could be identified by analysis of tumor samples from a cohort of patients with resectable CLM. An intrinsic molecular signature was generated on the basis of gene expression levels of the EMT-related transcription factor PRRX. A CLM subgroup that exhibited high expression of PRRX signature genes was identified and the findings were validated in independent CLM cohorts.
The signature was associated with an immune-inflamed phenotype, harboring features of immune activation and suppression, and revealed potential novel targets for immune based therapies in patients with microsatellite stable (MSS) CLM.

Patients and CLM Samples
Metastatic tumor samples were collected at the time of CLM surgery from the first 71 patients enrolled in the Oslo-COMET trial (NCT01516710; refs. 7,8). Of these, 33 cases were excluded from gene expression analysis for the following reasons: unresectable tumors (n = 2), benign lesions (n = 4; 2 hemangiomas, 1 focal nodular hyperplasia, and 1 fatty infiltration), no tissue for biobanking (n = 9), not analyzed (n = 1), inadequate tumor content (<10%; n = 9), and inadequate RNA quality (n = 8), leaving 38 CLM cases for analysis of which two metastatic lesions were available from six cases. The study was approved by the Regional Committee for Health and Research Ethics in Norway (2011/1285/REK Sør-Øst B), and written informed consent was required for participation. Median follow-up time was 66 months (95% confidence interval, 65-69) from CLM resection. Tumor tissue samples were snap frozen in liquid nitrogen immediately after resection and stored at −80°C. Two frozen sections per tumor sample were assessed for tumor content by the study pathologist (K. Grzyb) using routine diagnostic hematoxylin and eosin stains. Samples with tumor content 10%-100% (median 63%) were homogenized and aliquoted for further analysis. C-reactive protein (CRP) was measured in patient plasma as part of preoperative routine analysis. Date of death was obtained from the Norwegian National Registry.

Mutation Analysis
Targeted next-generation sequencing has previously been described for this cohort (8). In brief, DNA was isolated by AllPrep DNA/RNA MiniKit (Qiagen) using the QiaCube system according to the manufacturer's instructions and quantified by NanoDrop ND-1000 (Thermo Fisher Scientific). Sequencing was conducted by using the Ion AmpliSeq Cancer Hotspot Panel (v2) covering mutational hotspots in 50 cancer-related genes and following the manufacturers' protocol (Life Technologies). Data from the PGM runs were processed by The Torrent Suite Variant Caller using panel customized parameters as provided by Life Technologies and variants considered true passed quality control measures of minimum 500× coverage and at least 2% frequency.

Microarray Gene Expression Analysis
Generation of microarray data has been described previously (8). Briefly, total RNA from fresh-frozen samples was isolated using TRIzol reagent (Invitrogen). All subsequent experimental procedures, including labeling, hybridization, and scanning, were processed according to the standard Affymetrix protocols associated with application of Agilent SurePrint G3 Human Gene Expression 8 × 60K arrays. The gene expression data were preprocessed with Agilent's Feature Extraction Software (v10.7.3.1) and quantile normalized and log 2 transformed with R software. Probe sets representing unique genes were kept for analysis. When there were multiple probes per gene, the probe with the highest expression level was chosen.

Construction of the PRRX1 Signature
The variance in gene expression of the EMT-related transcription factors ZEB, ZEB, SNAI, SNAI, TWIST, TWIST, and PRRX was analyzed to identify a factor with high variance which would enable stratification of the cohort by expression rank. Upon identifying PRRX as the transcription factor exhibiting the highest variance, a quantile-based selection of the top and bottom 25% of the samples from unique patients was performed to form two groups (n = 10/group) for differential gene expression analysis. A significance analysis of microarrays (SAM) using J-Express software (http://www.molmine.com/ JexpressMain.php) was performed contrasting these two groups and the resulting differentially expressed genes (DEG) with a FDR ≤ 0.001 were included in the mesenchymal signature. Hierarchical clustering was performed in R using the "heatmap.plus" package and using average linkage method with Euclidean distance. A per-sample signature score was generated by calculating the average log 2 expression of all upregulated genes and used for subsequent correlation analysis with expression of key immune checkpoint genes. Subgroups of CLM samples were defined by major subbranches of the cluster dendogram and applied in subgroup statistical comparisons.

Pathway Analysis and Estimation of Immune Cell Infiltration
Upregulated and downregulated genes with fold change data from the SAM analysis were uploaded into Ingenuity Pathway Analysis (IPA) software (Ingenuity Systems, www.ingenuity.com) for pathway enrichment analysis and functional annotation. Significance of each pathway and functional group was assessed by IPA using the Fisher's exact tests (P ≤ 0.05). Upstream transcriptional regulation was predicted by IPA through the activation z-score statistic where the predicted regulatory relationships are associated with a direction of change that is either activating (z-score ≥ 2) or inhibiting (z-score ≤ −2).
The functional annotation in IPA was run selecting "immune cells, liver and colorectal cancer cell lines" in the tissue/cell parameter setting to focus on the immunologic consequences of the mesenchymal phenotype. TIMER, a webbased open access resource (https://cistrome.shinyapps.io/timer/) was used to conduct deconvolution of infiltrating immune cell types based on transcriptomic data. TIMER estimated the abundances of six tumor-infiltrating cell populations including CD4 + T cells, CD8 + T cells, B cells, macrophages, neutrophils, and dendritic cells (DC).

Reverse Phase Protein Array Analysis
Profiling of 295 cancer relevant proteins of which 63 were in a phosphorylated state (Supplementary Data S9) was performed at the reverse phase protein array (RPPA) core facility at MD Anderson Cancer Center (Houston, TX). Tissue from CLM (n = 30) was lysed then adjusted to 1 mg/mL concentration as assessed by bicinchonic acid assay and boiled with 1% SDS and 2-mercaptoethanol. Supernatants were manually diluted in five 2fold serial dilutions with lysis buffer. The samples were spotted onto and immobilized on nitrocellulose-coated slides. The slides were probed with antibodies using a tyramide-based signal amplification approach and visualized by 3,3ʹ diaminobenzidine tetrahydrochloride (DAB) colometric reaction. Slides were scanned, analyzed, and spots quantitated using MicroVigene software (VigeneTech Inc.). Relative protein concentrations were derived from the supercurve for each sample by curve fitting using the R package "SuperCurve" (version 1.01). All the values were log 2 transformed and median centered across each antibody. Differential protein expression analysis was performed as described for transcriptomic data.

Validation of the PRRX1 Signature
First, we validated that the PRRX signature captures mesenchymal biology; second, the stratification performance of the signature and main biological findings were validated in three independent datasets. For the first approach, three public EMT/mesenchymal signature gene sets generated from meta-analyses across cancer types were curated (https://www.gsea-msigdb. org/gsea/msigdb/cards/HALLMARK_EPITHELIAL_MESENCHYMAL_ TRANSITION.html; refs. 9, 10) and used to validate the presence of a subgroup with mesenchymal biology in the COMET cohort as identified by the PRRX signature. From the meta-analysis published by Tan and colleagues (10), the gene subset exclusive to the mesenchymal phenotype was applied. The public gene sets are listed in Supplementary Data S5.
For external validation of the gene signature, the GSE41258, GSE10961, and GSE41568 datasets were downloaded from the NCBI Gene Expression Omnibus (GEO) using R's GEOquery package. GSE41258 consists of Rosetta/Merck Human RSTA Custom Affymetrix 2.0 microarray data from primary and mCRC samples, of which 21 CLM samples (11) were selected for our validation analyses. The GSE10961 dataset consisted of 18 CLM profiled on the Affymetrix Human Genome U133 Plus 2.0 Array (12). From GSE41568, we extracted 80 CLM (Affymetrix Human Genome U133 Plus 2.0 Array; ref. 13). For all three datasets, probes were matched to gene symbols using the annotation files provided by the manufacturer. The probe with highest expression was chosen if multiple probes matched to the same gene symbol. Hierarchical clustering was performed for all three datasets using the PRRX signatures genes present in the external data which was array type dependent. From GSE10961 and GSE41568 expression data, PRRX signature score per sample was calculated as described above.

Statistical Analyses
Statistical analyses were conducted using the R Statistical Computing environment v.3.4.1 with exception of survival analysis where IBM SPSS statistics software was applied. Multiple groups were compared by one-way ANOVA.
Unpaired two-tailed t test was used for pairwise comparisons. Pearson correlation was computed to correlate between PRRX signature score and expression of immune markers. The Fisher's exact test was applied for comparison of categorical clinical and mutational data between groups. A value of P ≤ 0.05 was considered statistically significant. Overall survival was calculated from CLM resection to date of death or censor date (December 31, 2017). HR was calculated using Cox proportional hazards analysis and was reported with 95% confidence interval.

Data Availability
The data generated in this study are available upon request. The data that further support the findings of this study were obtained from NCBI GEO at GSE41568, GSE10961, and GSE41568.

Patients
The 38 patients included in the analysis had a mean age of 66 years (min-max 46-81 years), 16 (42%) were women and 22 (58%) men. The primary tumor was located in the right colon in 10 cases (26%), in the left colon/rectum in 28 (74%). Twelve patients had received neoadjuvant chemotherapy. The frequency of colorectal cancer relevant mutations across the cohort has been published previously (8). To exclude systemic inflammation as a confounding factor contributing to the immune profile of the PRRX high group, serum levels of CRP at the time of CLM resection were evaluated. Generally, the levels were low (median = 2.8 mg/L; min-max 0.6-45 mg/L; where nine cases had CRP levels above the 5 mg/L threshold value). All included patients had MSS tumors (14).

The PRRX1 Signature and Cohort Subgroups
The PRRX transcription factor displayed the highest expression variance of the investigated EMT-related transcription factors with a continuous distribution across the cohort between extreme high/low values (Fig. 1A) and was applied to distinguish high and low expression subgroups for differential expression analysis assuming enrichment of the mesenchymal phenotype in the high expression subgroup. When comparing the high PRRX gene expression group (Q4 = top 25% of the samples) with the low PRRX expressing group (Q1 = bottom 25% of the samples; Fig. 1B), 405 DEGs were identified (FDR of <0.1%; Supplementary Data S1). Hierarchical clustering using the PRRX DEGs split the samples into two distinct clusters defined by two major subbranches of the dendrogram (Fig. 1C). Cluster 1 (right major subbranch) showed higher expression of the PRRX signature genes (more mesenchymal) and Cluster 2 (left major subbranch) showed low expression (more epithelial). Each cluster showed further subbranching into two subclusters, reflecting an overall distribution of PRRX high and inverse PRRX low phenotypes at each end of a spectrum of intermediate phenotypes (Fig. 1C). Four subgroups were defined for downstream analyses based on the dendogram: PRRX high (n = 7), PRRX int1 (n = 13), PRRX int2 (n = 10), and PRRX low (n = 14; Supplementary Data S2). Three outlier samples were merged into the PRRX low subgroup as their signature scores matched the score range of this subgroup. The cohort included CLM pairs from 6 patients and no pairs were split into separate subgroups defined by the clustering. The PRRX high subgroup, constituting 16% of the cohort samples, displayed the highest expression of signature genes (mesenchymal phenotype; Fig. 1C) and was the focus group of downstream statistical comparisons.

AACRJournals.org
Cancer Res Commun; 3(2) February 2023 (FC range from −1.5 to −2.9) was TRAP, a member of the HSP protein family, which is associated with induction of EMT in ovarian cancer (21). Three zinc finger protein genes and SLCA (ZIP) which all have been associated with zinc homeostasis were also among the downregulated genes.

Validation of the PRRX1 Signature by Comparison with Other EMT/Mesenchymal Signatures
Three public EMT/mesenchymal signatures generated independently from meta-analyses across cancer types (  Fig. S1A-S1D).

Validation of the PRRX1 Signature in Independent Datasets
The discriminatory power of the PRRX signature was further validated in three independent cohorts of CLM patient samples ( Supplementary Fig. S2A    When Cluster 1 and 2 were compared using TIMER analysis in GSE10961 and GSE41568, similar enrichment of immune cell subsets was observed in Cluster 1 as in the PRRX high subgroup in the COMET cohort ( Supplementary Fig. S2D and S2G).

Validation of the PRRX1 Signature by Protein Expression Data
Limited overlap existed between RPPA antibody targets and PRRX signature genes; reflecting the available version of RPPA. Eight PRRX signature genes were represented by RPPA data, of which five proteins showed significant variance and higher expression in the PRRX high subgroup in accordance with gene expression analysis (Fig. 4A).

Associations between the PRRX1 high Signature and Clinicopathologic Features and Long-term Outcome
All patients in the PRRX high subgroup were women (P = 0.004), and the

Discussion
In this work, taking an unbiased approach, the expression of a number of EMTrelated transcription factors was analyzed and PRRX was identified as the factor exhibiting the largest expression variance in our cohort of CLM cases. Differential gene expression analysis was then applied to identify PRRX coexpressed genes which were used to generate the PRRX signature and define the PRRX high subgroup. Importantly, the signature was validated by applying public pan cancer EMT signatures to our data (reidentifying the PRRX high subgroup), by applying the PRRX signature to three independent CLM datasets, and by protein expression analysis, supporting the reproducibility and relevance of the signature. Not only was a PRRX high subgroup validated in all the investigated CLM datasets; the identified subgroup exhibited high expression of the same checkpoint molecules and a similar enrichment of immune cells across the analyzed datasets. The signature validation lends confidence in the analytic approach and supports the clinical validity of the PRRX signature.
The mesenchymal phenotype is a fundamental feature of primary colorectal cancer classification, and is strongly associated with poor prognosis, tumor recurrence, and drug resistance (5,22,23  that the mesenchymal phenotype can also be detected in the metastatic setting in CLM samples. Acquisition of a mesenchymal phenotype has been associated with T-cell dysfunction through increased expression of checkpoint molecules (24)(25)(26), and a pan-cancer EMT signature was previously suggested as a tool to select patients that might benefit from immune checkpoint inhibition (26). Therefore, identification of an equivalent subgroup in mCRC could be of clinical relevance with respect to new target discovery within immune checkpoint-directed therapy which could particularly benefit patients that have developed resistance to chemotherapy.
Functional annotation of the PRRX signature genes revealed a strong association with immune-related genes and processes, displaying both antitumor (Th1) and tumor-tolerant (Th2) responses, which was consistent with the predicted enrichment of immune cells in the PRRX high subgroup. Applying the TIMER algorithm, recruitment of several distinct immune cell types was estimated, with a higher relative abundance of DCs, neutrophils, macrophages, and T cells (CD4 + and CD8 + ) in the PRRX high subgroup. Balancing the immune activation driven by IFNγ signaling, TGFβ was predicted as upstream regulator of DEGs representing immune suppressive molecules upregulated in the PRRX high subgroup. Interestingly, the PRRX signature was positively correlated with expression of CD, an established marker of the immune suppressive M2-polarized macrophage population, which has been associated with induction of EMT (27,28). Despite the apparent immune suppression, the evidence of a preexisting immune activation is of importance as it could in principle be reactivated by appropriate immune modulating therapies.
Tumor immune escape from an ongoing immune activation can upregulate immune checkpoint expression. Targeting immune tolerance via coinhibitory checkpoint molecules to restore cytotoxic T-cell function forms the basis of current immune therapies. The immune profile associated with the PRRX signature suggested the presence of dysfunctional T cells with checkpoint molecule involvement, in addition to an imbalance between protumor and antitumor immune cells as mentioned above. The mechanistic link between the PRRX-driven mesenchymal phenotype and checkpoint molecule upregulation remains to be identified as this cannot be established from our gene expression dataset alone. Methods such as spatial transcriptomics and single-cell analysis would be logical steps to validate the signature and further investigate signature contribution from both tumor and stromal components. However, the strong correlation between the PRRX signature and expression of immune checkpoint molecules is in striking accordance with the findings of a recent study of patients with metastatic gastric adenocarcinoma where the mesenchymal-like tumor subgroup had high expression of HAVCR/TIM- and VISTA (29). Furthermore, TIM- gene expression was identified as a top contributing factor to the distinct clustering of an EMT-high colorectal subgroup in a pan-cancer study (30). The clinical significance of identifying HAVCR/TIM- and VISTA associated with the PRRX signature score is thus dual. These checkpoints represent potential resistance markers to the widely applied immune therapies targeting the PD-1/PD-L1 axis, and their high expression could contribute to explain inherent resistance to this approach in MSS mCRC. HAVCR2/TIM-3 is an emerging clinical target in the cancer immune landscape along with VISTA. HAVCR/TIM- has been reported to be upregulated in response to PD-1 blockade in various cancer models (31,32), and overexpression of HAVCR/TIM- and VISTA has been associated with lack of response to anti-PD-1/PD-L1-based therapies (33). HAVCR2/TIM-3 and VISTA therefore represent alternative candidate immune targets that based on our results could be explored in patients of the PRRX high subgroup. There are currently a number of signal-seeking early phase clinical trials evaluating anti-TIM3 antibodies in advanced cancer patients as monotherapy or in combination (e.g., NCT03489343, NCT03680508, and NCT02817633). Similarly, VISTA, which also has entered early trial phase (e.g., NCT04475523, NCT04564417), is a particularly interesting cotarget due to its expression on both exhausted T cells and infiltrating suppressive myeloid cells which may differentiate into tumor-associated macrophages (34). Targeting VISTA may potentially reduce populations of infiltrated immune suppressive cells which may be required for restoring T-cell effecter function by anti-HAVCR2/TIM-3 blockade.
An important limitation of this work is related to a limited sample size, but this is partly compensated by the validation analyses. Also, the selective analysis of resected liver metastases in the COMET trial may have implications for the representativeness of the findings. However, for ethical reasons, larger series of biopsy samples from nonresectable cases are not available, leaving researchers to base analyses on resected samples. In addition, the analyses were performed using transcriptional data generated from bulk tissue, limiting analysis of which cell type has contributed to a specific profile to the application of deconvolution methods, thereby limiting interpretation regarding underlying biological mechanisms. In future studies, analyses including all PRRX subgroups could be extended to include more advanced antibody-based cytometry analyses using live cell suspensions or intact tissues to gain further resolution of functional immune subtypes present in the CLM mesenchymal phenotype. Furthermore, multi-marker detection methods could be used to validate protein expression and place TIM-3 and VISTA expression into context of relevant immune markers for a more comprehensive understanding.
Although the starting point for this work was an explorative study of limited sample size, the results regarding checkpoint molecule expression and predicted immune cell enrichment in the PRRX high subgroup were convincingly and reproducibly validated in three independent CLM cohorts (totaling 119 cases). The uncovered biology provides rationale for incorporating immune modulating therapy tailored to a specific CLM patient subgroup defined by the PRRX signature and suggests further exploration of the novel immune checkpoints HAVCR2/TIM-3 and VISTA in MSS mCRC. The PRRX signature may help identify patients with CLM likely to benefit from immune-based therapies directed at these targets and points to an opportunity for expanding the use of immune therapy strategies to patients with mCRC beyond the MSI subgroup. Our next goal will be to further develop the PRRX signature for clinical utility as a predictive biomarker, using feature reduction tools and test in biopsy samples from our ongoing immune therapy trial in MSS mCRC (NCT03388190).