Abstract
Protein expression in formalin-fixed, paraffin-embedded tissue is routinely measured by IHC or quantitative fluorescence (QIF) on a handful of markers on a single section. Digital spatial profiling (DSP) allows spatially informed simultaneous assessment of multiple biomarkers. Here we demonstrate the DSP technology using a 44-plex antibody cocktail to find protein expression that could potentially be used to predict response to immune therapy in melanoma.
Experimental Design: The NanoString GeoMx DSP technology is compared with automated QIF (AQUA) for immune marker compartment-specific measurement and prognostic value in non–small cell lung cancer (NSCLC). Then we use this tool to search for novel predictive markers in a cohort of 60 patients with immunotherapy-treated melanoma on a tissue microarray using a 44-plex immune marker panel measured in three compartments (macrophage, leukocyte, and melanocyte) generating 132 quantitative variables.
The spatially informed variable assessment by DSP validates by both regression and variable prognostication compared with QIF for stromal CD3, CD4, CD8, CD20, and PD-L1 in NSCLC. From the 132 variables, 11 and 15 immune markers were associated with prolonged progression-free survival (PFS) and overall survival (OS). Notably, we find PD-L1 expression in CD68-positive cells (macrophages) and not in tumor cells was a predictive marker for PFS, OS, and response.
DSP technology shows high concordance with QIF and validates based on both regression and outcome assessment. Using the high-plex capacity, we found a series of expression patterns associated with outcome, including that the expression of PD-L1 in macrophages is associated with response.
This article is featured in Highlights of This Issue, p. 5429
The NanoString digital spatial profiler (DSP) is a novel platform that offers the capacity of high-plex immune marker quantitative measurements within specific regions of interest. In this study, the DSP technology was shown to have high concordance to the AQUA method of quantitative immunofluorescence in non–small cell lung cancer, in most markers. The platform was then used to profile pretreatment biopsies from immunotherapy-treated melanomas by measuring 44 immune markers simultaneously in macrophage, leukocyte, and tumor compartments, from which 26 predictive markers of response and survival were identified, demonstrating the discovery potential for the platform. Most notably, we found that PD-L1 in macrophages, but not melanocytes, shows association with response to immunotherapy. While further work is needed to validate this observation and to translate assay to usage in a CLIA laboratory, this biomarker could improve the sensitivity and specificity of the current PD-L1 IHC companion diagnostic test for immunotherapy.
Introduction
Immune checkpoint inhibitors (ICI) have dramatically changed the treatment landscape of many tumor types and altered therapeutic paradigms after the discovery of the immune checkpoint receptor programmed death 1 (PD-1) and its activator ligand, programmed death ligand-1 (PD-L1). PD-1 is expressed on the surface of tumor-infiltrating lymphocytes (TIL) and engages PD-L1 on tumor cells and/or other immune cells, and this interaction has been shown to be a major immune inhibitory mechanism in the tumor microenvironment (TME; refs. 1, 2). Even although regulation of PD-1/PD-L1 pathway is a well characterized immune evasion mechanism, it has been reported that PD-1 checkpoint blockade mediates immune resistance in less than 40% of malignancies (3–5). Tumor PD-L1 expression has been shown to predict response to immunotherapy (6, 7), although even with selection, the majority of patients fail to respond to PD-1 inhibitors (8). Furthermore, patients with low tumor PD-L1 expression have also been reported to have durable responses (9).
IHC analysis of formalin-fixed, paraffin-embedded (FFPE) patient tissue is currently the only companion diagnostic test in clinical practice. Despite its widespread use, its sensitivity, specificity, and reproducibility are suboptimal and it offers limited information about the complexity of TME. In the light of the toxicity and high cost of checkpoint inhibitors, there is a need for biomarkers that can more accurately select patients that will benefit from immunotherapy (7, 10–14). To optimize patient stratification, some recent studies have focused on the assessment of multiple variables to create signatures or a scoring system that takes into account transcriptomic data, tumor mutational burden, and/or TILs infiltration (15–17). Furthermore, assays that measure the expression of multiple immune markers can be used to reveal underlying mechanisms of tumor immune evasion in the TME, which may lead to the development of novel therapeutic strategies.
Quantitative immunofluorescence (QIF) is a technique that enables spatially resolved multiplexed target measurement on a single FFPE tissue slide but is limited by the number of fluorescence channels that can be utilized (18). NanoString DSP is a novel platform that offers nondestructive simultaneous high-plex quantitative measurement of biomarkers on a single FFPE tissue section within specific regions of interest. Regions of interest can be manually or molecularly defined. These features make the DSP platform well suited to discovery of single biomarkers or multiplexed signature development where target localization is important. In this study, we validate the quantitative localized measurement capability of NanoString DSP using automated QIF (AQUA) as a criterion standard for immune marker compartment-specific measurement. In addition, we assess their agreement on the prognostic value of CD3, a known prognostic biomarker of survival in non–small cell lung cancer (NSCLC; ref. 19). Then we explore the predictive value of a 44-plex panel of immune markers in a cohort of immunotherapy-treated melanoma patients. We identify PD-L1 expression in the macrophages, but not in the tumor, as the parameter that is most predictive of outcome. Furthermore, we identify an additional 26 potential biomarkers, illustrating the potential of the platform for discovery of novel biology and biomarkers.
Materials and Methods
Tissue microarray and patient cohorts
Tissue specimens were prepared in a tissue microarray (TMA) format as described previously (20). Representative tumor areas were obtained from FFPE specimens and 0.6-mm cores from each tumor block were arrayed in a recipient block. FFPE cell line pellets, tonsil, and placenta were used as controls. YTMA 356 (Yale cohort A), is a NSCLC array that consists of primary tumors resected between 2010 and 2014 from 43 patient cases that received EGFR tyrosine kinase inhibitors (TKI) at some point after resection. It includes 36 EGFR mutant, six EGFR wild-type, and one of unknown mutation status patient tumors, as well as 14 cell line cores (H820, H1648, H1993, H441, H1299, A431, A549, H2882, HCC193, HT29, PC9, MCF7, SKBR3, and H1355). A previously described cohort (21), YTMA 79 (Yale cohort B), consists of 202 FFPE primary NSCLC tumors from patients seen at Pathology Department of Yale University (New Haven, CT) between 1988 and 2003. YTMA 376 (Yale cohort C), consists of 60 FFPE melanoma patient tumors resected between 2011 and 2016 initially described by Wong and colleagues (22). The data cutoff date was September 1, 2017 and the median follow-up time was 20.1 months. All patients in this cohort received ICIs after specimen collection. All cohorts consist of retrospectively serially collected tumors without stratification or matching and clinicopathologic information from patients was collected from clinical records and pathology reports. Detailed characteristics of each cohort are presented in Supplementary Tables S1–S3.
All tissue was used in accordance with U.S. Common Rule after approval from the Yale Human Investigation Committee protocol #9505008219 with an assurance filed with and approved by the U.S. Department of Health and Human Services. Approval includes informed written consent or in some cases waiver of consent.
QIF
Quantitative measurement of PD-L1 and TILs markers was performed using AQUA method (Navigate Biopharma), quantifying fluorescent signal within subcellular compartments, as described previously (23). A tumor mask was created by binarizing the cytokeratin signal and creating an epithelial/melanocyte compartment. When this compartment size was less than 2% of the area of the TMA spot, the spot was excluded. Stroma was defined as the remaining area with positive DAPI staining. QIF score was calculated by dividing the target pixel intensity by the area of the compartment. QIF scores were normalized to the exposure time and bit depth at which the images were captured, allowing scores collected at different exposure times to be comparable.
DSP
The NanoString DSP technology allows specially defined collection of oligonucleotides tags that are cleaved from specific validated antibodies. The regions of interest may be user defined (drawn on an image) or molecularly defined using a fluorescence image of the same slide prior to collection. Here, FFPE tissue slides were incubated with cocktails of up to 44 unique oligonucleotide-conjugated antibodies (Supplementary Table S4). The compartments were identified with fluorescent imaging with antibodies targeting cytokeratin to detect NSCLC tumor compartment and S100 with HMB45 for melanocytes, CD68 for macrophages, and CD45 for leukocyte detection. Target immune markers were measured by sequential compartment assignment of the macrophage, leukocyte, and finally tumor compartment. The selected compartments were chosen for high-resolution multiplex profiling, and oligos from the selected region were released upon exposure to UV light. Photocleaved oligos were then collected via microcapillary tube inspiration using an early version of the DSP platform (NanoString) robotic system and transferred into a microwell plate with a spatial resolution of approximately 10 μm. Photocleaved oligos from the spatially resolved compartments in the microplate were then hybridized to 4-color, 6-spot optical barcodes in the nCounter platform, enabling up to 800 distinctly label counts in per compartment of the protein targets representing the antibodies to which the tags were originally conjugated. Digital counts from barcodes corresponding to protein probes were first normalized with internal spike-in controls to account for system variation, and then normalized to the area of their compartment.
Multiplexed TILs and TILs activation immunofluorescence staining
The multiplexing TIL protocol has been published previously (24). Briefly, tissue sections were subjected to the same deparaffinization, antigen retrieval, and blocking protocol mentioned above. Staining for pan-cytokeratin, CD4, CD8, and CD20 was performed using a sequential multiplexed immunofluorescence protocol with isotype-specific primary antibodies to detect epithelial tumor cells (cytokeratin: clone Z0622, Agilent), helper T cells (CD4 IgG, 1:100, clone SP35, Spring Bioscience), cytotoxic T cells (CD8 IgG1, 1:250, clone C8/144B, Agilent), and B lymphocytes (CD20 IgG2a, 1:150, clone L26, Agilent). Nuclei were highlighted using 4′,6-diamidino-2-phenylindole (DAPI). Secondary antibodies and fluorescent reagents used were goat anti-rabbit Alexa546 (Molecular Probes), anti-rabbit Envision (K4009, Agilent) with biotynilated tyramide/Streptavidine-Alexa750 conjugate (PerkinElmer), anti-mouse IgG1 antibody (1:100, eBioscience) with fluorescein-tyramide (PerkinElmer), anti-mouse IgG2a antibody (1:200, Abcam) with Cy5-tyramide (PerkinElmer). Residual horseradish peroxidase activity between incubations with secondary antibodies was eliminated by exposing the slides twice for 7 minutes to a solution containing benzoic hydrazide (0.136 g) and hydrogen peroxide (50 μL).
Staining for T-cell activation panel (25) included pan-cytokeratin, CD3, Ki67, and Granzyme B and was performed using a similar sequential multiplexed immunofluorescence protocol with isotype-specific primary antibodies to detect epithelial tumor cells (cytokeratin, clone Z0622, 1:100, Agilent), T lymphocytes (CD3 IgG, 1:100, clone SP7, Novus Biologicals), Ki67 (IgG1, 1:100, clone MIB-1, Agilent), and Granzyme B (IgG2a, 1:2,000, clone 4E6, Abcam). Fresh control slides from morphologically normal human tonsil were included in each staining batch as positive controls and to ensure reproducibility.
PD-L1 immunofluorescence staining
Tissue sections were subjected to the same deparaffinization, antigen retrieval, and blocking protocol mentioned above and incubated overnight with a cocktail of the primary target antibody, PD-L1 (9A11, Cell Signaling Technology) mouse mAb, and a cytokeratin antibody, rabbit polyclonal antibody (Z0622, Agilent). Next, sections were incubated for 60 minutes at room temperature with Alexa 546–conjugated goat anti-rabbit secondary antibody (Molecular Probes) diluted 1:100 in mouse EnVision Amplification Reagent (K4001, Agilent). Cyanine 5 directly conjugated to tyramide (FP1117, PerkinElmer), at a 1:50 dilution for 10 minutes, was used for target detection and ProLong Gold Mounting Medium (Molecular Probes) containing DAPI was used to stain nuclei. Control slides were run for reproducibility alongside each experimental slide-staining run.
Statistical analysis
Pearson correlation coefficient (R) was used to assess the agreement between QIF scores and DSP counts from near serial sections of Yale Cohort A (YTMA 356). Overall survival (OS) and progression-free survival (PFS) curves were constructed using the Kaplan–Meier analysis with a follow-up of 60 months and statistical significance was determined using the log-rank test. For the statistical analysis, the average NanoString counts from two available cores of each case was used. All P values were based on two-sided tests and P < 0.05 was considered statistically significant for median stratification. For markers stratified by any other cut-off point, the significance cutoff was set after Bonferroni correction for multiple comparisons. Specifically, for markers stratified by tertiles, a P < 0.0167 was considered significant while for quartile stratification a difference would be considered statistically significant if the P < 0.0083. Statistical analyses were performed using IBM SPSS Version 20 (IBM Corp.), JMP Pro software (version 11.2.0, 2014, SAS Institute Inc), and GraphPad Prism v6.0 for Windows (GraphPad Software, Inc). All tumor spots were visually evaluated and cases with staining artifacts or presence of less than 2% tumor compartment area were systematically excluded.
Results
DSP standardization to QIF and validation
To validate the NanoString DSP platform, we used the AQUA method of QIF as a comparison standard in Yale cohort A that contains patients with NSCLC treated with EGFR TKIs. The compartment assignment method of the two assays is similar, as both use positive immunofluorescence signal to create compartments within a region of interest in which multiple targets are measured. Briefly, imaging of fluorophore-conjugated cytokeratin-specific antibody is used to create a binary mask, which directs UV light to only the tumor compartment within a field of view. DNA oligos are released from the oligo-conjugated antibodies via cleavage of the UV photocleavable linker, collected, hybridized to reporter probes, and counted as tumor markers on the NanoString nCounter System. The stoma compartment is collected by inverting the mask and collecting the remaining oligos within the compartment. Visually, the compartments created by both assays were found to be comparable (Supplementary Fig. S1). Regression of counts and QIF scores for multiple immune markers in tumor and stroma regions showed a high concordance between the two assays. Specifically, for CD3 (R2 = 0.68), CD4 (R2 = 0.55), CD20 (R2 = 0.74), and CD8 (R2 = 0.54) there was a strong agreement when those markers were measured in the stroma compartment in near serial section TMAs (Fig. 1A–D). Counting PD-L1 by NanoString DSP in tumor compartment had a higher degree of agreement to PD-L1 tumor QIF scores (R2 = 0.53) compared with stroma measurements (R2 = 0.13), which can be attributed to a higher heterogeneity of immune cells expressing PD-L1 across sections (Fig. 1E and F) or the variance in compartment assignment between the two methods. As Yale cohort A is an EGFR TKI–treated cohort of patients with NSCLC, we also investigated whether any of the immune markers measured by NanoString DSP had a predictive role in response, PFS, and OS, but none of them was found to be associated with favorable outcome (data not shown).
NanoString DSP to QIF comparison in Yale NSCLC cohort A. Regression of NanoString DSP counts to QIF scores for CD3 in stroma (A), CD4 in stroma (B), CD20 in stroma (C), CD8 in stroma (D), PD-L1 in tumor (E), PD-L1 in stroma (F), CD3 in tumor (G), CD4 in tumor (H), and CD8 in tumor (I).
NanoString DSP to QIF comparison in Yale NSCLC cohort A. Regression of NanoString DSP counts to QIF scores for CD3 in stroma (A), CD4 in stroma (B), CD20 in stroma (C), CD8 in stroma (D), PD-L1 in tumor (E), PD-L1 in stroma (F), CD3 in tumor (G), CD4 in tumor (H), and CD8 in tumor (I).
As a further step to validation, we used a second NSCLC cohort (Yale cohort B) to test whether stratification of patients by stromal CD3 counts measured by NanoString DSP technology reproduced the prognostic significance found by QIF. Stratification of patients by median QIF measured stromal CD3 counts in tumor (Fig. 2A), showed a statistically significantly enrichment of OS in the CD3-high samples [P = 0.0019; HR, 0.41; 95% confidence interval (CI), 0.24–0.72]. Similarly, high CD3 in stroma by QIF (Fig. 2B) was associated with favorable prognosis (P = 0.036; HR, 0.55; 95% CI, 0.32–0.96). Measurement of CD3 in the tumor by DSP (Fig. 2C) had a similar prognostic value (P = 0.034; HR, 0.54; 95% CI, 0.30–0.95) to QIF, but for stroma counts (Fig. 2D) the statistical difference did not reach significance (P = 0.26; HR, 0.73; 95% CI, 0.42–1.27) perhaps due to lower resolution definition of stroma. This further demonstrates that there is a high concordance between the two assays when the measurements are performed in the same compartments on a field of view averaged basis, which was further validated by comparing prognostic significance.
Prognostic value of NanoString DSP and QIF in Yale NSCLC cohort B. Kaplan–Meier 5-year survival curves of patients with NSCLC in Yale cohort B, stratified by median tumor and stroma CD3 expression measured by QIF and NanoString DSP. A, CD3 in tumor by QIF, P = 0.0019; HR, 0.41; 95% CI, 0.24–0.72. B, CD3 in stroma by QIF, P = 0.036; HR, 0.55; 95% CI, 0.32–0.96. C, CD3 in tumor by NanoString DSP, P = 0.034; HR, 0.54; 95% CI, 0.30–0.95. D, CD3 in stroma by NanoString DSP, P = 0.26; HR, 0.73; 95% CI, 0.42–1.27. Survival analysis by log-rank (Mantel–Cox) test.
Prognostic value of NanoString DSP and QIF in Yale NSCLC cohort B. Kaplan–Meier 5-year survival curves of patients with NSCLC in Yale cohort B, stratified by median tumor and stroma CD3 expression measured by QIF and NanoString DSP. A, CD3 in tumor by QIF, P = 0.0019; HR, 0.41; 95% CI, 0.24–0.72. B, CD3 in stroma by QIF, P = 0.036; HR, 0.55; 95% CI, 0.32–0.96. C, CD3 in tumor by NanoString DSP, P = 0.034; HR, 0.54; 95% CI, 0.30–0.95. D, CD3 in stroma by NanoString DSP, P = 0.26; HR, 0.73; 95% CI, 0.42–1.27. Survival analysis by log-rank (Mantel–Cox) test.
Discovery of immunotherapy predictive markers for melanoma
To test the capacity of the DSP technology to discover multiple or novel immune-related biomarkers associated with response and survival, we used a melanoma immunotherapy–treated cohort of patients (Yale cohort C) and measured 44 markers simultaneously in three different compartments. Representative images of a TMA spot and the compartments are shown in Fig. 3A and B. The three molecular compartments were defined by the detection of fluorescence-labeled primary antibodies targeting CD68 for macrophages (Fig. 3C), CD45 for leukocytes (Fig. 3D), and S100 plus HMB45 for melanocyte detection (Fig. 3E). Target immune markers were measured by sequential compartment assignment of the macrophage, leukocyte, and finally tumor compartment. The remaining DNA-positive area (the fourth compartment) was inadequate for further assessment (Fig. 3F). As NanoString DSP utilizes molecular definition of compartment assignment and not cell segmentation, measurement of marker coexpression on a per cell basis was not generated. Molecular definition of compartments is more similar to the AQUA method of quantitative fluorescence, and thus not subject to the reproducibility errors that are more common with software-based cell segmentation. However, a limitation of this approach is that is does not allow measurements on a “per cell” basis. Each patient case was represented by two nonadjacent TMA cores collected in separate runs from two independent TMA master blocks. As validation of the reproducibility of DSP, the agreement of target count measurement for all markers between the two cores from separate TMA blocks, on different days, was assessed. The reproducibility for CD8 and CD68 (R2 = 0.49 and R2 = 0.7, respectively) across the two cores in two independent experiments is shown in Supplementary Fig. S2A and S2B. This level of reproducibility is comparable with that seen by QIF where the lower R2 values are a function of tissue heterogeneity, not lack of analytic reproducibility.
Representative images of NanoString DSP compartment selection in melanoma cohort C. A, Hematoxylin and eosin image of a representative TMA spot. B, Low resolution immunofluorescence image of the markers that define the selected compartments. Melanocytes are in green, CD68+ is in blue, and CD45+ is in purple. Selection of CD68+ compartment (C), CD45+ compartment (D), tumor compartment (E), and remaining DNA+ compartment (F). G, Histogram of CD8 counts per patient in each compartment, as illustrated by color.
Representative images of NanoString DSP compartment selection in melanoma cohort C. A, Hematoxylin and eosin image of a representative TMA spot. B, Low resolution immunofluorescence image of the markers that define the selected compartments. Melanocytes are in green, CD68+ is in blue, and CD45+ is in purple. Selection of CD68+ compartment (C), CD45+ compartment (D), tumor compartment (E), and remaining DNA+ compartment (F). G, Histogram of CD8 counts per patient in each compartment, as illustrated by color.
Furthermore, as the collection of the DNA oligonucleotides for marker measurement was performed sequentially and the resolution for compartment separation was approximately 10 μm in this version of the instrumentation, we observed that the detection of a given target was affected by target abundance, as well as order of compartment collection. In overlapping or close proximity compartments, measurement of a marker can appear to be associated with a nonexpressing cell type. For example, because CD68+ compartments were collected first, followed by CD45+ compartments, we observed counts for CD8, a well characterized marker of cytotoxic T cells, in both CD68+ and CD45+ compartments (Fig. 3G). Interestingly, high CD8 counts in the CD68 compartment (Fig. 4A–C), were found to be associated with prolonged OS (P = 0.0119; HR, 0.33; 95% CI, 0.14–0.78) and PFS (P = 0.0082; HR, 0.42; 95% CI, 0.22–0.83), as well as response to immunotherapy (P = 0.014) while CD8 in the CD45+ compartment was not. Similarly, CD8 in the melanocyte compartment was predictive of favorable outcome. Rather than mis-assignment, these observations may be seen as a low-resolution molecular proximity assay. While proximity assays usually include enzymatic activation steps for the detection of proximity between two markers, here the 10 μm resolution of markers' expression assignment to the neighboring compartment can serve as an indirect proximity indication.
Candidate predictive markers in immunotherapy-treated melanoma cohort C by NanoString DSP. Kaplan–Meier 5-year survival and PFS curves of immunotherapy-treated melanoma patients in Yale cohort C. A, OS by CD8 counts in CD68+ compartment, P = 0.0119. B, PFS by CD8 counts in CD68 compartment, P = 0.0082. C, Response to ICIs by CD8 counts in CD68+ compartment, P = 0.014. D, OS by PD-L1 in CD68+ compartment, P = 0.0032. E, PFS by PD-L1 in CD68+ compartment, P = 0.0072. F, Response to ICIs by PD-L1 in CD68+ compartment, P = 0.011. G, OS by PD-L1 in tumor compartment, P = 0.072. H, PFS by PD-L1 in tumor compartment, P = 0.054. I, Response to ICIs by PD-L1 in tumor compartment. Survival analysis by log-rank (Mantel–Cox) test. Two-tailed Mann–Whitney U test, bars represent means with SD (*, statistical significance P < 0.05).
Candidate predictive markers in immunotherapy-treated melanoma cohort C by NanoString DSP. Kaplan–Meier 5-year survival and PFS curves of immunotherapy-treated melanoma patients in Yale cohort C. A, OS by CD8 counts in CD68+ compartment, P = 0.0119. B, PFS by CD8 counts in CD68 compartment, P = 0.0082. C, Response to ICIs by CD8 counts in CD68+ compartment, P = 0.014. D, OS by PD-L1 in CD68+ compartment, P = 0.0032. E, PFS by PD-L1 in CD68+ compartment, P = 0.0072. F, Response to ICIs by PD-L1 in CD68+ compartment, P = 0.011. G, OS by PD-L1 in tumor compartment, P = 0.072. H, PFS by PD-L1 in tumor compartment, P = 0.054. I, Response to ICIs by PD-L1 in tumor compartment. Survival analysis by log-rank (Mantel–Cox) test. Two-tailed Mann–Whitney U test, bars represent means with SD (*, statistical significance P < 0.05).
Overall, by unadjusted univariable analysis, we found 11 markers associated with longer PFS (Table 1). As this was an exploratory study, we tested multiple cut-off points (median, tertiles, and quartiles) for significance. In the tumor compartment, high CD8, CD3, TIM3, HLADR, IDO1 (tertiles), and CD11c were predictive for PFS; in macrophages, high CD8, beta-2-microglobulin (B2M), PD-L1 (tertiles), and TIM3 were predictive; and in lymphocytes, high B2M was predictive. Fifteen markers were found to have a statistically significant univariate association to longer OS (Table 2). In tumor, high CD8, B2M, CD20, IDO1 (tertiles), and HLADR were predictive; in macrophages, high CD8, CD4, B2M, PDL1 (tertiles), and CD3, but low PMS2 and MYC were predictive; and in lymphocytes, high B2M but low PMS2 and MSH2 (tertiles) were predictive. Low PD-1 expression in lymphocytes (1st, 2nd, and 3rd vs. 4th quartile) showed a trend toward prolonged PFS and OS (P = 0.0084 and P = 0.44, respectively; Supplementary Fig. S3B and S3C). Multivariate analysis for PFS and OS (Supplementary Tables S5 and S6) by each compartment showed that only PD-L1 in macrophages remained statistically significant for OS.
Summary of all candidate predictive markers for PFS in tumor, CD68+, and CD45+ compartment, log-rank (Mantel–Cox) test
Predictive markers for PFS . | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Tumor ROI . | Cut-off point . | HR (95% CI) . | P . | CD68+ ROI . | Cut-off point . | HR (95% CI) . | P . | CD45+ ROI . | Cut-off point . | HR (95% CI) . | P . |
IDO1 | Low (1st and 2nd tertile) vs. high (3rd tertile) | 0.3205 (0.1705–0.6026) | 0.0021 | PDL1 | Low (1st and 2nd tertile) vs. high (3rd tertile) | 0.3641 (0.1899–0.6979) | 0.0072 | B2M | Median | 0.3869 (0.1998–0.7494) | 0.0033 |
CD3 | Median | 0.4409 (0.2331–0.8338) | 0.0099 | CD8 | Median | 0.4283 (0.2209–0.8302) | 0.0082 | ||||
CD8 | Median | 0.4532 (0.2385–0.8614) | 0.0119 | B2M | Median | 0.469 (0.2434–0.9036) | 0.0195 | ||||
CD11c | Median | 0.4669 (0.2476–0.8804) | 0.0157 | TIM3 | Median | 0.4687 (0.2447–0.8977) | 0.0209 | ||||
HLADR | Median | 0.4998 (0.2645–0.9443) | 0.0281 | ||||||||
TIM3 | Median | 0.5298 (0.2822–0.9946) | 0.0482 |
Predictive markers for PFS . | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Tumor ROI . | Cut-off point . | HR (95% CI) . | P . | CD68+ ROI . | Cut-off point . | HR (95% CI) . | P . | CD45+ ROI . | Cut-off point . | HR (95% CI) . | P . |
IDO1 | Low (1st and 2nd tertile) vs. high (3rd tertile) | 0.3205 (0.1705–0.6026) | 0.0021 | PDL1 | Low (1st and 2nd tertile) vs. high (3rd tertile) | 0.3641 (0.1899–0.6979) | 0.0072 | B2M | Median | 0.3869 (0.1998–0.7494) | 0.0033 |
CD3 | Median | 0.4409 (0.2331–0.8338) | 0.0099 | CD8 | Median | 0.4283 (0.2209–0.8302) | 0.0082 | ||||
CD8 | Median | 0.4532 (0.2385–0.8614) | 0.0119 | B2M | Median | 0.469 (0.2434–0.9036) | 0.0195 | ||||
CD11c | Median | 0.4669 (0.2476–0.8804) | 0.0157 | TIM3 | Median | 0.4687 (0.2447–0.8977) | 0.0209 | ||||
HLADR | Median | 0.4998 (0.2645–0.9443) | 0.0281 | ||||||||
TIM3 | Median | 0.5298 (0.2822–0.9946) | 0.0482 |
Summary of all candidate predictive markers for OS in tumor, CD68, and CD45 compartment, log-rank (Mantel–Cox) test
Predictive markers for OS . | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Tumor ROI . | Cut-off point . | HR (95% CI) . | P . | CD68+ ROI . | Cut-off point . | HR (95% CI) . | P . | CD45+ ROI . | Cut-off point . | HR (95% CI) . | P . |
CD20 | Median | 0.2581 (0.1128–0.5904) | 0.0019 | PDL1 | Low (1st and 2nd tertile) vs. high (3rd tertile) | 0.153 (0.0651–0.3598) | 0.0032 | MSH2 | Low (1st and 2nd tertile) vs. high (3rd tertile) | 3.277 (1.222–8.783) | 0.0035 |
HLADR | Median | 0.2762 (0.1212–0.6294) | 0.0035 | CD8 | Median | 0.3377 (0.1448–0.7876) | 0.0119 | B2M | Median | 0.28 (0.11–0.67) | 0.0049 |
CD8 | Median | 0.3321 (0.1455–0.7578) | 0.01 | B2M | Median | 0.34 (0.14–0.80) | 0.0138 | PMS2 | Median | 2.763 (1.162–6.574) | 0.0211 |
IDO1 | Low (1st and 2nd tertile) vs. high (3rd tertile) | 0.2392 (0.1033–0.554) | 0.0117 | CD3 | Median | 0.373 (0.1609–0.8648) | 0.0243 | ||||
B2M | Median | 0.42 (0.18–0.96) | 0.0429 | CD4 | Median | 0.3741 (0.1613–0.8672) | 0.0244 | ||||
PMS2 | Median | 2.466 (1.067–5.699) | 0.0405 | ||||||||
MYC | Median | 2.466 (1.067–5.7) | 0.0407 |
Predictive markers for OS . | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Tumor ROI . | Cut-off point . | HR (95% CI) . | P . | CD68+ ROI . | Cut-off point . | HR (95% CI) . | P . | CD45+ ROI . | Cut-off point . | HR (95% CI) . | P . |
CD20 | Median | 0.2581 (0.1128–0.5904) | 0.0019 | PDL1 | Low (1st and 2nd tertile) vs. high (3rd tertile) | 0.153 (0.0651–0.3598) | 0.0032 | MSH2 | Low (1st and 2nd tertile) vs. high (3rd tertile) | 3.277 (1.222–8.783) | 0.0035 |
HLADR | Median | 0.2762 (0.1212–0.6294) | 0.0035 | CD8 | Median | 0.3377 (0.1448–0.7876) | 0.0119 | B2M | Median | 0.28 (0.11–0.67) | 0.0049 |
CD8 | Median | 0.3321 (0.1455–0.7578) | 0.01 | B2M | Median | 0.34 (0.14–0.80) | 0.0138 | PMS2 | Median | 2.763 (1.162–6.574) | 0.0211 |
IDO1 | Low (1st and 2nd tertile) vs. high (3rd tertile) | 0.2392 (0.1033–0.554) | 0.0117 | CD3 | Median | 0.373 (0.1609–0.8648) | 0.0243 | ||||
B2M | Median | 0.42 (0.18–0.96) | 0.0429 | CD4 | Median | 0.3741 (0.1613–0.8672) | 0.0244 | ||||
PMS2 | Median | 2.466 (1.067–5.699) | 0.0405 | ||||||||
MYC | Median | 2.466 (1.067–5.7) | 0.0407 |
Notably, PD-L1 expression in macrophages was associated with prolonged OS (P = 0.0032; HR, 0.15; 95% CI, 0.065–0.35) and PFS (P = 0.0072; HR, 0.36; 95% CI, 0.18–0.69), while PD-L1 expressed in lymphocytes and melanocytes did not have any statistically significant predictive value. PD-L1 expression in macrophages could also distinguish responders from nonresponders to immunotherapy regardless of tumor PD-L1 expression (P = 0.0011; Fig. 4D–I). PD-L1 expression in tumor was found to be modestly correlated to macrophage PD-L1, suggesting an adaptive upregulation mechanism to immune pressure (Supplementary Fig. S3A). Further subgroup analysis of lymphocytes by PD-L1 expression revealed that high PD-L1 expression was associated with higher levels of B2M (P < 0.0001), HLADR (P = 0.0004), IDO1 (P < 0.0001), TIM3 (P = 0.0001), and B7H4 (P = 0.0069), while high PD-1–expressing lymphocytes had significantly higher expression of BIM (P = 0.0124), GZMB (P = 0.0091), and BCL6 (0.01).
Discussion
In this study, we benchmarked the novel DSP technology against an established platform, and then used it to identify novel candidate predictors of response to immunotherapy. First, we used the AQUA method of QIF, a method thoroughly used and previously compared with mass spectrometry (26), for validation of the technology. We found that there is a high correlation between measurements by the two assays in a large number of patient cases with multiple markers (CD3, CD4, CD8, CD20, and PD-L1) measured in tumor and stroma compartments. In addition, CD3 measurement by DSP reproduced the prognostic value similar to that seen using AQUA, as described previously (19).
To utilize the high-plex capacity of DSP to identify novel candidate biomarkers, we used a TMA consisting of a cohort of 60 pretreatment biopsies from patients with melanoma treated with immunotherapy to determine the clinical significance of a panel of 44 immune-related markers measured in three different compartments simultaneously (macrophages, lymphocytes, and melanocytes). A total of 11 and 15 immune markers were found to be correlated to PFS and OS, respectively, many of which have not been described before in this spatial context. For example, HLA-DR expression has been described on macrophages as a marker of activation and antigen presentation (27–29), but not on CTLs and melanoma cells (30). Here, HLA-DR expression in melanoma cells is associated with outcome in ICI-treated patients, representing a potential new finding that needs further validation.
This study complements two recent reports, which utilized DSP to investigate predictive biomarkers of survival in the adjuvant/neoadjuvant therapeutic setting for melanoma. In the first study, DSP profiling was conducted on core needle biopsies from metastases-containing lymph nodes of patients with advanced melanoma (31) and the compartment profiled was defined by a geometric area of the tissue. PD-L1 was observed to be associated with relapse-free survival following either adjuvant or neoadjuvant combination therapy with ipilimumab and nivolumab. In the second study, DSP was performed on tumor tissue taken either at baseline or on treatment with neoadjuvant nivolumab or the neoadjuvant combination of nivolumab and ipilimumab, and CD45+ cells were profiled as a compartment (32). Here, expression of a number of targets were associated with relapse-free survival in either arm, including PD-1, B2M, MS4A1, CD8A, CD45RO, GZMB, CD3, CD19, KI-67, VISTA, and CD4. These studies found in common a role for B2M, CD3, CD4, CD8A, PD-1, and PD-L1 in melanoma response to immunotherapy. This study further identified TIM3, MSH2, and MYC as potential biomarkers of response in the immune cell compartments, and nine additional potential biomarkers in the tumor compartment.
In our study, CD3 and CD8 in the macrophage compartment were also found to be associated with prolonged survival. CD3+ and CD8+ cell infiltration has been previously reported to correlate with favorable outcome to immunotherapy treatment (33, 34). Similarly, the role of macrophage and CD8+ cell interaction has been described previously (35), as macrophages mediate lymphocytic trapping and the blockade of colony stimulating factor 1 receptor increases responsiveness to anti-PD-1 treatment. A recent study (36) on 104 primary stage II/III melanoma tumors showed that a low CTL/macrophage ratio correlated with shortened OS and that close distance to macrophages also indicated poor prognosis. In our study, CD3 and CD8 expression assigned to CD68 compartment was associated with better outcome in immunotherapy-treated melanoma tumors. This highlights the importance of spatial information when measuring immune targets, as CD8 that was assigned to macrophage or tumor compartments due to proximity were the ones carrying the predictive value of the marker, while no clinical significance was found for expression in CD45+ compartment, approximating CD8 cells at a greater distance to macrophage or tumor cells. Again, this finding needs to be validated in other cohorts, but it supports the concept that CD8 close to macrophages is more important than total CD8.
Arguably the most interesting finding in the study was the association of PD-L1 expression in macrophages with OS. Although there was a trend toward prolonged PFS and OS for tumor PD-L1 expression, it did not reach statistical significance (P = 0.054 and P = 0.072, respectively; Fig. 4G–I). While this could be an artifact of imperfect compartmentalization and a part of PD-L1 tumor expression could have been measured in the macrophage compartment, emerging evidence supports the predictive value of PD-L1 macrophage over tumor expression. Tumor PD-L1 expression is the most commonly used predictive marker for response to ICIs and represents the only currently approved companion diagnostic. It is predictive in both tumor cells (in lung cancer) and immune cells (in breast cancer, gastric, cervical, bladder, and head and neck squamous cell carcinoma; refs. 37–40). While immune cells are not specifically classified, they are considered to predominantly include lymphocytes and macrophages, along with smaller numbers of other immune effector cells (myeloid-derived suppressor cells, natural killer cells, and others). There is growing evidence although that PD-L1 expression by macrophages may be a key element driving response to PD-L1 antibody treatment. Previous studies have shown that targeting PD-1/PD-L1 axis can still be effective regardless of PD-L1 tumor expression (3, 41, 42) with 83% of NSCLC and 46% of all tumor types with IHC score 3 of tumor-infiltrating immune cells responding to treatment (42). A recent study (43) also showed that treatment of mouse and human macrophages with PD-L1 antibodies increased macrophage proliferation, survival, and antitumor activity and that PD-L1 treatment exerted antitumor activity in mice lacking T cells, findings that are consistent with T-cell–independent, macrophage-dependent antitumor activity. Two mechanistic studies in mouse models also support macrophages as the key effector cell in the PD-axis mechanism of inhibition. Lin and colleagues found that neither knockout nor overexpression of PD-L1 in tumor cells had an effect on PD-L1 blockade efficacy in mice with expression of PD-L1 in macrophages (44). Similarly, Tang and colleagues found that PD-1 axis drug efficacy was not seen in myeloid PD-L1 −/− mice, which was restored by transplantation of myeloid PD-L1 wild-type/wild-type cells (45). Further studies are underway to determine the role of macrophage PD-L1 expression in antitumor immune response.
There are a number of limitations to this study. First, both DSP and AQUA assays were done on TMAs that are not currently used in the clinical setting. For immune markers that often have a high level of heterogeneity, accurate representation of the tumor and the TME is essential. For our study, we used two nonadjacent TMA cores for each patient to minimize sampling errors, but realize this still represents a very small percentage of a standard tissue section. However, the use of TMAs allowed the assessment of 44 immune markers included in the DSP panel in a large number of patients in a pilot study setting. Another limitation of the study is that the melanoma cohort consisted of patients that received a variety of immunotherapies, including combination therapies, but were all analyzed in unison. Response to the different therapies may be driven by distinct biology, which requires unique signatures to achieve greatest predictive power. Another limitation of this work is inherent in the DSP method. This method is limited to about 10-μm resolution, which means that some immune cells that infiltrate the tumors may be missed and mis-assigned to the tumor compartment. This issue may be addressed in the future as the DSP resolution is increased. Finally, only a single cohort of immunotherapy-treated patients was available for this study, which precludes validation of the biomarkers in an independent cohort or the use of more stringent statistical analysis and biomarker stratification. Future studies with additional samples from multiple institutes, conducted on a validated platform such as QIF, will be required to evaluate these biomarkers and develop diagnostic signatures. Similarly, subclassification of macrophages and lymphocytes or other cells (NKT) for immune marker measurement by immune cell types will provide more information about their role in antitumor immunity. These studies are beyond the scope of this pilot work and we look forward to assessing more compartments in future efforts.
In conclusion, we validated NanoString DSP technology as a new method for high-plex measurement of immune markers in multiple compartments and used it to discover over 20 potentially predictive markers of response to immunotherapy in patients with melanoma. Among them, the association of PD-L1 expression in macrophages with OS confirms mechanistic findings in mouse models in human tissue and gives a new insight in the clinical significance of macrophages in antitumor effect after PD-1/PD-L1 pathway blockade in patients with melanoma. This study illustrates the potential to leverage high-plex profiling on DSP to characterize tumor biology, elucidate drug mechanisms of action, and identify novel biomarkers associated with clinical response to therapy.
Disclosure of Potential Conflicts of Interest
C.R. Merritt, S.E. Warren, and J.M. Beechem hold ownership interest (including patents) in NanoString Technologies. J.W. Smithy holds ownership interest (including patents) in Johnson & Johnson. H.M. Kluger reports receiving commercial research grants from Merck, Apexigen, and Bristol-Myers Squibb, and is a consultant/advisory board member for Array Biopharma, Alexion, Prometheus, Corvus, Nektar, Biodesix, Genentech, Pfizer, Iovance, Immunocore, and Celldex. D.L. Rimm reports receiving commercial research grants from AstraZeneca, Cepheid, Navigate Biopharma, NextCure, Lilly, and Ultivue, and is a consultant/advisory board member for Amgen, Bristol-Myers Squibb, AstraZeneca, Cell Signaling Technology, Cephied, Daiichi Sankyo, GlaxoSmithKline, Konica Minolta, Merck, NanoString, and Ventana. No potential conflicts of interest were disclosed by the other authors.
Authors' Contributions
Conception and design: M.I. Toki, P.F. Wong, S.E. Warren, J.M. Beechem, D.L. Rimm
Development of methodology: M.I. Toki, C.R. Merritt, G.T. Ong, S.E. Warren, J.M. Beechem, D.L. Rimm
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): M.I. Toki, P.F. Wong, J.W. Smithy, H.M. Kluger, K.N. Syrigos, G.T. Ong
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): M.I. Toki, C.R. Merritt, P.F. Wong, H.M. Kluger, G.T. Ong, S.E. Warren, D.L. Rimm
Writing, review, and/or revision of the manuscript: M.I. Toki, C.R. Merritt, P.F. Wong, J.W. Smithy, H.M. Kluger, K.N. Syrigos, S.E. Warren, J.M. Beechem, D.L. Rimm
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): M.I. Toki, C.R. Merritt, P.F. Wong, K.N. Syrigos, S.E. Warren, D.L. Rimm
Study supervision: J.M. Beechem, D.L. Rimm
Acknowledgments
This work was supported by grants from the NIH including the Yale SPORE in Lung Cancer, P50-CA196530 and the Yale Cancer Center Support Grant P30-CA016359 and the SU2C Lung Cancer Dream Team Funding. The authors also acknowledge the expert assistance of Lori Charette and her staff in the Yale Tissue Microarray Facility division of Yale Pathology Tissue Services for construction of the tissue microarrays used in the study.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.