Abstract
Purpose: Although tumor-infiltrating lymphocytes (TIL) have been associated with response to neoadjuvant therapy, measurement typically is subjective, semiquantitative, and unable to differentiate among subpopulations. Here, we describe a quantitative objective method for analyzing lymphocyte subpopulations and assessing their predictive value.
Experimental Design: We developed a quantitative immunofluorescence assay to measure stromal expression of CD3, CD8, and CD20 on one slide. We validated this assay by comparison with flow cytometry on tonsil specimens and assessed predictive value in breast cancer on a neoadjuvant cohort (n = 95). Then, each marker was tested for prediction of pathologic complete response (pCR) compared with pathologist estimation of the percentage of lymphocyte infiltrate.
Results: The lymphocyte percentage and CD3, CD8, and CD20 proportions were similar between flow cytometry and quantitative immunofluorescence on tonsil specimens. Pathologist TIL count predicted pCR [P = 0.043; OR, 4.77; 95% confidence interval (CI), 1.05–21.6] despite fair interobserver reproducibility (κ = 0.393). Stromal AQUA (automated quantitative analysis) scores for CD3 (P = 0.023; OR, 2.51; 95% CI, 1.13–5.57), CD8 (P = 0.029; OR, 2.00; 95% CI, 1.08–3.72), and CD20 (P = 0.005; OR, 1.80; 95% CI, 1.19–2.72) predicted pCR in univariate analysis. CD20 AQUA score predicted pCR (P = 0.019; OR, 5.37; 95% CI, 1.32–21.8) independently of age, size, nuclear grade, nodal status, ER, PR, HER2, and Ki-67, whereas CD3, CD8, and pathologist estimation did not.
Conclusions: We have developed and validated an objective, quantitative assay measuring TILs in breast cancer. Although this work provides analytic validity, future larger studies will be required to prove clinical utility. Clin Cancer Res; 20(23); 5995–6005. ©2014 AACR.
Tumor-infiltrating lymphocytes (TIL) have previously been identified as prognostic and predictive biomarkers in several cancers, including breast cancer. Despite these findings, assessment of TILs remains challenging, as existing methods are often subjective or semiquantitative, poorly reproducible, and unable to distinguish TIL subpopulations. Here, we develop and validate an objective immunofluorescence-based quantitative approach for measuring up to three TIL subpopulations per tissue section. The quantitative assay demonstrates improved reproducibility and similar specificity and sensitivity compared with traditional TIL classification. We hope that this initial report stimulates further validation of the objective method it describes to determine its clinical utility in breast and other cancers.
Introduction
Neoadjuvant chemotherapy facilitates breast conserving surgery and enables early evaluation of response, allowing for discontinuation of ineffective treatment (1). Moreover, patients who achieve pathologic complete response (pCR), elimination of invasive tumor following therapy, demonstrate better disease-free and overall survival (2, 3). However, most patients do not achieve pCR when treated with standard neoadjuvant chemotherapy (4, 5).
Tumor-infiltrating lymphocytes (TIL) are an important immune component of the response to cancer (6). Lymphocyte subpopulations are recognized by expression of specific cell surface biomarkers. CD3, a T-cell receptor protein, instigates a signaling cascade after antigen recognition to activate T cells (7). CD8, a coreceptor for the T-cell receptor, recognizes the major histocompatibility I complex and is expressed on cytotoxic T lymphocytes, the predominant TIL subpopulation in breast cancer (8, 9). CD20, a transmembrane protein expressed on B cells but not plasma cells, regulates B-cell activity, proliferation, and differentiation (10, 11). B cells drive the humoral, antibody-driven immune response and often colocalize with T cells in tumors (11, 12). Previously, lymphocyte infiltration has predicted improved response to chemotherapy and trastuzumab, and CD3, CD8, and CD20 have all been discussed as potential predictive biomarkers (13–18).
Assessing the predictive significance of TILs is challenging due to differing methods of measurement (19). TIL measurement has been reported as TIL count (8, 20), as density (TILs per high-power field; refs. 13, 18), in semiquantitative scales (21, 22), and as the percentage of stromal infiltrate (14, 23). TIL location is also not assessed in a standardized manner. Locations can be defined as intratumoral or stromal, which is further subdivided into adjacent and distant stroma (13). Another concern is variable selection of and number of fields of view (FOV) analyzed, especially considering TIL heterogeneity (18). Perhaps the most significant issue plaguing TIL quantification is subjectivity and nonreproducibility. Several assays attempt to objectively quantify TILs, but many consider only one biomarker, precluding relationships between TIL subpopulations (24–28). Moreover, these assays are often challenging on large cohorts, in which they are unable to account for location, and measure TILs semiquantitatively.
This study aims to objectively measure TILs on a cohort of biopsies obtained before neoadjuvant chemotherapy. We develop and validate an immunofluorescence-based quantitative approach for measuring CD3, CD8, and CD20 expressions within the stroma adjacent to the tumor. We objectively analyze expression in all FOV, determine which TIL biomarkers predict pCR, and compare their predictive value.
Materials and Methods
Tissue collection and patient cohort
Freshly resected tonsil tissue was collected from 4 patients who underwent tonsillectomies at Yale-New Haven Hospital (New Haven, CT) in 2013. Upon collection, each tonsil specimen was divided into two approximately equal halves. One half was formalin fixed for 24 to 48 hours before paraffin embedding. The other half was placed in RPMI media on ice for single-cell suspension preparation.
Paraffin-embedded pretherapeutic core biopsies were collected from 105 patients with consecutive invasive breast cancer that received neoadjuvant therapy, primarily anthracycline, and taxane based, as described previously (29). The distribution of treatment regimens, nuclear grade, tumor size, hormone receptor status, and HER2 status is described in Table 1. HER2 status was determined by IHC, with 0 and 1+ classified as HER2 negative and 3+ as HER2 positive. Cases that were HER2 2+ were tested by FISH and then classified into positive or negative groups as per the 2013 ASCO/CAP guidelines. Tissue was used after written patient consent. The study was performed according to the Yale University Institutional Review Boards protocol #9505008219.
Clinical characteristics of patients in the neoadjuvant cohort
Characteristic . | N (total, 95) . | % . |
---|---|---|
Age at diagnosis | ||
<50 | 56 | 58.9 |
≥50 | 39 | 41.1 |
Treatment | ||
Adriamycin-based therapy | 72 | 75.8 |
Carboplatin + paclitaxel + Herceptin | 13 | 13.7 |
Carboplatin + abraxane + bevacizumab | 6 | 6.3 |
Cytoxan + taxotere | 2 | 2.1 |
Navelbine + herceptin | 1 | 1.1 |
Carboplatin + etoposide + taxol | 1 | 1.1 |
Tumor size | ||
<2 cm | 10 | 10.5 |
2–5 cm | 68 | 71.6 |
≥5 cm | 16 | 16.8 |
Unknown | 1 | 1.1 |
Nuclear grade | ||
Grade 1 | 3 | 3.2 |
Grade 2 | 46 | 48.4 |
Grade 3 | 43 | 45.3 |
Unknown | 3 | 3.2 |
Nodal status | ||
Node positive | 48 | 50.5 |
Node negative | 33 | 34.7 |
Unknown | 14 | 14.7 |
ER status | ||
ER positive | 58 | 61.1 |
ER negative | 35 | 36.8 |
Unknown | 2 | 2.1 |
PgR status | ||
PgR positive | 50 | 52.6 |
PgR negative | 43 | 45.2 |
Unknown | 2 | 2.1 |
HER2 status | ||
HER2 positive | 27 | 28.4 |
HER2 negative | 67 | 70.5 |
Unknown | 1 | 1.1 |
Characteristic . | N (total, 95) . | % . |
---|---|---|
Age at diagnosis | ||
<50 | 56 | 58.9 |
≥50 | 39 | 41.1 |
Treatment | ||
Adriamycin-based therapy | 72 | 75.8 |
Carboplatin + paclitaxel + Herceptin | 13 | 13.7 |
Carboplatin + abraxane + bevacizumab | 6 | 6.3 |
Cytoxan + taxotere | 2 | 2.1 |
Navelbine + herceptin | 1 | 1.1 |
Carboplatin + etoposide + taxol | 1 | 1.1 |
Tumor size | ||
<2 cm | 10 | 10.5 |
2–5 cm | 68 | 71.6 |
≥5 cm | 16 | 16.8 |
Unknown | 1 | 1.1 |
Nuclear grade | ||
Grade 1 | 3 | 3.2 |
Grade 2 | 46 | 48.4 |
Grade 3 | 43 | 45.3 |
Unknown | 3 | 3.2 |
Nodal status | ||
Node positive | 48 | 50.5 |
Node negative | 33 | 34.7 |
Unknown | 14 | 14.7 |
ER status | ||
ER positive | 58 | 61.1 |
ER negative | 35 | 36.8 |
Unknown | 2 | 2.1 |
PgR status | ||
PgR positive | 50 | 52.6 |
PgR negative | 43 | 45.2 |
Unknown | 2 | 2.1 |
HER2 status | ||
HER2 positive | 27 | 28.4 |
HER2 negative | 67 | 70.5 |
Unknown | 1 | 1.1 |
Flow cytometry
Single-cell suspensions were extracted from tonsil tissue and filtered through a 70-μm nylon strainer (Corning). The resulting suspension was centrifuged and reconstituted to 107 cells/mL in PBS with 5% FBS and 0.1% sodium azide. Cells were then divided into five equal proportions of 106 cells and incubated for 20 minutes on ice with the Fc Receptor Binding Inhibitor (eBioscience). One proportion was incubated with a mix of anti-CD3 FITC-conjugated (0.2 mg/mL; eBioscience), anti-CD8 PE-conjugated (0.05 mg/mL; eBioscience), and anti–CD20 APC-conjugated antibodies (0.012 mg/mL; eBioscience) for 30 minutes in the dark on ice. The other four proportions were compensation controls, one unstained and three stained in parallel with the antibody mix. After washing, flow-cytometry analysis for CD3-, CD8-, and CD20-positive subpopulations was done on an LSRII Flow Cytometer (BD Biosciences) at the Yale Cell Sorter Core Facility, and data were collected using FACSDiva software. Flow-cytometry plots were generated and analyzed with FLOWJO software (Tree Star).
Pathologic assessment of TILs
Histopathologic analysis was performed on hematoxylin and eosin (H&E)–stained sections of 103 core biopsies from the cohort. Sections when available were obtained from archives or otherwise stained before reading. Analysis was conducted by two pathologists (V. Bossuyt and C. Nixon), both blinded to clinical parameters and response. TILs were quantified as the percentage estimate of the tumor stroma area that contained lymphocytic infiltrate (14). Percentages were reported in increments of 10%, with greater than 50% infiltrate denoted as lymphocyte predominant breast cancer (LPBC; ref. 23).
Multiplexed immunofluorescence staining for TILs
In situ detection of CD3, CD8, and CD20 with cytokeratin and 4′6-diamidino-2-phenylindol (DAPI) was conducted on the same slide (Supplementary Fig. S1). Briefly, slides were deparaffinized and rehydrated before pH 8 EDTA antigen retrieval. Endogenous peroxidase activity was blocked with dual endogenous peroxidase block (Dako) and nonspecific antigens were blocked with 0.3% BSA in Tris-buffered saline/Tween. Primary monoclonal antibodies against CD3 (1:100; Novus, Rabbit IgG Clone NB600-1441), CD8 (1:250; Dako, Mouse IgG1 Clone C8/144B), and CD20 (1:150; Dako, Mouse IgG2a Clone L26) were coincubated for 1 hour at room temperature. Slides were incubated sequentially with three horseradish peroxidase (HRP)–conjugated secondary antibodies for 1 hour at room temperature before tyramide-based HRP activation for 10 minutes, followed by 1 mmol/L benzoic hydrazide with 0.15% hydrogen peroxide to quench HRP activation. The secondary antibodies were anti-rabbit Envision reagent (Dako), anti-mouse IgG1 (Abcam; 1:100), and anti-mouse IgG2a (Abcam; 1:200). HRP activators were biotinylated tyramide (PerkinElmer; 1:50), TSATMPlus Fluorescein tyramide (PerkinElmer; 1:100), and Cy-5 tyramide (PerkinElmer; 1:50), respectively. Subsequently, slides were incubated in Alexa 750–conjugated streptavidin for 1 hour (1:100; Invitrogen). A rabbit polyclonal anti-cytokeratin antibody (Dako; 1:100) and goat anti-rabbit secondary (1:100; Invitrogen) identified tumor epithelium. DAPI identified nuclei.
Quantitative immunofluorescence using AQUA
Automated quantitative analysis (AQUA) objectively and accurately measures protein expression within the tumor and subcellular compartments, as described previously (30, 31). FOV were selected that were cytokeratin positive or adjacent to a cytokeratin-positive FOV in a previously acquired low-resolution image. For each FOV, five monochromatic, high-resolution images were captured at wavelengths matching DAPI, FITC, Cy-3, Cy-5, and Cy-7 fluorophores using a PM-2000 image workstation (HistoRx).
For analysis with AQUA software, staining artifact and FOVs without invasive breast carcinoma were manually removed. A total compartment, consisting of all nuclei, and a tumor mask were generated by dichotomizing DAPI signal and CD3 signal, respectively, so that each pixel was “on” or “off.” The stromal compartment excluded the tumor mask from the total compartment. AQUA scores were calculated by dividing the summated pixel intensity within each compartment by area. Only cases with three or more cytokeratin-positive FOVs were included.
Statistical analysis
A weighted kappa test assessed interobserver variability between pathologist estimates. Pearson correlation coefficient (R) compared AQUA scores within different compartments. ANOVA testing was used for comparison of AQUA scores with histopathologic assessment and clinicopathologic characteristics and for analysis of response, and log-rank P values are reported. Logistic regression was used for univariate and multivariate analyses, with the CD3, CD8, and CD20 AQUA scores analyzed on a continuous scale. To generate a ratio of the likelihood of pCR for high TIL populations compared with low TIL populations, AQUA scores were split into low and high populations at a cutoff point objectively determined by Joinpoint software (NCI Surveillance Research). All statistical analyses were performed using Statview software (SAS Institute) and QuickCalcs (GraphPad Software).
Results
Histopathologic TIL assessment
Two pathologists estimated TIL infiltration in the tumor stroma on H&E-stained slides (H&E TILs) for 93 cases (Fig. 1A–C). TIL infiltrate was reported in increments of 10% with fair interobserver variability (weighted κ = 0.393; Fig. 1D). Eight patients (8.6%) exhibited LPBC with moderate interobserver variability of LPBC versus non-LPBC (κ = 0.501). Of the patients with LPBC, 5 (62.5%) achieved pCR. TIL percentages between pathologists were averaged for further analysis, and H&E TILs were significantly higher in the patients who achieved pCR compared with those who did not achieve pCR (P = 0.0075, Fig. 1E).
Histopathologic assessment of TILs. A, an example of a case with few TILs, as determined by both pathologists. B, an example of a case with an intermediate number of TILs (30%) as determined by both pathologists. C, an example of a case that was determined to be LPBC by both pathologists. D, interobserver variability between the two pathologists was moderate. Darker squares indicate agreement upon more cases. E, comparison between pathologist TIL counts and pCR.
Histopathologic assessment of TILs. A, an example of a case with few TILs, as determined by both pathologists. B, an example of a case with an intermediate number of TILs (30%) as determined by both pathologists. C, an example of a case that was determined to be LPBC by both pathologists. D, interobserver variability between the two pathologists was moderate. Darker squares indicate agreement upon more cases. E, comparison between pathologist TIL counts and pCR.
Validation of quantitative immunofluorescence of CD3, CD8, and CD20
Four tonsil specimens were analyzed for CD3, CD8, and CD20 expressions by both quantitative immunofluorescence and flow cytometry. CD3 and CD8 demonstrated membranous staining and primarily extrafollicular localization, whereas CD20 expression was membranous and intrafollicular (Fig. 2A). A total of, 28 to 292 FOV were analyzed and the percentage of lymphocytic infiltrate was calculated for each FOV (Fig. 2A) and averaged across the entire tonsil specimen. Flow cytometry on the same specimens demonstrated a preponderance of lymphocytes. CD3- and CD20-positive lymphocyte populations were distinct, but CD3- and CD8-positive populations overlapped (Fig. 2B). Lymphocyte percentage ranged from 79.5% to 93.2% by flow cytometry and 64.8% to 82.6% by immunofluorescence (Fig. 2C). The lymphocyte percentage was consistently greater by flow cytometry, although within the error margin. By both methods, most lymphocytes were CD20-positive, and CD8-positive lymphocytes were a small fraction of CD3-positive cells (Fig. 2D–F).
Validation of quantitative immunofluorescence assay for measuring lymphocyte infiltration and subpopulations by comparison with flow cytometry on tonsil specimens. A, in this representative field of view from quantitative immunofluorescence on tonsil tissue, CD3 is recognized in the Cy7 channel (purple, top left), CD8 in the FITC channel (green, top right), and CD20 in the Cy5 channel (red, bottom left). All three markers demonstrate the expected membranous expression. Multiplexing CD3, CD8, and CD20 (bottom right) demonstrates the differential localization between the three markers. B, in this representative example of flow cytometry on tonsil tissue, lymphocytes—which were most of the cells analyzed (left)—were gated into distinct CD3 and CD20 populations (right, top) and overlapping CD3 and CD8 populations (right, bottom). C, the percentage of cells that were lymphocytes was similar when determined by flow cytometry and quantitative immunofluorescence. Flow cytometry consistently resulted in a greater proportion of lymphocytes, albeit within the margin of error. D, the percentage of lymphocytes positive for CD3 was similar between the two methods. E, in all cases, only a small proportion of tonsil lymphocytes was positive for CD8. F, the majority of lymphocytes were B cells, and the percentage of B cells was similar when determined by both methods.
Validation of quantitative immunofluorescence assay for measuring lymphocyte infiltration and subpopulations by comparison with flow cytometry on tonsil specimens. A, in this representative field of view from quantitative immunofluorescence on tonsil tissue, CD3 is recognized in the Cy7 channel (purple, top left), CD8 in the FITC channel (green, top right), and CD20 in the Cy5 channel (red, bottom left). All three markers demonstrate the expected membranous expression. Multiplexing CD3, CD8, and CD20 (bottom right) demonstrates the differential localization between the three markers. B, in this representative example of flow cytometry on tonsil tissue, lymphocytes—which were most of the cells analyzed (left)—were gated into distinct CD3 and CD20 populations (right, top) and overlapping CD3 and CD8 populations (right, bottom). C, the percentage of cells that were lymphocytes was similar when determined by flow cytometry and quantitative immunofluorescence. Flow cytometry consistently resulted in a greater proportion of lymphocytes, albeit within the margin of error. D, the percentage of lymphocytes positive for CD3 was similar between the two methods. E, in all cases, only a small proportion of tonsil lymphocytes was positive for CD8. F, the majority of lymphocytes were B cells, and the percentage of B cells was similar when determined by both methods.
Objective assessment of TILs by quantitative immunofluorescence on the neoadjuvant cohort
Of 93 slides analyzed, 87 had sufficient FOVs for analysis, ranging from 4 to 118 (mean 33.5, median 27). Stromal AQUA scores were averaged across all FOVs for each case. Correlation between AQUA scores within the stromal and total compartments for CD3, CD8, and CD20 was very strong and better than between tumor mask and total compartment AQUA scores, especially for CD20 (Supplementary Fig. S2). Index breast tissue microarrays stained alongside the cohort were reproducible on serial sections for all markers (Supplementary Fig. S3).
Analysis of each FOV allows visualization of intratumoral heterogeneity. Some FOVs had minimal TILs (Fig. 3A), whereas others had moderate TILs with low expression (Fig. 3B) or numerous TILs (Fig. 3C). Heatmaps were constructed to visualize this intratumoral heterogeneity (Fig. 3D).
AQUA analysis of TILs in the stroma and demonstration of heterogeneity. Selected images of CD3 staining are taken from different FOV from the same biopsy. These include a field with minimal CD3-positive TILs (A), a field with CD3-positive TILs but low intensity staining (B), and a field with CD3-positive TILs and high-intensity staining (C). D, a heatmap of CD3 stromal AQUA scores for all fields from this biopsy specimen demonstrates heterogeneity.
AQUA analysis of TILs in the stroma and demonstration of heterogeneity. Selected images of CD3 staining are taken from different FOV from the same biopsy. These include a field with minimal CD3-positive TILs (A), a field with CD3-positive TILs but low intensity staining (B), and a field with CD3-positive TILs and high-intensity staining (C). D, a heatmap of CD3 stromal AQUA scores for all fields from this biopsy specimen demonstrates heterogeneity.
Association of TILs with clinicopathologic characteristics
Next, TIL subpopulations were compared with standard breast cancer classifiers, including estrogen receptor (ER), progesterone receptor (PgR), HER2, and triple-negative status. CD20 AQUA score was significantly higher (P = 0.0051), and CD8 expression trended higher in ER-negative patients. CD3 (P = 0.0301), CD8 (P = 0.0168), and CD20 (P = 0.0145) expressions were also significantly higher in PgR-negative tumors. No markers were significantly correlated with HER2, although CD3 stromal expression trended higher with HER2 positivity. Although only 24 patients (27.6%) had tumors negative for ER, PgR, and HER2, CD8 (P = 0.0052) and CD20 (P = 0.0058) stromal expressions were significantly higher in triple-negative tumors (Supplementary Fig. S4).
Association of TILs with response to neoadjuvant chemotherapy
To dichotomize TILs to assess relationship with response, cutoff points between high and low expression of all four markers were determined by generating three possible Joinpoints (32) and selecting the middle cutoff point (Supplementary Fig. S5). Cutoff points for CD3, CD8, and CD20 were above the median AQUA score, and the rate of pCR above the cutoff point was 41.9% for CD3, 35.1% for CD8, and 39.4% for CD20 compared with pCR rates of 16.1%, 18.0%, and 16.7% below these cutoff points, respectively. When analyzed as a continuous variable, increased stromal expressions of CD3 (P = 0.0172), CD8 (P = 0.0225), and CD20 (P = 0.0004) were all significantly associated with pCR, and all three markers (P < 0.0001) strongly correlated with pathologist assessment of LPBC (Fig. 4).
Distribution of AQUA scores for TIL biomarkers and comparison with pathologist TIL assessment. For all markers, the cutoff point between low expression and high expression determined by Joinpoint software is indicated. Red columns signify the patients who achieved pCR, whereas blue columns signify those who did not. The cutoff point for CD3 AQUA score in the stromal compartment is slightly above the median (A), and correlation with pCR (B) and with pathologist assessment of TILs (C) on H&E slides from the same cohort is excellent. D, CD8 AQUA score in the stromal compartment demonstrates a similar distribution and is significantly correlated with pCR (E) and correlated with pathologist assessment of LPBC (F). G, the distribution of stromal CD20 AQUA score is skewed more negatively than for CD3 and CD8. H, nonetheless, this marker is significantly correlated with pCR, and correlation with pathologist TIL estimates is still excellent (I).
Distribution of AQUA scores for TIL biomarkers and comparison with pathologist TIL assessment. For all markers, the cutoff point between low expression and high expression determined by Joinpoint software is indicated. Red columns signify the patients who achieved pCR, whereas blue columns signify those who did not. The cutoff point for CD3 AQUA score in the stromal compartment is slightly above the median (A), and correlation with pCR (B) and with pathologist assessment of TILs (C) on H&E slides from the same cohort is excellent. D, CD8 AQUA score in the stromal compartment demonstrates a similar distribution and is significantly correlated with pCR (E) and correlated with pathologist assessment of LPBC (F). G, the distribution of stromal CD20 AQUA score is skewed more negatively than for CD3 and CD8. H, nonetheless, this marker is significantly correlated with pCR, and correlation with pathologist TIL estimates is still excellent (I).
High stromal CD3 [P = 0.023; OR, 2.51; 95% confidence interval (CI), 1.13–5.57], CD8 (P = 0.029; OR, 2.00; 95% CI, 1.08–3.72), and CD20 (P = 0.0053; OR, 1.80; 95% CI, 1.19–2.72) expressions also predicted pCR in univariate analysis (Table 2). LPBC had a higher likelihood of response than non-LPBC tumors (P = 0.043; OR, 4.77; 95% CI, 1.05–21.6), although the 95% confidence interval overlapped with ORs for CD3, CD8, and CD20 scores (Table 2). Multivariable analysis with patient age, tumor size, nodal metastases, ER, PR, and HER2 positivity, and Ki-67 AQUA score from a previous study (29) was conducted for LPBC and CD3, CD8, and CD20 continuous AQUA scores. Only CD20 independently predicted pCR (P = 0.0186; OR, 5.368; 95% CI, 1.32–21.8), and high CD20 expressers had 5.5 times the rate of pCR (Table 2). With Ki-67 AQUA score removed, small tumor size, node-negative status, HER2 positivity, and increased CD3 (P = 0.0329) and CD20 (P = 0.0064) expressions all significantly predicted pCR. LPBC and high CD8 expression trended with pCR (data not shown).
Univariate and multivariate logistic regression analysis of the prediction of pCR by histopathologic assessment and AQUA analysis for markers of TILs
. | Univariate . | Multivariate . | Multivariate . | Multivariate . | Multivariate . | |||||
---|---|---|---|---|---|---|---|---|---|---|
. | . | LPBC . | CD3 . | CD8 . | CD20 . | |||||
. | . | . | AQUA score . | AQUA score . | AQUA score . | |||||
Variable . | OR (95% CI) . | P . | OR (95% CI) . | P . | OR (95% CI) . | P . | OR (95% CI) . | P . | OR (95% CI) . | P . |
Age | ||||||||||
<50 (n = 56) | 1 | 0.118 | 1 | 0.210 | 1 | 0.234 | 1 | 0.131 | ||
≥50 (n = 39) | 4.92 (0.666–36.4) | 3.340 (0.508–22.0) | 3.053 (0.486–19.2) | 6.322 (0.579–69.0) | ||||||
Tumor size | ||||||||||
<2 cm (n = 10) | 30.837 (1.28–7.46) | 0.035 | 25.597 (0.805–8.14) | 0.066 | 24.527 (0.789–7.62) | 0.068 | 237.504 (1.85–30.463) | 0.027 | ||
2–5 cm (n = 68) | 1 | 1 | 1 | 1 | ||||||
≥5 cm (n = 16) | 8.471 (0.340–2.11) | 0.193 | 9.429 (0.364–2.45) | 0.177 | 10.436 (0.403–2.70) | 0.158 | 37.342 (0.465–2.999) | 0.106 | ||
Nuclear grade | ||||||||||
Grade 1–2 (n = 49) | 1 | 0.466 | 1 | 0.605 | 1 | 0.808 | 1 | 0.126 | ||
Grade 3 (n = 43) | 0.460 (0.057–3.71) | 0.570 (0.068–4.79) | 0.786 (0.113–5.46) | 0.097 (0.005–1.93) | ||||||
Nodal status | ||||||||||
Node negative (n = 33) | 1 | 0.123 | 1 | 0.051 | 1 | 0.055 | 1 | 0.133 | ||
Node positive (n = 48) | 0.244 (0.041–1.46) | 0.180 (0.032–1.01) | 0.185 (0.033–1.04) | 0.209 (0.027–1.61) | ||||||
ER status | ||||||||||
ER negative (n = 35) | 1 | 0.667 | 1 | 0.678 | 1 | 0.848 | 1 | 0.900 | ||
ER positive (n = 58) | 0.542 (0.034–8.77) | 0.548 (0.032–9.36) | 0.766 (0.050–11.7) | 1.211 (0.061–24.0) | ||||||
PR status | ||||||||||
PR negative (n = 43) | 1 | 0.444 | 1 | 0.347 | 1 | 0.280 | 1 | 0.106 | ||
PR positive (n = 50) | 0.373 (0.030–4.66) | 0.309 (0.027–3.57) | 0.267 (0.024–2.93) | 0.100 (0.006–1.64) | ||||||
HER2 status | ||||||||||
HER2 negative (n = 67) | 1 | 0.196 | 1 | 0.253 | 1 | 0.211 | 1 | 0.175 | ||
HER2 positive (n = 27) | 3.830 (0.499–29.4) | 3.388 (0.417–27.5) | 3.712 (0.475–29.0) | 6.948 (0.421–1.15) | ||||||
Ki-67 AQUA | ||||||||||
Low Ki-67 (n = 41) | 1 | 0.039 | 1 | 0.056 | 1 | 0.063 | 1 | 0.024 | ||
High Ki-67 (n = 43) | 5.372 (1.089–26.5) | 4.706 (0.962–23.0) | 4.902 (0.916–26.2) | 9.791 (1.35–71.1) | ||||||
LPBC | ||||||||||
Non-LPBC (n = 85) | 1 | 0.043 | 1 | 0.983 | ||||||
LPBC (n = 8) | 4.773 (1.05–21.6) | N/A | ||||||||
CD3 | ||||||||||
Low CD3 (n = 56) | 1 | 0.023 | 1 | 0.401 | ||||||
High CD3 (n = 31) | 2.512 (1.13–5.57) | 1.821 (0.449–7.38) | ||||||||
CD8 | ||||||||||
Low CD8 (n = 50) | 1 | 0.029 | 1 | 0.592 | ||||||
High CD8 (n = 37) | 1.999 (1.08–3.72) | 1.331 (0.468–3.79) | ||||||||
CD20 | ||||||||||
Low CD20 (n = 54) | 1 | 0.005 | 1 | 0.019 | ||||||
High CD20 (n = 33) | 1.799 (1.19–2.72) | 5.368 (1.32–21.8) |
. | Univariate . | Multivariate . | Multivariate . | Multivariate . | Multivariate . | |||||
---|---|---|---|---|---|---|---|---|---|---|
. | . | LPBC . | CD3 . | CD8 . | CD20 . | |||||
. | . | . | AQUA score . | AQUA score . | AQUA score . | |||||
Variable . | OR (95% CI) . | P . | OR (95% CI) . | P . | OR (95% CI) . | P . | OR (95% CI) . | P . | OR (95% CI) . | P . |
Age | ||||||||||
<50 (n = 56) | 1 | 0.118 | 1 | 0.210 | 1 | 0.234 | 1 | 0.131 | ||
≥50 (n = 39) | 4.92 (0.666–36.4) | 3.340 (0.508–22.0) | 3.053 (0.486–19.2) | 6.322 (0.579–69.0) | ||||||
Tumor size | ||||||||||
<2 cm (n = 10) | 30.837 (1.28–7.46) | 0.035 | 25.597 (0.805–8.14) | 0.066 | 24.527 (0.789–7.62) | 0.068 | 237.504 (1.85–30.463) | 0.027 | ||
2–5 cm (n = 68) | 1 | 1 | 1 | 1 | ||||||
≥5 cm (n = 16) | 8.471 (0.340–2.11) | 0.193 | 9.429 (0.364–2.45) | 0.177 | 10.436 (0.403–2.70) | 0.158 | 37.342 (0.465–2.999) | 0.106 | ||
Nuclear grade | ||||||||||
Grade 1–2 (n = 49) | 1 | 0.466 | 1 | 0.605 | 1 | 0.808 | 1 | 0.126 | ||
Grade 3 (n = 43) | 0.460 (0.057–3.71) | 0.570 (0.068–4.79) | 0.786 (0.113–5.46) | 0.097 (0.005–1.93) | ||||||
Nodal status | ||||||||||
Node negative (n = 33) | 1 | 0.123 | 1 | 0.051 | 1 | 0.055 | 1 | 0.133 | ||
Node positive (n = 48) | 0.244 (0.041–1.46) | 0.180 (0.032–1.01) | 0.185 (0.033–1.04) | 0.209 (0.027–1.61) | ||||||
ER status | ||||||||||
ER negative (n = 35) | 1 | 0.667 | 1 | 0.678 | 1 | 0.848 | 1 | 0.900 | ||
ER positive (n = 58) | 0.542 (0.034–8.77) | 0.548 (0.032–9.36) | 0.766 (0.050–11.7) | 1.211 (0.061–24.0) | ||||||
PR status | ||||||||||
PR negative (n = 43) | 1 | 0.444 | 1 | 0.347 | 1 | 0.280 | 1 | 0.106 | ||
PR positive (n = 50) | 0.373 (0.030–4.66) | 0.309 (0.027–3.57) | 0.267 (0.024–2.93) | 0.100 (0.006–1.64) | ||||||
HER2 status | ||||||||||
HER2 negative (n = 67) | 1 | 0.196 | 1 | 0.253 | 1 | 0.211 | 1 | 0.175 | ||
HER2 positive (n = 27) | 3.830 (0.499–29.4) | 3.388 (0.417–27.5) | 3.712 (0.475–29.0) | 6.948 (0.421–1.15) | ||||||
Ki-67 AQUA | ||||||||||
Low Ki-67 (n = 41) | 1 | 0.039 | 1 | 0.056 | 1 | 0.063 | 1 | 0.024 | ||
High Ki-67 (n = 43) | 5.372 (1.089–26.5) | 4.706 (0.962–23.0) | 4.902 (0.916–26.2) | 9.791 (1.35–71.1) | ||||||
LPBC | ||||||||||
Non-LPBC (n = 85) | 1 | 0.043 | 1 | 0.983 | ||||||
LPBC (n = 8) | 4.773 (1.05–21.6) | N/A | ||||||||
CD3 | ||||||||||
Low CD3 (n = 56) | 1 | 0.023 | 1 | 0.401 | ||||||
High CD3 (n = 31) | 2.512 (1.13–5.57) | 1.821 (0.449–7.38) | ||||||||
CD8 | ||||||||||
Low CD8 (n = 50) | 1 | 0.029 | 1 | 0.592 | ||||||
High CD8 (n = 37) | 1.999 (1.08–3.72) | 1.331 (0.468–3.79) | ||||||||
CD20 | ||||||||||
Low CD20 (n = 54) | 1 | 0.005 | 1 | 0.019 | ||||||
High CD20 (n = 33) | 1.799 (1.19–2.72) | 5.368 (1.32–21.8) |
ROC curves were generated to assess the sensitivity and specificity of CD3, CD8, and CD20 AQUA score and H&E TILs for predicting pCR. The most sensitive and specific marker was CD20 (AUC 0.685) whereas CD3 (AUC 0.626), CD8 (AUC 0.653), and H&E TILs (AUC 0.672) were somewhat less sensitive and specific (Supplementary Fig. S6).
Analysis stratified by molecular subtype
Association of CD3, CD8, and CD20 expression with pCR was stratified by ER, HER2, and triple-negative status. High CD8 expression trended with pCR among ER-negative patients, and increased CD20 AQUA score was significantly predictive (P = 0.0015) among ER-positive patients. Higher CD8 (P = 0.0005) and CD20 (P = 0.0021) expressions predicted pCR among HER2-negative cases, whereas higher CD20 expression also predicted pCR (P = 0.0386) among HER2-positive patients. Among triple-negative patients, increased CD8 expression significantly predicted pCR (P = 0.0386), whereas higher CD3 and CD20 expression trended with pCR (Supplementary Fig. S7).
Discussion
In an effort to objectively assess TILs, we have developed an automated, reproducible method for in situ measurement of lymphocyte infiltrate, quantifying expression of up to three TIL subpopulations within the tumor and adjacent stroma. We validated this method by comparison with flow cytometry on tonsil, and demonstrated reproducibility on breast tissue. For breast tumors before neoadjuvant treatment, high stromal expression of CD3, CD8, and CD20 all predicted pCR in univariate analysis. Moreover, CD20 predicted pCR in multivariate analysis independently of age, tumor size, nuclear grade, nodal metastasis, ER, PgR, and HER2 status, and Ki-67 AQUA score. Agreement between this automated objective assay and traditional semiquantitative pathologist estimates is very good.
This is the first study to significantly correlate CD20 expression with response to neoadjuvant chemotherapy in both univariate and multivariate analysis. Previously, this association has been equivocal. Although one study demonstrated this correlation in univariate analysis (14), another study demonstrated no significant association and that genetic T-cell markers but not B-cell markers predicted chemotherapeutic response (17). This discrepancy could be attributed to localization of CD20-positive lymphocytes. In this study, CD20-expressing lymphocytes were often clustered in a few FOVs, and therefore could be easily missed in tissue microarrays with minimal stroma. In another study, B-cell–deficient mice were capable of an immune response, whereas T-cell–deficient mice were not (33). Although that study suggests that B cells are nonessential for an immune response against cancer, B cells may have antitumor effects through several mechanisms, including generation of autoantibodies against the tumor, direct cytotoxicity by granzyme B production, proimmunogenic cytokine secretion, and antigen presentation to T cells (11). The increased sensitivity and specificity of CD20 compared with CD3 and CD8 in our study support an important role for B cells in response to chemotherapy. Because our study is relatively small, we look forward to future studies to validate this result.
Flow cytometry is commonly used to quantify and characterize lymphocytes and cytokines in several diseases, including cancer (12, 34, 35), and can rapidly analyze up to 18 parameters. However, in solid tumors, it has not seen broad adoption in the clinical setting. One reason for this may be the importance of the architectural location of the infiltrating cells. Whether by traditional methods or our automated method, the spatial relationships are preserved, facilitating assessment of infiltrating cells in the context of the adjacent tumor. We differentiate tumor epithelium from stroma, whereas flow cytometry reveals nothing about stromal or intratumoral localization or about heterogeneity within a lymphocyte population (35). Here, we accurately localized lymphocyte subpopulations in tonsil, with CD20-positive B cells exclusively in the germinal centers and CD3- and CD8-positive T cells primarily in the interfollicular areas. Moreover, the in situ assay does not require a cell suspension, in which dispase and collagenase could disrupt cell surface biomarker expression (36, 37). In our quantitative assay, lymphocyte quantification by immunofluorescence-based multiplexing and flow cytometry was concordant. Although not perfect, minor differences could be attributable to tissue processing, including formalin fixation. Interestingly, this variability was minimized in samples with more FOVs.
Recently, an international group led by Galon and colleagues (19) proposed an “Immunoscore” that uses immune infiltration to develop a quantitative assay with prognostic and predictive power, feasibility, inexpensiveness, robustness, reproducibility, and standardization. Their proposed Immunoscore measures CD3 and CD8 infiltration at the central tumor and invasive margin using automated SpotBrowser image software to quantify immune infiltrate, ultimately producing a discrete score from 10 to 14 (24). This effort is notable but is specifically designed for colon cancer. Breast cancer is a unique disease and similar efforts are underway in breast cancer (38). In contrast, the method proposed here has advantages over traditional methods and the Galon and colleagues “Immunoscore” in that our assay reports lymphocyte infiltration as continuous data. Moreover, our assay analyzes three TIL biomarkers and addresses the contribution of each lymphocyte subpopulation to prognostic and predictive studies. Future studies will be necessary to assess the generality of our assay, including assessment in multiinstitutional trials and across quantitative platforms.
This work has a number of limitations. Perhaps most significant is the relatively small size of the cohort and nonuniform patient treatment. Future larger studies are required to validate these results before introduction into the clinic. A second and also significant limitation is the fact that this score is dependent on a quantitative immunofluorescent platform. Although many institutions have these platforms, they are almost exclusively in the research domain. However, at least two commercial CLIA laboratories now provide patient data to clinicians based on quantitative immunofluorescence. We are optimistic that the increased reproducibility and the increased objectivity of the data will lead to increased popularity and adoption of these approaches. Another limitation is that TIL quantification is not reported as a traditional percentage, but rather an AQUA score. This score represents intensity divided by the area, thus simulating a concentration. Because our method measures activation markers of T and B cells, concentration may be relatively more representative than the percentage. However, further data are required to verify this assertion.
In summary, we have developed an objective immunofluorescence-based assay for detecting and quantifying TILs in situ. This method measures CD3, CD8, and CD20 expressions on one slide and reflects the location and heterogeneity of different lymphocyte subpopulations. We successfully validated this assay by comparison with flow cytometry on tonsil tissue and compared it with pathologist TIL estimates. We demonstrated that CD3, CD8, and CD20 predict pCR following neoadjuvant chemotherapy, with CD20 as the most sensitive and specific marker. We believe this method offers potential advantages over existing lymphocyte quantification methods, including objectivity and reproducibility. However, future use at multiple institutions in a range of clinical settings will be required to determine its true value.
Disclosure of Potential Conflicts of Interest
D.L. Rimm reports receiving commercial research grants from Genoptix, Gilead, and Kolltan and is a consultant/advisory board member for ACD Inc., Bristol-Myers Squibb, and Genoptix. No potential conflicts of interest were disclosed by the other authors.
Authors' Contributions
Conception and design: J.R. Brown, D.R. Lannin, D.L. Rimm, V. Bossuyt
Development of methodology: J.R. Brown, D.L. Rimm, V. Bossuyt
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): J.R. Brown, D.R. Lannin, C. Nixon, V. Bossuyt
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): J.R. Brown, H. Wimberly
Writing, review, and/or revision of the manuscript: J.R. Brown, H. Wimberly, D.R. Lannin, C. Nixon, D.L. Rimm, V. Bossuyt
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): J.R. Brown, D.R. Lannin
Study supervision: D.L. Rimm, V. Bossuyt
Acknowledgments
The authors thank Lori Charette and Yale Tissue Pathology Services for histology services. The authors also thank Curtis Perry and the Yale Cell Sorter Core Facility for technical support with flow-cytometry experiments.
Grant Support
This work was funded by a grant from the Breast Cancer Research Foundation and by the NIH (grant number R-01 CA 114277 and Medical Scientist Training Program grant TG T32GM07205).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.