Abstract
The immune microenvironment in lung squamous cell carcinoma (LUSC) is not well understood, with interactions between the host immune system and the tumor, as well as the molecular pathogenesis of LUSC, awaiting better characterization. To date, no molecularly targeted agents have been developed for LUSC treatment. Identification of predictive and prognostic biomarkers for LUSC could help optimize therapy decisions. We sequenced whole exomes and RNA from 101 tumors and matched noncancer control Korean samples. We used the information to predict subtype-specific interactions within the LUSC microenvironment and to connect genomic alterations with immune signatures. Hierarchical clustering based on gene expression and mutational profiling revealed subtypes that were either immune defective or immune competent. We analyzed infiltrating stromal and immune cells to further characterize the tumor microenvironment. Elevated expression of macrophage 2 signature genes in the immune competent subtype confirmed that tumor-associated macrophages (TAM) linked inflammation and mutation-driven cancer. A negative correlation was evident between the immune score and the amount of somatic copy-number variation (SCNV) of immune genes (r = −0.58). The SCNVs showed a potential detrimental effect on immunity in the immune-deficient subtype. Knowledge of the genomic alterations in the tumor microenvironment could be used to guide design of immunotherapy options that are appropriate for patients with certain cancer subtypes. Cancer Immunol Res; 6(7); 848–59. ©2018 AACR.
Introduction
Lung cancer is the second leading cause of death in Korea. The most common type of primary lung cancer, lung adenocarcinoma, has been characterized at the molecular level (1, 2). Lung squamous cell carcinoma, which accounts for 30% of all lung cancers (3), is not well characterized due to poor understanding of the cancer's genomic evolution (4) and the antitumor activity of immune cells (5, 6). Genomic alterations in the tumor characterize various stages of cancer progression. Immune defenses, on the other hand, are governed by tumor stroma, including basement membrane, extracellular matrix, vasculature, and cells of the immune system (7–9). Most cells in tumor stroma have some capacity to suppress a tumor, although this capacity changes as the cancer progresses; invasion and metastasis can follow (10–13).
Immune and stromal characteristics have emerged as prognostic and predictive factors that could be used to guide a personalized approach in cancer immunotherapy (14, 15). Analyses of genomic alterations, especially somatic mutations, have been used to predict response to immunotherapy (16, 17). Here, we used genomic and transcriptomic analysis to define molecular subtypes of tumors with immune responses. We show that genomic alterations affect the tumor microenvironment and tumor development in a subtype-specific manner. The data show how genomic alterations and tumor microenvironment impact cancer proliferation and invasion, and how predicted roles of immune cells and their interactions with cancer cells in lung squamous cell carcinoma (LUSC) might affect cancer therapy and patient survival.
Materials and Methods
RNA and whole-exome sequencing
All protocols of this study were approved by the Institutional Review Board of Seoul National University Hospital (IRB No:1312-117-545).
One hundred and one cases of lung squamous cell cancer samples, taken between 2011 and 2013, were included. Of these 101 patients we excluded two patients, a patient treated with one cycle of weekly docetaxel 65 mg and cisplatin 48 mg regimen preoperatively, and another patient who died of massive pulmonary embolism at 16 days after operation, from subsequent survival analysis. All the tumor and matched adjacent noncancer control tissue specimens were grossly dissected immediate after surgery and preserved in liquid nitrogen. Data on clinical features such as smoking history, pathologic TNM stage, tumor size, and degree of differentiations were collected (Table 1; Supplementary Table S1). For RNA-seq, we extracted RNA from tissue using RNAiso Plus (Takara Bio Inc.), followed by purification using RNeasy MinElute (Qiagen). RNA was assessed for quality and was quantified using an RNA 6000 Nano LabChip on a 2100 Bioanalyzer (Agilent Technologies). The RNA-seq libraries were prepared as previously described (18).
Clinical data summary
. | Korean (n = 101) . |
---|---|
Patient characteristics . | Number of patients . |
Age at diagnosis, years | |
Median | 70 |
Range | 35–83 |
Sex | |
Male | 95 |
Female | 6 |
Smoking status | |
Never-smoker | 12 |
Former smoker | 61 |
Current smoker | 28 |
Median follow-up, months | 45 |
Tumor stage | |
I | 51 |
II | 29 |
III | 21 |
IV | 0 |
T stage | |
T1 | 27 |
T2 | 58 |
T3 | 15 |
T4 | 1 |
N stage | |
N0 | 62 |
N1 | 24 |
N2 | 15 |
Recurrancy | 31 |
Total LN | |
Median | 30 |
Range | 5–66 |
. | Korean (n = 101) . |
---|---|
Patient characteristics . | Number of patients . |
Age at diagnosis, years | |
Median | 70 |
Range | 35–83 |
Sex | |
Male | 95 |
Female | 6 |
Smoking status | |
Never-smoker | 12 |
Former smoker | 61 |
Current smoker | 28 |
Median follow-up, months | 45 |
Tumor stage | |
I | 51 |
II | 29 |
III | 21 |
IV | 0 |
T stage | |
T1 | 27 |
T2 | 58 |
T3 | 15 |
T4 | 1 |
N stage | |
N0 | 62 |
N1 | 24 |
N2 | 15 |
Recurrancy | 31 |
Total LN | |
Median | 30 |
Range | 5–66 |
For whole-exome sequencing, genomic DNA was extracted and 3 μg from each sample was sheared and used for the construction of a paired-end sequencing library as described in the protocol provided by Illumina. Enrichment of exonic sequences was then performed for each library using the SureSelect Human All Exon 50Mb Kit (Agilent Technologies) following the manufacturer's instructions.
Libraries for RNA and whole-exome sequencing were sequenced with Illumina TruSeq SBS Kit v3 on a HiSeq 2000 sequencer (Illumina Inc.) to obtain 100-bp paired-end reads. The image analysis and base calling were performed using the Illumina pipeline (v1.8) with default settings.
RNA-seq analysis
To characterize the LUSC transcriptome profile in cancer and noncancer control cells, we performed RNA-seq for 101 LUSC and matched noncancer control samples. Total RNA extracted from lung specimens and depleted of ribosomal RNA was sequenced at the desired depth (100×) on RNA-Seq (Illumina HiSeq). The reads were aligned to the human genome (version GRCh37) with the Spliced Transcripts Alignment to a Reference (STAR) alignment software. The preprocessing pipeline on the GTAK website was followed (19). The raw read counts were generated using HTSeq-count for each annotated gene.
Unsupervised subtype clustering
With the Ensembl gene set, the number of raw reads aligned to each gene was computed by HT-seq count and was normalized by the Variance Stabilizing Data (VSD) method with use of the R package DEseq2. The variance for each gene was calculated, and the top 1,000 genes by variance were selected for PCA analysis (20). PCA analysis using the 1,000 most variable genes was conducted with all tumor and noncancer control samples. Samples were clustered based on principal components into three groups noncancer control with 95% confidence interval by hierarchical clustering methods as implemented in the R package rgl (21). When analyzing RNA sequencing data, batch effects should be considered if experimental conditions and library preparation varied. All of our samples were processed in the same batches, thus additional batch-effect corrections were not necessary (22).
Differentially expressed gene analysis
Differentially expressed genes of tumor subtypes compared with noncancer control expression in noncancer control cells were determined by the significance criteria (adjusted P < 0.05, |Log2 (fold change)| ≥ 1, and base mean ≥ 100) as implemented in the R packages DESeq2 and edgeR. The adjusted P value for multiple testing was calculated by using the Benjamini–Hochberg correction from the computed P value (23). The centered VSD values of the differentially expressed gene list were applied to the array hierarchical clustering algorism (Cluster 3.0) with uncentered correlation and average linkage (24). The gene expression pattern was visualized with use of JAVA treeview. The hierarchical tree by arrays was generated by the clustering process and two types of gene sets in differentially expressed genes (subtype A-UP and B-DOWN, subtype A-DOWN and B-UP) were selected and enriched for Gene Ontology (GO) gene sets by Gene Set Enrichment Analysis (GSEA) in order to determine genes enriched in ranked gene lists.
Fragments per kilobase million (FPKM) calculation and normalization
Raw reads (HTseqcounts) were normalized using FPKM as implemented in the R package edgeR, and the FPKM values were transformed to log2 values and adjusted to the median-centered gene expression values by subtracting the row-wise median from the expression values in each row (Cluster 3.0). The centered and log2-transformed VSD and FPKM expression were used to illustrate gene expression pattern in a heat map.
GSEA and network analysis
GO analysis of the gene expression data was performed using GSEA (v2.24) desktop tools (permutation = 1,000) and visualized by the Enrichment Map tool in Cytoscape [P ≤ 0.05, false discovery rate (FDR) q value ≤ 0.1, and similarity ≤ 0.5; ref. 25].
Tumor microenvironment analysis
The fractions of stromal and immune cells in tumor samples were estimated by Estimation of STromal and Immune cells in MAlignant Tumours using Expression data (ESTIMATE) scores, with predictions of tumor purity based on the absolute method previously reported (26). The abundance of six infiltrating immune cell types (B cells, CD4+ T cells, CD8+ T cells, neutrophils, macrophages, and dendritic cells) in the two subtypes of LUSC was estimated with the Tumor IMmune Estimation Resource (TIMER) algorithm (27).
Immune score, stromal score, tumor purity, and cell-cycle score
Immune score, stromal score, and tumor purity were made using ESTIMATE. The gene set for cell-cycle score was used to calculate the cell-cycle score from the one in Davoli and colleagues (28).
Statistical analyses
Quantitative data are presented as mean ± standard deviation. We used R-3.2.3 to perform the statistical analyses. The normality of the variables was tested by the Shapiro–Wilk normality test (29). For two groups, significance (P value) for normally distributed variables was determined by an unpaired Student t test, and nonnormally distributed variables were analyzed by the Mann–Whitney U test. For more than two groups, Kruskal–Wallis and one-way ANOVA tests were used for the nonparametric and parametric methods, respectively (30). Statistically significant differences were tested at P values < 0.05. R value was computed by Pearson and distance correlation. Survival graph was plotted by the Kaplan–Meier method, and comparison between the subtypes was analyzed with the log-rank test, and Cox proportional hazards model was also used for survival analysis (SPSS for Windows, version 22; SPSS Inc.).
Somatic mutation detection using whole-exome sequencing
To find somatic mutations, we performed whole-exome sequencing (100×). Reads were aligned to the NCBI human reference genome (hg19) using BWA (31). Picard was applied to mark duplicates and we used Genome Analysis Toolkit (GATK Indel Realigner) to improve alignment accuracy. Somatic single-nucleotide variants (SNV) from 101 LUSC samples with matched noncancer control samples were called using MuTect (32). GATK's Haplotype Caller was also used for indel detection. All variants were annotated with information from several databases using ANNOVAR (33).
Somatic copy-number variant detection
Calculation of SCNV levels
We obtained amplified and deleted copy-number variants using EXCAVATOR and GISTIC analyses. Arm and focal SCNV levels of each patient were respectively calculated by summing the copy-number changes at each copy-number event. Arm, chromosome, and focal CNV level were normalized to the mean and standard deviation among the samples (28).
Prediction of neoantigens
We predicted neoantigens using the pVAC-Seq pipeline (36). We used nonsynonymous mutations to follow the pVAC-seq pipeline. Amino acid changes and transcript sequences were annotated by variant effect predictor. Epitopes predicted by HLAminer and were filtered by RNA expression (FPKM>1) and coverage (tumor coverage >10× and noncancer control coverage >5×).
Availability of data and material
Whole-exome and RNA sequencing data are available under the NCBI Sequence Read Archive (SRA) study accession no. SRP114315.
Results
Identification of LUSC subtypes
In this study, 101 LUSC and matched noncancer control samples were used to discover significant differential gene expression, and principal component analysis (PCA) on entire samples was performed to identify distinctive clusters based on gene variability between LUSC and noncancer control samples. PCA with the top 1,000 most variable genes and unsupervised hierarchical clustering with the k-means algorithm were applied to RNA-seq data and distinguished noncancer control from tumor samples. In contrast, 19 tumor samples overlapped and were more closely associated with the noncancer control group with 95% confidence interval ellipsoids (Fig. 1A; Supplementary Fig. S1). Additional PCA with 101 tumor samples distinguished the 19 tumor samples (subtype B) from 82 other major tumor samples (subtype A) at the 95% confidence interval. Thus, PCA revealed two molecular subtypes, A and B. These two subtypes were also distinguishable in The Cancer Genome Atlas (TCGA) LUSC cohort (n = 431) by PCA and unsupervised hierarchical clustering at 95% confidence interval (Supplementary Fig. S2).
Identification of the subtypes of LUSC. A, Principal component analysis (PCA) of the top 1,000 most variable genes among the subtype A, B, and noncancer control were performed across all samples. 3D PCA scores plots of top 1,000 of most variable genes was drawn as meshes containing cancer and noncancer control points (left) and subtype A and B points (right) based on K-means clustering (k-means = 2) on the first three PCs with 95% confidence interval ellipsoids. B, Ratio of somatic synonymous and nonsynonymous mutations in mutations per megabase and subtypes were classified. C, Name of significantly mutated genes (left), distribution of mutations across 101 LUSC, and frequency of significantly mutated genes (right) were plotted. D, The mutational signature revealed by somatic mutations in whole-exome sequencing. E, Arm-level CNV, focal-level CNV, and total level CNV of individual sample displayed in each column. F, Significant, focally amplified (red), and deleted (blue) regions are plotted across chromosomal locations.
Identification of the subtypes of LUSC. A, Principal component analysis (PCA) of the top 1,000 most variable genes among the subtype A, B, and noncancer control were performed across all samples. 3D PCA scores plots of top 1,000 of most variable genes was drawn as meshes containing cancer and noncancer control points (left) and subtype A and B points (right) based on K-means clustering (k-means = 2) on the first three PCs with 95% confidence interval ellipsoids. B, Ratio of somatic synonymous and nonsynonymous mutations in mutations per megabase and subtypes were classified. C, Name of significantly mutated genes (left), distribution of mutations across 101 LUSC, and frequency of significantly mutated genes (right) were plotted. D, The mutational signature revealed by somatic mutations in whole-exome sequencing. E, Arm-level CNV, focal-level CNV, and total level CNV of individual sample displayed in each column. F, Significant, focally amplified (red), and deleted (blue) regions are plotted across chromosomal locations.
To identify the pattern of genomic alterations in LUSC, we sequenced the exome of independent noncancer control samples (n = 101) and tumor samples (n = 101). The number of total mutations in subtype B was less than that in subtype A (P < 0.001 by Mann–Whitney U test; Fig. 1B; Supplementary Fig. S3A). Mutations in genes encoding TP53, NAV3, CDH10, KMT2D, NFE2L2, CTNNA3, KEAP1, NTRK3, RB1, NOTCH1, PTEN, FGFR2 and EGFR were also identified in the current study (Fig. 1C). Mutational signature combinations in subtype A were significantly different from those in subtype B (P < 0.01 by Mann–Whitney U test) showing more irregular patterns [Fig. 1D; Supplementary Fig. S3D, (32)]. We compared the burden of somatic copy-number variation (SCNV; ref. 28). Arm-level SCNVs (length > 98% of a chromosome arm), focal-level SCNVs (length < 98% of a chromosome arm), and total SCNVs (all of a chromosome), which comprise the burden of SCNV, were mostly found in subtype A (Fig. 1E and F). To elucidate the effect of genomic alteration on the tumor microenvironment, we also investigated major histocompatibility complex (MHC) and tumor-specific antigen (TSA) level, and found that subtype A carried more deletions of MHC regions than did subtype B (Supplementary Fig. S3C). TSA level was also higher in subtype A than in subtype B (P < 0.001 by Mann–Whitney U test; Supplementary Fig. S3D). Our transcriptome analysis thus defined two LUSC subtypes, A and B, with different patterns of somatic mutations.
Identification of differentially expressed genes in subtypes
We analyzed differential expression of 4,807 genes in subtype A and 1,471 genes in subtype B compared with noncancer control expression, and matched the significance criteria applied in this study (DESeq2: false discovery rate (FDR) < 0.1, P < 0.05). We derived a heat map depicting 5,387 differentially expressed genes in subtype A and B by excluding the overlapped genes (Supplementary Table S2). Differentially expressed genes were significantly enriched in several pathways through GO term enrichment analysis. Genes for which expression was upregulated were mainly involved with the cell-cycle and DNA replication (n = 1,826) in subtype A and with immune and defense pathways (n = 1,876) in subtype B (Fig. 2A).
Molecular subtypes at transcriptomic expression level. A, Heat map depicted 5,387 differentially expressed genes in subtypes A and B, and two distinct clustered genes in differentially expressed genes were selected. Top 10 GO gene sets in either clusters were determined based on the rank of enrichment −log10(P value) of pathway and the matched significance criteria (P < 0.05 and FDR q value < 0.1) B, Network visualization based on gene enrichment analysis. Nodes represent subtype A (red) and B (blue) networks, and green edges represent genetic overlap between networks using GO enrichment analysis (permutation = 1,000). The size of each node reflects the number of genes included in the network. Genes in significant networks were annotated and grouped with simplified GO terms. Networks meeting the cutoff conditions detailed at the bottom of the figure (right) were visualized with the Enrichment Map plugin for Cytoscape (P ≤ 0.05, FDR q value ≤ 0.1, and similarity ≤ 0.5).
Molecular subtypes at transcriptomic expression level. A, Heat map depicted 5,387 differentially expressed genes in subtypes A and B, and two distinct clustered genes in differentially expressed genes were selected. Top 10 GO gene sets in either clusters were determined based on the rank of enrichment −log10(P value) of pathway and the matched significance criteria (P < 0.05 and FDR q value < 0.1) B, Network visualization based on gene enrichment analysis. Nodes represent subtype A (red) and B (blue) networks, and green edges represent genetic overlap between networks using GO enrichment analysis (permutation = 1,000). The size of each node reflects the number of genes included in the network. Genes in significant networks were annotated and grouped with simplified GO terms. Networks meeting the cutoff conditions detailed at the bottom of the figure (right) were visualized with the Enrichment Map plugin for Cytoscape (P ≤ 0.05, FDR q value ≤ 0.1, and similarity ≤ 0.5).
Further functional gene enrichment analysis using GSEA (v2.2.3; ref. 25) proved that these differentially expressed genes were enriched in pathways related to cytoskeleton, mitosis, cell cycle, and chromatin modification in subtype A and pathways related to defense response, immune response, cytokine, and metabolic and biosynthetic process in subtype B. Using network analysis with Cytoscape (P ≤ 0.05, FDR q-value ≤ 0.1, and similarity ≤ 0.5), we found that the increase in expression and activation of pathway-related genes in subtype B was linked to regulation of immune cell differentiation and apoptotic processes, which would increase immune cell abundance. Genes characterizing subtype A, on the other hand, were linked to cell-cycle and microtubule pathways that would enhance cancer cell proliferation (Fig. 2B). Consistent with GSEA, network analysis showed that cell cycle and immune system gene sets were differentially correlated in each subtype. Both immune response and cell-cycle pathways were differently upregulated in the subtypes, showing an interaction between immune response and somatic mutations.
The immune landscape of the microenvironment in LUSC
To identify the roles of immunity and distribution of infiltrating immune cells in tumor and noncancer control samples, we computed stromal and immune scores along with tumor purity based on the ESTIMATE method (26). The stromal and immune scores in subtype B compared with subtype A were significantly increased (stromal core P = 9.10 × 10−25; immune score P = 7.51 × 10−19 by unpaired Student t test; Fig. 3A; Supplementary Fig. S4). The high immune scores suggested that recruitment of immune cells was more enhanced in subtype B than in A, which implicates not only immune cells surrounding the tumor but also immune cells infiltrating the tumor (26). This observation may reflect the fact that the different portion of infiltrating immune and stromal cells was intermixed in a dissecting tumor, and this can be useful for predicting the degree of tumor purity and tumor microenvironment (tumor purity P = 9.80 × 10−20 by unpaired Student t test; Supplementary Fig. S4). As subtype B had more tumor-infiltrating immune cells than did subtype A, cytolytic activity of subtype B must have been higher. Indeed, the cytolytic activity score was higher in subtype B than in subtype A (P = 2.07 × 10−6 by unpaired Student t test; Supplementary Fig. S4). Cytolytic activity is a metric of immune-mediated cell destruction, and immune infiltration predicts more cytolytic activity in subtype B of LUSC (37). Similarly, TCGA LUSC cohort results identified different microenvironmental factors in the two subtypes. Subtype B (n = 328) in the TCGA LUSC cohort was consistent with the description of our LUSC subtype B (n = 19; higher immune and stromal scores, lower tumor purity, and higher CYT score; Supplementary Fig. S5). The TCGA cohort contained more subtype B than subtype A samples; the Korean and TCGA cohorts likely had different subtype compositions.
The immune landscape of the microenvironment in LUSC. A, The heat map depicted the expression of stromal genes and the tumor microenvironmental factor [cytolytic activity (CYT), purity, immune, stromal], which divided into three groups, describing activated stroma-rich samples (subtype A), normal and activated stroma-rich samples (subtype B), and normal stroma-rich samples (noncancer control) gene expression. B, Scatterplots between stromal and immune scores with tumor purity gradient were shown, and correlation coefficient was indicated by each of the subtypes. The color grading corresponds to the tumor purity, indexed as shown on the color bar at the bottom right of the figure. C, The abundance of infiltrating B cells, CD4+ T cells, CD8+ T cells, neutrophil, macrophages, and dendritic cells in two subtypes was estimated, and each P value was indicated by each of the subtypes (Mann–Whitney U test and Kruskal–Wallis test). Box represents the median (thick line) and the quartiles (line). D, The median-centered and log2-transformed expression level (log2fpkm+1) of M1 and M2 signature genes in subtypes and noncancer control was box plotted with corresponding Mann–Whitney U and Kruskal–Wallis test results. Box represents the median (thick line) and the quartiles (line).
The immune landscape of the microenvironment in LUSC. A, The heat map depicted the expression of stromal genes and the tumor microenvironmental factor [cytolytic activity (CYT), purity, immune, stromal], which divided into three groups, describing activated stroma-rich samples (subtype A), normal and activated stroma-rich samples (subtype B), and normal stroma-rich samples (noncancer control) gene expression. B, Scatterplots between stromal and immune scores with tumor purity gradient were shown, and correlation coefficient was indicated by each of the subtypes. The color grading corresponds to the tumor purity, indexed as shown on the color bar at the bottom right of the figure. C, The abundance of infiltrating B cells, CD4+ T cells, CD8+ T cells, neutrophil, macrophages, and dendritic cells in two subtypes was estimated, and each P value was indicated by each of the subtypes (Mann–Whitney U test and Kruskal–Wallis test). Box represents the median (thick line) and the quartiles (line). D, The median-centered and log2-transformed expression level (log2fpkm+1) of M1 and M2 signature genes in subtypes and noncancer control was box plotted with corresponding Mann–Whitney U and Kruskal–Wallis test results. Box represents the median (thick line) and the quartiles (line).
Immune cells are important in tumors. Stromal cells influence tumor proliferation and inflammation. In order to evaluate tumor-associated stroma, the effects of activated and normal stroma on two tumor subtypes were described by 50 stromal signature genes as previously reported (38). The heat map of LUSC and noncancer control cells using active and normal stroma genes indicated three groups: activated stromal-rich samples (subtype A), normal and activated stromal-rich samples (subtype B), and normal stromal-rich samples (noncancer control). Mean normalized gene expression of the 25 exemplar activated and normal stromal genes was higher in subtype B than in subtype A (activated stroma P = 0.0247; normal stroma P < 0.01 by Mann–Whitney U test: Fig. 3A; Supplementary Figs. S6A and S6B). The activated stromal signature genes were associated with carcinoma-associated fibroblasts (CAF) and were overexpressed in subtype B as compared with subtype A and noncancer control. Likewise, the normal stromal genes, which reflect the composition of immune cells, were more highly expressed in subtype B than in subtype A. Overexpression of growth factors and chemokines related to CAFs, such as HGF, FGF7, CXCL12, MMP2, IL6, CCL2, NFkB, mediated tumor promotion and aggressive invasion in subtype B (Supplementary Fig. S6C; ref. 39).
To analyze tumor microenvironment integration in subtypes A and B, we assessed the correlation of stromal and immune scores as well as tumor purity. There was a positive correlation between stromal and immune scores (subtype A, Pearson r, 0.79; distance r, 0.78; subtype B Pearson r, 0.46; distance r, 0.54); samples with high tumor purity showed low stromal and immune scores (Fig. 3B). Subtypes A and B differed in their association with stromal and immune scores, suggesting variation in microenvironments and in interactions with stromal and immune cells.
We estimated (with use of the TIMER algorithm) the abundance of tumor-infiltrating immune cells in six cell types (B cells, CD4+ T cells, CD8+ T cells, neutrophils, macrophages, and dendritic cells) to predict immune cell profiling in subtypes A and B. Noncancer control A and noncancer control B subtypes had no significant differences. Dendritic cells were the most abundant cells among both tumor and noncancer control cells. Although macrophages were not the most abundant cells in tumors and noncancer control samples, macrophages seemed to be the most influential in a subtype-specific manner (P = 6.27 × 10−9 by Mann–Whitney U test; Figs. 3C and D). The proportion of macrophages in subtype B was significantly higher than in subtype A, supporting the previous observations that macrophages are present in large numbers in the tumor microenvironment and promote tumor progression and metastasis (40).
Intratumoral immune coordination was assessed by analyzing the correlation between selected immune cell markers. Pairwise comparisons of immune cell abundance levels were done by measuring Pearson correlation coefficients (r). The relationships implied by these correlations were visualized as r values (Supplementary Fig. S7). Correlation between immune cells was greater in tumor cells than in noncancer control cells, and the degree of correlation was higher in subtype B than in subtype A. Dendritic cells were more correlated with other immune cells in subtype B than in subtype A, and the correlations of dendritic cells with neutrophil and neutrophil with CD8+ T cells were increased in subtype B. Dendritic cells and neutrophils seemed to have immediate connection with CD8+ T cells, confirming previous findings that the cross-talk between neutrophils and recruited dendritic cells accelerates antigen presentation to T cells and generates an antigen-specific immune response (41, 42). Elevated expression of macrophage 2 signature genes in subtype B, a previously validated gene set (43), suggests that tumor-associated macrophages (TAM) promote tumor growth in subtype B cancer (Fig. 3D; Supplementary Figs. S8A and S8B). We examined the reproducibility of the immune parameters by subtype in both our LUSC and TCGA LUSC cohorts. All Immune scores, activated stromal gene expression, and macrophage 2 activity patterns by subtype were highly reproducible across both cohorts. LUSC and TCGA LUSC cohort subtype B showed higher immune infiltration (Supplementary Fig. S9). The immune microenvironment in subtype B was converted to escape mode during the immunoediting process of LUSC in order to help proliferation of resistant clones in an immunocompetent host. This complexity of subtype-specific microenvironments should inform therapeutic strategies for treatment of LUSC.
Impacts of somatic copy-number variants on tumor microenvironment
To identify the effect of genomic alterations on tumor microenvironments in LUSC, we investigated the association between somatic genomic alterations and the immune score. The SCNVs were significantly higher in subtype A (P < 0.001 by Mann–Whitney test; Fig. 4A) and were negatively correlated with the immune score (Pearson coefficient: −0.58) and with other immune-related properties such as the stromal score (stromal cell infiltration), CYT score (immune cytolytic activity), CD4+ T-cell infiltration (Pearson coefficient: −0.42), macrophage infiltration (Pearson coefficient: −0.39), and dendritic cell infiltration (Pearson coefficient: −0.41; Fig. 4B and C). These results suggested that the negative correlation between immune score and SCNVs in subtype A may influence the immune response of subtype A. Focal-level SCNVs contributed more to immune scores than arm-level SCNVs (Fig. 4D and E). In an association test, focal-level SCNVs were more highly correlated with immune score (Fig. 4F). These findings indicate that focal-level SCNVs are more influential on the tumor microenvironment and might be better targets for immunotherapy. Focal-level SCNVs had a stronger correlation with LUSC immunity and may make it easier to determine the target genes for drug interventions than arm-level SCNVs, because arm-level SCNVs are large genomic defects that may affect multiple targets (44, 45). To assess the role of focal-level SCNV in the tumor microenvironment, we investigated focal-level SCNVs in immune-related pathways via KEGG enrichment analysis. Subtype A had a high prevalence of SCNVs on immune-related pathways (B-cell receptor signaling pathway, chemokine signaling pathway, T-cell receptor signaling pathway, and Toll-like receptor signaling pathway of KEGG; Fig. 5). Genes harboring SCNV deletions, which lead to loss of function, were enriched in subtype A only for immune-related pathways of immune system processes, immune responses, and response to cytokine (Supplementary Fig. S10). These data suggested that focal-level SCNVs may drive the low immune response in tumor microenvironments and would be good targets to control immunity.
Impacts of genomic alterations on tumor microenvironments in LUSC. A, The level of total CNVs in cell-cycle–related gene sets is displayed across classes (left), and the level of total CNVs in immune-related gene sets is displayed across classes (right). B, The association between genomic alteration load (the number of mutations, the level of CNV) and cell-cycle score is shown (left) and the association between genomic alteration load and immune score is shown (right). C, The correlations between genomic alterations (CNVs, mutations) and immune-related properties were displayed. D, The cell-cycle scores are shown in each sample, and the high score densities and low score densities of cell-cycle score are plotted on the x and y axes. E, The immune scores are shown in each sample, and the high score densities and low score densities of immune score are plotted on the x and y axes. F, The correlation between CNV level (arm, focal) and cell-cycle score or immune score is displayed across subtype.
Impacts of genomic alterations on tumor microenvironments in LUSC. A, The level of total CNVs in cell-cycle–related gene sets is displayed across classes (left), and the level of total CNVs in immune-related gene sets is displayed across classes (right). B, The association between genomic alteration load (the number of mutations, the level of CNV) and cell-cycle score is shown (left) and the association between genomic alteration load and immune score is shown (right). C, The correlations between genomic alterations (CNVs, mutations) and immune-related properties were displayed. D, The cell-cycle scores are shown in each sample, and the high score densities and low score densities of cell-cycle score are plotted on the x and y axes. E, The immune scores are shown in each sample, and the high score densities and low score densities of immune score are plotted on the x and y axes. F, The correlation between CNV level (arm, focal) and cell-cycle score or immune score is displayed across subtype.
SCNV in LUSC immune pathway. The diagrams show the genes with SCNV in the four immune-related pathways and the percentage of samples with SCNV in immune-related pathway across classes; copy-number deletion (blue), copy-number amplification (pink), the percentage of samples with SCNV in subtype A (red box), the percentage of samples with SCNV in subtype B (blue box)
SCNV in LUSC immune pathway. The diagrams show the genes with SCNV in the four immune-related pathways and the percentage of samples with SCNV in immune-related pathway across classes; copy-number deletion (blue), copy-number amplification (pink), the percentage of samples with SCNV in subtype A (red box), the percentage of samples with SCNV in subtype B (blue box)
By analyzing the tumor microenvironment with RNA sequencing and whole-exome sequencing, we defined two immune-related subtypes: subtype A (immune defective) and subtype B (immune competent). The log2 VSD expression pattern between tumors and noncancer control tissue was consistent over the validation gene set with previously studied subtypes (Supplementary Fig. S11; ref. 46). The subtype A expression pattern resembled that of the classical subtype, in which TP63, AKR1C3, FOXE1, and TXN genes were over-expressed. Subtype B appeared to overexpress basal related genes (S100A7, MMP13, and SERPINB3) and secretory related genes (ARHGDIB and TNFRSF14). Except for the secretory-related genes, the other genes were all overexpressed in subtype A relative to subtype B. The expression pattern of previous validated genes differed between subtypes. Thus, our clustering method with gene selection for subtyping achieved similar results to those of the previous study. Our immune subtyping approach categorized the LUSC into immune-defective and -competent subtypes based on the pattern of infiltrating immune, stromal cells, immune cell composition. Our data also suggest that genomic alterations, especially SCNVs, decrease immune-related activities and immune cells in cell proliferation–related subtypes of LUSC, and that activated stroma and TAM lead to the evolution of cancer cells in the immune-related subtype. Although the subtypes differed in immune activation, mutation burden, and SCNV, the subtypes had no significant differences in clinical features or overall survival in both cohorts (our LUSC cohort P value = 0.223; TCGA LUSC cohort P value = 0.54; log-rank test; Supplementary Figs. S12 and S13). It was difficult to evaluate to what extent clinical features were affected by low immune activities due to the high SCNV in subtype A and the increased activity of TAMs and activated stroma in subtype B.
We found that all immune checkpoint genes were more highly expressed in subtype B than subtype A in both our LUSC samples and the TCGA LUSC cohort (Fig. 6A), and two independent cohorts had similar expression patterns for immune checkpoints (Fig. 6B). Both PD-1 and its ligand PD-L1 as well as other immune checkpoint genes are highly expressed in subtype B samples. PD-1 and PD-L1 are involved in immune tolerance by preventing stimulation of an immune response and inducing tumor immune escape (47) and might be valuable targets for new drugs in subtype B. However, difficulty with inducibility of PD-L1 protein expression and with accurate assays complicates their use. More reliable biomarkers are needed (48).
The expression pattern of immune checkpoints in LUSC. A, The expression of immune checkpoint genes was analyzed between subtypes in our LUSC samples (n = 101) and TCGA LUSC cohort (n = 431), respectively. The comparison of median-centered and log2-transformed expression (log2fpkm) of immune checkpoint genes was performed between subtypes, and P value was indicated by Mann–Whitney U or unpaired Student t test based on the normality. A two-color scale was used, with blue indicating low expression values and red representing highly expressed genes. B, The median-centered and log2-transformed expression level (log2fpkm) of immune checkpoint genes in both cohorts was box plotted with corresponding Kruskal–Wallis test results. Box represents the median (thick line) and the quartiles (line).
The expression pattern of immune checkpoints in LUSC. A, The expression of immune checkpoint genes was analyzed between subtypes in our LUSC samples (n = 101) and TCGA LUSC cohort (n = 431), respectively. The comparison of median-centered and log2-transformed expression (log2fpkm) of immune checkpoint genes was performed between subtypes, and P value was indicated by Mann–Whitney U or unpaired Student t test based on the normality. A two-color scale was used, with blue indicating low expression values and red representing highly expressed genes. B, The median-centered and log2-transformed expression level (log2fpkm) of immune checkpoint genes in both cohorts was box plotted with corresponding Kruskal–Wallis test results. Box represents the median (thick line) and the quartiles (line).
Discussion
Although the study of PD1 expression for immunotherapy has produced benefit in many clinical cases, some patients do not respond to checkpoint blockade (49). For such nonresponders, tumor-specific antigens with appropriate MHC-binding affinity might be more useful for immunotherapy (36, 50, 51). We analyzed the MHC region and TSA levels for both subtypes and found that subtype A had more TSA and more deletions of MHC regions than did subtype B. However, these results did not explain the role of the tumor microenvironment. Further study is needed to validate the results. In particular TSA content was predicted but not validated in the clinical environment.
Our genetic results thus provide evidence for an immune response to cancer in humans and indicate a mechanism of tumor-intrinsic resistance to cytolytic activity in tumors with a high burden of somatic mutations. Analysis of genomic alterations and their impact on the tumor microenvironment in a subtype-specific manner might identify patients who could benefit from cancer immunotherapies that boost the immune system.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Authors' Contributions
Conception and design: J.-S. Seo, A. Kim, J.-Y. Shin, Y.T. Kim
Development of methodology: A. Kim, J.-Y. Shin, J.-Y. Yun, J. Kim
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): J.-Y. Shin, Y.H. Kim, S. Park, H.J. Lee, I.-K. Park, C.-H. Kang, J.-Y. Yun, J. Kim, Y.T. Kim
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): J.W. Lee, A. Kim, J.-Y. Shin
Writing, review, and/or revision of the manuscript: J.W. Lee, A. Kim, J.-Y. Shin, Y.T. Kim
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): J.-S. Seo, J.W. Lee, A. Kim, Y.H. Kim, Y.T. Kim
Study supervision: J.-S. Seo, C.-H. Kang, Y.T. Kim
Acknowledgments
This work has been supported by Macrogen, Inc. (grant no. MGR17-01) and the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIP; No. NRF-2014R1A2A2A05003665).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.