Purpose: To reconcile the heterogeneity of thymic epithelial tumors (TET) and gain deeper understanding of the molecular determinants of TETs, we set out to establish a clinically relevant molecular classification system for these tumors.

Experimental Design: Molecular subgrouping of TETs was performed in 120 patients from The Cancer Genome Atlas using a multidimensional approach incorporating analyses of DNA mutations, mRNA expression, and somatic copy number alterations (SCNA), and validated in two independent cohorts.

Results: Four distinct molecular subtypes of TETs were identified. The most commonly identified gene mutation was a missense mutation in General Transcription Factor II-I (GTF2I group), which was present in 38% of patients. The next group was identified by unsupervised mRNA clustering of GTF2I wild-type tumors and represented TETs enriched in expression of genes associated with T-cell signaling (TS group; 33%). The remaining two groups were distinguished by their degree of chromosomal stability (CS group; 8%) or instability (CIN group; 21%) based upon SCNA analyses. Disease-free survival and overall survival were favorable in the GTF2I group and unfavorable in the CIN group. These molecular subgroups were associated with TET histology and clinical features including disease-free survival. Finally, we demonstrate high expression of PD1 mRNA and correlation of PD1 and CD8A in the TS subgroup.

Conclusions: Molecular subtyping of TETs is associated with disease-free and overall survival. Classification of TETs by a molecular framework could aid in the refinement of staging and in the discovery and development of rational treatment options for patients with TETs. Clin Cancer Res; 23(16); 4855–64. ©2017 AACR.

Translational Relevance

Thymic epithelial tumors (TET) are enigmatic tumors composed of epithelial and lymphocytic components. Subgrouping of TETs with a multidimensional molecular approach identified four molecular subtypes of TETs. The first group was identified by the most commonly identified gene mutation, a missense mutation in General Transcription Factor II-I (GTF2I group; 38%). The next group was identified by unsupervised mRNA clustering demonstrating enrichment in the expression of genes associated with T-cell signaling (TS group; 33%). The remaining two groups were distinguished by their degree of chromosomal stability (CS group; 8%) or instability (CIN group; 21%), based upon somatic copy number alteration. Classification of TETs by a molecular framework could aid in the refinement of staging and in the discovery and development of rational treatment options for patients with TETs.

Thymic epithelial tumors (TET) are enigmatic tumors comprised of variable proportions of epithelial and lymphocytic components. TETs are rare tumors with an incidence of 0.32 per 100,000 people every year worldwide, and a simple extrapolation of this would estimate that approximately only 390 TET cases are diagnosed in the United States every year (1). Given the rarity of these tumors and the wide spectrum of their underlying cellular composition, our understanding of TET biology continues to evolve.

Two of the most important factors that bridge the biology and clinical behavior of these tumors are stage and histology. Stage describes the invasiveness and metastatic capability of these tumors, and the current Masaoka–Koga staging system has proven useful for estimating prognosis in TET patients (2–4). The World Health Organization (WHO) classification system is a description of tumor histology, particularly as it relates to the distribution of epithelial and lymphocytic cellular components of these tumors, and has proven to be an independent prognostic factor for patients with TETs (5, 6). The biology of TETs is complex, however, and can result in heterogeneous clinical outcomes even amongst patients of similar stage or histology (7, 8). As a result, at least 14 staging systems have been proposed during the last four decades, and five of these systems have been validated, to some degree, within multiple patient cohorts (4, 9, 10). Whereas the Masaoka–Koga system is the most common staging system used in clinical practice, tumor heterogeneity exists within the stages defined in this system. For example, capsular invasion, which separates Masaoka–Koga stages I and II, does not significantly affect survival, and stage III tumors currently include tumors with heterogeneous prognosis (8). Similarly, with the WHO histological classification system, previous reproducibility studies (7, 11), and meta-analyses showed poor agreement on the separation between A and AB TETs, between B1, B2, and B3 TETs, and between B3 thymomas and thymic carcinoma. The molecular determinants of TETs are just beginning to be unraveled (12, 13) and will provide additional insight into the stratification of prognosis and treatment of patients with these tumors.

Surgery is a pillar in the treatment of patients with TETs, but effective systemic therapy is critical for patients with unresectable or metastatic tumors which presently have only 19% 10-year overall survival (OS; ref. 14). Chemotherapy is rarely curative in this patient population, and the development of targeted therapies has been hampered by the insufficient characterization of the genetic abnormalities of TETs (6). In addition to stratifying prognosis, molecular classification of TETs might encompass more accurate representation of tumor biology, and as our knowledge of targeted therapeutics advances, molecular subtyping will be important for guiding therapy.

Efforts of The Cancer Genome Atlas (TCGA) research network have provided a deeper understanding of the molecular alterations associated with a variety of tumors (15, 16). Only recently, however, has information on the genomic changes associated with TETs become available. With the advent of newer and more sensitive techniques in molecular biology and bioinformatics, there has been an incremental increase in the availability of high-quality data that will continue to expand our understanding of the molecular pathogenesis of TETs. Such deeper insight into the molecular determinants of TETs has potential to transform the current way that these tumors are staged and how TET patients are treated. To reconcile the heterogeneity of TETs and to gain deeper understanding of the molecular underpinnings of these tumors, we set out to establish a clinically relevant molecular classification system to categorize these tumors.

Integrative analysis from the TCGA data set

We obtained available DNA sequencing, mRNA sequencing, DNA copy-number alteration, and clinical data of the TETs cohort (n = 120) from TCGA portal (https://gdc-portal.nci.nih.gov/) and The Broad Institute TCGA GDAC Firehose (17). The samples of this data set were obtained from multiple institutes in the United States, Brazil, Canada, France, Germany, and Italy from 2000 to 2013. DNA sequencing data from Illumina HiSeq 2000 DNA Sequencing in Canada's Michael Smith Genome Sciences Centre and RNA sequencing data from Illumina HiSeq2000 RNA Sequencing in the University of North Carolina at Chapel Hill were analyzed. Normalized reads per kilobase per million values in mRNA sequencing data were transformed logarithmically and centralized by subtracting the mean of each channel, and further normalized to equalize the variance. Genomic data were processed as described in previous studies (15, 18). Molecular subgrouping of TETs was performed using a multidimensional approach incorporating DNA mutational analyses, unsupervised clustering of mRNA expression data, and somatic copy number alterations (SCNA). In SCNA analyses we regarded chromosomal instability solely on chromosomal deletions as the overwhelming majority of chromosomal aberrations were deletions of tumor suppressors (Supplementary Fig. S1; ref. 17).

The TCGA cohort includes 120 patients with primary TETs, in which 119 patients underwent surgical resection at the primary treatment modality and 1 patient with stage IV underwent cisplatin-based chemotherapy after anterior mediastinotomy. Time of median follow-up was 42.3 (0.5-150) months. Reverse-phase protein arrays (RPPA) data on 90 human TETs from TCGA were analyzed.

Analysis of microarray data

Microarray data were analyzed with the Robust MultiArray Average algorithm and implemented quantile normalization with log2 transformation of gene expression intensities with BRB-Array Tools version 4.3.0 (Biometric Research Branch, National Cancer Institute, Bethesda, MD) and the R-script from the Bioconductor project (www.bioconductor.org). Then, we selected the human mRNAs and adjusted data with mean values for genes and arrays, respectively. An unsupervised hierarchical clustering algorithm was applied using the uncentered correlation coefficient as the measure of similarity and the method of average linkage (Cluster 3.0). Java Treeview 1.60 (Stanford University School of Medicine, Stanford, CA) was used for tree visualization.

The 483 genes differentially expressed in TET subgroups between General Transcription Factor II-I (GTF2I) mutation and wild-type [P < 10−7, false discovery rate (FDR) <10−7, and fold change >5] were defined as GTF2I mutation gene set. Two hundred and five genes differentially expressed in TET subgroups between TS signature and others (P < 10−7, FDR <10−7, and fold change >5) were defined as TS gene set. The 124 genes differentially expressed in TET subgroups between CS and others (P < 0.001, FDR <0.07, and fold change >2) were defined as CS gene set. Finally, the 381 genes differentially expressed in TET subgroups between CIN and others (P < 10−7, FDR <10−7, and fold change >2) were defined as CIN gene set (Supplementary Table S1). To determine mRNA signatures identifying the molecular subtypes of validation cohort 2 and melanoma cohort (19), we adopted a previously developed model (20). Briefly, gene-expression data in the training set were combined to form a series of classifiers according to the compound covariate predictor (CCP) algorithm and the robustness of the classifier was estimated by the misclassification rate determined during leave-one-out cross-validation (LOOCV) of the training set. After LOOCV, the sensitivity and specificity of the prediction models were estimated by the fraction of samples correctly predicted.

Independent validation

Two independent cohorts from the NCBI Gene Expression Omnibus (GEO) were used to validate our results [validation cohort 1: GSE57892 (n = 22; ref. 12) and validation cohort 2: GSE29695 (n = 36; ref. 21)]. In cohort 1, multidimensional approach incorporating DNA mutational analyses, unsupervised clustering of mRNA expression data, and SCNA has been performed. In cohort 2, we validated the four subtypes on the basis of mRNA gene signatures that identified these groups. The WHO classification and Masaoka stage data were available from both cohorts.

Statistical analysis

The association of each subtype with disease-free survival (DFS) and OS in the TCGA TET cohort was estimated using Kaplan–Meier plots and log-rank tests. DFS was defined as the time from surgery to the first confirmed recurrence as well as death, and OS was defined as the time from surgery to death. Development of second primary cancer was excluded from the event of recurrence. Data were censored when a patient was alive without recurrence at the last follow-up. Multivariable Cox proportional hazards regression analysis was used to evaluate independent prognostic factors associated with DFS. Covariates in these analyses included those with P values less than 0.2 in univariable analysis. To assess the association of each molecular subtype with clinical phenotype, we used χ2 or Fisher exact test. Statistical significance was accepted as P < 0.05, and all tests were two-tailed. All statistical analyses were performed with SPSS 20.0 (SPSS, Inc.) and R language and software environment (http://www.r-project.org). The Ingenuity pathway analysis (IPA; www.ingenuity.com, Ingenuity) was used for gene set enrichment analysis to identify enriched gene sets in all subtypes.

A weighted permutation approach was implemented to test for mutual exclusivity and co-occurrence of alterations across a group of subgroups which is currently available as an additional step when sorting the subgroups by mutual exclusivity. The test is based on weighted permutations assessing the deviation of the observed coverage (number of columns with a signal) compared with expected obtained by permuting events, maintaining the number of events per row and weighted permutations for columns. Mutual exclusivities of the subtypes were tested by Gitools (22, 23). Principal component analysis (PCA) was performed on mRNA gene expression data and to analyze the association between the principal components and molecular subtypes relevant to TETs. We generated three-dimensional (3-D) plots by evaluating the association of the first three principal components (PC 1–3) with GTF2I mutation, T-cell signaling, and SCNA (24).

Accession codes

Data deposited into NCBI GEO gene expression microarrays GSE57892 (12) and GSE29695 (21) were utilized.

Molecular classification of TETs

By integrating genomic data from multiple platforms including mutation data, mRNA expression, and copy number alterations, we used a decision tree categorizing approach to stratify 120 TETs into molecularly distinct subtypes (Fig. 1A). GTF2I is the most frequently mutated gene in TETs (missense mutation p.Leu424His in change to codon chromosome 7 c.74146970T>A). Interestingly, this mutation status is well reflected in mRNA expression patterns as all mutated tissues were clustered together in hierarchical clustering analysis (P = 1.17 × 10−15; Supplementary Fig. S2A). Therefore, we grouped tumors with GTF2I mutation (n = 46 [38.3 %]) as first distinct molecular subtype of TETs. The second subtype was identified by unsupervised mRNA clustering of GTF2I wild-type tumors and represented TETs significantly enriched in expression of genes associated with T-cell signaling, hereafter referred to as the TS group (n = 39, 33%; Cluster C in Supplementary Fig. S2B). Analysis of genome-wide SCNA data further identified two additional subtypes; chromosomal stability (CS; n = 10, 8%) or instability (CIN; n = 25, 21%) subtypes. In TETs, the foci significantly deleted were 9p21.3 containing strong tumor suppressors (CDKN2A and CDKN2B), 22q12.1, 6p25.2, 3p22.2, 2q37.1, 6q21, 10q26.3, 9p13.11, 22q13.32, 7q36.3, 11q22.1, and 9q22.33. CDKN2A at 9p21.3 was significantly deleted in the CIN group, with a deletion in this tumor suppressor gene occurring in 7 (28%) of patients in this group. Furthermore, differences in genomic stability in the CS and CIN group were also well reflected in mRNA expression patterns as they have distinct mRNA expression signature (Fig. 1B). Mutual exclusivity and PCA were implemented to justify decision-tree approach in Supplementary Fig. S3.

Figure 1.

Four distinct molecular subtypes of TETs. A, Illustration of the molecular subtype classification tree. B, TETs are divided into four subtypes by a multidimensional approach incorporating DNA mutational analyses, unsupervised clustering of mRNA expression data, and somatic copy number alterations (SCNA). GTF2I mutation–positive (GTF2I, blue), T-cell signaling (TS, green), chromosomally stable (CS, orange), and chromosomal instability (CIN, red). Clinical (top) and molecular data (bottom) in 120 tumors from the TCGA cohort profiled with mRNA expression and somatic copy number alteration (SCNA) are depicted.

Figure 1.

Four distinct molecular subtypes of TETs. A, Illustration of the molecular subtype classification tree. B, TETs are divided into four subtypes by a multidimensional approach incorporating DNA mutational analyses, unsupervised clustering of mRNA expression data, and somatic copy number alterations (SCNA). GTF2I mutation–positive (GTF2I, blue), T-cell signaling (TS, green), chromosomally stable (CS, orange), and chromosomal instability (CIN, red). Clinical (top) and molecular data (bottom) in 120 tumors from the TCGA cohort profiled with mRNA expression and somatic copy number alteration (SCNA) are depicted.

Close modal

Molecular subtypes are associated with clinical phenotypes and outcomes

We correlated molecular subtypes with clinical covariates in the TCGA cohort (Fig. 2A; Supplementary Table S2) and we observed four main trends: (i) The GTF2I mutation group was prevalent in WHO class A and AB tumors and was associated with a decreased prevalence of myasthenia gravis (MG; 13%). (ii) The TS group occurred predominantly in B1 and B2 WHO classifications. (iii) The CIN group was enriched for B3 and thymic carcinoma histology. (iv) CS and CIN groups were associated with an increased prevalence of myasthenia gravis (40%).

Figure 2.

Clinical and histologic characteristics among molecular subtypes of TETs. A, World Health Organization (WHO) histologic classification, Masaoka–Koga stage, and presence of myasthenia gravis are shown for each of the molecular subtypes of TETs in the TCGA data set. B, DFS and OS were examined in each of the 4 molecular subtypes of TETs in the TCGA cohort. C, DFS was examined in subgroups of patients with early-stage (WHO classification A, AB, and B1) and advanced-stage (WHO classification B2, B3, and thymic carcinoma) tumors. D, DFS was examined in subgroups of patients with early-stage (Masaoka–Koga stage I and II) and advanced-stage (Masaoka–Koga stage III and IV) tumors.

Figure 2.

Clinical and histologic characteristics among molecular subtypes of TETs. A, World Health Organization (WHO) histologic classification, Masaoka–Koga stage, and presence of myasthenia gravis are shown for each of the molecular subtypes of TETs in the TCGA data set. B, DFS and OS were examined in each of the 4 molecular subtypes of TETs in the TCGA cohort. C, DFS was examined in subgroups of patients with early-stage (WHO classification A, AB, and B1) and advanced-stage (WHO classification B2, B3, and thymic carcinoma) tumors. D, DFS was examined in subgroups of patients with early-stage (Masaoka–Koga stage I and II) and advanced-stage (Masaoka–Koga stage III and IV) tumors.

Close modal

Univariable analyses of DFS within the TCGA data set demonstrated the expected correlation of Masaoka stage and WHO histology (Supplementary Fig. S4). Univariable Kaplan–Meier survival analyses demonstrated that patients with a GTF2I mutation and patients in the TS group demonstrated favorable DFS and OS and that patients in the CIN group demonstrated unfavorable DFS and OS (Fig. 2B). Among 54 patients with advanced WHO classification (B2, B3, and thymic carcinoma), 7 patients (13.0%) had a GTF2I mutation and these patients did not experience recurrence (Fig. 2C). Similarly, among 21 patients with Masaoka stage (III–IV), 5 patients (23.8%) had GTF2I mutation, and these patients also did not suffer from recurrence (Fig. 2D). In contrast, analyses of patients with early WHO classification (A, AB, and B1) and Masaoka stage (I–II) showed that patients characterized as CIN had unfavorable prognosis. Backward stepwise analysis in multivariable Cox regression models including WHO classification, Masaoka stage, and molecular subtyping revealed that molecular subtyping (CIN versus others) was an independent predictor of DFS (HR 3.27; 95% CI, 1.37–7.79; P = 0.007; Supplementary Table S3).

Independent external validation of molecular subtypes

Two independent cohorts were utilized to validate our findings in the TCGA cohort: cohort 1 and cohort 2. In these analyses, TET subtypes demonstrated a significant association with WHO classification and Masaoka stage in each independent cohort, consistent with our findings from the TCGA cohort (Fig. 3). With respect to SCNA analyses, cohort 1 showed similar findings to the TCGA cohort, that the CIN group was particularly enriched for chromosomal deletions and, specifically, deletions in CDKN2A. In cohort 2, DNA mutation and somatic copy number alteration data were not available. Therefore, molecular subtypes were identified on the basis of mRNA signature after proving statistically significant correlation between these mRNA signatures and their respective molecular subgroups within the TCGA cohort and in validation cohort 1 (Fig. 4).

Figure 3.

Validation of TET molecular subtypes in external independent cohorts. Cohort 1 included DNA mutation, mRNA expression, and somatic copy number alteration data, and molecular subtyping was performed in the same manner as in the TCGA cohort. In cohort 2, DNA mutation and somatic copy number alteration data were not available and therefore molecular subtypes were identified on the basis of mRNA signature.

Figure 3.

Validation of TET molecular subtypes in external independent cohorts. Cohort 1 included DNA mutation, mRNA expression, and somatic copy number alteration data, and molecular subtyping was performed in the same manner as in the TCGA cohort. In cohort 2, DNA mutation and somatic copy number alteration data were not available and therefore molecular subtypes were identified on the basis of mRNA signature.

Close modal
Figure 4.

Validation of mRNA signature of molecular subtypes through the TCGA cohort and validation cohort 1. A, A schematic overview of the strategy used for constructing prediction models and evaluating predicted outcomes based on gene expression signatures. CCP, compound covariate predictor; LOOCV, leave-one-out cross-validation. B, ROC curve between each molecular subtype and the probability of molecular subtypes predicted by mRNA signature. ROC curve in CS subgroup was not generated due to n = 2.

Figure 4.

Validation of mRNA signature of molecular subtypes through the TCGA cohort and validation cohort 1. A, A schematic overview of the strategy used for constructing prediction models and evaluating predicted outcomes based on gene expression signatures. CCP, compound covariate predictor; LOOCV, leave-one-out cross-validation. B, ROC curve between each molecular subtype and the probability of molecular subtypes predicted by mRNA signature. ROC curve in CS subgroup was not generated due to n = 2.

Close modal

Therapeutic implications for molecular subtype–stratified TETs

To explore viable therapeutic strategies for TETs within each molecular subtype, deeper investigation of genomic data was performed in each subtype. mRNA expression of GTF2I and its isoforms in GTF2I mutation group was similar to that of adjacent normal tissue, and of TETs within the CS and CIN groups (Fig. 5A) and GTF2I mutation and its mRNA expression was not correlated across all types of cancer in TCGA (Supplementary Fig. S5). Using known transcription factor binding site motifs within the TRANSFAC predicted transcription factor targets data set (ref. 25; Supplementary Table S4), we identified 220 target genes of the GTF2I transcription factor. Z-scores, which indicated the degree of target gene enrichment, and therefore GTF2I activity for the 220 target mRNAs were calculated and analyzed for each molecular subtype (Fig. 5B). In these analyses, the TS subtype demonstrated the lowest GTF2I mRNA and isoform expression, with a distribution of 220 Z-scores that was shifted downward along the x-intercept as the 134th-ranked mRNA (indicating that 134 target mRNAs were depleted and only 86 mRNAs were transcribed). In the case of the CS subtype, which demonstrated the highest GTF2I expression, an opposite pattern was observed: 90 mRNAs were depleted and 130 mRNAs were enriched. The curve of GTF2I subtype is located between TS and CIN subtypes, which implies that GTF2I mutation may not affect the function of GTF2I gene as a transcription factor.

Figure 5.

Therapeutic implication through pathway analysis and proteomic patterns of molecular subtypes of TETs. A, mRNA expression of GTF2I and its isoforms in four subtypes and adjacent normal tissues. An asterisk (*) denotes P < 0.05 compared with other three subtypes. B, A rank distribution plot of GTF2I mRNA target gene expression. Z-scores of 220 target mRNA targets from TRANSFAC were sorted from the smallest to the largest values and plotted against an anonymous x-axis. The average Z score in TS subtype was significantly lower than that in others (P = 0.002). C, The top canonical pathways derived from IPA in molecular subtypes of TETs. We performed the IPA to find major canonical pathway of each molecular subtype. These pathways in the y-axis emerged following the core analysis in the IPA. The x-axis indicates the significance level, scored as −log(P value) from Fisher exact test. D, Proteomic patterns associated with the molecular subtypes of TETs. Reverse-phase protein array (RPPA) data on 90 human TETs from TCGA in signaling pathways were analyzed according to the molecular subtypes of TETs. Critical values for the two-sided significance level of 0.1 are Z = −1.28 and Z = 1.28. E, mRNA expression of immune checkpoint inhibitory genes. The log2-transformed values of RPKM of PD1, PD-L1, and CTLA4 were shown. F, A scatter plot of log2-transformed values of RPKM of PD1 and CD8A. TS subtype was significantly deviated toward high expression of both CD8A and PD1 mRNA (χ2 test). G, ROC curve to suggest the potential role of TS signature to predict the response to immune checkpoint inhibitors in melanoma cohort.

Figure 5.

Therapeutic implication through pathway analysis and proteomic patterns of molecular subtypes of TETs. A, mRNA expression of GTF2I and its isoforms in four subtypes and adjacent normal tissues. An asterisk (*) denotes P < 0.05 compared with other three subtypes. B, A rank distribution plot of GTF2I mRNA target gene expression. Z-scores of 220 target mRNA targets from TRANSFAC were sorted from the smallest to the largest values and plotted against an anonymous x-axis. The average Z score in TS subtype was significantly lower than that in others (P = 0.002). C, The top canonical pathways derived from IPA in molecular subtypes of TETs. We performed the IPA to find major canonical pathway of each molecular subtype. These pathways in the y-axis emerged following the core analysis in the IPA. The x-axis indicates the significance level, scored as −log(P value) from Fisher exact test. D, Proteomic patterns associated with the molecular subtypes of TETs. Reverse-phase protein array (RPPA) data on 90 human TETs from TCGA in signaling pathways were analyzed according to the molecular subtypes of TETs. Critical values for the two-sided significance level of 0.1 are Z = −1.28 and Z = 1.28. E, mRNA expression of immune checkpoint inhibitory genes. The log2-transformed values of RPKM of PD1, PD-L1, and CTLA4 were shown. F, A scatter plot of log2-transformed values of RPKM of PD1 and CD8A. TS subtype was significantly deviated toward high expression of both CD8A and PD1 mRNA (χ2 test). G, ROC curve to suggest the potential role of TS signature to predict the response to immune checkpoint inhibitors in melanoma cohort.

Close modal

Canonical pathway analyses were performed to investigate in further detail the biological processes associated with molecular subgroups (Fig. 5C). IPA based on differential mRNA expression analyses showed that the GTF2I group was enriched for genes related to human embryonic stem cell pluripotency and Wnt/β-catenin signaling, which are involved in self-renewal of cells. The TS group had abundant genes related to T-cell signaling, including CD8A, CD8B, CD3D, as well as genes associated with CTLA4 signaling, T-cell receptor signaling, and ICOS/ICOSL signaling. The CS group was associated with pathways involved in G-protein–coupled receptor signaling, Toll-like receptor signaling, and regulation of the epithelial–mesenchymal transition (EMT) pathway, and the CIN group with NF-κB, EGF, FAK, and telomerase signaling. Comparison of protein expression among molecular subgroups revealed findings consistent with the results of IPA analysis. The GTF2I group demonstrated high expression of β-catenin, PIK3CA, and YAP1 proteins that are important in the stem cell pathway. The TS group demonstrated high expression of GATA3, which is known as a master transcription factor for the differentiation of T helper 2 cells (26), and LCK playing the role of T lymphocyte activation and differentiation (27). Tumors in the CS group were associated with downregulation of TSC2, an inhibitory molecule that suppresses the mTOR pathway, and overexpression of collagen VI, which is related to the EMT pathway. Finally, the CIN group showed overexpression of SRC kinase and c-kit proteins (Fig. 5D).

Given the upregulation of genes associated costimulatory and coinhibitory T-cell signaling in the TS group, we hypothesized that such a gene expression profile in TETs may be a reflection of an immunogenic tumor microenvironment. mRNA expression of immune checkpoint inhibitory genes, including programmed cell death 1 (PD1), programmed cell death-ligand 1 (PD-L1), and cytotoxic T-lymphocyte associated protein 4 (CTLA4) showed high expression of PD1 in the TS group and high expression PD-L1 expression in the CS and CIN groups (Fig. 5E). mRNA analyses of CD8A and PD1 mRNA revealed CD8A and PD1 expression in some TETs of the GTF2I, CS, and CIN groups, but only TETs in the TS group demonstrated high expression of both CD8A and PD1 (Fig. 5F), implying an abundance of PD1 expressing CD8 T cells that may respond favorably to immune checkpoint inhibitor therapy.

We therefore next examined whether a TS mRNA signature correlated with response to immune checkpoint inhibitors, in a cohort of patients with melanoma (n = 54) who were treated with anti-PD1 or anti-CTLA4 antibodies (19). In this cohort of melanoma patients, individuals whose tumors demonstrated a TS pattern of mRNA expression in early on-treatment tumor biopsies demonstrated the best responses to anti-PD1 therapy (AUC = 0.84, P = 0.020). Additionally, patients with melanomas demonstrating a TS pattern of mRNA expression in tumor biopsies performed before treatment or early during treatment also showed good response to anti-PD1 or anti-CTLA4 therapy (AUC=0.668, P = 0.034; Fig. 5G).

The molecular underpinnings of cancer have been delineated in a number of solid human tumors are the basis for personalized medicine approaches that have potential to revolutionize patient care. The molecular landscape of TETs is just beginning to be uncovered and only a handful of molecular characterization studies have been conducted in these tumors (12, 13, 28–30). Petrini and colleagues first reported that next-generation sequencing in TETs firstly identified a missense mutation in GTF2I at high frequency in type A thymomas with better survival (12). Additionally, thymic carcinomas have been found to carry a higher number of mutations than thymomas with recurrent mutations of known cancer genes, including TP53, CYLD, CDKN2A, BAP1, and PBRM1 (12). Furthermore, the TCGA will be forthcoming with a deeper analysis of TET data. Thus far, however, a framework that assimilates multiple molecular platforms to account for the molecular heterogeneity of these tumors, and which categorizes clinically relevant molecular subtypes has not been reported.

Herein, we report a molecular classification system for TETs that is based upon distinct patterns of genomic alterations across multiple TET patient cohorts, and that is associated with clinical features. This molecular classification system is based upon three main platforms. A DNA sequencing platform first identified tumors with a GTF2I mutation, which have remarkably favorable prognosis. An mRNA sequencing or array platform was then utilized to distinguish TETs with molecular subtype with a T-cell signaling gene profile (TS group), also a group of tumors also with relatively favorable prognosis. Finally, somatic copy number analyses were used to stratify chromosomally stable TETs (CS) from a chromosomal instability subtype (CIN), which was associated with the worst clinical prognosis. We show applicability of our molecular subtypes in two additional TET cohorts and their consistent and significant association with clinical phenotypes despite the various sources of heterogeneity and cohort differences. These molecular subtypes are therefore robust and discrete.

The GTF2I mutation was the most commonly identified gene mutation in TETs, present in 38% of patients in the TCGA cohort and in 32% of patients in validation cohort 1. TFII-I is a multifunctional protein involved in the transcriptional regulation of several genes that control cell proliferation and developmental processes, and which is stimulated by the binding of TFII-I to the FOS promoter (31). It binds specifically to several DNA sequence elements and mediates growth factor signaling (32). Our analyses of unsupervised clustering of mRNA sequencing data of TETs demonstrated that all patients with GTF2I mutation were distinctly clustered in one specific cluster, implying that the GTF2I mutation may be an important driving mutation in TETs even though we failed to show the association between GTF2I mutation and the alteration of transcription target mRNAs.

Our study has potentially important clinical implications, specifically related to estimating prognosis and to selecting treatment for patients suffering from TETs. When considered as a spectrum (ranging from GTF2I to TS to CS to CIN) this molecular classification system was associated with trends in clinical and histological TET features. For example, the GTF2I group was associated with more favorable WHO histology, less advanced Masaoka stage, and the absence of MG; and the CIN group was associated with less favorable WHO histology, more advanced Masaoka stage, and the presence of MG. Despite these relationships, the molecular classification system was independently associated with DFS when adjusted for these factors in multivariable analyses. Also highlighting the utility of this molecular classification in TET prognosis was the ability of this system to resolve heterogeneous clinical outcomes among patients with tumors considered in the same clinical classification groups. Specifically, amongst patients with advanced stage thymoma (III–IV), the molecular system stratified two discrete cohorts by substantially different survival outcomes.

While targeted therapy for thymoma and thymic carcinoma is in its infancy, deeper studies into the molecular determinants of TETs could result in the development of new targeted therapeutic agents. Similarly, molecular classification of TETs could identify patients who would respond to existing immunotherapeutic agents. For example, a promising preliminary report of a phase II study of pembrolizumab (anti-PD1 inhibitor) in patients with recurrent thymic carcinoma revealed a response rate of 24% (33), and an early phase study of avelumab (anti-PD-L1 IgG1 antibody) demonstrated activity in thymoma with 4 of 7 patients having objective responses (34). We have shown that tumors in the TS group are enriched for genes related to costimulatory and coinhibitory T-cell signaling such as PD1. Such tumors may be poised to respond well to immune checkpoint inhibitor therapy, and these data are consistent with reports of PD1 positivity (46%) and programmed death ligand 1 (PD-L1; 23%–80%) on TETs (35–38). TETs must be approached cautiously with checkpoint inhibitors, however, as 2 patients in the pembrolizumab study developed serious autoimmune disorders developed: one case of severe myositis/myocarditis and one case of type I diabetes (33), and all responders in the avelumab study experienced immune-related adverse effects including myositis in 3 patients and enteritis in 1 patient (34). Further, it has been shown in melanoma cohorts that patients with preexisting autoimmune disorders treated with checkpoint inhibitors are generally at increased risk of autoimmune adverse events (39). This is especially important in patients with TETs, a significant fraction of whom will have preexisting MG or other autoimmune disease. For example, our analyses have determined that MG is present in 13% in the GTF2I group, 25% in the TS group, 40% in CS, and 44% in the CIN group. However, moderate to severe immune related adverse events generally can be managed with the interruption of the checkpoint inhibitor and the use of corticosteroid immunosuppression upon the severity of the observed toxicity.

Further, advanced TETs have demonstrated responses to epidermal growth factor receptor (EGFR) inhibitors, cetuximab (40, 41) and erlotinib (42, 43) in some reports. Our canonical pathway analyses have demonstrated that the CIN molecular subtype is associated with activation of EGF or SRC-FAK signaling, suggesting that tumors displaying this molecular phenotype could have a favorable response to EGFR or SRC-FAK inhibitors. It is reported that TETs responded to multitargeted receptor tyrosine kinase (RTK) inhibitors, sunitinib (44), and dasatinib, a novel, oral, multitargeted kinase inhibitor of Bcr-Abl and SRC family kinases, as well as ephrin receptor kinases, platelet-derived growth factor receptor, and c-Kit (45), although SRC inhibition by saracatinib did not result in any clinical responses in patients with relapsed or refractory TETs (46). Lastly, our data demonstrate that tumors classified in the CS group were enriched for genes associated with EMT pathway and with Toll-like receptor signaling, and therefore may be suited best for trials with mTOR inhibitor class drugs to which favorable responses of advanced TETs have been reported (47), and/or Toll-like receptor antagonists. The characteristics of four molecular subtypes in TETs are summarized in Fig. 6 with the caveat that the prognosis and potential avenues for investigating treatment are based on limited follow-up data and the indirect association of treatment with pathway analysis and RPPA data, and require prospective study and validation. At the present time, this molecular staging system can serve as a springboard for deeper investigation for targeted molecular therapy in this disease, and with further discovery and experience with targeted molecular therapy of this disease, this system may potentially be useful for assigning prognosis and for stratifying patients to specific therapy.

Figure 6.

Summary of characteristics of the four molecular subtypes of TETs.

Figure 6.

Summary of characteristics of the four molecular subtypes of TETs.

Close modal

Several limitations are inherent to our study. Although the number of TET samples in the TCGA cohort is smaller relative to TCGA cohorts of other tumor types, TETs are a rare tumor and a sample size of 120 patients could be considered respectable. Further, the significant association of molecular subtypes and clinical features such as WHO classification and Masaoka staging were validated in two additional data sets. Whereas these additional data sets are small compared with the TCGA set, they did not contain overlapping data. Because DNA mutation and SCNA data were unavailable in cohort 2, we constructed and validated mRNA signatures that could identify each of the molecular subtypes for further analysis. Thus, we could not fully validate the prognostic robustness of this molecular subtyping in the additional cohorts. However, despite the lack of validation in survival, the original study related to validation cohort 1 demonstrated that patients with tumors bearing GTF2I mutations (GTF2I subgroup in this article) had a better prognosis than those bearing wild-type GTF2I (96% compared with 70% 10-year survival, respectively) in 204 patients (12). Lastly, clinical data from TCGA have report treatment modality as “surgical resection,” the completeness of resection could not be accounted for and may have biased our survival analyses.

In summary, these data support a GTF2I mutation as the most common gene mutation in TETs, and the presence of this mutation as a correlate of favorable outcome. Deeper investigation into the molecular mechanisms underlying TETs and the TET molecular stratification framework could hasten the clinical utility of this system as an adjunct to clinical staging and can support the discovery and development of rational treatment options for TET patients.

No potential conflicts of interest were disclosed.

Conception and design: H.-S. Lee, M. Hamaji, B.M. Burt

Development of methodology: H.-S. Lee, M. Hamaji, J.-S. Lee

Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): H.-S. Lee, D. Yoon

Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): H.-S. Lee, H.-J. Jang, R. Shah, D. Yoon, M. Hamaji, J.-S. Lee, B.M. Burt

Writing, review, and/or revision of the manuscript: H.-S. Lee, H.-J. Jang, R. Shah, M. Hamaji, B.M. Burt

Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): H.-S. Lee, H.-J. Jang, O. Wald, B.M. Burt

Study supervision: H.-S. Lee, H.-J. Jang, O. Wald, D.J. Sugarbaker, B.M. Burt

The results shown here are in part based upon data generated by the TCGA Research Network: http://cancergenome.nih.gov/. In addition, the authors thank Michelle G. Almarez for her administrative assistance with this manuscript and Dr. Jungnam Joo in Biometric Research Branch, Research Institute and Hospital, National Cancer Center, Korea for her statistical assistance.

This project is supported by Dan L. Duncan Cancer Center Pilot Project Grant, Baylor College of Medicine.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1.
Engels
EA
. 
Epidemiology of thymoma and associated malignancies
.
J Thorac Oncol
2010
;
5
:
S260
S5
.
2.
Masaoka
A
,
Monden
Y
,
Nakahara
K
,
Tanioka
T
. 
Follow‐up study of thymomas with special reference to their clinical stages
.
Cancer
1981
;
48
:
2485
92
.
3.
Koga
K
,
Matsuno
Y
,
Noguchi
M
,
Mukai
K
,
Asamura
H
,
Goya
T
, et al
A review of 79 thymomas: Modification of staging system and reappraisal of conventional division into invasive and non‐invasive thymoma
.
Pathol Int
1994
;
44
:
359
67
.
4.
Ruffini
E
,
Detterbeck
F
,
Van Raemdonck
D
,
Rocco
G
,
Thomas
P
,
Weder
W
, et al
Thymic carcinoma: a cohort study of patients from the European society of thoracic surgeons database
.
J Thorac Oncol
2014
;
9
:
541
8
.
5.
Zucali
PA
,
Di Tommaso
L
,
Petrini
I
,
Battista
S
,
Lee
HS
,
Merino
M
, et al
Reproducibility of the WHO classification of thymomas: practical implications
.
Lung Cancer
2013
;
79
:
236
41
.
6.
Kelly
RJ
,
Petrini
I
,
Rajan
A
,
Wang
Y
,
Giaccone
G
. 
Thymic malignancies: from clinical management to targeted therapies
.
J Clin Oncol
2011
;
29
:
4820
7
.
7.
Suster
S
,
Moran
CA
. 
Problem areas and inconsistencies in the WHO classification of thymoma
.
Semin Diagn Pathol
2005
;
22
:
188
97
.
8.
Detterbeck
FC
,
Stratton
K
,
Giroux
D
,
Asamura
H
,
Crowley
J
,
Falkson
C
, et al
The IASLC/ITMIG thymic epithelial tumors staging project: proposal for an evidence-based stage classification system for the forthcoming (8th) edition of the TNM classification of malignant tumors
.
J Thorac Oncol
2014
;
9
:
S65
S72
.
9.
Filosso
PL
,
Ruffini
E
,
Lausi
PO
,
Lucchi
M
,
Oliaro
A
,
Detterbeck
F
. 
Historical perspectives: the evolution of the thymic epithelial tumors staging system
.
Lung Cancer
2014
;
83
:
126
32
.
10.
Kondo
K
,
Monden
Y
. 
Therapy for thymic epithelial tumors: a clinical study of 1,320 patients from Japan
.
Ann Thorac Surg
2003
;
76
:
878
84
.
11.
Verghese
ET
,
den Bakker
M
,
Campbell
A
,
Hussein
A
,
Nicholson
AG
,
Rice
A
, et al
Interobserver variation in the classification of thymic tumours–a multicentre study using the WHO classification system
.
Histopathology
2008
;
53
:
218
23
.
12.
Petrini
I
,
Meltzer
PS
,
Kim
IK
,
Lucchi
M
,
Park
KS
,
Fontanini
G
, et al
A specific missense mutation in GTF2I occurs at high frequency in thymic epithelial tumors
.
Nat Genet
2014
;
46
:
844
9
.
13.
Okumura
M
,
Fujii
Y
,
Shiono
H
,
Inoue
M
,
Minami
M
,
Utsumi
T
, et al
Immunological function of thymoma and pathogenesis of paraneoplastic myasthenia gravis
.
Gen Thorac Cardiovasc Surg
2008
;
56
:
143
50
.
14.
Hamaji
M
,
Burt
BM
. 
Long-term outcomes of surgical and nonsurgical management of stage IV thymoma: a population-based analysis of 282 patients
.
Semin Thorac Cardiovasc Surg
2015
;
27
:
1
3
.
15.
The Cancer Genome Atlas Research Network
. 
Comprehensive genomic characterization of squamous cell lung cancers
.
Nature
2012
;
489
:
519
25
.
16.
The Cancer Genome Atlas Research Network
. 
Comprehensive molecular profiling of lung adenocarcinoma
.
Nature
2014
;
511
:
543
50
.
17.
Broad Institute TCGA Genome Data Analysis Center
. 
SNP6 Copy number analysis (GISTIC2)
.
Broad Institute of MIT and Harvard
; 
2016
.
18.
The Cancer Genome Atlas Research Network
. 
Comprehensive molecular characterization of gastric adenocarcinoma
.
Nature
2014
;
513
:
202
9
.
19.
Chen
PL
,
Roh
W
,
Reuben
A
,
Cooper
ZA
,
Spencer
CN
,
Prieto
PA
, et al
Analysis of immune signatures in longitudinal tumor samples yields insight into biomarkers of response and mechanisms of resistance to immune checkpoint blockade
.
Cancer Discov
2016
;
6
:
827
37
.
20.
Jang
HJ
,
Lee
HS
,
Burt
BM
,
Lee
GK
,
Yoon
KA
,
Park
YY
, et al
Integrated genomic analysis of recurrence-associated small non-coding RNAs in oesophageal cancer
.
Gut
2017
;
66
(2):
215
25
.
21.
Badve
S
,
Goswami
C
,
Gökmen-Polar
Y
,
Nelson
RP
 Jr
,
Henley
J
,
Miller
N
, et al
Molecular analysis of thymoma
.
PLoS One
2012
;
7
:
e42669
.
22.
Perez-Llamas
C
,
Lopez-Bigas
N
. 
Gitools: analysis and visualisation of genomic data using interactive heat-maps
.
PLoS One
2011
;
6
:
e19541
.
23.
Babur
Ö
,
Gönen
M
,
Aksoy
BA
,
Schultz
N
,
Ciriello
G
,
Sander
C
, et al
Systematic identification of cancer driving signaling pathways based on mutual exclusivity of genomic alterations
.
Genome Biol
2015
;
16
:
45
.
24.
Cristescu
R
,
Lee
J
,
Nebozhyn
M
,
Kim
KM
,
Ting
JC
,
Wong
SS
, et al
Molecular analysis of gastric cancer identifies subtypes associated with distinct clinical outcomes
.
Nat Med
2015
;
21
:
449
56
.
25.
Rouillard
AD
,
Gundersen
GW
,
Fernandez
NF
,
Wang
Z
,
Monteiro
CD
,
McDermott
MG
, et al
The harmonizome: a collection of processed datasets gathered to serve and mine knowledge about genes and proteins
.
Database
2016
;
2016
.
pii
:
baw100
.
26.
Ho
IC
,
Tai
TS
,
Pai
SY
. 
GATA3 and the T-cell lineage: essential functions before and after T-helper-2-cell differentiation
.
Nat Rev Immunol
2009
;
9
:
125
35
.
27.
Nika
K
,
Soldani
C
,
Salek
M
,
Paster
W
,
Gray
A
,
Etzensperger
R
, et al
Constitutively active Lck kinase in T cells drives antigen receptor signal transduction
.
Immunity
2010
;
32
:
766
77
.
28.
Petrini
I
,
Wang
Y
,
Zucali
PA
,
Lee
HS
,
Pham
T
,
Voeller
D
, et al
Copy number aberrations of genes regulating normal thymus development in thymic epithelial tumors
.
Clin Cancer Res
2013
;
19
:
1960
71
.
29.
Inoue
M
,
Starostik
P
,
Zettl
A
,
Ströbel
P
,
Schwarz
S
,
Scaravilli
F
, et al
Correlating genetic aberrations with World Health Organization-defined histology and stage across the spectrum of thymomas
.
Cancer Res
2003
;
63
:
3708
15
.
30.
Wang
Y
,
Thomas
A
,
Lau
C
,
Rajan
A
,
Zhu
Y
,
Killian
JK
, et al
Mutations of epigenetic regulatory genes are common in thymic carcinomas
.
Sci Rep
2014
;
4
:
7336
.
31.
Grueneberg
DA
,
Henry
RW
,
Brauer
A
,
Novina
CD
,
Cheriyath
V
,
Roy
AL
, et al
A multifunctional DNA-binding protein that promotes the formation of serum response factor/homeodomain complexes: identity to TFII-I
.
Genes Dev
1997
;
11
:
2482
93
.
32.
Roy
AL
. 
Biochemistry and biology of the inducible multifunctional transcription factor TFII-I
.
Gene
2001
;
274
:
1
13
.
33.
Giaccone
G
,
Thompson
J
,
Crawford
J
,
Mcguire
C
,
Manning
M
,
Subramaniam
DS
, et al
A phase II study of pembrolizumab in patients with recurrent thymic carcinoma
.
J Clin Oncol
2016
;
34
:
abstr 8517
.
34.
Rajan
A
,
Heery
CR
,
Perry
S
,
Keen
C
,
Mammen
AL
,
Berman
AW
, et al
Safety and clinical activity of anti-prpogrammed death-ligand 1 (PD-L1) antibody (ab) avelumab (MSB0010718C) in advanced thymic epithelial tumors (TETs)
.
J Clin Oncol
2016
;
34
:
abstr e20106
.
35.
Padda
SK
,
Riess
JW
,
Schwartz
EJ
,
Tian
L
,
Kohrt
HE
,
Neal
JW
, et al
Diffuse high intensity PD–L1 staining in thymic epithelial tumors
.
J Thorac Oncol
2015
;
10
:
500
8
.
36.
Katsuya
Y
,
Fujita
Y
,
Horinouchi
H
,
Ohe
Y
,
Watanabe
S
,
Tsuta
K
. 
Immunohistochemical status of PD-L1 in thymoma and thymic carcinoma
.
Lung Cancer
2015
;
88
:
154
9
.
37.
Yokoyama
S
,
Miyoshi
H
,
Nakashima
K
,
Shimono
J
,
Hashiguchi
T
,
Mitsuoka
M
, et al
Prognostic value of programmed death ligand 1 and programmed death 1 expression in thymic carcinoma
.
Clin Cancer Res
2016
;
22
:
4727
34
.
38.
Katsuya
Y
,
Horinouchi
H
,
Asao
T
,
Kitahara
S
,
Goto
Y
,
Kanda
S
, et al
Expression of programmed death 1 (PD-1) and its ligand (PD-L1) in thymic epithelial tumors: impact on treatment efficacy and alteration in expression after chemotherapy
.
Lung Cancer
2016
;
99
:
4
10
.
39.
Menzies
AM
,
Johnson
DB
,
Ramanujam
S
,
Atkinson
VG
,
Wong
AN
,
Park
JJ
, et al
Anti-PD-1 therapy in patients with advanced melanoma and preexisting autoimmune disorders or major toxicity with ipilimumab
.
Ann Oncol
2016
;
pii
:
mdw443
.
40.
Palmieri
G
,
Marino
M
,
Salvatore
M
,
Budillon
A
,
Meo
G
,
Caraglia
M
, et al
Cetuximab is an active treatment of metastatic and chemorefractory thymoma
.
Front Biosci
2007
;
12
:
757
61
.
41.
Farina
G
,
Garassino
MC
,
Gambacorta
M
,
La Verde
N
,
Gherardi
G
,
Scanni
A
. 
Response of thymoma to cetuximab
.
Lancet Oncol
2007
;
8
:
449
50
.
42.
Christodoulou
C
,
Murray
S
,
Dahabreh
J
,
Petraki
K
,
Nikolakopoulou
A
,
Mavri
A
, et al
Response of malignant thymoma to erlotinib
.
Ann Oncol
2008
;
19
:
1361
2
.
43.
Pedersini
R
,
Vattemi
E
,
Lusso
MR
,
Mazzoleni
G
,
Ebner
H
,
Graiff
C
. 
Erlotinib in advanced well-differentiated thymic carcinoma with overexpression of EGFR: a case report
.
Tumori
2008
;
94
:
849
52
.
44.
Thomas
A
,
Rajan
A
,
Berman
A
,
Tomita
Y
,
Brzezniak
C
,
Lee
MJ
, et al
Sunitinib in patients with chemotherapy-refractory thymoma and thymic carcinoma: an open-label phase 2 trial
.
Lancet Oncol
2015
;
16
:
177
86
.
45.
Chuah
C
,
Lim
TH
,
Lim
AS
,
Tien
SL
,
Lim
CH
,
Soong
R
, et al
Dasatinib induces a response in malignant thymoma
.
J Clin Oncol
2006
;
24
:
e56
e8
.
46.
Gubens
MA
,
Burns
M
,
Perkins
SM
,
Pedro-Salcedo
MS
,
Althouse
SK
,
Loehrer
PJ
, et al
A phase II study of saracatinib (AZD0530), a Src inhibitor, administered orally daily to patients with advanced thymic malignancies
.
Lung Cancer
. 
2015
;
89
:
57
60
.
47.
Wheler
J
,
Hong
D
,
Swisher
SG
,
Falchook
G
,
Tsimberidou
AM
,
Helgason
T
, et al
Thymoma patients treated in a phase I clinic at MD Anderson Cancer Center: responses to mTOR inhibitors and molecular analyses
.
Oncotarget
2013
;
4
:
890
8
.