Activation of the EGFR, KRAS, and ALK oncogenes defines 3 different pathways of molecular pathogenesis in lung adenocarcinoma. However, many tumors lack activation of any pathway (triple-negative lung adenocarcinomas) posing a challenge for prognosis and treatment. Here, we report an extensive genome-wide expression profiling of 226 primary human stage I–II lung adenocarcinomas that elucidates molecular characteristics of tumors that harbor ALK mutations or that lack EGFR, KRAS, and ALK mutations, that is, triple-negative adenocarcinomas. One hundred and seventy-four genes were selected as being upregulated specifically in 79 lung adenocarcinomas without EGFR and KRAS mutations. Unsupervised clustering using a 174-gene signature, including ALK itself, classified these 2 groups of tumors into ALK-positive cases and 2 distinct groups of triple-negative cases (groups A and B). Notably, group A triple-negative cases had a worse prognosis for relapse and death, compared with cases with EGFR, KRAS, or ALK mutations or group B triple-negative cases. In ALK-positive tumors, 30 genes, including ALK and GRIN2A, were commonly overexpressed, whereas in group A triple-negative cases, 9 genes were commonly overexpressed, including a candidate diagnostic/therapeutic target DEPDC1, that were determined to be critical for predicting a worse prognosis. Our findings are important because they provide a molecular basis of ALK-positive lung adenocarcinomas and triple-negative lung adenocarcinomas and further stratify more or less aggressive subgroups of triple-negative lung ADC, possibly helping identify patients who may gain the most benefit from adjuvant chemotherapy after surgical resection. Cancer Res; 72(1); 100–11. ©2011 AACR.

Lung cancer is the leading cause of cancer death worldwide (1, 2). Adenocarcinoma, which accounts for more than 50% of non-small-cell lung cancers (NSCLC), is the most frequent type and is increasing. Lung adenocarcinoma has a heterogeneous nature in various aspects, including clinicopathologic features (3). Recent molecular studies have revealed at least 3 major molecular pathways for the development of lung adenocarcinoma (4–8). A considerable fraction (30%–60%) of lung adenocarcinomas develops through acquisition of mutations either in the EGFR, KRAS, or ALK genes in a mutually exclusive manner, and the remaining lung adenocarcinomas, that is, those without EGFR, KRAS, and ALK mutations (herein designated “triple-negative adenocarcinomas”), develop with mutations of several other genes. HER2, BRAF, etc. are known to be mutated also mutually exclusively with the EGFR, KRAS, and ALK genes; however, frequencies of their mutations are very low (<5%; refs. 4–7). Therefore, genes responsible for the development of triple-negative adenocarcinomas are largely unknown.

Mutations in the EGFR gene are prevalent in females and never-smokers, and the frequencies are considerably higher in Asians (40%–60%) than in Europeans/Americans (∼10%; refs. 5–7, 9). EGFR mutations make tumor cells dependent on epidermal growth factor receptor (EGFR) signaling and define patients who respond to EGFR tyrosine kinase inhibitors (TKI), such as gefitinib (10, 11). On the other hand, mutations in the KRAS gene occur predominantly in males and ever-smokers, and their frequencies are higher in Europeans/Americans (>15%) than in Asians (10%; ref. 9). Specific inhibitors against KRAS activity are being developed (12). Therefore, clinicopathologic features of lung adenocarcinomas with EGFR mutations (herein designated “EGFR-positive adenocarcinomas”) and those with KRAS mutations (herein designated “KRAS-positive adenocarcinomas”) are considerably different from each other. Recently, a small subset of EGFR- and KRAS-negative lung adenocarcinomas (∼5%) was shown to have rearrangements of the ALK gene generating gene fusion transcripts (13), and patients with ALK rearrangements tend to be younger and have little or no smoking histories (4, 6–8). Because lung adenocarcinoma cells with ALK rearrangements (herein designated “ALK-positive adenocarcinomas”) are specifically sensitive to ALK TKIs, ALK-positive adenocarcinomas have been recently considered to be another subset of adenocarcinomas by considering the differences in therapeutic targets (4, 6–8). In contrast, clinicopathologic features of triple-negative lung adenocarcinomas have not been precisely characterized because of the lack of sufficient genetic information in these adenocarcinomas.

There have been several studies which attempted to characterize gene expression profiles in particular types of lung adenocarcinoma, including EGFR-positive and KRAS-positive adenocarcinomas (14–17). However, such information is limited for ALK-positive adenocarcinomas and triple-negative adenocarcinomas. Therefore, in this study, we aimed to elucidate clinicopathologic features and gene expression profiles of ALK-positive adenocarcinomas and triple-negative adenocarcinomas in comparison with those of EGFR-positive adenocarcinomas and KRAS-positive adenocarcinomas. We conducted a genome-wide gene expression profiling of 226 lung adenocarcinomas, consisting of 127 EGFR-positive adenocarcinomas, 20 KRAS-positive adenocarcinomas, 11 ALK-positive adenocarcinomas, and 68 triple-negative adenocarcinomas. To identify genes useful for molecular diagnosis and applicable to targeted therapy of ALK-positive adenocarcinomas and triple-negative adenocarcinomas, we focused on genes that were upregulated in these adenocarcinomas by selecting genes with low expression in EGFR-positive and KRAS-positive adenocarcinomas. Several genes were identified as being specifically and significantly upregulated in ALK-positive adenocarcinomas. In particular, the ALK gene itself was highly expressed exclusively in ALK-positive adenocarcinomas. More importantly, a distinct group of triple-negative adenocarcinomas with unfavorable outcome was identified. This group of triple-negative adenocarcinomas showed much worse prognosis than the other group of triple-negative adenocarcinomas, EGFR-positive adenocarcinomas, KRAS-positive adenocarcinomas, and ALK-positive adenocarcinomas. Several genes were identified as being upregulated and critical for predicting prognosis of patients in this group of adenocarcinomas.

Patients

The tumors were pathologically classified according to the TNM classification of malignant tumors (18). A total of 226 lung adenocarcinoma cases subjected to expression profiling were selected from 393 stage I–II cases who underwent potential curative resection between 1998 and 2008 at the National Cancer Center Hospital as follows (ref. 19; Supplementary Fig. S1). Among the 393 cases, 363 cases, consisting of 305 stage I and 58 stage II cases, were eligible by the criteria of cases who did not receive any neoadjuvant therapies before surgery and had not been diagnosed with cancer in the 5 years before lung adenocarcinoma diagnosis. All 58 stage II cases were subjected to expression profiling. The 305 stage I cases included 37 cases with relapse and 268 cases without relapse. To improve statistical efficiency, all the 37 relapsed cases and 131 matched unrelapsed cases selected by the incidence density sampling method (20, 21) were subjected to expression profiling. In total, 226 cases, consisting of 168 stage I and 58 stage II cases, were subjected to the expression profiling. Among the 226 cases, 204 who received complete resection (i.e., free resection margins and no involvement of mediastinal lymph nodes examined by mediastinal dissection) and did not receive postoperative chemotherapy and/or radiotherapy, unless relapsed, were subjected to survival analyses. This study was approved by the Institutional Review Boards of the National Cancer Center.

Microarray experiments and data processing

Total RNA was extracted using TRIzol reagent (Invitrogen), purified by an RNeasy kit (Qiagen), and qualified with a model 2100 Bioanalyzer (Agilent). All samples showed RNA Integrity Numbers more than 6.0 and were subjected to microarray experiments. Two micrograms of total RNA were labeled using a 5X MEGAscript T7 Kit (Ambion) and analyzed by Affymetrix U133Plus2.0 arrays. The data were processed by the MAS5 algorithm, and the mean expression level of a total of 54,675 probes was adjusted to 1,000 for each sample. Microarray data are available at National Center for Biotechnology Information Gene Expression Omnibus (GSE31210).

Probe selection for unsupervised clustering

One hundred and seventy-four genes (190 probes), preferentially expressed in ALK-positive and triple-negative adenocarcinomas, were selected by the following criteria; probes whose expression levels were less than 1,000 in any adenocarcinomas with EGFR or KRAS mutations, and probes whose averaged expression levels in ALK-positive and triple-negative adenocarcinomas were more than 1.5-fold higher than those in EGFR-positive and KRAS-positive adenocarcinomas with P values less than 0.05 by t test. Expression levels for these 190 probes were log-transformed and median-centered, both for probes and samples, and were subjected to an unsupervised hierarchical clustering. The clustering was done by the centroid linkage method using the Cluster 3.0 program, and the results were visualized using the Java Treeview program (22).

Mutation analyses

Genomic DNAs from all 226 lung adenocarcinomas were analyzed for EGFR and KRAS mutations by the high-resolution melting method as described (23, 24). Total RNAs from the 226 adenocarcinomas were examined for expression of fusion transcripts between ALK and EML4 or KIF5 using a multiplex reverse transcription PCR (RT-PCR) method (25).

Statistics

Cumulative survival was estimated by the Kaplan–Meier method, and differences in the survivals between 2 groups were analyzed by log-rank test. Influences of variables on relapse-free survival (RFS) and overall survival (OS) were evaluated by uni- and multivariate analyses of the Cox proportional hazard model. For all analyses, smoking status was polarized as never-smokers (0 pack years) and ever-smokers (>0 pack years). Pathologic TNM staging was categorized as stage I versus stage II. For multivariate analysis, all variables were included that were moderately associated (P < 0.1) with RFS or OS in any of the analyses.

Bioinformatics

Associations of gene expression levels with prognosis of NSCLC patients in 7 other expression profile studies were obtained from the PrognoScan database (26). In the PrognoScan database, association of gene expression with survival of patients was evaluated by the minimum P value approach. Briefly, patients were first arranged by expression levels of a given gene. They were then divided into high- and low-expression groups at all possible cutoff points, and the risk differences of any 2 groups were estimated by the log-rank test. Finally, the cutoff point that gave the most pronounced P value was selected.

EGFR/KRAS/ALK mutations and clinicopathologic characteristics of lung adenocarcinomas subjected to gene expression profiling

Among 226 stages I and II lung adenocarcinomas, EGFR and KRAS mutations were mutually exclusively detected in 127 (56%) and 20 (9%) cases, respectively, and an EML4–ALK fusion gene was expressed in 11 (4.9%) cases (Table 1). EGFR or KRAS mutations were not detected in any of the 11 cases with EML4–ALK fusion expression; thus, the occurrence of ALK rearrangements in a mutually exclusive manner with EGFR and KRAS mutations in lung adenocarcinoma was confirmed. The incidence and the fraction of EGFR-, KRAS-, and ALK-positive cases in this study were consistent with those in previous studies (5–7, 9, 13). Accordingly, the remaining 68 (30%) cases were defined as “triple-negative adenocarcinomas” because of the absence of EGFR, KRAS, and ALK mutations. Clinicopathologic features of EGFR-positive adenocarcinomas and KRAS-positive adenocarcinomas in this study are well consistent with those in previous studies of Japanese populations (27, 28). Patients with ALK-positive adenocarcinomas were younger and more likely to be never-smokers, as previously indicated (4, 6–8). Triple-negative adenocarcinomas showed similar clinicopathologic features to those of KRAS-positive adenocarcinomas, that is, a predominance of males, ever-smokers, and advanced stages.

Table 1.

Clinicopathologic characteristics of 226 lung adenocarcinomas subjected to expression profile analysis

MutationExpression profile
VariableAllEGFR (+)KRAS (+)ALK (+)Triple (−)Group AGroup B
No. of cases 226 127 20 11 68 36 32 
Age 
 Mean 60 60 60 54 61 61 60 
 Range 30–76 35–72 46–75 30–68 46–76 46–71 47–76 
Sex 
 Male 105 50 12 41 25 16 
 Female 121 77 27 11 16 
Smoking habit 
 Never-smoker 115 67 10 31 10 21 
 Ever-smoker 111 60 10 37 26 11 
pStage 
 IA 114 77 28 10 18 
 IB 54 26 20 12 
 II 58 24 20 14 
MutationExpression profile
VariableAllEGFR (+)KRAS (+)ALK (+)Triple (−)Group AGroup B
No. of cases 226 127 20 11 68 36 32 
Age 
 Mean 60 60 60 54 61 61 60 
 Range 30–76 35–72 46–75 30–68 46–76 46–71 47–76 
Sex 
 Male 105 50 12 41 25 16 
 Female 121 77 27 11 16 
Smoking habit 
 Never-smoker 115 67 10 31 10 21 
 Ever-smoker 111 60 10 37 26 11 
pStage 
 IA 114 77 28 10 18 
 IB 54 26 20 12 
 II 58 24 20 14 

Expression profile unique to ALK-positive lung adenocarcinomas

All 226 cases were subjected to genome-wide expression profiling using Affymetrix U133Plus2.0 arrays. One hundred and seventy-four genes, evaluated with 190 probes (Supplementary Table S1), were selected as those preferentially expressed in either ALK-positive adenocarcinomas or triple-negative adenocarcinomas under the criteria described in Materials and Methods. In particular, 10 genes evaluated with 11 probes were markedly upregulated according to the criteria of fold-differences more than 2.0 with P values less than 0.05 (Supplementary Table S2). It was noted that 2 probes for the ALK gene were present among them, and 1 of them (probe ID = 208212_s_at) showed the highest fold-difference of 8.7 between ALK-positive/triple-negative adenocarcinomas and EGFR-positive/KRAS-positive adenocarcinomas among the 190 probes. This result indicated that there is a subset of adenocarcinomas in which ALK was overexpressed. Therefore, an unsupervised hierarchical clustering using these 190 probes was done on 11 ALK-positive adenocarcinomas and 68 triple-negative adenocarcinomas (Supplementary Figs. S1 and S2). There were 3 distinct sets of genes/probes, as indicated by red, yellow, and blue bars on the left of the heat map. Two probes for the ALK gene were present in the gene/probe set with a yellow bar, and 11 cases with extremely high levels of ALK expression comprised a small subcluster in the right side of cluster 1. All the 11 cases corresponded to the ones with EML4–ALK fusion gene expression.

The results strongly indicated that ALK-positive adenocarcinomas have distinct expression profiles in comparison with ALK-negative adenocarcinomas, including not only triple-negative adenocarcinomas but also EGFR-positive and KRAS-positive adenocarcinomas. Therefore, genes with fold-differences more than 2.0 and P values less than 0.05 in their expression between ALK-positive adenocarcinomas and ALK-negative adenocarcinomas were further selected from the 190 probes. Thirty genes with 32 probes were then selected (Table 2). The ALK gene showed the highest level of fold difference in ALK-positive adenocarcinomas. Therefore, as previously reported (29–31), ALK-positive adenocarcinomas express high levels of ALK gene products, supporting that upregulation of the ALK gene is a biological consequence of ALK rearrangements in lung adenocarcinoma cells. Expression profiling further revealed that various other genes are distinctly upregulated in ALK-positive adenocarcinomas. In particular, fold differences of GRIN2A (glutamate receptor, ionotropic, N-methyl d-aspartate 2A) expression were more than 10, as with ALK expression. Moreover, GRIN2A was branched most closely to ALK in the heat map (Supplementary Fig. S2). Therefore, high levels of GRIN2A expression can be a characteristic unique to ALK-positive adenocarcinomas, in addition to upregulation of the ALK gene itself. The levels of GRIN2A expression in ALK-positive adenocarcinomas were significantly higher than those in ALK-negative adenocarcinomas by quantitative RT-PCR analysis (Supplementary Fig. S3).

Table 2.

Genes upregulated in ALK-positive lung adenocarcinomas

Gene symbolaGene nameProbe IDFold difference
ALK Anaplastic lymphoma receptor tyrosine kinase 208212_s_at 55.2 
EST Transcribed locus 242964_at 26.8 
ALK Anaplastic lymphoma receptor tyrosine kinase 208211_s_at 17.2 
GRIN2A Glutamate receptor, ionotropic, N-methyl d-aspartate 2A 242286_at 13.0 
GRIN2A Glutamate receptor, ionotropic, N-methyl d-aspartate 2A 231384_at 12.4 
MUC5AC /// MUC5B Mucin 5AC, oligomeric mucus/gel-forming /// mucin 5B, oligomeric mucus/gel-forming 222268_x_at 9.2 
EST Transcribed locus 1570291_at 8.1 
LOC100292909 Hypothetical protein LOC100292909 241535_at 7.7 
BLID BH3-like motif containing, cell death inducer 1555675_at 7.4 
LOC100130894 Hypothetical LOC100130894 1564158_a_at 6.1 
CLDN10 Claudin 10 1556687_a_at 6.0 
KRT16 Keratin 16 209800_at 5.9 
PROM2 Prominin 2 1562378_s_at 5.6 
GJB5 Gap junction protein, beta 5, 31.1 kDa 206156_at 5.0 
KIAA1644 KIAA1644 221901_at 4.8 
EPHB1 EPH receptor B1 210753_s_at 4.5 
LRRC4 Leucine rich repeat containing 4 223552_at 4.2 
EST Transcribed locus 235373_at 3.4 
tcag7.1188 Hypothetical LOC340340 1561254_at 3.3 
SBNO2 Strawberry notch homolog 2 (Drosophila204166_at 3.3 
EST Transcribed locus 241083_at 3.1 
SLC25A37 Solute carrier family 25, member 37 222528_s_at 3.1 
NDP Norrie disease (pseudoglioma) 206022_at 3.1 
EST Transcribed locus 243478_at 3.0 
EST Transcribed locus 239136_at 2.9 
RHOV ras homolog gene family, member V 241990_at 2.9 
YIF1B Yip1 interacting factor homolog B (S. cerevisiae231211_s_at 2.9 
RPRM Reprimo, TP53 dependent G2 arrest mediator candidate 219370_at 2.5 
SYT12 Synaptotagmin XII 228072_at 2.5 
HES2 Hairy and enhancer of split 2 (Drosophila231928_at 2.4 
CDH11 Cadherin 11, type 2, OB-cadherin (osteoblast) 239769_at 2.2 
IRAK3 Interleukin-1 receptor-associated kinase 3 220034_at 2.1 
Gene symbolaGene nameProbe IDFold difference
ALK Anaplastic lymphoma receptor tyrosine kinase 208212_s_at 55.2 
EST Transcribed locus 242964_at 26.8 
ALK Anaplastic lymphoma receptor tyrosine kinase 208211_s_at 17.2 
GRIN2A Glutamate receptor, ionotropic, N-methyl d-aspartate 2A 242286_at 13.0 
GRIN2A Glutamate receptor, ionotropic, N-methyl d-aspartate 2A 231384_at 12.4 
MUC5AC /// MUC5B Mucin 5AC, oligomeric mucus/gel-forming /// mucin 5B, oligomeric mucus/gel-forming 222268_x_at 9.2 
EST Transcribed locus 1570291_at 8.1 
LOC100292909 Hypothetical protein LOC100292909 241535_at 7.7 
BLID BH3-like motif containing, cell death inducer 1555675_at 7.4 
LOC100130894 Hypothetical LOC100130894 1564158_a_at 6.1 
CLDN10 Claudin 10 1556687_a_at 6.0 
KRT16 Keratin 16 209800_at 5.9 
PROM2 Prominin 2 1562378_s_at 5.6 
GJB5 Gap junction protein, beta 5, 31.1 kDa 206156_at 5.0 
KIAA1644 KIAA1644 221901_at 4.8 
EPHB1 EPH receptor B1 210753_s_at 4.5 
LRRC4 Leucine rich repeat containing 4 223552_at 4.2 
EST Transcribed locus 235373_at 3.4 
tcag7.1188 Hypothetical LOC340340 1561254_at 3.3 
SBNO2 Strawberry notch homolog 2 (Drosophila204166_at 3.3 
EST Transcribed locus 241083_at 3.1 
SLC25A37 Solute carrier family 25, member 37 222528_s_at 3.1 
NDP Norrie disease (pseudoglioma) 206022_at 3.1 
EST Transcribed locus 243478_at 3.0 
EST Transcribed locus 239136_at 2.9 
RHOV ras homolog gene family, member V 241990_at 2.9 
YIF1B Yip1 interacting factor homolog B (S. cerevisiae231211_s_at 2.9 
RPRM Reprimo, TP53 dependent G2 arrest mediator candidate 219370_at 2.5 
SYT12 Synaptotagmin XII 228072_at 2.5 
HES2 Hairy and enhancer of split 2 (Drosophila231928_at 2.4 
CDH11 Cadherin 11, type 2, OB-cadherin (osteoblast) 239769_at 2.2 
IRAK3 Interleukin-1 receptor-associated kinase 3 220034_at 2.1 

aGenes with fold difference >2.0 and P < 0.05 between ALK-positive and ALK-negative adenocarcinomas are shown.

Triple-negative lung adenocarcinomas with poor prognosis identified by gene expression profiling

By the unsupervised hierarchical clustering, 68 triple-negative adenocarcinomas were separated into 2 major groups, one containing 36 cases and the other 32 cases, designated as groups A and B, respectively (Fig. 1). Group A comprised cluster 1 with 11 ALK-positive adenocarcinomas. Group A cases were dominant in males, ever-smokers, and advanced stages, whereas group B cases were dominant in never-smokers and early stages (Table 1), indicating that group A cases comprise an aggressive type in triple-negative adenocarcinomas. Therefore, we next compared RFS and OS among the 5 groups of patients; groups A and B, EGFR-positive cases, KRAS-positive cases, and ALK-positive cases (Fig. 2). Among the 226 cases, 204 cases that received complete resection and did not receive postoperative chemotherapy and/or radiotherapy were subjected to survival analysis. Group A cases (n = 32) showed the worst prognosis for both RFS and OS among the 5 groups (Fig. 2A and B). In particular, group A cases showed significantly worse prognosis (P < 0.05) for both RFS and OS than group B cases (n = 30) and EGFR-positive cases (n = 116) by the log-rank test. Such differences were marginally significant between group A cases and KRAS-positive cases (n = 19) and not significant between group A cases and ALK-positive cases (n = 7), probably because the numbers of KRAS-positive and ALK-positive cases were smaller than those of group B and EGFR-positive cases.

Figure 1.

Unsupervised hierarchical clustering of 11 ALK-positive adenocarcinomas and 68 triple-negative adenocarcinomas. Triple-negative adenocarcinomas were separated into 36 group A cases and 32 group B cases, and group A cases construct cluster 1 with 11 ALK-positive adenocarcinoma cases. Clinical and genetic features are shown below the tree; sex (black, male; white, female); smoking status (black, ever-smoker; white, never-smoker); pathologic stage (black, stage II; gray, stage IB; white, stage IA); relapse (black, evidence of relapse; white, no evidence of relapse); ALK (yellow, ALK-fusion gene expression positive; white, negative). Three colored bars according to the main branches of probes/genes are shown on the left. Positions of probes for ALK, GRIN2A, and DEPDC1 are shown on the right. ADC, adenocarcinoma.

Figure 1.

Unsupervised hierarchical clustering of 11 ALK-positive adenocarcinomas and 68 triple-negative adenocarcinomas. Triple-negative adenocarcinomas were separated into 36 group A cases and 32 group B cases, and group A cases construct cluster 1 with 11 ALK-positive adenocarcinoma cases. Clinical and genetic features are shown below the tree; sex (black, male; white, female); smoking status (black, ever-smoker; white, never-smoker); pathologic stage (black, stage II; gray, stage IB; white, stage IA); relapse (black, evidence of relapse; white, no evidence of relapse); ALK (yellow, ALK-fusion gene expression positive; white, negative). Three colored bars according to the main branches of probes/genes are shown on the left. Positions of probes for ALK, GRIN2A, and DEPDC1 are shown on the right. ADC, adenocarcinoma.

Close modal
Figure 2.

Kaplan–Meier survival curves for RFS and OS of 204 lung adenocarcinoma cases according to EGFR-positive, KRAS-positive, ALK-positive, group A, and group B. RFS and OS of stage I–II (A, B) and stage I (C, D) cases are shown.

Figure 2.

Kaplan–Meier survival curves for RFS and OS of 204 lung adenocarcinoma cases according to EGFR-positive, KRAS-positive, ALK-positive, group A, and group B. RFS and OS of stage I–II (A, B) and stage I (C, D) cases are shown.

Close modal

Similar results were obtained from the analysis of 162 patients with stage I adenocarcinomas (Fig. 2C and D), indicating the independency of these associations with staging. Therefore, we next carried out multivariate analyses on RFS and OS of these 5 groups (Table 3). In the analysis of 204 stages I and II patients, RFS and OS of group A cases were significantly worse than those of EGFR-positive and group B cases, and the differences were independent of staging. HRs of ALK-positive and KRAS-positive cases were also as high as EGFR-positive and group B cases, although only the difference in RFS was statistically significant between group A cases and KRAS-positive cases. This could be also due to the small numbers of KRAS-positive and ALK-positive cases. Accordingly, multivariate analyses of 162 stage I patients further showed significant differences in RFS and OS between group A cases and EGFR-positive cases, and also between group A cases and group B cases. Because numbers of KRAS-positive cases and ALK-positive cases were small, we next compared RFS and OS between group A patients and patients in all 4 other groups combined (“Others” in Table 3). Differences in RFS as well as those in OS were highly significant and independent of staging. These results strongly indicated that group A patients comprise a distinct subclass of EGFR/KRAS/ALK-negative lung adenocarcinomas, and the prognoses of group A patients were the worst among the 5 groups of patients.

Table 3.

Hazard ratios for relapse-free and overall survivals in lung adenocarcinomas

UnivariateMultivariate
SurvivalCase (n)VariableHR (95% CI)PHR (95% CI)P
Relapse free Stage I–II (204) Age 1.03 (0.99–1.07) 0.12 1.04 (0.99–1.08) 0.092 
  Sex (male/female) 1.39 (0.82–2.38) 0.22 1.00 (0.49–2.05) 0.99 
  Smoking habit (ever/never) 1.43 (0.84–2.44) 0.19 1.10 (0.54–2.24) 0.80 
  pStage (II/I) 1.86 (1.41–2.45) 1.3E-05 3.50 (1.93–6.34) 3.6E-05 
  Subgroup 
   Group A/ALK (+) 4.78 (0.63–35.99) 0.13 6.01 (0.76–47.82) 0.09 
   Group A/KRAS (+) 2.43 (0.96–6.17) 0.062 2.85 (1.10–7.35) 0.031 
   Group A/EGFR (+) 3.58 (1.93–6.64) 5.3E-05 2.76 (1.44–5.29) 0.0022 
   Group A/Group B 4.58 (1.69–12.42) 0.0028 4.10 (1.50–11.24) 0.0061 
   Group A/Others 3.56 (2.00–6.34) 1.6E-05 3.04 (1.68–5.53) 2.5E-04 
 Stage I (162) Age 1.01 (0.96–1.06) 0.69 1.00 (0.95–1.05) 0.97 
  Sex (male/female) 0.99 (0.50–1.96) 0.98 0.83 (0.33–2.07) 0.69 
  Smoking habit (ever/never) 1.06 (0.54–2.08) 0.87 0.97 (0.39–2.45) 0.95 
  Subgroup 
   Group A/ALK (+) — — — — 
   Group A/KRAS (+) 2.31 (0.73–7.28) 0.15 2.36 (0.73–7.62) 0.15 
   Group A/EGFR (+) 4.33 (2.00–9.35) 2.0E-04 4.51 (2.05–9.91) 1.7E-04 
   Group A/Group B 5.36 (1.49–19.24) 0.010 5.52 (1.50–20.37) 0.010 
   Group A/Others 4.18 (2.03–8.60) 1.0E-04 4.32 (2.06–9.09) 1.1E-04 
Overall Stage I–II (204) Age 1.03 (0.98–1.08) 0.33 1.03 (0.98–1.09) 0.21 
  Sex (male/female) 1.69 (0.82–3.48) 0.16 0.89 (0.33–2.41) 0.82 
  Smoking habit (ever/never) 1.91 (0.92–3.97) 0.084 1.46 (0.54–3.92) 0.45 
  pStage (II/I) 2.07 (1.45–2.97) 7.2E-05 3.93 (1.83–8.44) 4.6E-04 
  Subgroup 
   Group A/ALK (+) 2.95 (0.38–22.78) 0.30 3.50 (0.41–29.85) 0.25 
   Group A/KRAS (+) 3.12 (0.88–11.09) 0.079 3.31 (0.91–12.03) 0.069 
   Group A/EGFR (+) 4.59 (2.06–10.23) 2.0E-04 3.35 (1.44–7.81) 0.005 
   Group A/Group B 6.83 (1.53–30.54) 0.012 5.68 (1.24–25.95) 0.025 
   Group A/Others 4.50 (2.17–9.36) 5.7E-05 3.61 (1.68–7.78) 0.0010 
 Stage I (162) Age 0.99 (0.93–1.06) 0.73 0.98 (0.91–1.04) 0.45 
  Sex (male/female) 1.15 (0.43–3.08) 0.79 0.77 (0.20–3.00) 0.70 
  Smoking habit (ever/never) 1.47 (0.55–3.91) 0.45 1.26 (0.32–4.89) 0.74 
  Subgroup 
   Group A/ALK (+) — — — — 
   Group A/KRAS (+) 5.79 (0.71–47.3) 0.10 5.61 (0.67–46.84) 0.11 
   Group A/EGFR (+) 5.83 (2.04–16.71) 0.0010 6.06 (2.08–17.71) 9.8E-04 
   Group A/Group B 9.13 (1.12–74.34) 0.039 9.32 (1.10–78.61) 0.040 
   Group A/Others 6.30 (2.34–16.99) 2.8E-04 6.47 (2.33–17.98) 3.4E-04 
UnivariateMultivariate
SurvivalCase (n)VariableHR (95% CI)PHR (95% CI)P
Relapse free Stage I–II (204) Age 1.03 (0.99–1.07) 0.12 1.04 (0.99–1.08) 0.092 
  Sex (male/female) 1.39 (0.82–2.38) 0.22 1.00 (0.49–2.05) 0.99 
  Smoking habit (ever/never) 1.43 (0.84–2.44) 0.19 1.10 (0.54–2.24) 0.80 
  pStage (II/I) 1.86 (1.41–2.45) 1.3E-05 3.50 (1.93–6.34) 3.6E-05 
  Subgroup 
   Group A/ALK (+) 4.78 (0.63–35.99) 0.13 6.01 (0.76–47.82) 0.09 
   Group A/KRAS (+) 2.43 (0.96–6.17) 0.062 2.85 (1.10–7.35) 0.031 
   Group A/EGFR (+) 3.58 (1.93–6.64) 5.3E-05 2.76 (1.44–5.29) 0.0022 
   Group A/Group B 4.58 (1.69–12.42) 0.0028 4.10 (1.50–11.24) 0.0061 
   Group A/Others 3.56 (2.00–6.34) 1.6E-05 3.04 (1.68–5.53) 2.5E-04 
 Stage I (162) Age 1.01 (0.96–1.06) 0.69 1.00 (0.95–1.05) 0.97 
  Sex (male/female) 0.99 (0.50–1.96) 0.98 0.83 (0.33–2.07) 0.69 
  Smoking habit (ever/never) 1.06 (0.54–2.08) 0.87 0.97 (0.39–2.45) 0.95 
  Subgroup 
   Group A/ALK (+) — — — — 
   Group A/KRAS (+) 2.31 (0.73–7.28) 0.15 2.36 (0.73–7.62) 0.15 
   Group A/EGFR (+) 4.33 (2.00–9.35) 2.0E-04 4.51 (2.05–9.91) 1.7E-04 
   Group A/Group B 5.36 (1.49–19.24) 0.010 5.52 (1.50–20.37) 0.010 
   Group A/Others 4.18 (2.03–8.60) 1.0E-04 4.32 (2.06–9.09) 1.1E-04 
Overall Stage I–II (204) Age 1.03 (0.98–1.08) 0.33 1.03 (0.98–1.09) 0.21 
  Sex (male/female) 1.69 (0.82–3.48) 0.16 0.89 (0.33–2.41) 0.82 
  Smoking habit (ever/never) 1.91 (0.92–3.97) 0.084 1.46 (0.54–3.92) 0.45 
  pStage (II/I) 2.07 (1.45–2.97) 7.2E-05 3.93 (1.83–8.44) 4.6E-04 
  Subgroup 
   Group A/ALK (+) 2.95 (0.38–22.78) 0.30 3.50 (0.41–29.85) 0.25 
   Group A/KRAS (+) 3.12 (0.88–11.09) 0.079 3.31 (0.91–12.03) 0.069 
   Group A/EGFR (+) 4.59 (2.06–10.23) 2.0E-04 3.35 (1.44–7.81) 0.005 
   Group A/Group B 6.83 (1.53–30.54) 0.012 5.68 (1.24–25.95) 0.025 
   Group A/Others 4.50 (2.17–9.36) 5.7E-05 3.61 (1.68–7.78) 0.0010 
 Stage I (162) Age 0.99 (0.93–1.06) 0.73 0.98 (0.91–1.04) 0.45 
  Sex (male/female) 1.15 (0.43–3.08) 0.79 0.77 (0.20–3.00) 0.70 
  Smoking habit (ever/never) 1.47 (0.55–3.91) 0.45 1.26 (0.32–4.89) 0.74 
  Subgroup 
   Group A/ALK (+) — — — — 
   Group A/KRAS (+) 5.79 (0.71–47.3) 0.10 5.61 (0.67–46.84) 0.11 
   Group A/EGFR (+) 5.83 (2.04–16.71) 0.0010 6.06 (2.08–17.71) 9.8E-04 
   Group A/Group B 9.13 (1.12–74.34) 0.039 9.32 (1.10–78.61) 0.040 
   Group A/Others 6.30 (2.34–16.99) 2.8E-04 6.47 (2.33–17.98) 3.4E-04 

Clustering of lung adenocarcinomas with poor prognosis by gene expression profiling

We next carried out unsupervised hierarchical clustering of all the 226 adenocarcinoma cases, including 127 EGFR-positive cases and 20 KRAS-positive cases, to investigate whether expression profiling with a set of 174 genes with 190 probes could extract group A cases as a unique subset among all adenocarcinomas, and whether the profiling could be useful for prognosis prediction of patients with any genotypes of adenocarcinomas in general. As shown in Supplementary Fig. S4, clustering patterns of all the 226 patients were very similar to those of the 79 patients consisting of 11 ALK-positive cases and 68 triple-negative cases. In particular, the 11 ALK-positive cases comprised a small cluster in the right side of Cluster 1 (Cluster 1b), supporting that ALK-positive adenocarcinomas show unique expression profiles among all adenocarcinomas. Group A and group B cases also have a tendency to accumulate in Clusters 1a and Cluster 2, respectively. However, group A cases often comprise clusters with the KRAS-positive cases, whereas group B cases were distributed with the EGFR-positive cases. Therefore, group A and group B triple-negative adenocarcinomas were not exclusive with the EGFR-positive and KRAS-positive adenocarcinomas by expression profiling of these 174 genes. Therefore, expression profiling with a set of the 174 genes was concluded to be useful to distinguish ALK-positive adenocarcinomas among all lung adenocarcinomas.

However, RFS of 119 patients in Cluster 1 was significantly worse than RFS of 85 patients in Cluster 2 (HR = 3.73, P = 0.00016). When Cluster 1 was further divided into 2 subclasses 1a and 1b of the right and left sides, respectively, Cluster 1a containing most of group A patients showed the worst prognosis among the 3 subclasses (Supplementary Fig. S4). Therefore, the expression signature of these 174 genes was indicated to be useful for prognostic prediction of adenocarcinoma patients, in particular of triple-negative adenocarcinoma patients.

Minimum set of genes characterizing triple-negative lung adenocarcinomas with poor prognosis

The above results implied that triple-negative adenocarcinomas can be classified into 2 distinct subgroups by expression profiling and prognoses of these 2 groups are significantly different from each other. Accordingly, expression of several genes among the 174 genes was expected to be independently associated with prognosis of triple-negative adenocarcinoma patients. Therefore, we next selected genes whose expression was associated with prognosis from the 174 genes evaluated by the 190 probes. To evaluate the prognostic value of each probe and to make a comparative study for association of gene expression with prognosis in other cohorts possible, we took a minimum P value approach for grouping the patients for survival analysis because of the following reason. A database named PrognoScan was recently developed by coauthors of this study (26). In the PrognoScan database, minimum P values for the association of gene expression with prognosis of all probes in a platform are available for a number of cohorts that have been published. Therefore, it was possible to validate the present findings using data from various other cohorts by the same criteria. According to the method described previously (26), corrected minimum P values were calculated for each probe to control the error rate for the evaluation of the association with RFS and OS. Expression of 11 genes evaluated with 12 probes (2 probes for the DEPDC1 gene) showed significant associations with both RFS and OS in 62 triple-negative adenocarcinomas and also in 46 stage I triple-negative adenocarcinomas (Table 4). Among the 11 genes, expression of 10 genes was positively correlated with poor prognosis, whereas that of the remaining 1 gene, KIF19, expression was negatively correlated with poor prognosis.

Table 4.

List of genes whose expression is associated with relapse free survival and overall survival of patients with lung adenocarcinoma

NCCCAN/DFHLMMSKUMNagoyaDukeSeoul
TN, Stage I–IITN, Stage IAll Stage I–IIAll Stage IStage I–IIIStage I–IIIStage I–IIIStage I–IIIStage I–IIIStage I–IIIStage I–III
DatasetRFSn = 62OSn = 62RFSn = 46OSn = 46RFSn = 204OSn = 204RFSn = 162OSn = 162OSn = 82OSn = 79OSn = 104OSn = 178OSn = 117RFSn = 111RFSn = 138
Gene symbolProbe ID (for NCC)PHRPHRPHRPHRPHRPHRPHRPHRPHRPHRPHRPHRPHRPHRPHR
DEPDC1 222958_s_at 0.00 2.3 0.00 3.0 0.00 3.0 0.02 2.7 0.00 2.1 0.00 1.8 0.00 2.0 0.00 2.1               
                  — — — — 0.03 1.1 — — 0.00 1.6 0.04 1.0 0.01 0.9 
DEPDC1 235545_at 0.00 1.8 0.01 2.3 0.00 2.4 0.04 2.6 0.00 1.4 0.00 1.6 0.00 1.3 0.00 2.2               
FOSL2 218881_s_at 0.01 1.7 0.03 1.7 0.02 1.8 0.00 3.3 0.00 1.2 0.00 1.7 0.00 1.4 0.00 2.4 — — 0.00 1.7 — — 0.02 0.7 — — — — 0.01 1.0 
MCM4 222037_at 0.00 1.8 0.00 3.0 0.01 2.0 0.04 2.6 0.00 1.4 0.00 1.8 0.00 1.5 0.00 2.1 — — — — — — — — 0.00 1.7 — — — — 
UBE2S 202779_s_at 0.00 3.0 0.02 16.0 0.01 2.8 0.02 16.6 0.00 1.6 0.02 1.4 0.00 1.6 0.00 2.1 — — — — 0.05 1.0 — — — — — — — — 
CD300A 217078_s_at 0.01 1.7 0.00 2.1 0.04 1.7 0.00 2.8 0.00 1.1 0.00 1.5 0.01 1.2 0.01 1.7 — — — — 0.00 1.5 — — — — — — — — 
SLITRK4 232636_at 0.02 1.7 0.03 1.7 0.00 2.9 0.00 2.5 0.01 1.1 0.00 1.4 0.04 1.1 0.00 2.0 — — — — — — — — — — — — — — 
KRT16 209800_at 0.00 2.0 0.00 2.5 0.00 2.4 0.00 2.7 0.00 1.2 0.00 1.4 0.01 1.2 0.01 1.7 — — — — — — — — — — — — — — 
SIGLEC9 210569_s_at 0.00 1.9 0.01 2.0 0.00 2.1 0.04 2.1 0.00 1.6 0.00 1.4 0.00 1.7 0.00 2.2 — — — — — — — — — — — — — — 
DIAPH3 232596_at 0.02 1.5 0.00 3.0 0.05 1.6 0.03 2.6 0.00 1.2 0.00 2.1 0.00 1.5 0.00 2.0 — — — — — — — — — — — — — — 
LOC152225 1562048_at 0.01 1.5 0.00 2.3 0.02 1.7 0.00 2.7 0.00 1.3 0.00 1.9 — — 0.00 1.9 — — — — — — — — — — — — — — 
KIF19 1553314_a_at 0.01 -1.5 0.05 -1.6 0.00 -3.0 0.00 −2.5 — — 0.03 −1.4 — — — — — — — — — — — — 0.00 −1.4 — — — — 
NCCCAN/DFHLMMSKUMNagoyaDukeSeoul
TN, Stage I–IITN, Stage IAll Stage I–IIAll Stage IStage I–IIIStage I–IIIStage I–IIIStage I–IIIStage I–IIIStage I–IIIStage I–III
DatasetRFSn = 62OSn = 62RFSn = 46OSn = 46RFSn = 204OSn = 204RFSn = 162OSn = 162OSn = 82OSn = 79OSn = 104OSn = 178OSn = 117RFSn = 111RFSn = 138
Gene symbolProbe ID (for NCC)PHRPHRPHRPHRPHRPHRPHRPHRPHRPHRPHRPHRPHRPHRPHR
DEPDC1 222958_s_at 0.00 2.3 0.00 3.0 0.00 3.0 0.02 2.7 0.00 2.1 0.00 1.8 0.00 2.0 0.00 2.1               
                  — — — — 0.03 1.1 — — 0.00 1.6 0.04 1.0 0.01 0.9 
DEPDC1 235545_at 0.00 1.8 0.01 2.3 0.00 2.4 0.04 2.6 0.00 1.4 0.00 1.6 0.00 1.3 0.00 2.2               
FOSL2 218881_s_at 0.01 1.7 0.03 1.7 0.02 1.8 0.00 3.3 0.00 1.2 0.00 1.7 0.00 1.4 0.00 2.4 — — 0.00 1.7 — — 0.02 0.7 — — — — 0.01 1.0 
MCM4 222037_at 0.00 1.8 0.00 3.0 0.01 2.0 0.04 2.6 0.00 1.4 0.00 1.8 0.00 1.5 0.00 2.1 — — — — — — — — 0.00 1.7 — — — — 
UBE2S 202779_s_at 0.00 3.0 0.02 16.0 0.01 2.8 0.02 16.6 0.00 1.6 0.02 1.4 0.00 1.6 0.00 2.1 — — — — 0.05 1.0 — — — — — — — — 
CD300A 217078_s_at 0.01 1.7 0.00 2.1 0.04 1.7 0.00 2.8 0.00 1.1 0.00 1.5 0.01 1.2 0.01 1.7 — — — — 0.00 1.5 — — — — — — — — 
SLITRK4 232636_at 0.02 1.7 0.03 1.7 0.00 2.9 0.00 2.5 0.01 1.1 0.00 1.4 0.04 1.1 0.00 2.0 — — — — — — — — — — — — — — 
KRT16 209800_at 0.00 2.0 0.00 2.5 0.00 2.4 0.00 2.7 0.00 1.2 0.00 1.4 0.01 1.2 0.01 1.7 — — — — — — — — — — — — — — 
SIGLEC9 210569_s_at 0.00 1.9 0.01 2.0 0.00 2.1 0.04 2.1 0.00 1.6 0.00 1.4 0.00 1.7 0.00 2.2 — — — — — — — — — — — — — — 
DIAPH3 232596_at 0.02 1.5 0.00 3.0 0.05 1.6 0.03 2.6 0.00 1.2 0.00 2.1 0.00 1.5 0.00 2.0 — — — — — — — — — — — — — — 
LOC152225 1562048_at 0.01 1.5 0.00 2.3 0.02 1.7 0.00 2.7 0.00 1.3 0.00 1.9 — — 0.00 1.9 — — — — — — — — — — — — — — 
KIF19 1553314_a_at 0.01 -1.5 0.05 -1.6 0.00 -3.0 0.00 −2.5 — — 0.03 −1.4 — — — — — — — — — — — — 0.00 −1.4 — — — — 

Abbreviations: NCC, National Cancer Center; TN, Triple-negative.

HRs (log2 ratio) with corrected P value < 0.05 are shown.

We first selected 174 genes as being preferentially expressed in either ALK-positive adenocarcinomas or triple-negative adenocarcinomas by the criteria of “probes whose expression levels in any adenocarcinomas with EGFR or KRAS mutations were lower than the mean expression level of a total of 54,675 probes.” Then, 11 of the 174 genes were further selected as being associated with prognosis of patients with triple-negative adenocarcinomas. Therefore, higher expression of several genes among the 11 genes was predicted to be associated with poorer prognosis, even when all adenocarcinoma cases, including EGFR-positive, KRAS-positive, and ALK-positive adenocarcinomas were analyzed together. Furthermore, triple-negative adenocarcinomas with poor prognosis would be separated into a high-risk group classified with this procedure. For this reason, we next analyzed all 204 adenocarcinoma cases. Among the 11 genes with 12 probes, 9 genes with 10 probes showed significant associations with both RFS and OS in all 204 adenocarcinoma cases and also in 162 stage I adenocarcinoma cases. LOC152225 and KIF19 were excluded because of no significant associations in stage I adenocarcinoma cases. As predicted, higher expression of the 9 genes was correlated with poorer prognosis in the analysis of RFS and OS among 204 stages I and II cases and also among 162 stage I cases.

The result strongly indicated that unsupervised hierarchical clustering using this 10 probe set (9 genes) would separate the patients into high-risk and low-risk groups for prognosis and that all group A triple-negative adenocarcinoma patients with poor prognosis would be classified into the high-risk group (Fig. 3 and Supplementary Table S3). As expected, expression profiling of these 9 genes successfully separated the 204 patients into high-risk and low-risk groups with significantly different RFS (HR = 3.79, 95% CI = 2.19–6.55, P = 1.9E-06) as well as OS (HR = 5.72, 95% CI = 2.53–12.87, P = 2.5E-05). Furthermore, if 62 triple-negative cases only were separated with these 9 genes, HRs for both RFS and OS were much higher than those with separation of all the 204 cases. All the relapsed cases in group A were separated into the high-risk group in the analyses of both cases (all the 204 cases and the 62 triple-negative cases only), supporting that triple-negative adenocarcinomas cases with poor prognosis can be selected as a high-risk group from all the adenocarcinoma cases by expression profiling of these 9 genes (Fig. 3). This profiling further separated 162 stage I cases as well as 46 stage I triple-negative adenocarcinoma cases into high-risk and low-risk groups with significantly different RFS as well as OS (Supplementary Fig. S5 and Supplementary Table S3). Again, HRs for both RFS and OS were much higher in triple-negative adenocarcinoma cases than in all adenocarcinoma cases. Accordingly, high levels of expression in these 9 genes were concluded to be distinct characteristics of triple-negative adenocarcinomas with poor prognosis.

Figure 3.

Unsupervised hierarchical clustering based on the expression of a set of 9 genes. All 204 stage I–II adenocarcinomas and 62 triple-negative (TN) stage I–II adenocarcinomas of the National Cancer Center (NCC) data set subjected to survival analysis were analyzed, and a cluster with higher expression of these genes than the other cluster was recognized as a high-risk group (red bar). Results of 117 adenocarcinomas, including 57 double-negative (DN) adenocarcinomas, of the Aichi Cancer Center (ACC) data set are shown below.

Figure 3.

Unsupervised hierarchical clustering based on the expression of a set of 9 genes. All 204 stage I–II adenocarcinomas and 62 triple-negative (TN) stage I–II adenocarcinomas of the National Cancer Center (NCC) data set subjected to survival analysis were analyzed, and a cluster with higher expression of these genes than the other cluster was recognized as a high-risk group (red bar). Results of 117 adenocarcinomas, including 57 double-negative (DN) adenocarcinomas, of the Aichi Cancer Center (ACC) data set are shown below.

Close modal

Validation of associations using independent expression profiling data

To validate the present findings using the data of other cohorts, we searched for expression profiling data with mutation data of the EGFR, KRAS, and ALK genes in various databases. However, there has been no cohort in which expression profiles specifically in triple-negative adenocarcinomas were analyzed. Therefore, unsupervised hierarchical clustering using these 9 genes was done on a cohort of 117 Japanese lung adenocarcinoma cases because expression profile data as well as EGFR/KRAS mutation data were available only in this cohort (32). This study included 57 adenocarcinoma cases without EGFR and KRAS mutations. Although a different array platform was used, the data for all the 9 genes were available for clustering. These cases were separated into 2 groups of 33 cases and 24 cases (Fig. 3). OS of the 33 cases was significantly shorter than that of the 24 cases (HR = 3.17, 95% CI = 1.17–8.63, P = 2.4E-02; Supplementary Table S3). As with our cohort, the high-risk group showed a significantly higher HR of 2.73, even when all the 117 cases were analyzed together. Although ALK mutation data were not available for this cohort, the results strongly supported that expression profiling of the 9 genes would be highly informative for prediction of prognosis of lung adenocarcinoma patients, in particular patients with EGFR- and KRAS-negative adenocarcinomas.

Associations of DEPDC1 expression with prognosis of NSCLC patients

Associations of gene expression with prognosis in various cancers are available from the PrognoScan database (22). Therefore, associations of expression of these 9 genes with prognosis of NSCLC patients were examined in 7 other cohorts (Table 4). Notably, DEPDC1 expression was positively associated with poor prognosis in 4 of the 7 cohorts; MSK, Nagoya, Duke, and Seoul. The results strongly indicated that DEPDC1 expression can be a novel prognostic marker for patients with NSCLC. Representative data showing the association of DEPDC1 expression with prognosis in 204 adenocarcinoma patients obtained from the minimum P value approach are shown in Supplementary Fig. S6. Associations of DEPDC1 expression with RFS and OS were validated by quantitative RT-PCR analysis of 204 stages I and II cases and also of 162 stage I cases (Supplementary Fig. S3).

FOSL2 expression was associated with prognosis in 3 of the 7 cohorts, whereas MCM4, CD300A, and UBE2S expression was associated in 1 cohort, respectively (Table 4).

In this study, we attempted to characterize ALK-positive adenocarcinomas and triple-negative adenocarcinomas by genome-wide expression profiling. For this purpose, we selected a set of genes that are not transcriptionally activated in any EGFR-positive and KRAS-positive adenocarcinomas, and obtained 2 pieces of unique evidence. One is that ALK-positive adenocarcinomas show unique expression profiles in comparison with any other types of adenocarcinomas. The other is that there is a group of patients with extremely poor prognosis among triple-negative adenocarcinomas. This group, herein designated as group A, of patients showed much worse prognoses than patients with EGFR, KRAS, or ALK mutations and also than the other group, group B, of patients with triple-negative adenocarcinomas.

ALK-positive adenocarcinomas are sensitive to ALK TKIs with an overall response rate of 55% (8). Therefore, for the clinical application of ALK-targeted therapy, it is indispensable to develop a simple and reliable method for detection of ALK rearrangements in lung adenocarcinomas. Here, we showed that ALK expression is exclusively high only in ALK-positive adenocarcinomas and that several other genes, including GRIN2A, are overexpressed together with ALK specifically in ALK-positive adenocarcinomas. Therefore, GRIN2A can be a biomarker for detection of ALK-positive adenocarcinomas. GRIN2A encodes an N-methyl-d-aspartate (NMDA) receptor, which is a neurotransmitter-gated ion channel involved in regulation of synaptic function in the central nervous system (33). It was noted that the GRIN2A gene was recently reported to be frequently mutated in melanoma (34). Therefore, although the biological significance of GRIN2A upregulation in ALK-positive adenocarcinomas remains unclear, GRIN2A expression may play some important role in the phenotype unique to ALK-positive adenocarcinomas. Expression profiles unique to ALK-positive adenocarcinomas, shown here, will be also informative to improve clinical detection of ALK rearrangements.

Group A cases were discriminated by expression profiling of 9 genes among stage I–II cases who received complete surgical resection of tumors. Therefore, this gene set will be applicable as biomarkers to select lung adenocarcinoma patients who will benefit from adjuvant therapy after surgery, in particular to select them among patients with triple-negative adenocarcinomas. For this reason, combined analyses of this expression profiling with mutational analyses of the EGFR, KRAS, and ALK genes will be appropriate to pick out triple-negative adenocarcinoma patients with poor prognosis from all the adenocarcinoma patients. Molecular targeting drugs against triple-negative adenocarcinomas are not available at present; therefore, genes upregulated in group A cases will also be applicable as targets for therapy. DEPDC1 was previously identified as being upregulated in bladder cancer and breast cancer (35–37). Because DEPDC1 expression was hardly detectable in any normal tissues except testis, it has been considered as a cancer/testis antigen and also as a promising target of therapeutic drugs (35, 36). This study showed that DEPDC1 is preferentially expressed in triple-negative adenocarcinomas with poor prognosis. In the PrognoScan database, DEPDC1 expression is shown to be positively associated with poor prognosis in bladder cancer, multiple myeloma, breast cancer, glioma, and melanoma. Therefore, DEPDC1 could be a novel target for diagnosis as well as therapy in various cancers, including lung adenocarcinoma.

Identification of genetic alterations that occur specifically in group A cases will be also of great importance for the development of target therapy for stages I and II lung adenocarcinoma patients with poor outcomes. Group A cases include males and ever-smokers as a majority (Table 1); therefore, group A cases were likely to carry several genetic alterations induced by tobacco carcinogens leading to poor outcomes. Identification of genetic alterations in group A adenocarcinomas will further facilitate the development of targeted therapies for lung adenocarcinomas with poor prognosis.

No potential conflicts of interest were disclosed.

The authors thank Dr. Teruhiko Yoshida and Ms. Sachiyo Mimaki for their efforts in expression profiling.

This work was supported in part by grants-in-aid from the Ministry of Health, Labor and Welfare for the 3rd-term Comprehensive 10-year Strategy for Cancer Control and from the Program for Promotion of Fundamental Studies in Health Sciences of the National Institute of Biomedical Innovation (NIBIO: 10–41). K. Shiraishi was an awardee of a Research Resident Fellowship from the Foundation for Promotion of Cancer Research in Japan.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1.
Ferlay
J
,
Shin
HR
,
Bray
F
,
Forman
D
,
Mathers
C
,
Parkin
DM
. 
Estimates of worldwide burden of cancer in 2008: GLOBOCAN 2008
.
Int J Cancer
2010
;
127
:
2893
917
.
2.
Parkin
DM
,
Bray
F
,
Ferlay
J
,
Pisani
P
. 
Global cancer statistics, 2002
.
CA Cancer J Clin
2005
;
55
:
74
108
.
3.
Colvy
T
,
Noguchi
M
,
Henschke
C
,
Vazquez
M
,
Geisinger
K
,
Yokose
T
et al 
Adenocarcinoma
.
In:
Travis
WD
,
Brambilla
E
,
Muller-Hermelink
HK
,
Harris
CC
,
editors
. 
World Health Organization classification of tumors: pathology and genetics; tumours of the lung, pleura, thymus and heart
.
IARC Press
:
Lyon, France
; 
2004
.
p. 35
44
.
4.
Pao
W
,
Girard
N
. 
New driver mutations in non-small-cell lung cancer
.
Lancet Oncol
2011
;
12
:
175
80
.
5.
Herbst
RS
,
Heymach
JV
,
Lippman
SM
: 
Lung cancer
.
N Engl J Med
2008
;
359
:
1367
80
.
6.
Janku
F
,
Stewart
DJ
,
Kurzrock
R
. 
Targeted therapy in non-small-cell lung cancer–is it becoming a reality?
Nat Rev Clin Oncol
2010
;
7
:
401
14
.
7.
Bronte
G
,
Rizzo
S
,
La Paglia
L
,
Adamo
V
,
Siragusa
S
,
Ficorella
C
, et al
2288; Driver mutations and differential sensitivity to targeted therapies: a new approach to the treatment of lung adenocarcinoma
.
Cancer Treat Rev
2010
;
36
Suppl 3
:
S21
9
.
8.
Gerber
DE
,
Minna
JD
. 
ALK inhibition for non-small cell lung cancer: from discovery to therapy in record time
.
Cancer Cell
2010
;
18
:
548
51
.
9.
Sun
S
,
Schiller
JH
,
Gazdar
AF
. 
Lung cancer in never smokers–a different disease
.
Nat Rev Cancer
2007
;
7
:
778
90
.
10.
Rosell
R
,
Moran
T
,
Queralt
C
,
Porta
R
,
Cardenal
F
,
Camps
C
, et al
Screening for epidermal growth factor receptor mutations in lung cancer
.
N Engl J Med
2009
;
361
:
958
67
.
11.
Mok
TS
,
Wu
YL
,
Thongprasert
S
,
Yang
CH
,
Chu
DT
,
Saijo
N
, et al
: 
Gefitinib or carboplatin-paclitaxel in pulmonary adenocarcinoma
.
N Engl J Med
2009
;
361
:
947
57
.
12.
Konstantinopoulos
PA
,
Karamouzis
MV
,
Papavassiliou
AG
. 
Post-translational modifications and regulation of the RAS superfamily of GTPases as anticancer targets
.
Nat Rev Drug Discov
2007
;
6
:
541
55
.
13.
Soda
M
,
Choi
YL
,
Enomoto
M
,
Takada
S
,
Yamashita
Y
,
Ishikawa
S
, et al
Identification of the transforming EML4-ALK fusion gene in non-small-cell lung cancer
.
Nature
2007
;
448
:
561
6
.
14.
Takeuchi
T
,
Tomida
S
,
Yatabe
Y
,
Kosaka
T
,
Osada
H
,
Yanagisawa
K
, et al
Expression profile-defined classification of lung adenocarcinoma shows close relationship with underlying major genetic changes and clinicopathologic behaviors
.
J Clin Oncol
2006
;
24
:
1679
88
.
15.
Angulo
B
,
Suarez-Gauthier
A
,
Lopez-Rios
F
,
Medina
PP
,
Conde
E
,
Tang
M
, et al
Expression signatures in lung cancer reveal a profile for EGFR-mutant tumours and identify selective PIK3CA overexpression by gene amplification
.
J Pathol
2008
;
214
:
347
56
.
16.
Motoi
N
,
Szoke
J
,
Riely
GJ
,
Seshan
VE
,
Kris
MG
,
Rusch
VW
, et al
Lung adenocarcinoma: modification of the 2004 WHO mixed subtype to include the major histologic subtype suggests correlations between papillary and micropapillary adenocarcinoma subtypes, EGFR mutations and gene expression analysis
.
Am J Surg Pathol
2008
;
32
:
810
27
.
17.
Meyerson
M
,
Carbone
D
. 
Genomic and proteomic profiling of lung cancers: lung cancer classification in the age of targeted therapy
.
J Clin Oncol
2005
;
23
:
3219
26
.
18.
Travis
WD
,
Brambilla
E
,
Muller-Hermelink
HK
,
Harris
CC
,
editors
. 
World Health Organization classification of tumors: pathology and genetics; tumours of the lung, pleura, thymus and heart
.
IARC Press
:
Lyon, France
; 
2004
.
p. 1
344
.
19.
Saito
M
,
Schetter
AJ
,
Mollerup
S
,
Kohno
T
,
Skaug
V
,
Bowman
ED
, et al
The association of microRNA expression with prognosis and progression in early-stage, non-small cell lung adenocarcinoma: a retrospective analysis of three cohorts
.
Clin Cancer Res
2011
;
17
:
1875
82
.
20.
Robins
JM
,
Gail
MH
,
Lubin
JH
. 
More on “Biased selection of controls for case-control analyses of cohort studies”
.
Biometrics
1986
;
42
:
293
9
.
21.
Richardson
DB
. 
An incidence density sampling program for nested case-control analyses
.
Occup Environ Med
2004
;
61
:
e59
.
22.
Eisen
MB
,
Spellman
PT
,
Brown
PO
,
Botstein
D
. 
Cluster analysis and display of genome-wide expression patterns
.
Proc Natl Acad Sci USA
1998
;
95
:
14863
8
.
Available from:
http://rana.lbl.gov/eisen/.
23.
Takano
T
,
Ohe
Y
,
Sakamoto
H
,
Tsuta
K
,
Matsuno
Y
,
Tateishi
U
, et al
Epidermal growth factor receptor gene mutations and increased copy numbers predict gefitinib sensitivity in patients with recurrent non-small-cell lung cancer
.
J Clin Oncol
2005
;
23
:
6829
37
.
24.
Takano
T
,
Ohe
Y
,
Tsuta
K
,
Fukui
T
,
Sakamoto
H
,
Yoshida
T
, et al
Epidermal growth factor receptor mutation detection using high-resolution melting analysis predicts outcomes in patients with advanced non small cell lung cancer treated with gefitinib
.
Clin Cancer Res
2007
;
13
:
5385
90
.
25.
Takeuchi
K
,
Choi
YL
,
Soda
M
,
Inamura
K
,
Togashi
Y
,
Hatano
S
, et al
Multiplex reverse transcription-PCR screening for EML4-ALK fusion transcripts
.
Clin Cancer Res
2008
;
14
:
6618
24
.
26.
Mizuno
H
,
Kitada
K
,
Nakai
K
,
Sarai
A
. 
2288; PrognoScan: a new database for meta-analysis of the prognostic value of genes
.
2288
;
BMC Med Genomics
2009
;
2
:
18
.
Available from:
http://gibk21.bio.kyutech.ac.jp/PrognoScan/index.html.
27.
Mitsudomi
T
. 
Advances in target therapy for lung cancer
.
Jpn J Clin Oncol
2010
;
40
:
101
6
.
28.
Suda
K
,
Tomizawa
K
,
Mitsudomi
T
. 
Biological and clinical significance of KRAS mutations in lung cancer: an oncogenic driver that contrasts with EGFR mutation
.
Cancer Metastasis Rev
2010
;
29
:
49
60
.
29.
Zhang
X
,
Zhang
S
,
Yang
X
,
Yang
J
,
Zhou
Q
,
Yin
L
, et al
Fusion of EML4 and ALK is associated with development of lung adenocarcinomas lacking EGFR and KRAS mutations and is correlated with ALK expression
.
Mol Cancer
2010
;
9
:
188
.
30.
Boland
JM
,
Erdogan
S
,
Vasmatzis
G
,
Yang
P
,
Tillmans
LS
,
Johnson
MR
, et al
Anaplastic lymphoma kinase immunoreactivity correlates with ALK gene rearrangement and transcriptional up-regulation in non-small cell lung carcinomas
.
Hum Pathol
2009
;
40
:
1152
8
.
31.
Mino-Kenudson
M
,
Chirieac
LR
,
Law
K
,
Hornick
JL
,
Lindeman
N
,
Mark
EJ
, et al
A novel, highly sensitive antibody allows for the routine detection of ALK-rearranged lung adenocarcinomas by standard immunohistochemistry
.
Clin Cancer Res
2010
;
16
:
1561
71
.
32.
Tomida
S
,
Takeuchi
T
,
Shimada
Y
,
Arima
C
,
Matsuo
K
,
Mitsudomi
T
, et al
Relapse-related molecular signature in lung adenocarcinomas identifies patients with dismal prognosis
.
J Clin Oncol
2009
;
27
:
2793
9
.
33.
Endele
S
,
Rosenberger
G
,
Geider
K
,
Popp
B
,
Tamer
C
,
Stefanova
I
, et al
Mutations in GRIN2A and GRIN2B encoding regulatory subunits of NMDA receptors cause variable neurodevelopmental phenotypes
.
Nat Genet
2010
;
42
:
1021
6
.
34.
Wei
X
,
Walia
V
,
Lin
JC
,
Teer
JK
,
Prickett
TD
,
Gartner
J
, et al
Exome sequencing identifies GRIN2A as frequently mutated in melanoma
.
Nat Genet
2011
;
43
:
442
6
.
35.
Kanehira
M
,
Harada
Y
,
Takata
R
,
Shuin
T
,
Miki
T
,
Fujioka
T
, et al
Involvement of upregulation of DEPDC1 (DEP domain containing 1) in bladder carcinogenesis
.
Oncogene
2007
;
26
:
6448
55
.
36.
Harada
Y
,
Kanehira
M
,
Fujisawa
Y
,
Takata
R
,
Shuin
T
,
Miki
T
, et al
Cell-permeable peptide DEPDC1-ZNF224 interferes with transcriptional repression and oncogenicity in bladder cancer cells
.
Cancer Res
2010
;
70
:
5829
39
.
37.
Kretschmer
C
,
Sterner-Kock
A
,
Siedentopf
F
,
Schoenegg
W
,
Schlag
PM
,
Kemmner
W
. 
Identification of early molecular markers for breast cancer
.
Mol Cancer
2011
;
10
:
15
.