Abstract
The array-based comparative genomic hybridization using microarrayed bacterial artificial chromosome clones allows high-resolution analysis of genome-wide copy number changes in tumors. To analyze the genetic alterations of primary lung adenocarcinoma in a high-throughput way, we used laser-capture microdissection of cancer cells and array comparative genomic hybridization focusing on 800 chromosomal loci containing cancer-related genes. We identified a large number of chromosomal numerical alterations, including frequent amplifications on 7p12, 11q13, 12q14-15, and 17q21, and two homozygous deletions on 9p21 and one on 8p23. Unsupervised hierarchical clustering analysis of multiple alterations revealed three subgroups of lung adenocarcinoma that were characterized by the accumulation of distinct genetic alterations and associated with smoking history and gender. The mutation status of the epidermal growth factor receptor (EGFR) gene was significantly associated with specific genetic alterations and supervised clustering analysis based on EGFR gene mutations elucidated a subgroup including all EGFR gene mutated tumors, which showed significantly shorter disease-free survival. Our results suggest that there exist multiple molecular carcinogenesis pathways in lung adenocarcinoma that may associate with smoking habits and gender, and that genetic cancer profiling will reveal previously uncharacterized genetic heterogeneity of cancer and be beneficial in estimating patient prognosis and discovering novel cancer-related genes including therapeutic targets.
Lung cancer is one of the most lethal and increasing cancers in Western countries as well as in Japan (1). Lung cancer is histopathologically divided in two subgroups, small cell lung carcinoma and non–small cell lung carcinoma, and lung adenocarcinoma comprises >40% of the latter (1).
Previous genetic analyses using allelotyping, comparative genomic hybridization (CGH), or the candidate gene approach revealed many genomic (genetic and epigenetic) alterations of tumor suppressor genes (such as p53, p16INK4a, FHIT, LKB1, and PTEN) and oncogenes [such as K-ras, B-RAF, MYC, epidermal growth factor receptor (EGFR), and ERBB2] as well as many chromosomal imbalances (such as on 3p, 8p, 9p, 17p, 18q, and 19p) in lung adenocarcinomas (2–10). However, overall understanding of genomic alterations in lung adenocarcinomas is far from complete and analysis of the relationship between the overall profile and combinations of genetic alterations with clinicopathologic parameters is still lacking. Recently, genome-wide gene expression analyses have uncovered a novel dimension of cancer profiling and helped define the nature of the heterogeneous subgroups of lung adenocarcinoma, each of which shows distinct tumor histology and patient prognosis (11–14). However, it is unclear whether there exist multiple genomic pathways in lung adenocarcinoma because of the lack of a genome-wide view of genetic alterations. It is clinically important to examine the correlations of certain molecular-genetic pathways with cancer cell traits relating to patient prognosis or chemotherapy sensitivity because it is possible that genetic alteration profiling may predict tumor recurrence/metastasis or sensitivity to molecular-target therapies as well as mRNA or protein expression profiles do (15–17).
The recently developed array-based CGH method using microarrayed bacterial artificial chromosome clones allows high-resolution analysis of genome-wide copy number changes in various tumors (18, 19). To define and analyze the genetic alterations of lung adenocarcinoma in a more detailed way, we used the array CGH technique and laser-capture microdissection of cancer cells, a combination that we have successfully used in other tumor types (20).9
Unpublished data. Shibata T, Hosoda F, Ohki M, Hirohashi S.
Materials and Methods
Patient materials. Surgical specimens of 55 lung adenocarcinoma patients who had been diagnosed and undergone operation between June 2001 and May 2002 at the National Cancer Center Hospital were examined. Fragments of tumor and corresponding normal lung tissue were taken immediately after surgery, fixed with 100% methanol, and embedded in paraffin. This study was approved by the institutional review boards of the National Cancer Center. The clinicopathologic data of the patients are shown in Table 1.
. | No. cases . | Frequency (%) . | ||
---|---|---|---|---|
Total no. patient | 55 | |||
Mean age (range) | 62.3 (35-79) | |||
Gender | ||||
Male | 28 | 50.9 | ||
Female | 27 | 49.1 | ||
Smoking history | ||||
Never | 27 | 49.1 | ||
Former | 10 | 18.2 | ||
Current | 18 | 32.7 | ||
Tumor differentiation | ||||
Well | 20 | 36.4 | ||
Moderate | 25 | 45.5 | ||
Poor | 10 | 18.1 | ||
Stage* | ||||
I (IA and IB) | 24 | 43.6 | ||
II (IIA and IIB) | 6 | 10.9 | ||
III (IIIA and IIIB) | 21 | 38.2 | ||
EGFR mutation | 26 | 47 | ||
Exon 18 G719S | 1 | 1.8 | ||
Exon 19 Del746-750 | 8 | 14.5 | ||
Exon 19 Del747-752 | 2 | 3.6 | ||
Exon 19 Del747-752insS | 1 | 1.8 | ||
Exon 21 L858R | 14 | 25.5 | ||
K-ras mutation | 6 | 11 | ||
Codon 12 | 4 | 7.3 | ||
Codon 13 | 1 | 1.8 | ||
Codon 61 | 1 | 1.8 |
. | No. cases . | Frequency (%) . | ||
---|---|---|---|---|
Total no. patient | 55 | |||
Mean age (range) | 62.3 (35-79) | |||
Gender | ||||
Male | 28 | 50.9 | ||
Female | 27 | 49.1 | ||
Smoking history | ||||
Never | 27 | 49.1 | ||
Former | 10 | 18.2 | ||
Current | 18 | 32.7 | ||
Tumor differentiation | ||||
Well | 20 | 36.4 | ||
Moderate | 25 | 45.5 | ||
Poor | 10 | 18.1 | ||
Stage* | ||||
I (IA and IB) | 24 | 43.6 | ||
II (IIA and IIB) | 6 | 10.9 | ||
III (IIIA and IIIB) | 21 | 38.2 | ||
EGFR mutation | 26 | 47 | ||
Exon 18 G719S | 1 | 1.8 | ||
Exon 19 Del746-750 | 8 | 14.5 | ||
Exon 19 Del747-752 | 2 | 3.6 | ||
Exon 19 Del747-752insS | 1 | 1.8 | ||
Exon 21 L858R | 14 | 25.5 | ||
K-ras mutation | 6 | 11 | ||
Codon 12 | 4 | 7.3 | ||
Codon 13 | 1 | 1.8 | ||
Codon 61 | 1 | 1.8 |
Clinical stage of four cases was not evaluated.
Laser-capture microdissection and whole-genome amplification. Laser-capture microdissection was done using LM200 (Arcuturus, Mount View, CA) as described (21). Only cancer cells were microdissected and lymphocytes, fibroblasts, and endothelial cells were carefully excluded. Corresponding normal lung epithelial cells were similarly microdissected and used as reference. To amplify the genomic DNA fragments, we used an adaptor-ligated whole-genome PCR as previously reported (22).
Array-based comparative genomic hybridization. This study used a custom-made CGH array called “MCG CancerArray-800 ver.2,” which consists of 800 duplicated bacterial artificial chromosome clones corresponding to various chromosomal loci that have been reported or considered to be altered in various human cancers (20, 23). Details of hybridization procedures have been previously reported (20). Sixteen-bit fluorescence intensity TIF images were obtained using a scanner (FLA8000, Fuji Film, Tokyo, Japan) and analyzed using GenePix Pro 5.0 (Axon Instruments, Inc., Foster City, CA). Thresholds for chromosomal gain (ratio >1.25) and loss (ratio <0.75) were determined by “normal versus normal experiments” (23, 24). We also validated our array CGH data by other methods. Loss of the 17p13 locus was confirmed by loss of heterozygosity of the p53 gene, which is located within that bacterial artificial chromosome using a microsatellite marker (TP53CA). Gene amplification of a representative gene, cyclin D1, was validated by fluorescence in situ hybridization analysis (24). We applied multiplex ligation-dependent probe amplification (MLPA) to validate ourarray data. Copy number alterations of multiple loci were analyzed using MLPA-SALSA kit (MRC-Holland, Amsterdam, the Netherlands) as per the recommendation of the manufacturer (25). Size and quantity of PCR products were calculated by Gene Mapper software (Version 3.5, Applied Biosystems, Tokyo, Japan) and copy number was determined by the ratio to the average of five normal control experiments.
Mutational analysis. We amplified exons 18, 19, 21, and 23 of the EGFR gene; exons 2 and 3 (covering codons 12, 13, and 61) of the K-ras gene; exons 20 and 21 of the ERBB2 gene; and exons 10, 14, 16, 17, 18, 19, and 20 of the MET gene from microdissected tumor and corresponding normal DNA samples with PCR using High Fidelity Taq polymerase (Roche, Mannheim, Germany) and appropriate primers (primer sequences are available on request). All PCR products were purified and analyzed by sequencing. PCR products showing deletions were subcloned in TA-vector (Invitrogen, Carlsbad, CA) and sequenced.
Immunohistochemical analysis. Four-micrometer sections of formalin-fixed, paraffin-embedded specimens of lung adenocarcinoma were stained with an anti-MET mouse monoclonal antibody (×100 dilution, Zymed, San Francicso, CA) as the suppliers recommended.
Statistical analysis. Two-dimensional hierarchical clustering analysis of the samples and signal ratios was done using the Impressionist (Gene Data, Basel, Switzerland) and GeneMaths (Applied Maths, Sint-Martens-Latem, Belgium) software programs as described (26, 27). Data were standardized by dividing by the root means and dendrograms were produced using the Pearson Correlation algorithm. For supervised clustering, we first selected loci that were significantly different between EGFR wild-type and mutated tumors based on the average ratio by Student's t test. We then used a machine-learning method, in which the leave-one-out cross-validation was done with all combinations of loci and multiple independent classifier algorithms, and selected 46 loci that could discriminate EGFR mutation status most accurately to classify the tumors. The Kaplan-Meier method was used to estimate the probability of disease-free survival. Cox proportional hazards regression model and multivariate analysis were done to detect the association between the presence of chromosomal alterations and disease-free survival. Log-rank analysis was used to assess the significance of the difference between subgroups.
Results
Array-based comparative genomic hybridization analysis of primary lung adenocarcinoma. We analyzed 55 cases of lung adenocarcinoma by array-based CGH and the chromosomal alteration profiles of 800 loci are shown in Fig. 1. We identified 32 loci that were lost in >40% of cases (Table 2). Among them, the 9p21 locus containing the p16INK4a gene and the 17p13.1 locus containing the p53 gene were lost in 54% and 40% of analyzed cases, respectively. We found homozygous deletions of three loci, including two on 9p21 and one on 8p23.3. We also identified 19 loci that were gained in >50% of cases (Table 3) and recurrent (>4 cases) amplifications (>4 copies) on 12q14-15 (9 of 55, 16.3%) followed by 7p12.3 (5 of 55), 11q13 (5 of 55), 17q12 (5 of 55), 1p36.1 (4 of 55), 1q21 (4 of 55), 5p15 (4 of 55), 7q31 (4 of 55), 8q24 (4 of 55), 14q12 (4 of 55), and 17q21.2 (4 of 55). These included genes previously reported to be amplified in lung cancer, such as the cyclin D1 (11q13), EGFR (7p12.3), and ERBB2 (17q21.2) genes (28, 29). We further validated copy number alterations on 8q24.3, 17q21.2, 3p21, and 17p13.1 by MLPA method. Chromosomal copy number changes (both gains and losses) detected by array CGH corresponded to those by MLPA (Fig. 1C).
Chromosomal location . | Covered candidate gene . | Chromosomal loss in lung adenocarcinoma (%) . |
---|---|---|
9p22 | MLLT3 | 56.4 |
9p21 | p16INK4a* | 54.5 |
9p21 | TEK | 54.5 |
18q23d | CTDP1 | 54.5 |
9p23 | GASC1 | 52.7 |
9p21.3 | MTAP | 52.7 |
15q25 | NTRK3 | 50.9 |
13q14.1 | FKHR | 50.9 |
18q21 | SMAD4* | 50.9 |
8p22 | NAT2 | 50.9 |
18q21.3 | PI5 | 50.9 |
18q21 | GRP | 50.9 |
18q22 | BCL2 | 50.9 |
8p22 | LZTS1* | 49.1 |
15q12 | SNRPN | 47.3 |
8p23.3 | D8S504 | 47.3 |
18q21.3 | SCCA1 | 47.3 |
8p22 | N33† | 45.5 |
13q11-12 | FGF9 | 45.5 |
8p22-11 | NRG1 | 43.6 |
13q22.1 | KLF12 | 43.6 |
Xq28 | MAGEA2 | 41.8 |
3p24.3 | THRB* | 41.8 |
3p14.2 | FHIT* | 41.8 |
8p22-21.3 | DLC1† | 41.8 |
8p23.1 | AAC1 | 41.8 |
17p11.2 | RH68621 | 40 |
13q14.1 | LCP1 | 40 |
8p22-8p21 | TNFRSF10B | 40 |
13q33 | EFNB2 | 40 |
17p13.1 | RCV1 | 40 |
18q21.3 | FVT1 | 40 |
Chromosomal location . | Covered candidate gene . | Chromosomal loss in lung adenocarcinoma (%) . |
---|---|---|
9p22 | MLLT3 | 56.4 |
9p21 | p16INK4a* | 54.5 |
9p21 | TEK | 54.5 |
18q23d | CTDP1 | 54.5 |
9p23 | GASC1 | 52.7 |
9p21.3 | MTAP | 52.7 |
15q25 | NTRK3 | 50.9 |
13q14.1 | FKHR | 50.9 |
18q21 | SMAD4* | 50.9 |
8p22 | NAT2 | 50.9 |
18q21.3 | PI5 | 50.9 |
18q21 | GRP | 50.9 |
18q22 | BCL2 | 50.9 |
8p22 | LZTS1* | 49.1 |
15q12 | SNRPN | 47.3 |
8p23.3 | D8S504 | 47.3 |
18q21.3 | SCCA1 | 47.3 |
8p22 | N33† | 45.5 |
13q11-12 | FGF9 | 45.5 |
8p22-11 | NRG1 | 43.6 |
13q22.1 | KLF12 | 43.6 |
Xq28 | MAGEA2 | 41.8 |
3p24.3 | THRB* | 41.8 |
3p14.2 | FHIT* | 41.8 |
8p22-21.3 | DLC1† | 41.8 |
8p23.1 | AAC1 | 41.8 |
17p11.2 | RH68621 | 40 |
13q14.1 | LCP1 | 40 |
8p22-8p21 | TNFRSF10B | 40 |
13q33 | EFNB2 | 40 |
17p13.1 | RCV1 | 40 |
18q21.3 | FVT1 | 40 |
Loss of heterozygosity or mutations, and
Chromosomal location . | Covered candidate gene . | Chromosomal gain in lung adenocarcinoma (%) . |
---|---|---|
17q25 | MAFG | 67.3 |
1q21 | MUC1* | 63.6 |
1q21 | MCL1* | 61.8 |
7p21 | IL6 | 58.2 |
1q21 | ARHGEF2 | 58.2 |
16p13.3 | ABCA3 | 56.4 |
17q11 | ITGB4 | 56.4 |
20q13 | Livin-2 | 56.4 |
5p15 | TERT | 54.5 |
8q24 | GLI4 | 54.5 |
16p13.3 | IGFALS | 54.5 |
17q24-25 | GRB2 | 54.5 |
1q21 | AF1Q | 54.5 |
1q23.1 | PMF1 | 54.5 |
12q24 | stSG8935 | 52.7 |
17q12 | PPARBP | 52.7 |
17q25 | Survivin* | 50.9 |
8q24 | RECQL4 | 50.9 |
11q12-13 | RELA | 50.9 |
Chromosomal location . | Covered candidate gene . | Chromosomal gain in lung adenocarcinoma (%) . |
---|---|---|
17q25 | MAFG | 67.3 |
1q21 | MUC1* | 63.6 |
1q21 | MCL1* | 61.8 |
7p21 | IL6 | 58.2 |
1q21 | ARHGEF2 | 58.2 |
16p13.3 | ABCA3 | 56.4 |
17q11 | ITGB4 | 56.4 |
20q13 | Livin-2 | 56.4 |
5p15 | TERT | 54.5 |
8q24 | GLI4 | 54.5 |
16p13.3 | IGFALS | 54.5 |
17q24-25 | GRB2 | 54.5 |
1q21 | AF1Q | 54.5 |
1q23.1 | PMF1 | 54.5 |
12q24 | stSG8935 | 52.7 |
17q12 | PPARBP | 52.7 |
17q25 | Survivin* | 50.9 |
8q24 | RECQL4 | 50.9 |
11q12-13 | RELA | 50.9 |
Unsupervised hierarchical clustering of array comparative genomic hybridization data. To examine whether there exist multiple carcinogenesis pathways in lung adenocarcinoma, we attempted two-dimensional hierarchical profiling of the chromosomal alterations detected. We first plotted the number of loci showing various incidences of alterations and found that there exist two peaks (loci altered in 10-15% and 20-25% of cases; Fig. 1D). We assumed that alterations appearing in <20% of cases reflect mostly random alterations as observed in genome-wide allelotyping analyses (30), whereas alterations affecting >20% of cases probably represent nonrandom (cancer-specific) alterations. Therefore, to exclude random changes that may be caused by the intrinsic genetic instability of cancer, we selected the loci that were affected in >25% of analyzed cases (397 loci in total) and did the unsupervised hierarchical clustering analysis. When analyzed by using loci affected in >5% and 15% of cases, we obtained almost the same classification as described below (data not shown).
Our hierarchical clustering yielded three distinct subclasses of primary lung adenocarcinoma (clusters A, B, and C shown in Fig. 1E). Cluster B exhibited significantly fewer genetic alterations (losses and gains) in all examined clones than the other two clusters; the average number of alterations (losses, gains, total) in cluster A was 102, 141, and 244, respectively; in B 43, 81, and 125; and in C 85, 131, and 216 (A versus B: P < 0.0001; B versus C: P < 0.0001). Frequencies of the various lost or gained loci were significantly different among these three cluster groups (P < 0.01). Cluster A was characterized by gains on 1p32-26, 4p16.3, 11p15, 12q13-14, 16p11.2-13.3, 17q11.1-25, 19q13.2, 20p11, 20q11.2, and 22q12.2 and losses on 1p22, 6q26, 10q24-26, 13q22.1-34, 15q21-25, and 18p11.2. Cluster C significantly showed gains on 5p12-14.3, 7p12.3-21.1, 7q22, 7q31, 8q12-21, and 14q11-24, and losses on 1q23.3-41, 10q22.1, and Xq. Some loci were similarly altered in both clusters A and C, including losses on 3p21-24, 6q26, 8q24.3, 9q21, 10p15, 10q11, 10q26, 15q21.1, 15q26.1, and 19p13.3 (containing LKB1), and gains on 5p15, 6p21, 7p21-22, 7q21, 8q21, 8q22-24 (containing MYC), 9q21-22, 11q13 (containing cyclin D1), and 20q13.1. Losses on 3p14 (containing FHIT), 8p22-23.3, 9p21 (containing p16INK4a), 13q11-34, 17p13.1 (containing p53), 18q21, and gains on 1q21-23, 1q42, 7p15, 17q12, 17q21.2, and 17q25 were observed in all subgroups with similar frequency. Two alterations (a gain on 19q13.1 and a loss on 22q12.2) were more frequently observed in cluster B than in clusters A and C. The above classification into cluster groups showed significant correlation with the patients' smoking history (P < 0.01) and gender (P < 0.001); cluster A frequently contained female patients without any smoking history (female: 17 of 20 cases and never smoker: 14 of 20 cases), whereas cluster B included male patients with current or former smoking history (male: 11 of 15 cases and smoker: 11 of 15 cases). Cluster C included more male patients (male: 14 of 20 cases), but showed no significant association with smoking history (smoker: 11 of 20 cases). No significant differences were observed between the groups with regard to other clinical features (histologic differentiation, clinical stage, and disease-free survival). Multivariate analysis revealed that two chromosomal alterations showed significant association with disease-free survival: a loss on 13q14.1 (P = 0.01, hazard ratio, 3.21; 95% confidence interval, 1.30-7.91) and a gain on 8q24.2 (P = 0.02, hazard ratio, 2.92; 95% confidence interval, 1.16-7.37).
It has been reported that somatic mutations of the K-ras, EGFR, and ERBB2 genes are frequent in lung adenocarcinoma (2, 8–10, 31–33). We attempted to determine the correlation of these oncogenic mutations with the above classification. We sequenced exons covering the kinase domain of the EGFR gene and found somatic mutations in 26 cases (47%; Table 1). EGFR mutations were more frequently observed in never-smoker patients (P < 0.001) and in the A and C cluster groups (P = 0.01 and 0.02). We detected K-ras activating mutations in six cases (11%; Table 1) and EGFR and K-ras mutations were mutually exclusive in our cases as reported by others (31, 32). No mutation in the reported exons of the ERBB2 gene was detected.
Supervised clustering analysis revealed correlation of EGFR gene mutation with specific genetic alterations. Tumors with EGFR mutations showed significantly more genetic alterations (losses and gains) than those without EGFR mutations; the average numbers of alterations (losses, gains, total) were 68, 110, and 178 in EGFR wild-type tumors and 93, 134, and 227 in EGFR mutated tumors, respectively (wild-type versus mutated P = 0.01, 0.03, and 0.01). We found 58 loci that showed significant differences in the frequency of copy number alterations between EGFR wild-type and mutated tumors. To further examine the genetic profile of EGFR mutated tumors, we classified lung adenocarcinomas based on their EGFR mutation status with the use of supervised hierarchical clustering. We preformed a machine-learning method with leave-one-out cross-validation and selected 46 loci that could discriminate EGFR mutation status. EGFR wild-type and mutated tumors were clustered in distinct groups using the ratios of 46 selected loci (Fig. 2A). Tumors carrying K-ras mutations were segregated from the EGFR mutant branch (Fig. 2A). Interestingly, some cases without EGFR mutation were clustered with the mutant branch, and tumors carrying EGFR mutations were separated in two subbranches (EGFR-MUT-A and EGFR-MUT-B; Fig. 2A). One branch (EGFR-MUT-A) was characterized by amplification of 12q14 or 1p36.1, whereas the other (EGFR-MUT-B) contained frequent amplification of 7p12.3 (containing the EGFR gene), 1q44-23, 5p12, 14q31, and 16p13.3. Poorly differentiated tumors were significantly (P = 0.01) segregated in the EGFR-MUT-B subgroup (EGFR-MUT-A: 1 of 15 cases and EGFR-MUT-B: 5 of 21 cases).
The EGFR wild-type tumors that were clustered with the EGFR-MUT branch led us to hypothesize that aberrant activation of tyrosine kinases other than EGFR have an effect equivalent to that of EGFR mutations in these tumors. We examined the copy number changes of loci containing oncogenic receptor–type tyrosine kinases (FGFR1, FGFR2, FGFR3, PDGFR, KIT, MET, ERBB2, FLT3, NTRK1, and NTRK3) detected by our arrays. We found that amplification of a locus containing the MET gene (7q31) was observed in EGFR wild-type tumors (three of four amplified cases; a representative case was shown in Fig. 3A) and that these tumors were clustered with the EGFR-MUT branch (Fig. 2A). Overexpression of MET protein was immunohistochemically detected in 24% (13 of 55) of cases including all cases with amplification of the MET gene (Fig. 3B), although there was no significant association between MET overexpression and the clustering. To determine whether somatic mutations of this kinase also occur in lung adenocarcinoma, we sequenced the exons of the MET gene, which have been reported to exhibit activating mutations in various tumors (34–36); however, no mutation was found in any of examined 55 cases.
We further examined whether this classification is of clinical significance. Kaplan-Meier plots showed a statistically significant difference in disease-free survival between the two groups (EGFR-WT and EGFR-MUT; log-rank analysis, P = 0.01; Fig. 2B) although EGFR mutation status alone did not (log-rank analysis, P = 0.06; data not shown).
Discussion
This study is the first high-resolution copy number analyses of primary lung adenocarcinoma by array CGH method. To extract common and specific genetic alterations, we collected and analyzed 55 cases of primary lung adenocarcinoma by combining the array-based CGH analysis with laser-capture microdissection of tumor cells. Importantly, our results were validated by the elucidation of frequent alterations of previously reported cancer-related genes in lung adenocarcinoma, including losses on the p16INK4a, p53, and FHIT loci, and amplifications on the cyclin D1 and EGFR loci. Moreover, we elucidated novel frequent alterations in small chromosomal regions such as losses on 13q11-14 and 15q12-25 and gains on 17q25, 1q21, and 16p13.3, which have not been detected by previous studies. Novel recurrent amplification, which may be a landmark for the existence of oncogenes, was also detected on loci, including 1p36.1, 1q21, 5p15, 12q14-15, and 14q12. We also found a homozygous deletion on 8p23.3 accompanied with frequent chromosomal loss and identification of candidate tumor suppressor genes in this locus is in progress.
It has been argued that there are distinct subclasses of lung adenocarcinoma by histopathologic observations and recent gene expression profilings (2, 13). Girard et al. (30) reported the possibility of classification of lung cancer by genome-wide allelotyping although their study only examined lung cancer cell lines and could not discriminate between copy number gain and loss. In our study, we analyzed primary lung adenocarcinoma and used unsupervised hierarchical cluster analysis to identify three groups of lung adenocarcinoma based on their distinct genetic changes. Among them, two subclasses (clusters A and C, 20 of 55 cases and 36.3% of total cases, respectively) shared many genetic alterations but also had changes unique to each other. This implied that they may be derived from a common precursor and diverge via the acquisition of specific genetic alterations during tumor development. In contrast, the third subclass (cluster B, 15 of 55 cases, 27.3%) showed characteristically fewer genetic alterations than the other two. Although one would expect this group to consist of tumors of an earlier stage, it contained tumors that varied clinically (from stage I to stage III) and there was no significant correlation of histopathologic features with the above classification. Interestingly, this clustering classification is significantly associated with smoking habits, suggesting that the specific carcinogen exposure may affect overall genetic profile of lung cancer. We propose three possibilities for the carcinogenesis process in the third group; these tumors may predominantly acquire (a) genetic alterations not covered by our arrays, although they contain most of the known cancer-related genes; (b) genetic alterations that do not involve copy number changes, such as balanced chromosomal translocations or microsatellite instability (37, 38); or (c) epigenetic alterations such as aberrant methylation of gene promoters, which have been reported to associate with smoking history (39). Further analysis of this group, focusing on the above mechanisms, will provide a more complete view of lung carcinogenesis.
Because the EGFR gene is frequently altered in lung adenocarcinoma and its mutation status is correlated to the sensitivity to the specific inhibitor, Gefitinib (8–10), we assumed that the EGFR pathway plays important roles in lung cancer and examined whether EGFR mutated tumors have any genetic characteristics in nature. We detected the EGFR gene mutations at similar frequency as reported (31, 32, 40) and the presence of somatic mutations was significantly associated with never-smoking history as previous studies reported (8–10, 31, 32, 40). We detected K-ras mutations relatively less frequent than previously reported (41) but comparatively to other study (42) probably because our analyzed cases contained more female and nonsmokers. We found that EGFR mutation and K-ras mutation were mutually exclusive as reported (31, 32, 40) and this finding is consistent with the notion that activation of both EGFR and K-ras stimulates the same downstream pathway (43).
We identified 58 loci whose alterations significantly correlated with the presence of EGFR mutations. It is interesting to note that amplification of the EGFR gene itself is significantly observed in EGFR mutated tumors, indicating that both somatic mutation and amplification of the EGFR gene simultaneously occur in part of lung adenocarcinoma. Using these selected loci, we classified the tumors by supervised hierarchical clustering. This classification revealed two groups: one containing only EGFR wild-type tumors (EGFR-WT) and the other (EGFR-MUT) containing all EGFR mutated and some EGFR wild-type tumors. Because the EGFR wild-type tumors that were grouped with the EGFR-MUT group shared similar genetic alterations with the EGFR mutated tumors, we hypothesized that they may have unknown genetic alterations complementary to EGFR activation and subsequently examined loci containing oncogenic receptor–type tyrosine kinases in our arrays. We found that a locus (7q21) containing the MET gene was amplified in part of these EGFR wild-type tumors and immunohistochemically validated overexpression of MET protein in these tumors. MET was shown to be implicated in ras-mediated tumorigenicity (44, 45) and activated in many tumors (34–36). Although the number of cases with MET amplification is small in this study, it is tempting to speculate that amplification of the MET gene may play a role similar to EGFR mutation in lung adenocarcinoma. Recently, somatic alterations of the MET gene were detected in lung cancer and pharmacologic inhibitors specific to the MET kinase have been reported (46–48). Our results also support the idea that the MET oncoprotein is a potent new candidate for therapeutic target in lung adenocarcinoma although there was no somatic mutation in the analyzed exons of our cases. In our cases, there are seven tumors without either EGFR mutations or MET amplification in the EGFR-MUT group. Somatic mutations in the kinase domain of ERBB2 were reported in EGFR wild-type lung adenocarcinomas (7, 33). Therefore, we searched for ERBB2 mutations in all 55 cases and found no somatic mutations, suggesting that other oncogenic kinases might be involved in these tumors.
EGFR mutation status could not predict tumor recurrence, which is consistent with a previous report on the insignificant relationship between EGFR mutation and patient prognosis (40). However, we found that EGFR-MUT group, which is revealed by genetic classification, showed significantly shorter disease-free survival than EGFR-WT group. Our results imply the possibility that specific combinations of genetic alterations (genetic code) selected by genome-wide analysis could evaluate tumor characteristics and estimation of such codes would be applicable for diagnostic purposes. Our classification also revealed that there are two genetically distinctive subgroups in the EGFR mutated lung adenocarcinoma, which were associated with tumor histologic differentiation. Because Gefitinib is one of the most promising molecular target drugs against lung cancer and molecular mechanisms determining its efficacy are still unclear (49), further analysis of a larger cohort is warranted to determine any possible relationship of genetic profiling with sensitivity to chemotherapeutic agents, including tyrosine kinase inhibitors.
Grant support: Grant-in-Aid for the Comprehensive 10-Year Strategy for Cancer Control from the Ministry of Health, Labor and Welfare, Japan.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.