Abstract
Epithelial-to-mesenchymal transition (EMT) is a key process associated with tumor progression and metastasis. To define molecular features associated with EMT states, we undertook an integrative approach combining mRNA, miRNA, DNA methylation, and proteomic profiles of 38 cell populations representative of the genomic heterogeneity in lung adenocarcinoma. The resulting data were integrated with functional profiles consisting of cell invasiveness, adhesion, and motility. A subset of cell lines that were readily defined as epithelial or mesenchymal based on their morphology and E-cadherin and vimentin expression elicited distinctive molecular signatures. Other cell populations displayed intermediate/hybrid states of EMT, with mixed epithelial and mesenchymal characteristics. A dominant proteomic feature of aggressive hybrid cell lines was upregulation of cytoskeletal and actin-binding proteins, a signature shared with mesenchymal cell lines. Cytoskeletal reorganization preceded loss of E-cadherin in epithelial cells in which EMT was induced by TGFβ. A set of transcripts corresponding to the mesenchymal protein signature enriched in cytoskeletal proteins was found to be predictive of survival in independent datasets of lung adenocarcinomas. Our findings point to an association between cytoskeletal and actin-binding proteins, a mesenchymal or hybrid EMT phenotype and invasive properties of lung adenocarcinomas. Cancer Res; 75(9); 1789–800. ©2015 AACR.
Introduction
Epithelial-to-mesenchymal transition (EMT) is a process in embryonic development that allows polarized epithelial cells to convert to loosely organized mesenchymal cells (1). The transition from an epithelial to a mesenchymal phenotype fosters cell movement during gastrulation and later during morphogenetic events such as neural crest ontogeny. Mesenchymal cells reaching the target site engage in a new differentiation program allowing development of diverse tissue types (2–4). The same features of EMT, namely loss of cell adhesion, increased migration, and invasion that aid metazoan development provide a likely mechanism for tumor progression with loss of an epithelial phenotype in more aggressive tumors (5–8).
Regulation of EMT is complex and multilayered, with diverse growth factors, miRNAs, genetic mutations, and epigenetic alterations all having been shown to play a role. TGFβ, hepatocyte growth factor, Notch, or Wnt can serve as initiating factors of EMT (9–13). Several miRNAs regulate EMT through inhibition of either effector genes or the signaling axis. The miR-200 family and miR-34 inhibit EMT, whereas miR-21 has an opposite effect (14–16). Several studies demonstrated a role for DNA methylation in regulating miR-200 and an altered DNA methylation profile associated with EMT (17–20). Other EMT regulatory mechanisms include zinc-finger transcription factors Snail1, Snail2, Zeb1, Zeb2, and the basic helix–loop–helix family members Twist1 and Twist2 which control expression of downstream genes and cellular features associated with EMT, such as cell adhesion and polarity (21–23). The transient nature of EMT has been considered as aiding distant site metastasis through a reverse mesenchymal-to-epithelial transition after invading cells have colonized distant sites (8, 24). Recent work by Lu and colleagues proposed occurrence of a hybrid epithelial-mesenchymal state with a determination between the three phenotypes being regulated by a circuit composed of two interconnected chimeric modules—the miR-34/SNAIL and the miR-200/ZEB mutual-inhibition feedback circuits (25).
We have undertaken a study of lung adenocarcinoma to determine molecular and phenotypic features associated with EMT states and their relevance to survival in early-stage disease. Extensive characterization of lung adenocarcinoma cell lines, including protein, mRNA, miRNA, DNA methylation, cell invasiveness, adhesion and motility analysis, revealed the occurrence of intermediate/hybrid phenotypes between epithelial and mesenchymal states. Gene and protein signatures associated with functional characteristics helped to define these hybrid states. Upregulation of cytoskeletal-related proteins was a common feature between mesenchymal and aggressive hybrid types. A set of transcripts enriched for cytoskeletal and actin binding proteins was found to be predictive of survival in independent lung adenocarcinoma datasets.
Materials and Methods
Cell culture
A panel of 38 lung adenocarcinoma cell lines selected to encompass known drivers in lung adenocarcinoma were grown in RPMI-1640 with 10% FBS and 1% penicillin/streptomycin unless otherwise noted. The identity of each cell line was confirmed by DNA fingerprinting via short tandem repeats at the time of mRNA and total protein lysate preparation using the PowerPlex 1.2 kit (Promega). Fingerprinting results were compared with reference fingerprints maintained by the primary source of the cell line. For SILAC labeling of cell lines, cells were grown for seven passages in RPMI-1640 supplemented with 13C-lysine and 10% dialyzed FBS according to the standard SILAC protocol (26).
Mass spectrometric analysis
Proteomic analysis was performed as previously described (27). Detailed methods for mass spectrometric analysis can be found in the Supplementary Information section.
Protein datasets are available as Supplementary Table S1.
Determination of EMT status for cell lines
A protein ratio of surface localized CDH1 (CDH1_S) and VIM from total cell extracts was calculated from (CDH1n+1)/(VIMn+1) where n is the number of spectral counts. High CDH1_S/VIM was considered to be a log2-transformed ratio > 0 and low to be log2-transformed ratio < 0. Cell morphology was assessed by plating cells at 25% to 50% confluence and acquiring phase contrast images on day 1, 2, 3, and 4 after plating. Cells were assessed for individual cell shape (spindle for mesenchymal or cuboid for epithelial) as well cell–cell interaction. Cells were classified as having epithelial morphology if the individual cells were cuboid and cells grouped to form discrete clusters with smooth edges indicative of tight junctions. Epithelial-like morphology lacked complete cuboid morphology or failed to form discrete clusters. Mesenchymal morphology required primarily spindle shape and no cell–cell adhesion. Cells with mesenchymal-like morphology were primarily spindle shaped but demonstrated cell–cell adhesion by forming clusters. Cells with log2-transformed CDH1_S/VIM ratios > 0 and an epithelial morphology were classified as epithelial, whereas mesenchymal cells had log2-transformed CDH1_S/VIM ratios < 0 and a mesenchymal morphology.
Profiling of mRNA, miRNA, and DNA methylation
Gene expression data were obtained using Illumina Human WG-6 v3.0 Expression BeadChips (Illumina) and expression values log2 normalized. miRNA profiling was performed using a real-time PCR-based approach using miRCURY LNA Universal RT miRNA PCR (panel I+II; Exiqon, Inc.). miRNA profiling was not available for cell lines H1299 and H1703. Illumina Infinium HumanMethylation27 BeadChips were used for DNA methylation analysis. DNA methylation profiling was not available for cell lines H1385 and H1703. mRNA, DNA methylation, and miRNA datasets were deposited in the National Center for Biotechnology Information's Gene Expression Omnibus (GEO) database (http://www.ncbi.nlm.nih.gov/geo).
Data analysis
Detailed methods for data analysis can be found in the Supplementary Information section.
Invasion, migration, and aggregation assays
Detailed methods for invasion, migration, and aggregation assays can be found in the Supplementary Information section.
Western blot analysis
Western blot analysis was performed according to standard procedures using polyvinylidene difluoride membranes and an enhanced chemiluminescence system (GE Healthcare). Following antibodies were used for Western blot analysis: ISYNA1 (Sigma Aldrich), FBXO2 (Novus), TCEAL4 (Novus), FKBP65 (BD Biosciences), vimentin (BD Biosciences), CDH1 (BD Biosciences), and AKAP12 (Abcam). α-tubulin (Sigma) was used as a loading control.
Immunofluorescence analysis
Detailed methods for immunofluorescence analysis and IHC analysis can be found in the Supplementary Information section.
Results
Characterization of cell lines based on their morphology and CDH1/VIM ratios
To define molecular features that distinguish epithelial from mesenchymal cells, a panel of 38 lung adenocarcinoma cell lines representative of the genomic diversity of this disease was subjected to proteomic, gene expression, miRNA, and DNA methylation profiling (Supplementary Fig. S1A). Changes in CDH1 and vimentin (VIM) have been considered hallmarks of EMT. Expression of CDH1 on the cell surface and VIM in whole-cell lysates was determined based on normalized spectral counts from mass spectrometry data (27). We assessed ratios of cell surface-localized CDH1 (CDH1_S) and VIM from whole-cell lysates along with cell morphology, and identified a subset of cell lines with a distinct mesenchymal or epithelial phenotype (28). Nine cell lines with a log2-transformed CDH1_S/VIM ratio > 0 and an epithelial morphology were classified as epithelial, while nine cell lines with a log2-transformed CDH1_S/VIM ratio < 0 and a mesenchymal morphology were classified as mesenchymal (Fig. 1A and Supplementary Fig. S1B). Log2-transfomed CDH1_S/VIM protein ratio was significantly correlated with CDH1/VIM ratios of mRNA expression (r = 0.8650, P < 0.0001; Spearman correlation). Common somatic gene mutations that occur in lung adenocarcinoma (Kras, TP53, EGFR) were not associated with a distinct EMT phenotype, with the exception of a negative correlation between EGFR mutation and a mesenchymal type as previously reported (29). The remaining cell lines could not be readily classified as epithelial or mesenchymal due to discordance between CDH1_S/VIM ratios and morphology and were investigated further for their hybrid properties. Immunofluorescence analysis of CDH1 and VIM revealed that both CDH1 and VIM were stained in the same cells in hybrid cell lines (Fig. 1B). We further investigated CDH1 and VIM protein expression in lung adenocarcinoma tissues. Among 141 lung adenocarcinoma tissues in the tissue microarray, 29 (20.6%) tumors were both CDH1 and VIM positive (Fig. 1C and D), indicative of a hybrid transcriptional program.
Classification of NSCLC cell lines. A, log2-transformed ratios of spectral counts of CDH1 on cell surface and VIM from whole-cell lysates, log2-transformed ratios of CDH1/VIM mRNA, and morphology, with representative images. Scale, 40 μm. B, immunofluorescence analysis of E-cadherin and vimentin in hybrid cell lines. C, CDH1 and VIM expression in lung adenocarcinoma tissue microarray. D, representative images of IHC analysis of E-cadherin and vimentin.
Classification of NSCLC cell lines. A, log2-transformed ratios of spectral counts of CDH1 on cell surface and VIM from whole-cell lysates, log2-transformed ratios of CDH1/VIM mRNA, and morphology, with representative images. Scale, 40 μm. B, immunofluorescence analysis of E-cadherin and vimentin in hybrid cell lines. C, CDH1 and VIM expression in lung adenocarcinoma tissue microarray. D, representative images of IHC analysis of E-cadherin and vimentin.
Identification of distinctive gene and protein signatures for mesenchymal and for epithelial cell lines
We performed a factor analysis to determine modality of gene expression patterns among the cell lines, which revealed a continuous rather than a modal distribution (Fig. 2A). Genes comprised in the discriminating factors included known markers of epithelial or mesenchymal cells, including CDH1, VIM, and EpCAM (data not shown). Comparison of mRNA expression between the nine epithelial and nine mesenchymal cell lines yielded 1,347 genes with a P value < 0.01 (t test), consisting of 659 with higher expression in mesenchymal and 688 in epithelial cell lines (Fig. 2B and Supplementary Table S2A). Gene ontology enrichment using DAVID (28) was applied to differentially expressed genes. The epithelial genes were enriched for genes encoding proteins localized to the cell surface, many of which play a role in cell adhesion, whereas mesenchymal genes were enriched for nuclear localized proteins and regulators of transcription (Supplementary Table S2B). Concordant findings were observed when gene ontology for 1,347 epithelial and mesenchymal genes was analyzed with using Gene Set Enrichment Analysis (GSEA; ref. 30; Supplementary Table S2C).
Molecular characterization of epithelial and mesenchymal cell lines. A, factor analysis of gene expression showing distribution of cell lines. B, volcano plot of the differential gene expression analysis. Gray boxes, significance cutoffs. C, factor analysis of protein expression showing distribution of cell lines. D, volcano plot of the differential protein expression analysis. Gray boxes, significance cutoffs. E, unsupervised hierarchical clustering of miRNA expression from 36 lung adenocarcinoma cell lines separates the epithelial and mesenchymal cell lines into distinct clusters. F, unsupervised hierarchical clustering of miRNA expression without miR-200 family members. G, volcano plot of the differential DNA methylation analysis. Gray boxes, significance cutoffs. H, starburst plot integrating differential DNA methylation and gene expression analyses. I, overlapping regulation of gene and protein expression. For sub figures A, C, E, and F, mesenchymal cell lines and hybrid cell lines are indicated.
Molecular characterization of epithelial and mesenchymal cell lines. A, factor analysis of gene expression showing distribution of cell lines. B, volcano plot of the differential gene expression analysis. Gray boxes, significance cutoffs. C, factor analysis of protein expression showing distribution of cell lines. D, volcano plot of the differential protein expression analysis. Gray boxes, significance cutoffs. E, unsupervised hierarchical clustering of miRNA expression from 36 lung adenocarcinoma cell lines separates the epithelial and mesenchymal cell lines into distinct clusters. F, unsupervised hierarchical clustering of miRNA expression without miR-200 family members. G, volcano plot of the differential DNA methylation analysis. Gray boxes, significance cutoffs. H, starburst plot integrating differential DNA methylation and gene expression analyses. I, overlapping regulation of gene and protein expression. For sub figures A, C, E, and F, mesenchymal cell lines and hybrid cell lines are indicated.
Extensive proteomic analysis of cell lysates by LC/MS-MS for all 38 cell lines identified a total of 12,808 distinct proteins with an average of 3,690 proteins identified in each cell line, pointing to substantial heterogeneity among cell lines (Supplementary Table S1 and Supplementary Fig. S2A). Factor analysis performed with proteomic data similarly produced a continuum rather than a modal distribution (Fig. 2C). Proteins comprised in the discriminating factors included VIM but not CDH1 nor EpCAM (data not shown). Proteomic analysis resulted in 232 proteins expressed more highly in epithelial cell lines and 166 proteins expressed more highly in mesenchymal cell lines with a P value < 0.05 (Mann–Whitney U test) and fold change > 1.5 (Fig. 2D and Supplementary Table S2D). The epithelial and mesenchymal mRNA signatures significantly overlapped with corresponding protein signatures, respectively (P = 3.29 × 10−20, and P = 6.61 × 10−10, Fisher exact test; Supplementary Fig. S2B). Significantly enriched GO categories in the proteomic mesenchymal signature consisted predominantly of cytoskeleton and actin organization (GO: 0005856~cytoskeleton, GO: 0008092~cytoskeletal protein binding, GO: 0003779~actin binding; Supplementary Table S2E). The enrichment in cytoskeletal and actin-related proteins was also prominent by Ingenuity Pathway Analysis (http://www.ingenuity.com/; Supplementary Fig. S2C and S2D). The GO terms enriched in the protein signature for epithelial cells included those associated with translation and metabolism (Supplementary Table S2E). To confirm functional relevance of cytoskeletal proteins in mesenchymal signature, we performed knockdown experiments of AKAP12, which is associated with actin-cytoskeleton reorganization (31) and identified in both mRNA and protein mesenchymal signatures (Supplementary Tables S2A and S2C). Although no obvious change was observed in EMT status, cell invasion was inhibited in H1299 cells by treatment with AKAP12 shRNA (Supplementary Table S2E), indicating the functional relevance of cytoskeletal proteins in mesenchymal signature.
To explore potential regulatory factors for gene and protein expression, miRNA and DNA methylation profiling was performed. Unsupervised hierarchical clustering of microRNA data resulted in two clusters that separated the epithelial and mesenchymal cell lines and significantly differed in CDH1_S/VIM expression (P = 5.6 × 10−6, t test; Fig. 2E). Separation of the cell lines into two clusters was principally due to expression of the miR-200 family, as removal of miR-200 family members from the dataset resulted in lack of clustering into two clusters (Fig. 2F). Restricting comparison of the miRNA data to the epithelial and mesenchymal cell lines resulted in 31 differentially expressed miRNAs at P < 0.01 (t test), with 10 expressed at lower levels in mesenchymal cell lines and 21 expressed at lower levels in epithelial cell lines (Supplementary Table S2F). Of the 10 miRNAs with lower expression in the mesenchymal cell lines, six had predicted binding sites in the mesenchymal gene signature (miR-200a, 200b, 200c, 429, 135b, and 148a) using the prediction algorithms miRanda and Targetscan (data not shown). Of the 21 miRNAs with lower expression in epithelial cell lines, four had predicted binding sites in the epithelial gene signature (miR-30a, 330-3p, 425, 455-3p). The miRNAs identified by both differential expression and algorithmic analysis of signatures are predicted to regulate 178 and 210 genes in the epithelial and mesenchymal signatures, respectively (Supplementary Table S2F).
Comparison of the DNA methylation status of the mesenchymal and epithelial cell lines using Infinium HumanMethylation27 BeadChips identified 75 hypo-methylated genes in the epithelial cell lines and 48 hypo-methylated genes in the mesenchymal cell lines (Fig. 2G and Supplementary Table S2G). Thirty one of the 75 hypo-methylated genes in the epithelial signature were also represented in the epithelial mRNA signature (P = 2.41 × 10−26, Fisher exact test). Conversely, eight of the 48 hypo-methylated genes in the mesenchymal cell lines were represented in the mesenchymal mRNA signature (P = 1.281 × 10−4, Fisher exact test; Fig. 2H). Assessment of the relative contribution of DNA methylation, miRNA, and gene expression to protein levels suggested greater concordance between gene and protein expression for epithelial than mesenchymal gene sets (Fig. 2I). Comparison of transcript and protein variance between mesenchymal and epithelial cell lines further revealed increased variance of protein expression in mesenchymal versus epithelial cell lines, compared with gene expression (Supplementary Table S2G).
Heterogeneity among cell lines with a hybrid phenotype
We next explored phenotypic differences and similarities between epithelial, hybrid, and mesenchymal cell lines. We observed a significant difference in invasiveness between the epithelial and mesenchymal cell lines based on a Matrigel invasion assay (P = 0.0001, Mann–Whitney U test; Fig. 3A). Hybrid cell lines were heterogeneous in their invasive properties. Expression of the mesenchymal proteins (r = 0.736, Spearman correlation) and genes (r = 0.715, Spearman correlation) had significantly higher correlation with invasiveness than CDH1_S/VIM expression alone (r = −0.483, Spearman correlation; P = 0.041, Fisher r-to-z transformation). We observed a significant difference in cell migration between the epithelial and mesenchymal cell lines based on scratch wound assays (P = 0.0012, Mann–Whitney U test; Fig. 3B). Expression of mesenchymal signature proteins and genes but not CDH1_S/VIM expression significantly correlated with migration (r = 0.546, Spearman correlation; P = 7.0 × 10−4, Mann–Whitney U test and r = 0.599, Spearman correlation; P = 2.0 × 10−4, Mann–Whitney U test, respectively). Cell–cell adhesion is a hallmark of epithelial cells and loss of cell–cell adhesion is considered to be a critical step in metastasis (4). Analysis of cell aggregation in a liquid culture allows assessment of the strength of cell–cell cadherin/catenin complex binding, as cells do not have a solid surface on which to bind. We observed significantly more cell aggregation in epithelial cell lines compared with mesenchymal cell lines (P = 0.0251, Mann–Whitney U test), with the hybrid cell lines distributed across the spectrum (Fig. 3C). Cell line growth rates were also assessed. We did not observe a significant difference in cell growth rates between epithelial, hybrid, and mesenchymal cell lines as the variance in growth rates within each class was greater than the difference between them (data not shown).
Phenotypic characterization of cell lines. A, cell invasion through Matrigel. In at least four fields from six replicate wells, the number of cells was counted for each cell line. B, cell migration measured by a scratch wound assay, with six scratches measured per cell line. C, cell aggregation after 24 hours in liquid culture over agarose as a measure of cell–cell adhesion. Six replicates with assay performed in three individual wells on different days. D, hierarchical clustering of differentially expressed genes in the epithelial and mesenchymal cell lines. E, hierarchical clustering of differentially expressed proteins in the epithelial and mesenchymal cell lines. For all sub figures, mesenchymal cell lines, epithelial, and hybrid cell lines are indicated.. F, confirmation of protein expression of novel EMT-related proteins. Western blotting of several EMT-related proteins showed increased expression of novel markers in mesenchymal cell lines compared with epithelial cell lines.
Phenotypic characterization of cell lines. A, cell invasion through Matrigel. In at least four fields from six replicate wells, the number of cells was counted for each cell line. B, cell migration measured by a scratch wound assay, with six scratches measured per cell line. C, cell aggregation after 24 hours in liquid culture over agarose as a measure of cell–cell adhesion. Six replicates with assay performed in three individual wells on different days. D, hierarchical clustering of differentially expressed genes in the epithelial and mesenchymal cell lines. E, hierarchical clustering of differentially expressed proteins in the epithelial and mesenchymal cell lines. For all sub figures, mesenchymal cell lines, epithelial, and hybrid cell lines are indicated.. F, confirmation of protein expression of novel EMT-related proteins. Western blotting of several EMT-related proteins showed increased expression of novel markers in mesenchymal cell lines compared with epithelial cell lines.
Hierarchical clustering was performed based on the gene and protein signatures to determine their ability to discriminate between epithelial, mesenchymal, and hybrid cell lines. Clustering based on the mRNA signature separated the mesenchymal cell lines into one group and eight of the nine epithelial cell lines into the other (Fig. 3D). The two clusters were significantly different for CDH1_S/VIM expression (P = 1.49 × 10−5, Mann–Whitney U test), invasion (P = 0.002, Mann–Whitney U test), and migration (P = 0.009, Mann–Whitney U test). Clustering of the differentially expressed proteins separated the epithelial and mesenchymal cell lines into distinct groups that were significantly different for CDH1_S/VIM expression (P = 0.003, Mann–Whitney U test), invasion (P = 3.0 × 10−4, Mann–Whitney U test), and migration (P = 0.004, Mann–Whitney U test) and aggregation (P = 0.006, Mann–Whitney U test; Fig. 3E). The large number of upregulated proteins in mesenchymal cell lines includes many that have not been previously associated with EMT and thus would represent novel biomarkers. We selected several novel proteins (TCEAL4, FBXO2, FKBP65, and ISYNA1) and confirmed their increased expression in mesenchymal cell lines by Western blotting (Fig. 3F).
Hybrid EMT states are distinguishable by molecular features linked to their phenotypes
We observed phenotypic heterogeneity among the hybrid cell lines that was unrelated to their CDH1 and VIM expression. Cell lines in the hybrid group exhibited phenotypic traits of high invasion and migration that are a characteristic of a mesenchymal phenotype, together with high aggregation, a feature of epithelial cells (Fig. 4A). A subset of hybrid cell lines (DFCI032, H1650, H1693, HCC827, and PC-9) exhibited high CDH1_S/VIM ratios but were also invasive and migratory (aggressive hybrid; Fig. 4B). Interestingly, although the mesenchymal cell lines migrated primarily as single cell, four of the five aggressive hybrid cell lines migrated by collective group migration (Fig. 4C).
Identification of aggressive hybrid cell lines. A, plot of cell invasion against migration with cell–cell adhesion indicated by color. B, cell lines are heterogeneous for phenotypic characteristics with hybrid epithelial-to-mesenchymal cells lines highlighted. Log2-transformed. CDH1_S/VIM protein ratios: > 0, and
< 0. Invasion:
> 150 cells per field;
< 150 cells per field. Aggregation,
, diffuse;
, aggregates;
, compact. Migration:
∼ <33% area covered;
>33% and <63% area covered;
> 63% area covered. For cell morphology: EMT status:
, epithelial;
, hybrid;
, mesenchymal. For cell morphology, mesenchymal, mesenchymal-like, mixed mesenchymal/epithelial morphology, epithelial-like, and epithelial are indicated. For EMT status, mesenchymal cell lines, epithelial, and hybrid cell lines are indicated. C, migration of aggressive hybrid cell lines. Aggressive hybrid cell lines imaged at 0 and 12 hours during a scratch wound assay reveal collective cell migration in four of five cell lines.
Identification of aggressive hybrid cell lines. A, plot of cell invasion against migration with cell–cell adhesion indicated by color. B, cell lines are heterogeneous for phenotypic characteristics with hybrid epithelial-to-mesenchymal cells lines highlighted. Log2-transformed. CDH1_S/VIM protein ratios: > 0, and
< 0. Invasion:
> 150 cells per field;
< 150 cells per field. Aggregation,
, diffuse;
, aggregates;
, compact. Migration:
∼ <33% area covered;
>33% and <63% area covered;
> 63% area covered. For cell morphology: EMT status:
, epithelial;
, hybrid;
, mesenchymal. For cell morphology, mesenchymal, mesenchymal-like, mixed mesenchymal/epithelial morphology, epithelial-like, and epithelial are indicated. For EMT status, mesenchymal cell lines, epithelial, and hybrid cell lines are indicated. C, migration of aggressive hybrid cell lines. Aggressive hybrid cell lines imaged at 0 and 12 hours during a scratch wound assay reveal collective cell migration in four of five cell lines.
Recent work by Lu and colleagues proposed Zeb/Snail/miR-200/miR-34 axis as regulators of hybrid phenotypes (25). We assessed transcript and protein expression of a large number of genes previously identified as regulators or markers of epithelial or mesenchymal cell types (Fig. 5A for significantly expressed markers and Supplementary Table S3 for all markers). We identified upregulation of ZEB1 (P = 0.042, Mann–Whitney U test) and SNAI2 (P = 0.041, Mann–Whitney U test) mRNA expression levels in the aggressive hybrid cell lines compared with epithelial cell lines. We observed increased expression of miR-34a in the aggressive hybrid cell lines compared with the epithelial cell lines, whereas expression of miR-200 family members was not significantly different (Fig. 5B), concordant with the results that four of aggressive hybrid cell lines were clustered together with epithelial cell lines (Fig. 2E). A comparative analysis between the epithelial and aggressive hybrid cell lines yielded 197 genes with upregulated mRNA levels in the aggressive hybrid cell lines out of 20,598 total genes (Fig. 5C). A set of 135 proteins were upregulated in the aggressive hybrid cell lines compared with epithelial cell lines (Supplementary Table S4A). Interestingly, the 135 protein signature was enriched for GO terms associated with cytoskeleton, actin binding, and organization, and significantly overlapped with the mesenchymal protein signature (P = 2.062 × 10−23 Fisher exact test; Fig. 5D and E and Supplementary Table S4B). Expression of cytoskeletal and actin-binding proteins in the aggressive hybrid cell lines was significantly higher than in the epithelial (P = 0.005, Mann–Whitney U test) or other hybrid cell lines (P = 0.009, Mann–Whitney U test; Fig. 5F) and was the primary discriminator of aggressive hybrid cell lines. Protein expression levels of TCEAL4 and ISYNA1 that are part of the mesenchymal protein signature were significantly elevated in the hybrid aggressive cell lines compared with epithelial type as determined by mass spectrometry, further supporting overlapping molecular characteristics of aggressive hybrid type and mesenchymal cell lines.
Molecular signatures for aggressive hybrid cell lines. A, expression of epithelial and mesenchymal markers at mRNA and protein levels in epithelial, aggressive hybrid, and mesenchymal cell lines. *, P value by t test under 0.05; X, P value ≥ 0.05. B, expression of epithelial- and mesenchymal-related miRNAs in epithelial, aggressive hybrid, and mesenchymal cell lines. *, P value by t test under 0.05; X, P value ≥ 0.05. C, volcano plot of the differential genes (left) and proteins (right) in a comparison of epithelial and aggressive hybrid cell lines. Gray boxes, significance cutoffs. D, overlap of mesenchymal and aggressive hybrid protein signatures. E, overlap of mesenchymal and aggressive hybrid protein significant gene ontology categories. F, cytoskeletal proteins are upregulated in the mesenchymal and aggressive hybrid cell lines compared against epithelial cell lines. Epi, epithelial; AH, aggressive hybrid; Mes, mesenchymal.
Molecular signatures for aggressive hybrid cell lines. A, expression of epithelial and mesenchymal markers at mRNA and protein levels in epithelial, aggressive hybrid, and mesenchymal cell lines. *, P value by t test under 0.05; X, P value ≥ 0.05. B, expression of epithelial- and mesenchymal-related miRNAs in epithelial, aggressive hybrid, and mesenchymal cell lines. *, P value by t test under 0.05; X, P value ≥ 0.05. C, volcano plot of the differential genes (left) and proteins (right) in a comparison of epithelial and aggressive hybrid cell lines. Gray boxes, significance cutoffs. D, overlap of mesenchymal and aggressive hybrid protein signatures. E, overlap of mesenchymal and aggressive hybrid protein significant gene ontology categories. F, cytoskeletal proteins are upregulated in the mesenchymal and aggressive hybrid cell lines compared against epithelial cell lines. Epi, epithelial; AH, aggressive hybrid; Mes, mesenchymal.
TGFβ-induced upregulation of cytoskeletal proteins precedes loss of E-cadherin.
We next tested induction of EMT with TGFβ treatment to confirm that proteomic differences observed between the mesenchymal and epithelial cell lines were due to an EMT event. Following exposure to TGFβ, epithelial H1437 cells elongated and lost cell adhesion (Fig. 6A). We further observed increased cell invasion and migration, but no changes in cell aggregation (Fig. 6B and C). We next performed a proteomic analysis of H1437 cells after 8 days of treatment with TGFβ in comparison with untreated cells, selecting differentially expressed proteins with > 1.5 fold change and G score greater than 3.85 (equivalent to a P value < 0.05; Fig. 6D and Supplementary Table S5A). Changes in protein expression after TGFβ treatment correlated significantly with the differentially expressed epithelial and mesenchymal proteins and the aggressive hybrid protein signature (P = 1.14 × 10−9 and P = 7.53 × 10−10, respectively, Spearman correlation; Fig. 6E). Gene Ontology analysis identified cytoskeletal and actin-binding proteins, including AKAP12, as the most highly enriched group among the upregulated proteins (Supplementary Table S5B). Downregulated protein set with TGFβ treatment overlapped with the epithelial protein signature (P = 9.88 × 10−20 Fisher exact test). Downregulated proteins were enriched for oxidation reduction and multiple categories related to metabolism and glycolysis. The morphologic and cytoskeletal protein expression alterations preceded changes in EMT markers, as we did not observe changes in CDH1 and VIM proteins after 8 days of TGFβ treatment (Fig. 6F) as confirmed by Western blotting. Loss of CDH1 expression and increase in VIM expression were observed at 14 days. The similarity between protein changes induced by TGFβ treatment of epithelial cells and the aggressive hybrid signature (invasive, migratory, high expression of cytoskeletal proteins) supports reorganization of cytoskeletal proteins preceding loss of CDH1 as an intermediate stage in EMT, with aggressive hybrid cell lines expressing epithelial markers (Fig. 6G).
Proteomic analysis of TGFβ induced EMT in an epithelial cell line. A, H1437 cells change from epithelial to mesenchymal morphology with TGFβ treatment. Scale, 40 μm. B, invasion and migration of H1437 increase with TGFβ treatment. C, aggregation of H1437 cells is unaffected by TGFβ treatment. Scale, 200 μm. D, volcano plot of the differential protein expression analysis. Gray boxes, significance cutoffs. E, proteins upregulated in H1437_TGFβ cells overlap significantly with the aggressive hybrid protein signatures. F, expression of epithelial or mesenchymal markers in TGFβ treated H1437 by LC-MS/MS or Western blot analysis. G, alteration of molecular and functional characteristics during EMT via aggressive hybrid type in H1437 with TGFβ treatment.
Proteomic analysis of TGFβ induced EMT in an epithelial cell line. A, H1437 cells change from epithelial to mesenchymal morphology with TGFβ treatment. Scale, 40 μm. B, invasion and migration of H1437 increase with TGFβ treatment. C, aggregation of H1437 cells is unaffected by TGFβ treatment. Scale, 200 μm. D, volcano plot of the differential protein expression analysis. Gray boxes, significance cutoffs. E, proteins upregulated in H1437_TGFβ cells overlap significantly with the aggressive hybrid protein signatures. F, expression of epithelial or mesenchymal markers in TGFβ treated H1437 by LC-MS/MS or Western blot analysis. G, alteration of molecular and functional characteristics during EMT via aggressive hybrid type in H1437 with TGFβ treatment.
Relevance of a mesenchymal signature enriched in genes encoding for cytoskeletal proteins to survival in early stage lung adenocarcinomas
We next determined whether the mesenchymal gene signature we identified had predictive value in early-stage lung adenocarcinoma by interrogating three independent gene expression datasets of lung adenocarcinoma annotated for outcome- Director's Challenge, Bhattacharjee and colleagues and Tomida and colleagues datasets (32–34). Stage 1 and 2 tumors were ranked by their relative expression of genes in the signature and hazard ratios were calculated by Cox regression. The mesenchymal mRNA signature significantly predicted survival in the Director's Challenge dataset (P = 7.62 × 10−3 Cox regression; Fig. 7A). Given the lack of databases for lung tumor protein expression annotated for survival, we next tested whether a set of mRNAs specifically encoding the mesenchymal proteomic signature had prognostic value. We first correlated protein abundance with mRNA expression by Spearman correlation and demonstrated significant correlation between transcript and protein expression with a mean of correlation coefficients of 0.216 (P = 1.0 × 10−16 based on permutation tests; Fig. 7B). Transcript: protein expression increased with measures of protein abundance (Fig. 7C). The set of mRNAs encoding the mesenchymal protein signature significantly predicted reduced survival in all three tumor datasets tested, whereas the epithelial signature was significantly associated with increased survival in one dataset (Fig. 7A). The Director's Challenge Consortium for the Molecular Classification of Lung Adenocarcinoma tested 14 different methods for predicting survival in early-stage non–small cell lung cancer with the best predictor found to be Model A, consisting of 13,830 genes produced from clustering (32). As a comparison, we tested Model A in the two other independent datasets and found that the mesenchymal protein signature had similar statistical significance as Model A (32). We next performed a Kaplan–Meier analysis and found that the mesenchymal signature significantly predicted survival in all three datasets tested (the Director's Challenge, Bhattacharjee and colleagues and Tomida and colleagues datasets; Fig. 7D). Given that upregulation of cytoskeletal proteins was associated with both mesenchymal and aggressive hybrid phenotypes, we tested whether mRNAs representing the cytoskeletal proteins from the mesenchymal protein signature were predictive of survival. We found that this restricted signature also significantly predicted survival in the Director's Challenge and Bhattacharjee and colleagues tumor sets (Fig. 7E).
Mesenchymal and cytoskeletal protein signatures predict survival in NSCLC. A, Cox regression of mRNA and protein signatures in three different gene expression datasets. B, correlation of mRNA and protein expression in 38 cell lines. C, correlation of mRNA and protein expression increases with protein abundance. D, Kaplan–Meier curves of mesenchymal protein signature in gene expression datasets. P values are derived from the log-rank test. Black line, the top one-third of tumors ranked by expression of the mesenchymal signature; gray line, the lower one-third of tumors by expression of the mesenchymal signature. E, Kaplan–Meier curves of cytoskeletal protein signature in gene expression datasets. P values are derived from the log-rank test. Black line, the top one-third of tumors ranked by expression of the cytoskeletal signature; gray line, the lower one-third of tumors by expression of the cytoskeletal signature.
Mesenchymal and cytoskeletal protein signatures predict survival in NSCLC. A, Cox regression of mRNA and protein signatures in three different gene expression datasets. B, correlation of mRNA and protein expression in 38 cell lines. C, correlation of mRNA and protein expression increases with protein abundance. D, Kaplan–Meier curves of mesenchymal protein signature in gene expression datasets. P values are derived from the log-rank test. Black line, the top one-third of tumors ranked by expression of the mesenchymal signature; gray line, the lower one-third of tumors by expression of the mesenchymal signature. E, Kaplan–Meier curves of cytoskeletal protein signature in gene expression datasets. P values are derived from the log-rank test. Black line, the top one-third of tumors ranked by expression of the cytoskeletal signature; gray line, the lower one-third of tumors by expression of the cytoskeletal signature.
Discussion
We have undertaken proteomic, gene expression, miRNA, and DNA methylation analysis of lung adenocarcinoma cell lines representative of genomic heterogeneity in lung adenocarcinoma, together with their functional characterization. Subsets that emerged from the study encompassed epithelial, mesenchymal, and an aggressive hybrid group with features of both epithelial and mesenchymal cell lines characterized by upregulation of cytoskeleton and actin-binding proteins. Findings from TGFβ treatment of epithelial cells support the occurrence of an intermediate state during EMT with hybrid features. In consideration of the role that EMT plays in tumor progression, we sought to elucidate gene sets that may be predictive outcome in early-stage lung cancer based on biologic functions and identified a signature enriched in cytoskeletal protein encoding genes predictive of survival.
Molecular profiling has revealed substantial tumor heterogeneity in many human cancers (35–38). The extensive molecular and phenotypic characterization of lung adenocarcinoma cell lines likewise has revealed substantial heterogeneity amongst the cell lines. Most cell lines could not be simply categorized as either mesenchymal or epithelial. We identified cell lines with features of both mesenchymal and epithelial cell types, substantiating the occurrence of a hybrid state in tumor cell populations (25). Costained E-cadherin and vimentin in hybrid cell lines suggested the occurrence of hybrid transcriptional program. Partial EMT has also been described during development, wound healing in addition to tumorigenesis (39–42), whereas the existence of subpopulation of E-cadherin-positive and vimentin-positive cells has been recently indicated in head and neck cancer (43, 44), supporting the need to further characterize this group at the genomic and proteomic levels as we have undertaken in this study. In addition, our IHC studies of CDH1 and VIM protein were performed using nonserial section of lung adenocarcinoma tissues. On the basis of this potential limitation, further validation studies for the occurrence of hybrid type in tumors are also warranted. A subset of hybrid cell lines was highly invasive despite gene and protein expression of epithelial markers. Aggressive hybrid cell lines expressed a similar pattern of upregulated cytoskeletal and actin binding as mesenchymal cell lines. The occurrence of hybrid EMT states is supported by our analysis of TGFβ induction of EMT in the H1437 cell line in which we observed phenotypic alterations and upregulation of cytoskeletal and actin-binding proteins before changes in cadherin or vimentin expression. Lu and colleagues proposed a regulatory switch centered around miR-200/Zeb and Snail/miR-34 that regulates hybrid states of EMT (25). Our findings support the importance of Zeb and miR-34a in the regulation of hybrid phenotypes. We observed limited concordance between mRNA and proteins for some genes, particularly for cells with a mesenchymal phenotype, suggesting an important role for posttranscriptional regulation affecting EMT.
Cytoskeletal rearrangements emerged as the dominant feature of mesenchymal and invasive cells based on both mRNA and protein analysis. This finding is further supported by upregulation of cytoskeletal proteins following TGFβ induction of EMT in our and other studies (45–47) and provided a rationale to investigate the relationship between cytoskeletal gene rich signatures and survival. A set of 41 genes derived from the mesenchymal protein signature representing cytoskeletal and actin-binding proteins predicted survival in all three tumor datasets we tested. Moreover, while clearly a key feature of EMT, our study provides supporting evidence that cytoskeletal reorganization and invasiveness occur frequently in the absence of CDH1 loss.
We further assessed the relevance of the gene and protein signatures associated with EMT to survival in early-stage lung adenocarcinoma in three independent sets. Reproducibility of statistical association of gene expression signatures with survival across independent datasets has been challenging (48). Association of the full set of mRNAs in the mesenchymal signature with survival was significant in the Director's Challenge dataset. Remarkably, association of the more limited set of transcripts corresponding specifically to the mesenchymal proteomic signature was significant in all three independent datasets tested without initial training. Thus, our findings emphasize the functional relevance of proteomics to integrated cancer molecular profiling, pointing to an association between cytoskeletal and actin-binding proteins, a mesenchymal or aggressive hybrid EMT phenotype and invasive properties of lung adenocarcinomas. Byers and colleagues (49) identified an EMT gene signature consisting of 76 genes predictive of resistance to EGFR and PI3K/AKT inhibitors which partially overlapped with our epithelial and mesenchymal gene/protein signatures. We note that Byers and colleagues established their signature by selecting genes with significant correlation (both positive and negative) with gene expression of CDH1, VIM, CDH2, and FN1. In our study, we first defined epithelial and mesenchymal cell line properties based on cell morphology and expression at the protein level of CDH1 and VIM. Our results indicate that aggressiveness among hybrid type of cell lines is not associated with expression levels of CDH1 and VIM.
In conclusion, an integrated systems approach that encompassed functional and molecular characterization, of lung adenocarcinoma cell lines, uncovered substantial heterogeneity with respect to epithelial and mesenchymal features among cell lines. Signatures were identified that distinguish epithelial and mesenchymal cells as well as signatures that were shared with cells with an intermediate/hybrid phenotype. Our findings point to an association between cytoskeletal and actin-binding proteins, a mesenchymal or hybrid EMT phenotype and invasive properties of lung adenocarcinomas that impact survival.
Disclosure of Potential Conflicts of Interest
A.F. Gazdar has received speakers bureau honoraria from Genentech. No potential conflicts of interest were disclosed by the other authors .
Authors' Contributions
Conception and design: M.J. Schliekelman, S.M. Hanash
Development of methodology: M.J. Schliekelman, J. Zhu, M. Celiktas, A.F. Gazdar, I.I. Wistuba, S.M. Hanash
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): M.J. Schliekelman, A. Taguchi, J. Rodriguez, M. Celiktas, A. Chin, H. Wang, S.A. Selamat, E.M. Kroh, C. Behrens, A.F. Gazdar, I.A. Laird-Offringa, I.I. Wistuba, S.M. Hanash
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): M.J. Schliekelman, A. Taguchi, J. Zhu, X. Dai, J. Rodriguez, Q. Zhang, L. McFerrin, S.A. Selamat, C. Yang, E.M. Kroh, K.S. Garg, I.A. Laird-Offringa, M. Tewari, I.I. Wistuba, J.P. Thiery, S.M. Hanash
Writing, review, and/or revision of the manuscript: M.J. Schliekelman, A. Taguchi, J. Zhu, K.S. Garg, A.F. Gazdar, I.A. Laird-Offringa, M. Tewari, J.P. Thiery, S.M. Hanash
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): M.J. Schliekelman, A. Taguchi, C.-H. Wong, S.M. Hanash
Study supervision: M.J. Schliekelman, S.M. Hanash
Acknowledgments
The authors thank members of the Hanash lab for their invaluable suggestions and Paul Schliekelman for providing statistical support and advice on the article.
Grant Support
This work was supported by the Department of Defense (DOD) Congressionally Mandated Lung Cancer Research Program, the National Cancer Institute Early Detection Program, the Canary Foundation and the Lungevity Foundation. M. Tewari was supported by the Canary Foundation. L. McFerrin was supported by NIH grant R21/R33 CA-88245 and the Listwin Family Foundation.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.