Breast cancer is the second leading cause of cancer death for women in the United States. Of the different subtypes, estrogen receptor–negative (ER) tumors, which are ErbB2+ or triple-negative, carry a relatively poor prognosis. In this study, we used system-wide analysis of breast cancer proteomes to identify proteins that are associated with the progression of ER tumors. Our two-step approach included an initial deep analysis of cultured cells that were obtained from tumors of defined breast cancer stages, followed by a validation set using human breast tumors. Using high-resolution mass spectrometry and quantification by Stable Isotope Labeling with Amino Acids in Cell Culture (SILAC), we identified 8,750 proteins and quantified 7,800 of them. A stage-specific signature was extracted and validated by mass spectrometry and immunohistochemistry on tissue microarrays. Overall, the proteomics signature reflected both a global loss of tissue architecture and a number of metabolic changes in the transformed cells. Proteomic analysis also identified high levels of IDH2 and CRABP2 and low levels of SEC14L2 to be prognostic markers for overall breast cancer survival. Together, our findings suggest that global proteomic analysis provides information about the protein changes specific to ER breast tumor progression as well as important prognostic information. Cancer Res; 72(9); 2428–39. ©2012 AACR.

The transformation of a normal somatic cell into an immortalized and later into an invasive cancer cell involves dramatic phenotypic changes including changes in cell proliferation rate, motility, metabolism, genomic stability, and survival. Patient prognosis largely depends on the cancer stage, which is assessed by tumor size, number of involved lymph nodes, and metastases [tumor-node-metastasis (TNM) stage; refs. 1–3]. To date, critical molecular events in breast cancer progression that have dramatic effects on disease outcome are still poorly characterized. Identification of such molecular changes can be achieved by system-wide and unbiased approaches.

In the past decade, numerous studies analyzed transcriptomes of breast cancer cell lines and tumor samples using cDNA arrays. Such transcriptomic analyses classified breast tumors into 5 distinct subtypes: basal-like tumors, luminal A, luminal B, ErbB2-overexpressing, and normal breast-like tumors (4–6). Luminal A tumors are estrogen receptor–positive (ER+) and have better prognosis, whereas the basal-like and ErbB2+ are ER and have poorer prognosis (6). Similar to the tumor transcriptome studies, classification has also been done on breast cancer cell lines (7–9). As in the tumor samples, there is a discrimination between basal and luminal cells, with small differences in the subclassification.

Analysis of the proteins rather than the mRNA levels may reflect the functional phenotype of the cells more directly. Moreover, some proportion of the changes at the genome as well as at the transcriptome levels are eliminated by higher regulatory mechanisms (10). However, measuring the proteome is technologically much more challenging than transcriptome analysis. As a result, in contrast to the extensive mRNA work, proteomics studies so far have usually analyzed only few samples, with limited proteome coverage and often with inaccurate quantification and therefore they did not provide a true global view of the system (11). For instance, 2-dimensional gel electrophoresis has been used in cancer proteomics, but this technique enabled analysis of only the most abundant proteins and generally with low quantitative accuracy. Mass spectrometry–based proteomics, particularly in a high resolution and quantitative format, has developed rapidly over the last few years (12). Hybrid mass spectrometers—such as the linear ion trap-Orbitrap—combine high resolution, high mass accuracy, and high peptide sequencing speed (13). Together with innovations in sample preparation and computational proteomics, these technologies can enable confident peptide and protein identification and quantification at a large scale. In our laboratory, these advances allow routine coverage of 5,000 to 7,000 proteins from mammalian cells (14). We hypothesized that the combination of confident identification, high proteome coverage, and accurate quantification using Stable Isotope Labeling with Amino Acids in Cell Culture (SILAC; ref. 15) could make proteomics applicable to system-wide analysis of cancer proteomes and could highlight proteins and processes that are altered during cancer progression.

We took a 2-step approach starting with deep analysis of the proteomes of cultured cells that were isolated from tumors of different stages using state-of-the-art mass spectrometric technology. The analysis of cell lines eliminates the effects of diverse cell populations in the tissue that may mask the changes in the cancer cells and the controlled growth environment of cell lines further reduces the variability compared with human tissue samples. These advantages enable conclusions from a smaller number of samples and are therefore more suitable for proteomics, which still has limited throughput. We extracted a stage-specific proteomic signature and validated the results using a directed mass spectrometric approach and immunohistochemistry using tumor arrays. Examination of the signature proteins in gene expression studies of large patient cohorts identified IDH2 and CRABP2 as markers of poor prognosis and SEC14L2 as a marker of good prognosis.

Cell culture and SILAC labeling

Human mammary epithelial cells (HMEC) were obtained from Lonza and from the European Collection of Cell Cultures (ECACC); HMT-3522-S1 (16) and MFM223 (17) were obtained from the ECACC; HCC202 and HCC2218 cells (18) were obtained from the American Type Culture Collection; HCC1599, HCC1143, HCC1937 (18), and MCF7 cells were obtained from the German Collection of Microorganisms and Cell Cultures (DSMZ); MCF10a (19) and MDA-MB-453 (20) were kindly provided by Axel Ullrich (Max-Planck Institute of Biochemistry, Martinsried, Germany) and tested negative for mycoplasma contamination using DNA tests. Cell lines were purchased from the cell banks during 2008 and were authenticated by them using DNA tests and microbiologic cultures. Cells were used 1 to 6 months after arrival to the laboratory. HMEC cells were cultured in mammary epithelial cell growth medium (ECACC); HCC1599, HCC1143, HCC1937, HCC202, and HCC2218 were grown in RPMI supplemented with 10% FBS; MCF10a cells were cultured in Dulbecco's Modified Eagle's Media (DMEM):F12 supplemented with 5% horse serum, 20 ng/mL EGF, 10 μg/mL insulin, 0.5 μg/mL hydrocortisone, and 0.1 μg/mL cholera toxin. HMT-3522-S1 cells were cultured in DMEM:F12 supplemented with 250 ng/mL insulin, 10 μg/mL transferrin, 0.1 μmol/L sodium selenite, 0.1 nmol/L 17 β-estradiol, 5 μg/mL ovine prolactin, 0.5 μg/mL hydrocortisone, and 10 ng/mL EGF; MDA-MB-453 cells were cultured in L-15 supplemented with 10% FBS; MFM223 cells were grown in MEM supplemented with 10% FBS. All cells were cultured with penicillin/streptomycin under 5% CO2, except for MDA-MB-453 which were cultured under 0% CO2.

MCF7 cells were SILAC-labeled with Arg10 and Lys8 by culturing them for 8 doublings in the SILAC medium to reach complete labeling. For proteomic analysis, each of the cell lines was analyzed in 3 biologic replicates. The first 2 replicates were lysed with modified RIPA buffer (50 mmol/L Tris-HCl, pH 7.4, 150 mmol/L NaCl, 1 mmol/L EDTA, 1% NP40, 0.25% sodium deoxycholate, and protease inhibitors) at 4°C. For lysis of the cells of the third replicate, we used an improved method that enables better yield of membrane proteins, with a buffer containing 4% SDS, 100 mmol/L Tris-HCl, pH 7.6, and 100 mmol/L dithiothreitol (DTT). For tumor analysis, we used the super-SILAC mix as described previously (21).

Tissue sample preparation

Normal and tumor tissues were kindly provided by René Bernards (NKI, Amsterdam, The Netherlands). Analysis of the samples followed an informed consent approved by the local ethics committee. Tissue slices from snap-frozen tissue samples were lysed with 4% SDS, 100 mmol/L Tris-HCl, pH 7.6, and 100 mmol/L DTT.

Trypsin digestion

Each of the nonlabeled samples (HMEC, MCF10a, HMT-3522-S1, HCC1937, HCC1143, HCC1599, HCC202 HCC2218, MFM223, and MDA-MB-453) was mixed with SILAC-labeled MCF7 cells at a 1:1 protein ratio. For tissue analysis, the super-SILAC mix was combined with equal protein amount of each of the tissue samples. Two methods were used for trypsin digest: In-solution digestion was used for the first 2 replicates of the cell line analysis, where cells were lysed with RIPA buffer and Filter Aided Sample Preparation (FASP; ref. 14) was used for the third cell line replicate and for the tissue/super-SILAC digestion, when lysis was done with SDS-based buffer.

Peptide fractionation

In the cell line experiments, peptides were separated using an Agilent 3100 OFFGEL fractionator (Agilent, G3100AA) as described previously (22). Before liquid chromatography/mass spectrometry (LC/MS) analysis, peptides were concentrated and desalted on C18 StageTips. The tumor/super-SILAC peptides were separated by strong anion exchange in a StageTip format as described previously (23). Peptides were separated to 6 fractions with buffers of different pH values. Peptides were concentrated and purified on C18 StageTips before LC/MS analysis.

LC/MS analysis

For the cell line analyses, peptides were separated by reverse-phase chromatography using a nanoflow HPLC system (Thermo Fisher Scientific) with a 90-minute linear gradient of water/acetonitrile. High-performance liquid chromatography (HPLC) was coupled online to an LTQ-Orbitrap mass spectrometer (Thermo Fisher Scientific). Fragmentation of the top 5 peptides in each scan was done by collision-induced dissociation. For the tumor sample analyses, peptides were eluted with a 190-minute linear gradient and the MS analysis was done on an LTQ-Orbitrap Velos instrument with MS/MS selecting the top 10 precursor m/z values from an inclusion list. The m/z values that were used in the inclusion list are given in Supplementary Table S1. Peptides were fragmented by higher energy collisional dissociation (HCD).

Data analysis

Raw MS files from the LTQ-Orbitrap were analyzed by MaxQuant (version 1.1.1.9; ref. 24). MS/MS spectra were searched against the decoy IPI-human database version 3.68 containing both forward and reverse protein sequences by the Andromeda search engine (25). For identification, the false discovery rate (FDR) was set to 0.01 on the protein and on the peptide levels. Complete protein and peptides lists are given as Supplementary Table S2.

Statistical analysis

All the statistical analyses of the MaxQuant protein tables were done with the Perseus program (J. Cox, manuscript in preparation). For hierarchical clustering, we filtered the data and kept proteins with a minimum of 5 ratio values from the 11 cell lines. Logarithmized ratios toward the internal standard were z-scored and clustered using Euclidean distances between averages. For ANOVA test, experimental systems were grouped according to their stage, and the statistical test was done with FDR = 0.05 and S0 = 1 (26). The S0 factor was described by Tusher and colleagues for t test and was here generalized for ANOVA test. Fisher exact tests were done with a Benjamini–Hochberg FDR threshold of 0.02.

Tumor arrays

Breast cancer tissue arrays were obtained from Pantomics, Inc., and consisted of 75 breast tumor and normal breast tissues in duplicates. Primary antibodies, anti-IDH2, anti-CRABP2, and anti-ANX3, were kindly provided by the Human Protein Atlas. We carried out semiquantitative scoring of the intensity of staining using 4 values (0–3) for negative, low, medium, and high staining intensities.

Gene expression data

Eight data sets comprising 1,467 samples were downloaded from the Gene Expression Omnibus and from the Stanford microarray database. Twenty samples lacking clinical information were removed; where raw data were not available, the published normalized data were used. The Affymetrix data sets were quantile-normalized and dual channel platforms were loess-normalized (a locally weighted polynomial regression method; refs. 27, 28). Probes were mapped to Entrez gene IDs to gene center the data (29). Entrez gene IDs for genes of interest were obtained from the gene database at National Center for Biotechnology Information (NCBI). All calculations were carried out in the R statistical environment (30).

Survival analysis

For each of the 52 genes, median mRNA expression levels were used to determine high and low expression groups within each of the 8 individual data sets. The survival curve was based on Kaplan–Meier estimates and the log-rank P value is shown for difference in survival. The P values are adjusted for multiple testing using a Bonferroni correction. Cox regression analysis was used to calculate hazard ratios (HR). The R package survival was used for all calculations and to plot the Kaplan–Meier survival curves.

A cell culture model for breast cancer development

To characterize the differences in the proteomes of ductal carcinomas of various stages, we assembled a panel of cell lines that were isolated from human tumors with a defined TNM stage. We aimed to identify molecular markers and cellular processes characteristic of specific stages in the transformation process rather than of a particular cell line or an individual patient. We therefore included 2 to 3 cell lines from each stage, derived from the tumors of different patients with cancer who were not previously treated. As control cells that represent the healthy tissue, we used primary mammary epithelial cells (HMEC) from 2 different sources. Premalignant cells were represented by MCF10a and HMT-3522-S1 cells, stage II tumors by HCC1143 and HCC1937 cells, stage III tumors by HCC202, HCC2218, and HCC1599 cells, and metastatic cells from pleural effusions by MFM223 and MDA-MB-453 cells (Fig. 1). Together, this panel of cells models the transformation process toward development of ER tumors that are basal-A, luminal, or ErbB2-overexpressing. In microarray studies, HCC1143, HCC1937, and HCC1599 have been classified as basal-like tumor cells, HCC202, HCC2218, MDA-MB-453 as luminal and ErbB2 overexpressing (8) whereas the classification of MFM223 was unknown.

First, we tested whether the cell lines retained their original in vivo phenotype with respect to their ability to grow in anchorage-independent conditions. A colony formation assay showed that the tumorigenic potential indeed increased with the stages (Fig. 1). These results show that the metastatic state is evident in the primary tumors and that these characteristics are maintained in the cell lines in vitro. Thus, this cellular model represents crucial aspects in the development of the transformed phenotype and can therefore serve as the basis of proteomic profiling.

MS-based proteomic analysis of breast cancer cell lines

We carried out a SILAC-based proteomic analysis to accurately quantify the proteomes of each of the cell lines. We SILAC-labeled MCF7 cells that served as a “spike-in” standard (31). Briefly, we cultured the MCF7 cells with “heavy” lysine and arginine, and the lysates of the “heavy” MCF7 cells were mixed with the lysates of each of these cell lines before trypsin digestion. Peptides were fractionated by isoelectric focusing and analyzed with a high-resolution mass spectrometer (LTQ-Orbitrap). The spike-in approach allowed culturing the experimental cell lines described earlier under their standard conditions (Supplementary Fig. S1; see Materials and Methods) followed by relative quantification against the common SILAC standard.

Analysis of biologic triplicates of all the samples identified a total of 8,750 proteins and quantified 7,800 of them, the latter of which were used for all subsequent analysis (Supplementary Table S2). Of the quantified proteins, approximately half did not vary significantly in any of the cell lines (53%; ANOVA test for comparison of triplicates at a 5% FDR, see Materials and Methods). These proteins are enriched for basic cellular processes, such as the basal transcription machinery and chromatin assembly and may be considered the “household proteome” (Supplementary Table S3). Expression levels of proteins involved in many basic cellular functions, such as metabolic processes, protein expression, and cell adhesion, did change drastically as described later, reflecting the pronounced cellular differences between the cells.

As positive controls, we found in HMECs high levels of known myoepithelial markers: keratins 5, 6, and 14 and caldesmon, a regulator of actomyosin contractility (32–34). These markers were lower in the cancer cells, most dramatically in stage III and in the cells from pleural effusions (median 25-fold lower compared with the myoepithelial cells; Fig. 2A). CD44, a cell surface adhesion molecule, has been reported as a marker of breast cancer stem cells (35). However, it was also previously found to be highly expressed in myoepithelial cells and basal tumors compared with luminal cells (9). In agreement with that observation, our proteomic data showed high expression of this marker only in the cells previously reported as basal (Fig. 2B; refs. 7, 8). We further identified the expected upregulation of the DNA repair protein PARP in late transformation stage (36), and high expression level of ErbB2 in HCC2218, HCC202, and MDA-MB-453, which have the corresponding gene amplification and are known to overexpress this receptor (8).

Clustering analysis distinguishes between breast cancer subtypes

Unsupervised hierarchical clustering of the proteomic data separated the samples into 2 main groups (Fig. 2C). The basal cluster included the HMECs, the benign cells, and the basal cancer cells from stage II and stage III. Within the basal cluster, cell lines perfectly segregated according to their stage. This shows that MS-based proteomics can correctly group cancer proteomes according to the stage in single cancer subtypes. The luminal cluster contained cells from stage III and from pleural effusion metastasis, including MFM223 cells. These proteomics results show, in agreement with the abovementioned transcriptomic work, the overall dominance of the cancer subtype, but additionally indicate stage-related alterations.

The hierarchical clustering of the proteins revealed 3 main groups; those lower in the cancer cells than in the controls, those high in the basal tumor cells, and those high in the luminal cells (Fig. 2C). We carried out enrichment analysis to find cellular processes significantly altered in each of the clusters. In the first one, of proteins that are low in tumors relative to control irrespective of subtype, we identified a prominent reduction in the adhesive phenotype of the cells. We extracted all cell adhesion–related proteins (as annotated by Gene Ontology) and indeed found that the genuine adhesion proteins in this group are reduced in the transformed cells (Supplementary Fig. S2A and Table S3). Among the downregulated proteins, we found α-integrins 2, 3, 4, 5, 6, and V and β-integrins 1, 4, 5, and 6, reflecting reduced adhesion to fibronectin, collagen, and laminin (37). The reduction in integrins coincided with lower levels of the extracellular matrix (ECM) proteins laminin-5 (α3β3γ2) and laminin-10/11 (α5β1/2γ1), fibronectin and collagen (COL7A1), and mediators of integrin signaling, such as ILK and α-parvin. The second cluster, of proteins that were high in the basal cells, included proteins involved in DNA replication and mitosis as well as splicing regulators. Examination of the distribution of cell-cycle regulators showed that the majority (such as CDK1, CDK4, cyclin A2, cyclin B1, and the APC complex proteins) are increased in the basal cells already in the premalignant stage and that they were further upregulated in the malignant cells (Supplementary Fig. S2B). Regulators of splicing and spliceosomal proteins were generally higher in the transformed cells than in the control cells; however, their expression was higher in the basal cells than in the luminal cells from the same stage (Supplementary Fig. S2C). The cluster of proteins that are high in the luminal cells was enriched for mitochondrial proteins as well as endoplasmic reticulum–Golgi and vesicle transport processes, reflecting the physiologic characteristic of the luminal cell layer as secreting cells (Supplementary Fig. S2D).

Establishment of a stage-specific signature

Because of the dominant effect of the cancer subtypes, it is challenging to identify proteins that can serve as stage-specific markers that capture commonalities in breast cancer progression. In an attempt to find such proteins, we carried out an ANOVA test on cells grouped by stage. We extracted a stage-specific signature of 52 proteins, of which 11 were upregulated and 41 were downregulated (FDR = 0.05; Fig. 3; Supplementary Table S4). We divided the signature proteins into 4 clusters according to the pattern of their change (Fig. 3). The largest cluster (22 proteins) consisted of proteins that were expressed at similar levels in the normal, premalignant, and stage II cells and dropped dramatically between stage II and stage III. This cluster included the laminin receptors, integrins α6β4 and α6β1, as well as 2 laminin-5 subunits (laminin-β3 and -γ2). Interestingly, unlike the other proteins in the group, integrin α6 expression increased more than 8-fold in the cells from the metastatic location and likewise expression of each of its binding partners was increased approximately 2-fold in these cells. The signature also included the β-subunit of the αvβ6 fibronectin receptor, whereas fibronectin itself was downregulated already in the initial step of transformation. Furthermore, this cluster included an adherens junction protein (P-Cadherin, CDH3), 5 actin regulators (CALD1, PDLIM5, PDLIM7, CAPG, and FMNL2), and the intermediate filament protein vimentin. These results highlight the general loss of the adhesive phenotype and remodeling of cell architecture. The observed proteome changes are the molecular correlates to the detachment of the cells from the tissue of origin in the process of metastasis.

The second and third clusters include proteins that were downregulated mainly between the premalignant and stage II cells and between the myoepithelial cells and the premalignant cells (Fig. 3). They include the cell-cycle regulator stratifin (SFN), which is a p53 target whose promoter region is known to be hypermethylated in breast cancers (38, 39), and 4 adhesion molecules (BPAG1, COL17A1, CDHF7, and CD97). We also found the membrane-bound metalloprotease MMP14 to be downregulated in the stage II tumor cells in our system.

The last cluster included proteins that are high in the transformed cells, mostly between the premalignant cells and stage II cells, but are further increased in later stages (Fig. 3). This cluster has 11 members, among them metabolic proteins (IDH2, BLVRB, UCKL1, and CRABP2), protein glycosylation (FUT8), and vesicle transport (ANX6). These and the upregulated proteins with unknown function are potential positive breast cancer markers.

As a first assessment of the signature proteins as potential tumor markers, we validated their relevance to human tumors in 3 ways. In the first, we used a directed mass spectrometric approach to preferentially retrieve signature proteins in the analysis of single tumor samples, and in the second, we carried out immunohistochemistry on tumor arrays to analyze expression of individual proteins in multiple tissue samples. Finally, we validated signature proteins in a large compendium of gene expression studies.

Validation of the protein signature in human tumor samples using a directed MS approach

To quantify tumor samples with respect to healthy tissue, cell line–based SILAC cannot be applied directly. Instead, we made use of the recently developed super-SILAC method for quantification of human breast tumor samples (21). We used the super-SILAC mix as an internal standard to quantify a tumor tissue and a normal tissue that served as control. The tumor tissue originated from ER stage III tumor and therefore corresponds to the stage III tumor cell lines such as HCC1599.

To examine the expression levels of the signature proteins in tumor samples, we developed a mass spectrometric method based on an inclusion list, in which we preferentially target peptides belonging to these proteins for fragmentation and identification. With this preferential fragmentation of peptides of interest, we identified 48 of the 52 signature proteins. To compare the cell line experiments and the human tumor tissues, we normalized the cancer samples to the healthy controls in each of the experiments, HCC1599 to HMEC and the tumor tissue to the normal tissue. Of the 39 signature proteins for which we had accurate ratios compared with internal standards in all 4 experiments (HCC1599 and HMEC vs. labeled MCF7 and normal tissue and stage III tumor tissue vs. super-SILAC mix), we found positive correlation of the cancer versus normal ratios for 32 proteins (Fig. 4). For most of the proteins, the ratio between the tumor cells and the normal controls was less pronounced in the tissues than in the cell lines. Specifically, we validated the downregulation of ANX3, SEC14L2, all adhesion proteins, and laminins, as well as the myoepithelial markers, CD109, caldesmon, and caveolin. We also validated the dramatic decrease in MMP14 (>30-fold in both cases). For the proteins that were highly expressed in the cancer cell lines, we validated the overexpression of IDH2, CRABP2, FUT8, BLVRB, ANX6, and RBM47, suggesting 6 potential novel markers for breast cancer.

Verification of signature proteins by immunohistochemistry using human tumor tissue arrays

Three of the signature proteins, IDH2, CRABP2, and ANX3, were selected for further evaluation of their expression patterns in a larger number of tumor samples using immunohistochemistry and tumor arrays. In the proteomic data, the first 2 were high in the advanced primary tumor cells, whereas the third was downregulated starting already in premalignant cells. We reacted breast cancer tissue microarrays that include 75 tumor and normal samples with the corresponding antibodies. We scored the intensity of the staining in each of the tissue sections in the array and examined the correlation between the staining intensity and the tumor TNM stage. In agreement with the previous experiments, ANX3 was strongly stained in the myoepithelial layer in the normal tissue but was low in the tumor tissues (Fig. 5A). Globally, ANX3 staining negatively correlated with the tumor stage. IDH2 was completely absent in the normal tissues although it showed strong mitochondrial staining in the tumor tissues. The overall correlation was positive between the tumor and lymph node state (Fig. 5B). CRABP2 was negative in the myoepithelial but had moderate staining of the luminal cell layer in the healthy tissue and showed strong staining of the tumor cells. For the overall correlation, we used the myoepithelial cells as control, as these were the controls used in the previous experiments. This confirmed the positive correlation between the tumor stage and the intensity of the CRABP2 staining. Thus, the analysis of 75 tissue samples confirmed that ANX3 is reduced with transformation and that IDH2 and CRABP2 are potential markers of advanced breast cancers.

Prognostic value of the signature proteins

To evaluate the prognostic value of the signature proteins, it is necessary to examine the protein expression in large patient cohorts. Because such proteomic data do not exist, we carried out a meta-analysis of publicly available patient mRNA data sets. Using Kaplan–Meier analysis of overall survival (OS) with median mRNA expression levels as a cutoff point, we examined the prognostic value of each of the 52 genes across 1,447 samples from 8 public data sets (Supplementary Table S5). The 3 most significant genes were SEC14L2 (adjusted P = 1.94e-9), CRABP2 (adjusted P = 9.41e-5), and IDH2 (adjusted P = 1.49e-4), with HR of 0.511 (CI, 0.4174–0.6259), 1.616 (CI, 1.325–1.972), and 1.597 (CI, 1.310–1.947), respectively (Fig. 6A–C; Supplementary Table S6). When using these 3 markers in combination (samples with greater than median expression of CRABP2 and IDH2 and less than median expression for SEC14L2), they had a greater effect on OS than they had individually (P = 6.4e-11; HR, 2.07; CI, 1.66–2.59; n = 1438; Fig. 6D). This is consistent with our observations that CRABP2 and IDH2 are markers of poor prognosis and SEC14L2 is a marker of good prognosis.

The relevance of breast cancer cell lines to tumors has been shown in genomic and gene expression studies (7, 40). On the basis of these studies, we here combine in vitro and in vivo analyses in a 2-step approach. We carried out a system-wide analysis in cultured cells, which provides a more homogenous cell population and controlled growth environment, and proceeded to validate the results in human tumor tissues. Our global analysis of the proteins from the cell lines quantified 7,800 proteins and is the first in-depth and quantitative proteomic study of breast cancer progression.

Collapse of the adhesive machinery in the transformed cells

Myoepithelial cells within normal tissue secrete ECM components, synthesize and maintain the basement membrane, and express high levels of adhesion proteins (33, 41, 42). Accordingly, we found high expression levels of ECM and adhesion proteins in the control cells but significantly lower expression in the transformed cells, mainly in the late-stage tumor cells. Interestingly, these proteins have been considered as basal markers (9), but our results show that they are also lost in basal stage III tumor cells. Therefore, their expression demarcates cancer stage rather than subtype.

The global proteomic profiles, as well as the signature proteins in the cell lines and tissues, reflect the general collapse of normal tissue architecture, which is a known feature of the development of carcinomas (Fig. 7A). The most dramatic changes in adhesion proteins occur between stage II and stage III tumor cells, highlighting their importance for detachment of the tumor cells from the original tissue. We show upregulation of integrin α6 in cells derived from pleural effusion metastases. This result agrees with previous studies that show overexpression of integrin α6 in metastatic sites of breast cancers (43, 44) and suggest a role for α6β4 and α6β1 integrins in adhesion in the metastatic location.

Novel breast cancer marker candidates reflect adhesive and metabolic changes

We found 52 potential markers of transformation; 41 of these were reduced in the transformed cells and 11 induced. The small number of commonly regulated proteins across the cancer subtypes shows that transformation likely occurs in diverse paths. Nevertheless, the existence of even a relatively small number of such proteins implies that there are commonalities in these paths. We validated these signature proteins in human tumor tissue albeit generally with lower ratios in the tissue analysis than in the cell lines. Presumably this results from the plurality of cell types in the tissue. As a result, the proteomic differences in the tissues can be diluted compared with the cell lines. This further validates our strategy of examination of the more homogenous cell population for the study of the changes in the cancer cells. In cases where the tissue protein is expressed mainly from noncancer cells, one may even obtain opposing results between the cultured cells and the tissues. For example, for fibronectin, decreased expression by tumor cells compared with myoepithelial cells was masked in the tissue by high expression by tissue fibroblasts. To establish the significance of each of the signature proteins, it would be necessary to perform broader studies using the same method on multiple tumor samples.

The top 2 prognostic markers in our data were retinoid-binding proteins. SEC14L2 was downregulated and CRABP2 was upregulated in the transformed cells. SEC14L2/TAP is a retinol- or α-tocopherol–binding protein that can act as a transcriptional regulator and as a regulator of cholesterol metabolism (45). Its reduced levels were previously indicated in prostate cancer and breast cancers (46, 47). CRABP2 binds all-trans retinoic acid in the cytoplasm and targets it to the nucleus where it binds its receptor and promotes cell differentiation. In head and neck cancers and gliomas, CRABP2 levels were shown to be reduced (48, 49). In contrast to these tumor types and in support of our data, ErbB2 has been shown to induce retinoic acid resistance in breast cancer cells (50). Furthermore, CRABP2 reportedly mediates proliferative activity through retinoic acid–induced PPARβ/δ activation in the presence of another factor, FABP5 (51). In our data, FABP5 is high in the basal cells and low in the luminal ones (Supplementary Table S2), suggesting 2 distinct mechanisms by which CRABP2 may induce hyperproliferation and possibly retinoic acid resistance.

Our results suggest IDH2 as another positive breast cancer marker. IDH2 has attracted much attention when it was identified as a proto-oncogene in glioblastomas and acute myeloid leukemia (52, 53). In those cases, mutations in IDH2 appeared in early stages of transformation and induced overproduction of 2-hydroxyglutarate, thereby affecting global DNA methylation patterns (54). Furthermore, elevated IDH2 activity led to high production of NADPH, a cofactor involved in biosynthetic processes and in control of oxidative stress. In our data, the level of IDH2 increased only in late stages of transformation. Such an elevation was not seen for the mutant enzymes in other tumor types.

The proteomic signature suggests control of the cells' oxidative state and NADPH levels as principal processes regulated with transformation (Fig. 7B). The increase in IDH2 in our data coincided with increased expression of flavin reductase (BLVRB), which was included in the proteomic signature, and with glutathione reductase. These 2 enzymes regulate the redox state of the cells and require NADPH for their activity. Possibly, elevated IDH2 and overproduction of NADPH enables high activity of BLVRB and glutathione reductase, thereby regulating the oxidative state in the transformed cells. NADPH can also be produced in the reaction of retinol oxidation to retinal and retinoic acid. Upon binding of retinoic acid, CRABP2 affects cell proliferation.

In conclusion, we here combined an unbiased system-wide view of the proteomes of cultured cells with focused analysis of tumors. This study captured the general processes that are altered upon transformation and showed the relevance of candidate markers to the in vivo situation. This strategy is particularly suitable for proteomics, which still has relatively limited throughput, because the proteomic data can be directly translated to immunohistochemistry and other protein-based analyses of tumors. Finally, our data identified CRABP2 and IDH2 as markers of poor prognosis and SEC14L2 as a marker of good prognosis and suggest additional markers that require further evaluation.

No potential conflicts of interest were disclosed.

Conception and design: T. Geiger

Development of methodology: T. Geiger, M. Mann

Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): T. Geiger, M. Mann

Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): T. Geiger, S.F. Madden, W.M. Gallagher, J. Cox, M. Mann

Writing, review, and/or revision of the manuscript: T. Geiger, S.F. Madden, W.M. Gallagher, M. Mann

Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): M. Mann

Study supervision: M. Mann

The authors thank the members of the Department of Proteomics and Signal Transduction for helpful discussion; especially Pawel Ostasiewicz for assistance with immunohistochemical analysis; René Bernard and Jelle Wesseling from the NKI, Amsterdam, the Netherlands, for the breast tumor and healthy tissues and Axel Ullrich from the MPI of Biochemistry, Martinsried for cell lines; Mathias Uhlén and Emma Lundberg from the Human Protein Atlas for antibodies; and thank Herbert Schiller from the MPI of Biochemistry and Benny Geiger from the Weizmann Institute for critical review of the manuscript.

This project was supported by the European Commission's 7th Framework Program PROteomics SPECificat ion in Time and Space (PROSPECTS, HEALTH-F4-2008-021,648). W.M. Gallagher and S.F. Madden are supported by Science Foundation Ireland, Strategic Research Cluster award to Molecular Therapeutics for Cancer Ireland (award 08/SRC/B1410).

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1.
Shek
LL
,
Godolphin
W
. 
Model for breast cancer survival: relative prognostic roles of axillary nodal status, TNM stage, estrogen receptor concentration, and tumor necrosis
.
Cancer Res
1988
;
48
:
5565
9
.
2.
Truong
PT
,
Vinh-Hung
V
,
Cserni
G
,
Woodward
WA
,
Tai
P
,
Vlastos
G
. 
The number of positive nodes and the ratio of positive to excised nodes are significant predictors of survival in women with micrometastatic node-positive breast cancer
.
Eur J Cancer
2008
;
44
:
1670
7
.
3.
Vinh-Hung
V
,
Verschraegen
C
,
Promish
DI
,
Cserni
G
,
Van de Steene
J
,
Tai
P
, et al
Ratios of involved nodes in early breast cancer
.
Breast Cancer Res
2004
;
6
:
R680
8
.
4.
Perou
CM
,
Sorlie
T
,
Eisen
MB
,
van de Rijn
M
,
Jeffrey
SS
,
Rees
CA
, et al
Molecular portraits of human breast tumours
.
Nature
2000
;
406
:
747
52
.
5.
Sorlie
T
,
Perou
CM
,
Tibshirani
R
,
Aas
T
,
Geisler
S
,
Johnsen
H
, et al
Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications
.
Proc Natl Acad Sci U S A
2001
;
98
:
10869
74
.
6.
Sorlie
T
,
Tibshirani
R
,
Parker
J
,
Hastie
T
,
Marron
JS
,
Nobel
A
, et al
Repeated observation of breast tumor subtypes in independent gene expression data sets
.
Proc Natl Acad Sci U S A
2003
;
100
:
8418
23
.
7.
Neve
RM
,
Chin
K
,
Fridlyand
J
,
Yeh
J
,
Baehner
FL
,
Fevr
T
, et al
A collection of breast cancer cell lines for the study of functionally distinct cancer subtypes
.
Cancer Cell
2006
;
10
:
515
27
.
8.
Kao
J
,
Salari
K
,
Bocanegra
M
,
Choi
YL
,
Girard
L
,
Gandhi
J
, et al
Molecular profiling of breast cancer cell lines defines relevant tumor models and provides a resource for cancer gene discovery
.
PLoS One
2009
;
4
:
e6146
.
9.
Charafe-Jauffret
E
,
Ginestier
C
,
Monville
F
,
Finetti
P
,
Adelaide
J
,
Cervera
N
, et al
Gene expression profiling of breast cell lines identifies potential new basal markers
.
Oncogene
2006
;
25
:
2273
84
.
10.
Geiger
T
,
Cox
J
,
Mann
M
. 
Proteomic changes resulting from gene copy number variations in cancer cells
.
PLoS Genet
2010
;
6
.
pii: e1001090
.
11.
Hanash
S
,
Taguchi
A
. 
The grand challenge to decipher the cancer proteome
.
Nat Rev
2010
;
10
:
652
60
.
12.
Mann
M
,
Kelleher
NL
. 
Precision proteomics: the case for high resolution and high mass accuracy
.
Proc Natl Acad Sci U S A
2008
;
105
:
18132
8
.
13.
Makarov
A
,
Denisov
E
,
Kholomeev
A
,
Balschun
W
,
Lange
O
,
Strupat
K
, et al
Performance evaluation of a hybrid linear ion trap/orbitrap mass spectrometer
.
Anal Chem
2006
;
78
:
2113
20
.
14.
Wisniewski
JR
,
Zougman
A
,
Nagaraj
N
,
Mann
M
. 
Universal sample preparation method for proteome analysis
.
Nat Methods
2009
;
6
:
359
62
.
15.
Ong
SE
,
Blagoev
B
,
Kratchmarova
I
,
Kristensen
DB
,
Steen
H
,
Pandey
A
, et al
Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics
.
Mol Cell Proteomics
2002
;
1
:
376
86
.
16.
Briand
P
,
Petersen
OW
,
Van Deurs
B
. 
A new diploid nontumorigenic human breast epithelial cell line isolated and propagated in chemically defined medium
.
In Vitro Cell Dev Biol
1987
;
23
:
181
8
.
17.
Hackenberg
R
,
Luttchens
S
,
Hofmann
J
,
Kunzmann
R
,
Holzel
F
,
Schulz
KD
. 
Androgen sensitivity of the new human breast cancer cell line MFM-223
.
Cancer Res
1991
;
51
:
5722
7
.
18.
Gazdar
AF
,
Kurvari
V
,
Virmani
A
,
Gollahon
L
,
Sakaguchi
M
,
Westerfield
M
, et al
Characterization of paired tumor and non-tumor cell lines established from patients with breast cancer
.
Int J Cancer
1998
;
78
:
766
74
.
19.
Tait
L
,
Soule
HD
,
Russo
J
. 
Ultrastructural and immunocytochemical characterization of an immortalized human breast epithelial cell line, MCF-10
.
Cancer Res
1990
;
50
:
6087
94
.
20.
Cailleau
R
,
Olive
M
,
Cruciger
QV
. 
Long-term human breast carcinoma cell lines of metastatic origin: preliminary characterization
.
In Vitro
1978
;
14
:
911
5
.
21.
Geiger
T
,
Cox
J
,
Ostasiewicz
P
,
Wisniewski
JR
,
Mann
M
. 
Super-SILAC mix for quantitative proteomics of human tumor tissue
.
Nat Methods
2010
;
7
:
383
5
.
22.
Hubner
NC
,
Ren
S
,
Mann
M
. 
Peptide separation with immobilized pI strips is an attractive alternative to in-gel protein digestion for proteome analysis
.
Proteomics
2008
;
8
:
4862
72
.
23.
Wisniewski
JR
,
Zougman
A
,
Mann
M
. 
Combination of FASP and StageTip-based fractionation allows in-depth analysis of the hippocampal membrane proteome
.
J Proteome Res
2009
;
8
:
5674
8
.
24.
Cox
J
,
Mann
M
. 
MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification
.
Nat Biotechnol
2008
;
26
:
1367
72
.
25.
Cox
J
,
Neuhauser
N
,
Michalski
A
,
Scheltema
RA
,
Olsen
JV
,
Mann
M
. 
Andromeda: a peptide search engine integrated into the MaxQuant environment
.
J Proteome Res
2011
;
10
:
1794
805
.
26.
Tusher
VG
,
Tibshirani
R
,
Chu
G
. 
Significance analysis of microarrays applied to the ionizing radiation response
.
Proc Natl Acad Sci U S A
2001
;
98
:
5116
21
.
27.
Yang
YH
,
Dudoit
S
,
Luu
P
,
Lin
DM
,
Peng
V
,
Ngai
J
, et al
Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation
.
Nucleic Acids Res
2002
;
30
:
e15
.
28.
Irizarry
RA
,
Hobbs
B
,
Collin
F
,
Beazer-Barclay
YD
,
Antonellis
KJ
,
Scherf
U
, et al
Exploration, normalization, and summaries of high density oligonucleotide array probe level data
.
Biostatistics
2003
;
4
:
249
64
.
29.
Maglott
D
,
Ostell
J
,
Pruitt
KD
,
Tatusova
T
. 
Entrez Gene: gene-centered information at NCBI
.
Nucleic Acids Res
2005
;
33
:
D54
8
.
30.
R Development Core Team
. 
R: A language and environment for statistical computing
. 
2011
.
Available from
: http://www.R-project.org.
31.
Geiger
T
,
Wisniewski
JR
,
Cox
J
,
Zanivan
S
,
Kruger
M
,
Ishihama
Y
, et al
Use of stable isotope labeling by amino acids in cell culture as a spike-in standard in quantitative proteomics
.
Nat Protoc
2011
;
6
:
147
57
.
32.
Adriance
MC
,
Inman
JL
,
Petersen
OW
,
Bissell
MJ
. 
Myoepithelial cells: good fences make good neighbors
.
Breast Cancer Res
2005
;
7
:
190
7
.
33.
Polyak
K
,
Hu
M
. 
Do myoepithelial cells hold the key for breast tumor progression?
J Mammary Gland Biol Neoplasia
2005
;
10
:
231
47
.
34.
Deugnier
MA
,
Teuliere
J
,
Faraldo
MM
,
Thiery
JP
,
Glukhova
MA
. 
The importance of being a myoepithelial cell
.
Breast Cancer Res
2002
;
4
:
224
30
.
35.
Shipitsin
M
,
Campbell
LL
,
Argani
P
,
Weremowicz
S
,
Bloushtain-Qimron
N
,
Yao
J
, et al
Molecular definition of breast tumor heterogeneity
.
Cancer cell
2007
;
11
:
259
73
.
36.
Goncalves
A
,
Finetti
P
,
Sabatier
R
,
Gilabert
M
,
Adelaide
J
,
Borg
JP
, et al
Poly(ADP-ribose) polymerase-1 mRNA expression in human breast cancer: a meta-analysis
.
Breast Cancer Res Treat
2011
;
127
:
273
81
.
37.
Humphries
JD
,
Byron
A
,
Humphries
MJ
. 
Integrin ligands at a glance
.
J Cell Sci
2006
;
119
:
3901
3
.
38.
Hermeking
H
,
Lengauer
C
,
Polyak
K
,
He
TC
,
Zhang
L
,
Thiagalingam
S
, et al
14-3-3 sigma is a p53-regulated inhibitor of G2/M progression
.
Mol Cell
1997
;
1
:
3
11
.
39.
Lodygin
D
,
Hermeking
H
. 
Epigenetic silencing of 14-3-3sigma in cancer
.
Semin Cancer Biol
2006
;
16
:
214
24
.
40.
Wistuba
 II
,
Behrens
C
,
Milchgrub
S
,
Syed
S
,
Ahmadian
M
,
Virmani
AK
, et al
Comparison of features of human breast cancer cell lines and their corresponding tumors
.
Clin Cancer Res
1998
;
4
:
2931
8
.
41.
Hu
M
,
Yao
J
,
Carroll
DK
,
Weremowicz
S
,
Chen
H
,
Carrasco
D
, et al
Regulation of in situ to invasive breast carcinoma transition.
Cancer Cell
2008
;
13
:
394
406
.
42.
Barsky
SH
,
Karlin
NJ
. 
Myoepithelial cells: autocrine and paracrine suppressors of breast cancer progression
.
J Mammary Gland Biol Neoplasia
2005
;
10
:
249
60
.
43.
Natali
PG
,
Nicotra
MR
,
Botti
C
,
Mottolese
M
,
Bigotti
A
,
Segatto
O
. 
Changes in expression of alpha 6/beta 4 integrin heterodimer in primary and metastatic breast cancer
.
Br J Cancer
1992
;
66
:
318
22
.
44.
Davidson
B
,
Konstantinovsky
S
,
Nielsen
S
,
Dong
HP
,
Berner
A
,
Vyberg
M
, et al
Altered expression of metastasis-associated and regulatory molecules in effusions from breast cancer patients: a novel model for tumor progression
.
Clin Cancer Res
2004
;
10
:
7335
46
.
45.
Porter
TD
. 
Supernatant protein factor and tocopherol-associated protein: an unexpected link between cholesterol synthesis and vitamin E (review)
.
J Nutr Biochem
2003
;
14
:
3
6
.
46.
Wen
XQ
,
Li
XJ
,
Su
ZL
,
Liu
Y
,
Zhou
XF
,
Cai
YB
, et al
Reduced expression of alpha-tocopherol-associated protein is associated with tumor cell proliferation and the increased risk of prostate cancer recurrence
.
Asian J Androl
2007
;
9
:
206
12
.
47.
Johnykutty
S
,
Tang
P
,
Zhao
H
,
Hicks
DG
,
Yeh
S
,
Wang
X
. 
Dual expression of alpha-tocopherol-associated protein and estrogen receptor in normal/benign human breast luminal cells and the downregulation of alpha-tocopherol-associated protein in estrogen-receptor-positive breast carcinomas
.
Mod Pathol
2009
;
22
:
770
5
.
48.
Calmon
MF
,
Rodrigues
RV
,
Kaneto
CM
,
Moura
RP
,
Silva
SD
,
Mota
LD
, et al
Epigenetic silencing of CRABP2 and MX1 in head and neck tumors
.
Neoplasia
2009
;
11
:
1329
39
.
49.
Campos
B
,
Centner
FS
,
Bermejo
JL
,
Ali
R
,
Dorsch
K
,
Wan
F
, et al
Aberrant expression of retinoic acid signaling molecules influences patient survival in astrocytic gliomas
.
Am J Pathol
2011
;
178
:
1953
64
.
50.
Tari
AM
,
Lim
SJ
,
Hung
MC
,
Esteva
FJ
,
Lopez-Berestein
G
. 
Her2/neu induces all-trans retinoic acid (ATRA) resistance in breast cancer cells
.
Oncogene
2002
;
21
:
5224
32
.
51.
Schug
TT
,
Berry
DC
,
Shaw
NS
,
Travis
SN
,
Noy
N
. 
Opposing effects of retinoic acid on cell growth result from alternate activation of two different nuclear receptors
.
Cell
2007
;
129
:
723
33
.
52.
Yan
H
,
Parsons
DW
,
Jin
G
,
McLendon
R
,
Rasheed
BA
,
Yuan
W
, et al
IDH1 and IDH2 mutations in gliomas
.
N Engl J Med
2009
;
360
:
765
73
.
53.
Mardis
ER
,
Ding
L
,
Dooling
DJ
,
Larson
DE
,
McLellan
MD
,
Chen
K
, et al
Recurring mutations found by sequencing an acute myeloid leukemia genome
.
N Engl J Med
2009
;
361
:
1058
66
.
54.
Xu
W
,
Yang
H
,
Liu
Y
,
Yang
Y
,
Wang
P
,
Kim
SH
, et al
Oncometabolite 2-hydroxyglutarate is a competitive inhibitor of alpha-ketoglutarate-dependent dioxygenases
.
Cancer Cell
2011
;
19
:
17
30
.