The normal duct-lobular system of the breast is lined by two epithelial cell types, inner luminal secretory cells and outer contractile myoepithelial cells. We have generated comprehensive expression profiles of the two normal cell types, using immunomagnetic cell separation and gene expression microarray analysis. The cell-type specificity was confirmed at the protein level by immunohistochemistry in normal breast tissue. New prognostic markers for survival were identified when the luminal- and myoepithelial-specific molecules were evaluated on breast tumor tissue microarrays. Nuclear expression of luminal epithelial marker galectin 3 correlated with a shorter overall survival in these patients, and the expression of SPARC (osteonectin), a myoepithelial marker, was an independent marker of poor prognosis in breast cancers as a whole. These data provide a framework for the interpretation of breast cancer molecular profiling experiments, the identification of potential new diagnostic markers, and development of novel indicators of prognosis.

The terminal duct-lobular unit of the breast, the structure from which the majority of breast cancers arise, is composed of two types of epithelial cells. The inner or luminal cells, which are potential milk secreting cells, are surrounded by an outer basal layer of contractile myoepithelial cells. Most breast carcinomas express phenotypic markers that are consistent with an origin from luminal cells (1). The biology of normal luminal cells is the key to understanding breast cancer initiation, with genetic alterations occurring in both normal cells and epithelial hyperplastic lesions driving the earliest stages of progression (2, 3, 4).

Despite the luminal origin of breast tumors, a subset of invasive ductal carcinomas in the breast also express markers specific for myoepithelial cells (5, 6, 7, 8, 9, 10). Recent studies using cDNA microarray analysis of primary human breast tumors have also identified a basal-like subset of invasive ductal carcinomas (11) based on their patterns of gene expression. These tumors exhibiting a basal phenotype have been reported to have an aggressive phenotype and poorer prognosis for the patient (12, 13, 14). Although it is interesting to speculate on the cells from which these tumors may be derived, there is little evidence currently to suggest a myoepithelial origin for basal-like breast cancers.

Although tumor expression profiling data have begun to unravel the complexity of breast cancer, corresponding transcriptional studies of the two normal epithelial cell types have lagged behind. Immunomagnetic methods have been developed for large-scale purification of normal human luminal and myoepithelial breast cells from reduction mammoplasty samples, amenable to detailed molecular analysis (15). We report here the cDNA microarray analysis of separated normal luminal and myoepithelial cells. The objectives of this study were to provide a baseline reference dataset to help understand preexisting and forthcoming tumor expression profiles, to determine whether novel cell-type specific markers can be used for tumor subclassification and for differential diagnosis, and to identify new predictive and prognostic markers that could include potential targets for future therapy. Comparison of the transcriptional profiles of normal adult human luminal and myoepithelial cells does identify novel markers, including some which provide significant prognostic information for primary breast cancers.

Cell Preparations.

Purified populations of ∼107 normal human breast luminal and myoepithelial cells were prepared from individual reduction mammoplasty samples (16) with modifications to enhance purity (15). Briefly, the different breast cells were immunomagnetically sorted from primary cultures using combined positive magnetic activated cell sorting selection using antibodies against the luminal epithelial membrane marker EMA (rat monoclonal ICR2) and the myoepithelial membrane antigen CD10 (mouse monoclonal DAKO-CALLA clone SS2/36) followed by negative Dynabead selection using mouse monoclonal antibodies against a different myoepithelial cell-surface antigen (Santa Cruz Biotechnology anti-β-4-integrin clone A9) and another luminal antigen (Dako BerEp-4 Epithelial Antigen).

cDNA Microarray Hybridizations.

The cDNA microarrays used in this study were constructed at the Sanger Centre as part of the Ludwig Institute for Cancer Research/Cancer Research UK Microarray Consortium, containing 9,930 sequence-validated cDNA clones representing ∼6000 unique human gene sequences (see web site for details and protocols).8 Clone annotation was based on the National Center for Biotechnology Information 34 assembly of the human genome with the Sanger Clone IDs mapping to Ensembl.9

RNA was prepared according to standard protocols (17), and preparations from nine luminal and nine myoepithelial samples (four of which were paired samples from the same patient) were individually hybridized against a common breast reference RNA in duplicated dye-swap experiments. The breast reference RNA was created by combining equal quantities of total RNA from the following breast cell lines, MDA-MB-361, MDA-MB-231, MDA-MB-435, BT20, HBL100, GI101, BT474, T47D, MCF7, SKBR3, ZR-75-1, and MDA-MB-468.

Image Processing and Data Analysis.

Fluorescent images of hybridized microarrays were captured using either the GenePix 4000 (Axon) dual color confocal laser scanner and GenePix software or a GSI Lumonics 4000 scanner and ScanArray software and quantitated and background subtracted using GSI Lumonics Quantarray 3.0 software. The log expression ratios were normalized using lowess local regression (18) using the statistical platform S-Plus version 6.1 for Windows (Insightful). All raw fluorescence intensity data and microarray image files have been deposited within the public repository for microarray based gene expression data ArrayExpress,10 complying with minimum information about a microarray experiment (MIAME) standards (19), with the accession number E-MEXP-36.

Genes were filtered from the total set of 9930 by exclusion because of low mean intensity values (<20th percentile of highest intensity across all arrays), consistent local artifact, and low mean absolute deviation (<0.3). This resulted in a filtered gene list of 1896 targets for unsupervised and supervised analysis. Any remaining missing values were imputed using the k-mean nearest neighbor method in Statistical Analysis of Microarrays (SAM).

Unsupervised hierarchical clustering was carried out using “hclust” in S-Plus, as well as the Cluster package, and plotted with Treeview.11 Differentially expressed genes were identified by application of the SAM (version 1.12) Excel add-in.12 Supervised analysis was carried by the nearest shrunken centroid classification for class prediction using the Prediction Analysis of Microarrays package,13 implemented in R (1.6.2).14

Reverse Transcription-PCR.

Ten μg total RNA was reverse transcribed from an oligo-dT primer under conventional conditions (Superscript II; Life Technologies, Inc.). The resulting reaction was diluted 10-fold in water, and 2 μl were used as a template for PCR amplification. PCR was performed under standard conditions in 50 μl for 25–40 cycles (primer sequences and cycle numbers are given in Supplementary Table S4). Products were resolved by standard agarose gel electrophoresis. Differential expression was confirmed by densitometry of ethidium bromide staining on conventional agarose gels using NIH Image software.

Immunohistochemistry.

Antibodies to differentially expressed genes were obtained commercially where available. Sections were dewaxed in xylene overnight, taken to ethanol (99.7–100% v/v), and blocked for endogenous peroxidase in methanol for 10 min. Sections were subjected to specific high temperature antigen retrieval techniques, blocked in normal horse serum (2.5%; Vector Labs) for 20 min, and primary antibodies applied for 30 min. SPARC was subjected to 2 min of pressure cooking in citrate buffer (pH 6.0), 1/5 dilution (Novocastra); S100A2 received 2 min of pressure cooking, dilution 1/100 (Dako); maspin received 18 min of microwaving in Dako Target Retrieval Solution (pH 6.0), dilution 1/100 (Novocastra); galectin 3 (LGALS3) received 2 min of pressure cooking, dilution 1/750 (Novocastra); CLDN4 received 18 min of microwaving, dilution 1/100 (Zymed); CD24 received 3 min of pressure cooking, dilution 1/100, (Serotech). All antibodies were diluted in Tris-buffered saline. The primary antibodies were rinsed off in 0.1% Tween 20 in Tris-buffered saline, developed using Vectastain Universal ABC kit (Vector Labs) and visualized with 3,3′-diaminobenzidine (Dako).

Tissue Microarrays.

Breast tumors were selected from the archives of the Istituto di Anatomia Patologica (Sassari, Italy), with appropriate local ethical committee approval. A total of 566 unselected primary breast cancers comprising all grades and types was retrieved and reviewed by an experienced pathologist. Up to 10 years clinical follow-up data were available for all cases (3–110 months, mean = 62 months, median = 73 months). The paraffin blocks were marked and punched with 0.6-mm2 tumor cores taken from the donor blocks for inclusion in duplicate recipient tissue array blocks using a precision tissue array instrument (Beecher Instruments; Ref. 20).

Survival analysis was carried out using the statistical platform S-Plus version 6.1 for Windows (Insightful) on our right-censored clinical follow-up data from the cases on the tissue microarray, using the log-rank test and the Cox proportional hazards model.

cDNA Microarray Analysis.

The microarray data from normal breast luminal and myoepithelial cells were firstly analyzed in an unsupervised manner to determine the innate differences between the cell preparations and secondly using supervised algorithms to identify the most discriminatory genes associated with each cell types. Using the normalized data from the 1896 gene list, unsupervised hierarchical clustering on the normal breast luminal and myoepithelial preparations was carried out (Fig. 1,A). The sample dendrogram clearly separates two main branches each consisting of one of the two cell types, exemplifying the inherent differences between the two epithelial cell types of the breast and the consistency of the cell separation and microarray analysis methods used. Clustering of the 1896 gene set also identifies luminal and myoepithelial-specific gene clusters, representative regions of which are shown (Fig. 1, B and C).

A list of statistically significant genes which were differentially expressed between the two cell types were identified using SAM. With a false discovery rate of 1%, 132 myoepithelial and 77 luminal differential genes were found (Supplementary Table S1). Expression ratios and SAM scores for the top 50 most differentially expressed genes in each cell type are shown (Table 1). The data were also analyzed by a supervised classification method to identify those genes which are the most predictive of each cell type, using a class prediction algorithm based on the nearest shrunken centroid method (21). Using Prediction Analysis of Microarrays, a classifier was first trained using the 1896 gene set, before cross-validation and plotting of the cross-validated error curves, to determine the threshold (amount of shrinkage), which gives the minimum cross-validated error rate (Supplementary Figure S2). Applying a threshold of 2.7 gives a misclassification rate of 0 using 42 cDNA clones corresponding to 33 unique genes (Fig. 2,A, Supplementary Figure S3). The cross-validated class probabilities by sample type (Fig. 2 B) demonstrate that these 33 genes accurately classify all of the samples into their correct classes (cell type).

Confirmation of Differential Expression.

To confirm the specific identity and differential expression of our cell type specific markers, semiquantitative reverse transcription-PCR was carried out on the four patient-matched luminal and myoepithelial samples used in our microarray analysis. Differential expression was confirmed for 56 of 62 (90%) genes by reverse transcription-PCR (primers, cycle numbers, and confirmatory data for the 66 unique genes from Table 1 are given in Supplementary Table S4).

These genes were next examined at the protein level in paraffin-embedded archival samples by immunohistochemistry, where the availability of appropriate antibodies made this possible. Differential luminal expression in normal breast lobules is shown for claudin 4 (CLDN4), CD24, and LGALS3 proteins. CLDN4 staining shows a strong membrane component of luminal epithelial cells consistent with its role in tight junction adhesion and does not stain the basement membrane of these cells (Fig. 3,A). CD24 stains the cytoplasmic compartment of normal luminal epithelial cells as well as the apical cell surface (Fig. 3,B). LGALS3 stains the nucleus and cytoplasm of luminal cells differentially compared with myoepithelial cells and also stains intralobular fibroblasts in the breast (Fig. 3 C).

Differential myoepithelial expression in normal breast lobules is demonstrated for S100A2, SERPINB5, and SPARC proteins. S100A2 shows a strong nuclear and cytoplasmic staining specifically in the myoepithelial cells, with no expression in the stromal cells (Fig. 3,E). Maspin (SERPINB5) expression is also restricted to myoepithelial cells, with strong nuclear and cytoplasmic staining (Fig. 3,F). Osteonectin (SPARC) stains the cytoplasmic compartment of myoepithelial cells differentially to luminal cells and also stains inter- and intralobular fibroblasts (Fig. 3 G).

Evaluation of Prognostic Significance Using Tissue Microarrays.

To evaluate whether the expression of the luminal and myoepithelial markers demonstrated any correlation with prognosis in breast cancer, immunohistochemistry was carried out on a tissue microarray consisting of 566 primary breast tumors of all types and grades for which outcome data in the form of overall survival was available. A summary of the results of univariate analysis is given in Table 2.

The luminal epithelial maker LGALS3 showed a loss of expression in approximately one-half of all assessable tumors on the tissue microarray, with 213 of 431 cases (49.4%) LGALS3 positive. This loss of expression did not correlate with prognostic outcome in all tumors (P = 0.597); however, when the subcellular localization of LGALS3 was evaluated, tumors with nuclear positivity (9 of 431, 2.1%, Fig. 3,D) showed a statistically significant (P = 0.00895, log-rank test) poorer overall survival than negative cases or those for whom expression was restricted to the cytoplasm (Fig. 4 A). All of these nuclear positive cases were also positive for cyclin D1 (data not shown). There was no correlation between LGALS3 expression and age, grade, size, ER, progesterone receptor, or tumor type (data not shown). In multivariate analysis, loss of LGALS3 expression just failed to reach formal statistical significance as an independent prognostic factor (P = 0.051).

Loss of expression from normal luminal epithelial cells to invasive cancer was also seen for other luminal markers tested. CLDN4 was positive in 245 of 331 tumors (74.0%) and CD24 positive in 126 of 426 tumors (29.6%). Although loss of expression shows a clear association with breast cancer development, neither marker conferred any independent prognostic information, nor were they correlated with age, grade, size, ER, progesterone receptor, or tumor type (data not shown).

The myoepithelial marker osteonectin (SPARC) was found to be positive in 17 of 350 (4.9%) assessable tumor cores (Fig. 3,H). When Kaplan-Meier survival curves for overall survival were plotted, a clear poor prognosis was observed for SPARC-positive tumors. This was found to be statistically significant by the log-rank test in all tumors (P = 0.00844, Fig. 4,B). By multivariate analysis (Table 3), SPARC was found to be an independent prognostic factor (P = 0.0057, Cox proportional hazards), conferring the highest relative risk of all factors fitted (6.88, 95% confidence interval 1.75–27.04), although the confidence interval is large, given the small number of positive tumors.

Other myoepithelial markers tested showed a proportion of breast tumors expressing these basal proteins. S100A2 was positive in 8 of 443 cases (1.8%), whereas maspin (SERPINB5) was positive in 108 of 333 cases (32.4%). S100A2 conferred no independent prognostic information, and whereas maspin expression appeared to indicate a better overall survival, this did not reach statistical significance (P = 0.092). No association with clinicopathological variables or survival was observed with maspin expression or its subcellular localization.

Expression profiling of purified normal luminal epithelial and myoepithelial cells in the breast provides a basis for interpretation of the large amount of microarray data currently being generated for breast tumors. Identification of subclassifications of breast tumors termed basal-like and luminal-like (11), which differ in their clinical outcome (13) clearly demonstrates the need for accurate determination of the patterns of gene expression in these normal cells of the breast. Our observations of myoepithelial-specific genes such as S100A2, LGALS7, CSTA, and BPAG1 in our cell preparations, which also cluster together to define the basal-like group in these tumor studies (22), demonstrate the use of such an approach and will help to accurately classify the proposed breast cancer stratifications. Our normal luminal cell preparations are hormone receptor negative, in common with the vast majority of normal luminal epithelial cells in the breast. It is therefore not surprising that there is little overlap between our luminal epithelial profiles and those of the luminal-like tumors in these classifications (11, 13), which are almost exclusively ER positive and the genes associated with them ER-responsive genes. The cell-type specific expression profiles in the normal breast also provides a baseline for studies investigating breast cancer progression (23), outcome prediction (24, 25), and local (26) and distant metastasis (27).

Our data also help to clarify previous transcriptional studies using human mammary epithelial cells (HMECs) as the normal component. Such cells were derived from cultures of unsorted normal breast epithelium. As it is the myoepithelially derived cells that have the greatest proliferative potential in vitro(28), HMECs are essentially myoepithelial-like (i.e., basal) in phenotype (29). Consequently, when HMECs are compared with (luminally derived) breast cancer cells or solid tumors, the markers that emerge as differentially expressed are essentially those represented in our luminal versus myoepithelial lists, rather than specific tumor markers per se, as has been inferred (30).

Generation of a larger and more accurate panel of markers for differentiating the normal epithelial cell types will have a major impact in patient management. Routine histopathological discrimination of in situ from invasive cancer in the breast uses the retention of the myoepithelial layer as a critical diagnostic criterion, with huge implications for planning appropriate surgery. Improving the differential diagnosis from, for example, small needle core biopsies using novel myoepithelial-specific markers will make an important contribution to clinical practice.

Genes differentially expressed between luminal and myoepithelial cells were also found to confer independent prognostic information for breast cancer patients. Galectin 3 is a gene expressed in normal luminal epithelial but not myoepithelial cells. Galectin 3 is thought to regulate many biological processes and has been associated with ERBB2 expression (31). Down-regulation of galectin 3 has been implicated in breast cancer progression (32). Tissue microarray analysis showed that loss of galectin 3 expression by malignant epithelial cells was seen in approximately one-half of all tumors assessed. Of particular interest is the correlation of nuclear LGALS3 expression and poor outcome. Nuclear galectin 3 has been reported to have a growth promoting activity through cyclin D1 induction (33), and all of these (9 of 431) tumors were also positive for cyclin D1. This observation demonstrates the ability of our approach to identify luminal epithelial-specific markers in the normal breast and use them to monitor breast cancer progression and establish their relationship with patient prognosis.

SPARC (osteonectin) modulates cellular interaction with the extracellular matrix by its binding to structural matrix proteins such as collagen and vitronectin. It was found by our cDNA microarray analysis to be differentially expressed between myoepithelial and luminal cells. Up-regulation of SPARC has been associated with increased invasive potential in breast cancer cells in vitro(34) and has been identified as a breast tumor marker by serial analysis of gene expression analysis (35). SPARC was found to be positive in malignant epithelial cells in 4.9% of all breast tumors examined in our tissue microarray study. SPARC-positive tumors showed a highly significant poorer overall survival in breast cancer patients by univariate analysis (P = 0.00844) and was an independent prognostic indicator by multivariate analysis (P = 0.0057). Aberrant expression of myoepithelial proteins has been recognized in breast tumors for some time, and data demonstrating poor prognosis of these tumors have largely been associated with ER- and lymph node-negative tumors (13, 14). Here, we present the identification of a protein expressed in the myoepithelial cells but not the luminal cells of the normal breast and whose expression in breast tumors confers a very poor clinical outcome, regardless of ER or lymph node status.

The clinical use of expanding the list of normal cell-type specific genes not only provides novel diagnostic and prognostic markers but will assist in understanding the multistep progression from epithelial cells to invasive cancer in the breast. Recent studies of breast cancer stem cell phenotypes (36) have identified CD24/CD44+ cells as having tumorigenic potential, these markers being associated respectively with normal luminal and normal myoepithelial cells. At the other end of the progression spectrum, among 17 reporter genes associated with metastasis (27) were up-regulation of COL1A1 and down-regulation of MYLK, two of the most discriminant myoepithelial genes in the present study. Collectively, these observations highlight the importance of the normal breast cell expression profiles in understanding breast cancer. They also provide a unique dataset in which targets for future therapeutic intervention may be identified.

Grant support: The microarray consortium is funded by the Wellcome Trust, Cancer Research UK and the Ludwig Institute of Cancer Research. J. Reis-Filho is the recipient of the Gordon Signy International Fellowship Award and is partially supported by Ph.D. Grant SFRH/BD/5386/2001 from the Fundação para a Ciência e a Tecnologia, Portugal, and Programa Operacional Ciência, Tecnologia e Inovação POCTI/CBO/45157/2002. A. Cossu, M. Budroni, and G. Palmieri are partially funded by Regione Autonoma della Sardegna.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

Note: C. Jones and A. Mackay contributed equally to this work. The human I.M.A.G.E. cDNA clone collection was obtained from the Medical Research Council Human Genome Mapping Project Resource Centre (Hinxton, United Kingdom). All cDNA clone resequencing was performed by Team 56 at the Sanger Institute.

Requests for reprints: Sunil R. Lakhani, The Breakthrough Toby Robins Breast Cancer Research Centre Institute of Cancer Research, Fulham Road, London SW3 6JB, United Kingdom. Phone: 020-7153-5525; Fax: 020-7153-5533; E-mail: [email protected]

8

Internet address: http://www.sanger.ac.uk/Projects/Microarrays/.

9

Internet address: http://www.ensembl.org.

10

Internet address: http://www.ebi.ac.uk/arrayexpress.

11

Internet address: http://rana.lbl.gov/EisenSoftware.htm.

12

Internet address: http://www-stat.stanford.edu/∼tibs/SAM/.

13

Internet address: http://www-stat.stanford.edu/∼tibs/PAM/.

14

Internet address: http://www.r-project.org.

Fig. 1.

Unsupervised hierarchical clustering of luminal and myoepithelial preparations. A, full heatplot of 1896 gene list, ordered by clustering of the samples separating into two cell-type specific arms. The scale on the right of the dendrogram shows 1 minus correlation. B, magnified view showing a representative luminal gene cluster. C, magnified view showing a representative myoepithelial gene cluster.

Fig. 1.

Unsupervised hierarchical clustering of luminal and myoepithelial preparations. A, full heatplot of 1896 gene list, ordered by clustering of the samples separating into two cell-type specific arms. The scale on the right of the dendrogram shows 1 minus correlation. B, magnified view showing a representative luminal gene cluster. C, magnified view showing a representative myoepithelial gene cluster.

Close modal
Fig. 2.

Supervised analysis using prediction analysis of microarrays (PAM). A, centroid plot showing a ranked list of the 42 most predictive clones, corresponding to 33 individual genes. The length of the horizontal bar for a given gene is equivalent to the difference between the overall centroid and the class-specific centroid. B, cross-validated probability plot with a threshold of 2.7, showing 0 misclassification error with the expression profiles of the 42 predictor clones.

Fig. 2.

Supervised analysis using prediction analysis of microarrays (PAM). A, centroid plot showing a ranked list of the 42 most predictive clones, corresponding to 33 individual genes. The length of the horizontal bar for a given gene is equivalent to the difference between the overall centroid and the class-specific centroid. B, cross-validated probability plot with a threshold of 2.7, showing 0 misclassification error with the expression profiles of the 42 predictor clones.

Close modal
Fig. 3.

Immunohistochemistry of antibodies raised against luminal and myoepithelial specific proteins. A, claudin 4, normal breast lobule (×40) showing luminal membrane staining. B, CD24, normal breast lobule (×20) showing luminal cytoplasmic and apical cell surface staining. C, LGALS3, normal breast lobule (×20) showing nuclear and cytoplasmic staining in luminal epithelial cells and also intralobular fibroblasts. D, LGALS3, invasive ductal carcinoma on tissue microarray (×10) showing nuclear positivity. E, S100A2, normal breast lobule (×20) showing myoepithelial nuclear and cytoplasmic staining. F, maspin, normal breast lobule (×20) showing myoepithelial nuclear and cytoplasmic staining. G, SPARC, normal breast lobule (×20) showing myoepithelial cytoplasmic staining as well as positivity in inter- and intralobular fibroblasts. H, SPARC, invasive ductal carcinoma on tissue microarray (×10) showing positive cytoplasmic staining in the tumor sample, as well as a positive stromal reaction.

Fig. 3.

Immunohistochemistry of antibodies raised against luminal and myoepithelial specific proteins. A, claudin 4, normal breast lobule (×40) showing luminal membrane staining. B, CD24, normal breast lobule (×20) showing luminal cytoplasmic and apical cell surface staining. C, LGALS3, normal breast lobule (×20) showing nuclear and cytoplasmic staining in luminal epithelial cells and also intralobular fibroblasts. D, LGALS3, invasive ductal carcinoma on tissue microarray (×10) showing nuclear positivity. E, S100A2, normal breast lobule (×20) showing myoepithelial nuclear and cytoplasmic staining. F, maspin, normal breast lobule (×20) showing myoepithelial nuclear and cytoplasmic staining. G, SPARC, normal breast lobule (×20) showing myoepithelial cytoplasmic staining as well as positivity in inter- and intralobular fibroblasts. H, SPARC, invasive ductal carcinoma on tissue microarray (×10) showing positive cytoplasmic staining in the tumor sample, as well as a positive stromal reaction.

Close modal
Fig. 4.

Kaplan-Meier survival curves (months) from tissue microarray analysis. A, LGALS3 nuclear expression conferring a poor prognosis in all tumors compared with negative and cytoplasmic positive patients (P = 0.00895). B, SPARC positivity is associated with significantly shorter overall survival in all tumors (P = 0.00844).

Fig. 4.

Kaplan-Meier survival curves (months) from tissue microarray analysis. A, LGALS3 nuclear expression conferring a poor prognosis in all tumors compared with negative and cytoplasmic positive patients (P = 0.00895). B, SPARC positivity is associated with significantly shorter overall survival in all tumors (P = 0.00844).

Close modal
Table 1

List of top 50 luminal-specific and top 50 myoepithelial specific genes as determined by Statistical Analysis of Microarrays analysis

Genes are ranked in order of fold change (myoepithelial over luminal) and are listed with their Sanger Institute Hver1.2.1 clone ID8 and Ensembl accession number.9 Genes highlighted in italics formed part of the discriminator genes identified by Prediction Analysis of Microarrays.

Sanger IDEnsemblGene IDStatistical Analysis of Microarray scoreFold difference
Luminal genes     
 741497_A ENSG00000148346 LCN2 −3.64 0.18 
 741497_C ENSG00000148346 LCN2 −3.80 0.18 
 111213_B ENSG00000167755 KLK6 −2.83 0.20 
 357842_A ENSG00000175315 CST6 −2.98 0.21 
 376599_A ENSG00000175315 CST6 −2.71 0.23 
 204335_A ENSESTG00000020862 CD24 −4.71 0.26 
 341021_A ENSG00000008517 NK4 −3.00 0.26 
 724533_B ENSG00000101443 WFDC2 −2.81 0.28 
 153508_A ENSG00000186996 CLDN4 −3.74 0.29 
 809923_A ENSESTG00000024749 TNFAIP2 −3.12 0.30 
 346130_B ENSG00000186996 CLDN4 −3.81 0.31 
 346510_A ENSG00000186996 CLDN4 −4.07 0.31 
 25433_A ENSG00000143153 ATP1B1 −5.66 0.32 
 1257299_A ENSG00000163975 MFI2 −3.21 0.35 
 767629_A ENSESTG00000006616 RARRES1 −2.90 0.36 
 137018_A ENSG00000012171 SEMA3B −4.07 0.37 
 180786_A ENSG00000006210 CX3CL1 −3.08 0.38 
 183573_A ENSG00000006210 CX3CL1 −2.71 0.40 
 153925_B ENSG00000052344 PRSS8 −4.82 0.40 
 382660_A ENSESTG00000003790 KIAA1641 −3.71 0.41 
 201516_B ENSG00000184930 MTND4 −3.32 0.41 
 182635_A ENSG00000070404 FSTL3 −2.54 0.42 
 151761_A ENSG00000185499 MUC1 −4.30 0.43 
 165830_B ENSG00000006210 CX3CL1 −2.63 0.43 
 357613_B ENSG00000184930 MTND4 −3.87 0.44 
 308173_A ENSG00000184689 MTND6 −2.73 0.45 
 233818_A ENSG00000183503 MTCO2 −3.94 0.45 
 809822_A ENSG00000129353 CTL2 −4.79 0.46 
 324225_B ENSG00000133321 RARRES3 −3.72 0.47 
 34461_A ENSG00000131981 LGALS3 −4.39 0.47 
 149218_A ENSG00000184930 MTND4 −2.73 0.48 
 41288_A ENSG00000185215 TNFAIP2 −3.27 0.48 
 293168_A ENSG00000184689 MTND6 −2.88 0.49 
 24593_B ENSG00000129353 CTL2 −3.32 0.49 
 156398_A  unknown −2.83 0.49 
 365623_A ENSG00000184930 MTND4 −3.47 0.50 
 782280_B  HDCRA −3.04 0.50 
 320142_A ENSG00000184316 MTATP6 −2.66 0.50 
 163072_A ENSG00000103335 KIAA0233 −2.80 0.50 
 167150_A ENSG00000169246 KIAA0220 −2.83 0.50 
 302373_C ENSG00000143153 ATP1B1 −3.36 0.51 
 1659533_B ENSG00000124159 MATN4 −3.61 0.51 
 307769_A ENSG00000122034 GTF3A −3.27 0.52 
 244297_A ENSG00000141934 PPAP2C −3.88 0.52 
 262390_A ENSG00000182240 BACE2 −4.40 0.52 
 772402_A ENSG00000099860 GADD45B −2.55 0.53 
 294203_A ENSG00000109062 SLC9A3R1 −3.06 0.53 
 796298_B ENSG00000110721 CHK −3.54 0.54 
 49200_A ENSG00000074416 MGLL −2.84 0.55 
 123164_A ENSG00000065361 ERBB3 −2.83 0.57 
Myoepithelial genes     
 298509_A ENSG00000108821 COL1A1 3.54 9.23 
 214997_A ENSG00000108821 COL1A1 3.58 9.22 
 323321_A ENSESTG00000026432 COL1A1 3.26 8.29 
 341752_A ENSG00000178939 LGALS7 4.17 7.97 
 810813_B ENSG00000160675 S100A2 3.66 7.48 
 300737_A ENSG00000065534 MYLK 2.50 7.11 
 188036_C ENSG00000151914 BPAG1 2.94 6.85 
 188036_A ENSG00000151914 BPAG1 3.39 6.38 
 264525_A ENSG00000065534 MYLK 3.24 5.92 
 249977_A ENSG00000100234 TIMP3 3.18 5.59 
 270187_A ENSG00000102265 TIMP1 3.57 5.04 
 310019_A ENSG00000065534 MYLK 2.40 4.98 
 stSG89269 ENSG00000100234 TIMP3 2.78 4.79 
 266325_A ENSG00000102265 TIMP1 3.42 4.79 
 327165_A ENSG00000166628 SERPINB5 3.14 4.72 
 263278_A ENSG00000113140 SPARC 3.89 4.66 
 1404774_A ENSG00000087494 PTHLH 2.42 4.55 
 346130_A ENSG00000169474 SPRR1B 2.73 4.45 
 51003_A ENSESTG00000011065 DKK3 2.88 4.11 
 359747_A ENSG00000178939 LGALS7 3.01 4.09 
 813614_C ENSG00000169474 SPRR1B 2.91 3.91 
 141815_A ENSG00000101384 JAG1 3.20 3.89 
 302294_A ENSG00000166033 PRSS11 3.10 3.65 
 346610_A ENSG00000175793 SFN 4.86 3.56 
 324700_A ENSG00000149968 MMP3 2.50 3.34 
 38967_A ENSG00000166033 PRSS11 3.39 3.30 
 810017_A ENSG00000104368 PLAT 3.12 3.26 
 111081_A ENSG00000169688 MT1F 3.08 3.21 
 809810_A ENSG00000137699 TRIM29 2.68 3.19 
Sanger ID Ensembl Gene ID Statistical Analysis of Microarray score Fold difference 
1840568_A ENSG00000184330 S100A7 2.75 3.14 
347284_A ENSG00000105974 CAV1 3.33 3.13 
788192_A ENSG00000087494 PTHLH 3.24 3.09 
195273_A ENSG00000139219 COL1A1 2.55 3.02 
293270_A ENSG00000107987 AKR1C2 3.21 2.94 
728114_A ENSG00000166899 FABP5 2.46 2.93 
1568010_A ENSG00000109321 AREG 2.85 2.90 
137665_A ENSG00000073282 TP73L 2.89 2.82 
364409_A ENSG00000121552 CSTA 2.54 2.80 
22117_A ENSG00000136699 FLJ20297 3.75 2.77 
813614_A ENSG00000169474 SPRR1B 2.61 2.69 
297392_A ENSG00000187193 MT1L 3.79 2.61 
127982_A ENSG00000104368 PLAT 2.84 2.58 
148057_A ENSG00000117318 ID3 3.74 2.56 
66946_A ENSG00000169688 MT1B 3.13 2.54 
26285_A ENSG00000091409 ITGA6 3.30 2.50 
267759_A ENSG00000105974 CAV1 2.53 2.49 
149370_A ENSG00000109861 CTSC 4.47 2.45 
274164_A ENSG00000187193 MT1F 4.07 2.44 
292784_A ENSG00000109861 CTSC 3.69 2.43 
418137_A ENSG00000105281 SLC1A5 5.24 2.42 
Sanger IDEnsemblGene IDStatistical Analysis of Microarray scoreFold difference
Luminal genes     
 741497_A ENSG00000148346 LCN2 −3.64 0.18 
 741497_C ENSG00000148346 LCN2 −3.80 0.18 
 111213_B ENSG00000167755 KLK6 −2.83 0.20 
 357842_A ENSG00000175315 CST6 −2.98 0.21 
 376599_A ENSG00000175315 CST6 −2.71 0.23 
 204335_A ENSESTG00000020862 CD24 −4.71 0.26 
 341021_A ENSG00000008517 NK4 −3.00 0.26 
 724533_B ENSG00000101443 WFDC2 −2.81 0.28 
 153508_A ENSG00000186996 CLDN4 −3.74 0.29 
 809923_A ENSESTG00000024749 TNFAIP2 −3.12 0.30 
 346130_B ENSG00000186996 CLDN4 −3.81 0.31 
 346510_A ENSG00000186996 CLDN4 −4.07 0.31 
 25433_A ENSG00000143153 ATP1B1 −5.66 0.32 
 1257299_A ENSG00000163975 MFI2 −3.21 0.35 
 767629_A ENSESTG00000006616 RARRES1 −2.90 0.36 
 137018_A ENSG00000012171 SEMA3B −4.07 0.37 
 180786_A ENSG00000006210 CX3CL1 −3.08 0.38 
 183573_A ENSG00000006210 CX3CL1 −2.71 0.40 
 153925_B ENSG00000052344 PRSS8 −4.82 0.40 
 382660_A ENSESTG00000003790 KIAA1641 −3.71 0.41 
 201516_B ENSG00000184930 MTND4 −3.32 0.41 
 182635_A ENSG00000070404 FSTL3 −2.54 0.42 
 151761_A ENSG00000185499 MUC1 −4.30 0.43 
 165830_B ENSG00000006210 CX3CL1 −2.63 0.43 
 357613_B ENSG00000184930 MTND4 −3.87 0.44 
 308173_A ENSG00000184689 MTND6 −2.73 0.45 
 233818_A ENSG00000183503 MTCO2 −3.94 0.45 
 809822_A ENSG00000129353 CTL2 −4.79 0.46 
 324225_B ENSG00000133321 RARRES3 −3.72 0.47 
 34461_A ENSG00000131981 LGALS3 −4.39 0.47 
 149218_A ENSG00000184930 MTND4 −2.73 0.48 
 41288_A ENSG00000185215 TNFAIP2 −3.27 0.48 
 293168_A ENSG00000184689 MTND6 −2.88 0.49 
 24593_B ENSG00000129353 CTL2 −3.32 0.49 
 156398_A  unknown −2.83 0.49 
 365623_A ENSG00000184930 MTND4 −3.47 0.50 
 782280_B  HDCRA −3.04 0.50 
 320142_A ENSG00000184316 MTATP6 −2.66 0.50 
 163072_A ENSG00000103335 KIAA0233 −2.80 0.50 
 167150_A ENSG00000169246 KIAA0220 −2.83 0.50 
 302373_C ENSG00000143153 ATP1B1 −3.36 0.51 
 1659533_B ENSG00000124159 MATN4 −3.61 0.51 
 307769_A ENSG00000122034 GTF3A −3.27 0.52 
 244297_A ENSG00000141934 PPAP2C −3.88 0.52 
 262390_A ENSG00000182240 BACE2 −4.40 0.52 
 772402_A ENSG00000099860 GADD45B −2.55 0.53 
 294203_A ENSG00000109062 SLC9A3R1 −3.06 0.53 
 796298_B ENSG00000110721 CHK −3.54 0.54 
 49200_A ENSG00000074416 MGLL −2.84 0.55 
 123164_A ENSG00000065361 ERBB3 −2.83 0.57 
Myoepithelial genes     
 298509_A ENSG00000108821 COL1A1 3.54 9.23 
 214997_A ENSG00000108821 COL1A1 3.58 9.22 
 323321_A ENSESTG00000026432 COL1A1 3.26 8.29 
 341752_A ENSG00000178939 LGALS7 4.17 7.97 
 810813_B ENSG00000160675 S100A2 3.66 7.48 
 300737_A ENSG00000065534 MYLK 2.50 7.11 
 188036_C ENSG00000151914 BPAG1 2.94 6.85 
 188036_A ENSG00000151914 BPAG1 3.39 6.38 
 264525_A ENSG00000065534 MYLK 3.24 5.92 
 249977_A ENSG00000100234 TIMP3 3.18 5.59 
 270187_A ENSG00000102265 TIMP1 3.57 5.04 
 310019_A ENSG00000065534 MYLK 2.40 4.98 
 stSG89269 ENSG00000100234 TIMP3 2.78 4.79 
 266325_A ENSG00000102265 TIMP1 3.42 4.79 
 327165_A ENSG00000166628 SERPINB5 3.14 4.72 
 263278_A ENSG00000113140 SPARC 3.89 4.66 
 1404774_A ENSG00000087494 PTHLH 2.42 4.55 
 346130_A ENSG00000169474 SPRR1B 2.73 4.45 
 51003_A ENSESTG00000011065 DKK3 2.88 4.11 
 359747_A ENSG00000178939 LGALS7 3.01 4.09 
 813614_C ENSG00000169474 SPRR1B 2.91 3.91 
 141815_A ENSG00000101384 JAG1 3.20 3.89 
 302294_A ENSG00000166033 PRSS11 3.10 3.65 
 346610_A ENSG00000175793 SFN 4.86 3.56 
 324700_A ENSG00000149968 MMP3 2.50 3.34 
 38967_A ENSG00000166033 PRSS11 3.39 3.30 
 810017_A ENSG00000104368 PLAT 3.12 3.26 
 111081_A ENSG00000169688 MT1F 3.08 3.21 
 809810_A ENSG00000137699 TRIM29 2.68 3.19 
Sanger ID Ensembl Gene ID Statistical Analysis of Microarray score Fold difference 
1840568_A ENSG00000184330 S100A7 2.75 3.14 
347284_A ENSG00000105974 CAV1 3.33 3.13 
788192_A ENSG00000087494 PTHLH 3.24 3.09 
195273_A ENSG00000139219 COL1A1 2.55 3.02 
293270_A ENSG00000107987 AKR1C2 3.21 2.94 
728114_A ENSG00000166899 FABP5 2.46 2.93 
1568010_A ENSG00000109321 AREG 2.85 2.90 
137665_A ENSG00000073282 TP73L 2.89 2.82 
364409_A ENSG00000121552 CSTA 2.54 2.80 
22117_A ENSG00000136699 FLJ20297 3.75 2.77 
813614_A ENSG00000169474 SPRR1B 2.61 2.69 
297392_A ENSG00000187193 MT1L 3.79 2.61 
127982_A ENSG00000104368 PLAT 2.84 2.58 
148057_A ENSG00000117318 ID3 3.74 2.56 
66946_A ENSG00000169688 MT1B 3.13 2.54 
26285_A ENSG00000091409 ITGA6 3.30 2.50 
267759_A ENSG00000105974 CAV1 2.53 2.49 
149370_A ENSG00000109861 CTSC 4.47 2.45 
274164_A ENSG00000187193 MT1F 4.07 2.44 
292784_A ENSG00000109861 CTSC 3.69 2.43 
418137_A ENSG00000105281 SLC1A5 5.24 2.42 
Table 2

Univariate analysis of clinicopathological and immunohistochemical data on the 566 tumor tissue microarray

Ps are calculated by the log-rank test.

FactorNo. casesDeath from breast cancerP
Mean survival (mo)SE
Size of tumor    <0.0001 
 T1 178 112.2 3.04  
 T2 170 97.5 3.86  
 T3 21 76.9 4.92  
 T4 38 50.2 5.79  
Nodal status    <0.0001 
 Positive 208 92.8 3.28  
 Negative 262 115.2 2.29  
Stage    <0.0001 
 I 124 117.9 3.05  
 II 197 102.7 3.4  
 III 50 66.4 4.85  
 IV 34 46.1 6.59  
Grade    0.00224 
 1 103 108.1 4.78  
 2 118 93.2 5.45  
 3 39 81.7 7.8  
Estrogen receptor    0.796 
 Positive 242 98.9 3.08  
 Negative 205 98.7 3.28  
Progesterone receptor    0.561 
 Positive 255 101.1 2.94  
 Negative 215 98.3 3.16  
LGALS3    0.597 
 Positive 213 96.1 3.2  
 Negative 218 99.7 3.14  
CD24    0.827 
 Positive 126 99.3 4.1  
 Negative 300 99.6 2.75  
CLDN4    0.707 
 Positive 245 101.7 2.95  
 Negative 86 99.4 4.86  
SPARC    0.00844 
 Positive 17 67.9 11.1  
 Negative 333 100.9 2.53  
S100A2    0.209 
 Positive 114.7 10.84  
 Negative 435 97.8 2.28  
MASPIN    0.092 
 Positive 108 103.3 4.34  
 Negative 225 96.3 3.12  
FactorNo. casesDeath from breast cancerP
Mean survival (mo)SE
Size of tumor    <0.0001 
 T1 178 112.2 3.04  
 T2 170 97.5 3.86  
 T3 21 76.9 4.92  
 T4 38 50.2 5.79  
Nodal status    <0.0001 
 Positive 208 92.8 3.28  
 Negative 262 115.2 2.29  
Stage    <0.0001 
 I 124 117.9 3.05  
 II 197 102.7 3.4  
 III 50 66.4 4.85  
 IV 34 46.1 6.59  
Grade    0.00224 
 1 103 108.1 4.78  
 2 118 93.2 5.45  
 3 39 81.7 7.8  
Estrogen receptor    0.796 
 Positive 242 98.9 3.08  
 Negative 205 98.7 3.28  
Progesterone receptor    0.561 
 Positive 255 101.1 2.94  
 Negative 215 98.3 3.16  
LGALS3    0.597 
 Positive 213 96.1 3.2  
 Negative 218 99.7 3.14  
CD24    0.827 
 Positive 126 99.3 4.1  
 Negative 300 99.6 2.75  
CLDN4    0.707 
 Positive 245 101.7 2.95  
 Negative 86 99.4 4.86  
SPARC    0.00844 
 Positive 17 67.9 11.1  
 Negative 333 100.9 2.53  
S100A2    0.209 
 Positive 114.7 10.84  
 Negative 435 97.8 2.28  
MASPIN    0.092 
 Positive 108 103.3 4.34  
 Negative 225 96.3 3.12  
Table 3

Multivariate analysis of the tissue microarray cohort using the Cox proportional hazards model

Only those statistically significant independent prognostic factors as determined by the model are shown.

FactorDeath from breast cancer
Hazard ratio (95% confidence interval)P (Cox)
Age 1.05 (1.02–1.08) 0.0016 
Stage 1.83 (1.24–2.72) 0.0026 
Grade 2.53 (1.146–4.40) 0.001 
SPARC positive 6.88 (1.75–27.04) 0.0057 
FactorDeath from breast cancer
Hazard ratio (95% confidence interval)P (Cox)
Age 1.05 (1.02–1.08) 0.0016 
Stage 1.83 (1.24–2.72) 0.0026 
Grade 2.53 (1.146–4.40) 0.001 
SPARC positive 6.88 (1.75–27.04) 0.0057 

We thank the staff of the Sanger Institute Microarray Facility for the supply of arrays, lab protocols, and technical advice (David Vetrie, Cordelia Langford, Adam Whittaker, Neil Sutton) and the staff of the Quantarray/GeneSpring data files and all data analysis and databases relating to elements on the arrays (Kate Rice, Rob Andrews, Adam Butler, Harish Chudasama). We also thank Professor Alan Ashworth for continued support and helpful discussions pertaining to this project.

1
Taylor-Papadimitriou J, Lane EB Keratin expression in the mammary gland Neville MC Daniel CW eds. .
The mammary gland, development, regulation and function
,
p. 181
-215, Plenum Press New York  
1987
.
2
Deng G, Lu Y, Zlotnikov G, Thor AD, Smith HS Loss of heterozygosity in normal tissue adjacent to breast carcinomas.
Science (Wash. DC)
,
274(5295)
:
2057
-9,  
1996
.
3
Lakhani SR, Chaggar R, Davies S, et al Genetic alterations in ’normal’ luminal and myoepithelial cells of the breast.
J Pathol
,
189(4)
:
496
-503,  
1999
.
4
Forsti A, Louhelainen J, Soderberg M, Wijkstrom H, Hemminki K Loss of heterozygosity in tumour-adjacent normal tissue of breast and bladder cancer.
Eur J Cancer
,
37(11)
:
1372
-80,  
2001
.
5
Gusterson BA, Warburton MJ, Mitchell D, Ellison M, Neville AM, Rudland PS Distribution of myoepithelial cells and basement membrane proteins in the normal breast and in benign and malignant breast diseases.
Cancer Res
,
42(11)
:
4763
-70,  
1982
.
6
Dairkee SH, Puett L, Hackett AJ Expression of basal and luminal epithelium-specific keratins in normal, benign, and malignant breast tissue.
J Natl Cancer Inst (Bethesda)
,
80(9)
:
691
-5,  
1988
.
7
Gould VE, Koukoulis GK, Jansson DS, Nagle RB, Franke WW, Moll R Coexpression patterns of vimentin and glial filament protein with cytokeratins in the normal, hyperplastic, and neoplastic breast.
Am J Pathol
,
137(5)
:
1143
-55,  
1990
.
8
Wetzels RH, Kuijpers HJ, Lane EB, et al Basal cell-specific and hyperproliferation-related keratins in human breast cancer.
Am J Pathol
,
138(3)
:
751
-63,  
1991
.
9
Tsuda H, Takarabe T, Hasegawa T, Murata T, Hirohashi S Myoepithelial differentiation in high-grade invasive ductal carcinomas with large central acellular zones.
Hum Pathol
,
30(10)
:
1134
-9,  
1999
.
10
Jones C, Nonni AV, Fulford L, et al CGH analysis of ductal carcinoma of the breast with basaloid/myoepithelial cell differentiation.
Br J Cancer
,
85(3)
:
422
-7,  
2001
.
11
Perou CM, Sorlie T, Eisen MB, et al Molecular portraits of human breast tumours.
Nature (Lond.)
,
406(6797)
:
747
-52,  
2000
.
12
Tsuda H, Takarabe T, Hasegawa F, Fukutomi T, Hirohashi S Large, central acellular zones indicating myoepithelial tumor differentiation in high-grade invasive ductal carcinomas as markers of predisposition to lung and brain metastases.
Am J Surg Pathol
,
24(2)
:
197
-202,  
2000
.
13
Sorlie T, Perou CM, Tibshirani R, et al Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications.
Proc Natl Acad Sci USA
,
98(19)
:
10869
-74,  
2001
.
14
van de Rijn M, Perou CM, Tibshirani R, et al Expression of cytokeratins 17 and 5 identifies a group of breast carcinomas with poor clinical outcome.
Am J Pathol
,
161(6)
:
1991
-6,  
2002
.
15
Page MJ, Amess B, Townsend RR, et al Proteomic definition of normal human luminal and myoepithelial breast cells purified from reduction mammoplasties.
Proc Natl Acad Sci USA
,
96(22)
:
12589
-94,  
1999
.
16
Clarke C, Titley J, Davies S, O’Hare MJ An immunomagnetic separation method using superparamagnetic (MACS) beads for large-scale purification of human mammary luminal and myoepithelial cells.
Epithelial Cell Biol
,
3(1)
:
38
-46,  
1994
.
17
Chomczynski P, Sacchi N Single-step method of RNA isolation by acid guanidinium thiocyanate-phenol-chloroform extraction.
Anal Biochem
,
162
:
156
-9,  
1987
.
18
Yang YH, Dudoit S, Luu P, et al Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation.
Nucleic Acids Res
,
30(4)
:
e15
2002
.
19
Brazma A, Hingamp P, Quackenbush J, et al Minimum information about a microarray experiment (MIAME)-toward standards for microarray data.
Nat Genet
,
29(4)
:
365
-71,  
2001
.
20
Kononen J, Bubendorf L, Kallioniemi A, et al Tissue microarrays for high-throughput molecular profiling of tumor specimens.
Nat Med
,
4(7)
:
844
-7,  
1998
.
21
Tibshirani R, Hastie T, Narasimhan B, Chu G Diagnosis of multiple cancer types by shrunken centroids of gene expression.
Proc Natl Acad Sci USA
,
99(10)
:
6567
-72,  
2002
.
22
Chung CH, Bernard PS, Perou CM Molecular portraits and the family tree of cancer.
Nat Genet
,
32 Suppl
:
533
-40,  
2002
.
23
Ma XJ, Salunga R, Tuggle JT, et al Gene expression profiles of human breast cancer progression.
Proc Natl Acad Sci USA
,
100(10)
:
5974
-9,  
2003
.
24
van ’t Veer LJ, Dai H, van de Vijver MJ, et al Gene expression profiling predicts clinical outcome of breast cancer.
Nature (Lond.)
,
415(6871)
:
530
-6,  
2002
.
25
van de Vijver MJ, He YD, van’t Veer LJ, et al A gene-expression signature as a predictor of survival in breast cancer.
N Engl J Med
,
347(25)
:
1999
-2009,  
2002
.
26
Huang E, Cheng SH, Dressman H, et al Gene expression predictors of breast cancer outcomes.
Lancet
,
361(9369)
:
1590
-6,  
2003
.
27
Ramaswamy S, Ross KN, Lander ES, Golub TR A molecular signature of metastasis in primary solid tumors.
Nat Genet
,
33(1)
:
49
-54,  
2003
.
28
O’Hare MJ, Ormerod MG, Monaghan P, Lane EB, Gusterson BA Characterization in vitro of luminal and myoepithelial cells isolated from the human mammary gland by cell sorting.
Differentiation
,
46(3)
:
209
-21,  
1991
.
29
DiRenzo J, Signoretti S, Nakamura N, et al Growth factor requirements and basal phenotype of an immortalized mammary epithelial cell line.
Cancer Res
,
62(1)
:
89
-98,  
2002
.
30
Nacht M, Ferguson AT, Zhang W, et al Combining serial analysis of gene expression and array technologies to identify genes differentially expressed in breast cancer.
Cancer Res
,
59(21)
:
5464
-70,  
1999
.
31
Mackay A, Jones C, Dexter T, et al cDNA microarray analysis of genes associated with ERBB2 (HER2/neu) overexpression in human mammary luminal epithelial cells.
Oncogene
,
22(17)
:
2680
-8,  
2003
.
32
Castronovo V, Van Den Brule FA, Jackers P, et al Decreased expression of galectin-3 is associated with progression of human breast cancer.
J Pathol
,
179(1)
:
43
-8,  
1996
.
33
Lin H-M, Pestell RG, Raz A, Kim H-RC Galectin-3 enhances cyclin D1 promoter activity through SP1 and a cAMP-responsive element in human breast epithelial cells.
Oncogene
,
21
:
8001
-10,  
2002
.
34
Briggs J, Chamboredon S, Castellazzi M, Kerry JA, Bos TJ Transcriptional up-regulation of SPARC, in response to c-Jun overexpression, contributes to increased motility and invasion of MCF7 breast cancer cells.
Oncogene
,
21(46)
:
7077
-91,  
2002
.
35
Porter DA, Krop IE, Nasser S, et al A SAGE (serial analysis of gene expression) view of breast tumor progression.
Cancer Res
,
61(15)
:
5697
-702,  
2001
.
36
Al-Hajj M, Wicha MS, Benito-Hernandez A, Morrison SJ, Clarke MF Prospective identification of tumorigenic breast cancer cells.
Proc Natl Acad Sci USA
,
100(7)
:
3983
-8,  
2003
.