Abstract
To identify molecular alterations involved in the initiation and progression of breast carcinomas, we analyzed the global gene expression profiles of normal mammary epithelial cells and in situ, invasive, and metastatic breast carcinomas using serial analysis of gene expression (SAGE). We identified sets of genes expressed only or most abundantly in a specific stage of breast tumorigenesis or in a certain subtype of tumors through the pair-wise comparison and by hierarchical clustering analysis of these eight SAGE libraries (two/stage). On the basis of these comparisons, we made the following observations: Normal mammary epithelial cells showed the most distinct and least variable gene expression profiles. Many of the genes highly expressed in normal mammary epithelium and lost in carcinomas encoded secreted proteins, cytokines, and chemokines, implicating abnormal paracrine and autocrine signaling in the initiation of breast tumorigenesis. Very few genes were universally up-regulated in all tumors regardless of their stage and histological grade, indicating a high degree of diversity at the molecular level that likely reflects the clinical heterogeneity characteristic of breast carcinomas. Tumors of different histology type and stage had very distinct gene expression patterns. No genes seemed to be specific for metastatic or for in situ carcinomas. We found that the most dramatic and consistent phenotypic change occurred at the normal-to-in situ carcinoma transition. This observation, combined with the fact that many of the genes involved encode secreted, cell-nonautonomous factors, implies that the normal epithelium-to-in situ carcinoma transition may be the most promising target for cancer prevention and treatment.
Introduction
Breast carcinoma is the second leading cause of cancer-related death in women of the Western world. In the United States alone, over 180,000 new cases are diagnosed annually (1). Despite improvements in cancer therapy, about one-quarter of the patients diagnosed with invasive breast carcinoma will eventually die from their disease. Thus, the identification of genes and biochemical pathways involved in breast oncogenesis are of utmost importance for the development of rational, molecularly based preventative and therapeutic approaches. Similar to other cancer types, breast tumorigenesis is a multistep process starting with benign then atypical hyperproliferation, progressing into in situ then invasive carcinomas, and culminating in metastatic disease (2). This tumor progression is driven by somatic genetic changes and is reflected in phenotypic changes such as altered gene expression profiles. Therefore, comprehensive analysis of gene expression profiles may identify key alterations that are important for the acquisition and maintenance of the cancerous phenotype and thus represent ideal targets for cancer prevention and treatment.
To identify such alterations we generated SAGE4 libraries from normal mammary epithelial cells and in situ, invasive, and metastatic carcinomas. SAGE analyzes 14-bp tags derived from a defined position of the cDNAs without a priori knowledge of the sequence of the genes expressed (3). The SAGE tag numbers directly reflect the abundance of the mRNAs; therefore, SAGE data are highly accurate and quantitative. Although several previous studies analyzed the gene expression profiles of breast carcinomas, no study used an unbiased comprehensive gene expression profiling approach and highly purified, uncultured, patient-derived tissues as starting material (4, 5, 6). Therefore, we believe that our study is a faithful description of the in vivo gene expression profiles of normal and cancerous mammary epithelial cells.
Materials and Methods
Normal and Cancerous Breast Tissue.
All human tissue was collected following NIH guidelines and using protocols approved by the Institutional Review Boards. One of the normal mammary tissues (N1) was obtained from reduction mammoplasty tissue of a healthy patient 43 years of age undergoing plastic surgery at Brigham and Women’s Hospital. One of the DCIS samples (DCIS1) was obtained from a patient 67 years of age undergoing mastectomy for extensive, high-grade comedo DCIS and IDC in the same breast at Brigham and Women’s Hospital. The other normal and DCIS libraries (N2 and DCIS2) were obtained from a patient 35 years of age undergoing mastectomy of the left breast for intermediate-grade extensive DCIS and IDC (source of DCIS tissue) and undergoing prophylactic mastectomy of the right breast (source of normal tissue) at Massachusetts General Hospital. In both cases, cells for SAGE library generation were derived from extensive DCIS, and histology was confirmed by microscopic examination of H&E-stained sections adjacent to the parts used for SAGE. Both normal and the DCIS2 libraries were generated from cells purified using anti-BerEP4-coupled magnetic beads (Epithelial Enrich, Dynal). DCIS1 was generated from macroscopically dissected 20-μm frozen sections. The two cases of paired invasive and metastatic breast carcinomas were obtained from patients 53 years of age (IDC2 and MET2) and 57 years of age (IDC1 and MET1) who were operated on at Duke University Medical Center. Both cases were of high nuclear grade, and one (IDC1 and MET1) was estrogen and progesterone receptor-negative, whereas the other one (IDC2 and MET2) was estrogen and progesterone receptor-positive. SAGE libraries were generated from macroscopically dissected 20-μm frozen sections.
Generation and Analysis of SAGE Libraries.
For immunomagnetic purification, minced breast tissue was digested in DMEM/F12 medium (Life Technologies, Inc., Rockville, MD) supplemented with 1% fetal bovine serum, 2 mg/ml collagenase I (C0130; Sigma Chemical Co.), and 2 mg/ml hyaluronidase (H3506; Sigma Chemical Co.) at 37°C for 2 h. Cells were collected by centrifugation; trypsinized; and resuspended in PBS, 1% BSA, and 2 mm EDTA and purified using the Epithelial Enrich kit (Dynal, Oslo, Norway) following the manufacturer’s recommendations. In the case of libraries derived from frozen sections, mRNA was prepared from OCT-embedded tissue using μMACS kit (Miltenyi Biotec). SAGE libraries were generated following a modified micro-SAGE protocol, including a 1% SDS washing/heating step after each enzymatic reaction to ensure complete inactivation of the enzymes (7). As part of the Cancer Gene Anatomy Project SAGE consortium, SAGE libraries were arrayed at the Lawrence Livermore National Laboratories and sequenced at the Washington University Human Genome Center or at the National Institute Sequencing Center (NIH, Bethesda, MD). The data have been posted on the Cancer Gene Anatomy Project web-site as part of the SAGEmap database (8, 9). Approximately 50,000 SAGE tags were obtained from each library. The exact numbers were: N1 (49, 351), N2 (38, 371), D1 (42, 556), D2 (29, 768), IDC1 (39, 996), IDC2 (67, 703), MET1 (45, 539), and MET2 (60, 975). SAGE libraries were analyzed using the SAGE 2000 software (derivation of tags, library comparisons, and Monte Carlo analysis for statistical significance); the data were subsequently transferred into Microsoft Access (link to Unigene database) and Excel (to sort data and convert it into tab-delimited files). For comparisons, only SAGE tags that occurred at least twice/library/50,000 tags were included. To compensate for the unequal tag numbers in the eight different libraries, tag numbers were normalized to 50,000-tags/library before comparisons. Hierarchical clustering was applied to data using the Cluster program developed by Eisen et al. (10). Data were log-transformed and filtered for at least one observation abs Val 5 and Maxval-Minval>2. Using these settings, 3,987 genes (of 16,808 total) were included in the analysis. Results in Fig. 1 were displayed with the TreeView program (10).
Cell Lines, RNA Preparation, and Northern Blot Analysis.
Breast cancer cell lines were obtained from American Type Culture Collection or were generously provided by Drs. Steve Ethier (University of Michigan Medical Center, Ann Arbor, MI), Gail Tomlinson (University of Texas Southwestern Medical Center, Dallas, TX), and Arthur Pardee (Dana-Farber Cancer Institute, Boston, MA). Cells were grown in media recommended by the provider. However, 2 days before RNA collection, all cell lines were cultured in DMEM/F12 medium supplemented with 10% fetal bovine serum to minimize gene expression differences attributable to different culture conditions. RNA isolation, reverse transcription-PCR, and Northern blot analyses were performed as described (11).
Results
Generation and Analysis of SAGE Libraries.
Genes and biochemical pathways underlying phenotypic changes during breast tumorigenesis can be identified in a number of different ways. We chose an unbiased, comprehensive gene expression profiling approach (SAGE) to create a transcriptome map of normal and cancerous mammary epithelial cells. Eight different SAGE libraries were generated and analyzed for this study: two independent cases of normal luminal epithelial cells, two different histological types of DCIS, two IDCs, and two lymph node METs. Approximately 50,000 SAGE tags were obtained from each library, leading to the analysis of over 50,000 unique transcripts. We first performed pair-wise comparisons and Monte Carlo analysis to identify SAGE tags, with statistically significantly different expression (P < 0.001) between normal-DCIS, DCIS-invasive, and invasive-metastatic libraries. Transcripts that were expressed in each pair-wise comparison were defined as normal, cancer, DCIS, invasive, or metastasis-specific genes (Table 1, A and B). In addition, we also identified genes with an expression pattern that correlated with histological grade or the presence of the ER (Table 2, A and B). Data were also analyzed using a clustering algorithm to delineate patterns of gene expression among all eight libraries (Fig. 1). Clusters of coexpressed genes suggested that the two normal mammary epithelial libraries, despite being derived from two different patients, were the most similar to each other. Only 43 tags were expressed differentially between the two normal libraries. The four invasive and metastatic libraries appeared to be quite similar to each other. Interestingly the two DCIS libraries were very dissimilar (225 tags showed an at-least 2-fold statistically significant difference) and clustered according to their predicted clinical behavior. High-grade DCIS have high recurrence rates and worse overall prognoses when compared with intermediate or low-grade DCIS. In agreement with this, the high-grade comedo DCIS (DCIS-1) was more similar to invasive tumors, whereas the intermediate DCIS was more similar to normal mammary epithelium. Although the two normal and the DCIS-2 libraries were generated from cells purified using anti-Ber-Ep4-coated magnetic beads, whereas all other libraries were derived from macroscopically dissected (>95% pure) tumors, it is unlikely that the presence of contaminating cells (<5%; mostly fibroblasts) in the dissected samples would be the only reason for this intriguing clustering pattern. However, this possibility cannot be excluded.
Genes Characteristic for Normal Mammary Epithelium.
A large fraction of normal mammary epithelium-specific genes (Table 1A) encode secreted proteins, chemokines (IL-8, GROα, GROβ, and MIP3α), and cytokines (LIF, IL-6, and HIN-1) that may play autocrine and/or paracrine roles in the regulation of normal mammary epithelial cell growth, differentiation, and morphogenesis. HIN-1, for example, is a novel growth-inhibitory cytokine hypermethylated in a large fraction of breast cancers that may play a role in epithelial cell proliferation and branching morphogenesis.5 Several other chemokines (SCYA2, SCYA7, SCYB5, SCYB6, and SYD1) and chemokine receptors (IL-4R, IL-6R, and IL-15R) were also more abundant (5–10-fold) in normal mammary epithelial cells, but because of their low tag numbers, the P of this difference was above P = 0.001. It is unlikely that these chemokines and cytokines were derived from contaminating leukocytes, because SAGE libraries were generated from immunomagnetic purified normal luminal mammary epithelial cells, and the purity of the cells was confirmed by reverse transcription-PCR analysis using known luminal (HIN-1 and EMA) and myoepithelial (calponin and CALLA/CD10) cell-specific transcripts (data not shown). Preliminary mRNA in situ and immunohistochemical experiments also confirmed that chemokines and cytokines are expressed in normal luminal mammary epithelial cells.6 Similarly, a recently identified novel chemokine, mammary enriched chemokine, or MEC, was also shown to be expressed in normal mammary epithelial cells and found to be down-regulated in breast carcinomas (12).
Another gene expressed at high levels in normal mammary epithelium and lost in tumors is the mitochondrial superoxide dismutase, SOD-2. A decreased level of SOD-2 in cancer cells is well documented and it may contribute to the high level of reactive oxygen species and subsequent oxidative stress characteristic of breast cancers (13, 14). The expression levels of two transcriptional regulators, IKα and C/EBPδ, show a somewhat gradual decline from normal to in situ and then to invasive carcinomas. IKα sequesters NFkβ in the cytoplasm in an inactive form; therefore, loss of IKα expression may lead to activation of the NFkβ pathway. C/EBPs are transcription factors that regulate cell growth and differentiation in a cell type-specific manner. C/EBPδ has been shown to be required for growth arrest in mammary epithelial cells (15); thus, the gradual decline of C/EBPδ mRNA levels during breast tumor progression may correlate with increased proliferation rates.
Genes Differentially Expressed Among Various Tumors and Their Correlation with Histological Parameters.
In contrast to the relatively high number of normal mammary epithelium-specific genes, only three genes appeared to be highly expressed in all breast carcinomas. These included trefoil factor 3, X-box binding protein 1, and fatty acid synthase (Table 1B). Both trefoil factor 3 and fatty acid synthase previously have been demonstrated to be up-regulated in breast carcinomas (16, 17, 18). The IFN-α inducible protein, IFI-6–16, is also highly represented in most of these breast cancer libraries and is likely to reflect the activation of the IFN/signal transducers and activators of transcription-signaling pathway observed in several previous studies (5, 6). Interestingly, glutamine synthase and desmoplakin were the only two genes that were specifically up-regulated in both DCIS tumors. Both genes have been shown to be aberrantly expressed in various cancers. Using immunohistochemical analysis, the levels of desmoplakin protein were found to be decreased in poorly differentiated, advanced-stage breast carcinomas (19). Glutamine synthase, on the other hand, may be induced because of hypoxia and/or decreased intracellular glutamine levels (20). Another indication of metabolic alterations in cancer cells was the up-regulation of several metabolic enzymes involved in glycolytic (3-phosphoglycerate dehydrogenase and glyceraldehyde dehydrogenase) and mitochondrial (NADH:uniquinone dehydrogenase and NADH dehydrogenase 1α) function in invasive carcinomas. There are numerous reports on metabolic differences between normal and cancer cells, the most dramatic being a highly active glycolytic pathway in cancers (21). Many transcripts encoding ribosomal proteins and immunoglobulins were also included in the “invasive” cluster. Their presence in invasive tumors is likely attributable to increased protein synthesis and the possible presence of contaminating lymphocytes in the dissected samples. Interestingly, the expression of the chemokine receptor 4 was also significantly higher in invasive carcinomas, correlating with recent data implicating chemokine receptors in invasive/metastatic behavior of breast carcinomas (22). Two SAGE tags seemed to be particularly interesting because of their restricted expression in invasive breast carcinomas. One corresponds to calmodulin-like skin protein, a calcium-binding protein implicated in keratinocyte differentiation, whereas the other, IBC-1, has no expressed sequence tag matches and corresponds to a novel gene.7 Both of these genes may potentially be used for breast cancer diagnosis and for differential diagnosis of in situ and invasive carcinomas.
The expression of certain genes appeared to correlate with the histological grade of the tumor and/or the presence of the ER. All four invasive/metastatic and one of the DCIS tumors were high nuclear grade, and most of the genes highly expressed in these tumors encode extracellular matrix proteins, such as various collagens, osteonectin, and BIGH3, a TGFβ-induced gene. Although some of these genes could be expressed in contaminating fibroblasts, previous studies demonstrated that osteonectin is produced by most ER-negative breast carcinomas and may contribute to the aggressive phenotype and the presence of microcalcifications characteristic for these tumors. To demonstrate that these secreted factors can be produced by normal and/or cancerous epithelial cells themselves, we performed Northern blot analysis of normal organoids (purified breast ducts) and various breast cancer cell lines (Fig. 2). Correlating with previous studies, breast cancer cell lines do not seem to reproduce the in vivo expression pattern of these differentially expressed genes. However, some cells clearly express high levels of genes encoding extracellular matrix components and other secreted proteins, indicating that these proteins can be produced by the tumor cells themselves.
We were not able to define an “intermediate grade-specific” gene cluster, because we analyzed only one intermediate-grade tumor (DCIS2). However, the expression of several STAT transcription targets, such as heat shock proteins and IFN-α inducible protein 27, appeared to be significantly higher in DCIS2 (Table 2B). Despite the fact that we analyzed three ER-negative (DCIS1, IDC1, and MET1) and three ER-positive (DCIS2, IDC2, and MET2) tumors, the only ER target that clearly correlated with the ER status of the tumors was cyclin D1. This could be attributable to the fact that both IDC2 and MET2 were of high nuclear grade, whereas most ER-positive tumors are of low-intermediate grade. All other known ER target genes were either present in all (cathepsin D and bcl-2) or in a subset of libraries (LIV-1-DCIS2 only; Trefoil factor 1, both normal and both DCIS).
Discussion
Characterization of global gene expression profiles may help elucidate important biological processes in both normal and cancer cells. Although these studies are largely descriptive, they may reveal the molecular basis of already-known phenotypic observations, and they may suggest new hypotheses that could stimulate future experiments. In agreement with this, our SAGE analysis of gene expression patterns at various stages of mammary tumorigenesis led to several important observations that revealed or confirmed the molecular basis of known biochemical findings and also identified novel ones that are likely to open up new avenues of future studies. In addition, because SAGE does not require prior knowledge of genes, novel transcripts can be identified. In fact, several of the differentially expressed SAGE tags we identified currently have no ESTs or other database match (Table 2, A and B). This is likely attributable to the restricted expression pattern of these genes, which makes them excellent candidate tumor markers. Searching against the completed human genome sequence, these SAGE tags will likely lead to the identification of novel transcripts. These transcripts could then be included in DNA arrays for high throughput analysis of breast and other tumors to determine their usefulness for cancer diagnosis and/or prognostication.
The high levels of SOD-2, IKα, and C/EBPδ in normal mammary epithelial cells and the loss of expression in breast cancers correlates with previous studies (13, 14, 23). One of the novel findings reported here was that several of the genes highly expressed in normal mammary epithelial cells and down-regulated in tumors encode secreted proteins, chemokines, and cytokines. Chemokines are mediators of immune cell trafficking, and, although they are also involved in regulating cell movements during morphogenesis, their principal targets are bone marrow-derived cells (24, 25). Recent data indicate that in addition to fibroblasts and adipocytes, macrophages, eosinophils, and endothelial cells are also required for normal mammary gland development (26, 27). Although the function of leukocytes and endothelial cells in this process has not been fully elucidated, they may be necessary for the formation of terminal end buds and for branching morphogenesis. The high abundance of GROα, GROβ, MIP3α, IL-8, and, to a lesser degree, SCYA2, SCYA7, SCYB5, SCYB6, and SYD1 chemokines in normal mammary epithelial cells suggests that epithelial cells may actively recruit bone marrow-derived and endothelial cells. Alternatively, chemokines may play an as yet uncharacterized role in interepithelial cell communication, a hypothesis that deserves additional investigation.
There were very few genes that were universally up-regulated in all of the tumors examined, indicating a high degree of heterogeneity of breast carcinomas at the molecular level. Even based on the limited number of tumors analyzed, it is clear that tumors of different histological stage (invasive) or nuclear grade (high or intermediate) clearly show distinct gene expression patterns that are likely to reflect their clinical behavior. The same is true for ER-positive and -negative tumors. However, we found no genes that were up-regulated in both lymph node metastases that were not already expressed in invasive carcinomas. Similarly, there were very few genes that were up-regulated in both DCIS but not in invasive carcinomas. This could be attributable to the limited number of samples analyzed, or it may indicate that preinvasive and metastatic lesions may be even more heterogeneous than invasive ones. Alternatively, genes responsible for invasive and metastatic behavior may already be expressed in preinvasive and invasive lesions, respectively. Analysis of large numbers of in situ carcinomas without adjacent invasive lesions and analysis of many invasive carcinomas with and without lymph node metastasis would be required to answer this question.
Some of the cancer-specific genes we identified could be used for cancer diagnosis and/or molecular-based anticancer therapy. For example, breast and other cancer cells have very high levels of fatty acid synthase (Table 1B; Ref. 18), and its inhibition leads to apoptosis in in vitro cell cultures and in xenograft models (28). Similarly, two of the transcripts highly expressed in invasive breast cancers, calmodulin-like skin protein, and IBC-1, a novel gene, have restricted expression patterns that make them good candidate invasive breast cancer tumor markers.
There were several similarities and differences between our findings and previously published data on the molecular profiling of breast tumors (4, 5, 6). Correlating with our data, trefoil factor, cytokeratin 18, X-box binding protein, cyclin D1, and transcriptional targets of the STAT/IFN pathway are highly up-regulated in a subset of breast cancers (4, 5, 6). In contrast, our “normal cluster” is very different from that of previous studies. There are several possible explanations of this difference: (a) in contrast with this study, no previous study used purified, uncultured, normal luminal mammary epithelial cells; (b) in previous cDNA array experiments the reference RNA was derived from a mix of cancer cell lines, potentially masking differences between normal and cancerous mammary epithelial cells; and (c) some of the genes we identified, especially the ones with no EST matches, may not be present on the arrays. Future studies analyzing the same samples on different platforms (SAGE versus arrays) are required to resolve the observed differences.
In summary, using comprehensive gene expression profiling, we identified several genes and pathways that have not previously been implicated in breast cancer and which may play important roles in the initiation and progression of breast carcinomas. Because we analyzed a limited number of specimens, additional experiments using high-throughput techniques, such as mRNA in situ hybridization or immunohistochemical analysis on tissue-microarrays, are required to determine how commonly these genes are differentially expressed. Additional analysis of these genes and the biochemical pathways in which they are involved will not only further our understanding of breast oncogenesis, but will also provide new and valuable targets for translational research.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Supported by National Cancer Institute-Cancer Gene Anatomy Project Contract S98-146A from the National Cancer Institute Cancer Gene Anatomy Project (to G. J. R. and K. P.), an American Society of Clinical Oncology Career Development Award (to K. P.), the National Cancer Institute Specialized Programs of Research Excellence in Breast Cancer at Dana-Farber Cancer Institute (to K. P.); and by the Dana-Farber Cancer Institute (to K. P., D. A. P., and I. E. K.).
The abbreviations used are: SAGE, serial analysis of gene expression; DCIS, ductal carcinoma(s) in situ; IDC, invasive ductal carcinoma; MET, metastasis; HIN-1, high in normal-1; IBC-1, invasive breast carcinoma-1; C/EBP, CCAAT/enhancer binding protein; ER, estrogen receptor.
I. Krop, D. Sgroi, D. Porter, K. Lunetta, R. LeVangie, P. Seth, C. Kaelin, E. Rhei, M. Bosenberg, S. Schnitt, J. Marks, Z. Pagon, D. Belina, J. Razumovic, and K. Polyak. HIN-1, a putative cytokine highly expressed in normal but not cancerous mammary epithelial cells. PNAS, in press.
D. Porter, J. Lahti-Domenici, S. Schmitt, and K. Polyak, unpublished data.
P. Seth, D. Porter, J. Lahti-Domenici, J. Marks, A. Richardson, and K. Polyak, unpublished data.
Variation in expression of 3,987 genes in eight SAGE libraries and dendogram representing similarities in expression patterns among samples. Only parts of certain clusters (normal, DCIS, invasive, and high-grade) are included in the figure. Each row represents a gene/SAGE tag, whereas each column corresponds to a SAGE library/tissue sample. The absolute abundance of the SAGE tag in the library (SAGE tag number) correlates with red color intensity (black not present; intense red, highly abundant). On the dendogram, all four invasive and metastatic breast cancers cluster together; similarly the two normal samples (N1 and N2) cluster, indicating their high degree of similarity. The high-grade DCIS1 is closer to the invasive branch, whereas the intermediate one is closer to normal mammary epithelium.
Variation in expression of 3,987 genes in eight SAGE libraries and dendogram representing similarities in expression patterns among samples. Only parts of certain clusters (normal, DCIS, invasive, and high-grade) are included in the figure. Each row represents a gene/SAGE tag, whereas each column corresponds to a SAGE library/tissue sample. The absolute abundance of the SAGE tag in the library (SAGE tag number) correlates with red color intensity (black not present; intense red, highly abundant). On the dendogram, all four invasive and metastatic breast cancers cluster together; similarly the two normal samples (N1 and N2) cluster, indicating their high degree of similarity. The high-grade DCIS1 is closer to the invasive branch, whereas the intermediate one is closer to normal mammary epithelium.
Northern blot analysis of differentially expressed genes in breast cancer cell lines and in normal organoids. The following genes were analyzed: collagen 1a2, connective tissue growth factor (CTGF), psoriasin/S100A7, osteonectin, IFN-induced clone IFI-6–16; chemokines GROα, GROβ, interleukin-8 (IL-8); and MIP3α, leukemia inhibitor factor (LIF), trefoil factor 3, and IKα.
Northern blot analysis of differentially expressed genes in breast cancer cell lines and in normal organoids. The following genes were analyzed: collagen 1a2, connective tissue growth factor (CTGF), psoriasin/S100A7, osteonectin, IFN-induced clone IFI-6–16; chemokines GROα, GROβ, interleukin-8 (IL-8); and MIP3α, leukemia inhibitor factor (LIF), trefoil factor 3, and IKα.
Genes highly abundant in normal and in cancerous mammary epithelial cells
Unigene ID no. . | Gene description . | N1 . | N2 . | DCIS1 . | DCIS2 . | IDC1 . | IDC2 . | MET1 . | MET2 . | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
A. Genes highly abundant in normal mammary epithelial cells | ||||||||||||||||||
75765 | Chemokine GROβ | 67 | 138 | 1 | 1 | 0 | 0 | 0 | 0 | |||||||||
789 | Chemokine GROα | 219 | 252 | 5 | 6 | 0 | 0 | 0 | 0 | |||||||||
624 | Chemokine CXCα (IL-8) | 205 | 196 | 4 | 21 | 1 | 0 | 0 | 0 | |||||||||
81328 | IKα | 75 | 84 | 3 | 21 | 2 | 3 | 1 | 5 | |||||||||
177781 | SOD2 a | 172 | 99 | 0 | 1 | 1 | 0 | 5 | 1 | |||||||||
3337 | Transmembrane 4 super family member 1 | 74 | 53 | 5 | 18 | 1 | 0 | 0 | 1 | |||||||||
75969 | Proline-rich protein | 21 | 24 | 2 | 0 | 1 | 0 | 1 | 0 | |||||||||
2250 | Leukemia inhibitory factor | 35 | 75 | 0 | 1 | 0 | 0 | 0 | 0 | |||||||||
23582 | Membrane component, surface marker 1 | 61 | 66 | 10 | 20 | 5 | 2 | 0 | 4 | |||||||||
241507 | Ribosomal protein S6 | 143 | 108 | 51 | 40 | 38 | 16 | 28 | 16 | |||||||||
75498 | Chemokine MIP3α | 24 | 16 | 1 | 0 | 0 | 0 | 0 | 0 | |||||||||
25829 | ras-related protein | 25 | 16 | 3 | 0 | 5 | 1 | 1 | 2 | |||||||||
NA | NO MATCH (CTTCCTGTGA) | 275 | 127 | 5 | 11 | 13 | 8 | 12 | 0 | |||||||||
76722 | C/EBPδ | 86 | 62 | 21 | 25 | 3 | 2 | 2 | 3 | |||||||||
91539 | ESTs | 29 | 27 | 0 | 0 | 0 | 0 | 0 | 0 | |||||||||
101382 | TNF-α induced protein 2 | 36 | 18 | 0 | 1 | 1 | 2 | 0 | 1 | |||||||||
NA | HIN-1 | 69 | 24 | 0 | 0 | 0 | 0 | 0 | 0 | |||||||||
B. Genes highly abundant in cancerous mammary epithelial cells | ||||||||||||||||||
82961 | Trefoil factor 3 (intestinal) | 19 | 3 | 285 | 477 | 206 | 69 | 159 | 136 | |||||||||
83190 | Fatty acid synthase | 9 | 2 | 29 | 35 | 93 | 18 | 141 | 25 | |||||||||
149923 | X-box binding protein 1 | 44 | 32 | 82 | 109 | 136 | 138 | 110 | 334 | |||||||||
265827 | INF-α-inducible protein, IFI-6-16 | 2 | 0 | 9 | 359 | 72 | 95 | 7 | 293 | |||||||||
170171 | Glutamine synthase | 6 | 3 | 41 | 30 | 2 | 6 | 3 | 3 | |||||||||
74316 | Desmoplakin | 4 | 9 | 50 | 30 | 7 | 4 | 6 | 0 | |||||||||
6335 | PI4P5 kinase, type II | 1 | 0 | 0 | 3 | 28 | 22 | 40 | 4 | |||||||||
3343 | 3-phosphoglycerate dehydrogenase | 6 | 3 | 11 | 8 | 51 | 59 | 26 | 108 | |||||||||
268571 | Apolipoprotein C-I | 1 | 1 | 4 | 0 | 48 | 32 | 45 | 15 | |||||||||
7744 | NADH:ubiquinone dehydrogenase | 5 | 9 | 8 | 3 | 33 | 42 | 27 | 54 | |||||||||
169476 | GAPDH | 27 | 24 | 16 | 21 | 87 | 73 | 127 | 92 | |||||||||
84298 | CD74 antigen | 4 | 18 | 16 | 3 | 88 | 115 | 113 | 40 | |||||||||
119206 | IGF-BP 7 | 0 | 0 | 5 | 3 | 27 | 35 | 30 | 6 | |||||||||
74823 | NADH dehydrogenase 1α | 5 | 3 | 3 | 3 | 36 | 15 | 23 | 20 | |||||||||
2186 | EEFIG | 48 | 45 | 58 | 31 | 820 | 658 | 954 | 76 | |||||||||
302063 | Immunoglobulin μ | 0 | 0 | 5 | 0 | 96 | 39 | 178 | 7 | |||||||||
181125 | Immunoglobulin λ gene clustera | 0 | 3 | 22 | 3 | 146 | 100 | 191 | 9 | |||||||||
180142 | Calmodulin-like skin protein | 0 | 0 | 0 | 0 | 26 | 14 | 10 | 0 | |||||||||
NA | IBC-1 | 0 | 0 | 0 | 0 | 98 | 56 | 110 | 0 |
Unigene ID no. . | Gene description . | N1 . | N2 . | DCIS1 . | DCIS2 . | IDC1 . | IDC2 . | MET1 . | MET2 . | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
A. Genes highly abundant in normal mammary epithelial cells | ||||||||||||||||||
75765 | Chemokine GROβ | 67 | 138 | 1 | 1 | 0 | 0 | 0 | 0 | |||||||||
789 | Chemokine GROα | 219 | 252 | 5 | 6 | 0 | 0 | 0 | 0 | |||||||||
624 | Chemokine CXCα (IL-8) | 205 | 196 | 4 | 21 | 1 | 0 | 0 | 0 | |||||||||
81328 | IKα | 75 | 84 | 3 | 21 | 2 | 3 | 1 | 5 | |||||||||
177781 | SOD2 a | 172 | 99 | 0 | 1 | 1 | 0 | 5 | 1 | |||||||||
3337 | Transmembrane 4 super family member 1 | 74 | 53 | 5 | 18 | 1 | 0 | 0 | 1 | |||||||||
75969 | Proline-rich protein | 21 | 24 | 2 | 0 | 1 | 0 | 1 | 0 | |||||||||
2250 | Leukemia inhibitory factor | 35 | 75 | 0 | 1 | 0 | 0 | 0 | 0 | |||||||||
23582 | Membrane component, surface marker 1 | 61 | 66 | 10 | 20 | 5 | 2 | 0 | 4 | |||||||||
241507 | Ribosomal protein S6 | 143 | 108 | 51 | 40 | 38 | 16 | 28 | 16 | |||||||||
75498 | Chemokine MIP3α | 24 | 16 | 1 | 0 | 0 | 0 | 0 | 0 | |||||||||
25829 | ras-related protein | 25 | 16 | 3 | 0 | 5 | 1 | 1 | 2 | |||||||||
NA | NO MATCH (CTTCCTGTGA) | 275 | 127 | 5 | 11 | 13 | 8 | 12 | 0 | |||||||||
76722 | C/EBPδ | 86 | 62 | 21 | 25 | 3 | 2 | 2 | 3 | |||||||||
91539 | ESTs | 29 | 27 | 0 | 0 | 0 | 0 | 0 | 0 | |||||||||
101382 | TNF-α induced protein 2 | 36 | 18 | 0 | 1 | 1 | 2 | 0 | 1 | |||||||||
NA | HIN-1 | 69 | 24 | 0 | 0 | 0 | 0 | 0 | 0 | |||||||||
B. Genes highly abundant in cancerous mammary epithelial cells | ||||||||||||||||||
82961 | Trefoil factor 3 (intestinal) | 19 | 3 | 285 | 477 | 206 | 69 | 159 | 136 | |||||||||
83190 | Fatty acid synthase | 9 | 2 | 29 | 35 | 93 | 18 | 141 | 25 | |||||||||
149923 | X-box binding protein 1 | 44 | 32 | 82 | 109 | 136 | 138 | 110 | 334 | |||||||||
265827 | INF-α-inducible protein, IFI-6-16 | 2 | 0 | 9 | 359 | 72 | 95 | 7 | 293 | |||||||||
170171 | Glutamine synthase | 6 | 3 | 41 | 30 | 2 | 6 | 3 | 3 | |||||||||
74316 | Desmoplakin | 4 | 9 | 50 | 30 | 7 | 4 | 6 | 0 | |||||||||
6335 | PI4P5 kinase, type II | 1 | 0 | 0 | 3 | 28 | 22 | 40 | 4 | |||||||||
3343 | 3-phosphoglycerate dehydrogenase | 6 | 3 | 11 | 8 | 51 | 59 | 26 | 108 | |||||||||
268571 | Apolipoprotein C-I | 1 | 1 | 4 | 0 | 48 | 32 | 45 | 15 | |||||||||
7744 | NADH:ubiquinone dehydrogenase | 5 | 9 | 8 | 3 | 33 | 42 | 27 | 54 | |||||||||
169476 | GAPDH | 27 | 24 | 16 | 21 | 87 | 73 | 127 | 92 | |||||||||
84298 | CD74 antigen | 4 | 18 | 16 | 3 | 88 | 115 | 113 | 40 | |||||||||
119206 | IGF-BP 7 | 0 | 0 | 5 | 3 | 27 | 35 | 30 | 6 | |||||||||
74823 | NADH dehydrogenase 1α | 5 | 3 | 3 | 3 | 36 | 15 | 23 | 20 | |||||||||
2186 | EEFIG | 48 | 45 | 58 | 31 | 820 | 658 | 954 | 76 | |||||||||
302063 | Immunoglobulin μ | 0 | 0 | 5 | 0 | 96 | 39 | 178 | 7 | |||||||||
181125 | Immunoglobulin λ gene clustera | 0 | 3 | 22 | 3 | 146 | 100 | 191 | 9 | |||||||||
180142 | Calmodulin-like skin protein | 0 | 0 | 0 | 0 | 26 | 14 | 10 | 0 | |||||||||
NA | IBC-1 | 0 | 0 | 0 | 0 | 98 | 56 | 110 | 0 |
SOD2, superoxide dismutase 2; NA, Not Applicable; GAPDH, glyceraldehyde-3-phospahte dehydrogenase; IGF-BP7, Insulin-like growth factor binding protein 7.
Genes highly abundant in high-grade and in intermediate-grade/ER+ tumors
Unigene ID no. . | Gene description . | N1 . | N2 . | DCIS1 . | DCIS2 . | IDC1 . | IDC2 . | MET1 . | MET2 . | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
A. Genes highly abundant in high-grade tumors | ||||||||||||||||||
172928 | Collagen type I, α1 (ACCAAAAACC) | 1 | 2 | 157 | 1 | 51 | 39 | 85 | 18 | |||||||||
172928 | Collagen type I, α1 (TGGAAATGAC) | 1 | 1 | 106 | 0 | 102 | 50 | 140 | 48 | |||||||||
111779 | Osteonectin | 2 | 0 | 65 | 1 | 62 | 53 | 91 | 18 | |||||||||
2132 | EGF receptor pathway substrate 8 | 4 | 0 | 15 | 0 | 28 | 19 | 50 | 3 | |||||||||
177486 | Amyloid A4 | 9 | 11 | 43 | 0 | 30 | 16 | 19 | 1 | |||||||||
293441 | SNC73 | 3 | 13 | 80 | 0 | 215 | 111 | 197 | 28 | |||||||||
5734 | KIAA0679 protein | 5 | 3 | 22 | 1 | 23 | 19 | 25 | 27 | |||||||||
164170 | Clone HEP06932 | 3 | 7 | 54 | 3 | 33 | 22 | 46 | 27 | |||||||||
184108 | Ribosomal protein S21 | 45 | 95 | 123 | 36 | 138 | 124 | 124 | 72 | |||||||||
169793 | Ribosomal protein L32 | 273 | 375 | 679 | 124 | 362 | 278 | 349 | 314 | |||||||||
300697 | Immunoglobulin γ3 | 0 | 0 | 30 | 0 | 402 | 371 | 807 | 60 | |||||||||
NAa | NO MATCH (TGAAGCAGTA) | 2 | 1 | 55 | 1 | 41 | 25 | 31 | 22 | |||||||||
NA | NO MATCH (TTCGGTTGGT) | 1 | 0 | 56 | 1 | 32 | 22 | 30 | 12 | |||||||||
B. Genes highly abundant in intermediate-grade/ER+ tumors | ||||||||||||||||||
274404 | Plasminogen activator | 10 | 10 | 3 | 63 | 2 | 2 | 0 | 4 | |||||||||
2006 | Glutathione S-transferase M3 | 0 | 1 | 0 | 26 | 5 | 6 | 2 | 7 | |||||||||
75410 | HSP70.5 | 20 | 16 | 8 | 94 | 48 | 15 | 29 | 34 | |||||||||
76067 | HSP27 | 19 | 9 | 4 | 132 | 11 | 11 | 0 | 7 | |||||||||
79516 | Neuronal tissue-enriched protein | 5 | 2 | 1 | 16 | 1 | 2 | 1 | 0 | |||||||||
2340 | Plakoglobin | 30 | 24 | 24 | 99 | 25 | 19 | 20 | 11 | |||||||||
75243 | Female sterile homeotic-related gene | 15 | 10 | 10 | 35 | 2 | 6 | 4 | 6 | |||||||||
77886 | Lamin A/C | 32 | 41 | 8 | 67 | 6 | 3 | 8 | 2 | |||||||||
182265 | Keratin 19 | 18 | 19 | 32 | 92 | 11 | 22 | 5 | 1 | |||||||||
31439 | Serine protease inhibitor, Kunitz type 2 | 7 | 9 | 12 | 43 | 7 | 10 | 5 | 15 | |||||||||
202833 | Heme oxygenase | 41 | 6 | 3 | 95 | 3 | 5 | 1 | 6 | |||||||||
65114 | Keratin 18 | 31 | 31 | 21 | 63 | 25 | 22 | 21 | 36 | |||||||||
223241 | EF1δ | 0 | 13 | 21 | 52 | 1 | 17 | 0 | 9 | |||||||||
74631 | Basigin | 8 | 5 | 1 | 21 | 0 | 6 | 2 | 3 | |||||||||
79144 | Clone MGC2479 | 1 | 2 | 2 | 20 | 2 | 3 | 3 | 3 | |||||||||
72222 | Clone PP3795 | 0 | 0 | 0 | 18 | 1 | 0 | 0 | 0 | |||||||||
NA | NO MATCH (CAGACTTTTT) | 4 | 2 | 2 | 30 | 1 | 5 | 0 | 2 | |||||||||
NA | NO MATCH (TCGTTACGCA) | 3 | 3 | 0 | 20 | 1 | 1 | 1 | 1 | |||||||||
56892 | C8ORF4 | 11 | 19 | 2 | 77 | 3 | 10 | 1 | 13 | |||||||||
NA | NO MATCH (CGCCGAATAA) | 4 | 6 | 2 | 20 | 1 | 0 | 3 | 2 | |||||||||
NA | NO MATCH (GCCGTCGGAG) | 0 | 0 | 0 | 41 | 0 | 2 | 1 | 10 | |||||||||
NA | NO MATCH (GTATTTTCTC) | 0 | 5 | 0 | 25 | 0 | 0 | 0 | 0 | |||||||||
NA | NO MATCH (GGGAAGCAGA) | 43 | 57 | 19 | 87 | 7 | 33 | 6 | 9 | |||||||||
NA | NO MATCH (GCTTTCTCAC) | 9 | 13 | 1 | 33 | 7 | 3 | 5 | 6 | |||||||||
17409 | Cysteine-rich protein (intestinal) | 18 | 2 | 11 | 36 | 3 | 27 | 1 | 33 | |||||||||
278613 | INF-α-inducible protein 27 | 0 | 0 | 2 | 20 | 0 | 11 | 1 | 17 | |||||||||
82932 | Cyclin D1 | 4 | 1 | 10 | 35 | 31 | 63 | 10 | 77 |
Unigene ID no. . | Gene description . | N1 . | N2 . | DCIS1 . | DCIS2 . | IDC1 . | IDC2 . | MET1 . | MET2 . | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
A. Genes highly abundant in high-grade tumors | ||||||||||||||||||
172928 | Collagen type I, α1 (ACCAAAAACC) | 1 | 2 | 157 | 1 | 51 | 39 | 85 | 18 | |||||||||
172928 | Collagen type I, α1 (TGGAAATGAC) | 1 | 1 | 106 | 0 | 102 | 50 | 140 | 48 | |||||||||
111779 | Osteonectin | 2 | 0 | 65 | 1 | 62 | 53 | 91 | 18 | |||||||||
2132 | EGF receptor pathway substrate 8 | 4 | 0 | 15 | 0 | 28 | 19 | 50 | 3 | |||||||||
177486 | Amyloid A4 | 9 | 11 | 43 | 0 | 30 | 16 | 19 | 1 | |||||||||
293441 | SNC73 | 3 | 13 | 80 | 0 | 215 | 111 | 197 | 28 | |||||||||
5734 | KIAA0679 protein | 5 | 3 | 22 | 1 | 23 | 19 | 25 | 27 | |||||||||
164170 | Clone HEP06932 | 3 | 7 | 54 | 3 | 33 | 22 | 46 | 27 | |||||||||
184108 | Ribosomal protein S21 | 45 | 95 | 123 | 36 | 138 | 124 | 124 | 72 | |||||||||
169793 | Ribosomal protein L32 | 273 | 375 | 679 | 124 | 362 | 278 | 349 | 314 | |||||||||
300697 | Immunoglobulin γ3 | 0 | 0 | 30 | 0 | 402 | 371 | 807 | 60 | |||||||||
NAa | NO MATCH (TGAAGCAGTA) | 2 | 1 | 55 | 1 | 41 | 25 | 31 | 22 | |||||||||
NA | NO MATCH (TTCGGTTGGT) | 1 | 0 | 56 | 1 | 32 | 22 | 30 | 12 | |||||||||
B. Genes highly abundant in intermediate-grade/ER+ tumors | ||||||||||||||||||
274404 | Plasminogen activator | 10 | 10 | 3 | 63 | 2 | 2 | 0 | 4 | |||||||||
2006 | Glutathione S-transferase M3 | 0 | 1 | 0 | 26 | 5 | 6 | 2 | 7 | |||||||||
75410 | HSP70.5 | 20 | 16 | 8 | 94 | 48 | 15 | 29 | 34 | |||||||||
76067 | HSP27 | 19 | 9 | 4 | 132 | 11 | 11 | 0 | 7 | |||||||||
79516 | Neuronal tissue-enriched protein | 5 | 2 | 1 | 16 | 1 | 2 | 1 | 0 | |||||||||
2340 | Plakoglobin | 30 | 24 | 24 | 99 | 25 | 19 | 20 | 11 | |||||||||
75243 | Female sterile homeotic-related gene | 15 | 10 | 10 | 35 | 2 | 6 | 4 | 6 | |||||||||
77886 | Lamin A/C | 32 | 41 | 8 | 67 | 6 | 3 | 8 | 2 | |||||||||
182265 | Keratin 19 | 18 | 19 | 32 | 92 | 11 | 22 | 5 | 1 | |||||||||
31439 | Serine protease inhibitor, Kunitz type 2 | 7 | 9 | 12 | 43 | 7 | 10 | 5 | 15 | |||||||||
202833 | Heme oxygenase | 41 | 6 | 3 | 95 | 3 | 5 | 1 | 6 | |||||||||
65114 | Keratin 18 | 31 | 31 | 21 | 63 | 25 | 22 | 21 | 36 | |||||||||
223241 | EF1δ | 0 | 13 | 21 | 52 | 1 | 17 | 0 | 9 | |||||||||
74631 | Basigin | 8 | 5 | 1 | 21 | 0 | 6 | 2 | 3 | |||||||||
79144 | Clone MGC2479 | 1 | 2 | 2 | 20 | 2 | 3 | 3 | 3 | |||||||||
72222 | Clone PP3795 | 0 | 0 | 0 | 18 | 1 | 0 | 0 | 0 | |||||||||
NA | NO MATCH (CAGACTTTTT) | 4 | 2 | 2 | 30 | 1 | 5 | 0 | 2 | |||||||||
NA | NO MATCH (TCGTTACGCA) | 3 | 3 | 0 | 20 | 1 | 1 | 1 | 1 | |||||||||
56892 | C8ORF4 | 11 | 19 | 2 | 77 | 3 | 10 | 1 | 13 | |||||||||
NA | NO MATCH (CGCCGAATAA) | 4 | 6 | 2 | 20 | 1 | 0 | 3 | 2 | |||||||||
NA | NO MATCH (GCCGTCGGAG) | 0 | 0 | 0 | 41 | 0 | 2 | 1 | 10 | |||||||||
NA | NO MATCH (GTATTTTCTC) | 0 | 5 | 0 | 25 | 0 | 0 | 0 | 0 | |||||||||
NA | NO MATCH (GGGAAGCAGA) | 43 | 57 | 19 | 87 | 7 | 33 | 6 | 9 | |||||||||
NA | NO MATCH (GCTTTCTCAC) | 9 | 13 | 1 | 33 | 7 | 3 | 5 | 6 | |||||||||
17409 | Cysteine-rich protein (intestinal) | 18 | 2 | 11 | 36 | 3 | 27 | 1 | 33 | |||||||||
278613 | INF-α-inducible protein 27 | 0 | 0 | 2 | 20 | 0 | 11 | 1 | 17 | |||||||||
82932 | Cyclin D1 | 4 | 1 | 10 | 35 | 31 | 63 | 10 | 77 |
NA, Not Applicable.
Acknowledgments
We thank Drs. Daniel Haber and Jeff Sklar for their critical review of this manuscript.