To identify genetic changes involved in the progression of breast carcinoma, we did cDNA array comparative genomic hybridization (CGH) on a panel of breast tumors, including 10 ductal carcinoma in situ (DCIS), 18 invasive breast carcinomas, and two lymph node metastases. We identified 49 minimal commonly amplified regions (MCRs) that included known (1q, 8q24, 11q13, 17q21-q23, and 20q13) and several uncharacterized (12p13 and 16p13) regional copy number gains. With the exception of the 17q21 (ERBB2) amplicon, the overall frequency of copy number alterations was higher in invasive tumors than that in DCIS, with several of them present only in invasive cancer. Amplification of candidate loci was confirmed by quantitative PCR in breast carcinomas and cell lines. To identify putative targets of amplicons, we developed a method combining array CGH and serial analysis of gene expression (SAGE) data to correlate copy number and expression levels for each gene within MCRs. Using this approach, we were able to distinguish a few candidate targets from a set of coamplified genes. Analysis of the 12p13-p12 amplicon identified four putative targets: TEL/ETV6, H2AFJ, EPS8, and KRAS2. The amplification of all four candidates was confirmed by quantitative PCR and fluorescence in situ hybridization, but only H2AFJ and EPS8 were overexpressed in breast tumors with 12p13 amplification compared with a panel of normal mammary epithelial cells. These results show the power of combined array CGH and SAGE analysis for the identification of candidate amplicon targets and identify H2AFJ and EPS8 as novel putative oncogenes in breast cancer. (Cancer Res 2006; 66(8): 4065-78)

Gene amplification is one of the mechanisms underlying the activation of oncogenes, and it is often associated with poor prognosis, tumor progression, and acquired drug resistance (1). Therefore, identification of amplified oncogenes has potential diagnostic and therapeutic implications. In breast cancer, gene amplification occurs recurrently on some chromosomal locations, indicating the common activation of some oncogenes during tumor development. The most prominent and frequent amplicons have been reported on chromosomes 1q, 8p12, 8q24, 11q13, 12q13, 17q21, 17q23, and 20q13, and several candidate targets have been proposed and verified (2). The most well characterized breast cancer oncogene is ERBB2, located on chromosome 17q21 and amplified in 20% to 30% of breast carcinomas (3). Other oncogenes amplified in breast cancer include MYC (8q24); CCND1, EMS1, EMSY (11q13), IGF1R (15q26); and STK15, AIB1, and ZNF217 (20q13), whereas PI3KCA is activated in 25% to 40% of breast carcinomas due to oncogenic mutations, although its amplification was also reported in a fraction of breast tumors (48). The successful treatment of ERBB2-amplified breast tumors with Herceptin, an inhibitor of ERBB2 activity, is one of the few examples of successful molecular-based therapy in breast cancer (9). Therefore, several large-scale genome resequencing and genomic approaches are aimed at the identification of novel tumor-specific genetic events, amplifications, or mutations, which could be targeted therapeutically.

Complicating the identification of relevant genes is the fact that most amplicons are fairly large, can span several megabases, and in extreme cases involve whole chromosome arms (e.g., chromosome 1q). Thus, amplification of the targeted oncogene is inevitably associated with coamplification of many surrounding genes. For example, amplification of ERBB2 is frequently accompanied by amplification and overexpression of nearby GRB7, TOP2A, and PIP4K2B genes (10). GRB7 encodes a protein that directly binds to ERBB2 and modulates ERBB2 signaling (11), TOP2A is a topoisomerase involved in DNA repair, and PIP4K2B is a lipid kinase that has been shown to enhance breast cancer cell growth (12). The function of these genes suggests that they may cooperate with ERBB2 and contribute to the malignant phenotype and therefore could also be considered targets of the 17q21 amplicon. Similar complexity is observed for the 20q13, 11q13, and other amplicons, suggesting that genetic selection for the overactivation of a group of genes is a general phenomenon. Thus, despite the fact that many amplified chromosomal regions have been identified, the characterization of their targets remains a difficult task. Performing comprehensive genome-wide screens on a large set of tumors and analyzing both genetic and gene expression changes in the same tumors may help in resolving this problem because this combined approach facilitates the identification of genes that are both amplified and overexpressed. Using this approach, KCNK9 was found to be the target of a small (550 kb) amplicon at 8q24.3 because it was the only overexpressed gene within that region (13).

Array comparative genomic hybridization (CGH) is a technology suitable to study gene copy number changes on a genome-wide scale at a high resolution (1416). Currently, three different array CGH platforms are used: bacterial artificial chromosome, cDNA, and long (60 bp) oligo arrays, with each having its own advantages and disadvantages. Previous cDNA array CGH studies of breast tumors and breast cancer cell lines revealed a strong correlation between copy number gain and increased gene expression and led to the identification of several new amplicons and their candidate targets, including HOXB7 at 17q21.3 (1719). However, the limitation of these studies was the use of a reference cDNA mix for evaluating gene expression changes, the potential probe hybridization bias associated with the use of fairly long cDNA fragments on the arrays, and limiting their analysis at advanced-stage tumors. In this study, we did cDNA array CGH on 30 breast tumors, including 10 preinvasive tumors (DCIS), and five nonmalignant cells purified from normal breast tissue or breast carcinomas, and in parallel, we used serial analysis of gene expression (SAGE) for evaluating gene expression patterns. Using this integrated approach, we confirmed known amplicons and their targets, such as ERBB2 at 17q21 and PAK1/EMSY at 11q13. We further identified many uncharacterized amplicons and their putative targets in both in situ and invasive carcinomas. Based on a targeted screen for the amplification of known oncogenes, KRAS2 in the 12p13-12 amplicon has previously been described as a gene amplified in a subset of breast carcinomas (20) and in one metastatic rectal tumor (21), but this amplicon has not been systematically characterized at high resolution. Our detailed characterization of the 12p13-p12 amplicon identified four putative candidate targets (ETV6, KRAS2, H2AFJ, and EPS8), and subsequent follow-up experiments confirmed H2AFJ and EPS8 as novel candidate oncogenes in breast cancer. Thus, these data show that the integration of SAGE with cDNA array CGH is a powerful approach for the identification of amplified candidate oncogenes.

Tissue specimens and cell lines. Tumor specimens were obtained from Brigham and Women's and Massachusetts General hospitals (Boston MA), Duke University (Durham, NC), University Hospital Zagreb (Zagreb, Croatia), and the National Disease Research Interchange. All human tissue was collected using protocols approved by the Dana-Farber Cancer Institute Institutional Review Board. Tissues were snap frozen on dry ice, stored in −80°C until use, or were immediately processed for immunomagnetic purification (22). Breast cancer cell lines were obtained from the American Type Culture Collection (Manassas, VA) or were generously provided by Dr. Steve Ethier (University of Michigan) and Dr. Arthur Pardee (Dana-Farber Cancer Institute). Cells were grown in media recommended by the provider.

cDNA array CGH profiling. Array CGH analysis was done essentially as previously described (23). Briefly, genomic DNA from normal and tumor tissues was fragmented and labeled according to published protocols.8

Labeled DNAs were hybridized to human cDNA microarrays containing 14,160 cDNA clones (Agilent Technologies, Palo Alto, CA), for which ∼9,000 unique map positions were defined (National Center for Biotechnology Information, build 34). The median interval between mapped elements is 100 kb, with 92.8% of intervals <1 Mb and 98.6% <3 Mb. Log 2 ratios were calculated from Cye3/Cye5 fluorescence channels and further normalized by GC % content of the probe's genome region using local regression. These normalized profiles were then processed using Circular Binary Segmentation, a change-point identification technique developed for array CGH, to demarcate genomic segments with statistically uniform copy number (23, 24). Segments are assigned a log 2 ratio that is the median log 2 ratio of the contained probes. The data were then centered as previously described, so that the peak in the distribution of segment values lies at zero (23). Based on the histogram of log 2 ratio distribution among all normal samples and among all tumors, a minimum gain threshold was set to log 2 ratio of ≥0.09 (Supplementary Fig. S1), and the amplification threshold was set to log 2 ratio of 0.5. Similarly, segments with log 2 ratio less than −0.08 and −0.35 are considered to have a chromosomal region of “loss” and “deletion,” respectively. The raw and segmented data sets are provided as Supplementary Data Files.

Identification of amplified loci and statistical analysis. Identification of priority loci was done in a similar way as described (23). Based on the histogram of log 2 ratio distribution among all normal samples and among all tumors, a minimum gain threshold was set to log 2 ratio of ≥0.09 (Supplementary Fig. S1), and the amplification threshold was set to log 2 ratio of 0.5. Minimal common regions (MCRs) of chromosome amplification were generated based on overlapping recurrence across samples using the same algorithm as previously described (23). MCRs were further prioritized by the presence of the following features: (a) recurrence of high-fold amplification events in more than one sample, (b) a peak segment value of >0.8 in at least one sample, or (c) statistically significant recurrence of low-level alteration. MCRs with one or more of these features are summarized in Table 1. Recurrence of array CGH gains and losses were compared between DCIS and invasive ductal carcinoma (IDC) sample groups, as well as between estrogen receptor–negative (ER) and ER+ sample groups. Low-level thresholds, >0.09 and less than −0.08, were used to define gain and loss, respectively. At each probe location, total numbers of samples with gains and losses were counted in each sample group (e.g., DCIS versus IBC and ER+ versus ER tumors), and significant difference was determined by Fisher's exact test (P < 0.05, not accounting for multiple testing).

Table 1.

List of MCRs of amplifications

ChromosomeStart positionStart geneEnd positionEnd geneBandSize (Mb)No. genesCandidates and known targets
801450 FLJ22639 802851 LOC284591 1p36.33 0.00  
148592826 RORC 150329171 S100A4 1q21 1.74 37  
166880319 SCN9A 169484614 NOSTRIN 2q24 2.60  
42800799 HIG1 44462437 ZNF445 3p22-p21 1.66  
142639325 NR3C1 145807087 TCERG1 5q31 3.17 10  
175319142 THOC3 176717480 RGS14 5q35.2-35.3 1.40 24 NSD1 
26195427 HFE 29631375 UBD 6q21.3 3.44 79  
138230274 TNFAIP3 146906521 RAB32 6q23-q24.3 8.68 33 C6orf115 
30182906 0 54860934 EGFR 7p15.1-p12 24.68 111 KIAA0241, ELMO1, UCC1, GLI3 
109897061 IMMP2L 115173977 TFEC 7q31 5.28 13  
28681099 FLJ10871 30555575 GTF2E2 8p21.1-p12 1.87 10  
33568398 MGC1136 39427722 ADAM3A 8p12 5.86 30 FGFR1 
59628282 SDCBP 62578374 ASPH 8q12 2.95  
82515121 PMP2 89122152 MMP16 8q21-q22 6.61 25 WWP1 
124329884 ZHX1 131133535 DDEF1 8q24.1-q24.2 6.80 27 KIAA0196, MYC 
125076688 HSPA5 127106700 GARNL3 9q33-q34.1 2.03  
10 126075862 OAT 126480382 KIAA0157 10q26 0.40 KIAA0140, KIAA0157 
11 74788222 RPS3 77489641 ALG8 11q13.3-q13.5 2.70 31 WNT11, E2IG4, CLNS1A, PTD015, GARP, EMSY 
12 9896236 CLECSF2 15926613 STRAP 12p13.3-p12.3 6.03 92 H2AFJ, EPS8 
12 24855562 BCAT1 32151452 BICD1 12p12.1-p11.21 7.30 54 TEL/ETV6, KRAS2 
12 61324033 PPM1H 64504458 HMGA2 12q14-q15 3.18 18  
12 66329021 DYRK2 75248048 OSBPL8 12q15 8.92 44 NUP107, CPSF6, FRS2, CCT2, MGC23401, KCNC2 
15 94674950 NR2F2 99661657 PCSK6 15q26 4.99 26 IGF1R 
        BAIAP3, CLCN7, KIAA0683, MAPK8IP3, NUBP2, NDUFB10, RAB26, LOC114984, PAQR4, HCFC1R1 
16 1246283 TPSD1 4931075 KIAA0420 16p13.3 3.68 128 FLJ14154, Magmas, C16orf5 
16 6009133 A2BP1 19087139 LOC51760 16p13-p12 13.08 59  
16 22264760 CDR2 23499838 NDUFAB1 16p12.3-p12.1 1.24  
16 31408316 FLJ13868 31792664 ZNF267 16p11.2 0.38  
17 7416868 EIF4A1 7435272 FXR2 17p13 0.02  
17 22645233 WSB1 23718425 VTN 17q11 1.07 15  
17 23924266 ALDOC 24977091 SSH2 17q11 1.05 30 SPAG5, SDF2, SUPT6H 
17 27682505 NJMU-R1 28364218 ACCN1 17q11 0.68  
17 33120548 TCF2 34143676 RNF110 17q12 1.02 10 MLLT6 
17 34816380 PPARBP 35502567 NR1D1 17q21 0.69 21 PERLD1, ERBB2 
17 35853236 IGFBP4 36285721 KRT20 17q21 0.43 13 CCR7 
17 37562439 KCNH4 38088158 CNTNAP1 17q21 0.53 19  
17 38256727 AOC3 38916860 DHX8 17q21 0.66 15  
17 41327624 MAPT 41805904 NSF 17q21 0.48  
17 44325162 ATP5G1 45858592 FLJ20920 17q21.32 1.53 33  
17 52517620 AKAP1 53952615 PNUTL2 17q23 1.43 19 MSI2 
17 53952615 PNUTL2 54188231 PPM1E 17p23 0.24  
17 57910136 TLK2 59754145 PECAM1 17q23 1.84 28 LOC51204 
17 71128808 MYO15B 71284112 H3F3B 17q25.1 0.16  
20 4149816 ADRA1D 5043599 PCNA 20p13-p12 0.89  
20 19141290 SLC24A3 20296765 INSM1 20p11 1.16  
20 24397835 C20orf39 25176706 PYGB 20p11.2 0.78 ACAS2L 
        PPGB, SULF2, ADNP, DPM1, STX16, NPEPL1 
20 43004186 TOMM34 59983250 TAF4 20q13.1-q13.3 16.98 130 NCOA3, BCAS4, ZNF217, BCAS1, CYP24A1 
20 61630222 PTK6 62181932 OPRL1 20q13.3 0.55 25 UCKL1, TCEA2 
        DYRK1A, DSCR8, HMGN1, MX1, TFF3, TSGA2 
21 36429337 CBR3 46879955 HRMT1L1 21q22.2-q22.3 10.45 127 ICOSL, ADARB1, POFUT2, C21orf56, LSS, PCNT2 
        ALG12, TUBGCP6, MAPK12, SBF1, ECGF1 
22 48487799 BRD1 49353600 ARSA 22q13.33 0.87 28 MGC16635, MAPK8IP2 
ChromosomeStart positionStart geneEnd positionEnd geneBandSize (Mb)No. genesCandidates and known targets
801450 FLJ22639 802851 LOC284591 1p36.33 0.00  
148592826 RORC 150329171 S100A4 1q21 1.74 37  
166880319 SCN9A 169484614 NOSTRIN 2q24 2.60  
42800799 HIG1 44462437 ZNF445 3p22-p21 1.66  
142639325 NR3C1 145807087 TCERG1 5q31 3.17 10  
175319142 THOC3 176717480 RGS14 5q35.2-35.3 1.40 24 NSD1 
26195427 HFE 29631375 UBD 6q21.3 3.44 79  
138230274 TNFAIP3 146906521 RAB32 6q23-q24.3 8.68 33 C6orf115 
30182906 0 54860934 EGFR 7p15.1-p12 24.68 111 KIAA0241, ELMO1, UCC1, GLI3 
109897061 IMMP2L 115173977 TFEC 7q31 5.28 13  
28681099 FLJ10871 30555575 GTF2E2 8p21.1-p12 1.87 10  
33568398 MGC1136 39427722 ADAM3A 8p12 5.86 30 FGFR1 
59628282 SDCBP 62578374 ASPH 8q12 2.95  
82515121 PMP2 89122152 MMP16 8q21-q22 6.61 25 WWP1 
124329884 ZHX1 131133535 DDEF1 8q24.1-q24.2 6.80 27 KIAA0196, MYC 
125076688 HSPA5 127106700 GARNL3 9q33-q34.1 2.03  
10 126075862 OAT 126480382 KIAA0157 10q26 0.40 KIAA0140, KIAA0157 
11 74788222 RPS3 77489641 ALG8 11q13.3-q13.5 2.70 31 WNT11, E2IG4, CLNS1A, PTD015, GARP, EMSY 
12 9896236 CLECSF2 15926613 STRAP 12p13.3-p12.3 6.03 92 H2AFJ, EPS8 
12 24855562 BCAT1 32151452 BICD1 12p12.1-p11.21 7.30 54 TEL/ETV6, KRAS2 
12 61324033 PPM1H 64504458 HMGA2 12q14-q15 3.18 18  
12 66329021 DYRK2 75248048 OSBPL8 12q15 8.92 44 NUP107, CPSF6, FRS2, CCT2, MGC23401, KCNC2 
15 94674950 NR2F2 99661657 PCSK6 15q26 4.99 26 IGF1R 
        BAIAP3, CLCN7, KIAA0683, MAPK8IP3, NUBP2, NDUFB10, RAB26, LOC114984, PAQR4, HCFC1R1 
16 1246283 TPSD1 4931075 KIAA0420 16p13.3 3.68 128 FLJ14154, Magmas, C16orf5 
16 6009133 A2BP1 19087139 LOC51760 16p13-p12 13.08 59  
16 22264760 CDR2 23499838 NDUFAB1 16p12.3-p12.1 1.24  
16 31408316 FLJ13868 31792664 ZNF267 16p11.2 0.38  
17 7416868 EIF4A1 7435272 FXR2 17p13 0.02  
17 22645233 WSB1 23718425 VTN 17q11 1.07 15  
17 23924266 ALDOC 24977091 SSH2 17q11 1.05 30 SPAG5, SDF2, SUPT6H 
17 27682505 NJMU-R1 28364218 ACCN1 17q11 0.68  
17 33120548 TCF2 34143676 RNF110 17q12 1.02 10 MLLT6 
17 34816380 PPARBP 35502567 NR1D1 17q21 0.69 21 PERLD1, ERBB2 
17 35853236 IGFBP4 36285721 KRT20 17q21 0.43 13 CCR7 
17 37562439 KCNH4 38088158 CNTNAP1 17q21 0.53 19  
17 38256727 AOC3 38916860 DHX8 17q21 0.66 15  
17 41327624 MAPT 41805904 NSF 17q21 0.48  
17 44325162 ATP5G1 45858592 FLJ20920 17q21.32 1.53 33  
17 52517620 AKAP1 53952615 PNUTL2 17q23 1.43 19 MSI2 
17 53952615 PNUTL2 54188231 PPM1E 17p23 0.24  
17 57910136 TLK2 59754145 PECAM1 17q23 1.84 28 LOC51204 
17 71128808 MYO15B 71284112 H3F3B 17q25.1 0.16  
20 4149816 ADRA1D 5043599 PCNA 20p13-p12 0.89  
20 19141290 SLC24A3 20296765 INSM1 20p11 1.16  
20 24397835 C20orf39 25176706 PYGB 20p11.2 0.78 ACAS2L 
        PPGB, SULF2, ADNP, DPM1, STX16, NPEPL1 
20 43004186 TOMM34 59983250 TAF4 20q13.1-q13.3 16.98 130 NCOA3, BCAS4, ZNF217, BCAS1, CYP24A1 
20 61630222 PTK6 62181932 OPRL1 20q13.3 0.55 25 UCKL1, TCEA2 
        DYRK1A, DSCR8, HMGN1, MX1, TFF3, TSGA2 
21 36429337 CBR3 46879955 HRMT1L1 21q22.2-q22.3 10.45 127 ICOSL, ADARB1, POFUT2, C21orf56, LSS, PCNT2 
        ALG12, TUBGCP6, MAPK12, SBF1, ECGF1 
22 48487799 BRD1 49353600 ARSA 22q13.33 0.87 28 MGC16635, MAPK8IP2 

NOTE: The position of genes are based on NCBI build 34. Candidates from statistical analysis of SAGE data are listed along with previously reported amplification targets (in bold).

Identification of best candidate targets in MCRs using SAGE. Breast cancer SAGE libraries were previously described (22, 25) and are also available online.9

SAGE libraries were normalized to 100,000 total tags. For each gene in the MCR, normalized SAGE tag numbers are listed for each tumor. For each gene in the MCR, amplification status was defined by the local segmented array CGH value in that sample's profile. Thus, for each gene, samples are divided into three groups: (a) amplified tumors, (b) nonamplified tumors, and (c) normal tissues. For each gene with SAGE data, four Ps were generated reflecting the statistical significance of differences in tag numbers (a) between amplified tumors and normal group (PA/N); (b) among amplified tumor, nonamplified tumor, and normal group (PA/NA/N); (c) between amplified group and nonamplified plus normal group (PA/NA,N); and (d) between all tumors (amplified and nonamplified) and normal group (PA,NA/N). The last two Ps (PA/NA,N and PA,NA/N) were only calculated if PA/NA/N is significant (P < 0.05). These tests were done separately for each of three thresholds for defining “amplified” segment values: (a) + level (>0.09), (b) ++ level (>0.2), and (c) +++ level (>0.5) to confirm that results were stable across a range of CNA. Fold overexpression is estimated by dividing mean tag numbers from the tumor samples with amplification with mean tag numbers from normal group. We used the following criteria for identifying a gene as a candidate target of the amplicon: the gene must meet the following criteria (a) either PA/N or PA/NA/N is <0.05, (b) overexpression must be >2-fold in amplified tumors compared with normal, (c) among tumors predicted to be amplified the SAGE tag ratio of (tumor tag) / [max (normal tag)] must reach 0.67, 0.8, and 1.0 for levels of +, ++, and +++ amplification (see above), respectively.

Overall correlation between gene overexpression and amplification. SAGE data from each tumor with array CGH data were compared with those from two normal mammary epithelial cells using a previously described method (22, 26), and tags that satisfied the following two criteria were considered overrepresented in tumors: (a) the difference between the tag numbers in tumor and normal samples is statistically significant (P < 0.05) using the PK algorithm (26), and (b) normalized tumor tag number is at least 2-fold higher than the tag number in either of the two normal samples. Each tag with at least two copies/library in the SAGE libraries was assigned to the best matching gene using an online resource.9 The total number of overexpressed genes in each sample was estimated based on the number of overexpressed tags with unique best gene match. SAGE analysis was done for all genes located in predicted amplicons to determine the fraction of amplified genes that are also overexpressed, and the subset of overexpressed genes that are amplified. For each sample, the observed number of overexpressed genes within amplified regions is compared with the number expected by chance to give an odds ratio. Odds ratios are not calculated if the expected number of overexpressed genes in amplified regions is <2. For each sample, odds ratios are determined separately for all three levels of amplification, as above: >0.09 (+ level), >0.2 (++), and >0.5 (+++).

Quantitative real-time PCR. Quantitative PCR primers were designed to amplify products of 100 to 150 bp (sequence of primers for genes analyzed is listed in Supplementary Table S4). Quantitative PCR was done on MJ Research Chromo 4 (Bio-Rad, Hercules, CA). Briefly, PCR reactions were done in a total volume of 25 μL composed of 1× PCR buffer [16.6 mmol/L NH4SO4, 67 mmol/L Tris (pH 8.8), 6.7 mmol/L MgCl2, 10 mmol/L β-mercaptoethanol] containing 2 ng of genomic DNA, 0.5 μmol/L of each primer, 0.5 mmol/L deoxynucleotide triphosphates, 0.5 mg/mL bovine serum albumin, 1 μL 1:1,500 diluted SYBR Green I, and 0.2 μL Platinum Taq (Invitrogen, Carlsbad, CA). The cycling conditions were 10 minutes at 95°C followed by 40 cycles of 15 seconds at 95°C and 1 minute at 58.1°C with a plate read at the end of each cycle. Composition of PCR products was examined by generating melting curves. The relative gene copy number was calculated by the comparative Ct method (27) and normalized to normal human genomic DNA and to a nonamplified gene (PVR on chromosome 19q13) from the same sample. Based on array CGH data, PVR gene copy number had no particular changes in all the samples. Quantitative PCR using cDNA templates was similarly done using RPL 39 (ribosomal protein L39) as control for normalization.

Fluorescence in situ hybridization. Bacterial artificial chromosome clones flanking or containing TEL/ETV6, KRAS2 (RP11-37P8), H2AFJ (RP-911J12), or EPS8 (RP-878D15) were obtained from Invitrogen/Research Genetics (Carlsbad, CA). The bacterial artificial chromosomes in the TEL/ETV6 probe are RP11-144O23 (AC006518) and RP11-267J23 (AC007537). The CEP4, CEP12 (D12Z3), and TEL/AML1 probes were obtained from Vysis, Inc. (Downers Grove, IL). The TEL/AML1 mix contains a TEL (spectrum green) probe at 12p13 that begins between exons 3 and 5 of TEL and extends ∼350 kb towards the telomere of 12p and an AML1 (spectrum red) probe at 21q22 spanning the entire AML1 gene. Touch preparations from the frozen tissues were prepared as follows: tissue was cut with a razor blade, and fresh cut surface was touched gently against a surface of a clean glass slide in several places and air-dried. Cells were fixed in cold 70% ethanol at 4°C for 2 hours, dehydrated in ascending ethanol series, and air-dried. After probe application, both tissue and probe DNAs were denatured simultaneously at 80°C for 2 minutes. Hybridization was carried out overnight at 37°C. Hybridizations of metaphase chromosomes obtained from the HCC1937 and ZR75-1 cell lines were done according to the method described (28). Metaphase chromosomes and interphase nuclei were stained with 4′,6-diamidino-2-phenylindole.

Array CGH analysis. cDNA array CGH was used to analyze copy number changes in 30 breast tumors (10 DCIS, 18 IDCs, and two lymph node metastases) along with five nonmalignant cells purified from normal breast tissue or breast carcinomas. The array used in this study (Agilent Human 1 clone set) covers >9,000 unique map positions with a median interval of about 100 kb between mapped elements. Typical array CGH profiles after normalization and Circular Binary Segmentation (see Materials and Methods for details) are depicted in Fig. 1A. Overall, individual cDNA log 2 ratios are scattered with the majority of log 2 ratios ranging from −0.5 to 0.5 even when using normal DNA. However, segmented array CGH data of nonmalignant samples clearly showed lack of statistically significant copy number gains and losses, whereas that of tumor samples identified multiple genetic alterations (Fig. 1A). To identify genomic areas that are recurrently amplified or deleted, we generated plots summarizing copy number alterations in all tumors and in separate DCIS and IDC groups (Fig. 1B). This overview shows that low-fold copy number gains and losses (segment values >0.09 or less than −0.08) affect nearly all chromosomes, whereas most high-fold amplifications and deletions (>0.5 or less than −0.35) correspond to previously identified regions (1q21, 8q24, 11q13, 12q13, 15q26, 17q21, 20q13 and 1p32, 11q11-12, 13q, 16q24, and 17p13, respectively). Increased log 2 ratio for chromosome X was observed in all samples due to the use of male genomic DNA as reference, whereas all breast tissue samples were obtained from females. The highest level amplification (>50-fold, calculated based on array CGH log 2 ratio; refs. 17, 18) was found at 15q26.3, presumably targeting IGF1R (29); but because this amplification was present in only one tumor (IDC-B17), it is not prominent in the recurrence chart (Fig. 1B), whereas 17q21 harboring ERBB2 clearly stands out. We also identified areas of amplifications, such as 12p13 and 16p13, that have not been characterized in detail and may harbor novel breast cancer oncogenes (Fig. 1B). The segmentation algorithm (Circular Binary Segmentation; ref. 24) effectively disregards single cDNA probes possessing aberrant high CGH log 2 ratios but identifies sets of adjacent probes with an altered average log 2 ratio. This segmental filtration can only detect amplicons that are composed of more than two genes. However, we cannot exclude the possibility that single highly aberrant log 2 ratios could represent very small focal amplifications not captured in Fig. 1B.

Figure 1.

cDNA array CGH profiling of breast tumors. A, representative array CGH profiles of a normal and a tumor sample. Black dots represent raw log 2 ratios, and the red lines represent data after segmentation. B, recurrence of chromosomal alterations. Integer value recurrence of copy number alterations in segmented data (y axis) is plotted for each probe aligned along the x axis in chromosome order. Dark red or green bars denote gain or loss of chromosome material, respectively; bright red or green bars represent probes within regions of higher-level amplification or deletion, respectively (see Materials and Methods). Blue stars mark copy number alterations that are statistically significantly different between DCIS and IDC. C, clustering of normal and tumor samples based on copy number gain. “Gained” regions with corresponding segmented log 2 ratio of ≥0.09 were identified, and raw data from these regions were used along with a filter of at least one sample having a log 2 ratio of ≥0.5. Red and green signal represents genes with copy number gains and losses, respectively; black areas correspond to clones that were removed from the sample using the “Gain” filter. Depicted chromosome length is proportionate to the number of cDNA probes contained in each region. Normal samples (green) cluster together and are devoid of statistically significant chromosomal changes, whereas DCIS (blue), invasive (red), and metastatic (black) tumors do not form distinct clusters according to tumor stage. Colored rectangles indicate tumor grade (red, high; purple, intermediate; blue, low grade), lymph node (LN), ER, and HER2 status (red, positive; blue, negative; gray, unknown). D, identification of MCRs at 8q21 and 8q24. Raw array CGH log 2 ratios and segmented data of tumors IDC-C7, IDC-C2, and DCIS5. MCRs are marked with green lines, and their potential targets WWP1 and CMYC/PVT1 are indicated. E, identification of amplicons and MCRs at 12p13. Normalized array CGH log 2 ratios and segmented data of tumors IDC1, LN1, and IDC-C6. MCRs are marked with green lines, and their potential targets TEL/ETV6, H2AFJ, EPS8, and KRAS2 are indicated.

Figure 1.

cDNA array CGH profiling of breast tumors. A, representative array CGH profiles of a normal and a tumor sample. Black dots represent raw log 2 ratios, and the red lines represent data after segmentation. B, recurrence of chromosomal alterations. Integer value recurrence of copy number alterations in segmented data (y axis) is plotted for each probe aligned along the x axis in chromosome order. Dark red or green bars denote gain or loss of chromosome material, respectively; bright red or green bars represent probes within regions of higher-level amplification or deletion, respectively (see Materials and Methods). Blue stars mark copy number alterations that are statistically significantly different between DCIS and IDC. C, clustering of normal and tumor samples based on copy number gain. “Gained” regions with corresponding segmented log 2 ratio of ≥0.09 were identified, and raw data from these regions were used along with a filter of at least one sample having a log 2 ratio of ≥0.5. Red and green signal represents genes with copy number gains and losses, respectively; black areas correspond to clones that were removed from the sample using the “Gain” filter. Depicted chromosome length is proportionate to the number of cDNA probes contained in each region. Normal samples (green) cluster together and are devoid of statistically significant chromosomal changes, whereas DCIS (blue), invasive (red), and metastatic (black) tumors do not form distinct clusters according to tumor stage. Colored rectangles indicate tumor grade (red, high; purple, intermediate; blue, low grade), lymph node (LN), ER, and HER2 status (red, positive; blue, negative; gray, unknown). D, identification of MCRs at 8q21 and 8q24. Raw array CGH log 2 ratios and segmented data of tumors IDC-C7, IDC-C2, and DCIS5. MCRs are marked with green lines, and their potential targets WWP1 and CMYC/PVT1 are indicated. E, identification of amplicons and MCRs at 12p13. Normalized array CGH log 2 ratios and segmented data of tumors IDC1, LN1, and IDC-C6. MCRs are marked with green lines, and their potential targets TEL/ETV6, H2AFJ, EPS8, and KRAS2 are indicated.

Close modal

In addition to the combined analysis of all 30 tumors, we also analyzed the 10 DCIS and 18 invasive tumors as separate groups with the aim of identifying genetic alterations potentially involved in the in situ to invasive carcinoma transition. Copy number changes were readily detected by array CGH in DCIS and in some cases were extremely prevalent across the whole genome (e.g., in DCIS5; Fig. 1C), correlating with prior studies describing high degree of genomic instability at this early stage of breast tumorigenesis (30). However, with the exception of 1q and 17q21/ERBB2 amplicons, an overall trend toward an increase in the number and amplitude of gains and losses from DCIS to IDC was observed (Fig. 1B,, middle and lower). Unsupervised clustering of filtered raw data (log 2 ratio above 0.09), including all amplified genes with at least one sample having a log 2 ratio above 0.5, did not identify clear DCIS and IDC clusters nor did the tumors cluster according to grade or ER status (Fig. 1C). However, statistical analyses determined that gain of 5q, chromosome 7, 11q, 16p, and 20p was statistically significantly (P < 0.05, not corrected for multihypothesis testing) more likely to be detected in IDC than in DCIS, whereas loss of chromosome 9 preferentially occurred in DCIS (Fig. 1B). Similarly ER+ invasive tumors were more likely to have 1q and 11q gain than ER ones, whereas we did not detect any statistically significant association between a specific amplification event and tumor grade, potentially because the majority of our tumors (22 of 30) were high grade (Fig. 1C).

Identification of MCRs of amplifications. MCRs of amplifications from the predicted copy number gains were identified using a recently described algorithm (23). The 49 highest ranked MCRs and their known and candidate targets are listed in Table 1, and examples of corresponding array CGH profiles with the MCRs indicated are depicted in Fig. 1D and F. Among these highest ranked MCRs are many known amplicons frequently (10-30% of tumors) amplified in breast cancer, including 17q21 (ERBB2), 8q24 (MYC), 11q13 (GARP and EMSY), 8p12 (FGFR1), and 20q13 (BCAS1 and ZNF217). We also detected less frequent amplicons, including 5q35 (FGFR4) and 15q25.6 (IGF1R), that are amplified in 3% to 5 % of tumors. Although the majority of MCRs are fairly large (0.5-10 Mb), a few of them are small (<500 kb) and contain a limited number (510) of genes representing attractive regions for further study.

To validate our array CGH results, we did quantitative real-time PCR analysis of selected candidate genes in primary breast tumors and breast cancer cell lines (Table 2). Amplification of most candidates was confirmed, although the fold amplification determined by array CGH and quantitative real-time PCR was not always in perfect agreement.

Table 2.

Quantitative PCR validation of selected genes from selected MCRs in primary breast tumors (A) and breast cancer cell lines (B)

A.
ChromosomeGeneBandSampleCGH, log 2 ratioQuantitative PCR, copy no.AmpPredict
VAMP8 2p12 DCIS5 1.01 1.2     
   IDC1 1.32     
   LN1 0.94 1.1     
   IDC-C47 0.95 0.8     
TOX 8q12 IDC-C10 0.906 3.2     
   IDC-C22 0.858 2.2     
   IDC-C7 1.33 1.9     
   DCIS5 1.168 1.6     
KIAA0196 8q24 DCIS5 2.292 5.8     
   IDC-C6 1.05 5.4     
   IDC-C10 1.443 3.4     
   DCIS-D9728 0.75     
12 ETV6 12p13 IDC-C6 0.87 2.8     
   IDC1 1.76 3.7     
   LN1 2.17 4.9     
12 KRAS2 12p12 IDC-C6 -0.3     
   IDC1 1.39 3.6     
   LN1 1.83     
12 RAP1B 12q14 DCIS5 2.3 ND     
   IDC-C4 2.31 1.3     
   IDC-C47 4.07 0.7     
17 MLLT6 17q12 LN1 1.465 7.4     
   IDC1 1.153 4.5     
   I-EPI-7 2.329 1.1     
17 NGFR 17q21 IDC-C2 3.271 11     
   IDC-C3 2.374     
   D-EPI-3 0.98 3.4     
   DCIS4 0.943     
20 CEBPB 20q13 IDC-B17 1.974 1.7     
   IDC-C7 0.852 ND     
   DCIS5 0.782 2.5     
   D-EPI-7 1.032     
21 COL6A1 21q22 DCIS5 1.986 6.5     
   DCIS4 0.763 1.3     
21 AIRE 21q22 DCIS5 1.05 2.9     
   DCIS4 0.557 1.1     
22 HDAC10 22q13 DCIS4 1.56 5.5     
   IDC-C22 0.982 1.3     
22 MLC1 22q13 DCIS4 1.06 5.1     
   IDC5 0.834 1.5     
           
           
B.
 
          
Cell line Chromosome/gene
 
         

 
8q13 BIG1
 
8 Y196
 
8q24 ZHX1
 
20q13 CEBPB
 
12p13 ETV6
 
12p13 KRAS2
 
12p11 SURB7
 
21q22 COL6A1
 
22q13 HDAC10
 
22q13 MLC1
 
21MT1 2.1 2.7 3.6 1.4 1.4 1.5 2.3 0.7 1.2 1.6 
21NT 2.4 2.0 2.2 0.7 0.8 1.3 1.1 0.3 1.2 1.3 
BT-20 1.5 4.8 3.7 4.8 1.1 1.8 1.6 0.8 0.9 1.5 
BT-474 1.3 1.1 1.9 4.4 0.8 1.4 1.2 0.7 0.7 1.3 
BT-549 1.9 1.6 2.1 0.9 0.7 1.0 1.0 0.9 0.7 0.8 
HCC1937 3.1 4.3 5.1 1.5 5.7 3.4 3.8 1.0 1.3 1.2 
Hs578T 1.6 1.5 1.2 1.7 0.9 1.1 1.3 0.6 0.9 2.4 
MCF10DCIS 0.7 1.5 1.3 0.5 0.9 1.3 0.9 0.6 1.6 1.2 
MCF-7 1.1 3.9 3.1 2.0 0.9 1.3 1.1 1.7 0.8 0.6 
MDA-MB-157 1.3 2.0 1.7 6.0 0.2 0.3 0.4 2.3 0.6 0.6 
MDA-MB-231 0.7 0.3 1.6 0.7 0.8 0.8 0.7 0.7 0.7 0.8 
MDA-MB-435S 1.3 1.8 1.6 1.4 0.6 0.8 0.9 1.3 1.4 1.5 
MDA-MB-468 1.0 1.1 1.4 0.7 0.9 0.9 1.0 1.4 1.1 1.4 
SK-BR-3 1.0 2.9 8.5 1.4 0.7 0.5 0.7 1.0 0.8 2.3 
SUM-44 1.8 1.8 1.6 0.5 1.0 0.8 0.9 0.9 0.7 0.6 
SUM-52 0.5 1.5 1.6 0.8 1.2 1.0 1.4 0.6 0.5 0.3 
SUM-102 1.4 1.2 1.1 0.4 1.4 1.1 1.8 0.5 0.7 1.5 
SUM-1315 1.2 1.2 1.3 0.4 0.6 0.5 0.7 0.7 0.9 1.8 
SUM-149 1.1 1.5 1.8 1.0 0.8 1.7 1.9 0.9 1.8 2.1 
SUM-159 1.1 0.7 0.2 0.9 0.9 0.9 1.1 0.8 1.0 1.5 
SUM-185 0.8 0.8 1.0 0.6 0.7 0.6 0.9 1.0 1.0 1.0 
SUM-190 2.3 1.5 1.4 0.7 1.2 1.1 1.3 1.4 1.8 1.7 
SUM-225 0.9 3.2 3.3 1.3 1.0 1.1 1.4 1.8 9.9 13.6 
SUM-229 1.4 1.8 1.8 1.2 1.0 0.6 0.7 1.3 1.1 0.8 
T47-D 1.2 1.6 1.5 0.9 0.3 1.0 0.9 0.6 0.5 0.4 
UACC-812 1.0 1.8 1.4 1.1 0.8 0.7 0.8 1.4 0.7 0.7 
UACC-893 1.6 2.9 2.6 0.3 1.5 1.1 1.7 0.4 0.4 0.4 
ZR-75-1 1.2 1.5 1.6 4.2 2.7 1.5 2.1 1.1 0.7 0.7 
A.
ChromosomeGeneBandSampleCGH, log 2 ratioQuantitative PCR, copy no.AmpPredict
VAMP8 2p12 DCIS5 1.01 1.2     
   IDC1 1.32     
   LN1 0.94 1.1     
   IDC-C47 0.95 0.8     
TOX 8q12 IDC-C10 0.906 3.2     
   IDC-C22 0.858 2.2     
   IDC-C7 1.33 1.9     
   DCIS5 1.168 1.6     
KIAA0196 8q24 DCIS5 2.292 5.8     
   IDC-C6 1.05 5.4     
   IDC-C10 1.443 3.4     
   DCIS-D9728 0.75     
12 ETV6 12p13 IDC-C6 0.87 2.8     
   IDC1 1.76 3.7     
   LN1 2.17 4.9     
12 KRAS2 12p12 IDC-C6 -0.3     
   IDC1 1.39 3.6     
   LN1 1.83     
12 RAP1B 12q14 DCIS5 2.3 ND     
   IDC-C4 2.31 1.3     
   IDC-C47 4.07 0.7     
17 MLLT6 17q12 LN1 1.465 7.4     
   IDC1 1.153 4.5     
   I-EPI-7 2.329 1.1     
17 NGFR 17q21 IDC-C2 3.271 11     
   IDC-C3 2.374     
   D-EPI-3 0.98 3.4     
   DCIS4 0.943     
20 CEBPB 20q13 IDC-B17 1.974 1.7     
   IDC-C7 0.852 ND     
   DCIS5 0.782 2.5     
   D-EPI-7 1.032     
21 COL6A1 21q22 DCIS5 1.986 6.5     
   DCIS4 0.763 1.3     
21 AIRE 21q22 DCIS5 1.05 2.9     
   DCIS4 0.557 1.1     
22 HDAC10 22q13 DCIS4 1.56 5.5     
   IDC-C22 0.982 1.3     
22 MLC1 22q13 DCIS4 1.06 5.1     
   IDC5 0.834 1.5     
           
           
B.
 
          
Cell line Chromosome/gene
 
         

 
8q13 BIG1
 
8 Y196
 
8q24 ZHX1
 
20q13 CEBPB
 
12p13 ETV6
 
12p13 KRAS2
 
12p11 SURB7
 
21q22 COL6A1
 
22q13 HDAC10
 
22q13 MLC1
 
21MT1 2.1 2.7 3.6 1.4 1.4 1.5 2.3 0.7 1.2 1.6 
21NT 2.4 2.0 2.2 0.7 0.8 1.3 1.1 0.3 1.2 1.3 
BT-20 1.5 4.8 3.7 4.8 1.1 1.8 1.6 0.8 0.9 1.5 
BT-474 1.3 1.1 1.9 4.4 0.8 1.4 1.2 0.7 0.7 1.3 
BT-549 1.9 1.6 2.1 0.9 0.7 1.0 1.0 0.9 0.7 0.8 
HCC1937 3.1 4.3 5.1 1.5 5.7 3.4 3.8 1.0 1.3 1.2 
Hs578T 1.6 1.5 1.2 1.7 0.9 1.1 1.3 0.6 0.9 2.4 
MCF10DCIS 0.7 1.5 1.3 0.5 0.9 1.3 0.9 0.6 1.6 1.2 
MCF-7 1.1 3.9 3.1 2.0 0.9 1.3 1.1 1.7 0.8 0.6 
MDA-MB-157 1.3 2.0 1.7 6.0 0.2 0.3 0.4 2.3 0.6 0.6 
MDA-MB-231 0.7 0.3 1.6 0.7 0.8 0.8 0.7 0.7 0.7 0.8 
MDA-MB-435S 1.3 1.8 1.6 1.4 0.6 0.8 0.9 1.3 1.4 1.5 
MDA-MB-468 1.0 1.1 1.4 0.7 0.9 0.9 1.0 1.4 1.1 1.4 
SK-BR-3 1.0 2.9 8.5 1.4 0.7 0.5 0.7 1.0 0.8 2.3 
SUM-44 1.8 1.8 1.6 0.5 1.0 0.8 0.9 0.9 0.7 0.6 
SUM-52 0.5 1.5 1.6 0.8 1.2 1.0 1.4 0.6 0.5 0.3 
SUM-102 1.4 1.2 1.1 0.4 1.4 1.1 1.8 0.5 0.7 1.5 
SUM-1315 1.2 1.2 1.3 0.4 0.6 0.5 0.7 0.7 0.9 1.8 
SUM-149 1.1 1.5 1.8 1.0 0.8 1.7 1.9 0.9 1.8 2.1 
SUM-159 1.1 0.7 0.2 0.9 0.9 0.9 1.1 0.8 1.0 1.5 
SUM-185 0.8 0.8 1.0 0.6 0.7 0.6 0.9 1.0 1.0 1.0 
SUM-190 2.3 1.5 1.4 0.7 1.2 1.1 1.3 1.4 1.8 1.7 
SUM-225 0.9 3.2 3.3 1.3 1.0 1.1 1.4 1.8 9.9 13.6 
SUM-229 1.4 1.8 1.8 1.2 1.0 0.6 0.7 1.3 1.1 0.8 
T47-D 1.2 1.6 1.5 0.9 0.3 1.0 0.9 0.6 0.5 0.4 
UACC-812 1.0 1.8 1.4 1.1 0.8 0.7 0.8 1.4 0.7 0.7 
UACC-893 1.6 2.9 2.6 0.3 1.5 1.1 1.7 0.4 0.4 0.4 
ZR-75-1 1.2 1.5 1.6 4.2 2.7 1.5 2.1 1.1 0.7 0.7 

NOTE: Chromosomal location, genes, tumor or cell line name, and copy numbers predicted based on array CGH (A) and quantitative PCR are listed. The “AmpPredict” column denotes whether the gene is predicted to be amplified based on the segmentation algorithm.

Overall correlation between gene amplification and overexpression. We also analyzed the overall contribution of copy number gain to gene expression changes by calculating the percentage of overexpressed genes that are located in amplicons and the percentage of genes present in amplicons that are overexpressed within each tumor. Among the 30 tumor samples with array CGH profiling data, we had SAGE libraries for 14 of them. In addition, we had SAGE libraries from two different cases of normal luminal mammary epithelial cells to be used for comparisons. Odds ratios for overexpressed genes within amplicons were determined for each sample (see Materials and Methods and Supplementary Table S1.). For amplifications defined by segment values >0.09 (+ level), the mean odds ratio among all samples is 1.85 (0.52-2.96). In other words, genes in low-level amplified regions are nearly twice as likely to be overexpressed than genes in not amplified areas. Raising the amplification threshold to 0.2 (++) and 0.5 (+++) levels leads to a stronger association, with odds ratios of 2.14 (0.27-4.47) and 3.97 (1.49-10.65), respectively, supporting that higher levels of amplification lead to greater overexpression of contained genes. Contrary to previous studies reporting high overall association between gene expression and amplification (18, 19), we found that the correlation between gene amplification and overexpression is highly variable among tumors, suggesting different mechanisms of gene activation depending on tumor subtype (Supplementary Table S1). The difference between our results and that of previous studies could be due to the use of different platforms for expression profiling, setting more stringent criteria for “overexpression,” using purified normal mammary epithelial cells as reference, and the fact that SAGE tag numbers predict absolute mRNA copy numbers, whereas cDNA array data reflects relative mRNA levels.

Identification of candidate targets of MCRs. To identify candidate targets of amplicons based on gene expression patterns, we tested the genes in each MCR based on SAGE data obtained from the same samples by comparing to those from three normal controls: two from purified normal mammary epithelial cells and one from normal breast organoid (details of the statistical analysis are described in Materials and Methods). Using this combinatorial approach in many cases, we were able to narrow down the number of candidate targets to a few genes. A complete gene list and statistical analysis for each MCR is available as a Supplementary Data File. It is noteworthy that previous gene expression profiling and array CGH studies did not use normal mammary epithelial cells for reference (18, 19). Thus, we believe our analysis was improved and more reliable in detecting overexpressed oncogenes. On the other hand, a limitation of the SAGE method is that at the usual sequencing depth (∼50,000 tags per library), it is not able to identify targets with low overall expression levels. In addition, regardless of the method used, the magnitude of gene dosage effect is not a necessary predictor of functional relevance in tumorigenesis.

To validate our approach, we first applied our statistical analysis to fairly well characterized amplicons, including 17q21, 11q13, and 8q24. The 17q21/ERBB2 MCR contains 21 genes, and among the 14 tumors with SAGE data, eight of them showed amplification of this 17q21 MCR. Using our method, we predicted that the strongest candidate of this MCR was ERBB2 based on the P values obtained in the four different statistical tests and the fold overexpression (Supplementary Table S2, top). However, in addition to ERBB2, neighboring gene PERLD1 was also identified as a potential candidate target. Correlating with our data, a recent detailed characterization of the ERBB2 amplicon at the copy number [fluorescence in situ hybridization (FISH) on tissue microarrays] and gene expression (real-time PCR) levels also found that ERBB2, PNMT, and PERLD (MGC9753) show the best correlation between amplification and overexpression (31).

The same analysis done in the 11q13.3-5 MCR identified E2IG4 as the strongest candidate target, immediately adjacent to EMSY, a recent proposed target of this amplicon (refs. 32, 33; Supplementary Table S2, bottom). Interestingly, E2IG4 was previously identified as a gene induced by estrogen in the MCF-7 breast cancer cell line, and it encodes a secreted protein with leucine-rich repeats. Its exact function and potential role in breast cancer are unknown (34). EMSY was recently identified as a BRCA2-interacting protein and a target of the 11q13 amplicon in breast and ovarian carcinomas (32, 33). The amplification and overexpression of EMSY may compromise BRCA2 function in sporadic tumors. Although our SAGE data did not support EMSY as the target, the fact that our method pinpointed to the neighboring gene E2IG4 confirmed the location of target(s) in this particular amplicon. Overexpression of E2IG4 and EMSY should be further validated using other means, such as quantitative RT-PCR or Northern hybridization.

On the other hand, our analysis of the 8q24.3 amplicon did not identify the known presumed target, MYC (Fig. 1D; Supplementary Table S3). Among the five tumors with amplification of this area, none of them showed significant overexpression of MYC compared with normal mammary epithelial cells or tumors that lacked amplification, suggesting that it may not be the best candidate target of this MCR in these breast tumors. A recent study analyzing myeloid malignancies with 8q24 amplification also concluded that MYC is not overexpressed in the amplified tumors and thus may not be the only target of this amplicon (35). Similarly, cDNA array CGH and gene expression analysis of breast carcinomas showed that only two of eight tumors with MYC amplification had increased expression of its mRNA, again raising the question whether MYC is the only target of the 8q24 amplicon in breast cancer (18). A previous study using an inducible MYC expression model showed that MYC expression was not fully required for tumor progression in the presence of additional genomic changes, such as KRAS2 mutation (36), raising the possibility that MYC overexpression in tumors might be transiently maintained. We identified KIAA0196 as another potential target of this region, correlating with recent findings that KIAA0196 was both amplified and overexpressed in prostate cancer (37, 38).

Systematic characterization of the 12p13-p12 amplicon. In addition to these previously described and well-characterized amplicons, we also identified several chromosomal areas with high-level copy number gains that have not previously been characterized in detail in breast cancer. Among these, we further characterized a 1.8-Mb amplicon at 12p13-p12 found in three tumors (Fig. 1F; Table 3), because it showed one of the highest levels of amplification (comparable with that of the ERBB2 amplicon). KRAS2, an oncogene that is located in the 12p13-12 amplicon, has previously been described as a gene amplified in a subset of breast carcinomas based on a on a limited array CGH screen for the amplification of known oncogenes (20), but this amplicon has not been systematically characterized at high resolution. Based on our integrated array CGH/SAGE analysis, we identified three candidate target genes in this region: H2AFJ, EPS8, and KRAS2. However, we also considered TEL/ETV6 as a candidate target, because it is a known oncogene (3941), and translocation of TEL/ETV6 to NTRK3 on 15q25 was reported in >90% of secretory breast cancers (42, 43) and even in rare cases of IDCs (44). TEL/ETV6 encodes a member of the ETS family of transcription factors with an NH2-terminal oligomerization (PNT) and a COOH-terminal DNA binding (ETS) domain (40). TEL/ETV6 translocation was frequently found in myeloid and lymphoid leukemias and in solid tumors (39, 40, 45), and recently, amplification of TEL/ETV6 was reported in a myelodysplastic syndrome (46). KRAS2 is another attractive target due to the well-established roles of Ras family proteins involved in tumorigenesis. Mutation of RAS genes is infrequent in human breast carcinomas (47, 48), but amplification of KRAS2 has previously been reported in 10 of 27 cases of breast carcinomas and breast cancer cell lines (20) and in one case of metastatic rectal carcinoma (21).

Table 3.

Identification of candidate genes in the 12p13 amplicon using SAGE data

Chromosome 12 positionGeneAmplification
TagSAGE
N-EPI-1N-EPI-2N-ORG-1− DCIS1− D-EPI-3− DCIS4+ DCIS5− D-EPI-6− D-EPI-7− DCIS8+++ IDC1− IDC3− IDC5− IDC6− I-EPI-7+++ LN1− LN2+ PANA/N+ PA/AN, N+ PA,NA/N+ PA/N+ Fold+++ PA/NA/N+++ PA/NA, N+++ PA,NA/N+++ PA/N+++ Fold
9896236 CLECSF2 0 0 0 AGAGGGAGTG NA NA NA NA NA NA NA NA 
9992003 FLJ46363 0 0 0 AAAATTTCAC NA NA NA NA NA NA NA NA 
10015281 MICL 0 0 0 CATTTATTAC NA NA NA NA NA NA NA NA 
10036942 CLEC2 0 0 0 CCTCGGAAAT NA NA NA NA NA NA NA NA 
10114349 CLEC1 0 0 0 CCGTTTCCCA NA NA NA NA NA NA NA NA 
10160649 CLECSF12 0 0 0 TGCTGATTTG NA NA NA NA NA NA NA NA 
10202169 OLR1 0 0 0 CATACTACAA 0.819 NA NA NA 0.864 NA NA NA 
10222898 FLJ31166 0 0 0 TTCCCATTTA NA NA NA NA NA NA NA NA 
10351684 KLRD1 0 0 0 TCACTATGCC NA NA NA NA NA NA NA NA 
10416221 KLRK1 0 0 0 TTGTATAAAT NA NA NA NA NA NA NA NA 
10451250 KLRC4 0 0 0 CTTCTATAAA NA NA NA NA NA NA NA NA 
10633039 KLRA1 0 0 0 CAGAAGAAAG NA NA NA NA NA NA NA NA 
10648061 FLJ10292 17 0 0 0 GGTTGGACAG NA NA NA NA NA NA NA NA 
10662805 STYK1 0 0 0 TATTGTTCAT NA NA NA NA NA NA NA NA 
10845480 TAS2R7 0 0 0 CATCTCTAAA NA NA NA NA NA NA NA NA 
10853003 TAS2R9 0 0 0 AGGGCCATAA NA NA NA NA NA NA NA NA 
10869212 TAS2R10 0 TTTTACTGTG NA NA NA NA NA NA NA NA 
10889749 PRR4 0 0 0 ACATTGAAAT NA NA NA NA NA NA NA NA 
10952253 TAS2R13 0 0 0 CTTTGTGAGA NA NA NA NA NA NA NA NA 
10982120 TAS2R14 0 0 15 0 TGTGTATGTA 0.819 NA NA NA 0.864 NA NA NA 
11040812 TAS2R49 0 0 0 GCAAAGGATC NA NA NA NA NA NA NA NA 
11065538 TAS2R48 0 0 0 TATCCTTCAT NA NA NA NA NA NA NA NA 
11310125 PRB3 0 0 0 ACATTGGAAG NA NA NA NA NA NA NA NA 
11396024 PRB1 0 0 0 ACATTGGAAA NA NA NA NA NA NA NA NA 
11694055 ETV6 0 0 0 GTGTTTTTGT 0.819 NA NA NA 0.864 NA NA NA 
12115145 BCL2L14 0 0 0 TGTTTCCACT NA NA NA NA NA NA NA NA 
12401322 LOH12CR1 0 0 0 AAAATCTGAC NA NA NA NA NA NA NA NA 
12520098 DUSP16 0 0 0 GTGGCATCTG NA NA NA NA NA NA NA NA 
12705263 GPR19 0 GCCAAAACTA NA NA NA NA NA NA NA NA 
12761576 CDKN1B 11 13 0 14 32 0 TTTTGTGCAT NA NA NA NA 0.907 NA NA 0.945 
12829882 DKFZP434F0318 14 12 0 0 TCAAGCAATC 0.399 NA NA 0.423 0.732 NA NA NA 
12935620 RAI3 39 16 45 13 44 20 39 42 26 72 41 47 GTGGTGGCAG 0.215 NA NA 0.441 0.397 NA NA 0.452 
12984976 GPRC5D 0 0 0 GGTTTTCCTG NA NA NA NA NA NA NA NA 
13019071 HEBP1 19 16 0 27 0 TTCCATATAC 0.718 NA NA 0.423 0.599 NA NA NA 
13044635 KIAA1467 12 0 0 13 CTCCTTTCTT 0.681 NA NA 0.423 0.478 NA NA NA 
13127761 GSG1 0 0 0 TGCTTAAGCC NA NA NA NA NA NA NA NA 
13415290 FLJ33810 0 0 0 GCAGGTTGTG NA NA NA NA NA NA NA NA 
13605411 GRIN2B 0 0 0 TCCCTGGACG NA NA NA NA NA NA NA NA 
14656843 GUCY2C 0 0 0 AATCAGATGT NA NA NA NA NA NA NA NA 
14818562 H2AFJ 90 48 18 208 26 51 168 87 233 65 885 90 147 106 29 1011 375 GAGGGCCGGT 0.005 0.129 0.062 0.103 13 0 0 0.062 0.006 18 
14847773 MGC47869 0 0 0 TGCTATGTTA 0.819 NA NA NA 0.864 NA NA NA 
14848851 LOC440087 0 0 0 AAAGACTTTA NA NA NA NA NA NA NA NA 
14926095 MGP 344 177 14 13 333 163 56 63 487 28 67 142 372 125 53 34 GTTTATGGAT NA NA NA NA NA NA NA NA 
14986232 ARHGDIB 13 17 0 0 34 14 0 CTGGCCCGAG 0.366 NA NA NA 0.486 NA NA NA 
15017245 PDE6H 0 0 0 AGCTCGCTCA NA NA NA NA NA NA NA NA 
15046034 LOC440088 0 0 0 AACGATTGGG NA NA NA NA NA NA NA NA 
15151985 RERG 0 0 0 ACTTATTTTG NA NA NA NA NA NA NA NA 
15366754 PTPRO 0 0 0 GATATACAAC NA NA NA NA NA NA NA NA 
15664365 EPS8 23 0 50 85 0 34 37 18 16 162 7 30 53 0 272 17 AGTCAGCTGG 0.008 0.113 0.355 0.093 6 0.001 0.077 0.355 0.061 9 
15926613 STRAP 34 16 27 13 30 28 45 43 56 19 17 36 278 30 ATAAAGTAAC 0.365 NA NA 0.584 0.012 0.327 0.888 0.355 
24855562 BCAT1 0 0 0 GAATAATTGT NA NA NA NA NA NA NA NA 
25037625 LOC196415 0 0 0 CAGATAATCC NA NA NA NA NA NA NA NA 
25152490 CASC1 0 0 0 GTGAAAGACA 0.819 NA NA NA 0.864 NA NA NA 
25249447 KRAS2 14 0 11 42 71 AACTGTACTA 0.014 0.217 0.065 0.19 8 0 0.066 0.065 0.082 12 
25520283 FLJ36004 0 0 0 AGAGGGTGAA NA NA NA NA NA NA NA NA 
26003229 C12orf2 0 0 0 TTCACTAATT 0.647 NA NA NA 0.729 NA NA NA 
26164228 BHLHB3 13 12 0 0 CTATTTTTGT 0.513 NA NA 0.423 0.728 NA NA NA 
26239789 SSPN 14 0 0 0 GAGTAGCTGA NA NA NA NA NA NA NA NA 
26381609 ITPR2 0 0 0 AGAAATTCAG NA NA NA NA NA NA NA NA 
26949385 FLJ10637 0 0 0 CAGGAGCAAA 0.819 NA NA NA 0.864 NA NA NA 
26982583 FGFR1OP2 0 0 0 GACTGGAGAG NA NA NA NA NA NA NA NA 
27017390 TM7SF3 12 0 11 12 CCTGGAGTGG 0.2 NA NA 0.184 0.553 NA NA 0.5 
27066750 SURB7 0 0 0 TAATACATTA NA NA NA NA NA NA NA NA 
27135702 LOC440091 0 0 0 AAGCTCCCCC NA NA NA NA NA NA NA NA 
27288373 STK38L 0 0 30 ATGCAAATTA 0.109 NA NA 0.423 10 0.017 0.5 0.336 0.5 15 
27377255 ARNTL2 0 0 0 GCTGCATTTA NA NA NA NA NA NA NA NA 
27568312 PPFIBP1 13 0 20 0 27 0 CCCGGCCCAA 0.365 NA NA NA 0.485 NA NA NA 
27740695 REP15 0 0 0 CTGGAATGAT NA NA NA NA NA NA NA NA 
27754996 MRPS35 0 15 0 12 ACTGCTGTCT 0.685 NA NA 0.423 0.455 NA NA 0.5 
28002284 PTHLH 0 0 39 0 TAAAAATAAC 0.693 NA NA NA 0.765 NA NA NA 
28014639 LOC440092 11 0 0 0 CAATGTGAAA NA NA NA NA NA NA NA NA 
28301400 FLJ11088 0 0 0 CAAAAGATCA NA NA NA NA NA NA NA NA 
29267865 MLSTD1 13 0 0 0 CAGAATGGAG 0.819 NA NA NA 0.864 NA NA NA 
29385227 PTX1 0 0 0 GAATTGGAAA NA NA NA NA NA NA NA NA 
29477951 OVCH1 0 0 0 CATATATGGG NA NA NA NA NA NA NA NA 
29550182 ARG99 0 0 0 TTCCCGCCTG NA NA NA NA NA NA NA NA 
30675001 IPO8 0 0 0 TGAGGCCTAT NA NA NA NA NA NA NA NA 
30753755 C1QDC1 0 0 0 ACTGATTGGT NA NA NA NA NA NA NA NA 
31118077 DDX11 0 0 0 ACTATAGAGA 0.819 NA NA NA 0.864 NA NA NA 
31324785 C12orf14 11 14 13 0 0 11 0 CACTTTGTAT NA NA NA NA NA NA NA NA 
31428985 MGC24039 0 0 0 CCGGTAATCT NA NA NA NA NA NA NA NA 
31703388 MGC50559 0 0 0 ATTTTAAATA NA NA NA NA NA NA NA NA 
31715340 LOC196394 0 0 0 AGCCAGTCTT NA NA NA NA NA NA NA NA 
32029259 FLJ10652 0 11 0 0 TGTAAGAAAT 0.65 NA NA NA 0.731 NA NA NA 
32151452 BICD1 0 0 0 TCTTCCTTCC NA NA NA NA NA NA NA NA 
Chromosome 12 positionGeneAmplification
TagSAGE
N-EPI-1N-EPI-2N-ORG-1− DCIS1− D-EPI-3− DCIS4+ DCIS5− D-EPI-6− D-EPI-7− DCIS8+++ IDC1− IDC3− IDC5− IDC6− I-EPI-7+++ LN1− LN2+ PANA/N+ PA/AN, N+ PA,NA/N+ PA/N+ Fold+++ PA/NA/N+++ PA/NA, N+++ PA,NA/N+++ PA/N+++ Fold
9896236 CLECSF2 0 0 0 AGAGGGAGTG NA NA NA NA NA NA NA NA 
9992003 FLJ46363 0 0 0 AAAATTTCAC NA NA NA NA NA NA NA NA 
10015281 MICL 0 0 0 CATTTATTAC NA NA NA NA NA NA NA NA 
10036942 CLEC2 0 0 0 CCTCGGAAAT NA NA NA NA NA NA NA NA 
10114349 CLEC1 0 0 0 CCGTTTCCCA NA NA NA NA NA NA NA NA 
10160649 CLECSF12 0 0 0 TGCTGATTTG NA NA NA NA NA NA NA NA 
10202169 OLR1 0 0 0 CATACTACAA 0.819 NA NA NA 0.864 NA NA NA 
10222898 FLJ31166 0 0 0 TTCCCATTTA NA NA NA NA NA NA NA NA 
10351684 KLRD1 0 0 0 TCACTATGCC NA NA NA NA NA NA NA NA 
10416221 KLRK1 0 0 0 TTGTATAAAT NA NA NA NA NA NA NA NA 
10451250 KLRC4 0 0 0 CTTCTATAAA NA NA NA NA NA NA NA NA 
10633039 KLRA1 0 0 0 CAGAAGAAAG NA NA NA NA NA NA NA NA 
10648061 FLJ10292 17 0 0 0 GGTTGGACAG NA NA NA NA NA NA NA NA 
10662805 STYK1 0 0 0 TATTGTTCAT NA NA NA NA NA NA NA NA 
10845480 TAS2R7 0 0 0 CATCTCTAAA NA NA NA NA NA NA NA NA 
10853003 TAS2R9 0 0 0 AGGGCCATAA NA NA NA NA NA NA NA NA 
10869212 TAS2R10 0 TTTTACTGTG NA NA NA NA NA NA NA NA 
10889749 PRR4 0 0 0 ACATTGAAAT NA NA NA NA NA NA NA NA 
10952253 TAS2R13 0 0 0 CTTTGTGAGA NA NA NA NA NA NA NA NA 
10982120 TAS2R14 0 0 15 0 TGTGTATGTA 0.819 NA NA NA 0.864 NA NA NA 
11040812 TAS2R49 0 0 0 GCAAAGGATC NA NA NA NA NA NA NA NA 
11065538 TAS2R48 0 0 0 TATCCTTCAT NA NA NA NA NA NA NA NA 
11310125 PRB3 0 0 0 ACATTGGAAG NA NA NA NA NA NA NA NA 
11396024 PRB1 0 0 0 ACATTGGAAA NA NA NA NA NA NA NA NA 
11694055 ETV6 0 0 0 GTGTTTTTGT 0.819 NA NA NA 0.864 NA NA NA 
12115145 BCL2L14 0 0 0 TGTTTCCACT NA NA NA NA NA NA NA NA 
12401322 LOH12CR1 0 0 0 AAAATCTGAC NA NA NA NA NA NA NA NA 
12520098 DUSP16 0 0 0 GTGGCATCTG NA NA NA NA NA NA NA NA 
12705263 GPR19 0 GCCAAAACTA NA NA NA NA NA NA NA NA 
12761576 CDKN1B 11 13 0 14 32 0 TTTTGTGCAT NA NA NA NA 0.907 NA NA 0.945 
12829882 DKFZP434F0318 14 12 0 0 TCAAGCAATC 0.399 NA NA 0.423 0.732 NA NA NA 
12935620 RAI3 39 16 45 13 44 20 39 42 26 72 41 47 GTGGTGGCAG 0.215 NA NA 0.441 0.397 NA NA 0.452 
12984976 GPRC5D 0 0 0 GGTTTTCCTG NA NA NA NA NA NA NA NA 
13019071 HEBP1 19 16 0 27 0 TTCCATATAC 0.718 NA NA 0.423 0.599 NA NA NA 
13044635 KIAA1467 12 0 0 13 CTCCTTTCTT 0.681 NA NA 0.423 0.478 NA NA NA 
13127761 GSG1 0 0 0 TGCTTAAGCC NA NA NA NA NA NA NA NA 
13415290 FLJ33810 0 0 0 GCAGGTTGTG NA NA NA NA NA NA NA NA 
13605411 GRIN2B 0 0 0 TCCCTGGACG NA NA NA NA NA NA NA NA 
14656843 GUCY2C 0 0 0 AATCAGATGT NA NA NA NA NA NA NA NA 
14818562 H2AFJ 90 48 18 208 26 51 168 87 233 65 885 90 147 106 29 1011 375 GAGGGCCGGT 0.005 0.129 0.062 0.103 13 0 0 0.062 0.006 18 
14847773 MGC47869 0 0 0 TGCTATGTTA 0.819 NA NA NA 0.864 NA NA NA 
14848851 LOC440087 0 0 0 AAAGACTTTA NA NA NA NA NA NA NA NA 
14926095 MGP 344 177 14 13 333 163 56 63 487 28 67 142 372 125 53 34 GTTTATGGAT NA NA NA NA NA NA NA NA 
14986232 ARHGDIB 13 17 0 0 34 14 0 CTGGCCCGAG 0.366 NA NA NA 0.486 NA NA NA 
15017245 PDE6H 0 0 0 AGCTCGCTCA NA NA NA NA NA NA NA NA 
15046034 LOC440088 0 0 0 AACGATTGGG NA NA NA NA NA NA NA NA 
15151985 RERG 0 0 0 ACTTATTTTG NA NA NA NA NA NA NA NA 
15366754 PTPRO 0 0 0 GATATACAAC NA NA NA NA NA NA NA NA 
15664365 EPS8 23 0 50 85 0 34 37 18 16 162 7 30 53 0 272 17 AGTCAGCTGG 0.008 0.113 0.355 0.093 6 0.001 0.077 0.355 0.061 9 
15926613 STRAP 34 16 27 13 30 28 45 43 56 19 17 36 278 30 ATAAAGTAAC 0.365 NA NA 0.584 0.012 0.327 0.888 0.355 
24855562 BCAT1 0 0 0 GAATAATTGT NA NA NA NA NA NA NA NA 
25037625 LOC196415 0 0 0 CAGATAATCC NA NA NA NA NA NA NA NA 
25152490 CASC1 0 0 0 GTGAAAGACA 0.819 NA NA NA 0.864 NA NA NA 
25249447 KRAS2 14 0 11 42 71 AACTGTACTA 0.014 0.217 0.065 0.19 8 0 0.066 0.065 0.082 12 
25520283 FLJ36004 0 0 0 AGAGGGTGAA NA NA NA NA NA NA NA NA 
26003229 C12orf2 0 0 0 TTCACTAATT 0.647 NA NA NA 0.729 NA NA NA 
26164228 BHLHB3 13 12 0 0 CTATTTTTGT 0.513 NA NA 0.423 0.728 NA NA NA 
26239789 SSPN 14 0 0 0 GAGTAGCTGA NA NA NA NA NA NA NA NA 
26381609 ITPR2 0 0 0 AGAAATTCAG NA NA NA NA NA NA NA NA 
26949385 FLJ10637 0 0 0 CAGGAGCAAA 0.819 NA NA NA 0.864 NA NA NA 
26982583 FGFR1OP2 0 0 0 GACTGGAGAG NA NA NA NA NA NA NA NA 
27017390 TM7SF3 12 0 11 12 CCTGGAGTGG 0.2 NA NA 0.184 0.553 NA NA 0.5 
27066750 SURB7 0 0 0 TAATACATTA NA NA NA NA NA NA NA NA 
27135702 LOC440091 0 0 0 AAGCTCCCCC NA NA NA NA NA NA NA NA 
27288373 STK38L 0 0 30 ATGCAAATTA 0.109 NA NA 0.423 10 0.017 0.5 0.336 0.5 15 
27377255 ARNTL2 0 0 0 GCTGCATTTA NA NA NA NA NA NA NA NA 
27568312 PPFIBP1 13 0 20 0 27 0 CCCGGCCCAA 0.365 NA NA NA 0.485 NA NA NA 
27740695 REP15 0 0 0 CTGGAATGAT NA NA NA NA NA NA NA NA 
27754996 MRPS35 0 15 0 12 ACTGCTGTCT 0.685 NA NA 0.423 0.455 NA NA 0.5 
28002284 PTHLH 0 0 39 0 TAAAAATAAC 0.693 NA NA NA 0.765 NA NA NA 
28014639 LOC440092 11 0 0 0 CAATGTGAAA NA NA NA NA NA NA NA NA 
28301400 FLJ11088 0 0 0 CAAAAGATCA NA NA NA NA NA NA NA NA 
29267865 MLSTD1 13 0 0 0 CAGAATGGAG 0.819 NA NA NA 0.864 NA NA NA 
29385227 PTX1 0 0 0 GAATTGGAAA NA NA NA NA NA NA NA NA 
29477951 OVCH1 0 0 0 CATATATGGG NA NA NA NA NA NA NA NA 
29550182 ARG99 0 0 0 TTCCCGCCTG NA NA NA NA NA NA NA NA 
30675001 IPO8 0 0 0 TGAGGCCTAT NA NA NA NA NA NA NA NA 
30753755 C1QDC1 0 0 0 ACTGATTGGT NA NA NA NA NA NA NA NA 
31118077 DDX11 0 0 0 ACTATAGAGA 0.819 NA NA NA 0.864 NA NA NA 
31324785 C12orf14 11 14 13 0 0 11 0 CACTTTGTAT NA NA NA NA NA NA NA NA 
31428985 MGC24039 0 0 0 CCGGTAATCT NA NA NA NA NA NA NA NA 
31703388 MGC50559 0 0 0 ATTTTAAATA NA NA NA NA NA NA NA NA 
31715340 LOC196394 0 0 0 AGCCAGTCTT NA NA NA NA NA NA NA NA 
32029259 FLJ10652 0 11 0 0 TGTAAGAAAT 0.65 NA NA NA 0.731 NA NA NA 
32151452 BICD1 0 0 0 TCTTCCTTCC NA NA NA NA NA NA NA NA 

NOTE: Normalized SAGE data (tags per 100,000) from 14 breast tumors, two normal mammary epithelial cells (N-EPI-1 and N-EPI-2), and one normal organoid (N-ORG-1) were analyzed to identify statistically significant differences in gene expression among amplified, nonamplified, and normal samples. Four statistical tests were done to calculate P values for difference between amplified and normal (A/N); among amplified (bold), nonamplified, and normal (A/NA/N); between amplified and nonamplified plus normal (A/NA,N); and between tumors and normal (A,NA/N) tissues. Amplification status of tumors was predicted by segmented CGH values (−, +, ++, +++ for log 2 ratio <0.09, 0.09, 0.2, and 0.5, respectively). Statistical analyses were done at different levels in which tumors having only +, ++, +++, or above were considered as amplified (see Materials and Methods for details). Fold = fold increase in gene expression compared to normals. Tag column lists the SAGE tag sequences used for the calculations. Candidate target(s) are highlighted in italic.

Abbreviation: NA, not applicable.

Amplification of all four candidates in tumors predicted by array CGH was confirmed by quantitative PCR (Tables 2 and 4). Within additional 16 tumor samples screened that did not have array CGH data, only one tumor had amplification of H2AFJ, EPS8, and KRAS2. Among 26 breast cancer cell lines screened, only HCC1937 and ZR-75-1 showed copy number gain in this region (Tables 2 and 4). Thus, amplification of 12p13-p12 occurs with a low frequency in breast cancers. We further did FISH to confirm amplification of the candidates on the single-cell level and to determine whether TEL/ETV6 is involved in a translocation, as in the case of secretory breast cancer and other tumor types. FISH displayed dramatic amplification of all four candidates in tumors IDC1 and LN1, but not in the control tumor LN2, which was also negative by array CGH (Fig. 2). FISH using two bacterial artificial chromosomes flanking the TEL/ETV6 gene labeled in red and green showed adjacent red and green signals in the tumors, suggesting that the TEL/ETV6 gene is not disrupted in these tumors. In HCC1937 cell line, the majority cells carried five copies of TEL/ETV6. Staining of metaphase chromosomes using centrosome-specific probes showed that three copies of ETV6 were associated with chromosome 12, whereas the other two copies associated with chromosome 4 and one yet unidentified chromosome. Again, the TEL/ETV6 gene is not disrupted in HCC1937 cell line. Therefore, translocation and fusion of TEL/ETV6 to NTRK3 or other genes is not a common event in breast cancers.

Table 4.

Validation of 12p13 amplicon candidate target genes in primary breast tumors and breast cancer cell lines

SampleETV6
H2AFJ
EPS8
KRAS2
HER2
Copy no.ExpCopy no.ExpCopy no.ExpCopy no.ExpCopy no.Exp
Normal N050702 1.0 0.3 1.0 1.0 1.0 1.0 1.0 0.6 1.0 0.1 
 N051002 1.0 0.1 1.0 0.3 1.0 0.4 1.0 0.3 1.0 0.1 
 N052902 1.0 0.1 1.0 0.3 1.0 0.1 1.0 0.1 1.0 0.1 
 N061202 1.0 1.0 1.0 0.6 1.0 0.4 1.0 0.9 1.0 0.2 
 N062002 1.0 0.6 1.0 0.9 1.0 0.4 1.0 1.0 1.0 0.3 
Tumors BWH-T1 1.3 0.9 0.9 0.4 0.9 0.2 0.6 0.7 0.4 1.3 
 BWH-T3 1.4 1.5 1.5 0.6 1.2 1.2 0.7 0.9 0.5 2.9 
 BWH-T4 1.1 0.6 1.1 0.3 0.3 0.5 0.3 0.8 0.2 3.1 
 BWH-T7 0.6 1.0 0.4 1.0 0.0 1.1 0.2 2.8 0.1 6.8 
 BWH-T8 1.5 0.9 2.3 1.0 3.1 0.8 2.6 1.1 0.1 2.4 
 BWH-T15 0.5 0.3 0.7 0.4 0.6 0.6 0.3 0.7 0.3 4.6 
 BOT169 1.0 0.3 1.0 1.3 1.1 9.3 1.0 1.1 4.3 22.2 
 BWH-T18 0.8 0.3 0.7 0.9 0.7 0.4 0.5 0.4 0.4 0.9 
 BWH-T24 1.5 0.6 1.4 1.5 1.4 9.4 1.2 5.1 1.1 0.9 
 CT6 2.4 0.2 0.6 0.8 0.5 1.1 1.1 0.7 35.0 74.8 
 CT-36 0.8 1.5 1.2 7.5 1.3 1.1 0.6 1.0 0.9 23.1 
 CT-39 ND 0.2 ND 0.7 ND 0.6 ND 0.3 ND 0.4 
 CT-46 0.6 0.8 1.0 7.6 1.3 2.4 0.5 2.5 0.8 18.8 
 CT-47 0.8 0.6 1.0 0.9 1.0 0.9 0.6 0.5 0.8 55.4 
 CT-49 ND 1.6 ND 2.6 ND 2.1 ND 1.8 ND 5.6 
 IDC1 3.6 0.2 6.1 13.5 7.3 4.0 3.3 1.0 2.0 9.1 
 LN1 4.9 0.2 8.6 14.4 8.0 3.8 3.9 1.3 4.5 8.9 
 IDC2 ND 0.0 ND ND ND ND ND 0.0 ND 0.2 
 LN2 ND 0.0 0.7 15.7 1.0 0.3 ND 0.1 0.9 0.9 
 MGH-T3 0.8 0.1 1.1 0.4 1.6 0.1 0.3 0.1 0.1 1.6 
 MGH-T4 1.4 0.4 1.3 ND 0.3 ND 1.3 ND 1.3 107.0 
Cell lines 21MT1 1.4 0.7 1.2 3.7 1.3 2.5 1.5 0.5 ND 142.9 
 21MT2 1.0 0.6 1.1 2.7 0.9 3.7 1.1 0.6 ND 70.0 
 21NT 0.8 0.0 1.0 2.4 0.8 3.9 1.3 0.3 16.4 96.1 
 21PT 0.7 1.0 0.8 2.2 0.6 4.5 0.8 1.3 8.5 177.3 
 BT-20 1.1 0.2 0.5 3.5 0.7 1.6 1.8 0.8 0.0 0.0 
 BT-549 0.7 0.1 0.8 2.4 0.9 1.1 1.0 ND 0.5 142.4 
 HCC1937 5.7 0.7 1.2 3.9 0.9 6.2 3.4 1.1 0.3 0.6 
 Hs578T 0.9 0.2 0.7 0.3 0.5 1.3 1.1 ND 0.6 0.4 
 MCF-10A 0.9 0.1 0.5 0.4 0.4 0.3 1.3 ND ND 0.2 
 MCF10DCIS 0.9 0.2 0.5 0.4 0.4 0.4 1.3 ND ND 0.7 
 MCF-7 ND ND 0.7 ND 0.7 ND ND ND ND ND 
 MDA-MB-435 0.6 0.1 0.8 0.3 0.9 0.5 0.8 ND ND 0.1 
 MDA-MB-468 0.9 0.1 0.5 0.5 0.3 0.1 0.9 ND 0.3 0.1 
 SUM-44 1.0 0.1 0.8 0.0 0.9 0.7 0.8 ND ND 0.5 
 SUM-52 1.2 0.1 0.6 0.4 0.5 0.4 1.0 ND ND 0.9 
 SUM-102 1.4 0.1 1.4 0.3 0.9 0.3 1.1 ND 0.3 0.1 
 SUM-1315 0.6 0.2 1.1 1.8 0.8 2.8 0.5 ND 0.5 0.6 
 SUM-149 0.8 0.1 1.6 0.6 1.3 0.7 1.7 ND 0.3 0.3 
 SUM-159 0.9 0.2 1.1 4.8 0.7 2.9 0.9 ND 0.5 1.2 
 SUM-185 0.7 0.3 1.5 1.4 1.3 1.0 0.6 ND 0.7 290.6 
 SUM-190 1.2 0.4 0.9 1.8 1.0 1.2 1.1 ND 58.7 331.8 
 SUM-225 1.0 0.2 0.7 0.6 0.5 0.8 1.1 ND 34.8 265.5 
 SUM-229 1.0 0.1 0.6 0.0 0.6 0.8 0.6 ND ND 0.6 
 UACC-812 0.7 1.8 0.8 0.3 0.4 2.0 0.8 3.1 16.0 5,843.2 
 UACC-893 1.5 0.5 0.8 4.8 0.6 0.4 1.1 1.1 48.2 371.6 
 ZR-75-1 2.7 0.2 2.1 3.8 2.6 1.0 1.5 ND ND 1.7 
SampleETV6
H2AFJ
EPS8
KRAS2
HER2
Copy no.ExpCopy no.ExpCopy no.ExpCopy no.ExpCopy no.Exp
Normal N050702 1.0 0.3 1.0 1.0 1.0 1.0 1.0 0.6 1.0 0.1 
 N051002 1.0 0.1 1.0 0.3 1.0 0.4 1.0 0.3 1.0 0.1 
 N052902 1.0 0.1 1.0 0.3 1.0 0.1 1.0 0.1 1.0 0.1 
 N061202 1.0 1.0 1.0 0.6 1.0 0.4 1.0 0.9 1.0 0.2 
 N062002 1.0 0.6 1.0 0.9 1.0 0.4 1.0 1.0 1.0 0.3 
Tumors BWH-T1 1.3 0.9 0.9 0.4 0.9 0.2 0.6 0.7 0.4 1.3 
 BWH-T3 1.4 1.5 1.5 0.6 1.2 1.2 0.7 0.9 0.5 2.9 
 BWH-T4 1.1 0.6 1.1 0.3 0.3 0.5 0.3 0.8 0.2 3.1 
 BWH-T7 0.6 1.0 0.4 1.0 0.0 1.1 0.2 2.8 0.1 6.8 
 BWH-T8 1.5 0.9 2.3 1.0 3.1 0.8 2.6 1.1 0.1 2.4 
 BWH-T15 0.5 0.3 0.7 0.4 0.6 0.6 0.3 0.7 0.3 4.6 
 BOT169 1.0 0.3 1.0 1.3 1.1 9.3 1.0 1.1 4.3 22.2 
 BWH-T18 0.8 0.3 0.7 0.9 0.7 0.4 0.5 0.4 0.4 0.9 
 BWH-T24 1.5 0.6 1.4 1.5 1.4 9.4 1.2 5.1 1.1 0.9 
 CT6 2.4 0.2 0.6 0.8 0.5 1.1 1.1 0.7 35.0 74.8 
 CT-36 0.8 1.5 1.2 7.5 1.3 1.1 0.6 1.0 0.9 23.1 
 CT-39 ND 0.2 ND 0.7 ND 0.6 ND 0.3 ND 0.4 
 CT-46 0.6 0.8 1.0 7.6 1.3 2.4 0.5 2.5 0.8 18.8 
 CT-47 0.8 0.6 1.0 0.9 1.0 0.9 0.6 0.5 0.8 55.4 
 CT-49 ND 1.6 ND 2.6 ND 2.1 ND 1.8 ND 5.6 
 IDC1 3.6 0.2 6.1 13.5 7.3 4.0 3.3 1.0 2.0 9.1 
 LN1 4.9 0.2 8.6 14.4 8.0 3.8 3.9 1.3 4.5 8.9 
 IDC2 ND 0.0 ND ND ND ND ND 0.0 ND 0.2 
 LN2 ND 0.0 0.7 15.7 1.0 0.3 ND 0.1 0.9 0.9 
 MGH-T3 0.8 0.1 1.1 0.4 1.6 0.1 0.3 0.1 0.1 1.6 
 MGH-T4 1.4 0.4 1.3 ND 0.3 ND 1.3 ND 1.3 107.0 
Cell lines 21MT1 1.4 0.7 1.2 3.7 1.3 2.5 1.5 0.5 ND 142.9 
 21MT2 1.0 0.6 1.1 2.7 0.9 3.7 1.1 0.6 ND 70.0 
 21NT 0.8 0.0 1.0 2.4 0.8 3.9 1.3 0.3 16.4 96.1 
 21PT 0.7 1.0 0.8 2.2 0.6 4.5 0.8 1.3 8.5 177.3 
 BT-20 1.1 0.2 0.5 3.5 0.7 1.6 1.8 0.8 0.0 0.0 
 BT-549 0.7 0.1 0.8 2.4 0.9 1.1 1.0 ND 0.5 142.4 
 HCC1937 5.7 0.7 1.2 3.9 0.9 6.2 3.4 1.1 0.3 0.6 
 Hs578T 0.9 0.2 0.7 0.3 0.5 1.3 1.1 ND 0.6 0.4 
 MCF-10A 0.9 0.1 0.5 0.4 0.4 0.3 1.3 ND ND 0.2 
 MCF10DCIS 0.9 0.2 0.5 0.4 0.4 0.4 1.3 ND ND 0.7 
 MCF-7 ND ND 0.7 ND 0.7 ND ND ND ND ND 
 MDA-MB-435 0.6 0.1 0.8 0.3 0.9 0.5 0.8 ND ND 0.1 
 MDA-MB-468 0.9 0.1 0.5 0.5 0.3 0.1 0.9 ND 0.3 0.1 
 SUM-44 1.0 0.1 0.8 0.0 0.9 0.7 0.8 ND ND 0.5 
 SUM-52 1.2 0.1 0.6 0.4 0.5 0.4 1.0 ND ND 0.9 
 SUM-102 1.4 0.1 1.4 0.3 0.9 0.3 1.1 ND 0.3 0.1 
 SUM-1315 0.6 0.2 1.1 1.8 0.8 2.8 0.5 ND 0.5 0.6 
 SUM-149 0.8 0.1 1.6 0.6 1.3 0.7 1.7 ND 0.3 0.3 
 SUM-159 0.9 0.2 1.1 4.8 0.7 2.9 0.9 ND 0.5 1.2 
 SUM-185 0.7 0.3 1.5 1.4 1.3 1.0 0.6 ND 0.7 290.6 
 SUM-190 1.2 0.4 0.9 1.8 1.0 1.2 1.1 ND 58.7 331.8 
 SUM-225 1.0 0.2 0.7 0.6 0.5 0.8 1.1 ND 34.8 265.5 
 SUM-229 1.0 0.1 0.6 0.0 0.6 0.8 0.6 ND ND 0.6 
 UACC-812 0.7 1.8 0.8 0.3 0.4 2.0 0.8 3.1 16.0 5,843.2 
 UACC-893 1.5 0.5 0.8 4.8 0.6 0.4 1.1 1.1 48.2 371.6 
 ZR-75-1 2.7 0.2 2.1 3.8 2.6 1.0 1.5 ND ND 1.7 

NOTE: Gene and sample names, copy numbers predicted based on quantitative PCR (copy no.) and overexpression determined by quantitative RT-PCR (Exp) are listed. Values predicting ≥2 fold copy number gain or overexpression are in boldface.

Abbreviation: ND, not determined.

Figure 2.

FISH analysis of candidate targets of the 12p13 amplicon in breast tumors and breast cancer cell lines. FISH analysis using a commercially available TEL/AML probe in breast tumors IDC1 (A), LN1(B), LN2(C), IDC-C6 (D), and HCC1937 breast cancer cell line (E). A to E, only the TEL (green) signal was captured. E, inset, typical interphase nucleus observed in the majority of HCC1937 cells, showing five TEL/ETV6 signals per cell. FISH analysis using bacterial artificial chromosome probes centromeric (red) and telomeric (green) to TEL/ETV6 in tumor IDC-C6 (F) and metaphase chromosomes of HCC1937 breast cancer cell line (G). Colocalization of the two signals indicates integrity of the TEL/ETV6 chromosomal locus in both cases. H, FISH analysis of metaphase chromosomes of HCC1937 cell line. Green and red signal corresponds to TEL/ETV6 and AML probes, respectively, whereas aqua signal (white arrow) marks the centromeres of the three chromosomes 4. The majority of the metaphase cells in the HCC1937 breast cancer cell line seem to be near tetraploid. Yellow arrow points to TEL/ETV6 (green signal) localized on a derivative chromosome 4. FISH analysis of H2AFJ and EPS8 in breast tumor LN1 (I and J, respectively) and ZR-75-1 breast cancer cell line (K and L, respectively). In both cases, red signal corresponds to the gene-specific bacterial artificial chromosome, whereas the green signal reflects hybridization using a chromosome 12 centromeric probe.

Figure 2.

FISH analysis of candidate targets of the 12p13 amplicon in breast tumors and breast cancer cell lines. FISH analysis using a commercially available TEL/AML probe in breast tumors IDC1 (A), LN1(B), LN2(C), IDC-C6 (D), and HCC1937 breast cancer cell line (E). A to E, only the TEL (green) signal was captured. E, inset, typical interphase nucleus observed in the majority of HCC1937 cells, showing five TEL/ETV6 signals per cell. FISH analysis using bacterial artificial chromosome probes centromeric (red) and telomeric (green) to TEL/ETV6 in tumor IDC-C6 (F) and metaphase chromosomes of HCC1937 breast cancer cell line (G). Colocalization of the two signals indicates integrity of the TEL/ETV6 chromosomal locus in both cases. H, FISH analysis of metaphase chromosomes of HCC1937 cell line. Green and red signal corresponds to TEL/ETV6 and AML probes, respectively, whereas aqua signal (white arrow) marks the centromeres of the three chromosomes 4. The majority of the metaphase cells in the HCC1937 breast cancer cell line seem to be near tetraploid. Yellow arrow points to TEL/ETV6 (green signal) localized on a derivative chromosome 4. FISH analysis of H2AFJ and EPS8 in breast tumor LN1 (I and J, respectively) and ZR-75-1 breast cancer cell line (K and L, respectively). In both cases, red signal corresponds to the gene-specific bacterial artificial chromosome, whereas the green signal reflects hybridization using a chromosome 12 centromeric probe.

Close modal

We next did quantitative reverse transcription-PCR (RT-PCR) on cDNA samples from primary breast tumors, breast cancer cell lines, and five purified normal mammary epithelial cells as references to examine overexpression of the four putative targets (Table 4). TEL/ETV6 and KRAS2 were overexpressed in a subset of breast tumors, but this did not correlate with their amplification. From the four genes tested, only H2AFJ and EPS8 were overexpressed in tumors in which they were also amplified, although the association between gene amplification and overexpression was statistically significant only for ERBB2 (P = 0.007, Fisher exact test) due to small sample size and the low frequency of amplification of the 12p13 target genes in breast tumors. Thus, based on these data, H2AFJ and EPS8 are the potential targets of this 12p13-p12 amplicon. The H2AFJ gene encodes a member of the histone H2A super family. It has two isoforms generated by differential splicing. We detected the expression of only isoform 2 (NM_177925) by SAGE and quantitative RT-PCR. This isoform encodes a 129-amino-acid protein that is highly homologous (∼95%) to other histone H2A family members. The function of this particular H2A protein is not known, but presumably, it is involved in modulating chromatin structure and gene expression. A recent study described overexpression of H2AFJ in human metastatic melanoma lesions compared with common nevocellular nevi, suggesting a potential role for this gene in melanoma metastasis (49). Epidermal growth factor (EGF) pathway substrate 8 (EPS8) was originally identified as a substrate of EGF receptor (EGFR) that enhances mitogenic signaling from receptor tyrosine kinases, phorbol ester, and c-Src (50, 51). Constitutive tyrosine phosphorylation of EPS8 was observed in many tumor cell lines (52). Overexpression of EPS8 in murine C3H10T1/2 fibroblasts induced cellular transformation in the presence of EGF (53), whereas down-regulation of EPS8 by trichostatin A or small interfering RNA inhibited the growth of v-Src-transformed chicken cells (46). At the molecular level, EPS8 binds to internalized EGFR, controls EGFR trafficking, and relays signals from Ras, phosphatidylinositol 3-kinase to Rac. More recently, EPS8 was found to bind to the barbed ends of actin filaments and regulates actin polymerization and cell motility (54, 55). Public gene expression data suggest that EPS8 is overexpressed in breast and several other cancer types, including lung and pancreatic cancer. Further studies are required to confirm overexpression of EPS8 protein in breast tumors by immunochemistry and to evaluate its potential prognostic value.

In summary, we identified H2AFJ and EPS8 as novel candidate breast cancer oncogenes based on integrated cDNA array CGH and SAGE analyses. The combination of these two technologies seems to be powerful for the identification of candidate target genes of amplified loci as shown by the identification of a novel 12p13 amplicon and its putative targets (H2AFJ and EPS8) detected in a subset of breast tumors. Further functional studies are necessary to validate the role of these new candidate oncogenes in breast tumorigenesis.

Note: Supplementary data for this article are available at Cancer Research Online (http://cancerres.aacrjournals.org/).

C. Brennan is currently at the Neurosurgery Service, Memorial Sloan-Kettering Cancer Center, 1275 York Avenue, New York, NY 10021.

Grant support: National Cancer Institute Cancer Genome Anatomy Project and Specialized Program in Research Excellence in Breast Cancer at Dana-Farber/Harvard Cancer Center grant CA89393, Department of Defense Breast Cancer Center of Excellence grant DAMD17-02-1-0692 (K. Polyak), Department of Defense Postdoctoral Fellowship grant DAMD17-02-1-0363 (J. Yao), and grant CA93683.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

We thank Drs. Drazen Belina, Zrinka Pagon, and Jasminka Razumovic (University Hospital Rebro and Zagreb Medical School, Zagreb, Croatia) and Drs. Andrea Richardson, Gabriela Lodeiro, and Ruth Gomes (Brigham and Women's Hospital, Boston, MA) for help with the acquisition of tumor samples and the current and past members of the Polyak laboratory for critical reading of the article and their constructive criticism throughout the execution of this project.

1
Savelyeva L, Schwab M. Amplification of oncogenes revisited: from expression profiling to clinical application.
Cancer Lett
2001
;
167
:
115
–23.
2
Courjal F, Cuny M, Simony-Lafontaine J, et al. Mapping of DNA amplifications at 15 chromosomal localizations in 1875 breast tumors: definition of phenotypic groups.
Cancer Res
1997
;
57
:
4360
–7.
3
Slamon DJ, Godolphin W, Jones LA, et al. Studies of the HER-2/neu proto-oncogene in human breast and ovarian cancer.
Science
1989
;
244
:
707
–12.
4
Bachman KE, Argani P, Samuels Y, et al. The PIK3CA gene is mutated with high frequency in human breast cancers.
Cancer Biol Ther
2004
;
3
:
772
–5.
5
Samuels Y, Wang Z, Bardelli A, et al. High frequency of mutations of the PIK3CA gene in human cancers.
Science
2004
;
304
:
554
.
6
Campbell IG, Russell SE, Choong DY, et al. Mutation of the PIK3CA gene in ovarian and breast cancer.
Cancer Res
2004
;
64
:
7678
–81.
7
Lee JW, Soung YH, Kim SY, et al. PIK3CA gene is frequently mutated in breast carcinomas and hepatocellular carcinomas.
Oncogene
2004
;
24
:
1477
–80.
8
Wu G, Xing M, Mambo E, et al. Somatic mutation and gain of copy number of PIK3CA in human breast cancer.
Breast Cancer Res
2005
;
7
:
R609
–16.
9
Yarden Y, Baselga J, Miles D. Molecular approach to breast cancer treatment.
Semin Oncol
2004
;
31
:
6
–13.
10
Kauraniemi P, Barlund M, Monni O, Kallioniemi A. New amplified and highly expressed genes discovered in the ERBB2 amplicon in breast cancer by cDNA microarrays.
Cancer Res
2001
;
61
:
8235
–40.
11
Janes PW, Lackmann M, Church WB, et al. Structural determinants of the interaction between the erbB2 receptor and the Src homology 2 domain of Grb7.
J Biol Chem
1997
;
272
:
8490
–7.
12
Luoh SW, Venkatesan N, Tripathi R. Overexpression of the amplified Pip4k2beta gene from 17q11–12 in breast cancer cells confers proliferation advantage.
Oncogene
2004
;
23
:
1354
–63.
13
Mu D, Chen L, Zhang X, et al. Genomic amplification and oncogenic properties of the KCNK9 potassium channel gene.
Cancer Cell
2003
;
3
:
297
–302.
14
Albertson DG, Pinkel D. Genomic microarrays in human genetic disease and cancer.
Hum Mol Genet
2003
;
12
Spec No 2:
R145
–52.
15
Pinkel D, Segraves R, Sudar D, et al. High resolution analysis of DNA copy number variation using comparative genomic hybridization to microarrays.
Nat Genet
1998
;
20
:
207
–11.
16
Mantripragada KK, Buckley PG, de Stahl TD, Dumanski JP. Genomic microarrays in the spotlight.
Trends Genet
2004
;
20
:
87
–94.
17
Pollack JR, Perou CM, Alizadeh AA, et al. Genome-wide analysis of DNA copy-number changes using cDNA microarrays.
Nat Genet
1999
;
23
:
41
–6.
18
Pollack JR, Sorlie T, Perou CM, et al. Microarray analysis reveals a major direct role of DNA copy number alteration in the transcriptional program of human breast tumors.
Proc Natl Acad Sci U S A
2002
;
99
:
12963
–8.
19
Hyman E, Kauraniemi P, Hautaniemi S, et al. Impact of DNA amplification on gene expression patterns in breast cancer.
Cancer Res
2002
;
62
:
6240
–5.
20
Daigo Y, Chin SF, Gorringe KL, et al. Degenerate oligonucleotide primed-polymerase chain reaction-based array comparative genomic hybridization for extensive amplicon profiling of breast cancers: a new approach for the molecular analysis of paraffin-embedded cancer tissue.
Am J Pathol
2001
;
158
:
1623
–31.
21
Rodenhuis S, van de Wetering ML, Mooi WJ, et al. Mutational activation of the K-ras oncogene. A possible pathogenetic factor in adenocarcinoma of the lung.
N Engl J Med
1987
;
317
:
929
–35.
22
Allinen M, Beroukhim R, Cai L, et al. Molecular characterization of the tumor microenvironment in breast cancer.
Cancer Cell
2004
;
6
:
17
–32.
23
Aguirre AJ, Brennan C, Bailey G, et al. High-resolution characterization of the pancreatic adenocarcinoma genome.
Proc Natl Acad Sci U S A
2004
;
101
:
9067
–72.
24
Olshen AB, Venkatraman ES, Lucito R, Wigler M. Circular binary segmentation for the analysis of array-based DNA copy number data.
Biostatistics
2004
;
5
:
557
–72.
25
Porter D, Lahti-Domenici J, Keshaviah A, et al. Molecular markers in ductal carcinoma in situ of the breast.
Mol Cancer Res
2003
;
1
:
362
–75.
26
Cai L, Huang H, Blackshaw S, et al. Clustering analysis of SAGE data using a Poisson approach.
Genome Biol
2004
;
5
:
R51
.
27
Ginzinger DG. Gene quantification using real-time quantitative PCR: an emerging technology hits the mainstream.
Exp Hematol
2002
;
30
:
503
–12.
28
Ney PA, Andrews NC, Jane SM, et al. Purification of the human NF-E2 complex: cDNA cloning of the hematopoietic cell-specific subunit and evidence for an associated partner.
Mol Cell Biol
1993
;
13
:
5604
–12.
29
Almeida A, Muleris M, Dutrillaux B, Malfoy B. The insulin-like growth factor I receptor gene is the target for the 15q26 amplicon in breast cancer.
Genes Chromosomes Cancer
1994
;
11
:
63
–5.
30
Chin K, de Solorzano CO, Knowles D, et al. In situ analyses of genome instability in breast cancer.
Nat Genet
2004
;
36
:
984
–8.
31
Kauraniemi P, Kuukasjarvi T, Sauter G, Kallioniemi A. Amplification of a 280-kilobase core region at the ERBB2 locus leads to activation of two hypothetical proteins in breast cancer.
Am J Pathol
2003
;
163
:
1979
–84.
32
Hughes-Davies L, Huntsman D, Ruas M, et al. EMSY links the BRCA2 pathway to sporadic breast and ovarian cancer.
Cell
2003
;
115
:
523
–35.
33
Rodriguez C, Hughes-Davies L, Valles H, et al. Amplification of the BRCA2 pathway gene EMSY in sporadic breast cancer is related to negative outcome.
Clin Cancer Res
2004
;
10
:
5785
–91.
34
Charpentier A, Bednarek A, Daniel R, et al. Effects of estrogen on global gene expression: identification of novel targets of estrogen action.
Cancer Res
2000
;
60
:
5977
–83.
35
Storlazzi CT, Fioretos T, Paulsson K, et al. Identification of a commonly amplified 4.3 Mb region with overexpression of C8FW, but not MYC in MYC-containing double minutes in myeloid malignancies.
Hum Mol Genet
2004
;
13
:
1479
–85.
36
D'Cruz CM, Gunther EJ, Boxer RB, et al. c-MYC induces mammary tumorigenesis by means of a preferred pathway involving spontaneous Kras2 mutations.
Nat Med
2001
;
7
:
235
–9.
37
Porkka KP, Tammela TL, Vessella RL, Visakorpi T. RAD21 and KIAA0196 at 8q24 are amplified and overexpressed in prostate cancer.
Genes Chromosomes Cancer
2004
;
39
:
1
–10.
38
van Duin M, van Marion R, Vissers K, et al. High-resolution array comparative genomic hybridization of chromosome arm 8q: evaluation of genetic progression markers for prostate cancer.
Genes Chromosomes Cancer
2005
;
44
:
438
–49.
39
Knezevich SR, McFadden DE, Tao W, Lim JF, Sorensen PH. A novel ETV6–3 gene fusion in congenital fibrosarcoma.
Nat Genet
1998
;
18
:
184
–7.
40
Wlodarska I, Mecucci C, Baens M, Marynen P, van den Berghe H. ETV6 gene rearrangements in hematopoietic malignant disorders.
Leuk Lymphoma
1996
;
23
:
287
–95.
41
Golub TR, Barker GF, Bohlander SK, et al. Fusion of the TEL gene on 12p13 to the AML1 gene on 21q22 in acute lymphoblastic leukemia.
Proc Natl Acad Sci U S A
1995
;
92
:
4917
–21.
42
Tognon C, Knezevich SR, Huntsman D, et al. Expression of the ETV6–3 gene fusion as a primary event in human secretory breast carcinoma.
Cancer Cell
2002
;
2
:
367
–76.
43
Makretsov N, He M, Hayes M, et al. A fluorescence in situ hybridization study of ETV6–3 fusion gene in secretory breast carcinoma.
Genes Chromosomes Cancer
2004
;
40
:
152
–7.
44
Letessier A, Ginestier C, Charafe-Jauffret E, et al. ETV6 gene rearrangements in invasive breast carcinoma.
Genes Chromosomes Cancer
2005
;
44
:
103
–8.
45
Rowley JD. The role of chromosome translocations in leukemogenesis.
Semin Hematol
1999
;
36
:
59
–72.
46
Mauvieux L, Helias C, Perrusson N, et al. ETV6 (TEL) gene amplification in a myelodysplastic syndrome with excess of blasts.
Leukemia
2004
;
18
:
1436
–8.
47
Thor A, Ohuchi N, Hand PH, et al. ras gene alterations and enhanced levels of ras p21 expression in a spectrum of benign and malignant human mammary tissues.
Lab Invest
1986
;
55
:
603
–15.
48
Miyakis S, Sourvinos G, Spandidos DA. Differential expression and mutation of the ras family genes in human breast cancer.
Biochem Biophys Res Commun
1998
;
251
:
609
–12.
49
de Wit NJ, Rijntjes J, Diepstra JH, et al. Analysis of differential gene expression in human melanocytic tumour lesions by custom made oligonucleotide arrays.
Br J Cancer
2005
;
92
:
2249
–61.
50
Fazioli F, Minichiello L, Matoska V, et al. Eps8, a substrate for the epidermal growth factor receptor kinase, enhances EGF-dependent mitogenic signals.
EMBO J
1993
;
12
:
3799
–808.
51
Gallo R, Provenzano C, Carbone R, et al. Regulation of the tyrosine kinase substrate Eps8 expression by growth factors, v-Src and terminal differentiation.
Oncogene
1997
;
15
:
1929
–36.
52
Matoskova B, Wong WT, Salcini AE, Pelicci PG, Di Fiore PP. Constitutive phosphorylation of eps8 in tumor cell lines: relevance to malignant transformation.
Mol Cell Biol
1995
;
15
:
3805
–12.
53
Maa MC, Hsieh CY, Leu TH. Overexpression of p97Eps8 leads to cellular transformation: implication of pleckstrin homology domain in p97Eps8-mediated ERK activation.
Oncogene
2001
;
20
:
106
–12.
54
Disanza A, Carlier MF, Stradal TE, et al. Eps8 controls actin-based motility by capping the barbed ends of actin filaments.
Nat Cell Biol
2004
;
6
:
1180
–8.
55
Offenhauser N, Borgonovo A, Disanza A, et al. The eps8 family of proteins links growth factor stimulation to actin reorganization generating functional redundancy in the Ras/Rac pathway.
Mol Biol Cell
2004
;
15
:
91
–8.