Abstract
Oncogene activation by gene amplification is a major pathogenetic mechanism in human cancer. Using comparative genomic hybridization, we determined that metastatic human colon cancers commonly acquire numerous extra copies of chromosome arms 7p, 8q, 13q, and 20q. We then examined the consequence of these amplifications on gene expression using DNA microarrays. Of 55,000 transcripts profiled, 2,146 were determined to map to one of the four common colon cancer amplicons and to also be expressed in normal or malignant colon tissues. Of these, only 81 transcripts (3.8%) demonstrated a 2-fold increase over normal expression among cancers bearing the corresponding chromosomal amplification. Chromosomal amplifications are common in colon cancer metastasis, but increased expression of genes within these amplicons is rare.
INTRODUCTION
Chromosomal aberrations may act as a fundamental pathophysiological event in human carcinogenesis (1). Common examples include inactivation of tumor suppressor genes by chromosomal deletion, and creation of oncogenic fusion genes by chromosomal translocation (1). Additionally, as exemplified by the HER2 gene (ERBB2), chromosome amplification can activate a target gene to become an oncogene by inducing its expression to levels substantially greater than normal (2). However, cancer-associated chromosome amplifications may span entire chromosome arms, and it has been unresolved whether this class of chromosome aberration can act by altering expression of thousands of amplified genes or rather acts by deregulating only a select few of such amplified genes. In this study, we have combined comparative genomic hybridization and DNA microarray expression profiling to examine the expression of over 2000 genes that were identified as residing on chromosome arms that were amplified in metastatic colon cancers. We have found for nearly all these genes that chromosome amplification does not result in up-regulation of gene expression, or alternatively, that amplified genes that also demonstrate increased expression levels are quite rare.
MATERIALS AND METHODS
DNA Microarray Analysis.
We designed two custom expression monitoring DNA microarrays using Affymetrix GeneChip technology (3) that contained essentially all expressed human genes in the public domain at the time of design. Briefly, we selected the sequences for inclusion on the arrays using genes predicted from the available human genome sequences and sequences derived from the expressed mRNA and EST databases in GenBank (4). Consensus sequences representing human expressed sequences were generated using the Clustering and Alignment Tool software (DoubleTwist, Oakland, CA) using the mRNAs (nt) and EST (dbest) databases in GenBank. Prediction of the expressed genome from the human genome sequence was done using Ab initio exon prediction (5).
The arrays were hybridized with labeled cRNA derived from 10 μg of total RNA using standard protocols (6). The intensity data from the arrays were analyzed using a statistically based analysis methodology that allows for estimating expression levels and providing confidence intervals for these estimates. This method uses a gamma distribution model of the intensity data for normalization to control for the systematic variation attributable to nonbiological factors, such as array-to-array variability, and attributable to variation in sample quality. For each probeset, a single measure or average intensity was calculated using Tukey’s trimean of the intensity of the constituent probes (7).
Comparative Genomic Hybridization.
Total genomic DNA from normal and tumor tissue was labeled with digoxigenin and biotin, respectively, using nick translation. Two μg each of digoxigenin-labeled normal DNA and biotin-labeled tumor DNA were ethanol precipitated together in the presence of 10 μg of salmon sperm DNA and 60 μg of Cot-I fraction of human DNA (Life Technologies, Inc., Gaithersburg, MD). Hybridization conditions were as described previously (8). Briefly, probes were dried and resuspended in 10 μl of hybridization solution (50% formamide, 2 × SSC, and 10% dextran sulfate). DNA was denatured for 5 min at 80°C; repetitive sequences were allowed to preanneal for 1.5 h at 37°C and hybridized to normal human metaphase preparations. Normal metaphase slides were prepared from peripheral blood lymphocytes. Slides were treated with RNase (100 μg/ml) for 45 min, fixed, and dehydrated. DNA was denatured at 80°C for 1.5 min in 70% deionized formamide, 2 × SSC. The probe mixture was applied to the slide, covered with an 18-mm2 coverslip, sealed with rubber cement, and hybridized for 48 h at 37°C in a humidified chamber. Probe signals were detected using an amplification procedure and counterstained with DAPI, as described previously (8). Slides were mounted in antifade solution.
Microscopy and Image Analysis.
Images were acquired with a cooled charge coupled device camera (Photometrics, Tuscon, Arizona) mounted on a Leica DMRBE epifluorescence microscope using filters specific for DAPI, fluorescein, and rhodamine (Chroma Technologies, Brattleboro, VT). CGH ratio profiles were calculated using Leica Q-CGH software (Leica Imaging Systems, Cambridge, United Kingdom) as described (9).
RESULTS
CGH of Metastatic Colon Cancer.
To determine the relationship between chromosomal amplification and gene expression profiles, we characterized both processes in 23 independent metastatic colon cancers. This included 15 samples of metastatic tumor tissue resected from colon cancer liver metastases and an additional eight cell lines that were derived from biopsies of such colon cancer hepatic metastases. All metastatic tissue samples were dissected free of tissue contaminants and were confirmed by histology examination to be comprised of at least 70% malignant epithelial cells. We focused this study on colon cancer metastases, because they have had maximal opportunity in vivo to select for chromosome amplifications that could confer an aggressive cancer phenotype.
To identify chromosomal regions selected for amplification, these tumors were first characterized by comparative genomic hybridization (CGH) (9, 10, 11). Chromosomal regions demonstrating a ratio of 2 or greater compared with normal control were scored as amplified. Chromosomal regions demonstrating a ratio of between 1.5 and 2 were scored as gained (corresponding in a diploid genome to a chromosomal copy number of 3 and 4, respectively). Chromosomal regions showing a ratio of 0.5 or less were scored as lost. The CGH findings for all amplifications, gains, and losses detected in these metastatic samples are displayed in Fig. 1,A, whereas only amplifications and losses are displayed in Fig. 1,B. As shown in Fig. 1,A, multiple different chromosomes frequently show chromosomal gains in colon cancer metastasis. In comparison with previous studies of lower stage colon cancers (8, 12), these metastatic samples more commonly showed genomic losses on chromosome 4 and uniquely showed gains without amplification of chromosome 2 (35% of samples). However, of the many chromosomal regions commonly gained in colon cancer metastasis, only four proved to also be recurrent sites of chromosomal amplification. These were chromosomes 7p, 8q, 13q, and 20q that proved amplified in from 26 to 43% of the cancer metastases (Fig. 1,B; Table 1) in a pattern that was similarly observed both in primary metastases tumors and in metastases derived cell lines. Although detected in metastatic samples, these amplifications have also been identified by us and others in primary colon cancers (8, 12), thereby suggesting that chromosomal amplification likely mainly contributes to steps in colon carcinogenesis that are prior to metastases development.
Expression Profiling of Colon Cancer Amplicons.
We next wanted to determine how each of the four major colon cancer amplicons affected expression of the genes that reside within the amplicons. Accordingly, cRNAs prepared from each of these samples were hybridized to DNA microarrays that measure gene expression of approximately 55,000 genes, EST clusters, and predicted exons. Analysis of the microarray data confirmed that the liver metastases samples expressed high levels of colon epithelial markers and, consistent with our histological determinations, were essentially free of hepatocyte-specific gene expression. The working draft genome assembly (13) was used to identify and order all transcription units present on the microarray that mapped to chromosomes 7p, 8q, 13q, and 20q. From this set, we selected for further analysis the 2146 transcription units that on the microarrays demonstrated either expression in control microdissected normal colon epithelial strips or expression in colon cancer liver metastases. This included all transcription units whose median expression in 9 normal colon epithelia exceeded a threshold of 100 average intensity units or whose expression in any of the 23 amplified liver metastases samples exceeded this threshold.
Fig. 2 displays the qualitative gene expression patterns for the 2146 transcription units that were mapped to and ordered across the four major colon cancer chromosomal amplicons. For each transcript, the median expression level was first calculated among the group of colon cancers determined by CGH to be amplified across the corresponding chromosomal region. Median gene expression in the colon cancer metastases with chromosome amplifications was then compared with the median level of gene expression among 9 control normal colon epithelial samples. Fig. 2 denotes in green the position of each transcript in which chromosome amplification was associated with a 2-fold or greater increase in average gene expression compared with normal colon epithelia. In contrast, red denotes the positions of transcripts in which chromosome amplification was associated with decreased average gene expression to <0.5 the level of normal controls. Brown regions contain transcripts for which chromosome amplification was associated with expression of between 0.5 and 2-fold the normal controls.
As is visually obvious from Fig. 2, chromosome amplification does not in general induce increased expression levels of the overwhelming majority of genes residing in the common colon cancer amplicons. The ubiquitous brown shading of the chromosomal ideograms reflects that 90% of the 2146 transcripts within these amplicons remain in the >0.5 and <2.0 range of expression relative to normal (coded brown), despite a greater than doubling of the corresponding gene copy number in each of the colon cancer samples represented (Table 2). Indeed, a 2-fold or greater increase in gene expression was found in only 81 of 2146 (3.8%) genes subject to chromosome amplification (Table 2). In contrast, 164 genes (7.7%) residing within amplified regions actually demonstrated decreased expression to <0.5 the level of normal colon (Table 2). Moreover, 60 of the 81 upregulated genes showed only a 2–3-fold increased expression, suggesting they are not targets of high-level amplifications hidden within the broader regions of the chromosomal amplicons (Fig. 3). This observation is not attributable to any technical limitation in detection on the microarrays of increased gene expression. Indeed, the microarray analysis of genes residing on nonamplified chromosomes detected 14 transcripts whose median expression in the colon cancer metastases was increased from 4- to 12-fold over normal. Our first conclusion is thus that chromosomal amplification is associated with increased expression among only a small minority of genes residing within colon cancer chromosomal amplicons.
Additionally, this analysis reveals that the few genes residing in colon cancer amplicons that do show increased expression are scattered, and that none of the four amplicons demonstrates a “hotspot” in which a cluster of genes all show increased expression. This is illustrated in Fig. 2 by the magnified views of neighboring gene expression in the regions surrounding representative individual up-regulated genes (coded in green). This observation contrasts with the finding in the MCF7 breast cancer cell line of a cluster of highly expressed genes mapping to a chromosome 17 amplicon, which thus may reflect events specific to MCF7 or to this specific breast cancer amplification (14).
The conclusion that genes within amplicons are rarely overexpressed remains valid even when applied to a single sample and to a segmental chromosomal amplicon delimiting a region smaller than the full chromosome arms we first characterized. For example, Fig. 4 shows CGH of an hepatic metastases-derived cell line (V394) in which a significant 13q amplification is restricted to only the proximal portion of the 13q arm. High-resolution CGH of this sample mapped this amplicon to three distinct chromosomal bands ranging from 13q11 to 13q21 (Fig. 4). However, only 8 of the 251 expressed genes lying within this amplicon showed a 2-fold or greater increase over normal (range 2.14–4.48-fold).
These data suggest that the selective advantage of chromosomal amplifications is most likely to lay in rare individual “target” genes that, similar to HER2 in some breast cancers, show both high copy number increases as well as an accompanying increase in gene expression. However, our analysis excludes some previously nominated candidate genes as being such specific targets of the major colon cancer-associated chromosomal amplifications. For example, genes posited as targets of chromosome 20q amplification in cancer that showed less than a 2-fold increased expression in colon cancer hepatic metastases amplified for 20q included: topoisomerase 1 (TOP1), the AIB1 transcription factor (NCOA3), the zinc finger transcription factor 217 (ZNF217), and the MYB transcription factor family member MYBL2 (Table 3) (15, 16, 17, 18). [Our array did not measure expression levels for the BTAK mitotic kinase (STK15), which remains a potential candidate target gene (19, 20)]. Similarly, we found that the epidermal growth factor receptor gene (EGFR), residing on chromosome 7p, demonstrates on average a decreased expression to 0.49-fold of normal colon in cases in which chromosome 7 is amplified (Table 3). Furthermore, consistent with published findings that increased MYC expression in colon cancer is not attributable to amplification of this gene (21), which resides on chromosome 8q, we noted that the majority of the colon cancers demonstrating increased MYC expression derived from cases that did not bear chromosome 8q amplification. Indeed, MYC expression was on average increased only 1.74-fold above normal in cases with amplification of chromosome 8q (Table 3). Northern analysis also independently confirmed the absence of induction of EGFR and MYC expression in representative hepatic metastases with high genomic amplifications of chromosomes 7p and 8q, respectively (data not shown). Thus, each of the four major colon cancer-associated chromosomal amplifications is presumptively based on a novel gene target.
DISCUSSION
Virtually all carcinomas reveal a tumor-specific distribution of genomic imbalances (22), and the acquisition of such chromosomal gains and losses is evidenced during early stages of tumorigenesis. The maintenance of these aberrations is strongly selected for even in the presence of gross aneuploidy and intratumor heterogeneity (23), and individual tumor-specific chromosomal imbalances are preserved even after years of cell culture (9, 11). Moreover, as illustrated by this study, chromosomes found to be frequently amplified in cancers also participate in an additional group of cases as chromosomes that are commonly gained. It is tempting to speculate that the functional consequence, and presumed selective advantage, of these chromosomal aneuploidies is exerted on the tumor cells via modifications in the expression status of genes on these chromosomes. The aim of this study was to determine whether chromosomal amplification globally alters the expression of amplified genes or whether expression of only a select few genes is changed. Having examined the expression of >2000 transcripts resident on amplified chromosomes present in metastatic colon cancers, we found that chromosomal amplifications do not result in global induction of gene expression, even when examined in late-stage metastatic colon cancers. Thus, total relative gene expression levels for the great majority of transcription units must be tightly regulated. Genes that do demonstrate significantly increased expression in association with chromosome amplification are few in number and are not geographically clustered together. The microarrays used in this study sampled 55,000 genes, ESTs, and predicted exons and therefore provide a comprehensive analysis of the human transcriptome that buttresses our conclusion that increased expression of amplified genes is a rare event. This conclusion lends support to the likely importance in carcinogenesis of individual genes that both are overexpressed in cancer and that map to sites of chromosomal amplifications. We note for example in breast cancer the early identification of several such genes that are additional to HER2, albeit this identification currently relies on characterizations of a small number of cell lines (14, 24). The combination of CGH and expression array technologies represents a robust method for identifying these uncommon genes and in colon cancer yields a tractable 81 candidates that are highly attractive for future analysis as potential oncogenes.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
This work was supported by NIH Grant CA88130 and by a grant from the National Colon Cancer Research Alliance. S. D. M. is an associate investigator of the Howard Hughes Medical Institute.
The abbreviation used are EST, expressed sequence tag; DAPI, 4,6-diamidino-2-phenylindole; CGH, comparative genomic hybridization; EGFR, epidermal growth factor receptor.
CGH analysis of chromosome copy number changes in colon cancer liver metastases (n = 23). A, for each chromosome ideogram, every bar on the left side of ideogram indicates a sample with chromosomal loss (ratio, <0.5), and every bar on the right side indicates a sample with either chromosomal gains (ratios, >1 and <2; thin bar) or amplification (ratio, >2 thick bar). B, for each chromosome ideogram, every bar on the left side of ideogram indicates a sample with chromosomal loss (ratio, <0.5), and every bar on the right side indicates a sample with chromosomal amplification (ratio, >2). Chromosomes 7p, 8q, 13q, and 20q are clearly amplified in substantial numbers of liver metastases.
CGH analysis of chromosome copy number changes in colon cancer liver metastases (n = 23). A, for each chromosome ideogram, every bar on the left side of ideogram indicates a sample with chromosomal loss (ratio, <0.5), and every bar on the right side indicates a sample with either chromosomal gains (ratios, >1 and <2; thin bar) or amplification (ratio, >2 thick bar). B, for each chromosome ideogram, every bar on the left side of ideogram indicates a sample with chromosomal loss (ratio, <0.5), and every bar on the right side indicates a sample with chromosomal amplification (ratio, >2). Chromosomes 7p, 8q, 13q, and 20q are clearly amplified in substantial numbers of liver metastases.
Pictorial representation of gene expression levels across colon cancer chromosomal amplicons. The working draft assembly of the Human Genome was used to identify and order all transcripts on the GeneChip Expression Arrays that mapped to chromosomes 7p, 8q, 13q, and 20q (8). On each chromosomal segment shown, green bars denote the positions of genes showing 2-fold or greater median expression in the set of colon cancer liver metastases bearing corresponding chromosomal amplifications as compared with normal colon. Red bars denote the position of genes with median expression in amplified liver metastases of 0.5 or less that of normal colon. Brown denotes regions in which all genes demonstrate median expression in amplified liver metastases of >0.5- and <2.0-fold of normal colon. Figures to the left and right of each ideogram show magnified views of chromosomal segments that flank the overexpressed green genes. Numerical values provide the map coordinates in the human genome sequence that correspond to the ideogram.
Pictorial representation of gene expression levels across colon cancer chromosomal amplicons. The working draft assembly of the Human Genome was used to identify and order all transcripts on the GeneChip Expression Arrays that mapped to chromosomes 7p, 8q, 13q, and 20q (8). On each chromosomal segment shown, green bars denote the positions of genes showing 2-fold or greater median expression in the set of colon cancer liver metastases bearing corresponding chromosomal amplifications as compared with normal colon. Red bars denote the position of genes with median expression in amplified liver metastases of 0.5 or less that of normal colon. Brown denotes regions in which all genes demonstrate median expression in amplified liver metastases of >0.5- and <2.0-fold of normal colon. Figures to the left and right of each ideogram show magnified views of chromosomal segments that flank the overexpressed green genes. Numerical values provide the map coordinates in the human genome sequence that correspond to the ideogram.
Graphical representation of genes with increased expression in amplified colon cancer metastases. Depicted is the fold increased expression for those genes on chromosomes 7p, 8q, 13q, and 20q that show >2-fold increase median expression relative to normal colon. For each transcript, the ratio depicted represents the median expression level in colon cancer liver metastases amplified for the corresponding chromosome compared with normal colon epithelium.
Graphical representation of genes with increased expression in amplified colon cancer metastases. Depicted is the fold increased expression for those genes on chromosomes 7p, 8q, 13q, and 20q that show >2-fold increase median expression relative to normal colon. For each transcript, the ratio depicted represents the median expression level in colon cancer liver metastases amplified for the corresponding chromosome compared with normal colon epithelium.
Chromosome 13q amplification and gene expression in V394. Shown is high-resolution CGH hybridizations of probe from colon cancer V394 to chromosome 13 displaying from left to right: ideogram, inverted DAPI banding, raw intensity profile, and the hybridization. This hybridization identifies a high-level segmental amplicon mapping to 13q11-21, as opposed to the entire arm of chromosome 13q observed in other liver metastases. Levels of V394 gene expression for transcripts mapping within this amplicon (as delimited by the arrows) are depicted at right following the same conventions as in Fig. 2 in which green bars depict genes with 2-fold increased expression versus normal colon, red bars depict genes with 2-fold decreased expression versus normal colon, and brown regions contain genes with >0.5- and <2-fold change in gene expression versus normal colon.
Chromosome 13q amplification and gene expression in V394. Shown is high-resolution CGH hybridizations of probe from colon cancer V394 to chromosome 13 displaying from left to right: ideogram, inverted DAPI banding, raw intensity profile, and the hybridization. This hybridization identifies a high-level segmental amplicon mapping to 13q11-21, as opposed to the entire arm of chromosome 13q observed in other liver metastases. Levels of V394 gene expression for transcripts mapping within this amplicon (as delimited by the arrows) are depicted at right following the same conventions as in Fig. 2 in which green bars depict genes with 2-fold increased expression versus normal colon, red bars depict genes with 2-fold decreased expression versus normal colon, and brown regions contain genes with >0.5- and <2-fold change in gene expression versus normal colon.
CGH analysis summary
A summary is shown of the total number of liver metastases analyzed and of those, the number that displayed losses, gains, amplifications, or no change in chromosomes 7p, 8q, 13q, and 20q.
. | Chromosome . | . | . | . | |||
---|---|---|---|---|---|---|---|
. | 7p . | 8q . | 13q . | 20q . | |||
Total no. of liver metastases | 23 | 23 | 23 | 23 | |||
No. with amplifications | 6 | 6 | 10 | 9 | |||
No. with gains | 15 | 14 | 9 | 10 | |||
No. with no change | 2 | 3 | 4 | 4 | |||
No. with losses | 0 | 0 | 1 | 0 |
. | Chromosome . | . | . | . | |||
---|---|---|---|---|---|---|---|
. | 7p . | 8q . | 13q . | 20q . | |||
Total no. of liver metastases | 23 | 23 | 23 | 23 | |||
No. with amplifications | 6 | 6 | 10 | 9 | |||
No. with gains | 15 | 14 | 9 | 10 | |||
No. with no change | 2 | 3 | 4 | 4 | |||
No. with losses | 0 | 0 | 1 | 0 |
Chromosome-specific gene expression
For cancers bearing amplications of the chromosomal regions shown, listed is the number of genes showing a 2–fold increase in median expression versus normal colon, the observed highest fold increase in median expression versus normal colon, and the number of genes showing a 2–fold or greater decrease in median expression versus normal colon.
. | Amplifications of chromosome . | . | . | . | |||
---|---|---|---|---|---|---|---|
. | 7p . | 8q . | 13q . | 20q . | |||
Total unique genes | 530 | 542 | 501 | 573 | |||
No. of genes with increased expression | 13 | 23 | 26 | 19 | |||
Highest fold expression increase | 4.38 | 4.86 | 5.22 | 3.68 | |||
No. of genes with decreased expression | 42 | 47 | 34 | 41 |
. | Amplifications of chromosome . | . | . | . | |||
---|---|---|---|---|---|---|---|
. | 7p . | 8q . | 13q . | 20q . | |||
Total unique genes | 530 | 542 | 501 | 573 | |||
No. of genes with increased expression | 13 | 23 | 26 | 19 | |||
Highest fold expression increase | 4.38 | 4.86 | 5.22 | 3.68 | |||
No. of genes with decreased expression | 42 | 47 | 34 | 41 |
Gene expression by amplified colon cancers
Listed for each gene is the median gene expression relative to normal colon among hepatic metastases bearing chromosomal amplifications corresponding to the given gene locations.
Gene and GenBank accession . | Chromosome . | Fold increased expression . |
---|---|---|
TOP1 | ||
U07804 | 20 | 1.69 |
AIB1 | ||
AA150333 W46488 N56493 | 20 | 0.71 |
ZNF217 | ||
N70546 | 20 | 0.71 |
MYBL2 | ||
X13293 | 20 | 1.10 |
MYC | ||
M13929 L00058 | 8 | 1.74 |
EGFR | ||
X00588 | 7 | 0.49 |
Gene and GenBank accession . | Chromosome . | Fold increased expression . |
---|---|---|
TOP1 | ||
U07804 | 20 | 1.69 |
AIB1 | ||
AA150333 W46488 N56493 | 20 | 0.71 |
ZNF217 | ||
N70546 | 20 | 0.71 |
MYBL2 | ||
X13293 | 20 | 1.10 |
MYC | ||
M13929 L00058 | 8 | 1.74 |
EGFR | ||
X00588 | 7 | 0.49 |
Acknowledgments
We thank Dr. Natasha Aziz for genomic annotation assistance and helpful discussion and Dr. Karl Sirotkin for helpful discussion.