Abstract
Purpose: We hypothesize that changes in the transcription of up-regulated genes are biologically meaningful and may be linked to variations in tumor behavior and clinical features. This study aimed to find individual up-regulated genes responsible for clinicopathologic variations in human colorectal cancer.
Experimental Design: Genes up-regulated concurrently in four microarray experiments were taken as candidate genes; 20 candidate genes were verified using real-time reverse transcription-PCR in these four experiments, along with 27 new samples. The presence or absence of up-regulation of these genes was correlated with 10 clinicopathologic variables from 31 patients. The mRNA transcript levels of these 20 candidate genes in the 31 paired samples were also correlated with each other to disclose any expression relationship.
Results: Forty percent (8/20) of the candidate genes were verified by real-time reverse transcription-PCR to have a tumor/normal expression ratio > 2. Up-regulation of THY1 and PHLAD1 was associated with the presence of anemia in colon cancer patients (P = 0.036 and 0.009, respectively). Up-regulation of HNRPA1 was more significant in cancer growing in the right-sided colon than the left side (P = 0.027). Up-regulated GPX2 was related to a higher degree of tumor differentiation (P = 0.019). c-MYC was significantly over-expressed in specimens from male compared with female colon cancer patients (P = 0.012). GRO1 was significantly up-regulated in patients younger than 65 years old (P = 0.010) and was found to be frequently over-expressed when cancers were less invasive. In addition, we found that normalized transcript levels of HNRPA1 were tightly associated with that of c-MYC (r = 0.948).
Conclusions: Validation of microarray data using another independent laboratory approach is mandatory and statistical correlation between gene expression status and the patient's clinical features may reveal individual genes relevant to tumor behavior and clinicopathologic variations in human colorectal cancer.
Introduction
Colorectal cancer is one of the leading causes of fatalities due to cancer in developed countries. Much effort has been devoted into researching its underlying molecular events. Understanding the differences in gene expression between cancerous and normal cells is important for cancer research. cDNA microarray technologies can perform analyses of such differences at a transcriptional level with high throughput. Using microarrays, researchers have unveiled gene expression profiles and signatures that characterize colorectal cancer and its phenotypes (1-3). However, the individual genes responsible for tumor behavior and clinicopathologic variations in colon cancer have not been reported thus far.
The goal of this study was to identify up-regulated genes in human colorectal cancer and determine if there are individual genes linked to tumor behavior and patient clinicopathologic factors. A simple and cost-saving strategy was adopted in this study to identify up-regulated genes: cDNA microarrays were taken as a genomic filter and real-time reverse transcription-PCR as a validation method for verifying array data; a large number tissue samples (n > 30) were used in post-array validation to minimize sample bias. To find the relevance between up-regulated genes and clinical factors, a variety of clinicopathologic parameters from 31 samples were correlated with the presence or absence of up-regulation of genes. In addition, the transcript levels of up-regulated genes were also analyzed in correlation with each other, using regression analysis, in an effort to determine if any relationship existed between these up-regulated genes.
We used fresh tissue samples instead of cultured cell lines as experimental material because cell lines may not express important surface molecules (antigens or receptors) and secretory molecules (matrix or signal molecules) related to the communication or interaction with the surrounding microenvironment as that found in vivo, and which has recently been considered crucial to tumor progression and metastasis (4, 5).
Materials and Methods
Patients and Tissue Samples
From September 2001 to December 2002, fresh paired tissue samples of colorectal cancer and corresponding normal mucosa were harvested from consecutive patients who underwent radical colectomy due to colorectal cancer in Feng-Yuan Hospital, Feng-Yuan City, Taichung County, Taiwan. The cancerous tissue was cut directly from the tumor mass and the normal mucosa was carefully dissected from the inner wall of the adjacent uninvolved colon with a sharp scalpel. The paired tissue samples were put in separate vials and snap-frozen with liquid nitrogen. Samples were placed in a −85°C refrigerator for storage. Samples with the following conditions were abandoned: tissue weight <500 mg, tumor tissue with ambiguous boundary to adjacent mucosa (easy contamination of normal mucosa), and tissue with warm ischemia (the time period between excision of surgical specimens from patients and the snap-freezing of tissue samples in liquid nitrogen) for >15 minutes. Informed consent was acquired from all patients. Data of clinicopathologic parameters were obtained from patients' clinical records, operative notes, and pathologic reports.
Isolation of RNA from Tissue Samples
Blocks of tissue were ground into powder in −196°C liquid nitrogen. Tissue powder was homogenized using TRIzol reagent (1 mL/50-100 mg; Life Technologies Inc., Gaithersburg, MD). Total RNA was isolated according to the manufacturer's instructions. Homogenates were incubated for 15 minutes at 20°C to permit complete dissociation of nucleoprotein complexes. Chloroform (1/5 TRIzol solution) was added. The homogenate was vigorously agitated and then centrifuged at 12,000 × g for 10 minutes at 4°C to allow for phase separation: RNA remained exclusively in the upper aqueous phase which could then be easily separated. Precipitation of RNA occurred after mixing of the separated aqueous phase with isopropanol (1/2 TRIzol solution), incubation for 10 minutes at 20°C, and centrifugation at 12,000 × g for 15 minutes at 4°C. The visible RNA precipitate pellet was washed with 75% ethanol, centrifuged at 7,500 × g for 5 minutes at 4°C, air-dried and redissolved in diethyl pyrocarbonate-treated H2O. RNA solution was stored at −70°C. Total RNA with OD260/OD280 > 1.6 was used for microarray experiments.
Microarray Experiments
Four microarray experiments were done using four paired (colon cancer and adjacent normal mucosa) tissue specimens from different patients. mRNA was isolated from total RNA with a Dynal MPC-s kit (Dynal Biotech, Lake Success, NY), and reverse-transcribed with Superscript II RNase H-reverse transcriptase (Life Technologies) to generate Cy5- and Cy3-labeled cDNA probes for cancer and normal samples, respectively. The labeled probes were hybridized to a commercial cDNA microarray, comprising a total of 8,000 immobilized cDNA fragments (ABC Human UniversoChip 8k-1; Asia BioInnovations Corporation, Taiwan). Fluorescence intensities of Cy5 and Cy3 targets were measured and scanned separately using a GenePix 4000B Array Scanner (Axon Instruments, Union City, CA). Data analysis was done using associated software, GenePix Pro 3.0.5.56 (Axon Instruments). The signal-to-noise ratios for the 635 and 532 nm channel were estimated by (F635Median-B635Median)/B635SD and (F532Median-B532Median)/B532SD. The signal-to-noise ratios were used to check the quality of the microarray hybridization process. The results were normalized for the labeling and detection efficiencies of the two fluorescence dyes, then used to determine differential gene expression between cancer and normal samples. The genes were considered to be up-regulated if the Cy5/Cy3 signal ratio was >2.
Real-time Reverse Transcription-PCR
A total of 31 paired samples were used for real-time reverse transcription-PCR, including the same samples studied in the initial array experiments, plus an additional 27 new samples. Two-step quantitative reverse transcription-PCR was done: 2 μg of total RNA was reverse-transcribed to cDNA using the Superscript preamplification system (Life Technologies), and quantitative real-time PCR was done on a Roche LightCycler system (Roche Molecular Biochemicals, Mannheim, Germany). For each PCR, the reaction was carried out in a reaction mixture (20 μL) consisting of 12.6 μL of H2O, 2.4 μL of MgCl2 (stock solution of kit), 0.5 μL (10 pmol) of forward primer, 0.5 μL (10 pmol) of reverse primer, 2 μL of cDNA, and 2 μL of LightCycler-FastStart DNA Master SYBR Green I. The gene-specific primers (Table 3) were designed using MaxVector software (Accelrys, Inc., San Diego, CA), and through genomic information obtained from the National Center for Biotechnology Information web site. The PCR protocol consisted of denaturation at 95°C for 10 minutes, followed by 40 to 60 cycles of PCR amplification (different annealing temperatures for specific primers, extension times, and fluorescence detection temperatures for different PCR products are listed in Table 3), with a subsequent melting curve analysis (continuous fluorescence detection from 65°C to 95°C with a temperature slope of 0.1°C/second). Relative mRNA quantification of each sample was calculated with reference to the standard curve constructed automatically by plotting the log number of 1-, 10-, 100-, 1,000-, and 10,000-fold serially diluted standard samples in each reaction. Most (95%) of the correlation coefficients of the standard curve in this study were r = 1.0 and the mean square errors were <0.2. The rest were r = 0.99 and the errors <0.4. Corrections for sample to sample differences were done by normalization to the reference gene (endogenous control). The constitutively expressed gene glyceraldehyde-3-phosphate dehydrogenase (GAPDH) was taken as the reference gene in this study. The normalized expression of a target gene = mRNA level of the gene / mRNA level of GAPDH in the same sample.
Statistical Data Analysis
There were four parts to data analysis in this study. The first part consisted of correlating the various clinicopathologic factors with each other. We conducted this part of the analysis for the purpose of comparing the clinical features of our patients with common colorectal patients. From this comparison, we could inspect the population (patients) bias of patients enrolled in this study.
The second part of data analysis was to analyze the real-time PCR data generated from post-array validation in order to find genes that could be verified with regard to confirming the array results: for each candidate gene, its normalized mRNA expression in cancer and corresponding normal tissue were “gene in tumor/GAPDH in tumor” and “gene in normal/GAPDH in normal,” respectively. Wilcoxon signed ranks tests were used to determine the statistical significance of expression difference for each test gene in 31 paired samples.
The third part of data analysis was to analyze the association between confirmed genes and clinicopathologic factors: for each patient, a gene was defined as up-regulated if the normalized tumor/normal expression ratio = (gene in tumor / GAPDH in tumor) / (gene in normal / GAPDH in normal) >2. The presence or absence of up-regulation of confirmed genes in 31 patients was analyzed in correlation with important clinicopathologic factors, including age (≤65 versus >65 years old), gender (male versus female), tumor location (right side versus left side), carcinoembryonic antigen (CEA) level (≤4.3 versus >4.3 ng/mL), tumor stage (Duke's A, B, C, D), tumor differentiation (well, moderate, or poor), the presence or absence of anemia, lymph node metastasis, distal metastasis and concomitant polyp, using a χ2 or Fisher's exact test.
The fourth part of our statistical analysis involved determining if there existed any expression relationship between the confirmed genes: 31 paired samples used in real-time reverse transcription-PCR provided data of varying transcript (expression) levels. For example, the mRNA transcript levels of gene A in 31 patients was analyzed in correlation with the levels of another gene, gene B, using regression analysis. The data were analyzed using Excel 2003 (Microsoft, Corp.) and the statistical package program Statistical Package for the Social Sciences 11.0 (SPSS Inc., Chicago, IL). Statistical significance was defined as P < 0.05.
Results
Part 1: Clinical Features of Patients in this Study
Table 1 illustrates the clinical characteristics of patients in this study. The average age (67.4 years old) was close to 70. Tumors occurred with a mild predominance in males (males/females = 1.38:1); 45% of tumors occurred in the rectum and 55% in the colon. Tumors located in the rectum were largely in males, whereas colonic tumors had a female predominance (P = 0.067). Patients with cancers of the right-sided colon exhibited a preponderance to anemia, more so than those with cancers of the left-sided colon (P = 0.008). All these clinical features were consistent with the current epidemiologic data concerning colorectal cancer. The consistency indicated that the 31 patients in this study were unbiased samples from colorectal cancer populations. In addition, our data also showed significant elevation of serum CEA levels in patients with left-sided colonic tumors (P = 0.036), as well as in patients at a more advanced tumor stage (P = 0.043).
Part 2: Microarray Experiments and Real-time Reverse Transcription-PCR Validation
Cy5/Cy3 ratios of 8,000 genes showed a Gaussian distribution (Fig. 1). The moderate skew to the right in terms of distribution in Fig. 1 roughly indicates that the number of up-regulated genes was greater than the number of down-regulated genes in colorectal cancers. Those with Cy5/Cy3 signal ratios >2 were arbitrarily defined as up-regulated genes in this study. As summarized in Fig. 2, more than 400 genes were over-expressed in each array chip but only 29 genes (listed in Table 2) were concurrently up-regulated in all four array experiments. Of these 29 candidate genes, 20 genes were tested for post-array validation using real-time reverse transcription-PCR in 31 paired samples: 40% (8/20) of these genes were confirmed to have a tumor/normal expression ratio > 2 (Tables 2 and 3, Fig. 3).
Part 3: Correlation Between Up-regulated Genes and Clinicopathologic Factors
Among eight confirmed genes, six were linked to clinicopathologic factors. The statistical correlation between GRO1 and the patients' clinicopathologic variables is shown in Table 4. Table 5 summarizes statistically significant relationships between the presence or absence of up-regulation of six genes and the presence or absence of 10 clinicopathologic factors. Up-regulation of THY1 and PHLAD1 were found to be associated with the presence of patients' anemia (P = 0.036 and 0.009). Up-regulation of HNRPA1 was significant in cancers growing in the right-sided colon more so than the left side (P = 0.027). Up-regulated GPX2 was related to a higher degree of tumor differentiation (P = 0.019). The c-MYC oncogene was up-regulated to a greater degree in specimens from male rather than female colon cancer patients (P = 0.012). GRO1 was significantly up-regulated in patients younger than 65 years old (P = 0.010); tumor stage, lymph node metastasis, and serum CEA levels were found to be frequently over-expressed when analyzed in correlation with GRO1 (P = 0.075, 0.060, 0.058, respectively). The data in Table 4 shows that the less invasive the tumor (less advanced in stage, lymph node metastasis, and CEA levels), the more prominent was GRO1 over-expression.
Part 4: Expression Relationship Between Up-regulated Genes
Regression analysis of mRNA transcript levels between eight confirmed genes revealed a tight correlation between c-MYC and heterogeneous nuclear ribonucleoprotein A1 (HNRPA1) in tumor tissue (r = 0.948; Fig. 4). When transcript levels of these two genes in normal tissue were taken as the baseline, the adjusted mRNA levels of c-MYC and HNRPA1 still showed a strong correlation (r = 0.871; Fig. 5). This finding suggested that the transcription of HNRPA1 may be coupled to that of c-MYC in an unknown manner.
Discussion
By themselves, the long lists of data obtained from microarray experiments help little in the understanding of clinical characteristics. However, analysis of gene expression in correlation with clinical or phenotypic variations may indicate biologically meaningful changes at a transcriptional level. Prior to this study, other array-based studies had shown expression profiles or gene clusters associated with colorectal cancer (1-3); whereas any relationship between individual genes and clinicopathologic factors was never clarified. Singh et al. (6) had utilized microarrays to identify genes that might predict the clinical behavior of a disease (prostate cancer), but there was no individual gene in their report whose expression correlated to the relevant clinical and pathologic parameters. In this study, we validated eight up-regulated genes in colorectal cancer tissue and found six of them to be linked to clinicopathologic variables.
In post-array validation of this study, we tested 20 candidate genes using real-time reverse transcription-PCR in 31 paired samples, including the same samples studied in the initial array experiments, plus an additional 27 new samples. The 20 candidate genes were up-regulated in all four chips; however, only 40% (8/20) of these candidate genes were post-array validated by real-time reverse transcription-PCR as being up-regulated in colorectal cancer. Previously, Rajeevan et al. (7) had used real-time reverse transcription-PCR to test 24 selected candidate genes from their array data and 71% (17/24) of those genes passed the post-array validation. Their high agreement between array data and real-time reverse transcription-PCR results was mainly due to the fact that they used only one pair of cultured cell line (two keratinocyte subclones) in both array experiments and post-array validation. In the current study, we used tissue samples from different patients; heterogeneity of tissue cells and sample-to-sample variation led to the low agreement between array data and real-time reverse transcription-PCR results. Therefore, validation of microarray results using another independent laboratory approach with additional tissue samples is mandatory if few microarrays are carried out and tissues are used as sources of experimental samples (8).
In combination with previous reports, some connections, either between up-regulated genes and carcinogenesis, or between gene expression and clinical factors, can be established from this study. Inferences based on our findings and evidence from others studies are discussed below.
MYC versus Sex
In this study, c-MYC was over-expressed in 94% (17/18) of samples from male colon cancer patients, but was over-expressed in only 54% (7/13) of females. This suggests a gender-related influence on c-MYC expression. The role of androgen in increasing c-MYC expression has been investigated and confirmed in prostatic cancer by many authors (9-12). Therefore, it is likely that androgen may have a similar effect in colorectal cancer. In fact, epidemiologic data concerning relative risk of colorectal cancer between males and females (1.25 in Taiwan), supports this postulation.
MYC versus HNRPA1
Our data also disclosed a strong linear correlation between mRNA transcript levels of c-MYC and that of HNRPA1 (correlation coefficient, 0.948). HNRPA1, the heterogeneous nuclear ribonucleoprotein (hnRNP) core protein A1, has a modular structure consisting of two conserved RNA binding domains (13) and functions as a carrier for RNA during export of RNA to the cytoplasm (14). The strong correlation between c-MYC and HNRPA1 expression may indicate a tight association between the transcriptional factor and the RNA binding proteins trafficking in and out of the nuclear membrane. Many authors have reported using c-myc antisense oligonucleotides to inhibit the cellular proliferation of various cancers, including colon cancer (15). From our data, a combined antisense oligo of c-myc and hnRNP A1 could be a strategy for treating colon cancer more effectively than c-myc antisense oligos alone.
GRO1 versus Age and Tumor Stage + Lymph Node Metastasis + CEA Level
Up-regulation of the GRO1 gene in colorectal cancer (1, 16) and that the GRO1 protein functions as a potent mediator of leukocyte recruitment and proliferation in inflammatory diseases (17-19) hint that the tumor growth of colon cancer might trigger an immune response together with GRO1 over-expression. In this study, two findings indirectly support this speculation. First, our data showed that GRO1 was frequently up-regulated in less invasive cancers (less advanced in stage, lymph node metastasis, and CEA levels). This implies that GRO1 may have a protective effect (the immune response against tumors) in preventing the progression of colon cancer. Second, we found the GRO1 gene was significantly up-regulated in patients younger than 65 years old; the immune response of young people is generally stronger than the elderly, hence, up-regulation of GRO1 largely occurs in colon cancer patients younger than 65.
GPX2 versus Differentiation
GPX2 expresses mainly in the epithelium of the gastrointestinal tract and its protein product, GPX-GI, functions as an intracellular selenium-dependent glutathione peroxidase that can reduce H2O2 and alkyl hydroperoxides (20). Previously, Chu et al. (21) reported that GPX2 mRNA levels in the colon of mice relatively resistant to dimethylhydrazine-induced colon cancer were higher than those in mice sensitive to chemical-induced colon cancer. In this study, we found that an up-regulation of GPX2 was related to a higher degree of tumor differentiation. Both Chu et al.'s study and ours suggest that GPX2 gene expression has an adverse effect on tumor proliferation, maybe due to its differentiation-promoting effect. Chu et al. also identified three retinoic acid response elements in this gene and showed that retinoic acid, a vitamin with known prodifferentiation effects, could induce GPX2 gene expression in a human breast cancer cell line (22). Therefore, if retinoic acid can induce GPX2 over-expression in colon cancer cells just as it does in cultured breast cancer cells, vitamin A could possibly be used to clinically promote cell differentiation of colon cancer.
THY1 and PHLDA1 versus Anemia
THY1 encodes THY-1, a surface glycoprotein characteristic of T cells, hematopoietic stem cells, liver stem cells (23), and blast cells of acute myeloid leukemia (24). THY-1 is structurally the simplest of the T cell antigen receptors in the immunoglobulin supergene family (25), but its role in immune response is unclear. PHLDA1, known in mice as the T cell death-associated gene, is one of the gut-expressed proteins with high T cell epitope homology (26). In this study, we found that both THY1 and PHLDA1 were up-regulated in colon cancer tissue and both of them were statistically relevant to the patient's anemia. This finding together with other reports (26, 27) hint that THY1 and PHLDA1 products are two surface molecules responsible for crosstalk between colon cancer and the immune system.
Further work is required to determine whether the up-regulation of these two genes in colon cancer is related to tumor-infiltrating T cells (27) being recruited, leading to increased tumor cell necrosis and subsequent tumor mass bleeding and patients' anemia.
The clinical correlates of up-regulated genes in this study are essentially statistical inference. More scientific evidence confirming the correlation and associated postulations are still necessary. However, the findings in this study provide clues to molecular events related to the carcinogenesis or clinical features of human colorectal cancer and suggest possible therapeutic targets.
Grant support: Supported in part by research grant AS92IMB3 from Academia Sinica, Taipei, Taiwan.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Acknowledgments
We thank Dr. K. Deen for his critical reading of this manuscript.