Abstract
Colon and ovarian cancers can be difficult to distinguish in the abdomen, and the distinction is important because it determines which drugs will be used for therapy. To identify molecular markers for that differential diagnosis, we developed a multistep protocol starting with the 60 human cancer cell lines used by the National Cancer Institute to screen for new anticancer agents. The steps included: (a) identification of candidate markers using cDNA microarrays; (b) verification of clone identities by resequencing; (c) corroboration of transcript levels using Affymetrix oligonucleotide chips; (d) quantitation of protein expression by “reverse-phase” protein microarray; and (e) prospective validation of candidate markers on clinical tumor sections in tissue microarrays. The two best candidates identified were villin for colon cancer cells and moesin for ovarian cancer cells. Because moesin stained stromal elements in both types of cancer, it would probably not have been identified as a marker if we had started with mRNA or protein profiling of bulk tumors. Villin appears at least as useful as the currently used colon cancer marker cytokeratin 20, and moesin also appears to have utility. The multistep process introduced here has the potential to produce additional markers for cancer diagnosis, prognosis, and therapy.
INTRODUCTION
Colon and ovarian carcinomas can be difficult to distinguish from each other histologically in ovarian masses, in peritoneal carcinomatosis, and in metastases to distant lymph nodes (1, 2, 3, 4, 5, 6). It has been reported that 7% of ovarian masses are, in fact, metastases and that most of those are from primary lesions in the gastrointestinal tract (1, 2, 7). Peritoneal carcinomatosis, an unfortunate manifestation of colon and ovarian tumors, also often requires differential diagnosis at sites of dissemination. Overall, adenocarcinomas account for 60% of all cancers of unknown primary, as defined by the detection of one or more metastatic tumors after an initial standard evaluation fails to identify a primary tumor site (8). Additional pathological studies rarely identify a primary site and are indicated in only selected patient subgroups (8). Carcinoma of unknown primary has a generally bad prognosis, with a median survival of 4–5 months, and the tumors tend to respond poorly to empirical therapy (8, 9). A number of markers have been used in an attempt to distinguish colon and ovarian malignancies by immunohistochemistry. Included have been CA125, CEA,7 CK7, and CK20 (3, 4, 5, 6, 10). The combination of CK7 and CK20 has become the de facto standard against which other markers are judged, but there is still a clear need for additional markers (8, 10). The problem has practical importance for choice of therapy. Colorectal cancer is generally treated with 5-FU; ovarian cancer is generally treated with paclitaxel and a platinum agent (11, 12).
To identify additional markers for this differential diagnosis, we analyzed molecular profiling data on the 60 human cancer cell lines (the NCI-60) used by the Developmental Therapeutics Program of the NCI to screen potential anticancer agents (13, 14, 15, 16). More than 100,000 chemical compounds plus a large number of natural product extracts have been screened since 1990, and their patterns of activity have been found to encode rich information about mechanisms of drug action and resistance (15, 16, 17). Screening the compounds for activity also profiles the cells for their patterns of drug sensitivity, thereby providing a unique perspective on the molecular pharmacology of cancer (15, 18, 19, 20, 21). That perspective has been enhanced by profiling the cells at the molecular level. Many laboratories at the NCI and other institutions have characterized the cells one or a few molecules at a time. Our approach has been to profile the cells more broadly at the DNA, mRNA (19, 20, 21), and protein (18) levels. This profiling provided the starting point for the present study. Cell lines in culture are not fully representative of tumor cells in vivo, of course, but the RNA and protein profiling studies have indicated that they maintain molecular signatures of their tissues of origin (18, 19, 20, 21). It would have been equally reasonable to start the identification process with clinical specimens, but we began with the cell lines because, unlike clinical tumors, they are available in large amounts for study and are homogeneous in cell lineage. Ultimately, the value of the approach would, we felt, depend on its results. For a reason to be discussed later, one of the histological markers identified would probably not have been found if we had begun by profiling bulk clinical tumors.
A schematic view of our overall approach is shown in Fig. 1. The steps were as follows: (a) select candidate markers based on differential colon/ovarian transcript expression levels obtained from 9706-clone cDNA arrays and based on the availability of suitable antibodies for detection as well as biological plausibility; (b) resequence the candidate clones to verify their identity; (c) cross-check the mRNA profiles obtained for candidate genes by cDNA microarray (19, 21) with those obtained for the same samples by Affymetrix oligonucleotide chip (20); (d) use protein lysate arrays (22, 23) in a newly developed high-density format to test whether the differential profiles seen at the transcript level would be reflected at the protein level as well; and (e) validate the markers prospectively on clinical materials using tissue microarrays with amplified immunohistochemical detection (24). Two candidate markers that survived this triage process have favorable characteristics for use in clinical pathology for distinguishing colon from ovarian cancer cells. Use of those markers, in combination with each other or with conventional markers, appears to improve the accuracy of discrimination.
MATERIALS AND METHODS
Cell Lines.
The NCI-60 set includes seven colon carcinomas (COLO205, HCC-2998, HT29, KM12, SW-620, HCT-116, and HCT-15) and six ovarian carcinomas (SK-OV-3, IGROV1, OCVAR-4, OVCAR-3, OVCAR-5, and OVCAR-8). For this study, we focused on these 13 cell types out of the 60. Xenografts of these lines in nude mice differ widely in histology, from undifferentiated to well differentiated (25). The more differentiated ones, such as HT29 and OVCAR-5, tend to show tubular and acinar/glandular features (25). Immunohistochemical analysis showed CEA staining among the colon lines, particularly COLO205, HT29, and KM12. CA125 and CA19-1 stained primarily the ovarian lines, particularly OVCAR-3 and OVCAR-5 (25).
cDNA Microarrays.
The microarrays and hybridization experiments have been described in detail previously (19, 21). Briefly, the arrays were produced by Synteni, Inc. (now Incyte, Inc., Palo Alto, CA). They consisted of robotically spotted, PCR-amplified cDNAs on coated glass slides. Approximately 8000 UniGene clusters were represented by the 9706 cDNA clones (Research Genetics, Huntsville, AL), including 3700 named genes, 1900 human genes homologous with those of other organisms, and 4104 expressed sequence tags of unknown function but defined chromosomal location. Sequence reverification suggested that 15–20% of the array elements contained misidentified cDNA clones or contaminants representing more than one gene (21). For each two-color comparative array hybridization, labeled cDNA was synthesized by reverse transcription from test cell mRNA in the presence of Cy5-dUTP and from reference pool mRNA in the presence of Cy3-dUTP. The reference pool (19, 21) consisted of 12 highly diverse cell lines out of the 60 cell lines. Included were leukemias HL-60 (TB) and K-562, non-small cell lung cancer NCI-H226, colon cancer COLO205, central nervous system cancer SNB-19, melanoma LOX-IMVI, ovarian cancers OVCAR-3 and OVCAR-4, renal cancer CAK-1, prostate cancer PC-3, and breast cancers MCF7 and HS 578T. The cells were harvested, and their RNA was stabilized within 1 min after removal from an incubator at 37°C. Polyadenylated mRNA was then prepared and characterized as described previously (19, 21) The data used here are available online.8
cDNA Microarray Data Analysis.
The experiments were those reported previously (19, 21), but the raw red and green channel data were normalized by a more refined algorithm based on fitting with a moving Gaussian kernel to remove the curvature generally observed in scatter plots of red and green channels against each other (26). After normalization, the expression level of each gene in the test cell relative to that in the reference pool was expressed as E = log10(CH2FL/CH1FL), where CH2FL and CH1FL are the signal amplitudes in the test cell and reference pool channels, respectively. To identify clones that might potentially distinguish colon and ovarian carcinomas, we then applied several statistical techniques to the expression profiles from the seven colon and six ovarian cell lines. Included as indices of selectivity were (a) the P from an unpaired t test comparing levels observed in the colon and ovarian lines, (b) the mean difference in expression ratio (E) between colon and ovarian lines, and (c) the median difference in amplitude of the single-channel test cell expression signal (CH2FL) between colon and ovarian lines. Single-channel values are not as accurately determined as ratios of red and green channels, but they suggest whether or not the amplitude of signal will be high enough for utility of the marker in in situ hybridization and, possibly, immunohistochemistry. On the basis of the three criteria, we identified the top 300 candidate genes in each direction and then clustered their expression patterns using a hierarchical average linkage algorithm with correlation metric (15, 18, 19) in our Clustered Image Map program package (cimminer).8
Sequence Verification of Candidate Clones.
Once the candidate molecules had been identified (based on the three numerical criteria above plus the availability of antibody and biological plausibility), their IMAGE clones were obtained from Research Genetics or American Type Culture Collection (Manassas, VA) and resequenced. Sequencing was performed using a universal primer pair, GF200, with the pT3T7 or pBluescript vector. Sequences were aligned using either Entrez web tools (National Center for Biotechnology Information, Bethesda, MD) or the DoubleTwist software package (Doubletwist, Inc., Oakland, CA) to confirm length and location of the clone insert in the full-length cDNA or genomic sequence. Only after full-length resequencing and verification that the clone was correctly identified did we proceed to further evaluation of the candidate marker.
Affymetrix Oligonucleotide Chips.
After identification and sequence validation of the candidates, we cross-compared their mRNA expression patterns from the cDNA microarrays with expression patterns that we and our collaborators obtained by running 6800-gene Affymetrix oligonucleotide chips (HuFL 6.8K; Affymetrix, Santa Clara, CA) on the same samples (20). The combination of these two databases generated by different methods with different sources of error provided mutual corroboration and, therefore, more reliable gene expression data.9 The Affymetrix oligonucleotide chip data for the NCI-60 cell lines are available at our web site8 in the form used in this analysis.
Analysis of Protein Expression Using Reverse-phase Protein Lysate Arrays.
To test whether the colon-ovarian selectivities seen at the transcript level would also be reflected at the protein expression level, we used reverse-phase protein lysate microarrays that included all 60 cell lines plus controls, each at 10 serial 2-fold dilutions. The arrays were designed and produced as described in detail elsewhere,10using a modification of a previous protocol for making lower-density arrays (22). Briefly, the arrays of 648 spots were produced on nitrocellulose-coated glass slides (FAST Slides; Schleicher & Schuell, Dassel, Germany) using a pin-in-ring format Affymetrix GMS 417 Arrayer. Protein levels were assessed using the Catalyzed Signal Amplification System (DAKO, Carpinteria, CA), which is based on horseradish peroxidase and diaminobenzidine. Because of the 10 serial dilutions, the nominal dynamic range was ∼1000-fold. Array images were processed using a modification of the P-SCAN program package (27). Before immunostaining of the arrays, each antibody was tested against a pool of all 60 cell lines by conventional Western blotting to check that it produced only a single major band at a loading of 20 μg of protein.
Construction of Tissue Microarrays.
TARP1 tumor tissue arrays were designed and produced by the NCI TARP using anonymous donor blocks obtained from the Cooperative Human Tissue Network. Donor blocks that met previously described inclusion criteria were selected and reviewed. Portions containing tumor were selected by one of us (S. M. H.). The arrays were produced as described previously (24), using a manual tissue arrayer (Beecher Instruments, Silver Spring, MD). The array design included specimens of 42 ovarian malignancies of surface epithelial origin, 45 lymphomas, 91 colon carcinomas, 90 non-small cell lung carcinomas, 97 prostate carcinomas, 89 breast carcinomas, 24 melanomas, 25 glioblastomas multiforme, and 62 normal human tissues of various types. Five-μm sections were cut from the array blocks using tape sectioning materials from Instrumedics (Hackensack, NJ). Every fiftieth slide was stained with H&E for quality control review.
Immunohistochemical Staining of Tissue Microarrays.
The arrays were stained using a method based on horseradish peroxidase-labeled polymer (Envision+; DAKO), preceded by an antigen retrieval procedure. The antigen retrieval step was carried out as follows for villin and moesin: the arrays were placed in a microwavable pressure cooker containing preheated buffer [0.1% Tween 20 and 0.01 m citrate buffer (pH 6.0)] and heated in a microwave oven at maximum power (800 W) for 8 min. For retrieval of CK7 and CK20 antigens, the arrays were pretreated with a solution of 0.1% trypsin for 20 min before the pressure cooker procedure. Primary antibodies were applied manually to the arrays, and incubation was carried out for 2 h or overnight at room temperature. The remainder of the procedure for all markers was performed using an automated immunostainer (DAKO) and the Envision+ detection kit (according to manufacturer’s instructions). Murine monoclonal anti-villin antibody was obtained from Novocastra Laboratories (Newcastle upon Tyne, United Kingdom) and used at a dilution of 1:200. Murine monoclonal antimoesin was obtained from Transduction Laboratories (Lexington, KY) and used at a dilution of 1:500. Murine antibodies directed against CK7 and CK20 were obtained from DAKO and used at a dilution of 1:200. The immunohistochemistry was scored as 0 (negative) or 1 (positive) according to the staining of epithelial components.
Statistical Evaluation of the Markers Identified.
Statistical analysis was performed for the four single markers and all possible two-, three-, and four-marker combinations thereof. Parameters calculated (Table 3) included the accuracy, sensitivity, specificity, Pearson’s χ2 statistic, likelihood ratio χ2, RSquare, and Fisher’s exact Ps. These calculations were done using the JMP statistical software package (SAS Institute, Cary, NC). McNemar and bootstrap procedures were also performed, as noted in “Results.”
RESULTS
Selection of Candidate Markers Based on cDNA Microarray Expression Studies.
As described in “Materials and Methods,” we performed unpaired t tests for each of the 9706 genes (i.e., cDNA clones) to assess their colon-ovarian selectivities. The 300 genes with the lowest Ps for colon selectivity and the 300 genes with the lowest Ps for ovarian selectivity were selected for further consideration. Hierarchical clustering of these 600 genes in a Clustered Image Map (15) using only the colon and ovarian data (Fig. 2) showed a clear distinction between the two classes of markers. We next searched among the 600 genes for those with large expression ratio differences (1/7ΣEco – 1/6ΣEov or 1/6ΣEov – 1/7ΣEco, see “Materials and Methods” for definitions). In this way, we identified genes with high relative expression levels favoring either colon or ovarian carcinoma cell lines. Expressed sequence tags of unknown function and clones whose sequences had not been reverified were eliminated from consideration (21). Removing those clones reduced the numbers of candidate genes from 300 in each direction to 122 for colon and 135 for ovarian cancer. Among those genes, we selected candidates for immunohistochemical testing on the basis of the ranking criteria described above and also on the basis of antibody availability. As a result of these statistical and nonstatistical criteria, we focused on villin (IMAGE clone ID 469974, UniGene cluster Hs.166068) as a candidate marker for colorectal adenocarcinoma and moesin (IMAGE 486864, UniGene cluster Hs.170328) as a candidate for ovarian adenocarcinoma. The percentile ranks and indices of colon-ovarian selectivity of expression for these two genes are shown in Table 1 and Fig. 3 A.
Sequence Verification.
Because cDNA clones are notoriously subject to misidentification or contamination, the likelihood that a clone represents a pure sample of the correct clone on cDNA microarrays has been reported as 60–80% in various studies (21, 28, 29, 30). Hence, we resequenced the clones of candidate markers. When we did so, the insert sequence obtained for villin clone 469974 did not match any mRNA sequence available in dbEST or full-length cDNA entry in GenBank. However, when we examined available genomic sequence information using the DoubleTwist program (Oakland, CA), we found a genomic clone, RP11-398G12, that contains a sequence predicted to be in the last exon of villin. The last exon can be present or absent, depending on which of two polyadenylation signals is selected in the splicing process. Hence, there are villin transcripts of two different lengths (31, 32), both of which accumulate during differentiation of colon epithelial cells (31, 32, 33). The size difference is entirely due to an 800-nucleotide extension of the 3′ noncoding region in the longer mRNA. The coding sequences of the two mRNAs are identical (32). Clone 469974 (which is 791-bp long) represents the longer of the two villin transcripts. The two mRNAs for villin and the IMAGE clone alignment are shown schematically in Fig. 3,B. Also shown in Fig. 3 B is the IMAGE clone 486864 alignment for moesin mRNA based on resequencing.
Corroboration of mRNA Expression Levels Using Affymetrix Oligonucleotide Chips.
For transcripts indicated by microarray to be of interest, it is generally necessary to verify expression results using an independent method. In this instance, we had the advantage of having analyzed the same 60 cell samples using Affymetrix oligonucleotide GeneChip arrays (20), which are subject to a very different set of possible errors. Comparison of the seven colon and six ovarian lines revealed surprisingly high correlation between the two data sets for the candidate markers. Pearson’s correlation coefficients between cDNA and oligonucleotide arrays for villin and moesin were 0.84 (two-tailed 95% bootstrap confidence interval of 0.54–0.95) and 0.85 (0.56–0.97), respectively. The corresponding Spearman correlation coefficients were 0.69 (0.10–0.95) and 0.78 (0.24–0.99), respectively. Therefore, we concluded that the candidates we had selected on the basis of cDNA microarray data were worthy of further investigation.
Verification of Selectivity at the Protein Level Using Reverse-phase Protein Lysate Microarrays.
Because it is well understood that mRNA expression levels may not reflect protein expression levels, especially across disparate cell types, we asked whether the candidate markers selected at the mRNA level would show similar selectivity at the protein level. To address the question, we used reverse-phase protein lysate microarrays recently designed in this laboratory10 to study all 60 cell lines together on the same slide. First, the monoclonal antibodies to be used for detection of protein on the arrays were checked to see whether they would produce only a single band at the appropriate molecular weight on Western blots (with loading of ∼20 μg/lane). The antibodies against villin and moesin met that criterion. Immunohistochemical staining of the lysate arrays with these antibodies then showed a qualitative difference between colon and ovarian cell lines. Quantitative analysis using a dose-interpolation algorithm10 showed a surprisingly high correlation between the cDNA and protein lysate microarray data for both molecules, suggesting transcriptional regulation of expression. The Pearson correlation coefficients between cDNA array and protein lysate array data were 0.92 (0.72–0.98) and 0.84 (0.46–0.97), respectively, for villin and moesin. The corresponding Spearman correlation coefficients were 0.88 (0.52–0.98) and 0.86 (0.45–0.99), respectively.
Immunohistochemistry on Tissue Microarrays.
To test prospectively whether the results obtained for cell lines would carry over to human tumors, the 133 colon and ovarian tumor specimens on TARP tissue microarrays were examined with antibodies for villin, moesin, and “conventional” markers (CK7 and CK20) for distinguishing colon from ovarian carcinomas. CK7 is usually present in primary ovarian carcinomas but is infrequent in colon carcinomas. Conversely, CK20 is common in colon carcinomas but rare in ovarian carcinomas (4). On the TARP tissue microarrays, CK7 and CK20 showed clear staining patterns in ovarian and colon specimens, respectively. One of the candidate markers, villin, exhibited a strong, distinct pattern of staining for colon (Fig. 4, a and c). Most of the ovarian cases were villin- negative (Fig. 4,b), but significant staining was seen in six cases. Stromal regions of both types of tumor were moesin-positive, but this staining was clearly distinguishable from the epithelial staining seen in the ovarian tumors (Fig. 4, g and h). The results for each specimen are recorded in Table 2.
Statistical Evaluation of the Candidate Markers.
Table 2 lists the result obtained for each tissue array specimen when arrays were stained with antibody for villin, moesin, CK7, or CK20. Table 3 then summarizes statistical calculations from Table 2 for each marker and combination of markers. In doing the calculations, we treated villin and CK20 as markers for colon cancer and moesin and CK7 as markers for ovarian cancer. Because no single statistic can capture all aspects of a marker’s quality, a number of different statistics were tabulated. Nominally, the highest accuracy was seen for villin as a marker of colon (0.93), but the distinction from the next best, CK20 as a marker for colon (0.91), is not statistically significant (P > 0.05 by McNemar’s test). Note that sensitivity, specificity, positive predictive value, and negative predictive value are symmetrical with respect to colon and ovary in Table 3, given the reciprocal nature of the scoring. CK7 and CK20 showed good sensitivity (0.97 and 0.88) and specificity (0.84 and 0.97) as ovarian and colon markers, respectively. For villin as a colon marker, the corresponding values were 0.97 and 0.85. Moesin, showed reasonably good sensitivity (0.78) and specificity (0.92) as an ovarian marker, but the intensity of staining was not as strong as it was for villin (Fig. 4, e and f). The marker that gave the fewest incorrect predictions (eight cases) was the new candidate villin. A natural next step in the analysis was to consider the markers in all possible combinations to see whether results better than those for single markers could be obtained. When doing such calculations, decision rules of different types could have been invoked for marker combinations that were not unanimous in their prediction. “Majority rules” would give the decision to the tumor type with the most markers in its favor. Ties (1 versus 1 or 2 versus 2) could be left as ambiguous or else resolved in favor of the “best” marker(s) by some criterion such as accuracy. For Table 3 we chose to use the most conservative rule, requiring unanimity for a prediction to be made. We did so in part to permit direct comparison among marker sets without invoking ad hoc tiebreakers (that would tend to favor villin over the others) for even numbers of markers. More importantly, we did so because the pathologist faced with conflicting predictions from different markers would be likely to consider the indications as ambiguous and seek other sources of information for making the differential diagnosis. The data in Table 2 can be used by those interested to introduce other decision rules for the calculations.
Pairing of the conventional markers, CK7 and CK20, provided more accurate prediction than did either marker alone. The CK7-CK20 pair yielded 87 correct, 3 incorrect, and 17 ambiguous among 107 informative cases. That is, 17 of the cases (16%) could not be diagnosed on the basis of the conventional markers. The immunostaining with villin and moesin yielded 85 correct, 3 incorrect, and 15 ambiguous among 103 informative cases for the same samples as those tested with antibodies to CK7 and CK20; 87 of 107 (81%) and 85 of 103 (83%) are indistinguishable by McNemer’s test (P > 0.05). Inclusion of a third marker decreased the number of incorrect predictions still further (to one or two), and use of all four markers left just one incorrect. The price of adding more markers and requiring unanimity was, of course, an increase in the number of cases left as ambiguous.
DISCUSSION
Colon and ovarian carcinomas can be difficult to distinguish histologically from each other in ovarian masses, in peritoneal carcinomatosis, and in metastases to distant lymph nodes (1, 2, 3, 4, 5, 6). Misdiagnosis in this context may result in delayed identification of the primary lesion or misdirected clinical procedures. Misdiagnosis may also lead to inappropriate therapy because metastatic colon cancer is generally treated with 5-FU, whereas ovarian cancer is most often treated with paclitaxel and a platinum agent (11, 12). For patients with carcinoma of unknown primary, a number of palliative chemotherapy approaches using combinations of 5-FU, mitomycin-C, doxorubicin, and cyclophosphamide have been tried, but with limited success (8, 34) Colorectal adenocarcinomas can generally be differentiated from other carcinomas on the basis of ultrastructural features, such as the microfilamentous cores of microvilli, apical electron dense bodies, and glycocalyceal bodies (35). A prospective study by Hickey and Seiler (35) demonstrated high sensitivity (1.0) and specificity (0.97) by ultrastructural features alone. However, electron microscopic studies cannot be done routinely, so better markers for standard clinical pathology are required. Current candidates include the well-known marker molecules CEA, CA125, CA19-9, vimentin, CK7, and CK20 (2, 3, 4, 5, 6). There are reports that immunostaining for these markers can predict the primary site in metastatic adenocarcinoma correctly 60–80% of the time, but none of the markers is fully site specific (5, 10). Hence, additional markers would be advantageous, and our aim in this study was to provide them. To do so, we pursued the multistep protocol shown schematically in Fig. 1, starting with colon and ovarian cell lines in the NCI-60, using cDNA microarrays, clone sequencing, oligonucleotide arrays, protein lysate arrays, and tissue microarrays.
Clinical tumor specimens have provided the usual starting point for identification of diagnostic markers, sometimes using a high-throughput technology such as the cDNA array or Affymetrix oligonucleotide GeneChip (36, 37). Clinical materials have several drawbacks, however. First, it may be difficult to obtain the amount of RNA or protein necessary. Amplification methods for mRNA are problematical and do not exist for protein. Second, clinical tumors are grossly heterogeneous. They contain endothelial cells, infiltrating leukocytes, fibroblasts, and other stromal cells, as well as tumor cells. The latter may represent only a small fraction of the overall cellular material, effectively diluting out any inherent marker selectivity. Cell lines in culture have the obvious disadvantage that they are not fully representative of cells in vivo, but unlike clinical materials, they are available in unlimited numbers and are homogeneous with respect to cell lineage (15). Because they tend to retain gene expression signatures of their organs of origin (19, 21), we chose to use them as starting points for identification of differential markers.
The present study indicates that villin and perhaps moesin should be strongly considered for clinical use in distinguishing colon and ovarian carcinomas. Villin is a Mr 95,000 major microfilament-associated protein that binds actin filaments in a calcium-dependent manner (38, 39) and supports assembly of the actin core bundles of microvilli (33). Accumulation of villin has been observed at the transcript level in HT29 colon adenocarcinoma cells (31, 33) and at the protein level in colon adenocarcinomas but in no other tumor types tested (33). Because the most consistent and specific ultrastructural indicator of colorectal adenocarcinoma appears to be the set of long microfilamentous rootlets extending from the cell surface into the apical cytoplasm (35, 40), it makes sense that villin expression would provide a marker for colorectal carcinoma. Villin has previously been suggested as a marker for colon cancer (41), but, as far as we know, there has been no previous study that provides a statistical basis for its use in the differential diagnosis of colon and ovarian cancers.
Villin offers some potential technical advantages over CK20 in that detection of CKs must generally be enhanced through antigen unmasking using a proteolytic enzyme such as trypsin. This step is difficult to standardize, and it is sensitive to variations in tissue fixation. In contrast, the method we introduced here for antigen unmasking of villin uses a HIER step, rather than an enzymatic degradation. It is our experience that HIER methodologies are easier to standardize and result in consistently good results, regardless of variations in tissue fixation.
The second new candidate marker, moesin, was originally isolated as a heparin-binding protein and identified as a member of the ezrin-radixin-moesin (ERM) family of homologous proteins that serve as general cross-linkers between plasma membranes and actin filaments (42, 43), Interestingly, both villin and moesin are involved in the structural assembly of cytoskeletal elements. Perhaps coincidentally, they are also similar in that phosphorylations (of tyrosine in villin and threonine in moesin) play pivotal roles in the organization of actin-associated cytoskeleton (44, 45).
In our validation studies using TARP tissue microarrays, moesin proved less effective as a single marker than did villin, perhaps partly because ovarian cancers are more diverse than colon cancers. Antimoesin antibody did not generally stain colon cancer cells but did stain surrounding stromal cells to a considerable extent. Hence, moesin might easily have been missed as a marker for distinguishing immunohistochemically between the two types of tumor cells if we had started with protein or transcript profiling of bulk tumors, rather than cultured cells. As was the case with villin, we were able to use a HIER protocol, rather than enzymatic degradation, for moesin antigen retrieval. However, this advantage was counterbalanced by relatively low expression levels in comparison with those of villin and the CKs. A better antimoesin antibody and further methodological refinement may be beneficial. Overall, availability of high-quality antibodies was a limiting factor in the triage process described here. To alleviate that problem, we have now screened >400 antibodies of different specificities by Western blot and have incorporated the results (plus extensive information on the antibody reagents) into a user-friendly relational database11 to aid ourselves and others in the search for additional markers.
In this study, we have presented a rather general set of steps that can be used to identify and validate candidate molecular markers. The protocol developed on a series of different types of “omic” (i.e., genomic and proteomic) experiments and analyses (46, 47). Additional studies will, of course, be necessary to test the utility of villin and moesin for use in clinical pathology, but their characteristics as assessed to date using tissue microarrays are promising. Whether or not they might be similarly useful as serum markers remains to be explored.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The abbreviations used are: CEA, carcinoembryonic antigen; CK, cytokeratin; NCI, National Cancer Institute; 5-FU, 5-fluorouracil; TARP, Tissue Array Research Program; HIER, heat-induced epitope retrieval.
http://discover.nci.nih.gov.
J. K. Lee, K. J. Bussey, F. G. Gwadry, W. Reinhold, G. Riddick, S. L. Pelletier, S. Nishizuka, G. Szakacs, J.-P. Annereau, M. Gottesman, and J. N. Weinstein. Comparing cDNA and oligonucleotide array data: Concordance of gene expression across platforms for the NCI-60 cancer cells. Genome Biology, submitted.
Satoshi Nishizuka, Lu Charboneau, Lynn Young, Sylvia Major, William C. Reinhold, Mark Waltham, Hosein Kouros-Mehr, Kimberly J. Bussey, Jae K. Lee, Virginia Espina, Peter J. Munson, Emanuel Petricoin III, Lance A. Liotta, John N. Weinstein. Proteomic profiling of the NCI-60 cancer cell lines using new high-density “reverse-phase” lysate microarrays, submitted.
Sylvia M. Major Satoshi Nishizuka, Rick Rowland, Frank L. Washburn, John N. Weinstein. Abminer: a relational database of antibodies screened against the NCI-60 cancer cell lines, manuscript in preparation.
Acknowledgments
Multitumor tissue microarray slides were obtained from the TARP of the NCI, NIH (Bethesda, MD). Collaborations with D. Ross, P. O. Brown, M. Eisen, D. Botstein, et al. at Stanford and with J. Staunton, T. Golub, E. Lander, et al. at the Whitehead Institute produced the cDNA array and Affymetrix chip expression data sets, respectively. We also thank M. J. Miller for sequencing and M. Y. L. Song for immunohistochemical staining. We are grateful to staff of the NCI’s Developmental Therapeutics Program for providing the NCI-60 cells and to the late K. D. Paull for his pioneering work on the informatics of the NCI-60 profiling system.