Abstract
Advances in the understanding of cancer cell biology and response to drug treatment have benefited from new molecular technologies and methods for integrating information from multiple sources. The NCI-60, a panel of 60 diverse human cancer cell lines, has been used by the National Cancer Institute to screen >100,000 chemical compounds and natural product extracts for anticancer activity. The NCI-60 has also been profiled for mRNA and protein expression, mutational status, chromosomal aberrations, and DNA copy number, generating an unparalleled public resource for integrated chemogenomic studies. Recently, microRNAs have been shown to target particular sets of mRNAs, thereby preventing translation or accelerating mRNA turnover. To complement the existing NCI-60 data sets, we have measured expression levels of microRNAs in the NCI-60 and incorporated the resulting data into the CellMiner program package for integrative analysis. Cell line groupings based on microRNA expression were generally consistent with tissue type and with cell line clustering based on mRNA expression. However, mRNA expression seemed to be somewhat more informative for discriminating among tissue types than was microRNA expression. In addition, we found that there does not seem to be a significant correlation between microRNA expression patterns and those of known target transcripts. Comparison of microRNA expression patterns and compound potency patterns showed significant correlations, suggesting that microRNAs may play a role in chemoresistance. Combined with gene expression and other biological data using multivariate analysis, microRNA expression profiles may provide a critical link for understanding mechanisms involved in chemosensitivity and chemoresistance. [Mol Cancer Ther 2007;6(5):1483–91]
Introduction
Genomic and proteomic studies have yielded a wealth of novel insights into molecular targets and mechanisms of cancer chemosensitivity and resistance (1–10). Nonetheless, progress in translating those insights into effective therapies has been relatively slow. One way to accelerate the advance toward molecularly based cancer therapy is to integrate various types of molecular information on the same set of cancer samples to develop as comprehensive as possible a molecular portrait of the cells and their pharmacology. An attractive model system for that type of “integromic” enterprise (11) is the panel of 60 diverse human cancer cell lines (the NCI-60) used by the National Cancer Institute (NCI) to screen >100,000 chemical compounds and natural product extracts for anticancer activity since 1990 (12). Included in the panel are nine broad categories of cancer cells: leukemias, melanomas, and cancers of breast, central nervous system (CNS), colon, lung, ovarian, prostate, and renal origin. Dose-response curves generated by the screening process provide 50% growth inhibitory (GI50) values for each compound-cell line pair. Screening data for ∼43,000 nonproprietary compounds are publicly available.7
To take advantage of the pharmacologic profiling, the NCI-60 have also been the subject of numerous genomic, proteomic, and other “-omic” profiling studies (13). Profiles determined using cell materials obtained by methods of cell culture and harvesting plug-compatible with those used for the study to be described here have included the following: transcript profiling on multiple platforms (2, 3, 7, 14, 15); proteomic profiling using two-dimensional gels (16) and lysate arrays (15, 17, 18); analysis of single nucleotide polymorphism (19); DNA resequencing for mutational status (20); comparative genomic hybridization for DNA copy number changes (21); spectral karyotyping for chromosomal aberrations (22–24); and promoter region DNA methylation studies for cancer-related genes (19). The collection of data sets related to the NCI-60 provides an unparalleled public resource for integrated chemogenomic studies aimed at elucidating molecular targets, identifying biomarkers for personalization of therapy, and understanding mechanisms of chemosensitivity and chemoresistance. To highlight the possibilities, Molecular Cancer Therapeutics launched a new series in November 2006 under the rubric “Spotlight on Molecular Profiling” with three articles on molecular characterization of the NCI-60 (20, 25, 26). Such studies have led to hypotheses tested in well-controlled experimental systems, with promising candidates progressing toward the clinic. A well-integrated set of such molecular profile databases obtained under strictly controlled, standardized protocols has been incorporated into the CellMiner program package,8
which also includes tools for querying and combining the data sets. Various molecular profile data sets on the NCI-60 are also available online.9 Because cell lines have been removed from their in vivo context and selected for growth in culture, they cannot be considered accurate surrogates for clinical tumors. However, they offer a number of advantages for chemogenomic studies (19). The cell lines are reasonably stable and reproducible over extended time periods; they are available in large quantities; and they are manipulable experimentally (e.g., by transfection or by selection for drug resistance).MicroRNAs are small noncoding RNAs of 21 to 25 nucleotides that negatively modulate protein expression (27–29). One strand of the mature double-stranded microRNA is incorporated into the RNA-induced silencing complex, which down-regulates target mRNAs either by degradation or by translational inhibition (28). MicroRNAs play important roles in normal regulation of gene expression for developmental timing, cell proliferation, and apoptosis. Moreover, altered microRNA expression is implicated in cancers. MicroRNAs have also been shown to play critical roles in cancer biology (30–35). For example, Volinia et al. (35) identified microRNAs that are differentially expressed in six solid tumors compared with normal tissues. However, their effect on chemotherapy has yet to be studied systematically. MicroRNAs could provide a critical link for understanding chemosensitivity/resistance patterns in the NCI-60. In particular, they could, in principle, help to explain discrepancies between mRNA and protein levels. Those discrepancies seriously complicate the use of mRNA profiles to study chemoresistance (7–10).
We have used custom pin-spotted microarrays to measure all known microRNAs (as of January 2006) in the NCI-60. The data have been deposited in the NCI Genomics and Bioinformatics Group's CellMiner database,8 which provides a variety of NCI-60 databases at the DNA, RNA, protein, and pharmacologic levels. CellMiner is a user-friendly, queryable, web-based relational database (under MySQL) that facilitates navigation and integrative analysis of the molecular profiles. The data have also been deposited in ArrayExpress (ref. 36; accession no. E-MEXP-1029) and in the NCI Developmental Therapeutics Program databases.9 This report provides an analysis of the distribution of microRNA expression across the NCI-60 (37). We assess data quality, compare cell line clusters based on microRNA and mRNA expression patterns, and analyze microRNAs that differentiate among tissue types or that distinguish among other cell line groupings.
Materials and Methods
The NCI-60 Cancer Cell Lines
Cell stocks were obtained from the NCI Developmental Therapeutics Program. The Genomics and Bioinformatics Group then cultured them, harvested RNA, and purified the RNA by a method (see below) that preserves the small microRNA species. To permit broad, integrative use of the data, the cells were cultured under essentially the same conditions as were used for the drug screen, for mRNA expression profiling, and for other molecular studies of the NCI-60 by the Genomics and Bioinformatics Group. In particular, the cells were grown in tissue culture flasks at 37°C in 5% CO2 in RPMI 1640 with l-glutamine and 10% fetal bovine serum, without antibiotics. Total RNA was extracted at ∼80% confluence using Trizol (Invitrogen) according to the manufacturer's instructions. The time from incubator to stabilization of the preparation was kept to <1 min. For details, see Shankavaram et. al. (15).
Microarray Hybridization
MicroRNA labeling and hybridization were done as previously described (38) using 5 μg of total RNA. Our pin-spotted microRNA microarray (ref. 37; Ohio State University Comprehensive Cancer Center, version 3.0) contains probes for 321 mature human microRNAs, spotted in duplicate. Included are 627 human microRNA probes, typically with more than one probe for a given mature microRNA as well as probes for most precursor microRNAs. Hybridization signals were detected with Streptavidin-Alexa647 conjugate, and scanned images (Axon 4000B) were quantified using the Genepix 6.0 software (Axon Instruments).
Experimental Design
Because it was logistically infeasible to run microarrays for all of the samples simultaneously, data for microRNA expression in the NCI-60 cell lines were collected in three batches, 20 cell lines + 4 controls (24 samples) per batch for a total of 72 samples. To select cell lines for assignment to each batch, the lines were cluster-ordered based on gene expression (2) and divided into three groups of 20 cell lines with reasonably similar expression profiles. For each batch, a balanced random selection was taken from each of the three groups, and one cell line was chosen as a control from each batch. The controls, selected for tissue diversity and moderate doubling time, were (a) prostate cell line PC-3, (b) leukemia K-562, and (c) lung cancer A549. To enable us to assess within- and between-batch variability, the controls were run in each of the three batches, and, for each batch, one of the controls was run in triplicate. That design is illustrated schematically as follows, with letters representing the three controls and periods representing other cell lines:
Batch A: aabc.........a..........
Batch B: abbc.....b..............
Batch C: abcc................c...
Quality Assessment and Normalization of Microarray Data
The data were analyzed using the R software package (39). The signal intensity of each spot was calculated by subtracting local background (based on the median intensity of the area surrounding each spot) from the median signal. After converting any negative values to 1, signal intensities were log 2 transformed and duplicate spots were averaged. In addition, for cluster analyses, we did quantile normalization (40) across the 3 microarray batches, replaced all values <5 with the median of such values, and set the value for each control cell line to the mean of its five replicates.
Using the unnormalized data, we then generated estimated bias curves from the 15 control values (i.e., 5 replicates of each of the 3 control cell lines, PC-3, K-562, and A549). For each microRNA probe and cell line, we averaged expression scores over the five replicates to give a “true” expression level for each of the 1,881 (i.e., 627 × 3) microRNA probe-cell line combinations. For each batch, there were 627 microRNA probes × 5 samples = 3,135 expression values that we paired with the 1,881 “true values” + 1,254 (i.e., 2 × 627) replicated true values. For each batch, we subtracted the paired true expression levels from the observed expression levels to get 3,135 bias values.
Affymetrix U133A Expression Data for the NCI-60
The methods used have been described in detail elsewhere (15). Briefly, the protocols for cell culture, cell harvest, and purification of RNA were as described above for microRNA except that the Genomics and Bioinformatics Group purified total RNA using the Qiagen RNeasy Midi Kit according to the manufacturer's instructions. Labeling, hybridization, scanning, and primary data analysis were done at GeneLogic, Inc. as per the previously described protocols (15). The MAS5 algorithm was used to generate signal intensities for the HG_U133A arrays, and expression values were normalized to a mean target level of 100.
Tar Base
TarBase (41) gives experimentally supported targets for microRNAs, with literature references and links to other databases. The version used here was downloaded Sept. 20, 2006.
Compound Database
To assess potential associations between microRNA expression and drug potencies, we used the September 2003 release of the NCI antitumor drug screening database. The data were obtained from the NCI DTP website,7 which contains nonconfidential screening results and chemical structural data. For each compound and cell line, growth inhibition after 48 h of drug treatment had been assessed from changes in total cellular protein using a sulforhodamine B assay (12). The assay provides GI50 values for all of the compound-cell line pairs. From those data, we obtained a curated subset of 7,794 compounds as follows: For drugs that had GI50 values recorded using multiple concentration ranges, we chose only the value obtained for the best concentration range. The best concentration range was chosen as the one containing the greatest number of values in the range (low concentration, +0.2; high concentration, −0.2). In other words, values close to either end point were discounted. Compounds with fewer than 30 accepted values were excluded. Finally, to focus on compounds showing good discrimination between cell lines, we selected compounds with [max(GI50) − min(GI50)] ≥ 2.0. That filtering process resulted in a database of 3,089 compounds.
Results and Discussion
For microRNA profiles, we compared the distributions of data values across the three batches, 15,048 (i.e., 627 × 24) microRNA probe-cell line measurements per batch. Figure 1A shows quantile plots for the three batches. There was clearly a batch effect at low expression levels, that is for levels of the log2 value of relative spot fluorescence intensity less than ∼8. Batches A, B, and C gave ∼24%, ∼7%, and ∼9% zero values, respectively. The expression levels at the 25th, 50th, and 75th percentiles were as follows:
Batch | 25th | 50th | 75th |
A | 1.3 | 5.5 | 8.7 |
B | 4.3 | 6.1 | 8.6 |
C | 4.1 | 6.2 | 8.8 |
Batch | 25th | 50th | 75th |
A | 1.3 | 5.5 | 8.7 |
B | 4.3 | 6.1 | 8.6 |
C | 4.1 | 6.2 | 8.8 |
Above the 70th percentile (i.e., expression level ∼8), the quantile curves for all three batches became nearly identical, with the curve for batch C remaining slightly lower than the other two.
To assess batch effects at the microRNA probe level, we plotted estimated bias curves calculated from the 15 control samples, representing 5 replicates each of the PC-3, K-562, and A549 cell lines. The estimated bias curves in Fig. 1B to D show how the difference between true and observed expression levels varies with expression level for each batch. The diagonal line of data points at the lower left of each plot resulted from the constraint that the estimated bias is never less than the true value. All three batches showed greater variability at expression levels less than ∼8. Batches A and C were negatively biased at the low end, where observed values tended to be lower than true values. Batch B was positively biased at the low end. That observation is consistent with the quantile plot in Fig. 1A. At the high end, all three batches showed low variability and low bias.
To focus our analysis on microRNAs likely to be most informative and discriminating, we filtered the full set of 627 probes to select those with log 2(exp) ≥ 8 in at least 10% of the cell lines. That selection process gave 279 microRNAs, ∼46% of the total probes. For further assessment of between-sample variability, we clustered the 72 samples based on the expression patterns of the 279 selected microRNAs using complete linkage clustering and a correlation metric. For each of the three control sets, the five replicates clustered together with no intervening cell lines. The mean pairwise correlation coefficient within replicate sets was r = 0.96 for PC-3 and r = 0.94 for both K-562 and A549.
Clustering of NCI-60 Cell Lines Based on MicroRNA Expression
Figure 2A shows the clustering of cell lines based on expression patterns of the 279 selected microRNAs. The mean cell-cell Pearson correlation coefficient was 0.74, indicating an essential biological similarity across the various cell types. Setting a cutoff of 0.7 for intercluster correlation gave nine clusters and one singleton (K-562). Supplementary Fig. S110
Supplementary data for this article are available at Molecular Cancer Therapeutics Online (http://mct.aacrjournals.org/).
For comparison, we also clustered the cell lines on the basis of U133A gene expression data obtained previously (15). For that analysis, we used the 1,000 genes with the largest interquartile ranges, again by complete linkage clustering and a correlation metric (Fig. 2B). Cell line groupings based on microRNA expression were generally consistent with tissue type and with cell line clustering based on mRNA expression. The leukemias clustered together, well separated from all other cell types. The melanomas formed a tight cluster and included two cell lines originally thought to be breast cancers (MDA-MB-435 and MDA-N). A majority of the colon, renal, and CNS cancers also clustered by tissue type. Furthermore, cell pairs known to be very similar clustered together. Included were cell line pairs from the melanoma panel [MDA-MB-435 and MDA-N (r = 0.95), MDA-MB-435 and M14 (r = 0.90)], the CNS panel [SNB-19 and U251 (r = 0.92)], and the ovarian panel [OVCAR-8 and OVCAR-8/ADR-RES (r = 0.93)]. MDA-MB-435 was originally thought to be derived from the pleural effusion of a patient with breast cancer, but it is absolutely characteristic of melanotic melanoma in all molecular profiling studies (2, 14, 15). MDA-N is an ErbB2 transfectant of MDA-MB-435. On the basis of comparative genomic hybridization and spectral karyotyping, we identified what was previously called NCI/ADR-RES as a drug-resistant derivative of OVCAR-8 ovarian cancer (i.e., OVCAR-8/ADR-RES; refs. 21, 23). In both cluster dendrograms, the breast and lung cancer cell lines were widely distributed. As with virtually all of our molecular profiles of the NCI-60, the two hormone-dependent ER+ breast cancer lines, MCF7 and T47D, clustered together, but the other breast lines did not. Similarly, the lung lines are widely distributed in the cluster tree no matter what type of molecular profile at the DNA, RNA, protein, or pharmacologic level is being analyzed. On the basis of the multiple types of molecular data now in hand, we are investigating those relationships further, but that analysis is beyond the scope of the present study.
The fact that cell line groupings based on correlation of microRNA expression patterns largely paralleled the groupings found for transcript expression provides additional evidence for the quality of data from the microarray experiments. It also attests to the hypothesis that microRNA patterns are characteristic of tissue type. To assess that finding more formally for both of the dendrograms in Fig. 2, we set a cutoff to give nine subgroups in each dendrogram. We then determined the minimum number of between-group moves necessary to rearrange the cells into tissue groups. For the microRNA clustering of Fig. 2A, 21 moves were required, whereas only 16 moves were needed for the transcript clustering of Fig. 2B.
Although the leukemias were all more similar to each other than to other cell types, they formed a relatively loose group with mean pairwise correlation of 0.78. K-562 had the lowest mean correlation with the other 59 cell lines (mean r = 0.58), much lower than the next lowest, and its mean correlation with the other leukemias was only 0.66. By chance, K-562 was one of the three controls tested in five replicates, and its microRNA profile showed good reproducibility across replicates. Hence, the outlier status of K-562 in terms of microRNA expression pattern seemed to reflect real differences in biology, not experimental error. K-562 is the only NCI-60 cell type derived from chronic myelogenous leukemia and the only one with the BCR-ABL translocation. We do not know whether either of those characteristics is causally associated with the outlier status of the line.
Comparison of MicroRNAs and mRNAs for Prediction of Tissue of Origin
Initially, we used Predictive Analysis for Microarrays (PAMp), which performs sample classification based on the nearest shrunken centroid method (42), to investigate the ability of microRNA expression levels to predict tissue of origin. PAMp uses Significance Analysis of Microarrays (SAM; ref. 43) to select features (microRNAs in this case) for its classifier. Hereafter, we will refer to the algorithm as SAM-PAM. Because some of the microRNAs selected by SAM might be highly correlated with each other, thereby reducing the independent predictive value of each individual microRNA in multivariate analysis, we sought an alternative way to select more “distinctive” microRNAs. Toward that end, we used Partition Around Medoids [ref. 44; denoted PAMc as implemented by the
let-7d | mir-30e-5p | mir-141 | mir-301 |
mir-7 | mir-34a | mir-147 | mir-320 |
mir-7-prec | mir-99 | mir-181b | mir-321 |
mir-9 | mir-99b-prec | mir-183 | mir-326 |
mir-10a | mir-103 | mir-192 | mir-335 |
mir-16 | mir-106 | mir-196 | mir-365 |
mir-21 | mir-124a | mir-196a | mir-371 |
mir-24 | mir-126 | mir-219 | mir-377 |
mir-26a | mir-128a | mir-221 | mir-382 |
mir-29c | mir-130a | mir-223 | mir-497 |
let-7d | mir-30e-5p | mir-141 | mir-301 |
mir-7 | mir-34a | mir-147 | mir-320 |
mir-7-prec | mir-99 | mir-181b | mir-321 |
mir-9 | mir-99b-prec | mir-183 | mir-326 |
mir-10a | mir-103 | mir-192 | mir-335 |
mir-16 | mir-106 | mir-196 | mir-365 |
mir-21 | mir-124a | mir-196a | mir-371 |
mir-24 | mir-126 | mir-219 | mir-377 |
mir-26a | mir-128a | mir-221 | mir-382 |
mir-29c | mir-130a | mir-223 | mir-497 |
The medoids from the clusters comprised the set of microRNA probes that we used in PAMp for predictive analysis. The overall approach is termed PAM-PAM. Six-fold cross-validation with 100 replicates was used in both SAM-PAM and PAM-PAM. We used the same analytic scheme to analyze the U133A transcript expression data. However, because the number of mRNAs was large even after preprocessing, we preselected 1,000 genes with the largest interquartile range before running PAMc. The medoids of 40 gene clusters were selected for comparative PAM-PAM analysis.
We applied the PAM-PAM analysis to a 43-cell line subset of the NCI-60 comprising the CNS, colon, leukemia, melanoma, ovarian, and renal tissue types. The breast and lung cancer cell lines were omitted because their intragroup correlations are low. The prostate cell lines were omitted because there are only two members, and the melanoma cell line LOX IMVI was omitted because it seems to be nonmelanotic and highly undifferentiated (15). The left-hand box plot in Fig. 3A shows the misclassification rates over 100 replications of PAM-PAM based on the microRNA data. The median error rate was 0.18. The right-hand box plot in Fig. 3A shows the analogous results obtained for the 40 U133A probes (i.e., the medoids selected by PAMc). The median error rate, 0.07, was smaller. Those results suggest that the mRNA expression data are more informative in discriminating among tissue types, an observation consistent with the relative numbers of moves needed to rearrange the subclusters in Fig. 2 into tissue groups. To assess that observation more formally, we applied the McNemar test (45) in R to the results from each 6-fold cross-validation. The P values over the 100 replications are plotted in Fig. 3B, which shows a skewed distribution rather than a U(0, 1), suggesting that, indeed, the U133A data are more informative in tissue type discrimination. However, that observation cannot be attributed unambiguously to the biology; it might, instead, reflect differences in experimental platform and/or methods of analysis between the microRNA and mRNA expression data. As we also found when comparing transcript and protein expression (15), it can be almost impossible to design experiments and analyses that rigorously eliminate all potential sources of technological and selection bias to focus exclusively on the underlying biological bias.
We used SAM itself (43) to identify microRNAs that are differentially expressed among the six tissue groups (CNS, colon, leukemia, melanoma, ovarian, and renal). The analysis was limited to the 40 microRNA medoids previously selected by PAMc (Table 1). For each tissue group, we applied SAM for identification of microRNAs differentially expressed between the given group and its complement (comprising the cell lines of the five remaining groups). The δ value was chosen to control the false discovery rate at <0.01. Table 2 lists the microRNAs called significant by SAM for each group.
Panel . | Discriminating microRNAs . |
---|---|
CNS | mir-21, mir-99, mir-382 |
Colon | mir-7, mir-10a, mir-29c, mir-103, mir-106, mir-130a, mir-141, mir-183, mir-192, mir-196, mir-196a, mir-335 |
Leukemia | mir-10a, mir-21, mir-99, mir-106, mir-221, mir-365 |
Melanoma | mir-141, mir-147, mir-192, mir-335 |
Ovarian | N/A |
Renal | mir-365 |
Panel . | Discriminating microRNAs . |
---|---|
CNS | mir-21, mir-99, mir-382 |
Colon | mir-7, mir-10a, mir-29c, mir-103, mir-106, mir-130a, mir-141, mir-183, mir-192, mir-196, mir-196a, mir-335 |
Leukemia | mir-10a, mir-21, mir-99, mir-106, mir-221, mir-365 |
Melanoma | mir-141, mir-147, mir-192, mir-335 |
Ovarian | N/A |
Renal | mir-365 |
NOTE: For each tissue group, we applied SAM to identify microRNAs differentially expressed between the given group and its complement.
Correlations with Known mRNA Targets
We next asked whether there is an association between the expression pattern of a microRNA and that of a known target. TarBase (41) provides information on experimentally supported targets for microRNAs. The version downloaded Sept. 20, 2006 listed translationally repressed targets for 47 microRNAs and a total of 72 microRNA-gene pairs. We mapped gene names to HUGO symbols and then to Affymetrix probes on the U133A set. There are typically multiple probes per HUGO symbol, and there are also multiple probes on our microarray for each microRNA (38). All microRNA and mRNA probes were included in the analysis, for a total of 86 microRNA probes and 84 U133A probes.
We calculated the Pearson correlation coefficients for all 7,224 microRNA-mRNA probe pairs. Two-hundred fifteen of the 7,224 correlations corresponded to microRNA-target pairs listed in TarBase, and the remaining 7,009 correlations to non-TarBase pairs. A Wilcoxon rank sum test between the two sets of microRNA-mRNA correlations gave a P value of 0.28, suggesting that there is not a significant difference between the correlations of microRNA-target and nontarget pairs. Because microRNAs are predicted to target multiple unrelated genes (46, 47) that are not coexpressed, it is not surprising that microRNA expression levels do not tend to be strongly correlated with the expression of particular target transcripts.
Correlations with Drug Activity
MicroRNAs have already been shown to play critical roles in the cancer biology underlying sensitivity or resistance to specific drugs. For example, Si et al. (48) found that inhibition of mir-21 using antisense oligonucleotides enhanced growth inhibition of MCF7 cells by topotecan (a clinical camptothecin analog) by ∼40%. The authors also found that inhibition of mir-21 inhibition resulted in lower levels of the Bcl-2 protein, which is not a direct target but may be partly responsible for increased apoptosis and increased drug sensitivity. In addition, Meng et. al. (49) found that mir-21 was highly overexpressed in cholangiocarcinoma (a tumor of the biliary tract), and that inhibition of mir-21 increased sensitivity to gemcitabine in malignant cholangiocytes cell lines.
To assess the potential roles of microRNAs in cancer chemotherapy, we calculated Pearson correlation coefficients across the NCI-60 between expression patterns of the 279 microRNAs and potency patterns of the 3,089 compounds, each selected through the filtering processes described above. In general, the six leukemias were more sensitive to cytotoxic compounds than were the other cell lines. The mean −log(GI50) value for the leukemia cell lines was 5.9, in contrast to a mean of 5.0 for the nonleukemic cell lines. Because a number of microRNAs were overexpressed or underexpressed in the leukemias (in comparison with the other cell lines), an extreme correlation coefficient might have resulted from the intrinsic sensitivity of the leukemias, not from chemoresistance or chemosensitivity. Therefore, we calculated compound-microRNA correlations using only the 54 nonleukemia cell lines. There were ∼861,000 compound-microRNA correlations, and a qqplot showed that the distribution was nearly normal with only small deviations at very extreme values.
Compound-microRNA correlation coefficients ranged from −0.75 to +0.71. Statistics from box and whisker plots (Fig. 4) were as follows: For left whisker, 1st quartile, median, 3rd quartile, and right whisker, the values were (−0.49, −0.16, −0.04, +0.06, +0.40). Between the left and right whiskers were 99.48% of the correlations, with 1,606 values <−0.49 and 2,896 values >0.40. To assess extreme values from the correlation matrix, we compared the 0.005 and 0.995 quantiles (which have values −0.447 and 0.377) with the corresponding values from a null distribution. The two values were well within the linear region of the qqplot. That assessment showed that the low-end correlation (−0.447) was very significant, P < 0.001, but the high-end correlation (0.377) was not significant, P = 0.26.
Table 3 lists compound-microRNA pairs and 54-cell correlation coefficients for the 20 microRNAs with the most extreme values. A strong correlation between the expression pattern of a microRNA and the growth inhibitory pattern of a drug may indicate a role in drug response. For example, a strong negative correlation may mean that cells expressing higher levels of the microRNA are less sensitive to the compound. Such correlations suggest a potential chemoresistance mechanism and provide a hypothesis that can be tested experimentally. Preliminary studies to assess the potential roles of microRNAs in chemosensitivity and chemoresistance are in progress.
NSC . | microRNA . | r . | NSC . | microRNA . | r . |
---|---|---|---|---|---|
711670 | mir-220 | −0.75 | 650711 | mir-181c | −0.67 |
703783 | mir-221 | −0.73 | 633274 | mir-193a-prec | −0.67 |
703783 | mir-222 | −0.72 | 713070 | mir-196a | 0.67 |
721038 | mir-140 | −0.71 | 713070 | mir-326 | 0.67 |
697472 | mir-24 | −0.70 | 22842 | mir-423 | 0.67 |
609394 | mir-30a-3p | −0.69 | 713070 | mir-30b-prec | 0.68 |
637992 | mir-146a | −0.69 | 22842 | mir-494 | 0.69 |
622597 | mir-212-prec | −0.69 | 622926 | mir-342 | 0.69 |
637992 | mir-146b | −0.69 | 713070 | mir-363 | 0.70 |
625863 | mir-122a | −0.68 | 655765 | mir-214 | 0.71 |
NSC . | microRNA . | r . | NSC . | microRNA . | r . |
---|---|---|---|---|---|
711670 | mir-220 | −0.75 | 650711 | mir-181c | −0.67 |
703783 | mir-221 | −0.73 | 633274 | mir-193a-prec | −0.67 |
703783 | mir-222 | −0.72 | 713070 | mir-196a | 0.67 |
721038 | mir-140 | −0.71 | 713070 | mir-326 | 0.67 |
697472 | mir-24 | −0.70 | 22842 | mir-423 | 0.67 |
609394 | mir-30a-3p | −0.69 | 713070 | mir-30b-prec | 0.68 |
637992 | mir-146a | −0.69 | 22842 | mir-494 | 0.69 |
622597 | mir-212-prec | −0.69 | 622926 | mir-342 | 0.69 |
637992 | mir-146b | −0.69 | 713070 | mir-363 | 0.70 |
625863 | mir-122a | −0.68 | 655765 | mir-214 | 0.71 |
NOTE: The table lists compound-microRNA correlations for the 20 microRNAs with the most extreme values. Pearson correlation coefficients (r) were calculated using data for the 54 nonleukemia cell lines, expression values for microRNAs, and −log(GI50) values for compounds.
Conclusion
The collection of pharmacologic and molecular profile data sets on the NCI-60 provides a resource for integrated chemogenomic studies aimed at elucidating molecular targets, identifying biomarkers for personalization of therapy, and understanding mechanisms of chemosensitivity and chemoresistance. However, information on expression patterns of microRNAs had been missing from the NCI-60 repertoire. Given the pervasive regulatory roles of microRNAs in cancer biology, we undertook to measure and analyze the expression of all known microRNAs (as of January 2006) in the NCI-60 cancer cell line panel. The data are available in CellMiner8 for direct integration with other molecular profiling data sets on the NCI-60. They have also been deposited in ArrayExpress and at the DTP web site. Cell line clustering based on microRNA expression patterns shows that cell groupings are generally consistent with tissue type and with clustering based on mRNA expression. The new microRNA microarray data reported here will make it possible for researchers to use the entire array of integrated NCI-60 molecular databases to study the role of microRNAs in the cellular response to chemotherapy. When combined with gene expression and other biological data in multivariate analyses, microRNAs may provide critical information for an understanding of cancer chemosensitivity and chemoresistance.
Grant support: National Institute of General Medical Sciences grant GM61390; NSF Agreement no. 0112050; and the Intramural Research Program of the NIH, National Cancer Institute, Center for Cancer Research.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Acknowledgments
We thank the many members of the DTP staff whose efforts make the Molecular Targets Program possible.