Major advances have been made in the field of precision medicine for treating cancer. However, many open questions remain that need to be answered to realize the goal of matching every patient with cancer to the most efficacious therapy. To facilitate these efforts, we have developed CellMinerCDB: National Center for Advancing Translational Sciences (NCATS; https://discover.nci.nih.gov/rsconnect/cellminercdb_ncats/), which makes available activity information for 2,675 drugs and compounds, including multiple nononcology drugs and 1,866 drugs and compounds unique to the NCATS. CellMinerCDB: NCATS comprises 183 cancer cell lines, with 72 unique to NCATS, including some from previously understudied tissues of origin. Multiple forms of data from different institutes are integrated, including single and combination drug activity, DNA copy number, methylation and mutation, transcriptome, protein levels, histone acetylation and methylation, metabolites, CRISPR, and miscellaneous signatures. Curation of cell lines and drug names enables cross-database (CDB) analyses. Comparison of the datasets is made possible by the overlap between cell lines and drugs across databases. Multiple univariate and multivariate analysis tools are built-in, including linear regression and LASSO. Examples have been presented here for the clinical topoisomerase I (TOP1) inhibitors topotecan and irinotecan/SN-38. This web application provides both substantial new data and significant pharmacogenomic integration, allowing exploration of interrelationships.

Significance:

CellMinerCDB: NCATS provides activity information for 2,675 drugs in 183 cancer cell lines and analysis tools to facilitate pharmacogenomic research and to identify determinants of response.

The approach to molecular biology and pharmacology, commonly referred to as precision medicine, has been significantly changed over approximately the last 25 years by the introduction of omics data and the conceptual shift to the use of computer analyses of large datasets with a combination of statistics, machine learning, omics visualizations, and integration of multiple disparate forms of data. Starting with the pioneering work of the Developmental Therapeutics Program (DTP) at the NCI (1), many projects have been and are contributing sizable blocks of data, prominently including (but not limited to) the large (∼1,000 cell line) panels of the Cancer Cell Line Encyclopedia (CCLE) from the Broad/Novartis, the Genomics of Drug Sensitivity in Cancer (GDSC) from Sanger and Massachusetts General Hospital and the Cancer Therapeutics Response Portal (CTRP) from the Broad Institute.

The Genomics and Pharmacology Facility (GPF) has pioneered omics data acquisition and integration since the mid 1990s (1–9). Its efforts have led to the CellMiner and CellMinerCDB web application (2–7, 9, 10) allowing pharmacogenomic database access and integrative analyses across all public cancer cell line genomics and drug response databases (2).

NCATS has established an automated compound screening platform for large compound libraries using quantitative high-throughput (qHTS) format across multiple different disease models since 2008 (11–13). For cancer cell line viability screening, NCATS created the Mechanism Interrogation PlatEs (MIPE) compound library comprising approved and investigational chemotherapeutic agents, as well as common medications for noncancer indications. An additional design feature of the MIPE library is compound mechanistic redundancy allowing analyses across multiple compounds reported to hit the same target. Compound screening data using the MIPE library has demonstrated value for multiple cancer types, such as diffuse intrinsic pontine glioma (DIPG), Hodgkin lymphoma, Ewing sarcoma, small-cell lung cancer (SCLC), glioblastoma, and others (9, 14–17). Published and unpublished MIPE library compound screening data have been aggregated into a unified dataset called the NCATS–NCI Cytoxicity Dataset shared internally with the NCI through the Palantir Foundry platform. A subset of this unified dataset is now being made public through CellMinerCDB.

Here we introduce the public databases and web portal of CellMinerCDB: NCATS(https://discover.nci.nih.gov/rsconnect/cellminercdb_ ncats/). CellMinerCDB: NCATS enables individual users to access and explore the large NCATS drug response database, with an emphasis on pharmacology and its relationships to molecular genomics. CellMinerCDB: NCATS is integrated with 33 datasets from multiple projects from DTP, GPF, CCLE, GDSC, CTRP, the NCI DTP SCLC, NCI60-DTP Almanac, MD Anderson, and the Project Achilles from the Cancer Dependency Map Portal (DepMap; see Supplementary Materials and Methods for a full listing; refs. 4, 5, 7, 18–28). The omics analyses include single and two-drug activities, DNA copy number, methylation, and sequencing, whole genome transcriptome, mRNA and selected protein expression, metabolite levels, and clustered regularly interspaced short palindromic repeats (CRISPR) knockouts, allowing explorations of the relationships between those data and pharmacologic responses. Functionalities of the new CellMinerCDB: NCATS web application are introduced and discussed here with multiple examples validating the database. Details about general functionalities of the CellMinerCDB (https://discover.nci.nih.gov/rsconnect/cellminercdb/) platforms have been reviewed recently (2) and a 10-minute tutorial is on YouTube (Fig. 1A).

Figure 1.

The CellMinerCDB: NCATS web application, NCATS dataset, drugs, and cell lines. A, Url, banner and tabs for the CellMinerCDB: NCATS web application. B, Schematic of the creation of the NCATS–NCI cytotoxicity dataset. Multiple versions of the MIPE library were combined into a single-dataset to make the “NCATS–NCI” “cytotoxicity dataset.” This dataset was trimmed down to remove cell lines with introduced genetic modifications, pretreatment conditions, nonstandard media additives, and data not meeting the sharing embargo date of 18 months. C, Left, pie chart showing the clinical status of the 2,675 CellMinerCDB: NCATS compounds: 36% are FDA-approved, 30% have entered clinical trials, and 34% are experimental. Right, pie chart showing the compounds overlapping between CellMinerCDB: NCATS and all other datasets included in CellMinerCDB 1.4. Thirty percent (837) of NCATS compounds overlap with at least one of the other CellMinerCDB datasets and 70% (1, 860) do not. Of those compounds found only in the NCATS datasets, there are multiple noncancer drug types included (see box). D, Pie chart showing the cell line overlaps between CellMinerCDB: NCATS and all other datasets included in CellMinerCDB 1.4.

Figure 1.

The CellMinerCDB: NCATS web application, NCATS dataset, drugs, and cell lines. A, Url, banner and tabs for the CellMinerCDB: NCATS web application. B, Schematic of the creation of the NCATS–NCI cytotoxicity dataset. Multiple versions of the MIPE library were combined into a single-dataset to make the “NCATS–NCI” “cytotoxicity dataset.” This dataset was trimmed down to remove cell lines with introduced genetic modifications, pretreatment conditions, nonstandard media additives, and data not meeting the sharing embargo date of 18 months. C, Left, pie chart showing the clinical status of the 2,675 CellMinerCDB: NCATS compounds: 36% are FDA-approved, 30% have entered clinical trials, and 34% are experimental. Right, pie chart showing the compounds overlapping between CellMinerCDB: NCATS and all other datasets included in CellMinerCDB 1.4. Thirty percent (837) of NCATS compounds overlap with at least one of the other CellMinerCDB datasets and 70% (1, 860) do not. Of those compounds found only in the NCATS datasets, there are multiple noncancer drug types included (see box). D, Pie chart showing the cell line overlaps between CellMinerCDB: NCATS and all other datasets included in CellMinerCDB 1.4.

Close modal

CellMinerCDB: NCATS is a public web application hosted in the Genomics and Pharmacology Facility of the Developmental Therapeutics Branch of the NCI Center for Cancer Research, and of the NCATS of the NIH.

The NCATS screening data contained within the CellMinerCDB: NCATS web application utilize RSTUDIO-2022.12.0–353 and were generated as previously described (29). Cells were treated with compounds for 48 hours in 1,536 well plates and assessed for viability using CellTiter Glo (Promega). Data were normalized to plate controls of DMSO-treated cells as 100% viability and no cells at 0% viability. A four-parameter curve fit was used to generate an IC50 and AUC. Z-score AUC (across cell lines) was calculated by subtracting the mean AUC and dividing by the SD of each drug across all cell lines screened.

All compounds were matched using SMILES and InChIKey to external databases to pull clinical status. NCATS Inxight, DrugBank, and CHEMBL were used as references for compound structure matching and global clinical status (30–32). Structure matching was done within the Palantir Foundry platform (Palantir Technologies) utilizing RDKit: Open-source cheminformatics (2021-09-4; Q3 2021 Release); and NCATSFind Resolver. NCATS cell lines were annotated internally using Cellosaurus for disease and tissue type and matched to the other cell line sets (33). The NCATS web application is an R/shiny app hosted on an NCI server.

Information sources for the cell lines and drugs include the NCI Thesaurus, PubChem and the scientific literature. The large amount of data coming from the included omics efforts and the platforms used to develop them has been previously described. Compound and cell line name variation across the different institutions cell line sets were resolved internally. An example is a single-compound with the names 122958 (NCI-60), ATRA (GDSC), tretinoin (CTRP), and isotretinoin (NCATS). Another example is a single-cell line with the names CO:COLO 205 (NCI-60), COLO 205 (CCLE), COLO-205 (GDSC), COLO205 (MD Anderson). All datasets have instances of missing data for specific cell lines, drugs, or genes.

Univariate analysis and multivariate analysis shown throughout were done using CellMinerCDB: NCATS functionalities or using data downloaded directly from CellMinerCDB: NCATS. The web application generated scatterplots, tables, and heatmap shown were generated using the selections described in the input boxes and figure legends. Drug versus drug activity comparisons not generated by the web application were done by Pearson correlation using R version 3.6.3. Bar charts were generated using GraphPad PRISM version 7.0. Violin plots were generated using ggplot version 3.3.5.

Bimodal drug activity density distributions were identified using a combination of a Gaussian Mixed Model-based (norm1mix package; version 1.3), a kurtosis test and visual inspection. Both these calculations and the density plots were done using The R Project for Statistical Computing.

Prediction of NCATS IC50 activity using CCLE microarray transcript expression by both univariate and multivariate analysis used Pearson correlation between drug response and gene expression of the target. The multivariate models use stepwise forward regression. Each model was initiated with a target for a given drug; multiple targets generated multiple models. Possible regression features included genes from Onco500 (34). A maximum of 10 features were added to each model and then pruned. For each iteration step, the feature with the lowest partial correlation P value after removing the effects of already included features was added using rcellminer 2.9.1 (35). A 10-fold cross-validated predicted response was calculated at each step using rcellminerElasticNet 0.1.1. Models were pruned by examining the statistical difference in the correlation of predicted versus observed response with each added feature using cocor 1.1–3. CCLE microarray expression data from CellMinerCDB was used (2).

Data availability

The data analyzed in this study were obtained from multiple sources. Within the application, the source of each data set is accessible within the Metadata tab, both within the “select here to learn more about…” link and from the “download footnotes” tab. A description of all data sources used in CellMinerCDB: NCATS is provided in the Supplementary Materials and Methods.

The CellMinerCDB: NCATS web application

The CellMinerCDB: NCATS publicly accessible web application was created to both access the NCATS drug response data and enrich and expand its usefulness by integrating multiple other forms and sources of genomics, proteomics, and metabolomics data from the other public cancer cell line datasets using the CellMinerCDB platform (2).

A screenshot of the site, banner, and tabs for the CellMinerCDB: NCATS web application is presented in Fig. 1A. CellMinerCDB: NCATS allows drug comparisons and emphasizes cross-database (CDB) analyses with the other public cancer cell line databases. The univariate analyses tab allows generation of on-the-fly bivariate scatter plots and correlation analyses from a single input to compare all profiles within selected data sets. The multivariate analyses tab allows the exploration of multivariate models predictive of an observed profile. Analyzing selected tissues of origin is an option for both univariate and multivariate analyses. The metadata tab allows the download of datasets of interest for further processing and archiving. The search IDs tab provides the identifiers within each cell line set by data type. The help tab provides explanations and descriptions of the various functionalities within the web application. In addition, the video tutorial tab provides a description and explanation of the CellMinerCDB functionalities. Thus, CellMinerCDB: NCATS provides new data, multiple functionalities, and data integration, allowing users to mine independently the NCATS data without having to seek support from bioinformatics teams.

The NCATS input data

CellMinerCDB: NCATS comprises 2,675 drugs and compounds tested in 183 cell lines, of which, 2,667 have mechanism of action designations. The dataset was created as described in Materials and Methods and Fig. 1B. The output is fully compatible and integrated with CellMinerCDB (2). An asset of CellMinerCDB: NCATS is the unique compounds and cancer cell lines included (Fig. 1C and D).

NCATS contains two drug sensitivity metrics, Z- AUC and IC50 values. These boast a large range of screening concentrations, routinely using 11 concentrations between 0.79 nanomolar and 47 micromolar, which is an asset of NCATS drug testing (12). The drugs include 952 (36%) clinically approved, 790 (30%) that have entered clinical trials, and 908 drugs (34%) that are preclinical (Fig. 1C, left). Notably, 1,877 (70%) drugs and compounds are unique to NCATS (Fig. 1C, right). They have been annotated with their commonly accepted mechanisms of action. A feature of the NCATS dataset is the inclusion of 518 approved nononcology drugs not found in the other public databases (Supplementary Table S1). Those include 103 antiinfectives (antibacterial, mycobacterial, viral, or fungal) for systemic use, 86 cardiovascular or nervous system drugs, 72 alimentary tract and metabolism compounds.

The 183 NCATS cell lines distribution by tissue of origin is detailed in Supplementary Table S2. They include 72 (38%) unique cancer cell lines absent in other public cancer cell line databases (Fig. 1D; Supplementary Table S3). Figure 1D shows several of the rare disease subtypes including DIPG, renal Birt-Hogg- Dubé syndrome, hereditary leiomyomatosis, and TFE3 fusion cancer cell lines. Thus, CellMinerCDB: NCATS provides the user with substantial new drug and cell line data.

Cell line and drug overlaps of NCATS with other cancer cell line datasets

The cell lines overlaps for CellMinerCDB: NCATS as well as all other cell line sets are listed in Fig. 2A. As in our other CellMinerCDB websites (https://discover.nci.nih.gov/), cell lines are matched with common tissue of origin terms based on the OncoTree ontology levels developed by the Memorial Sloan Kettering Cancer Center (New York, NY) and Dana-Farber Cancer Institute (Boston, MA), primarily version 1.1 as described previously (2). Additional information such as patient gender or age from which the cell line originated are also included. Comparison between drug responses in cell lines is made possible by the overlap of cell lines across databases (Fig. 2A).

Figure 2.

Cell line and drug overlap, and data types in CellMinerCDB-NCATS. A, Cell lines overlap between NCATS and the nine other cell line datasets. Project Achilles is from the DepMap; PRISM from Broad-MIT; NCI Almanac is the NCI60-DTP Almanac. B, Drug overlap between NCATS and the seven other cell line datasets. Number of drugs is as based on the comparison of NCATS AUC overlap and the seven other cell line sets. The MD Anderson and DepMap Achilles cell line datasets are not included as they have no drug activities. The NCI Almanac has two-drug activities measurements. The drugs with data for inhibitory concentration 50% (IC50) are slightly less in number. For acronym definitions see A. C, Available data in CellMinerCDB: NCATS. For the drug activities columns, the “single” numbers are compounds or drugs. The “combo” drugs are two-drug combinations for 105 FDA-approved drugs. For the DNA, RNA, and CRISPR columns, the numbers are genes with information for that cell line set. For the “protein” columns, the numbers are epitopes for the reverse phase protein arrays (RPPA) and protein fragments for the mass spectrometry (MS). For the “metabolite” column, the numbers are metabolites. For the “signatures” column, the number is signatures of various types. CTRP DNA copy number and mutation, microarray log2, and signatures data are identical to that in CCLE, and so are not included here.

Figure 2.

Cell line and drug overlap, and data types in CellMinerCDB-NCATS. A, Cell lines overlap between NCATS and the nine other cell line datasets. Project Achilles is from the DepMap; PRISM from Broad-MIT; NCI Almanac is the NCI60-DTP Almanac. B, Drug overlap between NCATS and the seven other cell line datasets. Number of drugs is as based on the comparison of NCATS AUC overlap and the seven other cell line sets. The MD Anderson and DepMap Achilles cell line datasets are not included as they have no drug activities. The NCI Almanac has two-drug activities measurements. The drugs with data for inhibitory concentration 50% (IC50) are slightly less in number. For acronym definitions see A. C, Available data in CellMinerCDB: NCATS. For the drug activities columns, the “single” numbers are compounds or drugs. The “combo” drugs are two-drug combinations for 105 FDA-approved drugs. For the DNA, RNA, and CRISPR columns, the numbers are genes with information for that cell line set. For the “protein” columns, the numbers are epitopes for the reverse phase protein arrays (RPPA) and protein fragments for the mass spectrometry (MS). For the “metabolite” column, the numbers are metabolites. For the “signatures” column, the number is signatures of various types. CTRP DNA copy number and mutation, microarray log2, and signatures data are identical to that in CCLE, and so are not included here.

Close modal

The drug and compound activity overlap between the multiple cell line sets is presented in Fig. 2B. Information on each cell line set activity measurements are accessible in the “data type” input box, Metadata “units” description or footnotes, or the provided urls. An asset for the user is that CellMinerCDB: NCATS automatically matches cell line and drug data across any cell line sets queried, which allows their comparison for identical, related by mechanism of action, or disparate drugs.

Omics data available for cross-comparisons in CellMinerCDB: NCATS

Figure 2C summarizes by cell line set and measurement type the profiles available in CellMinerCDB: NCATS, including 31,617 drug (and compound) activities, 261,848 molecular measurements and 18,119 miscellaneous signatures. All 28 included datasets are available for download from the Metadata tab (Fig. 1A). Our curation and standardization of these datasets minimizes the task of name matching.

The data types available for exploration based on the databases with overlapping cell lines include single-drug activities, two-drug combination activities, gene copy number, methylation and mutation levels, transcript expression, protein expression, metabolite levels, the DepMap Achilles (Achilles) CRISPR genetic dependencies, and miscellaneous molecular signatures. Those miscellaneous phenotypic signatures include the antigen presenting machinery (APM), epithelial–mesenchymal transition (EMT) status, replication stress (RepStress), genomic instability (HRD_LOH, HRD-SUM, NtAI, LST) and neuroendocrine status (NE). The metadata phenotypic signatures are accessible in the univariate analyses\data type\mda: miscellaneous phenotypic data. The number of data explorations one might pursue, depending on one's interest, easily jumps into the billions. The NCATS drug data can be compared to genomics data for the same cell lines in other datasets allowing one to relate the drug responses to omics features using CellMinerCDB: NCATS. The following examples illustrate the basic use of CellMinerCDB: NCATS.

Drug comparisons

The overlaps between cell lines and drugs across the “cell line sets” facilitate multiple forms of drug comparisons. Figure 3A shows a univariate analyses/plot data output for two structurally related TOP1 inhibitors commonly used in clinical oncology (36), topotecan (x-axis) versus SN-38 (y-axis), the active metabolite of irinotecan. Both are measured by NCATS and displayed using CellMinerCDB-NCATS. The highly significant correlation between the two drugs (P = 9.1×10−52) demonstrates internal assay consistency.

Figure 3.

Comparisons of drugs in CellMinerCDB: NCATS. A, Scatter plot of the activities of topotecan (x-axis) versus SN38 (y-axis), both measured by NCATS. The plot is a screenshot from CellMinerCDB-NCATS (Fig. 1A, univariate analyses). B, Comparison of the ALK inhibitor TAE-684 with the other ALK inhibitors tested by NCATS. The results were generated using CellMinerCDB-NCATS (univariate analyses/compare patterns tab selections) including a filter to output only “ALK inhibitor” in the mechanism of action (MOA) column and ordered by P value. C, Bar graph showing the top 15 compounds with the highest positive correlation for IC50 value comparisons between NCATS and GDSC. Red bars highlight the compounds highly correlated between NCATS and CTRP (D): linifanib, sorafenib, AZD-7762, and tivozanib. The primary target of each compound is shown in parenthesis. D, Bar graph showing the top 15 compounds with the highest positive correlation for IC50 values between NCATS and CTRP. Red bars highlight the compounds highly correlated between NCATS and GDSC (C). The primary target of each compound is shown in parenthesis. E, A scatter plot of AZD-7762 activity as measured by NCATS (x-axis) versus CTRP (y-axis). The plot is a screenshot generated using the univariate analyses/plot data tab selections. For the scatter plots A, B, and E, individual dots are cell lines with color coding by tissue of origin. F, Violin plot showing all compounds with IC50 with positive correlations and with P < 0.05 either between NCATS and CTRP or NCATS and GDSC. All compounds shown had a minimum of 16 cell lines overlap between datasets. The box plot overlay shows a median correlation of 0.4. All correlations presented are Pearson.

Figure 3.

Comparisons of drugs in CellMinerCDB: NCATS. A, Scatter plot of the activities of topotecan (x-axis) versus SN38 (y-axis), both measured by NCATS. The plot is a screenshot from CellMinerCDB-NCATS (Fig. 1A, univariate analyses). B, Comparison of the ALK inhibitor TAE-684 with the other ALK inhibitors tested by NCATS. The results were generated using CellMinerCDB-NCATS (univariate analyses/compare patterns tab selections) including a filter to output only “ALK inhibitor” in the mechanism of action (MOA) column and ordered by P value. C, Bar graph showing the top 15 compounds with the highest positive correlation for IC50 value comparisons between NCATS and GDSC. Red bars highlight the compounds highly correlated between NCATS and CTRP (D): linifanib, sorafenib, AZD-7762, and tivozanib. The primary target of each compound is shown in parenthesis. D, Bar graph showing the top 15 compounds with the highest positive correlation for IC50 values between NCATS and CTRP. Red bars highlight the compounds highly correlated between NCATS and GDSC (C). The primary target of each compound is shown in parenthesis. E, A scatter plot of AZD-7762 activity as measured by NCATS (x-axis) versus CTRP (y-axis). The plot is a screenshot generated using the univariate analyses/plot data tab selections. For the scatter plots A, B, and E, individual dots are cell lines with color coding by tissue of origin. F, Violin plot showing all compounds with IC50 with positive correlations and with P < 0.05 either between NCATS and CTRP or NCATS and GDSC. All compounds shown had a minimum of 16 cell lines overlap between datasets. The box plot overlay shows a median correlation of 0.4. All correlations presented are Pearson.

Close modal

Similarly, Fig. 3B shows a univariate analyses/compare patterns comparing the NCATS anaplastic lymphoma kinase (ALK) inhibitor TAE-684 to other NCATS ALK inhibitors (by entering ALK inhibitor in the output MOA column). Of the 12 ALK inhibitors in the NCATS database, 10 show significant correlations demonstrating assay and mechanism of action reproducibility across cell lines within the NCATS drug response database.

Comparison of NCATS with GDSC and CTRP drug activities in Fig. 3C and D, respectively, shows the top 15 correlated compounds for each. Four protein kinase inhibitors are common between these two (shown as red bars): linifanib, sorafenib, AZD-7762, and tivozanib. Figure 3E is a univariate analyses/plot data analysis of one of these comparisons, AZD-7762 as measured by both NCATS (x-axis) and CTRP (y-axis), yielding a P value of 1.1×10−10. These observations demonstrate ways of comparing drug activities across databases to determine consistency across common cell line sets.

Compared globally, the average Pearson correlation for NCATS versus either GDSC or CTRP across all compounds using Z-AUC or IC50 is 0.4. Violin plots (Fig. 3F) visualize significant correlations between NCATS and 102/265 compounds (38.4%) for CTRP and 71/212 compounds (33.5%) for GDSC. The NCATS versus the PRISM drug data are not included in this analysis as none had a minimum 16 cell lines with overlap. The Fig. 3 examples are only a small sampling of the types of informative comparisons one might do.

Exploration of NCATS drug responses with omics or CRISPR data

The integration of the NCATS drug responses with a wide range of molecular, phenotypic, and signature data from the other omics databases (CCLE, GDSC, and NCI) allows correlation queries for overlapping cell lines. We next present a small group of these as illustrations with outputs and screenshots from CellMinerCDB: NCATS.

Figure 4A validates SN-38 activity (in NCATS) versus SLFN11 gene transcript expression (in GDSC) using CellMinerCDB: NCATS univariate analyses/plot data. The scatter plot confirms the expected significant correlation between these causally linked parameters (36). Figure 4B presents additional examples between NCATS and GDSC; all showing significant correlation between a drug's activity and the transcript levels of that drug target.

Figure 4.

NCATS: CDB univariate comparisons of drug activities to transcript, DNA copy number, and CRISPR signatures. A, Scatter plot of SLFN11 transcript expression from GDSC (x-axis) versus SN-38 activity measured by NCATS (y-axis). The plot is a snapshot from CellMinerCDB-NCATS (univariate analyses). B, Additional examples of significantly correlated and biologically linked NCATS IC50 drug activities versus GDSC transcript expression levels. All gene examples are targets for the corresponding drugs. C, Scatter plot of MTOR DNA copy number as measured by CCLE (x-axis) versus -5584 activity as measured by NCATS (y-axis). The plot is a screenshot from CellMinerCDB-NCATS (univariate analyses/plot data tab selections), with the specific inputs used detailed in the boxes to the left. The vertical line was added at 0 intensity or 2N DNA copy number. The units for the x-axis were converted from intensity to ploidy (copy number = 2×2intensity) for biological clarity. D, Additional examples of significantly correlated and biologically linked NCATS IC50 drug activities versus CCLE DNA copy number from plots generated as in C. Genes are targets of the corresponding drugs. E, Scatter plot of BRAF CRISPR knockdown cell survival from the Achilles Project (x-axis) versus vemurafenib activity as measured by NCATS (y-axis). The plot is a screenshot from CellMinerCDB (univariate analyses/plot data tab), with the specific inputs used detailed in the input boxes to the left. The vertical line was added at 0 to indicate that the cell lines to the left of line have decreased survival following knocking down BRAF. F, Additional examples of significant correlations of drug activities versus CRISPR knockdown of the target genes. The CRISPR knockdown cell survival data are from the Achilles Project. All correlations presented in the figure are Pearson. For all scatter plots, dots are cell lines with color coding by tissue of origin indicated to the right.

Figure 4.

NCATS: CDB univariate comparisons of drug activities to transcript, DNA copy number, and CRISPR signatures. A, Scatter plot of SLFN11 transcript expression from GDSC (x-axis) versus SN-38 activity measured by NCATS (y-axis). The plot is a snapshot from CellMinerCDB-NCATS (univariate analyses). B, Additional examples of significantly correlated and biologically linked NCATS IC50 drug activities versus GDSC transcript expression levels. All gene examples are targets for the corresponding drugs. C, Scatter plot of MTOR DNA copy number as measured by CCLE (x-axis) versus -5584 activity as measured by NCATS (y-axis). The plot is a screenshot from CellMinerCDB-NCATS (univariate analyses/plot data tab selections), with the specific inputs used detailed in the boxes to the left. The vertical line was added at 0 intensity or 2N DNA copy number. The units for the x-axis were converted from intensity to ploidy (copy number = 2×2intensity) for biological clarity. D, Additional examples of significantly correlated and biologically linked NCATS IC50 drug activities versus CCLE DNA copy number from plots generated as in C. Genes are targets of the corresponding drugs. E, Scatter plot of BRAF CRISPR knockdown cell survival from the Achilles Project (x-axis) versus vemurafenib activity as measured by NCATS (y-axis). The plot is a screenshot from CellMinerCDB (univariate analyses/plot data tab), with the specific inputs used detailed in the input boxes to the left. The vertical line was added at 0 to indicate that the cell lines to the left of line have decreased survival following knocking down BRAF. F, Additional examples of significant correlations of drug activities versus CRISPR knockdown of the target genes. The CRISPR knockdown cell survival data are from the Achilles Project. All correlations presented in the figure are Pearson. For all scatter plots, dots are cell lines with color coding by tissue of origin indicated to the right.

Close modal

A second form of omics data comparison is given in Fig. 4C, comparing activity of the mTOR inhibitor VS-5584 from NCATS and MTOR DNA copy number from CCLE demonstrating significant correlation. CellMinerCDB: NCATS also shows that mTOR DNA copy number is significantly correlated to its transcript level (r = 0.49, P = 1.6E–61), providing the logical link between the drug activity and DNA copy number. Figure 4D provides additional examples of significant correlations between drug activities and DNA copy numbers; all linked through having the same gene both as drug target and molecular measurement. All have significant correlations between gene DNA copy number and transcript levels.

Figure 4E and F exemplify the possibility of testing NCATS drug activity versus genetic inactivation of the drug target. Figure 4E compares the growth inhibitory activity of vemurafenib (a BRAF inhibitor) to cell survival with BRAF CRISPR knockdown (as measured by Project Achilles). The resultant scatter plot demonstrates a significant correlation between the two. Figure 4F lists other examples showing significant correlations between drug activities and CRISPR knockdown; in each case linked through having the same gene both as the drug and CRISPR target. As for the drugs in Figs. 3 and 4 provides only a small sampling of the types of informative comparisons one might do.

To compare the predictive value of different genomics parameters, the NCATS approved and clinical trial drugs IC50 activities were each compared with the different genomics evaluations of their gene targets (transcript expression, gene copy number, methylation, mutations, and CRISPR) across the other nine platforms in CellMinerCDB: NCATS, resulting in 1,100 drug versus gene pairings (Supplementary Table S4). The percent significant correlations by platform were: (i) 5.3% for the CCLE DNA copy number, (ii) 8.8% for the GDSC methylation, (iii) 6.5% for the CCLE mutation, (iv) 5.1% for GDSC mutation, (v) 11.8% for the CCLE transcript microarray, (vi) 10.8% for the GDSC microarray, (vii) 12.8% for the CCLE RNA sequencing (RNA-seq), (viii) 9.2% for the CCLE protein, and (ix) 6.2% for the Achilles CRISPR. These results demonstrate the value of RNA-seq and proteomic analyses for predicting drug activity.

Although determination of protein levels remains limited in clinical samples, we found that both protein expression and gene-expression of the proapoptotic factor BAX in CCLE are significantly correlated with the IC50 activity of SN-38 in NCATS (Supplementary Fig. S1; P = 0.0013 and 0.0026 for 88 and 95 common cell lines, respectively). Thus, on the basis of the analysis of drugs tested in NCATS, we conclude that RNA-seq is currently the most practical predictor of drug response.

Multivariate and miscellaneous phenotypic signature (mda) analyses using CellMinerCDB: NCATS

Presuming that multiple factors are involved in drug response (36), we present two approaches for clinical TOP1 inhibitors (topotecan and SN-38, the active metabolite of irinotecan) using CellMinerCDB: NCATS.

The first utilizes the prior knowledge that the cytotoxicity of TOP1 inhibitors are dependent on SLFN11, apoptosis and transcription (36). Combining transcript expression of SLFN11 (Fig. 5A), BPTF (Fig. 5B) and high mobility group nucleosome-binding domain-containing protein 1(HMGN1; Fig. 5C) shows how the predictive value of SLFN11 can be strengthened by using the multivariate analysis tool of NCATS:CDB (Fig. 5D and 5E).

Figure 5.

Multivariate analysis of SN-38 activity in NCATS using the expression of SLFN11, BPTF, HMGN1, and BAX in the overlapping cell lines in CCLE is a better predictor of SN-38 activity than any of the four genes taken individually. A, Predictive value of SLFN11 expression. B, Predictive value of BPTF (encoding a protein regulating chromatin remodeling as a regulator of ATP hydrolysis of the NURF complex). C, Predictive value of HMGN1 (encoding HMGN1) associated with active transcription. D, Cluster image map of the multivariate analysis of SN-38 activity predicted by the expression of four genes together. See Supplementary Fig. S1 for BAX univariate data. E, Scatter plot of the observed versus 10-fold cross-validation for SN-38 using the same predictor genes as in D.

Figure 5.

Multivariate analysis of SN-38 activity in NCATS using the expression of SLFN11, BPTF, HMGN1, and BAX in the overlapping cell lines in CCLE is a better predictor of SN-38 activity than any of the four genes taken individually. A, Predictive value of SLFN11 expression. B, Predictive value of BPTF (encoding a protein regulating chromatin remodeling as a regulator of ATP hydrolysis of the NURF complex). C, Predictive value of HMGN1 (encoding HMGN1) associated with active transcription. D, Cluster image map of the multivariate analysis of SN-38 activity predicted by the expression of four genes together. See Supplementary Fig. S1 for BAX univariate data. E, Scatter plot of the observed versus 10-fold cross-validation for SN-38 using the same predictor genes as in D.

Close modal

The second multivariate analysis available in NCATS:CDB (and the other CellMinerCDB websites) uses previously described multigene expression signatures, which can be retrieved using the “mda” tab in the “data type” pull-down menu at the left of the website (Fig. 6). Together, these examples demonstrate the increased power of aggregating multiple genomic parameters to predict drug activity.

Figure 6.

Genomic signature analysis identifies RepStress but not EMT as predictor of SN-38 activity in the overlapping cell lines of NCATS and CCLE. Left and right, snapshots of CellMinerCDB: NCATS for RepStress and EMT, respectively.

Figure 6.

Genomic signature analysis identifies RepStress but not EMT as predictor of SN-38 activity in the overlapping cell lines of NCATS and CCLE. Left and right, snapshots of CellMinerCDB: NCATS for RepStress and EMT, respectively.

Close modal

Drug activity distributions and additional multivariate analysis

Figure 7 presents another form of exploration generated from the NCATS drug database: drug activity distributions with consideration of tissues of origin. Bimodal drug distributions were identified, demonstrating both sensitive and resistant cancer cell line responses. Enrichment for specific tissues of origin in the activity peaks demonstrates novel prospective therapeutic indications. Multivariate analyses using CCLE transcriptomics visualize multivariate molecular predictors. The first example given is for filanesib, with its bimodal activity distribution visualized in Fig. 7A and the significant prediction of that activity by KIF11, MYBBP1A, and TNFRSF10D (P = 1.2×10−7) in Fig. 7B. The second example given is for epothilone, with its bimodal activity distribution visualized in Fig. 7C and the significant prediction of that activity by TUBB6, ABCG1, GSK3G, and MLH1 (P = 1.2×10−7) in Fig. 7D. Diverse mechanisms of action drugs reveal enhanced activities for bladder, blood (leukemia), bone (sarcoma), bowel, brain, and lymphatic cancer cells in Fig. 7E.

Figure 7.

Drug distributions, tissue of origin enrichments and molecular predictors of drug activity. A, A density plot of filanesib activity (IC50 z-scores from NCATS; x-axis) versus distribution of the cell lines plotted as density (y-axis). B, Multivariate analysis for filanesib activity as the response variable and CCLE transcript expression of three genes as predictor variables. C, Density plot of epothilone A activity (x-axis) versus density (y-axis). The brain enrichment, P = 0.082. D, Multivariate analysis for epothilone A activity as the response variable and CCLE transcript expression of four genes as predictor variables. E, Density plots for four NCATS drugs showing drug activity IC50 z-scores versus distribution of the cell lines plotted as density (y-axis). For the density plots in A, C, and E, drug activities are z-scores calculated across cell lines for IC50s (x-axis). Enriched tissue of origins are included (if present) with both the number of cell lines present within the peak (first number) and total number of cell lines of that type (second number). The asterisks indicate significant P < 0.05. All other P values are less than 0.07. In the scatter plots B and D, the predicted drug activity is on the x-axis and the observed drug activity is on the y-axis. All correlations presented are Pearson. Dots are cell lines with color coding by tissue of origin. The plots were created using the CellMinerCDB: NCATS\multivariate analyses\plot data tab selections, with the specific inputs used detailed in the input boxes to the left.

Figure 7.

Drug distributions, tissue of origin enrichments and molecular predictors of drug activity. A, A density plot of filanesib activity (IC50 z-scores from NCATS; x-axis) versus distribution of the cell lines plotted as density (y-axis). B, Multivariate analysis for filanesib activity as the response variable and CCLE transcript expression of three genes as predictor variables. C, Density plot of epothilone A activity (x-axis) versus density (y-axis). The brain enrichment, P = 0.082. D, Multivariate analysis for epothilone A activity as the response variable and CCLE transcript expression of four genes as predictor variables. E, Density plots for four NCATS drugs showing drug activity IC50 z-scores versus distribution of the cell lines plotted as density (y-axis). For the density plots in A, C, and E, drug activities are z-scores calculated across cell lines for IC50s (x-axis). Enriched tissue of origins are included (if present) with both the number of cell lines present within the peak (first number) and total number of cell lines of that type (second number). The asterisks indicate significant P < 0.05. All other P values are less than 0.07. In the scatter plots B and D, the predicted drug activity is on the x-axis and the observed drug activity is on the y-axis. All correlations presented are Pearson. Dots are cell lines with color coding by tissue of origin. The plots were created using the CellMinerCDB: NCATS\multivariate analyses\plot data tab selections, with the specific inputs used detailed in the input boxes to the left.

Close modal

Supplementary Table S5 presents an example of a more systematic pharmacologic prediction approach of NCATS IC50 drug activity distributions using CCLE microarray transcript levels. Included are 63 significant gene–drug combinations in which the genes are known targets for those drugs. In the case of ABT-737 (a BH3 mimetic and BCL2 gene family inhibitor), the generated multivariate model includes two known targets: BCL2L2 and BCL2 (as given by NCATS annotation).

Making the NCATS drug activities publicly available is a significant addition to the omics arena. CellMinerCDB: NCATS gathers the NCATS drug response database and integrates it with nine other genomic and proteomic projects (see Fig. 1). The NCATS 2,675 drugs and compounds is second only to the large NCI/DTP activity screening in number (Figs. 1 and 2; ref. 2). Its high proportion of novel drugs, large number of nononcology drugs and inclusion of many novel cell lines, including rare tumors add significantly to the omics cancer cell line field.

Our curation of both the cell line and drug names enables integration with our previous CellMiner databases (2, 3, 9). It also resolves differences, making data retrieval and comparisons available with an intuitive web application. This combined with the molecular, metabolic, phenotypic, and signature data from NCI, CCLE, GDSC, and other databases adds a myriad of informative molecular parameters for the purposes of exploration, discovery, prediction, and verification of either previously known or novel relationships.

We find that the activity of drugs with similar mechanisms of action is in general internally consistent within NCATS and across the other drug databases (CCLE, CTRIP, GDSC, and NCI) as shown in Fig. 3. Activity variability for overlapping drugs between institutes is recognized and presumably comes from a combination of the type of robotics and biological techniques employed (37). NCATS uses 1,536 well plates, with compounds added immediately after cell plating and 48-hour drug exposure. CTRP and GDSC use 384 well plates, with compounds added 24 hours after cells plating and 72-hour drug incubation. All three projects use CellTiter-Glo. It is unsurprising that drug activity assays done under different conditions might give different results. However, our analyses shows that multiple drugs and compounds perform similarly regardless of differences in assay parameters. Thus, our recommendation for pharmacogenomics exploration with CellMinerCDB: NCATS is to first perform interdatabase analyses with drugs present in at least two platforms and prioritize drugs with consistent cytotoxicity response across databases.

CellMinerCDB: NCATS comprises two main analysis tools): “univariate analyses” and “multivariate analyses” (Fig. 1A). The pharmacogenomics analyses shown in Figs. 3,4567, all generated within the CellMinerCDB: NCATS web application, provide examples of the many types of analysis possible. With 14.7 billion drug activity versus gene-molecular or phenotypic (CRISPR) measurements, practically, one is limited only by the number of questions and knowledge one has. This number does not include the many intergene molecular and interdrug activity comparisons one might do.

Figures 3A, 4A, 5AE, and Supplementary Fig. S1 provide pharmacogenomic and proteomic explorations for SN-38, as prior work has causally related SLFN11 expression to the activity of TOP1 inhibitors (6, 38–40). The additional transcript examples in Fig. 4B, and DNA copy-number examples in Fig. 4C and D link various NCATS drugs to their molecular targets. The ability to perform gene knockdown (CRISPR) comparisons reflect how a gene knockdown measured in Project Achilles relates to response to drugs measured in NCATS. None of the 33 drug-target examples listed are FDA-approved biomarkers for their respective drugs; so each of them provides possible incentive for their development and use. One might easily expand this type of analysis to nontarget, but biologically relevant genes based on domain knowledge.

When using “univariate analyses,” we find the transcript data are stronger predictors of pharmacologic response than the other genomic data (gene mutations, copy-number variation, or methylation) available in the cancer cell lines (Supplementary Table S4). Currently DNA mutation is a predominant biomarker used for drug prediction. Although we see the expected predictive value of BRAF mutations with the activity of vemurafenib and dabrafenib (Supplementary Fig. S2), mutations only predict the activity of a relatively small subset of drugs routinely used in oncology. In addition to having reliable gene coverage and being implemented clinically RNA-seq data are advantageous for the construction of multigene signatures. The cell line superiority for the prediction of pharmacologic response is likely to translate clinically over time, leading to its gaining dominance for that purpose.

Because pharmacologic response is a product of multiple molecular factors, drug activity prediction, or exploration is expected to be improved and tested using the “multivariate analyses” tools of CellMinerCDB: NCATS. Figure 5 provides examples of how building multigene analyses can be explored. This approach requires an understanding of the pathways and targets that determine drug response. Taking the example of SN-38 (the active metabolite of irinotecan) and topotecan (36), Fig. 5 shows how “multivariate analyses” can be generated. CellMinerCDB also provides preexisting gene-signatures. Fig. 6 uses a precomputed multigene signature, the 18-transcript RepStress signature (29). Increased level of this stress parameter is significantly correlated with topotecan and SN-38 response, providing proof-of-principle and a testable preclinical model for RepStress as predictive for patient response to TOP1 inhibitors. Having precomputed signatures avoids looking up the reference, finding the genes involved, determining, and then applying the algorithm for the cell line set of interest.

Downloading the data of CellMinerCDB: NCATS reveals drug activity distribution enrichments for some tissue of origins within the cancer cell line panels. All the cancer types enriched indicate prospective novel applications for those drugs, presumably with responsive subsets. Nononcology drugs might also be studied. An example from Fig. 7E is disulfiram, a drug used to discourage alcohol intake. Response to this drug is bimodal across the NCATS cancer cell lines, with improved activity in bone (sarcoma) cell lines. This result expands our prior work on the discovery of acetalax, another noncancer drug, with activity in triple-negative breast cancer cell lines (3).

In summary, the wealth of information in the CellMinerCDB: NCATS web application, albeit with its own limitations, allows basic and clinician researchers to explore pharmacogenomic relationships in either univariate or multivariate fashion. One may consider drug response in the context of multiple forms or combinations of outputs that easily run into the billions. The web application facilitates the user's ability to explore those relationships and explore potential pharmacogenomic parameters applicable to clinical studies.

Limitations of the data come in multiple forms requiring multiple solutions. Missing data might be addressed by simply carrying out the salient form of analysis to fill those gaps.

More complete analysis of variability between platforms might be done by adding overlapping cell lines, drugs, or assays of interest. Algorithmic approaches that better consider the limitations and proper interpretation of datasets can improve results at that level, including the expansion of multivariate analysis functionality and approach selection. Recognitions of signatures predictive of pharmacologic response should yield improved success in that area. It should be noted that the relationships found do not constitute proof of causality. The continued exploration and definition of how best to integrate cancer cell lines omics data with that from patients and to integrate clinical data into the omics format remain fields in their infancy.

K.R. Bradwell reports other support from Palantir Technologies during the conduct of the study and other support from Palantir Technologies outside the submitted work. No disclosures were reported by the other authors.

W.C. Reinhold: Conceptualization, formal analysis, investigation, visualization, methodology, writing–original draft, project administration, writing–review and editing. K. Wilson: Resources, data curation, software, formal analysis, investigation, visualization, project administration, writing–review and editing. F. Elloumi: Resources, data curation, software, formal analysis, visualization, writing–review and editing. K.R. Bradwell: Resources, data curation, software, writing–review and editing. M. Ceribelli: Resources, data curation, and software. S. Varma: Resources, data curation, formal analysis, and investigation. Y. Wang: Resources, data curation, and software. D. Duveau: Resources, data curation, and software. N. Menon: Resources and data curation. J. Trepel: Conceptualization, resources, data curation, software, investigation, writing–review and editing. X. Zhang: Conceptualization, resources, data curation, software, investigation, writing–review and editing. C. Klumpp-Thomas: Resources. S. Michael: Resources. P. Shinn: Resources, data curation, and software. A. Luna: Data curation, software, formal analysis, writing–review and editing. C. Thomas: Conceptualization, resources, data curation, software, writing–review and editing. Y. Pommier: Conceptualization, resources, supervision, investigation, visualization, writing–review and editing.

Our studies are supported by the Center for Cancer Research, the Intramural Program of the NCI, NIH, Bethesda, MD (Z01 BC 006150).

The publication costs of this article were defrayed in part by the payment of publication fees. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.

Note: Supplementary data for this article are available at Cancer Research Online (http://cancerres.aacrjournals.org/).

1.
Weinstein
JN
,
Myers
TG
,
O'Connor
PM
,
Friend
SH
,
Fornace
AJ
Jr
,
Kohn
KW
, et al
.
An information-intensive approach to the molecular pharmacology of cancer
.
Science
1997
;
275
:
343
9
.
2.
Luna
A
,
Elloumi
F
,
Varma
S
,
Wang
Y
,
Rajapakse
VN
,
Aladjem
MI
, et al
.
CellMiner cross-database (CellMinerCDB) version 1.2: exploration of patient-derived cancer cell line pharmacogenomics
.
Nucleic Acids Res
2021
;
49
:
D1083
D93
.
3.
Rajapakse
VN
,
Luna
A
,
Yamade
M
,
Loman
L
,
Varma
S
,
Sunshine
M
, et al
.
CellMinerCDB for integrative cross-database genomics and pharmacogenomics analyses of cancer cell lines
.
iScience
2018
;
10
:
247
64
.
4.
Reinhold
WC
,
Sunshine
M
,
Liu
H
,
Varma
S
,
Kohn
KW
,
Morris
J
, et al
.
CellMiner: a web-based suite of genomic and pharmacologic tools to explore transcript and drug patterns in the NCI-60 cell line set
.
Cancer Res
2012
;
13
.
5.
Reinhold
WC
,
Sunshine
M
,
Varma
S
,
Doroshow
JH
,
Pommier
Y
.
Using cellminer 1.6 for systems pharmacology and genomic analysis of the NCI-60
.
Clin Cancer Res
2015
;
21
:
3841
52
.
6.
Reinhold
WC
,
Thomas
A
,
Pommier
Y
.
DNA-targeted precision medicine; have we been caught sleeping?
Trends Cancer
2017
;
3
:
2
6
.
7.
Reinhold
WC
,
Varma
S
,
Sousa
F
,
Sunshine
M
,
Abaan
OD
,
Davis
SR
, et al
.
NCI-60 whole exome sequencing and pharmacological cellminer analyses
.
PLoS One
2014
;
9
:
e101670
.
8.
Scherf
U
,
Ross
DT
,
Waltham
M
,
Smith
LH
,
Lee
JK
,
Tanabe
L
, et al
.
A gene expression database for the molecular pharmacology of cancer
.
Nat Genet
2000
;
24
:
236
44
.
9.
Tlemsani
C
,
Takahashi
N
,
Pongor
L
,
Rajapakse
VN
,
Tyagi
M
,
Wen
X
, et al
.
Whole-exome sequencing reveals germline-mutated small cell lung cancer subtype with favorable response to DNA repair-targeted therapies
.
Sci Transl Med
2021
;
13
:
eabc7488
.
10.
Pongor
LS
,
Tlemsani
C
,
Elloumi
F
,
Arakawa
Y
,
Jo
U
,
Gross
JM
, et al
.
Integrative epigenomic analyses of small cell lung cancer cells demonstrates the clinical translational relevance of gene body methylation
.
iScience
2022
;
25
:
105338
.
11.
Allison
M
.
NCATS launches drug repurposing program
.
Nat Biotechnol
2012
;
30
:
571
2
.
12.
Huang
R
,
Zhu
H
,
Shinn
P
,
Ngan
D
,
Ye
L
,
Thakur
A
, et al
.
The NCATS pharmaceutical collection: a 10-year update
.
Drug Discov Today
2019
;
24
:
2341
9
.
13.
Mathews Griner
LA
,
Guha
R
,
Shinn
P
,
Young
RM
,
Keller
JM
,
Liu
D
, et al
.
High-throughput combinatorial screening identifies drugs that cooperate with ibrutinib to kill activated B-cell-like diffuse large B-cell lymphoma cells
.
Proc Natl Acad Sci U S A
2014
;
111
:
2349
54
.
14.
Heske
CM
,
Davis
MI
,
Baumgart
JT
,
Wilson
K
,
Gormally
MV
,
Chen
L
, et al
.
Matrix screen identifies synergistic combination of PARP inhibitors and nicotinamide phosphoribosyltransferase (NAMPT) inhibitors in ewing sarcoma
.
Clin Cancer Res
2017
;
23
:
7301
11
.
15.
Ju
W
,
Zhang
M
,
Wilson
KM
,
Petrus
MN
,
Bamford
RN
,
Zhang
X
, et al
.
Augmented efficacy of brentuximab vedotin combined with ruxolitinib and/or Navitoclax in a murine model of human Hodgkin's lymphoma
.
Proc Natl Acad Sci U S A
2016
;
113
:
1624
9
.
16.
Lin
GL
,
Wilson
KM
,
Ceribelli
M
,
Stanton
BZ
,
Woo
PJ
,
Kreimer
S
, et al
.
Therapeutic strategies for diffuse midline glioma from high-throughput combination drug screening
.
Sci Transl Med
2019
;
11
:
eaaw0064
.
17.
Wilson
KM
,
Mathews-Griner
LA
,
Williamson
T
,
Guha
R
,
Chen
L
,
Shinn
P
, et al
.
Mutation profiles in glioblastoma 3D oncospheres modulate drug efficacy
.
SLAS Technol
2019
;
24
:
28
40
.
18.
Holbeck
SL
,
Camalier
R
,
Crowell
JA
,
Govindharajulu
JP
,
Hollingshead
M
,
Anderson
LW
, et al
.
The National Cancer Institute ALMANAC: a comprehensive screening resource for the detection of anticancer drug pairs with enhanced therapeutic activity
.
Cancer Res
2017
;
77
:
3564
76
.
19.
Varma
S
,
Pommier
Y
,
Sunshine
M
,
Weinstein
JN
,
Reinhold
WC
.
High resolution copy number variation data in the NCI-60 cancer cell lines from whole genome microarrays accessible through CellMiner
.
PLoS One
2014
;
9
:
e92047
.
20.
Reinhold
WC
,
Varma
S
,
Sunshine
M
,
Rajapakse
V
,
Luna
A
,
Kohn
KW
, et al
.
The NCI-60 methylome and its integration into cellminer
.
Cancer Res
2017
;
77
:
601
12
.
21.
Reinhold
WC
,
Varma
S
,
Sunshine
M
,
Elloumi
F
,
Ofori-Atta
K
,
Lee
S
, et al
.
RNA sequencing of the NCI-60: integration into cellminer and cellminer CDB
.
Cancer Res
2019
;
79
:
3514
24
.
22.
Liu
H
,
D'Andrade
P
,
Fulmer-Smentek
S
,
Lorenzi
P
,
Kohn
KW
,
Weinstein
JN
, et al
.
mRNA and microRNA expression profiles integrated with drug sensitivities of the NCI-60 human cancer cell lines MCT
2010
;
9
:
1080
91
.
23.
Nishizuka
S
,
Chen
ST
,
Gwadry
FG
,
Alexander
J
,
Major
SM
,
Scherf
U
, et al
.
Diagnostic markers that distinguish colon and ovarian adenocarcinomas: identification by genomic, proteomic, and tissue array profiling
.
Cancer Res
2003
;
63
:
5243
50
.
24.
Guo
T
,
Luna
A
,
Rajapakse
VN
,
Koh
CC
,
Wu
Z
,
Liu
W
, et al
.
Quantitative proteome landscape of the NCI-60 cancer cell lines
.
iScience
2019
;
21
:
664
80
.
25.
Gopi
LK
,
Kidder
BL
.
Integrative pan cancer analysis reveals epigenomic variation in cancer type and cell specific chromatin domains
.
Nat Commun
2021
;
12
:
1419
.
26.
Barretina
J
,
Caponigro
G
,
Stransky
N
,
Venkatesan
K
,
Margolin
AA
,
Kim
S
, et al
.
The cancer cell line encyclopedia enables predictive modelling of anticancer drug sensitivity
.
Nature
2012
;
483
:
603
7
.
27.
Ghandi
M
,
Huang
FW
,
Jane-Valbuena
J
,
Kryukov
GV
,
Lo
CC
,
McDonald
ER
III
, et al
.
Next-generation characterization of the cancer cell line encyclopedia
.
Nature
2019
;
569
:
503
8
.
28.
Heimerdinger
P
,
Rosin
A
,
Danzer
MA
,
Gerdes
T
.
A novel method for humidity-dependent through-plane impedance measurement for proton conducting polymer membranes
.
Membranes
2019
;
9
:
62
.
29.
Thomas
A
,
Takahashi
N
,
Rajapakse
VN
,
Zhang
X
,
Sun
Y
,
Ceribelli
M
, et al
.
Therapeutic targeting of ATR yields durable regressions in small cell lung cancers with high replication stress
.
Cancer Cell
2021
;
39
:
566
79
.
30.
Mendez
D
,
Gaulton
A
,
Bento
AP
,
Chambers
J
,
De Veij
M
,
Felix
E
, et al
.
ChEMBL: towards direct deposition of bioassay data
.
Nucleic Acids Res
2019
;
47
:
D930
D40
.
31.
Siramshetty
VB
,
Grishagin
I
,
Nguyen Eth
T
,
Peryea
T
,
Skovpen
Y
,
Stroganov
O
, et al
.
NCATS inxight drugs: a comprehensive and curated portal for translational research
.
Nucleic Acids Res
2022
;
50
:
D1307
D16
.
32.
Wishart
DS
,
Feunang
YD
,
Guo
AC
,
Lo
EJ
,
Marcu
A
,
Grant
JR
, et al
.
DrugBank 5.0: a major update to the DrugBank database for 2018
.
Nucleic Acids Res
2018
;
46
:
D1074
D82
.
33.
Bairoch
A
.
The cellosaurus, a cell-line knowledge resource
.
J Biomol Tech
2018
;
29
:
25
38
.
34.
Zhao
C
,
Jiang
T
,
Ju
JH
,
Zhang
S
,
Tao
J
,
Fu
Y
, et al
.
TruSight oncology 500: enabling comprehensive genomic profiling and biomarker reporting with targeted sequencing
.
Biorxiv
2020
.
35.
Luna
A
,
Rajapakse
VN
,
Sousa
FG
,
Gao
J
,
Schultz
N
,
Varma
S
, et al
.
rcellminer: exploring molecular profiles and drug response of the NCI-60 cell lines in R
.
Bioinformatics
2016
;
32
:
1272
4
.
36.
Thomas
A
,
Pommier
Y
.
Targeting topoisomerase i in the era of precision medicine
.
Clin Cancer Res
2019
;
25
:
6581
9
.
37.
Niepel
M
,
Hafner
M
,
Mills
CE
,
Subramanian
K
,
Williams
EH
,
Chung
M
, et al
.
A multi-center study on the reproducibility of drug-response assays in mammalian cell lines
.
Cell Syst
2019
;
9
:
35
48
.
38.
Zoppoli
G
,
Regairaz
M
,
Leo
E
,
Reinhold
WC
,
Varma
S
,
Ballestrero
A
, et al
.
Putative DNA/RNA helicase Schlafen-11 (SLFN11) sensitizes cancer cells to DNA-damaging agents
.
Proc Natl Acad Sci U S A
2012
;
109
:
15030
5
.
39.
Rees
MG
,
Seashore-Ludlow
B
,
Cheah
JH
,
Adams
DJ
,
Price
EV
,
Gill
S
, et al
.
Correlating chemical sensitivity and basal gene expression reveals mechanism of action
.
Nat Chem Biol
2016
;
12
:
109
16
.
40.
Jo
U
,
Murai
Y
,
Takebe
N
,
Thomas
A
,
Pommier
Y
.
Precision oncology with drugs targeting the replication stress, ATR, and Schlafen 11
.
Cancers (Basel)
2021
;
13
:
4601
.