Identifying robust biomarkers of drug response constitutes a key challenge in precision medicine. Patient-derived tumor xenografts (PDX) have emerged as reliable preclinical models that more accurately recapitulate tumor response to chemo- and targeted therapies. However, the lack of computational tools makes it difficult to analyze high-throughput molecular and pharmacologic profiles of PDX. We have developed Xenograft Visualization & Analysis (Xeva), an open-source software package for in vivo pharmacogenomic datasets that allows for quantification of variability in gene expression and pathway activity across PDX passages. We found that only a few genes and pathways exhibited passage-specific alterations and were therefore not suitable for biomarker discovery. Using the largest PDX pharmacogenomic dataset to date, we identified 87 pathways that are significantly associated with response to 51 drugs (FDR < 0.05). We found novel biomarkers based on gene expressions, copy number aberrations, and mutations predictive of drug response (concordance index > 0.60; FDR < 0.05). Our study demonstrates that Xeva provides a flexible platform for integrative analysis of preclinical in vivo pharmacogenomics data to identify biomarkers predictive of drug response, representing a major step forward in precision oncology.

Significance:

A computational platform for PDX data analysis reveals consistent gene and pathway activity across passages and confirms drug response prediction biomarkers in PDX.

See related commentary by Meehan, p. 4324

Preclinical models are vital for investigating disease biology and therapeutics, constituting essential tools for translational research and drug development. In cancer research, immortalized cell lines are the most used preclinical models because of their low cost, flexibility, and the existence of assays enabling genetic and chemical screen in a high-throughput manner. However, the cell line models suffer from multiple limitations. Although they are derived from patient tumors, they have evolved to survive in artificial culture conditions resulting in major alterations at the genomic level (1–6). These in vitro models also lack the tumor heterogeneity and three-dimensional structure of the origin patient tumor (4, 7–9).

To create cancer models that better recapitulate the tumor molecular features and drug response, the pharmaceutical industry and academia massively invested in the development of patient-derived xenografts (PDX; ref. 10), which enable engraftment of human tumors in animal models (11–17). PDXs are created by subcutaneous or orthotopic engraftment of the cancerous tissues or cells from patients' tumors into immunodeficient mice. Once established, these tumors can be passed from mouse to mouse, leading to consecutive “passages” of the initial tumor cells.

PDX models are being generated and distributed by several academic groups, research institutes, and commercial organizations. This makes it challenging to find PDX models with specific characteristics such as a model with a specific mutation. Therefore, catalogs of PDX models [e.g., PDXFinder (18), EurOPDX (19), and PRoXe (20)] are being developed, which contain relevant information and provide links to model acquisition. Furthermore, as PDXs are becoming the gold-standard model for preclinical studies, better data standardization and analysis platforms are required to ensure consistency and reproducibility in PDX-based analysis. Recently, a robust standard called PDX models minimal information (PDX-MI) has been proposed (21) for reporting and quality assurance for PDX models. However, management, analysis, and visualization of the PDX-based drug screening and genomic data still constitute major challenges.

Here we present Xenograft Visualization and Analysis (Xeva), a computational package enabling storage, access, and analysis of in vivo pharmacogenomics data. The Xeva toolbox facilitates biomarker discovery in PDX-based pharmacogenomic data. It implements class structure to manage and connect PDX-based drug screening data to the genomic features of the corresponding tumor. It provides functions for PDX data analysis, including multiple metrics to summarize drug response for tumor growth curves. Furthermore, Xeva provides functions to compute the association between genomic features and response to a drug in PDXs (gene–drug association). Different response metrics for the PDX growth curve can be used as outcome to identify novel gene–drug associations that can act as drug companion tests. Using the Xeva platform, we analyzed gene expression of PDXs across passages and demonstrated that activity patterns of the majority of the genes and pathways are stable across different PDX passages. We identified multiple pathways significantly associated with anticancer drug response, including known and new biomarkers based on gene expression, copy number aberrations (CNA), and mutations. Our results support the value of large-scale PDX-based drug screening for biomarker discovery using the integrative pharmacogenomic analysis pipelines implemented in the Xeva computational platform.

Processing of pharmacogenomic data

Gene expression data, PDX passages, and tissue information were obtained from the Gene Expression Omnibus and publications (Fig. 1). Molecular profiles including mutation, CNA, gene expression, and pharmacologic profiles were obtained from the publication Gao and colleagues (17) and Gene Expression Omnibus series GSE78806. Gene expression profiles were normalized with the RMA algorithm in the Affy package (version 1.58.0) in Bioconductor (22). PDX passages and tissue information were curated manually. Data were processed using R statistical software (http://www.r-project.org/). For mutation, CNA and RNAseq-based gene expression data, processed values were directly download from Gao and colleagues' publication (17). The final dataset contains 3,470 unique PDX models tested across 57 treatments and derived from 191 patients spanning across 5 different cancer types.

Figure 1.

Xeva facilitates analysis of the PDX-based pharmacogenomic data. A, Major publicly available PDX-based pharmacogenomic datasets. For a detailed list and references, see Supplementary Table S1. B, Distribution of genomic, drug screening, passage-related data, and cancer types in PDXE dataset. For 350 patients belonging to 6 different cancer types, availability of genomic data (CNA, mutation, and RNAseq) is shown in inner tracks. Number of drugs also includes untreated/control PDXs. Availability of passage-specific gene expression data is shown in outer track. C, Computation and visualization of response for PDX-based drug screening in PDXE breast cancer data using Xeva mRECIST function. For patient IDs highlighted in gray color, the control and treatment (paclitaxel) growth curve of PDXs is shown in D. The patient IDs are X-2344, X-1004, X-3078, and X-5975, respectively. Visualization was done using plotPDX function in Xeva.

Figure 1.

Xeva facilitates analysis of the PDX-based pharmacogenomic data. A, Major publicly available PDX-based pharmacogenomic datasets. For a detailed list and references, see Supplementary Table S1. B, Distribution of genomic, drug screening, passage-related data, and cancer types in PDXE dataset. For 350 patients belonging to 6 different cancer types, availability of genomic data (CNA, mutation, and RNAseq) is shown in inner tracks. Number of drugs also includes untreated/control PDXs. Availability of passage-specific gene expression data is shown in outer track. C, Computation and visualization of response for PDX-based drug screening in PDXE breast cancer data using Xeva mRECIST function. For patient IDs highlighted in gray color, the control and treatment (paclitaxel) growth curve of PDXs is shown in D. The patient IDs are X-2344, X-1004, X-3078, and X-5975, respectively. Visualization was done using plotPDX function in Xeva.

Close modal

Implementation

PDX pharmacogenomic experiments aim to investigate how tumor volume changes in in vivo models with respect to time, with and without drug treatment. The corresponding metadata, such as drug dose, number of days from tumor implantation to treatment start, or reason for stopping the experiment (e.g., whether mice died because of complications or were sacrificed due to maximal allowed tumor volume reached), are crucial factors for downstream analysis. In the Xeva platform, we have developed the XevaSet class to effectively store pharmacologic response (time vs. tumor volume) of PDXs along with metadata related to the experiments and molecular data (Supplementary Fig. S1). Furthermore, to store individual PDX (mouse) model data, we have implemented the pdxModel class, which provides slots for PDX-MI variables, along with time versus tumor volume data. Detailed schematics of the XevaSet and pdxModel classes are shown in Supplementary Figs. S2 and S3, respectively. XevaSet object can contain multi-omics data, which are linked to individual xenograft models and their pharmacologic profiles. Data can be subsetted by metadata and experimental factors, such as tumor types or drug names.

PDX experiment design

In vivo evaluation of drug sensitivity in cancer xenografts has traditionally used experimental approaches incorporating multiple animals (5–10) replicates for each control and treatment arm. This experimental design allows the assessment of the variability in the PDX drug response across individual animals (23). However, scaling this strategy for high-throughput drug screening is costly and requires the use of a large number of experimental animals. To increase the number of compounds being tested, reduce cost, and permit the use of fewer animals to provide essential data (24), a simpler “1 × 1 × 1″ experimental design for PDX clinical trial (PCT) has been proposed (17, 25). In this design, each compound is tested in only 1 PDX model from each patient (Supplementary Fig. S4). Although the 1 × 1 × 1 experiment design allows high-throughput screening at reduced cost, it is more prone to experimental errors due to lack of replications. Factors such as difference in the handling of mice or measurement errors in the tumor volume and biological variability in tumor growth rates between animals could significantly impact the results. We foresee that both experimental design strategies will coexist, where the 1 × 1 × 1 experiment design will be used for population-level and high-throughput screening, whereas the replicate-based experiment design will be used for more focused studies where discrimination of small differences in drug response is a desired outcome. We therefore implemented functions to accommodate both experimental designs for data visualization and statistical analyses.

PDX response metrics

Given the lack of consensus regarding the best summary metrics to estimate in vivo drug response (17, 26–29), we implemented the state-of-the-art response metrics used for PDX-based drug response experiments. These include the slope of curves, angle between the mean control and treatment curves, tumor growth inhibition (TGI), area between the curves, linear mixed model (28), best average response (BAR), best response (BR), and mRECIST (17).

For each PDX model, least squares fits were obtained by regressing tumor volume at each time point as:

formula

where V denotes tumor volume, T denotes time. The intercept and slope are denoted by α and β, respectively. Subsequently, the angle was computed using inverse tangent of regression line slope as:

formula

The TGI is defined as:

formula

where VC and VT are the median of control and treated growth curve, respectively, at the end of the study. VC0 and VT0 indicate the initial tumor volume for control and treated growth curve, respectively.

The BAR metric for each PDX model is defined as follows:

At each time point t, the normalized change in tumor volume is computed as

formula

Next, for each time t, the running average of |\Delta {V_t}$| from t = 0 to t was calculated. BAR is defined as the minimum of this running average (at |t \ge 10\ \,{\rm{days}}$|⁠). The minimum value of |\Delta V$|(at |t \ge 10\ \,{\rm{days}}$|⁠) defines as BR. Using these, the mRESIST metric for PDX is computed as:

  • complete response (CR): BR < −95% and BAR < −40%

  • partial response (PR): BR < −50% and BAR < −20%

  • stable disease (SD): BR < 35% and BAR < 30%

  • progressive disease (PD): not otherwise categorized

Visualization of in vivo drug-screening data

Several Xeva functions enable multifaceted visualization and exploration of the PDX data. The plotPDX function displays tumor growth curves, plotting time versus tumor volume data for a patient–drug pair or matched control and treatment PDX models (called batch). The waterfall function visualizes population-level response for a given set of PDXs. Similarly, plotmRECIST displays mRECIST-based drug response as a heatmap, with drugs along heatmap rows and PDXs along columns. Heatmap cells are colored according to the mRECIST value for the corresponding PDX–drug pair.

Gene expression consistency analysis

We calculated the Pearson correlation coefficient between pairs of samples belonging to the same lineage using all available gene expression. For comparison, we also computed the Pearson correlation coefficient between all possible pairs of samples that do not belong to the same lineage. Similar analysis was performed using the L1000 landmark gene set (30). For genes, we computed the intraclass correlation coefficient (ICC) using the psych package (version 1.8.4). For a particular gene, in each sample we computed its rank by sorting the expression values. The rank of genes along with passage information is used for ICC calculation. ICC values of the genes were used to perform gene-set enrichment analysis on the MSigDB hallmark gene sets and Reactome pathway database (31), which contains gene sets associated with specific pathways.

Pathways and gene–drug association analysis

We computed the association between a molecular feature and response to a drug across PDXs (commonly referred to as gene–drug association or drug response predicting biomarker). The gene–drug association was assessed separately for all three available molecular features, that is, gene expression, CNA, and gene mutation. The response of PDXs to a drug treatment is defined using BAR.

The association between genomic feature and PDX response was computed using nonparametric measure of association, concordance index (CI). The CI represents the probability that two variables will rank a random pair of samples the same order. For each drug, we compiled a list of potential biomarkers using OncoKB (32), DrugBank, and literature curation. We adjusted the P value using FDR for drug combinations and drugs with multiple potential biomarkers. Univariate gene–drug associations were calculated for three different molecular profiling modalities that are gene expression, copy number variation, and mutation. For drug–pathway association analysis, the Reactome pathway database (31) was used. We created a subset of the pathway database by selecting only the gene sets containing at least 1 potential drug target and containing less than 300 genes. We also grouped the drugs in different classes according to their genomic target. In total, we selected 94 pathways related to 57 drugs and drugs were classified into 11 classes. Pathway analysis was performed using the R Piano package (version 1.20.1; ref. 33). We used the R package ComplexHeatmap (version 1.20.0) to visualize the gene–drug associations and R package circlize (version 0.4.6) to visualize the drug–pathway association (34).

Research reproducibility

The open-source Xeva package is available from GitHub (https://github.com/bhklab/Xeva) at under GPLv3 license and Bioconductor repository (http://bioconductor.org/packages/Xeva/). We also provide a complete software environment through Code Ocean containing all necessary data and code to reproduce the analysis and figures described in this manuscript under the doi https://doi.org/10.24433/CO.bfb2dd77-53a3-4083-bf3a-16504fc8c786.

Xeva follows PDX-MI standards to store pharmacogenomic data

We designed XevaSet, a new object class enabling integration of molecular and pharmacologic profiles of PDXs (Supplementary Figs. S1–S3) following the recent minimal information for patient-derived tumor xenograft models (PDX-MI) standard for reporting on the generation, quality assurance, and use of PDX models (21). The PDX-MI standard ensures that all necessary clinical attributes of the tumor along with PDX-related essential experimental information, such as host mouse strain and passage information, is reported. Given that this information is crucial for downstream analysis and research reproducibility, we have implemented the pdxModel class (Supplementary Figs. S2–S4) that provides slots for PDX-MI variables. In this study, we curated the recent Novartis PDX Encyclopedia (PDXE; ref. 17) and created the PDXE XevaSet object (Fig. 1A and B; Supplementary Table S1) to investigate the consistency of gene expression patterns across passages and mine the pharmacogenomic data for known and new biomarkers predictive of drug response in vivo.

Xeva provides useful functions for analysis and visualization of PDX-based pharmacogenomic data. For every PDX model in the PDXE breast cancer dataset, mRECIST-based response metrics was computed using the Xeva function response (Fig. 1C). Heatmaps representing the mRECIST data cutaneous melanoma (CM), colorectal cancer, gastric cancer, non–small cell lung carcinoma (NSCLC) and pancreatic ductal adenocarcinoma can be found in Supplementary Figs. S5 to S9, respectively. Visualization of PDX growth curves is an essential part of data quality control and analysis. Tumor growth curves for individual PDX models and for matched control-treatment models can be plotted using the plotPDX function (Supplementary Table S2). Examples of PDX tumor growth curves in control (untreated) and treatment (paclitaxel) conditions are shown in Fig. 1D. All 4,483 tumor growth curves with matched control can be found in Supplementary Data File S1. PDX-related genomic data and linked drug screening response data can be extracted using the function summarizeMolecularProfiles (Supplementary Table S2). Similarly, the drugSensitivitySig function (Supplementary Table S2) allows users to quantify the strength of each gene–drug association using a linear regression model.

Gene expression is consistent across passages

PDX models are known to better represent the molecular characters of human tumor compared with simpler in vitro models (13, 35, 36). PDXs show high similarity to patient samples for mutation rate (17, 37) and CNA (13) and methylation (38) patterns, histological and molecular subtypes (13, 36, 39). Before conducting drug screening, PDXs are passaged multiple times with the assumption that genomic characteristics of PDXs are stable across passages. Several studies have shown that mutation and copy number patterns are largely stable across passages (14, 36, 40–44). However, a meta-analysis using inferred copy number data asserted that the genomic landscape of PDXs changes rapidly during passaging (45). Given these contradictory results, it is vital to perform systematic analysis of noninferred genomic data (i.e., gene expression profile) across passage (45).

To address these concerns, we sought to systematically analyze the changes in gene expression pattern across passages. We curated gene expression data for 661 PDX samples [all samples are overlapping with Ben-David and colleagues (45) study], derived from 371 patients and spanning from passage 0 (P0) to passage 5 (P5; Fig. 2A; Supplementary Table S3). To visualize the consistency of gene expression patterns across passages, we performed a t-distributed stochastic neighbor embedding (t-SNE) analysis and projected the high-dimensional PDX gene expression data into a two-dimensional plot (Fig. 2B). PDXs derived from a patient but belonging to different passages (defined as belonging to same lineage) were linked together in the t-SNE visualization. We observed that in the visualization PDXs from the same lineage are projected nearby even though coming from different passages. Next, we computed the Pearson correlation coefficient for all pairs of PDX samples belonging to same lineage using all available genes (Fig. 2C) and L1000 gene set (Supplementary Fig. S10). We found that the median Pearson correlation for related pairs is high (0.98) in comparison with nonrelated tissue-specific PDX pairs (0.87) and the difference was statistically significant (Wilcoxon rank sum test, P < 0.0001). Collectively, these results strongly support that the gene expression profile of PDXs is consistent across different passages. For specific genes, however, the expression behavior can vary across PDX passages, which may affect drug response depending on which passage is used for drug testing. We therefore assessed stability of each gene across passages by computing the intraclass correlation coefficient (ICC) for genes in samples from same PDX lineage for all tissue types and stratified by tissue type (Fig. 3A). We found that the ICC values of genes in pairs of related PDXs are higher when compared with nonrelated pairs (Wilcoxon rank sum test, P <1E−15). Furthermore, the ICC values of genes are skewed towards high values, indicating that the majority of genes have a stable expression pattern across PDX passages. Our results show that expression patterns of known biomarker genes such as EGFR, ERBB2, and MAP2K1 are stable across passage (Table 1; Supplementary Data File S2). Consistent with Rubio-Viqueira and colleagues (46), we found that expression of genes such as VEGFA, MDM2, and CDK4 is stable in pancreatic cancer PDXs.

Figure 2.

Gene expression landscape of PDXs is consistent across passages. A, Distribution of samples in different passages. B, t-SNE analysis of gene expression data (for 17,304 genes) from different passages of the PDXs. Samples belonging to same lineage but belonging to different passages are linked together by line (light gray color). Overlapping samples or samples very near to each other are connected using curved lines (light gray color). C, Pearson correlation for related sample pairs (belonging to same lineage) and randomly selected samples pairs. The correlation coefficient of related pairs is significantly higher than randomly selected pairs (P < 0.001). All available (17,304) gene expressions were used for the computation of the correlation coefficient.

Figure 2.

Gene expression landscape of PDXs is consistent across passages. A, Distribution of samples in different passages. B, t-SNE analysis of gene expression data (for 17,304 genes) from different passages of the PDXs. Samples belonging to same lineage but belonging to different passages are linked together by line (light gray color). Overlapping samples or samples very near to each other are connected using curved lines (light gray color). C, Pearson correlation for related sample pairs (belonging to same lineage) and randomly selected samples pairs. The correlation coefficient of related pairs is significantly higher than randomly selected pairs (P < 0.001). All available (17,304) gene expressions were used for the computation of the correlation coefficient.

Close modal
Figure 3.

PDX maintains expression pattern of the genes across passages. A, Violin plot shows ICC for genes across PDX passages for all samples and is stratified by tissue type. Black violin plot represents ICC values for genes calculated using nonpassage-related (randomly selected) samples. B, Pathways are stable across passages in PDXs. Barplot shows top 10 pathways with negative enrichment score in gene-set enrichment analysis. Only one pathway has a statistically significant (FDR < 0.05) negative enrichment score across PDX passages.

Figure 3.

PDX maintains expression pattern of the genes across passages. A, Violin plot shows ICC for genes across PDX passages for all samples and is stratified by tissue type. Black violin plot represents ICC values for genes calculated using nonpassage-related (randomly selected) samples. B, Pathways are stable across passages in PDXs. Barplot shows top 10 pathways with negative enrichment score in gene-set enrichment analysis. Only one pathway has a statistically significant (FDR < 0.05) negative enrichment score across PDX passages.

Close modal
Table 1.

Consistency of expression pattern of biomarkers across PDX passages

Consistency of expression pattern of biomarkers across PDX passages
Consistency of expression pattern of biomarkers across PDX passages

To identify pathways that are enriched with unstable genes, we performed a gene set enrichment analysis with the genome-wide ranking of genes based on their stability across passages (ICC values; Fig. 3B; Supplementary Data Files S3 and S4). Pathways enriched with unstable genes will be sensitive to passage and targeting these pathways might result in inconsistent drug response across passages. Our analysis shows that the pathways in the Hallmark gene set (47) have positive enrichment scores, indicating high stability of these pathways across passages (Supplementary Data File S3). For the Reactome gene set (31), we found that posttranscriptional mRNA processing pathways such as mRNA 3′ end processing and mRNA splicing are enriched with unstable genes (Fig. 3B; Supplementary Data File S4). It is well established that during proliferation and differentiation, cells adjust the mRNA and protein level by controlling the posttranscriptional mRNA processing pathways (48–50). Therefore, the instability of these posttranscriptional mRNA-processing pathways might be attributed to the tumor growth in the PDX. Targeting these pathways using a drug in PDXs might result in inconsistent response at different passages.

In vivo biomarker discovery

One of the main goals of pharmacogenomic studies is to find genomic biomarkers for drug response prediction. We evaluated the association between a molecular feature and response to a given drug (gene–drug association) in PDXE (17) data. The PDXE data consists of a 1 × 1 × 1 experimental design where 60 compounds were tested across 277 PDXs. The Xeva function plotPDX provides an interface for the visualization of time versus tumor volume data of PDXs. To model the tumor growth curve and to quantify the response of PDXs, we implemented several drug response metric including slope, area between curves, linear mixed effects model (28), BAR, and mRECIST (17).

We used Xeva to identify biomarkers for drug response prediction by computing gene–drug associations for drugs in the PDXE data and their corresponding known biomarker genes defined using OncoKB resource (32). For the analysis, a PDX's response to a drug is defined using BAR and association was computed using concordance index. Analysis was done for each tissue type with gene expression, CNA, and mutation data. In vitro drug testing has shown that the drug encorafenib can produce synergistic effects with binimetinib in cutaneous melanoma (17, 51). This drug combination also shows synergistic effects in PDXs as 50% of tested PDXs show tumor shrinkage, whereas binimetinib and encorafenib monotherapy show tumor shrinkage in 40% and 25% of PDXs, respectively (Fig. 4A and B). For the drug combination of encorafenib and binimetinib, we found that NRAS mutation status is significantly associated with response (Wilcoxon tests, FDR = 0.01; Fig. 4C). Among PDXs with NRAS wild-type status, 63.6% show a negative BAR to the binimetinib and encorafenib drug combination, whereas in the mutated category only 20% show a response.

Figure 4.

Xeva facilitates biomarker discovery and visualization. Waterfall plots show response of CM PDXs for drugs binimetinib (A), encorafenib (B), and binimetinib + encorafenib (C). Each bar represents one PDX derived from a patient and color represents NRAS mutation status (red mutated and blue wild-type). Response of the PDX is defined as BAR. Association between mutation and drug response was calculated using Xeva function drugSensitivitySig and FDR correction was applied to P values. For visualization, waterfall function in Xeva is used.

Figure 4.

Xeva facilitates biomarker discovery and visualization. Waterfall plots show response of CM PDXs for drugs binimetinib (A), encorafenib (B), and binimetinib + encorafenib (C). Each bar represents one PDX derived from a patient and color represents NRAS mutation status (red mutated and blue wild-type). Response of the PDX is defined as BAR. Association between mutation and drug response was calculated using Xeva function drugSensitivitySig and FDR correction was applied to P values. For visualization, waterfall function in Xeva is used.

Close modal

The drug trastuzumab (Herceptin) is an monoclonal antibody that targets the extracellular domain of the HER2 protein and inhibits the proliferation of tumor cells. In the breast cancer PDXE data, we found that expression of the HER2 encoding gene (ERBB2) is significantly associated with trastuzumab response (CI = 0.682; FDR = 0.02; Fig. 5). ERBB3, another member of the EFGR (EGFR/ERBB) family, was also found to be associated with trastuzumab response (CI = 0.68; FDR = 0.026). ERBB3 is known to be implicated in growth, proliferation, metastasis, and drug resistance in tumors through interacting with ERBB2 (52, 53).

Figure 5.

PDXs faithfully recapitulate known gene–drug associations. For the breast cancer PDXE dataset, the left side of the figure shows mutation (top), CNA (middle), and expression (bottom) pattern of known biomarker genes. Here, each column represents genomic features from an individual PDX model. The right panel shows association (concordance index) between genomic features and response of the drug. Columns represent drugs for which names are annotated on the top. In this panel, the association between gene feature and drug is represented using circles. For drugs, the corresponding biomarker is highlighted using dark gray color if the association is statistically significant (FDR < 0.05). Association between genomic feature and drug response was computed using the Xeva function drugSensitivitySig and visualization was done using ComplexHeatmap.

Figure 5.

PDXs faithfully recapitulate known gene–drug associations. For the breast cancer PDXE dataset, the left side of the figure shows mutation (top), CNA (middle), and expression (bottom) pattern of known biomarker genes. Here, each column represents genomic features from an individual PDX model. The right panel shows association (concordance index) between genomic features and response of the drug. Columns represent drugs for which names are annotated on the top. In this panel, the association between gene feature and drug is represented using circles. For drugs, the corresponding biomarker is highlighted using dark gray color if the association is statistically significant (FDR < 0.05). Association between genomic feature and drug response was computed using the Xeva function drugSensitivitySig and visualization was done using ComplexHeatmap.

Close modal

The drug binimetinib is a targeted and potent MAPK kinase (MAP2K or MEK) inhibitor (54). In breast cancer PDXs, expression of MAP2K2 (CI = 0.67; FDR = 0.02; Fig. 5) was significantly associated with binimetinib response. This gene belongs to the MAP2K kinase family and produces a protein that activates the MAPK/ERK pathway. Associations between potential biomarkers and PDX drug response for all tissue types can be found in Supplementary Data File S5.

Figure 6.

Pathways targeted by drugs are significantly enriched in PDXs. In the circos plot, drug classes (left) and targeted pathways (right) are linked when the association is significant (gene set enrichment analysis, FDR < 0.05). Activity of EGFR class drugs has significant association with EGFR signaling pathway. Similarly, MAPK class drugs show significant association with MAPK activation-related pathways.

Figure 6.

Pathways targeted by drugs are significantly enriched in PDXs. In the circos plot, drug classes (left) and targeted pathways (right) are linked when the association is significant (gene set enrichment analysis, FDR < 0.05). Activity of EGFR class drugs has significant association with EGFR signaling pathway. Similarly, MAPK class drugs show significant association with MAPK activation-related pathways.

Close modal

To gain understanding of the association between drug response and target pathway, we performed gene set enrichment analysis (Fig. 6). Drugs were classified into 11 classes according to their known targets and the Reactome pathway database was used for gene set enrichment analysis. We found that EGFR signaling in cancer pathway is significantly enriched (FDR < 0.05) in the EGFR class of drugs. Similarly, for MAPK class drugs, relevant pathways such as MAPK activation in Toll-like receptor (TLR) cascade and NF-κB and MAPK activation by TLR are significantly enriched (FDR < 0.05). Associations between each drug and reactome pathways can be found in Supplementary Data File S6.

PDXs are valuable models for cancer modeling and pharmacogenomic analysis. However, the translational potential of PDX preclinical models is highly dependent on the tools and techniques to processing and analyzing the data. Computational tools that enable standardized processing are thus an integral part of this line of research, and harmonized approaches can provide the community with accessible means for the analysis. The Xeva platform allows researchers to visualize and analyze the complex pharmacogenomic data generated during in vivo drug-screening studies. The key strengths of the Xeva platform is its ability to store all metadata from a PDX experiment, link genomic data to corresponding PDX models, and provide user friendly functions for analysis.

In a recent study, Ben-David and colleagues (45) analyzed changes in CNAs during PDX passaging using experimental and computational inferred copy number alterations (CNA) from gene expression profiles. They concluded that the CNA landscape of PDXs changes rapidly with passage as within four passages 12.3% (median) of the genome was affected by model-acquired CNAs. Although such an analysis is an important component of credentialing PDX as a preclinical platform, their analysis depended largely on inference rather than direct measurement of CNAs: for 84% (933 of 1,110) of PDX samples, the CNAs were inferred from microarray-based gene expression data and lack matched normal tissue samples. As the authors stated, virtual karyotyping does not fully recapitulate the CNA observed from SNP microarrays or whole-exome sequencing. For the samples where DNA-based CNA profiles are available, the authors have reported a concordance of 0.82 between experimental (DNA-based) and expression-based inferred CNA profile. However, directly assessing the gene expression pattern can provide better insight about changes in genomic landscape of PDXs across passages than the CNA profile. In this study, using Xeva, we have assessed the gene expression landscape of PDXs. Our results indicate that the gene expression landscape of PDXs is similar across different passages as the correlation between the related PDXs is very high. At the level of individual genes, we observe high consistency in expression patterns for the majority of genes. However, caution is required when analyzing genes with low stability in PDX models. Lack of stability in gene expression may lead to inconsistent results when targeting proteins or pathways related to genes with low expression stability. We have provided a list of genes and their stability score that will help researchers to assess the consistency in expression patterns of genes of interest and thereby deciding if PDXs are suitable models for their (targeted) drug of interest or to study behavior of a particular gene.

In our study, we found that the multiple known biomarkers predictive of drug response can be identified in the large Novartis PDXE dataset. Notably, the dataset used in this study has the 1 × 1 × 1 PDX experiment design (Supplementary Fig. S4). Thus, our results demonstrate that the 1 × 1 × 1 PDX experiment design is an adequate way to discover drug response predicting biomarkers at the population level. Although the 1 × 1 × 1 experiment design is a cost- and animal-resource efficient way to find biomarkers, using Xeva, we have found cases where control (untreated) mice show a partial response (PR) or tumor shrinkage. Possible causes of such early tumor shrinkage might include experimental error, handling glitches, or genetic properties of the tumor. In a 1 × 1 × 1 experiment design, lack of replicates makes it impossible to decipher the exact cause. Visualization and analysis of PDX response data using Xeva provides an efficient way to recognize such cases.

For biomarker discovery analysis, treatment response in PDXs is defined by the BAR (17). This metric of treatment response provides a continuous value; however, it does not take into account the control arm of the PDX experiment. As PDX-based pharmacogenomics gains popularity, standardized metrics to define response are required, thereby taking into account the control arm of the PDX experiments. Such methods will also improve the biomarker discovery process. Xeva provides a standard tool for comparison of different PDX response metrics.

The Xeva package enables easy and efficient analysis of the PDX-based pharmacogenomic data. Xeva includes functions to link molecular features to drug response, therefore providing a unified framework for analysis and development of biomarkers of drug response.

We developed the Xeva package to facilitate visualization, analysis, and biomarker discovery in PDX pharmacogenomic data. We showed that PDX gene expression is consistent across passages. Our platform allowed us to confirm the existence of several drug response prediction biomarkers in a large PDX pharmacogenomic dataset. The reproducibility of known biomarkers and consistency in gene expression shows that PDX experiments are suitable for in vivo biomarker discovery or validation. Xeva is an open-source, flexible, and timely tool in an era of increasing efforts to use PDXs as the main model system for cancer research. We envision Xeva will play a crucial role in PDX-based pharmacogenomic analysis, biomarker discovery, and validation.

No potential conflicts of interest were disclosed.

Conception and design: A.S. Mer, M.-S. Tsao, D.W. Cescon, B. Haibe-Kains

Development of methodology: A.S. Mer, D.W. Cescon, B. Haibe-Kains

Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): A.S. Mer, M.-S. Tsao

Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): A.S. Mer, B. Brew, J. Ortmann, D.W. Cescon, A. Goldenberg, B. Haibe-Kains

Writing, review, and/or revision of the manuscript: A.S. Mer, W. Ba-Alawi, Y.X. Wang, M.-S. Tsao, D.W. Cescon, B. Haibe-Kains

Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): A.S. Mer, P. Smirnov, M.-S. Tsao, B. Haibe-Kains

Study supervision: D.W. Cescon, A. Goldenberg, B. Haibe-Kains

Other (partial funding of study): M.-S. Tsao

A.S. Mer was supported by the SU2C Canada—Canadian Cancer Society Breast Cancer Dream Team Research Funding (SU2C-AACR-DT-18-15), with supplemental support from the Ontario Institute for Cancer Research, through funding provided by the Government of Ontario. Stand Up To Cancer Canada is a Canadian Registered Charity (Reg. #80550 6730 RR0001). Research Funding is administered by the American Association for Cancer Research International—Canada, the Scientific Partner of SU2C Canada. B. Haibe-Kains was supported by the Gattuso Slaight Personalized Cancer Medicine Fund at Princess Margaret Cancer Centre, the Canadian Institute of Health Research, Natural Sciences and Engineering Research Council, and The Terry Fox Research Institute fund. A.S. Mer would like to extend a sincere thank you to Dr. Carmen Ludwig for her help with editing the manuscript. We would like to thank the investigators of the Novartis Patient-Derived Xenograft Encyclopedia who have made their invaluable data available to the scientific community.

1.
Stein
WD
,
Litman
T
,
Fojo
T
,
Bates
SE
. 
A Serial Analysis of Gene Expression (SAGE) database analysis of chemosensitivity: comparing solid tumors with cell lines and comparing solid tumors from different tissue origins
.
Cancer Res
2004
;
64
:
2805
16
.
2.
Porter
SN
,
Baker
LC
,
Mittelman
D
,
Porteus
MH
. 
Lentiviral and targeted cellular barcoding reveals ongoing clonal dynamics of cell lines in vitro and in vivo
.
Genome Biol
2014
;
15
:
R75
.
3.
Begley
CG
,
Ellis
LM
. 
Drug development: raise standards for preclinical cancer research
.
Nature
2012
;
483
:
531
3
.
4.
Bousquet
G
,
Janin
A
. 
Patient-derived xenograft: an adjuvant technology for the treatment of metastatic disease
.
Pathobiology
2016
;
83
:
170
6
.
5.
Caponigro
G
,
Sellers
WR
. 
Advances in the preclinical testing of cancer therapeutic hypotheses
.
Nat Rev Drug Discov
2011
;
10
:
179
87
.
6.
Ben-David
U
,
Siranosian
B
,
Ha
G
,
Tang
H
,
Oren
Y
,
Hinohara
K
, et al
Genetic and transcriptional evolution alters cancer cell line drug response
.
Nature
2018
;
560
:
325
30
.
7.
Johnson
JI
,
Decker
S
,
Zaharevitz
D
,
Rubinstein
LV
,
Venditti
JM
,
Schepartz
S
, et al
Relationships between drug activity in NCI preclinical in vitro and in vivo models and early clinical trials
.
Br J Cancer
2001
;
84
:
1424
31
.
8.
Auman
JT
,
McLeod
HL
. 
Colorectal cancer cell lines lack the molecular heterogeneity of clinical colorectal tumors
.
Clin Colorectal Cancer
2010
;
9
:
40
7
.
9.
Domcke
S
,
Sinha
R
,
Levine
DA
,
Sander
C
,
Schultz
N
. 
Evaluating cell lines as tumour models by comparison of genomic profiles
.
Nat Commun
2013
;
4
:
2126
.
10.
Jung
J
,
Seol
HS
,
Chang
S
. 
The generation and application of patient-derived xenograft model for cancer research
.
Cancer Res Treat
2018
;
50
:
1
10
.
11.
Lai
Y
,
Wei
X
,
Lin
S
,
Qin
L
,
Cheng
L
,
Li
P
. 
Current status and perspectives of patient-derived xenograft models in cancer research
.
J Hematol Oncol
2017
;
10
:
106
.
12.
Wang
Y
,
Lin
D
,
Gout
PW
.
Patient-derived xenograft models of human cancer
.
Cham, Switzerland:
Springer
; 
2017
.
13.
Wang
D
,
Pham
N-A
,
Tong
J
,
Sakashita
S
,
Allo
G
,
Kim
L
, et al
Molecular heterogeneity of non-small cell lung carcinoma patient-derived xenografts closely reflect their primary tumors
.
Int J Cancer
2017
;
140
:
662
73
.
14.
Tignanelli
CJ
,
Herrera Loeza
SG
,
Yeh
JJ
. 
KRAS and PIK3CA mutation frequencies in patient-derived xenograft models of pancreatic and colorectal cancer are reflective of patient tumors and stable across passages
.
Am Surg
2014
;
80
:
873
7
.
15.
Beshiri
ML
,
Tice
CM
,
Tran
C
,
Nguyen
HM
,
Sowalsky
AG
,
Agarwal
S
, et al
A PDX/organoid biobank of advanced prostate cancers captures genomic and phenotypic heterogeneity for disease modeling and therapeutic screening
.
Clin Cancer Res
2018
;
24
:
4332
45
.
16.
Kim
MP
,
Evans
DB
,
Wang
H
,
Abbruzzese
JL
,
Fleming
JB
,
Gallick
GE
. 
Generation of orthotopic and heterotopic human pancreatic cancer xenografts in immunodeficient mice
.
Nat Protoc
2009
;
4
:
1670
80
.
17.
Gao
H
,
Korn
JM
,
Ferretti
S
,
Monahan
JE
,
Wang
Y
,
Singh
M
, et al
High-throughput screening using patient-derived tumor xenografts to predict clinical trial drug response
.
Nat Med
2015
;
21
:
1318
25
.
18.
Conte
N
,
Mason
J
,
Halmagyi
C
,
Neuhauser
S
,
Mosaku
A
,
Begley
DA
, et al
PDX finder: a portal for patient-derived tumor xenograft model discovery [Internet]
.
Nucleic Acids Res.
2019
;
47
:
D1073
9
.
19.
Hidalgo
M
,
Amant
F
,
Biankin
AV
,
Budinská
E
,
Byrne
AT
,
Caldas
C
, et al
Patient-derived xenograft models: an emerging platform for translational cancer research
.
Cancer Discov
2014
;
4
:
998
1013
.
20.
Townsend
EC
,
Murakami
MA
,
Christodoulou
A
,
Christie
AL
,
Köster
J
,
DeSouza
TA
, et al
The public repository of xenografts enables discovery and randomized phase II-like trials in mice
.
Cancer Cell
2016
;
29
:
574
86
.
21.
Meehan
TF
,
Conte
N
,
Goldstein
T
,
Inghirami
G
,
Murakami
MA
,
Brabetz
S
, et al
PDX-MI: minimal information for patient-derived tumor xenograft models
.
Cancer Res
2017
;
77
:
e62
6
.
22.
Irizarry
RA
,
Hobbs
B
,
Collin
F
,
Beazer-Barclay
YD
,
Antonellis
KJ
,
Scherf
U
, et al
Exploration, normalization, and summaries of high density oligonucleotide array probe level data
.
Biostatistics
2003
;
4
:
249
64
.
23.
Laajala
TD
,
Jumppanen
M
,
Huhtaniemi
R
,
Fey
V
,
Kaur
A
,
Knuuttila
M
, et al
Optimized design and analysis of preclinical intervention studies in vivo
.
Sci Rep
2016
;
6
:
30723
.
24.
Stephens
ML
,
Goldberg
AM
,
Rowan
AN
. 
The first forty years of the alternatives approach: refining, reducing, and replacing the use of laboratory animals
; 
2001
[cited 2018 Nov 2]. Available from:
https://animalstudiesrepository.org/sota_2001/6/.
25.
Migliardi
G
,
Sassi
F
,
Torti
D
,
Galimi
F
,
Zanella
ER
,
Buscarino
M
, et al
Inhibition of MEK and PI3K/mTOR suppresses tumor growth but does not cause tumor regression in patient-derived xenografts of RAS-mutant colorectal carcinomas
.
Clin Cancer Res
2012
;
18
:
2515
25
.
26.
Krepler
C
,
Sproesser
K
,
Brafford
P
,
Beqiri
M
,
Garman
B
,
Xiao
M
, et al
A comprehensive patient-derived xenograft collection representing the heterogeneity of melanoma
.
Cell Rep
2017
;
21
:
1953
67
.
27.
Savage
P
,
Blanchet-Cohen
A
,
Revil
T
,
Badescu
D
,
Saleh
SMI
,
Wang
Y-C
, et al
A targetable EGFR-dependent tumor-initiating program in breast cancer
.
Cell Rep
2017
;
21
:
1140
9
.
28.
Tsao
M-S
,
Wu
L
,
Allo
G
,
John
T
,
Li
M
,
Tagawa
T
, et al
Patient-derived xenograft establishment from human malignant pleural mesothelioma
.
Clin Cancer Res
2017
;
23
:
1060
7
.
29.
Fan
L
,
Pepicelli
CV
,
Dibble
CC
,
Catbagan
W
,
Zarycki
JL
,
Laciak
R
, et al
Hedgehog signaling promotes prostate xenograft tumor growth
.
Endocrinology
2004
;
145
:
3961
70
.
30.
Subramanian
A
,
Narayan
R
,
Corsello
SM
,
Peck
DD
,
Natoli
TE
,
Lu
X
, et al
A next generation connectivity map: L1000 platform and the first 1,000,000 profiles
.
Cell
2017
;
171
:
1437
52
.
31.
Fabregat
A
,
Jupe
S
,
Matthews
L
,
Sidiropoulos
K
,
Gillespie
M
,
Garapati
P
, et al
The reactome pathway knowledgebase
.
Nucleic Acids Res
2018
;
46
:
D649
55
.
32.
Chakravarty
D
,
Gao
J
,
Phillips
SM
,
Kundra
R
,
Zhang
H
,
Wang
J
, et al
OncoKB: a precision oncology knowledge base
.
JCO Precis Oncol
2017
.
doi: 10.1200/PO.17.00011
.
33.
Väremo
L
,
Nielsen
J
,
Nookaew
I
. 
Enriching the gene set analysis of genome-wide data by incorporating directionality of gene expression and combining statistical hypotheses and methods
.
Nucleic Acids Res
2013
;
41
:
4378
91
.
34.
Gu
Z
,
Gu
L
,
Eils
R
,
Schlesner
M
,
Brors
B
. 
circlize Implements and enhances circular visualization in R
.
Bioinformatics
2014
;
30
:
2811
2
.
35.
Siolas
D
,
Hannon
GJ
. 
Patient-derived tumor xenografts: transforming clinical samples into mouse models
.
Cancer Res
2013
;
73
:
5315
9
.
36.
Zhang
X
,
Claerhout
S
,
Prat
A
,
Dobrolecki
LE
,
Petrovic
I
,
Lai
Q
, et al
A renewable tissue resource of phenotypically stable, biologically and ethnically diverse, patient-derived human breast cancer xenograft models
.
Cancer Res
2013
;
73
:
4885
97
.
37.
Garman
B
,
Anastopoulos
IN
,
Krepler
C
,
Brafford
P
,
Sproesser
K
,
Jiang
Y
, et al
Genetic and genomic characterization of 462 melanoma patient-derived xenografts, tumor biopsies, and cell lines
.
Cell Rep
2017
;
21
:
1936
52
.
38.
Guilhamon
P
,
Butcher
LM
,
Presneau
N
,
Wilson
GA
,
Feber
A
,
Paul
DS
, et al
Assessment of patient-derived tumour xenografts (PDXs) as a discovery tool for cancer epigenomics
.
Genome Med
2014
;
6
:
116
.
39.
Einarsdottir
BO
,
Bagge
RO
,
Bhadury
J
,
Jespersen
H
,
Mattsson
J
,
Nilsson
LM
, et al
Melanoma patient-derived xenografts accurately model the disease and develop fast enough to guide treatment decisions
.
Oncotarget
2014
;
5
:
9609
18
.
40.
Julien
S
,
Merino-Trigo
A
,
Lacroix
L
,
Pocard
M
,
Goéré
D
,
Mariani
P
, et al
Characterization of a large panel of patient-derived tumor xenografts representing the clinical heterogeneity of human colorectal cancer
.
Clin Cancer Res
2012
;
18
:
5314
28
.
41.
Bertotti
A
,
Migliardi
G
,
Galimi
F
,
Sassi
F
,
Torti
D
,
Isella
C
, et al
A molecularly annotated platform of patient-derived xenografts (“xenopatients”) identifies HER2 as an effective therapeutic target in cetuximab-resistant colorectal cancer
.
Cancer Discov
2011
;
1
:
508
23
.
42.
Reyal
F
,
Guyader
C
,
Decraene
C
,
Lucchesi
C
,
Auger
N
,
Assayag
F
, et al
Molecular profiling of patient-derived breast cancer xenografts
.
Breast Cancer Res
2012
;
14
:
R11
.
43.
Hidalgo
M
,
Amant
F
,
Biankin
AV
,
Budinská
E
,
Byrne
AT
,
Caldas
C
, et al
Patient-derived xenograft models: an emerging platform for translational cancer research
.
Cancer Discov
2014
;
4
:
998
1013
.
44.
Gendoo
DMA
,
Denroche
RE
,
Zhang
A
,
Radulovich
N
,
Jang
GH
,
Lemire
M
, et al
Whole genomes define concordance of matched primary, xenograft, and organoid models of pancreas cancer
.
PLoS Comput Biol
2019
;
15
:
e1006596
.
45.
Ben-David
U
,
Ha
G
,
Tseng
Y-Y
,
Greenwald
NF
,
Oh
C
,
Shih
J
, et al
Patient-derived xenografts undergo mouse-specific tumor evolution
.
Nat Genet
2017
;
49
:
1567
75
.
46.
Rubio-Viqueira
B
,
Jimeno
A
,
Cusatis
G
,
Zhang
X
,
Iacobuzio-Donahue
C
,
Karikari
C
, et al
An in vivo platform for translational drug development in pancreatic cancer
.
Clin Cancer Res
2006
;
12
:
4652
61
.
47.
Liberzon
A
,
Birger
C
,
Thorvaldsdóttir
H
,
Ghandi
M
,
Mesirov
JP
,
Tamayo
P
. 
The molecular signatures database (MSigDB) hallmark gene set collection
.
Cell Syst
2015
;
1
:
417
25
.
48.
Hollerer
I
,
Grund
K
,
Hentze
MW
,
Kulozik
AE
. 
mRNA 3′ end processing: a tale of the tail reaches the clinic
.
EMBO Mol Med
2014
;
6
:
16
26
.
49.
Fu
Y
,
Sun
Y
,
Li
Y
,
Li
J
,
Rao
X
,
Chen
C
, et al
Differential genome-wide profiling of tandem 3′ UTRs among human breast cancer and normal cells by high-throughput sequencing
.
Genome Res
2011
;
21
:
741
7
.
50.
Masamha
CP
,
Xia
Z
,
Yang
J
,
Albrecht
TR
,
Li
M
,
Shyu
A-B
, et al
CFIm25 links alternative polyadenylation to glioblastoma tumour suppression
.
Nature
2014
;
510
:
412
6
.
51.
Flaherty
KT
,
Infante
JR
,
Daud
A
,
Gonzalez
R
,
Kefford
RF
,
Sosman
J
, et al
Combined BRAF and MEK inhibition in melanoma with BRAF V600 mutations
.
N Engl J Med
2012
;
367
:
1694
703
.
52.
Holbro
T
,
Beerli
RR
,
Maurer
F
,
Koziczak
M
,
Barbas
CF
 3rd
,
Hynes
NE
. 
The ErbB2/ErbB3 heterodimer functions as an oncogenic unit: ErbB2 requires ErbB3 to drive breast tumor cell proliferation
.
Proc Natl Acad Sci U S A
2003
;
100
:
8933
8
.
53.
Hynes
NE
,
Lane
HA
. 
ERBB receptors and cancer: the complexity of targeted inhibitors
.
Nat Rev Cancer
2005
;
5
:
341
54
.
54.
Dummer
R
,
Schadendorf
D
,
Ascierto
PA
,
Arance
A
,
Dutriaux
C
,
Di Giacomo
AM
, et al
Binimetinib versus dacarbazine in patients with advanced NRAS-mutant melanoma (NEMO): a multicentre, open-label, randomised, phase 3 trial
.
Lancet Oncol
2017
;
18
:
435
45
.