Abstract
Traditional approaches to evaluating antitumor agents using human tumor xenograft models have generally used cohorts of 8 to 10 mice against a limited panel of tumor models. An alternative approach is to use fewer animals per tumor line, allowing a greater number of models that capture greater molecular/genetic heterogeneity of the cancer type. We retrospectively analyzed 67 agents evaluated by the Pediatric Preclinical Testing Program to determine whether a single mouse, chosen randomly from each group of a study, predicted the median response for groups of mice using 83 xenograft models. The individual tumor response from a randomly chosen mouse was compared with the group median response using established response criteria. A total of 2,134 comparisons were made. The single tumor response accurately predicted the group median response in 1,604 comparisons (75.16%). The mean tumor response correct prediction rate for 1,000 single mouse random samples was 78.09%. Models had a range for correct prediction (60%–87.5%). Allowing for misprediction of ± one response category, the overall mean correct single mouse prediction rate was 95.28%, and predicted overall objective response rates for group data in 66 of 67 drug studies. For molecularly targeted agents, occasional exceptional responder models were identified and the activity of that agent confirmed in additional models with the same genotype. Assuming that large treatment effects are targeted, this alternate experimental design has similar predictive value as traditional approaches, allowing for far greater numbers of models to be used that more fully encompass the heterogeneity of disease types. Cancer Res; 76(19); 5798–809. ©2016 AACR.
Introduction
Preclinical development of anticancer agents has predominantly relied upon murine syngeneic tumor models, and subsequently human tumor xenografts. The NCI screening program used a limited number of xenograft models to represent major forms of adult cancer, NSCLC, breast, colon, etc. (1–4). These tumors were established in immune-deficient mice, and standard criteria were developed for agents to meet or surpass to continue clinical development. In general, the pharmaceutical industry adopted the same approach, using similar models, as the NCI. However, these models have been criticized for not being representative of human disease (5, 6), and there is a move away from cell line derived xenografts to patient-derived xenografts (PDX) where tumor models are established directly in mice from patient biopsies (7). Although development of PDX adult cancer and childhood cancer models started several decades ago (8–10), it has more recently been realized that, once plated on plastic, characteristics of tumors may be lost, and that xenografts developed from established cell lines may not accurately represent the original tumor (11, 12)
For many cancers, molecular characterization has identified multiple subtypes with particular characteristics that relate to sensitivity to therapy. An example is responsiveness to trastuzumab in patients with HER2 amplified cancers. Similarly, subtypes of NSCLC with EML4-ALK translocation or mutant EGFR respond to crizotinib and erlotinib, respectively (13–15), whereas these agents have little impact on survival in unselected patients. With increasing “omics” analysis of childhood cancers, subsets have also been identified for ependymoma (16, 17), medulloblastoma (18, 19), neuroblastoma (20–22), sarcomas (23–26), and acute leukemias (27), diseases previously considered rather homogeneous entities. Thus, there is a need to represent more diverse genotypes/phenotypes in the preclinical models. The problem is how to represent such diversity in a screening program that will allow identification of molecular subsets that may be responsive to particular therapeutics.
Within the Pediatric Preclinical Testing Program (PPTP) over one hundred PDX models have been developed, although resources limit evaluation of agents to approximately 50 models in a primary screen. As the screen attempts to identify active agents in multiple cancer types (sarcomas, brain tumors, neuroblastoma, kidney tumors, and acute lymphoblastic leukemia), only limited panels of tumors can be used that almost certainly do not represent the genetic or phenotypic heterogeneity of the clinical disease. One approach to expanding the models to overcome this limitation is to use each mouse as a “patient” (28). In these preclinical trials, each mouse has a tumor derived from a different patient, and is evaluated for tumor progression or regression during therapy. In this way, 30 or more tumors of a given type (e.g., melanoma) can be evaluated with the potential to generate response rates that more closely parallel clinical response rates. With the relative ease for generating new PDX models, the approach of using one mouse to represent a tumor allows far greater number of tumor lines to be evaluated, and perhaps to gain better insight into potential response rates that may occur in humans, or to allow identification of subsets of tumors within a histotype that have increased or decreased drug sensitivity. In PPTP studies, examples of agents for which subsets of tumors are hypersensitive to treatment include the MEK inhibitor, selumetinib, for which a single tumor with a BRAF mutation responded (29), whereas 45 other models failed to respond; the oncolytic virus NTX-010 for which there was selective sensitivity of neuroblastoma and alveolar rhabdomyosarcoma models (29), and the PARP inhibitor talazoparib for which xenografts with DNA damage repair defects responded (30). Even within Ewing sarcoma models with a common etiology as a consequence of the oncogenic EWS/FLI1 fusion, only 5 of 10 xenograft models were highly sensitive to combination treatment with temozolomide and a PARP inhibitor (31) and only 1 of 5 xenograft models was sensitive to the IGF-1R inhibitor SCH 717454 (32).
Although here are many reasons that preclinical results fail to translate to clinical activity (7, 33), the ability to evaluate an agent across a larger panel of models derived from a particular cancer type may be valuable for identifying subsets of tumors that respond to specific therapies, and for obtaining an estimate of likely response rates in unselected patients having that diagnosis. However, the validity of such an approach will depend upon the predictive value of the individual mouse/tumor. To assess the “one mouse” design, we evaluated the ability of a single mouse/tumor chosen at random to predict accurately the “group” response for 67 agents tested in the PPTP. The results suggest that the “single mouse” experimental design accurately predicts group response and that, with a few exceptions, predicts the overall response rate across panels of childhood cancers, and accurately identify responsive tumor types.
Materials and Methods
Detailed methods and response criteria used by the PPTP for evaluating new agents (34) are presented in Supplementary Materials and Methods. Briefly, PDX models for solid tumors were derived by subcutaneous transplant of tumor fragments and ALL models were derived by intravenous injection of purified blasts as described previously (35). Cell line–derived xenografts were serially propagated by transplantation of tumor fragments, as for the PDX models. Solid tumors and non-glioblastoma brain tumors were propagated by serial passage of fragments subcutaneously into CB17SC (scid−/−) female mice (Taconic Farms, Germantown, NY), glioblastoma were transplanted into BALB/c nu/nu mice (36). Leukemia models were propagated by intravenous inoculation in female non-obese diabetic (NOD)/scid−/− mice (37). Tumor volumes (cm3) or percentages of human CD45-positive cells (ALL xenografts) were measured for each tumor at the initiation of the study and weekly for up to 42 days after study initiation.
The xenograft models included 11 kidney tumors (Wilms, rhabdoid), 14 soft tissue sarcoma (Ewing/rhabdomyosarcoma), 9 non-glioblastoma brain tumors (medulloblastoma, ependymoma, PNET, glioma), 6 glioblastoma, 8 neuroblastoma, 7 osteosarcoma, and 28 acute lymphoblastic leukemia (ALL). Pathologic (34) and molecular characteristics (38, 39) for many of these xenografts have been reported. Solid tumors were grown subcutaneously, whereas ALL models were disseminated disease following intravenous inoculation. Of the 83 models, 62 (75%) were derived from direct transplantation of patient tumor into mice, and maintained by serial passage in mice. For solid tumors 10 mice per group were used and 8 mice per group for ALL models. Growth curves for tumors in each mouse were derived weekly until the tumor reached the endpoint criteria (4-fold its volume at initiation of treatment). The PPTP response criteria have been previously described (34), and are presented in detail in the Supplementary Materials and Methods. Briefly, each of the 8 to 10 mice in a treatment group is assigned one of 5 response scores based on the effect of the treatment on their tumor: Progressive disease without growth delay (PD1 = 0), progressive response with growth delay (PD2 = 2), stable disease (SD = 4), partial response (PR = 6), complete response (CR = 8), and maintained CR (MCR = 10). The objective response for the treatment group is based on the median of the objective response scores for the individual mice within the group. Treatment groups with PR, CR, or MCR are considered to have an objective response.
The response to treatment was scored for each tumor in the group (PD1–MCR), and the Group Median response was determined. In initial analysis, the response of one tumor in the treatment group was selected using a random number generating routine (1–10 for solid tumors, 1–8 for ALL studies). The response of the randomly selected tumor was then compared with the Group Median Response. This allowed for the deviation (± response categories between the single mouse result and the group median response) to be calculated. The analysis involved 2,106 tumor/drug comparisons.
As a more robust analysis, we repeated this process of randomly selecting one mouse/tumor from each tumor line from each treatment group 1,000 times and compared the response of the randomly chosen mouse/tumor to the median group tumor response using a slightly larger database (2,134 treatment group comparisons), and we calculated the average number of times the response from 1,000 random samples was the same as the median group response (mean number of the correct prediction). The percentage of correct predictions per tumor line is the mean number of the correct predictions from the random mouse/tumor for each treatment group for that tumor line irrespective of the drug tested.
Cell Lines and xenografts
All cell lines, other than those generated from PDX models, were obtained from the ATCC. Patient-derived xenografts and cell lines were deposited in a Master Bank at the initiation of the PPTP project, and characterized by short tandem repeats. Periodically, during the PPTP, lines were verified as authentic by STR analysis.
Statistical analysis
For the analysis, a single mouse prediction was considered to be correct when it was equal to the median tumor response of the group responses as described in Supplementary Materials and Methods. Four of the best and worse predictive models were selected on the basis of a percentage of correct responses among all of the studies. The Wilcoxon rank-sum test was used to compare percentages between the models. Regression analysis was used to study the association between single mouse predicted objective response and group mouse objective response.
Statistical analyses were done using SAS 9.3. and R. P values less than 0.05 were considered statistically significant.
Results
Criteria used to assess tumor response are defined in Materials and Methods and in more detail in the Supplementary Materials and Methods and have been published previously (34). The studies included a wide range of agents, including molecularly targeted agents such as tyrosine kinase inhibitors, antibody–drug conjugates, standard cytotoxic agents, ligand-binding antibodies, growth factor receptor–binding antibodies and small molecules, as well as two small-molecule agents with unknown mechanism of action (Supplementary Table S1 with references to published studies). Agents of greatest interest are those that induce tumor regression (i.e., objective responses).
In our initial analysis of PPTP datasets, results for 82 tumor models for 2,106 comparisons was undertaken (Supplementary Table S2), and indicated a correct prediction rate for single mouse data of 79.3%. If a deviation of +1 or −1 response category was used as an acceptable level of error (PD1 vs. PD2, or MCR vs. CR etc.), the prediction accuracy increased to 95.6%.
To further test the accuracy for a randomly chosen single mouse to predict the group median response, we analyzed a slightly larger dataset that included a total of 83 xenograft models and 2,134 treatment groups. Some of these tests involved screening of an agent against many tumor models (>40), whereas other tests used a focused testing approach (i.e., evaluated only against specific models, e.g., selected ALL xenografts, or tumor models having a specific genetic characteristic). The “test” mouse was chosen at random from each treatment group of a tumor line. The response of the tumor in that individual mouse was compared to the treatment group's response (median tumor response). This process was repeated 1,000 times for the whole dataset (2,134 groups). Tumor models, number of observations and accuracy of prediction are presented in Table 1. To quantify the accuracy, we used the PPTP response classifications. If the single mouse predicted response accurately (e.g., the single mouse response was SD and the group response was also SD, etc.) the result was scored as 0. If the single mouse overpredicted response by one category, the score was +1, or if underpredicted by one response category it was scored −1. Thus, the maximum overprediction would be MCR when the group response was PD1 (+5) or maximum under-prediction if the single mouse was PD1 but the group response was MCR (−5). Overall, the single mouse data predicted the group response with 78.09% accuracy (score 0), based on 2,134 treatment groups, Fig. 1A. If an error of +1 or −1 response category was used as an acceptable level of error the prediction accuracy increased to 95.49%, Table 2. Gross overprediction (+5), or underprediction (−5), occurred in 0.14% and 0.001% of studies, respectively (Table 2).
Tumor code . | Tumor line . | Histology . | Mean number incorrect . | Mean number correct . | Total no. of studies . | Proportion of correct response . |
---|---|---|---|---|---|---|
A1 | BT-29 | Kidney ATRT | 11.554 | 30.571 | 42 | 0.725 |
A2 | KT-10 | Wilms tumor | 7.769 | 39.050 | 47 | 0.838 |
A3 | KT-11 | Wilms tumor | 7.557 | 30.447 | 38 | 0.802 |
A4 | KT-13 | Wilms tumor | 11.236 | 35.833 | 47 | 0.761 |
A5 | KT-16 | Kidney ATRT | 6.378 | 12.652 | 19 | 0.661 |
A6a | SK-NEP-1 | Ewing sarcoma | 8.606 | 43.378 | 52 | 0.832 |
A7 | KT-12 | Kidney ATRT | 8.475 | 19.466 | 28 | 0.697 |
A8 | KT-14 | Kidney ATRT | 9.851 | 30.094 | 40 | 0.751 |
A9 | KT-5 | Wilms tumor | 0.719 | 2.265 | 3 | 0.756 |
A10 | WT-8 | Wilms tumor | 0.000 | 1.000 | 1 | 1.000 |
A11 | WT-6 | Wilms tumor | 0.352 | 0.671 | 1 | 0.695 |
B1 | Rh30R | Alveolar rhabdomyosarcoma | 10.855 | 36.238 | 47 | 0.770 |
B2 | EW-5 | Ewing sarcoma | 10.810 | 39.357 | 50 | 0.789 |
B3a | EW-8 | Ewing sarcoma | 7.050 | 38.912 | 46 | 0.845 |
B4 | Rh10 | Alveolar rhabdomyosarcoma | 7.834 | 29.289 | 37 | 0.789 |
B5a | Rh18 | Embryonal rhabdomyosarcoma | 12.248 | 35.619 | 48 | 0.744 |
B6 | Rh26 | Alveolar rhabdomyosarcoma | 9.9710 | 36.111 | 46 | 0.779 |
B7 | Rh30 | Alveolar rhabdomyosarcoma | 10.346 | 38.678 | 49 | 0.788 |
B8 | Rh41 | Alveolar rhabdomyosarcoma | 8.765 | 39.157 | 48 | 0.816 |
B9 | Rh36 | Embryonal rhabdomyosarcoma | 3.475 | 8.569 | 12 | 0.710 |
B10 | Rh65 | Alveolar rhabdomyosarcoma | 1.175 | 6.752 | 8 | 0.849 |
B11a | TC-71 | Ewing sarcoma | 8.717 | 37.173 | 46 | 0.811 |
B12a | CHLA258 | Ewing sarcoma | 7.905 | 38.265 | 46 | 0.831 |
B13 | Rh66 | Alveolar rhabdomyosarcoma | 0.465 | 1.523 | 2 | 0.753 |
B15a | ES-6 | Ewing sarcoma | 0.404 | 0.592 | 1 | 0.601 |
C1 | BT-28 | Medulloblastoma | 8.082 | 37.913 | 46 | 0.825 |
C4 | BT-45 | Medulloblastoma | 6.594 | 32.477 | 39 | 0.831 |
C5 | BT-36 | Ependymoma | 6.774 | 10.216 | 17 | 0.600 |
C6 | BT-41 | Ependymoma | 8.293 | 12.713 | 21 | 0.600 |
C7 | BT-46 | Medulloblastoma | 3.013 | 7.950 | 11 | 0.72 |
C8 | BT-50 | Medulloblastoma | 8.578 | 26.411 | 35 | 0.755 |
C9 | BT-44 | Ependymoma | 6.501 | 21.697 | 28 | 0.770 |
C11 | BT-35 | Glioma | 0.229 | 0.784 | 1 | 0.765 |
C12 | BT-40 | Glioma | 0.000 | 2.000 | 2 | 1.000 |
D1a | GBM2 | Glioblastoma | 14.266 | 32.727 | 47 | 0.699 |
D2 | BT-39 | Glioblastoma | 9.858 | 39.063 | 49 | 0.800 |
D3a | D645 | Glioblastoma | 9.405 | 33.641 | 43 | 0.784 |
D4a | D456 | Glioblastoma | 10.554 | 33.423 | 44 | 0.761 |
D5 | BT-56 | Glioblastoma | 1.208 | 1.792 | 3 | 0.594 |
D6a | D212 | Glioblastoma | 0.525 | 1.516 | 2 | 0.749 |
E1a | NB-SD | Neuroblastoma | 10.353 | 30.541 | 41 | 0.747 |
E2a | NB-1771 | Neuroblastoma | 10.691 | 34.444 | 45 | 0.764 |
E3a | NB-1691 | Neuroblastoma | 8.737 | 39.124 | 48 | 0.817 |
E4a | NB-EBc1 | Neuroblastoma | 10.710 | 37.338 | 48 | 0.737 |
E5a | CHLA79 | Neuroblastoma | 8.302 | 28.678 | 37 | 0.776 |
E6a | NB1643 | Neuroblastoma | 10.274 | 34.761 | 45 | 0.773 |
E7a | NB1382 | Neuroblastoma | 0.198 | 1.822 | 2 | 0.909 |
E9a | SK-NAS | Neuroblastoma | 1.238 | 5.727 | 7 | 0.813 |
F1 | OS-1 | Osteosarcoma | 10.714 | 40.301 | 51 | 0.791 |
F2 | OS-2 | Osteosarcoma | 5.972 | 42.230 | 48 | 0.875 |
F3 | OS-17 | Osteosarcoma | 8.439 | 38.475 | 47 | 0.819 |
F9 | OS-9 | Osteosarcoma | 7.911 | 37.228 | 45 | 0.822 |
F10 | OS-33 | Osteosarcoma | 6.804 | 43.225 | 50 | 0.862 |
F11 | OS-31 | Osteosarcoma | 6.961 | 41.013 | 48 | 0.856 |
F12 | OS-29 | Osteosarcoma | 0.000 | 1.00 | 1 | 1.000 |
G1 | ALL-2 | Primary; B-precursor | 10.252 | 32.945 | 43 | 0.766 |
G2 | ALL-3 | Primary; B-precursor | 11.956 | 21.008 | 33 | 0.640 |
G3 | ALL-4 | Primary; B-precursor; Ph+(BCR-ABL) | 8.470 | 37.410 | 46 | 0.814 |
G4 | ALL-7 | Primary; B-precursor | 8.934 | 31.202 | 40 | 0.776 |
G5 | ALL-8 | Primary; B-precursor | 10.132 | 35.888 | 45 | 0.785 |
G6 | ALL-16 | Primary; T-cell ALL | 12.663 | 20.272 | 33 | 0.615 |
G7 | ALL-17 | Primary; B-precursor | 8.569 | 36.319 | 45 | 0.810 |
G8 | ALL-19 | Primary; B-precursor | 11.542 | 32.449 | 44 | 0.738 |
G9 | ALL-10 | Primary; B-precursor | 0.358 | 1.638 | 2 | 0.807 |
G11 | ALL-27 | Tertiary, T-cell ALL | 0.442 | 0.580 | 1 | 0.549 |
G12 | ALL-29 | Tertiary, T-cell ALL | 0.435 | 0.585 | 1 | 0.597 |
G13 | ALL-30 | Tertiary, T-cell ALL | 0.359 | 0.630 | 1 | 0.645 |
G14 | ALL-31 | Tertiary, T-cell ALL | 2.6520 | 7.4110 | 10 | 0.7421 |
G15 | MLL-7 | Tertiary, infant, precursor B-ALL | 2.416 | 8.554 | 13 | 0.778 |
G19 | MLL-2 | Tertiary, infant, precursor B-ALL | 0.411 | 0.591 | 1 | 0.562 |
G20 | MLL-3 | Infant BCP-ALL | 0.261 | 1.745 | 2 | 0.874 |
G21 | MLL-5 | Tertiary, infant, precursor B-ALL | 0.491 | 1.444 | 2 | 0.750 |
G22 | MLL-6 | Infant BCP-ALL | 0.124 | 0.877 | 1 | 0.875 |
G23 | MLL-8 | Infant BCP-ALL | 0.482 | 0.520 | 1 | 0.546 |
G24 | MLL-14 | Infant BCP-ALL | 0.342 | 1.667 | 2 | 0.829 |
G25 | TGT-020 | ALL JAK2 R683G | 0.000 | 3.000 | 3 | 1.000 |
G27 | TGT-047 | ALL JAK2 R683G | 0.243 | 1.748 | 2 | 0.862 |
G28 | TGT-052 | ALL JAK1 V658F | 0.000 | 1.000 | 1 | 1.000 |
G29 | TGT-144 | ALL JAK2 | 0 | 1 | 1 | 1 |
G30 | TGT-174 | ALL JAK2 P933R | 0.840 | 1.140 | 2 | 0.579 |
H18a | RS4;11 | B-Lineage, monocytic | 0.115 | 0.865 | 1 | 0.881 |
J1a | KARPAS299 | T-cell lymphoma | 0.000 | 2.000 | 2 | 1.000 |
J2a | MV4;11 | Acute monocytic | 0.204 | 1.783 | 2 | 0.895 |
Tumor code . | Tumor line . | Histology . | Mean number incorrect . | Mean number correct . | Total no. of studies . | Proportion of correct response . |
---|---|---|---|---|---|---|
A1 | BT-29 | Kidney ATRT | 11.554 | 30.571 | 42 | 0.725 |
A2 | KT-10 | Wilms tumor | 7.769 | 39.050 | 47 | 0.838 |
A3 | KT-11 | Wilms tumor | 7.557 | 30.447 | 38 | 0.802 |
A4 | KT-13 | Wilms tumor | 11.236 | 35.833 | 47 | 0.761 |
A5 | KT-16 | Kidney ATRT | 6.378 | 12.652 | 19 | 0.661 |
A6a | SK-NEP-1 | Ewing sarcoma | 8.606 | 43.378 | 52 | 0.832 |
A7 | KT-12 | Kidney ATRT | 8.475 | 19.466 | 28 | 0.697 |
A8 | KT-14 | Kidney ATRT | 9.851 | 30.094 | 40 | 0.751 |
A9 | KT-5 | Wilms tumor | 0.719 | 2.265 | 3 | 0.756 |
A10 | WT-8 | Wilms tumor | 0.000 | 1.000 | 1 | 1.000 |
A11 | WT-6 | Wilms tumor | 0.352 | 0.671 | 1 | 0.695 |
B1 | Rh30R | Alveolar rhabdomyosarcoma | 10.855 | 36.238 | 47 | 0.770 |
B2 | EW-5 | Ewing sarcoma | 10.810 | 39.357 | 50 | 0.789 |
B3a | EW-8 | Ewing sarcoma | 7.050 | 38.912 | 46 | 0.845 |
B4 | Rh10 | Alveolar rhabdomyosarcoma | 7.834 | 29.289 | 37 | 0.789 |
B5a | Rh18 | Embryonal rhabdomyosarcoma | 12.248 | 35.619 | 48 | 0.744 |
B6 | Rh26 | Alveolar rhabdomyosarcoma | 9.9710 | 36.111 | 46 | 0.779 |
B7 | Rh30 | Alveolar rhabdomyosarcoma | 10.346 | 38.678 | 49 | 0.788 |
B8 | Rh41 | Alveolar rhabdomyosarcoma | 8.765 | 39.157 | 48 | 0.816 |
B9 | Rh36 | Embryonal rhabdomyosarcoma | 3.475 | 8.569 | 12 | 0.710 |
B10 | Rh65 | Alveolar rhabdomyosarcoma | 1.175 | 6.752 | 8 | 0.849 |
B11a | TC-71 | Ewing sarcoma | 8.717 | 37.173 | 46 | 0.811 |
B12a | CHLA258 | Ewing sarcoma | 7.905 | 38.265 | 46 | 0.831 |
B13 | Rh66 | Alveolar rhabdomyosarcoma | 0.465 | 1.523 | 2 | 0.753 |
B15a | ES-6 | Ewing sarcoma | 0.404 | 0.592 | 1 | 0.601 |
C1 | BT-28 | Medulloblastoma | 8.082 | 37.913 | 46 | 0.825 |
C4 | BT-45 | Medulloblastoma | 6.594 | 32.477 | 39 | 0.831 |
C5 | BT-36 | Ependymoma | 6.774 | 10.216 | 17 | 0.600 |
C6 | BT-41 | Ependymoma | 8.293 | 12.713 | 21 | 0.600 |
C7 | BT-46 | Medulloblastoma | 3.013 | 7.950 | 11 | 0.72 |
C8 | BT-50 | Medulloblastoma | 8.578 | 26.411 | 35 | 0.755 |
C9 | BT-44 | Ependymoma | 6.501 | 21.697 | 28 | 0.770 |
C11 | BT-35 | Glioma | 0.229 | 0.784 | 1 | 0.765 |
C12 | BT-40 | Glioma | 0.000 | 2.000 | 2 | 1.000 |
D1a | GBM2 | Glioblastoma | 14.266 | 32.727 | 47 | 0.699 |
D2 | BT-39 | Glioblastoma | 9.858 | 39.063 | 49 | 0.800 |
D3a | D645 | Glioblastoma | 9.405 | 33.641 | 43 | 0.784 |
D4a | D456 | Glioblastoma | 10.554 | 33.423 | 44 | 0.761 |
D5 | BT-56 | Glioblastoma | 1.208 | 1.792 | 3 | 0.594 |
D6a | D212 | Glioblastoma | 0.525 | 1.516 | 2 | 0.749 |
E1a | NB-SD | Neuroblastoma | 10.353 | 30.541 | 41 | 0.747 |
E2a | NB-1771 | Neuroblastoma | 10.691 | 34.444 | 45 | 0.764 |
E3a | NB-1691 | Neuroblastoma | 8.737 | 39.124 | 48 | 0.817 |
E4a | NB-EBc1 | Neuroblastoma | 10.710 | 37.338 | 48 | 0.737 |
E5a | CHLA79 | Neuroblastoma | 8.302 | 28.678 | 37 | 0.776 |
E6a | NB1643 | Neuroblastoma | 10.274 | 34.761 | 45 | 0.773 |
E7a | NB1382 | Neuroblastoma | 0.198 | 1.822 | 2 | 0.909 |
E9a | SK-NAS | Neuroblastoma | 1.238 | 5.727 | 7 | 0.813 |
F1 | OS-1 | Osteosarcoma | 10.714 | 40.301 | 51 | 0.791 |
F2 | OS-2 | Osteosarcoma | 5.972 | 42.230 | 48 | 0.875 |
F3 | OS-17 | Osteosarcoma | 8.439 | 38.475 | 47 | 0.819 |
F9 | OS-9 | Osteosarcoma | 7.911 | 37.228 | 45 | 0.822 |
F10 | OS-33 | Osteosarcoma | 6.804 | 43.225 | 50 | 0.862 |
F11 | OS-31 | Osteosarcoma | 6.961 | 41.013 | 48 | 0.856 |
F12 | OS-29 | Osteosarcoma | 0.000 | 1.00 | 1 | 1.000 |
G1 | ALL-2 | Primary; B-precursor | 10.252 | 32.945 | 43 | 0.766 |
G2 | ALL-3 | Primary; B-precursor | 11.956 | 21.008 | 33 | 0.640 |
G3 | ALL-4 | Primary; B-precursor; Ph+(BCR-ABL) | 8.470 | 37.410 | 46 | 0.814 |
G4 | ALL-7 | Primary; B-precursor | 8.934 | 31.202 | 40 | 0.776 |
G5 | ALL-8 | Primary; B-precursor | 10.132 | 35.888 | 45 | 0.785 |
G6 | ALL-16 | Primary; T-cell ALL | 12.663 | 20.272 | 33 | 0.615 |
G7 | ALL-17 | Primary; B-precursor | 8.569 | 36.319 | 45 | 0.810 |
G8 | ALL-19 | Primary; B-precursor | 11.542 | 32.449 | 44 | 0.738 |
G9 | ALL-10 | Primary; B-precursor | 0.358 | 1.638 | 2 | 0.807 |
G11 | ALL-27 | Tertiary, T-cell ALL | 0.442 | 0.580 | 1 | 0.549 |
G12 | ALL-29 | Tertiary, T-cell ALL | 0.435 | 0.585 | 1 | 0.597 |
G13 | ALL-30 | Tertiary, T-cell ALL | 0.359 | 0.630 | 1 | 0.645 |
G14 | ALL-31 | Tertiary, T-cell ALL | 2.6520 | 7.4110 | 10 | 0.7421 |
G15 | MLL-7 | Tertiary, infant, precursor B-ALL | 2.416 | 8.554 | 13 | 0.778 |
G19 | MLL-2 | Tertiary, infant, precursor B-ALL | 0.411 | 0.591 | 1 | 0.562 |
G20 | MLL-3 | Infant BCP-ALL | 0.261 | 1.745 | 2 | 0.874 |
G21 | MLL-5 | Tertiary, infant, precursor B-ALL | 0.491 | 1.444 | 2 | 0.750 |
G22 | MLL-6 | Infant BCP-ALL | 0.124 | 0.877 | 1 | 0.875 |
G23 | MLL-8 | Infant BCP-ALL | 0.482 | 0.520 | 1 | 0.546 |
G24 | MLL-14 | Infant BCP-ALL | 0.342 | 1.667 | 2 | 0.829 |
G25 | TGT-020 | ALL JAK2 R683G | 0.000 | 3.000 | 3 | 1.000 |
G27 | TGT-047 | ALL JAK2 R683G | 0.243 | 1.748 | 2 | 0.862 |
G28 | TGT-052 | ALL JAK1 V658F | 0.000 | 1.000 | 1 | 1.000 |
G29 | TGT-144 | ALL JAK2 | 0 | 1 | 1 | 1 |
G30 | TGT-174 | ALL JAK2 P933R | 0.840 | 1.140 | 2 | 0.579 |
H18a | RS4;11 | B-Lineage, monocytic | 0.115 | 0.865 | 1 | 0.881 |
J1a | KARPAS299 | T-cell lymphoma | 0.000 | 2.000 | 2 | 1.000 |
J2a | MV4;11 | Acute monocytic | 0.204 | 1.783 | 2 | 0.895 |
aCell line–derived xenografts.
. | Deviationa . | Number of predictions . | Percentage . |
---|---|---|---|
Underprediction | −5 | 3 | 0.001 |
−4 | 13 | 0.006 | |
−3 | 15 | 0.007 | |
−2 | 31 | 1.45 | |
−1 | 194 | 9.09 | |
0 | 1,718 | 80.50 | |
Overprediction | 1 | 126 | 5.90 |
2 | 20 | 0.94 | |
3 | 7 | 0.33 | |
4 | 4 | 0.19 | |
5 | 3 | 0.14 | |
Total | 2,134 | 100 |
. | Deviationa . | Number of predictions . | Percentage . |
---|---|---|---|
Underprediction | −5 | 3 | 0.001 |
−4 | 13 | 0.006 | |
−3 | 15 | 0.007 | |
−2 | 31 | 1.45 | |
−1 | 194 | 9.09 | |
0 | 1,718 | 80.50 | |
Overprediction | 1 | 126 | 5.90 |
2 | 20 | 0.94 | |
3 | 7 | 0.33 | |
4 | 4 | 0.19 | |
5 | 3 | 0.14 | |
Total | 2,134 | 100 |
aThe deviation represents the distance from the treatment group objective response (PD1, PD2, SD, PR, CR, or MCR) from the response observed by the randomly selected single mouse. Negative values represent underprediction (the single mouse shows an inferior response to the treatment group response) and positive values represent overprediction (the single mouse shows a superior response to the treatment group response). Percentages represent the rates at which there was overprediction or underprediction by the specified deviation value.
Because the methodology for assessing model response is different for the disseminated leukemia models (measurement of human CD45-positive cells in peripheral blood), we also analyzed the ALL data separately from the solid tumors. There were 375 treatment groups. The mean of 1,000 random samples of “single mouse” data demonstrated that the single mouse results predicted the group response accurately 75.33% of the time. If a deviation of ± one response category was used, the “success” rate increased to 94.30% (Supplementary Fig. S1).
We also identified the four best and worst tumor models in terms of accuracy of response prediction. The four models having the highest proportions of accurate predictions were B10, F11, F10, and F2, with correct prediction rates of 84.9%, 85.6%, 86.2%, and 87.5%, respectively (see Table 1 for tumor line identity and type). The four worst tumor models estimated from the mean response rates were C5, C6, G2, and G6 with accurate predictions of 60%, 60.0%, 64.0%, and 61.5%, respectively. The P value from the Wilcoxon rank-sum test was 0.0294, which indicates that the percentage of correct predictions of best tumors is significantly higher than that of worst tumor models. Histograms showing the best and worst models are shown in Fig. 1B and C. One possible explanation for lower prediction from a single mouse is that growth rates or responses were far less consistent in the tumor models with poor predictive value than those models with higher prediction success. We analyzed the consistency of responses for the four worst tumor models (C5, C6, G2, and G6), compared with the four best tumor models (B10, F2, F10, and F11; Supplementary Table S3). The means of variances of response for the four worst tumor models (range, 3.15–7.3) were much larger than the means of variances of response for the four best tumor models (range, 1.39–1.74; P = 0.03). Thus, the inconsistency of the responses in a treatment group reduced predictive value based on the single mouse response evaluation. Specific examples that illustrate deviation of the “single mouse” response relative to the group median response are shown in Fig. 2. Analysis of exomic mutations in the four “best” and three of the four “worst” showed a similar low mutation frequency in tumors with good predictive activity (range, 2–8 mutations/model) as those with poorer predictive value (range, 2–7 mutations/model; Supplementary Table S4).
One approach to assessing validity of “single mouse” predictions is to determine the effect of “false positive” and “false negative” predictions. For example, would data from the single mouse have predicted significant activity for an agent that was not active based on the median response data for the group (i.e., false positive)? To examine this, we first analyzed the number of objective responses (i.e., response of PR, CR, or MCR) within the group-derived dataset. In the 2,134 response determinations, there were 318 objective responses. We next analyzed whether the differences between predictions of the single mouse and that for the group median response altered the ‘outcome’ (i.e., number of models where response was at least PR). Single mouse data for predicted objective response when it was not present with a frequency of 14.03% and failed to identify objective response when it was present with a frequency of 7.78%. Considering only the 318 treatment groups with objective responses (determined by group median response), the single mouse approach predicted the group objective response correctly 73.0% of the time. For the 1,816 treatment groups with less than objective response, the single mouse predicted correctly 79.1% of the time. Applying a higher response standard, there were 250 treatment groups with CR or MCR among the 2,134 response determinations. Considering these 250 treatment groups with CR/MCR responses, the single mouse predicted the group response correctly 78.2% of the time.
Each drug study analyzed included a range of tumor models (median 41, range 1 to 55, depending on the experimental objective). The results from single mouse results and the group median results were highly correlated in both the initial analysis and the response rate from 1,000 random comparisons (r2 = 0.99, Fig. 3). In most studies, the over- and underpredictions were minor, and did not alter the outcome (objective response rate; Table 3 and Supplementary Table S2). As shown in Fig. 3 the single mouse data correlated well with the activity levels of each drug. The two outliers (studies 0614 and 1008, 20: vs. 40% and 50% vs. 0%) occurred in our initial analysis of the single mouse data. These experiments used 5 and 2 tumor models, respectively. This discrepancy was not apparent when 1,000 random iterations were performed.
ORRa . | Study number . | Study agent . | Number of studies . | Single mouse ORR (%) . | Group median ORR (%) . |
---|---|---|---|---|---|
0%–20% | 502 | Bortezemib | 7 | 18.421 | 10.526 |
504 | 17-DMAG | 39 | 13.310 | 12.821 | |
506 | BMS-354825 | 42 | 11.490 | 7.143 | |
507 | AZD-2171 | 39 | 10.269 | 5.128 | |
601 | SU11248 | 44 | 7.370 | 6.818 | |
602 | Rapamycin | 45 | 15.376 | 13.333 | |
603 | Lapatinib | 41 | 1.254 | 0.000 | |
604 | ABT-263 | 43 | 9.805 | 9.302 | |
605 | 19D12 | 43 | 8.002 | 6.977 | |
607 | SAHA | 41 | 3.115 | 0.000 | |
613 | Cytarabine | 6 | 2.033 | 0.000 | |
614 | Chloretazine | 5 | 28.120 | 20.000 | |
702 | HGS-ETR1 | 45 | 0.411 | 0.000 | |
703 | GSK690693 | 46 | 2.930 | 2.174 | |
704 | Aplidin | 42 | 4.583 | 2.381 | |
706 | Sorafenib | 44 | 2.950 | 0.000 | |
708 | IMC-A12 | 35 | 7.449 | 2.857 | |
801 | CGC(PG)-11047 | 41 | 5.722 | 2.439 | |
803 | BMS-754807 | 44 | 1.609 | 0.000 | |
805 | AZD8055 | 45 | 1.698 | 0.000 | |
806 | JNJ26854165 | 45 | 14.456 | 13.333 | |
807 | SCH 727965 | 43 | 6.793 | 4.651 | |
808 | MLN4924 | 44 | 4.707 | 0.000 | |
814 | Pazopanib | 7 | 0.000 | 0.000 | |
901 | AT13387 | 42 | 1.131 | 0.000 | |
902 | LCL161 | 46 | 4.246 | 2.174 | |
903 | Lenalidomide | 45 | 2.931 | 0.000 | |
905 | JNJ26481585 | 41 | 9.207 | 7.317 | |
1000 | Trisenox(ATO) | 5 | 0.000 | 0.000 | |
1001 | RO4929097 | 34 | 0.000 | 0.000 | |
1002 | SGI-1776 | 39 | 0.551 | 0.000 | |
1003 | MK-2206 | 38 | 1.774 | 0.000 | |
1006 | TAK-701 | 6 | 0.000 | 0.000 | |
1007 | XL147 | 39 | 4.328 | 2.564 | |
1008 | XL765 | 2 | 4.950 | 0.000 | |
1010 | BAL101553 | 38 | 0.295 | 0.000 | |
1101 | AZD1480 | 50 | 7.830 | 6.000 | |
1102 | PF-03084014 | 44 | 2.527 | 0.000 | |
1103 | INK128-1110-028 | 38 | 3.129 | 0.000 | |
1104 | Ganetespib | 11 | 0.000 | 0.000 | |
1105 | Pixantrone | 8 | 9.738 | 12.500 | |
1108 | XL765 | 6 | 0.000 | 0.000 | |
1109 | PCI-32765 | 7 | 0.000 | 0.000 | |
1112 | Cabozantinib (XL-148) | 34 | 7.706 | 5.882 | |
1113 | KPT330 | 45 | 12.891 | 11.111 | |
1201 | CX-5461 | 44 | 8.177 | 6.818 | |
1207 | NSC060043 | 1 | 0.000 | 0.000 | |
21%–40% | 501 | Vincristine | 47 | 40.774 | 38.298 |
505 | Cisplatin | 46 | 23.593 | 23.913 | |
606 | Topotecan | 45 | 33.284 | 33.333 | |
701 | MLN8237 | 45 | 42.396 | 40.000 | |
804 | GSK923295A | 38 | 42.939 | 39.474 | |
1004 | IMGN901 | 25 | 40.616 | 40.000 | |
1005 | RG7112 | 45 | 45.211 | 40.000 | |
1011 | BI6727 (Volasertib) | 41 | 23.834 | 24.390 | |
1107 | TL32711 | 6 | 40.300 | 33.333 | |
1110 | NSC750854 | 30 | 41.090 | 40.000 | |
1203 | Glembatumumab | 8 | 37.550 | 37.500 | |
41%–60% | 707 | SAR3419 | 12 | 47.883 | 50.000 |
813 | GENZ-644282 | 17 | 37.318 | 41.176 | |
904 | Temozolomide | 5 | 60.000 | 60.000 | |
1111 | Cabazitaxel | 10 | 60.420 | 50.000 | |
1202 | Abraxane | 7 | 42.857 | 42.857 | |
61%–80% | 503 | Cyclophosphamide | 47 | 67.038 | 63.830 |
802 | PR-104 | 42 | 70.967 | 69.048 | |
1106 | Eribulin | 43 | 60.116 | 60.465 | |
81%–100% | 913 | CPX351 | 5 | 100.000 | 100.000 |
ORRa . | Study number . | Study agent . | Number of studies . | Single mouse ORR (%) . | Group median ORR (%) . |
---|---|---|---|---|---|
0%–20% | 502 | Bortezemib | 7 | 18.421 | 10.526 |
504 | 17-DMAG | 39 | 13.310 | 12.821 | |
506 | BMS-354825 | 42 | 11.490 | 7.143 | |
507 | AZD-2171 | 39 | 10.269 | 5.128 | |
601 | SU11248 | 44 | 7.370 | 6.818 | |
602 | Rapamycin | 45 | 15.376 | 13.333 | |
603 | Lapatinib | 41 | 1.254 | 0.000 | |
604 | ABT-263 | 43 | 9.805 | 9.302 | |
605 | 19D12 | 43 | 8.002 | 6.977 | |
607 | SAHA | 41 | 3.115 | 0.000 | |
613 | Cytarabine | 6 | 2.033 | 0.000 | |
614 | Chloretazine | 5 | 28.120 | 20.000 | |
702 | HGS-ETR1 | 45 | 0.411 | 0.000 | |
703 | GSK690693 | 46 | 2.930 | 2.174 | |
704 | Aplidin | 42 | 4.583 | 2.381 | |
706 | Sorafenib | 44 | 2.950 | 0.000 | |
708 | IMC-A12 | 35 | 7.449 | 2.857 | |
801 | CGC(PG)-11047 | 41 | 5.722 | 2.439 | |
803 | BMS-754807 | 44 | 1.609 | 0.000 | |
805 | AZD8055 | 45 | 1.698 | 0.000 | |
806 | JNJ26854165 | 45 | 14.456 | 13.333 | |
807 | SCH 727965 | 43 | 6.793 | 4.651 | |
808 | MLN4924 | 44 | 4.707 | 0.000 | |
814 | Pazopanib | 7 | 0.000 | 0.000 | |
901 | AT13387 | 42 | 1.131 | 0.000 | |
902 | LCL161 | 46 | 4.246 | 2.174 | |
903 | Lenalidomide | 45 | 2.931 | 0.000 | |
905 | JNJ26481585 | 41 | 9.207 | 7.317 | |
1000 | Trisenox(ATO) | 5 | 0.000 | 0.000 | |
1001 | RO4929097 | 34 | 0.000 | 0.000 | |
1002 | SGI-1776 | 39 | 0.551 | 0.000 | |
1003 | MK-2206 | 38 | 1.774 | 0.000 | |
1006 | TAK-701 | 6 | 0.000 | 0.000 | |
1007 | XL147 | 39 | 4.328 | 2.564 | |
1008 | XL765 | 2 | 4.950 | 0.000 | |
1010 | BAL101553 | 38 | 0.295 | 0.000 | |
1101 | AZD1480 | 50 | 7.830 | 6.000 | |
1102 | PF-03084014 | 44 | 2.527 | 0.000 | |
1103 | INK128-1110-028 | 38 | 3.129 | 0.000 | |
1104 | Ganetespib | 11 | 0.000 | 0.000 | |
1105 | Pixantrone | 8 | 9.738 | 12.500 | |
1108 | XL765 | 6 | 0.000 | 0.000 | |
1109 | PCI-32765 | 7 | 0.000 | 0.000 | |
1112 | Cabozantinib (XL-148) | 34 | 7.706 | 5.882 | |
1113 | KPT330 | 45 | 12.891 | 11.111 | |
1201 | CX-5461 | 44 | 8.177 | 6.818 | |
1207 | NSC060043 | 1 | 0.000 | 0.000 | |
21%–40% | 501 | Vincristine | 47 | 40.774 | 38.298 |
505 | Cisplatin | 46 | 23.593 | 23.913 | |
606 | Topotecan | 45 | 33.284 | 33.333 | |
701 | MLN8237 | 45 | 42.396 | 40.000 | |
804 | GSK923295A | 38 | 42.939 | 39.474 | |
1004 | IMGN901 | 25 | 40.616 | 40.000 | |
1005 | RG7112 | 45 | 45.211 | 40.000 | |
1011 | BI6727 (Volasertib) | 41 | 23.834 | 24.390 | |
1107 | TL32711 | 6 | 40.300 | 33.333 | |
1110 | NSC750854 | 30 | 41.090 | 40.000 | |
1203 | Glembatumumab | 8 | 37.550 | 37.500 | |
41%–60% | 707 | SAR3419 | 12 | 47.883 | 50.000 |
813 | GENZ-644282 | 17 | 37.318 | 41.176 | |
904 | Temozolomide | 5 | 60.000 | 60.000 | |
1111 | Cabazitaxel | 10 | 60.420 | 50.000 | |
1202 | Abraxane | 7 | 42.857 | 42.857 | |
61%–80% | 503 | Cyclophosphamide | 47 | 67.038 | 63.830 |
802 | PR-104 | 42 | 70.967 | 69.048 | |
1106 | Eribulin | 43 | 60.116 | 60.465 | |
81%–100% | 913 | CPX351 | 5 | 100.000 | 100.000 |
aORR, objective response rate = number of models with (PR, CR, and MCR)/number of models tested with that agent.
We analyzed the dataset to see whether single mouse data identified the same tumor types as sensitive as did the group-derived data. Even analyzing the “worst” four studies for molecularly targeted agents the single mouse result appears to identify responsive tumor types, Fig. 4. Single mouse results would have overpredicted activity of a VEGFR2 inhibitor in neuroblastoma (AZD2171; study 0507), and would have missed the potential signal for activity both in non-glioblastoma brain tumors (AZD8055; study 0805) and ALL (AZD1480; study 1101). However, in both of these studies, only a single brain tumor (n = 5) and one ALL model (n = 10) responded (group data), suggesting limited activity for these agents against these cancer histologies.
The assumption made is that by simulating the clinical heterogeneity of tumor types in the preclinical model, one may more readily identify tumor subtypes that are responsive to particular agents. We retrospectively analyzed responses to 67 agents across all models to see whether this assumption was valid. Over 100 of the xenograft models in the PPTP have been characterized by exome sequencing, expression profiling, or both techniques (38, 39 and unpublished data). The “omic” characteristics of “Exceptional Responders” were identified. For example, single responsive xenograft models identified dasatinib (Ph-positive ALL; ref. 40), sunitinib (Flt3-activated ALL; ref. 41), selumetinib [AZD6244; BRAF(V600E) astrocytoma; ref. 29], and the MDM2 inhibitor RG7112 (infant MLL; ref. 42). RG7112 was tested in a larger cohort of infant MLL and showed marked activity (43) against all infant-derived models. T-ALL models were hypersensitive to the pre-prodrug PR-104, and similarly, further testing of PR-104 revealed activity against T-ALL that express high levels of the PR-104–activating enzyme aldo-keto reductase 1C3 (AKR1C3; ref. 44). For antibodies that block ligand binding to the Type 1 insulin-like growth factor receptor (IGF1R), the only meaningful activity was identified in sarcoma models at relatively low frequency [Ewing sarcoma (1 of 5), rhabdomyosarcomas (1 of 5), and osteosarcomas (2 of 6; refs. 32, 45)], consistent with clinical data in sarcoma (46).
Discussion
The concept of designing preclinical “patient” trials to evaluate new agents has not been validated, although this design opens the opportunity to incorporate additional models, derived from individual patients, that will more closely represent the heterogeneity of clinical cancer (28). Clearly, in these rare cancers of childhood, “omics” analysis is providing information on subgroups and potentially on therapy options for treatment. However, there are few preclinical models that represent these subgroups to allow interrogation of their drug responsiveness. Furthermore, using “traditional” study designs, the resources required to simulate these diverse genetic subgroups would be prohibitive. Indeed, if the preclinical “patient” trials approach is valid, then it may be possible to gain some estimate of likely clinical objective response rates at an early stage in development of an agent. Another advantage of using a greater number of models to represent the clinical disease is that one may identify a subset of highly sensitive tumors (“exceptional responders”) that would be valuable in developing a robust biomarker for subsequent patient selection (30).
We tested a range of drugs from standard chemotherapeutics to molecularly targeted agents, including antibodies and small molecules. Thus, the results are independent of drug class or mechanism of action. Overall, the single mouse data for all tumor models accurately predicted group response in 1,604 comparisons (75.16%). Allowing for a deviation of ± 1 response classification the correct prediction rate increased to 95.28%. Analysis of the ALL models showed the single mouse accurately predicted response in 75.33% studies, and this increased to 94.49% with allowance of ± one response classification. There was no difference between the prediction accuracy between xenograft models derived from cell lines, compared with those established by direct transplant of tumor into mice (P = 0.396).
The single mouse design identifies agents that induce tumor regression in specific models, and shows that single mouse data predicts accurately for objective responses using larger number of mice per group (n = 8–10). Conversely, a lack of objective response in a single mouse is highly predictive of no response in traditional study designs. Considering agents inducing CR or MCR against selected models, the single mouse approach is also highly predictive. Furthermore, the single mouse data accurately identified the tumor histologies that responded in the group study analysis. For pediatric drug development, agents inducing robust tumor regression are particularly important, and these operating characteristics for the single mouse design support its use in projects seeking to identify agents with this level of anticancer activity. For the studies analyzed, all solid and brain tumors were grown in the subcutaneous site. Consequently, stroma associated with the normal or orthotopic site may differ in the models. However, it is unlikely to influence the variation in response for individual mice. Rather for brain tumors it is likely that the subcutaneous models will overpredict drug responses, as in the orthotopic site drug penetration to the brain may be lower. Thus, for accurate translation, secondary testing in the orthotopic site may be necessary. However, the primary criteria for a screen is to identify agents with significant activity, and such activity can be filtered in secondary testing (i.e., orthotopic brain models etc.).
The advantage of the single mouse approach is in allowing a much larger number of models to be screened against agents of interest. There are increasing data to support the genetic fidelity of human tumors transplanted into immune-deficient mice. Expression profiling of 109 pediatric xenograft models with a similar number of patient samples showed that with rare exceptions xenografts clustered with the respective clinical histology (38, 39). Similarly, exome sequencing indicates more common gene mutations are recapitulated in the PPTP models. For example, the PPTP models include 4 embryonal rhabdomyosarcomas with three showing Ras-pathway activation, but each having an individual mechanism (NRAS mutant, NF1 mutation, and HRAS mutant). Each is different, and may represent a subtype that may respond differently to therapy. For alveolar rhabdomyosarcoma all 7 xenograft models are characterized by the reciprocal chromosomal translocation t(2;13) encoding the Pax3-Foxo1 chimeric transcription factor, but only one model has an activating mutation in the kinase domain of FGFR4, consistent with clinical data (47). Similar examples can be cited for each of the other solid tumor, brain tumor and ALL panels, thus documenting the heterogeneity of the models as representative of at least some of the heterogeneity of actual human tumors.
The assumption made in using larger panels of tumor models derived from a particular cancer type is that the overall response rate in preclinical models will more closely parallel response rates in the clinic. Intrinsic to this rationale is that by increasing the representation of subtypes of a diagnosis in the preclinical testing studies, it will be possible to identify particular subtypes that have “exceptional responses” to treatments. PPTP data tend to support this; for example, antibodies targeting IGF1R showed activity only in sarcoma models at a relatively low frequency but not in any other tumor types (32). Furthermore, “exceptional responders” to kinase inhibitors were identified only in models where there was an activating mutation that predisposed the tumor to drug sensitivity (29, 40). Other examples included identification of sensitivity in one or two models, where the cohort of tumor models was expanded to confirm drug activity. Examples include the activity of an MDM2 inhibitor against an infant MLL (42) where an additional six infant MLL models were used to confirm the general sensitivity of this ALL subgroup (43). Similarly, PR-104 when administered at dose levels consistent with human tolerated exposure identified T-ALL as the only sensitive tumor type, probably through selective drug activation (44). Other examples include sensitivity to temozolomide segregating with MGMT deficiency (48), or responsiveness to cisplatin and the PARP inhibitor talazoparib in a PALB2-mutated Wilms tumor (30). Thus, the proposed screening strategy can identify subgroups of potential responders and can accelerate therapy development in the clinic for specified subgroups.
To generate large panels of patient derived xenograft models across the range of childhood solid tumors (including tumors at diagnosis and relapse) will require collaboration and model-sharing across multiple laboratories and will require coordination with clinical trial organizations. Data generated by the PPTP clearly show that, at least in some cases, the genetic predisposition in an exceptional responder can be identified (e.g., PALB2 in response to cisplatin and talazoparib, or dasatinib in an ALL with BCR-ABL1 translocation), or the well-established sensitivity to chloroethylating nitrosoureas or temozolomide in MGMT-deficient tumors. Conversely, the response to a particular drug may identify a previously unrecognized susceptibility for a subset of tumors. For example, the JAK1/2 inhibitor AZD1480 unexpectedly induced regression of a Wilms tumor xenograft, and subsequent testing demonstrated regression in 3 of 5 models, suggesting some underlying predisposition to this agent, which may be an off-target kinase as ruxolitinib, another JAK1/2 inhibitor, was not active in these models (unpublished data). Thus, incorporating additional heterogeneity into preclinical testing the approach will lead to better insight into responses in humans.
Our studies identify response inconsistency as a characteristic of those models with lower predictive success. Thus, criteria for using a model with the single mouse experimental design may require running experiments using a “traditional” design as above for approximately 10 drugs and then analyzing data retrospectively, as in our study. To test this, we chose 10 studies at random for each tumor line and computed the accuracy for each model, allowing for deviation of plus or minus one response category. The models predicted response accurately (>80 percent correct) in 98% of trials with the ependymoma line (C6) being the worst model with an overall accuracy of 60%. Thus, this relatively easy approach may assist in excluding or including a model in the screen.
In summary, this analysis indicates that for most experiments results from a single tumor-bearing mouse accurately predict the result from a larger cohort (8–10) of mice. By assessing response with a single mouse, responsive tumor types were identified with relatively few exceptions. Furthermore, the overall response rates determined using single mouse data or group data were essentially identical. Assuming that large treatment effects are targeted (e.g., substantial tumor regression for treated animals vs. progressive disease for controls), utilization of the single mouse design may prove a feasible approach for reliably screening large numbers of models with diverse genetic characteristics.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Authors' Contributions
Conception and design: B. Murphy, E.A. Kolb, C.P. Reynolds, R.T. Kurmasheva, I. Dvorchik, J. Wu, R.B. Lock, P.J. Houghton
Development of methodology: B. Murphy, E.A. Kolb, I. Dvorchik, J. Wu, P.J. Houghton
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): J.M. Maris, E.A. Kolb, R. Gorlick, M.H. Kang, S.T. Keir, R.B. Lock, P.J. Houghton
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): B. Murphy, H. Yin, J.M. Maris, E.A. Kolb, C.P. Reynolds, I. Dvorchik, J. Wu, C.A. Billups, N. Boateng, M.A. Smith, R.B. Lock, P.J. Houghton
Writing, review, and/or revision of the manuscript: J.M. Maris, E.A. Kolb, R. Gorlick, C.P. Reynolds, R.T. Kurmasheva, J. Wu, N. Boateng, M.A. Smith, R.B. Lock, P.J. Houghton
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): B. Murphy, N. Boateng, P.J. Houghton
Study supervision: P.J. Houghton
Grant Support
NO1-CM42216 and UO1CA199297 from the National Cancer Institute.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.