Traditional approaches to evaluating antitumor agents using human tumor xenograft models have generally used cohorts of 8 to 10 mice against a limited panel of tumor models. An alternative approach is to use fewer animals per tumor line, allowing a greater number of models that capture greater molecular/genetic heterogeneity of the cancer type. We retrospectively analyzed 67 agents evaluated by the Pediatric Preclinical Testing Program to determine whether a single mouse, chosen randomly from each group of a study, predicted the median response for groups of mice using 83 xenograft models. The individual tumor response from a randomly chosen mouse was compared with the group median response using established response criteria. A total of 2,134 comparisons were made. The single tumor response accurately predicted the group median response in 1,604 comparisons (75.16%). The mean tumor response correct prediction rate for 1,000 single mouse random samples was 78.09%. Models had a range for correct prediction (60%–87.5%). Allowing for misprediction of ± one response category, the overall mean correct single mouse prediction rate was 95.28%, and predicted overall objective response rates for group data in 66 of 67 drug studies. For molecularly targeted agents, occasional exceptional responder models were identified and the activity of that agent confirmed in additional models with the same genotype. Assuming that large treatment effects are targeted, this alternate experimental design has similar predictive value as traditional approaches, allowing for far greater numbers of models to be used that more fully encompass the heterogeneity of disease types. Cancer Res; 76(19); 5798–809. ©2016 AACR.

Preclinical development of anticancer agents has predominantly relied upon murine syngeneic tumor models, and subsequently human tumor xenografts. The NCI screening program used a limited number of xenograft models to represent major forms of adult cancer, NSCLC, breast, colon, etc. (1–4). These tumors were established in immune-deficient mice, and standard criteria were developed for agents to meet or surpass to continue clinical development. In general, the pharmaceutical industry adopted the same approach, using similar models, as the NCI. However, these models have been criticized for not being representative of human disease (5, 6), and there is a move away from cell line derived xenografts to patient-derived xenografts (PDX) where tumor models are established directly in mice from patient biopsies (7). Although development of PDX adult cancer and childhood cancer models started several decades ago (8–10), it has more recently been realized that, once plated on plastic, characteristics of tumors may be lost, and that xenografts developed from established cell lines may not accurately represent the original tumor (11, 12)

For many cancers, molecular characterization has identified multiple subtypes with particular characteristics that relate to sensitivity to therapy. An example is responsiveness to trastuzumab in patients with HER2 amplified cancers. Similarly, subtypes of NSCLC with EML4-ALK translocation or mutant EGFR respond to crizotinib and erlotinib, respectively (13–15), whereas these agents have little impact on survival in unselected patients. With increasing “omics” analysis of childhood cancers, subsets have also been identified for ependymoma (16, 17), medulloblastoma (18, 19), neuroblastoma (20–22), sarcomas (23–26), and acute leukemias (27), diseases previously considered rather homogeneous entities. Thus, there is a need to represent more diverse genotypes/phenotypes in the preclinical models. The problem is how to represent such diversity in a screening program that will allow identification of molecular subsets that may be responsive to particular therapeutics.

Within the Pediatric Preclinical Testing Program (PPTP) over one hundred PDX models have been developed, although resources limit evaluation of agents to approximately 50 models in a primary screen. As the screen attempts to identify active agents in multiple cancer types (sarcomas, brain tumors, neuroblastoma, kidney tumors, and acute lymphoblastic leukemia), only limited panels of tumors can be used that almost certainly do not represent the genetic or phenotypic heterogeneity of the clinical disease. One approach to expanding the models to overcome this limitation is to use each mouse as a “patient” (28). In these preclinical trials, each mouse has a tumor derived from a different patient, and is evaluated for tumor progression or regression during therapy. In this way, 30 or more tumors of a given type (e.g., melanoma) can be evaluated with the potential to generate response rates that more closely parallel clinical response rates. With the relative ease for generating new PDX models, the approach of using one mouse to represent a tumor allows far greater number of tumor lines to be evaluated, and perhaps to gain better insight into potential response rates that may occur in humans, or to allow identification of subsets of tumors within a histotype that have increased or decreased drug sensitivity. In PPTP studies, examples of agents for which subsets of tumors are hypersensitive to treatment include the MEK inhibitor, selumetinib, for which a single tumor with a BRAF mutation responded (29), whereas 45 other models failed to respond; the oncolytic virus NTX-010 for which there was selective sensitivity of neuroblastoma and alveolar rhabdomyosarcoma models (29), and the PARP inhibitor talazoparib for which xenografts with DNA damage repair defects responded (30). Even within Ewing sarcoma models with a common etiology as a consequence of the oncogenic EWS/FLI1 fusion, only 5 of 10 xenograft models were highly sensitive to combination treatment with temozolomide and a PARP inhibitor (31) and only 1 of 5 xenograft models was sensitive to the IGF-1R inhibitor SCH 717454 (32).

Although here are many reasons that preclinical results fail to translate to clinical activity (7, 33), the ability to evaluate an agent across a larger panel of models derived from a particular cancer type may be valuable for identifying subsets of tumors that respond to specific therapies, and for obtaining an estimate of likely response rates in unselected patients having that diagnosis. However, the validity of such an approach will depend upon the predictive value of the individual mouse/tumor. To assess the “one mouse” design, we evaluated the ability of a single mouse/tumor chosen at random to predict accurately the “group” response for 67 agents tested in the PPTP. The results suggest that the “single mouse” experimental design accurately predicts group response and that, with a few exceptions, predicts the overall response rate across panels of childhood cancers, and accurately identify responsive tumor types.

Detailed methods and response criteria used by the PPTP for evaluating new agents (34) are presented in Supplementary Materials and Methods. Briefly, PDX models for solid tumors were derived by subcutaneous transplant of tumor fragments and ALL models were derived by intravenous injection of purified blasts as described previously (35). Cell line–derived xenografts were serially propagated by transplantation of tumor fragments, as for the PDX models. Solid tumors and non-glioblastoma brain tumors were propagated by serial passage of fragments subcutaneously into CB17SC (scid−/−) female mice (Taconic Farms, Germantown, NY), glioblastoma were transplanted into BALB/c nu/nu mice (36). Leukemia models were propagated by intravenous inoculation in female non-obese diabetic (NOD)/scid−/− mice (37). Tumor volumes (cm3) or percentages of human CD45-positive cells (ALL xenografts) were measured for each tumor at the initiation of the study and weekly for up to 42 days after study initiation.

The xenograft models included 11 kidney tumors (Wilms, rhabdoid), 14 soft tissue sarcoma (Ewing/rhabdomyosarcoma), 9 non-glioblastoma brain tumors (medulloblastoma, ependymoma, PNET, glioma), 6 glioblastoma, 8 neuroblastoma, 7 osteosarcoma, and 28 acute lymphoblastic leukemia (ALL). Pathologic (34) and molecular characteristics (38, 39) for many of these xenografts have been reported. Solid tumors were grown subcutaneously, whereas ALL models were disseminated disease following intravenous inoculation. Of the 83 models, 62 (75%) were derived from direct transplantation of patient tumor into mice, and maintained by serial passage in mice. For solid tumors 10 mice per group were used and 8 mice per group for ALL models. Growth curves for tumors in each mouse were derived weekly until the tumor reached the endpoint criteria (4-fold its volume at initiation of treatment). The PPTP response criteria have been previously described (34), and are presented in detail in the Supplementary Materials and Methods. Briefly, each of the 8 to 10 mice in a treatment group is assigned one of 5 response scores based on the effect of the treatment on their tumor: Progressive disease without growth delay (PD1 = 0), progressive response with growth delay (PD2 = 2), stable disease (SD = 4), partial response (PR = 6), complete response (CR = 8), and maintained CR (MCR = 10). The objective response for the treatment group is based on the median of the objective response scores for the individual mice within the group. Treatment groups with PR, CR, or MCR are considered to have an objective response.

The response to treatment was scored for each tumor in the group (PD1–MCR), and the Group Median response was determined. In initial analysis, the response of one tumor in the treatment group was selected using a random number generating routine (1–10 for solid tumors, 1–8 for ALL studies). The response of the randomly selected tumor was then compared with the Group Median Response. This allowed for the deviation (± response categories between the single mouse result and the group median response) to be calculated. The analysis involved 2,106 tumor/drug comparisons.

As a more robust analysis, we repeated this process of randomly selecting one mouse/tumor from each tumor line from each treatment group 1,000 times and compared the response of the randomly chosen mouse/tumor to the median group tumor response using a slightly larger database (2,134 treatment group comparisons), and we calculated the average number of times the response from 1,000 random samples was the same as the median group response (mean number of the correct prediction). The percentage of correct predictions per tumor line is the mean number of the correct predictions from the random mouse/tumor for each treatment group for that tumor line irrespective of the drug tested.

Cell Lines and xenografts

All cell lines, other than those generated from PDX models, were obtained from the ATCC. Patient-derived xenografts and cell lines were deposited in a Master Bank at the initiation of the PPTP project, and characterized by short tandem repeats. Periodically, during the PPTP, lines were verified as authentic by STR analysis.

Statistical analysis

For the analysis, a single mouse prediction was considered to be correct when it was equal to the median tumor response of the group responses as described in Supplementary Materials and Methods. Four of the best and worse predictive models were selected on the basis of a percentage of correct responses among all of the studies. The Wilcoxon rank-sum test was used to compare percentages between the models. Regression analysis was used to study the association between single mouse predicted objective response and group mouse objective response.

Statistical analyses were done using SAS 9.3. and R. P values less than 0.05 were considered statistically significant.

Criteria used to assess tumor response are defined in Materials and Methods and in more detail in the Supplementary Materials and Methods and have been published previously (34). The studies included a wide range of agents, including molecularly targeted agents such as tyrosine kinase inhibitors, antibody–drug conjugates, standard cytotoxic agents, ligand-binding antibodies, growth factor receptor–binding antibodies and small molecules, as well as two small-molecule agents with unknown mechanism of action (Supplementary Table S1 with references to published studies). Agents of greatest interest are those that induce tumor regression (i.e., objective responses).

In our initial analysis of PPTP datasets, results for 82 tumor models for 2,106 comparisons was undertaken (Supplementary Table S2), and indicated a correct prediction rate for single mouse data of 79.3%. If a deviation of +1 or −1 response category was used as an acceptable level of error (PD1 vs. PD2, or MCR vs. CR etc.), the prediction accuracy increased to 95.6%.

To further test the accuracy for a randomly chosen single mouse to predict the group median response, we analyzed a slightly larger dataset that included a total of 83 xenograft models and 2,134 treatment groups. Some of these tests involved screening of an agent against many tumor models (>40), whereas other tests used a focused testing approach (i.e., evaluated only against specific models, e.g., selected ALL xenografts, or tumor models having a specific genetic characteristic). The “test” mouse was chosen at random from each treatment group of a tumor line. The response of the tumor in that individual mouse was compared to the treatment group's response (median tumor response). This process was repeated 1,000 times for the whole dataset (2,134 groups). Tumor models, number of observations and accuracy of prediction are presented in Table 1. To quantify the accuracy, we used the PPTP response classifications. If the single mouse predicted response accurately (e.g., the single mouse response was SD and the group response was also SD, etc.) the result was scored as 0. If the single mouse overpredicted response by one category, the score was +1, or if underpredicted by one response category it was scored −1. Thus, the maximum overprediction would be MCR when the group response was PD1 (+5) or maximum under-prediction if the single mouse was PD1 but the group response was MCR (−5). Overall, the single mouse data predicted the group response with 78.09% accuracy (score 0), based on 2,134 treatment groups, Fig. 1A. If an error of +1 or −1 response category was used as an acceptable level of error the prediction accuracy increased to 95.49%, Table 2. Gross overprediction (+5), or underprediction (−5), occurred in 0.14% and 0.001% of studies, respectively (Table 2).

Table 1.

Tumor types and mean frequency of single mouse accurate predictions for group response (based on 1,000 random single mouse samples)

Tumor codeTumor lineHistologyMean number incorrectMean number correctTotal no. of studiesProportion of correct response
A1 BT-29 Kidney ATRT 11.554 30.571 42 0.725 
A2 KT-10 Wilms tumor 7.769 39.050 47 0.838 
A3 KT-11 Wilms tumor 7.557 30.447 38 0.802 
A4 KT-13 Wilms tumor 11.236 35.833 47 0.761 
A5 KT-16 Kidney ATRT 6.378 12.652 19 0.661 
A6a SK-NEP-1 Ewing sarcoma 8.606 43.378 52 0.832 
A7 KT-12 Kidney ATRT 8.475 19.466 28 0.697 
A8 KT-14 Kidney ATRT 9.851 30.094 40 0.751 
A9 KT-5 Wilms tumor 0.719 2.265 0.756 
A10 WT-8 Wilms tumor 0.000 1.000 1.000 
A11 WT-6 Wilms tumor 0.352 0.671 0.695 
B1 Rh30R Alveolar rhabdomyosarcoma 10.855 36.238 47 0.770 
B2 EW-5 Ewing sarcoma 10.810 39.357 50 0.789 
B3a EW-8 Ewing sarcoma 7.050 38.912 46 0.845 
B4 Rh10 Alveolar rhabdomyosarcoma 7.834 29.289 37 0.789 
B5a Rh18 Embryonal rhabdomyosarcoma 12.248 35.619 48 0.744 
B6 Rh26 Alveolar rhabdomyosarcoma 9.9710 36.111 46 0.779 
B7 Rh30 Alveolar rhabdomyosarcoma 10.346 38.678 49 0.788 
B8 Rh41 Alveolar rhabdomyosarcoma 8.765 39.157 48 0.816 
B9 Rh36 Embryonal rhabdomyosarcoma 3.475 8.569 12 0.710 
B10 Rh65 Alveolar rhabdomyosarcoma 1.175 6.752 0.849 
B11a TC-71 Ewing sarcoma 8.717 37.173 46 0.811 
B12a CHLA258 Ewing sarcoma 7.905 38.265 46 0.831 
B13 Rh66 Alveolar rhabdomyosarcoma 0.465 1.523 0.753 
B15a ES-6 Ewing sarcoma 0.404 0.592 0.601 
C1 BT-28 Medulloblastoma 8.082 37.913 46 0.825 
C4 BT-45 Medulloblastoma 6.594 32.477 39 0.831 
C5 BT-36 Ependymoma 6.774 10.216 17 0.600 
C6 BT-41 Ependymoma 8.293 12.713 21 0.600 
C7 BT-46 Medulloblastoma 3.013 7.950 11 0.72 
C8 BT-50 Medulloblastoma 8.578 26.411 35 0.755 
C9 BT-44 Ependymoma 6.501 21.697 28 0.770 
C11 BT-35 Glioma 0.229 0.784 0.765 
C12 BT-40 Glioma 0.000 2.000 1.000 
D1a GBM2 Glioblastoma 14.266 32.727 47 0.699 
D2 BT-39 Glioblastoma 9.858 39.063 49 0.800 
D3a D645 Glioblastoma 9.405 33.641 43 0.784 
D4a D456 Glioblastoma 10.554 33.423 44 0.761 
D5 BT-56 Glioblastoma 1.208 1.792 0.594 
D6a D212 Glioblastoma 0.525 1.516 0.749 
E1a NB-SD Neuroblastoma 10.353 30.541 41 0.747 
E2a NB-1771 Neuroblastoma 10.691 34.444 45 0.764 
E3a NB-1691 Neuroblastoma 8.737 39.124 48 0.817 
E4a NB-EBc1 Neuroblastoma 10.710 37.338 48 0.737 
E5a CHLA79 Neuroblastoma 8.302 28.678 37 0.776 
E6a NB1643 Neuroblastoma 10.274 34.761 45 0.773 
E7a NB1382 Neuroblastoma 0.198 1.822 0.909 
E9a SK-NAS Neuroblastoma 1.238 5.727 0.813 
F1 OS-1 Osteosarcoma 10.714 40.301 51 0.791 
F2 OS-2 Osteosarcoma 5.972 42.230 48 0.875 
F3 OS-17 Osteosarcoma 8.439 38.475 47 0.819 
F9 OS-9 Osteosarcoma 7.911 37.228 45 0.822 
F10 OS-33 Osteosarcoma 6.804 43.225 50 0.862 
F11 OS-31 Osteosarcoma 6.961 41.013 48 0.856 
F12 OS-29 Osteosarcoma 0.000 1.00 1.000 
G1 ALL-2 Primary; B-precursor 10.252 32.945 43 0.766 
G2 ALL-3 Primary; B-precursor 11.956 21.008 33 0.640 
G3 ALL-4 Primary; B-precursor; Ph+(BCR-ABL) 8.470 37.410 46 0.814 
G4 ALL-7 Primary; B-precursor 8.934 31.202 40 0.776 
G5 ALL-8 Primary; B-precursor 10.132 35.888 45 0.785 
G6 ALL-16 Primary; T-cell ALL 12.663 20.272 33 0.615 
G7 ALL-17 Primary; B-precursor 8.569 36.319 45 0.810 
G8 ALL-19 Primary; B-precursor 11.542 32.449 44 0.738 
G9 ALL-10 Primary; B-precursor 0.358 1.638 0.807 
G11 ALL-27 Tertiary, T-cell ALL 0.442 0.580 0.549 
G12 ALL-29 Tertiary, T-cell ALL 0.435 0.585 0.597 
G13 ALL-30 Tertiary, T-cell ALL 0.359 0.630 0.645 
G14 ALL-31 Tertiary, T-cell ALL 2.6520 7.4110 10 0.7421 
G15 MLL-7 Tertiary, infant, precursor B-ALL 2.416 8.554 13 0.778 
G19 MLL-2 Tertiary, infant, precursor B-ALL 0.411 0.591 0.562 
G20 MLL-3 Infant BCP-ALL 0.261 1.745 0.874 
G21 MLL-5 Tertiary, infant, precursor B-ALL 0.491 1.444 0.750 
G22 MLL-6 Infant BCP-ALL 0.124 0.877 0.875 
G23 MLL-8 Infant BCP-ALL 0.482 0.520 0.546 
G24 MLL-14 Infant BCP-ALL 0.342 1.667 0.829 
G25 TGT-020 ALL JAK2 R683G 0.000 3.000 1.000 
G27 TGT-047 ALL JAK2 R683G 0.243 1.748 0.862 
G28 TGT-052 ALL JAK1 V658F 0.000 1.000 1.000 
G29 TGT-144 ALL JAK2 
G30 TGT-174 ALL JAK2 P933R 0.840 1.140 0.579 
H18a RS4;11 B-Lineage, monocytic 0.115 0.865 0.881 
J1a KARPAS299 T-cell lymphoma 0.000 2.000 1.000 
J2a MV4;11 Acute monocytic 0.204 1.783 0.895 
Tumor codeTumor lineHistologyMean number incorrectMean number correctTotal no. of studiesProportion of correct response
A1 BT-29 Kidney ATRT 11.554 30.571 42 0.725 
A2 KT-10 Wilms tumor 7.769 39.050 47 0.838 
A3 KT-11 Wilms tumor 7.557 30.447 38 0.802 
A4 KT-13 Wilms tumor 11.236 35.833 47 0.761 
A5 KT-16 Kidney ATRT 6.378 12.652 19 0.661 
A6a SK-NEP-1 Ewing sarcoma 8.606 43.378 52 0.832 
A7 KT-12 Kidney ATRT 8.475 19.466 28 0.697 
A8 KT-14 Kidney ATRT 9.851 30.094 40 0.751 
A9 KT-5 Wilms tumor 0.719 2.265 0.756 
A10 WT-8 Wilms tumor 0.000 1.000 1.000 
A11 WT-6 Wilms tumor 0.352 0.671 0.695 
B1 Rh30R Alveolar rhabdomyosarcoma 10.855 36.238 47 0.770 
B2 EW-5 Ewing sarcoma 10.810 39.357 50 0.789 
B3a EW-8 Ewing sarcoma 7.050 38.912 46 0.845 
B4 Rh10 Alveolar rhabdomyosarcoma 7.834 29.289 37 0.789 
B5a Rh18 Embryonal rhabdomyosarcoma 12.248 35.619 48 0.744 
B6 Rh26 Alveolar rhabdomyosarcoma 9.9710 36.111 46 0.779 
B7 Rh30 Alveolar rhabdomyosarcoma 10.346 38.678 49 0.788 
B8 Rh41 Alveolar rhabdomyosarcoma 8.765 39.157 48 0.816 
B9 Rh36 Embryonal rhabdomyosarcoma 3.475 8.569 12 0.710 
B10 Rh65 Alveolar rhabdomyosarcoma 1.175 6.752 0.849 
B11a TC-71 Ewing sarcoma 8.717 37.173 46 0.811 
B12a CHLA258 Ewing sarcoma 7.905 38.265 46 0.831 
B13 Rh66 Alveolar rhabdomyosarcoma 0.465 1.523 0.753 
B15a ES-6 Ewing sarcoma 0.404 0.592 0.601 
C1 BT-28 Medulloblastoma 8.082 37.913 46 0.825 
C4 BT-45 Medulloblastoma 6.594 32.477 39 0.831 
C5 BT-36 Ependymoma 6.774 10.216 17 0.600 
C6 BT-41 Ependymoma 8.293 12.713 21 0.600 
C7 BT-46 Medulloblastoma 3.013 7.950 11 0.72 
C8 BT-50 Medulloblastoma 8.578 26.411 35 0.755 
C9 BT-44 Ependymoma 6.501 21.697 28 0.770 
C11 BT-35 Glioma 0.229 0.784 0.765 
C12 BT-40 Glioma 0.000 2.000 1.000 
D1a GBM2 Glioblastoma 14.266 32.727 47 0.699 
D2 BT-39 Glioblastoma 9.858 39.063 49 0.800 
D3a D645 Glioblastoma 9.405 33.641 43 0.784 
D4a D456 Glioblastoma 10.554 33.423 44 0.761 
D5 BT-56 Glioblastoma 1.208 1.792 0.594 
D6a D212 Glioblastoma 0.525 1.516 0.749 
E1a NB-SD Neuroblastoma 10.353 30.541 41 0.747 
E2a NB-1771 Neuroblastoma 10.691 34.444 45 0.764 
E3a NB-1691 Neuroblastoma 8.737 39.124 48 0.817 
E4a NB-EBc1 Neuroblastoma 10.710 37.338 48 0.737 
E5a CHLA79 Neuroblastoma 8.302 28.678 37 0.776 
E6a NB1643 Neuroblastoma 10.274 34.761 45 0.773 
E7a NB1382 Neuroblastoma 0.198 1.822 0.909 
E9a SK-NAS Neuroblastoma 1.238 5.727 0.813 
F1 OS-1 Osteosarcoma 10.714 40.301 51 0.791 
F2 OS-2 Osteosarcoma 5.972 42.230 48 0.875 
F3 OS-17 Osteosarcoma 8.439 38.475 47 0.819 
F9 OS-9 Osteosarcoma 7.911 37.228 45 0.822 
F10 OS-33 Osteosarcoma 6.804 43.225 50 0.862 
F11 OS-31 Osteosarcoma 6.961 41.013 48 0.856 
F12 OS-29 Osteosarcoma 0.000 1.00 1.000 
G1 ALL-2 Primary; B-precursor 10.252 32.945 43 0.766 
G2 ALL-3 Primary; B-precursor 11.956 21.008 33 0.640 
G3 ALL-4 Primary; B-precursor; Ph+(BCR-ABL) 8.470 37.410 46 0.814 
G4 ALL-7 Primary; B-precursor 8.934 31.202 40 0.776 
G5 ALL-8 Primary; B-precursor 10.132 35.888 45 0.785 
G6 ALL-16 Primary; T-cell ALL 12.663 20.272 33 0.615 
G7 ALL-17 Primary; B-precursor 8.569 36.319 45 0.810 
G8 ALL-19 Primary; B-precursor 11.542 32.449 44 0.738 
G9 ALL-10 Primary; B-precursor 0.358 1.638 0.807 
G11 ALL-27 Tertiary, T-cell ALL 0.442 0.580 0.549 
G12 ALL-29 Tertiary, T-cell ALL 0.435 0.585 0.597 
G13 ALL-30 Tertiary, T-cell ALL 0.359 0.630 0.645 
G14 ALL-31 Tertiary, T-cell ALL 2.6520 7.4110 10 0.7421 
G15 MLL-7 Tertiary, infant, precursor B-ALL 2.416 8.554 13 0.778 
G19 MLL-2 Tertiary, infant, precursor B-ALL 0.411 0.591 0.562 
G20 MLL-3 Infant BCP-ALL 0.261 1.745 0.874 
G21 MLL-5 Tertiary, infant, precursor B-ALL 0.491 1.444 0.750 
G22 MLL-6 Infant BCP-ALL 0.124 0.877 0.875 
G23 MLL-8 Infant BCP-ALL 0.482 0.520 0.546 
G24 MLL-14 Infant BCP-ALL 0.342 1.667 0.829 
G25 TGT-020 ALL JAK2 R683G 0.000 3.000 1.000 
G27 TGT-047 ALL JAK2 R683G 0.243 1.748 0.862 
G28 TGT-052 ALL JAK1 V658F 0.000 1.000 1.000 
G29 TGT-144 ALL JAK2 
G30 TGT-174 ALL JAK2 P933R 0.840 1.140 0.579 
H18a RS4;11 B-Lineage, monocytic 0.115 0.865 0.881 
J1a KARPAS299 T-cell lymphoma 0.000 2.000 1.000 
J2a MV4;11 Acute monocytic 0.204 1.783 0.895 

aCell line–derived xenografts.

Figure 1.

A, distribution of deviation for 2,134 observations. Single mouse prediction of response was compared with the median response for groups of tumor-bearing mice (solid tumors, n = 10; ALL models, n = 8). A score equaling zero indicates accurate prediction, whereas +1 and −1 refer to over- and underprediction by one response classification. B, four “best” tumor lines (F2, OS-2; F10, OS-33; F11, OS-31 osteosarcomas and B10, Rh65 rhabdomyosarcoma). C, four “worst case” tumor lines (C5, BT-36 and C6, BT-41 ependymomas; G2, All-3 (primary B-cell precursor ALL) and G6, All-16 (primary T-cell ALL).

Figure 1.

A, distribution of deviation for 2,134 observations. Single mouse prediction of response was compared with the median response for groups of tumor-bearing mice (solid tumors, n = 10; ALL models, n = 8). A score equaling zero indicates accurate prediction, whereas +1 and −1 refer to over- and underprediction by one response classification. B, four “best” tumor lines (F2, OS-2; F10, OS-33; F11, OS-31 osteosarcomas and B10, Rh65 rhabdomyosarcoma). C, four “worst case” tumor lines (C5, BT-36 and C6, BT-41 ependymomas; G2, All-3 (primary B-cell precursor ALL) and G6, All-16 (primary T-cell ALL).

Close modal
Table 2.

Distribution of deviation of all experiments (1,000 comparisons/treatment group)

DeviationaNumber of predictionsPercentage
Underprediction −5 0.001 
 −4 13 0.006 
 −3 15 0.007 
 −2 31 1.45 
 −1 194 9.09 
 1,718 80.50 
Overprediction 126 5.90 
 20 0.94 
 0.33 
 0.19 
 0.14 
 Total 2,134 100 
DeviationaNumber of predictionsPercentage
Underprediction −5 0.001 
 −4 13 0.006 
 −3 15 0.007 
 −2 31 1.45 
 −1 194 9.09 
 1,718 80.50 
Overprediction 126 5.90 
 20 0.94 
 0.33 
 0.19 
 0.14 
 Total 2,134 100 

aThe deviation represents the distance from the treatment group objective response (PD1, PD2, SD, PR, CR, or MCR) from the response observed by the randomly selected single mouse. Negative values represent underprediction (the single mouse shows an inferior response to the treatment group response) and positive values represent overprediction (the single mouse shows a superior response to the treatment group response). Percentages represent the rates at which there was overprediction or underprediction by the specified deviation value.

Because the methodology for assessing model response is different for the disseminated leukemia models (measurement of human CD45-positive cells in peripheral blood), we also analyzed the ALL data separately from the solid tumors. There were 375 treatment groups. The mean of 1,000 random samples of “single mouse” data demonstrated that the single mouse results predicted the group response accurately 75.33% of the time. If a deviation of ± one response category was used, the “success” rate increased to 94.30% (Supplementary Fig. S1).

We also identified the four best and worst tumor models in terms of accuracy of response prediction. The four models having the highest proportions of accurate predictions were B10, F11, F10, and F2, with correct prediction rates of 84.9%, 85.6%, 86.2%, and 87.5%, respectively (see Table 1 for tumor line identity and type). The four worst tumor models estimated from the mean response rates were C5, C6, G2, and G6 with accurate predictions of 60%, 60.0%, 64.0%, and 61.5%, respectively. The P value from the Wilcoxon rank-sum test was 0.0294, which indicates that the percentage of correct predictions of best tumors is significantly higher than that of worst tumor models. Histograms showing the best and worst models are shown in Fig. 1B and C. One possible explanation for lower prediction from a single mouse is that growth rates or responses were far less consistent in the tumor models with poor predictive value than those models with higher prediction success. We analyzed the consistency of responses for the four worst tumor models (C5, C6, G2, and G6), compared with the four best tumor models (B10, F2, F10, and F11; Supplementary Table S3). The means of variances of response for the four worst tumor models (range, 3.15–7.3) were much larger than the means of variances of response for the four best tumor models (range, 1.39–1.74; P = 0.03). Thus, the inconsistency of the responses in a treatment group reduced predictive value based on the single mouse response evaluation. Specific examples that illustrate deviation of the “single mouse” response relative to the group median response are shown in Fig. 2. Analysis of exomic mutations in the four “best” and three of the four “worst” showed a similar low mutation frequency in tumors with good predictive activity (range, 2–8 mutations/model) as those with poorer predictive value (range, 2–7 mutations/model; Supplementary Table S4).

Figure 2.

Examples of over- and underprediction by single mouse tumors. Left, growth of individual untreated (control) tumors. Right, growth of treated individual tumors. The tumor chosen at random (single mouse) is shown by the gray broken line (arrow). For solid tumors, tumor volume (cm3) is plotted against time, and for ALL-16, the percentage of human CD45-positive cells in peripheral blood is plotted against time. Treatments started on day 0.

Figure 2.

Examples of over- and underprediction by single mouse tumors. Left, growth of individual untreated (control) tumors. Right, growth of treated individual tumors. The tumor chosen at random (single mouse) is shown by the gray broken line (arrow). For solid tumors, tumor volume (cm3) is plotted against time, and for ALL-16, the percentage of human CD45-positive cells in peripheral blood is plotted against time. Treatments started on day 0.

Close modal

One approach to assessing validity of “single mouse” predictions is to determine the effect of “false positive” and “false negative” predictions. For example, would data from the single mouse have predicted significant activity for an agent that was not active based on the median response data for the group (i.e., false positive)? To examine this, we first analyzed the number of objective responses (i.e., response of PR, CR, or MCR) within the group-derived dataset. In the 2,134 response determinations, there were 318 objective responses. We next analyzed whether the differences between predictions of the single mouse and that for the group median response altered the ‘outcome’ (i.e., number of models where response was at least PR). Single mouse data for predicted objective response when it was not present with a frequency of 14.03% and failed to identify objective response when it was present with a frequency of 7.78%. Considering only the 318 treatment groups with objective responses (determined by group median response), the single mouse approach predicted the group objective response correctly 73.0% of the time. For the 1,816 treatment groups with less than objective response, the single mouse predicted correctly 79.1% of the time. Applying a higher response standard, there were 250 treatment groups with CR or MCR among the 2,134 response determinations. Considering these 250 treatment groups with CR/MCR responses, the single mouse predicted the group response correctly 78.2% of the time.

Each drug study analyzed included a range of tumor models (median 41, range 1 to 55, depending on the experimental objective). The results from single mouse results and the group median results were highly correlated in both the initial analysis and the response rate from 1,000 random comparisons (r2 = 0.99, Fig. 3). In most studies, the over- and underpredictions were minor, and did not alter the outcome (objective response rate; Table 3 and Supplementary Table S2). As shown in Fig. 3 the single mouse data correlated well with the activity levels of each drug. The two outliers (studies 0614 and 1008, 20: vs. 40% and 50% vs. 0%) occurred in our initial analysis of the single mouse data. These experiments used 5 and 2 tumor models, respectively. This discrepancy was not apparent when 1,000 random iterations were performed.

Figure 3.

Objective response rates (ORR) were calculated for all tumor models tested for a particular drug (n = 1 to n = 55) for different studies, based upon the group median response. Red, responses predicted from a randomly chosen single mouse are plotted against group median response; blue, the single mouse ORR mean ORR correlation based on 1,000 single mouse samples.

Figure 3.

Objective response rates (ORR) were calculated for all tumor models tested for a particular drug (n = 1 to n = 55) for different studies, based upon the group median response. Red, responses predicted from a randomly chosen single mouse are plotted against group median response; blue, the single mouse ORR mean ORR correlation based on 1,000 single mouse samples.

Close modal
Table 3.

Mean objective response rates (ORR) for 67 drugs based upon single mouse predictions of group median response (1,000 comparisons)

ORRaStudy numberStudy agentNumber of studiesSingle mouse ORR (%)Group median ORR (%)
0%–20% 502 Bortezemib 18.421 10.526 
 504 17-DMAG 39 13.310 12.821 
 506 BMS-354825 42 11.490 7.143 
 507 AZD-2171 39 10.269 5.128 
 601 SU11248 44 7.370 6.818 
 602 Rapamycin 45 15.376 13.333 
 603 Lapatinib 41 1.254 0.000 
 604 ABT-263 43 9.805 9.302 
 605 19D12 43 8.002 6.977 
 607 SAHA 41 3.115 0.000 
 613 Cytarabine 2.033 0.000 
 614 Chloretazine 28.120 20.000 
 702 HGS-ETR1 45 0.411 0.000 
 703 GSK690693 46 2.930 2.174 
 704 Aplidin 42 4.583 2.381 
 706 Sorafenib 44 2.950 0.000 
 708 IMC-A12 35 7.449 2.857 
 801 CGC(PG)-11047 41 5.722 2.439 
 803 BMS-754807 44 1.609 0.000 
 805 AZD8055 45 1.698 0.000 
 806 JNJ26854165 45 14.456 13.333 
 807 SCH 727965 43 6.793 4.651 
 808 MLN4924 44 4.707 0.000 
 814 Pazopanib 0.000 0.000 
 901 AT13387 42 1.131 0.000 
 902 LCL161 46 4.246 2.174 
 903 Lenalidomide 45 2.931 0.000 
 905 JNJ26481585 41 9.207 7.317 
 1000 Trisenox(ATO) 0.000 0.000 
 1001 RO4929097 34 0.000 0.000 
 1002 SGI-1776 39 0.551 0.000 
 1003 MK-2206 38 1.774 0.000 
 1006 TAK-701 0.000 0.000 
 1007 XL147 39 4.328 2.564 
 1008 XL765 4.950 0.000 
 1010 BAL101553 38 0.295 0.000 
 1101 AZD1480 50 7.830 6.000 
 1102 PF-03084014 44 2.527 0.000 
 1103 INK128-1110-028 38 3.129 0.000 
 1104 Ganetespib 11 0.000 0.000 
 1105 Pixantrone 9.738 12.500 
 1108 XL765 0.000 0.000 
 1109 PCI-32765 0.000 0.000 
 1112 Cabozantinib (XL-148) 34 7.706 5.882 
 1113 KPT330 45 12.891 11.111 
 1201 CX-5461 44 8.177 6.818 
 1207 NSC060043 0.000 0.000 
21%–40% 501 Vincristine 47 40.774 38.298 
 505 Cisplatin 46 23.593 23.913 
 606 Topotecan 45 33.284 33.333 
 701 MLN8237 45 42.396 40.000 
 804 GSK923295A 38 42.939 39.474 
 1004 IMGN901 25 40.616 40.000 
 1005 RG7112 45 45.211 40.000 
 1011 BI6727 (Volasertib) 41 23.834 24.390 
 1107 TL32711 40.300 33.333 
 1110 NSC750854 30 41.090 40.000 
 1203 Glembatumumab 37.550 37.500 
41%–60% 707 SAR3419 12 47.883 50.000 
 813 GENZ-644282 17 37.318 41.176 
 904 Temozolomide 60.000 60.000 
 1111 Cabazitaxel 10 60.420 50.000 
 1202 Abraxane 42.857 42.857 
61%–80% 503 Cyclophosphamide 47 67.038 63.830 
 802 PR-104 42 70.967 69.048 
 1106 Eribulin 43 60.116 60.465 
81%–100% 913 CPX351 100.000 100.000 
ORRaStudy numberStudy agentNumber of studiesSingle mouse ORR (%)Group median ORR (%)
0%–20% 502 Bortezemib 18.421 10.526 
 504 17-DMAG 39 13.310 12.821 
 506 BMS-354825 42 11.490 7.143 
 507 AZD-2171 39 10.269 5.128 
 601 SU11248 44 7.370 6.818 
 602 Rapamycin 45 15.376 13.333 
 603 Lapatinib 41 1.254 0.000 
 604 ABT-263 43 9.805 9.302 
 605 19D12 43 8.002 6.977 
 607 SAHA 41 3.115 0.000 
 613 Cytarabine 2.033 0.000 
 614 Chloretazine 28.120 20.000 
 702 HGS-ETR1 45 0.411 0.000 
 703 GSK690693 46 2.930 2.174 
 704 Aplidin 42 4.583 2.381 
 706 Sorafenib 44 2.950 0.000 
 708 IMC-A12 35 7.449 2.857 
 801 CGC(PG)-11047 41 5.722 2.439 
 803 BMS-754807 44 1.609 0.000 
 805 AZD8055 45 1.698 0.000 
 806 JNJ26854165 45 14.456 13.333 
 807 SCH 727965 43 6.793 4.651 
 808 MLN4924 44 4.707 0.000 
 814 Pazopanib 0.000 0.000 
 901 AT13387 42 1.131 0.000 
 902 LCL161 46 4.246 2.174 
 903 Lenalidomide 45 2.931 0.000 
 905 JNJ26481585 41 9.207 7.317 
 1000 Trisenox(ATO) 0.000 0.000 
 1001 RO4929097 34 0.000 0.000 
 1002 SGI-1776 39 0.551 0.000 
 1003 MK-2206 38 1.774 0.000 
 1006 TAK-701 0.000 0.000 
 1007 XL147 39 4.328 2.564 
 1008 XL765 4.950 0.000 
 1010 BAL101553 38 0.295 0.000 
 1101 AZD1480 50 7.830 6.000 
 1102 PF-03084014 44 2.527 0.000 
 1103 INK128-1110-028 38 3.129 0.000 
 1104 Ganetespib 11 0.000 0.000 
 1105 Pixantrone 9.738 12.500 
 1108 XL765 0.000 0.000 
 1109 PCI-32765 0.000 0.000 
 1112 Cabozantinib (XL-148) 34 7.706 5.882 
 1113 KPT330 45 12.891 11.111 
 1201 CX-5461 44 8.177 6.818 
 1207 NSC060043 0.000 0.000 
21%–40% 501 Vincristine 47 40.774 38.298 
 505 Cisplatin 46 23.593 23.913 
 606 Topotecan 45 33.284 33.333 
 701 MLN8237 45 42.396 40.000 
 804 GSK923295A 38 42.939 39.474 
 1004 IMGN901 25 40.616 40.000 
 1005 RG7112 45 45.211 40.000 
 1011 BI6727 (Volasertib) 41 23.834 24.390 
 1107 TL32711 40.300 33.333 
 1110 NSC750854 30 41.090 40.000 
 1203 Glembatumumab 37.550 37.500 
41%–60% 707 SAR3419 12 47.883 50.000 
 813 GENZ-644282 17 37.318 41.176 
 904 Temozolomide 60.000 60.000 
 1111 Cabazitaxel 10 60.420 50.000 
 1202 Abraxane 42.857 42.857 
61%–80% 503 Cyclophosphamide 47 67.038 63.830 
 802 PR-104 42 70.967 69.048 
 1106 Eribulin 43 60.116 60.465 
81%–100% 913 CPX351 100.000 100.000 

aORR, objective response rate = number of models with (PR, CR, and MCR)/number of models tested with that agent.

We analyzed the dataset to see whether single mouse data identified the same tumor types as sensitive as did the group-derived data. Even analyzing the “worst” four studies for molecularly targeted agents the single mouse result appears to identify responsive tumor types, Fig. 4. Single mouse results would have overpredicted activity of a VEGFR2 inhibitor in neuroblastoma (AZD2171; study 0507), and would have missed the potential signal for activity both in non-glioblastoma brain tumors (AZD8055; study 0805) and ALL (AZD1480; study 1101). However, in both of these studies, only a single brain tumor (n = 5) and one ALL model (n = 10) responded (group data), suggesting limited activity for these agents against these cancer histologies.

Figure 4.

Objective response data presented as “waterfall” plots for the “worst” four studies using molecularly targeted agents. Each bar represents the response of a single tumor model. Left, distribution of responses based upon a single mouse analysis. Right, the median group response based on 8 to 10 mice per group. Colors indicate different tumor types.

Figure 4.

Objective response data presented as “waterfall” plots for the “worst” four studies using molecularly targeted agents. Each bar represents the response of a single tumor model. Left, distribution of responses based upon a single mouse analysis. Right, the median group response based on 8 to 10 mice per group. Colors indicate different tumor types.

Close modal

The assumption made is that by simulating the clinical heterogeneity of tumor types in the preclinical model, one may more readily identify tumor subtypes that are responsive to particular agents. We retrospectively analyzed responses to 67 agents across all models to see whether this assumption was valid. Over 100 of the xenograft models in the PPTP have been characterized by exome sequencing, expression profiling, or both techniques (38, 39 and unpublished data). The “omic” characteristics of “Exceptional Responders” were identified. For example, single responsive xenograft models identified dasatinib (Ph-positive ALL; ref. 40), sunitinib (Flt3-activated ALL; ref. 41), selumetinib [AZD6244; BRAF(V600E) astrocytoma; ref. 29], and the MDM2 inhibitor RG7112 (infant MLL; ref. 42). RG7112 was tested in a larger cohort of infant MLL and showed marked activity (43) against all infant-derived models. T-ALL models were hypersensitive to the pre-prodrug PR-104, and similarly, further testing of PR-104 revealed activity against T-ALL that express high levels of the PR-104–activating enzyme aldo-keto reductase 1C3 (AKR1C3; ref. 44). For antibodies that block ligand binding to the Type 1 insulin-like growth factor receptor (IGF1R), the only meaningful activity was identified in sarcoma models at relatively low frequency [Ewing sarcoma (1 of 5), rhabdomyosarcomas (1 of 5), and osteosarcomas (2 of 6; refs. 32, 45)], consistent with clinical data in sarcoma (46).

The concept of designing preclinical “patient” trials to evaluate new agents has not been validated, although this design opens the opportunity to incorporate additional models, derived from individual patients, that will more closely represent the heterogeneity of clinical cancer (28). Clearly, in these rare cancers of childhood, “omics” analysis is providing information on subgroups and potentially on therapy options for treatment. However, there are few preclinical models that represent these subgroups to allow interrogation of their drug responsiveness. Furthermore, using “traditional” study designs, the resources required to simulate these diverse genetic subgroups would be prohibitive. Indeed, if the preclinical “patient” trials approach is valid, then it may be possible to gain some estimate of likely clinical objective response rates at an early stage in development of an agent. Another advantage of using a greater number of models to represent the clinical disease is that one may identify a subset of highly sensitive tumors (“exceptional responders”) that would be valuable in developing a robust biomarker for subsequent patient selection (30).

We tested a range of drugs from standard chemotherapeutics to molecularly targeted agents, including antibodies and small molecules. Thus, the results are independent of drug class or mechanism of action. Overall, the single mouse data for all tumor models accurately predicted group response in 1,604 comparisons (75.16%). Allowing for a deviation of ± 1 response classification the correct prediction rate increased to 95.28%. Analysis of the ALL models showed the single mouse accurately predicted response in 75.33% studies, and this increased to 94.49% with allowance of ± one response classification. There was no difference between the prediction accuracy between xenograft models derived from cell lines, compared with those established by direct transplant of tumor into mice (P = 0.396).

The single mouse design identifies agents that induce tumor regression in specific models, and shows that single mouse data predicts accurately for objective responses using larger number of mice per group (n = 8–10). Conversely, a lack of objective response in a single mouse is highly predictive of no response in traditional study designs. Considering agents inducing CR or MCR against selected models, the single mouse approach is also highly predictive. Furthermore, the single mouse data accurately identified the tumor histologies that responded in the group study analysis. For pediatric drug development, agents inducing robust tumor regression are particularly important, and these operating characteristics for the single mouse design support its use in projects seeking to identify agents with this level of anticancer activity. For the studies analyzed, all solid and brain tumors were grown in the subcutaneous site. Consequently, stroma associated with the normal or orthotopic site may differ in the models. However, it is unlikely to influence the variation in response for individual mice. Rather for brain tumors it is likely that the subcutaneous models will overpredict drug responses, as in the orthotopic site drug penetration to the brain may be lower. Thus, for accurate translation, secondary testing in the orthotopic site may be necessary. However, the primary criteria for a screen is to identify agents with significant activity, and such activity can be filtered in secondary testing (i.e., orthotopic brain models etc.).

The advantage of the single mouse approach is in allowing a much larger number of models to be screened against agents of interest. There are increasing data to support the genetic fidelity of human tumors transplanted into immune-deficient mice. Expression profiling of 109 pediatric xenograft models with a similar number of patient samples showed that with rare exceptions xenografts clustered with the respective clinical histology (38, 39). Similarly, exome sequencing indicates more common gene mutations are recapitulated in the PPTP models. For example, the PPTP models include 4 embryonal rhabdomyosarcomas with three showing Ras-pathway activation, but each having an individual mechanism (NRAS mutant, NF1 mutation, and HRAS mutant). Each is different, and may represent a subtype that may respond differently to therapy. For alveolar rhabdomyosarcoma all 7 xenograft models are characterized by the reciprocal chromosomal translocation t(2;13) encoding the Pax3-Foxo1 chimeric transcription factor, but only one model has an activating mutation in the kinase domain of FGFR4, consistent with clinical data (47). Similar examples can be cited for each of the other solid tumor, brain tumor and ALL panels, thus documenting the heterogeneity of the models as representative of at least some of the heterogeneity of actual human tumors.

The assumption made in using larger panels of tumor models derived from a particular cancer type is that the overall response rate in preclinical models will more closely parallel response rates in the clinic. Intrinsic to this rationale is that by increasing the representation of subtypes of a diagnosis in the preclinical testing studies, it will be possible to identify particular subtypes that have “exceptional responses” to treatments. PPTP data tend to support this; for example, antibodies targeting IGF1R showed activity only in sarcoma models at a relatively low frequency but not in any other tumor types (32). Furthermore, “exceptional responders” to kinase inhibitors were identified only in models where there was an activating mutation that predisposed the tumor to drug sensitivity (29, 40). Other examples included identification of sensitivity in one or two models, where the cohort of tumor models was expanded to confirm drug activity. Examples include the activity of an MDM2 inhibitor against an infant MLL (42) where an additional six infant MLL models were used to confirm the general sensitivity of this ALL subgroup (43). Similarly, PR-104 when administered at dose levels consistent with human tolerated exposure identified T-ALL as the only sensitive tumor type, probably through selective drug activation (44). Other examples include sensitivity to temozolomide segregating with MGMT deficiency (48), or responsiveness to cisplatin and the PARP inhibitor talazoparib in a PALB2-mutated Wilms tumor (30). Thus, the proposed screening strategy can identify subgroups of potential responders and can accelerate therapy development in the clinic for specified subgroups.

To generate large panels of patient derived xenograft models across the range of childhood solid tumors (including tumors at diagnosis and relapse) will require collaboration and model-sharing across multiple laboratories and will require coordination with clinical trial organizations. Data generated by the PPTP clearly show that, at least in some cases, the genetic predisposition in an exceptional responder can be identified (e.g., PALB2 in response to cisplatin and talazoparib, or dasatinib in an ALL with BCR-ABL1 translocation), or the well-established sensitivity to chloroethylating nitrosoureas or temozolomide in MGMT-deficient tumors. Conversely, the response to a particular drug may identify a previously unrecognized susceptibility for a subset of tumors. For example, the JAK1/2 inhibitor AZD1480 unexpectedly induced regression of a Wilms tumor xenograft, and subsequent testing demonstrated regression in 3 of 5 models, suggesting some underlying predisposition to this agent, which may be an off-target kinase as ruxolitinib, another JAK1/2 inhibitor, was not active in these models (unpublished data). Thus, incorporating additional heterogeneity into preclinical testing the approach will lead to better insight into responses in humans.

Our studies identify response inconsistency as a characteristic of those models with lower predictive success. Thus, criteria for using a model with the single mouse experimental design may require running experiments using a “traditional” design as above for approximately 10 drugs and then analyzing data retrospectively, as in our study. To test this, we chose 10 studies at random for each tumor line and computed the accuracy for each model, allowing for deviation of plus or minus one response category. The models predicted response accurately (>80 percent correct) in 98% of trials with the ependymoma line (C6) being the worst model with an overall accuracy of 60%. Thus, this relatively easy approach may assist in excluding or including a model in the screen.

In summary, this analysis indicates that for most experiments results from a single tumor-bearing mouse accurately predict the result from a larger cohort (8–10) of mice. By assessing response with a single mouse, responsive tumor types were identified with relatively few exceptions. Furthermore, the overall response rates determined using single mouse data or group data were essentially identical. Assuming that large treatment effects are targeted (e.g., substantial tumor regression for treated animals vs. progressive disease for controls), utilization of the single mouse design may prove a feasible approach for reliably screening large numbers of models with diverse genetic characteristics.

No potential conflicts of interest were disclosed.

Conception and design: B. Murphy, E.A. Kolb, C.P. Reynolds, R.T. Kurmasheva, I. Dvorchik, J. Wu, R.B. Lock, P.J. Houghton

Development of methodology: B. Murphy, E.A. Kolb, I. Dvorchik, J. Wu, P.J. Houghton

Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): J.M. Maris, E.A. Kolb, R. Gorlick, M.H. Kang, S.T. Keir, R.B. Lock, P.J. Houghton

Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): B. Murphy, H. Yin, J.M. Maris, E.A. Kolb, C.P. Reynolds, I. Dvorchik, J. Wu, C.A. Billups, N. Boateng, M.A. Smith, R.B. Lock, P.J. Houghton

Writing, review, and/or revision of the manuscript: J.M. Maris, E.A. Kolb, R. Gorlick, C.P. Reynolds, R.T. Kurmasheva, J. Wu, N. Boateng, M.A. Smith, R.B. Lock, P.J. Houghton

Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): B. Murphy, N. Boateng, P.J. Houghton

Study supervision: P.J. Houghton

NO1-CM42216 and UO1CA199297 from the National Cancer Institute.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1.
Goldin
A
,
Venditti
JM
. 
Progress report on the screening program at the Division of Cancer Treatment, National Cancer Institute
.
Cancer Treat Rev
1980
;
7
:
167
76
.
2.
Goldin
A
,
Venditti
JM
. 
The new NCI screen and its implications for clinical evaluation
.
Recent Results Cancer Res
1980
;
70
:
5
20
.
3.
Goldin
A
,
Venditti
JM
. 
A prospective screening program: current screening and its status
.
Recent Results Cancer Res
1981
;
76
:
176
91
.
4.
Sausville
EA
,
Burger
AM
. 
Contributions of human tumor xenografts to anticancer drug development
.
Cancer Res
2006
;
66
:
3351
4
.
5.
Gould
SE
,
Junttila
MR
,
de Sauvage
FJ
. 
Translational value of mouse models in oncology drug development
.
Nat Med
2015
;
21
:
431
9
.
6.
Williams
SA
,
Anderson
WC
,
Santaguida
MT
,
Dylla
SJ
. 
Patient-derived xenografts, the cancer stem cell paradigm, and cancer pathobiology in the 21st century
.
Lab Invest
2013
;
93
:
970
82
.
7.
Hidalgo
M
,
Amant
F
,
Biankin
AV
,
Budinska
E
,
Byrne
AT
,
Caldas
C
, et al
Patient-derived xenograft models: an emerging platform for translational cancer research
.
Cancer Discov
2014
;
4
:
998
1013
.
8.
Pickard
RG
,
Cobb
LM
,
Steel
GG
. 
The growth kinetics of xenografts of human colorectal tumours in immune deprived mice
.
Br J Cancer
1975
;
31
:
36
45
.
9.
Houghton
JA
,
Taylor
DM
. 
Maintenance of biological and biochemical characteristics of human colorectal tumours during serial passage in immune-deprived mice
.
Br J Cancer
1978
;
37
:
199
212
.
10.
Houghton
JA
,
Houghton
PJ
,
Webber
BL
. 
Growth and characterization of childhood rhabdomyosarcomas as xenografts
.
J Natl Cancer Inst
1982
;
68
:
437
43
.
11.
Gillet
JP
,
Calcagno
AM
,
Varma
S
,
Marino
M
,
Green
LJ
,
Vora
MI
, et al
Redefining the relevance of established cancer cell lines to the study of mechanisms of clinical anti-cancer drug resistance
.
Proc Natl Acad Sci U S A
2011
;
108
:
18708
13
.
12.
Hausser
HJ
,
Brenner
RE
. 
Phenotypic instability of Saos-2 cells in long-term culture
.
Biochem Biophys Res Commun
2005
;
333
:
216
22
.
13.
Kwak
EL
,
Bang
YJ
,
Camidge
DR
,
Shaw
AT
,
Solomon
B
,
Maki
RG
, et al
Anaplastic lymphoma kinase inhibition in non–small cell lung cancer
.
N Engl J Med
2010
;
363
:
1693
703
.
14.
Lynch
TJ
,
Bell
DW
,
Sordella
R
,
Gurubhagavatula
S
,
Okimoto
RA
,
Brannigan
BW
, et al
Activating mutations in the epidermal growth factor receptor underlying responsiveness of non–small cell lung cancer to gefitinib
.
N Engl J Med
2004
;
350
:
2129
39
.
15.
Paez
JG
,
Janne
PA
,
Lee
JC
,
Tracy
S
,
Greulich
H
,
Gabriel
S
, et al
EGFR mutations in lung cancer: correlation with clinical response to gefitinib therapy
.
Science
2004
;
304
:
1497
500
.
16.
Mack
SC
,
Witt
H
,
Wang
X
,
Milde
T
,
Yao
Y
,
Bertrand
KC
, et al
Emerging insights into the ependymoma epigenome
.
Brain Pathol
2013
;
23
:
206
9
.
17.
Nagasawa
DT
,
Trang
A
,
Choy
W
,
Spasic
M
,
Yew
A
,
Zarinkhou
G
, et al
Genetic expression profiles of adult and pediatric ependymomas: molecular pathways, prognostic indicators, and therapeutic targets
.
Clin Neurol Neurosurg
2013
;
115
:
388
99
.
18.
Hovestadt
V
,
Jones
DT
,
Picelli
S
,
Wang
W
,
Kool
M
,
Northcott
PA
, et al
Decoding the regulatory landscape of medulloblastoma using DNA methylation sequencing
.
Nature
2014
;
510
:
537
41
.
19.
Hovestadt
V
,
Remke
M
,
Kool
M
,
Pietsch
T
,
Northcott
PA
,
Fischer
R
, et al
Robust molecular subgrouping and copy-number profiling of medulloblastoma from small amounts of archival tumour material using high-density DNA methylation arrays
.
Acta Neuropathol
2013
;
125
:
913
6
.
20.
Schleiermacher
G
,
Javanmardi
N
,
Bernard
V
,
Leroy
Q
,
Cappo
J
,
Rio Frio
T
, et al
Emergence of new
ALK mutations at relapse of neuroblastoma
.
J Clin Oncol
2014
;
32
:
2727
34
.
21.
Kushner
BH
,
Modak
S
,
Kramer
K
,
LaQuaglia
MP
,
Yataghene
K
,
Basu
EM
, et al
Striking dichotomy in outcome of MYCN-amplified neuroblastoma in the contemporary era
.
Cancer
2014
;
120
:
2050
9
.
22.
Maris
JM
. 
Recent advances in neuroblastoma
.
N Engl J Med
2010
;
362
:
2202
11
.
23.
Parham
DM
,
Barr
FG
. 
Classification of rhabdomyosarcoma and its molecular basis
.
Adv Anat Pathol
2013
;
20
:
387
97
.
24.
Abraham
J
,
Nunez-Alvarez
Y
,
Hettmer
S
,
Carrio
E
,
Chen
HI
,
Nishijo
K
, et al
Lineage of origin in rhabdomyosarcoma informs pharmacological response
.
Genes Dev
2014
;
28
:
1578
91
.
25.
Khanna
C
,
Fan
TM
,
Gorlick
R
,
Helman
LJ
,
Kleinerman
ES
,
Adamson
PC
, et al
Toward a drug development path that targets metastatic progression in osteosarcoma
.
Clin Cancer Res
2014
;
20
:
4200
9
.
26.
Shern
JF
,
Chen
L
,
Chmielecki
J
,
Wei
JS
,
Patidar
R
,
Rosenberg
M
, et al
Comprehensive genomic analysis of rhabdomyosarcoma reveals a landscape of alterations affecting a common genetic axis in fusion-positive and fusion-negative tumors
.
Cancer Discov
2014
;
4
:
216
31
.
27.
Mullighan
CG
. 
Genomic characterization of childhood acute lymphoblastic leukemia
.
Semin Hematol
2013
;
50
:
314
24
.
28.
Gao
H
,
Korn
JM
,
Ferretti
S
,
Monahan
JE
,
Wang
Y
,
Singh
M
, et al
High-throughput screening using patient-derived tumor xenografts to predict clinical trial drug response
.
Nat Med
2015
;
21
:
1318
25
.
29.
Kolb
EA
,
Gorlick
R
,
Houghton
PJ
,
Morton
CL
,
Neale
G
,
Keir
ST
, et al
Initial testing (stage 1) of AZD6244 (ARRY-142886) by the Pediatric Preclinical Testing Program
.
Pediatr Blood Cancer
2010
;
55
:
668
77
.
30.
Smith
MA
,
Hampton
OA
,
Reynolds
CP
,
Kang
MH
,
Maris
JM
,
Gorlick
R
, et al
Initial testing (stage 1) of the PARP inhibitor BMN 673 by the pediatric preclinical testing program: PALB2 mutation predicts exceptional in vivo response to BMN 673
.
Pediatr Blood Cancer
2015
;
62
:
91
8
.
31.
Smith
MA
,
Reynolds
CP
,
Kang
MH
,
Kolb
EA
,
Gorlick
R
,
Carol
H
, et al
Synergistic activity of PARP inhibition by talazoparib (BMN 673) with temozolomide in pediatric cancer models in the pediatric preclinical testing program
.
Clin Cancer Res
2015
;
21
:
819
32
.
32.
Kolb
EA
,
Gorlick
R
,
Houghton
PJ
,
Morton
CL
,
Lock
R
,
Carol
H
, et al
Initial testing (stage 1) of a monoclonal antibody (SCH 717454) against the IGF-1 receptor by the pediatric preclinical testing program
.
Pediatr Blood Cancer
2008
;
50
:
1190
7
.
33.
Peterson
JK
,
Houghton
PJ
. 
Integrating pharmacology and in vivo cancer models in preclinical and clinical drug development
.
Eur J Cancer
2004
;
40
:
837
44
.
34.
Houghton
PJ
,
Morton
CL
,
Tucker
C
,
Payne
D
,
Favours
E
,
Cole
C
, et al
The pediatric preclinical testing program: description of models and early testing results
.
Pediatr Blood Cancer
2007
;
49
:
928
40
.
35.
Morton
CL
,
Papa
RA
,
Lock
RB
,
Houghton
PJ
. 
Preclinical chemotherapeutic tumor models of common childhood cancers: solid tumors, acute lymphoblastic leukemia, and disseminated neuroblastoma
.
Curr Protoc Pharmacol
2007
;
Chapter 14:Unit14 8
.
36.
Friedman
HS
,
Colvin
OM
,
Skapek
SX
,
Ludeman
SM
,
Elion
GB
,
Schold
SC
 Jr
, et al
Experimental chemotherapy of human medulloblastoma cell lines and transplantable xenografts with bifunctional alkylating agents
.
Cancer Res
1988
;
48
:
4189
95
.
37.
Liem
NL
,
Papa
RA
,
Milross
CG
,
Schmid
MA
,
Tajbakhsh
M
,
Choi
S
, et al
Characterization of childhood acute lymphoblastic leukemia xenograft models for the preclinical evaluation of new therapies
.
Blood
2004
;
103
:
3905
14
.
38.
Whiteford
CC
,
Bilke
S
,
Greer
BT
,
Chen
Q
,
Braunschweig
TA
,
Cenacchi
N
, et al
Credentialing preclinical pediatric xenograft models using gene expression and tissue microarray analysis
.
Cancer Res
2007
;
67
:
32
40
.
39.
Neale
G
,
Su
X
,
Morton
CL
,
Phelps
D
,
Gorlick
R
,
Lock
RB
, et al
Molecular characterization of the pediatric preclinical testing panel
.
Clin Cancer Res
2008
;
14
:
4572
83
.
40.
Kolb
EA
,
Gorlick
R
,
Houghton
PJ
,
Morton
CL
,
Lock
RB
,
Tajbakhsh
M
, et al
Initial testing of dasatinib by the pediatric preclinical testing program
.
Pediatr Blood Cancer
2008
;
50
:
1198
206
.
41.
Maris
JM
,
Courtright
J
,
Houghton
PJ
,
Morton
CL
,
Kolb
EA
,
Lock
R
, et al
Initial testing (stage 1) of sunitinib by the pediatric preclinical testing program
.
Pediatr Blood Cancer
2008
;
51
:
42
8
.
42.
Carol
H
,
Reynolds
CP
,
Kang
MH
,
Keir
ST
,
Maris
JM
,
Gorlick
R
, et al
Initial testing of the MDM2 inhibitor RG7112 by the pediatric preclinical testing program
.
Pediatr Blood Cancer
2013
;
60
:
633
41
.
43.
Richmond
J
,
Carol
H
,
Evans
K
,
High
L
,
Mendomo
A
,
Robbins
A
, et al
Effective targeting of the P53-MDM2 axis in preclinical models of infant MLL-rearranged acute lymphoblastic leukemia
.
Clin Cancer Res
2015
;
21
:
1395
405
.
44.
Moradi Manesh
D
,
El-Hoss
J
,
Evans
K
,
Richmond
J
,
Toscan
CE
,
Bracken
LS
, et al
AKR1C3 is a biomarker of sensitivity to PR-104 in preclinical models of T-cell acute lymphoblastic leukemia
.
Blood
2015
;
126
:
1193
202
.
45.
Houghton
PJ
,
Morton
CL
,
Gorlick
R
,
Kolb
EA
,
Keir
ST
,
Reynolds
CP
, et al
Initial testing of a monoclonal antibody (IMC-A12) against IGF-1R by the pediatric preclinical testing program
.
Pediatr Blood Cancer
2010
;
54
:
921
6
.
46.
Pappo
AS
,
Vassal
G
,
Crowley
JJ
,
Bolejack
V
,
Hogendoorn
PC
,
Chugh
R
, et al
A phase 2 trial of R1507, a monoclonal antibody to the insulin-like growth factor-1 receptor (IGF-1R), in patients with recurrent or refractory rhabdomyosarcoma, osteosarcoma, synovial sarcoma, and other soft tissue sarcomas: results of a Sarcoma Alliance for Research Through Collaboration study
.
Cancer
2014
;
120
:
2448
56
.
47.
Taylor
JGt
,
Cheuk
AT
,
Tsang
PS
,
Chung
JY
,
Song
YK
,
Desai
K
, et al
Identification of FGFR4-activating mutations in human rhabdomyosarcomas that promote metastasis in xenotransplanted models
.
J Clin Invest
2009
;
119
:
3395
407
.
48.
Keir
ST
,
Maris
JM
,
Reynolds
CP
,
Kang
MH
,
Kolb
EA
,
Gorlick
R
, et al
Initial testing (stage 1) of temozolomide by the pediatric preclinical testing program
.
Pediatr Blood Cancer
2013
;
60
:
783
90
.