Abstract
Many promising new cancer drugs proceed through preclinical testing and early-phase trials only to fail in late-stage clinical testing. Thus, improved models that better predict survival outcomes and enable the development of biomarkers are needed to identify patients most likely to respond to and benefit from therapy. Here, we describe a comprehensive approach in which we incorporated biobanking, xenografting, and multiplexed phospho-flow (PF) cytometric profiling to study drug response and identify predictive biomarkers in acute myeloid leukemia (AML) patients. To test the efficacy of our approach, we evaluated the investigational JAK2 inhibitor fedratinib (FED) in 64 patient samples. FED robustly reduced leukemia in mouse xenograft models in 59% of cases and was also effective in limiting the protumorigenic activity of leukemia stem cells as shown by serial transplantation assays. In parallel, PF profiling identified FED-mediated reduction in phospho-STAT5 (pSTAT5) levels as a predictive biomarker of in vivo drug response with high specificity (92%) and strong positive predictive value (93%). Unexpectedly, another JAK inhibitor, ruxolitinib (RUX), was ineffective in 8 of 10 FED-responsive samples. Notably, this outcome could be predicted by the status of pSTAT5 signaling, which was unaffected by RUX treatment. Consistent with this observed discrepancy, PF analysis revealed that FED exerted its effects through multiple JAK2-independent mechanisms. Collectively, this work establishes an integrated approach for testing novel anticancer agents that captures the inherent variability of response caused by disease heterogeneity and in parallel, facilitates the identification of predictive biomarkers that can help stratify patients into appropriate clinical trials. Cancer Res; 76(5); 1214–24. ©2016 AACR.
Introduction
Historically, improvements in long-term survival of cancer patients due to new therapeutic approaches have been incremental. Promising preclinical studies or early-phase clinical trials frequently do not translate into survival benefits in phase III trials (1), implying that traditional preclinical models and endpoints in early-phase trials are insufficient surrogates for predicting long-term outcomes. Moreover, the mechanistic basis for variable treatment responses in clinical trials is often unknown and can result in rejection of a drug that may be of benefit to a subset of patients. Thus, a re-examination of traditional drug development models and parallel identification of drug response biomarkers for patient selection are required in order to improve the success rate in bringing forward new effective oncologic drugs.
Current preclinical models seldom reflect the disease state within humans. For example, new drugs are frequently screened for their antiproliferative activity against cancer cell lines in vitro. However, cell lines and in vitro cultures do not fully capture the intrinsic and extrinsic diversity of human disease. Moreover, proliferation in culture measures drug effects on the bulk population and not the cancer stem cells (CSC), which in many tumors have been linked to therapy failure and disease recurrence (2). Tumor heterogeneity is also not well modeled by (frequently nonorthotopic) injection of human cancer cell lines into mice, or even by engineered mouse models; the low variability and good reproducibility of the latter are actually disadvantageous for drug testing as they do not reflect intratumor and interpatient heterogeneity (3, 4). Xenotransplantation of primary cancer cells is currently the best functional assay for both normal and malignant adult human stem cells, and in the context of human acute myeloid leukemia (AML) reads out clinically relevant properties of repopulating cells (5–7). Numerous previous studies have evaluated the efficacy of antileukemia drugs in the setting of xenotransplantation assays (8–12); however, the number of primary patient samples tested has generally been small, thus precluding biomarker development.
Here, we describe a comprehensive approach that combines drug testing of a large cohort of primary patient samples in xenotransplantation assays with parallel phospho-flow (PF) cytometric single-cell profiling of short-term drug responsiveness in vitro to develop companion drug response biomarkers. To test this approach, we studied the efficacy of fedratinib (FED, also known as SAR302503 or TG101348), an investigational Janus kinase 2 (JAK2) inhibitor, against leukemia stem cells (LSC) in AML. JAK2 inhibitors including ruxolitinib (RUX) and FED have demonstrated efficacy in clinical trials for the treatment of myeloproliferative neoplasms (MPN; refs. 13–15), but have not been employed in AML, where activating JAK2 mutations are rare. Nevertheless, downstream STAT transcription factors are activated in the majority of AML cases (16, 17). Furthermore, high levels of phosphorylated JAK2 (pJAK2) expression have been associated with worse outcome in AML, and in vitro studies suggest that JAK2 could be a therapeutic target in this disease (18). We demonstrate here that our approach effectively captures the variability in treatment response that is generally seen in patient cohorts and allowed identification of a PF signature that correlated with drug responses in xenografts.
Materials and Methods
Xenotransplantation assay and drug studies
Xenotransplantation and in vivo drug treatment experiments were carried out in Toronto (T) and Vancouver (V) using local optimized protocols. Employed protocols yielded similar engraftment results at both sites for patient samples tested in pilot studies. NOD.SCID (NS) and NOD.SCID-IL2Rγnull (NSG) mice were bred and housed at the University Health Network (UHN) Animal Facility (T) or the BC Cancer Research Centre Animal Resource Centre (V). Eight- to 10-week-old mice were sublethally irradiated (T: 225 cGy; V-NS: 325 cGy; V-NSG: 315 cGy) 24 hours before transplantation. NS mice (T) received 200 μg anti-CD122 mAb by subcutaneous injection immediately after irradiation. For NSG experiments, T-cell depletion was carried out by treating mice with 12.5 μg/kg anti–CD3-diphtheria toxin by i.p. injection 24 hours after AML transplantation for 2 consecutive days (V), or by using the EasySep CD3 Positive Selection Kit (StemCell Technologies) prior to intrafemoral (IF) transplantation (T). AML samples were injected IF except for two Vancouver samples that were transplanted intravenously as indicated in Table 1. AML samples were transplanted at a dose of 2 to 5 × 106 cells/mouse. Treatment with FED (Sanofi) or RUX (Selleck Chemicals), both at a dose of 60 mg/kg, or vehicle (0.5% methylcellulose) was given twice daily by oral gavage for 14 days starting 2 to 3 weeks after transplantation. For serial transplantation studies, equal numbers of human CD45+ cells harvested from the pooled bone marrow of FED-treated or vehicle-treated mice were injected into untreated secondary recipients and engraftment evaluated 10 to 12 weeks after transplantation. For cytarabine combination studies, mice were treated with cytarabine 80 mg/kg/d i.p. ×5 days prior to FED or vehicle treatment. For DAS combination studies, mice received 60 mg/kg FED twice daily, 50 mg/kg DAS once daily, or both for 2 weeks by oral gavage. For all drug studies, mice were sacrificed the day after the final dose, and the level of human leukemic engraftment in the injected femur and noninjected bones (other femur plus two tibias) was evaluated by flow cytometry using human-specific mAbs (for details, see Supplementary Methods).
. | . | . | . | Injected RF . | . | Noninjected BM . | . | . | ||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
. | . | . | . | Mean engraftment (%) . | . | Mean engraftment (%) . | . | . | ||||
Site . | Patient ID . | FAB . | Sample . | Vehicle . | FED . | P value . | RR (%) . | Vehicle . | FED . | P value . | RR (%) . | In vivo responsea . |
T | 1 | M1 | Relapse | 90.7 | 7.2 | 0.001 | 92 | 67.9 | 5.8 | 0.001 | 91 | R |
T | 2 | M0 | Diagnosis | 21.6 | 1.2 | 0.001 | 94 | 5.9 | 0.3 | 0.004 | 95 | R |
T | 3 | Unclassified | Diagnosis | 50.9 | 7.4 | 0.002 | 85 | 19.9 | 3.6 | 0.001 | 82 | R |
T | 4 | M4 | Relapse | 34.3 | 6.3 | 0.008 | 82 | 7.7 | 1.5 | 0.002 | 81 | R |
T | 5 | Unclassified | Diagnosis | 36.4 | 8.6 | 0.002 | 76 | 15.5 | 4.1 | 0.001 | 74 | R |
T | 6 | M4 | Relapse | 17.1 | 4.2 | 0.001 | 75 | 6.8 | 0.5 | 0.001 | 92 | R |
T | 7 | ND | Diagnosis | 76.9 | 19.3 | 0.001 | 75 | 46.8 | 7.9 | 0.001 | 83 | R |
T | 8 | M4 | Relapse | 75.7 | 19.4 | 0.001 | 74 | 50.9 | 9.3 | 0.001 | 82 | R |
V | 9 | M5 | Diagnosis | i.v. | i.v. | 27.4 | 6.7 | 0.0001 | 76 | R | ||
V | 10 | M4 | Diagnosis | i.v. | i.v. | 60.5 | 18.3 | 0.0001 | 70 | R | ||
T | 11 | M5a | Diagnosis | 22.6 | 7.3 | 0.004 | 68 | 13.5 | 3.7 | 0.001 | 73 | R |
T | 12 | ND | Diagnosis | 86.9 | 28.3 | 0.001 | 67 | 40.4 | 9.6 | 0.001 | 76 | R |
T | 13 | M5 | Diagnosis | 36.6 | 12.7 | 0.02 | 65 | 3.6 | 1.8 | 0.05 | 50 | R |
V | 14 | M0 | Diagnosis | 18.7 | 7.1 | 0.0001 | 62 | 8.5 | 1.8 | 0.004 | 79 | R |
T | 15 | M5b | Diagnosis | 25.6 | 10.0 | 0.01 | 61 | 11.5 | 3.6 | 0.01 | 69 | R |
V | 16 | ND | Diagnosis | 66.1 | 27.4 | 0.0008 | 59 | 6.7 | 4.7 | NS | 30 | R |
T | 17 | M1 | Diagnosis | 27.9 | 12.4 | 0.05 | 56 | 0.6 | 0.2 | NS | 67 | R |
T | 18 | M5 | Diagnosis | 28.0 | 15.2 | 0.001 | 46 | 15.2 | 8.1 | 0.01 | 47 | PR |
V | 19 | M2 | Diagnosis | 55.7 | 32.5 | NS | 42 | 12.3 | 5.3 | 0.044 | 57 | PR |
V | 20 | M4 | Diagnosis | 84.9 | 64.1 | 0.009 | 24 | 57.1 | 33.7 | 0.045 | 41 | PR |
T | 21 | M2 | Diagnosis | 86.1 | 84.5 | NS | 2 | 53.7 | 37.1 | 0.03 | 31 | PR |
V | 22 | M0 | Diagnosis | 93.6 | 93.7 | NS | 0 | 90.9 | 32.3 | 0.0001 | 64 | PR |
V | 23 | M5b | Diagnosis | 17.9 | 3.3 | NS | 82 | 2.8 | 1.3 | NS | 54 | NR |
T | 24 | ND | Relapse | 41.0 | 26.5 | NS | 35 | 9.6 | 1.7 | NS | 82 | NR |
V | 25 | M4 | Diagnosis | 24.6 | 19.4 | NS | 21 | 5.2 | 2.5 | 0.017 | 52 | NR |
V | 26 | M4Eo | Diagnosis | 16.0 | 13.6 | NS | 15 | 1.0 | 1.5 | NS | −56 | NR |
V | 27 | M4 | Diagnosis | 25.7 | 22.7 | NS | 12 | 2.8 | 1.3 | NS | 54 | NR |
T | 28 | M2 | Diagnosis | 68.0 | 64.9 | NS | 5 | 25.1 | 5.6 | NS | 78 | NR |
V | 29 | M4 | Diagnosis | 20.7 | 20.2 | NS | 2 | 2.3 | 3.0 | NS | −30 | NR |
T | 30 | Unclassified | Diagnosis | 82.2 | 80.9 | NS | 2 | 84.3 | 81.0 | NS | 4 | NR |
T | 31 | M4 | PD | 96.4 | 95.0 | NS | 1 | 92.3 | 80.7 | 0.001 | 13 | NR |
V | 32 | M4 | Diagnosis | 98.4 | 97.8 | NS | 1 | 98.7 | 87.8 | 0.014 | 11 | NR |
V | 33 | M1 | Diagnosis | 79.8 | 82.2 | NS | −3 | 55.8 | 32.4 | NS | 42 | NR |
V | 34 | M4Eo | Diagnosis | 46.2 | 52.0 | NS | −13 | 14.8 | 19.3 | NS | −30 | NR |
. | . | . | . | Injected RF . | . | Noninjected BM . | . | . | ||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
. | . | . | . | Mean engraftment (%) . | . | Mean engraftment (%) . | . | . | ||||
Site . | Patient ID . | FAB . | Sample . | Vehicle . | FED . | P value . | RR (%) . | Vehicle . | FED . | P value . | RR (%) . | In vivo responsea . |
T | 1 | M1 | Relapse | 90.7 | 7.2 | 0.001 | 92 | 67.9 | 5.8 | 0.001 | 91 | R |
T | 2 | M0 | Diagnosis | 21.6 | 1.2 | 0.001 | 94 | 5.9 | 0.3 | 0.004 | 95 | R |
T | 3 | Unclassified | Diagnosis | 50.9 | 7.4 | 0.002 | 85 | 19.9 | 3.6 | 0.001 | 82 | R |
T | 4 | M4 | Relapse | 34.3 | 6.3 | 0.008 | 82 | 7.7 | 1.5 | 0.002 | 81 | R |
T | 5 | Unclassified | Diagnosis | 36.4 | 8.6 | 0.002 | 76 | 15.5 | 4.1 | 0.001 | 74 | R |
T | 6 | M4 | Relapse | 17.1 | 4.2 | 0.001 | 75 | 6.8 | 0.5 | 0.001 | 92 | R |
T | 7 | ND | Diagnosis | 76.9 | 19.3 | 0.001 | 75 | 46.8 | 7.9 | 0.001 | 83 | R |
T | 8 | M4 | Relapse | 75.7 | 19.4 | 0.001 | 74 | 50.9 | 9.3 | 0.001 | 82 | R |
V | 9 | M5 | Diagnosis | i.v. | i.v. | 27.4 | 6.7 | 0.0001 | 76 | R | ||
V | 10 | M4 | Diagnosis | i.v. | i.v. | 60.5 | 18.3 | 0.0001 | 70 | R | ||
T | 11 | M5a | Diagnosis | 22.6 | 7.3 | 0.004 | 68 | 13.5 | 3.7 | 0.001 | 73 | R |
T | 12 | ND | Diagnosis | 86.9 | 28.3 | 0.001 | 67 | 40.4 | 9.6 | 0.001 | 76 | R |
T | 13 | M5 | Diagnosis | 36.6 | 12.7 | 0.02 | 65 | 3.6 | 1.8 | 0.05 | 50 | R |
V | 14 | M0 | Diagnosis | 18.7 | 7.1 | 0.0001 | 62 | 8.5 | 1.8 | 0.004 | 79 | R |
T | 15 | M5b | Diagnosis | 25.6 | 10.0 | 0.01 | 61 | 11.5 | 3.6 | 0.01 | 69 | R |
V | 16 | ND | Diagnosis | 66.1 | 27.4 | 0.0008 | 59 | 6.7 | 4.7 | NS | 30 | R |
T | 17 | M1 | Diagnosis | 27.9 | 12.4 | 0.05 | 56 | 0.6 | 0.2 | NS | 67 | R |
T | 18 | M5 | Diagnosis | 28.0 | 15.2 | 0.001 | 46 | 15.2 | 8.1 | 0.01 | 47 | PR |
V | 19 | M2 | Diagnosis | 55.7 | 32.5 | NS | 42 | 12.3 | 5.3 | 0.044 | 57 | PR |
V | 20 | M4 | Diagnosis | 84.9 | 64.1 | 0.009 | 24 | 57.1 | 33.7 | 0.045 | 41 | PR |
T | 21 | M2 | Diagnosis | 86.1 | 84.5 | NS | 2 | 53.7 | 37.1 | 0.03 | 31 | PR |
V | 22 | M0 | Diagnosis | 93.6 | 93.7 | NS | 0 | 90.9 | 32.3 | 0.0001 | 64 | PR |
V | 23 | M5b | Diagnosis | 17.9 | 3.3 | NS | 82 | 2.8 | 1.3 | NS | 54 | NR |
T | 24 | ND | Relapse | 41.0 | 26.5 | NS | 35 | 9.6 | 1.7 | NS | 82 | NR |
V | 25 | M4 | Diagnosis | 24.6 | 19.4 | NS | 21 | 5.2 | 2.5 | 0.017 | 52 | NR |
V | 26 | M4Eo | Diagnosis | 16.0 | 13.6 | NS | 15 | 1.0 | 1.5 | NS | −56 | NR |
V | 27 | M4 | Diagnosis | 25.7 | 22.7 | NS | 12 | 2.8 | 1.3 | NS | 54 | NR |
T | 28 | M2 | Diagnosis | 68.0 | 64.9 | NS | 5 | 25.1 | 5.6 | NS | 78 | NR |
V | 29 | M4 | Diagnosis | 20.7 | 20.2 | NS | 2 | 2.3 | 3.0 | NS | −30 | NR |
T | 30 | Unclassified | Diagnosis | 82.2 | 80.9 | NS | 2 | 84.3 | 81.0 | NS | 4 | NR |
T | 31 | M4 | PD | 96.4 | 95.0 | NS | 1 | 92.3 | 80.7 | 0.001 | 13 | NR |
V | 32 | M4 | Diagnosis | 98.4 | 97.8 | NS | 1 | 98.7 | 87.8 | 0.014 | 11 | NR |
V | 33 | M1 | Diagnosis | 79.8 | 82.2 | NS | −3 | 55.8 | 32.4 | NS | 42 | NR |
V | 34 | M4Eo | Diagnosis | 46.2 | 52.0 | NS | −13 | 14.8 | 19.3 | NS | −30 | NR |
Abbreviations: FAB, French-American-British; ND, not determined; PD, persistent disease; NS, not statistically significant.
aIn vivo response criteria: R: >50% RR in RF; PR: 20% to 50% RR in RF or >20% RR in BM only; NR, no significant difference between FED- and vehicle-treated mice (NS) or <20% RR in both RF and BM.
Definition of drug response for xenotransplantation studies
For in vivo drug studies, definition of response was based on the relative reduction (RR) in human leukemic engraftment in drug-treated versus vehicle-treated mice. RR was calculated as [(mean%engraftment of vehicle-treated mice) − (mean%engraftment of drug-treated mice)]/(mean%engraftment of vehicle-treated mice). We distinguished effects in the injected right femur (RF) versus noninjected bones (BM) as leukemic burden is usually higher in the injected RF, and as such a significant reduction in leukemic engraftment in the RF is more difficult to achieve than in noninjected BM. Patients were classified as responders (R) if RR in the RF was >50%, partial responders (PR) if we observed 20% to 50% RR in the RF or >20% RR in the BM only, and nonresponders (NR) if there was no statistically significant difference in engraftment levels between vehicle- and drug-treated mice or RR was <20% in both RF and BM.
PF cytometric analysis
AML patient samples tested in vivo were subjected to PF analysis following short-term drug treatment in vitro. Viably frozen samples were thawed and serum starved for 1 hour at 37°C, then treated with DMSO (vehicle), FED (100 nmol/L), AC220 (5 nmol/L; Selleck Chemicals), RUX (300 nmol/L), or DAS (100–200 nmol/L; Toronto Research Chemicals) for another hour. During the last 30 m, cells were incubated with viability dye, then fixed, washed, and permeabilized. Phosphomarker and extracellular staining was carried out for 30′ with optimized concentrations of antibodies (for details, see Supplementary Methods). Data were acquired on a BD LSRFortessa and analyzed using FlowJo and Cytobank software (http://cytobank.org/; ref. 19). Ba/F3 cells were obtained from R. Rottapel in 2006. This IL3-dependent hematopoietic cell line remains exquisitely IL3 dependent for survival and proliferation, as assessed by cell viability assays following IL3 withdrawal (ongoing). OCI-AML5 cells were obtained from M.D. Minden in 2010. This patient-derived AML cell line was authenticated by short-tandem repeats analysis in 2014 at the Centre for Applied Genomics (Hospital for Sick Children, Toronto, Canada).
Statistical analysis
Sixty-four independent patient samples were used in the study to capture the diversity of AML. Comparison of engraftment in drug- versus vehicle-treated mice was performed using two-tailed t tests. For PF analysis, two-group comparisons were performed using the Mann–Whitney U test. Correlations were assessed by two-tailed Spearman correlation. All data were analyzed with GraphPad Prism software, version 5.0, for Mac OS X.
Study approval
Peripheral blood cells were collected from patients with newly diagnosed or relapsed AML at the Princess Margaret Cancer Centre or Vancouver General Hospital after obtaining informed consent according to procedures approved by the UHN and University of British Columbia (UBC) Research Ethics Boards. Thawed viably frozen samples were prescreened for engraftment ability in xenotransplanted mice, and engrafting samples were used in drug studies. All animal experiments were performed in accordance with institutional guidelines approved by the UHN or UBC Animal Care Committee.
Results
FED targets LSCs in primary AML xenografts with heterogeneous responses
We tested the potential efficacy of FED in AML in an initial cohort of 34 patient samples obtained at diagnosis or relapse and representing multiple cytogenetic and molecular subtypes (Supplementary Table S1). Samples were transplanted into cohorts of immune-deficient mice (n = 5–8/group); following a 2- to 3-week engraftment period, mice were treated with FED or vehicle control for another 2 weeks (Fig. 1A). For 17 of 34 samples, leukemic engraftment in the injected femurs was 56% to 94% lower in FED-treated relative to vehicle-treated mice (P < 0.05, Table 1 and Fig. 1B). These samples also showed a 30% to 95% RR in leukemic engraftment of noninjected bones (P < 0.05). Five additional samples responded less robustly (<50% RR in injected femur and 31%–64% RR in noninjected bones; P < 0.05; Table 1). These 22 samples were classified as xenograft responders (X-R). By contrast, FED had a small or negligible effect on the leukemic graft in 12 of 34 samples (<20% RR, P < 0.05 or any RR, P > 0.05); these were classified as xenograft nonresponders (X-NR; Table 1 and Fig. 1C).
We also evaluated whether treatment with a standard agent such as cytarabine could potentiate the effects of FED, using samples from 3 AML patients that were partial- or nonresponders to FED alone (Supplementary Fig. S1A). In 2 of 3 samples tested, cytarabine+FED treatment significantly reduced leukemia burden in treated mice compared with either drug alone (Supplementary Fig. S1B). These findings suggest that combining the targeted therapeutic FED with cytarabine may increase overall efficacy.
To evaluate whether FED targets LSCs, AML cells were harvested from primary mice and transplanted into untreated secondary recipients; two X-NR and seven X-R samples were evaluable (cells from vehicle-treated primary mice generated >10% mean engraftment levels in secondary mice). For the two X-NR samples, the FED-treated and control groups yielded similar engraftment levels in secondary mice (Fig. 1D, left and Supplementary Fig. S1C). By contrast, in five of seven X-R samples, the FED-treated group gave rise to smaller grafts compared with controls (Fig. 1D, right and Supplementary Fig. S1C), although this only reached statistical significance in two samples due to small numbers of transplanted mice. These results suggest that in responding samples, FED treatment may impair the function and/or survival of LSCs exposed to drug in the primary mice, although variable sensitivity was observed as in the primary mice. Overall, by testing a large cohort of patient samples in a clinically relevant model, we evaluated drug effects against LSCs and captured the heterogeneous treatment response that is often seen in clinical trials.
FED-sensitive pSTAT5 provides a biomarker of in vivo response to FED
Given the observed heterogeneity of response in xenograft assays, we sought to identify a biomarker of FED responsiveness. As FED is a tyrosine kinase inhibitor, we used a multiplexed PF cytometry assay (20) to profile, with single-cell resolution, the impact of short-term (30–60′) in vitro FED treatment on the basal activity of the SYK, BCR, JAK/STAT, MAPK, PI3K/mTOR, and NF-κB signaling pathways in primary AML blasts (Supplementary Fig. S2A–S2C). We used 100 nmol/L FED in these studies, because this concentration greatly reduced IL3-induced STAT5 phosphorylation in the FLT3L-responsive OCI-AML5 cell line (Supplementary Fig. S2D). As expected (5), the AML samples in this cohort displayed highly variable expression of CD34, CD45, CD123, and CD33 (Supplementary Fig. S3A and S3B). Therefore, we costained each sample with antibodies directed against these markers to evaluate signaling in immunophenotypic cell subsets within individual samples. Basal levels of MAPK p38, pSTAT5, pAKT(T), pSRC, and IκBα showed the highest variance across this cohort (Fig. 2A, top). As would be expected, a 60′ treatment with FED in vitro decreased basal levels of pSTAT5 to a greater extent than other phosphoproteins (Fig. 2A, bottom). Interestingly, not all samples with high basal pSTAT5 levels were FED-responsive in the PF assay (R2 = 0.54), but samples with the highest basal levels of pSTAT5 showed the greatest FED-mediated decrease in pSTAT5 (Supplementary Fig. S3C). In most cases, only a fraction of the AML blast population showed elevated basal pSTAT5; the level of CD34 expression on pSTAT5hi cells was highly variable (Supplementary Fig. S3D). Thus, we identified considerable signaling heterogeneity within and between AML samples in the cohort.
By comparing the in vivo drug response and PF data for each patient sample, we found that basal pSTAT5 levels were significantly higher in X-R compared with X-NR samples (Fig. 2B). FED treatment also led to a significantly greater decrease in basal pSTAT5 levels in X-R relative to X-NR samples (Fig. 2B), suggesting that the PF assay represents a biomarker of AML responsiveness to FED in vivo. This conclusion was supported by unsupervised clustering and heatmap analysis of FED-mediated decreases in basal levels of 12 phosphoproteins, which divided the cohort into two major groups (Fig. 2C). One cluster contained primarily X-R samples (14/15) that showed robust FED-mediated decreases in pSTAT5 [0.47 to 2.78 fold change (FC), log2 scale]; there were inconsistent but significant decreases in several other phosphoproteins in this sample cluster following FED treatment. The second cluster contained a mixture of X-R and X-NR samples where FED treatment did not cause changes in the measured phosphoproteins. Considering all of the X-R samples in this initial patient cohort, 64% (14/22) exhibited ≥0.4-fold (log2) FED-mediated decrease in pSTAT5 (Supplementary Fig. S3E), whereas the remaining X-R samples lacked this biomarker, suggesting at least two different mechanisms of response to FED. Importantly, 92% (11/12) of X-NR samples did not show loss of pSTAT5 when treated with FED. This assay thus provides a highly specific predictor of treatment sensitivity to FED in vivo with high-positive predictive value (14/15 = 93%).
To confirm these findings, we tested the suitability of FED-sensitive decrease in pSTAT5 as a biomarker of in vivo response to FED in an independent validation cohort of 30 additional AML patient samples. In this cohort, 16 and 14 samples were classified as X-R and X-NR, respectively, based on RR of leukemic engraftment in treated xenotransplanted mice (Table 2 and Supplementary Fig. S4). A FED-mediated reduction in pSTAT5 signaling was observed in 9 of 16 X-R and was absent in 12 of 14 X-NR patients, giving a sensitivity of 56% and specificity of 86% for this response biomarker in the validation cohort. Collectively, these data indicate that FED-sensitive pSTAT5 may provide a phosphoproteomic biomarker to identify AML patients whose leukemia cells are unlikely to respond to FED treatment in the in vivo model.
. | . | . | . | Injected RF . | . | Distal BM . | . | . | ||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
. | . | . | . | Mean engraftment (%) . | . | Mean engraftment (%) . | . | . | ||||
Site . | Patient ID . | FAB . | Sample . | Vehicle . | FED . | P value . | RR (%) . | Vehicle . | FED . | P value . | RR (%) . | In vivo responsea . |
T | 35 | M4 | Diagnosis | 45.3 | 8.6 | 0.001 | 81 | 24.8 | 7.3 | 0.01 | 71 | R |
T | 36 | M5a | Diagnosis | 20.4 | 4.0 | 0.001 | 80 | 6.9 | 3.3 | NS | 52 | R |
T | 37 | Unclassified | Diagnosis | 39.6 | 8.6 | 0.001 | 78 | 23.4 | 4.0 | 0.001 | 83 | R |
T | 38 | M5 | Diagnosis | 42.4 | 9.7 | 0.001 | 77 | 15.4 | 3.7 | 0.02 | 76 | R |
V | 39 | M4 | Diagnosis | 51.4 | 12.6 | 0.0001 | 75 | 4.3 | 0.7 | 0.029 | 84 | R |
T | 40 | M4 | Diagnosis | 30.0 | 8.8 | 0.003 | 71 | 16.2 | 5.6 | 0.001 | 65 | R |
T | 41 | M5a | Diagnosis | 27.1 | 10.9 | 0.005 | 60 | 6.2 | 2.0 | 0.03 | 68 | R |
T | 42 | M5a | Diagnosis | 21.5 | 9.1 | 0.001 | 58 | 16.0 | 8.0 | 0.01 | 50 | R |
T | 43 | Unclassified | Diagnosis | 79.5 | 39.8 | 0.001 | 50 | 52.2 | 9.3 | 0.001 | 82 | R |
V | 44 | M4Eo | Diagnosis | 25.5 | 13.9 | 0.024 | 46 | 12.9 | 4.5 | 0.001 | 65 | PR |
T | 45 | M5a | Diagnosis | 72.4 | 45.1 | 0.06 | 38 | 50.2 | 16.6 | 0.001 | 67 | PR |
V | 46 | Unclassified | Diagnosis | 56.8 | 35.8 | 0.0057 | 37 | 15.5 | 6.4 | 0.013 | 59 | PR |
T | 47 | M1 | Diagnosis | 48.7 | 30.9 | 0.03 | 37 | 13.8 | 9.8 | NS | 29 | PR |
T | 48 | M0 | Diagnosis | 42.6 | 28.4 | NS | 33 | 9.8 | 5.1 | 0.05 | 48 | PR |
T | 49 | M4 | Diagnosis | 97.8 | 78.9 | 0.001 | 19 | 88.4 | 55.9 | 0.001 | 37 | PR |
T | 50 | Unclassified | Diagnosis | 72.7 | 63.4 | NS | 13 | 63.8 | 49.1 | 0.04 | 23 | PR |
V | 51 | M5 | Diagnosis | 44.2 | 19.5 | NS | 56 | 71.8 | 61.7 | NS | 14 | NR |
V | 52 | M2 | Diagnosis | 19.0 | 10.4 | NS | 45 | 5.4 | 2.0 | 0.017 | 64 | NR |
T | 54 | M5b | Diagnosis | 54.3 | 43.4 | NS | 20 | 36.8 | 24.8 | NS | 33 | NR |
V | 54 | Unclassified | Diagnosis | 24.4 | 20.0 | NS | 18 | 24.1 | 17.1 | NS | 29 | NR |
T | 55 | Unclassified | Diagnosis | 78.0 | 68.3 | NS | 12 | 55.5 | 49.5 | NS | 11 | NR |
T | 56 | M1 | Diagnosis | 92.2 | 81.0 | NS | 12 | 55.9 | 53.7 | NS | 4 | NR |
T | 57 | Unclassified | Diagnosis | 46.9 | 42.2 | NS | 10 | 39.8 | 31.0 | NS | 22 | NR |
T | 58 | M5 | Diagnosis | 75.1 | 71.0 | NS | 5 | 53.5 | 39.9 | NS | 25 | NR |
V | 59 | M4Eo | Diagnosis | 43.4 | 41.2 | NS | 5 | 8.6 | 7.6 | NS | 11 | NR |
T | 60 | M1 | Relapse | 93.1 | 90.1 | NS | 3 | 76.7 | 65.2 | NS | 15 | NR |
T | 61 | M1 | Diagnosis | 75.6 | 73.6 | NS | 3 | 20.4 | 25.6 | NS | −25 | NR |
T | 62 | Unclassified | PD | 95.8 | 93.3 | 0.001 | 3 | 89.4 | 79.6 | 0.05 | 11 | NR |
V | 63 | M1 | Diagnosis | 74.1 | 72.9 | NS | 2 | 69.7 | 58.6 | 0.014 | 16 | NR |
V | 64 | M4 | Diagnosis | 30.0 | 36.7 | NS | −22 | 38.4 | 49.4 | NS | −29 | NR |
. | . | . | . | Injected RF . | . | Distal BM . | . | . | ||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
. | . | . | . | Mean engraftment (%) . | . | Mean engraftment (%) . | . | . | ||||
Site . | Patient ID . | FAB . | Sample . | Vehicle . | FED . | P value . | RR (%) . | Vehicle . | FED . | P value . | RR (%) . | In vivo responsea . |
T | 35 | M4 | Diagnosis | 45.3 | 8.6 | 0.001 | 81 | 24.8 | 7.3 | 0.01 | 71 | R |
T | 36 | M5a | Diagnosis | 20.4 | 4.0 | 0.001 | 80 | 6.9 | 3.3 | NS | 52 | R |
T | 37 | Unclassified | Diagnosis | 39.6 | 8.6 | 0.001 | 78 | 23.4 | 4.0 | 0.001 | 83 | R |
T | 38 | M5 | Diagnosis | 42.4 | 9.7 | 0.001 | 77 | 15.4 | 3.7 | 0.02 | 76 | R |
V | 39 | M4 | Diagnosis | 51.4 | 12.6 | 0.0001 | 75 | 4.3 | 0.7 | 0.029 | 84 | R |
T | 40 | M4 | Diagnosis | 30.0 | 8.8 | 0.003 | 71 | 16.2 | 5.6 | 0.001 | 65 | R |
T | 41 | M5a | Diagnosis | 27.1 | 10.9 | 0.005 | 60 | 6.2 | 2.0 | 0.03 | 68 | R |
T | 42 | M5a | Diagnosis | 21.5 | 9.1 | 0.001 | 58 | 16.0 | 8.0 | 0.01 | 50 | R |
T | 43 | Unclassified | Diagnosis | 79.5 | 39.8 | 0.001 | 50 | 52.2 | 9.3 | 0.001 | 82 | R |
V | 44 | M4Eo | Diagnosis | 25.5 | 13.9 | 0.024 | 46 | 12.9 | 4.5 | 0.001 | 65 | PR |
T | 45 | M5a | Diagnosis | 72.4 | 45.1 | 0.06 | 38 | 50.2 | 16.6 | 0.001 | 67 | PR |
V | 46 | Unclassified | Diagnosis | 56.8 | 35.8 | 0.0057 | 37 | 15.5 | 6.4 | 0.013 | 59 | PR |
T | 47 | M1 | Diagnosis | 48.7 | 30.9 | 0.03 | 37 | 13.8 | 9.8 | NS | 29 | PR |
T | 48 | M0 | Diagnosis | 42.6 | 28.4 | NS | 33 | 9.8 | 5.1 | 0.05 | 48 | PR |
T | 49 | M4 | Diagnosis | 97.8 | 78.9 | 0.001 | 19 | 88.4 | 55.9 | 0.001 | 37 | PR |
T | 50 | Unclassified | Diagnosis | 72.7 | 63.4 | NS | 13 | 63.8 | 49.1 | 0.04 | 23 | PR |
V | 51 | M5 | Diagnosis | 44.2 | 19.5 | NS | 56 | 71.8 | 61.7 | NS | 14 | NR |
V | 52 | M2 | Diagnosis | 19.0 | 10.4 | NS | 45 | 5.4 | 2.0 | 0.017 | 64 | NR |
T | 54 | M5b | Diagnosis | 54.3 | 43.4 | NS | 20 | 36.8 | 24.8 | NS | 33 | NR |
V | 54 | Unclassified | Diagnosis | 24.4 | 20.0 | NS | 18 | 24.1 | 17.1 | NS | 29 | NR |
T | 55 | Unclassified | Diagnosis | 78.0 | 68.3 | NS | 12 | 55.5 | 49.5 | NS | 11 | NR |
T | 56 | M1 | Diagnosis | 92.2 | 81.0 | NS | 12 | 55.9 | 53.7 | NS | 4 | NR |
T | 57 | Unclassified | Diagnosis | 46.9 | 42.2 | NS | 10 | 39.8 | 31.0 | NS | 22 | NR |
T | 58 | M5 | Diagnosis | 75.1 | 71.0 | NS | 5 | 53.5 | 39.9 | NS | 25 | NR |
V | 59 | M4Eo | Diagnosis | 43.4 | 41.2 | NS | 5 | 8.6 | 7.6 | NS | 11 | NR |
T | 60 | M1 | Relapse | 93.1 | 90.1 | NS | 3 | 76.7 | 65.2 | NS | 15 | NR |
T | 61 | M1 | Diagnosis | 75.6 | 73.6 | NS | 3 | 20.4 | 25.6 | NS | −25 | NR |
T | 62 | Unclassified | PD | 95.8 | 93.3 | 0.001 | 3 | 89.4 | 79.6 | 0.05 | 11 | NR |
V | 63 | M1 | Diagnosis | 74.1 | 72.9 | NS | 2 | 69.7 | 58.6 | 0.014 | 16 | NR |
V | 64 | M4 | Diagnosis | 30.0 | 36.7 | NS | −22 | 38.4 | 49.4 | NS | −29 | NR |
Abbreviations: FAB, French-American-British; PD, persistent disease; NS, not statistically significant.
aIn vivo response criteria: R: >50% RR in RF; PR: 20% to 50% RR in RF or >20% RR in BM only; NR, no significant difference between drug- and vehicle-treated mice (NS) or <20% RR in both RF and BM.
To examine whether the effects of FED against LSCs in AML were indeed mediated by JAK2 inhibition, we tested the efficacy of RUX, a JAK1/2 inhibitor approved for the treatment of MPNs, against 10 AML samples classified as FED X-R. Surprisingly, eight of ten samples did not respond to RUX treatment in xenograft assays, and the remaining two showed only a partial response (Fig. 3A and Table 3). Furthermore, in vitro treatment with RUX had minimal or no impact on pSTAT5 levels, in marked contrast with strong FED-mediated effects. As cytokine-induced JAK2 activation is accompanied by autophosphorylation of Y1007 and Y1008 (Fig. 3B; ref. 21), we also examined the abundance of pJAK2Y1007/Y1008 in samples from both cohorts. Interestingly, when detected, pJAK2 was predominantly expressed in CD34−CD45hi AML blasts, and levels were not affected by in vitro treatment with FED (Fig. 3C). Samples with FED-sensitive pSTAT5 either had little pJAK2 (Fig. 3C, AML43), or expressed pJAK2 and pSTAT5 in phenotypically distinct cellular subsets (Fig. 3C, AML50), suggesting that STAT5 phosphorylation was JAK2 independent in most cases. Finally, samples with the highest basal pJAK2 typically had low pSTAT5 (Fig. 3C, AML53), suggesting that pJAK2 did not induce STAT5 phosphorylation in these samples. Collectively, these data suggest that FED exerts its effects in AML primarily through JAK2-independent mechanisms.
. | . | In vivo response . | Log2 FC drug/veh pSTAT5 . | ||
---|---|---|---|---|---|
Sample ID . | FLT3-ITD . | FED . | RUX . | FED . | RUX . |
AML48 | + | PR | NR | −1.22 | −0.2 |
AML37 | + | R | NR | −1.52 | −0.48 |
AML38 | + | R | PR | −0.52 | −0.07 |
AML40 | + | R | NR | −0.76 | 0.15 |
AML2 | + | R | NR | −2.09 | ND |
AML12 | + | PR | NR | −3.03 | −0.44 |
AML43 | + | R | NR | −1.6 | −0.21 |
AML1 | − | R | NR | −0.35 | −0.01 |
AML36 | − | R | NR | −0.03 | 0.04 |
AML7 | − | R | PR | −0.56 | −0.64 |
. | . | In vivo response . | Log2 FC drug/veh pSTAT5 . | ||
---|---|---|---|---|---|
Sample ID . | FLT3-ITD . | FED . | RUX . | FED . | RUX . |
AML48 | + | PR | NR | −1.22 | −0.2 |
AML37 | + | R | NR | −1.52 | −0.48 |
AML38 | + | R | PR | −0.52 | −0.07 |
AML40 | + | R | NR | −0.76 | 0.15 |
AML2 | + | R | NR | −2.09 | ND |
AML12 | + | PR | NR | −3.03 | −0.44 |
AML43 | + | R | NR | −1.6 | −0.21 |
AML1 | − | R | NR | −0.35 | −0.01 |
AML36 | − | R | NR | −0.03 | 0.04 |
AML7 | − | R | PR | −0.56 | −0.64 |
Abbreviation: ND, not determined.
Although FED was developed as a JAK2 inhibitor, like all other tyrosine kinase inhibitors, it has significant effects on other tyrosine kinases. Specifically, FED also inhibits FLT3 (Supplementary Fig. S5A; ref. 22), albeit with 5-fold lower potency (23). Therefore, we examined whether the observed FED effects might be mediated by FLT3 inhibition, because activating FLT3-ITD mutations are common in AML. In accordance with studies showing that FLT3-ITD causes STAT5 activation (24), basal pSTAT5 levels were significantly higher in FLT3-ITD+ compared with FLT3-ITD− AML samples (Supplementary Fig. S5B), as previously reported (25). Treatment in vitro with the FLT3 inhibitor AC220 or FED decreased pSTAT5 to a greater extent in FLT3-ITD+ samples (Supplementary Fig. S5B and S5C). Nonetheless, FED also robustly decreased pSTAT5 levels in 3 FLT3-ITD− AML samples (Supplementary Fig. S5C), and 15/38 X-R samples lacked FLT3-ITD (Supplementary Fig. S5D). These data argue that in vitro and in vivo responses to FED are not restricted to FLT3-ITD+ samples.
Identification of subpopulations with distinct pSRC/pSTAT5 signaling and drug sensitivity
Many AML samples from both cohorts exhibited small FED-mediated reductions in basal pSRC levels (Fig. 2A and Supplementary Fig. S4A), suggesting that inhibition of SRC signaling might partly account for FED's efficacy in vivo. Indeed, prior reports have suggested that several SRC family kinases are aberrantly activated in AML (26, 27). In vitro treatment with dasatanib (DAS), a dual SRC/ABL kinase inhibitor, robustly decreased SRC phosphorylation in our initial patient cohort (Fig. 4A). Interestingly, most AML samples contained phenotypically defined cellular subsets that exhibited distinct signaling profiles, including a discrete pSRChi subset that was typically CD45hiCD34−/lo (Fig. 4B) and CD33hi (data not shown), and a pSTAT5hi subset that was CD45lo (CD34+ or CD34−). In most cases, pSTAT5 levels were robustly decreased by in vitro treatment with FED but not DAS, whereas pSRC was reduced to a greater extent by DAS than FED (Fig. 4B). One exception was sample AML21, in which pSTAT5 and pSRC, although clearly present in different subsets, were both more sensitive to inhibition by DAS than by FED (Supplementary Fig. S6). Indeed, DAS treatment reduced global tyrosine phosphorylation in this sample; of note, this sample contains a BCR-ABL1 gene rearrangement. This unusual PF profile was also observed in six chronic phase CML samples tested (Supplementary Fig. S6 and data not shown).
The identification of distinct cell populations with differential pSTAT5/pSRC signaling and the prediction from in vitro analysis that FED and DAS might be efficacious when used together prompted us to test the efficacy of FED+DAS combination therapy in vivo with 3 patient samples (Fig. 4C–E and data not shown). Response to single-agent therapy (either FED or DAS) in xenotransplanted mice was concordant with drug sensitivity predicted by PF analysis of in vitro treatment effects (Fig. 4B and D), and the combination of FED+DAS was more effective in reducing leukemic engraftment in treated mice compared with either agent alone. In addition, although a limited number of mice were tested, serial transplantation of AML8 (a relapse sample) suggested that combination therapy may more effectively impair LSC function compared with the single agents (Fig. 4E). Thus, combining multiparameter PF analysis with in vivo drug testing of primary patient samples enabled the rational design and evaluation of a combination regimen to target LSCs in an animal model of AML.
Discussion
Here, we developed an approach that combines drug testing of large numbers of primary patient samples with multiparameter PF analysis. Using FED as a prototype, we demonstrated heterogeneity of treatment response in our xenograft model, with efficacy observed in 38 of 64 (59%) samples. In parallel, we identified a drug response biomarker that predicts which patients are likely to benefit from FED treatment based on the responses observed in xenografts. To date, this is the largest patient cohort tested with a single drug. This scale of testing allowed inclusion of patients with refractory/relapsed disease and heterogeneous molecular and cytogenetic abnormalities. The similar concordance of in vivo drug response with PF response in Toronto and Vancouver points to the robustness and feasibility of our approach for future preclinical drug development in AML.
Although xenograft models are often employed in anticancer drug development, timing and sample availability generally preclude contemporaneous analysis of patients in a clinical trial and xenografts derived from the same patients' samples, leading to uncertainty as to the validity of the latter for predicting treatment response. In the absence of such paired analyses, characterization of a predictive biomarker derived from xenograft treatment responses represents an ideal means to correlate the heterogeneous responses seen in the two settings. Validation of predicted drug responses in patients would strongly support the utility and applicability of large-scale xenografting for preclinical drug development, which if carried out in parallel with studies to identify response biomarkers as described here, will increase the success rate of anticancer drugs that proceed to trial, especially for drugs that may be beneficial to only a subset of patients.
In the case of AML, this approach can only be applied to the approximately 50% of patient samples that can generate xenografts using current methods, which may raise questions about the universality of drug-testing results. However, AML patients whose cells are engraftment-capable have worse outcomes (Kennedy and colleagues; unpublished data; ref. 7), and thus are in greatest need. Furthermore, there is substantive evidence that the properties of LSCs assayed in AML xenotransplant models have clinical relevance across a broader set of samples beyond those that can generate xenografts (5–7). For example, a functionally defined LSC-specific gene expression signature is highly prognostic in multiple independent AML cohorts (5, 6), suggesting that common pathways that are operative within LSC-enriched (i.e., xenograft-initiating) cell fractions are linked to outcome in all patients. This link also implies that, regardless of a patient's mutational spectrum and even when the subclonal composition of a xenograft does not reflect the dominant leukemic clone in a patient (28), the common stemness pathways that govern engrafting cells represent potentially relevant therapeutic targets. Thus, drugs that impair leukemic engraftment may affect outcomes, a possibility that must ultimately be tested in clinical trials.
Patient-derived tissues that have been serially propagated as xenografts (distinct from primary patient samples) have been used previously by the Pediatric Preclinical Testing Program (29) and others (30) to test a panel of therapeutic agents against a limited number of individual tumors. Although effective in screening potential new drugs, this approach does not capture interpatient tumor heterogeneity nor allow for identification of drug response biomarkers. Furthermore, xenografts that have been extensively propagated may not mirror the patient's tumor accurately. The largest study in a single tumor type examined the efficacy of Sagopilone against a panel of 22 primary non–small cell lung cancer (NSCLC) xenografts (31). Fifty percent of samples exhibited partial responses defined by RECIST (32), but only 5% of patients had a partial response at clinical trial (33). The incongruence between success at preclinical stages and failure in clinical trials seen in this and other studies may be explained by the inability of previous preclinical models of NSCLC and other tumors both to reflect disease heterogeneity and to read out CSC function (2). Drugs that reduce tumor bulk may not necessarily have activity against CSCs, and vice versa. As evaluation of drug effects against CSCs in clinical trials is challenging, it is important to evaluate CSC function in well-characterized preclinical models of human cancer, of which the AML xenograft is the best example.
The use of a large number of primary AML samples in our study enabled us to capture the response heterogeneity that is commonly seen in the clinic. The mechanistic basis for variability in treatment responses in clinical trials is often unknown but is likely related at least in part to disease heterogeneity among patients. Both inter- and intrapatient tumor heterogeneity exist on functional and genetic levels, and cell lines or even a small number of patient-derived xenografts do not sufficiently capture this heterogeneity. This precludes identification of response biomarkers and confounds efforts to understand mechanisms of action of candidate drugs and correlate these to response. For example, treatment with AZD1480, an ATP-competitive inhibitor of JAK kinases, reduced leukemic engraftment in xenograft studies of a small number of AML samples (n = 7; ref. 34). In vitro analysis of 48 AML samples demonstrated inhibition of both pSTAT3 and pSTAT5 in the phenotypically defined CD34+ LSC/progenitor blast population, but the degree of inhibition was quite variable (0%–100%). Thus, it is difficult to draw firm conclusions about whether inhibition of pSTAT3/5 signaling is the critical mechanism by which AZD1480 mediates its in vivo effects against LSCs. In our study, PF profiling provided some mechanistic insights into how FED exerts its anti-LSC effects in vivo; specifically, that FED is not acting solely through inhibition of JAK2 or FLT3 activity. Unexpectedly, we observed very few in vivo responses to another JAK inhibitor (RUX), highlighting the fact that tyrosine kinase inhibitors usually have multiple specificities (23), and that the mechanism of action against LSCs may not be related to their most potent inhibitor activity. The discordance between the in vivo and in vitro responses observed with FED and RUX also demonstrates that response biomarkers are likely tied to specific-drug actions and cannot be extrapolated to a drug class. Parallel assessment of in vivo response and in vitro analysis of affected pathways also provides opportunity for uncovering drug synergy, as demonstrated here for FED+DAS.
Predictive response biomarkers are increasingly being sought to guide patient selection in clinical trials to maximize the chances of demonstrating true drug effectiveness. Even when patients are selected for a trial based on the presence of a putative drug target (mutation and/or pathway), variability in response may be related to off-target effects, as seen in the current study, and thus unpredictable. The drug-responsive pSTAT5 biomarker that we identified based on correlation with treatment responses in xenografts was highly specific (few false positives) with strong positive predictive value in an independent validation cohort. Although this biomarker correlated with FLT3-ITD positivity, it also correctly predicted in vivo response to FED in 3 FLT3-ITD− xenografted patient samples. The biomarker did not identify all patients who exhibited drug responsiveness in xenograft assays; however, reserving treatment for patients who have the response biomarker avoids exposing those unlikely to derive therapeutic benefit to potential drug toxicities. Indeed, when our study was initiated, FED was in clinical trials for the treatment of MPNs. However, Sanofi recently halted these trials after reports of Wernicke's encephalopathy in a small number of treated patients (35).
Our study provides a modern paradigm for preclinical drug development that improves the chances of correctly identifying drugs with LSC activity to move forward into clinical trials. The evidence that xenotransplantation assays detect properties of AML patient samples that correlate with outcome (5–7) increases the confidence that candidate drugs that ablate LSCs in this preclinical model will also be effective when administered to patients. Clinical trials evaluating novel anticancer therapies are expensive, time consuming, and expose patients to risk. We anticipate that adoption of this approach for preclinical evaluation and biomarker development will lead to improved patient selection and clinical trial outcomes.
Disclosure of Potential Conflicts of Interest
M.D. Minden is a consultant/advisory board member for Celgene. No potential conflicts of interest were disclosed by the other authors.
Authors' Contributions
Conception and design: W.C. Chen, M.D. Minden, C. Guidos, J.E. Dick, J.C.Y. Wang
Development of methodology: W.C. Chen, J.S. Yuan, A. Mitchell, C. Guidos, J.E. Dick, J.C.Y. Wang
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): W.C. Chen, Y. Xing, N. Mbong, G. Gerhard, G. Bogdanoski, S. Lauriault, Y. Merkulova, M.D. Minden, D.E. Hogge, C. Guidos, J.C.Y. Wang
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): W.C. Chen, J.S. Yuan, Y. Xing, A. Mitchell, J.A. Kennedy, G. Bogdanoski, S. Lauriault, Y. Merkulova, C. Guidos
Writing, review, and/or revision of the manuscript: W.C. Chen, J.S. Yuan, A. Mitchell, M.D. Minden, D.E. Hogge, C. Guidos, J.E. Dick, J.C.Y. Wang
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): W.C. Chen, J.S. Yuan, Y. Xing, A.C. Popescu, J.A. Kennedy, G. Bogdanoski, S. Perdu, Y. Merkulova, D.E. Hogge
Study supervision: D.E. Hogge, C. Guidos, J.E. Dick, J.C.Y. Wang
Other (performed experiments): J. McLeod
Acknowledgments
The authors thank Jaime Claudio and Amanda Kotzer for project management support and Sherry Zhao and members of the SickKids-UHN Flow Facility for technical support.
Grant Support
All authors were supported by the Cancer Stem Cell Consortium with funding from the Government of Canada through Genome Canada and the Ontario Genomics Institute (OGI-047), and through the Canadian Institutes of Health Research (CSC-105367). J.E. Dick is also supported by grants from the Canadian Cancer Society, Terry Fox Foundation, Ontario Institute for Cancer Research with funds from the province of Ontario, a Canada Research Chair and the Ontario Ministry of Health and Long Term Care (OMOHLTC). The views expressed do not necessarily reflect those of the OMOHLTC.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.