Multiple myeloma remains treatable but incurable. Despite a growing armamentarium of effective agents, choice of therapy, especially in relapse, still relies almost exclusively on clinical acumen. We have developed a system, Ex vivo Mathematical Myeloma Advisor (EMMA), consisting of patient-specific mathematical models parameterized by an ex vivo assay that reverse engineers the intensity and heterogeneity of chemosensitivity of primary cells from multiple myeloma patients, allowing us to predict clinical response to up to 31 drugs within 5 days after bone marrow biopsy. From a cohort of 52 multiple myeloma patients, EMMA correctly classified 96% as responders/nonresponders and correctly classified 79% according to International Myeloma Working Group stratification of level of response. We also observed a significant correlation between predicted and actual tumor burden measurements (Pearson r = 0.5658, P < 0.0001). Preliminary estimates indicate that, among the patients enrolled in this study, 60% were treated with at least one ineffective agent from their therapy combination regimen, whereas 30% would have responded better if treated with another available drug or combination. Two in silico clinical trials with experimental agents ricolinostat and venetoclax, in a cohort of 19 multiple myeloma patient samples, yielded consistent results with recent phase I/II trials, suggesting that EMMA is a feasible platform for estimating clinical efficacy of drugs and inclusion criteria screening. This unique platform, specifically designed to predict therapeutic response in multiple myeloma patients within a clinically actionable time frame, has shown high predictive accuracy in patients treated with combinations of different classes of drugs. The accuracy, reproducibility, short turnaround time, and high-throughput potential of this platform demonstrate EMMA's promise as a decision support system for therapeutic management of multiple myeloma. Cancer Res; 77(12); 3336–51. ©2017 AACR.
We have developed a novel tool capable of predicting, within 5 days, 3 month clinical response of multiple myeloma patients to 31 drugs, using fresh bone marrow aspirates, a digital image analysis algorithm, mathematical models, and pharmacokinetic data.
We have implemented a gray-box parametric model that represents the tumor's response to drugs as a collective of subpopulations, each with different levels of chemosensitivity. In this model, the likelihood of cell death depends on drug concentration and exposure time. This model is parameterized by an ex vivo chemosensitivity assay, where primary multiple myeloma cells from fresh bone marrow aspirates are exposed to different concentrations of drug for 96 hours (Fig. 1; Supplementary Fig. S1).
Suppose the population of multiple myeloma cells in the patient's body is represented by p(t). The tumor burden varies with time according to the difference equation:
where G(dt) represents growth due to tumor cell replication and D(t,dt) represents drug-induced cell death between times t and t+dt. In cell culture, doubling time is used as a metric for quantifying cell replication. For mammalian cells, this number is approximately 24 hours. However, the doubling time of multiple myeloma tumors is much longer due to its characteristically low proliferative index (LI, labeling index). In the mathematical model, the growth factor for the multiple myeloma population in the absence of therapy is given by Equation A, where dt is a time interval (in days), and Δt is the time step used in the simulation's calculation. To determine LI for a given patient, we use the two closest prebiopsy measures of tumor burden, obtained from monoclonal paraprotein, and Equation A, with D = 1.
To describe the stochastic cell death process, we propose an empirical pharmacodynamics model based on the drug occupancy theory (8):
where drug and receptor molecules form drug–receptor complexes, which in turn cause cellular damage β and, eventually, cell death. The dynamics of this reversible reaction follow the law of mass action (9), where R(t) is the drug concentration at time t, h is an empirical exponent denoting the rate of conversion of drug exposure into cell damage, and κ of cell damage repair.
When the cellular damage is greater than the threshold τ, the probability of cell death increases asymptotically:
where δ is a nondimensionalizing empirical factor.
Short-term response of multiple myeloma patients to therapy can be monotonic or present an inflexion point followed by relapse (Fig. 2A–C). Thus, tumor chemosensitivity cannot always be accurately described by a single “clonal” population, but requires a more nuanced representation in the general model. We propose two subpopulations, with different degrees of sensitivity to therapy. Each subpopulation can either be modeled as “clonal” or as a distribution, with drug-specific threshold values (τ, Equation D). These threshold values are obtained from a normally distributed probability density function that specifies the fraction of a subpopulation that initiates cell death beyond a given threshold. Figure 2B shows an example of such a representation of tumor chemosensitivity as a single and as a double distribution. Thus, the total tumor burden of a patient is represented as:
where the composition of each subpopulation at initial time t0 is modeled as a distribution with a mean μj and standard deviation σj that define the percentage of cells that initiate cell death when the accumulated damage surpasses τ. For computational purposes, we have discretized this distribution in a histogram with n bins, ranging from μj − 6σj to μj + 6σj, using MATLAB's function normpdf (Supplementary Fig. S2).
There is no biological meaning for negative τ, so the histogram is truncated when μj − 6σj < 0, and the value of each bin is normalized so that the sum of all bins corresponds to pj. Thus, at initial time t0, the composition of the jth subpopulation is:
To separate drug-induced and spontaneous cell death, we divide every ex vivo dose-response curve by the corresponding vehicle control. Thus, the ex vivo data used to parameterize the mathematical model consist of percent live cells normalized by control at every time point, drug concentration, and exposure time. We then use MATLAB's lsqcurvefit function to find the model parameters that minimize the difference between normalized ex vivo data and the model estimates.
The last step consists of choosing among the four possibilities (1 or 2 subpopulations, clonal or distribution) the model that best describes the ex vivo data. We achieve this by applying Akaike's Information Criterion (AIC; ref. 10), which favors the best fitting model with the least number of parameters. Our simulations have shown that models with three or more subpopulations are never chosen, as they produce minor improvements in ex vivo data fitting, and are significantly penalized by AIC due to their large number of parameters (e.g., 1-population distribution requires 5 parameters, 2-subpopulation distributions require 9 parameters, 3-subpopulation distributions require 13 parameters).
In summary, the model assumes the existence of one or two tumor subpopulations, with different degrees of chemosensitivity. Each subpopulation, in turn, exhibits a range of sensitivity to therapy modeled as a normally distributed probability density function.
For decades, there have been attempts to develop predictive biomarkers in cancer, unfortunately, with limited translational success (1). Most biomarker development today depends on molecular techniques applied to dead cells, cell lines, or primary cancer cells isolated from their microenvironment, thus failing to account for many elements needed to properly assess therapeutic efficacy (2). We anticipate that the development of novel technologies and multidisciplinary approaches to directly assess chemosensitivity of primary cells in ex vivo reconstructions of the tumor microenvironment (TME) are critical avenues toward personalized predictive biomarker development (3–5).
Multiple myeloma is a treatable, but incurable malignancy of plasma cells (6), which serves as an excellent model disease to examine the potential of personalized management strategies. Frontline therapy combining multiple novel agents, high-dose therapy with autologous stem cell transplant, and maintenance therapy has yielded a high success rate of response in multiple myeloma (7). However, all patients eventually relapse, with the treatment of relapsed patients relying mainly on clinical acumen. This empirical approach has been made increasingly more difficult by the large number of approved anti–multiple myeloma agents, leading to astronomical numbers of possible two-, three-, or even four-drug combinations. In addition, at least one or two cycles of therapy are required to determine clinical efficacy, during which time the patient may suffer side effects without clinical benefit. Moreover, individual patients may be sensitive to targeted agents not formally investigated in multiple myeloma (11, 12). Thus, an assay capable of choosing the drug combination with highest clinical benefit would provide a critical step forward in the personalized treatment of relapsed multiple myeloma and other hematologic malignancies (13).
We describe an approach to predict clinical response of multiple myeloma patients to several classes of drugs, hereon referred to as Ex vivo Mathematical Myeloma Advisor (EMMA), designed to overcome the main hurdles that have limited the success of past and present technologies for assessment of clinical efficacy of experimental drugs in multiple myeloma. Historic colony formation assays require 2 to 3 weeks to yield results, which is beyond clinically actionable time. In addition, multiple myeloma cells have a low success rate of colony formation (1). Patient-derived tumor xenografts (PDX) models, although an invaluable tool for basic and translational research, are equally limited as predictive biomarkers due to extensive interval required for tumor engraftment and treatment response (14). Further, the infrastructure, financial burden, number of multiple myeloma cells, and, until recently, the lack of an appropriate host make PDX models suboptimal as clinically predictive biomarkers for testing the large number of drugs available in multiple myeloma (15). In contrast, using off-the-shelf multi-well plates, EMMA can test 31 drugs or combinations in 384-well plates, or 127 drugs in 1,536-well plates (16), with as few as 0.5 million multiple myeloma cells, thus allowing the clinical screening of an individual multiple myeloma patient to all standard-of-care and clinically relevant “non-multiple myeloma” therapeutics, in a single experiment.
In order to evaluate the accuracy of EMMA as a predictive multiple myeloma biomarker, we have tested samples of primary multiple myeloma cells from fresh bone marrow biopsies against multiple standard-of-care and experimental agents generating patient- and drug-specific mathematical models of chemosensitivity and clinical response to therapy within 5 days of biopsy. Crucially, we prospectively validated these in silico responses with postbiopsy treatment outcomes. EMMA also provides a platform to conduct in silico clinical trials to assess the efficacy of novel therapeutics. We assessed the efficacy of a series of experimental agents, including 25 protein kinases inhibitors (PKI), demonstrating patient-specific responses. Further, test/re-test reproducibility was demonstrated in patients with sequential biopsies. Collectively, these results support the use of EMMA as a novel rapid, reproducible, and high-throughput, ex vivo–, and mathematically informed decision-support tool for patient-specific multiple myeloma therapy.
Materials and Methods
Primary cancer cells
We investigated the ex vivo response of cancer cells from multiple myeloma patients (newly diagnosed or relapsed). Investigators obtained signed informed consent from all patients who were enrolled on the clinical trials MCC#14745, MCC#14690, and MCC#18608 conducted at the H. Lee Moffitt Cancer Center and Research Institute, as approved by the Institutional Review Board. To this end, patient samples were utilized in accordance with the Declaration of Helsinki, International Ethical Guidelines for Biomedical Research Involving Human Subjects (CIOMS), Belmont Report, and U.S. Common Rule. The medical records were de-identified, and only the following clinically relevant information was reviewed: (A) treatment administered (therapeutic agents, doses, and schedule) prior to biopsy; (B) cytogenetics; and (C) serum and urine electrophoresis results.
MM1.S myeloma cells were obtained from ATCC in 2009 and are validated biannually (last, February 2017) by comparing short tandem repeat analysis with ATCC's genetic profile (Geneteca) and screening for mycoplasma contamination by PCR (Agilent Technologies). Cells are utilized for only 2 to 8 passages before renewal with validated cryostorage aliquots.
The non–CD138-selected cells from bone marrow aspirates were placed in a flask with RPMI 1640 media supplemented with FBS (heat inactivated), penicillin/streptomycin, and passaged until only adherent cells remained (17). As this process takes weeks, primary multiple myeloma cells from fresh biopsies were cocultured with established stroma from prior patient samples.
Ex vivo assay
The ex vivo assay used to quantify chemosensitivity of primary multiple myeloma cells was described in detail previously (16). Briefly, fresh bone marrow aspirate cells are enriched for CD138+ expression using magnetic beads. Multiple myeloma cells (CD138+) were seeded in multi-well plates with collagen-I and previously established human-derived stroma, to a total volume of 8 μL containing approximately 4,000 multiple myeloma cells and 1,000 stromal cells. Each well is filled with 80 μL of RPMI 1640 media supplemented with FBS (heat inactivated), penicillin/streptomycin, and patient-derived plasma (10%, freshly obtained from patient's own aspirate, filtered) and left overnight for adhesion of stroma (Supplementary Figs. S1 and S3). The next day, drugs were added using a robotic plate handler, so that every drug was tested at five concentrations (1:3 serial dilution) in two replicates. Negative controls (supplemented growth media with and without vehicle control, DMSO) were included, as well as positive controls for each drug (cell line MM1.S at highest drug concentration). Plates were placed in a motorized stage microscope (EVOS Auto FL; Life Technologies) equipped with an incubator and maintained at 5% CO2 and 37°C. Each well was imaged every 30 minutes for a total duration of 4 days (Supplementary Figs. S1, S3, and S4).
Digital image analysis algorithm
We have developed a digital image analysis algorithm previously described (8, 16) to determine changes in viability of each well longitudinally across the 96-hour interval. In summary, this algorithm computes differences in sequential images and identifies as live cells those with continuous membrane deformations resulting from the interaction with the surrounding matrix. These interactions cease upon cell death. By applying this operation to all 192 images acquired for each well, it is possible to quantify nondestructively, and without the need to separate stroma and myeloma, the effect of drugs as a function of concentration and exposure time (Supplementary Fig. S1D and S1E).
Simulation of clinical treatment
The ultimate goal of this work is to predict the clinical response of multiple myeloma patients to therapy using the proposed mathematical model of chemosensitivity, and the pharmacokinetic (PK) properties of the drug and regimen chosen. Below, we describe an example of prediction of treatment with carfilzomib, whose regimen consists of infusions on days 1, 2, 8, 9, 15, and 16 in a cycle of 28 days. Blood concentration of carfilzomib peaks at approximately 5.5 μmol/L upon injection, quickly decreasing to 30 nmol/L after 20′, 1.4 nmol/L at 1 hour, and 0.14 nmol/L after 4 hours (18).
To simulate the treatment of a patient, we replace the function R(t) from Equation B by the PK curve of blood concentration of drug for the entire interval of the treatment. For example, let us consider patient Pt103, whose chemosensitivity to carfilzomib is depicted in Supplementary Fig. S5, drug concentrations ranged from 50 nmol/L to 0.6 nmol/L. Total exposure time was 96 hours, with imaging intervals of 30 minutes. Black dots represent actual data measurements. Green and blue surfaces represent distributed models of one and two populations, whereas cyan and red (overlapped) represent one and two population models with no distribution. As per the Akaike's Information Criterion (AIC), the two-population model with distribution is the best ex vivo fit.
The parameters for this patient's model of chemosensitivity to carfilzomib are listed in Supplementary Table S1. The in vivo growth rate, obtained from the rate of increase in monoclonal paraprotein from the latest relapse, is determined by the labeling index (LI) of 1.44%. Parameter a6 indicates that there are two subpopulations within this tumor burden in terms of chemosensitivity to carfilzomib. The first, more sensitive, occupies 53% of the number of cells and the other, more resistant, 47%.
As mentioned earlier, the least squares method is used to estimate parameters for each of the four proposed models. The implementation of this method is done through MATLAB's lsqcurvefit function, which uses an iterative gradient-based optimization algorithm to find those parameter values that yield the smallest SSR. In order to determine how AIC's choice of the “best model” eliminates over-parameterization, we have conducted convergence studies on the parameters during lsqcurvefit optimization of patient Pt103's ex vivo response to panobinostat (see Supplementary Material).
Importantly, this approach does not require a training set: all parameter values are obtained by fitting a general set of equations to the ex vivo data, using AIC to penalize more complex and favor simpler models. Thus, this work is a type 4 TRIPOD study (19).
Generation of the heatmap representing the activity of 30 drugs in multiple myeloma cells from 13 patients
Twenty-five protein kinase inhibitors were tested in 13 primary multiple myeloma samples, at 10 μmol/L maximum concentration each (1:3 serial dilution, 5 concentrations, 2 replicates). We have quantified the AUC (the average of viability between all replicates across 96 hours) of each drug for each patient sample and normalized this measure by the maximum possible AUC. If the drug showed no effect whatsoever, the normalized AUC would be 100%, whereas drugs with higher effect have a lower AUC. Drugs were sorted from most active (lowest average AUC) to least active (highest average AUC).
Ex vivo data and mathematical models capture the heterogeneity of individual tumor chemosensitivity
Figure 1 and Supplementary Fig. S1 illustrate the workflow for ex vivo analysis of chemosensitivity: CD138-selected multiple myeloma cells obtained from fresh bone marrow aspirates were seeded in multi-well plates with collagen and previously established human-derived primary stroma. After overnight incubation to ensure stroma adhesion, 31 different drugs were added, and plates were imaged every 30 minutes for 4 days. As previously described (5, 16), a digital image analysis algorithm determined the number of viable cells in every well at each of the 192 time points, thus producing 1,920 data points per drug. Supplementary Fig. S1E demonstrates an output of the ex vivo assay. Curves represent changes in viability of primary multiple myeloma cells and a cell-line control to different drug concentrations during 96 hours. These data were used to parameterize patient-specific mathematical models of chemosensitivity (see Methods and Supplementary Material).
The central aspect of EMMA is the ability to characterize tumor heterogeneity in the form of subpopulations with different degrees of chemosensitivity to a given drug (Fig. 2A). The importance of proper characterization of tumor heterogeneity is depicted in Fig. 2B and C: a homogeneous tumor reacts to therapy monotonically, either by steadily decreasing viability (Fig. 2C, green line) or by sustained growth. A heterogeneous tumor, however, harboring a chemoresistant subpopulation, will have a curve of response characterized by an inflection point at the time of relapse (Fig. 2C, blue line). To identify these subpopulations from the ex vivo dose-response data, we test four hypotheses: the first assumes only one “clonal” population (all cells in the tumor have the same degree of sensitivity to one particular drug), the second assumes that there is one population, but its chemosensitivity follows a normal distribution. The two other models assume two “clonal” subpopulations or two distributions, respectively. EMMA identifies the parameters that best fit each of these four models to the ex vivo data (Fig. 2D) and chooses the best model after residue correction by AIC to avoid overfitting (10). In the example of Fig. 2E, EMMA's interpretation is that the tumor is composed of two subpopulations, p1 and p2: the first is more uniform (“clonal”) and more sensitive, whereas the second has higher variance (Fig. 2E and F) and is more resistant.
EMMA predicts clinical response of multiple myeloma patients to single agents or drug combinations
A total of 52 patient specimens were tested ex vivo against a panel of drugs, and EMMA model predictions were tested against clinical outcome to the same drugs. The median patient age was 64.5 years, 21 patients were female, 13 were newly diagnosed, and the majority relapsed and/or refractory (Table 1). In order to generate clinical predictions of response to therapy, EMMA simulates each of the drugs in a regimen independently and combines all responses assuming additivity. Figure 3 describes patient Pt111's response to a triplet regimen of carfilzomib (K), lenalidomide (R), and dexamethasone (D). Figure 3A depicts the ex vivo response to carfilzomib (black dots) as well as the mathematical model proposed by EMMA (blue surface). Figure 3B depicts the actual response of this patient to the three-agent treatment (black dots linked by dashed lines) and model-predicted response of this patient to carfilzomib as single agent (green line). Figure 3C–D and Fig. 3E–F represent ex vivo and clinical predictions for the same patient to dexamethasone and lenalidomide, respectively. The combination of the three models shows high correlation with the actual outcome (Fig. 3G). Figure 3H shows the predictions of clinical response for this patient to other drugs: those expected to be most clinically effective were carfilzomib, bortezomib, and liposomal doxorubicin, whereas pomalidomide and lenalidomide were predicted to have little effect on this patient's tumor.
|Patient ID .||Age .||Gender .||Status at Bmbx .||Actual treatment .||Actual response/no response .||Model's prediction response/no response .||Actual outcome (last M-Spike available) .||Model's outcome (last M-Spike available) .|
|Patient ID .||Age .||Gender .||Status at Bmbx .||Actual treatment .||Actual response/no response .||Model's prediction response/no response .||Actual outcome (last M-Spike available) .||Model's outcome (last M-Spike available) .|
NOTE: The age range was 40–81, with a median of 65 years old. Biopsies were obtained between April 2014 and July 2016. * and ** indicate sequential biopsies of the same patient.
Abbreviations for disease status: PD, progressive disease; ND, newly diagnosed; RD, relapsed disease; RFD, refractory disease. For treatment: M, marizomib; CFZ, carfilzomib; CY, cyclophosphamide; D, dexamethasone; VD, bortezomib + dexamethasone; VRD, bortezomib + lenalidomide + dexamethasone; OPR, oprozomib; CRD, carfilzomib + lenalidomide + dexamethasone; VPD, bortezomib + pomalidomide + dexamethasone; CyBorD, cyclophosphamide + bortezomib + dexamethasone; V, bortezomib; CFZ + P + D, carfilzomib + pomalidomide + dexamethasone; DX, liposomal doxorubicin. For clinical response: CR, complete response; VGPR, very good partial response; PR, partial response; MR, minimal response; SD, stable disease; PD, progressive disease.
Classification of 52 multiple myeloma patients as responders or nonresponders.
Clinically, multiple myeloma response is monitored over time via sequential measurements of serum or urine monoclonal antibody produced by malignant cells (paraprotein) as a surrogate assessment of tumor burden. The least strict level of validation for this predictive model was to classify patients as responders and nonresponders. Table 1 shows that this model correctly classified 50 of 52 patients (96%) according to response/no-response. The first exception, Pt73 was predicted not to respond, but the cycle 2 day 1 paraprotein measure indicated a 70% tumor reduction. Unfortunately, this patient died 3 weeks later with disease progression, and no subsequent measures are available to confirm response/progression. Pt95's model correctly predicted an initial response (Supplementary Fig. S6), but anticipated an early relapse, thus classifying this patient at 90 days as nonresponder. Of note, this patient relapsed after 4 months.
Predictions according to International Myeloma Working Group stratification.
A more strict level of prediction is used to assess clinical response according to a stratification system aligned with the International Myeloma Working Group (IMWG) multiple myeloma response criteria (27), which clusters patient responses in three categories: VGPR/CR includes complete response (>99% tumor reduction) or very good partial response (90%–99% reduction), MR/PR includes partial response (50%–90% reduction) or minimal response (25%–50% reduction), and PD/SD, which includes stable disease (<25% reduction) or progressive disease (>25% increase). Table 1 outlines the postbiopsy therapy received by each patient in this study, corresponding clinical outcome according to the IMWG response criteria (28), and EMMA's predictions. Model predictions and clinical outcome agreed in 41 of 52 patients (79%). The highest accuracy occurred when the model predictions were PD/SD (15/17, or 88% accuracy), followed by MR/PR (16/19, or 84% accuracy) and VGPR/CR (10/16, or 63% accuracy). The narrower the range of response in each category, the greater the chance of disagreement between model predictions and actual outcome. Therefore, a more natural validation of EMMA would be a direct correlation between the actual tumor burden measures and model predictions as continuous variables.
Linear correlation between predicted and actual tumor burden.
The strictest validation of this model is the direct correlation of the tumor burden predictions with all available tumor burden measurements. Supplementary Fig. S6 highlights the model predictions and actual tumor burden measurements from 52 patients. Each graph shows tumor burdens normalized by the date of beginning of treatment post-biopsy (i.e., tumor burden = 100% at time = 0 days). Thick-colored lines represent model predictions of clinical response generated 5 days after biopsy. Each prediction curve is flanked by thinner lines, representing the upper and lower boundaries of the prediction. These boundaries are a function of low and high estimates of tumor growth rate, computed from prebiopsy measures of tumor burden. Black dots linked by dashed lines represent the patient's clinical after biopsy tumor burden measurements. Figure 3I shows the aggregated correlation between mathematical model predictions and actual clinical response for all available tumor burden measurements from the 52 patients within 90 days after biopsy (133 data points). The regression line between in silico model predictions and clinical response, shown flanked by the 95% confidence interval, had a slope of 0.83 and Pearson correlation coefficient r = 0.5658 (P < 0.0001).
Estimated clinical benefit of EMMA as a decision-support system for choice of therapy
As a multi-drug predictive biomarker, EMMA has two main goals: (i) to ensure that each patient receives the most effective therapy and (ii) to remove ineffective drugs from therapy. According to the model predictions, if EMMA's choice of drugs was used, the number of patients in this study who achieved VGPR or CR would have increased from 13 to 22, the number of patients presenting MR or PR would have decreased from 24 to 23, and the number of patients with no clinical benefit (PD or SD) would have decreased from 15 to 7. Also, according to EMMA, 60% of patients in this study received at least one agent with no predicted clinical efficacy (Supplementary Table S2). As an estimate of the single-agent efficacy of the drugs administered to these 52 patients, EMMA predicted that 34% of the agents had no predicted clinical efficacy, 24% were predicted to produce stable disease, 27% a minimum or partial response, and 15% a very good or complete response. These data suggest the potential clinical benefit of identifying the right drug for the right patient at the right time.
A high-throughput tool for personalized drug screening
We have tested the sensitivity of primary multiple myeloma cells from 13 patient samples to a panel of 5 anti–multiple myeloma agents and 25 PKIs. In Fig. 4A, each cell of the heatmap represents the average 96-hour AUC of the five concentrations (1:3 serial dilution, two replicates each) for each drug in individual patients. The drugs were ordered from lowest to highest AUC averaged across all patients, green being the most effective and red the least effective. Figure 4B lists the previous lines of therapy. Despite interpatient variation, it is possible to identify PKIs with consistently higher activity (e.g., BI2536, INK128, ponatinib, MK2206, and crizotinib), whereas others are consistently ineffective (e.g., ralimetinib, vemurafenib, VX745, and BMS777607). Further, PKIs that demonstrated patient-specific activity (e.g., ibrutinib, momelotinib, AZD1480, and palbociclib) highlight the potential for personalized management strategies. For example, the multidrug refractory patients 79 and 83 demonstrated sensitivity to the FDA-approved BTK inhibitor ibrutinib, suggesting that these two patients may derive clinical benefit from treatment with this PKI. The remaining patients were resistant to ibrutinib, further illustrating the need for personalized strategies for treatment allocation. These data indicate that this approach may be used to assess patient sensitivity to targeted therapeutics facilitating patient sample–derived drug screening or in silico clinical trials of experimental agents.
In addition, by grouping the 30 agents in pairs and performing a linear regression in each of the possible 435 combinations, it was possible to investigate agents with putative ex vivo activity correlation (Supplementary Fig. S7). Fifteen pairs of drugs showed Pearson correlation coefficient r > 0.75, suggesting that their anti–multiple myeloma activities involved similar biological processes in these patient specimens. Certain pairs were consistent with known activities such as carfilzomib/bortezomib and bortezomib/panobinostat, which have established links in anti–multiple myeloma activities (29). Other pairs, including panobinostat (HDACi)/ponatinib (Abli), with a positive slope of 0.8755 and r = 0.7695, suggest previously undefined shared biological pathways contributing to multiple myeloma survival. Interestingly, there were no instances of drugs with significant interpatient inverse correlation. This suggests that, across a group of patients, increased resistance to one drug during treatment correlates with increase, or has no effect at all, on resistance to a second drug, while cross-sensitization between drugs is unlikely (Supplementary Fig. S8). This has been observed in multiple clinical trials of alternating therapies seeking to exploit a cost of adaptation to two different regimens (30). Ultimately, those studies failed to demonstrate a significant improvement in survival between sequential and alternating groups (31, 32). However, our data suggest that there are patient-specific exceptions to this rule. For instance, Supplementary Fig. S9 depicts changes in 96-hour LD50 values for 2 patients between two sequential biopsies for 20 PKIs, with the posttreatment tumors becoming more resistant to some PKIs and, importantly, more sensitive to others, including the clinically relevant crizotinib and ponatinib, as well as the PLK inhibitor BI2536 (33), once again, highlighting the potential for phenotypically derived biomarker tools for truly personalized management.
Critically, even drugs within a same class, such as bortezomib and carfilzomib, can have significantly different clinical efficacy. Carfilzomib has been shown to be the more effective PI in early relapsed multiple myeloma (34); however, from Supplementary Table S2, Fig. 5, and Supplementary Fig. S10, it is clear that individual patients have differential predicted sensitivities to one PI versus the other. For instance, 7 of the 21 patients predicted as resistant to bortezomib were predicted to respond to carfilzomib. In contrast, 10 of the 24 patients predicted as resistant to carfilzomib were predicted to respond to bortezomib. Despite correlated ex vivo activity (Supplementary Fig. S7), the two drugs have different PK, leading to different predicted clinical responses. Collectively, these data again illustrate the potential clinical importance of allocating the right drug to the right patient at the right time even within a class of agents.
A platform for in silico clinical trials
We have tested 19 patient samples (Supplementary Table S3) with the HDAC6 inhibitor ricolinostat (Ri) and the Bcl-2 inhibitor venetoclax (Ve), created patient-specific models of chemosensitivity and simulated how these patients would have responded in a clinical trial. We have compared our results with actual phase I/II studies with single agents and combination with bortezomib (V) and dexamethasone (D; refs. 35, 36). The only data used for the simulations were the ex vivo results and drug-specific PK. The results are depicted in Fig. 6: in the single agent ricolinostat arm of the in silico trial (Fig. 6A), 1 patient (5%) was predicted to achieve VGPR/CR, 1 patient (5%) would achieve MR, whereas the remaining 17 (90%) would present PD/SD, in agreement with the low single-agent efficacy observed in the actual trial (60% PD and 40% SD). The simulation of single-agent venetoclax (Fig. 6B) predicted 3 patients reaching VGPR/CR (17%), 3 reaching PR (17%), 3 reaching MR (17%), and 9 presenting PD/SD (50%). Consistent with the clinical activity of venetoclax in phase I trials (36), t(11,14) status correlated with EMMA-predicted drug sensitivity with a mean depth of response 60% versus 31% of t(11,14) positive versus negative multiple myeloma, respectively (P = 0.0275). Note that newly diagnosed status (NDMM) was also a predictor of response (Fig. 6B, inset). It is also important to note that responses and failures were noted in both groups as well, demonstrating that molecular screening alone would not adequately predict clinical outcome. The virtual trial also projected clinical benefit for adding either drug to bortezomib and dexamethasone (V+D, P = 0.0181 and P = 0.0175 for Ri for Ve, Fig. 6C and D, respectively). Again, the actual benefit is observed in only a percentage of patients, highlighting the potential for the utilization of a phenotypic biomarker screening prior to treatment.
Multiple myeloma is an example of a cancer in which the efforts of basic, translational, and clinical research have provided a growing number of therapeutics with significant improvements in survival. Yet, curative intent therapy remains elusive. To this end, it is critical that we best allocate these therapies to maximize outcomes (and ideally minimize toxicities). Here, we have demonstrated a novel approach to predict clinical response of multiple myeloma patients to a wide range of therapeutics using an ex vivo chemosensitivity assay and computational models. This assay is scalable, reproducible and allows analysis of drug efficacy in primary multiple myeloma cells in the presence of elements of the tumor microenvironment (matrix, patient-derived serum, and human bone marrow–derived stroma). The major contribution of this approach, compared with existing techniques, is the detailed ex vivo characterization of the heterogeneity of tumor chemosensitivity, and its integration with mathematical models to accurately and reproducibly predict clinical response, with the potential to improve patient clinical outcomes through model-informed personalized management decisions.
EMMA has a number of advantages compared with past and current preclinical chemosensitivity assays. First, similar experiments would have cost and time-prohibitive hurdles in PDX models and may not be concluded in clinically actionable time frame (14). Second, EMMA not only mimics the real-time action of the drug on cancer cells, but mathematical models are used to extrapolate this short-term response into a longer clinical time frame, based on PK data. This trait makes EMMA an attractive system for multiple myeloma patients, including multi-drug refractory patients requiring salvage therapy. Third, patient-specific EMMA mathematical models can be used to test the effect of multiple classes of drugs in different regimens, leading to the assignment of the most efficacious regimen or drug to individual patients (21). Finally, EMMA's output is not limited to a dichotomized response/no-response or depth of response, but trajectories of actual clinical response at any moment during the first 3 months of treatment. Thus, its predictions can be followed in real-time during treatment, giving both physician and patient the opportunity to make informed, pretreatment decisions, and proactively act during therapy.
Recent works in the field have suggested different approaches to identify agents with clinical efficacy in liquid and solid cancers. Pemvoska and colleagues (4) have described a combination of ex vivo chemosensitivity and molecular profiling to determine therapeutic windows for drugs in acute myeloid leukemia. Majumder and colleagues have combined ex vivo chemosensitivity assays of slices of tumor explants (37), immunohistochemistry, and clinical data to create a signature to classify clinical response of patients with solid tumors. Both methods assess chemosensitivity at one fixed time point and do not account for the temporal dynamics of cell death, essential for the extrapolation of the effect of short periods of drug exposure (in vitro) to actual clinical response. In an analogy to physical sciences, these assays are capable of determining the initial speed of the clinical response but not “acceleration” and thus cannot predict the clinical trajectory. The concept of “acceleration” implies that the response of cancer cells to therapy cannot be described as a first-order differential equation, where rate of cell death is proportional to drug concentration, but instead requires a second-order model, which incorporates the notion of “damage,” and a threshold beyond which cell death starts. To this end, other assays can predict the initial effect of therapy on tumor burden, but cannot predict the actual depth, duration, or time to relapse. In cancers such as multiple myeloma, depth of response is commonly utilized as surrogates of clinical benefit (38). As such, a system capable of creating actual clinical trajectories (response) will be central to successfully translating in silico predictions to true clinical outcomes. In addition, although the agnostic pattern recognition techniques used in these published works (37) are adequate to create signatures capable of classifying patients into categories such as responders or nonresponders, they lack the ability to extrapolate conditions for which the signature was not trained. For instance, how would a patient respond to a combination of two drugs for which signatures were predetermined, or a different therapeutic regimen (dosing and schedule) for a known drug? The novel approach developed in this work provides an instrumental platform to address these issues. So far, EMMA-generated clinical predictions for regimens of two or more drugs assume additivity, which is the simplest possible implementation. However, our preliminary data indicate regions in the time–concentration space where there is synergy in primary multiple myeloma cells treated ex vivo (Supplementary Materials and Methods and Supplementary Fig. S11). Given drug-specific PK, staggered drug administration schedules, and the inherent heterogeneity of tumor populations, further work is required to adequately incorporate this information in EMMA models.
Predicting clinical response of patients based on ex vivo assays is a major challenge irrespective of how close the assay is to in vivo conditions. The most obvious difficulty is the translation of results from an assay that lasts for days into estimates of clinical response across months or even years (1, 38). We have begun to bridge this timescale gap through the use of mathematical models accounting for tumor heterogeneity, pharmacodynamics, and PK imputed with a tested phenotype (drug sensitivity). It has been long known that nature selects for phenotype, not genotype, and that multiple genotypes can produce the same phenotype (39). This nonexclusive relationship makes it challenging to predict clinical outcome from genotype alone or even gene expression profiles (40). EMMA directly identifies the phenotypic (or functional) representation of subpopulations regardless of genotypic background, thus removing the “middle man” and producing in silico clinical response outputs. Through nonlinear regression of the ex vivo chemosensitivity results, the model identifies subpopulations within the tumor burden based on chemosensitivity. In turn, the platform combines these data with drug- and regimen-specific PK, generating trajectories of clinical response demonstrating a high degree of accuracy in predicting outcomes.
We anticipate that this approach can provide precise clinical insight about treatment efficacy in a timely manner and assist oncologists in practicing truly personalized management, by proposing the best choice of therapy for each patient and identifying those with risk of early relapse due to the presence of therapy-resistant cells. In addition to clinical predictions of standard-of-care regimens, this approach can also serve as a means to perform in silico clinical trials (Fig. 6; refs. 5, 41, 42), where several experimental agents are tested in primary multiple myeloma cells from a cohort of patients mimicking the actual clinical setting without the potential cost or toxicity to patients. Additional uses of these mathematical models include the simulation of alternative regimens, such as metronomic therapy (43), adaptive therapy (44), or the introduction of treatment “holidays” (45), depending on the nature of the mechanism of resistance. Once the subpopulations have been identified and characterized by this method, computer simulations can be used to determine which alternative regimens may lead to best clinical outcome (45, 46).
Importantly, the mechanistic nature of these computational models allows the incorporation of additional influences on clonal evolution, including genotypic, epigenetic, microenvironmental, and clinical data, to continually upgrade this system. To this end, we expect to continue to build additional parameters into this computational model to improve the predictive capacity and to account for new classes of therapeutics. Studies are underway to integrate our prior models of response trajectories (47, 48) with EMMA to move from predications of depth of response to PFS. Further, we recognize the increasing importance of immune-mediated therapies in multiple myeloma (49, 50). Our preliminary studies with the CD38 antibody daratumumab (50), using EMMA's current protocol, have shown activity in primary multiple myeloma cells in concentrations as low as 86 nmol/L, with cell death initiated after 4 days, and reaching 25% viability reduction 1 day later. As depicted in Supplementary Fig. S12, the mechanism of cell death is phagocytosis by a yet-to-be-determined adherent cell present in the coculture, and only occurs in the presence of daratumumab. Research is ongoing to parameterize T-cell, NK-cell, and myeloid-derived stromal cell phenotypes in patient bone marrow samples to direct T-cell cytotoxicity assays in this platform. We anticipate that these data can be incorporated in EMMA to account for sensitivity to specific immune-based therapies. Further, continued validation of inter-day reproducibility (Supplementary Fig. S13), intra-plate variation (Supplementary Fig. S14), and dependence of AIC convergence of solution on ex vivo experiment duration and dynamic range of drug concentration (Supplementary Fig. S15, Supplementary Table S4, Supplementary Fig. S16, Supplementary Table S5, and Supplementary Figs. S17 and S18) are ongoing in larger numbers of patients to achieve the goal of a true clinical decision support tool.
To our knowledge, this study provides estimates on the individual efficacy of clinically administered agents in multiple myeloma for the first time: approximately one third of agents administered in this study may have had little to no clinical efficacy, 60% of patients received at least one ineffective agent, and 31% could have been treated with a more effective agent proposed by the mathematical models. Thus, we anticipate that EMMA would provide a critical support to oncologists to customize regimens by avoiding therapeutics that will not benefit the patient, thus reducing the potential toxicity while maximizing the clinical benefit.
Disclosure of Potential Conflicts of Interest
R. Baz reports receiving commercial research grant from Celgene, Karyopharm, Merck, Millennium, and Millenium/Takeda. K.H. Shain reports receiving commercial research grant from AbbVie and Signal Genetics, received honoraria from the Speakers Bureau of Amgen, Celgene, Janssen, Novartis, and Takeda, and is a consultant/advisory board member for Amgen, Celgene, Janssen, and Takeda. No potential conflicts of interest were disclosed by the other authors.
Conception and design: A. Silva, P. Sudalagunta, T. Jacobson, J. Song, R. Baz, W. Dalton, R. Gatenby, R. Gillies, E. Sontag, K.H. Shain
Development of methodology: A. Silva, M.C. Silva, P. Sudalagunta, T. Jacobson, C. Cubitt, W. Dalton, E. Sontag, M.B. Meads, K.H. Shain
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): A. Silva, M.C. Silva, A. Distler, T. Nguyen, C. Cubitt, R. Baz, L. Perez, W. Dalton, M.B. Meads, K.H. Shain
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): A. Silva, M.C. Silva, P. Sudalagunta, T. Jacobson, D.-T. Chen, L. Chen, R. Baz, J. Greene, R. Gillies, E. Sontag, M.B. Meads, K.H. Shain
Writing, review, and/or revision of the manuscript: A. Silva, M.C. Silva, P. Sudalagunta, T. Jacobson, J. Song, C. Cubitt, R. Baz, L. Perez, D. Rebatchouk, W. Dalton, R. Gatenby, R. Gillies, E. Sontag, M.B. Meads, K.H. Shain
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): A. Collins, C. Cubitt, W. Dalton, K.H. Shain
Study supervision: A. Silva, K.H. Shain
The authors thank the patients at H. Lee Moffitt Cancer Center who provided clinical samples for our ex vivo assays as well as consented access to the their clinical data through the Total Cancer Care database.
This research was funded by the H. Lee Moffitt Cancer Center Physical Sciences in Oncology (PSOC) Grant (1U54CA193489-01A1) and by H. Lee Moffitt Cancer Center's Team Science Grant. This work has been supported in part by the Translational Research Core Facility at the H. Lee Moffitt Cancer Center & Research Institute, a NCI-designated Comprehensive Cancer Center (P30-CA076292). Access to primary cells was made possible through the Total Cancer Care Protocol at the Moffitt Cancer Center.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.