Abstract
Models of tumor drug response have illuminated important concepts in oncology, but there remains a need for theory that combines intratumor and interpatient heterogeneity to explain patient outcomes, especially for curative treatments. In this study, we present a mathematical model of multidrug therapy that describes both cell-to-cell and patient-to-patient heterogeneity as distributions of drug sensitivity and apply it to simulate curative combination therapies for diffuse large B-cell lymphoma (DLBCL). Simulated trials reproduced progression-free survival and changes in ctDNA observed under standard therapy and accurately predicted success or failure of nine randomized trials of first-line combinations based on drug efficacies in relapsed/refractory DLBCL. Finally, we used the model to explore how drug synergies, biomarkers, and subtype-specific endpoints could improve the chance of success of targeted combination therapies. This study offers a quantitative model of curative drug combinations and suggests that predictive simulations could aid the design of new regimens with curative intent.
A new model of intratumor and interpatient heterogeneity in response to drug combinations explains and predicts the results of clinical trials of curative-intent treatments for DLBCL. This model can be used to understand and inform optimal design of curative drug combinations and clinical trials.
Introduction
The discovery of curative cancer treatments is among the most significant medical advances of the past century (1). Many lymphomas, including diffuse large B-cell lymphoma (DLBCL), have been curable by combination chemotherapy since the 1970s (2, 3). Despite the clinical success of many combination therapies, the basis for their curative potential remains poorly understood. To be clear, individual drug mechanisms are generally known, but single drugs are rarely curative. The success of drug combinations is often attributed to synergy. However, if synergy is loosely defined as “combinations are more effective than single drugs,” it is a tautology that does not explain mechanisms. If synergy is rigorously defined as “combinations have greater-than-additive pharmacologic interaction,” then it is demonstrably lacking from the majority of approved oncology drug combinations, including the regimen containing five drugs, rituximab, cyclophosphamide, doxorubicin, vincristine, and prednisone (RCHOP), that has long been the standard curative-intent treatment for DLBCL (4, 5). A more satisfactory understanding of the curative power of drug combinations would be a “bottom–up” mechanistic model that quantitatively explains their clinical efficacies and can predict which new drug combinations improve cure rates. Here, we present such a model for DLBCL, which reproduces the distribution of outcomes in patient populations treated with RCHOP, explains the failure of attempts over the past 20 years to improve upon RCHOP, and predicts the success of the first new regimen to be superior to RCHOP and receive FDA approval.
The original rationale for combination therapy was to overcome tumor heterogeneity (6) that occurs both within patients (“intratumor”) and between patients (“interpatient”). Past analyses and theories of intratumor and interpatient heterogeneity, considered separately, have contributed greatly to our understanding of combination therapy. The relevance of intratumor heterogeneity to combination therapy was described in 1952 by Lloyd Law, who proposed that although single drugs are resisted by a fraction of cells in a tumor, drug combinations may be resisted by only a fraction of the fraction of cells, which might be small enough to eradicate a cancer (7); there has been a renewed interest in this approach to tumor extinction (8, 9). The role of interpatient heterogeneity was observed in 1962 by the Acute Leukemia Group B, which found that combination therapy can increase the percentage of patients who respond because some patients’ cancers respond to one drug and some patients’ cancers respond to a different drug (10). Recently, we experimentally confirmed Law’s hypothesis in DLBCL cultures treated with RCHOP (5). We also found that interpatient heterogeneity explains much of the clinical efficacy of modern combination therapies for incurable cancers (11, 12). Given the importance of both intratumor and interpatient heterogeneity to combination therapy, there is a need to unite both forms of heterogeneity into one theory.
To begin to understand the joint effect of interpatient and intratumor heterogeneity, we recently described simple models which propose that because single drugs have variable efficacies among patients, drug combinations can be viewed as a sum of variable effects in each patient (13). Applied to fractional kill in tumor cells, this concept explains historical improvements in response and cure rates in pediatric acute lymphocytic leukemia (13). Applied to progression-free survival (PFS) data, this concept explains the PFS durations of most drug combinations to have received FDA approval for advanced cancers in the past 25 years (4). These simple phenomenologic models have practical applications, but a mechanistic model, grounded in pharmacology and tumor biology, is desirable for a conceptual understanding of how combination therapies sometimes succeed and sometimes fail. In particular, understanding curative combinations will require models that can describe dose responses, treatment schedules, and tumor kinetics.
To develop a mechanistic model of tumor responses to drug combinations, in this study, we first consider the seminal work of Skipper and colleagues (14), who showed in mice with leukemia that a dose of chemotherapy kills a constant fraction of cancer cells, known as the “fractional kill hypothesis.” An extrapolation of this hypothesis has become a textbook figure, showing a constant fraction of tumor population killed by each cycle of therapy (Fig. 1A; refs. 6, 15). However, Skipper’s experiments did not suggest that second or later doses have the same cytotoxic effect as the first, and this extrapolation leads to the false conclusion that any initially responsive tumor could be cured with enough cycles of chemotherapy. Past models have resolved this deficit by including drug-resistant subpopulations of cancer cells, which are selected by therapy and cause relapse (Fig. 1A). Variations on this approach have produced many insights into the evolution of drug resistance and have informed the design of treatments (16–19). Describing cancer cells simply as drug “sensitive” or “resistant” may be appropriate for cases like BCR-ABL inhibition in which a binding site mutation is the chief determinant of drug resistance (20). However, this binary model is generally oversimplistic, being unable to capture the spectrum of treatment responsiveness in human populations or to explain how it could be that some cancers are curable even though treatment always selects for resistant subpopulations. Here, we show that by moving beyond a binary view of drug resistance, modeling both intratumor and interpatient heterogeneity as distributions of drug sensitivity phenotypes can reproduce survival distributions in curative settings.
Modeling intratumor heterogeneity in drug sensitivity as a distribution. A, Traditional model depicting tumors as two subpopulations of cells: drug sensitive and drug resistant, with dose–response functions FS and FR, respectively. B, New model depicting tumor heterogeneity as a population of cells with a distribution of drug sensitivity phenotypes. Each cell’s dose–response function depends on its level of drug sensitivity. C, In the new model, each treatment cycle preferentially kills more sensitive cells (arrows represent cytotoxicity), shifting the distribution toward resistance and making subsequent cycles less effective. D, Distributions of drug sensitivity phenotypes in DLBCL cells treated with RCHO are approximately log-normal, as previously measured by high-complexity clone tracing (106 clones; only enrichment is quantifiable) and by genome-wide CRISPR inhibition screens (∼18,900 gene targets; both enrichment and depletion are quantifiable). E, Lymphoma cells’ sensitivities to different drugs (e.g., doxorubicin and vincristine), as previously measured by CRISPR inhibition screens in either single drug, are approximately described by a multivariate log-normal distribution (see other pairs in Supplementary Fig. S1C and S1D). F, DLBCL cells’ dose responses to drugs in RCHO, as previously measured with equipotent combinations of one through four drugs, are approximately reproduced by simulations using exponential dose responses. CRISPRi, CRISPR inhibition; LBCL, large B-cell lymphoma.
Modeling intratumor heterogeneity in drug sensitivity as a distribution. A, Traditional model depicting tumors as two subpopulations of cells: drug sensitive and drug resistant, with dose–response functions FS and FR, respectively. B, New model depicting tumor heterogeneity as a population of cells with a distribution of drug sensitivity phenotypes. Each cell’s dose–response function depends on its level of drug sensitivity. C, In the new model, each treatment cycle preferentially kills more sensitive cells (arrows represent cytotoxicity), shifting the distribution toward resistance and making subsequent cycles less effective. D, Distributions of drug sensitivity phenotypes in DLBCL cells treated with RCHO are approximately log-normal, as previously measured by high-complexity clone tracing (106 clones; only enrichment is quantifiable) and by genome-wide CRISPR inhibition screens (∼18,900 gene targets; both enrichment and depletion are quantifiable). E, Lymphoma cells’ sensitivities to different drugs (e.g., doxorubicin and vincristine), as previously measured by CRISPR inhibition screens in either single drug, are approximately described by a multivariate log-normal distribution (see other pairs in Supplementary Fig. S1C and S1D). F, DLBCL cells’ dose responses to drugs in RCHO, as previously measured with equipotent combinations of one through four drugs, are approximately reproduced by simulations using exponential dose responses. CRISPRi, CRISPR inhibition; LBCL, large B-cell lymphoma.
Distributions of phenotypes are a common observation in physiology and pharmacology. A foundational theorem of probability, the central limit theory, helps explain why: a quantity that is the product of many random variables tends to be log-normally distributed (21). Consequently, complex biological traits influenced by many underlying factors often exhibit cell-to-cell or organism-to-organism variations that can be well approximated by log-normal distributions (Supplementary Fig. S1A and S1B). This property has long been exploited by “mixed-effects” models of population pharmacokinetics (PK), which use parameter distributions to describe patient-to-patient variations in PK rates (22). This approach does not have to explicitly model each mechanism of variation; instead, it fits a magnitude of interpatient variance to observed PK variation in populations, implicitly accounting for biological variation even when underlying mechanisms are unknown. This approach could be particularly helpful to model drug resistance in cancer, wherein mechanisms of resistance are diverse and poorly understood and, indeed, experimental data show distributions of phenotypic heterogeneity in drug responsiveness (23, 24). Fitting to human distributions of treatment effect can capture the net impact of many sources of drug response heterogeneity, including genetics, epigenetics, tumor microenvironment, drug bioavailability, and others (irrespective of debates over which is most important). Mixed-effects models have been used before to model interpatient variance for in silico clinical trials (25). Applying the population PK approach to tumor drug responses, “population-tumor kinetic” (pop-TK) models represent both intratumor and interpatient heterogeneity with parameter distributions, enabling simulations of chemotherapy response that include both cell-to-cell and patient-to-patient variations.
We apply the pop-TK model to understand the development of first-line treatments for DLBCL, which has spanned the cyclophosphamide, doxorubicin, vincristine, and prednisone (CHOP) regimen introduced in the 1970s (2), the addition of rituximab making RCHOP in 2002 (26), two decades of unsuccessful attempts to improve upon RCHOP (27–36), and the recent success of the RCHP plus polatuzumab vedotin (Pola-RCHP) regimen (29). This history of trials in DLBCL prompts a critical question: Why do some new combination therapies improve cure rate, whereas others fail? Here, we use a mechanistic model to understand how cures are achieved by current regimens and inform the design of new curative-intent regimens.
Results
Modeling Multidrug Response of Heterogeneous Cancers
Our model of heterogeneous cancers treated with drug combinations is explained here by building up in complexity. We begin with a simple, familiar framework of one drug acting on one patient, whose cancer contains drug-sensitive and drug-resistant cells with initial populations and , in which and are dose–response functions describing the fractions of sensitive and resistant cells that survive one cycle of drug treatment at dose (Fig. 1A). The total number of cells that survive treatment will be . Applying this model to repeated cycles of therapy with growth between cycles produces the Darwinian selection for drug resistance (Fig. 1A). If heterogeneity in drug sensitivity is described as a series of many states rather than two extremes, the total population after treatment can be described as a sum . It follows then, that a sum of many states can be described as an integral over a distribution of drug sensitivity levels : (Fig. 1B). In this view, the consequence of intratumor heterogeneity is that the cytotoxic effect of chemotherapy is described by an ensemble of dose–response functions, with some cells being more sensitive than average and some more resistant than average. Greater or lesser drug sensitivity of the tumor cell population as a whole is defined by the initial center of the drug sensitivity distribution. Treatment preferentially kills cells on the drug-sensitive side of the distribution, such that repeated cycles of therapy progressively shift a tumor population toward resistance (Fig. 1C), reproducing a common clinical scenario of initial drug response followed by relapse. Cancer cells can resist chemotherapy for a multitude of reasons. Here, rather than modeling specific mechanisms, all physiologically relevant causes of resistance are viewed as reasons why some (perhaps many) tumor cells lie on the resistant side of the distribution. Darwinian selection therefore emerges from this model in a mechanism-agnostic manner.
Cross-resistance between drugs manifests as a correlation in the sensitivity distribution , and the multidrug response function can include synergistic, additive, or antagonistic drug interactions (“Simulating Pharmacologic Interactions”). This framework similarly expands to any number of drugs.
This approach has separated tumor response to chemotherapy into two parts: the distribution of tumor cells’ drug sensitivities and their dose–response function . These functions will depend on the drugs and cancer type of interest and are both measurable. Drug sensitivity of cancer cells in vivo depends on many factors, including genetics, epigenetics, microenvironment, PK, and more. When a quantitative phenotype is the product of many factors, its population distribution can usually be approximated by a log-normal distribution, as a consequence of the central limit theorem (Supplementary Fig. S1A and S1B). Indeed, for DLBCL sensitivity to drugs in RCHOP, we reanalyzed published high-complexity clone-tracing experiments and genome-wide CRISPR inhibition screens under therapy (5) and observed that drug sensitivity phenotypes in DLBCL cells are approximately log-normally distributed (Fig. 1D). Furthermore, considering sensitivity to multiple drugs in RCHOP, experimental data closely resemble partly correlated multivariate log-normal distributions and specify magnitude of correlation for RCHOP (Fig. 1E; Supplementary Fig. S1C and S1D). Determinants of chemosensitivity and tumor heterogeneity are surely more complex in vivo than in vitro, which strengthens rather than weakens the rationale for describing drug sensitivity as a distribution rather than binary extremes. This approach is helpfully agnostic to our limited knowledge of in vivo determinants of chemosensitivity.
Turning to dose–response , like most cytotoxic chemotherapies, the drugs in RCHOP exhibit approximately exponential dose–response functions, both as single drugs and when combined (Fig. 1F; refs. 5, 6). Although some pairs of drugs in RCHOP interact antagonistically, the response to the full five-drug combination was observed to be additive (noninteracting) according to both Loewe’s and Bliss’ definitions of synergy (5). The effect of differing drug sensitivity is modeled using a previously reported approach of rescaling drug concentration, such that drug-resistant cells behave as though exposed to less drug (37, 38). Rescaling drug concentration is specifically true for resistance mechanisms that decrease target binding (e.g., efflux pumps, binding site mutations, drug degradation, and the loss of prodrug activation) and is also observed to approximate the effects of diverse resistance mechanisms in settings ranging from antibacterial drugs to chemotherapies for DLBCL (5).
A clinical trial is modeled as a virtual cohort of heterogeneous patients. Thus, Eq. (B) is applied to each of the many simulated patients having their own values for average drug sensitivity (), initial number of cancer cells (), and tumor growth rate (). Tumor growth rates and initial population sizes are sampled from clinically measured distributions (39, 40). Patient-to-patient variations in drug sensitivity are modeled as a log-normal distribution, consistent with a prior analysis of individual patient data from ≈500 oncology trial arms showing that survival distributions are well approximated as log-normal (median R2 = 0.98; ref. 41). We also investigated whether the observed molecular heterogeneity in DLBCL corresponds to approximately the log-normal variation, by estimating the net effect of expression differences on drug sensitivity. Specifically, measured effects of gene expression on sensitivity or resistance to drugs in RCHOP were weighted by gene expression in each of the 137 treatment-naïve DLBCL samples, which resulted in approximately log-normal patient-to-patient variations (“Estimating Patient-to-Patient Variation in Drug Sensitivity,” Supplementary Fig. S2A and S2B). Parameters defining patient-level drug sensitivity distributions (average and variance) are fitted to reproduce observed PFS distributions, as detailed in the following section. Variation among patients in average drug sensitivity means that, for example, some patients’ entire cancer cell populations are refractory to treatment (Fig. 2A, red), whereas other patients may be exceptional responders with highly drug-sensitive cancers (Fig. 2A, yellow). Various treatment schedules can be simulated, producing trajectories of tumor cell populations over time, which determine whether a virtual patient is cured, or if not, when they experience disease progression (Fig. 2B). A cohort of outcomes yields a Kaplan–Meier plot of PFS (Fig. 2C). Three parameters chiefly influence drug activity: average drug sensitivity across all patients, interpatient variance (), and intratumor or cell-to-cell variance (); these affect PFS distributions in distinct ways and can therefore be fitted to clinical trial results (Supplementary Fig. S3A–S3C).
Including interpatient heterogeneity to simulate clinical trials. A, Top, Interpatient variation is simulated by sampling each virtual patient’s drug sensitivity from a log-normal distribution with mean and interpatient heterogeneity . Bottom, Each patient’s cancer is then modeled as a population of cells with a distribution of drug sensitivities (as shown in Fig. 1), with the average and intratumor heterogeneity . B, Drug combinations are modeled by describing each tumor with a multivariate sensitivity distribution, with one dimension per drug. A virtual patient’s trajectory of tumor population is calculated from their drug sensitivity distribution, initial tumor size, and tumor growth rate. Trajectories for three virtual patients are shown. C, Progression times from tumor trajectories of a virtual cohort are used to generate PFS distributions.
Including interpatient heterogeneity to simulate clinical trials. A, Top, Interpatient variation is simulated by sampling each virtual patient’s drug sensitivity from a log-normal distribution with mean and interpatient heterogeneity . Bottom, Each patient’s cancer is then modeled as a population of cells with a distribution of drug sensitivities (as shown in Fig. 1), with the average and intratumor heterogeneity . B, Drug combinations are modeled by describing each tumor with a multivariate sensitivity distribution, with one dimension per drug. A virtual patient’s trajectory of tumor population is calculated from their drug sensitivity distribution, initial tumor size, and tumor growth rate. Trajectories for three virtual patients are shown. C, Progression times from tumor trajectories of a virtual cohort are used to generate PFS distributions.
Explaining Clinical Outcomes in DLBCL
The RCHOP regimen contains five drugs with distinct mechanisms of action, and therefore its antitumor effect is modeled by a five-dimensional extension of Eq. (B) (pop-TK model). We calibrated our model on PFS distributions from the CHOP and RCHOP arms of the RICOVER-60 trial (42). Specifically, the Metropolis–Hastings algorithm fitted six parameters (mean sensitivity, interpatient and intratumor heterogeneity, interpatient and intratumor cross-resistance, and average tumor growth rate) to CHOP and one to the activity of rituximab in RCHOP (Supplementary Fig. S4), with the remaining parameters directly set by clinical measurements (Supplementary Table S1). The algorithm returned an ensemble of parameter sets that falls within the 95% confidence interval (CI) of observed PFS distributions and reproduces the benefit of rituximab (Fig. 3A). The primary source of parameter flexibility was a trade-off between intratumor heterogeneity and average drug sensitivity because greater heterogeneity entails more resistant cells. The parameter set closest to the average of the ensemble was selected as the representative set for subsequent simulations (Supplementary Fig. S5A and S5B).
Model reproduces clinical efficacy of combination chemoimmunotherapy in DLBCL. A, PFS distributions of CHOP or RCHOP as first-line therapy for DLBCL (RICOVER-60 trial, arms with six or eight cycles are aggregated; 95% CIs are shown) are approximately reproduced by simulations. Each simulation line is one parameter set from an ensemble of acceptable sets (Supplementary Figs. S4 and S5). B, Simulated tumor kinetics of virtual patients treated with RCHOP, using the parameter set closest to the center of the ensemble. Simulated PFS curves shown in (A) are calculated from 1,000 such simulated patients. For clarity, lines connect the minimum of tumor population after each cycle and therefore lack the “sawtooth” appearance of Fig. 2B. C, Simulated and observed changes in lymphoma population after the first cycle of RCHOP. Observed changes were measured by liquid biopsy assays of ctDNA (40). Simulated changes are from trajectories in (B). D, Pooled analysis of the HR for disease progression associated with high Ki67 expression, indicating rapid tumor proliferation, in studies on patients with DLBCL treated with RCHOP. Simulations of RCHOP, stratified by tumor growth rate, produce a similar prognostic effect of rapid tumor proliferation. LBCL, large B-cell lymphoma; ND, no detectable ctDNA.
Model reproduces clinical efficacy of combination chemoimmunotherapy in DLBCL. A, PFS distributions of CHOP or RCHOP as first-line therapy for DLBCL (RICOVER-60 trial, arms with six or eight cycles are aggregated; 95% CIs are shown) are approximately reproduced by simulations. Each simulation line is one parameter set from an ensemble of acceptable sets (Supplementary Figs. S4 and S5). B, Simulated tumor kinetics of virtual patients treated with RCHOP, using the parameter set closest to the center of the ensemble. Simulated PFS curves shown in (A) are calculated from 1,000 such simulated patients. For clarity, lines connect the minimum of tumor population after each cycle and therefore lack the “sawtooth” appearance of Fig. 2B. C, Simulated and observed changes in lymphoma population after the first cycle of RCHOP. Observed changes were measured by liquid biopsy assays of ctDNA (40). Simulated changes are from trajectories in (B). D, Pooled analysis of the HR for disease progression associated with high Ki67 expression, indicating rapid tumor proliferation, in studies on patients with DLBCL treated with RCHOP. Simulations of RCHOP, stratified by tumor growth rate, produce a similar prognostic effect of rapid tumor proliferation. LBCL, large B-cell lymphoma; ND, no detectable ctDNA.
Simulated tumor kinetics for a cohort of virtual patients underlies the fitted PFS distributions (Fig. 3B) and reproduces kinetic features of response to RCHOP. Simulations demonstrate the necessity of six cycles of RCHOP to cure most patients, even though some are cured earlier, which are consistent with the observed marginal difference made by eight cycles (42). Sensitive assays of circulating tumor DNA (ctDNA) have allowed liquid biopsies to measure tumor cell kill below the radiologic limit of detection, and although our model was not fitted to such data, the observed distribution of ctDNA changes after the first cycle of RCHOP was similar to its simulated equivalent over five orders of magnitude, as was the proportion of patients with undetectable disease after the first cycle (observed 28% and simulated 33%), indicating that the model reproduces the measured decline of tumor populations (Fig. 3C). Furthermore, Ki67 is a marker of tumor cell proliferation and high Ki67 staining is an adverse prognostic factor in DLBCL (43–48). We stratified simulated patients by their tumor proliferation rate (a parameter with interpatient variation) and found that it was an adverse prognostic factor in the model (HR = 1.3, P < 0.01) similar to clinical findings (Fig. 3D). Together, these results demonstrate that the fitted pop-TK model reproduces kinetics of RCHOP response and the prognostic effect of tumor growth rate.
Predicting DLBCL Combination Therapy Trials from Monotherapy Activities
Since the development of RCHOP, many phase III trials have tested new drugs in combinations for first-line DLBCL, nearly all of which have been preceded by a trial of the new drug in relapsed or refractory (R/R) DLBCL (Supplementary Table S2; refs. 27–36, 49). Because treatment-naïve disease encountered at first-line is more drug sensitive, the efficacy of new drugs in R/R disease has not been formally used to predict the efficacy of first-line treatments. To bridge this gap, we used the pop-TK model to simulate the selective pressure of first-line treatment and thereby relate drug efficacy in R/R disease with first-line therapy.
A virtual cohort of R/R patients was created by simulating an RCHOP-treated cohort and selecting patients who experience disease progression (Fig. 4A; Supplementary Fig. S6). This process represents the most drug-sensitive lymphomas being cured at first-line and is therefore absent from R/R cohorts, as well as R/R lymphomas becoming even more drug resistant from the selective pressure of first-line chemotherapy. We next simulated subsequent therapies in this R/R cohort, fitting drug efficacy parameters (mean and interpatient variance of sensitivity) to match observed PFS distributions of nine drug trials in R/R disease (Fig. 4B, column 1; refs. 49–57). In the case of polatuzumab vedotin (Pola), efficacy parameters were fit to trials of bendamustine and rituximab with or without Pola, and rituximab with Pola (Supplementary Fig. S7A). These parameters for drug efficacies were used to simulate first-line combination therapy trials, tailored to actual regimens in terms of whether a new drug was added to RCHOP (ibrutinib, lenalidomide, bevacizumab, venetoclax, and tucidinostat), administered as maintenance after response to RCHOP (everolimus and enzastaurin), or replaced a drug in RCHOP (obinutuzumab replaced rituximab and Pola replaced vincristine; Fig. 4B, column 2; Supplementary Table S2). Predicted PFS HRs fell within the 95% CIs of observed HRs for eight of nine trials (Fig. 4C). The model overestimated the efficacy of RCHOP plus bevacizumab; in actuality, this trial was terminated early because of a high incidence of congestive heart failure, which also necessitated administering only half of the intended cumulative bevacizumab dose (30). This underscores that models of tumor response do not predict adverse events in other organs, which are a distinct and critical determinant of outcomes, although pop-TK models predicted the antitumor efficacy of each tolerable regimen.
Predicting the efficacy of first-line combination therapies from drug efficacy in R/R DLBCL. A, R/R trials are simulated by first treating a virtual cohort of patients with DLBCL with RCHOP, which cures some but not all patients, then simulating subsequent therapy for virtual patients with R/R disease. B, Simulations of R/R therapy fitted two parameters—mean and interpatient variance in drug sensitivity—to align simulated (gray) with observed (red) PFS from R/R trials of nine drugs. These parameters for the efficacy of new drug “X” were used to simulate first-line trials of RCHOP (black) vs. RCHOP + X (purple), with dose schedules and sample sizes corresponding to the actual trial designs. Each simulated trial is compared with the observed trial result (RCHOP, black; RCHOP + X, blue). C, Comparison of simulated (purple) and observed (blue) PFS HRs from RCHOP + X clinical trials. D, Expected statistical power to detect significant improvement in PFS was calculated from 1,000 virtual trials, matching sample size, follow-up, and distribution of censoring events of each actual trial. The vertical dashed line marks 80% power. E, The difference in restricted mean survival time (RMST) between the experimental and control arms was calculated for each simulated trial and actual trial at 1, 2, and 3 years (note: trials of bevacizumab had sufficient follow-up for 1 year and tucidinostat for 2 years). Spearman correlation between simulated and observed restricted mean survival time was 0.73 (P < 0.001).
Predicting the efficacy of first-line combination therapies from drug efficacy in R/R DLBCL. A, R/R trials are simulated by first treating a virtual cohort of patients with DLBCL with RCHOP, which cures some but not all patients, then simulating subsequent therapy for virtual patients with R/R disease. B, Simulations of R/R therapy fitted two parameters—mean and interpatient variance in drug sensitivity—to align simulated (gray) with observed (red) PFS from R/R trials of nine drugs. These parameters for the efficacy of new drug “X” were used to simulate first-line trials of RCHOP (black) vs. RCHOP + X (purple), with dose schedules and sample sizes corresponding to the actual trial designs. Each simulated trial is compared with the observed trial result (RCHOP, black; RCHOP + X, blue). C, Comparison of simulated (purple) and observed (blue) PFS HRs from RCHOP + X clinical trials. D, Expected statistical power to detect significant improvement in PFS was calculated from 1,000 virtual trials, matching sample size, follow-up, and distribution of censoring events of each actual trial. The vertical dashed line marks 80% power. E, The difference in restricted mean survival time (RMST) between the experimental and control arms was calculated for each simulated trial and actual trial at 1, 2, and 3 years (note: trials of bevacizumab had sufficient follow-up for 1 year and tucidinostat for 2 years). Spearman correlation between simulated and observed restricted mean survival time was 0.73 (P < 0.001).
We further evaluated trial predictions by their expected statistical power and by restricted mean survival time for PFS, which quantifies the area under the survival curve up to a defined time. Each trial’s predicted HR and actual sample size were used to compute “statistical power,” being the likelihood of observing significantly improved PFS (using P < 0.05). The only regimens the trials of which reached the conventional target of 80% power were Pola-RCHP (for higher risk DLBCL defined by IPI ≥ 2) and RCHOP plus tucidinostat (for higher risk DLBCL defined by double expression of MYC and BCL2), which were indeed the only regimens to significantly improve PFS or event-free survival (EFS) over RCHOP (Fig. 4D; refs. 29, 49). Notably, our preliminary prediction of Pola-RCHP versus RCHOP was posted online before this trial’s results were first presented (Supplementary Fig. S7B). Finally, predicted and observed changes in restricted mean survival time at 1, 2, and 3 years were significantly correlated (Fig. 4E; Spearman r = 0.73; P < 0.001). These results show that the pop-TK model can use new drug efficacy in R/R clinical trials to predict the efficacy of new first-line combination therapies, including strategies of adding a drug, replacing a drug, or adjuvant therapy.
Model-Driven Clinical Trial Design Considerations
Having found that the pop-TK model explains recent trial results in DLBCL, we next used it to explore how future trials of combination therapy can improve patient outcomes. First, we aimed to understand why adding new drugs that do have monotherapy activity failed to improve PFS in so many trials. Virtual trials of increasing the numbers of drugs, all having equal monotherapy activity (N vs. N + 1 drugs), demonstrated diminishing returns in their relative reduction of risk (Fig. 5A). This suggests that improving survival beyond a five-drug standard of care will require either new drugs to be dramatically superior to existing ones or additional strategies such as synergistic combinations or the use of biomarkers for precision drug combinations. We next investigated these possibilities using virtual trials.
Impact of drug properties on clinical outcomes. A, The simulated effect of adding more drugs to a combination, each with equal individual efficacy, exhibits diminishing returns (HRs closer to 1). Simulated PFS distributions are shown for drugs with relative efficacies of 0.1 or 1, where 1 means as effective as the drugs in CHOP. B, Simulated HRs of RCHOP + X vs. RCHOP, as a function of the efficacy of drug X (horizontal) and pharmacologic interaction between X and a drug in RCHOP (vertical). Efficacy is quantified as a relative dose, where 1 is as effective as a drug in CHOP. Synergy is quantified as the combination index (CI), where CI = 1 means additive, CI < 1 means synergy (e.g., 0.5 means potency is doubled), and CI > 1 means antagonism (e.g., 2 means potency is halved). A white contour marks where a trial with 400 patients per arm has 80% power. Isobolograms illustrate different magnitudes of synergy. C, Simulated HRs of RCHOP + X vs. RCHOP as a function of the efficacy of drug X (horizontal) and the accuracy of a biomarker to select X-responsive patients (vertical). Biomarker accuracy is quantified as a correlation between a biomarker score and patient sensitivity to drug X, with P = 1 indicating perfect prediction and P = 0 indicating no predictive value. Swimmer plots illustrate patient selection at different correlations.
Impact of drug properties on clinical outcomes. A, The simulated effect of adding more drugs to a combination, each with equal individual efficacy, exhibits diminishing returns (HRs closer to 1). Simulated PFS distributions are shown for drugs with relative efficacies of 0.1 or 1, where 1 means as effective as the drugs in CHOP. B, Simulated HRs of RCHOP + X vs. RCHOP, as a function of the efficacy of drug X (horizontal) and pharmacologic interaction between X and a drug in RCHOP (vertical). Efficacy is quantified as a relative dose, where 1 is as effective as a drug in CHOP. Synergy is quantified as the combination index (CI), where CI = 1 means additive, CI < 1 means synergy (e.g., 0.5 means potency is doubled), and CI > 1 means antagonism (e.g., 2 means potency is halved). A white contour marks where a trial with 400 patients per arm has 80% power. Isobolograms illustrate different magnitudes of synergy. C, Simulated HRs of RCHOP + X vs. RCHOP as a function of the efficacy of drug X (horizontal) and the accuracy of a biomarker to select X-responsive patients (vertical). Biomarker accuracy is quantified as a correlation between a biomarker score and patient sensitivity to drug X, with P = 1 indicating perfect prediction and P = 0 indicating no predictive value. Swimmer plots illustrate patient selection at different correlations.
Unlike our previous simple models of noninteracting drug combinations (4, 11–13), pop-TK models can simulate the clinical consequence of drug–drug interactions. We simulated combination therapy trials spanning a range of values for individual efficacy of a new drug and a range of pharmacologic interactions between the new drug and a drug in RCHOP, from strong antagonism (combination index 5, meaning fivefold weaker potency) to additive (combination index 1) to strong synergy (combination index 0.2, meaning fivefold stronger potency; Fig. 5B). Synergies of up to twofold potentiation had a negligible effect on trial results, but synergies greater than twofold in all patients had the potential to improve outcomes, provided the new drug’s single-agent activity was at least as good as drugs in the standard of care (i.e., RCHOP). Conversely, strong antagonism had only a minor detrimental effect, suggesting that antagonism need not necessarily disqualify agents from being combined if they have substantial single-agent efficacy. These results suggest that enthusiasm for preclinical synergy would be best limited to interactions with greater than twofold enhanced potency (combination index <0.5), with consistency across most tumor models (e.g., panels of cell cultures), and involving drugs with substantial single-agent efficacy in humans. Such properties are uncommon but have been demonstrated by drug pairs in the investigational regimens LTRA and ViPOR, which have shown clinical promise (58, 59).
Drugs with moderate response rates could also elicit greater activity in combinations if biomarkers are available to select patients most responsive to the added drug. We simulated combination therapy trials with biomarker-based enrollment, investigating the influence of overall drug efficacy and biomarker accuracy. Specifically, we defined a metric for correlation between a biomarker and drug efficacy, whereby a correlation of 1 indicates that a biomarker perfectly selects patients who respond best to a drug and a correlation of 0 means the biomarker has no predictive value (“Simulating Biomarker Stratified Trials”). Virtual trials found that increasing biomarker performance led to superior HRs and statistical power at all levels of drug efficacy (Fig. 5C). A correlation of 0.5 between a biomarker and drug activity, which leads to an assignment of two thirds of patients to the optimal treatment (Supplementary Fig. S8A and S8B), was just as beneficial as a drug being 2.5-fold more potent, suggesting that even imperfect biomarkers have substantial potential for precision combination therapy (Fig. 5C).
DLBCL has been stratified into prognostically significant subgroups based on molecular features. Subgroups of interest include double-expressor lymphoma (concurrent expression of MYC and BCL2 or BCL6), cell-of-origin subtypes of activated B cell (ABC) and germinal center B cell (GCB), and genetic subtypes defined by exome sequencing (60–63). Subgroup-specific simulations of RCHOP therapy can be created by fitting average drug sensitivity to clinically observed PFS in each subgroup (Fig. 6A–C). We find that significant prognostic differences can be explained by differences in drug sensitivity that could seem modest in experiments; for example, 30% lower PFS at 2 years in double-expressor lymphoma arises from threefold lower drug potency. As an example of modeling subtype-specific combinations, we analyzed the performance of cell-of-origin as a predictor of benefit from Pola. Multiple clinical trials have observed that ABC DLBCL is more responsive to Pola than GCB DLBCL (64, 65), including a randomized phase II study of bendamustine and rituximab (BR) with or without Pola (52). We simulated subtype-specific trials of BR ± Pola to calibrate the efficacy of Pola in each cell-of-origin subtype and found that fourfold greater potency of Pola against ABC than GCB reproduced their observed differences in PFS HR (Fig. 6D). Next, we used these relative Pola efficacies to predict subtype-specific PFS distributions and HRs for the phase III trial of Pola-RCHP versus RCHOP (29, 63) and found that the actual large difference was predictable from the subtype-specific efficacy seen in the preceding phase II trial (Fig. 6D). This example demonstrates the potential of early-phase biomarker observations to anticipate biomarker-based endpoints in phase III trials, which could be used to select patients who will derive the greatest survival benefits from new combination therapies.
Modeling clinical outcomes in DLBCL subgroups. Subgroup-specific models of RCHOP therapy are constructed by fitting patient sensitivity distributions to observed PFS distributions in subgroups including (A) DLBCL stratified by MYC/BCL2 expression status (double-expressor vs. all others), (B) DLBCL stratified by LymphGen subtypes, and (C) DLBCL cell-of-origin (GCB and ABC). D, The efficacy of Pola in each cell-of-origin subtype was fitted to reproduce the observed PFS HRs by cell-of-origin in a phase II trial comparing bendamustine and rituximab (BR) with or without Pola in R/R DLBCL. These subtype-specific efficacies were used to predict PFS distributions and HRs for each cell-of-origin subtype in the first-line (1L) phase III trial comparing Pola-RCHP with RCHOP.
Modeling clinical outcomes in DLBCL subgroups. Subgroup-specific models of RCHOP therapy are constructed by fitting patient sensitivity distributions to observed PFS distributions in subgroups including (A) DLBCL stratified by MYC/BCL2 expression status (double-expressor vs. all others), (B) DLBCL stratified by LymphGen subtypes, and (C) DLBCL cell-of-origin (GCB and ABC). D, The efficacy of Pola in each cell-of-origin subtype was fitted to reproduce the observed PFS HRs by cell-of-origin in a phase II trial comparing bendamustine and rituximab (BR) with or without Pola in R/R DLBCL. These subtype-specific efficacies were used to predict PFS distributions and HRs for each cell-of-origin subtype in the first-line (1L) phase III trial comparing Pola-RCHP with RCHOP.
Discussion
By modeling tumor heterogeneity as distributions of drug sensitivity, simulations of multidrug response in patient populations can reproduce clinical trial results and predict the efficacy of new combination regimens. This study has not proposed new mechanisms but sought to assemble established facts into a lifelike simulation of tumor responses in patient populations. Our model is based on three central assumptions: (i) the effects of drugs depend on dose, (ii) drug sensitivity varies among the cancer cells within a patient, and (iii) the average drug sensitivity of cancers varies between patients. We model these phenomena by mathematical approximations, and although other descriptions are possible, it is well established that these phenomena are real: cancers are not homogeneous. Adopting the mixed-effects approach of population-PK allows this “population tumor kinetics” model to implicitly account for heterogeneity in drug response caused by many complex aspects of tumor biology. Rather than explicitly modeling each important influence on drug response, such as tumor genetics and the immune microenvironment, their overall impact is described by distributions of drug sensitivity at the patient-to-patient and cell-to-cell scales. In short, if any process contributes to drug resistance, it is a reason why some patients or some cells lie in the resistant tail of a sensitivity distribution. This approach sidesteps many deeper details but enables practical applications to design regimens and trials. Further mechanistic complexity could be added as needed for particular questions.
Application of the pop-TK model to combination therapy for DLBCL explains and predicts important clinical outcomes. When calibrated to PFS distributions of >1,000 patients treated with either CHOP or RCHOP (Fig. 3A), the pop-TK model reproduced tumor shrinkage kinetics (Fig. 3C; Supplementary Fig. S5B) and the prognostic impact of growth rate (Fig. 3D). By fitting a model of R/R patients to trials of new therapies in R/R patients, the pop-TK model was able to accurately predict the results of adding new therapies to first-line combinations (Fig. 4). Virtual trials were also used to calculate the statistical power of real trials, not based on an aspirational HR as is usual in trial design but using a data-based prediction of HR, and correctly predicted that only Pola-RCHP and RCHOP–tucidinostat were likely to be significantly superior to RCHOP (Fig. 4D). Therapies that did not improve efficacy when added to first-line combinations also had poor efficacy in the R/R setting. Two scenarios could underlie poor efficacy in R/R disease: the therapy has little antitumor activity at clinically achievable doses or the it is sufficiently cross-resistant with the first-line treatment that RCHOP-resistant patients also resist the new therapy. In either scenario, the common feature of failed combinations is that the new therapy is not effective at killing cancer cells that survive the standard regimen. Interestingly, the model did not reproduce the incidence of relapses later than 3 years, but recent genomic studies have found that “late relapses” of DLBCL usually result from cancer precursor cells acquiring new mutations to develop into related yet de novo cases of DLBCL (66). Thus, the inability of our simulations to reproduce late relapses supports the theory that late relapses are essentially new, treatment-naïve lymphomas.
This model focuses on antitumor effects of drug combinations and does not attempt to predict toxicity or overall survival (OS). OS depends on response to subsequent treatments that are outside the scope of this model, such as stem-cell transplant or chimeric antigen receptor (CAR) T-cell therapy. As such, although OS is a critical trial endpoint, post-progression survival is not amenable to mechanistic modeling. When drug combinations result in toxicities that require major adjustments to drug doses or schedule, as occurred for RCHOP plus bevacizumab, modeling the intended full dose cannot be expected to reproduce actual outcomes (Fig. 4C). However, this model could be used to understand the clinical impact of known adjustments, including dose interruptions or reductions. Empiric data on incidence and magnitude of dose adjustments, as observed in phase I or II studies, could be incorporated into models, to forecast whether those adjustments might compromise the antitumor efficacy of a new regimen.
This mechanistic model of combination therapy in DLBCL illuminated why some new drug combinations improve survival and others do not. We investigated three factors influencing efficacy, noting that considerations of efficacy are subject to the prerequisite that a regimen is tolerable. First, a new drug’s efficacy in R/R disease was a critical determinant of its efficacy in a first-line combination; most ineffective combinations were predictable from inadequate monotherapy activity. Second, using biomarkers to select patients who were likely to respond to a new drug could improve the outcomes of trials in which that drug is added to a combination (Fig. 5C). For example, the greater efficacy of Pola in R/R patients with ABC-subtype DLBCL predicted the observation in previously untreated DLBCL that Pola-RCHP selectively improves PFS in the ABC subtype (HR = 0.34) and not in the GCB subtype (HR = 1.03; Fig. 6D; refs. 29, 63). This finding supports using cell-of-origin subtype to choose between RCHOP and Pola-RCHP for untreated DLBCL and underscores the broader potential of biomarker-driven selection to improve outcomes in combination therapy trials (65). Notably, if subtype-specific efficacy is observed in early-phase trials, subtype-specific models could be used to predict the value of subtype-specific trial endpoints. Third, synergistic drug interactions demonstrated limited ability to improve combination therapy except when strong synergy is exhibited by therapies that also possess substantial single-agent efficacy. This finding cautions against enthusiasm for synergy when it has a modest magnitude, is spuriously present across preclinical models, or occurs among individually ineffective agents. Together, these clinical trial simulations suggest that developing new individually effective drugs and biomarkers for the most responsive disease subsets offers substantial potential to improve the success of first-line therapy for DLBCL.
In the future, this model could be adapted to other therapies and types of cancer by adjusting dose–response functions, sensitivity distributions, and growth rates based on data from other diseases. We also anticipate that further biological details, such as spatial aspects of solid tumors, could be built atop the current framework. Simulations of clinical trials do not replace clinical trials, just as simulations of airplanes do not replace airplanes, but, in both cases, the models can inform new designs and reduce the rate of failure. Reducing the frequency of unsuccessful phase III clinical trials has great potential to lower the cost of developing superior cancer treatments and accelerate improvements in survival.
Methods
Pop-TK Model
Tumor growth between cycles is described by the term , where is the interval between cycles in weeks and is growth rate in weeks−1. Multidimensional integration is performed numerically using PyCuba (67). Code and source data for all simulations in this work are available in a Code Ocean capsule at https://codeocean.com/capsule/7562510/tree/v1, and a concise version of the code is available in a Supplementary Mathematica notebook (Supplementary Code).
Simulating Pharmacologic Interactions
Sampling Virtual Patient Cohorts
Clinical trials were simulated as virtual patient cohorts, in which each patient’s drug sensitivities, tumor growth rate, and initial tumor population are sampled from distributions as described below.
Drug Sensitivity
Growth Rate
Tumor Burden
Generating Simulated PFS Distributions
We model time until treatment failure (progression). Post-progression survival, which determines OS, is outside of the scope of this model. Thus, we model PFS distributions, which were generated by simulating the pop-TK model for each patient in a cohort. Progression times were computed from tumor growth rate and population at the end of treatment:
- (1)
If the tumor population was greater than its initial size, progression was defined by the time required for the tumor to reach twice its initial population.
- (2)
If the tumor population was reduced but remained more than one cell, progression was defined by the time required to regrow to its initial population.
- (3)
If the tumor population was reduced to <1 cell, a patient was considered potentially cured, with a probability inversely proportional to residual tumor cell count; for example, a patient with “0.1 cells” has a 90% likelihood to be cured.
Cured patients were treated as censoring events at the longest follow-up time of the trial. Kaplan–Meier curves were generated from a cohort of simulated progression and censoring events.
HR and Power Calculations
HRs were calculated using the Cox proportional hazards model in the Lifelines Python package (69). Expected statistical power of trials was calculated by simulating 1,000 trials with the same design, sample size, and distribution of censoring events as the corresponding real trial. Power was the percentage of simulated trials that were “successful,” defined by HR significantly <1, at P < 0.05.
Model Parameters
The model has 12 parameters, some of which are determined directly from experimental or clinical data and some of which are fitted to reproduce PFS distributions (Supplementary Table S1).
Initial Tumor Sizes (Three Parameters)
Parameters defining the distribution of the initial tumor burden were fitted to clinical measurements of total metabolic tumor volume for a cohort of patients with varying Ann Arbor stage. This distribution is defined by three parameters: the standard deviation of tumor size for patients with the same Ann Arbor stage (), a mean initial tumor population of stage I DLBCL (), and the difference in initial tumor population between stages (; Eq. L; Supplementary Fig. S9). For each simulation of a specific real trial, the frequency of patients with stages I to IV disease was matched to that reported in the real trial.
Tumor Growth Rates (Three Parameters)
Tumor growth rates were sampled from a log-normal distribution with mean growth rate , standard deviation , and maximum growth rate (Eq. K). The average growth rate was estimated using the Metropolis–Hastings algorithm. Standard deviation was determined from clinical measurements of growth rates of non–Hodgkin lymphomas (39). The maximum growth rate was set to the average doubling time of Burkitt lymphoma, the fastest growing human tumor (68).
Drug Sensitivity (Six Parameters)
Drug sensitivity distributions for RCHOP were defined by six parameters: average sensitivity to the drugs comprising CHOP (), intratumor heterogeneity in drug sensitivity (; applied to each drug), patient-to-patient heterogeneity in drug sensitivity (; applied to each drug), cross-resistance between drugs at the patient scale (), cross-resistance between drugs at the cellular scale (), and average sensitivity to rituximab was defined by a parameter () describing how much more effective is rituximab, on average, than drugs in CHOP (average sensitivity to rituximab was calculated as ).
These parameters were fitted, via the Metropolis–Hastings algorithm, to reproduce PFS distributions of the RICOVER-60 trial. Parameters each uniquely affect the shape of the PFS distribution (Supplementary Fig. S3). Because trials prior to the establishment of CHOP predate the modern histologic definition of DLBCL, data are not available to estimate specific to each single agent in CHOP and so a shared average sensitivity is used for these drugs.
Metropolis–Hastings Parameter Estimation
Parameter values were fitted to reproduce the PFS distributions produced by six or eight cycles of CHOP, with 2-week cycles, in the RICOVER-60 trial (42). A preliminary search screened three values each for the six parameters (Supplementary Table S1). Twenty of these parameter sets produced PFS approximately near the efficacy of CHOP (residual errors being twice the 99% CI of the clinical data; Supplementary Fig. S4). These parameter sets were used as starting values for Metropolis–Hastings random walks by using the procedure below.
- (1)
Initiate the Metropolis–Hastings algorithm with one of the 20 candidate parameter sets.
- (2)
For a specified chain length (500), perform the following steps:
- a.
Modify one randomly selected parameter from the previous step () by adding a normally distributed random value centered at zero with the parameter’s standard deviation from the prior distribution in the preliminary screen. The resulting parameter set is .
- b.
Check that the modified parameter is within the predetermined range. If it is outside of the range, set it to the closest value allowed. The resulting parameter set is .
- c.
Use the score function, , to calculate the error value for the parameter set .
- d.
With the acceptance ratio , accept or reject the parameter set . If accepted, then becomes .
- a.
The Metropolis–Hastings algorithm was run six times for each of the 20 starting parameter sets. In sum, this tested 60,000 parameter sets (6 × 20 × 500). Pairwise distributions of all parameter sets in the 85% and 97.5% CIs of CHOP’s PFS distribution are shown in Supplementary Fig. S4. Among the 10 parameter sets that most closely reproduce the efficacy of CHOP, one set was the closest to the centroid of these most plausible sets in every pairwise view and was selected for further modeling (Supplementary Table S1 lists the selected set and range of the ensemble). Using this selected parameter set, we estimated the average sensitivity to rituximab () that best reproduced the RCHOP arms of RICOVER-60 (42).
Simulating and Fitting Trials in the R/R Cohorts
The R/R patient cohort was generated using a “six-drug” model wherein all patients received first-line treatment with “RCHOP + X” with “drug X” having a dose of zero. Virtual patients who progressed during or after initial treatment (pink and blue patients in Fig. 4A) were selected to simulate second-line therapy with “drug X” in R/R patients. Dosing schedules in virtual R/R trials were matched to schedules for each real trial of interest (Supplementary Table S2). R/R trials were simulated across a grid of two parameters—dose of drug X (equivalent to adjusting tumor sensitivity to drug X) and interpatient variance in sensitivity (). For each real trial of “drug X,” we selected the parameter set that most closely fit the clinically observed PFS distribution of that drug in R/R patients, by mean squared error in PFS over the reported duration of the trial, up to 2 years. These parameters defining the activity of each “drug X” were used to simulate phase III trials of RCHOP + X with dose schedules matching each particular trial (including whether a drug was added or replaced; Fig. 4B; Supplementary Table S2). Pola had been trialed in combination with rituximab (52) and in combination with rituximab and bendamustine (70), and therefore we fit parameters for Pola to both of these trials, treating Pola as a more potent form of vincristine (O in RCHOP) because both vincristine and the cytotoxic payload of Pola, monomethyl auristatin E, are microtubule-destabilizing agents that bind to TUBB.
Simulating Adjuvant Therapy Trials
Trials of adding enzastaurin or everolimus to RCHOP administered these agents as adjuvant therapy to patients with a complete response to RCHOP (31, 32), and their primary endpoints included only patients with a complete response to RCHOP. Correspondingly, we modeled these trials as enrolling only patients with a complete response to RCHOP, defined as attaining a minimum tumor size of <1% of initial tumor burden. The time to progression was defined as the time from the end of RCHOP until the tumor reached its pretreatment size.
Simulating Biomarker Stratified Trials
Simulating Subgroup-Specific Outcomes
A different subtype of DLBCL was modeled by fitting the average drug sensitivity of each subtype to the subtype-specific PFS distribution. This fitting minimized the mean squared error between the simulated and observed PFS distributions with the restriction that the sum of the subtype distributions for each stratification approach had to approximate the drug sensitivity distribution of all patients with DLBCL.
Assessing Cell-to-Cell Drug Sensitivity Distribution
Cell-to-cell variation in sensitivity to single drugs in RCHOP was previously measured in DLBCL cultures by high-complexity clone tracing and genome-wide CRISPR inhibition screens under treatment (5). These data were reanalyzed to graph drug sensitivity distributions as shown in Fig. 1, with log(enrichment) from 106 barcoded DLBCL clones treated with each drug in RCHOP and log(enrichment) of CRISPR inhibition guide RNAs (called “rho phenotype”) in DLBCL cultures treated with drugs in RCHOP (5). Enrichment scores from clone tracing were the geometric mean of three independent experiments, and enrichment scores from CRISPR inhibition were the geometric mean of 10 guide RNAs per gene.
Estimating Patient-to-Patient Variation in Drug Sensitivity
For each drug (rituximab, cyclophosphamide, doxorubicin, and vincristine), the 20 genes the knockdown of which most strongly perturbs drug sensitivity were identified from genome-wide CRISPR inhibition screens in drug-treated DLBCL cells, quantified by the rho phenotype (a metric of barcode enrichment or depletion; ref. 5). For each of the 137 previously untreated patients with DLBCL (NCBI GEO: GSE98588; ref. 71), the rho phenotype of each selected gene was multiplied by its relative expression (log2, normalized by the mean of all patients). The sum of these weighted effects across all selected genes estimates the net influence on their expression variation on drug sensitivity in each patient (Supplementary Fig. S2A). Drug responses are affected by more than tumor gene expression, and this calculation only illustrates that the observed molecular heterogeneity in DLBCL is expected to produce approximately log-normal variations in drug sensitivity between patients (Supplementary Fig. S2B).
Data Availability
Code and source data used in this work are available in a Code Ocean capsule at https://codeocean.com/capsule/2985122. A list of all clinical data digitized and available in the Code Ocean capsule can be found in Supplementary Table S3. A concise implementation of the model is provided in a Supplementary Mathematica notebook.
Authors’ Disclosures
A.E. Pomeroy reports personal fees from Respiratorious AB outside the submitted work. A.C. Palmer reports grants from the NCI during the conduct of the study, as well as grants from Prelude Therapeutics, personal fees from AstraZeneca, Kymera Therapeutics, Novartis, and Sanofi, and grants and personal fees from Merck outside the submitted work.
Authors’ Contributions
A.E. Pomeroy: Software, investigation, writing–original draft, writing–review and editing. A.C. Palmer: Software, investigation, writing–original draft, writing–review and editing.
Acknowledgments
We thank the investigators and patients who participated in the clinical trials analyzed in this work. We thank Ash Alizadeh, Bruce Chabner, Christopher Chidley, Haeun Hwangbo, Feng Fu, Jacob Pantazis, Jeremy Purvis, Sarah Patterson, Noah Schlachter, and Peter Sorger for discussions. A.E. Pomeroy is supported by National Institute of General Medical Sciences grant K12GM000678. This work is supported by National Cancer Institute grant R01CA279968.
Note: Supplementary data for this article are available at Blood Cancer Discovery Online (https://bloodcancerdiscov.aacrjournals.org/).
References
Supplementary data
Cell-to-cell variation in drug sensitivity is approximately log-normal
Patient-to-patient variation in drug sensitivity is approximately log-normal