## Abstract

Rapid improvements in the detection and tracking of early-stage tumor progression aim to guide decisions regarding cancer treatments as well as predict metastatic recurrence in patients following surgery. Mathematical models may have the potential to further assist in estimating metastatic risk, particularly when paired with *in vivo* tumor data that faithfully represent all stages of disease progression. Herein, we describe mathematical analysis that uses data from mouse models of spontaneous metastasis developing after surgical removal of orthotopically implanted primary tumors. Both presurgical (primary tumor) growth and postsurgical (metastatic) growth were quantified using bioluminescence and were then used to generate a mathematical formalism based on general laws of the disease (i.e., dissemination and growth). The model was able to fit and predict pre/postsurgical data at the level of the individual as well as the population. Our approach also enabled retrospective analysis of clinical data describing the probability of metastatic relapse as a function of primary tumor size. In these data-based models, interindividual variability was quantified by a key parameter of intrinsic metastatic potential. Critically, our analysis identified a highly nonlinear relationship between primary tumor size and postsurgical survival, suggesting possible threshold limits for the utility of tumor size as a predictor of metastatic recurrence. These findings represent a novel use of clinically relevant models to assess the impact of surgery on metastatic potential and may guide optimal timing of treatments in neoadjuvant (presurgical) and adjuvant (postsurgical) settings to maximize patient benefit. *Cancer Res; 76(3); 535–47. ©2015 AACR*.

A mathematical model was used to connect presurgical primary tumor volume and postsurgical metastatic burden and survival in two clinically relevant animal models of spontaneous metastasis and one clinical dataset of metastatic relapse probability in breast cancer patients. This model used one specific parameter to quantify differential metastatic aggressiveness, which could be of help for personalizing adjuvant therapy. Simulations revealed a highly nonlinear relationship between resected primary tumor size and metastatic recurrence. These results uncover a computable and patient-dependent threshold for evaluating the impact of surgery on overall survival.

## Introduction

Surgical removal of an early-stage localized tumor remains one of the most effective strategies in reducing the probability of systemic metastatic disease spread (1). Improved technologies of early cancer detection aim to classify primary tumor stage to identify whether potential treatment modalities—such as presurgical “neoadjuvant” or postsurgical “adjuvant”—should be considered to complement surgery and reduce metastatic potential. However, the relationship between primary tumor growth and eventual metastasis remains enigmatic (2). Metastatic seeding was initially thought to occur only during late stages of primary tumor growth and invasion (3); however, recent evidence suggests systemic dissemination is a much earlier event (4). Indeed, even the direction of tumor spread, initially thought to occur unidirectionally from primary to secondary sites, has been replaced by more complex and dynamic theories of interaction. These include models where primary and secondary lesions grow (and evolve) in parallel (2) and the possibility that cell seeding can be bidirectional, with metastasis potentially “re-seeding” back to original primary location (5, 6).

To assist in understanding this complexity, mathematical modeling has been used to determine the relationship between primary (localized) and secondary (metastatic) tumor dissemination and growth. Early studies used statistical analyses only (7, 8), whereas later work included experimentally derived data to validate models using biologic information that aimed to more faithfully represent the metastatic process (9). In 2000, Iwata and colleagues used imaging data from 1 patient with metastatic hepatocellular carcinoma to introduce a more formalistic and biologically based approach that relied on the description of the temporal dynamics of a population of metastatic colonies, with equations written at the organ or organism scale (10). In parallel, several studies have sought to include additional variables when modeling tumor growth, such as angiogenesis (11), stem cell behavior (12), tumor–immune interactions (13), and microenvironment influences (14), among numerous others. To date, the majority of mathematical studies in cancer modeling have focused on primary tumor, and relatively few have investigated the metastatic development (15–22).

This dearth in metastatic data stems largely from the complexity of studying metastasis itself. Metastasis starts with localized primary tumor growth, which then invades and intravasates into the bloodstream, which, in turn, spreads systemically until extravasating into tissue at a distant (hospitable) site (23, 24). While clinical (retrospective) data have value (2, 7, 20, 25, 26), mouse tumor models have typically aimed to mimic (and distinguish between) several stages of the metastatic process. In certain mouse models, metastasis can derive from a tumor that is implanted ectopically or orthotopically into a primary or metastatic site (“ectopic,” “orthotopic,” or “ortho-metastatic” models, respectively; ref. 27) and can involve various immune states (i.e., human xenograft or mouse isograft). Although more rarely performed, models can also include surgical resection of the primary tumor, which allows for progression of clinically relevant spontaneous metastatic disease. These can include surgery following ectopic implantation (i.e., “ecto-surgical,” such as tumors grown in the ear or limb that are later amputated), or orthotopic implantation and resection (i.e., “ortho-surgical”), which more faithfully represent patient disease. To date, no studies have utilized data from ortho-surgical metastasis models for mathematical analysis.

Herein, we describe a mathematical approach developed using data derived from two ortho-surgical metastasis models representing competent and incompetent immune systems with luciferase-tagged human breast (LM2-4^{LUC+}) and mouse kidney (RENCA^{LUC+}) cell lines. We first defined a mathematical formalism from basic laws of the disease (dissemination and growth). Then, we confronted the mathematical outputs to longitudinal measurements of primary tumor size, metastatic burden, and survival using a population approach (nonlinear mixed effects) for statistical estimation of the parameters. Minimally parameterized models of each experimental system were generated and used to fit and predict pre/postsurgical data at the individual and population levels. Next, we used clinical datasets to assess metastatic relapse probability from primary tumor size and show that, in both cases (preclinical and clinical), one specific parameter (*μ*) allowed quantification of interanimal/individual variability in metastatic propensity. Critically, our models confirm a strong dependence between presurgical primary tumor size and postsurgical metastatic growth and survival. However, quantitative analysis revealed a highly nonlinear pattern in this dependency and identified a range of tumor sizes (either large or small) where variation of tumor size did not significantly affect survival. These represent potential threshold limits for the utility of primary size as a predictor of metastatic disease (i.e., if small, then surgical cure; if large, then surgical redundancy). These findings represent the first time clinically relevant surgical models have been integrated with data-based mathematical models to inform the quantitative impact of presurgical primary tumor size on subsequent metastatic disease.

The metastatic modeling approach we used follows the formalism initiated by Iwata and colleagues (10), which was further developed/expanded in recent works in two key ways: (i) effect of systemic therapies (28, 29) and (ii) use in an (nonsurgical) *in vivo* human xenograft model involving orthotopic primary tumors (PT) and metastasis (21). Metastatic development is reduced to two main components:

Growth: It includes presurgical primary (

*g*) and secondary (_{p}*g*) tumor growth rates.Dissemination: It includes metastatic dissemination rate (

*d*).

A schematic description of the model is depicted in Fig. 1. More complex considerations on the biology (1, 30) and modeling (31) of the metastatic process have been considered elsewhere.

The PT volume *V _{p}*(

*t*) solves the following equations

The initial condition for the PT, denoted by *V _{i}*, was determined either by the number of injected cells (preclinical case) or the initial tumor size at inception (clinical case,

*V*= 1 cell). Metastases were assumed to start from one cell. For each case, the optimal structure resulting from our investigations was to assume the same structural growth law for the PT and the metastases, although with possibly different parameter values.

_{i}^{LUC+}) metastasis model

Growth dynamics were defined by

Gomp-Exp (32) growth model (see expression below).

Growth parameters for PT and metastases treated identically (

*g*=*g*)._{p}

In a previous study quantifying the descriptive power of several growth kinetics models using data from the same breast animal model (33), the Gompertz model accurately described primary tumor growth curves, in accordance with a large body of literature (33). However, a limitation of this model is that the tumor doubling time could become arbitrarily small for small volumes, a feature that we considered biologically irrelevant for small volumes at metastatic initiation (of the order of the cell). A lower bound to this doubling time might be expressed by the *in vitro* doubling time of the cell line, which can be experimentally determined. Consequently, we adopted the Gomp-Exp model (32), defined by

Under this model, growth is divided between two phases: an initial exponential phase, followed by a Gompertz growth phase. Parameter λ is the maximal proliferation rate, taken here to be equal to the value inferred from *in vitro* proliferation assays (see Supplementary Fig. S1A and Table 2). The second term in the min function is the Gompertz growth rate, defined by two parameters. Parameter α is the intrinsic relative (also termed specific) growth rate at the size *V*_{0} of one cell. Parameter β is the exponential decay rate of the relative (also termed specific) growth rate.

^{LUC+}) metastasis model

Growth dynamics were defined by

Exponential growth model.

Growth parameters for PT and metastases treated differently.

In mathematical terms, this is expressed by

Growth dynamics were defined by

Gompertz growth model.

Growth parameters for PT and metastases treated identically (

*g*=_{p}*g*).

The formation of new metastases was assumed to occur at a PT volume–dependent rate *d*(*V _{p}*) having the following parametric expression:

where parameter *μ* is an intrinsic parameter of metastatic aggressiveness. This critical coefficient is the daily probability for a given tumor cell to successfully establish a metastasis. Therefore, it is the product of several probabilities: (i) the probability of having evolved the necessary genetic mutations to ensure the phenotypic abilities required at each step of the metastatic process, (ii) the survival probability of all adverse events occurring in transit including survival in the blood or immune escape, among others, and (iii) the probability to generate a functional colony at the distant site. Following reported observations (34), we assumed that all the metastases were growing at the same volume (*v*)-dependent rate *g*(*v*) and that they all started from the same volume corresponding to the volume of one cell. The population of metastases was then formalized by means of a time (*t*)-dependent volume distribution ρ(*t*, *v*) solving the following problem (10):

The first equation is a continuity equation expressing conservation of the number of metastases when they grow. The second equation is a Neumann boundary condition on the flux of entering metastases at size *V* = *V*_{0}. The third equation describes the initial condition (no metastases at the initial time). From the solution of this problem, two main macroscopic quantities can be derived, the metastatic burden *M*(*t*) and the number of metastases *N*(*t*). In the convolution formula for *M*(*t*) (35), *V*(*s*) represents a solution to the Cauchy problem (1), with *g* instead of *g _{p}* and

*V*

_{0}as initial condition. This formula allows fast simulation of the model using the fast Fourier transform algorithm (35), which was essential for estimation of the parameters that required a very large number of model evaluations.

## Materials and Methods

### Preclinical methodology

#### Cell lines.

The human LM2-4^{LUC+} cells are a luciferase-expressing metastatic variant of the MDA-MB-231 breast cancer cell line derived after multiple rounds of *in vivo* lung metastasis selection in mice, as previously described (36, 37). Mouse kidney RENCA^{LUC+} cells expressing luciferase were a kind gift from R. Pili, Roswell Park Cancer Institute, and described previously (38). LM2-4^{LUC+} and RENCA^{LUC+} were maintained in DMEM (Corning; cat. # MT10-013-CV) and in RPMI medium (Corning; cat. # MT15-041-CV), respectively, with 5% heat-inactivated FBS (Corning; cat. # MT35-010-CV). Cells were authenticated by STR profile comparison with ATCC parental cell database (for LM2-4^{LUC+}) or confirmation of species origin (for RENCA^{LUC+}; DDC Medical). All cells were incubated at 37°C and 5% CO_{2} in a humidified incubator.

#### Cell proliferation assay.

LM2-4^{LUC+} cells were plated in 35-mm plates (5 × 10^{5} cells per plate) and were manually counted using trypan blue staining every 24 hours for 72 hours total (Cellgro; cat. # 25-900-CI).

#### Photon-to-cell ratio.

LM2-4^{LUC+} cells were trypsinized and counted. A total of 5 × 10^{6} cells were serial diluted 2-fold down to 9.77 × 10^{3} cells and processed with Bright-Glo Luciferase Assay System (Promega; cat. # E2610) following the manufacturer's protocol.

#### Ortho-surgical models of metastasis.

Animal tumor model studies were performed in strict accordance with the recommendations in the Guide for Care and Use of Laboratory Animals of the NIH and according to guidelines of the Institutional Animal Care and Use Committee at Roswell Park Cancer Institute (Protocol: 1227M to J.M.L. Ebos).

The optimization and use of animal models of breast and kidney metastasis, orthotopic primary tumor implantation, and surgical resection have been extensively detailed elsewhere (39). Briefly, LM2-4^{LUC+} cells (1 × 10^{6} cells in 100 μL) and RENCA^{LUC+} (4 × 10^{4} cells in 5 μL) were implanted, respectively, into the right inguinal mammary fat pad (right flank) or left kidney (subcapsular space) of 6- to 8-week-old female CB-17 SCID or Balb/c mice(39). Primary breast tumor size was assessed regularly with Vernier calipers using the formula width^{2}(length × 0.5), and in both tumor models, animals were monitored biweekly for bioluminescence (BL) to quantify tumor growth (40). See Supplementary Preclinical Methodology section for more details.

### Mathematical methodology: Fit procedures

#### Preclinical data: Primary tumor and metastatic burden dynamics.

Three fit procedures were investigated: (i) fitting the population average time series, (ii) individual fits of each mouse's primary tumor (PT) and metastatic burden (MB) kinetics, and (iii) a mixed-effect population approach. Due to the high variability in the data, the first approach was not considered relevant. The second approach showed that the model was able to describe individual dynamics but, due to the relative scarcity of the data in a given animal, led to very poor identifiability of the coefficients, in particular the metastatic dissemination parameter *μ*. The third approach was considered the most appropriate to our case. Indeed, nonlinear mixed-effect modeling (41) is a statistical technique specifically tailored for sparse serial measurements in a population. It assumes that interanimal variability can be described by a parametric distribution on the model's parameters (here assumed to be lognormal, consistently with other works; refs. 20, 42). Multiple strategies were tested in order to find the appropriate formalism to fit the data. These included fitting PT and MB separately or together. The strategy fitting PT and MB was ultimately selected because it resulted in more accurate fits and allowed for possible correlations between the primary and secondary tumor growth parameters in a same animal.

One of the model parameters for Gomp-Exp growth was the *in vitro* proliferation rate, which was determined by an exponential fit to an *in vitro* proliferation assay. Maximization of the likelihood function under nonlinear mixed-effect formalism was solved using the function *nlmefitsa* implemented in Matlab (43), which is based on the stochastic approximation of expectation maximization algorithm. Specific assumptions were: log-transformation of the parameters (i.e., log-normal population distribution), proportional error model and full covariance matrix. For individual fits, weighted least squares minimization corresponding to individual likelihood maximization was performed using the function *fminsearch* of Matlab (Nelder–Mead algorithm), following previously reported methods (33).

#### Clinical data: Calculation of metastatic relapse probability.

Our methodology for fitting the clinical data followed the same format as in ref. 44, although here the model was simplified (only parameter *μ* was allowed to vary among individuals) and PT size at diagnosis was considered to be uniformly distributed within each size range. Parameters for the growth of the primary and secondary tumors were fixed (not subject to optimization) and corresponded to a maximal volume of 10^{12} cells (≅ 1 kg) and a doubling time of 7.5 months at 1 g, consistently with clinical values reported in the literature (8, 25).

The data reported in (26) consisted of metastatic relapse probabilities during the next 20 years after surgery, for patients stratified by PT size (see Table 1). Diameter data from PT sizes at diagnosis were converted into volumes under the assumption of a spherical shape and then converted to number of cells using the conversion rule 1 mm^{3} ≅ 10^{6} cells (45). Parameter *μ* was assumed log-normally distributed in the population, with mean *μ*^{m} and standard deviation *μ*^{σ}.

Diameter (cm) . | Number of patients . | Prop. of relapse (data) . | Prop. of relapse (model) . |
---|---|---|---|

1 ≤ D ≤ 2.5 | 317 | 27.1 | 25.5 |

2.5 < D ≤ 3.5 | 496 | 42.0 | 42.4 |

3.5 < D ≤ 4.5 | 544 | 56.7 | 56.3 |

4.5 < D ≤ 5.5 | 422 | 66.5 | 65.9 |

5.5 < D ≤ 6.5 | 329 | 72.8 | 74.3 |

6.5 < D ≤ 7.5 | 192 | 83.8 | 80.8 |

7.5 < D ≤ 8.5 | 136 | 81.3 | 85.7 |

Diameter (cm) . | Number of patients . | Prop. of relapse (data) . | Prop. of relapse (model) . |
---|---|---|---|

1 ≤ D ≤ 2.5 | 317 | 27.1 | 25.5 |

2.5 < D ≤ 3.5 | 496 | 42.0 | 42.4 |

3.5 < D ≤ 4.5 | 544 | 56.7 | 56.3 |

4.5 < D ≤ 5.5 | 422 | 66.5 | 65.9 |

5.5 < D ≤ 6.5 | 329 | 72.8 | 74.3 |

6.5 < D ≤ 7.5 | 192 | 83.8 | 80.8 |

7.5 < D ≤ 8.5 | 136 | 81.3 | 85.7 |

NOTE: Fit of the model was significant for Pearson χ^{2} test for goodness of fit (*P* = 0.023).

Abbreviation: Prop., proportion.

The probability of having a metastatic relapse in the next 20 years for a primary tumor diagnosed with a given size was assumed to be equal to the probability of already having one distant tumor at the time of diagnosis. For a given volume range of PT sizes at diagnosis |$\left({V^k,\,V^{k + 1}} \right)$|, |$k \in \left\{{1, \ldots, 7}\right\}$|, we considered the diagnosis volume |$V_D^k$| as a random variable uniformly distributed in |$\left({V^k, \,V^{k + 1}} \right)$|. Then, we computed the corresponding age of the tumor at diagnosis (i.e., the time elapsed from the first cancer cell) from the assumption of Gompertzian growth with the parameter values previously mentioned. This quantity was denoted |$T_D (V_D^k)$|. Under our formalism, the probability of having a disseminated metastasis at time |$T_D (V_D^k)$| then writes

where Met* ^{k}* stands for the event of having one metastasis at diagnosis when the PT volume is in |$\left({V^k, \,V^{k + 1}} \right)$|. For any volume range and value of

*μ*and

^{m}*μ*

^{σ}, this formalism allowed us to compute a probability to be compared with the respective empirical proportion of relapsing patients reported in ref. 26, by simulating the two random variables involved (|$V_D^k$| and

*μ*). We then determined the best-fit parameters by minimizing the sum of squared errors to the data, using the function

*fminsearch*from Matlab.

## Results

To mimic clinical progression of spontaneous systemic metastatic disease, two models involving orthotopic tumor implantation and surgical resection (ortho-surgical) were used. These included a xenograft breast model (LM2-4^{LUC+} cells implanted into the mammary fat pad) and an isograft kidney model (RENCA^{LUC+} implanted into the subcapsular kidney space; see Materials and Methods; ref. 38). Presurgical PT and postsurgical MB were tracked by BL emission, expressed in photons/second (p/s; Fig. 2A).

In the breast model, simultaneous BL and gross tumor volume measurements (caliper) were performed. The former only quantifies living cells, whereas the latter computes a total volume indifferently of its composition. Volume and BL emission were significantly correlated (Supplementary Fig. S1B), as observed by others (46). Determination of the signal corresponding to one cell was required in our modeling for the value assigned to *V*_{0}. Based on linear regression between BL emission and tumor volume, we established that BL = 2.19 × 10^{6} *V* + 7.89 × 10^{7}, where BL is the BL in p/s and *V* is the volume in mm^{3}. This relationship, evaluated at *V* = 10 mm^{3} ≅ 10^{7} cells, gives 1 cell ≅ 10.08 p/s, which was approximated to 10 p/s. Using this value gave reasonable fits to the PT growth data (Supplementary Fig. S2).

### Validation and calibration of the mathematical model in ortho-surgical models

We assessed the ability of the models to describe and predict the experimental data of postsurgical MB dynamics. Several model designs were evaluated to define the optimal structure and methodology that would allow accurate and reliable data description. Specifically, for each *in vivo* experimental system, multiple structural expressions and parametric dependences between the growth rate of the PT and MB were tested. We refer to Supplementary Figs. S3 and S4 for direct comparison of goodness-of-fit and identifiability under different modeling setups. Population and individual fits of the best models to the data are shown in Fig. 2B and C (and Supplementary Fig. S5), and Fig. 3, respectively. The parameter values inferred from the population fits are reported in Table 2. The mathematical models—combined with the population distribution of the parameters inferred from the nonlinear mixed-effects statistical procedure—were able to give reasonable descriptions of the presurgical PT and postsurgical MB growth. Importantly, these combinations could quantify the dynamics of the process as well as the interanimal variability. The latter was better characterized by the metastatic potential parameter *μ* (large coefficients of variation in Table 2). The models could also fit individual dynamics of longitudinal data of presurgical PT and postsurgical MB (see Fig. 3 for some representative examples of growth dynamics in particular mice and Supplementary Figs. S6 and S7 for fits of all mice).

Data . | Growth model . | Location . | Parameter . | Unit . | Estimate (CV) . | 95% CI . |
---|---|---|---|---|---|---|

In vitro (breast) | Exp. | λ | day^{−1} | 0.837 (–) | (0.795–0.879) | |

Preclinical breast | Gomp-Exp | PT | V _{i} | cell | 1.00 × 10^{6} (–) | — |

α | day^{−1} | 1.9 (5.73) | (1.84–1.96) | |||

β | day^{−1} | 0.0893 (21.3) | (0.0791–0.101) | |||

Met | V _{0} | p/s | 10 (–) | — | ||

μ | cell^{−1} ·day^{−1} | 4.43 × 10^{−11} (176) | (2.70 × 10^{−11}–7.27 × 10^{−11}) | |||

Preclinical kidney | Exp. | PT | V _{i} | p/s | 1.63 × 105 (45.5) | (9.40 × 10^{4}–2.83 × 10^{5}) |

α _{p} | day^{−1} | 0.21 (60.3) | (0.151–0.292) | |||

Met | V _{0} | p/s | 10 (−) | — | ||

α | day^{−1} | 0.0307 (201) | (0.0133–0.0707) | |||

μ | cell^{−1} ·day^{−1} | 0.0415 (397) | (0.0181–0.0948) | |||

Clinical breast | Gomp | PT | V _{i} | cell | 1 (−) | |

α | day^{−1} | 0.013 (−) | ||||

β | day^{−1} | 0.000471 (−) | ||||

Met | V _{0} | cell | 1 (−) | |||

μ | cell^{−1} ·day^{−1} | 7.00 × 10^{−12} (1.04 × 10^{4}) |

Data . | Growth model . | Location . | Parameter . | Unit . | Estimate (CV) . | 95% CI . |
---|---|---|---|---|---|---|

In vitro (breast) | Exp. | λ | day^{−1} | 0.837 (–) | (0.795–0.879) | |

Preclinical breast | Gomp-Exp | PT | V _{i} | cell | 1.00 × 10^{6} (–) | — |

α | day^{−1} | 1.9 (5.73) | (1.84–1.96) | |||

β | day^{−1} | 0.0893 (21.3) | (0.0791–0.101) | |||

Met | V _{0} | p/s | 10 (–) | — | ||

μ | cell^{−1} ·day^{−1} | 4.43 × 10^{−11} (176) | (2.70 × 10^{−11}–7.27 × 10^{−11}) | |||

Preclinical kidney | Exp. | PT | V _{i} | p/s | 1.63 × 105 (45.5) | (9.40 × 10^{4}–2.83 × 10^{5}) |

α _{p} | day^{−1} | 0.21 (60.3) | (0.151–0.292) | |||

Met | V _{0} | p/s | 10 (−) | — | ||

α | day^{−1} | 0.0307 (201) | (0.0133–0.0707) | |||

μ | cell^{−1} ·day^{−1} | 0.0415 (397) | (0.0181–0.0948) | |||

Clinical breast | Gomp | PT | V _{i} | cell | 1 (−) | |

α | day^{−1} | 0.013 (−) | ||||

β | day^{−1} | 0.000471 (−) | ||||

Met | V _{0} | cell | 1 (−) | |||

μ | cell^{−1} ·day^{−1} | 7.00 × 10^{−12} (1.04 × 10^{4}) |

NOTE: Parameters corresponding to the preclinical data were obtained using nonlinear mixed-effects modeling. Interanimal variability of each parameter is represented by its respective coefficient of variation (CV). Parameter values for the clinical data are those that produced the fit to the clinical data of metastatic relapse probability from ref. 26, reported in Table 1. For these data, only parameter *μ* was allowed to vary between the individuals, and consequently it is the only parameter having a CV.

Abbreviations: CV, coefficient of variation in percentage = *std/est* × 100, with *std* the standard deviation of the lognormal distribution of the parameter and *est* the population estimate; CI, confidence interval on the population estimate inferred from the standard errors on the fit.

In addition to their descriptive power, the models were able to predict growth dynamics in external datasets that were not used for estimation of the parameters (Fig. 2D and E). These results emphasize the ability of our general modeling structure to capture MB growth dynamics. In addition, the modeled postsurgical MB could also be related to empirical survival by means of a lethal burden threshold, which was estimated to be 4 × 10^{9} p/s (Supplementary Fig. S8).

### Qualitative and quantitative differences across ortho-surgical models

#### Xenograft model: Breast metastasis.

Using the same growth model (Gomp-Exp) and parameters for both presurgical PT and postsurgical MB, we were able to adequately fit the data, while ensuring reasonable standard errors on the parameters estimates (Table 2). Although more complex structures (e.g., models with one parameter differing between primary and secondary growth) provided marginally better fits, robustness in estimating *μ* was impaired (Supplementary Fig. S3). Quantitative inference of *μ* revealed small metastatic potential (Table 2), which translated into late development of metastases following xenograft and growth of the MB mostly dominated by proliferation (Figs. 2B and 3A–C).

#### Isograft model: Kidney metastasis.

In contrast, the kidney model MB growth curves exhibited a different behavior, with a marked change of regimen at the time of surgery. In the context of the model, this means that most of the presurgical MB increase was driven by the dissemination process, and not by proliferation of the metastases themselves. This was reflected by a very large value of *μ* (Table 2), with nine orders of magnitude of difference compared with the breast model. This feature was not directly visible, nor quantifiable, by direct examination of the data, and reflects the large metastatic aggressiveness of isograft spontaneous metastasis animal models, because overpassing the immune surveillance is a major challenge in the metastatic process (4). When the PT was removed, dissemination was stopped and only proliferation remained for further growth of the MB, which happened at a slower rate than at the primary site (Figs. 2C and 3D–F). In some cases, growth of the MB remained constant or even decreased after surgery (see Supplementary Fig. S7). This result reflects the fact that the competent immune status of the mice might have an important impact on the establishment of durable, fast-growing metastatic colonies at the secondary sites (47).

Together, our data-based quantitative modeling analysis of presurgical PT and postsurgical MB growth kinetics demonstrated the descriptive power of the models, unraveled distinct growth patterns between the two animal models, and emphasized the critical role of the parameter *μ* for quantification of the interanimal variability.

### Clinical data of metastatic relapse probability

Clinical data reported in the literature generally do not provide detailed information about the untreated growth of the metastatic burden, either because the residual disease is invisible, or because the patients benefit from adjuvant therapy after resection of their PT. Nevertheless, before the generalization of adjuvant therapy for breast cancer, Koscielny and colleagues (26) reported data from a cohort of 2,648 patients followed for 20 years after surgery of the PT, without additional treatment. Their data (reproduced in Table 1) demonstrated that, despite a clear association between PT size at diagnosis and the probability of metastatic relapse, not all the patients having a given PT size were relapsing. For instance, only 42% of patients with a PT diameter at diagnosis between 2.5 and 3.5 cm developed metastasis. Based on this observation, we used our model to describe interindividual variability by means of a limited number of parameters. We considered that the probability of developing a metastasis in the next 20 years was equal to the probability of already having one at the time of diagnosis (see Materials and Methods). Using a lognormal population distribution of parameter *μ*, we were able to obtain a significant fit to the data of metastatic relapse for all size ranges (Table 1, *P* = 0.023). Interestingly, the median value of *μ* resulting from these human data was close to the value from the preclinical breast data, in comparison with the kidney model (see Table 2).

These results demonstrated that, within our semimechanistic modeling approach, parameter *μ* was able to capture the interindividual metastatic variability, not only in animal models but also for patient data.

### Assessing the impact of surgery on metastasis and survival: a simulation study

When diagnosis detects only a localized primary tumor, distant occult disease might already be present. In our model, the extent of this invisible metastatic burden depends on: (i) the PT size at diagnosis and (ii) the patient's metastatic potential *μ*. For instance, if the PT size (or *μ*) is small, then the occult MB might be negligible and surgery would substantially benefit to the patient in terms of metastatic reduction, by stopping further spread of new foci. Conversely, if the PT size (or *μ*) is large, then the occult MB might already be consequent and removing the PT might only have a marginal impact.

#### Virtual simulation of two breast cancer patients.

We simulated the quantitative impact of PT surgery in two virtual breast cancer patients having a PT diagnosed at 4.32 cm and two values of *μ* (median and 90th percentile within a population distributed according to our previous estimate). Results are reported in Fig. 4 and Supplementary Movies S1 and S2. A discrete and stochastic version of the metastatic dissemination was used here for the simulations (see Supplementary Methods for details). Interestingly, our simulation revealed that at the time of diagnosis, no metastasis was detectable (i.e., below the imaging detection limit, taken here to 10^{8} cells) in both cases (Fig. 4A and B). In clinical terms, this means that both patients would have been diagnosed with a localized disease. However, the two size distributions were very different, with a much larger residual burden in the “large *μ*” case, illustrative of the increased metastatic potential.

For the “median *μ*” case, our model predicted the presence of two small metastases, with respective sizes of 6 and 278 cells. Not surprisingly, when no surgery was simulated, this number continued to increase, reaching 160 secondary lesions after 15 years (Fig. 4C). However, most of the metastatic burden (126 tumors, i.e., 78.8% of the total burden) was composed of lesions smaller than 10^{9} cells (|$ \simeq$|1g). Figure 4E and G demonstrate that a substantial relative benefit (larger than 10%) in MB reduction was eventually obtained, but only after 7.8 years. Nevertheless, at the end of the simulation (15 years after surgery), the predicted two occult metastases at diagnosis had reached substantial sizes (1.41 × 10^{11} and 1.89 × 10^{11} cells). Therefore, for this patient with median metastatic potential, the model indicates an important benefit in using adjuvant therapy.

For a patient with higher metastatic potential (at the level of the 90th percentile, see Fig. 4B, D, F, and H and Supplementary Movie S2), even with a PT diagnosed at the same size, the predicted metastatic burden at diagnosis was considerably more important, with 76 lesions and the largest comprising 6.23 × 10^{6} cells. This consequent occult burden translated into poor outcome and the metastatic mass would have reached a lethal burden of 10^{12} cells 9.3 years after the initial diagnosis if no therapy would have been administrated.

These results illustrate the potential of the model as a diagnostic and prognostic numerical tool for assessment of the occult metastatic burden and postsurgery growth. In this, it could help to determine the extent of adjuvant therapy necessary to achieve a long-term control of the disease.

#### Impact of tumor size on postsurgical survival.

To further examine the relationship between the PT size at surgery and survival, we performed simulations for (i) an individual with fixed value of *μ* (the population median, see Fig. 5A) or (ii) an entire population (simulated survival curves in Fig. 5B) for three PT sizes. Numerical survival was defined by the time to reach a lethal burden of 1 kg (|$ \cong 10^{12}$| cells; ref. 2) from the time of cancer initiation. Interestingly, we observed a highly nonlinear relationship between the PT size and the survival, which suggested three size ranges delimited by two thresholds (Fig. 5A). The lower threshold—termed “recurrence” threshold (4 cm in Fig. 5A)—was defined as the maximal limit whereupon no metastasis was present at surgery (number of metastases lower than 1). The upper size threshold—termed “benefit” threshold (5.2 cm in Fig. 5A)—was defined as the size above which surgery had a negligible (<10%) impact on survival time. Above and below these “recurrence” and “benefit” thresholds, PT size had no important correlative value. Conversely, within the PT size range delimited by these two bounds, the relationship between presurgical PT and postsurgical MB/survival was highly correlative, with a large derivative and a sharp transition between the two extremes. The same qualitative PT size/survival relationship was obtained for any value of *μ* sampled within the population distribution, although with substantial quantitative differences (see Supplementary Fig. S9).

In Fig. 5C, we present quantitative estimates of the recurrence and benefit thresholds for various percentiles of *μ* within the population distribution (see also Supplementary Fig. S9). Our simulations predicted that for the first half of the population, surgery was almost always leading to negligible metastatic recurrence risk, with large values of the recurrence threshold (larger than the usual detection levels). On the other hand, the patients with large metastatic potential were predicted not to substantially benefit from the surgery, as far as reduction of future MB was concerned. For instance, a patient with *μ* at the level of the 90th percentile and a PT diagnosed at 4 cm would have an increase in absolute survival time of only 1.9% following surgery (Fig. 5C).

## Discussion

Using a formalism based on simple laws of metastatic development (including dissemination and proliferation), we derived mathematical models able to connect presurgical PT growth to postsurgical development of the MB in two ortho-surgical animal metastasis models (with two immune states) as well as one clinical dataset. These quantitative models allowed identification of different metastatic growth patterns and characterization of the metastatic potential (and associated interanimal/individual variability) as a critical parameter, *μ*. Our results also revealed a nonlinear quantitative relationship between the PT size at diagnosis and postsurgical survival improvement.

Previous studies have utilized experimental data derived from mouse metastasis models to inform mathematical analysis. For instance, Hartung and colleagues used human MDA-MB-231 breast cancer cells implanted orthotopically in mice in order to validate a mathematical model for longitudinal data of metastatic burden growth (21). This animal model was nonsurgical and utilized severe immunocompromised Nod SCID γ mice to improve the low metastatic potential observed in the MDA-MB-231, a phenomenon recently reported elsewhere (47). In our studies, we utilized a variant of the MDA-MB-231 previously selected for increased metastatic potential by repeated orthotopic implantation and metastatic resection in SCID mice (36). Because the selection of cells and immune state could influence analysis, we also included an immunocompetent mouse kidney model to confirm (and compare) findings. Although these and other modifications to the metastatic systems could significantly influence mathematical modeling (i.e., different mouse strain and cell line, different BL technique, etc.), the impact of surgery appears to be the most significant factor. In this regard, several technical discrepancies likely impair a relevant comparison between surgical and nonsurgical models presented by Hartung and colleagues (21) and the current study. For instance, in surgical models, we found it unnecessary to assume different growth between the primary and secondary lesions. In addition, we considered a less complex dissemination rate [expression |$d(V_p) = \mu V_p^\gamma$| and |$\gamma =⅔}$| was used in ref. 21). Notably, we could fit our data equally well with various values of γ, and thus concluded that it cannot be identified from combined PT growth and MB dynamics data alone (Supplementary Fig. S10). Future studies would require more data, especially on the number and size distribution of the secondary lesions, to precisely determine the shape of the dissemination coefficient. When using the dissemination and growth terms from ref. 21 and fitting the resulting model to our surgical data, we found a much larger metastatic potential *μ* and a significantly faster metastatic growth kinetics parameter than computed in the nonsurgical model (see Supplementary Text; ref. 21). Although the former probably illustrates higher metastatic propensity due to a more permissive immune state, the latter possibly suggests postsurgery metastatic acceleration (48–50).

In this regard, this raises another critical consideration of the impact of surgery on metastatic potential in mathematical modeling. Preclinical and clinical works have suggested that removal of the PT might provoke acceleration of metastatic growth (50, 51). There are various biologic rationales that could explain this, including inhibition of secondary growth by the presence of a primary neoplasm as a result of nutrient availability, concomitant immunity, or even systemic inhibition of angiogenesis (52). Such a theory could conceivably be assessed within the context of our model by defining different pre- and postsurgical metastatic growth rates *g*(*v*) and comparing goodness of fit. However, this would add at least one degree of freedom (thus deteriorating the reliability of the estimation) and invalidate the convolution formula used for computation of the metastatic burden in a model with nonautonomous *g*(*t*, *v*) (instead of *g*(*v*)), and therefore was not considered here. Importantly, theoretical integration of higher-order phenomena for the biologic dynamics of metastatic development has been considered elsewhere (14, 16, 18, 53) and recent findings in the organism-scale dynamics of metastases [such as the self-seeding phenomenon (5, 6) or the influence of the (pre-) metastatic niche (54)] could be embedded within the general formalism developed in our model. This could lead to complex models, however, and given the amount of information contained in our present data, reliable identification of such dynamics was not realistic. Instead, we only considered metastatic dynamics as reduced to its most essential features: dissemination and proliferation. Future studies should examine the potential of metastases to metastasize, as has been extensively debated in the past (55–57), particularly with the recent demonstration that some metastases are able to re-seed the primary tumor (5, 6). Although not included in this study, preliminary tests using our model suggest negligible differences in the simulations and no impact on our results; however, a more extensive analysis is required.

Our modeling philosophy elaborates on Fisher's theory (58) of cancer as a systemic disease and relates also to the parallel progression model (2). The dissemination rate *d*, characterized by parameter *μ*, quantifies the metastatic potential and allows for a *continuum* of possibilities between early and late dissemination. Our results seem to parallel clinical evidence of the impact (and importance) of early surgery—particularly in the case of breast cancer. For example, in a retrospective study of 2,838 breast cancer patients, the postsurgical residual risk of recurrence at 5 years for stage I disease was 7% (59). Consistently, our quantitative analysis demonstrates that in this case, for most patients, metastases that could have been shed before diagnosis would not develop into overt clinical disease during the remaining life history of the patient. For stage IV breast cancer (that would correspond, in our formalism, to a large value of *μ*), our analysis predicts only negligible benefit of the surgery (if only considering reduction of metastatic shedding), in accordance with preliminary results of a recent clinical trial (60). In order to use our model as a practical diagnosis and prognosis tool that could help to refine and individualize adjuvant therapy, the critical next step is to find a way to estimate the parameter *μ*, in a patient-specific manner. One of the main challenges will be to do so using data derived from the primary tumor only, because metastases are often undetectable at the time of diagnosis. Although the value of *μ* might very likely depend on the combination of several phenomena (including some genetic alterations or the immune status of the patient, which could be linked to different biomarkers; ref. 61), recent successes of genetic signatures as prognosis factors for metastasis might allow for patient-specific estimation of *μ* (62).

Any mathematical modeling attempt is limited by the intrinsic measurement error of the experimental technique. For monitoring the dynamics of total metastatic burden, BL imaging represents one of the best methods so far (63). However, measurement variability is hard to assess due to inherent issues, such as the long half-life of luciferin that prevents immediate replication of the measurements. Comparison of BL with caliper measurements showed large variance (Supplementary Fig. S1B), which increased with tumor size. This justified our assumption of a proportional measurement error model. Standard deviation of the relative error could in turn be estimated from the fit procedure and yielded a value of 0.72. This high degree of uncertainty should be taken into account as an inevitable limitation for quantitative modeling studies of BL data. We therefore put a strong emphasis on using a minimal number of parameters and assessed the robustness of our results on various assumptions, such as the shape of *d* and the value of *V*_{0} (Supplementary Figs. S10 and S11).

Together, our mathematical methodology provides a quantitative *in silico* framework that could be of valuable help for preclinical and clinical aims. Indeed, validation of our modeling methodology allows us to address in future works the differential effects of systemic therapies on primary tumor growth and metastases (39, 40). Clinically, our methodology could be used to refine/optimize therapeutic strategies for patients diagnosed with a localized cancer and inform on the timing of surgery, extent of occult metastatic disease, and probability of recurrence. In turn, this may affect decisions on duration and intensity of presurgical neoadjuvant or postsurgical adjuvant treatments (64).

## Disclosure of Potential Conflicts of Interest

No potential conflicts of interest were disclosed.

## Disclaimer

Opinions, interpretations, conclusions, and recommendations are those of the authors and are not necessarily endorsed by the Roswell Park Alliance Foundation (RPAF) and by the Department of Defense (DoD).

## Authors' Contributions

**Conception and design:** S. Benzekry, D. Barbolosi, J.M.L. Ebos

**Development of methodology:** S. Benzekry, A. Tracz, M. Mastri, D. Barbolosi, J.M.L. Ebos

**Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.):** A. Tracz, M. Mastri, R. Corbelli, J.M.L. Ebos

**Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis):** S. Benzekry, D. Barbolosi

**Writing, review, and/or revision of the manuscript:** S. Benzekry, D. Barbolosi, J.M.L. Ebos

**Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases):** A. Tracz, M. Mastri, R. Corbelli

**Study supervision:** S. Benzekry, J.M.L. Ebos

**Other (design and implementation of the numerical code used in the analysis):** S. Benzekry

## Grant Support

This study has been carried out within the frame of the LABEX TRAIL, ANR-10-LABX-0057 with financial support from the French State, managed by the French National Research Agency (ANR) in the frame of the Investments for the future Programme IdEx (ANR-10-IDEX-03-02). This work was also supported by the RPAF and by the DoD, through the Peer Reviewed Cancer Research Program, under award no. W81XWH-14-1-0210 (both to J.M.L. Ebos).

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked *advertisement* in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.