## Abstract

**Background:** The Colorectal Cancer Simulated Population model for Incidence and Natural history (CRC-SPIN) is a new microsimulation model for the natural history of colorectal cancer that can be used for comparative effectiveness studies of colorectal cancer screening modalities.

**Methods:** CRC-SPIN simulates individual event histories associated with colorectal cancer, based on the adenoma-carcinoma sequence: adenoma initiation and growth, development of preclinical invasive colorectal cancer, development of clinically detectable colorectal cancer, death from colorectal cancer, and death from other causes. We present the CRC-SPIN structure and parameters, data used for model calibration, and model validation. We also provide basic model outputs to further describe CRC-SPIN, including annual transition probabilities between various disease states and dwell times. We conclude with a simple application that predicts the impact of a one-time colonoscopy at age 50 on the incidence of colorectal cancer assuming three different operating characteristics for colonoscopy.

**Results:** CRC-SPIN provides good prediction of both the calibration and the validation data. Using CRC-SPIN, we predict that a one-time colonoscopy greatly reduces colorectal cancer incidence over the subsequent 35 years.

**Conclusions:** CRC-SPIN is a valuable new tool for combining expert opinion with observational and experimental results to predict the comparative effectiveness of alternative colorectal cancer screening modalities.

**Impact:** Microsimulation models such as CRC-SPIN can serve as a bridge between screening and treatment studies and health policy decisions by predicting the comparative effectiveness of different interventions. As such, it is critical to publish model descriptions that provide insight into underlying assumptions along with validation studies showing model performance. Cancer Epidemiol Biomarkers Prev; 19(8); 1992–2002. ©2010 AACR.

## Background

Microsimulation models simulate individual histories using stochastic rules that describe transitions between specified health states. Parameters associated with these transition rules are selected so that the model reproduces expected or observed data, a process called calibration. Calibration data include results from randomized controlled trials and epidemiologic and observational studies. Some calibration targets may be based on expert opinion. Once a model is calibrated, it can be used to provide policy-relevant information by generating predictions across a range of scenarios, including scenarios that may be difficult or impossible to evaluate in real-life settings. Thus, microsimulation models provide a powerful method for systematically combining evidence from a variety of sources to provide critical information to health policy decision makers. For example, microsimulation models provide a way to estimate and compare the effectiveness of different screening programs on colorectal cancer incidence and mortality, offering insights beyond those gained from either observational or randomized studies.

Colorectal cancer is the second leading cause of cancer death in the United States (1). Joint Guidelines from the American Cancer Society, the U.S. Multi-Society Task Force on colorectal cancer, and the American College of Radiology recommend a range of possible colorectal cancer screening options for average-risk adults age ≥50 years, including annual fecal occult blood testing, annual fecal immunochemical testing, stool DNA testing (although the screening interval is uncertain), flexible sigmoidoscopy every 5 years, double-contrast barium enema every 5 years, computed tomographic colonograph every 5 years, or colonoscopy every 10 years (2). Although this range of options may increase screening compliance by allowing individuals to select the test they are most comfortable with, observed rates of screening remain low (3-6). In addition, the range of screening options reflects uncertainty about which most effectively reduces colorectal cancer mortality.

We describe a new microsimulation model for colorectal cancer that was used to inform health policy decisions (7). Our goal is to both comprehensively describe our specific model and to provide an example of the type of information needed for model evaluation, including model structure and parameters, data and method used for model calibration, and model validation. To facilitate comparison with other microsimulation models for colorectal cancer, we report basic model outputs, including annual rates of transition between various disease states and dwell times. We conclude with a simple application of the Colorectal Cancer Simulated Population model for Incidence and Natural history (CRC-SPIN), predicting the impact of a one-time colonoscopy on colorectal cancer incidence.

## Materials and Methods

We developed our CRC-SPIN model to explore trends in colorectal cancer incidence and mortality, and to compare the effectiveness of different screening modalities. CRC-SPIN is based on the adenoma-carcinoma sequence (8, 9) and assumes that all colorectal cancer arises from an adenoma. Four distinct model components describe the natural history of colorectal cancer: (*a*) adenoma risk; (*b*) transition from adenoma to preclinical cancer; (*c*) transition from preclinical to clinically detectable cancer and stage at diagnosis; and (*d*) survival given stage at diagnosis. CRC-SPIN models state transitions in continuous time. Table 1 provides an overview of our model structure, including functional forms that describe transitions, statistical distributions that allow variability between and within individuals, and associated model parameters. Additional details are described below. See cisnet.cancer.gov for a more complete description of CRC-SPIN.

Adenoma Risk: nonhomogeneous Poisson process |

Log-risk for the ith individual = α + _{0i}α_{1}sex_{1} + ∑^{4}_{k=1} δ(A < _{k}age(_{i}t) ≤ A_{k+1}){age(_{i}t)α + ∑_{2k}^{k}_{j=2} A(_{j}α - _{2,j-1}α) _{2j} |

• Baseline log-risk, α_{0I}, is normally distributed, mean Λ, SD σ |

• δ(·) is an indicator function with δ(x) = 1 when x is true and δ(x) = 0 otherwise |

• age(_{i}t) is the ith individual's age at time t |

• A_{1} = 20, A_{2} = 50, A_{3} = 60, A_{4} = 70, A_{5} = ∞ (effectively 100 years old) |

Calibrated parameters: Λ, σ, α_{1}, α_{21}, α_{22}, α_{23}, and α_{24} |

Adenoma growth: Janoschek growth curve |

d(_{ij}t) = d_{∞} − (d_{∞}−d_{0}) exp (−λ) _{ij}t |

• d_{ij}(t) is the maximum diameter of the jth adenoma in the ith individual at time t after initiation. |

• d_{0} = 1 mm, minimal detectable adenoma size |

• d_{∞} = 50 mm, maximum adenoma size |

• time to reach 10 mm: −ln ((d_{∞}−10)/(d_{∞}−d_{0}))/λ. |

Time to reach 10 mm: type 2 extreme value distribution |

Adenomas in the colon: distribution parameterized by β_{1c} and β_{2c} |

Adenomas in the rectum: distribution parameterized by β_{1r} and β_{2r} |

Calibrated parameters: β_{1c}, β_{2c}, β_{1r}, and β_{2r} |

Transition to preclinical cancer: normal cumulative distribution |

Probability of transition, male colon Φ({ln (γ_{1}) + γ_{cm} size_{2} (_{cm}a−50)}/γ_{3}) |

Probability of transition, male rectum Φ({ln (γ_{1}) + γ_{rm} size_{2} (_{rm}a−50)}/γ_{3}) |

Probability of transition, female colon Φ({ln (γ_{1}) + γ_{cf} size_{2} (_{cf}a−50)}/γ_{3}) |

Probability of transition, female rectum Φ({ln (γ_{1}) + γ_{rf} size_{2} (_{rf}a−50)}/γ_{3}) |

Where Φ(·) is the standard normal cumulative distribution function, size is adenoma size in mm, and a is age at adenoma initiation. |

Calibrated parameters: γ_{1cm}, γ_{2cm}, γ_{1rm}, γ_{2rm}, γ_{1cf}, γ_{2cf}, γ_{1rf}, γ_{2rf}, and γ_{3} |

Sojourn time: lognormal distribution |

Preclinical colon cancer, lognormal with mean μ_{c}, SD τ_{c}μ_{c} |

Preclinical rectal cancer, lognormal with mean μ_{r}, SD τ_{r}μ_{r} |

Calibrated parameters: μ_{c}, τ_{c}, μ_{r}, and τ_{r} |

Adenoma Risk: nonhomogeneous Poisson process |

Log-risk for the ith individual = α + _{0i}α_{1}sex_{1} + ∑^{4}_{k=1} δ(A < _{k}age(_{i}t) ≤ A_{k+1}){age(_{i}t)α + ∑_{2k}^{k}_{j=2} A(_{j}α - _{2,j-1}α) _{2j} |

• Baseline log-risk, α_{0I}, is normally distributed, mean Λ, SD σ |

• δ(·) is an indicator function with δ(x) = 1 when x is true and δ(x) = 0 otherwise |

• age(_{i}t) is the ith individual's age at time t |

• A_{1} = 20, A_{2} = 50, A_{3} = 60, A_{4} = 70, A_{5} = ∞ (effectively 100 years old) |

Calibrated parameters: Λ, σ, α_{1}, α_{21}, α_{22}, α_{23}, and α_{24} |

Adenoma growth: Janoschek growth curve |

d(_{ij}t) = d_{∞} − (d_{∞}−d_{0}) exp (−λ) _{ij}t |

• d_{ij}(t) is the maximum diameter of the jth adenoma in the ith individual at time t after initiation. |

• d_{0} = 1 mm, minimal detectable adenoma size |

• d_{∞} = 50 mm, maximum adenoma size |

• time to reach 10 mm: −ln ((d_{∞}−10)/(d_{∞}−d_{0}))/λ. |

Time to reach 10 mm: type 2 extreme value distribution |

Adenomas in the colon: distribution parameterized by β_{1c} and β_{2c} |

Adenomas in the rectum: distribution parameterized by β_{1r} and β_{2r} |

Calibrated parameters: β_{1c}, β_{2c}, β_{1r}, and β_{2r} |

Transition to preclinical cancer: normal cumulative distribution |

Probability of transition, male colon Φ({ln (γ_{1}) + γ_{cm} size_{2} (_{cm}a−50)}/γ_{3}) |

Probability of transition, male rectum Φ({ln (γ_{1}) + γ_{rm} size_{2} (_{rm}a−50)}/γ_{3}) |

Probability of transition, female colon Φ({ln (γ_{1}) + γ_{cf} size_{2} (_{cf}a−50)}/γ_{3}) |

Probability of transition, female rectum Φ({ln (γ_{1}) + γ_{rf} size_{2} (_{rf}a−50)}/γ_{3}) |

Where Φ(·) is the standard normal cumulative distribution function, size is adenoma size in mm, and a is age at adenoma initiation. |

Calibrated parameters: γ_{1cm}, γ_{2cm}, γ_{1rm}, γ_{2rm}, γ_{1cf}, γ_{2cf}, γ_{1rf}, γ_{2rf}, and γ_{3} |

Sojourn time: lognormal distribution |

Preclinical colon cancer, lognormal with mean μ_{c}, SD τ_{c}μ_{c} |

Preclinical rectal cancer, lognormal with mean μ_{r}, SD τ_{r}μ_{r} |

Calibrated parameters: μ_{c}, τ_{c}, μ_{r}, and τ_{r} |

### Adenoma risk

CRC-SPIN initiates adenomas within individuals using a nonhomogeneous Poisson process that allows adenoma risk to vary systematically by gender and age, and to vary randomly across individuals. Age effects are modeled using a piecewise log-linear model with four age-risk intervals: (20,50), (50,60), (60,70), and ≥70. CRC-SPIN assumes that individuals <20 years old do not develop detectable adenomas.

Once adenomas are initiated, CRC-SPIN assigns their location using a multinomial distribution across six possible sites of the large intestine (from proximal to distal): (*a*) *P*(cecum) = 0.08; (*b*) *P*(ascending colon) = 0.23; (*c*) *P*(transverse colon) = 0.24; (*d*) *P*(descending colon) = 0.12; (*e*) *P*(sigmoid colon) = 0.24; and (*f*) *P*(rectum) = 0.09. This location information allows us to model endoscopic screening modalities with different reach (e.g., colonoscopy and flexible sigmoidoscopy, and colonoscopy with incomplete reach).

### Transition from adenoma to preclinical cancer

The time from adenoma initiation to preclinical disease is driven by the models for adenoma growth and adenoma transition to preclinical cancer. Although relatively few adenomas occur in the rectum (<10%), nearly a third of clinically detected colorectal cancers are rectal cancers (10). Therefore, CRC-SPIN allows adenomas in the colon and rectum to have different adenoma growth and adenoma transition probabilities. CRC-SPIN initiates adenomas at 1 mm, and describes subsequent adenoma growth using an extension to the Janoschek growth curve model (11, 12). This model is asymmetric, with exponential growth early that slows to an asymptote at the maximum adenoma size, *d*_{∞}. CRC-SPIN assumes that within individuals adenomas grow independently.

CRC-SPIN assigns the cumulative probability of adenoma transition to preclinical disease based on a function of location (colon or rectum), size, and age at initiation. Adenomas do not transition to preclinical cancer if their simulated size at transition is greater than the maximum adenoma size (*d*_{∞}) or if the individual dies before the adenoma reaches transition size.

### Time from preclinical to clinically detected cancer (sojourn time) and stage at diagnosis

Clinically detected cancer is defined as cancer diagnosed in the absence of screening. Sojourn time is defined as the time from cancer initiation to clinical detection. The sojourn time of each preclinical cancer is assumed to be independently distributed, according to a lognormal distribution that depends on whether the adenoma is located in the colon or the rectum.

Once a cancer is detected, CRC-SPIN simulates size at detection using a smoothed distribution based on size at detection from the National Cancer Institute's Surveillance Epidemiology and End Results (SEER) 1975-1979 (13), a period when there was little or no colorectal cancer screening. Next, CRC-SPIN stochastically assigns stage at detection, conditional on size, using a multinomial logistic regression model estimated using the same SEER data.

### Survival

CRC-SPIN stochastically assigns time of colorectal cancer death, using relative survival probabilities estimated from SEER survival data for cases diagnosed from 1975 to 1979. Survival probabilities are based on proportional hazards models that are stratified by location (colon or rectum) and American Joint Committee on Cancer stage, with age and gender included as covariates.

CRC-SPIN also stochastically assigns time of non–colorectal cancer death using survival probabilities based on product-limit estimates for age and birth-year cohorts from the National Center for Health Statistics Databases (14).

### CRC-SPIN model calibration

CRC-SPIN contains 23 calibrated parameters (7 adenoma risk parameters, 4 adenoma growth parameters, 8 adenoma transition parameters, and 4 sojourn time parameters). CRC-SPIN is a Bayesian model that includes specification of prior distributions for each of these model parameters (provided in the Appendix). We specify diffuse prior distributions when little is known about a parameter. When there are prior beliefs or evidence about parameter values, then informative prior distributions are used to incorporate this knowledge into the model.

Calibration (that is, estimation) of CRC-SPIN parameters is based on the joint posterior parameter distribution, which updates prior distributions (which reflect prior beliefs about parameters) using calibration data. We estimated the joint posterior distribution using a Markov chain Monte Carlo approach (15). This estimation approach results in a sequence of simulated draws from the posterior distribution: θ** _{1}*, …, θ*

*. We estimated model parameters and outcomes (i.e., predictions) using the mean across these simulated draws, with 95% credible intervals (CI) and prediction intervals (PI) estimated using 2.5th and 97.5th percentiles (16).*

_{N}We simultaneously estimated (calibrated) all 23 CRC-SPIN parameters using three types of data: population-based outcomes from SEER data, individual-level study outcomes, and adenoma-level outcomes.

Individual-level outcomes came from studies that focused on minimally screened asymptomatic individuals. We assumed imperfect accuracy when calibrating to data that used colonoscopy to identify outcomes (17, 18). Consistent with observed miss rates (19-21), all lesions ≥20 mm were assumed to be detected by colonoscopy. For smaller lesions, miss rates were assumed to be a quadratic function of size. More specifically, the probability of missing a lesion given its size was P(miss|size = *s *and *s*<20 mm) = 0.34 − 0.035*s* + 0.0009*s*^{2}, where *s *is adenoma diameter in mm. The associated miss rates for lesions 1 mm, 5 mm, 10 mm, and 15 mm in size were 31%, 19%, 8%, and 2%, respectively.

#### Adenoma prevalence.

Information about adenoma prevalence is largely incorporated into CRC-SPIN through prior distributions for adenoma risk parameters that are based on results from a meta-analysis of autopsy data describing adenoma prevalence by age and gender (22). We also included information from an additional study by Strul and colleagues (17) that reported adenoma prevalence and preclinical cancer rates by age for 1,171 asymptomatic individuals with a completed screening colonoscopy, excluding individuals with a personal or family history of colorectal cancer and those with colonoscopy, sigmoidoscopy, or barium enema within the last five years.

#### Adenoma size.

Lieberman and colleagues (18) reported adenoma prevalence and categorical size distributions from a sample of veterans with completed colonoscopies who participated in a study of screening colonoscopy. Because this study was conducted in a population of veterans, 96.8% of the participants were men. All recruited patients were asymptomatic and without a history of colorectal cancer or diseases of the colon, and had not had a colonic examination in the past 10 years. Because individuals in this sample may be at somewhat higher risk of adenoma initiation than the general population, we restricted our focus to the size of adenomas among the 1,141 individuals with at least one adenoma.

#### Adenoma count and size.

Pickhardt and colleagues (23) reported adenoma counts and categorical size distribution of adenomas from 1,233 asymptomatic individuals participating in a study comparing computed tomography colonography to optical colonoscopy. This study excluded people with a positive fecal occult blood test within the last 6 months, colonoscopy within the last 10 years, or barium enema within the last 5 years.

#### Preclinical cancer prevalence.

Imperiale and colleagues (24) reported preclinical cancer prevalence in a sample of 1,994 individuals employed by Eli Lily who participated in a study of screening colonoscopy. Eligible individuals were ≥50 years old, asymptomatic for colon cancer, and did not have a personal history of colorectal cancer, colorectal polyps, or inflammatory bowel disease.

Two studies of clinical series drawn from pathology records provided adenoma-level data, that is, outcomes reported for adenomas, rather than individuals. The first series, reported by Church (25), described the pathology of 5,722 adenomas removed between January 1995 and September 2002 in a single endoscopist's practice. The second series, reported by Odom and colleagues (26), described the pathology of 3,225 adenomas removed between January 1999 and December 2003, excluding those obtained from bowel resection and colorectal cancer not associated with a polyp. We calibrated to preclinical cancer rates in adenomas >5 mm because rates of preclinical cancer were near zero in smaller adenomas (0.05% in the Church data and 0.03% in the Odom data). This resulted in an intractable computational problem when estimating the likelihood function.

### Additional model description

For each θ** _{i}* (

*i*= 1,…,1000) we simulated model results for a cohort with 30 million individuals born in 1928. These simulated model results provided additional description of the CRC-SPIN model. We simulated a large cohort size so that credible intervals primarily provided information about the variability of estimated transition probabilities due to variability in estimated model parameters, rather than between-individual variability in the modeled disease process. We report estimated state transition probabilities based on the predicted proportion of simulated individuals making state transitions as they age from 60 to 61 years. We selected a one-year period to allow comparison with Markov models that use a one-year cycle. We calculated these proportions both for the overall cohort and for men and women. We also describe lifetime transition probabilities and dwell times for adenomas and preclinical cancers.

### External model validation

We used CRC-SPIN to predict results from two studies that were not used to calibrate (estimate) model parameters. When simulating data from these studies we assumed an error-prone colonoscopy, as described in the section on CRC-SPIN model calibration. These simulation runs also assumed incomplete reach: 85% of simulated colonoscopy exams were complete to the cecum, 7% were complete to the ascending colon, 5% were complete to the transverse colon, and 3% were complete to the descending colon. The two studies are described below.

#### Validation study 1: negative colonoscopy and subsequent colorectal cancer.

Brenner et al. (27) used a case-control study to examine the association between a negative colonoscopy and the subsequent risk for colorectal cancer. Their study included 380 cases >30 years old diagnosed with colorectal cancer between January 2003 and June 2004, excluding individuals with a prior positive colonoscopy (colonoscopy finding an adenoma or preclinical cancer). These 380 cases were frequency matched to 480 controls, also without a prior positive colonoscopy, by age and gender. The investigators used logistic regression to estimate the overall association between a prior negative colonoscopy and subsequent colorectal cancer diagnosis, and the association between the number of years since a prior negative colonoscopy and subsequent colorectal cancer diagnosis. Logistic regression models included age and gender as covariates. We used their reported adjusted odds ratios (OR) as our validation points.

We simulated Brenner et al.'s case-control sample for our set of θ** _{i}* data. Each simulated sample matched the age and gender distributions of cases and controls, and the overall probability of a prior (negative) colonoscopy. To generate these data, we simulated whether an individual had a prior colonoscopy at baseline, excluding those with a simulated positive finding at prior colonoscopy. For each simulated sample we used logistic regression to estimate age- and gender-adjusted ORs. When analyzing data simulated to represent the Brenner study, we added one case and one control aged 60 for each category of time since colonoscopy so that we could obtain estimates for every simulated study. This had a small conservative effect on our estimates, moving ORs slightly nearer to 1.

#### Validation study 2: adenoma prevalence following a negative colonoscopy.

Imperiale and colleagues (28) reported the proportion of individuals with at least one adenoma among 1,256 individuals with a prior negative screening colonoscopy (no cancers or adenomas found at screening) who completed follow-up. This study was a follow-up of a screening colonoscopy study used for model calibration (24). The average time to follow-up colonoscopy was 5.3 years (SD, 1.3 years; interquartile range, 5.0-6.0 years).

We simulated the timing of follow-up colonoscopy in the study cohort using a normal distribution (mean, 4.5 years; SD, 1.2 years), truncated to range from 4.75 to 8 years. This resulted in a mean time to follow-up colonoscopy of 5.6 years (interquartile range, 5.1-6.0 years). We simulated one cohort for each θ** _{i}* and using this cohort estimated the proportion with any adenoma detected at follow-up colonoscopy.

### Model application: evaluation of a hypothetical screening strategy

To show the use of CRC-SPIN, we predicted the hypothetical impact of a simple colonoscopy screening strategy on colorectal cancer incidence. For each θ** _{i}* we simulated a cohort of 10 million individuals born in 1928 (51.7% female at age 50). We simulated the effect of one-time screening colonoscopy, without subsequent surveillance, at age 50 years. We predict cancer incidence from age 55 through 85 years.

We explored three screening scenarios: a perfect screening test, colonoscopy with complete compliance, and colonoscopy with noncompliance. All simulated individuals were free of clinically detectable colorectal cancer at the time of screening, which occurred on their 50th birthday. The perfect screening test detects and removes all adenomas and preclinical cancers, regardless of size or location. Colonoscopy detects adenomas and preclinical cancers with the probability of detection based on lesion size and location. Lesion miss rates are described in the section on CRC-SPIN model calibration, whereas colonoscopy completion rates are in the section on external model validation. Colonoscopy with noncompliance assumes that half of all eligible individuals undergo colonoscopy on their 50th birthday. When simulating colonoscopy we assumed 1 death per 10,000 colonoscopies (29).

## Results

### Calibration and internal validation

Internal validation refers to the comparison of observed calibration points with model predictions of calibration points, based on simulating calibration data using calibration sample sizes to incorporate sampling variability. With few exceptions, these 95% prediction intervals (95% PI) include observed calibration points. Below, we focus on point predictions.

CRC-SPIN provided a good fit to calibration data. In particular, CRC-SPIN provided excellent fit to SEER colorectal cancer rates (Fig. 1A). CRC-SPIN predictions balanced data from three studies describing preclinical cancers (Fig. 1B), predicting fewer preclinical cancers at screening than were found by Imperiale et al. (ref. 24; 3 versus 4 per thousand), more preclinical cancers among biopsied adenomas up to 20 mm, and fewer preclinical cancers among biopsied adenomas >20 mm (although prediction intervals were wide for adenomas ≥20 mm because few adenomas were this large). The numbers of predicted preclinical cancers per 1,000 adenomas relative to the number observed were the following: for 6 to 10 mm, 8 versus 2 observed by Church (25) and 6.5 observed by Odom et al (26); for 11 to 20 mm: 36 versus 16 (26); for >10 mm, 48 versus 30 (25); for >20 mm, 140 versus 190 mm (26). CRC-SPIN predicted higher adenoma prevalence than observed by Strul et al. (ref. 17; Fig. 1C), reflecting our strong prior information about adenoma prevalence (22). CRC-SPIN also predicted more adenomas than found by Pickhard and colleagues (23), predicting 575 adenomas among 1,233 screened individuals (95% PI, 450-747), compared with the 554 observed. Finally, as shown in Fig. 1D, CRC-SPIN predicted that 61% of detected adenomas would be <6 mm, similar to 62% reported by Pickhardt et al. (2003), but predicted more detected adenomas >10 mm than reported by either Lieberman et al. (ref. 18; 28% versus 23%) or Pickhardt et al. (ref. 23; 17% versus 9%), and fewer adenomas 6 to 10 mm than reported by Pickhardt et al. (ref. 23; 22% versus 29%).

### Descriptive model results

Estimated one-year transition probabilities at age 60 (Table 2) show that none of the simulated individuals made the transition from the “no adenoma” state to clinically detected colorectal cancer or colorectal cancer death within one year, and transition to preclinical cancer was possible, but very unlikely. Similarly, the estimated probability of transition from the “small adenoma” state to any colorectal cancer state was nonzero, but extremely small. The probability of transition to colorectal cancer states increased with the size of the largest adenoma. Individuals were most likely to stay in each of adenoma size category for at least one year.

Most advanced state . | Overall transition probability . | ||
---|---|---|---|

At time 0 . | One year later . | Estimate (95% CI) . | |

No adenomas | No adenomas | 0.975 (0.969-0.979) | |

≥1 small adenoma | 0.013 (0.009-0.019) | ||

≥1 medium adenoma | <10^{−8 }(—) | ||

≥1 large adenoma | <10^{−8 }(—) | ||

Preclinical CRC | <10^{−8 }(—) | ||

Clinically detected CRC | 0 (—) | ||

CRC death | 0 (—) | ||

Non-CRC death | 0.012 (0.012-0.013) | ||

≥1 small adenoma | ≥1 small adenoma | 0.925 (0.909-0.942) | |

≥1 medium adenoma | 0.061 (0.044-0.078) | ||

≥1 large adenoma | <10^{−8 }(—) | ||

Preclinical CRC | 0.00014 (0.00008-0.00021) | ||

Clinically detected CRC | 0.000002 (0.00000-0.00004) | ||

CRC death | <10^{−8 }(—) | ||

Non-CRC death | 0.013 (0.013-0.013) | ||

≥1 medium adenoma | ≥1 medium adenoma | 0.932 (0.922-0.940) | |

≥1 large adenoma | 0.051 (0.043-0.061) | ||

Preclinical CRC | 0.003 (0.002-0.004) | ||

Clinically detected CRC | 0.0004 (0.0000-0.0010) | ||

CRC death | 0.00002 (0.00000-0.00006) | ||

Non-CRC death | 0.013 (0.013-0.013) | ||

≥1 large adenoma | ≥1 large adenoma | 0.970 (0.966-0.973) | |

Preclinical CRC | 0.015 (0.011-0.019) | ||

Clinically detected CRC | 0.002 (0.000-0.004) | ||

CRC death | 0.00008 (0.00000-0.00024) | ||

Non-CRC death | 0.013 (0.013-0.014) | ||

Preclinical CRC | Preclinical CRC | 0.566 (0.351-0.719) | |

Clinically detected CRC | 0.381 (0.244-0.575) | ||

CRC death | 0.039 (0.024-0.060) | ||

Non-CRC death | 0.013 (0.012-0.014) | ||

Clinically detected CRC | Clinically detected CRC | 0.924 (0.921-0.927) | |

CRC death | 0.063 (0.060-0.067) | ||

Non-CRC death | 0.012 (0.012-0.013) |

Most advanced state . | Overall transition probability . | ||
---|---|---|---|

At time 0 . | One year later . | Estimate (95% CI) . | |

No adenomas | No adenomas | 0.975 (0.969-0.979) | |

≥1 small adenoma | 0.013 (0.009-0.019) | ||

≥1 medium adenoma | <10^{−8 }(—) | ||

≥1 large adenoma | <10^{−8 }(—) | ||

Preclinical CRC | <10^{−8 }(—) | ||

Clinically detected CRC | 0 (—) | ||

CRC death | 0 (—) | ||

Non-CRC death | 0.012 (0.012-0.013) | ||

≥1 small adenoma | ≥1 small adenoma | 0.925 (0.909-0.942) | |

≥1 medium adenoma | 0.061 (0.044-0.078) | ||

≥1 large adenoma | <10^{−8 }(—) | ||

Preclinical CRC | 0.00014 (0.00008-0.00021) | ||

Clinically detected CRC | 0.000002 (0.00000-0.00004) | ||

CRC death | <10^{−8 }(—) | ||

Non-CRC death | 0.013 (0.013-0.013) | ||

≥1 medium adenoma | ≥1 medium adenoma | 0.932 (0.922-0.940) | |

≥1 large adenoma | 0.051 (0.043-0.061) | ||

Preclinical CRC | 0.003 (0.002-0.004) | ||

Clinically detected CRC | 0.0004 (0.0000-0.0010) | ||

CRC death | 0.00002 (0.00000-0.00006) | ||

Non-CRC death | 0.013 (0.013-0.013) | ||

≥1 large adenoma | ≥1 large adenoma | 0.970 (0.966-0.973) | |

Preclinical CRC | 0.015 (0.011-0.019) | ||

Clinically detected CRC | 0.002 (0.000-0.004) | ||

CRC death | 0.00008 (0.00000-0.00024) | ||

Non-CRC death | 0.013 (0.013-0.014) | ||

Preclinical CRC | Preclinical CRC | 0.566 (0.351-0.719) | |

Clinically detected CRC | 0.381 (0.244-0.575) | ||

CRC death | 0.039 (0.024-0.060) | ||

Non-CRC death | 0.013 (0.012-0.014) | ||

Clinically detected CRC | Clinically detected CRC | 0.924 (0.921-0.927) | |

CRC death | 0.063 (0.060-0.067) | ||

Non-CRC death | 0.012 (0.012-0.013) |

NOTE: Estimates were based on 30 million individuals simulated for each of 1,000 draws from the posterior distribution of CRC-SPIN model parameters. Individuals were assigned to their most advanced disease state at each time point. For example, an individual with multiple medium adenomas and one preclinical colorectal cancer would be categorized as having preclinical colorectal cancer. Small adenomas, <5 mm; medium adenomas, 5-<10 mm; large adenomas, ≥10 mm. Individuals who die during the one-year period are included in their most recent state.

Abbreviation: CRC, colorectal cancer.

One-year transition probabilities differed slightly for men and for women, although these small differences accumulate over time and are reflected in higher adenoma prevalence and colorectal cancer incidence in men. Overall, men had slightly higher probabilities of transition from no adenomas to one or more small adenomas (0.015 versus 0.011 for women), from having at least one small adenoma to at least one medium adenoma (0.066 versus 0.064), and from having at least one medium adenoma to at least one large adenoma (0.052 versus 0.050). The probabilities of transition from adenomas to preclinical cancers was somewhat smaller for men than for women (from one or more small adenomas, 0.00013 versus 0.00018; from one or more medium adenomas, 0.003 versus 0.004; from one or more large adenomas, 0.014 versus 0.016). Probabilities of transition into the clinically detected cancer state and colorectal cancer death states were similar for men and for women. Men had a higher probability of transition into other-cause death from all stages, approximately 0.016 versus 0.009 for women. The probability of transition into non–colorectal cancer death is independent of the colorectal cancer state; the differences in estimated probabilities in Table 2 reflect sampling variability.

Lifetime transition probabilities and dwell times are another way to describe the natural history model. Overall, 7.4% of adenomas transition to preclinical cancer before an individual dies (95% CI, 5.8-9.0%), and of these the average time from adenoma initiation to cancer initiation is 25.4 years (95% CI, 21.5-28.2). Most (87.0%) preclinical cancers transition to clinical cancer before an individual dies, and of these the average time from preclinical cancer initiation to clinical cancer detection is 1.9 years (95% CI, 1.2-3.0 years). Finally, 6.4% of adenomas transition to clinically detected cancer before an individual dies (95% CI, 5.0-8.0%), and of these the average time from adenoma initiation to clinically detected cancer is 27.2 years (95% CI, 23.3-30.1 years).

### External model validation

Although internal validation results suggested that CRC-SPIN may overestimate adenoma prevalence and size, this was not replicated by external validation (Table 3). The CRC-SPIN model predicted fewer adenomas than found in the Imperiale follow-up study (28).

. | Reported . | Predicted . |
---|---|---|

Study of individuals 5 years after a negative colonoscopy (28) | ||

Percentage of individuals with at least one adenoma | 0.16 (0.14-0.18) | 0.10 (0.07-0.15) |

Case-control study of negative colonoscopy and subsequent colorectal cancer* (27) | ||

Odds ratios are relative to no previous negative colonoscopy | ||

Any prior (negative) colonoscopy | 0.22 (0.14-0.34) | 0.19 (0.10-0.33) |

Colonoscopy 1-2 years ago | 0.13 (0.06-0.30) | 0.22 (0.08-0.44) |

Colonoscopy 3-4 years ago | 0.26 (0.12-0.58) | 0.15 (0.05-0.30) |

Colonoscopy 5-9 years ago | 0.21 (0.08-0.57) | 0.15 (0.05-0.30) |

Colonoscopy 10-19 years ago | 0.29 (0.11-0.80) | 0.31 (0.12-0.62) |

Colonoscopy 20+ years ago | 0.38 (0.13-1.09) | 0.58 (0.24-1.05) |

. | Reported . | Predicted . |
---|---|---|

Study of individuals 5 years after a negative colonoscopy (28) | ||

Percentage of individuals with at least one adenoma | 0.16 (0.14-0.18) | 0.10 (0.07-0.15) |

Case-control study of negative colonoscopy and subsequent colorectal cancer* (27) | ||

Odds ratios are relative to no previous negative colonoscopy | ||

Any prior (negative) colonoscopy | 0.22 (0.14-0.34) | 0.19 (0.10-0.33) |

Colonoscopy 1-2 years ago | 0.13 (0.06-0.30) | 0.22 (0.08-0.44) |

Colonoscopy 3-4 years ago | 0.26 (0.12-0.58) | 0.15 (0.05-0.30) |

Colonoscopy 5-9 years ago | 0.21 (0.08-0.57) | 0.15 (0.05-0.30) |

Colonoscopy 10-19 years ago | 0.29 (0.11-0.80) | 0.31 (0.12-0.62) |

Colonoscopy 20+ years ago | 0.38 (0.13-1.09) | 0.58 (0.24-1.05) |

NOTE: Reported values are the estimates reported in print, and include 95% confidence intervals. Predicted values are based on CRC-SPIN simulations and include 95% prediction intervals.

*Odds ratios are relative to no previous negative colonoscopy.

CRC-SPIN predicted a smaller reduction in odds of colorectal cancer associated with colonoscopy 1 to 2 years ago than reported by Brenner and colleagues, greater reductions in odds of colorectal cancer associated with colonoscopy 3 to 4 and 5 to 9 years ago, a similar reduction in odds for colonoscopy 10 to 19 years ago, and a smaller reduction in the odds of colorectal cancer associated with colonoscopy ≥20 years ago (Table 3). The predicted overall OR associated with any negative colonoscopy was similar to the reported OR (predicted OR of 0.19 versus reported OR of 0.22). Prediction intervals included the reported ORs and were generally similar to reported 95% confidence intervals. Several assumptions were required to predict these data. Perhaps most important were our assumptions that the accuracy of colonoscopy and the likelihood of a complete colonoscopy were constant over the ≥20-year time interval. We also assumed a “perfect” case-control study, free from selection bias that may result in differential risk for colorectal cancer among individuals who chose to undergo colonoscopy.

### Evaluation of a hypothetical screening strategy

Table 4 shows the predicted results for the hypothetical screening scenarios. CRC-SPIN predicted that a one-time screening of all individuals at age 50 with a perfect test would detect 95 cancers per 100,000 individuals and would result in reduced colorectal cancer incidence through age 85. This one-time perfect test would reduce the *cumulative* colorectal cancer incidence (including those detected at screening) by 87% by age 60 and 64% by age 80. A one-time colonoscopy at age 50 years (with missed lesions and incomplete reach) would reduce the cumulative colorectal cancer incidence by 74% by age 60 and 52% by age 80. Miss rates and incomplete reach associated with screening colonoscopy had a large impact on the predicted rates of cancer following the initial colonoscopy. The predicted relative rates of incident cancer 1-5, 6-10, 11-15, and 16-25 years postscreening are <0.01, 0.03, 0.13, and 0.35, respectively, for a perfect test; versus 0.12, 0.18, 0.29, and 0.49, respectively, for colonoscopy. Noncompliance further reduces the impact of colonoscopy on colorectal cancer incidence. When half of all 50-year-old individuals underwent colonoscopy, its impact on colorectal cancer incidence was approximately halved.

Age . | No screening . | Colonoscopy with 50% compliance . | Colonoscopy with 100% compliance . | Perfect screening . |
---|---|---|---|---|

Screen detected colorectal cancers per 100,000
. | ||||

50 | 0 (—) | 43 (24-71) | 87 (48-142) | 95 (53-155) |

Clinically detected colorectal cancers per 100,000 | ||||

50 | 43 (38-49) | 23 (20-27) | 4 (3-5) | 0 (—) |

55 | 73 (66-81) | 42 (37-46) | 10 (8-12) | 0 (0-1) |

60 | 115 (106-125) | 70 (64-76) | 25 (21-29) | 7 (3-13) |

65 | 168 (154-181) | 112 (102-123) | 57 (47-70) | 31 (19-48) |

70 | 234 (216-251) | 172 (155-191) | 110 (90-136) | 78 (55-111) |

75 | 314 (288-342) | 249 (221-282) | 184 (151-229) | 148 (109-200) |

80 | 475 (419-539) | 345 (299-398) | 279 (228-346) | 241 (183-315) |

85 | 522 (455-600) | 458 (389-545) | 395 (319-490) | 356 (274-460) |

Age . | No screening . | Colonoscopy with 50% compliance . | Colonoscopy with 100% compliance . | Perfect screening . |
---|---|---|---|---|

Screen detected colorectal cancers per 100,000
. | ||||

50 | 0 (—) | 43 (24-71) | 87 (48-142) | 95 (53-155) |

Clinically detected colorectal cancers per 100,000 | ||||

50 | 43 (38-49) | 23 (20-27) | 4 (3-5) | 0 (—) |

55 | 73 (66-81) | 42 (37-46) | 10 (8-12) | 0 (0-1) |

60 | 115 (106-125) | 70 (64-76) | 25 (21-29) | 7 (3-13) |

65 | 168 (154-181) | 112 (102-123) | 57 (47-70) | 31 (19-48) |

70 | 234 (216-251) | 172 (155-191) | 110 (90-136) | 78 (55-111) |

75 | 314 (288-342) | 249 (221-282) | 184 (151-229) | 148 (109-200) |

80 | 475 (419-539) | 345 (299-398) | 279 (228-346) | 241 (183-315) |

85 | 522 (455-600) | 458 (389-545) | 395 (319-490) | 356 (274-460) |

NOTE: Shown are model-based predictions of 1-year colorectal cancer incidence per 100,000 with 95% credible intervals in parenthesis.

To provide further insight into CRC-SPIN assumptions, we predicted both lead time and overdiagnosis associated with removal of adenomas and preclinical cancers using a one-time perfect screening test at age 50. Lead time is defined for individuals who would have developed clinically detected colorectal cancer in the absence of screening, and is equal to the time from screen detection of a preclinical cancer to clinical detection in the absence of screening. We predicted an average lead time of 2.0 years (95% CI, 0.92-3.31). Overtreatment is defined as the removal of adenomas or preclinical cancer that would never have gone on to develop into clinically detected colorectal cancer. We predicted that 78.6% (95% CI, 73.6-82.7%) of individuals with an adenoma removed would never go on to develop invasive cancer. The predicted percentage of overtreated individuals decreased with the number of adenomas removed: one adenoma removed, 83.1% (95% CI, 80.0-86.0%); two adenomas removed, 71.4% (95% CI, 66.1-76.0%); three adenomas removed, 56.4% (95% CI, 46.9-66.7%). Similarly, the rate of overtreatment decreased as the size of the largest adenoma increased: <5 mm, 86.7% (95% CI, 83.2-90.0%); 5-<10 mm, 73.5% (95% CI, 68.4-78.1%); ≥10 mm, 57.7% (95% CI, 51.0-63.7%).

## Discussion

Microsimulation models are a valuable tool for combining expert opinion with observational and experimental results to provide a relatively inexpensive way to explore the impact of different interventions and policy changes on disease incidence and mortality. Because of the complexity of such models, it is important to provide a variety of descriptions that provide insight into model assumptions.

We developed a relatively simple natural history model for colorectal cancer with disease progression that is a function of age, gender, and adenoma size and location. Our model was consistent with both clinical beliefs and available data. Even this relatively simple model contains 23 calibrated parameters. Because the characteristics of CRC-SPIN are not obvious from the model structure, we presented predicted transition probabilities, dwell times, and the predicted impact of screening to provide additional insight into CRC-SPIN assumptions.

Predicted transition probabilities can be used to compare the CRC-SPIN model with state transition models that report similar probabilities. For example, Pickhardt and colleagues (30) used a microsimulation model to examine the effectiveness of computed tomography colonography. Compared with the Pickhardt model, CRC-SPIN has lower one-year transition probabilities associated with new polyp development (one-year probability of transition into the adenoma state of 0.013 for a 60-year-old versus 0.019 for 50- to 60-year-olds and 0.033 for 60- to 70-year-olds), faster rates of adenoma growth (one-year probabilities of transition from small to medium adenomas of 0.061 versus 0.02, and one-year probabilities of transition from medium to large of 0.051 versus 0.02), and slower rates of transition to preclinical cancer (one-year probabilities of transition from large adenomas to preclinical cancer of 0.015 versus 0.03). These comparisons are approximate, because CRC-SPIN transition probabilities describe transition of individuals between states whereas Pickhardt and colleagues describe transition probabilities for individual adenomas.

Microsimulation models can be used to predict the impact of an intervention in a hypothetical target population. Our example examined the effect of a one-time screening colonoscopy, showing the use of our model in a very simple setting, with a one-time test applied to a cohort of individuals. Although our model accounts for between-individual differences in the underlying risk for developing an adenoma, we do not account for changes in risk that could be attributed to changes in modifiable risk factors, such as changes in diet, exercise, and smoking. Changes in risk factors could modify the impact of screening, through their effect on both colorectal cancer disease processes and risk for other-cause death. In addition, the predicted impact of screening will be influenced by the proportion of women in our cohort, because women are at lower risk than men for colorectal cancer. In general, screening with removal of adenomas will have a greater impact in higher-risk populations.

Comparison of the predicted impact of perfect screening relative with colonoscopy shows that the assumptions made about colonoscopy implementation can have a large effect on its predicted effectiveness. The effect of these assumptions adds difficulty to validation based on observational studies, such as our validation of the Brenner study (27). These results also have implications for a recent case-control study of colonoscopy, which concluded that colonoscopy may not provide protection against right-sided tumors (31). Our results support the idea that the findings by Baxter and colleagues (31) could be explained by the combined effect of higher lesion miss rates and lower colonoscopy completion rates in community clinical practice relative to a clinical trials setting.

The CRC-SPIN model offers a new method for estimating the comparative effectiveness of colorectal cancer screening tests and programs. We calibrated our model using a Bayesian approach, which allows both point and interval predictions, a major advantage over other existing calibration methods. The ability to incorporate information into the model through prior distributions is another advantage of the Bayesian calibration approach. However, this brings with it the potential for disagreement about specification of prior distributions. We specified informative prior distributions for adenoma risk parameters as an efficient method of incorporating prior information from an earlier meta-analysis (22). We specified uniform prior distributions for remaining parameters, incorporating prior beliefs about their plausible ranges. In the absence of data to inform a particular parameter, posterior distributions will not be updated and prior distributions will necessarily inform posterior estimates. If these prior distributions are diffuse uniform distributions, then interval estimates will be wide to reflect their uncertainty (15). Thus, the Bayesian model allows estimation in the face of uncertainty with estimates that reflect this uncertainty. This offers an important advantage over alternative calibration approaches that provide no method for communicating the precision of estimates.

As noted by Box and Draper (32) “all models are wrong, but some are useful.” Microsimulation models must balance model complexity against the availability of data to inform model parameters. In an effort to develop a model that is largely informed by observed data, we kept the CRC-SPIN model relatively simple, although this simplicity results in several limitations.

CRC-SPIN does not include risk factors (other than age and gender). Incorporating risk factors requires information about their actions. For example, models that include race as a risk factor must describe how race affects risk, that is, whether risk is increased through an elevation in adenoma risk, faster adenoma growth, a greater chance of adenoma transition to preclinical cancer, shorter sojourn time, or all or some combination of these effects. Although it is technically straightforward to incorporate risk factors into the CRC-SPIN model, we defer their inclusion until there are data to support these model extensions.

CRC-SPIN models progression based on adenoma size, but does not specify other adenomatous features associated with malignant potential. For example, advanced adenomas, one of the primary targets of colorectal cancer screening, are defined as those with size ≥10 mm, >25% villous component, or high-grade dysplasia (33). Although modeling of high-grade dysplasia is important for estimating the effectiveness of screening tests designed to detect dysplasia, simulation of villous components is less important as it is highly correlated with adenoma size (34). CRC-SPIN also assumes that adenoma size is measured without error. Recent studies have begun to quantify the amount of error in adenoma size measurement (35, 36). Given more information about errors in adenoma size measurements, screening modules used with CRC-SPIN could easily be modified to account for errors in size measurement in much the same way adenoma miss rates are incorporated into endoscopic exam modules.

Finally, CRC-SPIN assumes that all colorectal cancer arises from the adenoma-carcinoma pathway, that the distribution of adenomas in the colorectum does not vary by age, and that the probability of transition from adenoma to cancer does not vary by location within the colon. Although CRC-SPIN can be extended in a variety of ways, data are necessary to support these extensions. We will continue to examine model extensions to address these limitations as more data become available.

## Appendix

Prior distributions are used to incorporate prior information and beliefs about parameters into the CRC-SPIN model. The ability to incorporate information from prior studies via prior distributions is an important advantage of our Bayesian calibration approach.

Our model includes prior information from a meta-analysis of autopsy data describing adenoma prevalence by age and gender (22). The meta-analysis used a Poisson model with log-linear risk function that was similar to the CRC-SPIN adenoma risk model. Results from 14 autopsy studies were combined using a Bayesian approach that used minimally informative uniform prior distributions, and the analysis included careful assessment of model fit to autopsy data (details are provided by Rutter, Miglioretti, and Yu, 2007). For the CRC-SPIN model, we specified prior distributions based on results from this meta-analysis, with Λ ∼ Normal (−6.7,0.27), α_{1} ∼ Normal (−0.3,0.04). The meta-analysis specified a single age-effect, and we *a priori* assumed a constant age with α_{2k} ∼ Normal (0.03,0.003) for each of the *k* age effects.

Prior information for remaining model parameters was limited to expert opinion. For these parameters, we selected uniform prior distributions, with ranges selected to be wide enough to include plausible but unlikely values of parameters.

We specified a Uniform [0.05,3.0] prior distribution for between-individual variability in adenoma risk (σ), which allows the expected number of adenomas for an individual who is 2 SDs above from average risk to range from 1.1 times the expected number of adenomas for an average risk individual of the same age and gender (when σ = 0.05), up to more than 400 times the expected number of adenomas for a average risk individual (when σ = 3.0).

Prior distributions for adenoma growth parameters β_{1c}, β_{2c}, β_{1r}, and β_{2r} were selected to allow the median time for an adenoma to grow to 10 mm to range from 1.1 to 144.3 years with an interquartile range that can be as narrow as 0.4 years or as wide as 275.5 years. Specifically, we assumed β_{1c} and β_{1r} are *a priori* Uniform [1,100] and β_{2c} and β_{2r} are a priori distributed Uniform [1,4].

Prior distributions for adenoma transition parameters γ_{1 cm}, γ_{2 cm}, γ_{1 rm}, γ_{2 rm}, γ_{1cf}, γ_{2cf}, γ_{1rf}, and γ_{2rf} were selected to limit the transition probability of small adenomas (based on expert opinion) to accommodate observed rates of invasive cancer by adenoma size reported by Nusko et al. (37), and to allow a wide range of variability. For example, cumulative transition probabilities of a 10-mm adenoma can range from 0.0006 to 0.08 for a 50-year-old, and up to 0.28 for a 70-year-old with the maximum age effect. Cumulative transition probabilities for a 20-mm adenoma can range from 0.03 to 0.50 at age 50, and up to 0.79 at age 70. Specifically, we assumed that γ_{1 cm}, γ_{1rm}, γ_{1cf}, and γ_{1rf} are *a priori* Uniform [0.02,0.05]. We assumed that γ_{2 cm}, γ_{2 rm}, γ_{2cf}, and γ_{2rf} are *a priori* Uniform [0,0.02].

Prior distributions for sojourn time parameters μ_{c}, μ_{r}, τ_{c}, and τ_{r} were based on expert opinion and data from the Taiwan Multicenter Cancer Screening Project (38) and were specified to allow a broad range in both between-individual variability and expected sojourn time. The specified prior distributions allow the 10th percentile of sojourn time to range from 0.1 to 4.4 years, the 90th percentile of sojourn time to range from 0.6 to 11.1 years, and the interquartile range of sojourn time to range from 0.1 to 4.4 years. Specifically, we assumed μ_{c} and μ_{r} are *a priori* distributed Uniform [0.5,5] and τ_{c} and τ_{r} are *a priori* distributed Uniform [0.1,1.5].

## Disclosure of Potential Conflicts of Interest

No potential conflicts of interest were disclosed.

## Acknowledgments

**Grant Support:** NCI U01 CA97427.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked *advertisement* in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.