Objective: Many new stool tests intended to detect neoplastic cells or cell products are developed at present for colorectal cancer (CRC) screening. The aim of this study was to simulate a population-based screening setting to assess and compare the potential for early detection and prevention of CRC of screening based on stool tests with different sensitivity and specificity and of screening with colonoscopy as a primary screening tool.
Method: A Markov model was developed aimed to estimate the proportion of CRC cases which are early detected or prevented due to screening as well as the number of equired stool tests and colonoscopies per early detected or prevented CRC case. Model outcomes were calculated for the offer of annual stool testing from age 55 to 74 in combination with colonoscopic follow-up of positive test results and for the offer of screening colonoscopy as a primary screening tool at ages 55 and 65. The long-lasting risk reduction of colonoscopy allowing the removal of precancerous lesions was taken into account quantitatively.
Results: For a variety of stool tests with different performance characteristics, the proportion of CRC cases early detected or prevented was estimated to be higher for stool testing in combination with colonoscopic follow-up of positive test results compared with screening colonoscopy assuming levels of compliance to be expected for the respective screening scheme. Optimizing performance characteristics of stool tests in terms of detecting precancerous lesions, in addition to those in terms of detecting CRC, seemed to be crucial for maximizing effectiveness of CRC screening with stool tests.
Conclusion: Screening based on new stool tests with colonoscopic follow-up of positive test results might offer a high potential for early detection or prevention of CRC.
With more than 900,000 new cases and about 500,000 deaths per year, colorectal cancer (CRC) is the third most common malignancy in the world (1). Screening for CRC has a high potential to reduce morbidity and mortality of the disease, but there is an ongoing debate regarding the best and most effective screening method (2-4). Stool testing seems to be particularly advantageous with respect to achievable compliance rates and practicability. Therefore, many new tests based on molecular markers intended to detect neoplastic cells or cell products in stool are developed and evaluated at present (5-9).
An important question in this context is which levels of sensitivity and specificity for detecting CRC are needed for these tests to be useful in a population-based screening setting. High levels of sensitivity seem to be highly desirable in any case. Specificity is a more complex issue, however, as colonoscopic follow-up of “false-positive” findings will allow the detection and removal of precancerous lesions which may also have a positive effect on CRC morbidity and mortality in the long run. For example, Lang and Ransohoff (10) suggested that at least one third to one half of the observed reduction in CRC mortality in the Minnesota trial, a randomized controlled trial aimed to investigate the use of fecal occult blood test (FOBT) screening (11), was attributable to the high false-positive rate in this study, and they concluded that it would make more sense simply to screen with colonoscopy.
The aim of this study was to develop a mathematical model which allows to estimate and to compare the potential of screening with stool tests with different sensitivity and specificity, followed by colonoscopic follow-up of positive results on the one hand and of screening with colonoscopy as the primary screening method on the other hand for reducing the burden of CRC.
Materials and Methods
A Markov model was developed aimed to estimate (a) the proportion of CRC cases which would be early detected or prevented due to screening, and (b) the required number of examinations per early detected or prevented CRC case. Model calculations were carried out for two different screening strategies: the offer of annual stool screening from age 55 to 74 in combination with colonoscopic follow-up of positive test results (“scheme A”) and the offer of colonoscopy at ages 55 and 65 as the primary screening method (“scheme B”). Annual stool testing (based on FOBT) represents an essential part of many national guidelines for CRC screening (12-14). The offer of two colonoscopies as the primary screening method has been introduced by the German health care system since fall 2002 and screening colonoscopy every 10 years is recommended by the American Cancer Society (12). Model outcomes for scheme A were calculated for stool tests with different sensitivity and specificity. The terms “sensitivity” and “specificity” in this article are only used in the context of stool testing and they refer to the detection of CRC, not to the detection of precancerous lesions (adenomas). To describe the potential of colonoscopy allowing both the early detection of CRC and the prevention of CRC by detecting and removing adenomas, the term “risk reduction” is used.
A longitudinal design of the model was chosen, where the underlying population was a hypothetical cohort of people with balanced gender ratio at age 55. The cohort was assumed to be subject to the age- and sex-specific mortality rates (from all causes of deaths) in Germany in the year 2000, which were obtained from the German Federal Statistical Office. The number of CRC expected in the absence of screening was calculated based on the sex- and age-specific incidence rates for CRC taken from the population-based cancer registry of Saarland, Germany, for the year 2000 (15). Table 1 gives an overview over the assumptions underlying the model and model variables.
The Markov model developed to assess scheme A is illustrated in Fig. 1. Among subjects eligible for the stool test (“candidates”) with sex s at age i, a certain proportion (“compliance with stool testing”) was assumed to participate (base-case value: 40%), and it was assumed that compliance with stool testing does not change over time. The proportion of negative and positive test results among the participants was determined by the performance characteristics of the stool test as well as by the age- and sex-specific incidence rate. Owing to performance characteristics reported for new stool tests varying from low to very high sensitivities and specificities (5-9), the whole range between 50% and 100% was inserted into the model. It was assumed that a high proportion (base-case value: 90%) of positive test results are followed by colonoscopy with removal of early cancer stages and precancerous lesions. Due to the long-lasting risk reduction of developing CRC following colonoscopy (16, 17), individuals who once underwent colonoscopy were no longer considered eligible for further screening with the stool test. Therefore, the number of candidates at age i + 1 consisted of survivors of candidates at age i without colonoscopy (i.e., of candidates at age i who were either nonparticipants in the stool test at age i, or who had a negative stool test at age i or a positive stool test which was not followed by colonoscopy). To account for people who, in principle, never participate in stool testing, the candidates at age 55 were reduced by 20% before progressing through the model as described in Fig. 1. Consequently, the participation rate for each screening round (base case value: 40%) only refers to candidates who are, in principle, willing to participate in stool testing. The assumption of the proportion of never-screened subjects is based on estimates from a large cohort study conducted in Germany among age group 55 to 74 years and corresponds to the proportion of people in this study who have never undergone FOBT until age 74 (despite an offer of annual FOBT screening for more than 20 years).
In the base-case analysis evaluating screening scheme B, it was assumed that 20% of the candidates take part in each offered screening colonoscopy (at age 55 and at age 65) and that utilization of both screening offers is independent. Thus, for the base model of scheme B, the proportion of never screened subjects amounts to 64%.
Based on recent investigations (16, 17), it was assumed that risk reduction following colonoscopy with removal of precancerous lesions lasts for at least 20 years in both screening schemes. Starting with the maximum value of 100% in the year of colonoscopy, risk reduction was assumed to decrease by 2% per year (corresponding to a risk reduction of 80% and 60% after 10 years and 20 years, respectively) in the base-case analysis.
For both screening schemes, the proportion P of early detected or prevented CRC cases within age range 55 to 74 and the required number of examinations (stool tests, colonoscopies) per early detected or prevented CRC case were estimated as a function of the various model variables as outlined in Appendix 1.
To assess robustness of the model regarding the various variables, one-way sensitivity analyses were done in which values of screening participation, target age for screening, and risk reduction after colonoscopy were varied in a systematic manner. For scheme A, the sensitivity analyses were exemplified by performance characteristics of the FOBT (Hemoccult II: sensitivity = 37.1%, specificity = 97.7%; ref. 18).
Table 2 shows the results of the base model and the one-way sensitivity analyses for the offer of screening colonoscopy at ages 55 and 65 (scheme B). Under the assumptions of the base model (20% compliance, target age range 55-74, annual decrease of risk reduction after colonoscopy 2%) the proportion of early detected or prevented CRC cases amounts to 25.2% requiring 45 colonoscopies per early detected or prevented CRC case. The proportion of early detected or prevented CRC cases strongly depends on the compliance and to a much lesser degree on the target age range for screening and on the annual decrease of risk reduction after colonoscopy. The latter variables have a somewhat stronger effect on the number of colonoscopies per early detected or prevented CRC case.
For the offer of annual stool tests with different sensitivity and specificity (scheme A) the outcomes of the base model, the proportion P of early detected or prevented CRC cases, the number of required stool tests, and the number of required colonoscopies per early detected or prevented CRC case, are shown in Table 3A-C, respectively. More detailed tables including a larger number of variable values are available from the author.
For stool tests with a specificity of 100% and a sensitivity of 80%, 90%, or 100%, the proportion P of early detected or prevented CRC cases amounts to 22.9%, 25.8%, and 28.6% with 748, 665, and 598 stool tests per early detected or prevented CRC case, respectively. In case of a specificity of 100%, a minimum number of colonoscopies is carried out (i.e., only those after true positive test result).
Higher values of P are achieved with less than perfect specificity, which is associated with a higher number of colonoscopies following false-positive test results, causing accidental detection and removal of adenomas. For example, for a sensitivity of 80%, 90%, or 100%, the outcome P increases to 27.1%, 29.7%, and 32.3%, respectively, if specificity is 97.5%, and to 37.1%, 39.0%, and 41.0%, respectively, if specificity is only 90%. However, the higher proportion of early detected or prevented CRC comes at the prize of higher numbers of colonoscopies per detected CRC case. For the sensitivities mentioned above and a specificity of 90%, each early detected or prevented CRC case requires 31, 29, and 28 colonoscopies, respectively, which is more than twice the number needed in case of a test with a specificity of 97.5%. Further decrease of specificity down to 50% increases the proportion of early detected or prevented CRC up to 50% and more. At the same time, the number of stool tests per early detected or prevented CRC case is decreasing whereas the number of colonoscopies stays more or less constant at about 40 colonoscopies. For example, for a stool test with a sensitivity and a specificity of 50% (i.e., a test that would be no better than simply flipping a coin), the proportion of early detected or prevented CRC amounts to 54.6%, and 92 stool tests and 41 colonoscopies are required for each early detected or prevented CRC case.
Table 4 shows the results of the base model and the one-way sensitivity analyses for the offer of annual stool testing with FOBT (Hemoccult II, sensitivity 37.1 %, specificity 97.7%; ref. 15) in combination with colonoscopic follow-up of positive test results (scheme A). The proportion of early detected or prevented CRC cases amounts to 15.6%, and each early detected or prevented CRC case requires 1,019 stool tests and 22 colonoscopies. These values are based on the assumption that FOBT has no potential of detecting precancerous lesions in addition to the potential of detecting CRC, which may cause a certain underestimation of reality. The sensitivity analyses confirm the robustness of the model with respect to variations in the compliance with follow-up colonoscopy, the target age range, and the assumed annual decrease of risk reduction following colonoscopy. By contrast, compliance with stool testing seems to be of crucial importance. The variations mentioned above also affect the number of stool tests per early detected or prevented CRC case to some extent, whereas the number of colonoscopies is more or less constant.
New stool tests based on molecular markers are regarded as promising tools for future CRC screening. Performance characteristics are now reported for an increasing number of the new tests. At the same time, some countries, including Germany, recently have introduced the offer of colonoscopy as a primary screening tool. Regarding these developments in research and practice of CRC screening, a model simulating CRC screening in a population-based screening setting, either based on stool tests with different performance characteristics (with follow-up of positive results by colonoscopy) or based on colonoscopy as a primary screening tool, may be useful to compare the potentials of the various screening strategies for early detection or prevention of CRC.
The results of the Markov model simulating annual stool testing clearly showed that a high sensitivity for detecting CRC is advantageous to maximize the proportion of early detected or prevented CRC cases. Specificity, which determines the number of colonoscopies following false-positive test results, is a much more complex issue, however. For example, the proportion of early detected or prevented CRC cases would be higher for a test with 90% sensitivity and 90% specificity than for a test with 90% sensitivity and 100% specificity. The higher outcome for a test with only 90% specificity is due to the fact that a false-positive rate of 10% would allow to identify 10% of test participants bearing adenomas by accident. The possibility of removing these precancerous lesions during follow-up colonoscopy results in the prevention of CRC cases which would occur in the absence of screening. At the same time, however, 10% of those test participants having neither adenomas nor CRC are unnecessarily prompted to undergo follow-up examination.
Consequently, the higher yield of early detected or prevented CRC achieved with lower levels of specificity comes at the prize of an increasing number of required follow-up colonoscopies. Once specificity is reduced to a certain level, however, the higher number of colonoscopies is balanced by the growing screening benefit (i.e., the number of colonoscopies per early detected or prevented CRC case stays more or less constant). Although the model considered a gradual decrease of risk reduction after colonoscopy, the potential need for surveillance colonoscopy after polypectomy was not explicitly taken into account. If the latter was integrated into the model, the proportion of early detected or prevented CRC would even be higher but the number of required colonoscopies would also increase.
Currently, information about the performance characteristics of the new stool tests is mostly restricted to the potential of identifying CRC patients. Therefore, in the model, sensitivity and specificity of the stool test were defined with respect to the detection of CRC, and it was assumed that the stool test does not have additional potential to distinguish subjects with and without adenomas. In this case, increasing the false-positive rate represents the only way to extend screening to the detection of adenomas. The very favorable model outcomes for tests with a higher false-positive rate clearly emphasize the necessity of extending CRC screening to the detection of adenomas. Nevertheless, simply increasing the false-positive rate is a rather poor and arbitrary method to identify people bearing adenomas. Ideally, the stool test would allow to select “false-positive” patients with adenomas, preferentially only those with clinically relevant adenomas, for colonoscopic follow-up. Any potential of a stool test to detect clinically relevant adenomas would further increase the efficacy of the screening. Thus, optimizing performance characteristics of stool tests in terms of detecting precancerous lesions, in addition to those in terms of detecting CRC, seems to be crucial for maximizing effectiveness of CRC screening with stool tests.
If it is not possible to optimize stool tests with respect to the detection of adenomas, however, the model results point out that less than perfect specificity might be an alternative to increase the benefit of screening. Then, of course, the effect of stool testing is to a large extent achieved by leading a large proportion of the population to colonoscopic examination, which offers the opportunity of preventive removal of precancerous lesions.
The effect of accidental detection of adenomas resulting from a higher false-positive rate on the screening benefit is not only theoretical but was also discussed in connection with FOBT. It may have contributed to the observed reduction of CRC mortality in the Minnesota trial although estimations regarding the extent of this effect varied (10, 19). Lang and Ransohoff (10) concluded that it would make better sense to simply screen with colonoscopy in the first place. However, the results of the model simulating the offer of screening colonoscopy at ages 55 and 65 showed that substantial proportions of CRC cases would be early detected or prevented at rather high participation rates only, which seem to be difficult if not impossible to achieve in practice. So far, data regarding compliance with screening colonoscopy are scarce. In Bavaria, a state located in South Germany, the participation rate was only about 2% of those eligible in the first year after the offer of screening colonoscopy was introduced (20) although intensive publicity campaigns took place.
According to screening studies with FOBT, adherence to stool testing in combination with colonoscopic follow-up of positive test results is expected to be higher (21). Nevertheless, the results of the sensitivity analyses clearly point out that participation rates are also crucial for the screening benefit of stool testing and, thereby, they underline the necessity of finding methods to improve compliance with screening in general. In order to simulate a realistic screening setting rather than an optimized screening setting, compliance with stool testing in the base model was assumed to be lower than what was obtained in clinical trials (21).
The model presented here is limited to a very small number of assumptions and, therefore, it is well suited to make a general comparison of different stool tests and of different screening strategies. Other models estimating the clinical and economic consequences of different CRC screening methods (22, 23) are often based on numerous assumptions (e.g., regarding the natural history of the disease). However, evidence for the latter is scarce and studies have been published long time ago. In our model, the main focus of calculating the outcome lies on compliance, and barriers to screening adherence varying by screening strategy are taken into account. Thus, the model may be close to screening reality and it is transferable to other scenarios. The comparison of the screening methods is not affected by the assumptions underlying both screening schemes. On the other hand, the model outcome, the proportion of early detected or prevented CRC cases, is only a crude estimate of the screening benefit and does not allow to quantify the reduction of incidence and mortality, which would be necessary to estimate cost-effectiveness of screening.
In conclusion, the results of the model simulating CRC screening showed that the offer of new stool tests in combination with colonoscopic follow-up of positive test results may exceed the potential of screening with colonoscopy as the primary screening tool for which high levels of compliance are more difficult to achieve. Stool tests allowing selective detection of clinically relevant adenomas, in addition to CRC, would represent the best tools for CRC screening by focusing colonoscopic follow-up on those at highest risk of developing CRC.
For each age i and sex s, the proportion Ps,i of early detected or prevented CRC cases was estimated assuming that screening participation among individuals who develop CRC and among the general population is the same. The model was based on this cautious assumption although there is some evidence that, for example, compliance is higher among subjects with a family history of CRC (24, 25) or among subjects with a higher self-perceived risk for CRC (24).
For scheme A, the proportion Ps,i is composed of those CRC cases which are detected by stool testing in combination with colonoscopic follow-up at age i and those CRC cases which are prevented due to a former false-positive test result leading to the removal of precancerous lesions. Ps,i was calculated as follows for scheme A (with 55 ≤ i ≤ 74):
where s denotes sex, Cans,i denotes subjects with sex s eligible for the test at age i, Cans,55 denotes candidates with sex s at age 55 (original cohort), Comt denotes compliance with stool testing, Se denotes sensitivity of the stool test, Comc denotes compliance with colonoscopic follow-up after positive test result, FPs,j denotes subjects with sex s who had false-positive test result at age j, Surs,(a1,a2) denotes sex-specific population cumulative survival rate between ages a1 and a2, Rj,i = 1 − 0.02·(i − j) denotes risk reduction after colonoscopy at age j.
Using the notations introduced above, the proportion Ps,i of early detected or prevented CRC cases for scheme B was calculated for each age i as follows:
where ScrCols,55 and ScrCols,65 denote the sex-specific number of subjects who underwent their last screening colonoscopy at age 55 or at age 65 in the base model.
For both screening schemes the proportion P of all early detected or prevented CRC cases within the entire age range from 55 to 74 years was calculated as follows:
where the denominator yields the number of CRC cases which would occur in the absence of screening and the numerator yields the number of CRC cases which would be early detected or prevented due to screening.
The number of examinations per early detected or prevented CRC case was calculated by dividing the sum of required stool tests and/or colonoscopies for both genders by the number of CRC cases which would be early detected or prevented due to screening.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.