Period analysis has been shown to provide more up-to-date estimates of cancer survival than traditional methods of survival analysis. There is, however, a tradeoff between up-to-dateness and precision of period survival estimates: increasing up-to-dateness by restricting the analysis to a relatively short period, such as the most recent calendar year, goes along with loss of precision. Recently, a model-based approach was proposed, in which more precise period survival estimates for the most recent year can be obtained through modeling of survival trends within a recent 5-year period. We assess possibilities to extend the time window used for modeling to come up with even more precise, but equally accurate and up-to-date estimates of prognosis. Empirical evaluation using data from the Finnish Cancer Registry shows that extension of the time window to about 10 years provides, in most cases, as accurate results as using a 5-year time window (whereas further extension may lead to considerably less accurate results in some cases). Using 10-year time windows for modeling, SEs of survival estimates can be approximately halved compared with conventional period survival estimates for the most recent calendar year. Furthermore, we present a modification of the modeling approach, which allows extension to 10-year time windows to be achieved without the need to include additional cohorts of patients diagnosed longer time ago and which provides similarly accurate survival estimates at comparable levels of precision in most cases. Our analyses indicate opportunities to further maximize benefits of model-based period analysis of cancer survival. (Cancer Epidemiol Biomarkers Prev 2007;16(8):1675–81)

Period analysis, a new method of survival analysis introduced 10 years ago (1), has been shown to provide more up-to-date cancer survival estimates than traditional methods of survival analysis (2-6), and the method has gained increasing popularity in recent years (e.g., refs. 7-16). The principle of period analysis has been described in detail elsewhere (1, 17). Briefly, it consists of restricting the analysis to the survival experience of cancer patients in some recent time period, which is achieved by left truncation of observations at the beginning of that time period (in addition to right censoring of observations at its end). With conventional application of period analysis, there is a tradeoff between up-to-dateness and precision of survival estimates: increasing up-to-dateness by restricting the analysis to a relatively short recent time period, such as the most recent calendar year for which cancer registry data are available, goes along with a loss of precision. We recently proposed a model-based approach, in which much more precise period estimates of survival for the most recent single calendar year can be obtained through modeling of survival trends within a time window encompassing the most recent 5 years (18). The aim of this article is to assess the possibilities to extend the time window used for modeling to come up with even more precise but equally accurate and up-to-date estimates of prognosis of most recently diagnosed cancer patients to maximize the benefits of model-based period analysis of cancer patient survival.

Database

Our analysis is based on data from the nationwide Finnish Cancer Registry, which covers a population of ∼5 million people and which is well known for its high levels of completeness and data quality (19). At the time of this analysis, the database encompassed patients diagnosed within more than half a century from 1953 to 2004, with a follow-up with respect to vital status until the end of 2004. In this analysis, we included patients ages ≥15 years with a first diagnosis with one of 20 common forms of cancer between 1953 and 1999.

Statistical Analysis

Throughout this article, we present relative rather than absolute survival rates because the former are the measures of prognosis most commonly reported by population-based cancer registries. Relative survival rates reflect the probability of surviving the cancer of interest rather than the total survival probability (20, 21), taking expected deaths in the absence of cancer into account. For this analysis, the expected numbers of deaths were derived from age-, gender-, and calendar period–specific mortality figures of the general population of Finland according to the so-called Ederer II method (22).

In a first step, actual 5-year relative survival of patients diagnosed in 1995-1999 and followed through 2004 was compared with the most up-to-date estimates of 5-year relative survival that would potentially have been available by the end of 1997 (the median year of diagnosis of this cohort) by the following methods of survival analysis (see Fig. 1): first, a “conventional” period analysis for the year 1997 only; second, a “modeled” period analysis, by which a period estimate of 5-year relative survival for 1997 is estimated by trend analysis from a database including periods of 5 years (1993-1997), 10 years (1988-1997), and 15 years (1983-1997).

Figure 1.

Database used for conventional and modeled period estimates of 5-y relative survival for the year 1997 (solid frames) and database used for calculating actual 5-y relative survival of patients diagnosed 1995-1999 (dotted frame). The numbers within the cells indicate the years following diagnosis.

Figure 1.

Database used for conventional and modeled period estimates of 5-y relative survival for the year 1997 (solid frames) and database used for calculating actual 5-y relative survival of patients diagnosed 1995-1999 (dotted frame). The numbers within the cells indicate the years following diagnosis.

Close modal

The modeling approach, which has previously been described in detail for application with a 5-year time window (18), is outlined in Appendix 1. Briefly, survival probabilities are modeled for each combination of calendar year and year of follow-up within the specified time window. For that purpose, numbers of patients at risk and of deaths by year of follow-up are first calculated for each single calendar year within the specified time window. Next, a Poisson regression model for relative survival is used, in which the numbers of deaths for each combination of calendar year and year of follow-up are modeled as a function of calendar year (included as a numerical predictor variable) and year of follow-up (included as a categorical predictor variable). The logarithm of the person-years at risk is used as offset, and late entries and withdrawals are accounted for as half-persons. The model assumes a linear trend for the conditional survival estimates within the time periods used for modeling. This trend is assumed to be the same for each year of follow-up, but, like in conventional, nonparametric life table analysis, no specific shape of the survival curve is assumed. Conditional survival probabilities for each year of follow-up, 5-year cumulative period survival estimates, and their SEs are derived from the model results, as previously described (18).

The model-based approach provides a general framework that encompasses conventional cohort or period analyses as special cases of applications of saturated models. For example, a conventional period estimate of 5-year survival for 1997 can be obtained from a saturated model, in which the period of interest includes just one calendar year (i.e., 1997). This way, only five observations are included in the regression model, from which five parameters are estimated (one for each year of follow-up, none for calendar year). To ensure perfect comparability of results for conventional and modeled period analysis, we derived the former as special case from a saturated model by the same computer programs used for the modeling approach in our analysis.

Next, the analyses described in the previous paragraphs were repeated for each single calendar year from 1972 to 1997, the widest possible range of years for which pertinent calculations could be carried out with the data available. In addition, we carried out pertinent analyses varying the time windows by 1-year steps between 1 and 15 years. This allowed to address the performance of the various methods in a much broader range of settings. We calculated the following summary indicators of the performance of the various methods: the mean difference and the mean square difference between the various estimates of 5-year relative survival potentially available in the respective year and 5-year relative survival later observed for patients diagnosed in the 5-year calendar intervals around that year (5-year intervals were chosen to limit the role of random variation). The mean differences reflect the average underestimation or overestimation of the 5-year relative survival. In addition, the mean square differences reflect, among other factors, the random variation in the various estimates.

Extension of the time window included in modeling from 5 to 10 years or from 10 to 15 years requires additional inclusion of patients diagnosed a longer time ago. For example, with the database shown in Fig. 1, patients diagnosed from 1983 on or from 1978 on would have to be included in modeling using 10- and 15-year time windows, whereas modeling using a 5-year time window could be achieved with a database including patients diagnosed from 1988 on only. Extension of time windows might therefore be difficult for “younger” cancer registries with less long-standing time series of registration. We therefore additionally evaluated “abbreviated-period” modeling using 10- and 15-year time windows, but relying on the same cohorts of patients needed for “full-period” modeling using 5- and 10-year time windows, respectively, as illustrated in Fig. 2. In addition, abbreviated-period modeling using a 5-year window was also used, which requires a minimum number of 5 one-year cohorts for analysis. Compared with full-period modeling, some of the survival experience in the later years of follow-up (which would have to come from the “older cohorts”) is left out, whereas the database for the early years of follow-up, where the vast majority of cancer deaths occurs, is essentially the same.

Figure 2.

Database used for conventional and modeled period analyses of 5-y relative survival for the year 1997, applying abbreviated-period modeling for the 5-, 10-, and 15-y time windows (solid frames), and database used for calculating actual 5-y relative survival of patients diagnosed 1995-1999 (dotted frame). The numbers within the cells indicate the years following diagnosis.

Figure 2.

Database used for conventional and modeled period analyses of 5-y relative survival for the year 1997, applying abbreviated-period modeling for the 5-, 10-, and 15-y time windows (solid frames), and database used for calculating actual 5-y relative survival of patients diagnosed 1995-1999 (dotted frame). The numbers within the cells indicate the years following diagnosis.

Close modal

The analyses were carried out using the SAS statistical software package. For all survival analyses, the macro period was used to derive the numbers of patients at risk and of deaths by year of follow-up year and by calendar year (17, 23). Some minor formal modification of the output was made to facilitate the subsequent steps. Next, the procedure GENMOD was used to carry out Poisson regression, and the output of the regression models was used to carry out the subsequent calculations, as previously described (18).

Overall, 639,011 patients ages ≥15 years were reported to the Finnish Cancer Registry with a first diagnosis of cancer between 1953 and 1999. Of these, we excluded 2.6% notified by death certificate only, another 2.5% notified by autopsy only, and 0.1% due to missing information on month of diagnosis. The 20 forms of cancer specifically addressed in this article include 87.1% of the remaining cancer cases.

The numbers of notifications of patients with these 20 forms of cancer in 1995-1999, as well as the actual 5-year relative survival later observed for these patients, are shown in Table 1. In addition, Table 1 shows the estimates of 5-year relative survival potentially available in 1997, the median year of the 1995-1999 interval, by the different analytic approaches (ignoring delay in cancer registration). The point estimates obtained by the conventional period analysis and the different variants of modeled period analysis were, in general, quite similar. In particular, the model-based estimates were, on average, about as up-to-date as conventional period estimates, regardless of the length of the time window included in the modeling. The modeled period estimates were even closer (or as close) to the later observed survival estimates in most cases. This was true for 13, 15, and 13 of 20 cancers with full-period modeling and for 11, 15, and 13 of 20 cancers with abbreviated-period modeling, using 5-, 10-, and 15-year time windows, respectively (Table 1, bold figures). The SEs were much lower for the modeled period estimates and they decreased with increasing length of the time window used for modeling. With full-period modeling using 5-, 10-, and 15-year windows, SEs were, on average, about one third, almost one half, and more than one half lower than the SEs of estimates obtained from conventional period analysis for 1997. SEs were substantially higher for abbreviated-period modeling than for full-period modeling in case of 5-year time windows, but differences were almost negligible for 10- or 15-year time windows.

Table 1.

Various point estimates of 5-y relative survival potentially available by the end of 1997 and of actual 5-y relative survival later observed for patients diagnosed in Finland in 1995-1999

Cancer sitePatients diagnosed in 1995-1999
Period estimates of 5-y relative survival for 1997*
Modeled, 5-y windows
Modeled, 10-y windows
Modeled, 15-y windows
Actual 5-y relative survival
Conventional
Full period
Abbreviated period
Full period
Abbreviated period
Full period
Abbreviated period
NumberPESEPESEPESEPESEPESEPESEPESEPESE
Oral cavity 2,472 66.6 1.2 67.0 2.9 66.6 2.1 66.3 2.7 66.0 1.6 65.9 1.7 65.9 1.4 65.6 1.4 
Esophagus 1,037 11.9 1.1 11.6 3.2 9.9 1.6 10.1 1.9 9.4 1.1 8.4 1.1 9.7 1.0 9.5 1.0 
Stomach 4,355 29.5 0.8 27.0 1.7 27.5 1.2 26.5 1.5 27.8 0.9 27.7 0.9 27.7 0.7 27.7 0.7 
Colon 6,236 57.6 0.8 55.8 1.9 56.7 1.3 56.5 1.8 56.8 1.0 57.3 1.0 56.0 0.8 56.3 0.9 
Rectum 4,075 54.4 1.0 53.6 2.3 55.1 1.7 55.4 2.1 53.5 1.3 54.2 1.3 53.1 1.1 53.3 1.1 
Liver 2,283 9.7 0.7 14.2 2.2 11.9 1.2 12.1 1.7 11.5 0.9 11.1 0.9 10.7 0.7 10.6 0.7 
Pancreas 3,421 3.7 0.4 4.5 1.1 4.9 0.7 3.8 1.2 4.9 0.5 4.3 0.5 4.5 0.4 4.5 0.4 
Lung 10,224 10.3 0.3 10.6 0.8 11.1 0.5 10.3 0.7 10.2 0.4 9.9 0.4 9.5 0.3 9.3 0.3 
Breast 16,258 85.5 0.4 83.2 0.9 82.8 0.7 83.2 0.9 83.3 0.5 83.2 0.6 83.2 0.4 83.5 0.5 
Cervix 725 63.4 2.0 58.5 4.6 65.4 3.4 63.7 4.2 62.5 2.5 62.4 2.6 58.1 2.2 58.2 2.2 
Corpus 3,370 82.5 0.9 81.8 2.0 82.3 1.5 80.9 1.9 82.8 1.1 82.9 1.1 81.3 0.9 81.7 1.0 
Ovaries 2,356 44.6 1.1 39.5 2.4 40.4 1.8 39.2 2.2 40.6 1.4 41.3 1.4 40.1 1.2 40.4 1.2 
Prostate 14,062 80.4 0.6 72.6 1.6 72.9 1.2 73.3 1.6 71.7 0.9 72.4 0.9 71.2 0.8 71.7 0.8 
Kidneys 3,422 57.7 1.0 56.3 2.3 56.8 1.7 57.0 2.1 57.3 1.3 57.4 1.3 57.8 1.1 57.9 1.1 
Urinary bladder 3,943 72.4 1.0 71.6 2.4 74.0 1.7 73.1 2.2 71.9 1.3 72.5 1.3 71.6 1.1 71.6 1.1 
Melanoma 2,939 83.6 0.9 82.2 2.2 83.3 1.6 83.3 2.1 82.3 1.3 81.7 1.4 82.8 1.1 82.9 1.1 
Brain 1,793 30.9 1.1 29.1 2.4 27.7 1.8 28.2 2.1 29.5 1.4 29.7 1.5 29.4 1.2 29.4 1.2 
Thyroid gland 1,772 90.7 0.9 92.5 1.9 91.5 1.4 92.6 1.3 90.3 1.1 90.8 1.0 89.7 0.9 89.8 0.9 
Leukemias 2,022 40.4 1.3 42.7 2.9 43.2 2.1 43.2 2.8 40.4 1.6 41.1 1.6 40.8 1.3 41.2 1.3 
Lymphomas 2,834 49.6 1.1 48.2 2.4 48.4 1.8 46.6 2.3 49.6 1.4 50.3 1.4 48.5 1.2 48.7 1.2 
Cancer sitePatients diagnosed in 1995-1999
Period estimates of 5-y relative survival for 1997*
Modeled, 5-y windows
Modeled, 10-y windows
Modeled, 15-y windows
Actual 5-y relative survival
Conventional
Full period
Abbreviated period
Full period
Abbreviated period
Full period
Abbreviated period
NumberPESEPESEPESEPESEPESEPESEPESEPESE
Oral cavity 2,472 66.6 1.2 67.0 2.9 66.6 2.1 66.3 2.7 66.0 1.6 65.9 1.7 65.9 1.4 65.6 1.4 
Esophagus 1,037 11.9 1.1 11.6 3.2 9.9 1.6 10.1 1.9 9.4 1.1 8.4 1.1 9.7 1.0 9.5 1.0 
Stomach 4,355 29.5 0.8 27.0 1.7 27.5 1.2 26.5 1.5 27.8 0.9 27.7 0.9 27.7 0.7 27.7 0.7 
Colon 6,236 57.6 0.8 55.8 1.9 56.7 1.3 56.5 1.8 56.8 1.0 57.3 1.0 56.0 0.8 56.3 0.9 
Rectum 4,075 54.4 1.0 53.6 2.3 55.1 1.7 55.4 2.1 53.5 1.3 54.2 1.3 53.1 1.1 53.3 1.1 
Liver 2,283 9.7 0.7 14.2 2.2 11.9 1.2 12.1 1.7 11.5 0.9 11.1 0.9 10.7 0.7 10.6 0.7 
Pancreas 3,421 3.7 0.4 4.5 1.1 4.9 0.7 3.8 1.2 4.9 0.5 4.3 0.5 4.5 0.4 4.5 0.4 
Lung 10,224 10.3 0.3 10.6 0.8 11.1 0.5 10.3 0.7 10.2 0.4 9.9 0.4 9.5 0.3 9.3 0.3 
Breast 16,258 85.5 0.4 83.2 0.9 82.8 0.7 83.2 0.9 83.3 0.5 83.2 0.6 83.2 0.4 83.5 0.5 
Cervix 725 63.4 2.0 58.5 4.6 65.4 3.4 63.7 4.2 62.5 2.5 62.4 2.6 58.1 2.2 58.2 2.2 
Corpus 3,370 82.5 0.9 81.8 2.0 82.3 1.5 80.9 1.9 82.8 1.1 82.9 1.1 81.3 0.9 81.7 1.0 
Ovaries 2,356 44.6 1.1 39.5 2.4 40.4 1.8 39.2 2.2 40.6 1.4 41.3 1.4 40.1 1.2 40.4 1.2 
Prostate 14,062 80.4 0.6 72.6 1.6 72.9 1.2 73.3 1.6 71.7 0.9 72.4 0.9 71.2 0.8 71.7 0.8 
Kidneys 3,422 57.7 1.0 56.3 2.3 56.8 1.7 57.0 2.1 57.3 1.3 57.4 1.3 57.8 1.1 57.9 1.1 
Urinary bladder 3,943 72.4 1.0 71.6 2.4 74.0 1.7 73.1 2.2 71.9 1.3 72.5 1.3 71.6 1.1 71.6 1.1 
Melanoma 2,939 83.6 0.9 82.2 2.2 83.3 1.6 83.3 2.1 82.3 1.3 81.7 1.4 82.8 1.1 82.9 1.1 
Brain 1,793 30.9 1.1 29.1 2.4 27.7 1.8 28.2 2.1 29.5 1.4 29.7 1.5 29.4 1.2 29.4 1.2 
Thyroid gland 1,772 90.7 0.9 92.5 1.9 91.5 1.4 92.6 1.3 90.3 1.1 90.8 1.0 89.7 0.9 89.8 0.9 
Leukemias 2,022 40.4 1.3 42.7 2.9 43.2 2.1 43.2 2.8 40.4 1.6 41.1 1.6 40.8 1.3 41.2 1.3 
Lymphomas 2,834 49.6 1.1 48.2 2.4 48.4 1.8 46.6 2.3 49.6 1.4 50.3 1.4 48.5 1.2 48.7 1.2 

NOTE: In addition, the SEs of all estimates are given.

Abbreviation: PE, point estimates.

*

Bold figures indicate better or equal performance of the modeling approach compared with conventional period analysis (with respect to difference of 5-y relative survival estimate from later observed 5-y relative survival of patients diagnosed in 1995-1999).

With few exceptions, all types of analyses provided, on average, somewhat too pessimistic estimates of 5-year relative survival later observed for patients diagnosed in the 5-year interval around each single year between 1972 and 1997, which can be seen from the negative values of most of the mean differences shown in Table 2. With respect to this criterion, results were generally quite similar for the various analytic approaches. Nevertheless, the modeling approaches did better than or as well as conventional period analysis (bold figures) in a slight majority of cancers using 10-year time windows, whereas full-period modeling did worse for 14 cancers using 15-year time windows and abbreviated-period modeling did worse for 15 cancers using 5-year time windows. Mean square differences obtained with the modeling strategies were lower than those obtained with conventional period analysis for most cancers (bold figures). According to this criterion, full-period modeling did better than conventional period analysis for 20, 20, and 17 cancers, respectively, if 5-, 10-, and 15-year time windows were used. Abbreviated-period modeling did better than conventional period analysis for 14, 18, and 19 cancers, respectively.

Table 2.

Mean difference and mean square difference of the various period estimates of 5-y relative survival of cancer patients in Finland potentially available at the end of each single year between 1972 and 1997 from true 5-y relative survival later observed for patients diagnosed in the 5-y interval around those years

Cancer siteMean difference*
Mean square difference*
ConventionalModeled, 5-y windows
Modeled, 10-y windows
Modeled, 15-y windows
ConventionalModeled, 5-y windows
Modeled, 10-y windows
Modeled, 15-y windows
FullAbbrev.FullAbbrev.FullAbbrev.FullAbbrev.FullAbbrev.FullAbbrev.
Oral cavity 0.31 0.41 −0.35 0.61 0.52 0.90 0.92 8.48 6.10 9.07 7.00 6.68 8.08 8.30 
Esophagus −0.12 −0.32 −1.21 −0.23 −0.79 −0.16 −0.58 8.57 3.63 7.08 2.66 3.68 2.02 2.27 
Stomach −1.02 −1.12 −1.78 −1.27 −1.62 −1.55 −1.84 2.41 1.83 4.20 1.93 2.88 2.59 3.57 
Colon −0.86 −0.59 −0.80 −0.32 −0.15 −0.03 0.12 3.51 3.15 3.30 2.41 2.29 2.09 2.23 
Rectum −1.24 −1.36 −1.62 −1.25 −1.12 −0.91 −0.77 9.57 7.85 13.67 6.55 6.45 4.53 4.08 
Liver −0.07 0.19 −0.45 0.06 −0.11 0.10 −0.11 4.91 1.91 4.01 0.69 1.12 0.67 0.62 
Pancreas −0.06 −0.06 −0.19 −0.13 −0.28 −0.16 −0.29 0.93 0.59 0.60 0.55 0.66 0.48 0.60 
Lung 0.01 −0.03 −0.69 0.01 −0.39 0.34 0.00 0.87 0.53 0.97 0.61 0.66 0.82 0.65 
Breast −2.01 −1.97 −2.26 −1.99 −1.76 −2.12 −1.83 5.72 5.19 6.52 5.70 4.68 6.38 5.08 
Cervix 0.11 −0.12 −1.02 −0.43 −0.67 −0.13 −0.30 25.04 16.07 21.03 15.54 15.63 23.79 22.82 
Corpus −0.74 −0.71 −0.83 −0.66 −0.43 −0.71 −0.53 4.59 2.76 4.50 3.19 2.72 3.08 2.88 
Ovaries −0.27 0.09 −0.69 0.21 0.32 0.47 0.57 9.33 6.18 8.25 7.82 8.09 6.83 6.97 
Prostate −2.60 −2.38 −2.56 −1.99 −1.84 −1.30 −1.10 16.77 13.20 15.04 11.57 10.87 13.12 12.38 
Kidneys −1.76 −1.50 −1.75 −1.23 −1.20 −1.15 −1.08 9.51 7.29 7.51 5.04 4.50 7.86 7.44 
Urinary bladder −1.50 −1.48 −1.71 −1.28 −1.31 −0.81 −0.78 13.64 8.75 9.32 3.96 4.32 2.54 2.62 
Melanoma −2.29 −2.22 −2.63 −2.30 −2.25 −2.60 −2.47 17.82 12.73 16.28 13.19 11.85 19.54 17.32 
Brain −0.29 −0.28 −0.51 0.14 0.25 0.39 0.54 12.04 10.70 11.11 10.26 10.38 6.36 7.19 
Thyroid gland −1.24 −1.12 −0.68 −1.19 −0.81 −1.50 −1.25 14.25 7.93 7.45 5.35 4.05 8.27 7.01 
Leukemias −0.87 −1.15 −0.23 −1.06 −0.56 −0.93 −0.62 12.55 7.30 6.46 4.60 4.28 2.05 1.54 
Lymphomas −1.67 −1.75 −2.07 −1.68 −1.79 −1.96 −2.02 14.57 9.94 13.16 9.10 10.39 11.58 11.67 
Cancer siteMean difference*
Mean square difference*
ConventionalModeled, 5-y windows
Modeled, 10-y windows
Modeled, 15-y windows
ConventionalModeled, 5-y windows
Modeled, 10-y windows
Modeled, 15-y windows
FullAbbrev.FullAbbrev.FullAbbrev.FullAbbrev.FullAbbrev.FullAbbrev.
Oral cavity 0.31 0.41 −0.35 0.61 0.52 0.90 0.92 8.48 6.10 9.07 7.00 6.68 8.08 8.30 
Esophagus −0.12 −0.32 −1.21 −0.23 −0.79 −0.16 −0.58 8.57 3.63 7.08 2.66 3.68 2.02 2.27 
Stomach −1.02 −1.12 −1.78 −1.27 −1.62 −1.55 −1.84 2.41 1.83 4.20 1.93 2.88 2.59 3.57 
Colon −0.86 −0.59 −0.80 −0.32 −0.15 −0.03 0.12 3.51 3.15 3.30 2.41 2.29 2.09 2.23 
Rectum −1.24 −1.36 −1.62 −1.25 −1.12 −0.91 −0.77 9.57 7.85 13.67 6.55 6.45 4.53 4.08 
Liver −0.07 0.19 −0.45 0.06 −0.11 0.10 −0.11 4.91 1.91 4.01 0.69 1.12 0.67 0.62 
Pancreas −0.06 −0.06 −0.19 −0.13 −0.28 −0.16 −0.29 0.93 0.59 0.60 0.55 0.66 0.48 0.60 
Lung 0.01 −0.03 −0.69 0.01 −0.39 0.34 0.00 0.87 0.53 0.97 0.61 0.66 0.82 0.65 
Breast −2.01 −1.97 −2.26 −1.99 −1.76 −2.12 −1.83 5.72 5.19 6.52 5.70 4.68 6.38 5.08 
Cervix 0.11 −0.12 −1.02 −0.43 −0.67 −0.13 −0.30 25.04 16.07 21.03 15.54 15.63 23.79 22.82 
Corpus −0.74 −0.71 −0.83 −0.66 −0.43 −0.71 −0.53 4.59 2.76 4.50 3.19 2.72 3.08 2.88 
Ovaries −0.27 0.09 −0.69 0.21 0.32 0.47 0.57 9.33 6.18 8.25 7.82 8.09 6.83 6.97 
Prostate −2.60 −2.38 −2.56 −1.99 −1.84 −1.30 −1.10 16.77 13.20 15.04 11.57 10.87 13.12 12.38 
Kidneys −1.76 −1.50 −1.75 −1.23 −1.20 −1.15 −1.08 9.51 7.29 7.51 5.04 4.50 7.86 7.44 
Urinary bladder −1.50 −1.48 −1.71 −1.28 −1.31 −0.81 −0.78 13.64 8.75 9.32 3.96 4.32 2.54 2.62 
Melanoma −2.29 −2.22 −2.63 −2.30 −2.25 −2.60 −2.47 17.82 12.73 16.28 13.19 11.85 19.54 17.32 
Brain −0.29 −0.28 −0.51 0.14 0.25 0.39 0.54 12.04 10.70 11.11 10.26 10.38 6.36 7.19 
Thyroid gland −1.24 −1.12 −0.68 −1.19 −0.81 −1.50 −1.25 14.25 7.93 7.45 5.35 4.05 8.27 7.01 
Leukemias −0.87 −1.15 −0.23 −1.06 −0.56 −0.93 −0.62 12.55 7.30 6.46 4.60 4.28 2.05 1.54 
Lymphomas −1.67 −1.75 −2.07 −1.68 −1.79 −1.96 −2.02 14.57 9.94 13.16 9.10 10.39 11.58 11.67 
*

Bold figures indicate better or equal performance of the modeling approach compared with conventional period analysis according to each criterion.

A more comprehensive evaluation of the performance of full-period and abbreviated-period modeling according to the length of the time window used for modeling is shown in Fig. 3A and B, again using the mean square difference from later observed survival rates as criterion. With full-period modeling (Fig. 3A and B, left columns), minimum mean square difference levels are reached for 12 of 20 cancers using time windows between 3 and 10 years, whereas for the remaining 8 cancers, lowest levels of mean square difference are reached for longer time windows. On the other hand, a steep increase of mean square difference for time windows longer than 10 years is observed for a few cancers. Nevertheless, almost all modeling approaches perform better than conventional period analysis, which is represented by the results for 1-year time windows in Fig. 3A and B (left columns). In general, mean square differences were similar or even slightly lower with abbreviated-period than with full-period modeling (Fig. 3A and B, right columns), except for short time windows around 5 years (the minimum time windows for abbreviated-period modeling).

Figure 3.

Mean square difference between modeled estimates of 5-y relative survival potentially available by full-period or abbreviated-period modeling in each calendar year from 1972 to 1997 and 5-y relative survival later observed for patients diagnosed in the 5-y intervals around those calendar years, according to length of time windows included in the modeling. A. Gastrointestinal cancers (top) and gynecologic cancers (bottom). B. Urological cancers (top) and other common cancers (bottom).

Figure 3.

Mean square difference between modeled estimates of 5-y relative survival potentially available by full-period or abbreviated-period modeling in each calendar year from 1972 to 1997 and 5-y relative survival later observed for patients diagnosed in the 5-y intervals around those calendar years, according to length of time windows included in the modeling. A. Gastrointestinal cancers (top) and gynecologic cancers (bottom). B. Urological cancers (top) and other common cancers (bottom).

Close modal

In this article, we show that the benefits of the modeling approach, which we recently introduced to come up with both up-to-date and precise period estimates of cancer patient survival (18), can be further enhanced by extending the time window used for modeling. Although extension to 15 years or more may often be beneficial, extension to about 10 years seems to be a more prudent choice because risk of misprediction of survival seems to increase rapidly with increasing length of time windows beyond 10 years in some cases. By extension of time windows from 5 to 10 years, period estimates of survival for the most recent calendar year can be derived with SEs approximately half of those that would be obtained with conventional period analysis. This gain in precision can be obtained at no cost in up-to-dateness or accuracy of period survival estimates in most cases. To enhance precision to a comparable degree in conventional period analysis, the period included in the analysis would have be extended from the most recent single year to the most recent 4 years. This option, however, would go along with a substantial loss of up-to-dateness (by, on average, 1.5 years), which can be avoided by use of the model-based period analysis.

The gain in precision by moving from 5- to 10-year time windows for modeling can be achieved without additional requirements about the registry database. Although full-period modeling using 10-year windows would require existence of a reliable registry database for at least 15 years (rather than at least 10 years as in full-period modeling using a 5-year time window), no such extension is needed for abbreviated-period modeling using a 10-year time window, which seems to perform equally well or even slightly better in terms of accuracy of predictions than full-period modeling in most cases and which can be applied at virtually no cost of precision. The possibility to switch to abbreviated-period modeling is of particular relevance for younger cancer registries, such as those having been set up in many European countries as well as in multiple locations in the United States in the past 10 to 20 years (24, 25), because it will enable them to carry out modeled period analysis years before having reached the long time series needed for full-period modeling.

The type of database used for abbreviated-period modeling in our analysis can also be used and has recently been proposed to be used for “cohort modeling” of population-based cancer survival data, in which numbers of deaths are modeled by 1-year cohorts (years of diagnosis) rather than by 1-year periods (years of follow-up; ref. 26). Despite the common database, both approaches are not identical and yield different results. The relative performance of both approaches needs to be evaluated in further research.

As previously illustrated (18), the modeling approach may also be useful to disclose and estimate recent trends in cancer survival. Our analyses were carried out for all ages combined, and they therefore pertain to crude estimates of relative survival. Given that levels and trends of relative survival may differ by age, and given that the age distribution of patients has shifted and continues to shift to older ages for many forms of cancer, it may often be useful to carry out age-specific analyses or to adjust for age, particularly if the time window included in the modeling is extended to 10 or 15 years. Such age-specific or age-adjusted analyses could be easily implemented in the modeling framework.

The different performance of models based on time windows of various lengths shown for various cancers in Fig. 3A and B raises the question on whether choice of the length of time windows should be based on preassessment of time trends in cancer survival. In theory, with steady improvement in survival at a constant pace (or no change in survival at all), performance of the survival predictions should be independent of length of time windows, in which case longer-term windows might generally be preferred as they lead to more precise survival estimates. Use of longer-term time windows may be particularly risky, however, in case of changing pace of improvement (or even changing direction of survival trends) over time.

For example, the poor performance of models based on long-term time windows for cervical cancer observed in our analysis may reflect the inconsistent trends in survival observed for this form of cancer in Finland in the past decades, which are mostly explained by selection effects resulting from the (very successful) screening program. For other cancers, the reasons for worse performance of longer time periods were less obvious from long-term survival trends. In general, it seems to be difficult, if not impossible, to determine the optimal length of time windows for each single cancer from assessing long-term time trends in survival. In particular, even a long-term apparently consistent pace of change in survival in preceding years does not guarantee good performance of the use of longer time windows and may give misleading results in case of a sudden change affecting patients diagnosed in the period of interest. Furthermore, the choice of different time windows would impede comparisons of cancer survival estimates both within and between cancer registries. We therefore believe that choice of a common time window that proves useful in a broad range of scenarios (such as those evaluated in our analysis) may be a preferred strategy, and our analyses suggest that a time window of around 10 years may be a reasonable choice.

The analyses shown in this article refer to derivation of most up-to-date period survival estimates of currently diagnosed cancer patients on the basis of observed survival data. As recently shown, period modeling may also be useful for predicting survival of future cancer patients by extrapolation from observed survival data (27). Optimal length of time windows for the latter purpose may not necessarily be the same as the optimal length for the purpose addressed in this article and requires further empirical evaluation.

All of our analyses are based on data from the Finnish Cancer Registry. Whereas the methods proposed and evaluated are applicable in a large number of cancer registries around the world, the type of thorough empirical evaluation provided in this article, especially in Table 2, can only be carried out with a very long history of high-quality cancer registration. The Finnish Cancer Registry is one of very few population-based cancer registries in the world meeting this criterion, and there is no reason to believe that results would be substantially different with other long-standing cancer registries. In particular, differences in levels and trends of survival between patients with cancer at various sites within a population are much larger than differences between survival of patients with the same type of cancer in various populations (at least within developed countries). The consistency of patterns found for 20 cancers with such strongly divergent levels and trends of prognosis suggests that the methods presented are likely to be useful for a very broad range of settings. This suggestion is further supported by previous replications of the empirical evaluation of period analysis techniques first evaluated using data from the Finnish Cancer Registry, which have generally yielded very similar results (3-6).

In conclusion, our empirical evaluation suggests that the benefits of the modeling approach for the provision of up-to-date and precise period estimates of cancer patient survival for the most recent year for which cancer registry data are available may be further be enhanced by extending the time window used for modeling. In most cases, time windows around 10 years and application of abbreviated-period modeling seem to be a reasonable choice. With this approach, SEs of the most up-to-date period estimates can be approximately halved compared with conventional period analysis. Compared with period modeling based on 5-year time windows, abbreviated-period modeling allows this improvement to be achieved without the need to include additional cohorts of patients diagnosed a longer time ago. Extension of the time window included in the modeling beyond 10 years may be problematic because potential risks may outweigh further benefits in some cases.

Let lij be the effective numbers of persons at risk (accounting for late entries and withdrawals as half persons); dij, the observed numbers of deaths; and eij, the expected numbers of deaths (from population life tables) for each combination of follow-up-year i(1 ≤ i ≤ 5) and calendar year j.

Then, a generalized linear model dij = f(i,j) is fitted with outcome dij, Poisson error structure, predictor variables i (categorical) and j (linear), link ln(μijdij*), and offset ln(lijdij / 2), where μij is the model-based numbers of deaths and d* = −(lijdij / 2) × ln[(lijeij) / lij].

Let αi and β be the estimated regression coefficients for follow-up years i (1 ≤ i ≤ 5) and for a 1-year increase in calendar year, and let p1 be the first calendar year within the period included in the modeling. Then, estimates of conditional relative survival for each combination of follow-up year i and calendar year j are given as

and an estimate of cumulative 5-year relative survival for each calendar year j is given as

Variance estimates of 5-year cumulative relative survival for calendar year j can be obtained by the delta method, as previously described (18).

Grant support: The German Cancer Foundation (Deutsche Krebshilfe), Project No. 70-3166-Br 5 (H. Brenner), and the Academy of Finland and the Cancer Society of Finland (Timo Hakulinen).

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1
Brenner H, Gefeller O. An alternative approach to monitoring cancer patient survival.
Cancer
1996
;
78
:
2004
–10.
2
Brenner H, Hakulinen T. Advanced detection of time trends in long-term cancer patient survival: experience from 50 years of cancer registration in Finland.
Am J Epidemiol
2002
;
156
:
566
–77.
3
Brenner H, Hakulinen T. Up to date survival curves of patients with cancer by period analysis.
J Clin Oncol
2002
;
20
:
826
–32.
4
Brenner H, Söderman B, Hakulinen T. Use of period analysis for providing more up-to-date estimates of long-term survival rates: empirical evaluation among 370,000 cancer patients in Finland.
Int J Epidemiol
2002
;
31
:
456
–62.
5
Talbäck M, Stenbeck M, Rosén M. Up-to-date long-term survival of cancer patients: an evaluation of period analysis on Swedish Cancer Registry data.
Eur J Cancer
2004
;
40
:
1361
–72.
6
Ellison L. An empirical evaluation of period survival analysis using data from the Canadian Cancer Registry.
Ann Epidemiol
2006
;
16
:
191
–6.
7
Aareleid T, Brenner H. Trends in cancer patient survival in Estonia before and after the transition from a Soviet republic to an open market economy.
Int J Cancer
2002
;
102
:
45
–50.
8
Brenner H. Long-term survival rates of cancer patients achieved by the end of the 20th century: a period analysis.
Lancet
2002
;
360
:
1131
–5.
9
Smith LK, Lambert PC, Jones DR. Up-to-date estimates of long-term cancer survival in England and Wales.
Br J Cancer
2003
;
89
:
74
–6.
10
Coleman MP, Rachet B, Woods LM, et al. Trends and socioeconomic inequalities in cancer survival in England and Wales up to 2001.
Brit J Cancer
2004
;
90
:
1367
–73.
11
Talbäck M, Rosén M, Stenbeck M, Dickman PW. Cancer patient survival in Sweden at the beginning of the third millenium—predictions using period analysis.
Cancer Causes Control
2004
;
15
:
967
–76.
12
Brenner H, Stegmaier C, Ziegler H. Long-term survival of cancer patients in Germany achieved by the beginning of the 3rd millennium.
Ann Oncol
2005
;
16
:
981
–6.
13
Houterman S, Janssen-Heijnen ML, van de Poll-Franse LV, et al. Higher long-term survival rates in southeastern Netherlands using up-to-date period analysis.
Ann Oncol
2006
;
17
:
709
–12.
14
Stang A, Valiukeviciene S, Aleknaviciene B, Kurtinaitis J. Time trends of incidence, mortality and relative survival of invasive skin melanoma in Lithuania.
Eur J Cancer
2006
;
42
:
660
–7.
15
Zuccolo L, Dama E, Maule MM, Pastore G, Merletti F, Magnani C. Updating long-term childhood cancer survival trend with period and mixed analysis: good news from population-based estimates in Italy.
Eur J Cancer
2006
;
42
:
1135
–42.
16
Ellison LF, Gibbons L. Survival from cancer—up-to-date prediction using period analysis.
Health Rep
2006
;
17
:
19
–30.
17
Brenner H, Gefeller O, Hakulinen T. Period analysis for “up-to-date” cancer survival data: theory, empirical evaluation, computational realisation and applications.
Eur J Cancer
2004
;
40
:
326
–35.
18
Brenner H, Hakulinen T. Up-to-date and precise estimates of cancer patient survival: model based period analysis.
Am J Epidemiol
2006
;
164
:
689
–96.
19
Teppo L, Pukkala E, Lehtonen M. Data quality and quality control of a population-based cancer registry. Experience in Finland.
Acta Oncol
1994
;
33
:
365
–9.
20
Ederer F, Axtell LM, Cutler SJ. The relative survival rate: a statistical methodology.
Monogr Natl Cancer Inst
1961
;
6
:
101
–21.
21
Henson DE, Ries LA. The relative survival rate.
Cancer
1995
;
76
:
1687
–8.
22
Ederer F, Heise H. Instructions to IBM 650 programmers in processing survival computations. Methodological note no. 10, End Results Section. Bethesda (MD): National Cancer Institute; 1959.
23
Arndt V, Talbäck M, Gefeller O, Hakulinen T, Brenner H. Modification of SAS macros for more efficient analysis of relative survival rates.
Eur J Cancer
2004
;
40
:
778
–9.
24
Available from: http://www.encr.com.fr/, last accessed April 20, 2007.
25
Available from http://seer.cancer.gov/registries/, last accessed April 20, 2007.
26
Mariotto AB, Wesley MN, Cronin KA, Johnson KA, Feuer EJ. Estimates of long-term survival of newly diagnosed patients.
Cancer
2006
;
106
:
2039
–50.
27
Brenner H, Hakulinen T. Up-to-date estimates of cancer patient survival even with common latency in cancer registration.
Cancer Epidemiol Biomarkers Prev
2006
;
15
:
1727
–32.