Abstract
Test sensitivity pertains to the ability of a test to identify subjects with the target disorder. In cancer screening, test sensitivity can be estimated using interval cancer incidence as an indicator of false-negative result. A randomized trial provides the optimal approach for estimating test sensitivity, as the control arm provides the expected rates. We estimated the sensitivity of the prostate-specific antigen test using incidence method, i.e., based on incidence of interval cancer among subjects with negative screening results, compared with that in the control arm. Data from three centers in the European randomized screening trial were used to estimate interval cancer incidence (II) among 39,389 men with negative screening tests. This was compared with incidence among the 79,525 men in the control arm of the trial (Ic) to estimate test sensitivity (S = 1 − II / IC). Confidence intervals were calculated using simulations, assuming that the number of cases follows a Poisson distribution. The estimated test sensitivity following the first screen was 0.87 (0.83-0.92) in Finland, 0.87 (0.62-1.00) in Sweden, and 0.93 (95% confidence interval, 0.90-0.96) in the Netherlands. There was some indication of a higher test sensitivity for aggressive cancers (0.85-0.98 for non–organ-confined cases or Gleason 8-10) and for the second screening round (approximately 0.85-0.95). Test sensitivity varied to some extent between the three centers in the European trial, probably reflecting variation in screening protocols, but was acceptable in the first screening round, and may be better for aggressive cancers and in the second screening round. (Cancer Epidemiol Biomarkers Prev 2009;18(7):2000–5)
Introduction
The objective of screening is to reduce disease burden from the target disorder, in cancer screening, this means primarily cancer mortality. Effective screening requires that (a) the screening test is able to identify unrecognized disease, (b) the diagnostic process detects early cases, and (c) treatment outcomes of screen-detected cases are superior to those detected after the preclinical phase (1). In some cases, reduction in incidence (by means of curing premalignant lesions) and/or improved quality of life is also achieved.
Validity of a screening test is comprised of two components, sensitivity and specificity (2). Specificity relates to the ability of the test to correctly identify those subjects without the disease. Sensitivity pertains to the ability to correctly identify subjects with disease, determined by the rate of false-negative findings (missed cases), among those that would potentially be detectable (3). In the screening setting, the target disease is previously unrecognized, i.e., in the preclinical detectable phase. Sensitivity entails the probability of obtaining a negative test result in a person with the target disorder. Sensitivity is also an indicator of effectiveness, in the sense that it defines the maximal benefit achievable (provided that sensitivity pertains to potentially lethal, curable cases in the preclinical detectable phase).
Prostate-specific antigen (PSA) has emerged as the principal screening test for prostate cancer, used in both screening trials and for opportunistic screening. The specificity of the PSA test has been reported very consistently as 90% or slightly higher (4-6). Much wider variability in sensitivity has been reported, ranging from 20% to 95% (7-9). This discrepancy between studies has been due to differences in definitions, study designs and methods, suggesting that the reported estimates are not comparable. Similar results have been reported from prospective serum bank studies: 67% to 92% at 4 to 6 years of follow-up after drawing the blood sample (10-13).
We report here the test sensitivity in the three largest centers of the European Randomised trial of Screening for Prostate Cancer. In addition to randomized design, the major advantages of the approach include the large study size, uniform definitions, and the availability of population-based incidence data.
Materials and Methods
Test sensitivity was estimated using the incidence method, i.e., by relating interval cancer incidence among men with negative screening results to cancer incidence in the control arm of the screening trial.
The screening interval was 4 years in Finland and the Netherlands, whereas a 2-year interval was implemented in Sweden. Screening protocol varied slightly by center, but the core protocol was based on serum PSA as the primary screening test and included referral of men with serum PSA of 4 ng/mL or higher (Table 1). At lower PSA levels, additional criteria were used: in Finland, men with serum PSA 3.0 to 3.9 ng/mL underwent an ancillary test (digital rectal examination during the first 3 years of the trial, 1996-1998, and determination of the proportion of free PSA with a cutoff value of 0.16 from 1999 onwards); in Rotterdam, the Netherlands, digital rectal examination (DRE) and transrectal ultrasound (TRUS) were offered initially to all men (1993-1995), then in those with a PSA of >1.0 ng/mL (1995-1997), and was abandoned from 1997 onwards; in Gothenburg, Sweden, a cutoff level of 3.0 ng/mL without ancillary testing was used. For analyses of sensitivity, however, only men with a PSA of <3 ng/mL were included as screen-negative for consistency. Thus, men with a PSA of <3 ng/mL (study population for analysis of test sensitivity) received additional tests only in the Netherlands. The Dutch men with a PSA of <3 ng/mL, but a positive ancillary test, were regarded as screen-positive and excluded from the analysis (1,019 men in the first round and 763 in the second round).
Center . | Screening interval . | PSA cutoff . | Age range . | No. of men screened . | Screen-positive . | Cancers detected . |
---|---|---|---|---|---|---|
Finland (Helsinki and Tampere) | 4 | 4* | 55-67 | 20,793 | 1,979 (9.5%) | 543 (2.6%) |
The Netherlands (Rotterdam) | 4 | 4† | 55-74 | 19,970 | 4,587 (23.0%) | 1,014 (5.1%) |
Sweden (Gothenburg) | 2 | 3 | 50-65 | 5,855 | 663 (11.3%) | 142 (2.4%) |
Center . | Screening interval . | PSA cutoff . | Age range . | No. of men screened . | Screen-positive . | Cancers detected . |
---|---|---|---|---|---|---|
Finland (Helsinki and Tampere) | 4 | 4* | 55-67 | 20,793 | 1,979 (9.5%) | 543 (2.6%) |
The Netherlands (Rotterdam) | 4 | 4† | 55-74 | 19,970 | 4,587 (23.0%) | 1,014 (5.1%) |
Sweden (Gothenburg) | 2 | 3 | 50-65 | 5,855 | 663 (11.3%) | 142 (2.4%) |
With ancillary tests in the PSA range of 3.0 to 3.9 (DRE in 1996-1998 and free/total PSA from 1999).
With ancillary tests (DRE and TRUS) for all men (1993-1995) or men with a PSA of >1.0 ng/mL (1995-1997).
Study populations differed to some extent between centers, but all covered the core age group 55 to 69 years, with additional younger ages in Sweden and older men in the Netherlands. The results shown in this article are confined to the core age group (men ages 55-69 years at entry). The total number of screened men in all ages was ∼20,000 in both Finland and the Netherlands, with slightly less than 6,000 men in Sweden. The number of men in the core age groups were somewhat smaller.
Among screened men, follow-up started at the date of screening and ended at the next screening round, death, emigration, or common closing date (most recent date covered by follow-up was at the end of 2005 in Finland, at the end of 2004 in the Netherlands, and at the end of June 2006 in Sweden), whichever occurred first. In the control arm, follow-up started at date of randomization, but to improve comparability between the arms, the mean lag between entry to the study and screening in the intervention arm (199 days in Finland, 24 in Rotterdam, and 305 in Gothenburg) was subtracted from the follow-up time (and events during that time excluded). In the analyses of the second round, only men with a screen-negative finding at the second round and who participated in both rounds were included (14,198 men in Finland, 9,185 in the Netherlands, and 3,283 in Sweden). Similar to the first-round analysis, start of follow-up in the control arm was defined based on the screening dates (1,622 days from randomization in Finland, 1,509 days in the Netherlands, and 1,107 days in Sweden). Information on death and emigration was obtained from local or national population registries. Incidence density rates are shown per 100,000 person-years.
Information on incident cancer cases in the trial population was obtained from the regional (the Netherlands and Sweden) or national cancer registries (Finland), all population-based and well established, as evidenced by inclusion in several volumes of Cancer Incidence in Five Continents monographs, requiring stringent evaluation of data quality (14). Identification was based on automated record linkage, with personal identification number as the key.
Aggressive cancers were defined as those with advanced stage (T3-4 or N1 or M1) or poor differentiation (Gleason sum 8-10). These features were chosen as they predict a poor prognosis. Test sensitivity, S, was estimated as interval cancer incidence among men with a negative screening test (II) relative to incidence in the control arm (IC)
Furthermore, an adjusted estimate of sensitivity was calculated with correction for incomplete participation and difference in risk among participants and nonparticipants, using the formula
ref. 15 where P is the proportion of nonparticipants in the screening arm and RR is the prostate cancer incidence rate ratio for nonparticipants relative to the control arm.
Confidence intervals were obtained with simulation using statistical software R (with 10,000 simulations per interval estimate), assuming that the observed numbers of cases follow a Poisson distribution.
The study protocols were reviewed by appropriate organs dealing with research ethics in each country. Written informed consent was obtained from screened men. In the Netherlands, consent was obtained prior to randomization from all trial participants. In Finland and Sweden, randomization before consent was used and the men in the control arm were not contacted with exemption from consent given by the ethical committees. The funding agencies had no role in planning the study, preparing the report or decision to publish.
Results
The proportion of screen-positive men was >20% in the Netherlands and ∼10% in Finland and Sweden (Table 1). Correspondingly, the detection rate was twice as high in the Dutch center compared with the other two (5% versus 2.5%).
The number of screen-negative men in the first screening round were ∼18,000 in Finland, 15,000 in the Netherlands, and 3,400 in Sweden (Table 2). The mean length of follow-up was practically identical to the screening interval due to the small number of censorings. The number of interval cancer cases following the first screen was small (from 2 to 31 per center). The corresponding figures for the control arm are shown in Table 2.
Center . | Screen-negative* . | . | . | . | Control arm . | . | . | . | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | Men . | Person-years . | Cancers . | Aggressive† . | Men . | Person-years . | Cancers . | Aggressive† . | ||||||
Finland | 18,814 | 58,167 | 31 | 8 | 48,409 | 157,173 | 668 | 209 | ||||||
Netherlands | 15,383 | 56,936 | 15 | 2 | 21,162 | 80,822 | 319 | 129 | ||||||
Sweden | 3,445 | 6,604 | 2 | 1 | 6,479 | 12,737 | 29 | 11 |
Center . | Screen-negative* . | . | . | . | Control arm . | . | . | . | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | Men . | Person-years . | Cancers . | Aggressive† . | Men . | Person-years . | Cancers . | Aggressive† . | ||||||
Finland | 18,814 | 58,167 | 31 | 8 | 48,409 | 157,173 | 668 | 209 | ||||||
Netherlands | 15,383 | 56,936 | 15 | 2 | 21,162 | 80,822 | 319 | 129 | ||||||
Sweden | 3,445 | 6,604 | 2 | 1 | 6,479 | 12,737 | 29 | 11 |
NOTE: Analysis restricted to the core age group (55-69 y).
Men with a serum PSA of <3 ng/mL (excluding men with suspect DRE or TRUS in the Netherlands).
Including advanced stage (T3-4, N1, M1) and/or poorly differentiated (Gleason 8-10) cancers.
The crude cancer incidence rate among screen-negative men (with PSA <3 ng/mL) after the first screening round was 53 per 100,000 in Finland, 26 per 100,000 in the Netherlands, and 30 per 100,000 in Sweden (Table 3). The crude cancer incidence in the control arm was 348 per 100,000 in Finland, 403 per 100,000 in the Netherlands, and 174 per 100,000 in Sweden. Adjustment for age (by 5-year age group) did not diminish the differences between countries.
. | Screen-negative* . | Control . | Sensitivity . | |||
---|---|---|---|---|---|---|
Finland | 53.0 | 425.0 | 0.87 (0.83-0.92) | |||
Follow-up year | ||||||
0-1 | 6.5 | 377.9 | 0.98 (0.94-1.00) | |||
1-2 | 52.9 | 417.9 | 0.87 (0.77-0.95) | |||
2-3 | 66.9 | 433.7 | 0.85 (0.74-0.94) | |||
3-4 | 93.4 | 483.6 | 0.80 (0.66-0.90) | |||
Age group | ||||||
55-59 | 37.9 | 249.2 | 0.85 (0.75-0.93) | |||
60-64 | 67.7 | 505.9 | 0.87 (0.77-0.94) | |||
65-69 | 76.7 | 746.3 | 0.90 (0.82-0.96) | |||
The Netherlands | 26.3 | 394.7 | 0.93 (0.90-0.96) | |||
Follow-up year | ||||||
0-1 | 8.0 | 260.8 | 0.97 (0.89-1.00) | |||
1-2 | 0.0 | 311.0 | 1.00 (ND) | |||
2-3 | 32.5 | 274.3 | 0.88 (0.73-0.98) | |||
3-4 | 16.5 | 424.8 | 0.96 (0.90-1.00) | |||
Age group | ||||||
55-59 | 4.7 | 172.6 | 0.97 (0.91-1.00) | |||
60-64 | 18.8 | 296.0 | 0.94 (0.85-1.00) | |||
65-69 | 24.4 | 532.8 | 0.95 (0.89-1.00) | |||
Sweden | 30.3 | 227.7 | 0.87 (0.62-1.00) | |||
Follow-up year | ||||||
0-1 | 0.0 | 280.4 | 1.00 (ND) | |||
1-2 | 62.9 | 174.1 | 0.64 (0.19-1.00) | |||
Age group | ||||||
55-59 | 16.5 | 176.7 | 0.90 (0.64-1.00) | |||
60-64 | 19.9 | 473.5 | 0.96 (0.85-1.00) | |||
65-69 | 251.5 | 885.2 | 0.74 (0.39-0.95) |
. | Screen-negative* . | Control . | Sensitivity . | |||
---|---|---|---|---|---|---|
Finland | 53.0 | 425.0 | 0.87 (0.83-0.92) | |||
Follow-up year | ||||||
0-1 | 6.5 | 377.9 | 0.98 (0.94-1.00) | |||
1-2 | 52.9 | 417.9 | 0.87 (0.77-0.95) | |||
2-3 | 66.9 | 433.7 | 0.85 (0.74-0.94) | |||
3-4 | 93.4 | 483.6 | 0.80 (0.66-0.90) | |||
Age group | ||||||
55-59 | 37.9 | 249.2 | 0.85 (0.75-0.93) | |||
60-64 | 67.7 | 505.9 | 0.87 (0.77-0.94) | |||
65-69 | 76.7 | 746.3 | 0.90 (0.82-0.96) | |||
The Netherlands | 26.3 | 394.7 | 0.93 (0.90-0.96) | |||
Follow-up year | ||||||
0-1 | 8.0 | 260.8 | 0.97 (0.89-1.00) | |||
1-2 | 0.0 | 311.0 | 1.00 (ND) | |||
2-3 | 32.5 | 274.3 | 0.88 (0.73-0.98) | |||
3-4 | 16.5 | 424.8 | 0.96 (0.90-1.00) | |||
Age group | ||||||
55-59 | 4.7 | 172.6 | 0.97 (0.91-1.00) | |||
60-64 | 18.8 | 296.0 | 0.94 (0.85-1.00) | |||
65-69 | 24.4 | 532.8 | 0.95 (0.89-1.00) | |||
Sweden | 30.3 | 227.7 | 0.87 (0.62-1.00) | |||
Follow-up year | ||||||
0-1 | 0.0 | 280.4 | 1.00 (ND) | |||
1-2 | 62.9 | 174.1 | 0.64 (0.19-1.00) | |||
Age group | ||||||
55-59 | 16.5 | 176.7 | 0.90 (0.64-1.00) | |||
60-64 | 19.9 | 473.5 | 0.96 (0.85-1.00) | |||
65-69 | 251.5 | 885.2 | 0.74 (0.39-0.95) |
Abbreviation: ND, not defined.
Men with a serum PSA of <3 ng/mL (excluding men with suspect DRE or TRUS in the Netherlands).
Test sensitivity in the first screening round was estimated as 0.87 [95% confidence interval (95% CI), 0.83-0.92] in Finland, 0.93 (95% CI, 0.90-0.96) in the Netherlands, and 0.87 (95% CI, 0.62-1.00) in Sweden (Table 3). Sensitivity tended to decrease with time since screening (duration of follow-up). Test sensitivity was slightly lower for the oldest age group in the Netherlands and Sweden, but no differences by age were found in Finland. When the first two screening rounds in Sweden were combined to obtain a 4-year follow-up, the sensitivity was 0.88 (95% CI, 0.77-0.96).
Incidence of aggressive cancer (stage T3-4, N1, or M1, and/or Gleason sum 8-10) was substantially lower than the overall incidence. Of the interval cancers after the first screening, only eight in Finland, two in the Netherlands, and one in Sweden showed such characteristics. The corresponding incidence rates were 14 per 100,000 in Finland, 3.5 per 100,000 in the Netherlands, and 15 per 100,000 in Sweden. Sensitivity for the first screen based on aggressive cancers only (among both test-negative men and in the control arm) was 0.90 (95% CI, 0.81-0.96) in Finland and 0.98 (95% CI, 0.94-1.00) in the Netherlands. In Sweden, sensitivity was calculated based on aggressive cancers during the first two screening rounds due to the small numbers of cases and it was estimated as 0.79 (95% CI, 0.48-1.00).
Adjustment for selection bias due to incomplete participation did not substantially affect the results: the corrected sensitivity in Finland was 0.89 versus uncorrected 0.87, in Sweden, it was 0.90 versus 0.87, and the Dutch estimate was unaffected (Table 4). The slightly increased sensitivity in Finland and Sweden was due to a low cancer incidence among nonparticipants.
Center . | Participation (P) . | Rate ratio* (RR) . | Incidence in control arm (IC) . | Incidence among screen-negative men (II)† . | Corrected sensitivity‡ . |
---|---|---|---|---|---|
Finland | 0.65 | 0.84 | 425 | 53 | 0.89 |
The Netherlands | 0.94 | 0.83 | 395 | 26 | 0.93 |
Sweden | 0.59 | 0.59 | 228 | 30 | 0.90 |
Center . | Participation (P) . | Rate ratio* (RR) . | Incidence in control arm (IC) . | Incidence among screen-negative men (II)† . | Corrected sensitivity‡ . |
---|---|---|---|---|---|
Finland | 0.65 | 0.84 | 425 | 53 | 0.89 |
The Netherlands | 0.94 | 0.83 | 395 | 26 | 0.93 |
Sweden | 0.59 | 0.59 | 228 | 30 | 0.90 |
Incidence density rate in the control arm relative to the nonparticipants.
Men with a serum PSA of <3 ng/mL (excluding men with suspect DRE or TRUS in the Netherlands).
Calculated using the formula: S = [IC − P × RR × IC − (1− P) × II] / (IC − P × RR × IC; ref. 15).
The point estimates for test sensitivity were higher at the second compared with the first screening round in each country except Finland, although the CI overlap. The sensitivity for the second screening interval was 0.85 (95% CI, 0.80-0.91) in Finland and 0.95 (0.91-0.98) in the Netherlands (Table 5). For the Swedish third and fourth round combined, the sensitivity was 0.94 (0.88-0.99). Sensitivity based on the incidence of aggressive cancers alone following the second round was 0.92 (95% CI, 0.84-0.98) in Finland and 0.96 (95% CI, 0.90-1.00) in the Netherlands. In Sweden, there were no aggressive interval cancers among screen-negative men in the subsequent intervals (following the third screening round).
Center . | Screen-negative men* . | . | . | Control group . | . | . | Sensitivity . | ||||
---|---|---|---|---|---|---|---|---|---|---|---|
. | Cases† . | Person-years . | Incidence . | Cases . | Person-years . | Incidence . | . | ||||
Finland | 30 | 45,422 | 66 | 669 | 147,676 | 453 | 0.85 (0.80-0.91) | ||||
The Netherlands | 8 | 27,869 | 29 | 303 | 51,969 | 583 | 0.95 (0.91-0.98) | ||||
Sweden‡ | 4 | 11,472 | 35 | 218 | 35,616 | 612 | 0.94 (0.88-0.99) |
Center . | Screen-negative men* . | . | . | Control group . | . | . | Sensitivity . | ||||
---|---|---|---|---|---|---|---|---|---|---|---|
. | Cases† . | Person-years . | Incidence . | Cases . | Person-years . | Incidence . | . | ||||
Finland | 30 | 45,422 | 66 | 669 | 147,676 | 453 | 0.85 (0.80-0.91) | ||||
The Netherlands | 8 | 27,869 | 29 | 303 | 51,969 | 583 | 0.95 (0.91-0.98) | ||||
Sweden‡ | 4 | 11,472 | 35 | 218 | 35,616 | 612 | 0.94 (0.88-0.99) |
Men with a serum PSA of <3 ng/mL (excluding men with suspect DRE or TRUS in the Netherlands).
Numbers of aggressive interval cancers (stage T3-4, or M1, and/or Gleason sum 8-10): Finland 6, the Netherlands 2, Sweden 2.
Based on two consecutive 2-y screening intervals (total of 4 y).
Discussion
Test sensitivity is an indicator of test performance and a high sensitivity is a prerequisite for effective screening. Sensitivity of the screening test can be measured using the incidence of interval cancers following a negative test. In our analysis, the test sensitivities of serum PSA determinations were roughly comparable for the three countries within the European Randomised Study of Screening for Prostate Cancer trial, with overall estimates of ∼85%. There was some decline during the screening interval, but the overall estimates are reasonably high, which suggests that the interval was not too long, given the screening protocol. The findings suggest a higher sensitivity for aggressive cancers and for the second screening round.
No reports of test sensitivity following the second round of screening have been previously published, but some estimates of test sensitivity after one screening round have appeared. Compared with a an earlier report, the results reported here for Finland are based on longer follow-up and a larger number of cancer cases, but are consistent, showing test sensitivity of 0.89 for a PSA of <3 ng/mL in the earlier analysis and 0.83 overall (16). A previous Dutch article reported a total of 25 interval cancers among screened men, including screen-positive men (7). Based on incidence among all screened men, episode sensitivity was estimated as 80%, but no test sensitivity was calculated. A meta-analysis combining studies with different definitions and methods of analysis, suggested a sensitivity of 72% for PSA (5). This was, however, based on relating the number of men with elevated PSA to the number of positive biopsies and is therefore not comparable to our approach.
Our results are based on the three largest centers of the European randomized screening trial. The strengths of the study include randomized design, large size, and population-based cancer incidence data. All centers used a shared core protocol but with some variation in procedures. For instance, the screening interval was 2 years in Sweden, but 4 years in Finland and the Netherlands. Furthermore, the PSA cutoff level was 3 ng/mL in Sweden, whereas in Finland the Netherlands, a higher threshold was used (4 ng/mL, with an ancillary test for men with PSA 3.0-3.9). However, to obtain comparable figures, only men with a PSA of <3 were included in the analysis of sensitivity from all centers. Yet, the analyses using only men with PSA < 3 were practically similar to those also including men with a PSA of 3.0 to 3.9 ng/mL (e.g., test sensitivity 0.84 versus 0.87 for the first round in Finland and 0.91 versus 0.93 in the Netherlands). As the screening protocol in the Netherlands involved additional tests (DRE and TRUS) for men with a PSA of <3 ng/mL (all in 1993-1995, those with PSA >1 ng/mL in 1995-1997), the Dutch result does not exclusively reflect the properties of the PSA test, but the combination of tests. This may have resulted in slight overestimation of the test sensitivity in the Netherlands.
Whether the results from the three largest European Randomised Study of Screening for Prostate Cancer centers are applicable to the entire trial is uncertain. The rationale for restricting the current analysis to the three largest centers was statistical power. The incidence of interval cancer among screen-negative men is low and therefore the results tend to be very imprecise for smaller sample sizes (and in populations with lower background incidence rates). The analysis was also limited to men ages 55 to 69 years at entry and are not directly applicable to other age groups.
Less than a fifth of all prostate cancers during the initial follow-up period showed features predictive of a poor outcome (T3-4, N1, or M1, and Gleason sum 8-10). These are likely to be the cancers that will be most important in terms of prostate cancer mortality. Therefore, it is notable that the sensitivity tended to be higher in Finland and the Netherlands, when analysis was restricted to such cases.
The indications of improved sensitivity after two screening rounds suggest that men with confirmed low PSA level at two measurements 2 to 4 years apart represent a low-risk population, i.e., the validity of a repeated PSA determination is better than that of a single measurement. The difference in sensitivity between all cancers and aggressive tumors only was not apparent following the second screening round. This may be due to the small number and the lower proportion of such cases after the initial 4 years of follow-up.
As expected, the incidence of prostate cancer in the control arm was reasonably similar to the national rates in the centers with population-based control groups (Finland and Sweden). In Finland, the mean incidence in the 55 to 69 years old age group in 2000 to 2004 was 450 per 100,000, whereas the corresponding figure in the control arm of the trial was 452 per 100,000. For Sweden, the national rate for the 55 to 64 years old age group was 344 per 100,000, whereas that in the control group was 318 per 100,000. In the Netherlands, volunteers were recruited for the screening trial, and they were older than in the two other countries. Consequently, they had higher incidence rates (crude rate, 403 per 100,000 in the first 4 years), despite lower age-specific and age-standardized incidence in the entire population compared with the two Nordic countries (17). Yet, in age-specific comparisons, the Finnish control group had the highest rates with comparable incidence for the Swedish and Dutch control populations. In Sweden, the cancer incidence in the control arm was low in the first 2 years of the follow-up and subsequently increased considerably. This seems to reflect a low cancer incidence in the early follow-up (first 2 years), with rates comparable to both the national figures and those in the other centers subsequently.
Sensitivity can be estimated using a variety of methods. A randomized trial allows the use of the incidence method (18), with comparison of incidence in the screening arm with that in an unscreened reference population (providing an estimate of incidence in the absence of screening; ref. 19). The principal outcome in such an analysis is the reduction in disease incidence following screening. The rationale is that due to randomization, the disease risk at baseline is identical in both arms. The screening test is used to distinguish a high-risk group (screen positive) and the rest should therefore represent subjects at lower than average risk of disease. As the incidence rate in the control arm provides the hypothetical incidence in the absence of screening, it can be used to estimate how well the negative test result has identified a low-risk group. Because the incidence among screen-negative and screen-positive men is complementary (in the sense that the overall incidence without screening would be identical to that in the control arm), it follows that the lower the incidence among screen-negative men, the higher the incidence would have been among the screen-positive men. When attendance is incomplete, this method requires information on nonparticipants in order to correct for selective participation (as we have done here).
Several different definitions of the concept of sensitivity, more precisely, test sensitivity and episode sensitivity have been proposed. Hakama and coworkers used test sensitivity to refer to incidence among men with a positive screening test, but negative diagnostic confirmation (15), whereas definition and analysis similar to the present has been used in some other analyses (e.g., ref. 16).
The incidence method has several advantages compared with the cross-sectional approach (prevalence method), in which the yield of one test is contrasted with another and maximal yield assumed to represent perfect sensitivity. The incidence method is also superior to the detection method, which uses the prevalence/incidence ratio as the indicator of sensitivity, based on a longitudinal analysis without a control group. In the detection method, sensitivity is estimated as the proportion of screen-detected cases out of the total cases (screen-detected and interval cancers). First, the incidence method avoids the inflation of sensitivity due to overdiagnosis, as the screen-detected cases are not included in the assessment. Second, it does not include the assumption that a high detection rate is equivalent to high sensitivity, i.e., low subsequent risk of disease.
Interval cancer incidence reflects test sensitivity and its evaluation requires accurate identification of interval cases. Interval cases comprise both those cancers that would have been detectable at screening (i.e., were already at detectable preclinical phase) and those that were not, but reached that stage only after screening. Properties of the test affect only the frequency of cases that would have been potentially detectable at the time of screening, but were not identified, i.e., the false-negative screening results (test-negative but disease-positive subjects). Both the natural course of the target disease and the screening protocol (length of interval) affect the frequency of de novo disease arising after the screen. Therefore, they do not constitute failure of screening. Yet, identification of such cases is difficult or even impossible. Given the slow growth rate and long lead-time of prostate cancer (20, 21), the effect is likely to be small in this analysis.
Valid estimation of sensitivity requires comparability of populations. This is achieved by means of randomization for comparisons between the arms (intention to screen analyses). However, we include here only the screened men in the intervention arm. Compliance in the Dutch volunteer-based trial was very high (94%) and noncompliance is unlikely to affect the results. Yet, in the population-based Finnish and Swedish studies, a meaningful proportion of men failed to comply, even if the participation was relatively high (69% in Finland and 59% in Sweden). If the risk of prostate cancer differs between participants and nonparticipants, a selection bias is introduced. In both the Finnish and the Swedish trials, prostate cancer incidence among nonparticipants was slightly below the rate in the control arm (16, 22). However, our adjusted estimates take this into account.
Another aspect is the comparability of the age distribution. Within a center, this was ensured by randomization. This means that the test sensitivity estimates are comparable across centers, provided that the sensitivity does not vary by age. Even if there was some variability in test sensitivity by age, mainly in the Swedish study, comparisons within the age group also confirmed the overall results.
Another requirement for valid assessment of sensitivity is the comparability of information. First, the probability of diagnosis given histologic evidence of prostate cancer should be similar among screen-negative men and the control arm. It is difficult to assess whether the frequency of opportunistic PSA testing in the control arm differs from the screen-negative men (excluding the screening tests). In a Dutch study (23), PSA tests were more frequent in the control arm than in the screening arm, but the effect on cancer detection was small due to a low rate of prostate biopsies. Furthermore, in the screening trial, sextant biopsies were used, but it is possible that a larger number of cores were taken in other contexts than screening. However, as the comparison here is based on interval cancers versus control arm cancers, diagnostic procedures are not based on the screening protocol and the procedures are therefore not likely to differ between the arms. Second, incidence rates should be based on comprehensive information on cases. In this study, we used well-established, population-based cancer registries that have high enough completeness to merit inclusion in the Cancer Incidence in Five Continents monograph (14). They all use multiple sources of information and partly computerized reporting to improve coverage. The Finnish Cancer Registry has been estimated to reach at least 99% completeness for solid cancers (24). A recent study indicated an overall completeness of 96% for the Swedish Cancer Registry (25). There was practically no loss from follow-up due to high-quality data on vital status and emigration obtained from population registers.
In conclusion, determination of serum PSA has a sensitivity of 83% to 90% in the first round and close to 95% in the second round in the major centers of the European Randomised Screening trial for Prostate Cancer. Test sensitivity showed only modest variability related to screening protocol, such as ancillary screening tests or screening interval. A tendency towards higher sensitivity for aggressive cancers and after the second screening round is also encouraging for the potential effectiveness of prostate cancer screening.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Grant support: The European Randomised Screening for Prostate Cancer has received financial support from the European Union and Beckman Coulter (formerly Beckmann-Hybritech). Individual centers have also been supported by the Academy of Finland (grant no. 123054) and Finnish Cancer Organisations (Finland); the Dutch Cancer Society and the Netherlands Organization for Health Research and Development ZonMW (the Netherlands); and Cancerfonden (Sweden).
Acknowledgments
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.