Exposure Definition in Case–Control Studies of Cervical Cancer Screening: A Systematic Literature Review

Abstract The first step in evaluating the effectiveness of cervical screening is defining exposure to screening. Our aim was to describe the spectrum of screening exposure definitions used in studies of the effectiveness of cervical screening. This systematic review included case-control studies in a population-based screening setting. Outcome was incidence of cervical cancer. Three electronic databases were searched from January 1, 2012 to December 6, 2018. Articles prior to 2012 were identified from a previous review. The qualitative synthesis focused on describing screening exposure definitions reported in the literature and the methodologic differences that could have an impact on the association between screening and cervical cancer. Forty-one case–control studies were included. Six screening exposure definitions were identified. Cervical cancer risk on average decreased by 66% when screening exposure was defined as ever tested, by 77% by time since last negative test, and by 79% after two or more previous tests. Methodologic differences included composition of the reference group and whether diagnostic and/or symptomatic tests were excluded from the analysis. Consensus guidelines to standardize exposure definitions are needed to ensure evaluations of cervical cancer screening can accurately measure the impact of transitioning from cytology to human papillomavirus–based screening and to allow comparisons between programs.


Introduction
Cervical cancer screening has the potential to detect both asymptomatic cancers and precancerous lesions, enabling the reduction of cervical cancer mortality and incidence. The case-control study design is efficient when the outcome is rare and/or primary data collection is needed, providing a feasible approach to quantify the benefit of screening in reducing cervical cancer incidence at a population level. Hence this study design is often used for evaluations or audits of cancer screening programmes. Many case-control studies of cervical screening rely on medical records rather than questionnaires to ascertain screening exposure but have characterized screening exposure in different ways. This variation in screening exposure definition combined with other methodologic considerations may lead to different estimates of the benefit of cervical cancer screening in these casecontrol studies.
The detectable preclinical phase (DPP; ref. 1) refers to the period beginning at the time when a cancerous or precancerous lesion is detectable by screening and ending with the onset of clinical signs or symptoms of invasive cancer ( Supplementary Fig. S1). Ideally case-control studies evaluating the association between screening and cervical cancer incidence should aim to compare screening histories of cases and controls during the subperiod of the DPP in which only precancerous lesions are present. It is only during the precancerous phase that screening can lead to the detection and treatment of lesions to prevent cancer (2). The stages of cervical carcinogenesis are relatively well understood making it possible to estimate the average DPP duration (3). However, interindividual variation in the DPP duration, or misspecification of the DPP duration or its precancerous phase, are potential sources of bias in case-control studies of the effectiveness of screening for cancer prevention (4,5).
For a sensitive screening modality, women diagnosed with cervical cancer will be far less likely than their controls to have a screening test performed during the precancerous phase of the DPPhad they had a test during this period, the precancerous lesion could have been treated and the cancer could have been prevented. A screening test during the occult invasive phase of the DPP will not prevent the cancer but is likely to lead to detection at an early stage thereby potentially reducing morbidity and mortality (6).
Valid case-control studies of screening aim to ignore non-screening tests in analyses. In practice it can be challenging to accurately determine test indication, as information may not be accurate (e.g., self-reported test indication from interviews or questionnaires) or available (e.g., limited data from administrative claims or screening databases). Knowledge of the time from test to cancer detection can help to infer whether or not it was a screening test.
There have been several previous efforts to summarize the association between screening and cervical cancer incidence. A 2005 International Agency for Research on Cancer (IARC) review of cervical cancer screening summarized the association between "ever" having been screened and the risk of cervical cancer (7). In 2013, Peirson and colleagues (8) reviewed literature published between April 1995 and April 2012 in order to assess the association between screening and risk of cervical cancer incidence and mortality. They also examined associations with varying screening intervals and the age at which screening began and ended. A third review by Meggiolaro and colleagues (9) in 2016 aimed to quantify the association for cytology screening and identify potential sources of heterogeneity. Both the Peirson and Meggiolaro reviews reported high levels of heterogeneity between included studies. Meggiolaro and colleagues stratified results by study quality, cervical cancer histology, and calendar year of screening in order to explain the observed heterogeneity. None of these previous reviews considered screening exposure definitions and differences across them.
We conducted a systematic review of case-control studies evaluating the effectiveness of screening to reduce cervical cancer incidence in order to classify the spectrum of screening exposure definitions used in these studies. Our review updates literature from the prior reviews by including publications from January 2012 to December 2018. Our goal was to better understand the implications of various screening exposure definitions on results across these case-control studies to better inform screening evaluations and audits of screening programs designed to quantify the impact of screening on cervical cancer prevention.

Materials and Methods
This systematic review was undertaken and reported in adherence to the Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) guidelines. A protocol for this systematic review was developed prior to conducting the searches (Supplementary Table S1).
In addition to considering all studies from the IARC (7), Peirson (8), and Meggiolaro (9) reviews for inclusion, we used broad search criteria similar to those used by Peirson and colleagues (8) to search PubMed Central, Ovid MEDLINE and Embase (Ovid) databases for additional studies published from January 1, 2012 to December 6, 2018. Our search included the following terms (and their derivatives): cervix uteri, cervical intraepithelial neoplasia, Papillomavirus infection, Papanicolaou (Pap/smear), screening and early detection. The present search differed from the Peirson protocol only in the exclusion of literature in languages other than English and limitation of the search from January 1, 2012 onward. In addition, reference lists from included manuscripts and from the three previous reviews were searched.
Two investigators (A. Castanon and A.W.W. Lim) independently reviewed titles and abstracts of identified articles for eligibility for fulltext review. Disagreements were resolved by consensus between the two reviewers. Full-text articles of eligible abstracts were retrieved and also independently reviewed for inclusion by the same investigators. Excluded studies included those in which any study subjects were less than 15 years of age, study subjects were not offered either conventional or liquid-based cervical cytology as a screening test or did not include a comparison group who had the opportunity to be screened, but did not have cervical cancer. Study designs other than case-control and studies that examined an outcome other than cervical cancer incidence were excluded. Books, conference abstracts, narrative review articles, and articles without individual level data on screening exposure were also excluded.
Included studies were quality-appraised with the Newcastle-Ottawa Scale (10). Information to assess the quality of evidence was abstracted into a pre-specified table in duplicate (by A. Castanon and A.W.W. Lim) from the primary methodology paper for each study. The two investigators extracting these data were blind to each other's quality ratings, and rating disagreements were resolved by consensus or consultation with a third investigator (P. Sasieni).
Characteristics of included studies were abstracted into structured tables and included: lead author, publication year, country of study setting, age range of cases and controls, diagnosis years for cases, number of cases, FIGO stage and/or histological type of cases, screen-ing organization (opportunistic or following an invitation), data sources for outcome and exposure assessment, control eligibility criteria, screening exposure measure, screening intervals studied, and measures of association.
Measures of association from included studies were most frequently reported as odds ratios (OR) for screening relative to no screening, but occasionally as relative protection (i.e., the reciprocal of the OR). Results originally reported as relative protection (RP) were converted to odds ratios by division (1/RP). When the reference group was the most frequently screened group, all ORs were divided by the OR in the least screened group. Therefore, the OR of 1.00 in the "never screened" group has an associated confidence interval (CI), and no CI is associated with the OR for the most screened group. When studies did not report ORs, the Altman method (11) was used to calculate ORs and associated standard errors and 95% CIs when sufficient information was available within the manuscripts.
The qualitative synthesis was carried out using a framework comprised of two elements: (i) description of the measures of screening exposure reported in the literature, and (ii) description of methodological differences which could have an impact on the association between screening and cervical cancer among studies reporting the same screening exposure.

Results
The database searches identified 1,384 records in PubMed, 2,553 in Ovid MEDLINE and 2,514 in Embase (Ovid) for a total of 6,451 citations ( Fig. 1). In addition 41 citations were identified by searching previous systematic reviews and their reference lists. After 1,701 duplicates were removed, a total of 4,750 records remained. Title and abstract review excluded 4,714 records because they did not meet study inclusion criteria. Full text was reviewed for 36 manuscripts of which eight had a cervical mortality endpoint, three were narrative reviews, 20 were not case-control studies. In addition 41 manuscripts were identified from previous reviews, full-text was unavailable for four (12)(13)(14)(15)) and one was not in English (16). The database searches yielded a total of four manuscripts published from 2012 to 2018 that met our study inclusion criteria, 36 manuscripts from the IARC, Pierson and Meggiolaro reviews, and 1 from reference list searches. Thus, 41 manuscripts were included, likely representing 33 unique studies. Manuscripts by Sasieni and colleagues (n ¼ 4; refs. [17][18][19][20] and Castanon et al. (n ¼ 2;refs. 21,22) are from the same case-control study, and the manuscripts by Celetano and colleagues (23) and Klassen and colleagues (24) appear to report on the same population, but this cannot be confirmed from the publications.
Supplementary Table S2 lists the key characteristics of included manuscripts. Studies included cases diagnosed from 1959 to 2014 from 17 different countries. Most studies included a broad age-range of women, and two studies focused exclusively on women 55 and older (22,25). Twenty-two studies were conducted in noninvitational screening settings, seventeen studies were conducted in invitational settings, and two studies were conducted in settings with both non-invitational and invitational screening. Eight studies restricted analyses to International Federation of Gynecology and Obstetrics (FIGO) stage 1B or worse cervical cancers and five reported histology-specific results. Six studies did not specify any matching criteria, sixteen obtained screening information through interviews, four through both interviews and medical/screening records, and twenty-one through medical/screening records or databases.
Full details on the risk of bias for included studies are shown in the supplementary material (Supplementary Table S3). The risk of bias varied considerably between studies but did not seem to be influenced by the number of cervical cancers included (Supplementary Table S2).
The case-control studies with the least likelihood of bias were those in which cancers were identified through hospital pathology databases or cancer registries and in which screening histories for both cases and controls were extracted from electronic databases, screening registries, or medical records. Similarly, studies in which controls were identified through electronic databases (allowing the inclusion of all, or a sample of all, eligible women) rather than through population lists (which require the investigator to contact participants for an interview or provide a questionnaire to obtain information) were the least susceptible to bias. Six of the included studies had Newcastle-Ottawa scores of less than six and are therefore particularly susceptible to bias (24,(26)(27)(28)(29)(30). Eleven studies had low risk of bias (Newcastle-Ottawa scores of eight or nine) and the remaining 24 studies had a moderate risk of bias.
We identified six definitions of screening exposure among the manuscripts included in this review: "ever having a test", "time since last negative test", "number of tests", "maximum interval between tests", "screening history", and "screening in a three year age band and risk of cancer over a five-year period". Several manuscripts reported more than one definition of screening exposure. Since most studies attempted to exclude diagnostic tests from analyses, we use the terms "screening" and "testing" interchangeably throughout this manuscript except within the tables where the distinction is important when interpreting results.

Ever having a test
Twenty-nine of the included studies examined the risk of cervical cancer associated with ever having a test during various look-back windows prior to cases' diagnosis date or corresponding reference date for controls ( Table 1). Supplementary Figure S2 illustrates the lookback window during which screening history is ascertained when measuring 'ever having a test'.
Most studies which considered screening exposure as 'ever having a test' (n ¼ 25) examined look-back windows less than 10 years, two Articles identified through searches: 5  The baseline group in these studies is the most screened group. Here the ORs have been divided by the OR in the least screened group. Note that therefore the OR of 1.00 in the "never screened" group has an associated confidence interval, and there is no CI for the most screened group. f From the same case-control study.
studies considered 10-year look-back windows, and two studies only examined lifetime screening exposure. Most studies excluded tests within 1 month (n ¼ 1), 6 months (n ¼ 10), and 12 months (n ¼ 9) prior to the diagnosis/reference date. Few explicitly acknowledge whether these exclusions are to eliminate symptomatic tests and/or to exclude tests taken during the occult invasive DPP. Of the studies that did not explicitly specify an exposure exclusion period, three studies excluded all self-reported diagnostic tests and six did not exclude any tests prior to diagnosis/reference date. Olesen and colleagues (31) and Van der Graff and colleagues (32) were the only studies to exclude both tests within 6 and 12 months of diagnosis/reference date, respectively, and any tests taken in response to symptoms during the precancerous phase of the DPP. The impact of including symptomatic tests and those occurring during the occult invasive DPP is demonstrated by results from Kasinpila and colleagues (33) In this study, the risk of cervical cancer (although not statistically significant) was 53-84% higher among those tested between 6-11 months and within 6 months prior to diagnosis/reference date, respectively. However a 73% reduction in cervical cancer risk was seen when the test occurred 1-3 years prior to diagnosis.
Using a 7-year look-back, Kamineni and colleagues (25) was the only study to perform sensitivity analyses using various estimates of the occult invasive DPP (≤6 months, ≤12 months, 18 months and 24 months), and examined testing only during corresponding estimates of the precancerous phase of the DPP(5 yrs, 5.5 yrs, 6.0 yrs and 6.5 yrs). Their results were robust to these estimates of the occult invasive DPP.
Others use the exposure exclusion period to remove symptomatic tests from analysis and attempt to account for the effect of tests taken during the occult invasive DPP by restricting analysis to cancers International Federation of Gynaecology and Obstetrics (FIGO) stage 1B or worse (17,18) (which are less likely to be screendetected) or present results by FIGO stage at diagnosis (21).
In studies that reported results by various look-back windows, we noted that the reference group differed across studies. Some reference groups only included women who had never been screened or tested during the look-back window, while others also included those screened or tested prior to the look-back window.
Among the 11 studies with look-back windows of 10 or more years, the average reduction in risk associated with screening was 67% (range 57-85%). Among the 24 studies reporting results for having a test during the 2 to 6 years prior to diagnosis/reference date, the average reduction in risk associated with screening was 66% (range 11-95%).

Time since last negative test
Fourteen studies reported results for time since last negative test ( Table 2). A negative test will not have prevented cancer but will identify those at a period of lower risk than their unscreened counterparts. Studying how this risk increases over time, can inform appropriate screening intervals.
There are a number of methodological issues that can introduce bias in this measure of screening exposure. Only negative tests after which no further action is recommended should be considered for this analysis. The manuscripts by IARC (34), Kamineni and colleagues (35), Andrae and colleagues (36), and all manuscripts by Sasieni and colleagues and Castanon and colleagues, included only negative tests that did not lead to further action. We note that the IARC study (34) is a pooled analysis of case-control studies with similar designs and a common measure of screening exposure.
Another methodological difference is the composition of the reference group. Some studies included those with no negative tests (n ¼ 4) in the reference group, others only included women with no tests (n ¼ 2), and others include women with tests outside the intervals under study (n ¼ 3). Five other studies included both those with no tests and those with tests outside the intervals under study. The different composition of the reference groups can partly be attributed to the fact that the studies by Makino (37), Sobue (38), Ibanez (39), Macgregor (40), Mitchell (41), Yang (42) and Zhang (43) included screened women only.
Using the shortest time interval between testing and cancer diagnosis for each study, the average reduction in risk associated with a negative screen was 77% (range 59-95%). However, the observed reduction in risk decreased with increasing time between diagnosis/ reference date in all studies.

Number of tests
Eight studies reported results by number of tests ( Table 3). Five of these eight studies were agnostic to test results while three restricted analyses to negative tests. We note that, as in the previous exposure definition, the composition of the reference group differs between studies. Tests during the occult invasive DPP are not excluded in this definition of screening exposure because it is not attempting to make reference to the time between the test and diagnosis/reference date. Instead, this screening exposure definition evaluates whether the risk is the same after receipt of the first primary screening tests as it is after a series of negative primary screening tests (i.e., a dose response). Triage and surveillance tests should be excluded.
All studies with this screening exposure definition observed decreasing risk of cervical cancer with increasing number of tests prior to diagnosis/reference date. The average reduction in risk associated with two or more tests was 79% (range 54-94%).

Other screening exposure definitions
The most common other definition of screening exposure was that which established how often women attended screening ( Table 4). This definition is useful to assess how testing intensity changes the risk of cervical cancer. Four studies reported on the frequency at which women were screened (21,22,28,42). For example, Yang and colleagues (42), classified women depending on whether they only attended one of the last two screening rounds (i.e., not regularly screened) or whether they attended both of the last two screening rounds. This classification was done irrespective of test result. In contrast to 'number of tests' this measure does take into account the intervals at which tests were taken, hence all four studies excluded tests taken during the estimated occult invasive DPP before establishing how often women had attended screening. More frequent screening was associated with a lower risk of cervical cancer compared to irregular attendance.
A problem with screening exposure definitions looking at the time since last test is that women on annual or six-monthly follow-up or repeat schedules will have a short time since last test. Such simple classification is likely to falsely make screening appear less effective (since such women are at increased risk of cervical cancer).
Defining exposure as the "maximum interval between tests" (Supplementary Fig. S3) can address the issue raised above. For example, consider a look-back window of 6 years and a woman who had three tests six months apart, but whose previous test was five years prior to these; her maximum interval would be five years. No attention is given to her most recent interval (6 months) or her average interval (2 years). Had this woman's only test been the one 6 months prior to   The baseline group in these studies is the most screened group. Here the ORs have been divided by the OR in the least screened group. Note that therefore the OR of 1.00 in the "never screened" group has an associated confidence interval, and there is no CI for the most screened group.
f Estimated via the Altman method. diagnosis, her interval would have been 5.5 years and she would have been grouped with those who had no tests. In this review two studies used this measure of screening exposure (19,22). All previous screening exposure definitions have defined the agegroups by the age of the cases and controls (i.e., age at diagnosis/ reference date) rather than the age at screening. In order to specifically examine the benefit of screening within a specific age range (e.g., ages 20-24), a different approach is required. The final screening exposure definition identified in this review considered whether a woman had been screened in a narrow three-yearly age band (e.g., 40-42 years vs. not screened 40-44 years) and then looked at whether she developed cancer in the subsequent five years (e.g., 45-49 years; Supplementary  Fig. S4). To date, this screening exposure definition has only been studied by Sasieni and colleagues (20).
Studies with more than one screening exposure definition Studies using more than one screening exposure definition illustrate the differences in magnitude of observed associations based on the choice of exposure definition. Hoffman and colleagues (44) reported a 70% lower risk of cervical cancer among women ever tested but an 80% lower risk among those who had three or more tests.
Andrae and colleagues (36) reported a 52% lower risk among those tested as recommended compared with a 65% lower risk among those who tested negative. Sasieni and colleagues (18) observed an even greater difference: 66% lower risk 2.5-3.4 years after a test, but an 87% lower risk within 3 years of a negative test.
Manuscripts by IARC (34), La Vecchia and colleagues (45), Macgregor and colleagues (46), Mitchell and colleagues (47), Palli and colleagues (48) and Parazzini and colleagues (49,50) report results by time since last negative and by number of tests. All studies except for Macgregor and colleagues observed a lower risk of cervical cancer after multiple tests than for time since last negative test. For example, Palli and colleagues reported a 66% reduction in risk of cervical cancer within 3 years of a negative test, but a 94% reduction following three or more tests. Note that all studies in this table except for McGregor and Mitchell used interviews to establish screening history. When using interviews, it is not possible to ascertain the look-back window under study. Where screening databases were used and the period is not specified, the look-back period will be to the age at which screening is first offered from the date when the registry was created. b Estimated via the Altman method.

Discussion
Our review examined data from studies in 17 countries spanning 6 continents covering cervical cancer cases diagnosed from 1959 to 2014. They varied by how often screening was offered and under what conditions (invitational or not). Choice of screening exposure definition impacted the magnitude of observed screening benefit. Cervical cancer risk on average decreased by 66% when screening exposure was defined as ever tested, by 77% when exposure was defined as time since last negative test and by 79% after two or more previous tests. Within study differences between exposure definitions estimates were even greater than between studies differences. Methodological differences that affect estimates of screening benefit within exposure definition include the estimated duration of the occult invasive DPP and the choice of reference group.
A bridging search in PubMed covering January 2019 to April 2020 identified two additional manuscripts that met the inclusion criteria for this review (51,52). No new measures of screening exposure were reported, and results were in keeping with those presented in this review.
We compared results from different screening exposure definitions to illustrate their impact on the magnitude of the effect. Observed differences are largely due to differences in the underlying risk of cervical cancer among women included in each exposure definition. Some exposure definitions are evaluating the tests' ability to predict risk (i.e., after a negative test) and some the ability to reduce risk (i.e., participating in screening). In practice, risks from different exposure definitions should not be compared.
The population-level benefit of cervical cancer screening will depend on multiple factors including screening coverage, accuracy of the screening test, and quality of the follow-up for those testing positive (7,53), leading prior studies to conclude that these factors may be driving observed differences in study results (7,9). None of the previous reviews addressed screening exposure definition as a source of heterogeneity between studies. Although Peirson and colleagues (8) reported the screening exposure definition for each study, they did not stratify results by them.
The case-control design ensures that differences in screening coverage do not impact results. Defining exposure to screening as ever having a test provides the most straightforward estimate of benefit of at least one test when more specific screening history and/or test results are not available. However, study results will, to some extent, reflect the quality of the follow-up for positive tests.
Analyses focusing on negative tests will reflect the sensitivity of the screening test, allowing for estimation of appropriate screening intervals. Although knowledge of test result is needed when focusing on negative tests, assuming the test has a reasonable negative predictive value there is no need to exclude symptomatic or diagnostic tests for this exposure definition.
Other screening exposure definitions are less practical as they require data from more than one round of screening which may not always be available. However, these screening exposure measures may be more desirable for mature screening programs because they can consider multiple primary testing prior to the diagnosis/reference date and allow estimation of the benefit of repeat testing. Measures such as number of tests (in particular negative tests) and regularity of testing can be useful when comparing results from different settings. Accounting for number of past tests aims to equalize the risk among individuals and it is also a way to standardize the difference in the testing accuracy between studies. The likelihood that a third test is a false-negative is much lower than that for a single negative test (assuming sensitivity is independent between tests). Note that the advantage of two average-quality cytology tests over one could be significant, whereas the advantage of two human papillomavirus (HPV) primary tests over one may be small.
Most studies in this review excluded tests during a short period prior to diagnosis. This exclusion period was typically 6 or 12 months but varied from 1 to 24 months. Some studies indicated that the rationale was to exclude tests taken in response to symptoms, but few explicitly stated whether they (also) intended to exclude tests taken during the occult invasive DPP. The risk of cervical cancer associated with tests during the occult invasive DPP reflects the prevalence of screendetected cancer. Determining the precise duration of the occult invasive phase is challenging and it will, of course, not be identical for every individual. In this review, only Kamineni and colleagues (25) actively focused on screening that occurred during the presumed precancerous period of the DPP. To isolate the precancerous phase, they estimated the duration of the occult invasive phase. As, the duration of the occult invasive phase will be half as long, on average, in screen-detected cases than among cases whose cancer was detected as a result of symptoms, Kamineni and colleagues (25) analysed their data in two strata: 1) symptomatic cases and controls with no screening during the presumed occult invasive phase, and 2) screen-detected cases and recently screened controls. They performed sensitivity analyses using various estimates of the occult invasive DPP, and results were robust to estimates of the occult invasive DPP of up to 2 years prior to diagnosis/reference date.
In contrast, Wang and colleagues (51) knew the majority of cervical cancer cases in their study had an abnormal test result within 6 months of diagnosis and a review of medical records found that tests within one month of diagnosis were likely to be performed because of symptoms. However instead of analysing data in the manner of Kamineni and colleagues, they excluded, for both cases and controls, tests within 6 months of diagnosis and extended the exposure window by half a year so analysis would reflect screening in the round prior to diagnosis. If screening has been only recently introduced in a population, if data for only one screening round is available, or if screening is not invitational (i.e., there is no 'round prior to diagnosis'), excluding tests taken during the occult invasive DPP will lead to a high proportion of cases being classed as 'never screened'. This may bias in favour of screening unless it is accounted for during the analysis by, for instance, using the analytic approach taken Kamineni and colleagues (25).
The choice of reference group is a methodological consideration that has not been given attention in the literature. The observed screening benefit will be greater if the reference group includes only women who have never been screened as opposed to a reference group that also includes women whose last test was prior to the defined lookback window.
This review did not focus on evaluation of screening effectiveness by age or by histological type. There is evidence from Italy (54), the UK (18), and South Africa (44) that screening is less effective in young women, but this has not been consistently observed (36). There is also evidence that screening with cytology (but not HPV tests) is less sensitive for detecting adenocarcinoma of the cervix (21,51). It is likely that age and histological type will also be sources of heterogeneity when comparing results from different studies.
Many cervical cancer screening programs are transitioning from cytology-based screening to primary HPV screening. Routine evaluations of the effectiveness of primary HPV screening in preventing cervical cancer will be critical to ensure that benefits observed in randomized trials are borne out in practice. Additionally, larger cohorts of HPV-vaccinated women are becoming screen-eligible and the lower prevalence of cervical disease has already been shown to decrease the positive predictive value of screening (55).
As evidenced by the number of case-control studies in this review, this study design is largely accepted as an efficient way to quantify the benefits of cervical screening in a variety of settings. The exposure definitions identified in this review will be relevant irrespective of whether a screening program employs cytology alone, co-testing, primary HPV screening, or is transitioning to a new screening modality.
To ensure programs evaluate their progress towards cervical cancer elimination and can accurately measure the impact of transitioning to new screening modalities, case-control studies should be implemented alongside routine quality assurance measures to allow for routine evaluation of screening.
The next step is to establish international consensus for core screening exposure definitions to be used in case-control studies of screening effectiveness, similar to those established for effectiveness trials (56). This will enable the development of guidelines to standardize definitions and establish key scientific questions to be addressed under each exposure definition. Only then will consistent evaluation of cervical cancer screening programs and international comparisons be possible given the evolving cervical cancer prevention landscape.