Abstract
Background: Several options for the triage of high-risk HPV screen–positive (hrHPV+) women were assessed.
Methods: This study incorporated CIN2+ cases and controls, all of whom tested hrHPV+ and whose results of liquid-based cytology (LBC), HPV16/18 genotyping, and p16/Ki67 cytoimmunochemistry were available. Sensitivity and specificity for the CIN2+ of these triage tests were evaluated.
Results: Absolute sensitivities of HPV 16/18 typing, LBC, and p16/Ki-67 cytoimmunochemistry for CIN2+ detection were 61.7%, 68.3%, and 85.0% for women with hrHPV+ clinician-taken samples. Respective specificities were 70.5%, 89.1%, and 76.7%. The absolute accuracy of the triage tests was similar for women with a hrHPV+ self-sample. P16/Ki-67 cyto-immunochemistry was significantly more sensitive than LBC although significantly less specific.
Conclusions: All three single-test triage options, if positive, exceed the threshold of 20% risk at which colposcopy would be indicated. However, none of them conferred a post-test probability of CIN2+ <2%; which would permit routine recall. P16/Ki-67 cytoimmunochemistry on HPV16/18 negative women had a post-test probability of CIN2+ of 1.7% and 0.6% if also LBC negative.
Impact: This is one of the few studies to directly compare the performance of triage strategies of hrHPV+ women, in isolation and combinations. It is the only study assessing triage strategies in women who test hrHPV+ in self-taken vaginal samples. A combined triage option that incorporated HPV 16/18 typing prior to p16/ki-67 cytoimmunochemistry in HPV 16/18–negative women yielded a post-test probability of CIN2+ of >20%, whereas women who tested negative had a probability of CIN2+ of <2%. Cancer Epidemiol Biomarkers Prev; 26(11); 1629–35. ©2017 AACR.
This article is featured in Highlights of This Issue, p. 1579
Introduction
Current evidence indicates that high-risk human papillomavirus (hrHPV) testing is more effective in reducing the burden of cervical intraepithelial neoplasia grade 3 (CIN3) and cervical cancer compared with cytology (1). Consequently, international guidelines support the approach of using HPV molecular testing as the optimal modality for primary cervical screening (2–4). Consensus around what the optimal “second line” triage test(s) for HPV-positive women should be is, however, less apparent. This is reflected in the variety of triage algorithms proposed across countries committed to the rollout of HPV-based primary screening (4, 5). The optimal triage of screen-positive women is one of the most important considerations in this new era of cervical screening given the high prevalence of transient HPV infection and the implications of unnecessary referrals for colposcopy. Performing cytology on hrHPV-positive women has been recommended as a triage strategy (6, 7). However, even in organized settings, the reliance on morphology and subjective skills may limit the overall effectiveness of this approach—particularly in immunized populations (8, 9).
In addition, data indicate that up to two thirds of HPV-positive women will be cytology negative, thus creating a category of women who need further follow-up (7, 10). The ideal triage test would create a scenario where negative women could be returned to routine recall while positive women could be referred to colposcopy with robust cause.
Other triage options are HPV 16/18 genotyping and detection of the cyclin-dependent kinase inhibitor protein—p16INK4a either in isolation or in combination with Ki-67; the p16/Ki-67 dual-stained cytology is available commercially as CINtec PLUS (Roche Molecular Systems; refs. 10–15). The rationale for these approaches are based on the higher risk that HPV 16 and 18 confer for the development of CIN3+ compared with other HR-HPV types and the observation that markers of abnormal cell proliferation, that are detectable through cytoimmunochemistry are indicative of potentially transforming infections (16, 17). There is evidence to indicate that the specificity and positive predictive value (PPV) of both these approaches exceeds that of hrHPV testing when used to stratify the risk among women with minor abnormalities (18). A recent triage study conducted in the United States demonstrated higher specificity of p16/Ki-67 dual-stained cytology compared with conventionally stained cytology (where cytologists were aware of the HPV status) for the triage of HPV-positive women (13). However, few studies have compared p16/Ki-67 dual-stained cytology, HPV 16/18 typing, and cytology concurrently within the same population—in the setting of primary screening. Moreover, there are few reports where combinations of the three triage approaches have been considered and none where the performance of p16/Ki-67 dual-stained cytology has been assessed as a triage of women who tested hrHPV positive in a self-taken screening sample.
Previously, we assessed the performance of hrHPV DNA testing using the cobas 4800 HPV assay on clinician-collected cervical samples and self-collected vaginal samples in more than 5,000 women recruited to the Papillomavirus Dumfries and Galloway (PaVDaG) study (19). This study was nested into the PaVDaG framework with the objective of comparing the accuracy of HPV16/18 genotyping, liquid-based cytology (LBC), p16/Ki-67 dual-stained cytology and combinations thereof, for the detection of CIN2+ in a sample of hrHPV-positive women in one round of screening. In addition, the accuracy of triage with p16/Ki-67 dual-stained cytology and other markers in women who were hrHPV positive on a self-collected vaginal sample was also assessed.
Material and Methods
Overview of PaVDaG cohort/study
A detailed description of the PaVDaG cross-sectional study can be found in Stanczuk and colleagues (19). Briefly, women ages 20 to 60 attending for routine cervical screening were invited to participate in the study and asked to provide a routine LBC taken by a clinician, a self-taken vaginal swab, and a random void urine specimen. HrHPV testing on all specimens was performed by cobas 4800 PCR DNA test that detects 14 hrHPV types and provides a read out that includes separation of HPV 16/18. Cervical screening history and follow-up information was accessed through the national Scottish Cytology Call-Recall System (SCCRS). The British Society for Clinical Cytopathology reporting guidelines and CIN nomenclature were used to classify cytology findings and histologic outcomes respectively (20, 21). Histology was performed as per NHS standard care—review histology was not done unless routinely indicated. Management of women with abnormal cytology results was performed according to national guidelines and complemented with additional diagnostic interventions for hrHPV-positive women who were cytology negative as described previously (19–22).
Subset of PaVDaG cohort used to evaluate the diagnostic accuracy of tests
A diagnostic test accuracy study was set up that incorporated 61 cases with histologic-confirmed CIN2+ and 279 controls (≤CIN1), all being hrHPV positive and selected at random from the PaVDaG study. Controls incorporated LBC samples from women with at least two cytology negative screening results (3 years apart) and no previous evidence of CIN2+. Women who had two follow-up negative LBC results after a borderline squamous cytology result as well as three negative LBC results after low-grade cytology and no previous evidence of CIN2+ were also eligible for inclusion as controls, as were women who had normal colposcopy without biopsy or who had biopsies showing ≤CIN1. All selected cases and controls were submitted for p16/Ki-67 double staining cytology. In addition, conventionally stained LBC and HPV16/18 results were available for all the women. A total of 61 CIN2+ cases and 279 control women were selected at random from the whole study population. The accuracy of the aforementioned triage tests was also assessed for 57 CIN2+ cases and 335 controls associated with self-taken samples.
For both, cervical and self-taken samples, the sensitivity and specificity for CIN2+ and CIN3+ of p16/Ki-67 dual-stained cytology, HPV 16/18 genotyping and cytology were assessed (as a triage of hrHPV-positive women), as single or combined test strategies.
P16/Ki-67 cytoimmunochemistry
New slides were made for the p16/Ki-67 dual-stained cytology analysis using the CINtec PLUS Cytology Kit (Roche MTM Laboratories) from residual LBC material. CINtec PLUS interpretation was blinded to the hrHPV and cytology results. Samples with one or more cervical epithelial cells that simultaneously showed brown cytoplasmic immunostaining (p16) and red nuclear immunostaining (Ki-67) were classified as positive regardless of the morphologic appearance of the cells. Slides with excessive background staining were considered not evaluable and excluded from analysis.
Ethical approval
The study was approved by the NHS West of Scotland Research Ethics Service (ref: 12/WS/0085). The study sponsor was NHS Dumfries and Galloway. Roche provided cobas 4800 collection and testing kits and CINtec PLUS kits but was not involved in study design and all analysis was performed independently.
Data analysis and power calculation
Absolute accuracy (for detection of CIN2+) of HPV16/18 genotyping, LBC and p16/Ki-67 dual-stained cytology was measured in terms of sensitivity, specificity. PPV and compliment of the negative predictive value (cNPV) = 1-NPV were computed using the prevalence of CIN2+ among hrHPV+ women observed in the PaVDaG screening study (16.2%; ref. 19). The number of tests used in a particular triage option, the number of colposcopies per 1,000 hrHPV+ patients, as well as the number of referrals needed to find one CIN2+ case [NNR (number needed to refer = 1/PPV)] were also assessed as measures of cost and efficiency.
Relative sensitivity and specificity of each triage test or combination of triage tests (concurrent or sequential) compared with another was also determined. The McNemar test was used to assess concordance between tests. Statistical significance was defined as P < 0.05 and the CI not spanning 1.
The utility of the triage tests can be displayed on pre-test–post-test probability (PPP) plots (23). The left y-axis gives the prevalence of CIN2+ in the given population before triage testing (pre-test probability) and the right y-axis gives the post-test probability (PPV) of CIN2+ when the triage test is positive or negative (cNPV). Decision thresholds are defined by colored zones, where the red zone indicates immediate referral for colposcopy (defined as a risk of CIN2+ of >20%), the yellow zone indicates further surveillance (risk of CIN2+ between 2% and 20%), and the green zone indicates routine screening/recall (risk CIN2+ <2%). Knowledge of the positive hrHPV status has been demonstrated to increase sensitivity of cytologic interpretation (24). An assumption was made that p16/Ki-67 dual-stained cytology would allow more objective interpretation and increase specificity of triage without affecting the sensitivity, compared with traditional cytology at cutoff value of borderline squamous changes (13). A matched design was used for sample size computation (25). We hypothesized noninferior sensitivity accepting a minimal value for the lower 95% confidence interval bound of the relative sensitivity of 0.90, which yielded a sample size of 60 CIN2+ cases of CIN2+ as per the validation protocol of Meijer and colleagues (3). The number of nondiseased subjects (≤CIN1) needed was based on an assumed difference in specificity between 71% (double staining) and 60% (cytology at ASC-US+) with a power of 90%, which yielded 230 subjects.
Results
Absolute accuracy of triage tests to detect CIN2+
The absolute accuracy of the triage tests (LBC, p16/Ki-67 dual-stained cytology and HPV 16/18 typing) in clinician-taken samples for the detection of CIN2+ are presented singly and in various combinations in Table 1. Unless otherwise stated, “LBC” is based on a threshold of borderline squamous changes. As stand-alone assays, the triage with the highest sensitivity was p16/Ki-67 dual-stained cytology: 85% (95% CI, 73–93) followed by LBC: 68% (95% CI, 55–80) and HPV 16/18 typing: 62% (95% CI, 48–74). Comparatively, the triage test with the highest specificity was LBC: 89% (95%CI, 85–93) followed by p16/Ki-67 dual-stained cytology: 76.7% (95% CI, 71.1–81.8), and HPV 16/18 typing: 70.5% (95% CI, 64.6–76.0). The PPV of LBC was the highest at 54.9% compared to p16/Ki-67 dual-stained cytology and HPV 16/18 typing (41.4% and 28.8%, respectively). HPV 16/18 typing generated the greatest number of colposcopies and the highest number of referrals to detect one CIN2+.
The corresponding accuracy parameters for triage of women with an hrHPV-positive self-sample were similar (see Supplementary Table S1).
Relative accuracy of triage tests to detect CIN2+
The relative sensitivity and specificity of p16/Ki-67 dual-stained cytology and HPV 16/18 typing compared with LBC at three thresholds [borderline (ASC-US), low-grade dyskaryosis and high-grade dyskaryosis] is presented in Table 2 for women who were hrHPV positive on clinician-taken samples. P16/Ki-67 dual-stained cytology was more sensitive than LBC and HPV16/18 genotyping at all thresholds (p McN < 0.05; p McN = 0.0060). However, relative specificity of p16/Ki-67 dual-stained cytology versus LBC was significantly lower at all thresholds (p McN < 0.001 for all). The relative sensitivity of HPV 16/18 typing was not significantly different to LBC at a threshold of borderline and low grade squamous changes although it was significantly higher than LBC at a threshold of high-grade dyskaryosis (p McN < 0.0001). The specificity of HPV 16/18 typing was lower than that of LBC at all thresholds (P = < 0.001)
Relative accuracy findings for triage of women who were hrHPV positive on a self-sample were slightly different in relation to the specificity of p16/Ki-67 dual-stained cytology which was significantly higher than HPV 16/18 typing (see Supplementary Table S2).
Combination of triage tests
A comprehensive list of options with two or three triage tests used in parallel or sequentially is available for cervical samples (Tables 1 and 3) and self-samples (Supplementary Tables S1 and S3).
In clinician-taken samples, application of two triage tests as co-tests [HPV 16/18 typing and p16/Ki-67 dual-stained cytology, LBC plus p16/Ki-67 dual-stained cytology and LBC (≥borderline) plus HPV 16/18 typing] were all associated with absolute sensitivities of over ≥90%. Respective specificities of the above 2-test approaches ranged from 56% to 72%. For full details and CIs refer to Table 1.
The relative sensitivity and specificity of the 2-test approaches are detailed in Table 3 using LBC at a threshold of borderline as a comparator. All 2-test approaches that included p16/Ki-67 dual-stained cytology were significantly more sensitive than the comparator. HPV 16/18 typing with LBC at borderline and low-grade was significantly more sensitive than the comparator, although the gain in sensitivity was lost when the combination of HPV 16/18 plus LBC at cutoff value of ≥moderate was considered. Regarding specificity, all of the dual co-test triage approaches described above was significantly less specific than the comparator test.
Multistep triage options
The lower part of Table 1 shows the accuracy estimates of three multistep (sequential) triage strategies that all start with HPV 16/18 typing. The strategy with the highest sensitivity of 98% (95% CI, 91–100) and the lowest post-test risk of CIN2+ (0.6%) involved triage of HPV16/18-negative women with LBC and if negative then a further triage with p16/Ki-67 dual-stained cytology. However, this approach was associated with the highest number of colposcopies and a specificity that was lower than any of the stand-alone and combined options.
Figure 1 illustrates the probability of CIN2+ according to the three single-triage scenarios (i.e., LBC at borderline or worse, p16/Ki-67 dual-stained cytology, 16/18 typing; columns 1–3 respectively). The option of p16/Ki-67 dual-stained cytology subsequent to HPV 16/18 typing is incorporated into column 4. In the absence of triage, the pre-test probability of CIN2+ associated with hrHPV-positive status resides in the yellow zone (2%–20% indicating that surveillance is warranted. All three individual triage options, if positive, exceed the threshold of 20% risk at which colposcopy would be indicated. However, the post-test risk in women who are negative by any of the three individual triage options, does not fall into the green zone where routine screening can be advised However, if the p16/Ki-67 dual-stained cytology is performed in women who test negative for HPV 16/18, the post-test(s) probability does fall into the green zone.
Discussion
Now that hrHPV testing with a validated assay is accepted as the optimal modality for primary screening (26); one of the key challenges is to define and implement robust triage strategies, to avoid unnecessary referral to colposcopy (27). The lack of consensus around optimal triage tests is evidenced through the variety of approaches considered across the various HR-HPV–based screening programs (7). Arguably, the most evidenced triage tests involve cytology, repeat HPV testing, HPV 16/18 and p16 staining (with or without Ki-67) and various combinations thereof (5). However, there are comparatively few studies in which these have been compared with each other within the same population and fewer still where they have been evaluated in both clinician and self-taken samples. The present analysis attempted to address this.
P16/Ki-67 dual-stained cytology was more sensitive than traditional LBC and HPV16/18 genotyping, although the specificity compared to LBC was significantly lower. This is at variance with findings of Wentzensen and colleagues (13), who showed similar sensitivity and a higher specificity of p16/Ki-67 triage compared with traditional cytology. However, it is well established that the performance of cytology varies between settings (8) and the cytology associated with the present study was performed as part of an organized call-recall UK-wide program, which is subject to nationally observed quality assurance and guidelines, whereas the United States provides opportunistic screening. Importantly, unlike the U.S. study, cytologists were unaware of HPV status when reporting results. Previous research has shown that knowledge of HPV positivity can lead to a higher sensitivity at the compromise of specificity (24). Ideally, a triage test, if negative, should provide sufficient reassurance to permit a safe return to routine recall while conferring sufficient risk, if positive, to warrant referral to colposcopy. Figures of <2% probability of CIN2+ for a negative result + and ≥20% probability of CIN2+ of a positive triage result for CIN2 have been proposed (23, 28). It is notable that no single-triage approach satisfied these criteria.
HPV 16/18 typing alone had the lowest sensitivity and specificity as a single triage; however, one practical advantage of typing is that it occurs concurrently with the screening test. In settings where cytology is impractical and/or of low quality, this may be a more workable solution—particularly as the post-test probability of a positive result exceeds 20% for CIN2+. Managing the 16/18 negatives is more challenging; a subsequent negative dual stain led to a post-test probability of <2%; however, the number of additional colposcopies associated with this scenario was relatively high.
Screening programs that enfranchise hard to reach populations with a self-sampling option are increasing as is the evidence that this may be a practical and technically sound option for women who do attend. Although the focus of this work was to evaluate triage approaches in clinician-taken samples, it was notable that triage test performance was similar in self-taken samples. The only significant difference was observed in relation to HPV 16/18 typing that showed a lower relative specificity compared with both LBC and the dual stain. The higher prevalence and diversity of hrHPV in the vagina compared with the cervix is relatively well documented and may account for this observation (29, 30). Previous studies have also demonstrated that self-taken samples are valid biospecimens for host methylation biomarker detection (19, 31, 32).
The main limitation of this study is that all analyses were performed retrospectively and women were not triaged in real-time. This may have accounted for the number of technically invalid samples by the dual-staining, which was relatively high (between 7% and 8%). Comparatively, in the original PaVDaG study only around 2% of samples were unsatisfactory (19). In addition, if the study was prospective, women who were hrHPV positive on their self-taken sample would have been invited for a subsequent cytology sample, whereas we used the contemporaneous LBC sample (collected at the same time as the self-sample) as a proxy.
Another limitation is that not all controls/negative outcomes were associated with the gold standard (histology). However, using negative cytology across screening rounds has precedent (33). We are also aware that the overarching PaVDaG design may have somewhat biased the absolute specificity estimates. This was addressed by including relative specificity estimates that are less susceptible to bias (34).
The focus of this study was to evaluate the performance of the three most-evidenced triage strategies and this informed our selection of cytology, dual-staining and 16/18 typing. However, the evidence for other triage markers, including host and viral methylation targets and sequential viral load measurement, is increasing (3, 35, 36). Arguably the optimal triage test would be delivered at a single time point, as a single test. If the said test was amenable to self-collected samples, avoided subjective interpretation and was also impervious to the impact of immunization this would be of significant value. A recent study that measured the prevalence of HPV 16/18, using clinically validated assays, in immunized young women attending for first screen showed that 16/18 prevalence was 75% lower than that of unimmunized cohorts. The changing pattern of 16/18 infection will clearly have implications for the predictive value of 16/18 typing in immunized women (37). A further report also showed significant reductions in the predictive value of cytology in immunized women, particularly of low-grade cytology for CIN2+ (9). Objective triage tests that are positive irrespective of HPV type would have greater durability for immunized populations.
We report that LBC and dual-staining out-perform HR-HPV16/18 genotyping as single time point triage tests in terms of sensitivity and specificity. Although dual-staining showed significantly higher sensitivity than LBC its specificity was significantly lower.
The lowest risk of CIN2+ in screen negative women (0.6%) was achieved by offering colposcopy to all HPV16/18-positive women, and those who were hrHPV positive for non-16/18 types who had positive LBC and/or dual-staining. Such low cNPV is likely to remain below 2% even with the anticipated reduced performance of screening tests in immunized population. Although we did not perform a formal cost/benefit analysis on the various approaches, clearly the relative cost per test (including patient and clinician time) will be influential for future decision-making. Furthermore, given the rate of development of self-sampling and rapid molecular diagnostics, a purely molecular pathway for cervical screening may be a not too distant proposition.
Disclosure of Potential Conflicts of Interest
G.A. Stanczuk has received speakers honoraria from Roche. K. Cuschieri reports receiving a commercial research grant from GeneFirst, EuroImmun, Genomica, LifeRiver, SelfScreen, and Hologic. No potential conflicts of interest were disclosed by the other authors.
Authors' Contributions
Conception and design: G.A. Stanczuk, H. Currie, A. Wilson
Development of methodology: G.A. Stanczuk, A. Wilson
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): G.A. Stanczuk, G.J. Baxter, H. Currie, W. Forson, A. Wilson
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): G.A. Stanczuk, A. Wilson, L. Patterson, M. Arbyn
Writing, review, and/or revision of the manuscript: G.A. Stanczuk, H. Currie, K. Cuschieri, A. Wilson, T. Palmer, M. Arbyn
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): G.A. Stanczuk, G.J. Baxter, J.R. Lawrence, K. Cuschieri, L. Patterson, L. Govan, J. Black
Study supervision: G.A. Stanczuk, G.J. Baxter
Grant Support
The study was sponsored by NHS Dumfries and Galloway and supported by funding from the CSO, Scotland. Roche donated Cobas 4800 collection and testing kits and CINtec PLUS kits. M. Arbyn was supported by the Seventh Framework Program of the Directorate-General for Research of the European Commission through the comparing health services interventions for the prevention of HPV-related cancer (coheaHr) network (grant 603019) and by the Joint Action on Comprehensive Cancer Control, which has received funding from the European Union within the framework of the Health Program (2008–2013).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.