Purpose: Offering self-sampling of cervico-vaginal material for high-risk human papillomavirus (hrHPV) testing is an effective method to increase the coverage in cervical screening programs. Molecular triage directly on hrHPV-positive self-samples for colposcopy referral opens the way to full molecular cervical screening. Here, we set out to identify a DNA methylation classifier for detection of cervical precancer (CIN3) and cancer, applicable to lavage and brush self-samples.

Experimental Design: We determined genome-wide DNA methylation profiles of 72 hrHPV-positive self-samples, using the Infinium Methylation 450K Array. The selected DNA methylation markers were evaluated by multiplex quantitative methylation-specific PCR (qMSP) in both hrHPV-positive lavage (n = 245) and brush (n = 246) self-samples from screening cohorts. Subsequently, logistic regression analysis was performed to build a DNA methylation classifier for CIN3 detection applicable to self-samples of both devices. For validation, an independent set of hrHPV-positive lavage (n = 199) and brush (n = 287) self-samples was analyzed.

Results: Genome-wide DNA methylation profiling revealed 12 DNA methylation markers for CIN3 detection. Multiplex qMSP analysis of these markers in large series of lavage and brush self-samples yielded a 3-gene methylation classifier (ASCL1, LHX8, and ST6GALNAC5). This classifier showed a very good clinical performance for CIN3 detection in both lavage (AUC = 0.88; sensitivity = 74%; specificity = 79%) and brush (AUC = 0.90; sensitivity = 88%; specificity = 81%) self-samples in the validation set. Importantly, all self-samples from women with cervical cancer scored DNA methylation–positive.

Conclusions: By genome-wide DNA methylation profiling on self-samples, we identified a highly effective 3-gene methylation classifier for direct triage on hrHPV-positive self-samples, which is superior to currently available methods. Clin Cancer Res; 24(14); 3456–64. ©2018 AACR.

Translational Relevance

Offering self-sampling of cervico-vaginal specimens for high-risk human papillomavirus (hrHPV) testing to nonattendees increases the attendance rate in cervical screening. However, an additional triage test directly applicable on self-sampled material is necessary to identify hrHPV-positive women at risk for progression to cervical cancer. Because cytology, the wide-accepted triage method, cannot be reliably performed on self-sampled material, there is an urgent need for molecular triage markers. This is the first study performing a genome-wide DNA methylation discovery directly on self-samples, which allowed us to define the most optimal DNA methylation markers. We identified and validated a highly effective 3-gene methylation classifier (ASCL1, LHX8, and ST6GALNAC5) for detection of cervical precancer and cancer in both lavage and brush self-samples from hrHPV-positive women, which outperforms currently available methods. These findings could greatly improve the clinical management of women with hrHPV-positive self-samples and indicate that a transition to a full molecular self-screening approach in cervical screening programs is feasible.

Organized cytology-based cervical screening programs using physician-collected cervical scrapes have led to a substantial decrease in cervical cancer incidence and mortality in high-income countries (1). However, a considerable subset of women does not attend cervical screening (nonattendees), which compromises the effectiveness of the screening program (2). Previous studies have shown that offering self-sampling of cervico-vaginal specimens (self-samples) for high-risk human papillomavirus (hrHPV) testing (hrHPV self-sampling) to nonattendees increases the attendance to cervical screening. Up to 30% of the invited nonattendees returned their self-sample to the laboratory for hrHPV testing (3–6). Importantly, the diagnostic accuracy of hrHPV testing on self-samples for cervical intraepithelial neoplasia grade 3 and cervical cancer (CIN3+) is similar to hrHPV-screening of physician-collected cervical scrapes (7, 8). Therefore, offering hrHPV self-sampling as an alternative to conventional scrapes has just been implemented in the new hrHPV-based cervical screening program in the Netherlands. Partial substitution of hrHPV testing on physician-collected scrapes in cervical screening programs by hrHPV self-sampling can be envisioned in the near future.

Although hrHPV testing has a higher sensitivity for CIN3+ compared with cytology, its 3% to 5% lower specificity for CIN3+ necessitates the use of a triage test to distinguish women with clinically relevant disease from those with irrelevant, transient hrHPV infections to prevent overreferral and overtreatment. Currently, cytology is the most widely accepted triage tool. Because cytology cannot be reliably performed on self-sampled material (9–11), women with hrHPV-positive self-samples need to visit a physician for an additional cervical scrape for cytology. This may lead to loss to follow-up, delay the diagnostic track, and is less feasible in low-income countries given the lack of adequate infrastructure and limited number of trained practitioners (8, 12, 13). Therefore, molecular triage testing directly applicable to self-sampled material from hrHPV-positive women is preferred.

We and others have shown that DNA methylation analysis of tumor-suppressor genes on self-samples is well feasible and effective to detect CIN3+ using quantitative methylation-specific PCR (qMSP; refs. 12–16). DNA methylation analysis has already shown competitive clinical performance versus other triage options in cervical scrapes, whereas improvements in performance on self-samples are conceivable. Previous findings have shown that DNA methylation markers originally discovered in tissue specimens and tested on hrHPV-positive cervical scrapes are not necessarily of clinical value when applied to hrHPV-positive self-samples (17). This is likely due to the cellular composition of self-samples, which contain fewer disease-related cells. Therefore, self-samples may display distinct epigenetic signatures compared with physician-collected cervical specimens. Hence, DNA methylation marker discovery screens directly performed on self-samples are more likely to yield the most informative DNA methylation markers for hrHPV-positive self-samples.

In this study, we describe the identification and validation of a DNA methylation classifier for the detection of CIN3 and cervical cancer in hrHPV-positive self-samples. A genome-wide DNA methylation marker discovery for CIN3 detection was performed using the Infinium 450K BeadChip array to 72 hrHPV-positive self-samples from a screening cohort of nonattendees. The identified candidate DNA methylation markers were evaluated by multiplex qMSP in unique, large series of lavage-based (n = 245; further referred to as “lavage self-samples”) and brush-based (n = 246; further referred to as “brush self-samples”) self-samples from screening cohorts of nonattendees to build an optimal DNA methylation classifier for detection of CIN3 that is applicable to self-samples of both devices. The clinical performance of the obtained DNA methylation classifier was subsequently validated by multiplex qMSP on an independent series of lavage (n = 199) and brush (n = 287) self-samples.

Clinical specimens

Discovery set: case-control series for DNA methylation marker discovery screen.

For genome-wide DNA methylation marker discovery for CIN3 detection, hrHPV-positive lavage self-samples collected using the Delphi screener (Delphi Bioscience) were obtained from a screening cohort of nonattendees (PROHTECT-1 trial, ref. 3; NTR792; n = 72; Fig. 1, Discovery screen). Detailed characteristics of study design, clinical specimens, inclusion criteria, and follow-up procedures have been described previously (3). Array data from a pilot experiment of 12 self-samples for power calculations revealed a ratio of 3 (hrHPV-positive controls) to 4 (CIN3) for proper marker discovery. Therefore, the discovery series comprised hrHPV-positive lavage self-samples from 29 control women, who either had histologic evidence of absence of CIN2+ (≤CIN1) or displayed hrHPV clearance combined with normal cytology in follow-up (further referred to as hrHPV-positive controls; median age, 36; range, 31–56), and 39 cases histologically diagnosed with CIN3 (median age, 36; range, 31–62). Controls and cases were matched according to age and hrHPV type to the extent of sample availability. The hrHPV types in controls were eight HPV16, four HPV51, four HPV52, four HPV56, three HPV45, two HPV35, two HPV58, two HPV66, one HPV33, and one HPV39; the hrHPV types in CIN3 were 21 HPV16, six HPV31, four HPV52, three HPV33, three HPV56, two HPV51, two HPV68, one HPV18, one HPV35, one HPV39, one HPV45, and one HPV66. In addition, hrHPV-positive lavage self-samples from women histologically diagnosed with cervical squamous cell carcinoma (SCC; n = 4) were included (median age, 49; range, 42–61). The hrHPV types in SCC were two HPV16, one HPV31, and one HPV45.

Figure 1.

Experimental set-up of the study. All self-samples were obtained from screening cohorts of nonattendees, except seven SCC and five AdCA brush self-samples in the validation set.

Figure 1.

Experimental set-up of the study. All self-samples were obtained from screening cohorts of nonattendees, except seven SCC and five AdCA brush self-samples in the validation set.

Close modal

Building set: case-control series to build a DNA methylation classifier.

To build a DNA methylation classifier for CIN3 detection, both hrHPV-positive lavage self-samples (n = 245; PROHTECT-1 trial, ref. 3); excluding samples used for the discovery screen) and brush self-samples collected using a VibaBrush (Rovers; n = 246; PROHTECT-2 trial, ref. 4; NTR1851) were obtained from screening cohorts of nonattendees who reached a study endpoint and all of which were not preselected (Fig. 1; building a DNA methylation classifier; Supplementary Fig. S1). Detailed characteristics of study design, clinical specimens, inclusion criteria, and follow-up procedures have been described previously (4). Available lavage self-samples of 214 hrHPV-positive controls (controls; median age, 41; range, 31–62) and 31 women histologically diagnosed with CIN3 (cases; median age, 36; range, 31–62) were included. Brush self-samples included 174 hrHPV-positive controls (controls; median age, 37; range, 30–62) and 72 women histologically diagnosed with CIN3 (cases; median age, 36; range, 31–61).

Validation set: independent series to validate the DNA methylation classifier.

To validate the clinical performance of the DNA methylation classifier, independent series of both hrHPV-positive lavage (n = 199) and brush (n = 287) self-samples, all of which were not preselected, were used (Fig. 1; Validation of DNA methylation classifier; Supplementary Fig. S1). For lavage self-samples, hrHPV-positive samples collected using the Delphi Screener (Delphi Bioscience) were obtained from a screening cohort of nonattendees who reached a study endpoint in the PROHTECT-3 trial (methylation-arm; NTR2606; ref. 12). Detailed characteristics of study design, clinical specimens, inclusion criteria, and follow-up procedures have been described previously (12). Half of the available samples in this trial were randomly chosen for evaluation in the current study. These were supplemented with an independent series of four lavage self-samples from women with SCC who participated in the PROHTECT-1 trial (3). The total lavage series comprised 134 hrHPV-positive controls (median age, 38; range, 33–63), 22 women with CIN2 (median age, 38; range, 33–58), 35 women with CIN3 (median age, 38; range, 33–48), seven women with SCC (median age, 48; range, 38–61), and one woman with adenocarcinoma (AdCA; age 33). For brush self-samples, hrHPV-positive samples collected using the Evalyn brush (Rovers) were obtained from a screening cohort of nonattendees who reached a study endpoint in the PROHTECT-3B trial (NTR3350; ref. 18). Detailed characteristics of study design, clinical specimens, inclusion criteria, and follow-up procedures have been described previously (18). These were supplemented with an independent series of four brush self-samples from women with SCC and one brush self-sample from a woman with adenocarcinoma in situ (ACIS) who participated in the PROHTECT-2 trial (4) and seven brush self-samples from women with SCC and five brush self-samples from women with AdCA who visited the gynecology clinic (METC15.1468/X15MET study). The total brush series comprised 178 hrHPV-positive controls (median age, 39; range, 33–63), 28 women with CIN2 (median age, 38; range, 33–53), 56 women with CIN3 (median age, 38; range, 33–59), 16 women with SCC (median age, 44; range, 29–75), one woman with ACIS (age 41), and eight women with AdCA (median age, 44; range, 27–62).

This study followed the ethical guidelines of the Institutional Review Board of VU University Medical Center and Antoni van Leeuwenhoek Hospital/Netherlands Cancer Institute. All participants in the PROHTECT and X15MET trials gave informed consent.

Infinium HumanMethylation450 BeadChip and data preprocessing

Before application, quality of the DNA was assessed by Qubit BR dsDNA measurement and visual evaluation of DNA integrity on an agarose gel. Genome-wide DNA methylation profiling was performed by Infinium HumanMethylation450 BeadChip (Illumina). Data are available from the NCBI Gene Expression Omnibus (GEO) through series accession number GSE99511 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE99511). Data were preprocessed and checked for sample and probe quality. Four samples (one hrHPV-positive control and three CIN3) were excluded based on the correlation heatmap results (Supplementary Fig. S2). For further data preprocessing, see Supplementary Methods.

hrHPV and DNA methylation testing

For sample processing, hrHPV testing, and DNA methylation analysis, see Supplementary Methods and Supplementary Table S1. HrHPV positivity was determined for all samples. HrHPV genotypes were defined in a subset of the classifier building set only. In each multiplex qMSP assay, three targets and the housekeeping gene β-actin (ACTB) were combined as described before (19). Target DNA methylation values were normalized to reference gene ACTB and the calibrator using the comparative Ct method (2−ΔΔCt × 100) to obtain ΔΔCt ratios (20). The ΔΔCt ratios were square root transformed. Only samples for which sufficient DNA material was available and which achieved an ACTB Ct value <30 were included.

Statistical analyses

Discovery screen: genome-wide DNA methylation array data.

After preprocessing of Infinium data, we applied adaptive group-regularized logistic ridge regression (GRridge; ref. 21). We incorporated auxiliary information (referred to as codata) in building the GRridge classification model, namely P values from a similar study in cervical tissue specimens using the same array platform (Farkas and colleagues, ref. 22) and standard deviation of each probe in the current dataset. Using informative codata has been shown to enhance the identification of valuable markers in rather impure samples, such as self-samples (Supplementary Fig. S3). More details regarding the GRridge model by incorporating such information are provided in Supplementary Methods and elsewhere (21). Post hoc forward selection was applied to the GRridge model to render a model of DNA methylation markers. The performance of the GRridge model was visualized by an ROC curve, obtained by leave-one-out-cross-validation, and quantified by AUC. Predicted probabilities, representing the risk for an underlying CIN3, were calculated using the GRridge model. Hierarchical clustering of the 28 DNA methylation markers was performed to further select the genes that were most discriminative between CIN3 and hrHPV-positive controls.

Building and validation of DNA methylation classifier: qMSP data.

To compare DNA methylation levels between two groups (hrHPV-positive controls and CIN3), the Wilcoxon rank-sum test (two-sided) was applied on the square root–transformed ΔΔCt ratios. Statistical significance was set at P < 0.05.

To build a DNA methylation classifier, classical logistic regression analysis was performed on qMSP data to select relevant DNA methylation markers for CIN3 detection in both lavage and brush self-samples (detailed description in Supplementary Methods and Supplementary Fig. S4). In brief, logistic regression analysis followed by stepwise selection and backward elimination was performed on the combination of lavage and brush self-sample datasets (to encourage overlap) to obtain an initial marker panel of two DNA methylation markers for both self-sample types. Forward selection on the separate lavage and brush datasets suggested the addition of a third DNA methylation marker, which was particularly relevant for the brush dataset, without harming the performance in the lavage dataset. Because DNA methylation in CpG islands has been shown to increase with age (23), we included age as a factor in the DNA methylation classifier. Supplementary Table S2 shows the P value and contribution (coefficient/SD) of age and the third DNA methylation marker ST6GALNAC5 in the 3-gene methylation classifier. These two factors were included in the classifier because exclusion of age and ST6GALNAC5 resulted in a lower performance in particularly the brush self-samples. Predicted probabilities and 95% CI were calculated for all analyzed samples using the logistic regression models of the DNA methylation classifier for lavage and brush self-samples. The clinical performance of the logistic regression models in both classifier building and validation sets was visualized by an ROC curve and evaluated by AUC calculation. The ROC curves show the sensitivity and specificity for the complete spectrum of different thresholds in predicted probabilities using the logistic regression models. A threshold was fixed for predicted probabilities corresponding to 80% specificity (lavage self-samples: 0.053; brush self-samples: 0.240) based on the classifier building set and subsequently evaluated in the independent validation set for CIN3 sensitivity and specificity. In addition, the DNA methylation classifier at a fixed threshold was applied on self-samples from women with CIN2, SCC, and ACIS/AdCA to evaluate the positivity rates in these disease categories. A classification and regression tree (CART) algorithm, which renders a DNA methylation classifier using marker-based cutoffs, was built for comparison with the continuous values obtained by regression. For the details of the CART method, see Supplementary Methods.

An overview of the study design is given in Fig. 1.

Discovery of DNA methylation markers in hrHPV-positive self-samples

In total, we obtained 68 genome-wide DNA methylation profiles of hrHPV-positive lavage self-samples from a screening cohort of nonattendees, of which 64 (28 controls and 36 women with CIN3) were suitable to identify DNA methylation markers for CIN3 detection (Fig. 1; Discovery screen). Adaptive group-regularized ridge regression, GRridge (21), and variable selection on the DNA methylation profiles from women with and without CIN3 yielded a panel of 28 DNA methylation markers with discriminatory power for CIN3 (AUC of 0.77). Hierarchical clustering of all 28 DNA methylation markers showed that 12 methylated genes, that is, ACAN, ASCL1, LHX8, MYADM, NRG3, RGS7, ST6GALNAC3, ST6GALNAC5, WDR17, ZNF582, ZNF583, and ZNF781, were mostly contributing to the discrimination of women with and without CIN3 (Fig. 1; hierarchical clustering, Fig. 2A, Table 1; Supplementary Fig. S5). Evaluation of the DNA methylation profiling data from four hrHPV-positive lavage self-samples from women with SCC confirmed high DNA methylation levels for all these 12 DNA methylation markers (Fig. 2B; Supplementary Fig. S5).

Figure 2.

Heatmap of the 28 DNA methylation markers in the discovery screen. Hierarchical clustering of the 28 Infinium 450K BeadChip probes, each probe corresponds to a DNA methylation marker. Low (blue) to high (purple) DNA methylation levels (arcsine square root–transformed beta values) are displayed for each DNA methylation marker (cg numbers of the probes). A, DNA methylation data of self-samples from hrHPV-positive controls (green; n = 28) and from women with CIN3 (orange; n = 36). The samples are ordered by predicted probability. The 12 DNA methylation markers above the black line showed the most discriminative DNA methylation profile between women with and without CIN3. B, DNA methylation data of self-samples from women with SCC (red; n = 4).

Figure 2.

Heatmap of the 28 DNA methylation markers in the discovery screen. Hierarchical clustering of the 28 Infinium 450K BeadChip probes, each probe corresponds to a DNA methylation marker. Low (blue) to high (purple) DNA methylation levels (arcsine square root–transformed beta values) are displayed for each DNA methylation marker (cg numbers of the probes). A, DNA methylation data of self-samples from hrHPV-positive controls (green; n = 28) and from women with CIN3 (orange; n = 36). The samples are ordered by predicted probability. The 12 DNA methylation markers above the black line showed the most discriminative DNA methylation profile between women with and without CIN3. B, DNA methylation data of self-samples from women with SCC (red; n = 4).

Close modal
Table 1.

The 12 candidate DNA methylation markers from the discovery screen

Infinium BeadChip probeChr.Chr. locationGene name
cg08272731 75602167 LHX8 
cg14156405 241520286 RGS7 
cg20707222 76540222 ST6GALNAC3 
cg23243867 77334045 ST6GALNAC5 
cg27486637 176987174 WDR17 
cg10401879 10 83634276 NRG3 
cg20718350 12 103352294 ASCL1 
cg06675190 15 89346205 ACAN 
cg13499300 19 54369556 MYADM 
cg02763101 19 56904945 ZNF582 
cg00796360 19 56915650 ZNF583 
cg14587524 19 38183262 ZNF781 
Infinium BeadChip probeChr.Chr. locationGene name
cg08272731 75602167 LHX8 
cg14156405 241520286 RGS7 
cg20707222 76540222 ST6GALNAC3 
cg23243867 77334045 ST6GALNAC5 
cg27486637 176987174 WDR17 
cg10401879 10 83634276 NRG3 
cg20718350 12 103352294 ASCL1 
cg06675190 15 89346205 ACAN 
cg13499300 19 54369556 MYADM 
cg02763101 19 56904945 ZNF582 
cg00796360 19 56915650 ZNF583 
cg14587524 19 38183262 ZNF781 

Abbreviation: Chr., chromosome.

Building a DNA methylation classifier using hrHPV-positive lavage and brush self-samples

Next, the 12 most discriminative DNA methylation markers from the discovery screen were further analyzed using multiplex qMSP in large series of hrHPV-positive lavage self-samples (n = 245) and brush self-samples (n = 246) from women with and without CIN3 from two screening cohorts (Fig. 1; building a DNA methylation classifier). In both lavage and brush self-samples, all genes except ACAN (in lavage only; P < 0.05) showed significantly increased DNA methylation levels (P < 0.001) in self-samples from women with CIN3 compared with hrHPV-positive controls (Fig. 3).

Figure 3.

Differential DNA methylation levels of the 12 candidate methylation markers in hrHPV-positive self-samples. DNA methylation levels represented by the square root–transformed ΔΔCt ratios (y axis) in (A) lavage self-samples from hrHPV-positive controls (n = 214) and women with CIN3 (n = 31; x axis), and (B) brush self-samples from hrHPV-positive controls (n = 174) and women with CIN3 (n = 72; x axis). The three genes left of the black line are included in the 3-gene methylation classifier. *, P < 0.05; ***, P < 0.001; and NS, not significant.

Figure 3.

Differential DNA methylation levels of the 12 candidate methylation markers in hrHPV-positive self-samples. DNA methylation levels represented by the square root–transformed ΔΔCt ratios (y axis) in (A) lavage self-samples from hrHPV-positive controls (n = 214) and women with CIN3 (n = 31; x axis), and (B) brush self-samples from hrHPV-positive controls (n = 174) and women with CIN3 (n = 72; x axis). The three genes left of the black line are included in the 3-gene methylation classifier. *, P < 0.05; ***, P < 0.001; and NS, not significant.

Close modal

To build an optimal DNA methylation classifier for detection of CIN3, which is applicable to different self-sample types, logistic regression analysis followed by stepwise selection and backward elimination was performed on the combined dataset of lavage and brush self-sample qMSP results (see Materials and Methods, Supplementary Methods, and Supplementary Fig. S4). This revealed a 3-gene methylation classifier for CIN3 detection in both self-sample types, consisting of ASCL1, LHX8, and ST6GALNAC5 (Supplementary Fig. S4; Supplementary Table S2). This 3-gene methylation classifier showed a very good clinical performance for CIN3 detection in both hrHPV-positive lavage (AUC = 0.90) and brush (AUC = 0.86) self-samples (Fig. 4A and B, black lines). At the threshold corresponding to a specificity of 80% in hrHPV-positive controls, 83% (25 of 30) of lavage self-samples and 76% (52 of 68) of brush self-samples from women with CIN3 were DNA methylation–positive (Supplementary Fig. S6).

Figure 4.

Clinical performance of the 3-gene methylation classifier for CIN3 detection in hrHPV-positive lavage and brush self-samples. ROC curve and AUC of the 3-gene methylation classifier for CIN3 detection in (A) lavage and (B) brush self-samples in classifier building set (gray) and validation set (black).

Figure 4.

Clinical performance of the 3-gene methylation classifier for CIN3 detection in hrHPV-positive lavage and brush self-samples. ROC curve and AUC of the 3-gene methylation classifier for CIN3 detection in (A) lavage and (B) brush self-samples in classifier building set (gray) and validation set (black).

Close modal

Validation of DNA methylation classifier

To validate the clinical performance of the 3-gene methylation classifier, an independent, large series of hrHPV-positive lavage self-samples (n = 199) and brush self-samples (n = 287) was analyzed by multiplex qMSP (Fig. 1; Validation of DNA methylation classifier). Solely hrHPV-positive controls and CIN3 from independent screening cohorts were used for validation of the 3-gene methylation classifier. This showed a comparable clinical performance for CIN3 detection as observed in the above-described classifier building set, in both hrHPV-positive lavage (AUC = 0.88) and brush (AUC = 0.90) self-samples (Fig. 4A and B, gray lines). The predefined threshold corresponding to an 80% specificity in the classifier building set (see above) was applied to this validation set. This resulted in a CIN3 sensitivity of 74% (26 of 35) in lavage self-samples and 88% (49 of 56) in brush self-samples, at 79% and 81% specificity in hrHPV-positive controls, respectively (Supplementary Fig. S6). To confirm these findings, we applied an alternative method (CART) on both lavage and brush self-samples, which rendered similar results to those shown here (Supplementary Methods; Supplementary Table S3; and Supplementary Figs. S7 and S8).

Furthermore, this validation series also comprised self-samples from women with CIN2 from a screening cohort. Fifty percent of these lavage self-samples (11 of 22) and brush self-samples (14 of 28) were DNA methylation–positive (Supplementary Fig. S6). Importantly, all 23 SCC (seven lavage self-samples and 16 brush self-samples; Supplementary Fig. S6), and all ACIS (one brush self-sample) and AdCA (one lavage self-sample and eight brush self-samples) scored DNA methylation–positive (Supplementary Fig. S9).

Here, we identified a DNA methylation classifier consisting of three methylated gene promoters, ASCL1, LHX8, and ST6GALNAC5, for the detection of CIN3 and cervical cancer in hrHPV-positive self-samples and validated the clinical performance in large series of both cervical lavage and brush self-samples from independent screening cohorts of nonattendees.

Previous publications showed that CIN lesions detected by DNA methylation analysis do not completely overlap with those detected by cytology (24). In fact, DNA methylation analysis tends to preferably detect cervical cancer and advanced high-grade precursor lesions, defined as CIN2/3 associated with a persistent hrHPV infection of >5 years. Women with advanced CIN2/3 are presumed to have a high short-term progression risk to cancer and are therefore in need of immediate referral and treatment (24, 25). Cytology on the other hand detects both early and advanced CIN lesions with a moderate sensitivity of 65% to 80%, and cannot be reliably applied to self-samples, requiring a visit to the physician (9–11). DNA methylation markers are applicable on self-samples and have the potential to reduce the risk for undetected cervical cancers and advanced CIN2/3. On the contrary, women with a negative DNA methylation marker test would have a low short-term cancer progression risk, indicating that immediate colposcopy referral is unnecessary. To prevent overreferral and overtreatment in hrHPV-based self-sampling, direct triage testing by DNA methylation markers in self-sampled material enables the identification of only those hrHPV-positive women with clinically relevant disease who are in need of treatment, and it allows for full molecular cervical self-screening.

This is the first study performing a discovery screen directly on self-samples, which allowed us to define the most optimal DNA methylation classifier for direct molecular triage testing on hrHPV-positive self-sampled material. Our 3-gene methylation classifier showed a very good and reproducible clinical performance for detection of CIN3 in both hrHPV-positive lavage (classifier building set AUC = 0.90; classifier validation set AUC = 0.88) and brush (classifier building set AUC = 0.86; classifier validation set AUC = 0.90) self-samples. This indicates that it represents a universal triage test for both self-sample devices. Furthermore, the combined analysis of the 3-gene methylation classifier and a reference gene in a single multiplex assay saves material, costs, and time and allows for (semi)high-throughput screening.

To select the most discriminatory DNA methylation markers for CIN3 from our discovery screen on hrHPV-positive self-samples, which are rather impure due to an overrepresentation of non–disease-related cells, we applied our recently proposed GRridge model (21). This method enables objective use of codata and was shown to potentially outperform other prediction methods (Supplementary Fig. S3; ref. 26). In particular publicly available DNA methylation data from relatively pure cervical tissue specimens, obtained by the same array platform, proved to be useful codata (22). The validity of this approach is supported by the identification of the three DNA methylation classifier genes that have all been previously described in DNA methylation studies on cervical cancer (22, 27, 28). The combination of GRridge (on array data) and classical logistic regression analysis (on qMSP data) enabled us to build a highly discriminative methylation classifier for CIN3 detection consisting of ASCL1, LHX8, and ST6GALNAC5. The narrow range of the 95% confidence interval of the predicted probabilities (i.e., the methylation classifier value; range, 0–1) in both lavage and brush self-samples supports a good representation of the disease state (case vs. control) in the population by the 3-gene methylation classifier (Supplementary Fig. S10). Comparison of the three markers in HPV16-positive self-samples to self-samples positive for other hrHPV types (non-HPV16), in the subset of samples with HPV typing information, revealed no significant difference in DNA methylation levels in both lavage and brush self-samples, except for LHX8 in HPV16 versus non-HPV16 controls of lavage self-samples (P value = 0.03; Supplementary Fig. S11).

ASCL1, achaete-scute family bHLH transcription factor 1, is a proneural transcription factor and functions as a main regulator of differentiation in neurogenesis (29). LHX8, LIM homeobox 8, is a highly conserved transcription factor regulating cell fate in neurogenesis, tooth morphogenesis, and oogenesis (30). ST6GALNAC5, ST6 N-Acetylgalactosaminide Alpha-2,6-Sialyltransferase 5, is a transmembrane sialyltransferase involved in the biosynthesis of gangliosides on the cell surface (31). Next to cervical cancer, LHX8 methylation has been detected in breast cancer (32), ST6GALNAC5 methylation has been described in colorectal cancer studies (33), and ASCL1 methylation has been detected in oral and colorectal cancer (34, 35).

Of the previously described DNA methylation markers tested in self-samples (12–14, 17, 24), the DNA methylation panel FAM19A4/miR124-2 showed the best clinical performance in a large screening cohort. Analysis of the same study cohorts as used in present study showed a CIN3+ sensitivity of 71% in lavage and 69% in brush self-samples at a specificity of 68% and 76%, respectively (14). Within the CIN3+ group, 68% of CIN3 and all cancers were detected in both self-sample types. Other DNA methylation marker panels, such as JAM3/EPB41L3/TERT/C13ORF18, have only been analyzed in small selected series of self-samples (15, 16). A combination of DNA methylation markers with HPV16/18 genotyping results in higher sensitivities compared with solely DNA methylation, however at the cost of severe lower specificities due to detection of early CIN2/3 (14, 36). Our 3-gene methylation classifier shows a better sensitivity for CIN3 than other assays in both lavage (74%) and brush (88%) self-samples in a similar screening population, at a higher specificity of 79% and 81%, respectively. These findings emphasize the validity and importance of our approach to perform the DNA methylation marker discovery directly on self-sampled material. Furthermore, the 3-gene methylation classifier detected all self-samples from women with SCC. Importantly, all self-samples from women with SCC showed very high predicted probabilities (median, 1.00; range, 0.54–1.00), which accentuates the value of our 3-gene methylation classifier for detection of cervical cancer. In addition, all self-samples from women with ACIS and AdCA scored DNA methylation–positive, indicating that glandular lesions are also detected by our 3-gene methylation classifier. Nevertheless, further evaluation of cervical glandular lesions and other rare cervical cancer types is warranted.

A limitation of our study is that we used cohorts of nonattending women. Therefore, further confirmation in a regular population-based population is warranted. In addition, the Infinium 450K array is limited to 485.577 CpG measurements. A new version of this platform, the Infinium MethylationEPIC Beadchip array, covers over 850,000 CpG sites and would yield more discovery data, especially in the enhancer regions (37). Although the Infinium 450K array is not fully genome-wide, and may yield partly different results than other methylome analysis methods, all 485,577 probes cover 99% of Refseq genes and 96% of all CpG islands with multiple probes per gene and CpG island (38). Furthermore, this array is one of the most widely accepted methods for genome-wide DNA methylation profiling, and it is cost-effective (39).

In conclusion, by genome-wide DNA methylation profiling on self-samples obtained from a screening trial, we identified and validated an effective 3-gene methylation classifier for detection of CIN3 and cervical cancer in both lavage and brush self-samples from hrHPV-positive women. Moreover, this 3-gene methylation classifier showed an improved clinical performance compared with current (complex) triage strategies for the management of hrHPV-positive self-samples (13). Our findings indicate that a transition toward full molecular self-screening in hrHPV-based cervical screening programs is feasible.

D.A.M. Heideman holds ownership interest (including patents) in Self-screen B.V., and is a consultant/advisory board member for Bristol-Meyer Squibb and Pfizer. P.J.F. Snijders reports receiving speakers bureau honoraria from Gen-Probe, Qiagen, Roche, and Seegene, and holds ownership interest (including patents) in Self-screen B.V. J. Berkhof is a consultant/advisory board member for GlaxoSmithKline, Merck, and Roche. C.J.L.M. Meijer is an employee of Self-screen B.V., reports receiving speakers bureau honoraria from Merck and Qiagen, holds ownership interest (including patents) in Diassay B.V., Qiagen, and Self-screen B.V., and is a consultant/advisory board member for Qiagen. R.D.M. Steenbergen holds ownership interest (including patents) in Self-screen B.V. No potential conflicts of interest were disclosed by the other authors.

Conception and design: D.A.M. Heideman, P.J.F. Snijders, C.J.L.M. Meijer, R.D.M. Steenbergen

Development of methodology: W. Verlaat, P.W. Novianti, R.D.M. Steenbergen

Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): W. Verlaat, A.P. van Splunter, N.E. van Trommel, L.F.A.G. Massuger, R.L.M. Bekkers, W.J.G. Melchers, F.J. van Kemenade, C.J.L.M. Meijer

Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): W. Verlaat, D.A.M. Heideman, S.M. Wilting, P.W. Novianti, C.F.W. Peeters, W.J.G. Melchers, F.J. van Kemenade, M.A. van de Wiel, R.D.M. Steenbergen

Writing, review, and/or revision of the manuscript: W. Verlaat, B.C. Snoek, D.A.M. Heideman, S.M. Wilting, P.J.F. Snijders, P.W. Novianti, A.P. van Splunter, C.F.W. Peeters, N.E. van Trommel, L.F.A.G. Massuger, R.L.M. Bekkers, W.J.G. Melchers, F.J. van Kemenade, J. Berkhof, M.A. van de Wiel, C.J.L.M. Meijer, R.D.M. Steenbergen

Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): W. Verlaat, D.A.M. Heideman, C.J.L.M. Meijer

Study supervision: C.J.L.M. Meijer, R.D.M. Steenbergen

Other (development and design of original study of which samples are used in this study): R.L.M. Bekkers

Other (interpretation of the data and define clinical impact of the data): C.J.L.M. Meijer

We thank Lise De Strooper, Bart Hesselink, Maarten van der Salm, Saskia Doorn, Martijn Bogaarts, and Dénira Agard for excellent technical assistance. In addition, we thank Dr. S. Farkas for providing the raw data of her study (22). This work was supported by the European Research Council (ERC advanced 2012-AdG; 322986; Mass-Care) to C.J.L.M. Meijer and by ZonMw (Netherlands Organisation for Health Research and Development; 91216012) to M.A. van de Wiel.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1.
Peto
PJ
,
Gilham
PC
,
Fletcher
O
,
Matthews
FE
. 
The cervical cancer epidemic that screening has prevented in the UK
.
Lancet
2004
;
364
:
249
56
.
2.
Bos
AB
,
Rebolj
M
,
Habbema
JDF
,
Ballegooijen
M Van
. 
Nonattendance is still the main limitation for the effectiveness of screening for cervical cancer in the Netherlands
.
Int J Cancer
2006
;
119
:
2372
5
.
3.
Gok
M
,
Heideman
DAM
,
Kemenade
FJ Van
,
Berkhof
J
,
Rozendaal
L
,
Spruyt
JWM
, et al
HPV testing on self collected cervicovaginal lavage specimens as screening method for women who do not attend cervical screening: cohort study
.
Br Med J
2010
;
340
:
c1040
.
4.
Gok
M
,
Kemenade
FJ Van
,
Heideman
DAM
,
Berkhof
J
,
Rozendaal
L
,
Spruyt
JWM
, et al
Experience with high-risk human papillomavirus testing on vaginal brush-based self-samples of non-attendees of the cervical screening program
.
Int J cancer
2012
;
130
:
1128
35
.
5.
Bais
AG
,
Kemenade
FJ Van
,
Berkhof
J
,
Verheijen
RHM
,
Snijders
PJF
,
Voorhorst
F
, et al
Human papillomavirus testing on self-sampled cervicovaginal brushes: an effective alternative to protect nonresponders in cervical screening programs
.
Int J Cancer
2007
;
120
:
1505
10
.
6.
Racey
CS
,
Withrow
DR
,
Gesink
D
. 
Self-collected HPV testing improves participation in cervical cancer screening: a systematic review and meta-analysis
.
Can J Public Heal
2013
;
104
:
159
66
.
7.
Snijders
PJF
,
Verhoef
VMJ
,
Arbyn
M
,
Ogilvie
G
,
Minozzi
S
,
Banzi
R
, et al
High-risk HPV testing on self-sampled versus clinician-collected specimens: a review on the clinical accuracy and impact on population attendance in cervical cancer screening
.
Int J Cancer
2013
;
132
:
2223
36
.
8.
Arbyn
M
,
Verdoodt
F
,
Snijders
PJF
,
Verhoef
VMJ
,
Suonio
E
,
Dillner
L
, et al
Accuracy of human papillomavirus testing on self-collected versus clinician-collected samples: a meta-analysis
.
Lancet Oncol
2014
;
15
:
172
83
.
9.
Rijkaart
DC
,
Berkhof
J
,
Kemenade
FJ Van
,
Coupé
VMH
,
Hesselink
AT
,
Rozendaal
L
, et al
Evaluation of 14 triage strategies for HPV DNA-positive women in population-based cervical screening
.
Int J Cancer
2012
;
130
:
602
10
.
10.
Dijkstra
MG
,
Niekerk
D Van
,
Rijkaart
DC
,
Kemenade
FJ Van
,
Heideman
DAM
,
Snijders
PJF
, et al
Primary hrHPV DNA testing in cervical cancer screening: how to manage screen-positive women? A POBASCAM trial substudy
.
Cancer Epidemiol Biomarkers Prev
2014
;
23
:
55
63
.
11.
Garcia
F
,
Barker
B
,
Santos
C
,
Brown
EM
,
Muño
T
,
Giuliano
A
, et al
Cross-sectional study of patient- and physician- collected cervical cytology and human papillomavirus
.
Obstet Gynecol
2003
;
102
:
266
72
.
12.
Verhoef
VMJ
,
Bosgraaf
RP
,
Van Kemenade
FJ
,
Rozendaal
L
,
Heideman
DAM
,
Hesselink
AT
, et al
Triage by methylation-marker testing versus cytology in women who test HPV-positive on self-collected cervicovaginal specimens (PROHTECT-3): a randomised controlled non-inferiority trial
.
Lancet Oncol
2014
;
15
:
315
22
.
13.
Luttmer
R
,
Strooper
LMA De
,
Steenbergen
RDM
,
Berkhof
J
,
Snijders
PJF
,
Heideman
DAM
, et al
Management of high-risk HPV-positive women for detection of cervical (pre)cancer
.
Expert Rev Mol Diagn
2016
;
16
:
961
74
.
14.
De Strooper
LMA
,
Verhoef
VMJ
,
Berkhof
J
,
Hesselink
AT
,
Bruin
HME De
,
Van Kemenade
FJ
, et al
Validation of the FAM19A4/mir124-2 DNA methylation test for both lavage- and brush-based self-samples to detect cervical (pre) cancer in HPV-positive women
.
Gynecol Oncol
2016
;
141
:
341
7
.
15.
Boers
A
,
Bosgraaf
RP
,
van Leeuwen
RW
,
Schuuring
E
,
Heideman
DAM
,
Massuger
LFAG
, et al
DNA methylation analysis in self-sampled brush material as a triage test in hrHPV-positive women
.
Br J Cancer
2014
;
111
:
1095
101
.
16.
Eijsink
JJH
,
Yang
N
,
Lendvai
A
,
Klip
HG
,
Volders
HH
,
Buikema
HJ
, et al
Detection of cervical neoplasia by DNA methylation analysis in cervico-vaginal lavages, a feasibility study
.
Gynecol Oncol
2011
;
120
:
280
3
.
17.
Hesselink
AT
,
Heideman
DAM
,
Steenbergen
RDM
,
Gök
M
,
Van Kemenade
FJ
,
Wilting
SM
, et al
Methylation marker analysis of self-sampled cervico-vaginal lavage specimens to triage high-risk HPV-positive women for colposcopy
.
Int J Cancer
2014
;
135
:
880
6
.
18.
Bosgraaf
RP
,
Verhoef
VMJ
,
Massuger
LFAG
,
Siebers
AG
,
Bulten
J
,
de Kuyper-de Ridder
GMD
, et al
Comparative performance of novel self-sampling methods in detecting high-risk human papillomavirus in 30,130 women not attending cervical screening
.
Int J cancer
2015
;
136
:
646
55
.
19.
Snellenberg
S
,
De Strooper
LMA
,
Hesselink
AT
,
Meijer
CJLM
,
Snijders
PJF
,
Heideman
DAM
, et al
Development of a multiplex methylation-specific PCR as candidate triage test for women with an HPV-positive cervical scrape
.
BMC Cancer
2012
;
12
:
551
.
20.
Schmittgen
TD
,
Livak
KJ
. 
Analyzing real-time PCR data by the comparative CT method
.
Nat Protoc
2008
;
3
:
1101
8
.
21.
Wiel
MA van de
,
Lien
TG
,
Verlaat
W
,
Wieringen
WN van
,
Wilting
SM
. 
Better prediction by use of co-data: adaptive group-regularized ridge regression
.
Stat Med
2016
;
35
:
368
81
.
22.
Farkas
SA
,
Milutin-Gašperov
N
,
Grce
M
,
Nilsson
TK
. 
Genome-wide DNA methylation assay reveals novel candidate biomarker genes in cervical cancer
.
Epigenetics
2013
;
8
:
1213
25
.
23.
Christensen
BC
,
Houseman
EA
,
Marsit
CJ
,
Zheng
S
,
Wrensch
MR
,
Wiemels
JL
, et al
Aging and environmental exposures alter tissue-specific DNA methylation dependent upon CPG island context
.
PLoS Genet
2009
;
5
:
e1000602
.
24.
Steenbergen
RDM
,
Snijders
PJF
,
Heideman
DAM
,
Meijer
CJLM
. 
Clinical implications of (epi)genetic changes in HPV-induced cervical precancerous lesions
.
Nat Rev Cancer
2014
;
14
:
395
405
.
25.
De Strooper
LMA
,
Meijer
CJLM
,
Berkhof
J
,
Hesselink
AT
,
Snijders
PJF
,
Steenbergen
RDM
, et al
Methylation analysis of the FAM19A4 gene in cervical scrapes is highly efficient in detecting cervical carcinomas and advanced CIN2/3 lesions
.
Cancer Prev Res
2014
;
7
:
1251
7
.
26.
Novianti
PW
,
Snoek
BC
,
Wilting
SM
,
Wiel
MA van de
. 
Better diagnostic signatures from RNAseq data through use of auxiliary co-data
.
Bioinformatics
2017
;
33
:
1572
4
.
27.
Boers
A
,
Wang
R
,
van Leeuwen
RW
,
Klip
HG
,
Bock
GH De
,
Hollema
H
, et al
Discovery of new methylation markers to improve screening for cervical intraepithelial neoplasia grade 2/3
.
Clin Epigenetics
2016
;
8
:
29
.
28.
Clarke
MA
,
Luhn
P
,
Gage
JC
,
Bodelon
C
,
Dunn
ST
,
Walker
J
, et al
Discovery and validation of candidate host DNA methylation markers for detection of cervical precancer and cancer
.
Int J Cancer
2017
;
141
:
701
10
.
29.
Vasconcelos
FF
,
Castro
DS
. 
Transcriptional control of vertebrate neurogenesis by the proneural factor Ascl1
.
Front Cell Neurosci
2014
;
8
:
412
.
30.
Zhou
C
,
Yang
G
,
Chen
M
,
He
L
,
Xiang
L
,
Ricupero
C
, et al
Lhx6 and Lhx8: cell fate regulators and beyond
.
FASEB J
2015
;
29
:
4083
91
.
31.
Drolez
A
,
Vandenhaute
E
,
Delannoy
CP
,
Dewald
JH
,
Gosselet
F
,
Cecchelli
R
, et al
ST6GALNAC5 expression decreases the interactions between breast cancer cells and the human blood-brain barrier
.
Int J Mol Sci
2016
;
17
:
1309
.
32.
Tommasi
S
,
Karm
DL
,
Wu
X
,
Yen
Y
,
Pfeifer
GP
. 
Methylation of homeobox genes is a frequent and early epigenetic event in breast cancer
.
Breast cancer Res
2009
;
11
:
R14
.
33.
Øster
B
,
Thorsen
K
,
Lamy
P
,
Wojdacz
TK
,
Hansen
LL
,
Birkenkamp-Demtröder
K
, et al
Identification and validation of highly frequent CpG island hypermethylation in colorectal adenomas and carcinomas
.
Int J Cancer
2011
;
129
:
2855
66
.
34.
Jin
B
,
Yao
B
,
Li
J-L
,
Fields
CR
,
Delmas
AL
,
Liu
C
, et al
DNMT1 and DNMT3B modulate distinct polycomb-mediated histone modifications in colon cancer
.
Cancer Res
2009
;
69
:
7412
21
.
35.
Li
Y-F
,
Hsiao
Y-H
,
Lai
Y-H
,
Chen
Y-C
,
Chen
Y-J
,
Chou
J-L
, et al
DNA methylation profiles and biomarkers of oral squamous cell carcinoma
.
Epigenetics
2015
;
10
:
229
36
.
36.
Verhoef
VMJ
,
Heideman
DAM
,
Van Kemenade
FJ
,
Rozendaal
L
,
Bosgraaf
RP
,
Hesselink
AT
, et al
Methylation marker analysis and HPV16/18 genotyping in high-risk HPV positive self-sampled specimens to identify women with high grade CIN or cervical cancer
.
Gynecol Oncol
2014
;
135
:
58
63
.
37.
Moran
S
,
Arribas
C
,
Esteller
M
. 
Validation of a DNA methylation microarray for 850,000 CpG sites of the human genome enriched in enhancer sequences
.
Epigenomics
2016
;
8
:
389
99
.
38.
Bibikova
M
,
Barnes
B
,
Tsan
C
,
Ho
V
,
Klotzle
B
,
Le
JM
, et al
High density DNA methylation array with single CpG site resolution
.
Genomics
2011
;
98
:
288
95
.
39.
Yong
W-S
,
Hsu
F-M
,
Chen
P-Y
. 
Profiling genome-wide DNA methylation
.
Epigenetics Chromatin
2016
;
9
:
26
.

Supplementary data