Abstract
Human papillomavirus (HPV) in oropharyngeal squamous cell carcinoma (OPSCC) is tumorigenic and has been associated with a favorable prognosis compared with OPSCC caused by tobacco, alcohol, and other carcinogens. Meanwhile, machine learning has evolved as a powerful tool to predict molecular and cellular alterations of medical images of various sources.
We generated a deep learning–based HPV prediction score (HPV-ps) on regular hematoxylin and eosin (H&E) stains and assessed its performance to predict HPV association using 273 patients from two different sites (OPSCC; Giessen, n = 163; Cologne, n = 110). Then, the prognostic relevance in a total of 594 patients (Giessen, Cologne, HNSCC TCGA) was evaluated. In addition, we investigated whether four board-certified pathologists could identify HPV association (n = 152) and compared the results to the classifier.
Although pathologists were able to diagnose HPV association from H&E-stained slides (AUC = 0.74, median of four observers), the interrater reliability was minimal (Light Kappa = 0.37; P = 0.129), as compared with AUC = 0.8 using the HPV-ps within two independent cohorts (n = 273). The HPV-ps identified individuals with a favorable prognosis in a total of 594 patients from three cohorts (Giessen, OPSCC, HR = 0.55, P < 0.0001; Cologne, OPSCC, HR = 0.44, P = 0.0027; TCGA, non-OPSCC head and neck, HR = 0.69, P = 0.0073). Interestingly, the HPV-ps further stratified patients when combined with p16 status (Giessen, HR = 0.06, P < 0.0001; Cologne, HR = 0.3, P = 0.046).
Detection of HPV association in OPSCC using deep learning with help of regular H&E stains may either be used as a single biomarker, or in combination with p16 status, to identify patients with OPSCC with a favorable prognosis, potentially outperforming combined HPV-DNA/p16 status as a biomarker for patient stratification.
Human papillomavirus (HPV) in oropharyngeal squamous cell carcinoma (OPSCC) is tumorigenic and has been associated with a favorable prognosis. Within our study, we highlight the significance of assessing HPV-related morphologic changes within OPSCC by proposing an HPV prediction score (HPV-ps) that identified patients with a favorable prognosis. Future studies, including deescalation trials, may find it meritorious to use HPV prediction either as a single marker or in combination with p16 status.
Introduction
Cancers arising in the head and neck region, particularly the oropharynx, are often related to persistent infection with high risk human papillomavirus (HPV; refs. 1–3). As incidences are rising and patients with HPV-associated OPSCC generally display a significantly better prognosis than those with HPV-negative OPSCC, biomarkers to guide personalized treatment concepts are urgently needed to spare these patients the high rates of therapy-related side effects (4–6). The unmet need to identify patients qualifying for less-toxic treatment regimens using optimized clinical trial design and more precise inclusion criteria (7), is reflected by the failure of recent phase III deescalation trials where cisplatin has been insufficiently replaced by the less toxic mAb cetuximab (8, 9).
For identification of HPV-related OPSCC, p16INK4A (p16) has been included as a valid biomarker in the AJCC-8/UICC-8 staging system. However, few studies have investigated the role of sole p16 IHC to identify HPV-related OPSCC versus dual testing (p16 and HPV-DNA detection) and reported a discrepancy in 5%–20% (10–13). Furthermore, it was demonstrated that a subgroup of patients with sole p16 positivity displayed an overall survival similar to that of patients with HPV-negative OPSCC (13). This is in contrast to other studies that reported high accuracy of p16 for diagnosis of HPV-related OPSCC (14, 15). Clinically, the AJCC/UICC-8 staging manual solely uses p16 IHC as a biomarker for evaluation of HPV-association in OPSCC. At the same time, there are limitations in tissue quality and specimen origin for HPV PCR testing using formalin-fixed paraffin-embedded (FFPE) samples that may challenge an accurate assessment of HPV status, although current technologies and automated tests, as well as liquid biopsies are emerging (16–18).
In parallel, deep learning has evolved as a powerful tool for precision medicine allowing classification and segmentation of several kinds of medical images, including regular histologic H&E specimens, elucidating its potential in the era of precision medicine (19, 20). Within a recently published study, it was shown that p16-status can be predicted from regular CT images alone with an AUC of approximately 0.7 (21).
On the basis of this idea, we developed a deep learning model that would identify HPV status from H&E slides. In detail, we (i) generated an algorithm to detect areas of viable tumor cells within OPSCC and HNSCC, and (ii) trained a network exclusively on cases of OPSCC with a dichotomous status for both HPV-DNA and p16 (HPV-DNA+/p16+; HPV-DNA−/p16−), and then (iii) we determined the prognostic relevance of the HPV prediction score (HPV-ps) either alone or in combination with p16 status.
Materials and Methods
Patients
Patients from two different sites were included. Patients of the Giessen and the Cologne cohort were diagnosed with primary squamous cell carcinoma of the oropharynx (ICO code C10, International Classification of Diseases for Oncology) and treated either at the Department of Oto-Rhino-Laryngology, Head and Neck Surgery of the University Hospital of Giessen, Germany between 2000 and 2009 (n = 163, Supplementary Table S1) or at the Department of Oto-Rhino-Laryngology, Head and Neck Surgery, University of Cologne, Germany between 2005 and 2019 (n = 110, Supplementary Table S2). The study was conducted in accordance with the Declaration of Helsinki, and the protocol was approved by the regional ethics committees (Giessen: AZ 95/15, dated October 19, 2015; Cologne: AZ 19–1288_1, dated February 3, 2020). Informed written consent was obtained from each subject. Patient characteristics were recorded prospectively by the Giessen cancer registry database (GTDS), as well as from the cancer registry database of the Center for Integrated Oncology (CIO), Cologne. Patients at both sites were either treated with upfront surgery and concomitant (chemo)radiotherapy as appropriate or definitive chemoradiotherapy following approved guidelines. Characteristics of the cohort from the TCGA database of HNSCC are summarized in Supplementary Table S3.
HPV-DNA and p16 status
Corresponding p16 status on tumor tissue was assessed by staining and scoring for p16 according to EORTC guidelines, based on cytoplasmatic and nucleic p16 expression in 70% of tumor cells (22, 23). Isolation of tumor DNA and HPV genotyping was performed as described previously (24). A dichotomous HPV+/p16+ status was declared if both high-risk HPV variants, as well as p16 positivity, was present.
Whole-slide images and processing
Regular H&E-stained slides, following standard protocols, were scanned using a NanoZoomer S360 (Hamamatsu Photonics) whole-slide scanning device at a 40× magnification. All digitized slides were evaluated for image quality and included, when more than 90% of the tissue area was in focus. Then, the whole-slide-images were processed and 1,000 × 1,000 pixel image tiles from relevant tumor areas were created.
Statistical analysis
For evaluation of interdependence of clinicopathologic parameters, statistical analyses were performed using SPSS statistical software (IBM SPSS 25.0). P values were calculated by Pearson χ2 test, asymptotic, two-sided. Light Kappa was used to determine inter-rater reliability as described previously (25). Survival analysis for the cohort was performed using a Cox Proportional-Hazards Model.
Expert review process–evaluation of H&E slides for HPV status
Four experienced, board-certified pathologists reviewed n = 152 cases of OPSCC using virtual whole-slide images for classification as either HPV-positive or HPV-negative OPSCC, respectively. The analysis was done blinded, while six training cases, including three HPV-associated and three non–HPV-associated OPCSCC, were provided to the observers for training purpose. We also compiled the morphologic consensus criteria used by the four expert pathologists to classify HPV-associated or non–HPV-associated tumors. The following criteria are agreed to be indicative of an HPV+ tumor (the absence or only weakly pronounced characteristics corresponding to non–HPV-associated tumors): (i) confluent tumor growth (so-called “medullary“ features), (ii) low level of desmoplastic stroma reaction (little stroma, few fibroblasts), (iii) pronounced accompanying lymphocytic inflammation, whereby the lymphocytes are in close spatial contact with the tumor cells (lymphocytes are not predominately located in the stroma between the tumor cells but “seek” proximity to the carcinoma cell), (iv) basaloid nuclear morphology, (v) absence of keratinization by the tumor cells, (vi) absence of dysplasia at the edge of the tumor. Then, the corresponding test cases were determined as HPV-associated or HPV negative based on digitalized whole-slide images.
Training dataset, general
To account for color- and staining variabilities, and to make the model more generalizable, the training cohort was built from both TCGA data, as well as whole-slide-images from two different sites (Cologne and Giessen, Germany). Only cases that were either HPV-DNA+/p16+ (double positive) or HPV-DNA−/p16− (double negative) were included in the training dataset.
Training data, image segmentation
A U-Net style network (26) was trained to detect relevant tumor areas on H&E OPSCC virtual whole-slide images. In total, 1,385 image crops (1,000 × 1,000) were annotated for viable tumor areas using a balanced dataset of HPV-positive and HPV-negative cases (705 image crops from 44 HPV-positive cases, 679 image crops from 59 HPV-negative cases; see Supplementary Table S4). In detail, images were annotated by a trained pathologist and several augmentation steps were applied, including stain normalization on H&E images (27), as well as gray-scale augmentation with a ten-percent probability using Albumentations (28). Training was performed on a NVIDIA RTX 6000 using the PyTorch framework and Adam as an optimizer (29). Training was performed on a mixture of TCGA cases, as well as cases from Giessen and Cologne. All training cases were completely independent to the test sets.
Training data, image classification
To classify HPV-associated from HPV-negative images, we trained a DenseNet (30) using extracted tumor patches (224 × 224) that had been detected by our segmentation algorithm (U-Net) as areas of viable tumor cells. A balanced training dataset of 3,939 image crops were used for HPV-positive cases and 6,061 image crops for HPV-negative cases (total of 10,000 1,000 × 1,000 tumor patches; see Supplementary Table S4). Stochastic-gradient descent (SGD) was used as an optimizer (31) for the classifier. The HPV-ps was calculated on the basis of the networks output, here all scores for each tumor patch were used. The median of all scores was added to the highest score of all tumor patches for a given tumor. Thresholds were determined based on the 70th percentile, 30th percentile (high > 70th percentile; medium >30th < 70th percentile; low < 30th percentile).
Results
Classifying HPV-associated OPSCC using deep learning
We developed a deep learning–based algorithm that would allow (i) detection of areas of viable tumor cells, as well as (ii) classification of image patches according to HPV status (Fig. 1A). By training a U-Net architecture for image segmentation to identify tumor areas within OPSCC tumors, our approach allowed a consistent and controllable declaration of image information. Following the extraction of relevant tumor patches from OPSCC, a DenseNet architecture classified images according to HPV-status to provide an HPV prediction score (HPV-ps, Fig. 1A).
Schematic of the algorithm and study approach. A, A U-Net deep convolutional neural network detects areas of viable tumor cells of OPSCC and HNSCC. Extracted tumor patches (green line circles tumor areas) are then classified using a DenseNet, where an HPV prediction score (HPV-ps) is assigned. Scale bar is 2 mm (B). Schematic overview of three cohorts being analyzed (n = 594).
Schematic of the algorithm and study approach. A, A U-Net deep convolutional neural network detects areas of viable tumor cells of OPSCC and HNSCC. Extracted tumor patches (green line circles tumor areas) are then classified using a DenseNet, where an HPV prediction score (HPV-ps) is assigned. Scale bar is 2 mm (B). Schematic overview of three cohorts being analyzed (n = 594).
We then evaluated the performance to predict HPV-status in OPSCC, as well the prognostic relevance to stratify individuals with OPSCC and non-OPSCC proposing an HPV-ps (Fig. 1B).
Pathologists can predict HPV status from regular whole-slide images
To initially explore whether pathologists could also predict HPV-status using morphologic information with help of regular H&E stains in OPSCC, four board-certified pathologists with more than 10 years of experience were asked to determine HPV-status using H&E slides. To control for comparable sources of information the blinded analysis was done on digitized images.
While the performance of the observers was generally sufficient (AUC = 0.74; median of four observers), the inter-rater reliability (Light Kappa = 0.37; P = 0.129) was minimal (Table 1). Still, features of HPV-association could be detected on regular H&E whole-slide-images by human observers.
Pathologist evaluation of HPV status on OPSCC.
. | Observer 1 . | Observer 2 . | Observer 3 . | Observer 4 . |
---|---|---|---|---|
Statistics . | Value . | Value . | Value . | Value . |
Sensitivity | 67.86% | 71.43% | 92.86% | 85.71% |
Specificity | 69.35% | 88.62% | 32.26% | 75.00% |
Positive likelihood ratio | 2.21 | 6.28 | 1.37 | 3.43 |
Negative likelihood ratio | 0.46 | 0.32 | 0.22 | 0.19 |
. | Observer 1 . | Observer 2 . | Observer 3 . | Observer 4 . |
---|---|---|---|---|
Statistics . | Value . | Value . | Value . | Value . |
Sensitivity | 67.86% | 71.43% | 92.86% | 85.71% |
Specificity | 69.35% | 88.62% | 32.26% | 75.00% |
Positive likelihood ratio | 2.21 | 6.28 | 1.37 | 3.43 |
Negative likelihood ratio | 0.46 | 0.32 | 0.22 | 0.19 |
Predicting HPV status using H&E images of OPSCC with help of deep learning
We then applied our deep-learning–based algorithm (HPV-ps) on two different OPSCC cohorts that were independent to the training dataset from two different sites (Giessen, n = 163; Cologne, n = 110). In both cohorts, the HPV-ps showed an AUC of 0.8 (95% CI, 0.73–0.87) in the Giessen cohort, as well as an AUC of 0.8 (95% CI, 0.71–0.89) within the Cologne cohort, respectively (Fig. 2A and B).
Performance of an algorithm to predict HPV status on regular H&E whole-slide images. A, Receiver operator curve for the HPV-ps compared with HPV status (HPV-DNA+/p16+) in OPSCC of Giessen, n = 163, and (B) Cologne, n = 110. C, Comparison of the cumulative number of pathologists that declared HPV association of OPSCC using regular digitized H&E whole-slide images. The cumulative number of the observers is plotted against the HPV-ps for the corresponding cases. The HPV association (HPV-DNA+/p16+; HPV-DNA−/p16−) is indicated within the legend. The mean of the HPV-ps for the given number of pathologists is indicated using a line. The dot size of the individual data points represents the cumulative number of pathologists that declared OPSCC as HPV association (total number of cases n = 152). D, Grad-CAM layer maximization visualization of an image patch (224 × 224) that revealed a high score of HPV presence. The top image represents the original H&E stain. The lower image shows a heatmap of probability, where regions of high probability of HPV association appear in yellow. Highlighted in yellow are tumor cells with vacuolized cytoplasm and spike-like nuclear elongation.
Performance of an algorithm to predict HPV status on regular H&E whole-slide images. A, Receiver operator curve for the HPV-ps compared with HPV status (HPV-DNA+/p16+) in OPSCC of Giessen, n = 163, and (B) Cologne, n = 110. C, Comparison of the cumulative number of pathologists that declared HPV association of OPSCC using regular digitized H&E whole-slide images. The cumulative number of the observers is plotted against the HPV-ps for the corresponding cases. The HPV association (HPV-DNA+/p16+; HPV-DNA−/p16−) is indicated within the legend. The mean of the HPV-ps for the given number of pathologists is indicated using a line. The dot size of the individual data points represents the cumulative number of pathologists that declared OPSCC as HPV association (total number of cases n = 152). D, Grad-CAM layer maximization visualization of an image patch (224 × 224) that revealed a high score of HPV presence. The top image represents the original H&E stain. The lower image shows a heatmap of probability, where regions of high probability of HPV association appear in yellow. Highlighted in yellow are tumor cells with vacuolized cytoplasm and spike-like nuclear elongation.
To understand the decisions of our network, we correlated the scores of the algorithm to the predictions of the four observers on the same OPSCC (Fig. 2C). Here, the median of the HPV-ps of cases where either one or none of the four observers declared HPV association was − 0.5 (n = 84), while the median of the HPV-ps where two or more observers declared HPV association was 0.8 (n = 68; P < 0.0001, Mann–Whitney U test; Fig. 2C). In addition, there were 24 cases where two or more pathologists agreed on HPV association that scored as HPV-ps-high. Conversely, there were 8 cases where two or more pathologists determined HPV association that were scored as HPV-ps-low. There was only one case that was scored as HPV-high, where neither of the four expert reviewers declared HPV-association. However, this case was found to be HPV-DNA+/p16+ (Fig. 2C).
By evaluating the HPV predictions of human experts, observer 2 and observer 4 had the best overall performance to declare HPV-associated OPSCC (Table 1). We therefore compared these declarations to the HPV-ps and demonstrated a significant overlap of cases that had been determined as HPV-associated (Light Kappa = 0.46, P = 0.00088). Together, these results support the idea that our HPV-ps detected phenotypical characteristics of HPV association OPSCC on regular H&E stains, which were also relevant to expert human observers.
To further determine which image information was relevant to the networks output, we applied a gradient-weighted class activation mapping (Grad-CAM) technique to visualize image information that were significant to the networks output (32). Cellular changes, including vacuolized cytoplasm and axon-like elongated nuclei, showed high activations of the networks layer (Fig. 2D).
HPV-ps identified OPSCC and non-OPSCC HNSCC patients with a favorable prognosis
We next asked, whether the HPV-ps could be used to identify patients with OPSCC with a favorable prognosis, as this has been demonstrated for p16-status as a surrogate marker for HPV association (14). Patients with a high HPV-ps revealed a favorable prognosis in two cohorts from two different sites [Giessen, OPSCC, HR = 0.55 (95% CI = 0.42–0.72), P < 0.0001; Cologne, OPSCC, HR = 0.44 (95% CI = 0.26–0.75), P = 0.0027; Table 2; Fig. 3A and B].
Comparison of HPV and p16 with the HPV prediction score on survival.
Marker (G, Giessen; C, Cologne) . | HR (95% CI) . | . | P . |
---|---|---|---|
HPV-DNA+/p16+ (G) | 0.26 (0.13–0.54) | 0.00026 | |
HPV-ps-high/p16+ (G) | 0.06 (0.02–0.26) | 0.00011 | |
p16+ (G) | 0.26 (0.14–0.48) | 0.000013 | |
HPV-ps (G) | 0.55 (0.42–0.72) | 0.000013 | |
HPV-DNA+/p16+ (C) | 0.37 (0.18–0.77) | 0.0083 | |
HPV-ps-high/p16+ (C) | 0.3 (0.09–0.98) | 0.046 | |
p16+ (C) | 0.37 (0.18–0.77) | 0.0083 | |
HPV-ps (C) | 0.44 (0.26–0.75) | 0.0027 |
Marker (G, Giessen; C, Cologne) . | HR (95% CI) . | . | P . |
---|---|---|---|
HPV-DNA+/p16+ (G) | 0.26 (0.13–0.54) | 0.00026 | |
HPV-ps-high/p16+ (G) | 0.06 (0.02–0.26) | 0.00011 | |
p16+ (G) | 0.26 (0.14–0.48) | 0.000013 | |
HPV-ps (G) | 0.55 (0.42–0.72) | 0.000013 | |
HPV-DNA+/p16+ (C) | 0.37 (0.18–0.77) | 0.0083 | |
HPV-ps-high/p16+ (C) | 0.3 (0.09–0.98) | 0.046 | |
p16+ (C) | 0.37 (0.18–0.77) | 0.0083 | |
HPV-ps (C) | 0.44 (0.26–0.75) | 0.0027 |
Kaplan–Meier curves stratified for HPV-ps within patients with OPSCC. A, Kaplan–Meier curve for the Giessen cohort showing the survival divided by low (<30th percentile), medium (between 30th and 70th percentile), and high (type = “Other” >70th percentile) HPV prediction scores (n = 155). B, Kaplan–Meier curve for the Cologne cohort showing the survival divided by low (<30th percentile), medium (between 30th and 70th percentile), and high (>70th percentile; n = 110). Cox proportional-hazards model.
Kaplan–Meier curves stratified for HPV-ps within patients with OPSCC. A, Kaplan–Meier curve for the Giessen cohort showing the survival divided by low (<30th percentile), medium (between 30th and 70th percentile), and high (type = “Other” >70th percentile) HPV prediction scores (n = 155). B, Kaplan–Meier curve for the Cologne cohort showing the survival divided by low (<30th percentile), medium (between 30th and 70th percentile), and high (>70th percentile; n = 110). Cox proportional-hazards model.
Having shown that the HPV-ps was associated with a favorable prognosis in patients with OPSCC from two different sites, we investigated whether the HPV-ps could also be applied on a cohort of non-OPSCC head and neck cancer patients with unknown HPV-DNA/p16 status (TCGA, n = 329). Interestingly, the HPV-ps was associated with a favorable prognosis within this unselected cohort of non-OPSCC head and neck cancer patients [TCGA, HR = 0.69 (95% CI = 0.52–0.9); P = 0.0073; Supplementary Fig. S1). The prognostic effect of the HPV-ps could also be observed using a multivariate analysis within the three cohorts (Giessen OPSCC, Cologne OPSCC, TCGA non-OPSCC HNSCC; Supplementary Table S5).
HPV-ps further stratified HPV-associated OPSCC but had no prognostic value in non–HPV-associated OPSCC patients
To assess whether the HPV-ps was specific to HPV status, we analyzed the prognostic effect of the HPV-ps within HPV-DNA−/p16− OPSCC cases. Interestingly, the HPV-ps was not associated with prognosis within this subset of OPSCC [Giessen, HR = 0.78 (95% CI = 0.58–1.1), P = 0.12; Cologne, HR = 0.66 (95% CI = 0.26–1.6), P = 0.36; Table 3]. Conversely, the HPV prediction score did further stratify HPV-associated (HPV-DNA+/p16+) OPSCC patients [Giessen, HR = 0.11 (95% CI = 0.025–0.45), P = 0.0022; Cologne, HR = 0.45 (95% CI = 0.2–0.99), P = 0.048; Table 3].
Comparison of HPV prediction score according to HPV-DNA−/p16− and HPV-DNA+/p16+.
HPV-DNA−/p16− . | ||
---|---|---|
Description (cohort) . | HR (95% CI) . | P . |
HPV-ps (Giessen) | 0.78 (0.58–1.1) | 0.12 |
HPV-ps (Cologne) | 0.66 (0.26–1.6) | 0.36 |
HPV-DNA−/p16− . | ||
---|---|---|
Description (cohort) . | HR (95% CI) . | P . |
HPV-ps (Giessen) | 0.78 (0.58–1.1) | 0.12 |
HPV-ps (Cologne) | 0.66 (0.26–1.6) | 0.36 |
HPV-DNA+/p16+ . | ||
---|---|---|
Description (cohort) . | HR (95% CI) . | P . |
HPV-ps (Giessen) | 0.11 (0.025–0.45) | 0.0022 |
HPV-ps (Cologne) | 0.45 (0.2–0.99) | 0.048 |
HPV-DNA+/p16+ . | ||
---|---|---|
Description (cohort) . | HR (95% CI) . | P . |
HPV-ps (Giessen) | 0.11 (0.025–0.45) | 0.0022 |
HPV-ps (Cologne) | 0.45 (0.2–0.99) | 0.048 |
By combining both cohorts from Giessen and Cologne, we next explored whether the HPV-ps could indeed provide prognostic relevance within HPV-associated OPSCC. Here, HPV-associated (HPV-DNA+/p16+) OPSCC could further be stratified according to the HPV-ps (Fig. 4A), while this effect was lost within non–HPV-associated (HPV-DNA−/p16−) OPSCC (Fig. 4B). Together, these results further indicated that HPV-associated morphologic changes were relevant to the classifier.
Kaplan–Meier curves stratified for HPV-ps within OPSCC patients in HPV-associated and non–HPV-associated tumors. A, Kaplan–Meier curve for HPV-associated (HPV+/p16+) OPSCC from both the Giessen and Cologne cohorts by low (<30th percentile), medium (between 30th and 70th percentile), and high (>70th percentile) HPV prediction scores (n = 94). B, Kaplan–Meier curve for non–HPV-associated (HPV-DNA−/p16−) OPSCC from both the Giessen and Cologne cohorts showing the survival divided by low (<30th percentile), medium (between 30th and 70th percentile), and high (>70th percentile; n = 171). Cox proportional-hazards model.
Kaplan–Meier curves stratified for HPV-ps within OPSCC patients in HPV-associated and non–HPV-associated tumors. A, Kaplan–Meier curve for HPV-associated (HPV+/p16+) OPSCC from both the Giessen and Cologne cohorts by low (<30th percentile), medium (between 30th and 70th percentile), and high (>70th percentile) HPV prediction scores (n = 94). B, Kaplan–Meier curve for non–HPV-associated (HPV-DNA−/p16−) OPSCC from both the Giessen and Cologne cohorts showing the survival divided by low (<30th percentile), medium (between 30th and 70th percentile), and high (>70th percentile; n = 171). Cox proportional-hazards model.
Combining the HPV-ps to p16 status in OPSCC improves stratification of patients
As recent deescalation trials using p16 status as a single biomarker showed limitations in stratifying patients for less toxic regimens, we analyzed whether combining the HPV-ps with p16-status could add prognostic relevance to this established marker.
Interestingly, cases that demonstrated a high HPV-ps and that were p16-positive had a better prognosis by stratifying patients solely using p16 status alone or HPV-DNA-status and p16 (HPV/p16) in combination. This effect could be observed within the Giessen cohort [HPV-ps/p16+, HR = 0.06 (95% CI, 0.02–0.26, P < 0.0001; p16+/HPV+, HR = 0.26 (95% CI, 0.13–0.54), P = 0.00026; Table 2; Supplementary Fig. S2A), as well as in the Cologne cohort with dichotomous HPV-DNA+/p16+ status [HPV-ps/p16+, HR = 0.3 (95% CI, 0.09–0.98), P = 0.046; HPV-DNA+/p16+, HR = 0.37 (95% CI, 0.18–0.77), P = 0.0083; Table 2; Supplementary Fig. S2B].
Discussion
As recent deescalation trials within OPSCC patients' cohorts showed less pleasing results using p16-status as a single stratification biomarker, there is a diagnostic need to identify clinically applicable biomarkers that enable stratification of patients for less-toxic treatment regimens (8, 9).
Within our study, we highlight that a score that was exclusively trained on dichotomous HPV-DNA-positive and p16-positive OPSCC using regular H&E stains identified patients with a favorable prognosis from three different cohorts (Giessen, Cologne, TCGA-HNSCC; n = 594). Combining the HPV-ps with p16-status further stratified patients with a favorable prognosis, outperforming both p16-status alone, as well as p16- and HPV-DNA-status in combination.
This may be of particular clinical interest, as HPV-DNA testing using molecular techniques (e.g., PCR) by assessing FFPE tissue may sometimes be challenged due to tissue quality, speed, and interpretation (identification of HPV-high-risk variants, as well as a threshold of the sensitive PCR technology). In general, our approach can (i) deliver result within minutes, and (ii) is insensitive to DNA or RNA degradation related to tissue fixation or general sample quality.
While these results are promising, they rely on the classification of tumor patches by a deep-convolutional neural network that may challenge a clear interpretation what is detected by the network (33). However, by following our approach of extracting only image information from areas of viable tumor cells, we control the input of the classifier. In addition, by assessing HPV-status with help of trained pathologists (>10 years' experience), the overlap between the two best performing experts (observer 2, observer 4; Light Kappa = 0.46; P = 0.00088) provide evidence that HPV-associated morphologic changes were relevant to the classifier. Furthermore, we compared cases that had been analyzed by human observers in an unbiased way. Unlike other attempts, we compare not only positive results of the classifier and evaluate their morphologic characteristics, but instead compare a valid number of cases (n = 152) that had been analyzed prospectively. In addition, there was no prognostic effect of the HPV-ps within HPV-negative OPSCC, indicating a specificity of the classifier toward HPV-related image information (Fig. 4A and B). Moreover, there was a similar pattern of patient characteristics from HPV/p16 status and the HPV prediction score (Supplementary Tables S1 and S2), also excluding differing tumor stages between the groups that may affect the prognostic relevance (Supplementary Tables S1–S3). Unlike previous attempts that showed efficacy in determining p16-status using radiological images, we trained our classifier exclusively on HPV-DNA+/p16+ cases (21). This may guide future attempts to identify HPV-related OPSCC, as previous studies highlighted that p16-status as a single biomarker may not be sufficient (13).
Together, we highlight the significance of assessing HPV-related morphological changes within OPSCC by proposing an HPV-prediction score (HPV-ps) that identified patients with a favorable prognosis. Future studies, including deescalation trials, may find it meritorious to use HPV prediction either as a single marker or in combination with p16-status. In summary, this may allow for classification of OPSCC for different risk groups and potential stratification of patients who may qualify for less toxic therapeutic options.
Authors' Disclosures
E.-S. Prigge reports grants and personal fees from MSD Sharp & Dohme GmbH, and personal fees from Institut für Frauengesundheit (IFG) GmbH outside the submitted work. M. Maltseva reports grants and personal fees from EFRE NRW during the conduct of the study. J.P. Klussmann reports grants and personal fees from MSD and personal fees from Merck and BMS outside the submitted work. N. Wuerdemann reports grants from European Union Funds for Regional development (EFRE) and the German State of North Rhine Westphalia (NRW) during the conduct of the study, as well as other from German Research Council and personal fees from Merck Sharp & Dohme and Merck Serono GmbH outside the submitted work. No disclosures were reported by the other authors.
Authors' Contributions
S. Klein: Conceptualization, data curation, software, formal analysis, supervision, funding acquisition, validation, investigation, visualization, methodology, writing-original draft, project administration, writing-review and editing. A. Quaas: Conceptualization, formal analysis, writing-review and editing. J. Quantius: Data curation, formal analysis, writing-review and editing. H. Löser: Formal analysis, validation, investigation. J. Meinel: Formal analysis, validation, investigation. M. Peifer: Conceptualization, writing-review and editing. S. Wagner: Resources, investigation, writing-review and editing. S. Gattenlöhner: Resources, investigation. C. Wittekindt: Resources, investigation, writing-review and editing. M. von Knebel Doeberitz: Resources, investigation, writing-review and editing. E.-S. Prigge: Resources, investigation. C. Langer: Resources, investigation. K.-W. Noh: Resources, investigation, writing-review and editing. M. Maltseva: Resources, investigation. H.C. Reinhardt: Conceptualization, writing-review and editing. R. Büttner: Conceptualization, resources, formal analysis, funding acquisition, writing-review and editing. J.P. Klussmann: Conceptualization, funding acquisition, writing-original draft, writing-review and editing. N. Wuerdemann: Resources, data curation, formal analysis, investigation, writing-original draft, writing-review and editing.
Acknowledgments
S. Klein received a grant from the Else Kröner-Fresenius-Stiftung (Kolleg_2016). S. Klein, J.P. Klussmann, A. Quaas, and R. Büttner received funding from the European Union Funds for Regional development (EFRE) and the German State of North Rhine Westphalia (NRW). R. Büttner was supported by a grant from the Deutsche Krebshilfe to the Center for Integrated Oncology Cologne (CIO). N. Wuerdemann was supported by the Cologne Clinician Scientist Program (CCSP), funded by the German Research Council (FI 773/15–1).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.