Abstract
Oral cavity and oropharyngeal cancer (oral cancer) is a deadly disease that is increasing in incidence. Worldwide 5-year survival is only 50% due to delayed intervention with more than half of the diagnoses at stage III and IV, whereas earlier detection (stage I and II) yields survival rates up to 80% to 90%. Salivary soluble CD44 (CD44), a tumor-initiating marker, and total protein levels may facilitate oral cancer risk assessment and early intervention. This study used a hospital-based design with 150 cases and 150 frequency-matched controls to determine whether CD44 and total protein levels in oral rinses were associated with oral cancer independent of age, gender, race, ethnicity, tobacco and alcohol use, and socioeconomic status (SES). High-risk subjects receiving oral cancer prevention interventions as part of a community-based program (n = 150) were followed over 1 year to determine marker specificity and variation. CD44 ≥5.33 ng/mL was highly associated with case status [adjusted OR 14.489; 95% confidence interval (CI), 5.973–35.145; P < .0001, vs. reference group CD44 <2.22 ng/mL and protein <1.23 mg/mL]. Total protein aided prediction above CD44 alone. Sensitivity and specificity in the frequency-matched study was 80% and 48.7%, respectively. However, controls were not representative of the target screening population due, in part, to a high rate of prior cancer. In contrast, specificity in the high-risk community was 74% and reached 95% after annual retesting. Simple and inexpensive salivary CD44 and total protein measurements may help identify individuals at heightened risk for oral cancer from the millions who partake in risky behaviors. Cancer Prev Res; 9(6); 445–55. ©2016 AACR.
Introduction
Head and neck squamous cell carcinoma (HNSCC), which includes cancers of the oral cavity, pharynx, and larynx, affects 550,000 people worldwide each year (1). In India, oral cancer, defined here as cancers of the oral cavity and oropharynx, is the most common fatal cancer in middle-aged men, and it is the costliest cancer in low-income countries (2, 3). The main risk factors include tobacco use, alcohol use, and human papillomavirus (HPV) infection (4–6). The incidence of oral cancer is rising with the increasing incidence of HPV+ oropharyngeal cancer (7).
Worldwide 5-year survival only reaches 50%, largely due to late-stage (III or IV) presentation (8). Upper aerodigestive tract (UADT) mucosa progresses through a premalignant phase, dysplasia, prior to development of frank malignancy. Dysplasia is reversible (9) and can regress with tobacco cessation or spontaneously (10, 11). Unfortunately, dysplasia often mimics characteristics of benign inflammation so, frequently, it remains occult until further progression results in late-stage cancer diagnosis (12).
Screening for HNSCC in India reduced oral cancer mortality by over 80% in tobacco and/or alcohol users (13). Screening by oral exam followed by tissue biopsy, the gold standard, has only 64% sensitivity for oral cancer (8) and 31% specificity for oral dysplasia or cancer (14).
Molecular tests including hypermethylation, RNA, and protein-based panels are under development, but not validated (15–18). Other technologies that use dyes, autofluorescence, or exfoliative cytology as adjuncts to the physical examination are used in clinical practice but have not improved early detection rates (19, 20).
CD44, a cell surface transmembrane glycoprotein involved in cell proliferation, cell migration, and tumor initiation (21–24) is overexpressed in premalignant lesions (25–27). Soluble CD44 (solCD44), released by proteinases, is detectable in body fluids (28, 29). Prior work suggests that total protein is also an effective oral cancer marker (30, 31). Both can be measured with simple, inexpensive assays and are overexpressed in oral cavity and oropharyngeal cancers suggesting usefulness in both HPV-positive and negative disease (29–32).
This study uses a case–control, hospital-based design to evaluate salivary markers in oral cancer cases and controls, frequency-matched for important risk and demographic factors to determine whether CD44 and total protein levels are associated with cancer rather than potential confounders. The markers are then tested in a community at elevated risk for oral cancer (n = 150) at baseline and 1-year follow-up to examine marker changes over time. Moreover, this study begins to explore whether oral rinse CD44 and total protein levels (i) detect both HPV+ and HPV− disease, (ii) are associated with prognosis, and (iii) change over a 1-year period. The outcome of this work is a reliable, inexpensive, and noninvasive risk prediction test for oral cancer with potential to greatly benefit populations that suffer most from this disease.
Materials and Methods
Case–control design to determine marker cut-off points
Subjects for the 2012 hospital-based, case–control study were recruited from clinics at the University of Miami Sylvester Comprehensive Cancer Center (UM, Miami, FL) and Jackson Memorial Hospital (JMH, Miami, FL) between 2007 and 2012 (Fig. 1). This study evaluated whether soluble salivary tumor markers distinguish 150 oral cancer patients from 150 controls frequency matched for age, gender, race, ethnicity, tobacco, and alcohol use, and socioeconomic status (SES). Oral cancer cases included newly diagnosed, previously untreated subjects with squamous cell carcinoma. Control subjects were identified from family medicine and internal medicine clinics and chosen prior to testing so that the key covariates (age, tobacco use etc.) in the control group were not significantly different from the covariates in the case group. All subjects were recruited equally from UM, a private university hospital system serving mostly insured, white patients and JMH, a county hospital system serving primarily low-income patients and a large minority population. All subjects completed a questionnaire including demographics, behavioral risk factors, and SES. For cases, data on tumor characteristics and outcomes were extracted from medical records. Controls with lesions suspicious for oral cancer were excluded as were HIV+ or pregnant individuals. Exclusion decisions were blinded to marker level results. The resulting marker panel was validated using 27 oral cavity and oropharyngeal cases and 39 high-risk controls enrolled between 2004 and 2006 in a previous case–control study (31).
Test performance in a high-risk target screening group
The hospital-based, case–control study was designed to determine whether CD44 and total protein were associated with oral cancer independent of demographic and risk variables. To determine the specificity of the markers in a potential target screening population, the marker panel developed using data from the 2012 hospital-based, case–control study was evaluated in 150 participants from a community previously determined to be at elevated risk for oral cancer due to poverty and smoking (33). Subjects in this study received free head and neck cancer screening, education on smoking cessation, good nutrition, and oral health. This community control group was followed over time; baseline and annual follow-up oral rinses were obtained and measured between the years 2011 and 2013 to assess specificity and variation in marker levels. As the community control group was still at elevated risk for cancer, we also estimated true specificity in a group of 21 normal volunteers who were primarily nonsmokers.
All participants consented according to The Code of Ethics of the World Medical Association (Declaration of Helsinki).
Laboratory analysis
Oral rinses were collected using a previously published method that samples the oral cavity and oropharynx (29–32). Levels of solCD44 (normal and variant isoforms) were measured using a sandwich ELISA assay (eBioscience), with previously published modifications (29–32). We performed the DC protein assay (Bio-Rad Laboratories) according to the manufacturer's protocol using saliva samples prepared as previously published (29–32). Each sample was tested in duplicate and the technician was blinded to disease status.
Formalin-fixed, paraffin-embedded specimens were retrieved from cases, where available (n = 79). HPV status was assessed using p16INK4A IHC, an accepted surrogate marker for HPV (34–36).p16INK4A was performed according to the manufacturer's IHC protocol on 68 specimens (BD Biosciences). In addition, HPV status was already available in 11 cases (IHC, n = 10 or in situ hybridization, n = 1). All specimens were reviewed by a pathologist (C. Gomez), blinded to the patient's clinical data. p16INK4A expression was scored as positive if strong and diffuse nuclear and cytoplasm staining was present in ≥50% of the tumor specimen (36).
Statistical analysis
Patient groups were compared with respect to the distribution of potentially important categorical covariates using the χ2 test or Fisher exact test. Data on solCD44 were log base-2 transformed to stabilize estimates of variance and improve the fit to the normal distribution. Continuous variables were analyzed using Student t test or ANOVA followed by Fisher least significant difference test for pairwise mean comparison, and tests of prespecified contrasts. Logistic regression analysis was used to assess the association between markers and the risk for oral cancer. OR estimates were reported with corresponding 95% confidence interval (95% CI) and AUC of the ROC for fitted models. Estimates of sensitivity, specificity, and accuracy were derived from a fitted multivariable logistic model which included significant interactions between markers and covariates as well as from a model including only risk groups based on cut-off points for solCD44 and protein that were derived using multivariate recursive partitioning analysis (37) implemented in the R-packages MVPART (v.1.6.1.) and Recursive Partitioning and Regression Trees (RPART), version 1.6-0 (38). Kaplan–Meier and Cox regression models were used to evaluate PFS and OS. HR estimates and corresponding 95% CI are reported. Statistical analyses were performed using SAS version 9.2 (SAS Institute, Inc.) and R package.
Results
Characteristics of hospital-based case–control study
The description of the hospital-based, case–control study, comprising 150 patients with oral cancer and 150 controls, is summarized in Table 1 and Fig. 1. There were no significant differences between cases and controls with respect to age, gender, race, SES, oral health (number of teeth removed), smoking history, alcohol habit, or enrollment clinic (county JMH versus private hospital UM system). Supplementary Table S1 (online version only) shows cancer-specific characteristics for cases.
. | Cases (n = 150) . | Controls (n = 150) . | . |
---|---|---|---|
Variable/category . | N (%) . | N (%) . | P . |
Site of enrollment | |||
JMH | 80 (53.3) | 71 (47.3) | 0.299 |
UM | 70 (46.7) | 79 (52.7) | |
Age, years | |||
<40 | 4 (2.7) | — | 0.214 |
40–<50 | 20 (13.3) | 29 (19.3) | |
50–<60 | 60 (40.0) | 56 (37.3) | |
60–<70 | 44 (29.3) | 44 (29.3) | |
≥70 | 22 (14.4) | 21 (14.0) | |
<60 | 84 (66.0) | 85 (56.7) | 0.449 |
≥60 | 66 (44.0) | 65 (43.3) | |
Mean (SD) | 58.6 (10.5) | 58.5 (9.7) | 0.887 |
Median (Range) | 58 (28-88) | 58.5 (40–87) | |
Gender | |||
Male | 121 (80.7) | 118 (78.7) | 0.907 |
Female | 29 (19.3) | 32 (21.3) | |
Race | |||
White | 123 (82.6) | 118 (79.7) | 0.534 |
Black | 26 (17.4) | 30 (20.3) | |
Asian/Other/Missing (1 case other, 1 control Asian, and 1 control missing) | 1 | 2 | |
Ethnicity | |||
Hispanic | 77 (51.3) | 93 (62.0) | 0.062 |
Non-Hispanic | 73 (48.7) | 57 (38.0) | |
SESa | |||
Low | 100 (66.7) | 90 (60.0) | 0.231 |
High | 50 (33.3) | 60 (40.0) | |
Oral health score | |||
Poor/Fair | 80 (64.0) | 87 (58.0) | 0.310 |
Good | 45 (36.0) | 63 (42.0) | |
Missing | 25 | ||
Teeth removed | |||
None/1 to 5 | 86 (58.9) | 92 (63.0) | 0.301 |
6 or more but not all | 36 (24.7) | 39 (26.7) | |
All | 24 (16.4) | 15 (10.3) | |
Missing | 4 | 4 | |
Smoking status | |||
Never | 33 (22.0) | 32 (21.3) | 0.889 |
Ever | 117 (78.0) | 118 (78.7) | |
Drinking habitsb | |||
Non-drinker/Mild | 78 (52.0) | 85 (56.7) | 0.279 |
Moderate | 24 (16.0) | 30 (20.0) | |
Heavy | 48 (32.0) | 35 (23.3) |
. | Cases (n = 150) . | Controls (n = 150) . | . |
---|---|---|---|
Variable/category . | N (%) . | N (%) . | P . |
Site of enrollment | |||
JMH | 80 (53.3) | 71 (47.3) | 0.299 |
UM | 70 (46.7) | 79 (52.7) | |
Age, years | |||
<40 | 4 (2.7) | — | 0.214 |
40–<50 | 20 (13.3) | 29 (19.3) | |
50–<60 | 60 (40.0) | 56 (37.3) | |
60–<70 | 44 (29.3) | 44 (29.3) | |
≥70 | 22 (14.4) | 21 (14.0) | |
<60 | 84 (66.0) | 85 (56.7) | 0.449 |
≥60 | 66 (44.0) | 65 (43.3) | |
Mean (SD) | 58.6 (10.5) | 58.5 (9.7) | 0.887 |
Median (Range) | 58 (28-88) | 58.5 (40–87) | |
Gender | |||
Male | 121 (80.7) | 118 (78.7) | 0.907 |
Female | 29 (19.3) | 32 (21.3) | |
Race | |||
White | 123 (82.6) | 118 (79.7) | 0.534 |
Black | 26 (17.4) | 30 (20.3) | |
Asian/Other/Missing (1 case other, 1 control Asian, and 1 control missing) | 1 | 2 | |
Ethnicity | |||
Hispanic | 77 (51.3) | 93 (62.0) | 0.062 |
Non-Hispanic | 73 (48.7) | 57 (38.0) | |
SESa | |||
Low | 100 (66.7) | 90 (60.0) | 0.231 |
High | 50 (33.3) | 60 (40.0) | |
Oral health score | |||
Poor/Fair | 80 (64.0) | 87 (58.0) | 0.310 |
Good | 45 (36.0) | 63 (42.0) | |
Missing | 25 | ||
Teeth removed | |||
None/1 to 5 | 86 (58.9) | 92 (63.0) | 0.301 |
6 or more but not all | 36 (24.7) | 39 (26.7) | |
All | 24 (16.4) | 15 (10.3) | |
Missing | 4 | 4 | |
Smoking status | |||
Never | 33 (22.0) | 32 (21.3) | 0.889 |
Ever | 117 (78.0) | 118 (78.7) | |
Drinking habitsb | |||
Non-drinker/Mild | 78 (52.0) | 85 (56.7) | 0.279 |
Moderate | 24 (16.0) | 30 (20.0) | |
Heavy | 48 (32.0) | 35 (23.3) |
aSocioeconomic status (SES) categories high and low were defined based on income (≤$25,000, >$25,000), education (“≤grade 12 or GED”, “some college or college graduate”) and employment (“out-of/unable-to work”, “occupation with some income”). Supplementary Table S3, see online Version only. High SES: income >$25,000 or, if income was missing, “some college or college graduate” and “occupation with some income”. Low SES: income ≤$25,000, or, if income was missing, low education and/or “out-of/unable-to work”; 1 subject missing income and education with “occupation with some income” was classified as low SES.
bDrinking habits: Non-drinker/Mild: past drinking ≤2 drinks/day or current drinking ≤2 drinks/day for 1–15 days/month; moderate: past drinking 3 to <5 drinks/day or current drinking ≤2 drinks/day for 16–30 days/month or ≥3 drinks/day for 1–15 days/month; heavy: past drinking 5 or more drinks/day or current drinking ≥3 drinks/day for 16–30 days/month.
Log2solCD44, hereafter referred to as CD44, and total protein were evaluated with respect to risk factors or demographic variables within the case and control groups (Table 2). CD44 and protein were higher in cases compared with controls at the P < 0.05 level when age, gender, race/ethnicity, SES, smoking habit or drinking habit, teeth loss, or ability to gargle were considered. This provides strong evidence that CD44 and total protein levels are associated with oral cancer independent of these risk factors. In cases but not in controls, CD44 was significantly higher in subjects who were older, had worse gargle ability, and more teeth loss. CD44 and protein did not differ significantly by TNM status or HPV status.
. | . | . | log2 [solCD44 (ng/mL)] . | . | Protein (mg/mL) . | . | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
. | Cases . | Controls . | Cases . | Controls . | . | Cases . | Controls . | . | ||||
. | N . | N . | Mean . | SE . | Mean . | SE . | P . | Mean . | SE . | Mean . | SE . | P . |
All | 150 | 150 | 1.94a | 0.09 | 1.28a | 0.07 | <0.0001 | 0.94a | 0.05 | 0.76a | 0.03 | 0.003 |
Site of enrollment | ||||||||||||
JMH | 80 | 71 | 1.96a | 0.14 | 1.32a | 0.11 | <0.0001 | 0.95w | 0.07 | 0.81w | 0.05 | 0.017 |
UM | 70 | 79 | 1.92b | 0.11 | 1.26b | 0.09 | 0.93a | 0.06 | 0.73a | 0.04 | ||
Age | ||||||||||||
<60 | 84 | 85 | 1.71a,c | 0.12 | 1.16a,w | 0.08 | <0.0001 | 0.88w | 0.06 | 0.75w | 0.04 | 0.010 |
60 or more | 66 | 65 | 2.24b,c | 0.14 | 1.45b,w | 0.12 | 1.00a | 0.07 | 0.78a | 0.05 | ||
Gender | ||||||||||||
Male | 121 | 118 | 2.01a | 0.10 | 1.29a | 0.08 | <0.0001 | 0.96a | 0.05 | 0.80a | 0.04 | 0.006 |
Female | 29 | 32 | 1.68 | 0.21 | 1.28 | 0.16 | 0.86w | 0.10 | 0.64w | 0.07 | ||
Race/Ethnicity (n) | (149) | (148) | ||||||||||
White Non-Hispanic | 53 | 29 | 1.93a | 0.15 | 1.31a | 0.14 | <0.001 | 0.91a | 0.08 | 0.68a | 0.07 | 0.017 |
White Hispanic | 70 | 89 | 1.91b | 0.13 | 1.32b | 0.08 | 0.91 | 0.07 | 0.81 | 0.04 | ||
Black | 26 | 30 | 2.06c | 0.26 | 1.14c | 0.19 | 1.08b | 0.12 | 0.71b | 0.07 | ||
SES | ||||||||||||
Low | 100 | 90 | 1.92a | 0.12 | 1.36a | 0.09 | <0.0001 | 0.95w | 0.06 | 0.81w | 0.04 | 0.009 |
High | 50 | 60 | 1.98b | 0.14 | 1.17b | 0.10 | 0.91a | 0.07 | 0.69a | 0.04 | ||
Smoking status | ||||||||||||
Never | 33 | 32 | 1.72a | 0.20 | 1.23a | 0.13 | <0.0001 | 0.96 | 0.14 | 0.76 | 0.06 | 0.027 |
Ever | 117 | 118 | 2.01b | 0.10 | 1.30b | 0.08 | 0.93a | 0.05 | 0.76a | 0.04 | ||
Never | 33 | 32 | 1.72a | 0.20 | 1.23a | 0.13 | <0.0001 | 0.94 | 0.14 | 0.76 | 0.06 | 0.080 |
Former | 37 | 59 | 2.13b | 0.18 | 1.31b | 0.10 | 0.98a | 0.09 | 0.78a | 0.06 | ||
Current | 80 | 59 | 1.95c | 0.13 | 1.29c | 0.12 | 0.91w | 0.05 | 0.75w | 0.05 | ||
In current smokers (n) | (75) | (52) | ||||||||||
<20 pack-years | 33 | 29 | 1.86a | 0.18 | 1.07a | 0.20 | 0.003 | 0.96 | 0.08 | 0.71 | 0.07 | 0.203 |
≥20 pack-years | 42 | 23 | 1.99w | 0.19 | 1.51w | 0.14 | 0.84 | 0.08 | 0.81 | 0.09 | ||
Alcohol past | ||||||||||||
Non-drinker | 35 | 40 | 2.08a | 0.19 | 1.38a | 0.13 | <0.0001 | 1.00a | 0.09 | 0.73a | 0.05 | 0.018 |
Drinker (Mild/Mod/Heavy) | 115 | 110 | 1.90b | 0.11 | 1.25b | 0.08 | 0.92b | 0.05 | 0.78b | 0.04 | ||
Alcohol current (n) | (148) | (148) | ||||||||||
Non-drinker | 84 | 72 | 1.87a | 0.12 | 1.40a | 0.10 | <0.0001 | 0.97w | 0.07 | 0.82w | 0.05 | 0.010 |
Drinker (Mild/Mod/Heavy) | 64 | 76 | 2.04b | 0.15 | 1.17b | 0.10 | 0.91a | 0.06 | 0.72a | 0.04 | ||
Alcohol status | ||||||||||||
Never | 33 | 30 | 2.08a | 0.20 | 1.39a | 0.16 | <0.0001 | 1.01a | 0.10 | 0.75a | 0.07 | 0.019 |
Ever | 117 | 120 | 1.90b | 0.10 | 1.26b | 0.08 | 0.92b | 0.05 | 0.77b | 0.04 | ||
Teeth removed | ||||||||||||
None/1 to 5 | 86 | 92 | 1.82a,d | 0.11 | 1.26a | 0.08 | <0.0001 | 0.90a | 0.05 | 0.74a | 0.04 | 0.020 |
≥6, but not all | 36 | 39 | 1.79b | 0.15 | 1.25b | 0.14 | 0.81b | 0.06 | 0.76 | 0.07 | ||
All | 24 | 15 | 2.33c.d | 0.26 | 1.43c | 0.20 | 1.05b | 0.13 | 0.84 | 0.10 | ||
Gargle (n) | (138) | (143) | ||||||||||
Poor/Fair | 38 | 12 | 2.23a.c | 0.22 | 0.88a | 0.36 | <0.0001 | 1.13a,b | 0.11 | 0.66a | 0.13 | <0.001 |
Good | 100 | 131 | 1.82b,c | 0.10 | 1.29b | 0.07 | 0.85b | 0.05 | 0.77 | 0.03 | ||
Cancer site | ||||||||||||
Lip/OC | 59 | 2.12 | 0.15 | 0.132 | 0.98 | 0.09 | 0.490 | |||||
Oropharyngeal | 91 | 1.83 | 0.11 | 0.91 | 0.05 | |||||||
Stage | ||||||||||||
Stage I/II | 26 | 1.78 | 0.17 | 0.425 | 0.90 | 0.09 | 0.719 | |||||
Stage III/IV | 124 | 1.98 | 0.11 | 0.94 | 0.05 | |||||||
T-stage | ||||||||||||
T1-T2 | 63 | 1.76w | 0.12 | 0.088 | 0.89 | 0.05 | 0.431 | |||||
T3-T4 | 87 | 2.07w | 0.13 | 0.97 | 0.07 | |||||||
N-stage | ||||||||||||
N0, Nx | 51 | 1.97 | 0.14 | 0.848 | 0.95 | 0.08 | 0.850 | |||||
N1-N3 | 99 | 1.93 | 0.12 | 0.93 | 0.06 | |||||||
HPV (n) | (79) | |||||||||||
HPV+ | 31 | 1.90 | 0.23 | 0.760 | 0.88 | 0.09 | 0.997 | |||||
HPV- | 48 | 1.99 | 0.16 | 0.88 | 0.08 |
. | . | . | log2 [solCD44 (ng/mL)] . | . | Protein (mg/mL) . | . | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
. | Cases . | Controls . | Cases . | Controls . | . | Cases . | Controls . | . | ||||
. | N . | N . | Mean . | SE . | Mean . | SE . | P . | Mean . | SE . | Mean . | SE . | P . |
All | 150 | 150 | 1.94a | 0.09 | 1.28a | 0.07 | <0.0001 | 0.94a | 0.05 | 0.76a | 0.03 | 0.003 |
Site of enrollment | ||||||||||||
JMH | 80 | 71 | 1.96a | 0.14 | 1.32a | 0.11 | <0.0001 | 0.95w | 0.07 | 0.81w | 0.05 | 0.017 |
UM | 70 | 79 | 1.92b | 0.11 | 1.26b | 0.09 | 0.93a | 0.06 | 0.73a | 0.04 | ||
Age | ||||||||||||
<60 | 84 | 85 | 1.71a,c | 0.12 | 1.16a,w | 0.08 | <0.0001 | 0.88w | 0.06 | 0.75w | 0.04 | 0.010 |
60 or more | 66 | 65 | 2.24b,c | 0.14 | 1.45b,w | 0.12 | 1.00a | 0.07 | 0.78a | 0.05 | ||
Gender | ||||||||||||
Male | 121 | 118 | 2.01a | 0.10 | 1.29a | 0.08 | <0.0001 | 0.96a | 0.05 | 0.80a | 0.04 | 0.006 |
Female | 29 | 32 | 1.68 | 0.21 | 1.28 | 0.16 | 0.86w | 0.10 | 0.64w | 0.07 | ||
Race/Ethnicity (n) | (149) | (148) | ||||||||||
White Non-Hispanic | 53 | 29 | 1.93a | 0.15 | 1.31a | 0.14 | <0.001 | 0.91a | 0.08 | 0.68a | 0.07 | 0.017 |
White Hispanic | 70 | 89 | 1.91b | 0.13 | 1.32b | 0.08 | 0.91 | 0.07 | 0.81 | 0.04 | ||
Black | 26 | 30 | 2.06c | 0.26 | 1.14c | 0.19 | 1.08b | 0.12 | 0.71b | 0.07 | ||
SES | ||||||||||||
Low | 100 | 90 | 1.92a | 0.12 | 1.36a | 0.09 | <0.0001 | 0.95w | 0.06 | 0.81w | 0.04 | 0.009 |
High | 50 | 60 | 1.98b | 0.14 | 1.17b | 0.10 | 0.91a | 0.07 | 0.69a | 0.04 | ||
Smoking status | ||||||||||||
Never | 33 | 32 | 1.72a | 0.20 | 1.23a | 0.13 | <0.0001 | 0.96 | 0.14 | 0.76 | 0.06 | 0.027 |
Ever | 117 | 118 | 2.01b | 0.10 | 1.30b | 0.08 | 0.93a | 0.05 | 0.76a | 0.04 | ||
Never | 33 | 32 | 1.72a | 0.20 | 1.23a | 0.13 | <0.0001 | 0.94 | 0.14 | 0.76 | 0.06 | 0.080 |
Former | 37 | 59 | 2.13b | 0.18 | 1.31b | 0.10 | 0.98a | 0.09 | 0.78a | 0.06 | ||
Current | 80 | 59 | 1.95c | 0.13 | 1.29c | 0.12 | 0.91w | 0.05 | 0.75w | 0.05 | ||
In current smokers (n) | (75) | (52) | ||||||||||
<20 pack-years | 33 | 29 | 1.86a | 0.18 | 1.07a | 0.20 | 0.003 | 0.96 | 0.08 | 0.71 | 0.07 | 0.203 |
≥20 pack-years | 42 | 23 | 1.99w | 0.19 | 1.51w | 0.14 | 0.84 | 0.08 | 0.81 | 0.09 | ||
Alcohol past | ||||||||||||
Non-drinker | 35 | 40 | 2.08a | 0.19 | 1.38a | 0.13 | <0.0001 | 1.00a | 0.09 | 0.73a | 0.05 | 0.018 |
Drinker (Mild/Mod/Heavy) | 115 | 110 | 1.90b | 0.11 | 1.25b | 0.08 | 0.92b | 0.05 | 0.78b | 0.04 | ||
Alcohol current (n) | (148) | (148) | ||||||||||
Non-drinker | 84 | 72 | 1.87a | 0.12 | 1.40a | 0.10 | <0.0001 | 0.97w | 0.07 | 0.82w | 0.05 | 0.010 |
Drinker (Mild/Mod/Heavy) | 64 | 76 | 2.04b | 0.15 | 1.17b | 0.10 | 0.91a | 0.06 | 0.72a | 0.04 | ||
Alcohol status | ||||||||||||
Never | 33 | 30 | 2.08a | 0.20 | 1.39a | 0.16 | <0.0001 | 1.01a | 0.10 | 0.75a | 0.07 | 0.019 |
Ever | 117 | 120 | 1.90b | 0.10 | 1.26b | 0.08 | 0.92b | 0.05 | 0.77b | 0.04 | ||
Teeth removed | ||||||||||||
None/1 to 5 | 86 | 92 | 1.82a,d | 0.11 | 1.26a | 0.08 | <0.0001 | 0.90a | 0.05 | 0.74a | 0.04 | 0.020 |
≥6, but not all | 36 | 39 | 1.79b | 0.15 | 1.25b | 0.14 | 0.81b | 0.06 | 0.76 | 0.07 | ||
All | 24 | 15 | 2.33c.d | 0.26 | 1.43c | 0.20 | 1.05b | 0.13 | 0.84 | 0.10 | ||
Gargle (n) | (138) | (143) | ||||||||||
Poor/Fair | 38 | 12 | 2.23a.c | 0.22 | 0.88a | 0.36 | <0.0001 | 1.13a,b | 0.11 | 0.66a | 0.13 | <0.001 |
Good | 100 | 131 | 1.82b,c | 0.10 | 1.29b | 0.07 | 0.85b | 0.05 | 0.77 | 0.03 | ||
Cancer site | ||||||||||||
Lip/OC | 59 | 2.12 | 0.15 | 0.132 | 0.98 | 0.09 | 0.490 | |||||
Oropharyngeal | 91 | 1.83 | 0.11 | 0.91 | 0.05 | |||||||
Stage | ||||||||||||
Stage I/II | 26 | 1.78 | 0.17 | 0.425 | 0.90 | 0.09 | 0.719 | |||||
Stage III/IV | 124 | 1.98 | 0.11 | 0.94 | 0.05 | |||||||
T-stage | ||||||||||||
T1-T2 | 63 | 1.76w | 0.12 | 0.088 | 0.89 | 0.05 | 0.431 | |||||
T3-T4 | 87 | 2.07w | 0.13 | 0.97 | 0.07 | |||||||
N-stage | ||||||||||||
N0, Nx | 51 | 1.97 | 0.14 | 0.848 | 0.95 | 0.08 | 0.850 | |||||
N1-N3 | 99 | 1.93 | 0.12 | 0.93 | 0.06 | |||||||
HPV (n) | (79) | |||||||||||
HPV+ | 31 | 1.90 | 0.23 | 0.760 | 0.88 | 0.09 | 0.997 | |||||
HPV- | 48 | 1.99 | 0.16 | 0.88 | 0.08 |
NOTE: Same letters identify pairwise mean comparison within group or within a category of a key variable that was significant at the 5% level (letters a, b, c) or at the 10% level (letters w, y) by Fisher least significant difference test.
Abbreviations: (n), total cases or total controls excluding missing data; OC, oral cavity; OP, oropharynx; P: P value from ANOVA global test of equality of all means.
HPV+ tumors, frequent in nonsmokers with oropharyngeal HNSCC, have a better prognosis compared with smoking and alcohol-related tumors (39).HPV+ tumors are rarely found in the oral cavity (OC). In our study, only 4 of the 31 HPV+ tumors were from the OC (see Supplementary Table S1, online only). The CD44 levels between the 4 OC HPV+ and 27 HPV+ oropharyngeal (OP) cases were not significantly different. The total protein levels were significantly lower (OC, 0.54 mg/mL; OP, 0.93 mg/mL; P = 0.001) in the OC compared with OP HPV+ samples.
Risk modeling
In univariate analysis, CD44 and total protein were significantly associated with cancer status with an OR for 1-unit increase in CD44 of 2.036 (95% CI, 1.552–2.671; P < 0.0001, AUC=0.68) and for 1-unit increase in protein of 2.159 (95% CI, 1.288–3.617; P < 0.003, AUC=0.59). The AUC was improved to 0.763 in a multivariable model including adjustments for important variables and their interactions, which removed residual confounding not accounted for in the frequency matching. The OR for CD44 increased to 2.684 (95% CI, 1.797–4.010; P < 0.0001), while the OR for protein became less than 1 and nonsignificant (OR = 0.646; 95% CI, 0.301–1.386; P = 0.262; Table 3, part A). This model “markers + covariates” with AUC = 0.763 provided significantly better prediction than the reduced model excluding both markers and only including potential risk factors (AUC = 0.686; P = 0.003), indicating that the markers aid prediction over and above prediction provided by knowledge of risk factors alone.
Part A: Logistic Regression, all patients . | OR (95% CI) . | P . | AUC . | Rescaled R2 . | . | . | . |
---|---|---|---|---|---|---|---|
Univariate models (150 cases/150 controls) | |||||||
log2 solCD44 | 2.036 (1.552–2.671) | <0.0001 | 0.681 | 0.137 | |||
Protein | 2.159 (1.288–3.617) | 0.003 | 0.590 | 0.042 | |||
Multivariable modela (149 cases/148 controls) | |||||||
log2 solCD44 | 2.684 (1.797–4.010) | <0.0001 | 0.763 | 0.276 | |||
Protein | 0.646 (0.301–1.386) | 0.262 | |||||
Part B: Logistic regression stratified by HPV status | |||||||
HPV negative (48 cases/150 controls) | |||||||
Univariate | |||||||
log2 solCD44 | 2.311 (1.561–3.422) | <0.0001 | 0.689 | 0.146 | |||
Protein | 1.838 (0.888–3.807) | 0.101 | 0.562 | 0.020 | |||
Multivariable modelb (48 cases/148 controls): | |||||||
log2 solCD44 | 4.017 (2.124–7.597) | <0.0001 | 0.771 | 0.275 | |||
Protein | 0.179 (0.052–0.620) | 0.006 | |||||
HPV positive (31 cases/150 controls) | |||||||
Univariate | |||||||
log2 solCD44 | 2.001 (1.291–3.102) | 0.002 | 0.667 | 0.096 | |||
Protein | 1.882 (0.789–4.492) | 0.154 | 0.567 | 0.018 | |||
Multivariable modelc (148 controls) | |||||||
log2 solCD44 | 3.079 (1.486–6.378) | 0.003 | 0.773 | 0.221 | |||
Protein | 0.384 (0.080–1.833) | 0.230 | |||||
Part C: Logistic regression analysis of risk groups derived by multivariate recursive partitioning | |||||||
Univariate Modeld of risk groups based on CD44 and protein levels | |||||||
Risk Level (n = case + control) | SolCD44 (ng/mL; level description) | Protein (mg/mL) | OR (95% CI) | Prediction | P | AUC | Rescaled R2 |
Low (102 = 29 + 73) | <2.22 (low) | <1.23 (low–medium) | Reference | Control | 0.722 | 0.227 | |
Medium (116 = 54 + 62) | ≥2.22 and <5.33 (medium) | ≥0.558 (medium–high) | 2.192 (1.247–3.854) | Control | 0.006 | ||
High (5 = 4 + 1) | <2.22 (low) | ≥1.23 (high) | 10.069 (1.079–93.93) | Case | 0.043 | ||
High (20 = 16 + 4) | ≥2.22 and <5.33 (medium) | <0.558 (low) | 10.069 (3.103–32.672) | Case | 0.0001 | ||
High (57 = 47 + 10) | ≥5.33 (high) | – | 11.830 (5.279–26.508) | Case | <.0001 | ||
Multivariable modele of risk groups based on CD44 and protein levels | |||||||
Risk level (n) | SolCD44 | Protein | OR (95% CI) | Prediction | P | AUC | Rescaled R2 |
Low (102) | <2.22 (low) | <1.23 (low–medium) | Reference | Control | 0.790 | 0.325 | |
Medium (116) | ≥2.22 and <5.33 (medium) | ≥0.558 (medium–high) | 2.755 (1.483–5.117) | Control | 0.001 | ||
High (5) | <2.22 (low) | ≥1.23 (high) | 5.905 (0.591–59.053) | Case | 0.131 | ||
High (20) | ≥2.22 and <5.33 (medium) | <0.558 (low) | 11.860 (3.312–42.472) | Case | <.0001 | ||
High (57) | ≥5.33 (high) | – | 14.489 (5.973–35.145) | Case | <.0001 | ||
SES High vs. low | 0.577 (0.304–1.094) | 0.092 | |||||
White Non-Hispanic vs. Black at age <60 | 7.885 (2.372–26.206) | ||||||
White Hispanic vs. Black at age <60 | 1.767 (0.636–4.907) | ||||||
White Non-Hispanic vs. Black at age ≥60 | 0.799 (0.216–2.956) | ||||||
White Hispanic vs. Black at age ≥60 | 0.382 (0.124–1.175) | ||||||
Age ≥60 vs. <60 in Black | 3.099 (0.838–11.457) | ||||||
Age ≥60 vs. <60 in White Non-Hispanic | 0.314 (0.111–0.892) | ||||||
Age ≥60 vs. <60 in White Hispanic | 0.669 (0.324–1.383) | ||||||
Alcohol Ever vs. Never in Male | 1.615 (0.713–3.660) | ||||||
Alcohol Ever vs. Never in Female | 0.202 (0.056–0.726) | ||||||
Male vs. Female in alcohol=Never | 0.216 (0.062–0.757) | ||||||
Male vs. Female in alcohol=Ever | 1.723 (0.695–4.273) |
Part A: Logistic Regression, all patients . | OR (95% CI) . | P . | AUC . | Rescaled R2 . | . | . | . |
---|---|---|---|---|---|---|---|
Univariate models (150 cases/150 controls) | |||||||
log2 solCD44 | 2.036 (1.552–2.671) | <0.0001 | 0.681 | 0.137 | |||
Protein | 2.159 (1.288–3.617) | 0.003 | 0.590 | 0.042 | |||
Multivariable modela (149 cases/148 controls) | |||||||
log2 solCD44 | 2.684 (1.797–4.010) | <0.0001 | 0.763 | 0.276 | |||
Protein | 0.646 (0.301–1.386) | 0.262 | |||||
Part B: Logistic regression stratified by HPV status | |||||||
HPV negative (48 cases/150 controls) | |||||||
Univariate | |||||||
log2 solCD44 | 2.311 (1.561–3.422) | <0.0001 | 0.689 | 0.146 | |||
Protein | 1.838 (0.888–3.807) | 0.101 | 0.562 | 0.020 | |||
Multivariable modelb (48 cases/148 controls): | |||||||
log2 solCD44 | 4.017 (2.124–7.597) | <0.0001 | 0.771 | 0.275 | |||
Protein | 0.179 (0.052–0.620) | 0.006 | |||||
HPV positive (31 cases/150 controls) | |||||||
Univariate | |||||||
log2 solCD44 | 2.001 (1.291–3.102) | 0.002 | 0.667 | 0.096 | |||
Protein | 1.882 (0.789–4.492) | 0.154 | 0.567 | 0.018 | |||
Multivariable modelc (148 controls) | |||||||
log2 solCD44 | 3.079 (1.486–6.378) | 0.003 | 0.773 | 0.221 | |||
Protein | 0.384 (0.080–1.833) | 0.230 | |||||
Part C: Logistic regression analysis of risk groups derived by multivariate recursive partitioning | |||||||
Univariate Modeld of risk groups based on CD44 and protein levels | |||||||
Risk Level (n = case + control) | SolCD44 (ng/mL; level description) | Protein (mg/mL) | OR (95% CI) | Prediction | P | AUC | Rescaled R2 |
Low (102 = 29 + 73) | <2.22 (low) | <1.23 (low–medium) | Reference | Control | 0.722 | 0.227 | |
Medium (116 = 54 + 62) | ≥2.22 and <5.33 (medium) | ≥0.558 (medium–high) | 2.192 (1.247–3.854) | Control | 0.006 | ||
High (5 = 4 + 1) | <2.22 (low) | ≥1.23 (high) | 10.069 (1.079–93.93) | Case | 0.043 | ||
High (20 = 16 + 4) | ≥2.22 and <5.33 (medium) | <0.558 (low) | 10.069 (3.103–32.672) | Case | 0.0001 | ||
High (57 = 47 + 10) | ≥5.33 (high) | – | 11.830 (5.279–26.508) | Case | <.0001 | ||
Multivariable modele of risk groups based on CD44 and protein levels | |||||||
Risk level (n) | SolCD44 | Protein | OR (95% CI) | Prediction | P | AUC | Rescaled R2 |
Low (102) | <2.22 (low) | <1.23 (low–medium) | Reference | Control | 0.790 | 0.325 | |
Medium (116) | ≥2.22 and <5.33 (medium) | ≥0.558 (medium–high) | 2.755 (1.483–5.117) | Control | 0.001 | ||
High (5) | <2.22 (low) | ≥1.23 (high) | 5.905 (0.591–59.053) | Case | 0.131 | ||
High (20) | ≥2.22 and <5.33 (medium) | <0.558 (low) | 11.860 (3.312–42.472) | Case | <.0001 | ||
High (57) | ≥5.33 (high) | – | 14.489 (5.973–35.145) | Case | <.0001 | ||
SES High vs. low | 0.577 (0.304–1.094) | 0.092 | |||||
White Non-Hispanic vs. Black at age <60 | 7.885 (2.372–26.206) | ||||||
White Hispanic vs. Black at age <60 | 1.767 (0.636–4.907) | ||||||
White Non-Hispanic vs. Black at age ≥60 | 0.799 (0.216–2.956) | ||||||
White Hispanic vs. Black at age ≥60 | 0.382 (0.124–1.175) | ||||||
Age ≥60 vs. <60 in Black | 3.099 (0.838–11.457) | ||||||
Age ≥60 vs. <60 in White Non-Hispanic | 0.314 (0.111–0.892) | ||||||
Age ≥60 vs. <60 in White Hispanic | 0.669 (0.324–1.383) | ||||||
Alcohol Ever vs. Never in Male | 1.615 (0.713–3.660) | ||||||
Alcohol Ever vs. Never in Female | 0.202 (0.056–0.726) | ||||||
Male vs. Female in alcohol=Never | 0.216 (0.062–0.757) | ||||||
Male vs. Female in alcohol=Ever | 1.723 (0.695–4.273) |
NOTE: Rescaled R2, coefficient of determination measured the dispersion explained by model; ORs, 1-unit increase for continuous variables log2 CD44, protein, and age, unless specified categories; race/ethnicity (WNH and Black vs. WH), gender (male vs. female), smoking and alcohol (ever vs. never), and SES (high vs. low).
aAdjusted for age (P = 0.132), race/ethnicity (P = 0.004), age × race/ethnicity (P = 0.006), gender (P = 0.030), alcohol (P = 0.032), gender×alcohol (P = 0.020), smoking (P = 0.527), and SES (P = 0.042). Model “markers + covariates” (AUC=0.763) provided significantly better prediction than the reduced model excluding both markers (AUC=0.686) and only including potential risk factors (P = 0.003), indicating that the markers aid prediction over and above prediction provided by knowledge of risk factors.
bAdjusted for age (P = 0.020), gender (P = 0.009), age × gender (P = 0.008), race/ethnicity (P = 0.740), alcohol (P = 0.183), smoking (P = 0.487), and SES (P = 0.047).
cAdjusted for age (P = 0.052), gender (P = 0.104), age × gender (P = 0.096), race/ethnicity (P = 0.298), alcohol (P = 0.537), smoking (P = 0.131), and SES (P = 0.070).
dAUC=0.722 for risk group model (based on CD44 and protein) is significantly different from AUC = 0.681 for univariate model log2 solCD44 (P = 0.025).
eLogistic regression model included CD44-protein risk groups (5 categories, P < 0.0001), age (≥60 vs. <60, P = 0.090), gender (P = 0.017), race/ethnicity (P = 0.001), alcohol (P = 0.014), SES (P = 0.092), and interaction age × race/ethnicity (P = 0.029) and gender×alcohol (P = 0.007). Smoking (ever vs. never, P = 0.700) and teeth removed (6 or more or all vs. 5 or less, P = 0.485) were tested for inclusion into model (AUC=0.791); they were removed since their inclusion did not improve model fit.
Findings for the analysis stratified by p16INK4A (surrogate for HPV status) were similar to the combined analysis. In the HPV− group, protein levels were associated with a significant protective effect following multivariate analysis (Table 3, part B).
Multivariate recursive partitioning and logistic regression analyses were employed to understand the relationship between CD44, protein, and prediction of disease presence (Table 3, part C). Importantly, when covariates including CD44, protein, age, gender, race, ethnicity, and SES, were included into the model, CD44 and protein were the most important predictors of cancer status, defining five risk groups. Furthermore, we found that the AUC = 0.722 for the risk group model defined by CD44 and protein is significantly different from AUC = 0.681 for the univariate log2 CD44 model (P = 0.025), indicating that the addition of protein improves prediction.
The classification tree defined subjects as “controls” if CD44 was <2.22 ng/mL and protein was <1.23 mg/mL (reference group) or if CD44 was ≥2.22 and <5.33 ng/mL and protein was ≥0.558 mg/mL (Table 3, part C). However, compared with reference group, the OR for the latter group was 2.192 (95% CI, 1.247–3.854) and significant (P = 0.006), indicating elevated risk. Furthermore, many cancer subjects and 2 control subjects who went on to develop cancer during the course of the study had levels that fell into this medium CD44 and medium-to-high protein group leading us to consider this group as a case group. The other groups classified as “cases” included subjects with CD44 <2.22 ng/mL and protein ≥1.23 mg/mL, CD44 ≥ 2.22 and <5.33 ng/mL and protein <0.558 mg/mL, and CD44 ≥ 5.33 ng/mL, regardless of protein level. Thus, based on the levels of CD44 and total protein, we identified 4 of the 5 groups at risk as cases. ORs derived from a multivariate model including risk groups defined by CD44 and protein, demographic, and risk factors showed similar results (Table 3, part C). The percentage of cancer patients that fell into each risk category by HPV status and stage is shown (Supplementary Table S2, online version only).
Defining the reference group as control and all others as cases, sensitivity was 80.7% and specificity was 48.7% for the 2012 hospital-based group (see Table 3, part C for numbers of cases and controls that fell into each group). Sensitivity reached 80% for stages I–IV, and 85% for stage I–II. These results were validated using CD44 and total protein results from a similar hospital-based study whose enrollment was completed in 2006 (single test, stage I–IV: sensitivity 2012 = 80.7%; 2006 = 77.8%; specificity 2012 = 48.7%; 2006 = 56.4%). The frequency-matched control group was at exceptionally high risk for cancer as over 10% of these controls had a history of prior cancer outside the UADT. Hospital-based controls with history of cancer had significantly higher solCD44 and protein levels compared with controls without prior cancer (P < 0.05). Thus, the community-based population was used to estimate the specificity of the test. This is in keeping with suggestions by the Early Detection Research Network who note that control subjects from clinical settings may not be representative of control subjects recruited from the population because they have been referred for some reason to the clinic (40). They suggest that, although selection based on convenience may be necessary early, final conclusions should be based on population-based studies, if possible (40).
Specificity in a target screening population
To predict specificity in a true screening population, a community at high-risk for HNSCC [n = 150, see Supplementary Table S3 (online version only) for demographic and risk characteristics] was evaluated. These subjects were all African-American, they were heavier smokers and drinkers, and had worse oral health than the cases. They were younger than the cases and were enrolled from a community center rather than a clinic. We also studied oral rinses from 21 normal volunteers. Specificity was greatest in the normal volunteers (95.2%). Specificity was 74% (n = 150) after one baseline evaluation but also reached 95% in the high-risk community in subjects retested at one year (n = 95). In the latter case, a result was considered positive if both the baseline and annual result were positive. Importantly, these subjects had received counseling on smoking cessation, nutrition, and oral health and assistance with access to such services as part of the oral cancer prevention program prior to this apparent drop in marker levels.
Changes in CD44 and protein levels over time in a screening population
A total of 95 patients in the community-based control group provided baseline and annual follow-up collections. The distribution of changes in CD44 and protein over 1 year is shown in Fig. 2A and B, respectively. The average annual drop in CD44 of 0.439 ng/mL (24%) was significant (P < .0001). Linear regression analysis confirmed a significant linear trend for lower CD44 values [R2 = 0.227, intercept = 0.785 (P < .0001), slope = 0.331 (P < .0001); Fig. 2C]. Mean protein also dropped from 0.644 to 0.543 mg/mL (P = 0.036) with confirmation by linear regression analysis [R2 = 0.108; intercept = 0.284 (P = 0.002), slope = 0.402 (P < .0001); Fig. 2D]. Of 22 community subjects at baseline elevated risk, only 5 remained in an at-risk category after 1-year follow-up suggesting that retesting may improve specificity.
To determine whether the decreased marker levels were due to variation in assay conditions over the course of the year rather than a true decrease in the markers, a baseline second aliquot (baseline 2) was run on the same plate as the annual follow-up collection with 81 such pairs for each assay (protein and CD44). The average drop in levels between baseline 2 and annual follow-up was significant for CD44 (CD44: 0.296 ng/mL, P = 0.023; protein: 0.013 mg/mL, P = 0.796) while linear regression showed a significant trend towards lower numbers for both markers [CD44: R2 = 0.227; intercept = 0.882 (P < .0001), slope = 0.288 (P < .0001); protein. R2 = 0.155; intercept = 0.256 (P = 0.008), slope = 0.534 (P < .0001); figures not shown]. We also fit linear regression of baseline 2 on baseline 1. For CD44, linear regression indicated that the two baselines were equivalent suggesting that the changes in CD44 level were not due to technical changes in the assay. The differences between baselines for protein were not within the expected random variation (data not shown).
Prognostic significance of markers
Overdiagnosis has been observed in breast, prostate, and thyroid cancer screening (41). To avoid this, markers should identify aggressive forms of oral cancer rather than indolent cancers that will not cause significant problems during a patient's lifetime (41). We assessed marker association with prognostic factors and adjusted for confounders such as stage to determine whether the markers have potential to detect early, aggressive forms of the disease. Kaplan–Meier curves for PFS and OS by risk group are shown in Fig. 2E and F. Unadjusted and adjusted estimates of HRs for PFS and OS by risk groups are shown in Supplementary Table S4 (online version only). On the basis of multivariate analysis with adjustment for tumor stage, age, gender, race and ethnicity, and SES, hospital-based cases that had CD44 levels ≥ 5.33 ng/mL, had reduced PFS (adjusted HR = 3.919; 95% CI, 1.692–9.080; P = 0.001) and OS (adjusted HR = 3.242; 95% CI, 1.299–8.089; P = 0.012) compared with cases in the reference group. Subjects with CD44 < 2.22 ng/mL and protein ≥1.23 mg/mL had borderline association with decreased PFS (adjusted HR = 3.446; 95% CI, 0.857–13.867; P = 0.082) and no significant difference in OS (adjusted HR = 2.186; 95% CI, 0.524–9.123; P = 0.284) compared with cases in the reference group; however, this group included only 4 cases. As a result, the data supports the markers indeed have potential to identify the most aggressive forms of oral cancer.
Potential application of CD44 in detecting oral cancer or cancers at other sites
Similar to prior work (32), in this study, 2 control subjects fell into an elevated risk category and developed early HNSCC (lip and carcinoma in situ of the larynx) in follow-up. Two other controls were excluded because of bladder cancer and possible oral premalignancy, respectively. The latter went on to develop lung cancer. A subject from the community-based study classified in an elevated risk category developed lung cancer 14 months following testing.
Discussion
Despite over 550,000 new diagnoses of HNSCC worldwide each year, few receive a skilled oral cancer screening exam. Early diagnosis dramatically improves survival, but most present late. This study describes a simple, inexpensive, noninvasive risk assessment test based on salivary CD44 and protein that is able to distinguish stage I–IV oral cancer cases from controls. Sensitivity of early-stage lesions was as good or better (I–II = 85%) than identification of all stages combined (I–IV = 80%). The finding that early- and late-stage disease is detected is in keeping with prior publications on salivary CD44 levels by this and other groups (32, 42).
Also consistent with prior work, adding total protein increases the accuracy of the test at very minimal cost (30). The relative protein and CD44 levels may greatly facilitate risk stratification as these specific levels are associated with varying risk, as indicated by the OR. This may enable clinicians to tailor follow-up and patients to understand their risk better thus motivating change. Further work must be done to determine the cut-off points that characterize multiple risk levels across diverse populations.
A strength of the study that adds to prior work is frequency matching which ensures that there are no statistically significant differences between cases and controls with respect to age, race, gender, SES, tobacco use, or alcohol use. This ensures that the biomarkers are associated with cancer risk and not some other confounder such as tobacco use. We included additional covariates in modeling to remove any residual confounding. Results strongly support that CD44 and total protein are associated with cancer risk independent of tobacco or alcohol use, age gender, race, etc.
In frequency matching, a hospital-based control group was chosen as the cases were also hospital-based and the goal was to ensure that the cases and controls were as similar as possible except for cancer status. While this minimizes confounding, there are some limitations to this design as control subjects from such clinics may not be representative of control subjects recruited from the population (40). Indeed, over 10% of our control population had a prior history of cancer. The EDRN suggests that final conclusions should be based on population-based studies, if possible (40). To begin to investigate the markers in a population-based control group, we also enrolled a target screening group of smokers from an underserved, minority community population and followed them over time. When we compared results with the cases from the hospital-based study, specificity in the high-risk community could reach as high as 95% after annual retesting, increasing from 74% for a single initial test. Given that this population had higher levels of tobacco and alcohol use, worse oral health, and lower SES than the case group, this specificity is quite high.
The study included a diverse population. We enrolled subjects from a public institution that serves primarily unfunded patients and includes a large percentage of Hispanic and African-American minorities as well as a private academic institution that serves mostly funded, white patients. Thus, we had ample minorities, patients from low SES, and patients with poor oral health. This further ensures that the markers will work in diverse populations and limits potential confounding.
The study provides exploratory evidence that high salivary CD44 is associated with poor PFS and OS. Thus, CD44 appears to be associated with aggressive disease, although further study would be needed to determine whether these markers are useful for prognosis (43).
We performed preliminary analysis on CD44 and protein levels in HPV+ versus HPV− cancer. Our data did not show a significant difference in CD44 or protein levels between HPV+ and HPV− subjects. We do not think it is related to oral cavity cases that were HPV+ as there were only 4 of these and the CD44 levels were not significantly different than the oropharyngeal HPV+ samples. Protein levels were significantly higher in the OP HPV+ compared with OC HPV+ subjects, although the sample size was small. HPV status was unknown in 39 of 91 of the oropharyngeal cases. This is a limitation of the study thus further investigation is needed better understand the relationship between CD44, total protein, and HPV status.
Two false-positive control subjects developed HNSCC during the follow-up period. Additional subjects with other smoking-associated tumors, including lung and bladder, also had elevated CD44 levels. Thus, “false positives” could actually be true positives for occult oral disease or other cancers. As CD44 is a tumor initiation factor, levels might go down if risk factors decrease and occult lesions disappear. Data suggests that individuals who stayed in the community screening program for a year underwent a significant decrease in CD44 levels not attributable to technical differences in the test. All subjects who stayed in the community screening program received education on smoking cessation and access to resources to assist them in improving oral hygiene and nutrition raising the possibility that these prevention efforts may result in lower marker levels and lower risk. However, more investigation is needed to show this definitively.
This study assesses risk of oral cancer in that certain levels of CD44 and protein are associated with elevated ORs and the OR for relatively rare diseases like oral cancer approximates relative risk (44). While the study does provide directional, anecdotal evidence that certain levels of CD44 and protein may identify those patients that will go on to develop cancer or precancer, the study was not designed to assess leukoplakia or dysplasia or determine whether these markers predict progression to invasive cancer. Whether these markers predict progression is an area of considerable interest that should be explored further in larger, prospective studies.
Conclusion
The results provided here are encouraging. Further investigations with larger sample sizes are needed to determine whether marker levels vary with behavioral changes such as smoking cessation, whether reversal of premalignant lesions is associated with a drop in marker levels, and whether this test increases the number of screen-detected oral cancer lesions. Success in any of these areas could revolutionize oral cancer screening, by providing a simple and reliable measure of oral cancer risk that alerts primary care providers, dentists, and other frontline screeners to individuals most in need of skilled oral exam at a stage when the process can be more easily treated or perhaps even reversed with behavioral modification.
Disclosure of Potential Conflicts of Interest
E.J. Franzmann is a chief scientific officer, reports receiving a commercial research grant, and has ownership interest (including patents) in Vigilant Biosciences. No potential conflicts of interest were disclosed by the other authors.
Disclaimer
The University of Miami, Drs. Franzmann, Reis, Pereira and Duncan hold intellectual property used in the study and have potential for financial benefit from its future commercialization. Dr. Franzmann is Chief Scientific Officer, consultant, and equity holder in Vigilant Biosciences, Inc., licensee of the intellectual property used in this study.
Authors' Contributions
Conception and design: I.M. Reis, P. Fisher, J.J. Hu, E.J. Franzmann
Development of methodology: L.H.M. Pereira, I.M. Reis, A. Perez, E.J. Franzmann
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): L.H.M. Pereira, E.P. Reategui, S. Saint-Victor, C. Gomez, W.J. Goodwin, E.J. Franzmann
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): L.H.M. Pereira, I.M. Reis, R.C. Duncan, C. Gomez, S. Bayers, A. Perez, J.J. Hu, E.J. Franzmann
Writing, review, and/or revision of the manuscript: L.H.M. Pereira, I.M. Reis, E.P. Reategui, S. Saint-Victor, R.C. Duncan, P. Fisher, W.J. Goodwin, J.J. Hu, E.J. Franzmann
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): L.H.M. Pereira, E.P. Reategui, C. Gordon, E.J. Franzmann
Study supervision: L.H.M. Pereira, E.P. Reategui, E.J. Franzmann
Other (longtime mentor of senior author): W.J. Goodwin
Acknowledgments
The authors thank members of the University of Miami, Division of Head and Neck Surgery, the Department of Family Medicine, the Sylvester Comprehensive Cancer, Center Disparities and Community Outreach Core, Liberty Square Community Center, Curley's House Food Bank, Liberty City Community Health Advisory Board, and Dr. John Deo for their assistance with this work.
Grant Support
The work was funded by Woman's Cancer Association, a gift from Vigilant Biosciences, Inc., Sylvester Comprehensive Cancer Center and University of Miami, Department of Otolaryngology. E.J. Franzmann was supported by grants NCI R01CA118584, NCI RO3 CA107828, 4BB-20 Bankhead-Coley, and 10BG-02 Bankhead-Coley.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.