Abstract
Background: The peripheral blood neutrophil-to-lymphocyte ratio (NLR) is a cytologic marker of both inflammation and poor outcomes in patients with cancer. DNA methylation is a key element of the epigenetic program defining different leukocyte subtypes and may provide an alternative to cytology in assessing leukocyte profiles. Our aim was to create a bioinformatic tool to estimate NLR using DNA methylation, and to assess its diagnostic and prognostic performance in human populations.
Methods: We developed a DNA methylation–derived NLR (mdNLR) index based on normal isolated leukocyte methylation libraries and established cell-mixture deconvolution algorithms. The method was applied to cancer case–control studies of the bladder, head and neck, ovary, and breast, as well as publicly available data on cancer-free subjects.
Results: Across cancer studies, mdNLR scores were either elevated in cases relative to controls, or associated with increased hazard of death. High mdNLR values (>5) were strong indicators of poor survival. In addition, mdNLR scores were elevated in males, in nonHispanic white versus Hispanic ethnicity, and increased with age. We also observed a significant interaction between cigarette smoking history and mdNLR on cancer survival.
Conclusions: These results mean that our current understanding of mature leukocyte methylomes is sufficient to allow researchers and clinicians to apply epigenetically based analyses of NLR in clinical and epidemiologic studies of cancer risk and survival.
Impact: As cytologic measurements of NLR are not always possible (i.e., archival blood), mdNLR, which is computed from DNA methylation signatures alone, has the potential to expand the scope of epigenome-wide association studies. Cancer Epidemiol Biomarkers Prev; 26(3); 328–38. ©2016 AACR.
This article is featured in Highlights of This Issue, p. 287
Introduction
Systemic inflammation in cancer is associated with altered myelopoiesis and the production of myeloid suppressor cells (MDSC), contributing to an immunosuppressive network that adversely affects cancer survival (1–3). MDSCs are aberrantly activated immature myeloid cells that are functionally distinct from terminally differentiated myeloid cells, although they are morphologically similar to their normal mature counterparts (i.e., mononuclear cells, neutrophils; ref. 4). Epigenetic reprograming is implicated in the altered differentiation pathway leading to MDSCs (5), which has been shown to target the retinoblastoma cell cycle (6), Notch signaling pathways (7), as well as other transcriptional networks (8).
The best methods to assess altered myeloid populations or systemic inflammation, more generally, are still evolving and as a result, large-scale studies are lacking. However, the peripheral blood neutrophil-to-lymphocyte ratio (NLR), derived from the common five-part white blood cell differential count, has emerged as a robust marker of cancer-associated inflammation (9–14). Increases in the blood NLR have been remarkably consistent in their association with poor cancer survival. A recent meta-analysis including 100 independent studies encompassing more than 40,000 subjects demonstrated that an elevated NLR was a statistically significant predictor of poor overall survival, cancer-specific survival, as well as progression-free and disease-free survival, even after adjustment for established risk predictors (15). While the NLR is an index of systemic inflammation, the biology underlying its strong connection with cancer outcomes remains obscure. Intriguingly, it has recently been shown that the mutational landscape of some cancers (associated with tobacco carcinogen exposure) engender an immune response detectable in the periphery, strongly supporting the concept that the blood harbors phenotypically active subsets of immune cells (16).
One plausible explanation for the association of elevated NLRs with cancer mortality is its presumed correlation with altered myeloid differentiation and production of MDSCs. While this idea is largely untested, the crucial role that epigenetic modifications (including DNA methylation) play in programming myeloid and lymphoid cell differentiation is recognized (17–20). Remodeling the epigenome during hematopoiesis leads to progressively restricted immune subtypes and DNA methylation provides a chemically stable mark for these cell fate decisions (21). The DNA methylomes of circulating myeloid (monocytes, neutrophil, basophils, eosinophil) and lymphoid (CD4, CD8 T cells, B cells, NK cells) cells have been extensively studied, revealing that lineage-specific peripheral blood immune cells can be distinguished by a signature or “fingerprint” of leukocyte differentially methylated regions (L-DMRs; refs. 22–24). Using a bioinformatic approach (a cell-mixture deconvolution algorithm), specific L-DMR libraries accurately estimate the proportion of each cell type in complex mixtures such as whole blood (25). Although the transition from immature to mature myeloid cells has been shown to involve changes in DNA methylation (26, 27), the diagnostic L-DMRs of immature myeloid cells and more specifically, cancer-induced MDSCs, have not been defined. Given that epigenetic mechanisms specify both normal and cancer-related myelopoiesis, the possibility exists that without specific DMRs for cancer-related leukocytes, previously identified immune methylation profiles may not predict cancer patient outcomes as they do with the cytologic NLR. We questioned whether, similar to the cytologic NLR, established DMR signatures of blood neutrophils and lymphocytes used to estimate the NLR would be predictive of cancer patient survival and correlate with other risk factors.
Here, we develop and evaluate a methylation-derived NLR (mdNLR) index using L-DMR cell libraries and our validated bioinformatic approach (23, 28). Often times, cytologic NLR data are not available in clinical and research studies, including epigenome-wide association (EWAS) studies; an mdNLR index, based solely on archival blood DNA, would expand the scope of studies that seek to evaluate immune parameters in cancer to include large prospective studies.
Materials and Methods
Computing the mdNLR
Estimation of the mdNLR required three main steps: (i) identify differentially methylated CpGs among leukocyte subtypes, (L-DMRs), (ii) perform cell-mixture deconvolution (23) to estimate the proportion of leukocyte subtypes using the L-DMRs identified in step 1, and (iii) compute the ratio of the predicted proportion of neutrophil granulocytes to lymphocytes (Fig. 1). Using DNA methylation data from isolated leukocyte subtypes (22, 23), we identified CpG-specific patterns of DNA methylation among monocytes, granulocytes, and lymphocytes using a series of t tests fit independently to each CpG. For each of the three pairs of comparisons, the top 50 CpGs with the smallest and largest t statistics were combined to create a single list of J nonoverlapping L-DMRs (Supplementary Tables S1 and S2). The rationale for selecting only the top 50 CpGs with the smallest and largest t statistics to create the L-DMR library was based on previous work (29) and empirical analyses that suggested that the inclusion of additional CpGs offered only marginal improvements in the accuracy of our estimates of NLR (data not shown). Using the J L-DMRs and cell-mixture deconvolution, we estimated the fractions of monocytes, granulocytes, and lymphocytes for the ith study sample, |${\hat{\Omega }_i} = [ {{{\hat{\omega }}_{( {{\rm{Gran}},i} ),}}\,\,{{\hat{\omega }}_{( {{\rm{Mono}},i} ),}}\,\,{{\hat{\omega }}_{( {{\rm{Lymph}},i} )}}} ].$| Finally, the mdNLR was computed for each sample by taking the ratio of its predicted granulocyte and lymphocyte fractions, |$mdNL{R_i} = {\frac{{{{\hat{\omega }}_{( {{\rm{Gran}},\,i} )}}}}{{{{\hat{\omega }}_{( {{\rm{Lymph}},\,i} )}}}}}, \,0 \le mdNL{R_i} \, \lt \, \infty.$| A publicly available implementation of this method is available by request from the corresponding author.
Validating the mdNLR
As a validation of the mdNLR, we performed an analysis comparing mdNLR to cytologic NLR within an independent set consisting of whole blood (WB) DNA methylation measurements across 18 samples (29). These data are publicly available in Gene Expression Omnibus (GEO accession number: GSE77797). Of the 18 samples, 12 were artificial WB reconstructions for which the mixing proportions of leukocyte cell types (i.e., granulocytes, monocytes, CD4T, CD8T, natural killer cells, and B cells) were known exactly. The remaining 6 WB samples were collected from disease-free adult donors with available immune cell profiling data from flow cytometry (FC). Further details on the 18 WB samples can be found elsewhere (29). The mdNLR and cytologic NLR were first determined for each of the 18 samples and subsequently compared by computing the coefficient of determination (R2) and root mean squared prediction error (RMSPE).
Statistical analyses of the mdNLR and clinical outcomes
Associations between mdNLR and clinical covariates were assessed using either logistic regression or linear regression models. Both univariate and multivariate regression models adjusted for potential confounders were employed. Cox-proportional hazards regression models adjusted for potential confounding covariates were used to examine the association between mdNLR and survival time. In our survival analyses, mdNLR was modeled both as a continuous predictor and by dichotomizing subjects into high and low mdNLR groups. High and low mdNLR groups were determined by first identifying mdNLR cutoff point that maximized the log-rank (LR) test statistic within each dataset separately (Δd, d = {Bladder, HNSCC}), followed by group assignment:
To determine the mdNLR cutoff point (Δd) for each dataset, we computed the LR test statistic that resulted from comparing survival profile between subjects in the high and low mdNLR groups as a function of varying thresholds for defining membership in those groups, that is, LRdt = fd(Sdi, Cdi |δt), where Sdi is the time to death or censoring for subject i in dataset d = {Bladder, HNSCC}, Cdi is the censoring indicator for subject i in data set d, and δt t = 1,…,T represents the cutoff point for determining the high and low NLR groups. Finally, the optimal cutoff point was obtained by finding the δt that resulted in the maximum LR test statistic, that is, |${\Delta _d} = \arg {\max _{{\delta _t}}}( {L{R_{dt}}} ).$| The purpose of dichotomizing mdNLR into high and low groups was 2-fold: first, as a preventative measure due to the potential nonlinear effect of mdNLR on the log-hazard and second, to enable straightforward comparisons across datasets and previously published studies.
Target DNA methylation datasets
We derived and investigated mdNLR scores in five published DNA methylation datasets. The study sample sizes, clinical characteristics, and available demographic/epidemiological information is given in Table 1. The datasets used here included case–control EWAS of bladder cancer (30), head and neck squamous cell carcinoma (HNSCC; ref. 31), ovarian cancer (32), and breast cancer (33). Treatment status was available for the patients in the ovarian cancer dataset and separate analyses were performed on pre- and post-treatment cases. For the HNSCC dataset, blood was drawn pretreatment. Conversely, blood was drawn post-treatment for cases in the bladder cancer dataset. Treatment was fairly consistent among non-muscle–invasive bladder cancer cases, which comprised the vast majority (70%) of bladder cancer cases. Across all cases, approximately 75% received surgery only (i.e., transurethral resection of the bladder tumor; TURBT) and approximately 13% received bacillus Calmette-Guérin (BCG) at the time of blood draw. Treatment information was not available for the six blood samples collected postdiagnosis in the breast cancer dataset.
Study (Array) . | Characteristics . | Cases . | Controls . | Pdifference . |
---|---|---|---|---|
HNSCC (27K) | Total samples | 92 | 92 | |
mdNLR, mean (SD) | 3.0 (1.6) | 1.7 (0.7) | 1.2 × 10−10 | |
Age, median years (range) | 58 (31–84) | 59 (32–86) | 0.54a | |
Gender | 0.99b | |||
Male | 64 (70%) | 64 (70%) | ||
Female | 28 (30%) | 28 (30%) | ||
Race | 0.99b | |||
White | 84 (91%) | 85 (92%) | ||
Nonwhite | 8 (9%) | 7 (8%) | ||
Smoking history | 0.04b | |||
Never | 17 (19%) | 32 (35%) | ||
Former | 59 (64%) | 47 (51%) | ||
Current | 16 (17%) | 13 (14%) | ||
HPV16 (E6, E7, or L1) | 0.0002b | |||
Negative | 66 (72%) | 83 (90%) | ||
Positive | 26 (28%) | 9 (10%) | ||
Tumor site | ||||
Laryngeal | 18 (20%) | N/A | ||
Oral cavity | 47 (50%) | N/A | ||
Oropharyngeal | 25 (27%) | N/A | ||
Bladder cancer (27K) | Total samples | 223 | 237 | |
mdNLR, mean (SD) | 2.6 (1.9) | 2.8 (2.2) | 0.33 | |
Age, median years (range) | 66 (25–74) | 65 (28–74) | 0.05a | |
Gender | 0.06b | |||
Male | 171 (77%) | 158 (67%) | ||
Female | 52 (23%) | 79 (33%) | ||
Race | 0.99b | |||
White | 223 (100%) | 237 (100%) | ||
Nonwhite | ||||
Smoking history | <0.001b | |||
Never | 40 (18%) | 72 (30%) | ||
Former | 111 (50%) | 126 (53%) | ||
Current | 72 (32%) | 39 (16%) | ||
Tumor stage | ||||
T0a | 156 (70%) | N/A | ||
Tis | 6 (2.7%) | N/A | ||
T1 | 37 (17%) | N/A | ||
T2 | 12 (5.4%) | N/A | ||
T3 | 6 (2.7%) | N/A | ||
T4 | 6 (2.7%) | N/A | ||
Ovarian cancer (27K) | Total samples | 266 | 274 | |
Pretreatment cases | 131 (49%) | N/A | ||
Posttreatment cases | 135 (51%) | N/A | ||
mdNLR, mean (SD) | 3.15 (2.24) | 2.08 (1.01) | 2.2 × 10−16a | |
Age group | 3.7 × 10−6b | |||
50–55 | 34 (13%) | 14 (5%) | ||
55–60 | 47 (18%) | 68 (25%) | ||
60–65 | 42 (16%) | 67 (24%) | ||
65–70 | 43 (16%) | 39 (14%) | ||
70–75 | 50 (19%) | 66 (24%) | ||
>75 | 50 (19%) | 20 (7%) | ||
Histology | ||||
Serous | 150 (56%) | N/A | ||
Endometrioid | 34 (13%) | N/A | ||
Mucinous | 30 (11%) | N/A | ||
Clear cell | 25 (9%) | N/A | ||
Other | 27 (10%) | N/A | ||
Breast Twin Study (450K) | Total samples | 15 | 15 | |
mdNLR, mean (SD) | 2.7 (1.4) | 2.4 (1.2) | 0.08c | |
Healthy aging study (450K) | Total | 656 | ||
mdNLR, mean (SD) | 3.0 (5.6)d | |||
Age, median years | 65 (19–101) | |||
Race | ||||
Caucasian - European | 426 (65%) | |||
Hispanic - Mexican | 230 (35%) | |||
Gender | ||||
Female | 338 (52%) | |||
Male | 318 (48%) |
Study (Array) . | Characteristics . | Cases . | Controls . | Pdifference . |
---|---|---|---|---|
HNSCC (27K) | Total samples | 92 | 92 | |
mdNLR, mean (SD) | 3.0 (1.6) | 1.7 (0.7) | 1.2 × 10−10 | |
Age, median years (range) | 58 (31–84) | 59 (32–86) | 0.54a | |
Gender | 0.99b | |||
Male | 64 (70%) | 64 (70%) | ||
Female | 28 (30%) | 28 (30%) | ||
Race | 0.99b | |||
White | 84 (91%) | 85 (92%) | ||
Nonwhite | 8 (9%) | 7 (8%) | ||
Smoking history | 0.04b | |||
Never | 17 (19%) | 32 (35%) | ||
Former | 59 (64%) | 47 (51%) | ||
Current | 16 (17%) | 13 (14%) | ||
HPV16 (E6, E7, or L1) | 0.0002b | |||
Negative | 66 (72%) | 83 (90%) | ||
Positive | 26 (28%) | 9 (10%) | ||
Tumor site | ||||
Laryngeal | 18 (20%) | N/A | ||
Oral cavity | 47 (50%) | N/A | ||
Oropharyngeal | 25 (27%) | N/A | ||
Bladder cancer (27K) | Total samples | 223 | 237 | |
mdNLR, mean (SD) | 2.6 (1.9) | 2.8 (2.2) | 0.33 | |
Age, median years (range) | 66 (25–74) | 65 (28–74) | 0.05a | |
Gender | 0.06b | |||
Male | 171 (77%) | 158 (67%) | ||
Female | 52 (23%) | 79 (33%) | ||
Race | 0.99b | |||
White | 223 (100%) | 237 (100%) | ||
Nonwhite | ||||
Smoking history | <0.001b | |||
Never | 40 (18%) | 72 (30%) | ||
Former | 111 (50%) | 126 (53%) | ||
Current | 72 (32%) | 39 (16%) | ||
Tumor stage | ||||
T0a | 156 (70%) | N/A | ||
Tis | 6 (2.7%) | N/A | ||
T1 | 37 (17%) | N/A | ||
T2 | 12 (5.4%) | N/A | ||
T3 | 6 (2.7%) | N/A | ||
T4 | 6 (2.7%) | N/A | ||
Ovarian cancer (27K) | Total samples | 266 | 274 | |
Pretreatment cases | 131 (49%) | N/A | ||
Posttreatment cases | 135 (51%) | N/A | ||
mdNLR, mean (SD) | 3.15 (2.24) | 2.08 (1.01) | 2.2 × 10−16a | |
Age group | 3.7 × 10−6b | |||
50–55 | 34 (13%) | 14 (5%) | ||
55–60 | 47 (18%) | 68 (25%) | ||
60–65 | 42 (16%) | 67 (24%) | ||
65–70 | 43 (16%) | 39 (14%) | ||
70–75 | 50 (19%) | 66 (24%) | ||
>75 | 50 (19%) | 20 (7%) | ||
Histology | ||||
Serous | 150 (56%) | N/A | ||
Endometrioid | 34 (13%) | N/A | ||
Mucinous | 30 (11%) | N/A | ||
Clear cell | 25 (9%) | N/A | ||
Other | 27 (10%) | N/A | ||
Breast Twin Study (450K) | Total samples | 15 | 15 | |
mdNLR, mean (SD) | 2.7 (1.4) | 2.4 (1.2) | 0.08c | |
Healthy aging study (450K) | Total | 656 | ||
mdNLR, mean (SD) | 3.0 (5.6)d | |||
Age, median years | 65 (19–101) | |||
Race | ||||
Caucasian - European | 426 (65%) | |||
Hispanic - Mexican | 230 (35%) | |||
Gender | ||||
Female | 338 (52%) | |||
Male | 318 (48%) |
aWilcoxon rank-sum test for a difference between cases and control.
bFisher's exact or χ2 test for a difference between cases and controls.
cOne-sided paired t test to assess difference in cases and controls.
dAverage and SD driven up by several large mdNLR outlier values in the data.
To explore the relationship between mdNLR and age, ethnic origin, and gender in healthy adults, we used a large (n = 656), publicly available blood-derived DNA methylation dataset (34) (GEO accession number: GSE40279). None of the methylation studies included direct cytologic measurements of leukocyte proportions.
Reference leukocyte-specific DNA methylation datasets
We made use of two previously published leukocyte-specific DNA methylation datasets as the basis for mdNLR estimation as the target DNA methylation datasets used in our analysis spanned two different array technologies (HM27 and HM450 arrays; refs. 22, 23). Specifically, HM27 methylation data for leukocyte subtypes isolated from the peripheral blood 46 different nondiseased human adults [B-cells (n = 6), natural killer cells (n = 11), CD4+ T cells (n = 8), CD8+ T cells (n = 2), Pan-T cells (n = 6), monocytes (n = 5), granulocytes (n = 8); ref. 23], and a dataset (GEO accession number: GSE35069), that profiled the same leukocyte subtypes in each of 6 healthy male adults using the Illumina HM450 BeadChip (22).
Quality control and preprocessing of the DNA methylation datasets
For each of the DNA methylation datasets, preprocessing and quality control was accomplished using the minfi Bioconductor package (35). To ensure high-quality methylation data, CpG loci having a sizable fraction (>25%) of detection P values above a predetermined threshold (detection P > 10E−5) were excluded (36). For the HM450 datasets, Subset Quantile Within Array (SWAN) normalization was performed for type 1/2 probe adjustment (37). The presence of technical sources of variability induced by plate and/or BeadChip was examined using principal components analysis (PCA) and the top K principal components (38) were examined in terms of their association with plate and BeadChip. If plate and/or BeadChip was found to be significantly associated with any of the top K principal components, we applied ComBat method (39) for normalization using the sva Bioconductor package.
Results
mdNLR validation analysis
As a validation of our proposed mdNLR, we first performed an analysis comparing mdNLR with cytologic NLR using an independent set consisting of WB DNA methylation measurements across 18 samples with observed cell proportions. The results of this analysis showed a high correlation between mdNLR and cytologic NLR (R2 = 0.99; Supplementary Fig. S1). While a small downward bias in our estimates of NLR was observed, the average difference between mdNLR and NLR across the 18 samples was minimal, RMSPE = 0.60 (in NLR units); that is, on average, mdNLR and cytologic NLR differed by 0.60 units.
Head and neck squamous cell carcinoma
A univariate comparison of the mdNLR between cases and controls revealed a statistically significant inflation in the mdNLR among head and neck squamous cell carcinoma (HNSCC) cases (P = 1.2 × 10−10) with the mean mdNLR of cases estimated at 2.99 compared with 1.75 for controls (Fig. 2A and B). In a multivariable linear regression model adjusted for patient age, gender, race, smoking status, and HPV16 status, HNSCC cases exhibited a significantly larger mdNLR compared with controls (P = 2.6 × 10−10). Within the same model, age was also observed to be positively associated with mdNLR (P = 0.009), with each 10-year increment in age being associated with an expected increase of 0.20 in the mdNLR.
We computed receiver operating characteristic (ROC) curves and corresponding area under the curve (AUC) based on covariate data only (age, gender, race, smoking status, and HPV16 status), mdNLR, and their combination (Fig. 2C). The classifier with mdNLR alone was sufficient to distinguish HNSCC cases from controls with an AUC = 0.76 (95% CI = 0.69–0.83), and including the covariates with mdNLR resulted in an AUC = 0.82 (95% CI = 0.76–0.88), a statistically significant improvement in the AUC compared with the covariate only classifier (P = 0.002).
Among HNSCC cases there was no significant difference in the mdNLR based on the site of the tumor (oral, pharyngeal, and laryngeal: P = 0.83); however, the mdNLR was elevated in subjects who died during follow-up period compared with those that survived or were censored [mean mdNLR = 3.23 and 2.62, respectively (P = 0.07)]. We next compared survival between HNSCC cases whose mdNLR≤5 (88% of cases) and cases whose mdNLR>5 (12% of cases), and observed that cases in the low mdNLR group had a median survival time that was approximately 5.5 times longer than subjects in the high group (Fig. 2E; log-rank P = 0.002). In a Cox-regression model adjusted for age, gender, smoking status (never vs. ever), HPV16 status and tumor stage (I versus II, III, and IV), cases in the high mdNLR group had an approximately 2-fold increase hazard of death compared with those in the low mdNLR group [HR=2.04; 95% CI = (0.97–4.29; Supplementary Table S3). As it has been shown that tobacco carcinogens alter the blood immune profile associated with cancer survival (16), we tested for and found a statistically significant interaction between mdNLR and survival with smoking status (Table 2; P = 0.014). Among those with a high mdNLR, the never-smokers exhibited a 3-fold increased hazard of death compared with ever-smokers (HR = 3.19; 95% CI = 0.71–14.34); among those with a low mdNLR, the never-smokers exhibited a 3-fold decreased hazard of death compared with ever-smokers (HR = 0.33; 95% CI = 0.12–0.91). Furthermore, a Cox-proportional hazard model treating mdNLR as a continuous predictor also revealed significant effects of both mdNLR and smoking status on overall survival as well as a significant interaction between smoking status and mdNLR (Table 2).
Study . | Variable . | HR (95% CI) . | P . |
---|---|---|---|
HNSCC | |||
mdNLR (cont.) | 2.58 (1.41–4.71) | 0.002a | |
Age | 1.05 (1.02–1.08) | 2.9 × 10−4a | |
Gender | |||
Female | ref (N/A) | N/A | |
Male | 0.71 (0.38–1.31) | 0.270 | |
Smoking history | |||
Never | ref (N/A) | N/A | |
Ever | 45.72 (3.22–648.56) | 0.005a | |
Tumor site | |||
Laryngeal | ref (N/A) | N/A | |
Oral cavity | 1.13 (0.53–2.43) | 0.744 | |
Oropharyngeal | 2.7 (1.13–6.43) | 0.025b | |
HPV16 | 0.17 (0.07–0.42) | 1.2 × 10−4a | |
Tumor stage | |||
Stage I | ref (N/A) | N/A | |
Stages II, III, IV | 2.53 (0.97–6.6) | 0.058c | |
Smoking × mdNLR | 0.41 (0.22–0.78) | 0.006d | |
HNSCC | |||
mdNLR | |||
mdNLR ≤ 5 | |||
mdNLR > 5 | 13.87 (2.76–69.66) | 0.001a | |
Age | 1.05 (1.02–1.08) | 4.9 × 10−4a | |
Gender | |||
Female | |||
Male | 0.71 (0.38–1.33) | 0.283 | |
Smoking history | |||
Never | |||
Ever | 3.05 (1.1–8.45) | 0.032b | |
Tumor site | |||
Laryngeal | |||
Oral cavity | 1.19 (0.56–2.54) | 0.646 | |
Oropharyngeal | 2.45 (1.01–5.92) | 0.047b | |
HPV16 | 0.2 (0.08–0.48) | 2.9 × 10−4a | |
Tumor stage | |||
Stage I | |||
Stages II, III, IV | 2.31 (0.88–6.1) | 0.090c | |
Smoking × mdNLR | 0.1 (0.02–0.63) | 0.014b | |
Bladder | |||
mdNLR (cont.) | 1.65 (1.15–2.37) | 0.007d | |
Age | 1.07 (1.04–1.10) | 2.3 × 10−6a | |
Gender | |||
Female | ref (N/A) | N/A | |
Male | 1.62 (0.96–2.74) | 0.070c | |
Smoking history | |||
Never | ref (N/A) | N/A | |
Ever | 4.98 (1.44–17.17) | 0.011b | |
Tumor stage | |||
Low (T0a & T1–T3) | ref (N/A) | N/A | |
High (Tis & T4) | 2.75 (1.45–5.20) | 0.002a | |
Smoking × mdNLR | 0.67 (0.46–0.97) | 0.033b | |
Bladder | |||
mdNLR | |||
mdNLR ≤ 5 | ref (N/A) | N/A | |
mdNLR > 5 | 33.67 (6.77–167.5) | 1.7 × 10−5a | |
Age | 1.08 (1.05–1.11) | 1.7 × 10−7a | |
Gender | |||
Female | ref (N/A) | N/A | |
Male | 1.61 (0.95–2.72) | 0.076c | |
Smoking history | |||
Never | ref (N/A) | N/A | |
Ever | 1.97 (1.09–3.55) | 0.024b | |
Tumor stage | |||
Low (T0a & T1-T3) | ref (N/A) | N/A | |
High (Tis & T4) | 3.29 (1.73–6.27) | 2.8 × 10−4a | |
Smoking × mdNLR | 0.08 (0.01–0.42) | 0.003a |
Study . | Variable . | HR (95% CI) . | P . |
---|---|---|---|
HNSCC | |||
mdNLR (cont.) | 2.58 (1.41–4.71) | 0.002a | |
Age | 1.05 (1.02–1.08) | 2.9 × 10−4a | |
Gender | |||
Female | ref (N/A) | N/A | |
Male | 0.71 (0.38–1.31) | 0.270 | |
Smoking history | |||
Never | ref (N/A) | N/A | |
Ever | 45.72 (3.22–648.56) | 0.005a | |
Tumor site | |||
Laryngeal | ref (N/A) | N/A | |
Oral cavity | 1.13 (0.53–2.43) | 0.744 | |
Oropharyngeal | 2.7 (1.13–6.43) | 0.025b | |
HPV16 | 0.17 (0.07–0.42) | 1.2 × 10−4a | |
Tumor stage | |||
Stage I | ref (N/A) | N/A | |
Stages II, III, IV | 2.53 (0.97–6.6) | 0.058c | |
Smoking × mdNLR | 0.41 (0.22–0.78) | 0.006d | |
HNSCC | |||
mdNLR | |||
mdNLR ≤ 5 | |||
mdNLR > 5 | 13.87 (2.76–69.66) | 0.001a | |
Age | 1.05 (1.02–1.08) | 4.9 × 10−4a | |
Gender | |||
Female | |||
Male | 0.71 (0.38–1.33) | 0.283 | |
Smoking history | |||
Never | |||
Ever | 3.05 (1.1–8.45) | 0.032b | |
Tumor site | |||
Laryngeal | |||
Oral cavity | 1.19 (0.56–2.54) | 0.646 | |
Oropharyngeal | 2.45 (1.01–5.92) | 0.047b | |
HPV16 | 0.2 (0.08–0.48) | 2.9 × 10−4a | |
Tumor stage | |||
Stage I | |||
Stages II, III, IV | 2.31 (0.88–6.1) | 0.090c | |
Smoking × mdNLR | 0.1 (0.02–0.63) | 0.014b | |
Bladder | |||
mdNLR (cont.) | 1.65 (1.15–2.37) | 0.007d | |
Age | 1.07 (1.04–1.10) | 2.3 × 10−6a | |
Gender | |||
Female | ref (N/A) | N/A | |
Male | 1.62 (0.96–2.74) | 0.070c | |
Smoking history | |||
Never | ref (N/A) | N/A | |
Ever | 4.98 (1.44–17.17) | 0.011b | |
Tumor stage | |||
Low (T0a & T1–T3) | ref (N/A) | N/A | |
High (Tis & T4) | 2.75 (1.45–5.20) | 0.002a | |
Smoking × mdNLR | 0.67 (0.46–0.97) | 0.033b | |
Bladder | |||
mdNLR | |||
mdNLR ≤ 5 | ref (N/A) | N/A | |
mdNLR > 5 | 33.67 (6.77–167.5) | 1.7 × 10−5a | |
Age | 1.08 (1.05–1.11) | 1.7 × 10−7a | |
Gender | |||
Female | ref (N/A) | N/A | |
Male | 1.61 (0.95–2.72) | 0.076c | |
Smoking history | |||
Never | ref (N/A) | N/A | |
Ever | 1.97 (1.09–3.55) | 0.024b | |
Tumor stage | |||
Low (T0a & T1-T3) | ref (N/A) | N/A | |
High (Tis & T4) | 3.29 (1.73–6.27) | 2.8 × 10−4a | |
Smoking × mdNLR | 0.08 (0.01–0.42) | 0.003a |
aP≤ 0.005.
bP≤ 0.05.
cP ≤ 0.10.
dP≤ 0.01.
Bladder cancer
A multivariable linear regression analysis adjusted for subject age, gender, smoking status (never, former, current) showed no statistically significant difference in mean mdNLR between bladder cancer cases and controls (P = 0.23; Fig. 3A and B). However, both age and gender were significantly associated with mdNLR among bladder cancer cases, (P = 0.009 and P = 0.005, respectively), adjusting for tumor stage, grade, age, gender, and smoking history. In particular, among bladder cancer cases, females had a lower mdNLR on average compared with males (1.88 vs. 2.78; P = 0.005) and each 10-year increase in age was associated with an expected increase of 0.30 in mdNLR (P = 0.009).
Similar to the HNSCC dataset, mdNLR was significantly elevated among bladder cancer cases who died during the follow-up period compared with those who were censored or remained alive at the end of the study period (mean = 3.14 vs. 2.46; P = 0.05; Fig. 3C). Also paralleling the HNSCC dataset, the optimal cutoff for defining low and high mdNLR groups among bladder cancer cases was found to be 5 (Fig. 3D). On the basis of the optimal cutoff point, 7% and 93% of the cases were assigned to high and low mdNLR groups, and a univariate comparison of survival showed that those with an mdNLR≤5 had a median survival nearly twice as long as subjects in the high group (Fig. 3E; log-rank P = 2.2 × 10−5). Furthermore, in a model adjusted for age, gender, smoking status (ever vs. never) and tumor stage, cases with an mdNLR>5 had an approximately 3-fold increased hazard of death compared with those with an mdNLR≤5 (HR = 3.01; 95% CI = 1.69–5.36; Supplementary Table S4). Strikingly, mimicking the results observed in the HNSCC dataset, the association between mdNLR and survival also exhibited a statistically significant interaction with smoking status (P = 0.003), such that among those with a high mdNLR, never-smokers exhibited a 3.5-fold increased hazard of death compared with ever-smokers (HR = 3.52, P = 1.7 × 10−5), whereas among those with a low mdNLR, never-smokers exhibited a 2-fold decreased hazard of death compared with ever-smokers (HR = 0.51; 95% CI = 0.28–0.92). Furthermore, a Cox-proportional hazard model treating mdNLR as a continuous predictor also revealed significant effects of both mdNLR and smoking status on overall survival as well as a significant interaction between smoking status and mdNLR (Table 2). Intriguingly, despite the association between mdNLR and overall survival, there was no significant association between survival and the components used to calculate the mdNLR; granulocytes (HR = 2.10; P = 0.427) and lymphocytes (HR = 0.47; P = 0.412).
Ovarian cancer
A comparison of mdNLR between controls and ovarian cancer cases showed that the mdNLR was significantly higher in cases (P = 2.2 × 10−16; Table 1), and this difference remained significant after adjustment for patient age at blood draw (P = 8.0 × 10−11). Comparing the mean mdNLR of controls versus pre- and posttreatment cases separately revealed a gradient; controls had the smallest average mdNLR (mean = 2.08), followed by posttreatment cases (mean = 2.47), and pretreatment cases had the highest mdNLR, (mean = 3.84; Fig. 4A). In addition, posttreatment cases had significantly elevated mdNLR compared with cancer-free controls (P = 0.002, age-adjusted) and pretreatment cases had significantly elevated mdNLRs compared with both controls (P = 2.2 × 10−16, age-adjusted) and posttreatment cases (P = 1.3 × 10−6, age-, histology-, and stage-adjusted). To examine the potential of mdNLR to correctly classify controls, pretreatment cases, and posttreatment cases, we computed ROC curves and corresponding AUCs for each pairwise comparison (Fig. 4B). The results from this analysis revealed that mdNLR alone was sufficient to distinguish controls from pretreatment cases with an AUC = 0.79 (95% CI = 0.74–0.83). Unexpectedly, however, our results showed that the mdNLR was able to better classify pre- and posttreatment cases (AUC = 0.69; 95% CI = 0.63–0.76) compared with its performance for classifying controls from posttreatment cases (AUC = 0.61; 95% CI = 0.55–0.67]).
Breast cancer
Comparing the mdNLR between twins discordant for breast cancer using a Wilcoxon signed rank test showed that subjects with breast cancer had significantly elevated mdNLRs compared with their cancer-free twin (median difference in mdNLR between twin pairs = 0.33) and this difference was statistically significant (P = 0.005). We also examined the difference in mdNLR between twin pairs as a function the time pre- versus postdiagnosis at which samples were collected (Fig. 4C). A Lowess smoothed curve reflecting the relationship between the twin-pair difference in the mdNLR as a function of sample collection relative to the time of cancer diagnosis was generated on the basis of all 13 twin pairs. Despite limited power due to the small sample size of this study, our results revealed a trend of increasing separation in the mdNLR between twin pairs in the years leading up to cancer diagnosis, which peaked around the time of diagnosis and decreased thereafter (Fig. 4C).
Healthy aging
Both the mean and variance of the mdNLR values increased with age (Fig. 4D). The mdNLR was, on average, higher among white non-Hispanics (mean = 3.49) compared with Mexican Hispanics (mean = 2.23) and this difference was statistically significant after adjustment for subject age (P = 0.006; Fig. 4E). The mean mdNLRs were also higher for males (mean = 3.30) relative to females (mean = 2.81); however, this difference was not statistically significant (P = 0.26).
Discussion
Here we demonstrate that a methylation-derived estimate of the NLR displays associations consistent with the simple cytologic NLR. Our approach is based on cell mixture deconvolution (23) and only differs from the estimateCellCounts function in the minfi Bioconductor package (35) in the library (i.e., set of L-DMRs) used as the basis for deconvolution. Whereas estimateCellCounts uses a library that is comprised of top 50 hyper- and hypomethylated CpGs between each cell type (i.e., CD4T, CD8T, natural Killer, B cells, monocytes, and granulocytes) and the remaining five subtypes (J = 600 total L-DMRs comprise the estimateCellCounts library), our library was constructed by identifying the L-DMRs (top 50 hyper- and hypomethylated CpGs) that best discriminated lymphocytes (i.e., CD4T, CD8T, natural Killer, and B cells, collectively), monocytes, and granulocytes. The difference between libraries is subtle; however, our decision to select the libraries in this way was based on the formulation of NLR (i.e., neutrophil by lymphocyte fraction) and empirical results that suggested more accurate estimates when mdNLR was estimated using the library considered here. Because of the library similarity of these two approaches, comparisons of mdNLR and cytologic NLR in our validation analysis were highly similar: R2 = 0.99 when mdNLR was estimated using our approach versus estimateCellCounts.
Our mdNLR estimates were higher in cases across multiple cancer types compared with controls and positively associated with an increased hazard of death in two independent cancer cohorts that included adjustment for established risk predictors. The prognostic importance of the NLR beyond the additive effects of its constituents is underscored by the fact that our analysis of bladder cancer showed a significant association between overall survival and mdNLR, but no associations with the two components of the ratio (i.e., granulocyte and lymphocyte fractions). This is consistent with previous observations using the cytologic NLR (40). We further showed that the mdNLR is positively associated with increased aging and varies as expected among ethnic groups (41). Surprisingly, in both bladder and HNSCC cancers, cigarette smoking was associated with shorter survival times among patients with nonelevated mdNLR, but not among those with elevated mdNLR scores. Smoking is associated with poor survival in these and other cancers (42, 43); however, it has never been shown to interact with any measure of immune status in the manner observed here. This is particularly intriguing given recent observations showing the importance of tobacco and mutagen exposure in shaping treatment response to immunotherapy among tumors with high mutational loads (16). Our data, showing a novel interaction of smoking with the mdNLR may similarly reflect this modification of the immune profile and potentially act as a biomarker of effect for immune therapies. Thus, the mdNLR represents a distinct approach utilizing the leukocyte lineage-determining epigenome that mirrors many established features of systemic inflammation (9–13) and offers promise as an informative survival biomarker.
The mechanisms driving the association of the NLR with cancer survival are not understood. Elevated NLR scores are associated with increases in inflammatory and angiogenic cytokines (44, 45) and the levels of MDSCs (46, 47). The importance of MDSCs in cancer progression is gaining increased acceptance, as is the fact that MDSCs are major obstacles in the application of new immune checkpoint blockade therapies (48, 49). Elevated cytologic NLR values are also associated with resistance to checkpoint blockade inhibitors (50, 51). It is therefore important to understand the relationships between the NLR and immune dysregulation in cancer. Measuring the mdNLR in the context of genome wide methylation analyses could help identify the epigenetic lineage of phenotypically active cells that are prognostically important; something that has not been possible using the cytological NLR. Discovery of the methylomic features of the lineage of immunomodulatory cells could also provide a path to detecting targets important in cancer inflammation and compromised immune response.
As is the case with all studies, this work is not without some limitations. Similar to examinations into the diagnostic and prognostic potential cytological NLR, the five studies used in this report and their accompanying results, are subject to the same considerations; namely, the external and internal validity of study findings. In this regard, we acknowledge that the mdNLR survival and risk associations reported herein may not be generalizable to the entire population of HNSCC, ovarian, bladder, breast cancer patients. Also, while the dataset used to validate mdNLR was relatively small (N = 18), 12 of the 18 samples comprising this dataset were obtained by mixing leukocyte subtype–specific DNA in known, predetermined proportions. Thus, for these 12 samples, the underlying leukocyte fractions, and consequently NLR, are known with high confidence and are likely less prone to measurement error associated with cell sorting/counting techniques (i.e., FACS, complete blood cell count, etc.). Consequently, these twelve samples represent an ideal dataset on which to validate the accuracy of our methylation-derived estimates of NLR.
In conclusion, the mdNLR will allow epidemiologists to explore systemic inflammation on an extremely large scale, using archival blood specimens previously not available for this line of inquiry. Our observation of differences in mdNLR among twin pairs discordant for breast cancer suggest that cancer-promoting lifestyle and environmental factors that modify host immunity may be revealed through epigenetic analysis of peripheral blood. Such studies aim to integrate environmental and genetic epidemiologic risk factors for cancer, and incorporation of mdNLR opens an opportunity to better understand the etiologic underpinnings of cancer-associated immunomodulation.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Authors' Contributions
Conception and design: D.C. Koestler, C.J. Marsit, M.R. Karagas, K.T. Kelsey, J.K. Wiencke
Development of methodology: D.C. Koestler, J. Usset, M.R. Karagas, K.T. Kelsey
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): C.J. Marsit, M.R. Karagas, K.T. Kelsey
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): D.C. Koestler, J. Usset, B.C. Christensen, C.J. Marsit, M.R. Karagas, K.T. Kelsey, J.K. Wiencke
Writing, review, and/or revision of the manuscript: D.C. Koestler, J. Usset, B.C. Christensen, C.J. Marsit, M.R. Karagas, K.T. Kelsey, J.K. Wiencke
Study supervision: K.T. Kelsey
Grant Support
This work was supported by NIH grants (1KL2TR000119 to D.C. Koestler; R01CA52689 and P50CA097257 to J.K. Wiencke; R01DE022772 and P20GM104416 to B.C. Christensen; P30CA023108 to C.J. Marsit and M.R. Karagas), Kansas IDeA Network of Biomedical Research Excellence Bioinformatics Core supported in part by the National Institute of General Medical Science award P20GM103418 (to D.C. Koestler), and the Robert Magnin Newman endowment for neurooncology at the University of California, San Francisco (to J.K. Wiencke).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.