Abstract
Low-dose CT screening can reduce lung cancer–related mortality. However, CT screening has an FDR of nearly 96%. We sought to assess whether urine samples can be a source for DNA methylation–based detection of non–small cell lung cancer (NSCLC).
This nested case–control study of subjects with suspicious nodules on CT imaging obtained plasma and urine samples preoperatively. Cases (n = 74) had pathologic confirmation of NSCLC. Controls (n = 27) had a noncancer diagnosis. We detected promoter methylation in plasma and urine samples using methylation on beads and quantitative methylation–specific real-time PCR for cancer-specific genes (CDO1, TAC1, HOXA7, HOXA9, SOX17, and ZFP42).
DNA methylation at cancer-specific loci was detected in both plasma and urine, and was more frequent in patients with cancer compared with controls for all six genes in plasma and in CDO1, TAC1, HOXA9, and SOX17 in urine. Univariate and multivariate logistic regression analysis showed that methylation detection in each one of six genes in plasma and CDO1, TAC1, HOXA9, and SOX17 in urine were significantly associated with the diagnosis of NSCLC, independent of age, race, and smoking pack-years. When methylation was detected for three or more genes in both plasma and urine, the sensitivity and specificity for lung cancer diagnosis were 73% and 92%, respectively.
DNA methylation–based biomarkers in plasma and urine could be useful as an adjunct to CT screening to guide decision-making regarding further invasive procedures in patients with pulmonary nodules.
The National Lung Screening Trial showed that lung cancer screening can reduce lung cancer–related mortality by 20% using low-dose CT. However, CT screening has an FDR of nearly 96%. Biomarkers from liquid biopsy assays hold promise for enhancing the diagnostic accuracy of early-stage lung cancer screening in conjunction with CT imaging. Urine samples have the potential to be easily implemented in a primary care practice. This study suggests that liquid biopsy biomarkers based on methylation detection from plasma and urine could be used as an adjunct to CT screening to help guide the decision to proceed with further invasive procedures, because plasma and urine yield low false-positive rates and the methylation of these genes is associated with a high lung cancer risk independent of age, race, and pack-year.
Introduction
Lung cancer is the leading cause of cancer-related mortality among men and women in the United States (1). It comprises one quarter of all cancer-related deaths in the United States (1). Non–small cell lung cancer (NSCLC), accounts for 87% of lung cancer cases and has an 18% overall 5-year survival rate (2). This low survival rate is most likely due to the fact that over 40% of NSCLC cases are diagnosed at stage IV, which has a 5-year survival rate of 2%–13% (3, 4).
The National Lung Screening Trial (NLST) showed that lung cancer screening can reduce lung cancer–related mortality by 20% by using low-dose CT (LDCT; ref. 5). However, in that study, baseline LDCT scans were positive in 27.3% of subjects with a false-positive rate (FPR) of 26.3% and a FDR of 96% (5). Altering criteria for designating a scan as positive and the implementation of the Lung-RADS algorithm can reduce this rate (6, 7). For example, by raising the criteria for a positive nodule from ≥4 mm as reported in NLST (5) to ≥6 mm, the FPR is decreased to 17% (8). However, 7 patients would have had delayed diagnosis (2.7% of detected patients with cancer with delay), and the majority of “positive” LDCT findings (4,470/4,726) continue to be from subjects without cancer (FDR of 94.5%). Lung-RADS criteria further reduces the FPR of baseline screening to 12.8% (6, 7), but also reduces sensitivity to 84.9% with 25 cancers not detected on the baseline scan (248 vs. 273 for NLST cutoff), and 3,095/3,334 positive scans are from patients without cancer, for an FDR of 93.6%. This demonstrates the need for biomarker approaches for the management of screen-detected pulmonary nodules.
Using lung cancer tissue samples from The Cancer Genome Atlas, we identified very highly sensitive and specific epigenetic changes for lung cancer (all stages) based on promoter gene methylation being able to discriminate lung cancer tissue from normal lung samples (9). Later, we validated the use of these epigenetic markers with the detection of the promoter gene methylation in liquid biopsies using plasma and sputum for NSCLC early stages (10). Urine can be a source for detection of specific somatic mutations in tumor DNA from cancers including urothelial, colon, and lung cancer (11–13). In addition, epigenetic changes in tumor DNA from urine sediment has been reported for urothelial (renal) cancer (14). Our aim was to determine whether urine samples can be used as a source for DNA methylation detection in NSCLC.
Materials and Methods
Study population
The study population consists of a prospective, observational nested case–control from two institutions: The Johns Hopkins Hospital (within the Johns Hopkins Lung Cancer Specialized Program of Research Excellence, Baltimore, MD) and the University of Illinois at Chicago (UIC) Hospital Health Science System (UIHHSS, Chicago, IL). This study was conducted in accordance with the Declaration of Helsinki. Institutional review board (IRB) approval was obtained prior to study initiation (NA_00005998 for Johns Hopkins and IRB #2017–1286 and #2018-0755 for UIC). All participants from both institutions signed informed consent. The reporting of this study conforms to the Strengthening the Reporting of Observational Studies in Epidemiology statement (15). Patients selected for inclusion in this study were 50 years and older and had a CT scan for suspicion of lung cancer and referred to surgery for resection. Exclusion criteria comprised having small cell lung cancer pathology, presence of other malignancy, any history of cancer within the past 5 years, or an adult lacking the capacity to consent. Surgical resection and pathologic records were obtained from lung cancer lesions in patients who met the tumor–node–metastasis guidelines classification criteria (3, 16). Cases had pathologically confirmed NSCLC. Controls were defined as patients histologically confirmed not to have cancer. Pack-years of cigarette smoking were defined as the average number of packs smoked per day times the number of years smoked. Nodule size and volume were obtained from the pathology report, and nodule volume was calculated using the ellipsoid volume formula (volume = 4/3 × π × radius A × radius B × radius C). Urine and plasma samples were obtained from all participants. The population composition was comprised of a predominantly Caucasian population from the Johns Hopkins Hospital (Baltimore, MD; n = 52) and African Americans from the University of Illinois at Chicago (Chicago, IL; n = 49 for a total of 101 patients examined in this study).
Plasma and urine sample collection
Patients enrolled in the study provided urine and matching blood samples. Blood samples were collected in EDTA tubes (Becton Dickinson) and processed within 2 hours after sample collection. Plasma was collected and stored at −80°C in freezer until use for up to 6 months. Longer storage of specimens has not previously affected the ability to detect DNA methylation in plasma, and more critical for DNA degradation is the time from collection to processing and freezing. Urine samples were collected in 50 mL urine collection containers (Thermo Fisher Scientific NC9512383). For processing, 10 mL of urine was transferred into 15 mL conical tubes. To prevent DNA degradation, 200 μL of 0.5 mol/L EDTA (pH 8.0; Thomas Fisher Scientific, Inc.: 351027721) was added and mixed into each tube and spun at 3,200 rpm for 15 minutes. Supernatant was collected and stored in at −80°C until use.
DNA isolation and bisulfate conversion
DNA was extracted from urine and plasma samples using an optimized methylation on beads (MOB) protocol. MOB is a process that allows DNA extraction and bisulfite conversion in a single tube via the use of silica super magnetic beads (10, 17). This approach yields a 1.5- to 5-fold improvement in extraction efficiency compared with traditional techniques (10, 18). We optimized the MOB protocol, which was previously a 24-hour protocol to a 6-hour protocol. We are newly describing MOB protocol for the isolation of DNA from the urine.
In the improved protocol, plasma samples were incubated with Proteinase K (10 mg/mL; New England Biolabs Co.: P8107s) and Buffer AL (Qiagen, Co.: 19075) at 55°C for 1 hour. During the DNA bisulfite treatment procedure, CT lightning conversion reagent was added and incubated at 98°C for 10 minutes and then at 70°C for 1 hour.
For DNA extraction from urine, 150 μL of Proteinase K (10 mg/mL; New England Biolabs Co.: P8107s) was added to 3 mL of urine followed by 3 mL of Buffer AL (Qiagen, Co.: 19075) and incubated in a water bath at 55°C for 1 hour. After digestion, 3 mL of 100% of isopropanol and 150 μL of Magnetic Beads (Promega, Co: magnesi KF-MD1471) were added to the sample to bind the DNA. Plasma and urine samples were prepared with parallel digestion workflows running concurrently (10).
Assessment of cell-free DNA tumor fraction
DNA isolated from plasma and urine from patients with mutations in EGFR and KRAS detected in tumor DNA was sent to our Genome Research Core facility for digital droplet quantitative PCR (ddPCR) for these loci. ddPCR Supermix was used for probes (first line, ddPCR SMX PRBS nodUTP 500RXN) from Bio-Rad with the restriction enzyme from Life Technologies to set-up PCR before generating droplets. The assays used were: (i) ddPCR mutation assay: KRAS p.G12Vc.35G>T, Human and (ii) ddPCR mutation Assay: EGFR p.L858Rc.2573T>G, Human. Tumor DNA fraction in cell-free DNA (cfDNA) was defined as the number of mutated DNA copies detected by ddPCR divided by the total number of DNA copies (mutant + wild-type).
DNA methylation analysis
Primers and hybridization probes for methylation analysis were designed using Primer3 (v.0.4.0; refs. 19, 20). For this study, primers generating shorter amplicons were designed within the same genomic regions of the primers we previously used for plasma methylation detection (10). All primer and probe sequences are listed in Supplementary Table S1. The analysis was performed using quantitative real-time methylation-specific PCR (qMSP), and β-Actin was used as reference gene for normalization of methylation levels (21). The PCR reaction mix and 2−ΔCt for each methylation detection replicate were performed as described previously (10). Positive promoter gene methylation was defined as 2−ΔCt values ≥ 10−10 and negative promoter gene methylation as 2−ΔCt values < 10−10. This bimodal distribution of 2−ΔCt values, as shown previously in sputum and plasma detection (10), is in essence detectable versus undetectable, and all lower quantities (<10−10) are actually zero, with real-time cycle thresholds of infinity designated as Ct of 100 to allow calculation of 2−ΔCt. The variation in 2−ΔCt for undetectable methylation is the result of different control (β-actin) cycle thresholds. In this study, the lowest positive methylation 2−ΔCt value was 1.66 × 10−5 and the highest negative methylation 2−ΔCt value was 1.85 × 10−19.
Statistical analysis
Quantitative data are expressed as median (interquartile range) for continuous variables and frequency (percentage) for categorical variables. Baseline demographic characteristics of the cases and controls were compared with the Wilcoxon rank sum test for continuous data and the Fisher exact test for categorical data. Two-sided statistical tests were used. Pearson correlation analysis was obtained for the ΔCt values from DNA methylation from plasma and urine.
We determined the sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and FPR in this cohort using the presence or absence of detectable methylation, using the methylation ΔCt values. We obtained ROC analysis using the 2−ΔCt values for individual genes to determine the performance of each individual gene. The AUC was reported with 95% confidence interval (CI). In addition, we tested whether having at least one positive methylated gene versus two versus three versus four versus five versus all simultaneously positively methylated has a better AUC. We found that looking into plasma, urine, and both, the best performing option was the one with having at least three genes with positive methylation. Univariate and multivariate logistic regression models adjusted by age, race, and pack-years were used to assess the association of methylated genes with lung cancer diagnosis by measuring the ORs with 95% CI. P < 0.05 was considered statistically significant. Bonferroni correction was used on all performed analysis to adjust P values for multiple tests. Supplementary Table S2 provides the univariate logistic regression analysis for risk of having NSCLC for each of the covariates. Only age was significantly associated with NSCLC risk (P = 0.03) among baseline characteristics (uncorrected analysis), however, after Bonferroni correction for multiple comparisons there were no significant associated variables with NSCLC risk (Supplementary Table S2). But given the importance of age, race, and pack-years, these were included as covariates in the multivariate analysis. All statistical analyses were performed using R statistic software, version 3.4.0 (22).
Results
Characteristics of the patients
A total of 101 patients fulfilled inclusion criteria, with 74 NSCLC subjects and 27 controls with noncancerous lung lesions (Table 1). Clinical and demographic variables were balanced in cases and controls. Overall, the 101 subjects' characteristics were: median age 64 years; 51% male and 49% female; 51% Caucasian, 38% African American, and 11% other; median body mass index (BMI) of 27; 19% current smokers, 66% former smokers, and 15% never smokers; and a median 30 pack-year history.
. | Cancer . | Control . | . |
---|---|---|---|
Patient characteristics . | (n = 74) . | (n = 27) . | P . |
Age at diagnosis (years; IQR) | 64 (59–70) | 62 (50–67) | 0.10 |
Gender | |||
Male (%) | 34 (46%) | 18 (67%) | 0.08 |
Female (%) | 40 (54%) | 9 (33%) | |
Race | |||
White (%) | 41 (55%) | 11 (41%) | 0.27 |
African American (%) | 24 (32%) | 14 (52%) | |
Hispanic (%) | 1 (1%) | 1 (4%) | |
Asian (%) | 4 (5%) | 0 (0%) | |
Other (%) | 4 (5%) | 1 (4%) | |
BMI (IQR) | 26 (22–30) | 27 (24–32) | 0.26 |
Smoking status | |||
Current (%) | 13 (18%) | 6 (22%) | 0.75 |
Former (%) | 49 (66%) | 18 (67%) | |
Never (%) | 12 (16%) | 3 (11%) | |
Pack-year (IQR) | 30 (15–46) | 30 (10–46) | 0.93 |
COPD (%) | 23 (31%) | 5 (19%) | 0.31 |
FEV1 % predicted (IQR) | 82 (72–95) | 69 (60–81) | 0.07 |
FVC % predicted (IQR) | 87 (77–103) | 80 (57–103) | 0.29 |
FEV1/FVC % ratio (IQR) | 78 (77–80) | 79 (77–82) | 0.66 |
Histology | |||
Adenocarcinoma (%) | 65 (88%) | NA | NA |
Squamous cell (%) | 9 (12%) | NA | |
Stage | |||
I (%) | 34 (46%) | NA | NA |
II (%) | 14 (19%) | NA | |
III (%) | 12 (16%) | NA | |
IV (%) | 14 (19%) | NA | |
Nodule size (cm) | 2.1 (1.6–3.7) | 3 (2.3–4) | 0.70 |
<1 cm | 4 (6%) | 1 (11%) | 0.28 |
1–2 cm | 22 (35%) | 1 (11%) | |
>2 cm | 37 (59%) | 7 (78%) |
. | Cancer . | Control . | . |
---|---|---|---|
Patient characteristics . | (n = 74) . | (n = 27) . | P . |
Age at diagnosis (years; IQR) | 64 (59–70) | 62 (50–67) | 0.10 |
Gender | |||
Male (%) | 34 (46%) | 18 (67%) | 0.08 |
Female (%) | 40 (54%) | 9 (33%) | |
Race | |||
White (%) | 41 (55%) | 11 (41%) | 0.27 |
African American (%) | 24 (32%) | 14 (52%) | |
Hispanic (%) | 1 (1%) | 1 (4%) | |
Asian (%) | 4 (5%) | 0 (0%) | |
Other (%) | 4 (5%) | 1 (4%) | |
BMI (IQR) | 26 (22–30) | 27 (24–32) | 0.26 |
Smoking status | |||
Current (%) | 13 (18%) | 6 (22%) | 0.75 |
Former (%) | 49 (66%) | 18 (67%) | |
Never (%) | 12 (16%) | 3 (11%) | |
Pack-year (IQR) | 30 (15–46) | 30 (10–46) | 0.93 |
COPD (%) | 23 (31%) | 5 (19%) | 0.31 |
FEV1 % predicted (IQR) | 82 (72–95) | 69 (60–81) | 0.07 |
FVC % predicted (IQR) | 87 (77–103) | 80 (57–103) | 0.29 |
FEV1/FVC % ratio (IQR) | 78 (77–80) | 79 (77–82) | 0.66 |
Histology | |||
Adenocarcinoma (%) | 65 (88%) | NA | NA |
Squamous cell (%) | 9 (12%) | NA | |
Stage | |||
I (%) | 34 (46%) | NA | NA |
II (%) | 14 (19%) | NA | |
III (%) | 12 (16%) | NA | |
IV (%) | 14 (19%) | NA | |
Nodule size (cm) | 2.1 (1.6–3.7) | 3 (2.3–4) | 0.70 |
<1 cm | 4 (6%) | 1 (11%) | 0.28 |
1–2 cm | 22 (35%) | 1 (11%) | |
>2 cm | 37 (59%) | 7 (78%) |
Abbreviations: COPD, chronic obstructive pulmonary disease; FEV1, forced expiratory volume in 1 second; FVC, forced vital capacity; IQR, interquartile range; NA, nonapplicable.
Detection of DNA methylation
To detect DNA methylation in free DNA present in the urine, we modified our previous approach to detect shorter fragments of DNA, as described in the Materials and Methods section. We first measured methylation using ΔCt values. Methylation was detected more frequently in all six genes in patients with cancer compared with controls in plasma (Figs. 1A and 2A). In urine, CDO1, TAC1, HOXA9, and SOX17 showed significantly more patients with cancer having positive methylation compared with controls (Figs. 1B and 2B), and the quantitation of methylation was similar in urine and plasma when detectable. A summary of the detection of methylation (Table 2), demonstrating the sensitivity and specificity for lung cancer diagnosis using individual genes from plasma, ranged from 58%–93% and 28%–84%, respectively. The frequency of methylation detection in urine versus plasma for patients with cancer was similar for CDO1, SOX17, and ZFP42, but in urine was slightly lower for TAC1, HOXA7, and HOXA9. FPR in plasma ranged from 16% to 72%. When at least three genes had positive methylation in plasma, the sensitivity and specificity for lung cancer diagnosis was 88% and 60%, respectively, with FPR 40%. Sensitivity and specificity for lung cancer diagnosis using individual genes from urine ranged from 48% to 92% and 22% to 81%, respectively, with an FPR 19%–78%. When at least three genes had positive methylation in urine, the sensitivity and specificity for lung cancer diagnosis was 93% and 30%, respectively, with FPR 70%. When both urine and plasma results were combined, the sensitivity and specificity for lung cancer diagnosis for simultaneously methylated genes in plasma and urine ranged from 27%–85% and 32%–96%, respectively, for individual genes, with an FPR 4%–68% (4%–15% when not considering ZFP42). When at least three genes had simultaneous positive methylation both in plasma and urine, the sensitivity and specificity for lung cancer diagnosis was 73% and 92%, respectively, with an FPR 8%.
. | Cancer (n = 74) . | Control (n = 25) . | . | . | . | ||
---|---|---|---|---|---|---|---|
Plasma . | n . | Sensitivity . | n . | Specificity . | PPV . | NPV . | AUC (95% CI) . |
CDO1 | 56 | 76% | 11 | 56% | 84% | 44% | 0.68 (0.55–0.80) |
TAC1 | 61 | 82% | 10 | 60% | 86% | 54% | 0.73 (0.61–0.86) |
HOXA7 | 55 | 74% | 4 | 84% | 93% | 53% | 0.79 (0.69–0.90) |
HOXA9 | 43 | 58% | 5 | 80% | 90% | 39% | 0.66 (0.54–0.77) |
SOX17 | 59 | 80% | 9 | 64% | 87% | 52% | 0.75 (0.63–0.86) |
ZFP42 | 69 | 93% | 18 | 28% | 79% | 58% | 0.70 (0.58–0.82) |
All (at least 3 positive) | 65 | 88% | 10 | 60% | 87% | 63% | 0.68 (0.56–0.80) |
Cancer (n = 71) | Control (n = 27) | ||||||
Urine | n | Sensitivity | n | Specificity | PPV | NPV | AUC (95% CI) |
CDO1 | 51 | 72% | 10 | 63% | 84% | 46% | 0.70 (0.58–0.82) |
TAC1 | 48 | 68% | 7 | 74% | 87% | 47% | 0.70 (0.58–0.83) |
HOXA7 | 36 | 51% | 12 | 56% | 75% | 30% | 0.54 (0.41–0.67) |
HOXA9 | 34 | 48% | 5 | 81% | 87% | 37% | 0.66 (0.54–0.77) |
SOX17 | 56 | 79% | 9 | 67% | 86% | 55% | 0.76 (0.65–0.88) |
ZFP42 | 65 | 92% | 21 | 22% | 76% | 50% | 0.65 (0.52–0.77) |
All (at least 3 positive) | 66 | 93% | 19 | 30% | 78% | 62% | 0.70 (0.58–0.81) |
Cancer (n = 71) | Control (n = 27) | ||||||
Plasma and urine | n | Sensitivity | n | Specificity | PPV | NPV | AUC (95% CI) |
CDO1 | 42 | 58% | 4 | 85% | 91% | 42% | 0.69 (0.50–0.82) |
TAC1 | 39 | 53% | 2 | 92% | 95% | 41% | 0.72 (0.59–0.85) |
HOXA7 | 32 | 45% | 4 | 85% | 89% | 37% | 0.70 (0.58–0.82) |
HOXA9 | 20 | 27% | 1 | 96% | 95% | 33% | 0.77 (0.66–0.87) |
SOX17 | 47 | 65% | 3 | 88% | 94% | 48% | 0.78 (0.67–0.89) |
ZFP42 | 60 | 85% | 17 | 32% | 78% | 42% | 0.72 (0.60–0.84) |
All (at least 3 positive) | 52 | 73% | 2 | 92% | 96% | 55% | 0.72 (0.61–0.84) |
. | Cancer (n = 74) . | Control (n = 25) . | . | . | . | ||
---|---|---|---|---|---|---|---|
Plasma . | n . | Sensitivity . | n . | Specificity . | PPV . | NPV . | AUC (95% CI) . |
CDO1 | 56 | 76% | 11 | 56% | 84% | 44% | 0.68 (0.55–0.80) |
TAC1 | 61 | 82% | 10 | 60% | 86% | 54% | 0.73 (0.61–0.86) |
HOXA7 | 55 | 74% | 4 | 84% | 93% | 53% | 0.79 (0.69–0.90) |
HOXA9 | 43 | 58% | 5 | 80% | 90% | 39% | 0.66 (0.54–0.77) |
SOX17 | 59 | 80% | 9 | 64% | 87% | 52% | 0.75 (0.63–0.86) |
ZFP42 | 69 | 93% | 18 | 28% | 79% | 58% | 0.70 (0.58–0.82) |
All (at least 3 positive) | 65 | 88% | 10 | 60% | 87% | 63% | 0.68 (0.56–0.80) |
Cancer (n = 71) | Control (n = 27) | ||||||
Urine | n | Sensitivity | n | Specificity | PPV | NPV | AUC (95% CI) |
CDO1 | 51 | 72% | 10 | 63% | 84% | 46% | 0.70 (0.58–0.82) |
TAC1 | 48 | 68% | 7 | 74% | 87% | 47% | 0.70 (0.58–0.83) |
HOXA7 | 36 | 51% | 12 | 56% | 75% | 30% | 0.54 (0.41–0.67) |
HOXA9 | 34 | 48% | 5 | 81% | 87% | 37% | 0.66 (0.54–0.77) |
SOX17 | 56 | 79% | 9 | 67% | 86% | 55% | 0.76 (0.65–0.88) |
ZFP42 | 65 | 92% | 21 | 22% | 76% | 50% | 0.65 (0.52–0.77) |
All (at least 3 positive) | 66 | 93% | 19 | 30% | 78% | 62% | 0.70 (0.58–0.81) |
Cancer (n = 71) | Control (n = 27) | ||||||
Plasma and urine | n | Sensitivity | n | Specificity | PPV | NPV | AUC (95% CI) |
CDO1 | 42 | 58% | 4 | 85% | 91% | 42% | 0.69 (0.50–0.82) |
TAC1 | 39 | 53% | 2 | 92% | 95% | 41% | 0.72 (0.59–0.85) |
HOXA7 | 32 | 45% | 4 | 85% | 89% | 37% | 0.70 (0.58–0.82) |
HOXA9 | 20 | 27% | 1 | 96% | 95% | 33% | 0.77 (0.66–0.87) |
SOX17 | 47 | 65% | 3 | 88% | 94% | 48% | 0.78 (0.67–0.89) |
ZFP42 | 60 | 85% | 17 | 32% | 78% | 42% | 0.72 (0.60–0.84) |
All (at least 3 positive) | 52 | 73% | 2 | 92% | 96% | 55% | 0.72 (0.61–0.84) |
Circulating cfDNA tumor fraction
To compare methylation detection in plasma and urine with the tumor fraction in these samples, we identified 16 patients with known driver mutations detected in their lung cancer tissue specimens (11 patients with KRAS mutations and 5 with EGFR mutations). We used a standardized ddPCR on DNA isolated from plasma and urine to assess the number of copies (and mutant fraction) of circulating DNA containing the known mutation. The mutant fraction for each patient in plasma (Fig. 3A) and urine (Fig. 3B) was compared with the methylation quantitation. For patients with detectable DNA mutations, methylation levels were similar or higher than mutational quantities in plasma, and similar in urine. The detection frequency for mutations, which in plasma was 9 of 16 (56%) and in urine was 3 of 16 (19%), was compared with the methylation detection frequency for these same 16 patients (Fig. 3C). This shows similar frequency of detection in plasma, but a greater frequency of methylation detection in urine.
Plasma and urine methylation correlation
Genes were simultaneously methylated in both plasma and urine (positive concordance) in a minimum of 22% of patients (e.g., SOX17) to a maximum 80% of patients (e.g., ZFP42) when all subjects were included (Supplementary Table S3), and nonmethylated status was matched in plasma and urine (negative concordance) ranging from 5% (e.g., ZFP42) to 31% (e.g., HOX9) for all subjects. Total concordance of simultaneously methylated and simultaneously nonmethylated in both plasma and urine ranged from 53% (e.g., HOX9) to 85% (e.g., ZFP42). The level of methylation was also concordant, and the ΔCt values for TAC1, HOXA7, and SOX17 showed significant correlation between the methylation detection in plasma and urine when looking at all patients (Supplementary Table S3). However, there was a difference in relative importance of positive and negative concordance between patients with cancer and controls. For NSCLC, the positive methylation concordance increased to a range 28%–85% and the negative methylation concordance decreased to a range 0%–21% with a total concordance of 48%–85% among patients with NSCLC. Among cancer-free controls, the positive methylation concordance decreased to a range 4%–68% and the negative methylation concordance increased to a range 20%–64% with a total concordance of 52%–88% among cancer-free controls. For discordant samples, there was a similar frequency of positive plasma and negative urine compared with positive urine and negative plasma.
Gene methylation and lung cancer diagnostic accuracy
ROC curves for lung cancer detection were obtained for each single gene, using the normalized methylation ΔCt values calculated as described in Materials and Methods. The AUC values were 0.66–0.79 in plasma samples and 0.54–0.76 in urine samples. The genes with the largest AUC values in plasma were TAC1 AUC: 0.73, 95% CI (0.61–0.86); HOXA7 AUC: 0.79, 95% CI (0.69–0.90); and SOX17 AUC: 0.75, 95% CI (0.63–0.86; Table 2). When examining multigene methylation, having at least three positive methylated genes in plasma has an AUC of 0.68, 95% CI (0.56–0.80). The genes with the largest AUC in urine were: CDO1 AUC: 0.70, 95% CI (0.58–0.82); TAC1 AUC: 0.70, 95% CI (0.58–0.83); and SOX17 AUC: 0.76, 95% CI (0.65–0.88; Table 2). When three positive methylated genes in urine were used, the AUC was 0.70, 95% CI (0.58–0.81). The genes with the largest AUC when simultaneously methylated both in plasma and urine were: TAC1 AUC: 0.72, 95% CI (0.59–0.85); HOXA9 AUC: 0.77, 95% CI (0.66–0.87); SOX17 AUC: 0.78, 95% CI (0.67–0.89); and ZFP42 AUC: 0.72, 95% CI (0.60–0.84).
Gene methylation and lung cancer risk association
Univariate logistic regression for risk of having NSCLC was obtained for each one of the baseline characteristics to assess for confounders (Supplementary Table S2). To seek for other confounders for the multivariate analysis, we looked at the differences in methylation between Caucasians and African Americans given that in our study we had 51% Caucasians and 38% African Americans and that African Americans have been underrepresented in the scientific literature and carry a disproportionate frequency of the lung cancer burden, with 11% higher incidence rate compared with their Caucasian counterparts, later stage diagnosis, and poorer 5-year overall survival rate (16% in African American vs. 19% in Caucasian; refs. 5, 23–27). We did not find any difference in the percentage of positive methylation in any of the genes when comparing Caucasians with African Americans (more details about the sensitivity, specificity, PPV, NPV, FPR, and AUC for each gene in plasma, urine, and both for Caucasian patients and African Americans in Supplementary Materials and Methods). Also, in our study, only age was significantly associated with NSCLC risk (P = 0.03) among all baseline characteristics, however, after Bonferroni correction for multiple comparisons there were no significant associated variables with NSCLC risk (Supplementary Table S2).
Univariate logistic regression analysis showed that the ΔCt methylation value of each one of the genes in plasma was significantly associated with NSCLC risk (Table 3). This association remained statistically significant for CDO1, TAC1, HOXA7, HOXA9, and SOX17 after adjusting by age, race, and smoking pack-years (Bonferroni corrected for multiple comparisons). The results shown by logistic regression analysis confirm those found by the direct comparison on the differences on percentage of methylation of cancer versus controls. In addition, having at least three genes with positive methylation in plasma was significantly associated with NSCLC risk both in univariate and multivariate analysis.
. | Univariate analysis . | Multivariate analysis . | . | ||
---|---|---|---|---|---|
. | . | . | . | P . | P . |
. | OR (95% CI) . | P . | OR (95% CI) . | Not corrected . | Bonferroni correction . |
CDO1 methylated in plasma | 3.96 (1.54–10.49) | 0.005 | 4.90 (1.73–4.77) | 0.003 | 0.023 |
TAC1 methylated in plasma | 7.04 (2.65–19.79) | <0.001 | 9.22 (3.11–230.63) | <0.001 | <0.001 |
HOXA7 methylated in plasma | 15.20 (5.05–57.34) | <0.001 | 17.02 (5.16–71.84) | <0.001 | <0.001 |
HOXA9 methylated in plasma | 5.55 (2.00–18.14) | 0.002 | 6.05 (2.05–21.22) | 0.002 | 0.015 |
SOX17 methylated in plasma | 6.99 (2.65–19.65) | <0.001 | 7.03 (2.53–21.19) | <0.001 | 0.002 |
ZFP42 methylated in plasma | 5.37 (1.54–20.09) | 0.009 | 5.13 (1.32–021.49) | 0.019 | 0.135 |
At least 3 genes methylated in plasma | 10.83 (3.87–32.80) | <0.001 | 13.56 (4.33–549.21) | <0.001 | <0.001 |
CDO1 methylated in urine | 4.34 (1.73–11.41) | 0.002 | 6.69 (2.31–22.02) | <0.001 | 0.006 |
TAC1 methylated in urine | 5.96 (2.29–17.11) | <0.001 | 5.30 (1.87–16.84) | 0.003 | 0.018 |
HOXA7 methylated in urine | 1.29 (0.59–3.18) | 0.580 | 1.40 (0.52–3.88) | 0.503 | 1.000 |
HOXA9 methylated in urine | 4.04 (1.47–13.15) | 0.011 | 4.33 (1.46–15.17) | 0.013 | 0.088 |
SOX17 methylated in urine | 7.47 (2.87–20.78) | <0.001 | 8.69 (3.05–27.81) | <0.001 | <0.001 |
ZFP42 methylated in urine | 3.10 (0.88–10.92) | 0.073 | 3.82 (0.98–15.57) | 0.053 | 0.374 |
At least 3 genes methylated in urine | 5.56 (1.66–20.33) | 0.006 | 5.76 (1.60–22.73) | 0.008 | 0.059 |
CDO1 plasma and urine | 7.45 (2.55–27.45) | <0.001 | 12.68 (3.55–61.04) | <0.001 | 0.003 |
TAC1 plasma and urine | 13.76 (3.71–89.64) | <0.001 | 13.81 (3.38–99.40) | <0.001 | 0.010 |
HOXA7 plasma and urine | 4.72 (1.61–17.34) | 0.009 | 4.90 (1.55–19.24) | 0.012 | 0.081 |
HOXA9 plasma and urine | 9.63 (1.85–177.47) | 0.031 | 10.19 (1.83–192.57) | 0.031 | 0.217 |
SOX17 plasma and urine | 14.41 (4.47–65.06) | <0.001 | 18.17 (5.12–91.16) | <0.001 | <0.001 |
ZFP42 plasma and urine | 2.57 (0.87–7.41) | 0.081 | 2.76 (0.86–8.82) | 0.083 | 0.581 |
At least 3 genes in plasma and urine | 31.47 (8.27–208.29) | <0.001 | 69.34 (13.21–721.89) | <0.001 | <0.001 |
. | Univariate analysis . | Multivariate analysis . | . | ||
---|---|---|---|---|---|
. | . | . | . | P . | P . |
. | OR (95% CI) . | P . | OR (95% CI) . | Not corrected . | Bonferroni correction . |
CDO1 methylated in plasma | 3.96 (1.54–10.49) | 0.005 | 4.90 (1.73–4.77) | 0.003 | 0.023 |
TAC1 methylated in plasma | 7.04 (2.65–19.79) | <0.001 | 9.22 (3.11–230.63) | <0.001 | <0.001 |
HOXA7 methylated in plasma | 15.20 (5.05–57.34) | <0.001 | 17.02 (5.16–71.84) | <0.001 | <0.001 |
HOXA9 methylated in plasma | 5.55 (2.00–18.14) | 0.002 | 6.05 (2.05–21.22) | 0.002 | 0.015 |
SOX17 methylated in plasma | 6.99 (2.65–19.65) | <0.001 | 7.03 (2.53–21.19) | <0.001 | 0.002 |
ZFP42 methylated in plasma | 5.37 (1.54–20.09) | 0.009 | 5.13 (1.32–021.49) | 0.019 | 0.135 |
At least 3 genes methylated in plasma | 10.83 (3.87–32.80) | <0.001 | 13.56 (4.33–549.21) | <0.001 | <0.001 |
CDO1 methylated in urine | 4.34 (1.73–11.41) | 0.002 | 6.69 (2.31–22.02) | <0.001 | 0.006 |
TAC1 methylated in urine | 5.96 (2.29–17.11) | <0.001 | 5.30 (1.87–16.84) | 0.003 | 0.018 |
HOXA7 methylated in urine | 1.29 (0.59–3.18) | 0.580 | 1.40 (0.52–3.88) | 0.503 | 1.000 |
HOXA9 methylated in urine | 4.04 (1.47–13.15) | 0.011 | 4.33 (1.46–15.17) | 0.013 | 0.088 |
SOX17 methylated in urine | 7.47 (2.87–20.78) | <0.001 | 8.69 (3.05–27.81) | <0.001 | <0.001 |
ZFP42 methylated in urine | 3.10 (0.88–10.92) | 0.073 | 3.82 (0.98–15.57) | 0.053 | 0.374 |
At least 3 genes methylated in urine | 5.56 (1.66–20.33) | 0.006 | 5.76 (1.60–22.73) | 0.008 | 0.059 |
CDO1 plasma and urine | 7.45 (2.55–27.45) | <0.001 | 12.68 (3.55–61.04) | <0.001 | 0.003 |
TAC1 plasma and urine | 13.76 (3.71–89.64) | <0.001 | 13.81 (3.38–99.40) | <0.001 | 0.010 |
HOXA7 plasma and urine | 4.72 (1.61–17.34) | 0.009 | 4.90 (1.55–19.24) | 0.012 | 0.081 |
HOXA9 plasma and urine | 9.63 (1.85–177.47) | 0.031 | 10.19 (1.83–192.57) | 0.031 | 0.217 |
SOX17 plasma and urine | 14.41 (4.47–65.06) | <0.001 | 18.17 (5.12–91.16) | <0.001 | <0.001 |
ZFP42 plasma and urine | 2.57 (0.87–7.41) | 0.081 | 2.76 (0.86–8.82) | 0.083 | 0.581 |
At least 3 genes in plasma and urine | 31.47 (8.27–208.29) | <0.001 | 69.34 (13.21–721.89) | <0.001 | <0.001 |
Note: Multivariate analysis adjusted by age, race, and pack-years.
In urine, univariate logistic regression analysis showed methylation of CDO1, TAC1, HOXA9, and SOX17 was significantly associated with lung cancer risk. ZFP42 had a trend toward significance (P = 0.07), but the low specificity of this gene reduced the significant difference. These findings remained statistically significant for CDO1, TAC1, and SOX17 after adjusting for age, race, and pack-years (Bonferroni corrected for multiple comparisons). Having at least three genes with positive methylation in urine was also significantly associated with NSCLC risk in univariate analysis.
Univariate logistic regression analysis showed that having the gene simultaneously methylated in both plasma and urine in CDO1, TAC1, HOXA7, HOXA9, and SOX17 was significantly associated with lung cancer risk. These findings remained statistically significant for CDO1, TAC1, and SOX17 after adjusting by age, race, and pack-years (Bonferroni corrected for multiple comparisons). Finally, having at least three genes with simultaneous positive methylation in both plasma and urine was significantly associated with NSCLC risk both in univariate and multivariate analysis.
Subset analysis by stage
When comparing the frequency of detection of early (stage I and II) versus late stages (III and IV) in plasma and urine, for many loci there was a higher percentage of positive methylation in late stages (III and IV) compared with early stages (I and II; Fig. 2C and D; Supplementary Fig. S1 showing quantitative methylation). In early-stage patients versus controls, methylation was detected more frequently in all genes in patients with cancer compared with controls in plasma (Supplementary Fig. S2A). In urine, CDO1, TAC1, and SOX17 showed significantly more people having positive methylation among those with cancer compared with controls (Supplementary Fig. S2B). When comparing late stage patients versus controls, methylation was detected more frequently in all genes in patients with cancer compared with controls in plasma (Supplementary Fig. S3A). In urine, five genes showed significantly more people having positive methylation among those with cancer compared with controls (Supplementary Fig. S3B).
Overall sensitivity and AUC were slightly higher for late stages (Supplementary Table S5) compared with early stages (Supplementary Table S4). Sensitivity for early-stage NSCLC diagnosis using individual genes from plasma ranged from 54% to 92% with AUC ranging from 64% to 73%. Sensitivity for late-stage NSCLC using individual genes from plasma ranged from 65% to 100% with AUC ranging from 69% to 91%. Sensitivity and specificity for early-stage NSCLC diagnosis using individual genes from urine ranged from 40% to 91% with AUC ranging from 49% to 74%. Sensitivity and specificity for late-stage NSCLC using individual genes from urine ranged from 63% to 96% with AUC ranging from 62% to 86%.
Discussion
The results from this study show that methylation can be detected more frequently in patients with cancer compared with controls in all six genes in plasma and in CDO1, TAC1, HOXA9, and SOX17 in urine. When at least three genes had simultaneous positive methylation both in plasma and urine, the sensitivity and specificity for lung cancer diagnosis was 73% and 92%, respectively, with an FPR 8%. These data suggest that epigenetic biomarkers from liquid biopsies based on methylation detection from plasma and urine may compliment LDCT screening to help in the decision process to proceed with further invasive diagnostic/treatment procedures. The FPRs in this study are significantly lower than reported for LDCT screening from the NLST trial (NLST FPR 24%; ref. 5) and in Lung Cancer Screening in the Veterans Health Administration with FPR 98% (28).
The use of noninvasive liquid biopsies for cancer monitoring and most recently for early-cancer diagnosis is becoming increasingly accepted in oncology given the recent developments overcoming the detection challenges for minute quantities of DNA from low quantity DNA yielding body fluids (29, 30). Previous studies sought to improve lung cancer detection accuracy by the use of molecular biomarkers obtained from noninvasive liquid biopsies (31–37). In addition, several studies found that DNA methylation could be associated with lung cancer independently of pack-years (38–41). However, none of these tests achieved adequate sensitivity and specificity (31–37, 42–44) to adopt for lung cancer screening. With improvements in DNA extraction methods and processing for methylation detection, along with the use of highly prevalent cancer-specific methylation targets, we believe we overcame these limitations and optimized it for methylation detection in liquid biopsies (10, 17). In previous studies, we showed that MOB can reduce sample loss thereby increasing DNA methylation detection sensitivity. In this study, we further optimized an ultrasensitive detection strategy based on MOB and real-time qMSP for DNA methylation detection from plasma and urine. The real-time qMSP assay used in this study can detect single molecules containing dense hypermethylation with the pattern for which the assay has been designed (fully methylated). It is less efficient for incomplete methylation in this region, but may also detect partial methylation. Other methods able to distinguish partial methylation (including bisulfite sequencing) lack sensitivity for the rare molecules present in plasma or urine and without extremely deep reads (100,000× coverage) would not approach the necessary level of sensitivity. We have developed approaches that can detect partial methylation, and detect epigenetic heterogeneity in difficult samples such as liquid biopsies that contain low fractional concentrations of circulating tumor DNA (ctDNA) and rare epigenetic subclonal populations (45, 46), but are not yet capable of analyzing a large number of samples and would not be suitable for the very short DNA fragments found in urine. Our previous publications provided additional data to address this question (10). However, it is seen that there are varying levels of methylation in the tumor and that most plasma levels are lower than the tissue expected because the fraction of cfDNA in plasma from the tumor is much lower than the tumor cellularity of the tumor itself. We have also recently published (47) quantitative methylation of tumor versus benign nodules for many of these genes, which indeed show some differences in methylation between cancer and normal, and quantitively between different tumors (47). The majority of tumors have high levels of methylation, while the majority of benign lesions have no detectable methylation. For benign nodules, the quantity of methylation is similar, but in some cases, lower than tumors. We can confirm the effects dilution of tumor DNA in the plasma has on the quantity of methylation detected in plasma. As previously published, the quantity of methylation, when detected in the plasma, is lower than that observed in tumor tissue (10).
Urine has been shown to be a source for detection of specific tumor genetic mutations in ctDNA from different types of cancer including urothelial, colon, and lung cancer (11–13). Because of this, we further optimized our previously published approach by using shorter amplicons to be able to detect methylation changes from the smaller DNA fragments present in urine. Urine samples have advantages compared with other sources of body fluids. They are noninvasive and easy to obtain, larger volumes can be easily collected, they have less processing limitations compared with other samples, can be stored at room temperature, and the DNA content is stable for longer periods of time than other body fluids. Because of that, urine samples have the potential to be easily implemented in primary care practice.
We observed some differences in sensitivity for detection according to stage, which for most genes was greater for late stage (III–IV) than early stage (I–II). This is not surprising, given other studies showing greater sensitivities for detecting late-stage cancers compared with early, primarily related to increased levels of ctDNA with larger tumor burden (29). While this is one explanation, an additional factor could be increased tumor DNA methylation associated with lung cancer progression. However, there is no evidence for a stage increase in the presence of DNA methylation at these loci, and indeed, these were chosen because they are frequently methylated in early-stage lung cancer (9, 10), making them optimal as early detection biomarkers. This suggests that tumor burden, and increased levels of tumor-derived cfDNA, is the likely reason for differences in detection of DNA methylation.
In addition, in this study, we note some differences in methylation detection between plasma and urine, with a trend toward better methylation detection in plasma compared with urine. This was also reflected on the circulating free DNA mutation tumor fraction, which was detected in 56% of plasma samples and 19% of urine samples. The differences in methylation detection between plasma and urine may be multifactorial. First, urine is a plasma ultrafiltrate, and thus only a portion of plasma DNA may be filtered into the urine. In addition, only small DNA molecules make it into the urine stream. To detect DNA methylation in urine, we redesigned this assay to detect even shorter amplicons, but even with this approach some methylated cfDNA molecules may be too small to be detected. Finally, there could be stability issues from exposure of cfDNA to the kidney or urothelial tract environments that could result in loss of tumor DNA in urine compared with plasma. However, despite these differences, urine should be further explored for its utility for cancer detection as discussed in the Introduction.
For our approach, we report a relatively quantitative (normalized to input amplifiable DNA) measure of the abnormal methylation among cfDNA, and this regional DNA methylation was previously shown to be a cancer-specific change (9, 10). In lung cancer, detection of EGFR mutations in urine has been shown to be comparable with blood-based detection (13, 48–50). This also appears to be from cfDNA, and sensitivities for detecting EGFR mutations in urine are similar to what we observe for patients with metastatic lung cancer, where urine mutation detection has been reported. In the majority of patients, there was concordance for methylation detection between plasma and urine. Discordance in patients with cancer was split between positive methylation detection in urine and plasma, and most likely reflects stochastic detection from sampling very rare tumor cfDNA fragments. If true, additional sampling time points or greater volume of plasma or urine may increase sensitivity.
This study strongly suggests that methylation of specific genes hypermethylated in NSCLC can be detected in urine by using assay modifications for the more fragmented cfDNA found in urine. While this study did not fully address the clinical utility of this approach, as discussed in the Materials and Methods section, patients selected for inclusion were 50 years and older and had a CT scan for suspicion of lung cancer referred to surgery for resection. The utility of methylation detection would likely not be for the smallest nodules (which are lowest risk and can be followed) or large lesions (where risk is high, and evaluation needed). Rather, the indeterminate nodule (8 to 20–30 mm) has a cancer risk where further discrimination is most needed, where potentially this epigenetic liquid biopsy test could help.
This study as a first proof of principle has limitations. These include a relatively small sample size, with its associated limited power to accurately assess the sensitivity and specificity of methylation detection in urine. In addition, because this study included patients with NSCLC of all stages, with only 65% of patients diagnosed at early stages and a relatively small number of controls, future prospective studies with larger sample sizes and limited to early-cancer stage are needed to fully explore and validate this approach for early detection use.
Conclusion
This study suggests that epigenetic biomarkers from liquid biopsies based on methylation detection from plasma and urine could be used as an adjunct to CT screening to guide the decision-making regarding further invasive procedures in patients with pulmonary nodules. Plasma and urine methylation detection yield low FPRs and the methylation of these genes is associated with a high NSCLC risk independent of age, race, and smoking pack-year.
Disclosure of Potential Conflicts of Interest
T. Ito is an employee/paid consultant for Toray Medical Co., Ltd. R.C. Gaba reports receiving other commercial research support from Guerbet USA LLC and Janssen Research & Development, and holds ownership interest (including patents) in Sus Clinicals, Inc. R.A. Winn is an employee/paid consultant for Genentech. M.V. Brock is an employee/paid consultant for Cepheid. A. Hulbert is an employee/paid consultant for Everest Detection. No potential conflicts of interest were disclosed by the other authors.
Authors' Contributions
Conception and design: L.E. Feldman, I. Jusue-Torres, R.A. Winn, J.G. Herman, A. Hulbert
Development of methodology: B. Liu, A. Kottorou, K. Rodgers, C. Chen, I. Jusue-Torres, R.A. Winn, J.G. Herman, A. Hulbert
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): B. Liu, J. Ricarte Filho, A. Mallisetty, K. Rodgers, K. Holmes, N. Gastala, K. Valyi-Nagy, R.C. Gaba, C. Ascoli, M. Pasquinelli, L.E. Feldman, R.A. Winn, M.V. Brock, A. Hulbert
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): B. Liu, J. Ricarte Filho, K. Valyi-Nagy, I. Jusue-Torres, R.A. Winn, J.G. Herman, A. Hulbert
Writing, review, and/or revision of the manuscript: B. Liu, J. Ricarte Filho, A. Mallisetty, C. Villani, A. Kottorou, C. Chen, T. Ito, K. Holmes, N. Gastala, K. Valyi-Nagy, O. David, R.C. Gaba, M. Pasquinelli, L.E. Feldman, M.G. Massad, T.-H. Wang, I. Jusue-Torres, E. Benedetti, R.A. Winn, M.V. Brock, J.G. Herman, A. Hulbert
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): K. Rodgers, M. Pasquinelli, M.G. Massad, E. Benedetti, A. Hulbert
Study supervision: L.E. Feldman, M.G. Massad, I. Jusue-Torres, E. Benedetti, M.V. Brock, A. Hulbert
Other (assisted with preparation and execution of biorepository protocol for tissue acquisition, provided expertise related to pathology of lung cancer, reviewed manuscript): O. David
Acknowledgments
This work was supported by grants from The University of Illinois at Chicago Cancer Center, EDRN U01CA214165-03 and DOD W81XWH-12-1-0323, and in part, under a grant with the Pennsylvania Department of Health.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.