Abstract
Purpose: To establish a novel panel of cancer-specific methylated genes for cancer detection and prognostic stratification of early-stage non–small cell lung cancer (NSCLC).
Experimental Design: Identification of differentially methylated regions (DMR) was performed with bumphunter on “The Cancer Genome Atlas (TCGA)” dataset, and clinical utility was assessed using quantitative methylation-specific PCR assay in multiple sets of primary NSCLC and body fluids that included serum, pleural effusion, and ascites samples.
Results: A methylation panel of 6 genes (CDO1, HOXA9, AJAP1, PTGDR, UNCX, and MARCH11) was selected from TCGA dataset. Promoter methylation of the gene panel was detected in 92.2% (83/90) of the training cohort with a specificity of 72.0% (18/25) and in 93.0% (40/43) of an independent cohort of stage IA primary NSCLC. In serum samples from the later 43 stage IA subjects and population-matched 42 control subjects, the gene panel yielded a sensitivity of 72.1% (31/41) and specificity of 71.4% (30/42). Similar diagnostic accuracy was observed in pleural effusion and ascites samples. A prognostic risk category based on the methylation status of CDO1, HOXA9, PTGDR, and AJAP1 refined the risk stratification for outcomes as an independent prognostic factor for an early-stage disease. Moreover, the paralog group for HOXA9, predominantly overexpressed in subjects with HOXA9 methylation, showed poor outcomes.
Conclusions: Promoter methylation of a panel of 6 genes has potential for use as a biomarker for early cancer detection and to predict prognosis at the time of diagnosis. Clin Cancer Res; 23(22); 7141–52. ©2017 AACR.
Lung cancer is the leading cause of cancer-related deaths worldwide, and most patients are diagnosed at advanced stage because of a lack of symptoms at early stage of the disease, resulting in poor outcomes. However, a considerable heterogeneity of outcomes still remains, even in the early stage of the disease. Thus, identifying biomarkers for minimally invasive detection and prognosis of early-stage disease is needed. In this study, we identified a panel of novel cancer-specific methylated genes and evaluated the detection accuracy of these genes using serum samples from patients with stage IA disease. Moreover, we constructed a prognostic risk category based on the methylation status of different combinations of tested genes, and the clinical utility was validated in an independent cohort of stage IA disease. Testing of this gene panel may facilitate the development of improved diagnosis, clinical management, and outcome prediction.
Introduction
Lung cancer is the second most common cancer and the leading cause of cancer-related deaths worldwide. A hallmark of lung cancer development is the sequential epigenetic and genetic abnormalities in somatic cells (1), and a better understanding of the intrinsic biological traits that underline the initiation and progression of NSCLC may be essential for developing biomarkers to manage this disease appropriately.
Methylation is one of the most common epigenetic alterations, and aberrant methylation in CpG dinucleotide-rich clusters (CpG islands) of gene promoter regions interferes with gene transcription and can eliminate tumor suppressor gene (TSG) function (2). This aberrant methylation can be an early event in cancer progression, indicating its potential as a biomarker for cancer detection (2). We, along with others, previously reported the presence of DNA methylation in body fluids, including serum, sputum, and bronchoalveolar lavage in NSCLC (3–5). Given the presence of late-stage tumors in approximately two-thirds of patients at the time of diagnosis due to lack of symptoms at early stage of the disease, assessing methylation from body fluids may be a promising, minimally invasive approach to develop screening strategies for high-risk groups such as smokers, and in follow-up of early-stage disease after surgical resection. Furthermore, even though the disease could be diagnosed at early stage, a considerable heterogeneity of outcomes still remains (6). Thus, identifying biomarkers for risk stratification at the time of diagnosis of early-stage disease is needed in clinical practice. Although The Cancer Genome Atlas (TCGA) attempted to elucidate genome-wide DNA methylation profiles of NSCLC (7, 8), most studies have selected candidate genes based on the β-value of individual promoter CpG island (9, 10). However, functionally relevant traits have been generally associated with differentially methylated regions (DMR) rather than with a single differentially methylated CpG island (11).
Homeobox (HOX) genes, organized into four clusters (HOXA, HOXB, HOXC, and HOXD) on different chromosomes, have a common homeodomain and act as transcription factors (12). Their expression pattern is tightly regulated to direct the formation of body structures during embryonic development (13). The 3′ region of HOXA and HOXB cluster genes are predominantly expressed in normal adult lung tissues, whereas HOXC and HOXD clusters are expressed in fetal lungs, diseased lungs, such as in primary pulmonary hypertension, and in NSCLC (13, 14), suggesting that the altered pattern of HOX gene expression may contribute to lung diseases including cancer. The HOXA cluster is frequently methylated in NSCLC (10, 15), and promoter methylation of HOXA5 and HOXA11 attenuates their tumor-suppressive function via transcriptional silencing (16, 17). On the other hand, blocking HOX activity reduced in vivo tumor growth (14), indicating the tumor-promoting properties of HOX genes. Thus, dysregulated HOX genes may be important for both oncogenesis and tumor suppression in NSCLC. However, the association between aberrant expression of HOX genes in a particular tumor remains unclear.
Here, we first identified a candidate methylation panel of 6 genes, including HOXA9, using the DMR analysis of the TCGA dataset. We then determined the clinical utility of this gene panel for cancer detection and prognostic prediction by analyzing 90 primary NSCLC, 43 primary NSCLC with matched serum samples from stage IA lung adenocarcinoma (LUAD), 40 serum samples from stage IA lung squamous cell carcinoma (LUSC), 70 pleural effusions (PE), and 49 ascites samples. Moreover, we found compensatory overexpression of the paralog group (HOXB9, HOXC9, and HOXD9) for HOXA9 transcriptional silencing via its promoter methylation, resulting in poor outcomes.
Materials and Methods
DMR discovery and panel selection using bump hunting
We used the bump hunting method to perform an epigenome-wide analysis of the LUAD methylome to identify DMRs of biological interest using methylation arrays (11). To this end, we conducted two separate epigenome-wide analysis using minfi package in Bioconductor (18), as we performed previously (19). Briefly, we performed an unbiased epigenome-wide DNA methylation analysis using the minfi package in Bioconductor to identify DMRs in 359 primary, chemotherapy-naïve LUAD samples, and 42 normal lung tissue samples from the TCGA dataset. We used the following statistical genomics criteria for selecting candidate genes associated with LUAD in this biomarker identification and validation study: we identified the significant DMRs (FEWER area P < 0.001) that were in a CpG island located up to 200-bp upstream and downstream from the 5′ end of the gene. We subsequently selected genes for downstream validation using a candidate gene approach to prioritize methylated biomarkers with known biological function in lung cancer.
Tissue samples
Informed consent was obtained from all patients before collecting samples. Approval for research on human subjects was obtained from the Johns Hopkins University Institutional Review Boards. This study qualified for an exemption under the U.S. Department of Health and Human Services policy for the protection of human subjects [45 CFR 46.101(b)]. The demographic and clinical characteristics of all the cohorts are summarized in Table 1.
. | Primary tumor . | Serum . | . | |||
---|---|---|---|---|---|---|
Samples . | Training cohorta . | Validation cohortb . | Stage IA LUADb . | Stage IA LUSC . | Control . | . |
Patients | (n = 90) | (n = 43) | (n = 43) | (n = 40) | (n = 42) | |
Age (years) | ||||||
Mean ± SEM (years) | 64.61 ± 1.13 | 70.02 ± 8.92 | 70.02 ± 8.92 | 71.80 ± 9.06 | 66.65 ± 1.02 | |
Range (years) | 41–86 | 46–88 | 46–88 | 49–87 | 49–76 | |
Race | ||||||
Caucasian (%) | 60 (80.0) | 39 (90.7) | 39 (90.7) | 36 (90.0) | 40 (95.2) | |
Asian (%) | 1 (1.3) | 4 (9.3) | 4 (9.3) | 2 (5.0) | 1 (2.4) | |
Others (%) | 14 (18.7) | 0 (0.0) | 0 (0.0) | 2 (5.0) | 1 (2.4) | |
Gender | ||||||
Female (%) | 37 (49.3) | 29 (67.4) | 29 (67.4) | 17 (42.5) | 27 (64.3) | |
Male (%) | 38 (50.7) | 14 (32.6) | 14 (32.6) | 23 (57.5) | 15 (35.7) | |
Smoking history | ||||||
Absence (%) | 29 (38.7) | 16 (37.2) | 16 (37.2) | 1 (2.5) | 12 (28.6) | |
Presence (%) | 46 (61.3) | 27 (62.8) | 27 (62.8) | 39 (97.5) | 30 (71.4) | |
Tumor size | ||||||
Mean ± SEM (cm) | 3.87 ± 0.21 | 1.73 ± 0.60 | 1.73 ± 0.60 | 1.72 ± 0.73 | — | |
Histology | — | |||||
Adenocarcinoma (%) | 60 (66.7) | 43 (100.0) | 43 (100.0) | — | — | |
Squamous carcinoma (%) | 25 (27.8) | — | — | 40 (100.0) | — | |
Large cell carcinoma (%) | 5 (5.5) | — | — | — | — | |
Stage | — | |||||
Stage I (%) | 31 (41.3) | 43 (100.0) | 43 (100.0) | 40 (100.0) | — | |
Stage II (%) | 22 (29.3) | — | — | — | — | |
Stage III/IV (%) | 22 (29.3) | — | — | — | — | |
Adjuvant chemotherapy | — | |||||
Absence (%) | unknown | 43 (100.0) | 43 (100.0) | 40 (100.0) | — | |
Recurrence | — | |||||
Absence (%) | 40 (53.3) | 36 (83.7) | 36 (83.7) | 33 (82.5) | — | |
Presence (%) | 35 (46.7) | 7 (16.3) | 7 (16.3) | 7 (17.5) | — | |
Samples | Pleural effusion | Ascites | ||||
Primary disease | Tumor | Normalc | Tumor | Normald | ||
Cytology | Positive | Negative | Positive | Negative | ||
Patients | (n = 37) | (n = 16) | (n = 17) | (n = 24) | (n = 11) | (n = 14) |
Age (years) | ||||||
Mean ± SEM (years) | 62.19 ± 2.47 | 66.69 ± 2.57 | 65.53 ± 3.54 | 61.50 ± 2.18 | 64.18 ± 4.70 | 53.07 ± 3.20 |
Range (years) | 26–92 | 49–84 | 35–84 | 46–84 | 39–90 | 30–73 |
Race | ||||||
Caucasian (%) | 22 (59.5) | 14 (87.5) | 10 (58.8) | 11 (45.8) | 7 (63.6) | 7 (50.0) |
Others (%) | 15 (40.5) | 2 (12.5) | 7 (41.2) | 13 (54.2) | 4 (36.4) | 7 (50.0) |
Gender | ||||||
Female (%) | 24 (64.9) | 7 (43.8) | 9 (52.9) | 13 (54.2) | 3 (27.3) | 4 (28.6) |
Male (%) | 13 (35.1) | 9 (56.2) | 8 (47.1) | 11 (45.8) | 8 (72.7) | 10 (71.4) |
Primary tumor | ||||||
Lung (%) | 19 (51.4) | 1 (6.3) | — | 2 (100.0) | 0 (0.0) | — |
Others (%) | 18 (48.6)e | 15 (93.7)f | — | 22 (100.0)g | 11 (100.0)h | — |
Histology | ||||||
Adenocarcinoma (%) | 27 (73.0) | — | — | 24 (100.0) | — | — |
Squamous carcinoma (%) | 1 (2.7) | — | — | 0 (0.0) | — | — |
Others (%) | 9 (24.3) | — | — | 0 (0.0) | — | — |
. | Primary tumor . | Serum . | . | |||
---|---|---|---|---|---|---|
Samples . | Training cohorta . | Validation cohortb . | Stage IA LUADb . | Stage IA LUSC . | Control . | . |
Patients | (n = 90) | (n = 43) | (n = 43) | (n = 40) | (n = 42) | |
Age (years) | ||||||
Mean ± SEM (years) | 64.61 ± 1.13 | 70.02 ± 8.92 | 70.02 ± 8.92 | 71.80 ± 9.06 | 66.65 ± 1.02 | |
Range (years) | 41–86 | 46–88 | 46–88 | 49–87 | 49–76 | |
Race | ||||||
Caucasian (%) | 60 (80.0) | 39 (90.7) | 39 (90.7) | 36 (90.0) | 40 (95.2) | |
Asian (%) | 1 (1.3) | 4 (9.3) | 4 (9.3) | 2 (5.0) | 1 (2.4) | |
Others (%) | 14 (18.7) | 0 (0.0) | 0 (0.0) | 2 (5.0) | 1 (2.4) | |
Gender | ||||||
Female (%) | 37 (49.3) | 29 (67.4) | 29 (67.4) | 17 (42.5) | 27 (64.3) | |
Male (%) | 38 (50.7) | 14 (32.6) | 14 (32.6) | 23 (57.5) | 15 (35.7) | |
Smoking history | ||||||
Absence (%) | 29 (38.7) | 16 (37.2) | 16 (37.2) | 1 (2.5) | 12 (28.6) | |
Presence (%) | 46 (61.3) | 27 (62.8) | 27 (62.8) | 39 (97.5) | 30 (71.4) | |
Tumor size | ||||||
Mean ± SEM (cm) | 3.87 ± 0.21 | 1.73 ± 0.60 | 1.73 ± 0.60 | 1.72 ± 0.73 | — | |
Histology | — | |||||
Adenocarcinoma (%) | 60 (66.7) | 43 (100.0) | 43 (100.0) | — | — | |
Squamous carcinoma (%) | 25 (27.8) | — | — | 40 (100.0) | — | |
Large cell carcinoma (%) | 5 (5.5) | — | — | — | — | |
Stage | — | |||||
Stage I (%) | 31 (41.3) | 43 (100.0) | 43 (100.0) | 40 (100.0) | — | |
Stage II (%) | 22 (29.3) | — | — | — | — | |
Stage III/IV (%) | 22 (29.3) | — | — | — | — | |
Adjuvant chemotherapy | — | |||||
Absence (%) | unknown | 43 (100.0) | 43 (100.0) | 40 (100.0) | — | |
Recurrence | — | |||||
Absence (%) | 40 (53.3) | 36 (83.7) | 36 (83.7) | 33 (82.5) | — | |
Presence (%) | 35 (46.7) | 7 (16.3) | 7 (16.3) | 7 (17.5) | — | |
Samples | Pleural effusion | Ascites | ||||
Primary disease | Tumor | Normalc | Tumor | Normald | ||
Cytology | Positive | Negative | Positive | Negative | ||
Patients | (n = 37) | (n = 16) | (n = 17) | (n = 24) | (n = 11) | (n = 14) |
Age (years) | ||||||
Mean ± SEM (years) | 62.19 ± 2.47 | 66.69 ± 2.57 | 65.53 ± 3.54 | 61.50 ± 2.18 | 64.18 ± 4.70 | 53.07 ± 3.20 |
Range (years) | 26–92 | 49–84 | 35–84 | 46–84 | 39–90 | 30–73 |
Race | ||||||
Caucasian (%) | 22 (59.5) | 14 (87.5) | 10 (58.8) | 11 (45.8) | 7 (63.6) | 7 (50.0) |
Others (%) | 15 (40.5) | 2 (12.5) | 7 (41.2) | 13 (54.2) | 4 (36.4) | 7 (50.0) |
Gender | ||||||
Female (%) | 24 (64.9) | 7 (43.8) | 9 (52.9) | 13 (54.2) | 3 (27.3) | 4 (28.6) |
Male (%) | 13 (35.1) | 9 (56.2) | 8 (47.1) | 11 (45.8) | 8 (72.7) | 10 (71.4) |
Primary tumor | ||||||
Lung (%) | 19 (51.4) | 1 (6.3) | — | 2 (100.0) | 0 (0.0) | — |
Others (%) | 18 (48.6)e | 15 (93.7)f | — | 22 (100.0)g | 11 (100.0)h | — |
Histology | ||||||
Adenocarcinoma (%) | 27 (73.0) | — | — | 24 (100.0) | — | — |
Squamous carcinoma (%) | 1 (2.7) | — | — | 0 (0.0) | — | — |
Others (%) | 9 (24.3) | — | — | 0 (0.0) | — | — |
aFifteen samples with LUAD did not have detailed information on clinicopathologic features.
bThe paired primary tumor and serum samples from stage IA LUAD were used as a validation cohort of primary tumor and a serum cohort, respectively.
cNormal samples include pleural effusion from subjects with hypertension, rib fracture, and a valvular disease of the heart.
dNormal samples include ascites from subjects with hepatitis C, lung transplant, and kidney transplant.
eOthers include 6 gastrointestinal cancer, 2 hepatobiliary-pancreatic cancer, 4 breast cancer, 2 germ cell tumor, 1 gynecologic cancer, 1 hematologic tumor, 1 anaplastic ependymoma, and 1 renal cell carcinoma.
fOthers include 4 gastrointestinal cancer, 3 hepatobiliary-pancreatic cancer, 1 gynecologic cancer, 1 hematologic tumor, 1 renal cell carcinoma, 1 prostate cancer, and 4 adenocarcinoma of unknown origin.
gOthers include 5 gastrointestinal cancer, 8 hepatobiliary-pancreatic cancer, 4 breast cancer, and 5 gynecologic cancer, 1 anaplastic ependymoma, and 1 renal cell carcinoma.
hOthers include 2 gastrointestinal cancer, 6 hepatobiliary-pancreatic cancer, 1 gynecologic cancer, and 2 hematologic tumor.
DNA extraction and bisulfite treatment
DNA was extracted using the standard phenol–chloroform extraction protocol as described previously (5). Bisulfite treatment was conducted with an EpiTect Bisulfite Kit (Qiagen).
Conventional methylation-specific PCR and quantitative MSP
Methylation-specific PCR (MSP) primers were designed for all the 30 genes to test presence of methylated alleles in the CpG island promoter region. PCR products were separated by electrophoresis on 1.5% agarose gels stained with ethidium bromide and imaged in the Gel Doc XR with Quantity One Version 4.6.1. Software (Bio-Rad).
For quantitative MSP (Q-MSP), amplification reactions were done in a 7900HT Fast Real-Time PCR System (Life Technologies). Results were analyzed by Sequence Detector System (SDS) software (Applied Biosystems). The sequences of primers and probes used in this study are shown in Supplementary Table S1.
Cell lines and 5-Aza-2′-deoxycytidine treatment
NSCLC cell lines NCI-H226, NCI-H1437, NCI-H1703, and NCI-H1975 were obtained from and propagated according to the recommendations of the ATCC. Cells were treated with 5 μmol/L 5-Aza-2′-deoxycytidine (5-Aza-dC; Sigma-Aldrich), as described previously (20).
qRT-PCR
Total RNA from cells and formaldehyde-fixed paraffin-embedded human tissues was isolated using the RNeasy Plus Mini Kit (Qiagen) and the RecoverAll Total Nucleic Acid Isolation Kit (Ambion), respectively. qRT-PCR was performed using the Fast SYBR Green Master Mix (Thermo Fisher Scientific).
Statistical analysis
For continuous variables, data are expressed as a mean ± SEM. The two groups were compared with a two-tailed Student t test or Wilcoxon–Mann–Whitney test, where appropriate. Multiple groups were compared with a Kruskal–Wallis with post hoc test (Dwass–Steel test). Categorical variables were analyzed with a Fisher exact test or a χ2 test. Overall survival (OS) time was calculated from the date of surgery to the date of death from any cause, or censored at the last follow-up. The Kaplan–Meier method was used to estimate the distributions of OS, and the log-rank test was used to compare the distribution of survival time. Univariate and multivariate prognostic analyses were performed using the Cox proportional hazard model, which calculates the adjusted HR and the 95% confidence interval (CI). The level of statistical significance was set at P < 0.05. All statistical analyses were conducted with the JMP 12 software package (SAS Institute).
For details, see Supplementary Materials and Methods.
Results
Identification of cancer-specific methylated genes by analyzing TCGA dataset
To identify DMRs in LUAD, we conducted an epigenome-wide analysis using the bump hunter method in TCGA LUAD samples and selected 30 genes based on the criteria described in Materials and Methods (Supplementary Table S2). We then screened the methylation status of these 30 genes using conventional MSP in 8 matched pairs of primary LUAD tumors and the adjacent normal samples. Among these 30 genes, the promoter methylation of cysteine dioxygenase type 1 (CDO1), adherents junctions associated protein 1 (AJAP1), prostaglandin D2 receptor (PTGDR), homeobox A9 (HOXA9), membrane-associated ring-CH-type finger 11 (MARCH11), and UNC homeobox (UNCX) were the most frequently methylated in tumor samples (sensitivity range from 62.5% to 87.5% with 100% specificity for all the 6 genes), suggesting cancer-specific methylated genes (Fig. 1A). Indeed, these 6 genes showed higher methylation values in tumors compared with those in normal samples in the TCGA dataset (Supplementary Fig. S1A), and the methylation of the promoter CpG islands of these 6 genes was confirmed by bisulfite sequencing analysis (Supplementary Fig. S1B). Moreover, these 6 genes showed significantly decreased expression levels in tumors when compared with adjacent normal samples which is consistent with methylation status (Fig. 1B), which was confirmed in the TCGA dataset (Supplementary Fig. S2A). In addition, an inverse linear correlation between promoter methylation and expression level was observed in TCGA samples (Supplementary Fig. S2B). After treatment with the demethylating agent 5-Aza-dC, expression of these 6 genes was robustly reactivated in a majority of cell lines with promoter methylation (Fig. 1C). Collectively, these findings suggest that promoter methylation of these 6 genes is a frequent and cancer-specific event and may act as a regulatory mechanism for its transcriptional silencing.
Methylation frequency and association with clinicopathologic features in primary lung cancer samples
To confirm the cancer specificity of methylation events of these 6 genes, Q-MSP assay for each of the genes was conducted on 25 primary NSCLC and matched adjacent normal samples. All 6 genes showed significantly higher methylation value in tumor compared with their corresponding adjacent normal samples (Fig. 2A). To understand the broader idea of methylation prevalence of these genes, we employed Q-MSP assays on additional 65 primary NSCLC samples (total 90 samples). The demographic and clinical characteristics for this cohort of samples are summarized in Table 1. Again, higher methylation values of these 6 genes were observed in tumor samples (Fig. 2B). The optimal cut-off value for distinguishing between tumor (n = 90) and normal samples (n = 25) was calculated using a ROC analysis for each gene. By using the optimal methylation cut-off value for individual gene, the observed sensitivity and specificity of an individual gene for cancer detection ranged between 51.7% and 77.8% and 72.0% and 88.0%, respectively (Supplementary Table S3). We further assessed the methylation pattern spectrum of these 6 genes. When we considered the methylation of at least one of the 5 genes (AJAP1, CDO1, UNCX, HOXA9, and MARCH11), the sensitivity and specificity were 92.2% (83/90) and 72.0% (18/25), respectively (Fig. 2C). Although the combination panel of 6 genes detected almost all of the subjects with tumors (87/90, 96.7%), the specificity decreased to 60.0% (15/25).
To confirm the methylation frequency in an independent set, we tested 43 primary tumors with stage IA LUAD. Forty of 43 subjects (93.0%) were methylation positive for our gene panel, which indicated that it had potential for detecting neoplastic changes at a very early stage (Fig. 2C).
We next assessed correlation between these 6 genes and clinicopathologic features (Supplementary Table S4). Only AJAP1 showed a high frequency of methylation in early stage (stage I–II) compared with late stage (stage III–1V) NSCLC patients. Although some epigenetic alterations in NSCLC are associated with histology or other clinicopathologic factors, such as age, smoking, and alcohol consumption (21); no correlation with promoter methylation of any of the genes was found.
Prognostic significance of a methylation gene panel in early lung cancer
We investigated whether promoter methylation assessment of the gene panel could be used as a prognostic marker for the early stage of NSCLC. Seventy-five of the 90 NSCLC subjects had available information on follow-up, among which 53 were early stage of the disease. In Kaplan–Meier analysis, promoter methylation of PTGDR showed a significantly favorable outcome (5-year OS, 57.2% vs. 21.4%; P = 0.042; Supplementary Fig. S3A). Although no statistically significant association was detected between prognosis and promoter methylation of the remaining genes, promoter methylation of CDO1 and HOXA9 showed a trend of a poor outcome (5-year OS, 36.6% vs. 63.1% for CDO1, and 43.0% vs. 54.2% for HOXA9), whereas AJAP1 showed a trend of a favorable outcome (5-year OS, 60.1% vs. 34.6%). Interestingly, the concomitant evaluation of both CDO1 and HOXA9 promoter methylation was significantly associated with a poorer outcome (HR, 2.20; 95% CI, 1.01–5.30; P = 0.043), whereas the combination of PTGDR and AJAP1 promoter methylation was associated with a better outcome (HR, 0.386; 95% CI, 0.15–0.98; P = 0.037; Supplementary Fig. S3B). To refine prognostic stratification in the early stage, we constructed a prognostic risk category based on the promoter methylation status of CDO1, HOXA9, PTGDR, and AJAP1; subjects were defined as low-risk when both PTGDR and AJAP1 were positive for promoter methylation and either CDO1 or HOXA9 was negative for promoter methylation, as high risk when either PTGDR or AJAP1 was negative and both CDO1 and HOXA9 were positive, and as moderate risk for the others. According to this risk category, 21 (39.6%), 27 (50.9%), and 5 (9.4%) of the 53 subjects with the early-stage disease were classified as being at low, moderate, and high risk, respectively. Significant prognostic stratification for OS was provided by the risk categories (5-year OS, 72.7% for low risk, 38.6% for moderate risk, 0% for high risk, P = 0.034, Fig. 2D). The risk category remained as an independent prognostic factor, even after adjusting for stage and histology in the multivariate Cox proportional hazard analysis (P = 0.035; Table 2).
. | Univariate . | Multivariate . | ||
---|---|---|---|---|
Variables . | HR (95% CI) . | Pa . | HR (95% CI) . | Pa . |
Combination methylation marker | 0.027 | 0.035 | ||
CDO1 or HOXA9 negative/PTGDR and AJAP1 positive | 1 (reference) | 1 (reference) | ||
CDO1 and HOXA9 positive/PTGDR or AJAP1 negative | 6.157 (1.105–34.494) | 0.039 | 4.696 (1.111–27.447) | 0.045 |
The others | 3.821 (1.259–16.510) | 0.016 | 5.342 (1.685–23.695) | 0.003 |
Stage | ||||
Stage I | 1 (reference) | 0.376 | 1 (reference) | 0.046 |
Stage II | 1.478 (0.614–3.521) | 2.718 (1.020–7.493) | ||
Histology | 0.035 | 0.004 | ||
Adenocarcinoma | 1 (reference) | 1 (reference) | ||
Squamous carcinoma | 2.755 (1.099–6.997) | 0.0314 | 5.386 (1.845–16.019) | 0.002 |
Large cell carcinoma | 0.576 (0.088–2.240) | 0.456 | 0.738 (0.111–2.920) | 0.692 |
. | Univariate . | Multivariate . | ||
---|---|---|---|---|
Variables . | HR (95% CI) . | Pa . | HR (95% CI) . | Pa . |
Combination methylation marker | 0.027 | 0.035 | ||
CDO1 or HOXA9 negative/PTGDR and AJAP1 positive | 1 (reference) | 1 (reference) | ||
CDO1 and HOXA9 positive/PTGDR or AJAP1 negative | 6.157 (1.105–34.494) | 0.039 | 4.696 (1.111–27.447) | 0.045 |
The others | 3.821 (1.259–16.510) | 0.016 | 5.342 (1.685–23.695) | 0.003 |
Stage | ||||
Stage I | 1 (reference) | 0.376 | 1 (reference) | 0.046 |
Stage II | 1.478 (0.614–3.521) | 2.718 (1.020–7.493) | ||
Histology | 0.035 | 0.004 | ||
Adenocarcinoma | 1 (reference) | 1 (reference) | ||
Squamous carcinoma | 2.755 (1.099–6.997) | 0.0314 | 5.386 (1.845–16.019) | 0.002 |
Large cell carcinoma | 0.576 (0.088–2.240) | 0.456 | 0.738 (0.111–2.920) | 0.692 |
aCox proportional hazard model.
To validate the ability of prognostic risk category in the early-stage disease, we analyzed in an independent cohort of 43 subjects with stage IA LUAD. Surprisingly, even in this stage IA independent set, the risk category significantly stratified the prognosis (5-year OS, 100.0% for low risk, 96.0% for moderate risk, 55.6% for high risk, P = 0.015; Fig. 2D), suggesting the potential clinical utility of our gene panel as a prognostic marker.
Potential of the methylation gene panel as a biomarker for cancer detection in serum samples from stage IA NSCLC
Given the high positive frequency of our methylation gene panel in early-stage primary tumors, we next assessed the potential for minimally invasive early cancer detection using serum samples from 43 stage IA LUAD, 40 stage IA LUSC, and 42 population-matched subjects from New York University Lung Cancer Screening Cohort (ref. 22; Table 1). In general, the mean methylation value of each of the 6 genes was higher in cancer subjects compared with controls (Supplementary Fig. S4). By determining the optimal cutoff using ROC curves in 43 stage IA LUAD and 42 control samples, the sensitivity and specificity for a lung cancer diagnosis from individual methylated genes in serum ranged 4.7%–34.9% and 85.7%–100%, respectively (Supplementary Table S3). In 43 primary tumor and the matched serum samples from stage IA LUAD, the methylation status of an individual gene in serum DNA was always concordant with the primary tumor DNA, and the concordance rate was 55.6% (15/27) for MARCH11, 42.9% (12/28) for HOXA9, 31.8% (7/22) for CDO1, 27.3% (3/11) for UNCX, 22.2% (4/18) for PTGDR, and 20% (2/10) for AJAP1 (Fig. 3A). Of note, our gene panel yielded a sensitivity of 72.1% and specificity of 71.4% in serum DNA from stage IA LUAD and detected 24 (60.0%) of the 40 serum samples from stage IA LUSC using the same cutoff (Fig. 3B).
Methylation of the gene panel in pleural effusion and ascites samples
As malignant pleural effusion (MPE) is a common complication of NSCLC (23), we next assessed the feasibility of our methylation gene panel for detecting cancer using 70 pleural effusion (PE) samples (Table 1). Promoter methylation of the 6 genes was significantly higher in MPE samples than in PEs with negative cytology. The differences between PEs from cancer and benign subjects, and between MPEs from NSCLC and other cancer types were not significant (Supplementary Fig. S5A). By determining the optimal cutoff (Supplementary Table S3), the sensitivity and specificity were 70.3% and 84.8%, respectively, when we considered the methylation of at least one of the 4 genes (CDO1, PTGDR, MARCH11, and UNCX; Fig. 3C). While sensitivity increases to 75.7% by adding AJAP1 to the later 4 genes, the specificity decreases to 75.8%. Promoter methylation of our gene panel was also detected in ascites samples from cancer patients at a frequency and level similar to those of MPEs using the cutoff that was set in the PE samples, despite the malignant ascites from subjects with various types of cancer (Fig. 3C; Supplementary Fig. S5B; Table 1), suggesting the utility of our gene panel for cancer detection from the diverse body fluids.
Compensating for the expression of HOX paralog group 9 for HOXA9 methylation
We examined whether or not mRNA expression levels of HOXA9 were also associated with outcomes using 75 primary NSCLC samples that were informative for prognostic analysis. Surprisingly, the overexpression of HOXA9 mRNA was associated with shorter OS (Fig. 4A), which was contradictory with association between its methylation status and outcomes (Supplementary Fig. S3A). Therefore, to explore any compensatory mechanism, we assessed the association between HOXA9 methylation and expression of HOX genes. The median level of HOXA9 expression was almost equal in normal and tumor tissues, despite the frequent cancer-specific methylation in the TCGA dataset (Supplementary Fig. S2A), indicating a population with different expression levels in tumors. Indeed, the overexpression of HOXA9 was observed in subjects without its promoter methylation (Fig. 4B). Interestingly, we found that HOXA9 promoter methylation was significantly frequent in subjects with an overexpression of the other HOX genes (67/149, 45.0%), especially a paralog group for HOXA9 gene (i.e. HOXB9, HOXC9, and HOXD9; 36/65, 55.4%), as compared with those without any overexpression of HOX genes (60/246, 24.4%; P < 0.001) or those with an overexpression of HOXA9 gene (4/29, 13.8%; P = 0.002). Of note, the paralog group for HOXA9 was predominantly expressed in tumor tissues (Supplementary Fig. S2A), and the overexpression of any HOX paralog group 9 (i.e., HOXA9, HOXB9, HOXC9, and HOXD9) showed a poor outcome (Fig. 4A). In addition, the expression levels of the paralog group for HOXA9 were significantly higher in subjects with HOXA9 methylation than those without methylation (Fig. 4C). As expected, the expression level of HOXA9 was low in subjects with its promoter methylation, indicating an inverse correlation between HOXA9 methylation and the expression of the paralog group for HOXA9. Similar findings were observed in the TCGA dataset, and the expression of HOX paralog group 9 was associated with a poor outcome in various types of cancer, including LUSC (Supplementary Fig. S6).
Discussion
A growing number of epigenetic alterations, especially gene promoter methylation, contribute to the initiation and progression of NSCLC, indicating epigenetic abnormality as a prime candidate for integration into clinical practice and precision medicine. In this study, we demonstrate a potentially clinically applicable methylation gene panel that may facilitate the development of improved diagnosis, clinical management, and outcome prediction.
Promoter methylation of our gene panel (CDO1, HOXA9, AJAP1, PTGDR, UNCX, and MARCH11) was not only frequent and cancer specific but also an early event. Cancer detection at an early stage potentially increases survival rates. Indeed, low-dose spiral CT can be a reliable screening tool for the early detection of lung cancer, which decreases the mortality rate from lung cancer by 20% (24). However, a serious limitation of low-dose spiral CT is its poor specificity with a 96.4% false-positive rate (24, 25), and the rate of surgical resections for benign disease remains too high (6%–38%; ref. 26). Therefore, novel minimally invasive biomarkers are urgently needed to increase diagnostic accuracy, decrease unnecessary invasive diagnostic procedure, and improve prognosis. Our gene panel showed comparable sensitivity (72.1% vs. 71.1%) and superior specificity (71.4% vs. 62.7%) in serum samples from stage IA LUAD compared with the CT screening, suggesting a potential for early cancer detection. We assessed methylation status using Q-MSP because of the most frequently used and well-established method to detect DNA methylation (4, 5). Q-MSP can achieve more objective and specific assessment through PCR amplification using methylation-specific primers and fluorescent probes. Furthermore, this method is a simple and cost-efficient technique that can detect minute foci of tumor cells that would be insufficient to raise morphologic suspicion of malignancy, making it a more sensitive assay compared with standard histopathology (2). Furthermore, it is not possible to detect NSCLC in blood by using histopathology approach. Therefore, DNA methylation analysis in serum provides minimally invasive alternatives to routine procedures presently required to obtain biopsy material and may be valuable for the management of thoracic malignancies. Collectively, our findings indicate the potential clinical utility of methylation marker for detecting stage IA NSCLC using serum.
Our gene panel showed comparable detection sensitivity to previous pivotal studies in primary tumor tissues when using different combinations of genes (3, 9); however, these studies calculated the sensitivity considering stage I and II cases together while our validation cohort consisted of only stage IA disease. The sensitivity of our gene panel was relatively low, albeit with high specificity compared with previous studies (4, 5). This discrepancy may be due to differences in DNA extraction procedure and different gene panel; using Methylation-on-Beads procedure for DNA extraction may improve the sensitivity of our assay by reducing sample loss (4).
The median survival time of patients with MPE was 8 months, categorizing the cancer as a stage IV disease in the TNM staging system (27). Thus, managing patients with MPE is essentially palliative, and the diagnosis must be established promptly and with minimal risk. Although cytomorphologic examination remains the major diagnostic tool for MPE, the diagnostic sensitivity is only 60% with repeated thoracentesis (28). Our gene panel yielded a relatively higher diagnostic sensitivity (70.3%–75.7%) and similar specificity (75.8%–84.8%) of MPE compared with previous studies, where the sensitivity and specificity was 39.5%–66.7% and 79.4%–100%, respectively (29–31). These results suggested that promoter methylation has clinical potential as an auxiliary tool to complement cytomorphologic examination to diagnose MPE and to avoid repeated thoracentesis. In addition, our gene panel successfully detected not only MPE from almost all types of malignancy included in this study, but also malignant ascites from various types of cancer at similar accuracy. Thus, the methylation of these genes may be a highly prevalent alteration for diverse tumor types.
The TCGA dataset is useful to elucidate cancer-specific methylated genes because of their large number of well-collected samples and the high-standards used for clinical and bioinformatics analyses (9, 10). DMR identification in TCGA samples is a promising approach to discover methylated genes with functionally relevant traits, as we have reported previously (19). All the 6 genes in the panel showed significantly lower expression in tumors with promoter methylation compared with those without methylation and were restored by treatment with 5-Aza-dC in NSCLC cell lines, suggesting promoter methylation is one of the key mechanisms of transcriptional silencing and potential functional roles with biologic implications, in the initiation and progression of cancer (2). We previously demonstrated that CDO1 was frequently methylated and acted as a TSG in multiple cancer types including lung cancer (32). CDO1 is a cytosolic non-heme, iron-dependent enzyme that catalyzes the irreversible addition of molecular oxygen to the sulfhydryl group of cysteines; this reaction yields cysteine sulfinic acid, resulting in the attenuation of oxidative phosphorylation and tumor growth (33). AJAP1 integrates into E-cadherin–mediated adherens junctions, which play a suppressive role in cell–cell and cell–extracellular matrix interactions associated with cell migration and invasion (34). The prostaglandin D2 (PGD2)-PTGDR signaling pathway is an endogenous negative regulator of vascular permeability and tumorigenesis, and a PTGDR deficiency enhances lung carcinoma cell–derived tumor progression accompanied by abnormal vascular expansion (35). UNCX and HOXA9 belong to evolutionarily conserved homeobox superfamilies, which play decisive roles in regulating apoptosis, differentiation, motility and angiogenesis (36). Thus, all the 6 genes we tested are functionally associated with tumorigenesis.
HOXA9 regulates progenitor abundance by suppressing differentiation and maintaining self-renewal during myelopoiesis (37), and its ectopic expression induces serous ovarian cancer (38). In contrast, the HOXA9 signaling pathway suppresses breast cancer growth and metastasis (39), indicating context-dependent behavior of HOXA9. In NSCLC, the functional role of HOXA9 is not fully understood (14, 40), and the contribution of its expression level to patient outcomes and its association with the paralog group remain elusive. Our findings indicated that the overexpression of HOXA9 and the paralog group conferred a poor prognosis and that paralog group 9 for HOXA9 was predominantly expressed in tumors with HOXA9 methylation. The paralog group represents its homology among clusters, and HOXA9 shares 91% sequence homology with HOXB9 (41). Thus, the high sequence identity may induce functional redundancy during early development (42) and in malignant cells (43). This functional redundancy speculation was further supported by the observation of no dramatic alterations in morphogenesis caused by mutations of a single HOX gene due to the functional compensation for one another when one paralogous gene is disrupted (44). Therefore, the combinational expression profile of the HOX paralog group 9 genes may correlate with the aggressive phenotype, and the compensatory expression of the paralog group for HOXA9 in tumors with HOXA9 methylation may reflect the association between HOXA9 methylation and a poor outcome. Identifying the mechanism underlying compensatory overexpression of the paralog genes for transcriptional silencing via promoter methylation may, therefore, provide important clues leading to new molecular therapeutic targets in NSCLC.
In addition to early cancer detection, identifying biomarkers for risk stratification at the time of diagnosis of early-stage disease remains another major clinical challenge. However, there is no gene panel that satisfies two major clinical challenges. Although it is widely accepted that the TNM staging system is a clinically useful prognostic marker, NSCLCs encompass different biological entities with variable clinical outcomes, even in the early stage of the disease (6). In addition, clinical trials have failed to show a significant survival benefit for adjuvant chemotherapy in patients with completely resected stage I disease (45), despite the expected 5-year recurrence rate of 30% to 50% (46), indicating the importance of selecting appropriate patients most likely to benefit from adjuvant chemotherapy. It is believed that patients at a higher risk of recurrence benefit more from adjuvant chemotherapy than patients at a lower risk of recurrence, even in the early stage of the disease (47). Thus, developing predictive biomarkers to identify patients who are at high risk of poor prognosis is clearly imperative to assess the balance of the expected absolute benefit and the possible risk of toxicity in decision-making for adjuvant chemotherapy in the early stage of the disease. We demonstrated a more rigorous stratification of outcomes in early-stage NSCLC using our risk category based on the methylation status of CDO1, HOXA9, AJAP1, and PTGDR genes. The promoter methylation risk category was an independent prognostic factor of the disease stage and histopatholgic subtypes in multivariate analysis for OS. Of note, the reproducibility was confirmed in a relatively homogeneous population with stage IA LUAD subjects that were diagnosed and treated in a separate institution representing an independent validation cohort. Thus, an assessment based on promoter methylation risk panels may serve as a prognostic biomarker and help to identify patients at high risk who may benefit from adjuvant therapy in early-stage NSCLC.
Promoter methylation status of AJAP1 and PTGDR, in contrast to CDO1 gene, exhibited favorable outcomes despite their reported tumor-suppressive functions (34, 35). However, promoter methylation of genes with biologic implication does not necessarily have similar prognostic impact. For example, in promoter methylation of certain TSGs in NSCLC, RAR-β and APC methylation have been reported to exhibit favorable outcomes in contrast with poor outcomes of RASSF1 methylation due to incomplete transcriptional silencing, compared with genetic alteration (48, 49). In addition, individual promoter methylation is just a part of the epigenome-wide methylation status (50), and its suppressive role may be neutralized by differential methylation on CpG island shores, enhancers and repetitive elements in the noncoding areas of the genome, the activation of other signaling pathways, or the compensatory expression of functionally similar molecules such as the paralog group for HOXA9. Thus, inactivation of AJAP1 and PTGDR via promoter methylation may result in a protective effect. Therefore, combinatorial analysis of methylated genes with different outcomes provides prognostically divergent subgroups (49). On the basis of this concept, our promoter methylation risk category was constructed from CDO1, HOXA9, AJAP1, and PTGDR genes with different outcomes.
Limitations of the current study include possible selection bias due to retrospective analysis, possible statistical error due to testing multiple variables, and statistical power hampered by the relatively small sample size. In addition, our training cohort included heterogeneous populations, including different stage, histology, and smoking history. However, there was no association of our gene panel with these clinicopathologic features, except for association between AJAP1 promoter methylation and early stage of disease, suggesting application for diverse subtypes of lung cancer, but not specific subtype such as smoking-related or nonsmoking-related lung cancer. We experimentally validated only 30 genes positioned on DMRs, selected by statistical genomic criteria and a candidate gene approach, followed by extended studies using promising 6-gene panel. Gene selection using novel epigenome-wide statistical approaches may yield more sensitive and specific markers. We realize, therefore, that while promising, these results cannot be considered conclusive and must be compared with prior gene panels to determine the clinical feasibility in another screening cohort with large sample sizes such as The National Lung Screening Trial (NLST) biobank.
In conclusion, promoter methylation of a panel of 6 genes (CDO1, HOXA9, AJAP1, PTGDR, UNCX, and MARCH11) is a cancer-specific alteration and has the potential for use as a biomarker for cancer detection and prognostic prediction of early-stage NSCLC.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Authors' Contributions
Conception and design: A. Ooki, Z. Maleki, M. Brait, H.-S. Nam, H. Pass, D. Sidransky, R. Guerrero-Preston, M.O. Hoque
Development of methodology: A. Ooki, Z. Maleki, M. Brait, H. Pass, M.O. Hoque
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): A. Ooki, Z. Maleki, J.-C.J. Tsay, W. Rom, R. Guerrero-Preston, M.O. Hoque
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): A. Ooki, M. Brait, N. Turaga, H.-S. Nam, H. Pass, D. Sidransky, R. Guerrero-Preston, M.O. Hoque
Writing, review, and/or revision of the manuscript: A. Ooki, Z. Maleki, J.-C.J. Tsay, M. Brait, H.-S. Nam, W. Rom, H. Pass, D. Sidransky, R. Guerrero-Preston, M.O. Hoque
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): A. Ooki, Z. Maleki, C.M. Goparaju, M. Brait, H. Pass, R. Guerrero-Preston, M.O. Hoque
Study supervision: M.O. Hoque
Grant Support
This work was funded by the Flight Attendant Medical Research Institute Clinical Innovative Award 103015 (M.H.), the Career Development award from SPORE in Cervical Cancer Grants P50 CA098252 (to M.O. Hoque), National Institute of Environmental Health Sciences R01-ES018845-04S1, and National Cancer Institute grants K01-CA164092 (to R. Guerrero-Preston) and U01-CA84986 (to D. Sidransky).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.