Abstract
Several oncogenic signals are involved in the synthesis, metabolism, transportation, and modulation of cholesterol. However, the roles of genetic variants of the cholesterol pathway genes in cancer survival remain unclear.
We investigated associations between 26,781 common SNPs in 209 genes of the cholesterol pathway and non–small cell lung cancer (NSCLC) survival by utilizing genotyping data from two published genome-wide association studies. We used multivariate Cox proportional hazards regression and expression quantitative trait loci analyses to identify survival-associated SNPs and their correlations with the corresponding mRNA expression, respectively. We also used the Kaplan–Meier survival analysis and bioinformatics functional prediction to further evaluate the identified independent SNPs.
We found five independent SNPs (APOB rs1801701C>T; CDH13 rs35859010 C>T, rs1833970 T>A, rs254315 T>C, and rs425904 T>C) to be significantly associated with NSCLC survival in both discovery and replication datasets. When the unfavorable genotype (APOB rs1801701CC) and haplotypes (CDH13 rs35859010-rs1833970-rs254315-rs425904 C-A-T-C and T-T-T-T) were combined into a genetic score as the number of unfavorable genotypes/haplotypes (NUGH) in the multivariate analysis, an increased NUGH was associated with worse survival (Ptrend < 0.0001). In addition, both APOB rs1801701T<C and CDH13 rs425904C<T were correlated with mRNA expression of the genes in normal lung tissues from the genotype-tissue expression project.
Genetic variants of APOB and CDH13 in the cholesterol pathway were associated with NSCLC survival, possibly by affecting their gene expression.
Genetic variants of APOB and CDH13 in the cholesterol pathway may provide new scientific insights into NSCLC prognosis.
Introduction
Lung cancer is one of the most common cancer types in the United States and remains the leading cause of cancer death. It is estimated that in 2019 there will be 228,150 new cases of lung cancer and 142,670 related deaths, accounting for 24% of all cancer deaths (1). The 5-year survival rate of lung cancer had been improved gradually up to 19.4% between 2009 and 2015 (2), but still strikingly lower than that of other cancers, especially compared with those that have similar morbidities (3). Therefore, additional research is needed to search for appropriate biomarkers for treatment response and thus survival of patients with lung cancer.
There are different histologic subtypes of lung cancer, and non–small cell lung cancer (NSCLC) accounts for 85% of all patients with lung cancer, which is further classified as lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC; refs. 4, 5). NSCLC treatments include surgery, radiotherapy, and chemotherapy as well as the targeted therapy, depending upon histologic typing and staging. It has been reported that patients with cancer often adopt lifestyle changes after their diagnosis and treatments to improve their health status. For example, there was a 3-fold increase in food supplemental use after patients were diagnosed with cancer (6), and approximately 70% of breast and prostate cancer survivors were overweight or obese after cancer diagnosis and successful treatment (7). Therefore, there is a growing concern that both a poor nutritional status and an improper diet may have some significant effects on weakening the outcomes of treatment in patients with cancer (8). In particular, dietary cholesterol as well as genes involved in its synthesis, metabolism, transportation, and modulation have recently been a research focus of epidemiologic, preclinical, and clinical studies (9–13), which indicate that high dietary cholesterol intake may be prone to cancer, even influencing survival of patients with cancer.
The controversial role of cholesterol in cancer development derived from several conflicting epidemiologic studies that provided obscure results about associations between serum cholesterol levels and risk for certain cancer types (10, 14, 15), but preclinical studies more consistently suggest a role of cholesterol in survival of patients with cancer. For example, some studies provided evidence for a correlation between cholesterol synthesis and prognostic outcome, showing that several oncogenic signals, such as PI3K/AKT/mTOR, RTK/RAS, and TP53, could modulate cholesterol synthesis in cancer cells (16–23). Studies in cultured cells and in animals also revealed that induction of cholesterol synthesis by the AKT/mTORC1/SREBP pathway contributed to cell growth (21) and promoted cancer aggressiveness and bone metastases (24, 25). Furthermore, multiple cholesterol metabolites, such as steroids and oxysterols metabolized by mitochondrial cytochrome P450 family enzymes, were found to be involved in tumor growth and metastasis (26, 27). Therefore, targeting the synthesis, transport, or metabolites of cholesterol may be alternative options for controlling cancer growth (28–33).
Although cholesterol effects differ by cancer type (34), the role of genetic variants in affecting the cholesterol pathway function and cancer survivals is not yet clear (16). Therefore, we conducted the present study to investigate the associations of genetic variants in the cholesterol pathway and related genes with NSCLC survival by using available genotyping data from two previously published genome-wide association studies (GWAS) of lung cancer.
Materials and Methods
Study populations
Two independent genotyping datasets were used for the discovery and replication, respectively, in the present study. Study populations of the discovery dataset derived from the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial that was implemented between 1993 and 2001, in which baseline demographic characteristics and risk factors (such as smoking status) were archived, and whole blood samples were collected at enrollment (35–37). During the 13-year follow-up, there were 1,185 Caucasian participants who were confirmed to have developed NSCLC and whose histologic diagnosis, tumor stage, treatment method, and survival time including overall survival (OS) and disease-special survival (DSS) time were recorded. Furthermore, their genomic DNA samples were extracted from the whole blood and genotyped with Illumina HumanHap240Sv1.0 and HumanHap550v3.0 (dbGaP accession: phs000093.v2.p2 and phs000336.v1.p1; refs. 38–41).
The replication dataset included 984 Caucasian patients with histologically confirmed NSCLC from the Harvard Lung Cancer Susceptibility (HLCS) study launched in 1992 (42). For these patients, demographic and clinical information along with survival time were collected. The whole blood samples were collected, and DNA was extracted with the Auto Pure Large Sample system for nucleic acid purification (QIAGEN Company) and genotyped by using the Illumina Humanhap610Quad array. The genotyping data were used for imputation with the Mach3 software based on the sequencing data for Caucasians from the 1,000 Genomes Project.
The present study was approved by both the Internal Review Board of Duke University School of Medicine (#Pro00054575) and the dbGaP database administration (#6404). The comparison of the characteristics between the PLCO trial (n = 1,185) and the HLCS study (n = 984) is presented in Supplementary Table S1. Because the discovery PLCO dataset had detailed genotyping data with more covariates available but the HLCS replication dataset did not, we could only use the PLCO dataset for further multivariate analyses after replication.
Gene and SNP selection
Based on the databases of the Molecular Signatures Databases (http://software.broadinstitute.org/gsea/msigdb/index.jsp), we included 209 genes related to the synthesis, metabolism, and regulation of cholesterol in the analyses after excluding 60 duplicated genes, additional two genes withdrawn in NCBI and five genes on the X chromosome (Supplementary Table S2). We first extracted the genotype data for these 209 candidate genes and their ± 2-kb flanking regions from the PLCO dataset and then performed the imputation with IMPUTE2 using the sequencing data for Caucasians from the 1,000 Genomes Project database. As a result, a total of 26,781 SNPs (1,666 genotyped and 24,115 imputed), which met the criteria of a genotyping rate ≥95%, a minor allelic frequency (MAF) ≥5%, Hardy–Weinberg equilibrium (HWE) ≥1 × 10−5, and an imputation info score ≥0.8, were retained for subsequent analyses (Supplementary Fig. S1B).
Multivariate Cox proportional hazards regression analysis
For the discovery PLCO dataset, we employed a single-locus analysis first with multivariate Cox proportional hazards regression models to assess the association between each of the 26,781 candidate SNPs and NSCLC survival by calculating HR and 95% confidence interval (CI). To correct for multiple testing, we first used FDR by a cutoff value of 0.2, followed by the Bayesian false discovery probability (BFDP) with a cutoff value of 0.8 as recommended for highly correlated SNPs as a result of imputation (43, 44). We assigned a prior probability of 0.10 and a detectable upper boundary HR of 3.0 for an association with variant genotypes or minor alleles of the SNPs with P < 0.05. The multivariate Cox regression analysis and multiple test correction were performed by using the GenABEL package of R software (45).
For the HLCS replication dataset, associations between the identified significant SNPs and NSCLC survival in the PLCO dataset were further evaluated by using the multivariate Cox regression model with a significance level of P < 0.05. Finally, an inverse variance weighted meta-analysis was performed to combine the results of both discovery and replication datasets by PLINK 1.90.
Stepwise multivariate Cox regression analysis
To identify independent SNPs associated with NSCLC survival, we used a multivariate stepwise Cox model for the PLCO dataset, in which all the significant SNPs were included one by one, and their independence in predicting the outcome was evaluated by P < 0.05. In addition to the available demographic characteristics and clinical variables in the PLCO dataset, other 15 previously published SNPs (see the Results) associated with survival of NSCLC in the same PLCO dataset were also included in the model for further adjustment to confirm the newly identified independent survival-associated SNPs.
Combined effect analysis of all the independent SNPs
After the independent SNPs were confirmed, their effects on the NSCLC survival were assessed in the form of genotypes or haplotypes by multivariate analysis with adjustment for other covariates in the PLCO dataset in each of additive, dominant, and recessive models. The genotype model with HR > 1 and P < 0.05 was regarded as the unfavorable genotype. If a cluster of independent SNPs was in the same gene, their haplotypes were also constructed and evaluated (46). For haplotype inference, we applied the HAPLOTYPE procedure of the SAS Genetics module, given a multilocus sample of genetic marker genotypes under the assumption of HWE. The expectation–maximization algorithm was used to estimate the probability that each individual possesses a particular haplotype pair. The most likely haplotypes were used for each individual, and then the unfavorable haplotypes were identified by multivariate Cox regression analysis with HR > 1 and P < 0.05. Finally, a diplotype of each PLCO patient was assigned a value according to the number of unfavorable haplotypes on two strands of homologous chromosomes, and the associations of diplotypes with OS and DSS of NSCLC were also evaluated using the same statistical method as for genotypes.
Once the unfavorable genotypes/haplotypes (UGH) were verified, they were combined into a number of unfavorable genotypes/haplotypes (NUGH) as a genetic score to assess the combined effect of all independent SNPs. We used Kaplan–Meier (K-M) curves and log-rank tests to evaluate the effects of NUGH on cumulative probability of OS and DSS with GraphPad Prism 8, and P < 0.05 was considered statistically significant.
Prediction model construction
We constructed a survival prediction model by using the ROC curve with the “survival” and “timeROC” package of R software (version 3.5.0). Sensitivity, specificity, and time-dependent AUC were used to measure the ability of survival models to predict the NSCLC survival due to the effects of both clinical and genetic variables (47).
Stratified analysis
We performed a stratified analysis to evaluate associations between the UGH/NUGH and survival (both OS and DSS) of NSCLC in each stratum of the available covariates in the PLCO dataset, and the associations were assessed with P < 0.05. We also assessed possible interactions with the χ2-based Q-test between genotypes/haplotypes and NSCLC among subgroups in the stratified analysis with Pi < 0.05.
Expression quantitative trait loci analysis
We performed the expression quantitative trait loci (eQTL) analysis to identify correlations between genotypes of the independent SNPs and mRNA expression levels of the corresponding genes. Two approaches were used for the eQTL analysis; one was a linear regression model performed with the R software, in which the mRNA expression data were obtained from 373 European individuals in the 1,000 Genomes Project; and another was derived from two other GWAS datasets with normal lung tissue samples of 383 subjects and 369 whole blood samples, respectively, which are made available in the Genotype-Tissue Expression (GTEx) project (48, 49).
Correlation between mRNA expression in lung cancer tissues and NSCLC survival
The correlations between mRNA expression levels of the SNP-associated genes and NSCLC survival were examined in 111 pairs of lung cancer (51 LUSC and 60 LUAD) tissues and their adjacent normal tissues from The Cancer Genome Atlas (TCGA) database by using a paired Student t test. We also used the K-M survival curves to visualize the associations by an online tool (http://kmplot.com/analysis/index.php?p=service&cancer=lung), in which 1,926 NSCLC samples with published gene expression data and survival information from the caBIG, GEO, and TCGA repositories were integrated; and P values for the K-M survival plot with HR and log-rank were calculated, and the plots were made with R language (50).
All statistical analyses in the present study were performed by using the SAS software (version 9.4; SAS Institute), unless otherwise indicated.
Bioinformatics functional prediction
Bioinformatics functional prediction for each of the identified significant SNPs was performed with the online tools of SNPinfo (ref. 51; https://snpinfo.niehs.nih.gov), RegulomeDB (ref. 52; http://www.regulomedb.org), and HaploReg (ref. 53; https://pubs.broadinstitute.org/mammals/haploreg).
Results
Basic characteristics of 1,185 patients with NSCLC from the PLCO trial and 984 patients with NSCLC from the HLCS study have been described elsewhere (54), and the detailed description of the present study is shown in Supplementary Fig. S1A. Among the corresponding 26,781 SNPs of the 209 candidate genes in the cholesterol-related pathway, we identified in the single-locus analysis 1,004 SNPs that were significantly associated with OS in the PLCO dataset with multiple test correction by BFDP after failed by FDR, of which 24 SNPs remained significant in the replication by the HLCS dataset. As shown in Table 1 for the results of additive genetic models, these 24 SNPs are located in six genes, i.e., APOB, ABCG5, RORA, CDH13, ABCG1, and COMT. Additional meta-analysis of the PLCO and HLCS datasets for these 24 identified SNPs showed the consistent results, and there was no heterogeneity between these two datasets (all Phet > 0.05).
. | . | . | PLCO (n = 1,185) . | HLCS (n = 984) . | Combined analysis . | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
SNP . | Allelea . | Gene . | FDR . | BFDP . | EAF . | HR (95% CI)b . | Pb . | EAF . | HR (95% CI)c . | Pc . | Phetd . | I2 . | HR (95% CI)e . | Pe . |
rs1801701 | C>T | APOB | 0.27 | 0.34 | 0.09 | 0.75 (0.62–0.90) | 0.002 | 0.09 | 0.81 (0.66–0.98) | 0.032 | 0.592 | 0 | 0.78 (0.68–0.89) | 0.0002 |
rs77105521 | C>T | ABCG5 | 0.30 | 0.67 | 0.18 | 0.83 (0.73–0.95) | 0.007 | 0.18 | 0.84 (0.72–0.97) | 0.020 | 0.918 | 0 | 0.83 (0.76–0.92) | 0.0003 |
rs76382861 | G>A | RORA | 0.29 | 0.47 | 0.15 | 0.81 (0.70–0.93) | 0.002 | 0.13 | 0.81 (0.68–0.96) | 0.017 | 0.998 | 0 | 0.81 (0.73–0.90) | 0.0002 |
rs78023262 | T>A | RORA | 0.29 | 0.47 | 0.15 | 0.81 (0.70–0.93) | 0.002 | 0.13 | 0.81 (0.68–0.96) | 0.016 | 0.991 | 0 | 0.81 (0.73–0.90) | 0.0002 |
rs35859010 | C>T | CDH13 | 0.32 | 0.67 | 0.15 | 0.83 (0.72–0.95) | 0.008 | 0.15 | 0.79 (0.65–0.96) | 0.018 | 0.672 | 0 | 0.82 (0.73–0.91) | 0.0004 |
rs1833970 | T>A | CDH13 | 0.18 | 0.16 | 0.36 | 0.83 (0.74–0.92) | 0.001 | 0.38 | 0.89 (0.79–1.00) | 0.049 | 0.391 | 0 | 0.86 (0.79–0.93) | 0.0001 |
rs2067861 | C>T | CDH13 | 0.29 | 0.61 | 0.37 | 0.85 (0.77–0.95) | 0.004 | 0.38 | 0.89 (0.79–1.00) | 0.048 | 0.571 | 0 | 0.87 (0.80–0.94) | 0.0003 |
rs9934609 | C>T | CDH13 | 0.26 | 0.48 | 0.37 | 0.84 (0.75–0.94) | 0.002 | 0.38 | 0.89 (0.79–1.00) | 0.049 | 0.476 | 0 | 0.86 (0.80–0.94) | 0.0004 |
rs9934700 | C>T | CDH13 | 0.27 | 0.48 | 0.37 | 0.84 (0.75–0.94) | 0.002 | 0.38 | 0.89 (0.79–0.99) | 0.040 | 0.512 | 0 | 0.86 (0.80–0.93) | 0.0003 |
rs6563943 | G>A | CDH13 | 0.29 | 0.41 | 0.36 | 0.85 (0.76–0.94) | 0.003 | 0.38 | 0.89 (0.79–1.00) | 0.048 | 0.556 | 0 | 0.87 (0.80–0.94) | 0.0004 |
rs72795378 | G>A | CDH13 | 0.37 | 0.74 | 0.12 | 1.20 (1.04–1.38) | 0.013 | 0.13 | 1.20 (1.03–1.40) | 0.023 | 0.995 | 0 | 1.20 (1.08–1.33) | 0.0006 |
rs17689520 | C>G | CDH13 | 0.39 | 0.72 | 0.10 | 1.22 (1.04–1.42) | 0.014 | 0.11 | 1.20 (1.00–1.42) | 0.044 | 0.868 | 0 | 1.21 (1.08–1.36) | 0.0014 |
rs60978336 | C>A | CDH13 | 0.30 | 0.57 | 0.10 | 1.25 (1.06–1.46) | 0.006 | 0.10 | 1.19 (1.00–1.42) | 0.049 | 0.699 | 0 | 1.22 (1.09–1.38) | 0.0008 |
rs12446784 | A>G | CDH13 | 0.42 | 0.78 | 0.10 | 1.21 (1.03–1.41) | 0.019 | 0.11 | 1.19 (1.00–1.42) | 0.049 | 0.904 | 0 | 1.20 (1.07–1.35) | 0.0020 |
rs72795399 | A>G | CDH13 | 0.42 | 0.78 | 0.10 | 1.21 (1.03–1.41) | 0.019 | 0.11 | 1.20 (1.00–1.43) | 0.044 | 0.932 | 0 | 1.20 (1.07–1.35) | 0.0019 |
rs11525693 | C>G | CDH13 | 0.40 | 0.77 | 0.10 | 1.22 (1.04–1.43) | 0.016 | 0.10 | 1.20 (1.01–1.43) | 0.043 | 0.886 | 0 | 1.21 (1.08–1.36) | 0.0014 |
rs254315 | T>C | CDH13 | 0.29 | 0.67 | 0.15 | 1.21 (1.06–1.39) | 0.006 | 0.15 | 1.19 (1.02–1.38) | 0.029 | 0.844 | 0 | 1.20 (1.08–1.33) | 0.0004 |
rs374476 | A>G | CDH13 | 0.30 | 0.59 | 0.15 | 1.21 (1.06–1.38) | 0.006 | 0.15 | 1.19 (1.02–1.39) | 0.024 | 0.899 | 0 | 1.20 (1.09–1.33) | 0.0003 |
rs425904 | T>C | CDH13 | 0.41 | 0.78 | 0.20 | 1.16 (1.03–1.30) | 0.017 | 0.21 | 1.15 (1.01–1.31) | 0.039 | 0.917 | 0 | 1.16 (1.06–1.26) | 0.0011 |
rs183436 | A>C | ABCG1 | 0.31 | 0.72 | 0.32 | 0.86 (0.78–0.96) | 0.007 | 0.31 | 0.88 (0.78–1.00) | 0.046 | 0.734 | 0 | 0.87 (0.80–0.94) | 0.0005 |
rs225390 | G>A | ABCG1 | 0.29 | 0.72 | 0.33 | 0.86 (0.77–0.96) | 0.006 | 0.33 | 0.87 (0.77–0.99) | 0.029 | 0.854 | 0 | 0.87 (0.80–0.94) | 0.0005 |
rs225395 | C>T | ABCG1 | 0.29 | 0.61 | 0.34 | 0.85 (0.76–0.95) | 0.003 | 0.34 | 0.88 (0.79–1.00) | 0.043 | 0.637 | 0 | 0.87 (0.80–0.94) | 0.0005 |
rs225398 | C>G | ABCG1 | 0.29 | 0.56 | 0.34 | 0.86 (0.77–0.95) | 0.005 | 0.35 | 0.88 (0.78–0.99) | 0.038 | 0.768 | 0 | 0.87 (0.80–0.94) | 0.0005 |
rs174699 | T>C | COMT | 0.18 | 0.13 | 0.05 | 1.45 (1.17–1.79) | 0.001 | 0.05 | 1.30 (1.02–1.68) | 0.037 | 0.527 | 0 | 1.39 (1.18–1.63) | 0.0001 |
. | . | . | PLCO (n = 1,185) . | HLCS (n = 984) . | Combined analysis . | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
SNP . | Allelea . | Gene . | FDR . | BFDP . | EAF . | HR (95% CI)b . | Pb . | EAF . | HR (95% CI)c . | Pc . | Phetd . | I2 . | HR (95% CI)e . | Pe . |
rs1801701 | C>T | APOB | 0.27 | 0.34 | 0.09 | 0.75 (0.62–0.90) | 0.002 | 0.09 | 0.81 (0.66–0.98) | 0.032 | 0.592 | 0 | 0.78 (0.68–0.89) | 0.0002 |
rs77105521 | C>T | ABCG5 | 0.30 | 0.67 | 0.18 | 0.83 (0.73–0.95) | 0.007 | 0.18 | 0.84 (0.72–0.97) | 0.020 | 0.918 | 0 | 0.83 (0.76–0.92) | 0.0003 |
rs76382861 | G>A | RORA | 0.29 | 0.47 | 0.15 | 0.81 (0.70–0.93) | 0.002 | 0.13 | 0.81 (0.68–0.96) | 0.017 | 0.998 | 0 | 0.81 (0.73–0.90) | 0.0002 |
rs78023262 | T>A | RORA | 0.29 | 0.47 | 0.15 | 0.81 (0.70–0.93) | 0.002 | 0.13 | 0.81 (0.68–0.96) | 0.016 | 0.991 | 0 | 0.81 (0.73–0.90) | 0.0002 |
rs35859010 | C>T | CDH13 | 0.32 | 0.67 | 0.15 | 0.83 (0.72–0.95) | 0.008 | 0.15 | 0.79 (0.65–0.96) | 0.018 | 0.672 | 0 | 0.82 (0.73–0.91) | 0.0004 |
rs1833970 | T>A | CDH13 | 0.18 | 0.16 | 0.36 | 0.83 (0.74–0.92) | 0.001 | 0.38 | 0.89 (0.79–1.00) | 0.049 | 0.391 | 0 | 0.86 (0.79–0.93) | 0.0001 |
rs2067861 | C>T | CDH13 | 0.29 | 0.61 | 0.37 | 0.85 (0.77–0.95) | 0.004 | 0.38 | 0.89 (0.79–1.00) | 0.048 | 0.571 | 0 | 0.87 (0.80–0.94) | 0.0003 |
rs9934609 | C>T | CDH13 | 0.26 | 0.48 | 0.37 | 0.84 (0.75–0.94) | 0.002 | 0.38 | 0.89 (0.79–1.00) | 0.049 | 0.476 | 0 | 0.86 (0.80–0.94) | 0.0004 |
rs9934700 | C>T | CDH13 | 0.27 | 0.48 | 0.37 | 0.84 (0.75–0.94) | 0.002 | 0.38 | 0.89 (0.79–0.99) | 0.040 | 0.512 | 0 | 0.86 (0.80–0.93) | 0.0003 |
rs6563943 | G>A | CDH13 | 0.29 | 0.41 | 0.36 | 0.85 (0.76–0.94) | 0.003 | 0.38 | 0.89 (0.79–1.00) | 0.048 | 0.556 | 0 | 0.87 (0.80–0.94) | 0.0004 |
rs72795378 | G>A | CDH13 | 0.37 | 0.74 | 0.12 | 1.20 (1.04–1.38) | 0.013 | 0.13 | 1.20 (1.03–1.40) | 0.023 | 0.995 | 0 | 1.20 (1.08–1.33) | 0.0006 |
rs17689520 | C>G | CDH13 | 0.39 | 0.72 | 0.10 | 1.22 (1.04–1.42) | 0.014 | 0.11 | 1.20 (1.00–1.42) | 0.044 | 0.868 | 0 | 1.21 (1.08–1.36) | 0.0014 |
rs60978336 | C>A | CDH13 | 0.30 | 0.57 | 0.10 | 1.25 (1.06–1.46) | 0.006 | 0.10 | 1.19 (1.00–1.42) | 0.049 | 0.699 | 0 | 1.22 (1.09–1.38) | 0.0008 |
rs12446784 | A>G | CDH13 | 0.42 | 0.78 | 0.10 | 1.21 (1.03–1.41) | 0.019 | 0.11 | 1.19 (1.00–1.42) | 0.049 | 0.904 | 0 | 1.20 (1.07–1.35) | 0.0020 |
rs72795399 | A>G | CDH13 | 0.42 | 0.78 | 0.10 | 1.21 (1.03–1.41) | 0.019 | 0.11 | 1.20 (1.00–1.43) | 0.044 | 0.932 | 0 | 1.20 (1.07–1.35) | 0.0019 |
rs11525693 | C>G | CDH13 | 0.40 | 0.77 | 0.10 | 1.22 (1.04–1.43) | 0.016 | 0.10 | 1.20 (1.01–1.43) | 0.043 | 0.886 | 0 | 1.21 (1.08–1.36) | 0.0014 |
rs254315 | T>C | CDH13 | 0.29 | 0.67 | 0.15 | 1.21 (1.06–1.39) | 0.006 | 0.15 | 1.19 (1.02–1.38) | 0.029 | 0.844 | 0 | 1.20 (1.08–1.33) | 0.0004 |
rs374476 | A>G | CDH13 | 0.30 | 0.59 | 0.15 | 1.21 (1.06–1.38) | 0.006 | 0.15 | 1.19 (1.02–1.39) | 0.024 | 0.899 | 0 | 1.20 (1.09–1.33) | 0.0003 |
rs425904 | T>C | CDH13 | 0.41 | 0.78 | 0.20 | 1.16 (1.03–1.30) | 0.017 | 0.21 | 1.15 (1.01–1.31) | 0.039 | 0.917 | 0 | 1.16 (1.06–1.26) | 0.0011 |
rs183436 | A>C | ABCG1 | 0.31 | 0.72 | 0.32 | 0.86 (0.78–0.96) | 0.007 | 0.31 | 0.88 (0.78–1.00) | 0.046 | 0.734 | 0 | 0.87 (0.80–0.94) | 0.0005 |
rs225390 | G>A | ABCG1 | 0.29 | 0.72 | 0.33 | 0.86 (0.77–0.96) | 0.006 | 0.33 | 0.87 (0.77–0.99) | 0.029 | 0.854 | 0 | 0.87 (0.80–0.94) | 0.0005 |
rs225395 | C>T | ABCG1 | 0.29 | 0.61 | 0.34 | 0.85 (0.76–0.95) | 0.003 | 0.34 | 0.88 (0.79–1.00) | 0.043 | 0.637 | 0 | 0.87 (0.80–0.94) | 0.0005 |
rs225398 | C>G | ABCG1 | 0.29 | 0.56 | 0.34 | 0.86 (0.77–0.95) | 0.005 | 0.35 | 0.88 (0.78–0.99) | 0.038 | 0.768 | 0 | 0.87 (0.80–0.94) | 0.0005 |
rs174699 | T>C | COMT | 0.18 | 0.13 | 0.05 | 1.45 (1.17–1.79) | 0.001 | 0.05 | 1.30 (1.02–1.68) | 0.037 | 0.527 | 0 | 1.39 (1.18–1.63) | 0.0001 |
Abbreviation: EAF, effect allele frequency.
aReference > effect allele.
bObtained from an additive genetic model with adjustment for age, sex, stage, histology, smoking status, chemotherapy, radiotherapy, surgery, PC1, PC2, PC3, and PC4.
cObtained from an additive genetic model with adjustment for age, sex, stage, histology, smoking status, chemotherapy, radiotherapy, surgery, PC1, PC2, and PC3.
dPhet: P value for heterogeneity by Cochran's Q test.
eMeta-analysis in the fixed-effects model.
Identification of independent SNPs among the 24 significant SNPs
After adjustment for other 15 previously reported survival-associated SNPs in the same PLCO dataset, five SNPs remained as independent survival predictors for further analysis. As shown in Table 2, APOB (rs1801701C>T) and CDH13 (rs35859010C>T, rs1833970T>A, rs254315 T>C, and rs425904T>C), as well as other demographic and clinical covariates except for radiotherapy, were independently associated with NSCLC survival (all P < 0.05).
Variables . | Category . | Frequency . | HR (95% CI)a . | Pa . | HR (95% CI)b . | Pb . |
---|---|---|---|---|---|---|
Age | Continuous | 1185 | 1.03 (1.02–1.05) | <0.0001 | 1.04 (1.02–1.05) | <0.0001 |
Sex | Male/female | 698/487 | 0.76 (0.65–0.88) | 0.0004 | 0.74 (0.63–0.87) | 0.0002 |
Smoking status | Never/current | 115/423 | 1.73 (1.29–2.33) | 0.0002 | 2.03 (1.5–2.74) | <0.0001 |
Never/former | 115/647 | 1.79 (1.36–2.36) | <0.0001 | 2.01 (1.51–2.68) | <0.0001 | |
Histology | Adeno/squam | 577/285 | 1.26 (1.04–1.51) | 0.0173 | 1.25 (1.03–1.52) | 0.0223 |
Adeno/others | 577/323 | 1.38 (1.16–1.64) | 0.0003 | 1.40 (1.17–1.68) | 0.0002 | |
Tumor stage | I–IIIA/IIIB–IV | 655/528 | 3.22 (2.65–3.91) | <0.0001 | 3.29 (2.70–4.02) | <0.0001 |
Chemotherapy | No/yes | 639/538 | 0.54 (0.45–0.65) | <0.0001 | 0.55 (0.46–0.66) | <0.0001 |
Radiotherapy | No/yes | 762/415 | 1.03 (0.87–1.21) | 0.7578 | 0.99 (0.84–1.17) | 0.9012 |
Surgery | No/yes | 637/540 | 0.21 (0.17–0.28) | <0.0001 | 0.19 (0.15–0.25) | <0.0001 |
APOB, rs1801701, C>T | CC/CT/TT | 982/192/11 | 0.75 (0.62–0.90) | 0.0024 | 0.79 (0.66–0.96) | 0.0155 |
CDH13, rs35859010, C>T | CC/CT/TT | 856/297/32 | 0.85 (0.74–0.99) | 0.0299 | 0.83 (0.71–0.96) | 0.0108 |
CDH13, rs1833970, T>A | TT/TA/AA | 488/531/166 | 0.83 (0.75–0.93) | 0.0011 | 0.83 (0.74–0.92) | 0.0008 |
CDH13, rs254315, T>C | TT/TC/CC | 863/300/22 | 1.24 (1.08–1.42) | 0.0023 | 1.33 (1.15–1.53) | <0.0001 |
CDH13, rs425904, T>C | TT/TC/CC | 757/371/57 | 1.23 (1.08–1.39) | 0.0012 | 1.24 (1.09–1.40) | 0.0007 |
Variables . | Category . | Frequency . | HR (95% CI)a . | Pa . | HR (95% CI)b . | Pb . |
---|---|---|---|---|---|---|
Age | Continuous | 1185 | 1.03 (1.02–1.05) | <0.0001 | 1.04 (1.02–1.05) | <0.0001 |
Sex | Male/female | 698/487 | 0.76 (0.65–0.88) | 0.0004 | 0.74 (0.63–0.87) | 0.0002 |
Smoking status | Never/current | 115/423 | 1.73 (1.29–2.33) | 0.0002 | 2.03 (1.5–2.74) | <0.0001 |
Never/former | 115/647 | 1.79 (1.36–2.36) | <0.0001 | 2.01 (1.51–2.68) | <0.0001 | |
Histology | Adeno/squam | 577/285 | 1.26 (1.04–1.51) | 0.0173 | 1.25 (1.03–1.52) | 0.0223 |
Adeno/others | 577/323 | 1.38 (1.16–1.64) | 0.0003 | 1.40 (1.17–1.68) | 0.0002 | |
Tumor stage | I–IIIA/IIIB–IV | 655/528 | 3.22 (2.65–3.91) | <0.0001 | 3.29 (2.70–4.02) | <0.0001 |
Chemotherapy | No/yes | 639/538 | 0.54 (0.45–0.65) | <0.0001 | 0.55 (0.46–0.66) | <0.0001 |
Radiotherapy | No/yes | 762/415 | 1.03 (0.87–1.21) | 0.7578 | 0.99 (0.84–1.17) | 0.9012 |
Surgery | No/yes | 637/540 | 0.21 (0.17–0.28) | <0.0001 | 0.19 (0.15–0.25) | <0.0001 |
APOB, rs1801701, C>T | CC/CT/TT | 982/192/11 | 0.75 (0.62–0.90) | 0.0024 | 0.79 (0.66–0.96) | 0.0155 |
CDH13, rs35859010, C>T | CC/CT/TT | 856/297/32 | 0.85 (0.74–0.99) | 0.0299 | 0.83 (0.71–0.96) | 0.0108 |
CDH13, rs1833970, T>A | TT/TA/AA | 488/531/166 | 0.83 (0.75–0.93) | 0.0011 | 0.83 (0.74–0.92) | 0.0008 |
CDH13, rs254315, T>C | TT/TC/CC | 863/300/22 | 1.24 (1.08–1.42) | 0.0023 | 1.33 (1.15–1.53) | <0.0001 |
CDH13, rs425904, T>C | TT/TC/CC | 757/371/57 | 1.23 (1.08–1.39) | 0.0012 | 1.24 (1.09–1.40) | 0.0007 |
aStepwise analysis included age, sex, smoking status, tumor stage, histology, chemotherapy, radiotherapy, surgery, PC1, PC2, PC3, and PC4.
bFifteen published SNPs were used for poststepwise adjustment: five SNPs were reported in the previous publication (PMID: 27557513); one SNP was reported in the previous publication (PMID: 29978465); two SNPs were reported in the previous publication (PMID: 30259978); two SNPs were reported in the previous publication (PMID: 26757251); three SNPs were reported in the previous publication (PMID: 30650190); two SNPs were reported in the previous publication (PMID: 30989732); as well as adjusted by two significant SNPs have not published but been researching simultaneously in other pathway (related cell membrane transporters and trans-sulfation).
The five independent SNPs are presented as marked in two separate Manhattan plots for both PLCO and HLCS datasets (Supplementary Fig. S1C and S1D). Furthermore, the regional association plot (http://locuszoom.org/) of each independent SNP is shown in Supplementary Fig. S2A–S2E to illustrate their surrounding SNPs in the discovery dataset and the recombination rate estimated from HapMap Data Rel 22/phase II European population (55). Meanwhile, the pairwise linkage disequilibrium (LD) analysis using HaploView 4.2 software showed that four CDH13 SNPs were in low LD (Supplementary Fig. S2F).
Combined APOB genotypes and CDH13 haplotypes and survival of NSCLC
For APOB, the rs1801701 CT+TT genotypes were associated with a better survival (HR = 0.73, 95% CI, 0.60–0.88, P = 0.014 for OS and HR = 0.74, 95% CI, 0.60–0.91, P = 0.004 for DSS), compared with the rs1801701 CC genotype. Therefore, the APOB rs1801701 CC genotype was the unfavorable genotype (Table 3).
. | . | OSb . | DSSb . | ||||
---|---|---|---|---|---|---|---|
Genotypes/Haplotypes . | Frequencya . | Deaths (%) . | HR (95% CI) . | P . | Deaths (%) . | HR (95% CI) . | P . |
APOB, rs1801701, C>T | |||||||
CC | 972 | 663 (68.21) | 1.00 | 594 (61.11) | 1.00 | ||
CT | 192 | 119 (61.98) | 0.72 (0.59–0.88) | 0.016 | 108 (56.25) | 0.73 (0.60–0.91) | 0.004 |
TT | 11 | 7 (63.64) | 0.76 (0.36–1.61) | 0.476 | 7 (63.64) | 0.82 (0.39–1.73) | 0.593 |
Trend test | 0.002 | 0.006 | |||||
Dominant | |||||||
CCc | 972 | 663 (68.21) | 1.00 | 594 (61.11) | 1.00 | ||
CT+TT | 203 | 126 (62.07) | 0.73 (0.60–0.88) | 0.014 | 115 (56.65) | 0.74 (0.60–0.91) | 0.004 |
Recessive | |||||||
CC+CT | 1,164 | 782 (67.18) | 1.00 | 702 (60.31) | 1.00 | ||
TT | 11 | 7 (63.64) | 0.82 (0.39–1.72) | 0.593 | 7 (63.64) | 0.87 (0.41–1.84) | 0.720 |
CDH13 haplotypes | |||||||
H1 | 938 | 637 (67.91) | 1.00 | 567 (60.45) | 1.00 | ||
H2 | 431 | 270 (62.65) | 0.77 (0.67–0.89) | 0.0004 | 245 (56.84) | 0.80 (0.69–0.93) | 0.003 |
H3 | 261 | 178 (68.20) | 0.97 (0.82–1.15) | 0.733 | 164 (62.84) | 1.01 (0.84–1.20) | 0.951 |
H4 | 236 | 145 (61.44) | 0.78 (0.65–0.94) | 0.008 | 127 (53.81) | 0.79 (0.65–0.96) | 0.019 |
H5 | 231 | 170 (73.59) | 1.24 (1.05–1.47) | 0.013 | 155 (67.10) | 1.26 (1.05–1.51) | 0.012 |
H6 | 162 | 117 (72.22) | 1.27 (1.05–1.55) | 0.017 | 105 (64.81) | 1.25 (1.02–1.55) | 0.034 |
H7 | 91 | 61 (67.03) | 0.94 (0.72–1.23) | 0.653 | 55 (60.44) | 0.93 (0.71–1.24) | 0.633 |
Combined CDH13 haplotypes | |||||||
FH | 1,957 | 1291 (65.97) | 1.00 | 1,158 (59.17) | 1.00 | ||
UHd | 393 | 287 (73.03) | 1.38 (1.21–1.57) | <0.0001 | 260 (66.16) | 1.36 (1.19–1.56) | <0.0001 |
NUGHe | |||||||
0 | 142 | 84 (59.15) | 1.00 | 77 (54.23) | 1.00 | ||
1 | 738 | 492 (66.67) | 1.44 (1.13–1.82) | 0.003 | 439 (59.49) | 1.41 (1.10–1.81) | 0.007 |
2 | 295 | 213 (72.20) | 1.93 (1.49–2.50) | <0.0001 | 193 (65.42) | 1.82 (1.43–2.46) | <0.0001 |
Trend test | <0.0001 | <0.0001 | |||||
NUGH | |||||||
0–1 | 880 | 567 (65.45) | 1.00 | 516 (58.64) | 1.00 | ||
2 | 295 | 213 (72.20) | 1.42 (1.21–1.68) | <0.0001 | 193 (65.42) | 1.41 (1.19–1.66) | <0.0001 |
. | . | OSb . | DSSb . | ||||
---|---|---|---|---|---|---|---|
Genotypes/Haplotypes . | Frequencya . | Deaths (%) . | HR (95% CI) . | P . | Deaths (%) . | HR (95% CI) . | P . |
APOB, rs1801701, C>T | |||||||
CC | 972 | 663 (68.21) | 1.00 | 594 (61.11) | 1.00 | ||
CT | 192 | 119 (61.98) | 0.72 (0.59–0.88) | 0.016 | 108 (56.25) | 0.73 (0.60–0.91) | 0.004 |
TT | 11 | 7 (63.64) | 0.76 (0.36–1.61) | 0.476 | 7 (63.64) | 0.82 (0.39–1.73) | 0.593 |
Trend test | 0.002 | 0.006 | |||||
Dominant | |||||||
CCc | 972 | 663 (68.21) | 1.00 | 594 (61.11) | 1.00 | ||
CT+TT | 203 | 126 (62.07) | 0.73 (0.60–0.88) | 0.014 | 115 (56.65) | 0.74 (0.60–0.91) | 0.004 |
Recessive | |||||||
CC+CT | 1,164 | 782 (67.18) | 1.00 | 702 (60.31) | 1.00 | ||
TT | 11 | 7 (63.64) | 0.82 (0.39–1.72) | 0.593 | 7 (63.64) | 0.87 (0.41–1.84) | 0.720 |
CDH13 haplotypes | |||||||
H1 | 938 | 637 (67.91) | 1.00 | 567 (60.45) | 1.00 | ||
H2 | 431 | 270 (62.65) | 0.77 (0.67–0.89) | 0.0004 | 245 (56.84) | 0.80 (0.69–0.93) | 0.003 |
H3 | 261 | 178 (68.20) | 0.97 (0.82–1.15) | 0.733 | 164 (62.84) | 1.01 (0.84–1.20) | 0.951 |
H4 | 236 | 145 (61.44) | 0.78 (0.65–0.94) | 0.008 | 127 (53.81) | 0.79 (0.65–0.96) | 0.019 |
H5 | 231 | 170 (73.59) | 1.24 (1.05–1.47) | 0.013 | 155 (67.10) | 1.26 (1.05–1.51) | 0.012 |
H6 | 162 | 117 (72.22) | 1.27 (1.05–1.55) | 0.017 | 105 (64.81) | 1.25 (1.02–1.55) | 0.034 |
H7 | 91 | 61 (67.03) | 0.94 (0.72–1.23) | 0.653 | 55 (60.44) | 0.93 (0.71–1.24) | 0.633 |
Combined CDH13 haplotypes | |||||||
FH | 1,957 | 1291 (65.97) | 1.00 | 1,158 (59.17) | 1.00 | ||
UHd | 393 | 287 (73.03) | 1.38 (1.21–1.57) | <0.0001 | 260 (66.16) | 1.36 (1.19–1.56) | <0.0001 |
NUGHe | |||||||
0 | 142 | 84 (59.15) | 1.00 | 77 (54.23) | 1.00 | ||
1 | 738 | 492 (66.67) | 1.44 (1.13–1.82) | 0.003 | 439 (59.49) | 1.41 (1.10–1.81) | 0.007 |
2 | 295 | 213 (72.20) | 1.93 (1.49–2.50) | <0.0001 | 193 (65.42) | 1.82 (1.43–2.46) | <0.0001 |
Trend test | <0.0001 | <0.0001 | |||||
NUGH | |||||||
0–1 | 880 | 567 (65.45) | 1.00 | 516 (58.64) | 1.00 | ||
2 | 295 | 213 (72.20) | 1.42 (1.21–1.68) | <0.0001 | 193 (65.42) | 1.41 (1.19–1.66) | <0.0001 |
Abbreviations: FH, favorable haplotypes; HR, hazard ratio; NUGH, number of unfavorable genotypes/haplotypes; UH, unfavorable haplotypes.
aTen missing data were excluded.
bAdjusted for age, sex, smoking status, histology, tumor stage, chemotherapy, surgery, and principal components.
cUnfavorable genotype was APOB rs1801701 CC.
dUnfavorable haplotypes were H5 (C-A-T-C) and H6 (T-T-T-T) for CDH13 rs35859010-rs1833970-rs254315-rs425904.
eNUGH were assigned according to the APOB unfavorable genotype model and CDH13 diplotype. 0, no APOB rs1801701 CC or CDH13 H5/H6; 1, one APOB rs1801701 CC or CDH13 H5/H6; 2, both APOB rs1801701 CC and CDH13 H5/H6.
For CDH13, considering the low LD of four SNPs in the same gene, their haplotypes were constructed and used in subsequent analyses. The frequencies of CDH13 haplotypes in the PLCO dataset were first estimated, and there were seven haplotypes named H1 to H7, of which haplotype H1 (C-T-T-T) was the most frequent (39.91%), followed by H2 (C-T-T-C, 18.34%), H3 (C-T-C-T, 11.11%), H4 (C-A-T-T, 10.04%), H5 (C-A-T-C, 9.83%), H6 (T-T-T-T, 6.89%), and H7 (T-T-C-T, 3.87%). Haplotypes H5 and H6 were found to be associated with the worst NSCLC OS and DSS; when haplotype H5 and H6 were combined, they remained significantly associated with a worse NSCLC survival (HR = 1.38, 95% CI, 1.21–1.57, P < 0.0001 for OS and HR = 1.36, 95% CI, 1.19–1.56, P < 0.0001 for DSS; Table 3). As for CDH13 diplotype, an increased number of unfavorable haplotypes on two strands of homologous chromosomes was associated with a worse survival in the multivariate analysis in the PLCO dataset (all P < 0.05 for OS and DSS, Supplementary Table S3).
To further evaluate the combined effect of these UGH on NSCLC OS and DSS in the PLCO dataset, we combined the significant unfavorable genotype (APOB rs1801701 CC) and diplotype (CDH13 H5/H6) into a genetic score as the NUGH. As shown in Table 3, an increased NUGH was associated with a worse survival in the multivariate analysis in the PLCO dataset (both Ptrend < 0.0001 for OS and DSS). We further used K-M survival curves to visualize these associations of the NUGH with NSCLC OS and DSS in Fig. 1A–D, in which NSCLC survival declined as the NUGH increased from 0, 1 to 2 UGH (Log-rank P = 0.004 for OS and Log-rank P = 0.010 for DSS). Similar results were observed, when the NUGH 2 group was compared with the NUGH 0–1 group (Log-rank P = 0.007 for OS and Log-rank P = 0.012 for DSS).
Combined APOB genotypes/CDH13 haplotypes and survival prediction model
As shown in Fig. 1, the ROC curves indicated an improved prediction performance with the addition of NUGH to the model with the covariates, compared with the model with the covariates only. We did not observe any difference in the prediction of 5-year survival by the UGH based on AUC and ROC curves. However, when we evaluated the 10-year NSCLC OS and DSS, the addition of UGH to the model with the covariates significantly increased the AUCs from 86.87% to 89.00% (P < 0.0001) and from 87.53% to 89.39% (P = 0.0004), respectively (Fig. 1F and H). We further plotted the ROC curves for stage subgroups (i.e., I, II, III, and IV; Supplementary Fig. S3), and there was no difference in AUCs by stage. Although the 10-year NSCLC OS may not be clinically valuable, it may also reflect the roles of genetic factors in response to changes over time in lifestyle and dietary intake of cholesterol, which could influence the health status, even the outcome of clinical treatment of patients with cancer (6, 7).
Stratified analysis of associations between APOB genotypes/CDH13 haplotypes and survival of NSCLC
As shown in Table 4, compared with those with 0–1 NUGH, individuals with 2 NUGH had a worse survival, consistently in each of the strata by all the covariates (all HR > 1.0 and P < 0.05), except for smoking status and chemotherapy. Meanwhile, the heterogeneity and interactions were also evaluated among these subgroups, and the results indicated that smoking status and chemotherapy had an interactive effect on the associations of NUGH with OS (Pinter = 0.026 for smoking status and Pinter = 0.0006 for chemotherapy) and DSS (Pinter = 0.050 for smoking status and Pinter = 0.002 for chemotherapy) of NSCLC.
. | 0–1 NUGHa . | 2 NUGHa . | Multivariate analysis for OSb . | . | Multivariate analysis for DSSb . | . | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Characteristics . | All . | OS . | DSS . | All . | OS . | DSS . | HR (95% CI) . | P . | Pinterc . | HR (95% CI) . | P . | Pinterc . |
Age (years) | ||||||||||||
≤71 | 479 | 290 | 258 | 155 | 109 | 101 | 1.27 (1.01–1.59) | 0.039 | 1.26 (1.00–1.60) | 0.051 | ||
>71 | 401 | 286 | 258 | 140 | 104 | 92 | 1.52 (1.20–1.91) | 0.0004 | 0.408 | 1.50 (1.18–1.92) | 0.001 | 0.321 |
Sex | ||||||||||||
Male | 526 | 381 | 336 | 169 | 124 | 108 | 1.28 (1.04–1.57) | 0.019 | 1.25 (1.01–1.56) | 0.044 | ||
Female | 354 | 195 | 180 | 126 | 89 | 85 | 1.74 (1.34–2.25) | <0.0001 | 0.390 | 1.75 (1.34–2.29) | <0.0001 | 0.348 |
Smoking status | ||||||||||||
Never | 86 | 46 | 45 | 28 | 16 | 15 | 0.92 (0.50–1.72) | 0.802 | 0.88 (0.47–1.66) | 0.694 | ||
Current | 311 | 198 | 171 | 106 | 68 | 64 | 1.18 (0.88–1.57) | 0.266 | 1.27 (0.94–1.72) | 0.116 | ||
Former | 483 | 332 | 300 | 161 | 129 | 114 | 1.62 (1.32–1.99) | <0.0001 | 0.026 | 1.56 (1.25–1.94) | <0.0001 | 0.050 |
Histology | ||||||||||||
Adeno | 432 | 249 | 226 | 143 | 97 | 95 | 1.56 (1.23–1.99) | 0.0003 | 1.63 (1.27–2.08) | 0.0001 | ||
Squamous | 216 | 146 | 127 | 68 | 45 | 34 | 1.07 (0.76–1.51) | 0.693 | 0.86 (0.58–1.27) | 0.439 | ||
Others | 232 | 181 | 163 | 84 | 71 | 64 | 1.63 (1.22–2.16) | 0.0009 | 0.659 | 1.66 (1.23–2.25) | 0.001 | 0.472 |
Tumor stage | ||||||||||||
I–IIIA | 491 | 224 | 179 | 163 | 90 | 74 | 1.37 (1.06–1.76) | 0.015 | 1.36 (1.03–1.79) | 0.033 | ||
IIIB–IV | 389 | 352 | 337 | 132 | 123 | 119 | 1.39 (1.12–1.72) | 0.002 | 0.293 | 1.39 (1.12–1.73) | 0.003 | 0.357 |
Chemotherapy | ||||||||||||
No | 478 | 260 | 217 | 160 | 106 | 90 | 1.72 (1.36–2.17) | <0.0001 | 1.65 (1.28–2.12) | 0.0001 | ||
Yes | 402 | 316 | 299 | 135 | 107 | 103 | 1.14 (0.91–1.42) | 0.254 | 0.0006 | 1.15 (0.92–1.45) | 0.222 | 0.002 |
Radiotherapy | ||||||||||||
No | 567 | 320 | 275 | 194 | 129 | 116 | 1.43 (1.16–1.76) | 0.0007 | 1.47 (1.18–1.83) | 0.0007 | ||
Yes | 313 | 256 | 241 | 101 | 84 | 77 | 1.35 (1.05–1.75) | 0.019 | 0.536 | 1.29 (0.99–1.68) | 0.058 | 0.276 |
Surgery | ||||||||||||
No | 474 | 424 | 401 | 161 | 141 | 133 | 1.23 (1.01–1.49) | 0.038 | 1.21 (0.99–1.48) | 0.059 | ||
Yes | 406 | 152 | 115 | 134 | 72 | 60 | 1.78 (1.33–2.37) | <0.0001 | 0.147 | 1.89 (1.37–2.60) | <0.0001 | 0.092 |
. | 0–1 NUGHa . | 2 NUGHa . | Multivariate analysis for OSb . | . | Multivariate analysis for DSSb . | . | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Characteristics . | All . | OS . | DSS . | All . | OS . | DSS . | HR (95% CI) . | P . | Pinterc . | HR (95% CI) . | P . | Pinterc . |
Age (years) | ||||||||||||
≤71 | 479 | 290 | 258 | 155 | 109 | 101 | 1.27 (1.01–1.59) | 0.039 | 1.26 (1.00–1.60) | 0.051 | ||
>71 | 401 | 286 | 258 | 140 | 104 | 92 | 1.52 (1.20–1.91) | 0.0004 | 0.408 | 1.50 (1.18–1.92) | 0.001 | 0.321 |
Sex | ||||||||||||
Male | 526 | 381 | 336 | 169 | 124 | 108 | 1.28 (1.04–1.57) | 0.019 | 1.25 (1.01–1.56) | 0.044 | ||
Female | 354 | 195 | 180 | 126 | 89 | 85 | 1.74 (1.34–2.25) | <0.0001 | 0.390 | 1.75 (1.34–2.29) | <0.0001 | 0.348 |
Smoking status | ||||||||||||
Never | 86 | 46 | 45 | 28 | 16 | 15 | 0.92 (0.50–1.72) | 0.802 | 0.88 (0.47–1.66) | 0.694 | ||
Current | 311 | 198 | 171 | 106 | 68 | 64 | 1.18 (0.88–1.57) | 0.266 | 1.27 (0.94–1.72) | 0.116 | ||
Former | 483 | 332 | 300 | 161 | 129 | 114 | 1.62 (1.32–1.99) | <0.0001 | 0.026 | 1.56 (1.25–1.94) | <0.0001 | 0.050 |
Histology | ||||||||||||
Adeno | 432 | 249 | 226 | 143 | 97 | 95 | 1.56 (1.23–1.99) | 0.0003 | 1.63 (1.27–2.08) | 0.0001 | ||
Squamous | 216 | 146 | 127 | 68 | 45 | 34 | 1.07 (0.76–1.51) | 0.693 | 0.86 (0.58–1.27) | 0.439 | ||
Others | 232 | 181 | 163 | 84 | 71 | 64 | 1.63 (1.22–2.16) | 0.0009 | 0.659 | 1.66 (1.23–2.25) | 0.001 | 0.472 |
Tumor stage | ||||||||||||
I–IIIA | 491 | 224 | 179 | 163 | 90 | 74 | 1.37 (1.06–1.76) | 0.015 | 1.36 (1.03–1.79) | 0.033 | ||
IIIB–IV | 389 | 352 | 337 | 132 | 123 | 119 | 1.39 (1.12–1.72) | 0.002 | 0.293 | 1.39 (1.12–1.73) | 0.003 | 0.357 |
Chemotherapy | ||||||||||||
No | 478 | 260 | 217 | 160 | 106 | 90 | 1.72 (1.36–2.17) | <0.0001 | 1.65 (1.28–2.12) | 0.0001 | ||
Yes | 402 | 316 | 299 | 135 | 107 | 103 | 1.14 (0.91–1.42) | 0.254 | 0.0006 | 1.15 (0.92–1.45) | 0.222 | 0.002 |
Radiotherapy | ||||||||||||
No | 567 | 320 | 275 | 194 | 129 | 116 | 1.43 (1.16–1.76) | 0.0007 | 1.47 (1.18–1.83) | 0.0007 | ||
Yes | 313 | 256 | 241 | 101 | 84 | 77 | 1.35 (1.05–1.75) | 0.019 | 0.536 | 1.29 (0.99–1.68) | 0.058 | 0.276 |
Surgery | ||||||||||||
No | 474 | 424 | 401 | 161 | 141 | 133 | 1.23 (1.01–1.49) | 0.038 | 1.21 (0.99–1.48) | 0.059 | ||
Yes | 406 | 152 | 115 | 134 | 72 | 60 | 1.78 (1.33–2.37) | <0.0001 | 0.147 | 1.89 (1.37–2.60) | <0.0001 | 0.092 |
aTen missing data were excluded.
bAdjusted for age, sex, stage, histology, smoking status, chemotherapy, radiotherapy, surgery, PC1, PC2, PC3, and PC4.
cPinter: P value for interaction analysis between characteristic and UGH.
eQTL effects of APOB rs1801701 and CDH13 rs425904 on mRNA expressions of their genes
For APOB, the eQTL analysis of the data from the GTEx project revealed that the rs1801701 T allele was significantly correlated with a lower expression level of APOB in 383 normal lung tissue samples (P = 0.0269; Fig. 2A) but not in 369 whole blood samples. Because APOB expression data were not available in the 1,000 Genomes Project, the eQTL analysis could not be performed for APOB.
For CDH13, the eQTL analysis of the data from the GTEx project revealed that the rs425904 C allele was significantly correlated with a lower expression level of CDH13 in 383 normal lung tissue samples (P = 0.0275; Fig. 2D) but not in 369 whole blood samples. In the RNA sequencing data of lymphoblastoid cell lines from the 1,000 Genomes Project, none of the four SNPs on CDH13 (i.e., rs35859010, rs1833970, rs254315, and rs425904) showed a significant correlation with the mRNA expression of the gene in all three genetic models (Supplementary Fig. S4A–S4D); nor were the haplotypes of CDH13 correlated with the mRNA expression levels (Supplementary Fig. S4E).
Associations of mRNA levels of APOB and CDH13 with survival of NSCLC
As shown in Fig. 2B, in comparison with adjacent normal tissues, NSCLC (LUAD+LUSC) tumor tissues had a lower mRNA expression level of APOB (P < 0.001), which remained for both LUAD and LUSC samples, separately (all P < 0.001). Meanwhile, as shown in Fig. 2C, a lower expression level of APOB was associated with a better survival of 1,926 patients with NSCLC (HR = 1.28, 95% CI, 1.08–1.39 and Log-rank P = 0.0016).
As shown in Fig. 2E, in comparison with adjacent normal tissues, NSCLC (LUAD+LUSC) tumor tissues also had a lower mRNA expression level of CDH13 (P < 0.001), which remained for LUAD samples (P < 0.001) but not for LUSC samples (P = 0.073). Meanwhile, as shown in Fig. 2F, a higher expression level of CDH13 was associated with a better survival of 1,926 patients with NSCLC (HR = 0.76, 95% CI, 0.65–0.88 and Log-rank P = 0.00032).
Bioinformatics functional prediction of the five independent SNPs
The results of functional prediction for the five independent SNPs identified as mentioned-above revealed no evidence for functional relevance based on the SNPinfo, but there was some evidence for bioinformatics function based on RegulomeDB and HaploReg. Specifically, APOB rs1801701C>T and CDH13 rs254315T>C are likely to have some effects on enhancer histone marks, DNAse and motifs, whereas CDH13 rs35859010C>T may have an effect on enhancer histone marks and motifs (Supplementary Table S4).
Discussion
In the present study, we performed a comprehensive analysis to investigate the associations between SNPs in genes involved in the cholesterol pathway and survival of NSCLC, utilizing two published GWAS datasets with a relatively long median follow-up time and strict quality control procedures. Five novel SNPs in two genes were identified, and the APOB genotypes and CDH13 haplotypes were found to be associated with NSCLC survival.
APOB is located on chromosome 2p24.1 and encodes apolipoprotein B (ApoB), which is the main apolipoprotein of chylomicrons and low-density lipoprotein (LDL; ref. 56). LDL is commonly known as a "bad cholesterol" for both heart diseases and vascular diseases in general, whereas the functional roles of cholesterol as well as its carrier ApoB in cancer growth, especially in NSCLC, remain somewhat unclear. As mentioned above, increasing cellular cholesterol levels may promote proliferation and migration of cancer cells, likely leading to tumor progression (9–12). In one study that prospectively evaluated the associations between cancer mortality and circulating lipid biomarkers in 15,602 females, lipid levels were found to be associated with the total cancer deaths, including lung cancer (57). In addition, because a whole-exome sequencing study provided a strong evidence that APOB had an influence on LDL (58), we speculate that APOB may affect the progression of NSCLC through regulating the dietary intake and transport of cholesterol as well as the levels of downstream cholesterol metabolites. Therefore, reduction of the digestion and transport of cholesterol by APOB may be a potential adjuvant method for future NSCLC therapies, as a result of a better understanding of nutrient requirements, dietary intakes, and nutrient metabolism in the patients.
CDH13 is located in chromosome16q23.3 and encodes a member of the cadherin superfamily, which is localized on the surface of the cell membrane and is anchored by a glycosylphosphatidylinositol moiety, rather than by a transmembrane domain (59), affecting cellular behavior mainly through its signaling properties (60, 61). Most of published studies focusing on the methylation of CDH13 observed that the methylation level of CDH13 was higher in NSCLC tumor tissues than in adjacent normal tissues and suggested that CDH13 hypermethylation was associated with early recurrence and worse survival in NSCLC (62–64), although another study found that CDH13 mRNA high expression levels were correlated with a better OS in patients with adenocarcinoma (63). The mechanism underlying the observed association between downregulation CDH13 and poor prognosis of NSCLC may be related to the loss of CDH13 ability to inhibit cell proliferation and invasiveness, which may either increase susceptibility to apoptosis or reduce tumor growth (61).
On the other hand, CDH13 is a putative receptor for a high molecular weight adiponectin, a cytokine produced by adipocytes, which also attracted research interest in recent years (65). There were several SNPs in CDH13 that were reportedly to affect disease progression by influencing serum adiponectin levels (66, 67), and the serum adiponectin level was found to be associated with prognosis of lung cancer (68). It is likely that the interaction between CDH13 and adiponectin may be a potential signaling to influence NSCLC progression related to the cholesterol pathway. In the present study, we identified four novel SNPs and confirmed four haplotypes in CDH13 to be associated with survival of NSCLC.
When we combined the unfavorable APOB genotype and CDH13 haplotypes, we observed a dose–effect relationship between NUGH and both NSCLC OS and DSS. In addition, there was a weak interaction between smoking status and NUGH on survival of patients with NSCLC. This may be related to the role of nicotine in tobacco that could affect serum cholesterol levels through APOB. In one study on the effect of nicotine on lipoprotein metabolism in rats, there was a significant increase in the levels of total cholesterol and ApoB in the sera of nicotine-treated rats (69). In the present study, we found that smoking status was strongly associated with a worse survival of NSCLC in the presence of the APOB rs1801701 CC genotype as well as CDH13 haplotype H5 (C-A-T-C) and H6 (T-T-T-T) that affected the gene expression. Further stratified analysis found an interaction between chemotherapy and NUGH. Among the patients with 2 NUGH, those who did not receive chemotherapy tended to have a much worse NSCLC survival than those who received chemotherapy, but the difference was not statistically significant. This is likely due to selection bias that needs to be verified in future studies.
There are several limitations in the present study. First, different distributions of demographic and clinic characteristics between the PLCO trial and HLCS study populations might have partially affected the validation; therefore, additional validation by other studies with more detailed prognostic factors, such as tumor stages and treatments, is needed to confirm these findings. Second, the sample sizes of two genotyping datasets were not large enough to allow for adequate subgroup analysis, particularly not for the FDR test, a more desired multiple test correction method. Third, because the present study used the genotyping data from populations of European ancestry, similar studies on other ethnic populations should be performed in the future. Fourth, we only analyzed associations between genetic variants in the identified genes in a selected pathway and survival, more survival-association studies should be called upon on genetic variants in other important biological pathway genes that are likely relevant to tumor phenotypes and treatment response in patients with NSCLC. Finally, additional mechanistic studies should be performed to explore possible molecular mechanisms underlying the observed associations between the SNPs and survival of patients with NSCLC.
In summary, the present study suggested a potential role of genetic variants of the cholesterol pathway genes APOB and CDH13 in NSCLC survival, possibly through the modulation of the synthesis, transport, and metabolism of cholesterol by these SNPs and genes, which may provide new scientific insights into NSCLC prognosis and clinical management, once replicated by other investigators.
Disclosure of Potential Conflicts of Interest
J. Clarke reports receiving commercial research grants from Bristol-Myers Squibb, Genentech, AstraZeneca, Spectrum, Adaptimmune, Medpacto, Bayer, AbbVie, Moderna, GlaxoSmithKline, and Array; reports receiving speakers bureau honoraria from Merck; and is a consultant/advisory board member for AstraZeneca, Guardant, Merck, Pfizer, and NGM Bio. No potential conflicts of interest were disclosed by the other authors.
Authors' Contributions
Conception and design: W. Deng, D.C. Christiani, Q. Wei
Development of methodology: H. Liu, Q. Wei
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): H. Liu, D.C. Christiani, Q. Wei
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): W. Deng, H. Liu, J. Clarke, L. Su, L. Lin, D.C. Christiani, Q. Wei
Writing, review, and/or revision of the manuscript: W. Deng, H. Liu, S. Luo, J. Clarke, C. Glass, D.C. Christiani, Q. Wei
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): L. Su, D.C. Christiani, Q. Wei
Study supervision: Q. Wei
Acknowledgments
This work was supported by the NIH (R01NS091307 to S. Luo); 2018 Guangxi One Thousand Young and Middle-Aged College and University Backbone Teachers Cultivation Program (GJR 2018-18 to W. Deng), the Duke Cancer Institute as part of the P30 Cancer Center Support Grant (NIH/NCI CA014236 to M. Kastan); and the V Foundation for Cancer Research (D2017-19 to Q. Wei). The Harvard Lung Cancer Susceptibility Study was supported by NIH grants R01CA092824 and R01CA074386 to D.C. Christiani.
The authors thank all the participants of the PLCO Cancer Screening Trial; Dongfang Tang, Sen Yang, and Yuchen Zhao for their technical and statistical assistance; and the NCI for providing access to the data collected by the PLCO trial. The statements contained herein are solely those of the authors and do not represent or imply concurrence or endorsement by the NCI. The authors also acknowledge the dbGaP repository for providing cancer genotyping datasets. The accession numbers for the datasets for lung cancer are phs000336v1.p1 and phs000093.v2.p2. A list of contributing investigators and funding agencies for those studies can be found in the Supplementary Data.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.