Purpose: The main prognostic factor of lung cancer patient outcome is clinical stage, a parameter of tumor aggressiveness. Our study was conducted to test whether germ line variations modulate individual differences in clinical stage.

Experimental Design: We conducted a case-only genome-wide association study (GWAS) using a 620,901 single-nucleotide polymorphism (SNP) array in a first series of 600 lung adenocarcinoma (ADCA) patients and in a replication series of 317 lung ADCA patients.

Results: GWAS identified 54 putatively associated SNPs, 3 of which were confirmed in the replication series. Joint analysis of the two series pointed to 22 statistically associated (P < 0.01) genetic variants that together explained about 20% of the phenotypic variation in clinical staging (P < 2 × 10−16) and showed a statistically significant difference in overall survival (P = 8.0 × 10−8). The strongest statistical association was observed at rs10278557 (P = 1.1 × 10−5), located in the mesenchyme homeobox 2 (MEOX2) gene.

Conclusion: These data point to the role of germ line variations involving multiple loci in modulating clinical stage and, therefore, prognosis in lung ADCA patients. Clin Cancer Res; 17(8); 2410–6. ©2011 AACR.

Translational Relevance

Clinical stage is the main clinical parameter affecting prognosis in lung cancer patients. We tested whether a genetic component might be involved in modulating patient clinical stage. In 917 Italian lung adenocarcinoma patients, we detected 22 genetic variants that together explained a large individual variation in clinical stage and that were also associated with overall survival. This demonstration that individual genetic constitution can affect clinical stage represents a step toward understanding the role of genetic factors in a clinically relevant parameter and opens the possibility of identifying new genetic targets for lung cancer therapy based on individual genetic constitution.

Lung cancer prognosis is affected by clinical stage, tumor histologic subtype, and the possibility of surgical resection. However, clinical stage is the parameter with the greatest impact on lung cancer patient survival; indeed, patients with early-stage cancer have an excellent prognosis, whereas prognosis is poorer with increasing stage (reviewed in ref. 1).

Variations in cancer aggressiveness and malignancy have been associated mainly with the accumulation of multiple somatic alterations and epigenetic changes in the neoplastic cells (2). While most studies aimed at identifying factors that affect cancer patient outcome/survival have focused on genetic alterations or transcriptional changes in the cancer tissue, a recent study suggests a role for germ line variations in the control of lung cancer patient survival (3). Those findings are supported by results in mouse models of lung tumorigenesis showing that specific genetic loci modulate tumor progression (4, 5).

In the present study, we conducted a genome-wide association study (GWAS) of lung adenocarcinoma (ADCA) patients to test the hypothesis that clinical stage can be genetically modulated. Comparison of DNA from patients at clinical stage I with DNA of patients at higher clinical stages revealed multiple and unlinked single-nucleotide polymorphisms (SNPs) that may genetically modulate clinical stage and that together are associated with patient overall survival.

Study population and DNAs

All patients were enrolled in the authors' institutes in Milan, Italy (Table 1). Study protocols were approved by all ethics committees, and each subject gave informed consent to the use of their biological samples for research purposes.

Table 1.

Characteristics of lung ADCA patients

CharacteristicsAll patients (N = 917)
GWASReplication study
No. of subjects 600 317 
Age at diagnosis, y 
 Median 63 65 
 Range 20–81 34–84 
Gender 
 Male 442 233 
 Female 156 84 
Smoker status 
 Never 98 50 
 Ever 494 262 
Clinical stage 
 I 300 160 
 >I 300 141 
Follow-up at 60 mo 
 No. patients alive 316 183 
 Median duration, mo 59.1 60 
 Range 4.4–60 1.7–60 
CharacteristicsAll patients (N = 917)
GWASReplication study
No. of subjects 600 317 
Age at diagnosis, y 
 Median 63 65 
 Range 20–81 34–84 
Gender 
 Male 442 233 
 Female 156 84 
Smoker status 
 Never 98 50 
 Ever 494 262 
Clinical stage 
 I 300 160 
 >I 300 141 
Follow-up at 60 mo 
 No. patients alive 316 183 
 Median duration, mo 59.1 60 
 Range 4.4–60 1.7–60 

Genomic DNA was extracted from peripheral blood using the DNeasy Blood & Tissue Kit (QIAGEN) and quantified by fluorimetry using the Picogreen dsDNA Quantitation Kit (Invitrogen).

Patients in the first series were divided into 2 groups according to clinical stage (I or >I), and the same amounts of each DNA sample were used to create a DNA pool of 300 stage I patients and a DNA pool of 300 patients at higher clinical stages. Since the accuracy of analyses using a DNA pooling strategy depends heavily on the estimates of DNA concentration (6), we performed serial dilutions of each DNA sample.

SNP array and genotyping

A genome-wide DNA pooling strategy was used for initial screening to minimize interindividual sample variability and to reduce costs and time as compared with analyses of individual samples at the same power of study and with a robust estimation of allele frequency (7). Putative associations were confirmed by individual genotyping.

DNA pools were analyzed using the Human610-Quad BeadChip array (Illumina), which allows analysis of 620,901 genetic markers chosen from the International HapMap release 23. Twelve SNP array hybridizations were performed for each DNA pool to verify genotype reproducibility and to estimate technical variability. Data were obtained as intensity signals used to determine allele frequencies of each SNP and to reconstruct the number of chromosomes carrying each of the 2 possible alleles. Selected SNPs were genotyped in individual samples using MassARRAY (Sequenom) as described (8).

Statistical analysis

Differences between stage I and stage >I lung ADCA cases in allelic frequencies assessed in SNP array hybridization were analyzed using random variance t statistics and BRB ArrayTools (http://linus.nci.nih.gov/BRBArrayTools.html). Differences in chromosome counts between the 2 groups were tested by Fisher's exact test or by χ2 analysis when the normal approximation was appropriate. The correlation between SNP array and individual genotype allelic frequencies was expressed as a Pearson's coefficient. Association between clinical stage (I or >I) and confounding variables was analyzed using ANOVA or logistic analysis, whereas association between SNPs and clinical stage was analyzed using PLINK software (9), which included analysis of Hardy–Weinberg equilibrium (HWE), linkage disequilibrium (LD) between SNPs, and population-based association between prognosis factors and genotype/allelotype. Age at cancer diagnosis was downcoded to binary dummy variables (age in decades), which were used as covariates in logistic regression analyses. The average genetic risk score of clinical stage >1 for individuals was calculated using the “score” procedure of PLINK, that is, the sum, across the 22 significantly (P < 0.01) associated SNPs in the joint analysis, of the number of minor alleles (0, 1, or 2) at any SNP multiplied by the log of the OR for that SNP. The reliability of the model was assessed by bootstrap resampling with replacement (10). Overall survival was assessed using Cox regression analysis and the “survival” package in R, with follow-up cutoff at 60 months to reduce bias due to mortality caused by non–cancer-related factors. All statistical tests were 2-sided.

GWAS identifies multiple SNPs associated with clinical stage

Genome-wide SNP array analysis conducted in 12 replicas of DNA pools from lung ADCA cases at clinical stage I or at higher clinical stages, respectively, allowed the screening of 620,901 SNPs. After discarding SNPs whose minor allele frequency was less than 0.10 in the pools and whose statistically significant imbalance of allelic frequencies between the 2 DNA pools was below the genome-wide threshold of P ≤ 2.0 × 10−6, analysis of the reconstructed number of chromosomes of the remaining SNPs in the 2 groups using a 2 × 2 contingency table revealed 80 most statistically associated SNPs (P < 1.0 × 10−4). These were selected for individual genotyping to validate the SNP array findings in the DNA pool. Of the 80 SNPs, 2 mitochondrial SNPs, 1 SNP on chromosome Y, and 9 redundant SNPs in tight LD with close-by SNPs (<58-kb distance) in the same locus in the HapMap Caucasian (CEU) population were excluded; 1 SNP failed PCR or MassEXTEND primer design and 4 additional SNPs failed genotyping, reducing the number of markers to 63 SNPs.

A good correlation was observed in the minor allele frequencies obtained either by MassARRAY genotyping in single individuals or by SNP array analysis in DNA pools (r = 0.85, P < 2.2 × 10−16), demonstrating the reliability of the DNA pooling approach. None of the selected SNPs showed significant deviation from the HWE, except for rs565968 (P = 0.00076). No statistically significant LD was observed between any SNP pairs (r2 < 0.1).

Association analysis using a logistic model, adjusted for age at diagnosis and smoking status, indicated that 54 of 63 SNPs were statistically associated with clinical stage status (Table 2, P < 0.05). The strongest association was observed for SNP rs10278557 (P = 5.0 × 10−6), which maps in the mesenchyme homeobox 2 (MEOX2) gene on chromosome 7.

Table 2.

SNPs showing statistically significant association with clinical stage (P < 0.05) in GWAS

SNPaChromosomePosition, MbGeneOR95% CIPb
rs951774 102.28  0.7 0.5–1 2.6 × 10−2 
rs10187901 140.3  0.6 0.4–0.9 2.8 × 10−2 
rs13390491 179.29 TTN 1.7 1.2–2.3 1.8 × 10−3 
rs10498217 227.72 COL4A4 0.6 0.4–0.9 6.9 × 10−3 
rs16843438 241.79 TMEM16G 0.6 0.4–0.8 1.6 × 10−3 
rs2574711 11.64 VGLL4 0.5 0.3–0.8 3.5 × 10−3 
rs7694589 29.95  0.7 0.5–0.9 9.4 × 10−3 
rs11722134 73.78  1.7 1.1–2.8 2.5 × 10−2 
rs1994854 77.99  0.5 0.4–0.7 2.1 × 10−4 
rs423997 86.53  1.4 1.1–1.8 1.2 × 10−2 
rs4505911 68.28  0.5 0.3–0.8 4.1 × 10−3 
rs10900886 105.28  0.5 0.4–0.8 1.0 × 10−3 
rs3823111 53.62 KLHL31 2.4 1.3–4.5 5.0 × 10−3 
rs806435 88.83 SPACA1 1.3–2.9 4.8 × 10−4 
rs458523 95.12  1.6 1.2–2 5.8 × 10−4 
rs565968 125.42 RNF217 0.8 0.6–1 4.3 × 10−2 
rs10278557 15.67 MEOX2 0.5 0.4–0.7 5.0 × 10−6 
rs17819684 82.57 PCLO 1.5 1.2–1.9 2.4 × 10−3 
rs2299297 104.53 MLL5 1.6 1.2–2.2 4.2 × 10−4 
rs2648 128.6 TSPAN33 2.3 1.4–3.7 8.0 × 10−4 
rs17125699 17.75  0.5 0.3–1 4.4 × 10−2 
rs972519 4.48 SLC1A1 0.6 0.4–0.9 7.3 × 10−3 
rs824249 28.76  0.5 0.4–0.8 3.1 × 10−3 
rs10987191 128  0.5 0.3–0.8 2.7 × 10−3 
rs11259181 10 14.64 FAM107B 0.6 0.4–0.9 2.2 × 10−2 
rs10832757 11 17.29 NUCB2 0.5 0.4–0.7 4.6 × 10−5 
rs7107350 11 21.17 NELL1 1.3–3 7.2 × 10−4 
rs3808996 11 124.46 SLC37A2 0.6 0.4–0.9 1.3 × 10−2 
rs3825305 12 61.32 PPM1H 0.5 0.3–0.8 3.3 × 10−3 
rs9596742 13 52.46  0.4 0.3–0.7 1.3 × 10−4 
rs2391875 13 110.29  0.5 0.4–0.7 2.0 × 10−4 
rs8020076 14 27.53  1.7 1.3–2.2 8.2 × 10−5 
rs718998 14 37.44 SLC25A21 1.4 1.1–1.9 2.0 × 10−2 
rs1255641 14 63.05 PPP2R5E 1.8 1.3–2.5 1.1 × 10−3 
rs10520058 15 36.34 SPRED1 0.4 0.2–0.6 5.4 × 10−4 
rs2937940 15 84.17  1.9 1.4–2.6 1.9 × 10−5 
rs9927531 16 26.44  1.8 1.2–2.6 1.6 × 10−3 
rs1183259 16 58.97  0.5 0.3–0.8 1.8 × 10−3 
rs4788587 16 70.56 PKD1L3 0.7 0.5–0.9 1.2 × 10−2 
rs10514440 16 77.24 WWOX 3.5 1.7–7 5.4 × 10−4 
rs1860444 17 46.22  2.9 1.6–5.4 4.3 × 10−4 
rs16950191 17 47.07 CA10 1.4 1–1.8 3.2 × 10−2 
rs12610723 19 3.73 MATK 2.4 1.4–4.3 2.8 × 10−3 
rs2287700 19 14.44 PKN1 1.2–3.2 7.9 × 10−3 
rs4805442 19 34.78  0.5 0.3–0.7 4.3 × 10−4 
rs6030680 20 41.23 PTPRT 1.6 1.2–2.1 1.1 × 10−3 
rs4553110 6.53  0.6 0.4–0.9 5.2 × 10−3 
rs12687904 6.81  0.5 0.3–0.9 1.4 × 10−2 
rs4830793 12.62 FRMPD4 2.3 1.3–4 4.0 × 10−3 
rs7887846 22.58  0.6 0.4–0.9 1.3 × 10−2 
rs5972356 31.19 DMD 0.5 0.3–0.8 7.4 × 10−3 
rs5927730 31.27 DMD 0.7 0.5–0.9 1.8 × 10−2 
rs404481 102.39  0.7 0.5–0.9 7.8 × 10−3 
rs2207031 127.84  2.3 1.5–3.6 1.9 × 10−4 
SNPaChromosomePosition, MbGeneOR95% CIPb
rs951774 102.28  0.7 0.5–1 2.6 × 10−2 
rs10187901 140.3  0.6 0.4–0.9 2.8 × 10−2 
rs13390491 179.29 TTN 1.7 1.2–2.3 1.8 × 10−3 
rs10498217 227.72 COL4A4 0.6 0.4–0.9 6.9 × 10−3 
rs16843438 241.79 TMEM16G 0.6 0.4–0.8 1.6 × 10−3 
rs2574711 11.64 VGLL4 0.5 0.3–0.8 3.5 × 10−3 
rs7694589 29.95  0.7 0.5–0.9 9.4 × 10−3 
rs11722134 73.78  1.7 1.1–2.8 2.5 × 10−2 
rs1994854 77.99  0.5 0.4–0.7 2.1 × 10−4 
rs423997 86.53  1.4 1.1–1.8 1.2 × 10−2 
rs4505911 68.28  0.5 0.3–0.8 4.1 × 10−3 
rs10900886 105.28  0.5 0.4–0.8 1.0 × 10−3 
rs3823111 53.62 KLHL31 2.4 1.3–4.5 5.0 × 10−3 
rs806435 88.83 SPACA1 1.3–2.9 4.8 × 10−4 
rs458523 95.12  1.6 1.2–2 5.8 × 10−4 
rs565968 125.42 RNF217 0.8 0.6–1 4.3 × 10−2 
rs10278557 15.67 MEOX2 0.5 0.4–0.7 5.0 × 10−6 
rs17819684 82.57 PCLO 1.5 1.2–1.9 2.4 × 10−3 
rs2299297 104.53 MLL5 1.6 1.2–2.2 4.2 × 10−4 
rs2648 128.6 TSPAN33 2.3 1.4–3.7 8.0 × 10−4 
rs17125699 17.75  0.5 0.3–1 4.4 × 10−2 
rs972519 4.48 SLC1A1 0.6 0.4–0.9 7.3 × 10−3 
rs824249 28.76  0.5 0.4–0.8 3.1 × 10−3 
rs10987191 128  0.5 0.3–0.8 2.7 × 10−3 
rs11259181 10 14.64 FAM107B 0.6 0.4–0.9 2.2 × 10−2 
rs10832757 11 17.29 NUCB2 0.5 0.4–0.7 4.6 × 10−5 
rs7107350 11 21.17 NELL1 1.3–3 7.2 × 10−4 
rs3808996 11 124.46 SLC37A2 0.6 0.4–0.9 1.3 × 10−2 
rs3825305 12 61.32 PPM1H 0.5 0.3–0.8 3.3 × 10−3 
rs9596742 13 52.46  0.4 0.3–0.7 1.3 × 10−4 
rs2391875 13 110.29  0.5 0.4–0.7 2.0 × 10−4 
rs8020076 14 27.53  1.7 1.3–2.2 8.2 × 10−5 
rs718998 14 37.44 SLC25A21 1.4 1.1–1.9 2.0 × 10−2 
rs1255641 14 63.05 PPP2R5E 1.8 1.3–2.5 1.1 × 10−3 
rs10520058 15 36.34 SPRED1 0.4 0.2–0.6 5.4 × 10−4 
rs2937940 15 84.17  1.9 1.4–2.6 1.9 × 10−5 
rs9927531 16 26.44  1.8 1.2–2.6 1.6 × 10−3 
rs1183259 16 58.97  0.5 0.3–0.8 1.8 × 10−3 
rs4788587 16 70.56 PKD1L3 0.7 0.5–0.9 1.2 × 10−2 
rs10514440 16 77.24 WWOX 3.5 1.7–7 5.4 × 10−4 
rs1860444 17 46.22  2.9 1.6–5.4 4.3 × 10−4 
rs16950191 17 47.07 CA10 1.4 1–1.8 3.2 × 10−2 
rs12610723 19 3.73 MATK 2.4 1.4–4.3 2.8 × 10−3 
rs2287700 19 14.44 PKN1 1.2–3.2 7.9 × 10−3 
rs4805442 19 34.78  0.5 0.3–0.7 4.3 × 10−4 
rs6030680 20 41.23 PTPRT 1.6 1.2–2.1 1.1 × 10−3 
rs4553110 6.53  0.6 0.4–0.9 5.2 × 10−3 
rs12687904 6.81  0.5 0.3–0.9 1.4 × 10−2 
rs4830793 12.62 FRMPD4 2.3 1.3–4 4.0 × 10−3 
rs7887846 22.58  0.6 0.4–0.9 1.3 × 10−2 
rs5972356 31.19 DMD 0.5 0.3–0.8 7.4 × 10−3 
rs5927730 31.27 DMD 0.7 0.5–0.9 1.8 × 10−2 
rs404481 102.39  0.7 0.5–0.9 7.8 × 10−3 
rs2207031 127.84  2.3 1.5–3.6 1.9 × 10−4 

aSNPs sorted by chromosome and position.

bValues of P obtained by logistic regression procedure of PLINK toolset, on the basis of allelic test for association, that is, rare allele versus common allele, adjusted by age at tumor diagnosis (in decades) and smoking status.

Among the 63 SNPs tested in the replication series of 317 lung ADCA samples (Table 1), 3 SNPs showed replication in the independent smaller ADCA series (Table 3). Joint analysis of the GWA and replication series, increasing the statistical power of association analyses (11) and bringing the total sample size to 917 lung ADCA patients, revealed no statistically significant deviation (P < 0.01) from the HWE for any of the 63 SNPs and identified 22 SNPs significantly associated with clinical stage at the statistical threshold of P < 0.01 by logistic analysis adjusted for age at diagnosis and smoking status (Table 4). The strongest association was again observed at SNP rs10278557 (P = 1.1 × 10−5) mapping in the MEOX2 gene (Table 4).

Table 3.

SNPs associated with lung ADCA clinical stage in the GWAS and replication studies

SNPaChromosomePosition, MbGeneRare alleleGWASReplication
OR95% CIPbOR95% CIPb
rs3823111 53.62 KLHL31 2.4 1.3–4.5 5.0 × 10−3 2.9 1.2–6.8 1.8 × 10−2 
rs9927531 16 26.44  1.8 1.2–2.6 1.6 × 10−3 1.7 1.0–2.9 4.8 × 10−2 
rs16950191 17 47.07 CA10 1.4 1.0–1.8 3.2 × 10−2 1.6 1.1–2.3 2.2 × 10−2 
SNPaChromosomePosition, MbGeneRare alleleGWASReplication
OR95% CIPbOR95% CIPb
rs3823111 53.62 KLHL31 2.4 1.3–4.5 5.0 × 10−3 2.9 1.2–6.8 1.8 × 10−2 
rs9927531 16 26.44  1.8 1.2–2.6 1.6 × 10−3 1.7 1.0–2.9 4.8 × 10−2 
rs16950191 17 47.07 CA10 1.4 1.0–1.8 3.2 × 10−2 1.6 1.1–2.3 2.2 × 10−2 

aSNPs sorted by chromosome and position.

bLogistic regression procedure in PLINK toolset, based on allelic test for association, adjusted for age at cancer diagnosis and smoking status. Selection of SNPs based on P < 0.05 threshold in the replication study.

Table 4.

SNPs associated with lung ADCA clinical stage in the joint analysis of the GWAS and replication studies and used to build up the polygenic model with additive effects of SNP rare alleles on risk of clinical stage >I

SNPaChromosomePosition, MbGeneRare alleleOR95% CIPb
rs951774 102.28  0.7 0.5–0.9 6.9 × 10−3 
rs13390491 179.29 TTN 1.4 1.1–1.8 6.3 × 10−3 
rs10498217 227.72 COL4A4 0.7 0.5–0.9 9.4 × 10−3 
rs1994854 77.99  0.7 0.5–0.9 2.8 × 10−3 
rs4505911 68.28  0.5 0.3–0.8 4.0 × 10−3 
rs10900886 105.28  0.7 0.5–0.9 7.7 × 10−3 
rs3823111 53.62 KLHL31 2.6 1.6–4.3 2.1 × 10−4 
rs806435 88.83 SPACA1 1.8 1.3–2.4 2.5 × 10−4 
rs10278557 15.67 MEOX2 0.6 0.4–0.7 1.1 × 10−5 
rs2299297 104.53 MLL5 1.6 1.3–2.0 7.5 × 10−5 
rs824249 28.76  0.6 0.5–0.9 9.5 × 10−3 
rs10987191 127.99  0.5 0.4–0.8 2.5 × 10−3 
rs10832757 11 17.29 NUCB2 0.6 0.5–0.8 5.8 × 10−5 
rs9596742 13 52.46  0.6 0.5–0.9 7.7 × 10−3 
rs2391875 13 110.29  0.6 0.4–0.8 8.8 × 10−5 
rs8020076 14 27.53  1.3 1.1–1.6 9.8 × 10−3 
rs10520058 15 36.34 SPRED1 0.5 0.3–0.8 5.8 × 10−3 
rs9927531 16 26.44  1.7 1.3–2.3 2.5 × 10−4 
rs10514440 16 77.24 WWOX 2.4 1.4–4.1 1.1 × 10−3 
rs16950191 17 47.07 CA10 1.5 1.2–1.8 1.6 × 10−3 
rs7887846 22.58  0.7 0.5–0.9 7.8 × 10−3 
rs2207031 127.84  1.8 1.2–2.5 1.4 × 10−3 
SNPaChromosomePosition, MbGeneRare alleleOR95% CIPb
rs951774 102.28  0.7 0.5–0.9 6.9 × 10−3 
rs13390491 179.29 TTN 1.4 1.1–1.8 6.3 × 10−3 
rs10498217 227.72 COL4A4 0.7 0.5–0.9 9.4 × 10−3 
rs1994854 77.99  0.7 0.5–0.9 2.8 × 10−3 
rs4505911 68.28  0.5 0.3–0.8 4.0 × 10−3 
rs10900886 105.28  0.7 0.5–0.9 7.7 × 10−3 
rs3823111 53.62 KLHL31 2.6 1.6–4.3 2.1 × 10−4 
rs806435 88.83 SPACA1 1.8 1.3–2.4 2.5 × 10−4 
rs10278557 15.67 MEOX2 0.6 0.4–0.7 1.1 × 10−5 
rs2299297 104.53 MLL5 1.6 1.3–2.0 7.5 × 10−5 
rs824249 28.76  0.6 0.5–0.9 9.5 × 10−3 
rs10987191 127.99  0.5 0.4–0.8 2.5 × 10−3 
rs10832757 11 17.29 NUCB2 0.6 0.5–0.8 5.8 × 10−5 
rs9596742 13 52.46  0.6 0.5–0.9 7.7 × 10−3 
rs2391875 13 110.29  0.6 0.4–0.8 8.8 × 10−5 
rs8020076 14 27.53  1.3 1.1–1.6 9.8 × 10−3 
rs10520058 15 36.34 SPRED1 0.5 0.3–0.8 5.8 × 10−3 
rs9927531 16 26.44  1.7 1.3–2.3 2.5 × 10−4 
rs10514440 16 77.24 WWOX 2.4 1.4–4.1 1.1 × 10−3 
rs16950191 17 47.07 CA10 1.5 1.2–1.8 1.6 × 10−3 
rs7887846 22.58  0.7 0.5–0.9 7.8 × 10−3 
rs2207031 127.84  1.8 1.2–2.5 1.4 × 10−3 

aSNPs sorted by chromosome and position.

bLogistic regression procedure in PLINK toolset, based on allelic test for association with clinical stage, adjusted for age at cancer diagnosis and smoking status. SNPs selected on the basis of P < 0.01 threshold for association.

Differences in lung ADCA outcome are associated with patients' genetic profile

We used a polygenic model (8) to evaluate additive effects of these 22 SNPs in modulating individual clinical stage, after removing 81 of 917 patients with more than 30% missing genotypes from the data set. For each patient, the allele-based OR (Table 4) was attributed to the carrier status of an allele of each SNP associated with clinical stage status, on the basis of its association with the probability of carrying a stage >I lung ADCA.

The average genetic estimator was −7.9 × 10−3 ± 5.4 × 10−4 units (mean ± standard error) for patients with clinical stage I (n = 418) and 3.2 × 10−3 ± 5.3 × 10−4 for patients with higher clinical stage (n = 403; P < 2.2 × 10−16, ANOVA analysis). The 22 SNPs explained 20.7% of the phenotypic variance in clinical staging. Although with a lower effect than in the first series and in the whole series, the genetic estimator was also statistically associated to clinical stage in the second ADCA series alone (P = 0.0006, ANOVA analysis). To verify the robustness of the model in our series, we carried out an empirical replication using bootstrap samples (B = 2,000 resamplings) and found that the difference in the genetic estimator between stage I and stage >I patients was = −11.1 × 10−3 units, 95% CI = −12.7 × 10−3 to −9.7 × 10−3, Pdiff = 0.0005.

Subjects were divided into quartiles on the basis of the genetic risk score. Application of the generalized linear model to the quartile groups, with the lowest quartile as the reference, revealed a significant association between the genetic estimator and increased probability of developing a more aggressive lung ADCA (OR = 2.9, 95% CI = 1.9–4.6, P = 2.7 × 10−6 for the second quartile, OR = 6.8, 95% CI = 4.4–10.7, P < 2 × 10−16 for the third quartile; and OR = 14.5, 95% CI = 9.1–23.6, P < 2 × 10−16 for the fourth quartile group, Fig. 1A).

Figure 1.

A, genetic risk of developing a more aggressive lung ADCA (clinical stage >I) in patients grouped according to the quartiles of genetic risk score, with the lowest quartile as the reference group. Bars denote ORs. Vertical lines represent 95% CIs. B, Kaplan–Meier survival curves in lung ADCA patients grouped as in (A); follow-up is shown truncated at 60 months (P = 8.0 × 10−8, log-rank test).

Figure 1.

A, genetic risk of developing a more aggressive lung ADCA (clinical stage >I) in patients grouped according to the quartiles of genetic risk score, with the lowest quartile as the reference group. Bars denote ORs. Vertical lines represent 95% CIs. B, Kaplan–Meier survival curves in lung ADCA patients grouped as in (A); follow-up is shown truncated at 60 months (P = 8.0 × 10−8, log-rank test).

Close modal

Finally, Kaplan–Meier curves showed a statistically significant association between the genetic risk score, in quartiles, and overall survival (P = 8.0 × 10−8, log-rank test; Fig. 1B). Use of multivariate Cox proportional hazard models for survival (adjusted for age and smoking habit) to evaluate the association between the genetic risk score and overall survival showed that the risk of death for quartiles 3 and 4 (HR = 1.5, 95% CI = 1.1–2.0, P = 0.016; HR = 2.3, 95% CI = 1.7–3.0, P = 8.7 × 10−8, respectively) was significantly higher than that of the lowest quartile.

In recent years, several GWAS have focused on genetic risk for lung cancer, but none has examined the possible genetic modulation of the most powerful prognostic factor in lung cancer patients, that is, clinical stage (1). Our present study in a patient series of the same lung cancer histotype and of the same ethnicity identified 54 SNPs, putatively associated (P < 0.05) with clinical stage (Table 2) in a relatively large first series of 600 patients, and 3 SNPs that maintained their statistical association with clinical stage in the smaller replication series (Table 3). Joint analysis of the GWAS and replication series to increase the statistical power of the study and to obtain an overall unbiased estimate (11) identified 22 SNPs that, at nominal statistical value of P < 0.01, showed statistical association with clinical stage (Table 4). Analysis of additive effects of risk associated to the minor alleles of these 22 SNPs using a polygenic model (8) revealed a statistically significant association between the genetic estimator and an increased risk of higher clinical stage (Fig. 1A) and higher risk of death (Fig. 1B), suggesting the complex genetic control of lung ADCA patient clinical prognosis.

Of the 22 candidate SNPs, 6 mapped within genes. The most significantly associated SNP in the joint analysis (rs10278557, P = 1.1 × 10−5, Table 4) maps on chromosome 7 in the intronic region of the MEOX2 gene, also known as growth arrest–specific homeobox (GAX) gene, a member of a subfamily of nonclustered, divergent, antennapedia-like homeobox-containing genes. MEOX2, a key regulator of vascular cell function, has been proposed as a candidate tumor suppressor gene in Wilm's tumor and shows upregulation and aberrant methylation in lung cancer (12, 13).

The myeloid/lymphoid or mixed-lineage leukemia 5 (trithorax homolog, Drosophila; MLL5, rs2299297), the sprouty related, EVH1 domain containing 1 (SPRED1, rs10520058), and the WW domain containing oxidoreductase (WWOX, rs10514440; Table 4) candidacies are also of interest. Indeed, the MLL5 gene belongs to a gene family that activates and regulates homeobox (HOX) genes that are important in oncogenesis and tumor suppression (14, 15). MLL5 is located on chromosome 7q22, which is frequently deleted in myeloid leukemias and is a key regulator of normal hematopoiesis (16). SPRED1 negatively regulates the Ras-ERK (extracellular signal–regulated kinases) signaling pathway, cell motility, and metastasis, and its germ line loss-of-function mutations cause a neurofibromatosis 1–like syndrome (17, 18). WWOX acts as a tumor suppressor gene in different tumor types and plays a regulatory role in protein degradation, transcription, and RNA splicing (reviewed in ref. 19).

At present, it is unknown whether the observed associations between SNPs and lung cancer clinical stage underlie effects of nonsynonymous or regulatory variants in LD with these SNPs.

Empirical replication using bootstrap samples from the original data, rather than replication in independent samples, has been proposed in association studies since bootstrap samples likely share the same population structure of original data, whereas an independent series may be characterized by a different population structure and, thus, lead to false-negative results on analysis (20). Our empirical replication using bootstrap samples confirmed the statistically significant difference between stage I and stage >I patients in their genetic estimator on the basis of 22 SNPs. However, the lack of a replica in a truly independent population remains a potential weakness of the present study that may be overcome in future, large studies carried out by international consortia.

Together, our results indicate for the first time that clinical staging of lung ADCA can be under genetic control, with each patient displaying a tendency toward a low or high clinical stage depending on individual genetic variations. The significant association of the 22 SNPs with lung ADCA clinical stage and survival raises the possibility that the functional products of the genes linked to these SNPs use novel biochemical pathways associated with such patient outcome, and that identification of these pathways might provide gene targets for therapies to counter disease progression. Further clarification of the role of genetic mechanisms in lung ADCA patient outcomes may hold the promise of improved therapy and disease outcome.

The funders had no role in the design and conduct of the study, in the collection, analysis, and interpretation of the data, and in the preparation, review, or approval of the manuscript.

The authors thank Harvard-Partners Center for Genetics and Genomics Genotyping Facility, Cambridge, MA, for custom genotyping by MassARRAY.

This work was funded in part by grants from Associazione and Fondazione Italiana Ricerca Cancro (AIRC and FIRC). E. Frullanti and F. Colombo were funded by the AIRC “Antonietta Andreoli” and Associazione Marta Nurizzo fellowships, respectively.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1.
Brundage
MD
,
Davies
D
,
Mackillop
WJ
. 
Prognostic factors in non-small cell lung cancer: a decade of progress
.
Chest
2002
;
122
:
1037
57
.
2.
Sidransky
D
. 
Emerging molecular markers of cancer
.
Nat Rev Cancer
2002
;
2
:
210
9
.
3.
Huang
YT
,
Heist
RS
,
Chirieac
LR
,
Lin
X
,
Skaug
V
,
Zienolddiny
S
, et al
Genome-wide analysis of survival in early-stage non-small-cell lung cancer
.
J Clin Oncol
2009
;
27
:
2660
7
.
4.
Manenti
G
,
Gariboldi
M
,
Fiorino
A
,
Zanesi
N
,
Pierotti
MA
,
Dragani
TA
. 
Genetic mapping of lung cancer modifier loci specifically affecting tumor initiation and progression
.
Cancer Res
1997
;
57
:
4164
6
.
5.
Hunter
KW
,
Broman
KW
,
Voyer
TL
,
Lukes
L
,
Cozma
D
,
Debies
MT
, et al
Predisposition to efficient mammary tumor metastatic progression is linked to the breast cancer metastasis suppressor gene Brms1
.
Cancer Res
2001
;
61
:
8866
72
.
6.
Sham
P
,
Bader
JS
,
Craig
I
,
O'Donovan
M
,
Owen
M
. 
DNA pooling: a tool for large-scale association studies
.
Nat Rev Genet
2002
;
3
:
862
71
.
7.
Norton
N
,
Williams
NM
,
Williams
HJ
,
Spurlock
G
,
Kirov
G
,
Morris
DW
, et al
Universal, robust, highly quantitative SNP allele frequency measurement in DNA pools
.
Hum Genet
2002
;
110
:
471
8
.
8.
Galvan
A
,
Falvella
FS
,
Frullanti
E
,
Spinola
M
,
Incarbone
M
,
Nosotti
M
, et al
Genome-wide association study in discordant sibships identifies multiple inherited susceptibility alleles linked to lung cancer
.
Carcinogenesis
2010
;
31
:
462
5
.
9.
Purcell
S
,
Neale
B
,
Todd-Brown
K
,
Thomas
L
,
Ferreira
MA
,
Bender
D
, et al
PLINK: a tool set for whole-genome association and population-based linkage analyses
.
Am J Hum Genet
2007
;
81
:
559
75
.
10.
Efron
B
. 
Bootstrap methods: another look at the jackknife
.
Ann Statist
1979
;
7
:
1
26
.
11.
Bowden
J
,
Dudbridge
F
. 
Unbiased estimation of odds ratios: combining genomewide association scans with replication studies
.
Genet Epidemiol
2009
;
33
:
406
18
.
12.
Ohshima
J
,
Haruta
M
,
Arai
Y
,
Kasai
F
,
Fujiwara
Y
,
Ariga
T
, et al
Two candidate tumor suppressor genes, MEOX2 and SOSTDC1, identified in a 7p21 homozygous deletion region in a Wilms tumor
.
Genes Chromosomes Cancer
2009
;
48
:
1037
50
.
13.
Cortese
R
,
Hartmann
O
,
Berlin
K
,
Eckhardt
F
. 
Correlative gene expression and DNA methylation profiling in lung development nominate new biomarkers in lung cancer
.
Int J Biochem Cell Biol
2008
;
40
:
1494
508
.
14.
Ansari
KI
,
Mandal
SS
. 
Mixed lineage leukemia: roles in gene expression, hormone signaling and mRNA processing
.
FEBS J
2010
;
277
:
1790
804
.
15.
Shah
N
,
Sukumar
S
. 
The hox genes and their roles in oncogenesis
.
Nat Rev Cancer
2010
;
10
:
361
71
.
16.
Heuser
M
,
Yap
DB
,
Leung
M
,
de Algara
TR
,
Tafech
A
,
McKinney
S
, et al
Loss of MLL5 results in pleiotropic hematopoietic defects, reduced neutrophil immune function, and extreme sensitivity to DNA demethylation
.
Blood
2009
;
113
:
1432
43
.
17.
Bundschu
K
,
Walter
U
,
Schuh
K
. 
Getting a first clue about SPRED functions
.
Bioessays
2007
;
29
:
897
907
.
18.
Brems
H
,
Chmara
M
,
Sahbatou
M
,
Denayer
E
,
Taniguchi
K
,
Kato
R
, et al
Germline loss-of-function mutations in SPRED1 cause a neurofibromatosis 1-like phenotype
.
Nat Genet
2007
;
39
:
1120
6
.
19.
Del Mare
S
,
Salah
Z
,
Aqeilan
RI
. 
WWOX: its genomics, partners, and functions
.
J Cell Biochem
2009
;
108
:
737
45
.
20.
Cho
S
,
Kim
K
,
Kim
YJ
,
Lee
JK
,
Cho
YS
,
Lee
JY
, et al
Joint identification of multiple genetic variants via elastic-net variable selection in a genome-wide association analysis
.
Ann Hum Genet
2010
;
74
:
416
28
.