Lung cancer in lifetime never smokers is distinct from that in smokers, but the role of separate or overlapping carcinogenic pathways has not been explored. We therefore evaluated a comprehensive panel of 11,737 single-nucleotide polymorphisms (SNP) in inflammatory-pathway genes in a discovery phase (451 lung cancer cases, 508 controls from Texas). SNPs that were significant were evaluated in a second external population (303 cases, 311 controls from the Mayo Clinic). An intronic SNP in the ACVR1B gene, rs12809597, was replicated with significance and restricted to those reporting adult exposure to environmental tobacco smoke. Another promising candidate was an SNP in NR4A1, although the replication OR did not achieve statistical significance. ACVR1B belongs to the TGFR-β superfamily, contributing to resolution of inflammation and initiation of airway remodeling. An inflammatory microenvironment (second-hand smoking, asthma, or hay fever) is necessary for risk from these gene variants to be expressed. These findings require further replication, followed by targeted resequencing, and functional validation.

Significance: Beyond passive smoking and family history of lung cancer, little is known about the etiology of lung cancer in lifetime never smokers that accounts for about 15% of all lung cancers in the United States. Our two-stage candidate pathway approach examined a targeted panel of inflammation genes and has identified novel structural variants that appear to contribute to risk in patients who report prior exposure to sidestream smoking. Cancer Discovery; 1(5): 420–9. ©2011 AACR.

This article is highlighted in the In This Issue feature, p. 367

From etiologic, molecular genetic, and biologic viewpoints, it is now fairly well accepted that lung cancer occurring in lifetime never smokers is distinct from smoking-associated lung cancer (1). It is noteworthy that the top hit from all published ever-smoking lung cancer genome-wide association studies (GWAS), the chromosome 15q25 locus encoding nicotinic acetylcholine receptor (NAChR) subunits and a proteasome subunit, has not been implicated in lung cancer risk in never smokers (2). Nevertheless, it is likely that the 2 disease entities do share some molecular features suggesting separate but overlapping pathways to lung carcinogenesis (1). Increasing evidence suggests that pathway-based approaches to identify the genetic contribution to cancer susceptibility may provide complementary information to conventional single-marker analyses.

Of intense interest in lung carcinogenesis is the inflammation pathway, because an abnormally prolonged or intense inflammatory response could create a microenvironment that promotes lung cancer development. Although tobacco-induced lung cancer is characterized by increased tissue oxidative stress and an abundant and deregulated inflammatory microenvironment (3), a similar role for inflammation in lung cancer in never smokers has not been studied in depth. We therefore evaluated a comprehensive panel of germline genetic variants in inflammatory pathway genes in risk of lung cancer in lifetime never smokers in a discovery phase of cases and controls selected from an ongoing multiracial/ethnic lung cancer case–control study that has recruited study participants from The University of Texas MD Anderson Cancer Center from 1995 onward (4). We performed a replication analysis in an independent sample of never-smoking lung cancer cases and controls from the Mayo Clinic (5). Lung cancer in never smokers accounts for 15% of all lung cancers in the United States, yet beyond passive smoking and family history of lung cancer, few other well-established genetic or nongenetic clues to its etiology are known.

For the discovery phase, we recruited 451 non–small cell lung cancer cases and 508 controls, all lifetime never smokers (Table 1). Of these subjects, about two thirds of the cases and controls (650) were included in our previously published risk model for never smokers (6). Adenocarcinoma was diagnosed in 76% of the cases. On average, the controls were 5 years younger than the cases. More than 60% of both the cases and controls were women. The percentages of self-reported environmental tobacco smoke (ETS) exposure were 83% and 75% for the discovery cases and controls, respectively. The associations between asthma and dust exposure were not statistically significant. However, a history of hay fever (OR = 0.70; P = 0.02), passive smoking exposure (OR = 1.59; P = 0.01), family history of 2 or more first-degree relatives with any cancer (OR = 2.24, P < 0.001) (data not shown), or 2 or more first-degree relatives with lung cancer (OR = 3.47, P = 0.04) all achieved statistical significance.

Table 1.

Distribution of selected variables in discovery and replication populations

   Discovery  Replication  
 CasesControlsOR CasesControlsOR 
Variable (n = 451) (n = 508) (95% CI)a P value (n = 303) (n = 311) (95% CI) P value 
Age                 
Mean (SD) 61.6 (13.0) 56.6 (13.1) 1.03 (1.02–1.04) <0.0001 62.0 (12.9) 62.2 (13.1) 1.03 (1.02–1.04) <0.0001 
Sex                 
Male, n (%) 147 (32.6) 190 (37.4)     83 (27.4) 86 (27.7)     
Female, n (%) 304 (67.4) 318 (62.6) 0.81 (0.62–1.06) 0.1198 220 (72.6) 225 (72.4) 0.99 (0.7–1.4) 0.9425 
Asthma                 
No 369 (87.0) 442 (87.3)     263 (86.8) 278 (89.4)     
Yes 55 (13.0) 64 (12.4) 1.03 (0.70–1.51) 0.8830 40 (13.2) 33 (10.6) 1.28 (0.8–2.1) 0.3223 
Hay fever                 
No 332 (78.3) 363 (71.7)     N/A       
Yes 92 (21.7) 143 (28.3) 0.70 (0.52–0.95) 0.022         
Dust                 
No 345 (81.4) 429 (84.8)     N/A       
Yes 79 (18.6) 77 (15.2) 1.28 (0.90–1.80) 0.1657         
ETS                 
No 57 (17.0) 115 (24.6)     99 (33.3) 135 (43.8)     
Yes 278 (83.0) 353 (75.4) 1.59 (1.12–2.26) 0.0104 198 (66.7) 173 (56.2) 1.56 (1.12–2.17) 0.0082 
Family history of lung cancer               
348 (84.5) 439 (87.1)     213 (70.3) 258 (83.0)     
53 (12.9) 61 (12.1) 1.10 (0.74–1.63) 0.6482 90 (29.7) 53 (17.0) 2.06 (1.40–3.02) 0.0002 
11 (2.7) 4 (0.8) 3.47 (1.09–10.98) 0.0346         
   Discovery  Replication  
 CasesControlsOR CasesControlsOR 
Variable (n = 451) (n = 508) (95% CI)a P value (n = 303) (n = 311) (95% CI) P value 
Age                 
Mean (SD) 61.6 (13.0) 56.6 (13.1) 1.03 (1.02–1.04) <0.0001 62.0 (12.9) 62.2 (13.1) 1.03 (1.02–1.04) <0.0001 
Sex                 
Male, n (%) 147 (32.6) 190 (37.4)     83 (27.4) 86 (27.7)     
Female, n (%) 304 (67.4) 318 (62.6) 0.81 (0.62–1.06) 0.1198 220 (72.6) 225 (72.4) 0.99 (0.7–1.4) 0.9425 
Asthma                 
No 369 (87.0) 442 (87.3)     263 (86.8) 278 (89.4)     
Yes 55 (13.0) 64 (12.4) 1.03 (0.70–1.51) 0.8830 40 (13.2) 33 (10.6) 1.28 (0.8–2.1) 0.3223 
Hay fever                 
No 332 (78.3) 363 (71.7)     N/A       
Yes 92 (21.7) 143 (28.3) 0.70 (0.52–0.95) 0.022         
Dust                 
No 345 (81.4) 429 (84.8)     N/A       
Yes 79 (18.6) 77 (15.2) 1.28 (0.90–1.80) 0.1657         
ETS                 
No 57 (17.0) 115 (24.6)     99 (33.3) 135 (43.8)     
Yes 278 (83.0) 353 (75.4) 1.59 (1.12–2.26) 0.0104 198 (66.7) 173 (56.2) 1.56 (1.12–2.17) 0.0082 
Family history of lung cancer               
348 (84.5) 439 (87.1)     213 (70.3) 258 (83.0)     
53 (12.9) 61 (12.1) 1.10 (0.74–1.63) 0.6482 90 (29.7) 53 (17.0) 2.06 (1.40–3.02) 0.0002 
11 (2.7) 4 (0.8) 3.47 (1.09–10.98) 0.0346         

The replication set (Table 1) included 303 cases and 311 controls, all lifetime never smokers and well matched on age and gender. ETS exposure was reported by 67% of the cases and 56% of the controls. Passive smoking (OR = 1.56; P = 0.008) and family history of lung cancer (OR = 2.06; P = 0.0002) were significantly associated with risk. Asthma was not a risk factor in this population. Ten cases, but no controls reported a history of emphysema (OR = 10.6; P = 0.02).

In total, 11,737 single-nucleotide polymorphisms (SNP) were available for analysis in the discovery phase. Table 2 summarizes the subpathways, genes, and SNPs included in the customized Illumina inflammation chip, and as outlined in Loza and colleagues (7). In univariate analysis, assuming an additive model, 21 SNPs were statistically significant with P values ≤0.001 and Bayesian false discovery probability (BFDP) levels ≤ 0.8 (Table 3).

Table 2.

Summary of inflammation subpathways, genes, and SNPs on Illumina chip

PathwayaGenes (n)SNPs (n)
Adhesion-extravasation-migration 12 108 
Apoptosis signaling 67 834 
Complement cascade 
Cytokine signaling 266 3,139 
Glucocorticoid/PPAR signaling 24 258 
Innate pathogen detection 53 542 
Leukocyte signaling 132 2,023 
MAPK signaling 156 2,854 
Natural killer cell signaling 31 296 
Phagocytosis-Ag presentation 41 488 
PI3K/AKT signaling 45 580 
ROS/glutathione/cytotoxic granules 25 231 
TNF superfamily signaling 49 569 
Total 904 11,930 
PathwayaGenes (n)SNPs (n)
Adhesion-extravasation-migration 12 108 
Apoptosis signaling 67 834 
Complement cascade 
Cytokine signaling 266 3,139 
Glucocorticoid/PPAR signaling 24 258 
Innate pathogen detection 53 542 
Leukocyte signaling 132 2,023 
MAPK signaling 156 2,854 
Natural killer cell signaling 31 296 
Phagocytosis-Ag presentation 41 488 
PI3K/AKT signaling 45 580 
ROS/glutathione/cytotoxic granules 25 231 
TNF superfamily signaling 49 569 
Total 904 11,930 

Abbreviations: Ag, antigen; AKT, MAPK, mitogen-activated protein kinase; PPAR, peroxisome proliferator-activated receptor; ROS, reactive oxygen species; SNP, single-nucleotide polymorphism.

a

See ref. 7 

Table 3.

Significant SNPs in discovery set (additive model)

CHRSNPBPMinor alleleORL95U95P valueLocationGene
rs10127728 171417779 1.679 1.266 2.226 0.0003 Flanking_3UTR TNFSF4 
rs549471 158972243 0.6511 0.5058 0.8381 0.0009 Flanking_5UTR SLAMF7 
rs12131065 67541594 1.416 1.149 1.746 0.0011 Flanking_5UTR IL12RB2 
rs2300095 11188304 1.377 1.131 1.676 0.0015 Intron MTOR 
rs17488897 97731596 0.7201 0.5947 0.8718 0.0008 Flanking_3UTR ZAP70 
rs1464572 45807414 0.7392 0.6157 0.8874 0.0012 Intron PRKCE 
rs13432276 46165464 1.358 1.125 1.639 0.0015 Intron PRKCE 
rs4585495 159717272 2.147 1.391 3.316 0.0006 Intron C1QTNF2 
rs745749 179648409 1.359 1.131 1.634 0.0011 Intron MAPK9 
rs17651965 149435808 0.6051 0.4454 0.8221 0.0013 Intron CSF1R 
rs350294 158440305 1.399 1.152 1.699 0.0007 Flanking_3UTR SYNJ2 
10 rs1887327 6648739 0.6144 0.4691 0.8047 0.0004 Flanking_5UTR PRKCQ 
11 rs11819995 127894601 1.479 1.186 1.843 0.0005 Intron ETS1 
12 ars12809597 50642590 0.7192 0.5892 0.8778 0.0012 Intron ACVR1B 
12 brs2701129 50715744 0.6261 0.4751 0.825 0.0009 Flanking_5UTR NR4A1 
13 rs9518587 101309356 1.504 1.191 1.9 0.0006 Flanking_3UTR FGF14 
14 rs11621263 24164990 1.581 1.218 2.054 0.0006 Flanking_3UTR GZMB 
14 rs11629129 24164696 1.743 1.253 2.426 0.001 Flanking_3UTR GZMB 
14 rs11158813 24154521 1.721 1.236 2.397 0.0013 Flanking_5UTR GZMH 
17 rs11653414 5355762 0.5093 0.3409 0.7609 0.001 Intron NLRP1 
21 rs962859 33569993 0.7385 0.6148 0.8871 0.0012 Intron IL10RB 
CHRSNPBPMinor alleleORL95U95P valueLocationGene
rs10127728 171417779 1.679 1.266 2.226 0.0003 Flanking_3UTR TNFSF4 
rs549471 158972243 0.6511 0.5058 0.8381 0.0009 Flanking_5UTR SLAMF7 
rs12131065 67541594 1.416 1.149 1.746 0.0011 Flanking_5UTR IL12RB2 
rs2300095 11188304 1.377 1.131 1.676 0.0015 Intron MTOR 
rs17488897 97731596 0.7201 0.5947 0.8718 0.0008 Flanking_3UTR ZAP70 
rs1464572 45807414 0.7392 0.6157 0.8874 0.0012 Intron PRKCE 
rs13432276 46165464 1.358 1.125 1.639 0.0015 Intron PRKCE 
rs4585495 159717272 2.147 1.391 3.316 0.0006 Intron C1QTNF2 
rs745749 179648409 1.359 1.131 1.634 0.0011 Intron MAPK9 
rs17651965 149435808 0.6051 0.4454 0.8221 0.0013 Intron CSF1R 
rs350294 158440305 1.399 1.152 1.699 0.0007 Flanking_3UTR SYNJ2 
10 rs1887327 6648739 0.6144 0.4691 0.8047 0.0004 Flanking_5UTR PRKCQ 
11 rs11819995 127894601 1.479 1.186 1.843 0.0005 Intron ETS1 
12 ars12809597 50642590 0.7192 0.5892 0.8778 0.0012 Intron ACVR1B 
12 brs2701129 50715744 0.6261 0.4751 0.825 0.0009 Flanking_5UTR NR4A1 
13 rs9518587 101309356 1.504 1.191 1.9 0.0006 Flanking_3UTR FGF14 
14 rs11621263 24164990 1.581 1.218 2.054 0.0006 Flanking_3UTR GZMB 
14 rs11629129 24164696 1.743 1.253 2.426 0.001 Flanking_3UTR GZMB 
14 rs11158813 24154521 1.721 1.236 2.397 0.0013 Flanking_5UTR GZMH 
17 rs11653414 5355762 0.5093 0.3409 0.7609 0.001 Intron NLRP1 
21 rs962859 33569993 0.7385 0.6148 0.8871 0.0012 Intron IL10RB 

Abbreviations: BP, base pair; CHR, chromosome.

a

Replicated in the Mayo data set (OR = 0.80; 0.62–1.02); P = 0.069.

b

Mayo data set (OR = 0.85; 0.60–1.21); P = 0.36.

In the replication analysis of these 21 SNPs from the discovery phase, only one, rs12809597 in the ACVR1B gene, was concordant for direction with the discovery phase [discovery OR = 0.72 (0.59, 0.88); P = 0.0012] (Table 3) but was of borderline overall significance [replication OR = 0.80 (0.62, 1.02); P = 0.069]. For women specifically, the OR in the replication population was 0.67, P = 0.0097, but was not statistically significant in men, although the numbers were small. In the combined data sets, the overall OR for rs12809597 was 0.72, P = 0.0002. For women only, the overall OR was 0.72, P = 0.0013; for men, the combined OR was 0.74, P = 0.05. A second SNP in this region, rs2701129 in the 5′ UTR of NR4A1, was strongly significant in our data (OR = 0.63; P = 0.0009) but did not achieve statistical significance in the Mayo Clinic data, although the OR was in the same protective direction (OR = 0.85; P = 0.36).

We also conducted stratified analysis by select variables including ETS exposure, family history of lung cancer, hay fever, and asthma (data not shown). Notably, the significant association between lung cancer risk and rs12809597 was evident in only those who reported ETS exposure, OR = 0.67; P = 7.8 ×10−5, compared with an OR of 0.78, P = 0.39 in those who denied ETS exposure. In the discovery data, this ACVR1B SNP was significantly protective in both men [OR = 0.47 (0.30–0.73); P = 0.0010], and women with ETS exposure [OR = 0.74 (0.54–1.01); P = 0.0543]. In the replication, this pattern was evident only in women with ETS exposure [OR = 0.60 (0.41–0.88); P = 0.009]. It is noteworthy that only 83 male cases were present in the replication set, and power is therefore limited for these subset analyses. Likewise, rs2701129 in NR4A1 was statistically significant only in ETS-exposed subjects in the discovery set [OR = 0.61 (0.43–0.87); P = 0.0068]. We also noted a greater significant effect for NR4A1 (OR = 0.31; P = 0.0081) in those with asthma (the risk group for lung cancer in never smokers), compared with those who denied having asthma (OR = 0.69; P = 0.0165). However, although we did not note a similar pattern in the discovery data for ACVR1B, we saw an identical pattern for the ACVR1B SNP in the Mayo Clinic data for those with and without asthma (OR = 0.39; P = 0.02 vs. OR = 0.86; P = 0.27, respectively).

Also of interest is that in the discovery set, in those who denied having had hay fever (i.e., the risk group), the ORs were significantly protective for both ACVR1B (OR = 0.70; P = 0.0026 and NR4A1 (OR = 0.54; P = 0.0003). We did not have comparable data for analysis in the replication set. We previously reported (8) that, paradoxically, those with both conditions (asthma and hay fever) had a significantly elevated lung cancer risk (OR = 2.43; 95% confidence interval = 1.11–5.35). It is in this subgroup (asthma and hay fever) that we detected the greatest protective effect with NR4A1 (OR = 0.28; P = 0.04).

We hypothesized that polymorphisms in genes directly associated with ACVR1B might contribute to the risk noted for the ACVR1B SNP. Therefore, we used an in silico approach, Pathway Studio (9), to identify upstream regulators and downstream targets of ACVR1B. Direct interactions between genes (i.e., direct regulation of gene expression, protein/protein binding, or binding to the promoter region) were used to construct the network. Based on these criteria, we identified 25 upstream regulators and 39 downstream targets of the ACVR1B gene. In this study, we had genotype data for 11 upstream regulators and 16 downstream targets. Of these, none was nominally significant at P < 0.05 in additive models.

Imputation was performed to increase coverage of SNPs in the region surrounding rs12809597 in the ACVR1B gene for their association with lung cancer risk (Fig. 1). Before imputation, 30 genotyped SNPs were found, 23 of which were between 50.58 Mb and 50.74 Mb. After imputation, 156 SNPs exhibited r2 > 0.8 and MAF > 0.01 and were adequately reliably imputed between 50.58 Mb and 50.74 Mb. Best-guess genotypes were used in the analysis. The most likely candidate SNP, rs1882119 (P = 1.76 × 10−4), an imputed SNP (r2 = 0.9849) in this region is in an intron of NR4A1, not ACVR1B. Conversely, rs2701129 (P =1.96 × 10−4) was directly genotyped. Because the r2 for rs12809597(ACVR1B) and rs2701129(NR4A1) was only 0.013, we further investigated relevant SNPs in NR4A1.

Figure 1.

Association of imputed and genotyped SNPs in the chromosome 12 region around ACVR1B and NR4A1 with lung cancer risk. Chromosomal position is on the x-axis and negative logarithm to the base 10 of the P values from logistic regression analysis is on the y-axis. Genotyped SNPs are plotted as solid diamonds, and imputed SNPs, as open circles. The two most significant SNPs in the region rs1882119 and chr12:50735912 are plotted in red. The overall structure of the linkage disequilibrium (LD) with SNPs in this region is reflected by estimated recombination rates from the genetic map of Hapmap in build 36 coordinates. The strength of the pairwise correlation between the surrounding markers and the most significant SNP (rs1882119) is reflected by the size of the symbols: the larger the size, the stronger the LD. LD was calculated from actual genotyped or imputed data by using PLINK. Genes in the region are annotated with location, range, and orientation by using gene annotations from the UCSC genome browser (downloaded from Broad Institute website). The original downloaded files were in build 35 positions and converted to build 36 positions (46).

Figure 1.

Association of imputed and genotyped SNPs in the chromosome 12 region around ACVR1B and NR4A1 with lung cancer risk. Chromosomal position is on the x-axis and negative logarithm to the base 10 of the P values from logistic regression analysis is on the y-axis. Genotyped SNPs are plotted as solid diamonds, and imputed SNPs, as open circles. The two most significant SNPs in the region rs1882119 and chr12:50735912 are plotted in red. The overall structure of the linkage disequilibrium (LD) with SNPs in this region is reflected by estimated recombination rates from the genetic map of Hapmap in build 36 coordinates. The strength of the pairwise correlation between the surrounding markers and the most significant SNP (rs1882119) is reflected by the size of the symbols: the larger the size, the stronger the LD. LD was calculated from actual genotyped or imputed data by using PLINK. Genes in the region are annotated with location, range, and orientation by using gene annotations from the UCSC genome browser (downloaded from Broad Institute website). The original downloaded files were in build 35 positions and converted to build 36 positions (46).

Close modal

In parallel with the ACVR1B analysis described earlier, we identified 170 upstream/downstream genes related to NR4A1, of which 65 genes and 568 SNPs had been included in our inflammation panel. Of these, 17 SNPs had P values < 0.01 in univariate analysis, assuming an additive model (Table 4). Five of these SNPs (NR4A2, NR4A1, TP53, BCL2, and MAP2K2), based on P values < 0.05, remained statistically significant in models using logistic regression forward or stepwise selection procedures, and with controlling for age, sex, second-hand smoking exposure, and family history of lung cancer (Table 5).

Table 4.

SNPs from NR4A1 targets and upstream and downstream regulators (additive model)

CHRSNPBPOR (95% CI)aP valueMAFLocationGene
rs566421 26782253 1.23 (1.02–1.48) 0.029 0.38 Flanking_3UTR RPS6KA1 
rs10159180 154746948 0.83 (0.69–1.00) 0.050 0.48 Flanking_5UTR MEF2D 
rs13428968 156900859 1.30 (1.02–1.66) 0.034 0.16 Flanking_5UTR NR4A2 
rs7656411 154847105 1.23 (1.00–1.50) 0.048 0.26 Flanking_3UTR TLR2 
rs10482642 142708224 1.28 (1.00–1.63) 0.047 0.17 Flanking_3UTR NR3C1 
12 rs2701129 50715744 0.63 (0.48–0.83) 0.001 0.13 Flanking_5UTR NR4A1 
12 rs2701124 50734424 0.58 (0.41–0.82) 0.002 0.08 Coding NR4A1 
15 rs325383 98072569 1.28 (1.05–1.57) 0.016 0.29 Flanking_3UTR MEF2A 
17 rs2078486 7523808 0.71 (0.51–1.00) 0.050 0.08 Intron TP53 
18 rs4987856 58944474 1.44 (1.07–1.94) 0.016 0.10 3UTR BCL2 
18 rs1977971 59119174 1.28 (1.06–1.53) 0.009 0.46 Intron BCL2 
18 rs11152377 59123426 0.83 (0.69–0.99) 0.036 0.42 Intron BCL2 
18 rs1462129 59131851 1.26 (1.05–1.51) 0.013 0.48 Intron BCL2 
18 rs1801018 59136859 0.82 (0.69–0.99) 0.037 0.43 Coding BCL2 
19 rs8101696 4066452 0.60 (0.41–0.89) 0.011 0.06 Intron MAP2K2 
19 rs4808100 17792497 1.24 (1.03–1.49) 0.020 0.37 Intron INSL3 
20 rs6063022 35426741 1.33 (1.03–1.71) 0.027 0.15 Intron SRC 
CHRSNPBPOR (95% CI)aP valueMAFLocationGene
rs566421 26782253 1.23 (1.02–1.48) 0.029 0.38 Flanking_3UTR RPS6KA1 
rs10159180 154746948 0.83 (0.69–1.00) 0.050 0.48 Flanking_5UTR MEF2D 
rs13428968 156900859 1.30 (1.02–1.66) 0.034 0.16 Flanking_5UTR NR4A2 
rs7656411 154847105 1.23 (1.00–1.50) 0.048 0.26 Flanking_3UTR TLR2 
rs10482642 142708224 1.28 (1.00–1.63) 0.047 0.17 Flanking_3UTR NR3C1 
12 rs2701129 50715744 0.63 (0.48–0.83) 0.001 0.13 Flanking_5UTR NR4A1 
12 rs2701124 50734424 0.58 (0.41–0.82) 0.002 0.08 Coding NR4A1 
15 rs325383 98072569 1.28 (1.05–1.57) 0.016 0.29 Flanking_3UTR MEF2A 
17 rs2078486 7523808 0.71 (0.51–1.00) 0.050 0.08 Intron TP53 
18 rs4987856 58944474 1.44 (1.07–1.94) 0.016 0.10 3UTR BCL2 
18 rs1977971 59119174 1.28 (1.06–1.53) 0.009 0.46 Intron BCL2 
18 rs11152377 59123426 0.83 (0.69–0.99) 0.036 0.42 Intron BCL2 
18 rs1462129 59131851 1.26 (1.05–1.51) 0.013 0.48 Intron BCL2 
18 rs1801018 59136859 0.82 (0.69–0.99) 0.037 0.43 Coding BCL2 
19 rs8101696 4066452 0.60 (0.41–0.89) 0.011 0.06 Intron MAP2K2 
19 rs4808100 17792497 1.24 (1.03–1.49) 0.020 0.37 Intron INSL3 
20 rs6063022 35426741 1.33 (1.03–1.71) 0.027 0.15 Intron SRC 

Abbreviations: CHR, chromosome; CI, confidence interval; MAF, multiple alignment format.

a

Univariate analysis.

Table 5.

Stepwise logistic model including upstream and downstream regulators of NR4A1 (additive model)

SNPOR (95% CI)aP value
rs13428968 1.46 (1.09–1.95) 0.0103 
rs2701124 0.52 (0.35–0.80) 0.0024 
rs2078486 0.61 (0.41–0.93) 0.0212 
rs1977971 1.36 (1.10–1.68) 0.0049 
rs8101696 0.56 (0.35–0.88) 0.0122 
SNPOR (95% CI)aP value
rs13428968 1.46 (1.09–1.95) 0.0103 
rs2701124 0.52 (0.35–0.80) 0.0024 
rs2078486 0.61 (0.41–0.93) 0.0212 
rs1977971 1.36 (1.10–1.68) 0.0049 
rs8101696 0.56 (0.35–0.88) 0.0122 
a

Adjusted for age, sex, second-hand smoking, and family history of lung cancer.

Our original risk model was constructed based on 709 never smokers (330 lung cancer cases and 379 controls) (6). Of the total of 959 never smokers in this new analysis, 650 (68%) overlapped in both analyses. The published AUC for never smokers in that model was 0.57. The point estimate of the AUC for those not included in our original study (N = 309) was 0.56. The AUC statistic for the baseline model in the entire discovery dataset, incorporating the same clinical and epidemiologic variables (age, gender, family history of lung cancer, and ETS exposure) was 0.62 (data not shown). With the addition of the replicated SNP, rs12809597, the AUC increased to 0.64, P = 0.098. The comparable model for the Mayo Clinic data with addition of rs12809597 yielded an AUC of 0.60. The same analysis for the discovery data, adding in the NR4A1 SNPs and upstream and downstream regulators, yielded an AUC of 0.68 (P = 0.0005), data not shown.

We also summed the number of adverse alleles (ACVR1B, NR4A1, and upstream and downstream regulators) and evaluated the distribution of cases and controls across different strata to determine the cumulative risk in the discovery set (Table 6). Compared with the lowest-risk stratum (0 to 6 risk alleles), the risks increased to an OR of 2.21, P = 0.0272 for 7 risk alleles; OR = 3.26; P = 5.0 × 10−4 for 8 risk alleles, and OR = 5.28 for 9 or more risk alleles (P = 3.9 × 10−7 (Table 6). A 46% increase in risk was found for each adverse allele, and the P value for trend was 1.11 × 10−9 (Table 6). Six percent of cases and 13% of controls were in the lowest-risk stratum compared with 50% and 35% in the highest-risk stratum, respectively.

Table 6.

Genetic risk score in discovery set for ACVR1B, NR4A1, and upstream and downstream SNPs

Adverse alleles (n)Cases, n (%)Controls, n (%)OR (95% CI)aP value
0–6 26 (5.9) 64 (12.8) Ref.   
64 (14.4) 112 (22.4) 2.21 (1.09–4.45) 0.0272 
133 (29.9) 150 (29.9) 3.26 (1.69–6.32) 5.00 × 10−4 
9+ 221 (49.8) 175 (34.9) 5.28 (2.78–10.05) 3.90 × 10−7 
P for trend     1.46 (1.30–1.65) 1.11 ×10−9 
Adverse alleles (n)Cases, n (%)Controls, n (%)OR (95% CI)aP value
0–6 26 (5.9) 64 (12.8) Ref.   
64 (14.4) 112 (22.4) 2.21 (1.09–4.45) 0.0272 
133 (29.9) 150 (29.9) 3.26 (1.69–6.32) 5.00 × 10−4 
9+ 221 (49.8) 175 (34.9) 5.28 (2.78–10.05) 3.90 × 10−7 
P for trend     1.46 (1.30–1.65) 1.11 ×10−9 
a

Adjusted for age, sex, environmental tobacco smoke, and lung cancer family history.

In this two-stage candidate pathway analysis of inflammation gene variants, we were able to replicate one variant (rs12809597) in the Activin receptor type-1B (ACVR1B)/Activin receptor-like kinase 4 (ALK4) gene that was significantly associated with lung cancer risk in lifetime never-smoking cases. This risk was most prominent in women and in those risk subgroups that reported adult exposure to ETS, prior asthma, or no prior hay fever. Further analysis of SNPs 1 Mb from this polymorphism suggested that another promising target was in the 5′ UTR of the Nuclear receptor subfamily 4 group A member 1 (NR4A1) gene, although the OR in the replication Mayo Clinic data did not achieve statistical significance, and the association we detected could be attributed to chance.

Inflammation is a complex host defense against biological, chemical, physical, and endogenous irritants. Innate immunity is mediated by a variety of secreted proinflammatory cytokines. The inflammation is resolved by anti-inflammatory cytokines. Chronic inflammation results from a dysfunction of these negative regulatory mechanisms (10). Although smoking (and perhaps, to a lesser extent, passive exposure) is the obvious cause of a chronic inflammatory milieu in the lung parenchyma and bronchial epithelium, other likely precipitating factors include infection, inhaled particulate exposures, and pulmonary scarring (11) that can lead to oxidative stress and an inflammatory response, even in non–tobacco-exposed subjects in whom lung cancer develops. It remains plausible, therefore, that inflammation gene polymorphisms could be important in lung cancer risk in lifetime never smokers as well.

Elevated prediagnostic C-reactive protein (CRP) levels, a systemic, but nonspecific, marker of chronic inflammation, have been associated with subsequent lung cancer risk (12) with evidence of a dose–response relation. Conversely, use of nonsteroidal anti-inflammatory drugs (NSAIDs) has been associated with decreased lung cancer risk in some (1316), but not all studies (1719). Few of these studies have specifically evaluated the risk in lifetime never smokers, although in one cohort analysis (13), the strongest effect for total NSAID use was for long-term former smokers.

Activin receptor type-1B is a protein encoded by the ACVR1B gene with alternate splicing, resulting in multiple transcript variants. Our SNP of interest, rs12809597, is intronic, and no function has been reported for this SNP, although it is possible that this tagSNP may be linked to other causal SNP(s) in the gene that affect expression or function. ACVR1B, also known as ALK4, acts as a transducer of activin or activin-like ligands that are growth and differentiation factors belonging to the transforming growth factor-β (TGF-β) superfamily of signaling proteins, essential regulators of proliferation and apoptosis, and key regulators of inflammation and angiogenesis. Activins signal through a heteromeric complex of receptor serine kinases, which include at least two type I (I and IB) and two type II (II and IIB) receptors (20). Activin complexes with ACVR1B and recruits SMAD2 or SMAD3, members of the SMAD family of transcriptional coregulators. ACVR1B has been shown to be mutated in pancreatic tumors (21), and activin signaling mediates growth inhibition and cell cycle arrest in breast cancer cells (22). Moreover, differential expression of this gene has been found in the epithelial cells of a subset of smokers with lung cancer (23) and in bone marrow micrometastases from lung cancer patients (24), although the relevance of the gene deregulation in lung cancer is not entirely clear. Whole-genome microarray analysis of ACVR1B expression in large airway epithelial cells indicated some reduction in expression among normal smokers compared with nonsmokers (25), suggesting the possible impact of cigarette-smoke exposure on activin signaling. It is therefore of interest that risk from the variant was most apparent in ETS-exposed subjects. Activins have also been implicated in the etiology of fibrotic diseases (26) and are upregulated during the fibrotic response in vivo (27).

Both the TGF-β and activin signaling pathways are activated on allergen provocation in asthma and may contribute to the resolution of inflammation and initiation of airway remodeling after allergen challenge (28). Activin may also act as an inhibitor of cytokine-induced proinflammatory chemokine release from the airway epithelium. Activin-A is rapidly induced in TH2 cells on T-cell activation and may also function as a TH2 immunomodulatory cytokine (29). An enhanced TH2 immune response contributes to the induction of allergy and asthma.

We previously showed that self-reported, physician-diagnosed asthma is significantly associated with risk of lung cancer in lifetime never smokers who were a subset of this larger analysis (OR = 1.82) with evidence of a dose–response pattern for duration (P = 0.007 for trend) (8), although this pattern was not evident in our discovery data. In their meta-analysis, Santillan and colleagues (30) also found asthma to be a significant risk factor for lung cancer in never smokers. Our data also demonstrated a protective effect of prior hay fever on lung cancer risk in never smokers (6). Cockcroft and colleagues (31) suggested that patients with respiratory atopy appeared to have some degree of protection against developing malignancies of endodermal origin, attributable to enhanced immune surveillance in a stimulated immune system.

It was of special interest that the most significant odds ratio for the ACVR1B SNP was obtained in the subset of cases and controls that reported adult exposure to ETS, although these subset analyses are based on small sample sizes. No such association was evident in those who denied such exposure. It could be argued that an inflammatory microenvironment is more likely to exist in those exposed passively to tobacco smoke, and that exposure is necessary for the impact of the gene variant to be apparent.

The association with NR4A1 (also known as Nur77) is intriguing, but must be viewed with considerable caution. It is an orphan receptor within the nuclear hormone receptor superfamily and a potent inhibitor of NF-κB activation (32). NRA41 is overexpressed in patients with atopic dermatitis compared with healthy volunteers (33). Protective effects of the NR4A1 SNP were also largest in putative risk subgroups (asthma, no prior hay fever).

A 1% increase in the AUC (0.64) was found in an expanded clinical and epidemiologic risk model incorporating the ACVR1B SNP, and an additional 5% (0.68), when we also added the upstream and downstream targets of NR4A1. These improvements in risk prediction incorporating these genes were statistically significant. The final AUC of 0.68 is similar, but the incremental improvement in AUC is larger, than that obtained from a risk-prediction model of lung cancer in ever smokers in which we incorporated top lung cancer GWAS hits, the chromosome 15q nicotinic receptor gene cluster (tag SNP rs1051730 G>A), and two SNPs from the 5p15.33 region (rs2736100 and rs401681) (34). However, higher AUC values are desirable for the model to have clinical utility and for any public health impact or recommendation, especially because the incidence of lung cancer in never smokers is substantially lower compared with that in ever smokers.

In a parallel analysis, rs2701129 was associated with an OR of 0.78, P = 0.014, in 1,096 cases and 727 controls, all ever smokers, whom we have genotyped by using the same Illumina platform, although rs12809597 was not a risk predictor. The rs12809597 was not directly genotyped in the GWAS, and the r2 value was not sufficiently robust for imputation. The rs2701129 was genotyped in GWAS, but was not statistically significant.

Although the chemical constituents of sidestream and mainstream smoke are qualitatively the same, differences in pH, combustion temperature, and degree of dilution with air contribute to quantitative differences in their chemical composition and their emission rates. For example, nitrosamines and other carcinogens are present in greater concentrations in sidestream than in mainstream smoke (35). The ever-smoker cases for GWAS also differ from the never-smoker cases in this analysis. For example, more than 25% of our ever-smoker cases report preexisting chronic obstructive pulmonary disease (COPD) that is almost nonexistent in never smokers. One could therefore hypothesize that the pathogenic processes for smokers and never smokers are not equivalent, although certain etiologic pathways could be shared, such as the involvement of inflammation.

We acknowledge the limitations of this study and the challenge in drawing causal inferences from association analyses. Relatively small sample sizes were used for both the discovery and replication sets, and this problem is exaggerated in subset analyses. We also relied on self-reported questionnaire data for assessment of ETS exposure, raising the potential for both misclassification and recall bias. Nevertheless, for residential exposure to ETS, most studies in the past have confirmed that self-reports were generally reliable (36), and practical approaches to alternative measurement of ETS exposure decades before the onset of lung cancer have not been established. In national survey data, the accuracy of self-reported second-hand smoke exposure at work, home, or home and work ranged from 87% to 92%, although workers reporting no second-hand smoke exposure were only 28% accurate (37). Thus underreporting of ETS exposure could occur, but overreporting is less likely.

In summary, this analysis used a candidate pathway approach to evaluate SNPs comprehensively in inflammation genes as predisposing to lung cancer risk in lifetime never smokers. We replicated an SNP in the TGF-β family in ETS-exposed patients or those with inflammatory/allergic conditions, and by using in silico analyses, we were able to identify upstream and downstream SNPs of our target SNPs that further contributed to risk. Recent progress in identification of novel SNPs, especially those generated from the 1000 Genomes Project, have identified several polymorphisms in the ACVR1B gene that could be candidates for causal variants. Those SNPs include 6 polymorphisms located in the coding region of the ACVR1B: rs34488074, rs114081852, rs117020497, rs114735080, rs77643569, and rs34050429. We plan to include these SNPs in the next phase of our targeted sequencing studies.

Subject Accrual

This analysis focuses on lung cancer cases and controls who reported themselves to be lifetime never smokers (i.e., smoked <100 cigarettes over a lifetime). Cases for the discovery phase were consecutive Caucasian patients with newly diagnosed, histopathologically confirmed, and previously untreated non–small cell lung cancer with no age, gender, ethnicity, tumor histology, or disease-stage restrictions. Medical history, family history of cancer, adult environmental tobacco-exposure history, and occupational history were obtained through an interviewer-administered risk-factor questionnaire. We did not validate self-reports of passive smoking exposure. Case-exclusion criteria for the study included prior chemotherapy or radiotherapy or recent blood transfusion.

We recruited our control population from the Kelsey-Seybold Clinic, Houston's largest multidisciplinary physician practice. Potential controls were first surveyed with a short questionnaire for their willingness to participate in research studies and to provide preliminary data for matching demographic characteristics with those of cases (4). Controls were frequency matched to the cases on the basis of age (±5 years), sex, smoking status, and ethnicity. Exclusion criteria were similar and also included no prior cancer. To date, the response rate among both the cases and controls has been approximately 75%. On receiving informed consent, we drew a 40–mL blood sample into coded, heparinized tubes from study participants. Genomic DNA was extracted from peripheral blood lymphocytes and stored at −80°C.

The replication phase was conducted among never-smoking cases and controls recruited between January 1997 and September 2008 and who were included in a published GWAS (5). These lifetime never-smoking lung cancer cases were recruited from the Mayo Clinic, and community residents who were never smokers were selected as controls and matched to the patients according to age, sex, and ethnic background. Personal interviews with structured questionnaires were used to elicit demographic, epidemiologic, and exposure data. Institutional review board approval was obtained from the MD Anderson Cancer Center, Kelsey-Seybold Foundation (Houston, Texas), and Mayo Clinic (Rochester, Minnesota).

Gene and SNP Selection

Candidate genes for the discovery phase were selected based on the following criteria. We searched the Gene Ontology database (38) and the National Center for Biotechnology Information (NCBI) PubMed (39) to identify a list of inflammation pathway–related genes. For each gene, we selected haplotype tagging SNPs (htSNP) located within 10 kb upstream of the transcriptional start site or 10 kb downstream of the transcriptional stop site, based on data from the International HapMap Project (40) release 24/Phase II. By using the LD select program (41) and the UCSC Golden Path Gene Sorter program (42), we further divided identified SNPs into bins based on an r2 threshold of 0.8 and minor allele frequency (MAF) greater than 0.05 in Caucasians to select tagging SNPs. We also included SNPs in the coding (synonymous SNPs, nonsynonymous SNPs) and regulatory regions (promoter, splicing site, 5′ UTR, and 3′ UTR). Functional SNPs and SNPs previously reported to be associated with cancer were also included. We also extensively used the inflammation pathway gene list and functionally defined subpathways, as outlined in Loza and colleagues (7), who suggested that variants in multiple genes in inflammation pathways may likely cooperate in additive or synergistic ways to affect disease risk. The complete set of selected SNPs was submitted to Illumina technical support for Infinium chemistry designability, beadtype analyses, and iSelect Infinium Beadchip synthesis.

Of the total number of selected SNPs, 2.9% could not be designed because of designability score failure. An additional 12% could not be incorporated into the beadchip owing to manufacturing issues (within the norm stated by Illumina). Overall, slightly fewer than 15% of all SNPs were not designed. We did not seek surrogates for failed SNPs, because of the relatively low failure rate for designability (<3%) and constraint on the total number of beadtypes for the custom chip design.

Genotyping

In total, 19,949 SNPs were genotyped in the discovery samples by using Illumina′s Infinium iSelect HD Custom Genotyping BeadChip according to the standard 3-day protocol (San Diego, CA). Of these, 11,930 SNPs were in inflammation pathways, and the remaining SNPs were identified from ongoing GWAS for further query in separate analyses. Genotypes were autocalled by using the BeadStudio software. Any SNP with a call rate lower than 95% was excluded from further analysis (n = 203). A further 27 SNPs were removed because of a difference in genotype between the original and the duplicate sample (error rate). We also deleted 93 SNPs that were at the same chromosomal position and 89 SNPs with MAF = 0. The final data set included 19,537 SNPs, of which 11,737 SNPs were in the inflammation pathway.

For the Mayo Clinic samples, whole genome amplification (WGA) was performed before SNP genotyping. The WGA was set up in four separate reactions, each of which included 25 ng of genomic DNA and standard amplification procedures with a total reaction volume of 25 μL (REPLI-g Midi Kit, Qiagen). After WGA, the four reactions were pooled, mixed, and quantified by the picogreen method. Genotyping was performed in Dynamic Arrays (Fluidigm; South San Francisco, CA) containing integrated fluidic circuits (IFCs). Then 75 ng of the WGA-DNA was pre-amplified using 0.2X primer multiplex of the source primers. 2.3 μL of pre-amplified DNA was then loaded onto the array; 3 μL of each Applied Biosystems TaqMan genotyping assay in a 5-μL assay reaction volume was loaded onto the array. The assay was run for 40 PCR cycles under vacuum pressure. The end-point read was performed on an EP1 machine by using a CCD camera to detect VIC and FAM dyes. SNP Genotyping Analysis Software was used to autocall SNP genotype clusters with a confidence index of 95%. The specific SNPs identified from this pathway-based analysis were not included in the Mayo GWAS chip (5) and were directly genotyped for this analysis. The never-smoker GWAS with the Mayo Clinic samples had a rather limited sample size, and an additional GWAS in never smokers is under way, including our discovery set of never-smoking cases and controls.

Statistical Analyses

Pearson's χ2 test was used to assess the differences in categoric variables, and t tests were used for continuous variables in both discovery and replication data sets. All tests were 2-sided. For each SNP, Hardy-Weinberg equilibrium was assessed among controls by using a χ2 test. To assess case–control associations of SNP genotypes with lung cancer risk, we used unconditional logistic regression, implemented by using SAS/Genetics version 9.2. Single-SNP association tests were carried out by using PLINK 1.07 (43).

We applied the Bayesian false discovery probability test (BFDP) (44) to evaluate the chance of obtaining a false-positive association. This approach calculates the probability of declaring no association, given the data and a specified prior on the presence of an association, and has a noteworthy threshold that is defined in terms of the costs of false discovery and nondiscovery. Four levels of prior probability of 0.01, 0.03, 0.05, and 0.07 and odds ratios from 1.3 through 2.0 were tested; selected levels of noteworthiness for BFDP were set at 0.8 (i.e., false nondiscovery rate is 4 times as costly as false discovery). We used the most conservative prior of 0.01 to determine that the association was unlikely to represent a false-positive result.

In stratified analyses, we used logistic regression to examine associations of selected SNPs with lung cancer case–control status for subgroups of subjects defined by sidestream tobacco exposure, history of hay fever, asthma, or family history of lung cancer, comparing each subgroup of cases against controls within that subgroup.

We also performed a stepwise forward logistic regression analysis in which we allowed significant univariate SNPs to enter a model according to the strength of association, provided they showed association with disease (P < 0.05). SNPs were retained for analysis if they continued to show association (P < 0.05), given other SNPs in the model. Linkage disequilibrium (LD) between SNPs was calculated for cases and controls by using PLINK before all the SNPs were entered into the model. If two SNPs were in high LD (r2 ≥ 0.8), only one SNP was entered into the model. Linkage disequilibrium was visualized by using Haploview v. 4.1 (45) to summarize r2 statistics.

Genotyped SNPs in the region 1 Mb from each side of the ACVR1B gene range were retrieved (46). Before imputation, we identified three A/T or C/G SNPs that were in opposite strand orientation to the strand of the 1000 Genomes Project reference data, based on comparisons of minor allele frequencies. The strands for these three SNPs were flipped before imputation. MACH version 1.016 was used for imputation and options, with the 1000 Genomes Project March 2010 release CEU data as the reference panel (47). For the replication analysis, we included all SNPs that were statistically significant at P values < 0.001 and BFDP levels ≤ 0.8 with prior probability of 0.01. For risk-model construction, we retained all epidemiologic variables that were components of our published risk-prediction model for never smokers (6). However, because the Mayo Clinic study did not have data available on prior hay fever, we elected to omit this variable from the model. For each risk model, we calculated specificity and sensitivity of the resulting logistic regression model by constructing receiver operator characteristic (ROC) curves and calculating the area under the curve (AUC) statistic to estimate the ability of the models to discriminate between patients and controls for the two populations separately and combined. Approximate 95% confidence intervals for the AUC were calculated, assuming a binegative exponential distribution by using SAS statistical software. An AUC of 0.5 indicates chance prediction (equivalent to a coin toss), whereas a statistic of 0.7 or higher indicates good discrimination. We also constructed expanded models that included any replicated SNPs. We performed pairwise comparisons of AUCs of the baseline multiple logistic model and the expanded model including genetic data by using a contrast matrix to evaluate differences of the areas under the empirical ROC curves (48).

The authors declare that they have no competing financial interests. None of the sponsors played a role in the study design, collection, analysis, and interpretation of the data, in the writing of this article, or in the decision to submit the manuscript for publication.

This work was supported by grants from the National Cancer Institute [CA55769 and CA127219 (M.R. Spitz); CA80127 and CA84354 (P. Yang); U19CA148127 and CA121197 (C.I. Amos); CA123235 and CA131327 (C.J. Etzel); and CA149462 (O.Y. Gorlova)]; Kelsey Seybold Research Foundation; and Mayo Foundation Fund.

1.
Subramanian
J
,
Govindan
R
. 
Lung cancer in ‘Never-smokers’: a unique entity
.
Oncology
2010
;
24
:
29
35
.
2.
Spitz
MR
,
Amos
CI
,
Dong
Q
,
Lin
J
,
Wu
X
. 
The CHRNA5-A3 region on chromosome 15q24-25.1 is a risk factor both for nicotine dependence and for lung cancer
.
J Natl Cancer Inst
2008
;
100
:
1552
6
.
3.
Walser
T
,
Cui
X
,
Yanagawa
J
,
Lee
JM
,
Heinrich
E
,
Lee
G
, et al
. 
Smoking and lung cancer: the role of inflammation
.
Proc Am Thorac Soc 2008 Dec
1
;
5
:
811
5
.
4.
Hudmon
KS
,
Honn
SE
,
Jiang
H
,
Chamberlain
RM
,
Xiang
W
,
Ferry
G
, et al
. 
Identifying and recruiting healthy control subjects from a managed care organization: a methodology for molecular epidemiological case-control studies of cancer
.
Cancer Epidemiol Biomarkers Prev
1997
;
6
:
565
71
.
5.
Li
Y
,
Sheu
CC
,
Ye
Y
,
de Andrade
M
,
Wang
L
,
Chang
SC
, et al
. 
Genetic variants and risk of lung cancer in never smokers: a genome-wide association study
.
Lancet Oncol
2010
;
11
:
321
30
.
6.
Spitz
MR
,
Hong
WK
,
Amos
CI
,
Wu
X
,
Schabath
MD
,
Dong
Q
, et al
. 
A risk model for prediction of lung cancer
.
J Natl Cancer Inst
2007
;
99
:
715
26
.
7.
Loza
MJ
,
McCall
CE
,
Li
L
,
Isaacs
WB
,
Xu
J
,
Chang
BL
. 
Assembly of inflammation-related genes for pathway-focused genetic analysis
.
PLoS One
2007
;
2
:
e1035
.
8.
Gorlova
OY
,
Zhang
Y
,
Schabath
MB
,
Lei
L
,
Zhang
Q
,
Amos
CI
, et al
. 
Never smokers and lung cancer risk: a case-control study of epidemiological factors
.
Int J Cancer
2006
;
118
:
1798
804
.
9.
Nikitin
A
,
Egorov
S
,
Daraselia
N
,
Mazo
I
. 
Pathway studio: the analysis and navigation of molecular networks
.
Bioinformatics
2003
;
19
:
2155
7
.
10.
Hanada
T
,
Yoshimura
A
. 
Regulation of cytokine signaling and inflammation
.
Cytokine Growth Factor Rev
2002
;
13
:
413
21
.
11.
Engels
EA
. 
Inflammation in the development of lung cancer: epidemiological evidence
.
Expert Rev Anticancer Ther
2008
;
8
:
605
15
.
12.
Chaturvedi
AK
,
Caporaso
NE
,
Katki
HA
,
Wong
HL
,
Chatterjee
N
,
Pine
SR
, et al
. 
C-reactive protein and risk of lung cancer
.
J Clin Oncol
2010
;
28
:
2719
26
.
13.
Slatore
CG
,
Au
DH
,
Littman
AJ
,
Satia
JA
,
White
E
. 
Association of nonsteroidal anti-inflammatory drugs with lung cancer: results from a large cohort study
.
Cancer Epidemiol Biomarkers Prev
2009
;
18
:
1203
7
.
14.
Van Dyke
AL
,
Cote
ML
,
Prysak
G
,
Claeys
GB
,
Wenzlaff
AS
,
Schwartz
AG
. 
Regular adult aspirin use decreases the risk of non-small cell lung cancer among women
.
Cancer Epidemiol Biomarkers Prev
2008
;
17
:
148
57
.
15.
Khuder
SA
,
Herial
NA
,
Mutgi
AB
,
Federman
DJ
. 
Nonsteroidal antiinflammatory drug use and lung cancer: a meta-analysis
.
Chest
2005
;
127
:
748
54
.
16.
Olsen
JH
,
Friis
S
,
Poulsen
AH
,
Fryzek
J
,
Harving
H
,
Tjønneland
A
, et al
. 
Use of NSAIDs, smoking and lung cancer risk
.
Int J Cancer
2008
;
98
:
232
7
.
17.
Kelly
JP
,
Coogan
P
,
Strom
BL
,
Rosenberg
L
. 
Lung cancer and regular use of aspirin and nonaspirin nonsteroidal anti-inflammatory drugs
.
Pharmacoepidemiol Drug Saf
2008
;
4
:
322
7
.
18.
Feskanich
D
,
Bain
C
,
Chan
AT
,
Pandeya
N
,
Speizer
FE
,
Colditz
GA
. 
Aspirin and lung cancer risk in a cohort study of women: dosage, duration and latency
.
Br J Cancer
2007
;
97
:
1295
9
.
19.
Wall
RJ
,
Shyr
Y
,
Smalley
W
. 
Nonsteroidal anti-inflammatory drugs and lung cancer risk: a population-based case control study
.
J Thorac Oncol
2007
;
2
:
109
14
.
20.
ten Dijke
P
,
Ichijo
H
,
Franzen
P
,
Schulz
P
,
Saras
J
,
Toyoshima
H
, et al
. 
Activin receptor-like kinases: a novel subclass of cell-surface receptors with predicted serine/threonine kinase activity
.
Oncogene
1993
;
8
:
2879
87
.
21.
Su
GH
,
Bansal
R
,
Murphy
KM
,
Montgomery
E
,
Yeo
CJ
,
Hruban
RH
, et al
. 
ACVR1B (ALK4, activin receptor type 1B) gene mutations in pancreatic carcinoma
.
Proc Natl Acad Sci U S A
2001
;
98
:
3254
7
.
22.
Burette
JE
,
Jeruss
JS
,
Kurley
SJ
,
Lee
EJ
,
Woodruff
TK
. 
Activin A mediates growth inhibition and cell cycle arrest through SMADs in human breast cancer cells
.
Cancer Res
2005
;
65
:
7968
75
.
23.
Spira
A
,
Beane
JE
,
Shah
V
,
Steiling
K
,
Liu
G
,
Schembri
F
, et al
. 
Airway epithelial gene expression in the diagnostic evaluation of smokers with suspect lung cancer
.
Nat Med
2007
;
13
:
361
6
.
24.
Wrage
M
,
Ruosaari
S
,
Eijk
PP
,
Kaifi
JT
,
Hollmén
J
,
Yekebas
EF
, et al
. 
Genomic profiles associated with early micrometastasis in lung cancer: relevance of 4q deletion
.
Clin Cancer Res
2009
;
15
:
1566
74
.
25.
Carolan
BJ
,
Heguy
A
,
Harvey
BG
,
Leopold
PL
,
Ferris
B
,
Crystal
RG
. 
Up-regulation of expression of the ubiquitin carboxyl-terminal hydrolase L1 gene in human airway epithelium of cigarette smokers
.
Cancer Res
2006
;
66
:
10729
40
.
26.
Border
WA
,
Noble
NA
. 
Transforming growth factor beta in tissue fibrosis
.
N Engl J Med
1994
;
331
:
1286
92
.
27.
Ohga
E
,
Matsuse
T
,
Teramoto
S
,
Katayama
H
,
Nagase
T
,
Fukuchi
Y
, et al
. 
Effects of activin A on proliferation and differentiation of human lung fibroblasts
.
Biochem Biophys Res Commun
1996
;
228
:
391
6
.
28.
Kariyawasam
HH
,
Pegorier
S
,
Barkans
J
,
Xanthou
G
,
Aizen
M
,
Ying
S
, et al
. 
Activin and transforming growth factor-beta signaling pathways are activated after allergen challenge in mild asthma
.
J Allergy Clin Immunol
2009
;
124
:
454
62
.
29.
Ogawa
K
,
Funaba
M
,
Chen
Y
,
Tsujimoto
M
. 
Activin A functions as a Th2 cytokine in the promotion of the alternative activation of macrophages
.
J Immunol
2006
;
177
:
6787
94
.
30.
Santillan
AA
,
Camargo
CA
 Jr
,
Colditz
GA
. 
A meta-analysis of asthma and risk of lung cancer
.
Cancer Causes Control
2003
;
14
:
327
34
.
31.
Cockcroft
DW
,
Klein
GJ
,
Donevan
RE
,
Copland
GM
. 
Is there a negative correlation between malignancy and respiratory atopy?
Ann Allergy
1979
;
43
:
345
7
.
32.
Diatchenko
L
,
Romanov
S
,
Malinina
I
,
Clarke
J
,
Tchivilev
I
,
Li
X
, et al
. 
Identification of novel mediators of NF-kappaB through genome-wide survey of monocyte adherence-induced genes
.
J Leukoc Biol
2005
;
78
:
1366
77
.
33.
Kagaya
S
,
Hashida
R
,
Ohkura
N
,
Tsukada
T
,
Sugita
Y
,
Terakawa
M
, et al
. 
NR4A orphan nuclear receptor family in peripheral blood eosinophils from patients with atopic dermatitis and apoptotic eosinophils in vitro
.
Int Arch Allergy Immunol
2005
;
137(Suppl 1)
:
35
44
.
34.
Spitz
MR
,
Amos
CI
,
D'Amelio
A
 Jr
,
Dong
Q
,
Etzel
C
. 
Re: Discriminatory accuracy from single-nucleotide polymorphisms in models to predict breast cancer risk
.
J Natl Cancer Inst
2009
;
101
:
1731
2
.
35.
Husgafvel-Pursiainen
K
. 
Genotoxicity of environmental tobacco smoke: a review
.
Mutat Res
2004
;
567
:
427
45
.
36.
Wu
AH
. 
Exposure misclassification bias in studies of environmental tobacco smoke and lung cancer
.
Environ Health Perspect
1999
;
107(Suppl 6)
:
873
7
.
37.
Arheart
KL
,
Lee
DJ
,
Fleming
LE
,
LeBlanc
WG
,
Dietz
NA
,
McCollister
KE
, et al
. 
Accuracy of self-reported smoking and secondhand smoke exposure in the US workforce: the National Health and Nutrition Examination Surveys
.
J Occup Environment Med
2008
;
50
:
1414
20
.
38.
The Gene Ontology Consortium
. 
Gene ontology: tool for the unification of biology
. (cited 2010 August 25). Available from: http://www.geneontology.org.
39.
The National Center for Biotechnology Information (NCBI)
. (cited 2010 August 25). Available from: http://www.ncbi.nlm.nih.gov/pubmed.
40.
International HapMap Consortium
. 
The international HapMap project
. (cited 2011 August 25). Available from: http://hapmap.ncbi.nlm.nih.gov/cgi-perl/gbrowse/hapmap24_B36/.
41.
Documentation for ldSelect Version 1.0 Deborah A. Nickerson, Mark Rieder, Chris Carlson, Qian Yi, University of Washington
. (cited 2010 August 25). Available from: http://droog.gs.washington.edu/ldSelect.html.
42.
UCSC Genome Bioinformatics Genome Browser
. (cited 2010 August 25). Available from: http://genome.ucsc.edu.
43.
Purcell
S
,
Neale
B
,
Todd-Brown
K
,
Thomas
L
,
Ferreria
MA
,
Bender
D
, et al
. 
PLINK: a tool set for whole-genome association and population-based linkage analyses
.
Am J Hum Genet
2007
;
81
:
559
75
.
44.
Wakefield
J
. 
A Bayesian measure of the probability of false discovery in genetic epidemiology studies
.
Am J Hum Genet
2007
;
81
:
208
27
.
45.
Barrett
JC
,
Fry
B
,
Maller
J
,
Daly
MJ
. 
Haploview: analysis and visualization of LD and haplotype maps
.
Bioinformatics
2005
;
21
:
263
5
.
46.
Diabetes Genetics Initiative of Broad Institute of Harvard and MIT, Lund University, and Novartis Institutes of BioMedical Research
. (cited 2009 Jan 15). Available from: http://www.broadinstitute.org/science/projects/diabetes-genetics-initiative/plotting-genome-wide-association-results.
47.
Markov Chain Haplotyping (MACH) software tool for haplotype estimation and genotype imputation
. 
Developed by Goncalo Abecasis and Yun Li
. (cited 2008 April). Available from: http://www.sph.umich.edu/csg/abecasis/MACH/index.html release MACH1.016.
48.
DeLong
ER
,
DeLong
DM
,
Clarke-Pearson
DL
. 
Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach
.
Biometrics
1988
;
44
:
837
45
.