Background:

Lung cancer risk attributable to smoking is dose dependent, yet few studies examining a polygenic risk score (PRS) by smoking interaction have included comprehensive lifetime pack-years smoked.

Methods:

We analyzed data from participants of European ancestry in the Framingham Heart Study Original (n = 454) and Offspring (n = 2,470) cohorts enrolled in 1954 and 1971, respectively, and followed through 2018. We built a PRS for lung cancer using participant genotyping data and genome-wide association study summary statistics from a recent study in the OncoArray Consortium. We used Cox proportional hazards regression models to assess risk and the interaction between pack-years smoked and genetic risk for lung cancer adjusting for European ancestry, age, sex, and education.

Results:

We observed a significant submultiplicative interaction between pack-years and PRS on lung cancer risk (P = 0.09). Thus, the relative risk associated with each additional 10 pack-years smoked decreased with increasing genetic risk (HR = 1.56 at one SD below mean PRS, HR = 1.48 at mean PRS, and HR = 1.40 at one SD above mean PRS). Similarly, lung cancer risk per SD increase in the PRS was highest among those who had never smoked (HR = 1.55) and decreased with heavier smoking (HR = 1.32 at 30 pack-years).

Conclusions:

These results suggest the presence of a submultiplicative interaction between pack-years and genetics on lung cancer risk, consistent with recent findings. Both smoking and genetics were significantly associated with lung cancer risk.

Impact:

These results underscore the contributions of genetics and smoking on lung cancer risk and highlight the negative impact of continued smoking regardless of genetic risk.

Lung cancer is the leading cause of cancer mortality in the United States (1). As the primary risk factor for lung cancer, 80% to 90% of these deaths are linked to cigarette smoking (2). The risk of lung cancer due to smoking is dose dependent such that those with a longer and/or heavier smoking history are at greater risk of developing lung cancer (3–6). A person's smoking history can be represented through pack-years: packs of cigarettes smoked per day (20 cigarettes per pack) multiplied by the years smoked. Importantly, genetic susceptibility to lung cancer is also well established, with prior studies providing strong support for the contribution of germline genetic variation to lung cancer (7–21). Many of the genetic loci that convey a greater risk of lung cancer are also associated with smoking behaviors, (9, 10, 15, 18, 22, 23) which supports the presence of a gene-by-smoking interaction for lung cancer (19, 20, 24–27).

Wang and colleagues recently evaluated gene–environment interactions on lung cancer risk using genotype data collected from the prospective UK Biobank (26). They observed that smoking was the leading risk factor for lung cancer incidence with an unweighted population attributable fraction of 63.73% and that smoking displayed a positive additive interaction with their polygenic risk score (PRS) to influence lung cancer risk (26). The interaction between smoking and high genetic risk possessed an unweighted population attributable fraction of 17.85% (26). However, they classified individuals with a history of smoking as heavy (≥40 pack-years) or non-heavy (<40 pack-years), genetic risk as “low,” “medium,” or “high,” and analyzed the categorical versions of these variables rather than using them continuously, which could potentially lead to biased results if the chosen cut-off points do not correspond to clinical decision points or a true difference in risk.

Although prior findings support a gene-by-smoking interaction, few have examined the dose-dependency by incorporating cumulative pack-years (26) and have instead simplified smoking history to current/former/never smoking status (19). Extending on these findings, we sought to evaluate the presence of a potential interaction between genetic risk and cumulative pack-years smoked, both examined continuously. Furthermore, many prior studies have used a case–control design without follow-up or multiple assessments of smoking status over time. To this end, we applied a PRS to the Original and Offspring cohorts of the Framingham Heart Study (FHS), with a median of greater than nine smoking assessments over five decades of follow-up, to examine the interplay between genetic risk and lifetime pack-years smoked on lung cancer risk.

Sample description

This investigation includes data from the FHS, a longitudinal cohort of community-dwelling individuals from the town of Framingham, Massachusetts which was established in 1948 to determine the risk factors for cardiovascular disease (28). Original cohort participants presented for biennial examinations during which routine physical examinations were conducted and health-related questionnaires were completed. In 1971, children of the Original cohort participants, as well as other community dwelling individuals were enrolled into the Offspring cohort; follow-up examinations occur every 4 years (29). As FHS is community-based, the cohort characteristics reflect the town's residents at the time of enrollment in that the participants are almost exclusively middle class individuals of European ancestry. The current analysis includes participants in the Original (28) and Offspring (29) cohorts who attended their fourth (1954–1958, n = 4,541) and first (1971–1975, n = 5,122) examination cycles, respectively. Included participants were free of lung cancer at baseline and possessed complete data on smoking history and genetic information (Fig. 1). Following exclusions, our analytic sample included 2,924 individuals (454 Original cohort; 2,470 Offspring cohort) who attended a total of 29,564 examinations (referred to hereafter as “person-examinations”). Notably, DNA was extracted from stored blood samples in the late 1980s and early 1990s upon receipt of written participant consent in accordance with the Belmont Report; by this point, many of the Original cohort participants had passed away and were unable to consent. As such, genetic data were obtained on fewer than 30% of Original cohort participants (30). Of the 2,470 Offspring cohort participants in our sample, 393 have one parent in the FHS study and 83 have both parents in the cohort. Participant characteristics were assessed regularly via in-person clinic examination throughout the follow-up period: approximately every 2 years for the Original cohort (28), and every 4 years for the Offspring cohort (29). This study was approved by the Vanderbilt University Medical Center Institutional Review Board.

Figure 1.

This flow chart shows the number of individuals in each cohort who did not meet the sequential inclusion criterion and were therefore excluded from the analytic sample. The remaining 2,924 individuals (454 Original cohort, 2,470 Offspring cohort) were pooled into a single cohort for analysis. aThe number of individuals who attended the specified exam and provided genetic data. bTo accurately capture lifetime smoking exposure, it was essential to know smoking history prior to baseline. cWhile genetic samples were provided by these individuals, they did not provide consent for genetic analysis by non-FHS investigators. dOriginal cohort participants were seen roughly every 2 years. After 5 years without an update (effectively one missed exam plus an additional year), individuals were censored to avoid carrying values forward for an extended period without reassessment. Similarly, Offspring cohort participants were seen roughly every 4 years and were thus censored after 9 years without an update (also corresponding to a single missed exam plus an additional year).

Figure 1.

This flow chart shows the number of individuals in each cohort who did not meet the sequential inclusion criterion and were therefore excluded from the analytic sample. The remaining 2,924 individuals (454 Original cohort, 2,470 Offspring cohort) were pooled into a single cohort for analysis. aThe number of individuals who attended the specified exam and provided genetic data. bTo accurately capture lifetime smoking exposure, it was essential to know smoking history prior to baseline. cWhile genetic samples were provided by these individuals, they did not provide consent for genetic analysis by non-FHS investigators. dOriginal cohort participants were seen roughly every 2 years. After 5 years without an update (effectively one missed exam plus an additional year), individuals were censored to avoid carrying values forward for an extended period without reassessment. Similarly, Offspring cohort participants were seen roughly every 4 years and were thus censored after 9 years without an update (also corresponding to a single missed exam plus an additional year).

Close modal

Outcome event

FHS participants were continuously followed for diagnosis of lung cancer during the follow-up period from the baseline examination [examination cycle 4 (Original cohort); examination cycle 1 (Offspring cohort)] until the end of 2013 (Original cohort) or 2018 (Offspring cohort). Follow-up for the Original cohort concluded at the end of 2013 because fewer than 40 participants remained alive and were ages 96 years on average. Lung cancer incidence was adjudicated following standardized protocols, including a review of medical records and pathology and laboratory reports (31). In brief, FHS participants (or their designated family member if the participant is unable to respond for themselves due to death or illness) and participants’ primary care physicians are contacted at the end of each calendar year and are asked to report any hospitalizations, new cancer diagnoses, or death. Study staff then review this information along with data collected during routine Study examinations (biennial for Original cohort, quadrennial for Offspring cohort) to adjudicate cancer diagnoses. If more information is needed to confirm the diagnosis or cancer location, all hospital records are obtained from the patient's physician, with consent, for review. In the presence of remaining uncertainty, a surgical oncologist reviews the information to adjudicate the diagnosis.

Genetic data and quality control

For this analysis, we used existing genotyping data from the Affymetrix GeneChip Human Mapping 500K Array and the 50K Human Gene Focused Panel platforms which are included as part of the FHS SNP Health Association Resource (SHARe) data available in the database of Genotypes and Phenotypes (dbGaP; ref. 32). Biospecimens for DNA extraction were collected from FHS participants between 1971 and 2002 and genotyped on the Affymetrix 500K and MIPS 50K arrays. Genotyping data were mapped to genome build 37 (32). Depending on when an individual's DNA was collected, there is potential for immortal time bias because there are potentially decades between the baseline examination and DNA extraction. A large amount of time between baseline and genetic testing will bias our results if many people die from lung cancer prior to DNA collection. Thus, we examined the baseline characteristics of individuals excluded because of a lack of genetic information versus those included in the sample; no significant differences were observed. After assessing the possibility of immortal person-time bias but before combining phenotypic and genotypic data, we performed quality control of the genotyping data (33).

Imputed genotyping data on FHS participants were downloaded from dbGaP (phs000710.v1.p1). Genotypes were previously imputed to the 1000 Genomes using MACH (version 1.00.15) and HapMap (release 22, build 36, CEU) as the imputation backbone. From the imputed data, we excluded SNPs with an average call rate <90%, in linkage disequilibrium (r2 > 0.7), or minor allele frequency <5% (34).

Principal components to identify population substructure were estimated using EIGENSTRAT and were provided by FHS (35, 36).

PRS

To build a lung cancer PRS, we used summary statistics from a previously performed genome-wide association study (GWAS) in individuals of European ancestry from the OncoArray Consortium (7). The OncoArray genotyping platform covers 533,631 SNPs that passed quality control procedures (18); of these, 517,482 SNPs passed the filtering algorithm described by McKay and colleagues (adherence to Hardy–Weinberg equilibrium and call rate >95%) and were included in their analyses of lung cancer risk (7). Genotypes were then imputed using the 1000 Genomes (phase III) as the reference panel (18). The OncoArray Consortium sample used in the GWAS for lung cancer risk contains data on 85,716 participants from 26 cohorts; FHS was not among them. Smoking data were available on 50,046 OncoArray Consortium participants; of those, 40,187 (80%) had ever smoked, and 20,833 (42%) were current smokers (7). We determined the overlap between the OncoArray and the imputed FHS genotyping data and built our PRS from the shared risk variants. We used PRSice (37), a P-value thresholding method, to develop our PRS. After linkage disequilibrium pruning, we retained 83,304 SNPs. We then constructed two PRSs: one weighted by the regression coefficient (log-odds) between the SNP and lung cancer risk, that is, β, referred to as PRSβ hereafter, and the second (PRSβ/Var(β)) was weighted by the regression coefficient divided by its estimated variance, that is, β/[Var(β)] to incorporate a measure of uncertainty.

Ascertainment of smoking history and potential confounders

Collection and construction of smoking variables, including pack-years smoked, have been described previously (38, 39). At the baseline examination (cycle 4 for Original cohort, cycle 1 for Offspring cohort), data on current and prior smoking behaviors were collected. For current and former smokers, information was obtained on age at which the participant started smoking, usual number of cigarettes smoked per day in the past, age quit smoking (former smokers), and current number of cigarettes smoked per day. From these data, we calculated pack-years at baseline for both current and former smokers; never smokers were assigned a pack-years value of 0. Pack-years were updated at each examination in the follow-up period as described below.

At postbaseline examinations, data on current smoking status and cigarettes per day were collected, allowing us to calculate cumulative smoking exposure. For a given participant, smoking status (current, former, never) could change over time such that each participant contributed person examinations and person time to the category reflecting his or her status at each assessment. If an individual developed lung cancer, this event counted only in the group to which the individual belonged at the time of the event. The median number of examinations during which smoking was assessed was 22 for the Original cohort and 9 for the Offspring cohort.

We identified age, sex, education, genetics, population substructure, and family history as potential confounders of the association between smoking and lung cancer. Age (in years), sex (male/female), and education at the baseline examination (less than high school graduate, high school graduate, more than high school education) were available in the FHS examination data. Population substructure was identified from principal components as described above. Lung cancer family history was not available in these data and was therefore not included in models. Dynamic predictors (smoking variables and age) were updated at each examination while static predictors (PRS, population substructure, sex, education) remained constant over follow-up. An example timeline of exams, including variable updates and censoring, is displayed in Supplementary Fig. S1.

Missing covariate data

In the FHS data from in-person examinations, missingness was relatively low. In the Original cohort, 87% of person-exams had no missing data while in the Offspring cohort, 98% of person-exams had no missing data. Education level was most frequently missing in the Offspring cohort with a mere 2.1% missingness. Smoking status was not assessed during all the Original cohort exams and was missing at 13% of person-examinations. However, smoking status was collected at all Offspring examinations and was missing at <1% of person-exams in this cohort. All other variables had <5% missingness.

Missing covariate data were handled using multiple imputation by chained equations techniques to produce five complete datasets for analysis. We imputed continuous variables through the use of predictive mean matching (40) to produce imputed values that are clinically plausible. Categorical variables were imputed using the discriminate function with a noninformative Jeffrey prior (41). Results across imputed datasets were combined according to Rubin's rules (42).

Statistical analysis

We calculated baseline summary statistics by smoking status. Means and SD were reported for normally distributed continuous variables, while medians along with the 25th and 75th percentile were reported for continuous variables with skewed distributions. We report counts and percentages for categorical variables. We then plotted the distribution of pack-years smoked stratified by smoking status (former/current) and calculated the Pearson correlation between each of the PRSs and continuous pack-years to assess their association. Next, we determined how many parent-child sets existed in our data in which the child and at least one parent developed lung cancer.

Using Poisson regression with an offset term equal to the natural logarithm of follow-up time, we calculated lung cancer incidence rates per 1,000 person-years stratified by smoking status; current and former smokers were further categorized by above/below 20 pack-years as this was the median pack-years in our sample over all included person-examinations as well as the eligibility threshold for initiating lung cancer screening in the United States (43).

After confirming the proportional hazards assumption via the interaction of continuous pack-years and PRS (separately) with the natural logarithm of follow-up time, we fit Cox proportional hazards regression models with incident lung cancer as the outcome. Because the FHS includes related individuals, we used mixed-effects Cox proportional hazards regression in our analyses that incorporated the kinship matrix to adjust the variance accordingly. Models included continuous pack-years, smoking status, lung cancer PRS, principal component 1 [PC1 (representing European ancestry)], age, sex, and educational attainment as independent variables.

We first assessed whether the PRS modified the effect of the association between continuous pack-years and lung cancer risk using a test of heterogeneity. In the presence of heterogeneity, we assessed the lung cancer risk associated with each increase of 10 pack-years at the mean PRS and 1 SD above and below the mean (separate models for each PRS). We then examined the risk associated with a one SD increase in each PRS at 0, 10, 20, and 30 pack-years smoked. Note that assessing the risk at 0 pack-years is analogous to estimating risk among those who have never smoked. Finally, we calculated the area under the ROC curve (AUC) for the models incorporating the interactions between each PRS and pack-years smoked. Tests for heterogeneity, modeling, and calculation of the AUC were performed for both versions of the PRS.

A two-sided P value <0.05 was considered statistically significant except for tests for heterogeneity which considered a more liberal two-sided P value <0.1 significant as our number of events was relatively low and tests for interactions are typically underpowered (44, 45). For all regression models, missing data were handled via multiple imputation by chained equation techniques. Analyses were performed in SAS 9.4 and R 4.0.1.

Data availability

Genotypic and phenotypic data for this project were obtained from dbGaP (phs000342.v20.p13: NHLBI Framingham SHARe) and are available to other researchers upon data request approval through dbGaP at https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000342.v20.p13.

Sample characteristics

Our sample included 2,924 FHS participants: 1,170 who had never smoked, 520 who formerly smoked, and 1,234 currently smoking individuals. At study baseline, those who had never smoked or currently smoke had an average age of 35 years compared with formerly smoking individuals, who had a baseline age of 38 years (Table 1). Women were most likely to have never smoked, while men were most likely to currently smoke. Currently smoking individuals consumed a median of one pack per day (20 cigarettes). Individuals who had formerly smoked had a median of 11 pack-years compared with currently smoking individuals, who had a median of 13.4 pack-years. Consistently, the distribution of pack-years was more heavily right-skewed among current than former smokers (Supplementary Fig. S2). While PRS distributions were similar across smoking status (Table 1), we observed a positive correlation between the PRSs and continuous pack-years (r = 0.042, P = 0.0226 for PRSβ; r = 0.040, P = 0.0312 for PRSβ/Var(β)).

Table 1.

Baseline sample statistics by smoking status.

OverallNever smokedFormerly smokedCurrently smoking
(Total N = 2,924)(Total N = 1,170)(Total N = 520)(Total N = 1,234)
CharacteristicaNSummaryNSummaryNSummaryNSummary
Age, years 2,924 35.4 (9.8) 1,170 34.8 (10.8) 520 37.7 (8.6) 1,234 35.1 (9.2) 
Sex 2,924 — 1,170 — 520 — 1,234 — 
 Male — 1,286 (44.0) — 446 (38.1) — 279 (53.7) — 561 (45.5) 
 Female — 1,638 (56.0) — 724 (61.9) — 241 (46.4) — 673 (54.5) 
Education 2,585 — 1,050 — 452 — 1,083 — 
 Less than High School — 253 (9.8) — 99 (9.4) — 38 (8.4) — 116 (10.7) 
 High School — 865 (33.5) — 330 (31.4) — 144 (31.9) — 391 (36.1) 
 More than High School — 1,467 (56.8) — 621 (59.1) — 270 (59.7) — 576 (53.2) 
Body mass index, kg/m2 2,923 24.2 (21.9, 27.0) 1,169 24.2 (21.6, 26.9) 520 24.8 (22.5, 27.4) 1,234 24.0 (21.8, 26.7) 
Cigarettes per dayb — — — — — — 1,231 20.0 (10.0, 30.0) 
Pack-Years — — — — 520 11.0 (4.0, 22.2) 1,234 13.4 (5.4, 24.0) 
Years since quittingc — — — — 520 6.0 (3.0, 10.0) — — 
PRS — — — — — —   
 Weighted by β 2,924 0.2 (0.0, 0.4) 1,170 0.2 (0.0, 0.4) 520 0.2 (0.0, 0.4) 1,234 0.2 (0.0, 0.4) 
 Weighted by β/Var(β) 2,924 13.2 (−0.7, 28.3) 1,170 12.7 (−0.3, 27.7) 520 13.2 (−1.3, 29.1) 1,234 13.6 (−0.8, 28.3) 
OverallNever smokedFormerly smokedCurrently smoking
(Total N = 2,924)(Total N = 1,170)(Total N = 520)(Total N = 1,234)
CharacteristicaNSummaryNSummaryNSummaryNSummary
Age, years 2,924 35.4 (9.8) 1,170 34.8 (10.8) 520 37.7 (8.6) 1,234 35.1 (9.2) 
Sex 2,924 — 1,170 — 520 — 1,234 — 
 Male — 1,286 (44.0) — 446 (38.1) — 279 (53.7) — 561 (45.5) 
 Female — 1,638 (56.0) — 724 (61.9) — 241 (46.4) — 673 (54.5) 
Education 2,585 — 1,050 — 452 — 1,083 — 
 Less than High School — 253 (9.8) — 99 (9.4) — 38 (8.4) — 116 (10.7) 
 High School — 865 (33.5) — 330 (31.4) — 144 (31.9) — 391 (36.1) 
 More than High School — 1,467 (56.8) — 621 (59.1) — 270 (59.7) — 576 (53.2) 
Body mass index, kg/m2 2,923 24.2 (21.9, 27.0) 1,169 24.2 (21.6, 26.9) 520 24.8 (22.5, 27.4) 1,234 24.0 (21.8, 26.7) 
Cigarettes per dayb — — — — — — 1,231 20.0 (10.0, 30.0) 
Pack-Years — — — — 520 11.0 (4.0, 22.2) 1,234 13.4 (5.4, 24.0) 
Years since quittingc — — — — 520 6.0 (3.0, 10.0) — — 
PRS — — — — — —   
 Weighted by β 2,924 0.2 (0.0, 0.4) 1,170 0.2 (0.0, 0.4) 520 0.2 (0.0, 0.4) 1,234 0.2 (0.0, 0.4) 
 Weighted by β/Var(β) 2,924 13.2 (−0.7, 28.3) 1,170 12.7 (−0.3, 27.7) 520 13.2 (−1.3, 29.1) 1,234 13.6 (−0.8, 28.3) 

aSummary statistics are displayed as mean (SD) for age, systolic blood pressure, diastolic blood pressure, and total cholesterol, as median (Q1, Q3) for body mass index, cigarettes per day, pack-years, years since quitting, and PRSs, and as N (%) for categorical variables.

bApplies only to currently smoking individuals.

cApplies only to formerly smoking individuals.

Among the 476 Offspring cohort participants in our sample with at least one parent also in the sample, 393 had a single parent in the sample; both parents of the remaining 83 were in the sample. Of the 393 with one parent in the study, 4 developed lung cancer, 1 of whom had a parent diagnosed with lung cancer as well. None of the Offspring participants with both parents in the study developed lung cancer. Thus, of 476 Offspring participants with at least one parent also in the sample, there was only one parent-child set (0.2%) in which both members developed lung cancer.

PRSs

Using PRSice, we determined that the highest R2 value between lung cancer status and a PRS was achieved when including the 120 SNPs with P < 6 × 10−5 (Supplementary Fig. S3). When examining the PRS distributions by incident lung cancer status, we found that a greater burden of risk alleles was present in those who developed lung cancer than those who did not (Supplementary Fig. S4).

Effect of pack-years smoked and PRS on lung cancer risk

Among the 2,924 individuals in our sample, 86 were diagnosed with incident lung cancer. We observed that both formerly and currently smoking individuals had higher lung cancer incidence rates than individuals who never smoked, with the highest rates in current smokers (Table 2). When stratifying smokers’ incidence rates by above/below 20 pack-years, point estimates were slightly higher in formerly smoking individuals in both categories than currently smoking individuals, but confidence intervals overlapped, indicating no significant difference.

Table 2.

Lung cancer incidence by smoking status and smoking intensity.

Smoking statusPerson-ExaminationsPerson-YearsLung cancerIncidence rate per 1,000PY (95% CI)
Never 10,848 44,532 0.13 (0.06–0.30) 
Former 9,975 42,904 48 1.12 (0.84–1.49) 
 < 20 PKY 5,696 25,537 0.26 (0.12–0.57) 
 ≥ 20 PKY 4,279 17,367 42 2.39 (1.76–3.24) 
Current 6,114 25,194 32 1.27 (0.90–1.80) 
 < 20 PKY 2,394 10,912 0.09 (0.01–0.65) 
 ≥ 20 PKY 3,720 14,282 31 2.17 (1.53–3.09) 
Smoking statusPerson-ExaminationsPerson-YearsLung cancerIncidence rate per 1,000PY (95% CI)
Never 10,848 44,532 0.13 (0.06–0.30) 
Former 9,975 42,904 48 1.12 (0.84–1.49) 
 < 20 PKY 5,696 25,537 0.26 (0.12–0.57) 
 ≥ 20 PKY 4,279 17,367 42 2.39 (1.76–3.24) 
Current 6,114 25,194 32 1.27 (0.90–1.80) 
 < 20 PKY 2,394 10,912 0.09 (0.01–0.65) 
 ≥ 20 PKY 3,720 14,282 31 2.17 (1.53–3.09) 

Note: Cells are time-updated such that as individuals begin and quit smoking, they contribute person-time to various groups. An individual's lung cancer event only contributes to the group he or she was in at the time of diagnosis. Incidence rates and corresponding 95% confidence intervals use data from five multiple imputations. Other columns are based on the first imputation alone.

Abbreviations: CI, confidence interval; PKY, pack-years; PY, person-years.

We then assessed the presence of a PRS-by-pack-years interaction on lung cancer risk by performing tests for heterogeneity for each PRS interacting with pack-years. For both PRSs, P values were ≤0.1 (PRSβ×pack-years P = 0.09; PRSβ/Var(β)×pack-years P = 0.098), and the beta coefficient for the interaction term was negative, indicating the presence of an antagonistic (i.e., submultiplicative) interaction. Because of this heterogeneity, we analyzed the effect of each additional 10 pack-years smoked at the mean PRS, one SD below the mean PRS, and one SD above the mean PRS. Similarly, we assessed the impact of each SD increase in the PRS at 0, 10, 20, and 30 pack-years smoked.

Using PRSβ/Var(β) as an example, each additional 10 pack-years smoked was significantly associated with: a 56% increase in lung cancer risk at one SD below the mean of PRSβ/Var(β), a 48% increase in risk at the mean of PRSβ/Var(β), and a 40% increase at one SD above the mean of PRSβ/Var(β); a similar pattern was observed for PRSβ (Table 3). Continuing to use PRSβ/Var(β) as an example, we observed that each SD increase in PRSβ/Var(β) was significantly associated with: a 55% increase in the risk of lung cancer at 0 pack-years (individuals who had never smoked), a 47% increase in risk at 10 pack-years, a 39% increase in risk at 20 pack-years smoked, and a 32% increase in risk at 30 pack-years smoked; a similar trend was observed PRSβ (Table 4). Thus, as each PRS increased, the effect of additional pack-years smoked conferred a relatively smaller risk of lung cancer, and as pack-years increased, the effect of an additional pack-years smoked was associated with a smaller, but still significant risk of lung cancer. This is consistent with our observation of a submultiplicative interaction.

Table 3.

Effect of each additional 10 pack-years smoked at varying PRS values on lung cancer risk.

HR (95% CI)
PRS Valueper 10 pack-year increaseP-value
PRSβ (SD = 0.31) 
 1 SD below mean PRS 1.56 (1.40–1.75) <0.0001 
 Mean PRS 1.48 (1.37–1.60) <0.0001 
 1 SD above mean PRS 1.40 (1.28–1.53) <0.0001 
PRSβ/Var(β) (SD = 21.92) 
 1 SD below mean PRS 1.56 (1.39–1.75) <0.0001 
 Mean PRS 1.48 (1.37–1.60) <0.0001 
 1 SD above mean PRS 1.40 (1.28–1.54) <0.0001 
HR (95% CI)
PRS Valueper 10 pack-year increaseP-value
PRSβ (SD = 0.31) 
 1 SD below mean PRS 1.56 (1.40–1.75) <0.0001 
 Mean PRS 1.48 (1.37–1.60) <0.0001 
 1 SD above mean PRS 1.40 (1.28–1.53) <0.0001 
PRSβ/Var(β) (SD = 21.92) 
 1 SD below mean PRS 1.56 (1.39–1.75) <0.0001 
 Mean PRS 1.48 (1.37–1.60) <0.0001 
 1 SD above mean PRS 1.40 (1.28–1.54) <0.0001 

Note: HRs are per 10 pack-years and are estimated from mixed-effects Cox proportional hazards regression models with incident lung cancer as the outcome and adjusting for continuous pack-years, PRS, the interaction between continuous pack-years and PRS, age, sex, current smoking status, education, and principal component 1. The variance structure accounts for familial relationships.

Abbreviations: CI, confidence interval; HR, hazard ratio; PRS, polygenic risk score; SD, standard deviation.

Table 4.

Effect of each SD increase in PRS at varying pack-year values on lung cancer risk.

HR (95% CI)
Pack-years smokedper SD increase in PRSP-value
PRSβ (SD = 0.31) 
 0 Pack-Years (Never Smoked) 1.57 (1.09–2.25) 0.0154 
 10 Pack-Years 1.48 (1.08–2.03) 0.0148 
 20 Pack-Years 1.40 (1.06–1.85) 0.0165 
 30 Pack-Years 1.33 (1.04–1.70) 0.0250 
PRSβ/Var(β) (SD = 21.92) 
 0 Pack-Years (Never Smoked) 1.55 (1.08–2.22) 0.0175 
 10 Pack-Years 1.47 (1.07–2.01) 0.0165 
 20 Pack-Years 1.39 (1.06–1.83) 0.0180 
 30 Pack-Years 1.32 (1.03–1.68) 0.0263 
HR (95% CI)
Pack-years smokedper SD increase in PRSP-value
PRSβ (SD = 0.31) 
 0 Pack-Years (Never Smoked) 1.57 (1.09–2.25) 0.0154 
 10 Pack-Years 1.48 (1.08–2.03) 0.0148 
 20 Pack-Years 1.40 (1.06–1.85) 0.0165 
 30 Pack-Years 1.33 (1.04–1.70) 0.0250 
PRSβ/Var(β) (SD = 21.92) 
 0 Pack-Years (Never Smoked) 1.55 (1.08–2.22) 0.0175 
 10 Pack-Years 1.47 (1.07–2.01) 0.0165 
 20 Pack-Years 1.39 (1.06–1.83) 0.0180 
 30 Pack-Years 1.32 (1.03–1.68) 0.0263 

Note: HRs are per SD increase in PRS and are estimated from mixed-effects Cox proportional hazards regression models with incident lung cancer as the outcome and adjusting for continuous pack-years, PRS, the interaction between continuous pack-years and PRS, age, sex, current smoking status, education, and principal component 1. The variance structure accounts for familial relationships.

Abbreviations: CI, confidence interval; PRS, polygenic risk score; SD, standard deviation.

The models which incorporated the interaction between each PRS and pack-years smoked possessed strong discrimination, with AUC values above 0.85 (AUC for model with PRSβ = 0.8624, AUC for model with PRSβ/Var(β) = 0.8616).

To our knowledge, this is the first investigation to observe a gene-by-smoking interaction in a prospective cohort with continuous pack-years over a 50-year period and adjudicated lung cancer incidence. We confirmed that pack-years smoked remains a strong risk factor for lung cancer, even when adjusting for genetic contribution. After accounting for the interaction between smoking and genetic risk, both the PRS and pack-years remained significantly associated with lung cancer incidence. Because the interaction was submultiplicative, the risk of lung cancer associated with increasing pack-years smoked was greatest among those with the lowest genetic risk and the risk associated with genetic risk was greatest among those who had never smoked; this is consistent with other recent findings (46). Thus, although there is an interaction between the genetic risk of lung cancer and pack-years smoked, it is not so strong as to indicate that there is any level of genetic risk at which smoking becomes “safe”—the long-term benefits of smoking cessation, including decreased mortality both pre- and post-cancer diagnosis, cannot be overstated (38, 47, 48).

While the presence of a gene-by-smoking interaction has been observed previously (19, 24–27, 46, 49), we are among the first to examine the interaction between a lung cancer PRS comprised of hundreds of SNPs and continuous pack-years smoked (25). VanderWeele and colleagues were among the first to report a significant gene-by-smoking interaction on lung cancer, but they examined only two SNPs on chromosome 15 (24). Similarly, Zhou and colleagues observed that 8092 C>A polymorphism in ERCC1 is able to modify the associations between polymorphisms rs1316298 and rs4589502 and smoking, demonstrating some of the missing heritability that exists in lung cancer susceptibility (49). More recently, in a study of 413,870 UK Biobank participants, Kachuri and colleagues observed that PRS was associated with lung cancer risk, and noted that modifiable risk factors, including smoking, appeared to be stronger risk stratifiers than the PRS, supporting the use of both the PRS and smoking history to estimate lung cancer risk (50). Dai and colleagues also constructed a PRS containing 19 SNPs for lung cancer in a prospective sample of Chinese men and women and found it to be associated with diagnosis of lung cancer (21). While they did not directly test for effect modification of this association by pack-years smoked, they did perform stratified analyses in non-smoking individuals as well as heavy (≥30 pack-years) and non-heavy smoking individuals (<30 pack-years), and observed synergistic effect modification such that those with high genetic risk and a ≥30 pack-year smoking history were at the greatest lung cancer risk (21).

Hung and colleagues recently constructed a weighted PRS with 128 SNPs using genetic data from the OncoArray Consortium as we did here. However, they validated their PRS in the UK Biobank to assess the PRS's utility in predicting lung cancer and identifying individuals eligible for lung cancer screening (19). On the whole, our results and theirs complement one another. Although only six of the SNPs in Hung's PRS overlapped with the 120 included in our PRS, they also observed a greater lung cancer risk with increasing PRS. The authors also observed an interaction between smoking status and their PRS (19) but did not assess the role of pack-years smoked. Similarly, Shi and colleagues observed a submultiplicative interaction between smoking and a PRS such that the association between the PRS and lung adenocarcinoma was stronger in individuals without a history of smoking than in those with a history of smoking (46), but as they did not have individual-level data, they were unable to assess the interaction between pack-years and their PRS.

Our observed submultiplicative interaction could be because many of the genetic loci which convey greater risk of lung cancer are also associated with smoking behaviors (9, 10, 15, 18, 22, 23), so the PRS and pack-years smoked are likely capturing some amount of shared variability in lung cancer incidence. This is also consistent with our observation that the PRSs and pack-years smoked are positively correlated. For example, three genes that are strongly associated with lung cancer which were identified in GWAS are nicotinic acetylcholine receptors, CHRNA3, CHRNB4, and CHRNA5 (9, 10, 15, 18), all of which reside on chromosome 15. These same loci are also associated with smoking behaviors, including smoking intensity, topography, duration of exposure, and successful cessation (9, 22, 23, 51). In addition, CYP2A6 and CYP2B6 are part of the P450 system and encode for enzymes governing nicotine metabolism and activating tobacco-specific nitrosamines (i.e., the carcinogens in cigarettes; refs. 52, 53) which influence cigarette consumption and smoking cessation (9, 54–56). Those who metabolize nicotine quickly tend to smoke more and are therefore exposed to more carcinogens. It is believed that CYP2A6 and CYP2B6 are associated with lung cancer because they increase an individual's propensity to smoke both longer and more heavily (57). These associations support the biologic plausibility of a submultiplicative gene-by-smoking interaction for lung cancer (19, 20, 24–27). Among never smokers, there is evidence that genetic mutations due to environmental exposures, like pollution and radon, may cause lung cancer (58, 59). For example, a recent investigation reveals that EGFR mutations, which are common among never smokers with lung cancer, are elevated among persons living in areas with high PM2.5 pollution (58). In addition, it is known that the chromosome 15q25 locus is associated with lung cancer among non-smokers (10); this association may be modified by radon exposure (59).

Strengths and limitations

The decades-long follow-up and comprehensive smoking variables that are regularly collected in FHS allowed us to incorporate pack-years as a continuous time-varying covariate. We were therefore able to produce more precise estimates of the effect of pack-years smoked than is available in most other data sources and highlights the strength of our findings when compared with case–control studies. Specifically, inclusion of continuous pack-years as opposed to categorical smoking status (current/former/never) or categorical smoking intensity (heavy vs. non-heavy) allowed us to demonstrate the graded contribution of genetics to lung cancer risk as smoking duration and intensity (captured via continuous pack-years) increased. Thus, we encourage researchers to collect and model smoking history using continuous pack-years whenever possible as it allows the description of more granular associations between smoking and outcomes.

We acknowledge the lack of statistical power resulting from a relatively small sample size and low number of incident lung cancer events; the existence of relatedness among participants further decreased our effective sample size and statistical power. In particular, although FHS is a study that includes family members, there is no efficient way to adjust for family history of lung cancer with these data because participants were never asked about their family history of lung cancer and only 19% of Offspring cohort participants in our sample had one or more parent also in our sample. Therefore, the confounding effect of family history could not be accounted for. Given that only 0.2% of the American population has a current or prior diagnosis of lung cancer (58), and only one parent-child set in our data both developed lung cancer, this bias is likely small. Because of sample size concerns, we decided a priori to use an alpha level of 0.1 to assess the presence of an interaction between PRS and pack-years. This threshold is higher than a standard alpha level of 0.05, and as our P values for the interaction were between 0.05 and 0.1, we present results which incorporate this interaction; this would not have been the case had we used an alpha level of 0.05. Results should be interpreted with this choice in mind. We were also unable to include important regions of the genome, including the CYP2A6 nicotine metabolism gene, which is associated with both smoking behavior and lung cancer development, in our PRS because of this region is notoriously challenging to genotype and was not covered by our chip or imputation panel (59). Finally, FHS Original and Offspring cohort participants are predominantly of European ancestry so results may not generalize to individuals of other genetic ancestries.

Conclusion

In conclusion, our results support the presence of a gene-by-smoking interaction on lung cancer risk both reinforcing the negative effect of continued smoking on lung cancer regardless of genetic risk and supporting the notion that lung cancer risk attributable to genetics is highest among those who have never smoked. These results can inform future studies to quantify the value of incorporating genetic information into the linked processes of lung cancer screening and tobacco cessation treatment. Larger studies with both genetic data and longitudinal smoking information should investigate this further.

M.S. Duncan reports grants from NIH, National Center for Advancing Translational Sciences during the conduct of the study. M.C. Aldrich reports grants from NIH/NCI during the conduct of the study; personal fees from Guardant Health outside the submitted work. No disclosures were reported by the other authors.

The funders had no role in the design of the study; the collection, analysis, and interpretation of the data; the writing of the article; or the decision to submit the article for publication.

M.S. Duncan: Conceptualization, data curation, formal analysis, methodology, writing–original draft, writing–review and editing. H. Diaz-Zabala: Writing–original draft, writing–review and editing. J. Jaworski: Data curation, formal analysis, writing–review and editing. H.A. Tindle: Conceptualization, methodology, writing–review and editing. R.A. Greevy: Conceptualization, methodology, writing–review and editing. L. Lipworth: Conceptualization, writing–review and editing. R.J. Hung: Resources, writing–review and editing. M.S. Freiberg: Conceptualization, resources, methodology, writing–review and editing. M.C. Aldrich: Conceptualization, resources, methodology, writing–review and editing.

M.S. Duncan was supported by the National Center for Advancing Translational Sciences at NIH (KL2TR001996). M.C. Aldrich was supported by the NCI at NIH (R01CA251758 and U01CA253560). H. Diaz-Zabala acknowledges the support of the NCI at NIH through the VUMC Molecular and Genetic Epidemiology of Cancer training program (T32CA160056). H.A. Tindle acknowledges the support of the William Anderson Spickard, Jr, MD Chair in Medicine. M.S. Freiberg acknowledges the support of the Dorothy and Laurence Grossman Chair in Cardiology. The Framingham Heart Study is supported by contract no. 75N92019D00031 from the National Heart, Lung, and Blood Institute (NHLBI), NIH, and Department of Health and Human Services with additional support from other sources.

Note: Supplementary data for this article are available at Cancer Epidemiology, Biomarkers & Prevention Online (http://cebp.aacrjournals.org/).

2.
Centers for Disease Control and Prevention
.
What are the risk factors for lung cancer?
2023
.
Available from:
https://www.cdc.gov/cancer/lung/basic_info/risk_factors.htm.
3.
Law
MR
,
Morris
JK
,
Watt
HC
,
Wald
NJ
.
The dose-response relationship between cigarette consumption, biochemical markers and risk of lung cancer
.
Br J Cancer
1997
;
75
:
1690
3
.
4.
Ruano-Ravina
A
,
Figueiras
A
,
Montes-Martínez
A
,
Barros-Dios
JM
.
Dose-response relationship between tobacco and lung cancer: new findings
.
Eur J Cancer Prev
2003
;
12
:
257
63
.
5.
Ai
F
,
Zhao
J
,
Yang
W
,
Wan
X
.
Dose–response relationship between active smoking and lung cancer mortality/prevalence in the Chinese population: a meta-analysis
.
BMC Public Health
2023
;
23
:
747
.
6.
Warren
GW
,
Cummings
KM
.
Tobacco and lung cancer: risks, trends, and outcomes in patients with cancer
.
Am Soc Clin Oncol Educ Book
2013
;
359
64
.
7.
McKay
JD
,
Hung
RJ
,
Han
Y
,
Zong
X
,
Carreras-Torres
R
,
Christiani
DC
, et al
.
Large-scale association analysis identifies new lung cancer susceptibility loci and heterogeneity in genetic susceptibility across histological subtypes
.
Nat Genet
2017
;
49
:
1126
32
.
8.
Bossé
Y
,
Amos
CI
.
A decade of GWAS results in lung cancer
.
Cancer Epidemiol Biomarkers Prev
2018
;
27
:
363
79
.
9.
Thorgeirsson
TE
,
Geller
F
,
Sulem
P
,
Rafnar
T
,
Wiste
A
,
Magnusson
KP
, et al
.
A variant associated with nicotine dependence, lung cancer and peripheral arterial disease
.
Nature
2008
;
452
:
638
42
.
10.
Hung
RJ
,
McKay
JD
,
Gaborieau
V
,
Boffetta
P
,
Hashibe
M
,
Zaridze
D
, et al
.
A susceptibility locus for lung cancer maps to nicotinic acetylcholine receptor subunit genes on 15q25
.
Nature
2008
;
452
:
633
7
.
11.
Wang
Y
,
McKay
JD
,
Rafnar
T
,
Wang
Z
,
Timofeeva
MN
,
Broderick
P
, et al
.
Rare variants of large effect in BRCA2 and CHEK2 affect risk of lung cancer
.
Nat Genet
2014
;
46
:
736
41
.
12.
Timofeeva
MN
,
Hung
RJ
,
Rafnar
T
,
Christiani
DC
,
Field
JK
,
Bickeböller
H
, et al
.
Influence of common genetic variation on lung cancer risk: meta-analysis of 14 900 cases and 29 485 controls
.
Hum Mol Genet
2012
;
21
:
4980
95
.
13.
Landi
MT
,
Chatterjee
N
,
Yu
K
,
Goldin
LR
,
Goldstein
AM
,
Rotunno
M
, et al
.
A genome-wide association study of lung cancer identifies a region of chromosome 5p15 associated with risk for adenocarcinoma
.
Am J Hum Genet
2009
;
85
:
679
91
.
14.
Broderick
P
,
Wang
Y
,
Vijayakrishnan
J
,
Matakidou
A
,
Spitz
MR
,
Eisen
T
, et al
.
Deciphering the impact of common genetic variation on lung cancer risk: a genome-wide association study
.
Cancer Res
2009
;
69
:
6633
41
.
15.
Wang
Y
,
Broderick
P
,
Webb
E
,
Wu
X
,
Vijayakrishnan
J
,
Matakidou
A
, et al
.
Common 5p15.33 and 6p21.33 variants influence lung cancer risk
.
Nat Genet
2008
;
40
:
1407
9
.
16.
McKay
JD
,
Hung
RJ
,
Gaborieau
V
,
Boffetta
P
,
Chabrier
A
,
Byrnes
G
, et al
.
Lung cancer susceptibility locus at 5p15.33
.
Nat Genet
2008
;
40
:
1404
6
.
17.
Liu
P
,
Vikis
HG
,
Wang
D
,
Lu
Y
,
Wang
Y
,
Schwartz
AG
, et al
.
Familial aggregation of common sequence variants on 15q24–25.1 in lung cancer
.
J Natl Cancer Inst
2008
;
100
:
1326
30
.
18.
Amos
CI
,
Wu
X
,
Broderick
P
,
Gorlov
IP
,
Gu
J
,
Eisen
T
, et al
.
Genome-wide association scan of tag SNPs identifies a susceptibility locus for lung cancer at 15q25.1
.
Nat Genet
2008
;
40
:
616
22
.
19.
Hung
RJ
,
Warkentin
MT
,
Brhane
Y
,
Chatterjee
N
,
Christiani
DC
,
Landi
MT
, et al
.
Assessing lung cancer absolute risk trajectory based on a polygenic risk model
.
Cancer Res
2021
;
81
:
1607
15
.
20.
Jia
G
,
Lu
Y
,
Wen
W
,
Long
J
,
Liu
Y
,
Tao
R
, et al
.
Evaluating the utility of polygenic risk scores in identifying high-risk individuals for eight common cancers
.
JNCI Cancer Spectr
2020
;
4
:
pkaa021
.
21.
Dai
J
,
Lv
J
,
Zhu
M
,
Wang
Y
,
Qin
N
,
Ma
H
, et al
.
Identification of risk loci and a polygenic risk score for lung cancer: a large-scale prospective cohort study in Chinese populations
.
Lancet Respir Med
2019
;
7
:
881
91
.
22.
Bierut
LJ
,
Stitzel
JA
,
Wang
JC
,
Hinrichs
AL
,
Grucza
RA
,
Xuei
X
, et al
.
Variants in nicotinic receptors and risk for nicotine dependence
.
Am J Psychiatry
2008
;
165
:
1163
71
.
23.
Saccone
SF
,
Hinrichs
AL
,
Saccone
NL
,
Chase
GA
,
Konvicka
K
,
Madden
PAF
, et al
.
Cholinergic nicotinic receptor genes implicated in a nicotine dependence association study targeting 348 candidate genes with 3713 SNPs
.
Hum Mol Genet
2007
;
16
:
36
49
.
24.
VanderWeele
TJ
,
Asomaning
K
,
Tchetgen Tchetgen
EJ
,
Han
Y
,
Spitz
MR
,
Shete
S
, et al
.
Genetic variants on 15q25.1, smoking, and lung cancer: an assessment of mediation and interaction
.
Am J Epidemiol
2012
;
175
:
1013
20
.
25.
Qian
DC
,
Han
Y
,
Byun
J
,
Shin
HR
,
Hung
RJ
,
McLaughlin
JR
, et al
.
A novel pathway-based approach improves lung cancer risk prediction using germline genetic variations
.
Cancer Epidemiol Biomarkers Prev
2016
;
25
:
1208
15
.
26.
Wang
X
,
Qian
ZM
,
Zhang
Z
,
Cai
M
,
Chen
L
,
Wu
Y
, et al
.
Population attributable fraction of lung cancer due to genetic variants, modifiable risk factors, and their interactions: a nationwide prospective cohort study
.
Chemosphere
2022
;
301
:
134773
.
27.
Zhang
R
,
Chu
M
,
Zhao
Y
,
Wu
C
,
Guo
H
,
Shi
Y
, et al
.
A genome-wide gene–environment interaction analysis for tobacco smoke and lung cancer susceptibility
.
Carcinogenesis
2014
;
35
:
1528
35
.
28.
Dawber
TR
,
Kannel
WB
,
Lyell
LP
.
An approach to longitudinal studies in a community: the Framingham study
.
Ann N Y Acad Sci
2006
;
107
:
539
56
.
29.
Kannel
WB
,
Feinleib
M
,
McNamara
PM
,
Garrison
RJ
,
Castelli
WP
.
An investigation of coronary heart disease in families. The Framingham Offspring Study
.
Am J Epidemiol
1979
;
110
:
281
90
.
30.
Cupples
LA
,
Heard-Costa
N
,
Lee
M
,
Atwood
LD
;
Framingham Heart Study Investigators
.
Genetics analysis workshop 16 problem 2: the Framingham Heart Study data
.
BMC Proc
2009
;
3
:
S3
.
31.
Kreger
BE
,
Splansky
GL
,
Schatzkin
A
.
The cancer experience in the Framingham Heart Study cohort
.
Cancer
1991
;
67
:
1
6
.
32.
Psaty
BM
,
O'Donnell
CJ
,
Gudnason
V
,
Lunetta
KL
,
Folsom
AR
,
Rotter
JI
, et al
.
Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) consortium
.
Circ Cardiovasc Genet
2009
;
2
:
73
80
.
33.
Turner
S
,
Armstrong
LL
,
Bradford
Y
,
Carlson
CS
,
Crawford
DC
,
Crenshaw
AT
, et al
.
Quality control procedures for genome-wide association studies
.
Curr Protoc Hum Genet
2011
;
Chapter 1
:
Unit1.19
.
34.
Marees
AT
,
de Kluiver
H
,
Stringer
S
,
Vorspan
F
,
Curis
E
,
Marie-Claire
C
, et al
.
A tutorial on conducting genome-wide association studies: quality control and statistical analysis
.
Int J Methods Psychiatr Res
2018
;
27
:
e1608
.
35.
Price
AL
,
Patterson
NJ
,
Plenge
RM
,
Weinblatt
ME
,
Shadick
NA
,
Reich
D
.
Principal components analysis corrects for stratification in genome-wide association studies
.
Nat Genet
2006
;
38
:
904
9
.
36.
Patterson
N
,
Price
AL
,
Reich
D
.
Population structure and eigenanalysis
.
PLoS Genet
2006
;
2
:
e190
.
37.
Euesden
J
,
Lewis
CM
,
O'Reilly
PF
.
PRSice: polygenic risk score software
.
Bioinformatics
2015
;
31
:
1466
8
.
38.
Tindle
HA
,
Stevenson Duncan
M
,
Greevy
RA
,
Vasan
RS
,
Kundu
S
,
Massion
PP
, et al
.
Lifetime smoking history and risk of lung cancer: results from the Framingham Heart Study
.
J Natl Cancer Inst
2018
;
110
:
1201
7
.
39.
Duncan
MS
,
Freiberg
MS
,
Greevy
RA
Jr
,
Kundu
S
,
Vasan
RS
,
Tindle
HA
.
Association of smoking cessation with subsequent risk of cardiovascular disease
.
JAMA
2019
;
322
:
642
50
.
40.
Little
RJA
.
Missing-data adjustments in large surveys
.
J Bus Econ Stat
1988
;
6
:
287
96
.
41.
Schafer
J
.
Chapter 7: Methods for categorical data
.
Analysis of incomplete multivariate data
. 1st ed.
New York (NY)
:
Chapman and Hall
;
1997
. p.
239
88
.
42.
Rubin DB
.
Chapter 4: Randomization-based evaluations
.
Multiple imputation for nonresponse in surveys
.
New York (NY)
:
John Wiley and Sons
;
1987
. p.
113
53
.
43.
US Preventive Services Task Force
.
Screening for lung cancer: US Preventive Services Task Force recommendation statement
.
JAMA
2021
;
325
:
962
70
.
44.
Kaufman
JS
,
MacLehose
RF
.
Which of these things is not like the others?
Cancer
2013
;
119
:
4216
22
.
45.
Marshall
SW
.
Power for tests of interaction: effect of raising the type I error rate
.
Epidemiol Perspect Innov
2007
;
4
:
4
.
46.
Shi
J
,
Shiraishi
K
,
Choi
J
,
Matsuo
K
,
Chen
TY
,
Dai
J
, et al
.
Genome-wide association study of lung adenocarcinoma in East Asia and comparison with a European population
.
Nat Commun
2023
;
14
:
3043
.
47.
Sheikh
M
,
Mukeriya
A
,
Shangina
O
,
Brennan
P
,
Zaridze
D
.
Postdiagnosis smoking cessation and reduced risk for lung cancer progression and mortality: a prospective cohort study
.
Ann Intern Med
2021
;
174
:
1232
9
.
48.
Wang
X
,
Romero-Gutierrez
CW
,
Kothari
J
,
Shafer
A
,
Li
Y
,
Christiani
DC
.
Prediagnosis smoking cessation and overall survival among patients with non–small cell lung cancer
.
JAMA Netw Open
2023
;
6
:
e2311966
.
49.
Zhou
W
,
Liu
G
,
Park
S
,
Wang
Z
,
Wain
JC
,
Lynch
TJ
, et al
.
Gene-smoking interaction associations for the ERCC1 polymorphisms in the risk of lung cancer
.
Cancer Epidemiol Biomarkers Prev
2005
;
14
:
491
6
.
50.
Kachuri
L
,
Graff
RE
,
Smith-Byrne
K
,
Meyers
TJ
,
Rashkin
SR
,
Ziv
E
, et al
.
Pan-cancer analysis demonstrates that integrating polygenic risk scores with modifiable risk factors improves risk prediction
.
Nat Commun
2020
;
11
:
6084
.
51.
Chen
LS
,
Baker
TB
,
Piper
ME
,
Breslau
N
,
Cannon
DS
,
Doheny
KF
, et al
.
Interplay of genetic risk factors (CHRNA5-CHRNA3-CHRNB4) and cessation treatments in smoking cessation success
.
Am J Psychiatry
2012
;
169
:
735
42
.
52.
Chiang
HC
,
Wang
CY
,
Lee
HL
,
Tsou
TC
.
Metabolic effects of CYP2A6 and CYP2A13 on 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanone (NNK)-induced gene mutation-A mammalian cell-based mutagenesis approach
.
Toxicol Appl Pharmacol
2011
;
253
:
145
52
.
53.
Alzahrani
AM
,
Rajendran
P
.
The multifarious link between cytochrome P450s and cancer
.
Oxid Med Cell Longev
2020
;
2020
:
3028387
.
54.
Mackillop
J
,
Obasi
E
,
Amlung
MT
,
McGeary
JE
,
Knopik
VS
.
The role of genetics in nicotine dependence: mapping the pathways from genome to syndrome
.
Curr Cardiovasc Risk Rep
2010
;
4
:
446
53
.
55.
Liu
M
,
Jiang
Y
,
Wedow
R
,
Li
Y
,
Brazel
DM
,
Chen
F
, et al
.
Association studies of up to 1.2 million individuals yield new insights into the genetic etiology of tobacco and alcohol use
.
Nat Genet
2019
;
51
:
237
44
.
56.
Bray
MJ
,
Chen
LS
,
Fox
L
,
Hancock
DB
,
Culverhouse
RC
,
Hartz
SM
, et al
.
Dissecting the genetic overlap of smoking behaviors, lung cancer, and chronic obstructive pulmonary disease: a focus on nicotinic receptors and nicotine metabolizing enzyme
.
Genet Epidemiol
2020
;
44
:
748
58
.
57.
Saccone
NL
,
Culverhouse
RC
,
Schwantes-An
TH
,
Cannon
DS
,
Chen
X
,
Cichon
S
, et al
.
Multiple independent loci at chromosome 15q25.1 affect smoking quantity: a meta-analysis and comparison with lung cancer and COPD
.
PLoS Genet
2010
;
6
:
e1001053
.
58.
Noone
A
,
Howlader
N
,
Krapcho
M
,
Miller
D
,
Brest
A
,
Yu
M
, et al
. SEER Cancer Statistics Review, 1975–2015, NCI, Bethesda, MD. Available from: https://seer.cancer.gov/archive/csr/1975_2015/#contents.
59.
Wassenaar
CA
,
Dong
Q
,
Wei
Q
,
Amos
CI
,
Spitz
MR
,
Tyndale
RF
.
Relationship between CYP2A6 and CHRNA5-CHRNA3-CHRNB4 variation and smoking behaviors and lung cancer risk
.
J Natl Cancer Inst
2011
;
103
:
1342
6
.
This open access article is distributed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) license.