Abstract
Tobacco exposure causes 8 of 10 lung cancers, and identifying additional risk factors is challenging due to confounding introduced by smoking in traditional observational studies.
We used Mendelian randomization (MR) to screen 207 metabolites for their role in lung cancer predisposition using independent genome-wide association studies (GWAS) of blood metabolite levels (n = 7,824) and lung cancer risk (n = 29,266 cases/56,450 controls). A nested case–control study (656 cases and 1,296 matched controls) was subsequently performed using prediagnostic blood samples to validate MR association with lung cancer incidence data from population-based cohorts (EPIC and NSHDS).
An MR-based scan of 207 circulating metabolites for lung cancer risk identified that blood isovalerylcarnitine (IVC) was associated with a decreased odds of lung cancer after accounting for multiple testing (log10-OR = 0.43; 95% CI, 0.29–0.63). Molar measurement of IVC in prediagnostic blood found similar results (log10-OR = 0.39; 95% CI, 0.21–0.72). Results were consistent across lung cancer subtypes.
Independent lines of evidence support an inverse association of elevated circulating IVC with lung cancer risk through a novel methodologic approach that integrates genetic and traditional epidemiology to efficiently identify novel cancer biomarkers.
Our results find compelling evidence in favor of a protective role for a circulating metabolite, IVC, in lung cancer etiology. From the treatment of a Mendelian disease, isovaleric acidemia, we know that circulating IVC is modifiable through a restricted protein diet or glycine and L-carnatine supplementation. IVC may represent a modifiable and inversely associated biomarker for lung cancer.
Introduction
Lung cancer causes 1.8 million deaths worldwide and is the leading cause of cancer death globally (1). Tobacco causes 8 of 10 lung cancers (2), and a number of environmental and occupational exposures, such as radon (3) and radiation (4), are well-described. Nonetheless, little is known about specific biochemical modifiable risk factors for lung cancer. In many Western countries, a large proportion of lung cancer cases now occur in former or never smokers (5). Identifying additional modifiable risk factors for lung cancer beyond smoking is therefore of great interest, may identify individuals at risk, and provide other prevention targets.
Causal inference in humans can be biased by confounding, where the exposure and the outcome share a common cause. Research into lung cancer etiology is particularly challenging as many putative risk factors, including health conditions, socioeconomic factors, and biomarkers (6) strongly associate with smoking behaviors, which induces confounding in traditional epidemiological studies. Mendelian randomization (MR; ref. 7), which uses germline genetic variants as instrumental variables, is less prone to confounding because it relies upon the random segregation of alleles at meiosis and their random allocation at conception, thereby breaking association with nearly all confounding factors (8).
However, the causal interpretation of MR estimates relies on several major assumptions (Fig. 1, top). First, the genetic proxy must be robustly associated with the exposure. Second, the genetic proxy must not be associated with factors that confound the exposure–outcome association. Third, the genetic proxy must affect the outcome only via the exposure, that is, the absence of horizontal pleiotropy (6). Fourth, genetic proxies cannot increase the exposure in some subjects and decrease it in others: the effect must be consistent in the same direction or null (9). Several novel statistical methods and qualitative analyses have recently emerged to evaluate violations of these assumptions. However, horizontal pleiotropy can be reduced in metabolite studies by using genetic variants that influence the metabolite and are located in, or close to, genes whose roles in determining metabolite levels have been previously well described. Because hundreds of metabolite enzymatic pathways have been studied over the past century (10), a wealth of information is available to identify such genetic variants and assess potential bias due to horizontal pleiotropy (11, 12). In addition, genetic and biological variability affecting blood and other tissues metabolic profile are well documented to promote oncogenesis and cancer proliferation (13, 14). Therefore, metabolomics-based MR can help overcome a main limitation of MR when the genetic determinants of candidate metabolites act upon genes involved in metabolic pathways, while offering a rationale for biomarker discovery in cancer.
Recent large-scale genome-wide association studies (GWAS) have identified the genetic determinants of hundreds of biomarkers, such as metabolites (15). Therefore, two-sample MR, where the exposure and outcome are assessed in different studies (16), could be used to screen for the effect of these metabolites on disease risk if a large GWAS has been conducted for the disease (17). These results could then be assessed using direct measurement of the metabolite in appropriate case–control studies, providing converging evidence from different methods that are subject to different limitations and biases (18, 19).
Our objective was to identify metabolic risk factors for lung cancer risk using an approach that integrates Mendelian randomization with direct metabolite analysis in prediagnostic sample from large-population cohorts.
Materials and Methods
Overall study design
Our goal was to identify etiologic metabolic markers of lung cancer risk using two independent but complementary designs: an exploratory two-sample MR in large GWAS, with validation for the importance of the most promising metabolites in prediagnostic blood from case–control studies nested in large population cohorts (Fig. 1). STROBE-MR (20) and STROBE (21) reporting guidelines were followed for MR and case–control studies, respectively.
Mendelian randomization
Study populations and data sources
SNP–metabolite association data were obtained from a metabolite GWAS in 7,824 subjects of European descent from two population-based cohorts using the Metabolon platform (15). SNP–lung cancer risk associations were extracted from a recent large-scale lung cancer GWAS with 29,266 cases and 56,450 controls of European descent (22). All studies received ethical approval from their respective review committees/boards and all participants provided written consent.
Statistical analysis
Of the 400 metabolites assayed in 7,824 individuals using the Metabolon platform (15), 207 circulating metabolites had SNPs associated at genome-wide significance (P < 5×10–8). SNPs were clumped at linkage disequilibrium, r2 > 0.001. After data harmonization, 207 metabolites with 555 unique SNPs were included in analyses (Fig. 1, Table 1). Where metabolites had only one available SNP, a Wald estimate was used to estimate the effect on lung cancer risk. Where multiple SNPs were available for a metabolite, odds ratio (OR) were estimated using a likelihood-based MR approach (ML; ref. 16). A false discovery rate (FDR) was applied to adjust for multiple-hypothesis testing from these primary MR analyses using all available instruments for the 207 metabolites investigated.
. | Controls (n = 1,296) . | Cases (n = 649) . |
---|---|---|
Sex, n (%) | ||
Male | 723 (55.7) | 364 (56.1) |
Female | 573 (44.3) | 285 (43.9) |
Age at blood collection, mean (95% CI) | 56.7 (56.2–57.1) | 56.7 (56.1–57.3) |
BMI, mean (95% CI) | 26.3 (26.1–26.5) | 26.3 (26.0–26.6) |
Smoking status, n (%) | ||
Never | 357 (27.3) | 74 (11.2) |
Previous | 401 (30.6) | 163 (24.9) |
Current | 538 (41.1) | 412 (62.8) |
Cigarettes per day, mean (95% CI) | 9.1 (8.6–9.6) | 14.7 (13.9–15.4) |
Smoking duration (years), mean (95% CI) | 21.6 (20.7–22.5) | 31.1 (29.8–32.3) |
Time since quitting (years), mean (95% CI) | 7.3 (6.7–7.9) | 3.0 (2.3–3.8) |
. | Controls (n = 1,296) . | Cases (n = 649) . |
---|---|---|
Sex, n (%) | ||
Male | 723 (55.7) | 364 (56.1) |
Female | 573 (44.3) | 285 (43.9) |
Age at blood collection, mean (95% CI) | 56.7 (56.2–57.1) | 56.7 (56.1–57.3) |
BMI, mean (95% CI) | 26.3 (26.1–26.5) | 26.3 (26.0–26.6) |
Smoking status, n (%) | ||
Never | 357 (27.3) | 74 (11.2) |
Previous | 401 (30.6) | 163 (24.9) |
Current | 538 (41.1) | 412 (62.8) |
Cigarettes per day, mean (95% CI) | 9.1 (8.6–9.6) | 14.7 (13.9–15.4) |
Smoking duration (years), mean (95% CI) | 21.6 (20.7–22.5) | 31.1 (29.8–32.3) |
Time since quitting (years), mean (95% CI) | 7.3 (6.7–7.9) | 3.0 (2.3–3.8) |
Sensitivity analyses
For metabolites with statistically significant ML-based and FDR-adjusted effects, we ran weighted-median (23), weighted-mode (24), MR-Egger (25), and MR-RAPS (26) sensitivity analyses that can provide pleiotropy-robust estimates in the presence of bias from horizontal pleiotropy and can quantify net directional pleiotropy (using the MR-Egger intercept (25)). Heterogeneity of the SNP estimates, an indication of horizontal pleiotropy, was evaluated using Cochran's and Rücker's Q (27).
Bidirectional MR was used to estimate potential lung cancer effects on metabolite concentrations as a sensitivity analysis to assess the correct orientation of MR estimates.
For metabolites with available cis-SNPs (within 1 Mb of a gene known to influence metabolite levels), a separate secondary MR analysis was conducted using the Wald ratio. MR analysis using cis-SNPs are less likely to be subject to pleiotropy (12) and provide a biological rationale for SNP–metabolite associations (9). Furthermore, a stringent Bayesian colocalization analysis was performed to assess confounding by linkage disequilibrium (28).
SNP associations with metabolite-pathway components and lung cancer risk factors were searched to qualitatively evaluate biological plausibility and pleiotropy, respectively, in Phenoscanner (29), KEGG (30), OMIM (31), eQTLgen (32), and GTEx (33) database as described in Materials and Methods.
Additional analyses stratified by histologic cancer subtypes (adenocarcinoma: 11,273 cases, 55,483 controls; squamous cell carcinoma: 7,426, 55,627; small cell carcinoma: 2,664, 21,444) and by smoking status (never: 2,355, 7,504; ever: 23,223, 16,964) were performed for metabolites that remained significant after sensitivity analyses.
Prospective nested case–control study
Study populations
Metabolites with robust evidence of an effect on lung cancer risk following all genetic analyses were subsequently measured using prediagnostic plasma samples from a prospective case–control study nested within the European Prospective Investigation into Cancer (EPIC; ref. 34) and Nutrition and The Northern Swedish Health and Disease Study (NSHDS; ref. 35).
The EPIC study is a large multicenter prospective cohort that recruited participants between 1992 and 1998 (34). For the case–control study, participants were taken from the 238,816 individuals from the centers in Netherlands, United Kingdom, France, Germany, Spain, and Italy who donated a blood sample at study recruitment. NSHDS is an ongoing prospective cohort of the population in Västerbotten County, Sweden. At the end of follow-up for the current study sample in 2014, a total of 99,404 study participants who donated a blood sample at enrollment had been recruited. Further cohort information is provided in Materials and Methods.
All study participants gave written informed consent to participate in the study and the research was approved by the participating countries’ local ethics committees and IARC's Ethical Review Committee.
Outcome and study design
Incident lung cancer was defined based on the International Classification of Diseases for Oncology (ICD-O-2) and included all invasive cancers coded as “C34”. Cases were chosen to maximize time to from blood collection to diagnosis (min: 2.5 years; 97% cases over 5 years).
At the time of diagnosis of an index case, two cohort participants that were alive and free of cancer (excluding nonmelanoma skin cancer) were randomly selected as controls and matched (36) based on study center, sex, date of blood collection (±12 months), and age at blood collection (±3 months, relaxed up to ±5 years) as shown in Fig. 1. To adjust for smoking and maximize power in smoking-stratified analyses, one control in each matched set was also matched on the index case's smoking status from 5 categories: never smokers, short- and long-term quitters among former smokers (<10 years and 10 years since quitting, respectively), and light and heavy smokers among current smokers (<15 cigarettes and 15 cigarettes per day, respectively). The overall sampling strategy yielded 649 cases and 1,296 matched controls.
Molar metabolite measurement
Molar metabolite concentrations in plasma samples were quantified at The Metabolomics Innovation Centre (TMIC, University of Victoria, Canada) by ultrahigh-performance liquid chromatography–tandem mass spectrometry (UPLC-MS/MS) operated in the multiple-reaction monitoring mode and expressed in nmol/L as described in Materials and Methods.
Statistical analyses
Descriptive statistics were conducted for anthropometric and lifestyle characteristics between cases and controls. Metabolite concentrations were log10-transformed to allow for direct comparison between case–control and MR estimates, which were measured on this scale. Linear regression was used to test for linear trends among controls by strata of selected characteristics (sex, age, body mass index, and smoking traits).
Primary analysis involved a conditional logistic regression model to examine the statistical association of the prioritized metabolites with lung cancer risk, conditioned on the matching factors and adjusted for age, BMI, and smoking characteristics (cigarettes/day and smoking duration). Secondary analyses were repeated in subgroups according to histology and smoking status (never/ever). Additional analyses by quartile of metabolites delimited in controls for lung cancer overall and by the abovementioned subgroups were also conducted. Statistical analyses were performed using R [TwoSampleMR (37) Coloc (28); The R project (38)].
Results
MR analyses
After FDR correction (5%), three metabolites were associated with lung cancer risk: arachidonate(20–4n6), 1-arachidonoylglycerophosphocholine, and isovalerylcarnitine (IVC; Supplementary Tables S1 and S2). However, only IVC remained associated with lung cancer risk after pleiotropy-robust analyses [weighted-median (23), weighted-mode (24), MR-Egger (25), and MR-RAPS (26)]. As determined a priori, IVC was therefore the only metabolite further investigated. A genetically determined increment in blood IVC concentration (log10) was associated with a reduced risk of lung cancer (ORML: 0.43; 95% CI, 0.29–0.63; NSNP = 6; Table 2). Similar results were observed for IVC and lung cancer risk when stratified by histologic subtypes (small cell carcinoma: ORML: 0.19; 95% CI, 0.07–0.50; squamous cell carcinoma: ORML: 0.39; 95% CI, 0.21–0.72; and adenocarcinoma: ORML: 0.52; 95% CI, 0.31–0.88) and by smoking status (ever: ORML: 0.43; 95% CI, 0.22–0.64; and never: ORML: 0.76; 95% CI, 0.49–1.02; Fig. 2; Supplementary Table S3).
Sensitivity analyses: MR assumption evaluation
A cis-SNP (rs9635324) was identified for IVC using the ProGeM package (11). This SNP is located downstream (5.5 kb) from the isovaleryl dehydrogenase gene (IVD), which was confirmed using KEGG's (30) enzyme codes. IVD's substrate is the metabolite isovaleryl-CoA, whose carnitine circulating form is isovalerylcarnitine, IVC (Fig. 3). Mutations at the IVD locus render this enzyme inactive and leads to isovaleric acidemia, an autosomal recessive inborn error of leucine catabolism characterized by an accumulation of IVC in whole blood (39). Moreover, the cis-SNP allele associated with lower IVC was consistently associated with higher IVD enzyme gene expression in whole blood (eQTLgen data; refs. 32, 40) and in lung tissue from GTEx (33). Thus, impaired IVD enzyme function leads to higher blood IVC, whereas higher functional enzyme expression is associated with low IVC levels. Overall, these findings provide a clear biological rationale for the cis-SNP-metabolite association via IVD’s enzyme, supporting MR assumptions (Fig. 1; ref. 9). MR analyses using only this cis-SNP supported IVC's primary MR results overall (log10 OR: 0.27; 95% CI, 0.14–0.54) and by histologic subtype (Fig. 3; Supplementary Table S3).
The colocalization analysis (Fig. 1) found an 80% posterior probability that a single signal (cis-SNP rs9635324) at the genomic locus around IVD affects both circulating IVC levels and lung cancer risk. Additional sensitivity analyses revealed no evidence of horizontal pleiotropy or heterogeneity (Supplementary Tables S4 and S5). Notably, the only association identified for the cis-SNP rs9635324 was with IVC, supporting the validity of this instrument (Supplementary Table S6). Finally, bidirectional MR analysis showed no association of lung cancer with IVC levels (Supplementary Table S7).
Prospective nested case–control study
Data from 656 cases and 1,296 matched controls were included in the analysis. The mean age at blood collection was 56 years for both controls and cases, and for cases the mean time between pre-diagnostic blood collection and diagnosis was 7 years (range: 2–10 years; Table 1.). Among controls, IVC concentrations were higher among men compared with women, participants with higher BMI, and among participants who smoked (driven by higher proportion of smoking in males who have higher IVC on average than females), smoked a greater number of cigarettes per day, and who smoked for a greater number of years (Table 2).
. | Isovalerylcarnitine (nmol/L) . | |||
---|---|---|---|---|
. | Controls (1,296) . | Case (649) . | ||
. | N . | Mean (95% CI) . | N . | Mean (95% CI) . |
Sex | ||||
Male | 723 | 73.8 (71.7–75.8) | 364 | 69.5 (66.6–72.5) |
Female | 573 | 55.7 (53.5–58.1) | 285 | 54.9 (51.7–58.3) |
Age at blood collection (Years) | ||||
<50 years | 251 | 62.1 (52.8–71.4) | 128 | 66.9 (53.9–79.8) |
≥50 and <55 years | 237 | 67.6 (62.7–72.5) | 113 | 67.6 (60.5–74.7) |
≥55 and <60 years | 340 | 66.6 (63.3–69.9) | 170 | 64.7 (60.0–69.3) |
≥60 and <65 years | 328 | 66.2 (61.3–71.2) | 164 | 60.5 (53.6–67.4) |
≥65 and <70 years | 73 | 71.5 (61.4–81.6) | 39 | 58.7 (44.9–72.5) |
≥70 years | 67 | 61.4 (52.8–71.4) | 35 | 45.5 (26.3–64.7) |
BMI | ||||
≤25 | 521 | 58.9 (56.5–61.5) | 269 | 56.1 (52.6–59.6) |
>25 and ≤30 | 562 | 69.7 (68.5–76.3) | 277 | 67.9 (64.6–71.4) |
>30 | 212 | 72.4 (68.5–76.3) | 102 | 68.9 (63.3–74.6) |
Smoking status | ||||
Never | 357 | 63.1 (60.0–66.1) | 74 | 57.7 (51.0–64.5) |
Previous | 401 | 68.9 (65.9–71.8) | 163 | 65.8 (61.2–70.4) |
Current | 538 | 65.4 (62.9–67.9) | 412 | 63.1 (60.2–65.9) |
Cigarettes per day | ||||
<0 | 357 | 62.9 (59.9–65.9) | 74 | 57.8 (51.0–64.5) |
≥1 and <5 | 110 | 61.1 (55.6–66.6) | 24 | 68.3 (56.4–80.2) |
≥5 and <10 | 224 | 64.5 (60.6–68.3) | 97 | 61.3 (55.4–67.3) |
≥10 and <15 | 167 | 65.8 (61.3–70.3) | 96 | 61.7 (55.7–67.6) |
≥15 | 333 | 70.7 (67.5–73.8) | 335 | 63.7 (60.5–66.9) |
Smoking duration (years) | ||||
0 | 357 | 62.9 (59.9–65.9) | 74 | 57.8 (51.0–64.5) |
≥1 and <20 | 204 | 63.4 (59.4–67.4) | 47 | 64.3 (55.8–72.8) |
≥20 and <30 | 205 | 67.3 (63.2–71.4) | 112 | 55.5 (49.5–61.5) |
≥40 | 489 | 67.7 (65.1–70.3) | 395 | 66.0 (63.0–69.1) |
. | Isovalerylcarnitine (nmol/L) . | |||
---|---|---|---|---|
. | Controls (1,296) . | Case (649) . | ||
. | N . | Mean (95% CI) . | N . | Mean (95% CI) . |
Sex | ||||
Male | 723 | 73.8 (71.7–75.8) | 364 | 69.5 (66.6–72.5) |
Female | 573 | 55.7 (53.5–58.1) | 285 | 54.9 (51.7–58.3) |
Age at blood collection (Years) | ||||
<50 years | 251 | 62.1 (52.8–71.4) | 128 | 66.9 (53.9–79.8) |
≥50 and <55 years | 237 | 67.6 (62.7–72.5) | 113 | 67.6 (60.5–74.7) |
≥55 and <60 years | 340 | 66.6 (63.3–69.9) | 170 | 64.7 (60.0–69.3) |
≥60 and <65 years | 328 | 66.2 (61.3–71.2) | 164 | 60.5 (53.6–67.4) |
≥65 and <70 years | 73 | 71.5 (61.4–81.6) | 39 | 58.7 (44.9–72.5) |
≥70 years | 67 | 61.4 (52.8–71.4) | 35 | 45.5 (26.3–64.7) |
BMI | ||||
≤25 | 521 | 58.9 (56.5–61.5) | 269 | 56.1 (52.6–59.6) |
>25 and ≤30 | 562 | 69.7 (68.5–76.3) | 277 | 67.9 (64.6–71.4) |
>30 | 212 | 72.4 (68.5–76.3) | 102 | 68.9 (63.3–74.6) |
Smoking status | ||||
Never | 357 | 63.1 (60.0–66.1) | 74 | 57.7 (51.0–64.5) |
Previous | 401 | 68.9 (65.9–71.8) | 163 | 65.8 (61.2–70.4) |
Current | 538 | 65.4 (62.9–67.9) | 412 | 63.1 (60.2–65.9) |
Cigarettes per day | ||||
<0 | 357 | 62.9 (59.9–65.9) | 74 | 57.8 (51.0–64.5) |
≥1 and <5 | 110 | 61.1 (55.6–66.6) | 24 | 68.3 (56.4–80.2) |
≥5 and <10 | 224 | 64.5 (60.6–68.3) | 97 | 61.3 (55.4–67.3) |
≥10 and <15 | 167 | 65.8 (61.3–70.3) | 96 | 61.7 (55.7–67.6) |
≥15 | 333 | 70.7 (67.5–73.8) | 335 | 63.7 (60.5–66.9) |
Smoking duration (years) | ||||
0 | 357 | 62.9 (59.9–65.9) | 74 | 57.8 (51.0–64.5) |
≥1 and <20 | 204 | 63.4 (59.4–67.4) | 47 | 64.3 (55.8–72.8) |
≥20 and <30 | 205 | 67.3 (63.2–71.4) | 112 | 55.5 (49.5–61.5) |
≥40 | 489 | 67.7 (65.1–70.3) | 395 | 66.0 (63.0–69.1) |
The primary conditional logistic regression analysis showed that a 10-fold increment in blood IVC was associated with 48% lower risk of lung cancer (log10-OR: 0.52; 95% CI, 0.32–0.86). After adjusting for detailed smoking exposure (smoking duration and cigarettes/day) and BMI, the association between blood IVC was accentuated and resembled that of the MR analysis (log10 OR, 0.39; 95% CI, 0.21–0.72) with no difference in precision (SEminimally adjusted = 0.27 vs. SEfully adjusted = 0.24).
Stratified analysis by histologic subtypes and smoking yielded similar OR estimates to that of the primary analysis, although confidence intervals included one, indicating that these subgroup analyses may have benefitted from a larger sample size. Risk analyses by quartiles of IVC with lung cancer can be found in Supplementary Table S8.
Discussion
In this study, we integrated genetic (MR) and traditional epidemiology study designs as an efficient and novel approach to identify lung cancer biomarkers with plausible etiologic involvement. In the initial MR analyses, we tested 207 metabolites and identified IVC as associated with lung cancer risk. Subsequent direct blood measurement of IVC in prediagnostic blood samples from large prospective case–control studies independently supported an association of IVC with lung cancer risk.
Etiologic research on lung cancer is hampered by the wide-ranging impact of smoking, not only on lung cancer risk, but also on many putative risk factors. MR largely overcomes this confounding by relying upon random assignment of alleles at conception, yet it can yield biased estimates when its assumptions are violated (7). The most problematic assumption is the lack of horizontal pleiotropy. The study design we have followed helps to mitigate this bias since the enzymatic and genetic determinants of IVC have been previously described, allowing us to use only SNPs near enzymes known to influence IVC levels directly. It is possibly, but unlikely, that such cis-SNPs act on lung cancer via pathways independent of IVC. Furthermore, in Phenoscanner (29), a database with over 65 billion published SNP associations, we identified no associations between the SNPs used as proxies of IVC and smoking characteristics. We thus conclude that the observed relationship between IVC and lung cancer is independent of smoking.
We next analyzed the concentrations of IVC using prediagnostic blood samples from a case–control study nested within two large population cohorts. This analysis allowed us to carefully evaluate the epidemiologic properties of IVC and its relation to lung cancer risk using direct measurements. This analysis confirmed the inverse association between IVC and lung cancer risk, and careful adjustment for smoking characteristics further accentuated the association. Taken together, these data are consistent with a role for IVC metabolism in lung cancer etiology in humans. Nonetheless, future work should aim to replicate these findings in larger cohorts and investigate the IVC–lung cancer association among never smokers in a sample with greater power for stratified analyses.
IVC is a carnitine substrate of the enzyme isovaleryl-CoA dehydrogenase, which is involved in the degradation of leucine and fatty acids. Leucine is, in turn, an essential amino acid that is involved in metabolic regulation via the mTORC1 complex, which may influence cancer development through intracellular signals regulating cellular growth and proliferation (41). Leucine also regulates the cellular availability of glutamine, a major player in cancer proliferation and drug resistance via metabolic rewiring (42). More proximally, IVC is a selective activator of calpain, an inducer of apoptosis (43, 44); thus, lower cellular IVC levels may interfere with programmed cell death. Although there is limited epidemiologic evidence in the literature on the importance of IVC in cancer, circulating IVC has previously been inversely associated with endometrial cancer (45). Despite this evidence, the specific biological pathway from IVC to lung cancer pathophysiology remains to be elucidated.
Since cancer's first portrayal as a metabolic disease over a hundred years ago (10), a deeper understanding of the metabolic heterogeneity and adaptability of cancerous tissue (46) has yielded novel metabolism-targeted therapies (42, 47). Similarly, genetic and biological variability affecting several tissues’ metabolic profile are well documented to impact cancer risk and proliferation (13, 14). The treatment of isovaleric acidemia shows that IVC can be modified by a restricted protein diet, glycine and L-carnitine supplementation (39), yet this remains to be investigated in lung cancer.
Much of the published biomarker research has used MR to test existing hypotheses reported in the observational and clinical literature due to its robustness to classic epidemiology biases (48, 49). In contrast, here we demonstrate that a conservative set of MR-based decision criteria, leveraging features unique to metabolites, involved in cancer biology, can be used as a primary step to generate strong statistical evidence in favor of a metabolite, or a set of these, from a large panel of candidate metabolites. Given the prohibitively high cost of measuring a full panel of metabolites/proteins in an adequately powered sample, our study demonstrates an efficient approach to identifying plausible etiologic biomarkers that can readily be applied to other cancer outcomes.
Limitations
This approach, however, is not without limitations. Not all known metabolites are available on commercial panels; therefore, our study did not include all known blood metabolites. Furthermore, while cis instruments for a biomarker may generate strong MR evidence (8), their identification is nontrivial. We thus advise caution when using MR to scan for biomarkers where cis instruments are not available to corroborate MR signals from trans genetic variation. Finally, but perhaps most importantly, we provide evidence that IVC is important in the etiology of lung cancer, but this does not preclude other metabolites in the IVC pathway having biological effects on lung cancer and further studies are required to fully investigate each constituent of the pathway.
We found elevated levels of IVC inversely associated with lung cancer risk in both MR and nested case–control studies, thus providing evidence in favor of a protective role for IVC in lung cancer etiology. Further research is required to clarify the mechanisms by which IVC may influence lung cancer development. More generally, we present a methodologic approach for biomarker discovery that allows for efficient identification of biomarkers using MR, to be followed up by direct measurements in well-designed epidemiologic studies.
Authors' Disclosures
A. Cerani reports grants and other support from Canadian institutes of health research and grants and other support from Fonds de Recherche Québec, Santé, during the conduct of the study. S. Zhou reports grants from CIHR outside the submitted work. C.H. Borchers reports Chief Scientific Officer at MRM Proteomics Inc, a spin off company of the University of Victoria. Chief Technology Officer at Molecular You. Chief Scientific Officer at MRM Proteomics Russia, a spin off company of the Skolkovo Institute of Science and Technology, Moscow, Russia.
Authors' Contributions
K. Smith-Byrne: Conceptualization, formal analysis, investigation, visualization, methodology, writing–original draft, writing–review and editing. A. Cerani: Formal analysis, writing–original draft, writing–review and editing. F. Guida: Writing–review and editing. S. Zhou: Formal analysis, writing–review and editing. A. Agudo: Writing–review and editing. K. Aleksandrova: Writing–review and editing. A. Barricarte: Writing–review and editing. M. Rodríguez-Barranco: Writing–review and editing. C.H. Borchers: Writing–review and editing. I.T. Gram: Writing–review and editing. J. Han: Writing–review and editing. C.I. Amos: Writing–review and editing. R.J. Hung: Writing–review and editing. K. Grankvist: Writing–review and editing. T.H. Nøst: Writing–review and editing. L. Imaz: Writing–review and editing. M.D. Chirlaque-López: Writing–review and editing. M. Johansson: Writing–review and editing. R. Kaaks: Writing–review and editing. T. Kühn: Writing–review and editing. R.M. Martin: Writing–review and editing. J.D. McKay: Writing–review and editing. V. Pala: Writing–review and editing. H.A. Robbins: Writing–review and editing. T.M. Sandanger: Writing–review and editing. D. Schibli: Writing–review and editing. M.B. Schulze: Writing–review and editing. R.C. Travis: Writing–review and editing. P. Vineis: Writing–review and editing. E. Weiderpass: Writing–review and editing. P. Brennan: Writing–review and editing. M. Johansson: Conceptualization, data curation, supervision, funding acquisition, methodology, writing–original draft, writing–review and editing. J.B. Richards: Conceptualization, resources, data curation, supervision, writing–original draft, writing–review and editing.
Acknowledgments
Mattias Johansson and Karl Smith-Byrne were supported by grants from the US National Cancer Institute under award number U19CA203654 and Cancer Research UK (C18281/A29019). This work was also supported by the Canadian Institutes of Health Research, the Canadian Foundation for Innovation, the Fonds de Recherche Santé Québec (FRSQ), and the FRQS Clinical Research Scholarship. Metabolite GWAS studies were conducted within TwinsUK, which is funded by the Wellcome Trust, Medical Research Council, European Union, the National Institute for Health Research (NIHR)-funded BioResource, Clinical Research Facility, and Biomedical Research Centre based at Guy's and St Thomas's NHS Foundation Trust in partnership with King's College London. The INTEGRAL-ILCCO OncoArray data collection was supported by National Institute of Health under award number U19CA203654.
We thank our collaborators from the International Lung Cancer Consortium (ILCCO) Adonina Tardon, Angela Risch, Angeline Andrew, Chu Chen, David Christiani, Demetrios Albanes, Erich Wichmann, Gadi Rennert, Geoffrey Liu, Hans Brunnström, Heike Bickeböller, Hongbing Shen, Jian-Min Yuan, John K. Field, John R. McLaughlin, Kjell Grankvist, Lambertus A. Kiemeney, Loic Le Marchand, M. Dawn Teare, Maria Teresa Landi, Matthew B. Schabath, Melinda C. Aldrich, Mikael Johansson, Neil Caporaso, Olle Melander, Philip Lazarus, Richard Houlston, Sanjay S. Shete, Shan Zienolddiny, Stephen Lam, Stig E. Bojesen, Susanne Arnold, Thorunn Rafnar, Victoria Stevens, Ying Wang, and Yun-Chul Hong.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Note: Supplementary data for this article are available at Cancer Epidemiology, Biomarkers & Prevention Online (http://cebp.aacrjournals.org/).