Abstract
Observational studies have suggested blood cell counts may act as predictors of cancer. It is not known whether these hematologic traits are causally associated with lung cancer.
Two-sample bidirectional univariable Mendelian randomization (MR) and multivariable MR (MVMR) were performed to investigate the causal association between hematologic traits and the overall risk of lung cancer and three histologic subtypes [lung adenocarcinoma, squamous cell lung cancer, and small cell lung cancer (SCLC)]. The instrumental variables of 23 hematologic traits were strictly selected from large-scale genome-wide association studies. Inverse-variance weighted method and five extra methods were used to obtain robust causal estimates.
We found evidence that genetically influenced higher hematocrit [OR, 0.845; 95% confidence interval (CI), 0.783–0.913; P = 1.68 × 10−5] and hemoglobin concentration (OR, 0.868; 95% CI, 0.804–0.938; P = 3.20 × 10−4) and reticulocyte count (OR, 0.923; 95% CI, 0.872–0.976; P = 5.19 × 10−3) decreased lung carcinoma risk, especially in ever smokers. MVMR further identified hematocrit independently of smoking as an independent predictor. Subgroup analysis showed that a higher plateletcrit level increased the risk of small cell lung carcinoma (OR, 1.288; 95% CI, 1.126–1.474; P = 2.25 × 10−4).
Genetically driven higher levels of reticulocyte count and hematocrit decreased lung cancer risk. Higher plateletcrit had an adverse effect on SCLC. Hematologic traits may act as low-cost factors for lung cancer risk stratification.
Further studies are required to elucidate the potential mechanisms underlying the dysregulation of homeostasis related to hematologic traits, such as subclinical inflammation.
Introduction
Lung cancer is the second most common cancer worldwide, with the highest mortality, accounting for 18.0% of all cancer-related deaths (1). Identifying simple predictors of lung cancer, such as smoking history, could improve the accessibility and cost-effectiveness of screening programs and contribute to early diagnosis of lung cancer. Clinical practices routinely assess blood cell count to assist in diagnosing various diseases. It is known that different blood cell components may play a role in diagnosing and treating cancer, with subtle changes occurring in the early stages of cancer. Pretreatment anemia was an independent risk and prognostic factor for survival in lung cancer patients (2). White blood cells are associated with inflammation, inducing mediators and cellular effectors as important constituents of the local environment of tumors (3). Platelets were regarded as active players in all steps of carcinogenesis including cancer cell proliferation, extravasation, and metastasis (4).
Observational studies are difficult to systematically evaluate the relationship between hematologic traits with lung cancer because there may be biases such as confounders or reverse causation. Mendelian randomization (MR) is a genetic epidemiologic method using SNPs as instrumental variables (IV) for risk factors to explore the unbiased effect on diseases (5). As inherent genetic variations are usually impervious to environmental variables, the bias of confounding factors common in observational studies is reduced (6). The fundamental assumption of MR analyses is that: the IVs are associated with the risk of exposures, and only through the exposures; the IVs were independent of any potential confounders (6). In this study, using summary data of large-scale genome-wide association study (GWAS), we performed a series of two-sample MR analyses to establish whether hematologic traits have a causal effect on the risk of lung cancer and its different subtypes.
Materials and Methods
Data sources
The genetic information on hematologic traits was obtained from comprehensive GWAS data published by William et al, which contains about 170,000 European-ancestry participants (8). As listed in Supplementary Table S2, 7 mature red cell-related traits, 4 immature red cell-related traits, 8 white blood cell-related traits, and 4 platelet-related traits were selected.
Summary statistics of lung cancer were obtained from a recent aggregated GWAS analysis on 29,266 patients and 56,450 controls based on the collaboration of Transdisciplinary Research of Cancer in Lung of the International Lung Cancer Consortium (TRICL-ILCCO) and the Lung Cancer Cohort Consortium (LC3; 9). MR analysis was performed in total lung carcinoma and histologic types of lung adenocarcinoma (11,273 cases and 55,483 controls), squamous cell lung cancer (7,426 cases and 55,627 controls), and small cell lung cancer (SCLC; 2,664 cases and 21,444 controls). Meanwhile, MR analysis was also performed for identified hematologic traits in ever-smokers (23,223 cases and 16,964 controls) and never-smokers (2,355 cases and 7,504 controls) to evaluate the role of smoking.
The genetic information of smoking was obtained from a cross-ancestry with 3.4 million individuals (approximately 79% European; ref. 10). Smoking initiation was regarded as a distinction between ever-smokers and never-smokers. Cigarettes per day were also used to assess the effect of hematologic traits after adjustment of smoking. The IVs of total bilirubin level were extracted from a GWAS including 363,228 individuals (approximately 94% European) from UK Biobank (11).
All analyses were based on publicly available summary statistics. No individual patient was involved in the design of the study. As a result, ethical approval and informed consent were not required for this study.
IVs
Valid IVs were extracted with a series of selection standards. The F statistics were used to select strong SNPs, avoiding the negative impact of weak IVs. Typically, SNP with an F statistic > 10 was selected for MR analysis. Genetic variants associated with the exposures at thresholds for a genome-wide level of statistical significance (P < 5 × 10−8) were selected as instruments. All SNPs were clumped with r2 < 0.001 and a clump window of 10,000 kb based on the 1000 genomes linkage disequilibrium (LD) reference panel of only Europeans. Furthermore, to eliminate the influence of confounding factors, each IV was retrieved using the PhenoScanner (http://www.phenoscanner.medschl.cam.ac.uk/) to evaluate the confounding factors in the potential association between hematologic traits and increased risk of lung cancer (12, 13). SNPs associated with traits, including lung cancer, body mass index, smoking, and diabetes were excluded. Variance explained for each instrument SNP was calculated using the following formula: 2β2*MAF(1–MAF; ref. 14). MAF represents the minor allele frequency of the SNPs, and β represents the effect of the effect allele on exposures.
Statistical analysis
Univariable MR
Two-sample MR analyses were performed using the inverse-variance weighted (IVW) method based on the random-effects model. This method assumed that there were no horizontal pleiotropic effects of the IVs on the outcome. SNPs associated with the outcome with a P value < 5 × 10−8 were pruned to avoid an ambiguous direction between hematologic traits and lung cancer (15). MR Pleiotropy Residual Sum and Outlier (MR-PRESSO) method was used to remove outlier SNPs to correct for horizontal pleiotropy, as implemented in the R package MR-PRESSO (16). After outliers were removed, the IVW method was performed again. Further, five extra methods, weighted median, penalized weighted median, MR-Egger regression, maximum likelihood, and Robust Adjusted Profile Score (RAPS) method, were applied to perform sensitivity analyses (17). The standard error of the effect estimated from the MR-Egger method will typically be large and the effect will be lower (18). The weighted median analysis provides consistent estimates even when up to 50% of the information comes from invalid IVs and penalized weighted median method downweights the contribution of invalid IVs to the analysis (19). Maximum likelihood estimates the probability distribution parameters by maximizing the likelihood function and has low standard error (20). The RAPS method uses profile likelihood to model weak instrument bias and is robust to idiosyncratic pleiotropy. Horizontal pleiotropy was assessed by estimating the deviation of MR-Egger intercepts (21). Heterogeneity was measured by I2 and Cochran Q-derived P and was further tested with random effects model (22, 23). I2 > 50% were used as indicators of possible horizontal pleiotropy. In addition, to assess potential heterogeneous SNPs, the leave-one-out analysis was performed by omitting each instrumental SNP. Statistical power was calculated using the mRnd power calculator (available at http://cnsgenomics.com/shiny/mRnd/), taking into account a type I error of 5% and statistical power of 0.80 (24).
Multivariable MR
For lung carcinoma, multivariable MR (MVMR) was performed (25). For the MVMR analyses, we extracted SNPs that were genome-wide significant with at least one of the hematologic traits and excluded any SNPs with a pairwise LD R2 > 0.001. IVW, MR-egger, and median-based methods were used in multivariable analyses. The MVMR-Egger method will be useful for analyzing high-dimensional data in situations where the risk factors are highly related (26). Heterogeneity was tested by calculating the modified form of Cochran's Q statistics. The strength of each exposure was evaluated by calculating conditional F statistic, and a conventional F-statistic threshold is 10 (16). The covariance between the effect of the genetic variants was estimated using metaCCA package (27). MR-PRESSO was applied to evaluate horizontal pleiotropy and remove outliers (16).
Reverse MR
To assess whether there is a reverse causal relationship between lung cancer and hematologic traits, bidirectional MR with four identified hematologic traits as outcomes and lung cancer as a risk factor was also performed on identified associations. IVs for 4 different lung cancer outcomes were selected using the same method described earlier. IVW method was applied to bidirectional MR and associations with FDR-adjusted P < 0.05 were significant.
The statistical significance level was set to 0.05 after the FDR-corrected throughout our study. Two-sample MR analyses were performed using R, version 4.1, mainly the TwoSampleMR (28). MR-PRESSO package was used to test horizontal pleiotropy and remove outliers. MVMR was also applied to lung carcinoma with MendelianRandomization, MVMR, and metaCCA package (25, 27).
Data availability
The datasets analyzed in this study are publicly available summary statistics from each GWAS consortia. Further inquiries can be directed to the corresponding author.
Results
Instrumental variants
A total of 3,713 SNPs were extracted from GWAS of 23 hematologic traits. Each SNP was strong enough with F > 10 (min 29.718, mean 96.63), eliminating the bias of weak IVs (Supplementary Table S3). The number of IVs for heart structure ranged from 78 to 238, resulting in variance ranging from 3.65% to 23.04% for each exposure (Supplementary Table S4).
Association between hematologic traits and lung carcinoma
Univariable MR
The causal association between 23 various hematologic traits and lung carcinoma was assessed. After the removal of outliers using MR-PRESSO, 3 hematologic traits were identified to be significantly associated with lung carcinoma via the IVW method (FDR < 0.05; Supplementary Table S5). IVW method estimated that higher hematocrit [OR, 0.845; 95% confidence interval (CI), 0.783–0.913; P = 1.68 × 10−5), hemoglobin concentration (OR, 0.868; 95% CI, 0.804–0.938; P = 3.20 × 10−4), and reticulocyte count (OR, 0.923; 95% CI, 0.872–0.976; P = 5.19 × 10−3) decreases the risk of lung carcinoma. Statistical power was calculated as 0.999, 0.993, and 0.935, respectively (Supplementary Table S6). Another 5 methods mentioned in Methods were used to estimate the causal associations, and the FDR value was mostly lower than 0.05 (Supplementary Table S5; Fig. 2). The results were also consistent with MR-PRESSO methods. Cochran Q statistics and I2 based on IVW and MR-Egger showed no evidence of heterogeneity (Supplementary Table S7). In addition, there was no significant horizontal pleiotropy according to the intercept of the MR-Egger regression (Supplementary Table S7). However, there was no causal relationship between red blood cell count and lung cancer (P = 0.862). IVW method also estimated that reticulocyte fraction of red cells was associated with lung cancer (P = 0.0268), which was consistent with the results of other methods, but the P value was no longer significant after correction.
There were no potential outliers of the IVs of the 3 identified hematologic traits that were present on visual inspection in scatter plots (Fig. 3) and leave-one-out plots (Supplementary Figs. S1–S3). Therefore, there was insufficient evidence for horizontal pleiotropy in the association between hematologic traits and lung carcinoma.
Smoking and bilirubin
Considering tobacco consumption has been proven to increase the risk of lung carcinoma, we then explore whether the effect of hematologic traits differs in smokers. Firstly, the effect of 3 identified hematologic traits was assessed in ever-smokers and never-smokers. Hematocrit and hemoglobin concentration affect lung carcinoma risk in ever smokers but never smokers, while reticulocyte count demonstrated no causal association in either the ever-smokers or the never-smokers (Supplementary Fig. S4A, Supplementary Table S8). No heterogeneity or horizontal pleiotropy was observed (Supplementary Table S9). We then assessed the effects of hematologic traits independent of smoking using MVMR with smoking as an adjustment. Two smoking-related traits were selected: smoking initiation, which represents ever or never smokers, and cigarettes per day, which represents the intensity of smoking with the number of SNPs ranging from 101 to 278 (Supplementary Table S10). Hematocrit still exhibited a negative association with lung cancer risk independent of smoking initiation (OR, 0.836; 95% CI, 0.749–0.932; P = 1.30 × 10−3) or daily cigarette consumption (OR, 0.848; 95% CI, 0.792–0.908; P = 2.14 × 10−6). Reticulocyte count's association with lung cancer ceased to exist after accounting for smoking. Furthermore, hemoglobin concentration correlated with lung cancer after adjusting for smoking intensity, yet this association diminished to insignificance upon adjusting for smoking initiation (Supplementary Fig. S4B, Supplementary Table S11).
Serum bilirubin has been linked to lung carcinoma while it is also increased in hemolysis. This correlation signifies changes in hematologic characteristics and suggests a potential role of bilirubin in the causal effects of these traits Thus, we performed MVMR with bilirubin as an adjustment using 105 to 122 SNPs (Supplementary Table S10). The causal effect of reticulocyte count was attenuated to null (P = 0.061) after adjustment for bilirubin, while the effect of hematocrit (P = 7.85 × 10−3) and hemoglobin concentration (P = 6.03 × 10−3) was robust to the adjustment of bilirubin (Supplementary Fig. S4C, Supplementary Table S11).
MVMR
MVMR was further performed for the 3 traits using 202 SNPs to identify key risk factors (Supplementary Table S12). Reticulocyte count (OR, 1.649; 95% CI, 1.063–2.559; P = 0.03) was still associated with lung carcinoma using the multivariable IVW method. Though the IVW method demonstrated an insignificant P value (0.059) of hematocrit, the MR-egger method, which is especially applicable to the closely related risk factors, supported its role as an independent protective factor (Fig. 4). The P value of Cochran Q was 0.426, which indicated no obvious heterogeneity. With the phenotype covariance matrix estimated by the metaCCA method used for correction (Supplementary Table S13), the strength test indicated that the instruments of each trait were strong enough (min F = 18.4), supporting the validity of this analysis.
Subgroup analysis of lung carcinoma subtypes
We further performed subgroup analysis in three histologic subtypes of lung carcinoma. Figure 5 summarizes the significance and OR of the association estimates for each hematologic trait with each lung cancer subtype (Supplementary Table S14). MR analysis based on the IVW method found that plateletcrit was causally associated with a higher risk of small cell lung carcinoma with FDR = 0.015 (OR, 1.288; 95% CI, 1.126–1.474; P = 2.25 × 10−4). Though hematocrit was associated with squamous cell lung carcinoma (OR, 0.824; 95% CI, 0.724–0.938; P = 3.47 × 10−3) and reticulocyte count was associated with lung adenocarcinoma (OR, 0.904; 95% CI, 0.837–0.976; P = 9.48 × 10−3), the FDR values of both associations were greater than 0.05. We also used other methods to verify the result and the sensitivity analysis showed that plateletcrit was a risk factor for small cell lung carcinoma regardless of the analysis methods. The statistical power was calculated as 0.994 (Supplementary Table S15). Cochran's Q test showed that no significant heterogeneity existed and there was also no evidence of directional pleiotropy according to the MR-Egger intercept (Supplementary Table S16).
Reverse MR
We also conducted reverse MR to investigate the potential pleiotropy, considering that lung cancer may impact hematologic traits in turn. No reverse causal association was found between lung cancer or its subtypes and the identified traits (hematocrit, reticulocyte count, hemoglobin concentration); thus, the direction of the effects was highly robust (Supplementary Table S17). However, according to the IVW results of reverse MR analysis, lung carcinoma was causally associated with plateletcrit (β = 0.059; 95% CI, 0.026–0.092; P = 5.19 × 10−4; Supplementary Table S17). Meanwhile, Cochran's Q test in the IVW method showed that there was no significant heterogeneity in the IVs (Q pval = 0.086). The results of MR-Egger regression also did not find significant horizontal pleiotropy (P = 0.593). The scatter plot (Fig. 6A) and Leave-one-out analysis (Fig. 6B) showed that there were no potential outliers of the IVs.
Discussion
In this study, using the summary data of 23 blood cell traits from large-scale GWAS, we performed a two-sample MR analysis to investigate whether there is a causal association between blood cell traits and lung cancer. We found evidence that genetically influenced lower hematocrit increased lung carcinoma risk, especially in smokers, and the effect was independent of smoking and bilirubin. A higher plateletcrit level increased the risk of small cell lung carcinoma. In reverse MR, we found that lung carcinoma can increase plateletcrit.
The hematocrit provides valuable information about the oxygen-carrying capacity of the blood and the presence of anemia, which is a common condition associated with cancer. Hematocrit, as a complex hematologic indicator, reflects a diverse array of metabolic and lifestyle risk factors that may contribute to an increased susceptibility to lung cancer (29). A few observational studies regarded a low hematocrit as a predictor for a poorer prognosis of cancer (30, 31). Blood viscosity exponentially increases with increasing hematocrit (32). It is primarily dependent on hematocrit and is a determinant of peripheral vascular resistance. Endothelial cells are involved in the regulation of microvascular perfusion via their production and secretion of the potent vasodilator nitric oxide. Acute decrease of hematocrit leads to decreased shear stress and reduced steady-state nitric oxide level, thereby shifting the microvascular tone towards vasodilatation (33). Thus, it is not surprising that hematocrit is an independent risk factor for cardiovascular mortality (34). Nitric oxide has shown to be both pro- and anti-carcinogenic per se (35). Meanwhile, low hematocrit levels exhibit low hemoglobin, suppressing oxygen metabolism and therefore deteriorating microcirculation disorders. Even within normal ranges of hematocrit levels, hematocrit was also suggested to suppress oxygen supply to tissues by affecting tissue tension of oxygen (36). The proliferation of tumor tissue presents the division of cancer cells with hysteretic formation of microvasculature, leading to chronically reduced oxygen concentration. Lower hematocrit and hemoglobin levels may exacerbate chronic hypoxia in the microenvironment. Chronic hypoxia promotes proliferation and determines the aggressiveness of lung carcinoma via driving glucose metabolic reprogramming and regulating stemness in cancer cells (37, 38).
On the other hand, the dysfunctional microvasculature, associated with decreased nitric oxide, can lead to chronic pro-inflammatory conditions (39). Observational studies have found that lower hematocrit was more likely to have a higher burden of subclinical inflammation, characterized by elevated high-sensitivity C-reactive protein (hs-CRP) level (40). Furthermore, during the inflammatory response and oxidative stress, red blood cells are diminished, leading to a reduction in the hematocrit level. Chronic hypoxia and inflammation favor the selection of more aggressive cancer stem-like cells displaying high metastatic potential (35).
Reticulocytes are nonnucleated direct precursors of red blood cells, representing a small fraction of peripheral red blood cells. Reticulocyte count is an index of bone marrow hemopoietic activity and an elevated reticulocyte count implies a bone marrow response to either increased red blood cell destruction (hemolysis) or acute or chronic blood loss (41). In hemolysis, heme degrades with the formation of bilirubin. The accelerated eryptosis augments plasma bilirubin, which in turn stimulates suicidal erythrocyte death (42). Bilirubin is an antioxidant, and a certain degree of increase is beneficial to resist oxidative stress. Previous observational and MR studies showed raised bilirubin may provide protection against lung cancer, especially in individuals exposed to high levels of smoke oxidants (43–45). Our findings revealed that the causal effect of reticulocyte count reduced to insignificance when bilirubin was taken into account, indicating that the effect of reticulocyte count may be attributed to bilirubin. A previous study suggested that raised bilirubin may provide protection against lung cancer, especially in individuals exposed to high levels of smoke oxidants (45). Similarly, our findings supported the idea that reticulocyte count also act as a protective factor. It is plausible that the protective effect of reticulocyte count is related to bilirubin levels.
Meanwhile, A higher reticulocyte count suggested a greater capacity for oxygen delivery to tissues. Improved tissue oxygenation can create an unfavorable environment for the growth and survival of cancer cells (46). Similarly, the potential involvement of chronic inflammation in the predictive value of reticulocyte count merits consideration. A higher reticulocyte count might act as an indicator of a more balanced inflammatory response. Reticulocytes play a role in modulating inflammatory processes, and an elevated count could suggest an effective control of chronic inflammation (47).
Of note, our study linked plateletcrit to SCLC as well. Plateletcrit is the volume fraction of blood occupied by platelet, reflecting both the count and volume of the platelet. Platelets release many mediators such as integrins, chemokines, thromboxane, and ILs that may lead to thrombosis, infection, and immunity (48). A previous observational study found higher plateletcrit in lung cancer patients than healthy controls (49). Platelets contribute to the increased thrombotic risk of cancer patients and are associated with the metastasis of tumors. Platelets synthesize IL1β, driving the expression of genes supporting cell proliferation and differentiation, and accumulation of oncogenic mutations (50). Clinical inhibition of IL1β demonstrated lower incidences of lung cancer, as well as overall lung cancer mortality in patients with high hs-CRP (51). Our results supported that higher plateletcrit was particularly associated with SCLC. Serotonin, another platelet-derived mediator may play a carcinogenic role. SCLC is a very aggressive tumor with properties of neuroendocrine cells and is strongly associated with tobacco use (52). Nicotine might affect the proliferation of small cell lung carcinoma cells by stimulating the release of serotonin with autocrine capabilities (53). More serotonin due to higher plateletcrit may amplify the mitogenic effect of serotonin on SCLC cells (54).
Our analyses were based on large-scale GWAS about 23 hematologic traits and lung cancer with strict criteria for selecting variants. Consistent results were found across the analysis and the sensitivity analyses. Multivariable Mendelian analysis was applied to clarify the causal effect between related hematologic traits and lung carcinoma. Meanwhile, we have conducted extensive sensitivity analyses in both univariable and multivariable analyses. We also conducted reverse MR and further validated that the direction of the effects was highly robust. These results, for the first time, demonstrated that genetic predisposition of hematologic traits was associated with lung cancer risk. Moreover, our findings also provide insight into the carcinogenic microenvironment that leads to lung cancer.
This study has some limitations as well. Due to the MR study design, we are not able to assess the potential nonlinear relationship between hematologic traits and the risk of lung carcinoma, while in certain conditions hematologic traits may play a U-shape role (55). This MR was conducted using the summary-level data of European descent, which may not be entirely applicable to other ancestries. Finally, it is necessary to interpret the results from summary statistics as population average causal effects. Thus, more population-specific traits need to be investigated.
In this study, we found that genetically driven higher levels of reticulocyte count and hematocrit were causally associated with lower lung cancer risk in the general population. Higher plateletcrit had an adverse effect on SCLC. The causal effect may result from a combination of subclinical inflammation and multiple risk factors. This study supported hematologic traits as low-cost factors for lung cancer risk stratification. Further studies are necessary to verify the critical mediators.
Authors' Disclosures
No disclosures were reported.
Authors' Contributions
Z. Yang: Conceptualization, formal analysis, writing–original draft. H. He: Conceptualization, formal analysis, writing–original draft. G. He: Data curation, writing–review and editing. C. Zeng: Methodology. Q. Hu: Supervision, project administration.
Acknowledgments
Q. Hu was funded by the China Postdoctoral Science Foundation (2021TQ0371, 2021M703636) and the Hunan Provincial Natural Science Foundation (2022JJ40661).
The publication costs of this article were defrayed in part by the payment of publication fees. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.
Note: Supplementary data for this article are available at Cancer Epidemiology, Biomarkers & Prevention Online (http://cebp.aacrjournals.org/).