Abstract
Background: The “meeting-in-the-middle” (MITM) is a principle to identify exposure biomarkers that are also predictors of disease. The MITM statistical framework was applied in a nested case–control study of hepatocellular carcinoma (HCC) within European Prospective Investigation into Cancer and Nutrition (EPIC), where healthy lifestyle index (HLI) variables were related to targeted serum metabolites.
Methods: Lifestyle and targeted metabolomic data were available from 147 incident HCC cases and 147 matched controls. Partial least squares analysis related 7 lifestyle variables from a modified HLI to a set of 132 serum-measured metabolites and a liver function score. Mediation analysis evaluated whether metabolic profiles mediated the relationship between each lifestyle exposure and HCC risk.
Results: Exposure-related metabolic signatures were identified. Particularly, the body mass index (BMI)-associated metabolic component was positively related to glutamic acid, tyrosine, PC aaC38:3, and liver function score and negatively to lysoPC aC17:0 and aC18:2. The lifetime alcohol-specific signature had negative loadings on sphingomyelins (SM C16:1, C18:1, SM(OH) C14:1, C16:1 and C22:2). Both exposures were associated with increased HCC with total effects (TE) = 1.23 (95% confidence interval = 0.93–1.62) and 1.40 (1.14–1.72), respectively, for BMI and alcohol consumption. Both metabolic signatures mediated the association between BMI and lifetime alcohol consumption and HCC with natural indirect effects, respectively, equal to 1.56 (1.24–1.96) and 1.09 (1.03–1.15), accounting for a proportion mediated of 100% and 24%.
Conclusions: In a refined MITM framework, relevant metabolic signatures were identified as mediators in the relationship between lifestyle exposures and HCC risk.
Impact: The understanding of the biological basis for the relationship between modifiable exposures and cancer would pave avenues for clinical and public health interventions on metabolic mediators. Cancer Epidemiol Biomarkers Prev; 27(5); 531–40. ©2018 AACR.
This article is featured in Highlights of This Issue, p. 515
Introduction
Hepatocellular carcinoma (HCC) is the most common form of liver cancer (1) and is the second most frequent cause of cancer-related death worldwide (2). HCC mortality trends have increased in western populations, and unfavorable trends have been projected to 2020 (3). Similarly, HCC incidence rates have risen dramatically in Europe in the recent decades (4). HCC is a multifactorial disease strongly associated with lifestyle factors, most of which are modifiable (5). While hepatitis infection, including both HBV and HCV, remains its primary risk factor, other exposures, such as obesity, alcohol drinking, diabetes, smoking, physical activity, and some dietary items, have also been related to HCC risk (6–8). Understanding the underlying etiology of HCC is key to help develop disease prevention strategies, including through the identification of metabolic signatures.
Metabolic profiling is gaining traction in epidemiologic investigations of cardiovascular disease and cancer (9). Metabolomics measures a wide array of endogenous and exogenous features providing a comprehensive snapshot of metabolic status at a given time point (10). In this sense, the metabolome is a tool encompassing thousands of detectable molecules involved in various physiologic processes that can potentially elucidate underlying molecular and mechanistic pathways that may lead to pathologic outcomes (11–14).
In this scenario, the “meeting-in-the-middle” (MITM) principle (15, 16) can be used as a research strategy to identify biomarkers that are mechanistic mediators of the relationship between specific exposures and disease outcome. The statistical framework of the MITM was previously extended to multivariate modeling in a partial least squares (PLS) analysis to integrate a set of 21 lifestyle variables and 285 metabolic variables from 1H-NMR spectra in relation to HCC risk (17). In this current work, a more focused methodology was further developed building on a similar analytic structure. To this end, modifiable lifestyle exposures from a modified healthy lifestyle index (HLI; refs. 18, 19), including dietary, anthropometric (body mass index; BMI), smoking, alcohol consumption, and physical activity, and, in addition, diabetes and hepatitis infection were investigated with respect to specific metabolic signatures.
In this study, our objective was to apply a MITM approach to extract metabolic profiles of lifestyle exposures and explore their mediating role in the HCC causal pathway using data from a case–control study nested within the European Prospective Investigation into Cancer and Nutrition (EPIC) using targeted metabolomic data.
Materials and Methods
The study population
EPIC is a multicenter prospective study designed to investigate the link between diet, lifestyle, and environmental factors with cancer incidence and other chronic disease outcomes. Over 520,000 healthy men and women ages 25–85 were enrolled between 1992 and 2000 across 23 EPIC administrative centers in 10 European countries including Denmark, France, Germany, Greece, Italy, the Netherlands, Norway, Spain, Sweden, and the United Kingdom (20). Extensive details of the study design, rationale and recruitment methods, dietary assessment, blood collection protocols, and follow-up procedures have been previously published (20, 21). During the enrollment period (1992–1998), participants gave informed consent and completed questionnaires on diet, lifestyle, anthropometric measures, and medical history. Validated country-specific dietary questionnaires (DQ) ensuring high compliance were used to assess habitual diet over the previous 12 months (20). The ethical review boards of the participating institutions and the International Agency for Research on Cancer (IARC) approved this study. Biological samples were collected at recruitment prior to disease onset, and were available for approximately 80% of the cohort. Serum samples were stored at IARC, Lyon, France in −196°C liquid nitrogen for all countries, with the exception of samples originating from Sweden (−80°C freezers) and Denmark (−150°C nitrogen vapor). For information on how to submit an application for gaining access to EPIC data and/or biospecimens, please follow the instructions at http://epic.iarc.fr/access/index.php.
Nested case–control study design
This study focused on a nested case–control study of HCC in EPIC (22, 23) with available biological samples for the case sets identified in the period between subjects' recruitment into the cohort (1993–1998) and 2010 (23, 24). Cases of HCC (n = 147) originated from all participating EPIC centers except for Norway and France that were not included in this study. For each HCC case, one control (n = 147) was selected by incidence density sampling (25) from all cohort members alive and free of cancer (except for nonmelanoma skin cancer), and matched by age at blood collection (±1 year), sex, study center, time of the day at blood collection (±3 hours), fasting status at blood collection (<3, 3–6, and >6 hours); among women, the pair was additionally matched by menopausal status (pre-, peri-, and postmenopausal), and hormone replacement therapy use at time of blood collection (yes/no). All subjects were cancer-free at the time of blood collection. Information on follow-up and case ascertainment can be read elsewhere (23). In brief, HCC cases were defined as first primary invasive tumors and identified through the 10th Revision of International Statistical Classification of Diseases, Injury and Causes of Death (ICD10) as C22.0 with morphology codes ICD-O-2 “8170/3′and “8180/3.” Metastatic cases and other types of primary liver cancer were excluded.
The lifestyle variables
The examined predictor variables were restricted to lifestyle factors that have been previously associated with HCC (5–7, 26–33) and that were used in the preexisting HLI (18, 19). These included BMI (continuous, kg/m²), average lifetime alcohol intake (continuous, g/day), the diet score (continuous), physical activity (continuous, metabolic equivalent of task MET-h/week), smoking (never, ex-smokers quit > 10 years, ex-smokers quit ≤ 10 y, current smokers ≤ 15 cig/day, current smokers > 15 cig/day). Self-reported diabetes at baseline (yes/no) and hepatitis infection (yes/no) were also included as both were already established liver cancer risk factors (24, 26, 27). Hepatitis infection was assessed from biomarker measures of hepatitis B and hepatitis C viruses' (HBV, HCV) seropositivity (ARCHITECT HBsAg and anti-HCV chemiluminescent microparticle immunoassays; Abbott Diagnostics; ref. 26). The previously mentioned diet score is an a priori score proposed within EPIC based on dietary components that have been posited to affect risk of cancer (18, 19) and was detailed elsewhere (18). Briefly, the diet score combined six dietary items including cereal fiber, red and processed meats, ratio of polyunsaturated to saturated fatty acids, margarine (used as a surrogate marker for trans-fat from industrial sources), glycemic load, and fruits and vegetables. Lifetime alcohol intake was used instead of alcohol at recruitment to dispel with reverse causality.
These seven lifestyle variables will be herein referred to as the X-set. Missing values in the X-set were imputed through a simple EM algorithm using the covariance matrix of the data (34).
Metabolomic data
We measured concentrations of a set of known metabolites using targeted metabolite profiling in serum samples. The samples were analyzed by ultra-performance liquid chromatography (LC; 1290 Series HPLC; Agilent) coupled to a tandem mass spectrometer (MS/MS; QTrap 5500; AB Sciex) and performed at IARC (Lyon, France), using the BIOCRATES AbsoluteIDQ p180 Kit (Biocrates). This kit is used to measure more than 150 common endogenous metabolites (amino acids, biogenic amines, hexoses, acylcarnitines, lipids) including various metabolites from the intermediate metabolism, all of which are important indicators of individual metabolic status (35, 36). Details of the sample preparation and mass spectrometry analyses are provided elsewhere (23, 37). ICCs were above 0.50 in 73% and 52% of the metabolites in fasting and nonfasting samples, respectively (37). Metabolites with coefficient of variation (CV) > 20% for analytical replicates were excluded by the IARC laboratory leading to 145 metabolites being detected. Of these, metabolites with >40% of missing values (i.e., below the limit of detection/quantification or above the highest calibration standards) were excluded, resulting in a total of 132 metabolites included in this study. Measurements that were below the limit of detection were set to half that value and those below limit of quantification were set to half that limit. In addition, measurements that were above the highest concentration calibration standards were set to the highest values. Metabolite nomenclature has been described previously (38). Briefly, fatty acid side chains are labeled “Cx:y”, where x and y are the numbers of carbon atoms and double bonds, respectively. There were 12 acylcarnitines (abbreviated according to the fatty acid side chain), 21 amino acids, and 6 biogenic amines (labeled with their full name), 78 phosphatidylcholines (PC) of which there were 11 “LysoPC a” (PCs having one fatty acid side chain with an acyl bound), 34 “PC aa” and 33 “PC ae” [PCs having, respectively, two acyl side chains (diacyl) and one acyl and one alkyl side chains], a total of 14 sphingomyelins “SM” of which 5 had a hydroxyl group “SM(OH)” (additionally labeled according to the fatty acid side chain), and finally 1 sum of hexoses (including glucose, fructose, and galactose). PCs were separated by type of bond and number of fatty acid side chains.
Liver function score
The liver function score is a composite score summarizing the number of abnormal values for six circulating liver blood biomarkers. It reflects possible underlying liver function impairment (17, 22, 23) and includes the following tests: alanine aminotransferase >55 U/L, aspartate aminotransferase >34 U/L, gamma-glutamyltransferase: men > 64 U/L and women > 36 U/L, alkaline phosphatase > 150 U/L, albumin < 35 g/L, total bilirubin > 20.5 μmol/L. These biomarkers were acquired at the same time as the metabolites from the prediagnostic blood samples collected at recruitment and the cut-off points were provided by the clinical biochemistry laboratory that conducted the analyses (Centre de Biologie République, Lyon, France) based on the assay specifications as described previously (23, 24).
The set comprising the 132 metabolites and the liver function score will be herein referred to as the M-set of metabolomics data.
Statistical analyses
The flowchart in Supplementary Fig. S1A recaps the different steps of the statistical analyses. Prior to PLS analysis, the sources of systematic variability within the exposure and the metabolomic data were identified and quantified through PC-PR2 analysis (39). Consequently, residuals on country for the lifestyle exposures and on country and batch for the M-set were carried in a series of univariate linear regression models and used in the following analytical steps.
Partial least squares analyses
Each exposure variable, in turn, was related to metabolomic data through PLS analysis. PLS is a multivariate method that generalizes features of PCA with those of multiple linear regression (40, 41). Mathematical and computational details of the PLS method and its applicability within the MITM framework have been thoroughly described previously (17). In brief, PLS extracted linear combinations, referred to as PLS factors, of predictors (in this case the X-variable) and responses (the M-set), allowing a simultaneous decomposition of both sets with the aim of maximizing their covariance (40, 41). A series of individual PLS analyses was applied using each HLI variable separately as the predictor to yield exposure-specific metabolic signatures and one PLS factor was retained for each analysis. PLS scores mirroring the metabolic components of each PLS factor, the M-scores, were computed. The PLS factor loadings quantified how much each metabolite contributed to the PLS metabolic signature and variables with loading values lower than the 5th and larger than the 95th percentiles were reported to ease interpretation. In an attempt to yield even more specific metabolic signatures, sensitivity PLS analyses were computed using mutually adjusted lifestyle residuals, where each lifestyle exposure was regressed on the others, as well as country for each of the exposure variables. Country and batch residuals were used in the M-set.
Mediation analyses
Mediation analysis assessed whether the metabolic profiles mediated the relation between individual lifestyle factors and HCC risk (Supplementary Fig. S1B). For each individual PLS analysis, mediating effects were computed for each extracted pair of lifestyle variable and M-score, adapting the formulae from VanderWeele and Vansteelandt (42) to accommodate continuous exposures and mediators and conditional logistic regression for our matched setting. For each examined lifestyle variable, estimates of the natural direct effect (NDE), the natural indirect effect (NIE), and the total effect (TE) were obtained, along with the effect of the corresponding M-score adjusted for its counterpart lifestyle exposure and for confounding variables and referred to as the mediator effect (ME).
The NDE and NIE were produced through two main models: a linear mediator model and a conditional logistic outcome model. HCC being a rare outcome, direct and indirect effects were estimated taking into account the nested case–control design. This is done by running the mediator regression only for the controls (43). After testing, there was no exposure–mediator interaction, thus the models were written as follows:
Let x be the exposure, m the mediator, c a set of different confounders, y the HCC outcome, and j the pair number ranging among the set {1,…, n = 147}:
Thus, NDE and NIE are given as follows for a one SD increase in x and m:
95% confidence interval (CI) for NDE and NIE were computed through the following formulae:
where |$\hat{\sigma }_{11}^\theta$|, |$\hat{\sigma }_{22}^\theta$| and |$\hat{\sigma }_{11}^\beta$| are the estimated variances of the coefficients |${\hat{\theta }_1}$|, |${\hat{\theta }_2}$|, and |${\hat{\beta }_1}$|, respectively.
The total effect of X (TE) was computed from the following conditional logistic regression:
with TE given by:
Usually TE can be written as the product of NDE and NIE. However, in our setting employing conditional logistic regression, this is no longer the case because discordant pairs in the model adjusted for the mediator are not the same as the model not including the mediator (TE).
The mediator effect (ME), corresponding to the “independent” effect of the M-score adjusted for its counterpart lifestyle exposure and for confounding variables was given by:
To control for potential confounding, mediation analyses models were adjusted for the modified HLI variables except the one under scrutiny. P values for NDE and NIE were inferred from the 95% CI, whereas for the ME and TE, P values associated with Wald test for continuous exposure compared with a χ2distribution with 1 degree of freedom were produced. The false discovery rate (FDR) correction (44) was applied to mediation results stemming from the multiple PLS analyses to control for multiple testing and associated q values were presented.
For each mediation analysis, the estimates for the NDE, NIE, TE, and ME were reported for an increase in the exposure as follows: an increase of 1−SD for smoking, an increase of 1−SD among the controls for BMI, physical activity, and the diet score, an increase of 1 unit (0 to 1) for diabetes and hepatitis infection, and finally an increase of 12 g/day (corresponding to one alcohol unit) for lifetime alcohol.
As TE = NDE*NIE does not hold in our setting, the mediated proportion was computed using the following formula:
Indeed, the proportion mediated makes real sense only when NDE and NIE have the same direction of association and is bounded between 0% and 100%. In this case, our formula reduces to:
When NDE and NIE have opposite directions, the mediated proportion is not well-defined. For example, if |NDE = 0.5$| and |NIE = 2$|, so that|\ TE = 1$|, it is not clear what the mediated proportion would be. In our results, NDE and NIE always had the same direction when they were both statistically significant. For example, in our analyses for diabetes (or equivalently for BMI), the NIE is significantly associated with an increased risk of HCC and the NDE was not significant and had the opposite direction of association. This suggested that TE = NIE and using our first formula above, we get the appropriate value of 100%.
All statistical tests were two-sided and after multiple testing correction, q values below 0.05 were considered statistically significant. Statistical analyses were performed using PROC PLS in SAS (45) for PLS analyses and the R Software (46) for linear and conditional logistic regressions and mediation analyses.
Results
Study population characteristics by case–control status are presented in Table 1. Individual PLS analyses yielded metabolic signatures for each component of the modified HLI and the top metabolites contributing to each signature are presented in Table 2. For lifetime alcohol, the signature was negatively related to SM C16:1, SM C18:1, SM(OH) C14:1, SM(OH) C16:1, and SM(OH) C22:2 and positively related to glutamic acid and PC aaC32:1. Metabolites associated with smoking included SM C16:1 and C18:1, SM(OH) C14:1 and C22:2, LysoPC aC28:1 and PC aeC30:2 with negative loadings and hexoses with positive loadings. In the sensitivity analysis, smoking was negatively associated with serine, lysine, and biogenic taurine and positively with PC aaC36:1 and aaC40:3 (Supplementary Table S1). Different diacyl and acyl–alkyl–phosphatidylcholines characterized the metabolic signature related to diet. The metabolic profile of BMI included glutamic acid, tyrosine, PC aaC38:3, the liver function score with positive loadings and glutamine, LysoPC aC17:0 and LysoPC aC18:2 with negative values. Hexoses and amino acids valine, isoleucine, and phenylalanine were positively associated with diabetes status.
Characteristicsa . | Cases . | Controls . |
---|---|---|
N | 147 | 147 |
Sex (n) | ||
Male | 102 | 102 |
Female | 45 | 45 |
Age at blood collection (years) | 60.1 (50.7–68.8) | 60.1 (50.9–68.9) |
Height (cm) | 167.7 (152.3–180.7) | 169.3 (156.1–181.0) |
Weight (kg) | 79.8 (59.0–102.2) | 78.3 (60.6–93.0) |
BMI (kg/m2) | 28.2 (23.0–34.9) | 27.3 (22.0–32.5) |
Total energy intake (kcal/day) | 2260.8 (1381.4–3169.3) | 2276.6 (1495.1–3140.6) |
Alcohol intake at recruitment (g/day) | ||
Among consumers | 29.6 (1.08–80.76) | 16.8 (1.27–41.27) |
% of never drinkers (<0.1 g/day) | 13.6 | 6.1 |
Physical activity (Mets-hour/week) | 77.9 (18.9–150.5) | 83.3 (21.6–157.2) |
Lifetime alcohol intake (g/day)b | ||
Among consumers | 33.6 (2.13–74.42) | 19.7 (2.17–41.32) |
% of never drinkers (<0.1 g/day) | 3.1 | 4.1 |
Diet scoreb | 25.7 (16.0–33.0) | 27.4 (20.6–30.0) |
Hepatitis infectionb,c | ||
Yes | 41 | 5 |
No | 106 | 142 |
Diabetes at baselineb | ||
Yes | 19 | 10 |
No | 128 | 137 |
Smoking statusb (n) | ||
Current smokers, > 15 cigarettes/day | 25 | 23 |
Current smokers, ≤ 15 cigarettes/day | 34 | 10 |
Former smokers, quit ≤ 10 years | 17 | 25 |
Former smokers, quit >10 years | 29 | 29 |
Never | 42 | 60 |
Fasting status (n) | ||
Unknown | 17 | 16 |
In between (3–6 h) | 31 | 29 |
No (<3 h) | 59 | 61 |
Yes (>6 h) | 40 | 41 |
Characteristicsa . | Cases . | Controls . |
---|---|---|
N | 147 | 147 |
Sex (n) | ||
Male | 102 | 102 |
Female | 45 | 45 |
Age at blood collection (years) | 60.1 (50.7–68.8) | 60.1 (50.9–68.9) |
Height (cm) | 167.7 (152.3–180.7) | 169.3 (156.1–181.0) |
Weight (kg) | 79.8 (59.0–102.2) | 78.3 (60.6–93.0) |
BMI (kg/m2) | 28.2 (23.0–34.9) | 27.3 (22.0–32.5) |
Total energy intake (kcal/day) | 2260.8 (1381.4–3169.3) | 2276.6 (1495.1–3140.6) |
Alcohol intake at recruitment (g/day) | ||
Among consumers | 29.6 (1.08–80.76) | 16.8 (1.27–41.27) |
% of never drinkers (<0.1 g/day) | 13.6 | 6.1 |
Physical activity (Mets-hour/week) | 77.9 (18.9–150.5) | 83.3 (21.6–157.2) |
Lifetime alcohol intake (g/day)b | ||
Among consumers | 33.6 (2.13–74.42) | 19.7 (2.17–41.32) |
% of never drinkers (<0.1 g/day) | 3.1 | 4.1 |
Diet scoreb | 25.7 (16.0–33.0) | 27.4 (20.6–30.0) |
Hepatitis infectionb,c | ||
Yes | 41 | 5 |
No | 106 | 142 |
Diabetes at baselineb | ||
Yes | 19 | 10 |
No | 128 | 137 |
Smoking statusb (n) | ||
Current smokers, > 15 cigarettes/day | 25 | 23 |
Current smokers, ≤ 15 cigarettes/day | 34 | 10 |
Former smokers, quit ≤ 10 years | 17 | 25 |
Former smokers, quit >10 years | 29 | 29 |
Never | 42 | 60 |
Fasting status (n) | ||
Unknown | 17 | 16 |
In between (3–6 h) | 31 | 29 |
No (<3 h) | 59 | 61 |
Yes (>6 h) | 40 | 41 |
aValues are presented as means and 10th and 90th percentiles in parentheses for continuous variables and as frequencies for categorical variables.
bThere were respectively 42, 12, 76, 29 and 7 missing values for lifetime alcohol consumption, diet score, hepatitis, diabetes and smoking; they were imputed with an EM algorithm using the covariance matrix of the data.
cPrior to EM, there were 41 hepatitis infections (3 in controls and 38 in cases). After imputation there were 46 hepatitis infections (5 in controls and 41 in cases). Among the initial distribution of the 41 hepatitis patients, there were 26 with hepatitis C, 15 with hepatitis B with 3 subjects having both HBV and HCV.
Metabolites . | Loadings . | Metabolites . | Loadings . | Metabolites . | Loadings . |
---|---|---|---|---|---|
BMI | Lifetime alcohol | Diet score | |||
Glutamine | −0.186 | Glutamic acid | 0.17 | PC aaC36:1 | −0.178 |
Glutamic acid | 0.230 | SM C16:1 | −0.171 | PC aaC38:0 | 0.195 |
Tyrosine | 0.243 | SM C18:1 | −0.167 | PC aaC38:6 | 0.23 |
LysoPC aC17:0 | −0.218 | SM(OH) C14:1 | −0.18 | PC aaC40:6 | 0.204 |
LysoPC aC18:2 | −0.236 | SM(OH) C16:1 | −0.184 | PC aaC42:2 | 0.263 |
PC aeC36:2 | −0.203 | SM(OH) C22:2 | −0.211 | PC aeC34:1 | −0.195 |
Liver function score | 0.191 | PC aaC32:1 | 0.211 | PC aeC40:6 | 0.167 |
Physical activity | Smoking | Hepatitis infection | |||
Biogenic creatinine | −0.199 | Hexoses | 0.136 | SM C20:2 | −0.179 |
Biogenic taurine | −0.181 | SM C16:1 | −0.238 | SM(OH) C16:1 | −0.178 |
Glutamic acid | −0.212 | SM C18:1 | −0.194 | PC aaC32:2 | 0.188 |
PC aaC34:2 | −0.188 | SM(OH) C14:1 | −0.214 | PC aaC34:1 | 0.184 |
PC aeC34:2 | 0.209 | SM(OH) C22:2 | −0.182 | PC aaC34:3 | 0.18 |
PC aeC34:3 | 0.176 | LysoPC aC28:1 | −0.204 | PC aaC34:4 | 0.197 |
PC aeC36:3 | 0.193 | PC aeC30:2 | −0.264 | PC aaC36:5 | 0.189 |
Diabetes status | |||||
Biogenic alpha AAA | 0.236 | ||||
Isoleucine | 0.168 | ||||
Phenylalanine | 0.158 | ||||
Valine | 0.211 | ||||
Hexoses | 0.551 | ||||
Lyso PC aC16:1 | −0.145 | ||||
Liver function score | 0.226 |
Metabolites . | Loadings . | Metabolites . | Loadings . | Metabolites . | Loadings . |
---|---|---|---|---|---|
BMI | Lifetime alcohol | Diet score | |||
Glutamine | −0.186 | Glutamic acid | 0.17 | PC aaC36:1 | −0.178 |
Glutamic acid | 0.230 | SM C16:1 | −0.171 | PC aaC38:0 | 0.195 |
Tyrosine | 0.243 | SM C18:1 | −0.167 | PC aaC38:6 | 0.23 |
LysoPC aC17:0 | −0.218 | SM(OH) C14:1 | −0.18 | PC aaC40:6 | 0.204 |
LysoPC aC18:2 | −0.236 | SM(OH) C16:1 | −0.184 | PC aaC42:2 | 0.263 |
PC aeC36:2 | −0.203 | SM(OH) C22:2 | −0.211 | PC aeC34:1 | −0.195 |
Liver function score | 0.191 | PC aaC32:1 | 0.211 | PC aeC40:6 | 0.167 |
Physical activity | Smoking | Hepatitis infection | |||
Biogenic creatinine | −0.199 | Hexoses | 0.136 | SM C20:2 | −0.179 |
Biogenic taurine | −0.181 | SM C16:1 | −0.238 | SM(OH) C16:1 | −0.178 |
Glutamic acid | −0.212 | SM C18:1 | −0.194 | PC aaC32:2 | 0.188 |
PC aaC34:2 | −0.188 | SM(OH) C14:1 | −0.214 | PC aaC34:1 | 0.184 |
PC aeC34:2 | 0.209 | SM(OH) C22:2 | −0.182 | PC aaC34:3 | 0.18 |
PC aeC34:3 | 0.176 | LysoPC aC28:1 | −0.204 | PC aaC34:4 | 0.197 |
PC aeC36:3 | 0.193 | PC aeC30:2 | −0.264 | PC aaC36:5 | 0.189 |
Diabetes status | |||||
Biogenic alpha AAA | 0.236 | ||||
Isoleucine | 0.168 | ||||
Phenylalanine | 0.158 | ||||
Valine | 0.211 | ||||
Hexoses | 0.551 | ||||
Lyso PC aC16:1 | −0.145 | ||||
Liver function score | 0.226 |
aMetabolite variables contributing to each PLS factor were selected based on extreme loading values, that is, below or above the 2.5th and 97.5th percentiles. Results from multiple PLS models performed using residuals based on country (X- and M-sets) and batch residuals (M-set only).
The association between each of the different lifestyle factors with HCC risk was strongly mediated by the PLS metabolic profiles, with the exception of physical activity and hepatitis infection (Table 3,Table 4). In particular, for both diabetes and BMI, a positive association for the NIE, equal to 5.11 (95% CI = 1.99–13.14, q value = 2.45E−03) and 1.56 (1.24–1.96, q value = 1.20E−03), respectively, was observed, together with a lack of association for the NDE, thus suggesting that the relationship between these two variables and HCC risk was fully mediated by the corresponding metabolic signatures. As for smoking, diet, and lifetime alcohol, the mediated proportions were 56%, 38%, and 24%, respectively, with NIE equal to 1.22 (1.04–1.44, q value = 2.52E−02), 0.85 (0.74–0.97, q value = 2.52E−0.2), and 1.09 (1.03–1.15, q value = 4.67E−03), respectively (Table 3). Noteworthy, the NIE estimate for smoking in the sensitivity analysis was 1.98 (1.34–2.92, q value = 1.32E−04), and the relation between smoking and HCC was fully mediated by the M-score (Supplementary Table S2).
Modelsa . | NDE . | q value . | NIE . | q value . | TE . | q value . | % Mediated . |
---|---|---|---|---|---|---|---|
BMI | 0.85 (0.60–1.20) | 4.81E−01 | 1.56 (1.24–1.96) | 1.20E−03 | 1.23 (0.93–1.62) | 1.74E−01 | 100 |
Lifetime alcohol | 1.31 (1.06–1.61) | 4.20E−02 | 1.09 (1.03–1.15) | 4.67E−03 | 1.40 (1.14–1.72) | 3.50E−03 | 24 |
Diet score | 0.77 (0.54–1.11) | 3.92E−01 | 0.85 (0.74–0.97) | 2.52E−02 | 0.66 (0.47–0.92) | 3.27E−02 | 38 |
Physical activity | 0.98 (0.72–1.35) | 9.18E−01 | 0.97 (0.87–1.09) | 6.17E−01 | 0.98 (0.71–1.34) | 8.84E−01 | —b |
Smoking | 1.17 (0.77–1.77) | 5.36E−01 | 1.22 (1.04–1.44) | 2.52E−02 | 1.42 (0.99–2.03) | 1.02E−01 | 57 |
Hepatitis infection | 17.99 (5.15–62.80) | 4.11E−05 | 0.94 (0.83–1.06) | 3.48E−01 | 16.70 (4.82–57.84) | 6.24E−05 | 0 |
Diabetes | 0.46 (0.11–1.93) | 4.82E−01 | 5.11 (1.99–13.14) | 2.45E−03 | 2.45 (0.84–7.18) | 1.41E−01 | 100 |
Modelsa . | NDE . | q value . | NIE . | q value . | TE . | q value . | % Mediated . |
---|---|---|---|---|---|---|---|
BMI | 0.85 (0.60–1.20) | 4.81E−01 | 1.56 (1.24–1.96) | 1.20E−03 | 1.23 (0.93–1.62) | 1.74E−01 | 100 |
Lifetime alcohol | 1.31 (1.06–1.61) | 4.20E−02 | 1.09 (1.03–1.15) | 4.67E−03 | 1.40 (1.14–1.72) | 3.50E−03 | 24 |
Diet score | 0.77 (0.54–1.11) | 3.92E−01 | 0.85 (0.74–0.97) | 2.52E−02 | 0.66 (0.47–0.92) | 3.27E−02 | 38 |
Physical activity | 0.98 (0.72–1.35) | 9.18E−01 | 0.97 (0.87–1.09) | 6.17E−01 | 0.98 (0.71–1.34) | 8.84E−01 | —b |
Smoking | 1.17 (0.77–1.77) | 5.36E−01 | 1.22 (1.04–1.44) | 2.52E−02 | 1.42 (0.99–2.03) | 1.02E−01 | 57 |
Hepatitis infection | 17.99 (5.15–62.80) | 4.11E−05 | 0.94 (0.83–1.06) | 3.48E−01 | 16.70 (4.82–57.84) | 6.24E−05 | 0 |
Diabetes | 0.46 (0.11–1.93) | 4.82E−01 | 5.11 (1.99–13.14) | 2.45E−03 | 2.45 (0.84–7.18) | 1.41E−01 | 100 |
NOTE: Values reported in bold are statistically significant.
aModels were mutually adjusted for all HLI variables. Cases and controls were matched on age at blood collection (±1 year), sex, study center, date (±2 months), and time of day at blood collection (±3 hours), fasting status at blood collection (<3/3–6/>6 hours); women were additionally matched on menopausal status (pre/peri/postmenopausal) and hormone replacement therapy. The mediator models were linear. The outcome models were computed through conditional logistic regressions. In the mediation analysis, the exposure was the original modified HLI lifestyle factor, the mediator was the associated M-score (metabolic profile), and the outcome was HCC.
bAs the associations were null for direct and indirect effects, the proportion mediated was not computed. NDE and NIE and their 95% CIs computed from formulae are detailed in Materials and Methods.
Modelsa . | ME . | q value . |
---|---|---|
BMI | 4.04 (2.22–7.36) | 3.17E−05 |
Lifetime alcohol | 2.50 (1.57–3.97) | 2.48E−04 |
Diet score | 0.61 (0.41–0.89) | 1.54E−02 |
Physical activity | 0.90 (0.60–1.35) | 6.15E−01 |
Smoking | 3.33 (1.96–5.66) | 3.17E−05 |
Hepatitis infection | 1.22 (0.88–1.69) | 2.60E−01 |
Diabetes | 2.75 (1.59–4.78) | 5.55E−04 |
Modelsa . | ME . | q value . |
---|---|---|
BMI | 4.04 (2.22–7.36) | 3.17E−05 |
Lifetime alcohol | 2.50 (1.57–3.97) | 2.48E−04 |
Diet score | 0.61 (0.41–0.89) | 1.54E−02 |
Physical activity | 0.90 (0.60–1.35) | 6.15E−01 |
Smoking | 3.33 (1.96–5.66) | 3.17E−05 |
Hepatitis infection | 1.22 (0.88–1.69) | 2.60E−01 |
Diabetes | 2.75 (1.59–4.78) | 5.55E−04 |
NOTE: Values reported in bold are statistically significant.
aModels were mutually adjusted for all HLI variables. Cases and controls were matched on age at blood collection (±1 year), sex, study center, date (±2 months), and time of day at blood collection (±3 hours), fasting status at blood collection (<3/3–6/>6 hours); women were additionally matched on menopausal status (pre/peri/postmenopausal) and hormone replacement therapy. The mediator models were linear. The outcome models were computed through conditional logistic regressions. In the mediation analysis, the exposure was the original modified HLI lifestyle factor, the mediator was the associated M-score (metabolic profile), and the outcome was HCC.
The metabolic profile associated with the diet score showed a strong inverse association with HCC risk, with ME (mediator effect) equal to 0.61 (0.41–0.89, q value = 1.54E−02). A positive association with HCC was observed for the lifestyle-specific metabolic signatures related to BMI, lifetime alcohol, smoking, and diabetes with ME = 4.04 (2.22–7.36, q value = 3.17E−0.5), 2.50 (1.57–3.97, q value = 2.48E−04), 3.33 (1.96–5.66, q value = 3.17E−05), and 2.75 (1.59–4.78, q value = 5.55E−04), respectively (Table 4).
The TE estimates showed strong associations for lifetime alcohol (1.40; 95% CI = 1.14–1.72, q value = 3.50E−03), diet score (0.66; 0.47–0.92, q value = 3.27E−02), and hepatitis infection (16.70; 4.82–57.84, q value = 6.24E−05) with HCC risk (Table 3). The NDE were not statistically significantly associated with HCC for most of the lifestyle exposures, except for lifetime alcohol intake and hepatitis infection. With the exception of smoking and, to a lesser extent, lifetime alcohol, the PLS metabolic profiles and estimated associations were virtually unchanged in the sensitivity analysis (Supplementary Tables S1 and S2).
Discussion
In this nested case–control study, we evaluated the relationships of modifiable lifestyle exposures and other HCC risk factors in models that integrated information on metabolites according to the MITM principle (17). Metabolic signatures mediated varying proportions of the associations of BMI, lifetime alcohol use, diet score, smoking, and diabetes status with HCC risk. Our findings provide mechanistic insights into the etiology of HCC with respect to modifiable lifestyle and other HCC risk factors.
Carcinogenesis leading to HCC involves various modifications of a number of molecular pathways and its etiology is complex. It involves different risk factors that can overlap (e.g., hepatitis infection, cirrhosis, diabetes, obesity) that in turn affect a broad range of mechanisms including activation of proinflammatory pathways and induction of the oxidative stress response (47).
PLS analysis was used to relate each of the modified HLI variables, in turn, to the set of metabolites. The exposure-specific metabolic signatures were found to mediate the relation with HCC risk for BMI, lifetime alcohol consumption, smoking, diabetes, and diet, with a proportion mediated of 100, 24, 56, 100 and 38%, respectively. These findings suggested that varying proportion of the total effect on HCC is exerted via the metabolic signatures, possibly through specific underlying mechanisms by which the exposure is acting.
Specifically, a recent IARC handbook evaluation on body fatness and obesity reported a positive relationship between BMI and risk of liver cancer (48). Our study suggests that the increase in HCC risk is entirely mediated by a BMI-specific metabolic signature characterized by phosphatidylcholines (LysoPC aC18:2, LysoPC aC17:0 and PC aeC36:2) and tyrosine. PCs are required for lipoprotein assembly and secretion; in particular acyl-alkyl-PCs were correlated with high-density cholesterol (49, 50). Tyrosine level's imbalance has been previously related to insulin resistance and type II diabetes (50–53). Correlation studies conducted in the EPIC-Potsdam cohort exploring the association between lifestyle factors and blood metabolite levels, acquired with the same targeted technology showed similar findings, with serum acyl-alkyl-phosphatidylcholines (PC ae), LysoPC aC17:0, aC18:2, and PC aeC36:2 negatively associated with obesity and BMI, whereas tyrosine was positively related to BMI (35, 36, 54). Tyrosine was also related to BMI in the Framingham Offspring cohort (55) and tyrosine and glutamate were positively associated with BMI in a study including participants from the PLCO, Navy, and Shanghai studies (56). Glutamate, a neurotransmitter, has been previously implicated in appetite regulation (57). Hypotheses suggested high levels of amino acids may be due to excess protein breakdown in skeletal muscle due to insulin resistance concurrent with obesity, or to higher protein turnover in high BMI individuals (58–60).
The metabolic signature fully mediated the association between diabetes, a well-established HCC risk factor (5), and HCC. The contributing metabolites were hexoses, phenylalanine, and LysoPCs, consistent with previous studies based on targeted (50) and untargeted (61) approaches. These metabolites were further linked with insulin resistance and involved in glycolysis and gluconeogenesis, and their metabolic alterations were associated with an increased diabetes risk (50).
The metabolic signature of lifetime alcohol intake was negatively associated with sphingomyelins and positively associated with phosphatidylcholines. In the CARLA study, in both men and women, PC aaC32:1 was positively associated with alcohol consumption while PC aeC30:2, SM(OH) C14:1, SM(OH) C16:1, and SM(OH) C22:2 were inversely associated with high intakes of alcohol (62). Similar metabolite patterns were observed in a study that focused on alcohol-dependent patients (63). As ethanol has been hypothesized to induce lipogenesis in the liver tissues (64), alcohol can lead to hepatic injuries causing a disruption of the metabolism of fatty acids including phospholipids (65). A study in EPIC found that alcohol intake was positively associated with plasma phospholipids (66). In particular, alcohol is associated with higher levels of short PCs and lower levels of long-chain PC (65). Alcohol is also responsible of the activation of the enzyme acid sphingomyelinase (ASM) resulting in lower SM levels that could lead to catabolism of SMs into ceramide and PCs (63, 67), which, in turn, leads to hepatotoxicity and eventually HCC (68, 69).
The identification of specific metabolic signatures for alcohol and smoking was particularly challenging in our study, as these two factors are strongly correlated (70–72). An overlap between the smoking and alcohol-specific metabolic signatures was observed in the preliminary analysis, where four common sphingomyelins, that is, SM C16:1, SM C18:1, SM(OH) C14:1, and SM(OH) C22:2, were identified. In the sensitivity analysis, the different lifestyle exposures were mutually adjusted for prior to PLS, thus leading to a new list of metabolites associated with smoking which included serine, SM(OH) C22:2 and PC aaC36:1, consistently to what was reported in the KORA study (73). As a result, the estimated proportion of mediation increased from 57% to 100%, resulting in a metabolic signature capturing smoking-related metabolic features that is more predictive of HCC. In the CARLA study, that has also explored lifestyle exposures' relation with serum metabolite levels, a sex-specific positive association was found between PC aaC32:1 and smoking in men and with SM C16:1 and BMI in women (62). Smoking disrupts PC levels, with low acyl-alkyl PCs and high diacyl PCs likely reflecting less lipid remodeling in membranes resulting in inflammation (74), which may in turn lead to liver injury (75). PCs play a major role in essential fatty acids and are a substrate for cell signaling (76). Cigarette smoking contains free radicals (77, 78), which promote lipid peroxidation (79) that can lead to liver diseases (80).
One of the limitations of the study was the small sample size, as HCC is a rare malignancy. Consequently, both cases and controls were included in the PLS analysis for identification of the exposure-specific metabolic signatures. However, PLS was done agnostically with respect to case–control status. In addition, mediation analysis assumed a temporal sequence of lifestyle exposures, metabolites, and outcome (81, 82), for the NDE and NIE to have a causal interpretation. Despite the prospective design of the study, another potential limitation is reverse causality, as the liver is an organ playing a key role in many metabolic pathways (83). In our study, lifestyle exposures were assessed at baseline, at the same time of the collection of biological samples that provided metabolomics data. It is worth noting that lifestyle and metabolite concentrations reflect different exposure windows, with metabolites likely reflecting exogenous and endogenous exposures in a limited timeframe (37, 38, 84). The diet score was derived from questionnaires that covered the dietary habits of participants over the past 12 months prior to baseline (20, 21). While lifetime alcohol reflected the history of exposure across adult life, other exposures such as BMI, smoking, physical activity, hepatitis infection, and diabetes status were assumed to be relatively stable over time in the middle-aged study populations in EPIC. Another key aspect of mediation analysis is what is referred to as the “cross-world assumption,” whereby NDE and NIE cannot be identified in the presence of a mediator-outcome confounder that is affected by the exposure (85). In our study, the composite liver function score, an index compiled from measures of circulating biomarkers of hepatic function indicating underlying liver impairment (22), was likely affected by lifestyle exposure, and was, in turn, likely influencing metabolite levels and HCC risk. The use of weighting-based estimation methods to look at joint mediators to compute randomized interventional effects has been proposed as a solution in the presence of mediator-outcome confounding (85). In this study, the liver function was added to the list of mediators. In this way, the metabolic signatures comprised of relevant information on the liver function, and the link with relevant lifestyle factors was evaluated. Finally, we had a substantial number of missing values for hepatitis infection (∼25%), which was imputed by an EM algorithm, and given the small sample size it was not possible to conduct a sensitivity analysis by hepatitis type.
Findings from this comprehensive approach suggested that certain exposure-specific metabolic profiles are intermediate biomarkers on the causal pathway toward hepatocellular carcinogenesis, but replication of these findings in an independent setting and in a larger sample is warranted. Future applications include applying the same methodology to other cancer sites and their associated lifestyle risk factors to identify relevant metabolic signatures, possibly relying on Mendelian Randomization analysis to assess causality. Application to untargeted data is also planned.
This study further refined an endeavor for high-throughput data to integrate metabolomics, lifestyle exposures together with disease indicators. Metabolomics lends itself as a promising tool to identify metabolites bridging the link between exposure(s) and disease, as advocated by the MITM principle (15, 16). The framework we developed allows the identification of informative metabolic signatures, which are useful to elucidate the underlying biological mechanisms in the relationship between lifestyle exposure to cancer risk (86).
Disclosure of Potential Conflicts of Interest
T. Philip is the president of Institut Curie Cancer Center. No potential conflicts of interest were disclosed by the other authors.
Authors' Contributions
Conception and design: N. Assi, D.C. Thomas, P. Vineis, M.J. Gunter, M. Jenab, P. Ferrari
Development of methodology: N. Assi, D.C. Thomas, P. Ferrari, V. Viallon
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): V. Chajes, M.-C. Boutron-Ruault, A. Molinuevo, T. Kühn, R.C. Travis, K. Overvad, M.J. Gunter, A. Scalbert, M. Jenab
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): N. Assi, D.C. Thomas, V. Chajes, M.J. Gunter, M. Jenab, V. Viallon
Writing, review, and/or revision of the manuscript: N. Assi, D.C. Thomas, M. Leitzman, M. Stepien, V. Chajes, P. Vineis, C. Bamia, M.-C. Boutron-Ruault, T.M. Sandanger, A. Molinuevo, H.C. Boshuizen, A. Sundkvist, T. Kühn, R.C. Travis, K. Overvad, E. Riboli, M.J. Gunter, A. Scalbert, M. Jenab, P. Ferrari, V. Viallon
Study supervision: T. Philip, P. Vineis, P. Ferrari
Acknowledgments
The coordination of EPIC is financially supported by the European Commission (DG-SANCO) and the International Agency for Research on Cancer. The national cohorts are supported by Danish Cancer Society (Denmark); Ligue Contre le Cancer, Institut Gustave Roussy, Mutuelle Générale de l'Education Nationale, Institut National de la Santé et de la Recherche Médicale (INSERM) (France); Deutsche Krebshilfe, Deutsches Krebsforschungszentrum and Federal Ministry of Education and Research (Germany); the Hellenic Health Foundation (Greece); Associazione Italiana per la Ricerca sul Cancro-AIRC-Italy and National Research Council (Italy); Dutch Ministry of Public Health, Welfare and Sports (VWS), Netherlands Cancer Registry (NKR), LK Research Funds, Dutch Prevention Funds, Dutch ZON (Zorg Onderzoek Nederland), World Cancer Research Fund (WCRF), Statistics Netherlands (The Netherlands); Nordic Centre of Excellence programme on Food, Nutrition and Health. (Norway); Health Research Fund (FIS), PI13/00061 to Granada), Regional Governments of Andalucía, Asturias, Basque Country, Murcia (no. 6236) and Navarra, ISCIII RETIC (RD06/0020) (Spain); Swedish Cancer Society, Swedish Scientific Council and County Councils of Skåne and Västerbotten (Sweden); Cancer Research UK (14136 to EPIC-Norfolk; C570/A16491 and C8221/A19170 to EPIC-Oxford), Medical Research Council (1000143 to EPIC-Norfolk and MR/M012190/1 to EPIC-Oxford, United Kingdom). N. Assi was financially supported by the Université Claude Bernard Lyon I through a doctoral fellowship awarded by the EDISS (Ecole Doctorale InterDisciplinaire Sciences Santé) doctoral school to complete her PhD work. The data on the EPIC-Hepatobiliary dataset was generated through support from the French National Cancer Institute (L'Institut National du Cancer; INCA; grant number 2009-139; principal investigator: M. Jenab). The work undertaken by D.C. Thomas reported in this publication was supported by the NIH under award number P01 CA196559. R.C. Travis is a co-principal investigator of the EPIC-Oxford cohort whose work is supported by Cancer Research UK under grant number C8221/A19170.
The authors wish to thank Dr. Joshua Sampson from the Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD, for useful discussions and insightful comments on this work. The authors would like to extend their thanks to Mr. Bertrand Hémon and Ms. Carine Biessy from the International Agency for Research on Cancer for their kind help with issues related to data management.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.