Abstract
Breast cancer is a major cause of death in occidental women. The role of metabolism in breast cancer etiology remains unclear. Metabolomics may help to elucidate novel biological pathways and identify new biomarkers to predict breast cancer long before symptoms appear. The aim of this study was to investigate whether untargeted metabolomic signatures from blood draws of healthy women could contribute to better understand and predict the long-term risk of developing breast cancer.
A nested case–control study was conducted within the SU.VI.MAX prospective cohort (13 years of follow-up) to analyze baseline plasma samples of 211 incident breast cancer cases and 211 matched controls by LC/MS. Multivariable conditional logistic regression models were computed.
A total of 3,565 ions were detected and 1,221 were retained for statistical analysis. A total of 73 ions were associated with breast cancer risk (P < 0.01; FDR ≤ 0.2). Notably, we observed that a lower plasma level of O-succinyl-homoserine (OR = 0.70, 95%CI = [0.55-0.89]) and higher plasma levels of valine/norvaline [1.45 (1.15–1.83)], glutamine/isoglutamine [1.33 (1.07–1.66)], 5-aminovaleric acid [1.46 (1.14–1.87)], phenylalanine [1.43 (1.14–1.78)], tryptophan [1.40 (1.10–1.79)], γ-glutamyl-threonine [1.39 (1.09–1.77)], ATBC [1.41 (1.10–1.79)], and pregnene-triol sulfate [1.38 (1.08–1.77)] were associated with an increased risk of developing breast cancer during follow-up.
Conclusion: Several prediagnostic plasmatic metabolites were associated with long-term breast cancer risk and suggested a role of microbiota metabolism and environmental exposure.
After confirmation in other independent cohort studies, these results could help to identify healthy women at higher risk of developing breast cancer in the subsequent decade and to propose a better understanding of the complex mechanisms involved in its etiology.
This article is featured in Highlights of This Issue, p. 1273
Introduction
Breast cancer is the most common female cancer in the world (www.dietandcancerreport.org) and ranks as the fifth cause of death from cancer overall (626,679 deaths worldwide in 2018; ref. 1). Despite some well-known risk factors such as adult height, reproductive life, and hormonal factors, and, for postmenopausal women, alcoholic drinks, body fatness, and adult weight gain (www.wcrf.org/breast-cancer-2017), the role of metabolism in breast cancer etiology remains unclear and hypothesis-driven research has not yet allowed to entirely elucidate it.
Recent advances in modern explorative technologies, such as metabolomics, increase the capacity to identify additional risk factors and better understand mechanisms involved (2). Indeed, mass spectrometry (MS) and NMR allows detecting a wide range of low molecular weight biochemicals in biofluids, cells, or tissues (3) and thus can provide a signature of both exogenous exposures and endogenous metabolism. Therefore, metabolomics allows identifying biomarkers relevant to cancer etiology and elucidating novel biological pathways (4–6). Several cross-sectional epidemiologic studies highlighted differences between metabolomic profiles and patients with breast cancer and controls (7, 8). An NMR-based prospective study (5 years of follow-up) recently showed the benefit of metabolomic signatures of breast cancer risk (9). However metabolomic signatures may be altered by the presence of a breast tumor itself (10), arguing for similar investigations with a long follow-up. Improving the ability to discriminate profiles at higher risk of developing breast cancer in the subsequent decades through the identification of prediagnostic biomarkers, to establish preventive approaches, remains an important focus in breast cancer research.
Two recent semiuntargeted MS studies investigated the associations between diet- (11) or body mass index (BMI)-related metabolites (12) and long-term breast cancer risk, and found promising results supporting the role of diet or BMI in postmenopausal breast cancer etiology. We previously set-up a prospective nested case–control study within the SU.VI.MAX cohort to investigate whether baseline metabolomic profiles from plasma of apparently healthy women could contribute to identify women who would develop breast cancer during the subsequent years, and better understand the etiology of this complex disease. Using an NMR metabolomics analysis, higher fasting plasma levels of valine, lysine, glutamine, and glucose and lower plasma levels of lipoproteins and glycerol-derived compounds were notably found to be associated with a higher risk of developing breast cancer (13). To increase the number of metabolites detected both in terms of sensitivity and metabolome coverage, this study applied for the first time a fully untargeted and nonoriented MS analysis, with the objective to highlight new endogenous or exogenous metabolites associated with long-term risk of breast cancer.
Materials and Methods
Population study
This analysis was based on a case–control study nested within the SU.VI.MAX prospective cohort (clinicaltrials.gov; NCT00272428). The latter was initially designed to investigate the influence of a daily supplementation with nutritional doses of antioxidants on the incidence of cardiovascular diseases and cancers. The study design and methods have been previously detailed (14, 15). Briefly, this study is a population-based, double-blind, placebo-controlled, randomized trial enrolling a total of 13,017 participants recruited in 1994–1995. The intervention study lasted 8 years, and observational follow-up of health events was maintained until September 2007. Written informed consent was obtained from all participants. The trial was approved by the Ethics Committee for Studies with Human Subjects of Paris-Cochin Hospital (CCPPRB 706/2364) and the ‘Commission Nationale de l'Informatique et des Libertés' (CNIL 334641/907094) and was conducted according to the Declaration of Helsinki guidelines.
Baseline data collection
At inclusion, self-administered questionnaires about sociodemographic characteristics, smoking status, medication use, health status, and family history of cancer were filled-in by participants. During a baseline clinical examination, all participants underwent anthropometric measurements (height and weight) and a blood draw (occurring after a 12-hour fasting period) by study nurses and physicians. These 35 mL venous blood samples were collected in sodium heparin Vacutainer Tubes (Becton Dickinson,). After centrifugation at +4°C, plasma aliquots were immediately prepared and stored frozen at −20°C during less than 2 days and then stored in liquid nitrogen.
Case ascertainment
Health events that occurred during follow-up (from 1994 to 2007) were self-reported by participants. Medical data were gathered through participants, physicians, and hospitals and reviewed by an independent physician expert committee. Pathologic reports were used to validate cases and extract cancer characteristics. Cancers were classified by using the International Classification of Diseases, 10th Revision, Clinical Modification (16).
Nested case–control study
This analysis initially included 215 participants with a first incident invasive breast cancer matched 1:1 with controls for baseline age (35–39 years/40–44 years/45–49 years/50–54 years/55–59 years/>60 years), menopausal status (pre-/postmenopause status at baseline and at diagnosis), BMI (<18.5 kg/m2, underweight; ≥18.5–<25 kg/m2, normal weight; and ≥25 kg/m2, overweight/obese), arm assignment to the initial SU.VI.MAX clinical trial (placebo/supplemented), smoking status (current smokers and non/former smokers at baseline), and season of blood draw (a priori–defined periods: October–November/December–January–February/March–April–May). Controls selection was based on the classical density sampling method (17), that is, every time a case was diagnosed, one control was selected from other participants from the cohort with no breast cancer at that time.
Metabolomic analyses
Metabolomic profiling was conducted using ultrahigh performance liquid chromatography/mass spectroscopy with the Metabolic Profiler Platform (Bruker Daltonique). Details on chemicals and reagents, biological samples preparation, metabolic profiling, raw data extraction, and metabolite identification are available in Supplementary Materials and Methods and Supplementary Fig. S1. Samples were randomized within the analytic sequence based on a Williams Latin Square strategy defined according to the main factors of the study. The stability of the analytic system was monitored using pooled samples as quality control and analyzed in the same sequence than participant samples (injected one time at the beginning of each sequence and then after each set of 12 samples). This metabolomic discovery approach allows semiquantitative measurements, representing relative ion intensities in arbitrary unit. However, even if it is not absolute quantitation, there are linear relations between ion intensity for each metabolite and its concentration level in the samples.
Statistical analysis
Participants' baseline characteristics were compared between cases and controls using χ² tests for categorical variables and Student t test for continuous variables. Less than 5% of values were missing for all covariates and missing values were replaced by the mode.
Prospective associations between each ion's signal intensity and breast cancer risk were characterized using multivariable conditional logistic regression models. OR for an increment of 1 SD for each metabolite and their 95% confidence intervals (95% CI) were computed, as well as tests for linear trend. Multivariable models were adjusted for the following baseline characteristics: age (continuous), BMI (continuous), smoking status (current, former, and nonsmokers), height (continuous), alcohol intake (continuous), physical activity (low, moderate, and intense), education level (primary, secondary, and superior), family history of breast cancer, number of children, and use of hormone replacement therapy for menopause; all these covariates constituting the reference model. Models were adjusted for matching factors only when the variable was more precise than the one used for the matching, to avoid residual confounding. The models were not additionally adjusted for menopausal status (pre-/postmenopause status at baseline and at diagnosis), season of blood draw (October–November/December–January–February/March–April–May), and arm assignment to the initial SU.VI.MAX clinical trial (yes/no) because cases and controls were already matched for these variables with the same coding. To limit the false negative rate (type II error = risk of not detecting reals associations) in this exploratory untargeted metabolomic study, a FDR level of 0.2 was retained from the Benjamini Hochberg procedure (ref. 18; 1,221 tests: 530 ions in Electrospray (ESI)-positive mode and 691 ions in ESI-negative mode). Linearity assumption was checked by plotting the ion's intensity as a function of logit(predicted values of breast cancer risk). ORs for categorized intensities in tertiles were also computed, as well as a Spearman correlation matrix of the identified associated metabolites.
Sensitivity analysis was performed on the same models by excluding cases diagnosed during the first year of follow-up, as well as the matched controls to prevent reverse causality bias. Exploratory stratified and heterogeneity (interaction) analyses were also performed according to menopausal status at diagnosis and according to the median follow-up duration: 5.8 years (to investigate differentially short- or long-term associations with breast cancer risk). Interaction was tested as the produce of each ion (continuous/SD) by these two factors (two categories each).
A multivariate analysis using a principal component analysis (PCA) with rotated factors by orthogonal transformation was performed on ions associated with breast cancer from logistic regressions. The four first factors were retained according to the plot of the total explained variance (Cattel scree plots; ref. 19). The multivariable conditional logistic regression models were run on these factors (with the same adjustments as described above) to investigate the associations between metabolomic patterns (combinations of metabolites variations) and breast cancer risk.
Analyses were performed using SAS software (v9.3) and plots were generated using R software (v3.5.2). All of the statistical tests were two-sided.
Ethical approval
All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.
Informed consent
Informed consent was obtained from all individual participants included in the study.
Results
The baseline characteristics of breast cancer cases and controls in the study population are summarized in Table 1. The cases were more likely to be taller with more frequent family history of breast cancer than controls. A total of 25% of cases were diagnosed before 3.5 years, 50% before 5.8 years, and 75% before 9.4 years. The maximum length of follow-up was 12.8 years.
. | Breast cancer cases (N = 211) . | Controls (N = 211) . | Pb . |
---|---|---|---|
Age at baseline (y) | 49.0 ± 6.0 | 48.9 ± 6.0 | 0.9 |
BMI (kg/m²) | 23.2 ± 4.0 | 23.6 ± 4.2 | 0.3 |
0.7 | |||
<18.5 kg/m² (underweight) | 9 (4.3) | 6 (2.8) | |
≥18.5–<25 kg/m² (normal weight) | 148 (70.1) | 151 (71.6) | |
≥25 kg/m² (overweight) | 54 (25.6) | 54 (25.6) | |
Height (cm) | 163.0 ± 6.2 | 160.9 ± 5.9 | 0.0004 |
Intervention group | 1 | ||
Placebo | 105 (49.8) | 105 (49.8) | |
Antioxidants | 106 (50.2) | 106 (50.2) | |
Smoking status | 1 | ||
Never and former | 164 (77.7) | 164 (77.7) | |
Current smoker | 47 (22.3) | 47 (22.3) | |
Physical activity | 0.8 | ||
Irregular | 67 (31.8) | 60 (28.4) | |
<1 h/d walking equivalent | 69 (32.7) | 72 (34.1) | |
≥1 h/d walking equivalent | 75 (35.5) | 79 (37.4) | |
Educational level | 0.3 | ||
Primary | 37 (17.5) | 44 (20.9) | |
Secondary | 82 (38.9) | 91 (43.1) | |
Superior | 92 (43.6) | 76 (36.0) | |
Number of biological children | 1.8 ± 1.2 | 2 ± 1.2 | 0.1 |
Hormonal treatment for menopause (yes) | 76 (36.0) | 75 (35.5) | 0.9 |
Menopausal status at baseline | 1 | ||
Premenopausal | 129 (61.1) | 129 (61.1) | |
Postmenopausal | 82 (38.9) | 82 (38.9) | |
Menopausal status at diagnosis | 1 | ||
Premenopausal | 78 (37.0) | 78 (37.0) | |
Postmenopausal | 133 (63.0) | 133 (63.0) | |
Family history of breast cancerc (yes) | 35 (16.6) | 21 (10.0) | 0.04 |
Alcohol intake (g/d) | 11.2 ± 12.0 | 11.5 ± 13.0 | 0.8 |
Month of blood draw | 1 | ||
March–April–May | 77 (36.5) | 77 (36.5) | |
October–November | 31 (14.7) | 32 (15.2) | |
December–January–February | 103 (48.8) | 102 (48.3) |
. | Breast cancer cases (N = 211) . | Controls (N = 211) . | Pb . |
---|---|---|---|
Age at baseline (y) | 49.0 ± 6.0 | 48.9 ± 6.0 | 0.9 |
BMI (kg/m²) | 23.2 ± 4.0 | 23.6 ± 4.2 | 0.3 |
0.7 | |||
<18.5 kg/m² (underweight) | 9 (4.3) | 6 (2.8) | |
≥18.5–<25 kg/m² (normal weight) | 148 (70.1) | 151 (71.6) | |
≥25 kg/m² (overweight) | 54 (25.6) | 54 (25.6) | |
Height (cm) | 163.0 ± 6.2 | 160.9 ± 5.9 | 0.0004 |
Intervention group | 1 | ||
Placebo | 105 (49.8) | 105 (49.8) | |
Antioxidants | 106 (50.2) | 106 (50.2) | |
Smoking status | 1 | ||
Never and former | 164 (77.7) | 164 (77.7) | |
Current smoker | 47 (22.3) | 47 (22.3) | |
Physical activity | 0.8 | ||
Irregular | 67 (31.8) | 60 (28.4) | |
<1 h/d walking equivalent | 69 (32.7) | 72 (34.1) | |
≥1 h/d walking equivalent | 75 (35.5) | 79 (37.4) | |
Educational level | 0.3 | ||
Primary | 37 (17.5) | 44 (20.9) | |
Secondary | 82 (38.9) | 91 (43.1) | |
Superior | 92 (43.6) | 76 (36.0) | |
Number of biological children | 1.8 ± 1.2 | 2 ± 1.2 | 0.1 |
Hormonal treatment for menopause (yes) | 76 (36.0) | 75 (35.5) | 0.9 |
Menopausal status at baseline | 1 | ||
Premenopausal | 129 (61.1) | 129 (61.1) | |
Postmenopausal | 82 (38.9) | 82 (38.9) | |
Menopausal status at diagnosis | 1 | ||
Premenopausal | 78 (37.0) | 78 (37.0) | |
Postmenopausal | 133 (63.0) | 133 (63.0) | |
Family history of breast cancerc (yes) | 35 (16.6) | 21 (10.0) | 0.04 |
Alcohol intake (g/d) | 11.2 ± 12.0 | 11.5 ± 13.0 | 0.8 |
Month of blood draw | 1 | ||
March–April–May | 77 (36.5) | 77 (36.5) | |
October–November | 31 (14.7) | 32 (15.2) | |
December–January–February | 103 (48.8) | 102 (48.3) |
aValues are means ± SD or n (%).
bP for the comparison between breast cancer cases and controls using χ² tests for categorical variables or Student t test (from analysis of variance models) for continuous variables.
cAmong first-degree female relatives.
A total of 3,565 ions were detected and 1,221 were retained for statistical analysis. From logistic regression analyses, 53 ions were positively associated with breast cancer risk and 20 ions were inversely associated (P ≤ 0.01; FDR ≤ 0.2). Results of all associated ions are shown in Supplementary Table S1. Among these ions, some have been identified using available databases (in-house database, Metlin; https://metlin.scripps.edu), Human Metabolome Database (www.hmdb.ca), Kyoto Encyclopedia of Genes and Genomes database), and analytic data (as MS2); the associations between these ions (N = 9) and breast cancer risk are displayed in Table 2. A lower plasma level of O-succinyl-homoserine (OR = 0.70, 95%CI = [0.55-0.89]) and higher plasma levels of valine/norvaline [1.45 (1.15–1.83)], glutamine/isoglutamine [1.33 (1.07–1.66)], 5-aminovaleric acid [1.46 (1.14–1.87)], phenylalanine [1.43 (1.14–1.78)], tryptophan [1.40 (1.10–1.79)], γ-glutamyl-threonine [GGT; 1.39 (1.09–1.77)], acetyl tributyl citrate ATBC [1.41 (1.10–1.79)], and the putative pregnene-triol sulfate [1.38 (1.08–1.77)] were associated with an increased risk of developing breast cancer during follow-up. Linearity assessments, as well as analyses performed with ions categorized in tertiles were provided in Supplementary Fig. S2A and S2B and Supplementary Table S2. The Spearman correlation matrix of the identified associated metabolites is provided in Supplementary Table S3. Positive correlations (P < 0.05) were found for pregnene-triol sulfate with 5-aminovaleric acid, tryptophan, and GGT; for phenylalanine with valine, tryptophan, 5-aminovaleric acid, and glutamine; for valine with 5-aminovaleric acid, tryptophan, and glutamine; for 5-aminovaleric acid with GGT; for tryptophan with glutamine; and for GGT with glutamine. Inverse correlations were found for O-succinyl-homoserine with 5-aminovaleric acid and ATBC; and for 5-aminovaleric acid with ATBC. Results of sensitivity, stratification, and heterogeneity analyses are shown in Supplementary Tables S4, S5, and S6. All identified ions remained associated with breast cancer when excluding cases diagnosed before 1 year of follow-up. No interaction was found between breast cancer risk and duration of follow-up. Higher plasma level of glutamine was associated with an increased risk of breast cancer only in premenopausal women (Pinteraction = 0.003; FDR = 0.2).
Name (level of confidence for identificationb) . | Annotation . | Measured mass m/z . | Retention time (minute) . | Mode . | Cases/controls . | Mean ± SD . | OR (95% CI) . | Ptrend . |
---|---|---|---|---|---|---|---|---|
l-phenylalanine (1) | [M+H]+ | 166.08618 | 10.11 | ESI+ | 211/211 | 227,718 ± 42,535 | 1.43 (1.14–1.78) | 0.002c |
l-valine/norvaline (1) | [M+H-CH2O2]+ | 72.08089 | 3.08 | ESI+ | 211/211 | 115,033 ± 24,435 | 1.45 (1.15–1.83) | 0.002c |
5-aminovaleric acid (1) | [M+Na]+ | 140.06816 | 2.29 | ESI+ | 211/211 | 10,219 ± 5,139 | 1.46 (1.14–1.87) | 0.003d |
O-succinyl-L-homoserine (1) | [M+Na-2H]- | 240.04755 | 2.86 | ESI− | 203/203 | 919 ± 400 | 0.70 (0.55–0.89) | 0.004d |
ATBC (acetyl tributyl citrate) (1) | [M+Na]+ | 425.21694 | 19.31 | ESI+ | 211/211 | 2,198 ± 964 | 1.41 (1.10–1.79) | 0.006d |
l-tryptophan (1) | [M+Na]+ | 227.07887 | 11.09 | ESI+ | 211/211 | 799 ± 180 | 1.40 (1.10–1.79) | 0.007d |
l-γ-glutamyl-l-threonine (1) | [M+H-H2O]+ | 231.1451 | 2.52 | ESI+ | 211/211 | 1,054 ± 601 | 1.39 (1.09–1.77) | 0.008d |
l-glutamine/l-isoglutamine (1) | [M+H-C4H8O]+ | 90.0551 | 2.32 | ESI+ | 211/211 | 13,187 ± 2,982 | 1.33 (1.07–1.66) | 0.009d |
Pregnene-triol sulfate (2) | C21H33O6S | 413.19771 | 16.11 | ESI− | 203/203 | 2,441 ± 1,254 | 1.38 (1.08–1.77) | 0.01d |
Name (level of confidence for identificationb) . | Annotation . | Measured mass m/z . | Retention time (minute) . | Mode . | Cases/controls . | Mean ± SD . | OR (95% CI) . | Ptrend . |
---|---|---|---|---|---|---|---|---|
l-phenylalanine (1) | [M+H]+ | 166.08618 | 10.11 | ESI+ | 211/211 | 227,718 ± 42,535 | 1.43 (1.14–1.78) | 0.002c |
l-valine/norvaline (1) | [M+H-CH2O2]+ | 72.08089 | 3.08 | ESI+ | 211/211 | 115,033 ± 24,435 | 1.45 (1.15–1.83) | 0.002c |
5-aminovaleric acid (1) | [M+Na]+ | 140.06816 | 2.29 | ESI+ | 211/211 | 10,219 ± 5,139 | 1.46 (1.14–1.87) | 0.003d |
O-succinyl-L-homoserine (1) | [M+Na-2H]- | 240.04755 | 2.86 | ESI− | 203/203 | 919 ± 400 | 0.70 (0.55–0.89) | 0.004d |
ATBC (acetyl tributyl citrate) (1) | [M+Na]+ | 425.21694 | 19.31 | ESI+ | 211/211 | 2,198 ± 964 | 1.41 (1.10–1.79) | 0.006d |
l-tryptophan (1) | [M+Na]+ | 227.07887 | 11.09 | ESI+ | 211/211 | 799 ± 180 | 1.40 (1.10–1.79) | 0.007d |
l-γ-glutamyl-l-threonine (1) | [M+H-H2O]+ | 231.1451 | 2.52 | ESI+ | 211/211 | 1,054 ± 601 | 1.39 (1.09–1.77) | 0.008d |
l-glutamine/l-isoglutamine (1) | [M+H-C4H8O]+ | 90.0551 | 2.32 | ESI+ | 211/211 | 13,187 ± 2,982 | 1.33 (1.07–1.66) | 0.009d |
Pregnene-triol sulfate (2) | C21H33O6S | 413.19771 | 16.11 | ESI− | 203/203 | 2,441 ± 1,254 | 1.38 (1.08–1.77) | 0.01d |
aMultivariable models were adjusted for age (continuous), BMI (continuous), smoking status (current smokers, former smokers, and nonsmokers), season of blood draw (a priori–defined periods: October–November/December–January–February/March–April–May), height (continuous), alcohol intake (continuous), physical activity (irregular/<1 h/d walking equivalent/≥1 h/d walking equivalent), education level (primary/secondary/superior), family history of breast cancer (yes/no), number of children (continuous), and use of hormone replacement therapy for menopause (yes/no; reference model). Tests for linear trend were performed using the continuous variables. ORs were presented for an increment of 1 SD for each metabolite.
bMetabolites were classified according to Sumner and colleagues (46) concerning the levels of confidence in the identification process: identified (level 1: confirmed by standard), putatively annotated (level 2: based upon physicochemical properties and/or spectral similarity with public/commercial spectral libraries).
cFDR after Benjamini Hochberg correction for multiple testing = 0.1.
dFDR after Benjamini Hochberg correction for multiple testing = 0.2.
All ions associated with breast cancer risk in logistic regressions were then analyzed by PCA. The four first PCA factors (metabolomic patterns) explained more than 36% of total variability. Loading values are displayed in Supplementary Table S7. The first factor was notably characterized by higher plasma levels of valine and 5-aminovaleric acid, the second factor was in particular characterized by higher plasma levels of phenylalanine, valine, 5-aminovaleric acid, and tryptophan, the third by lower plasma level of pregnene-triol sulfate, and the fourth by higher plasma levels of phenylalanine, valine 5-aminovaleric acid, GGT, and glutamine (loading values ≥ |0.20|). Moreover, several unidentified ions also characterized these four factors. These data allowed distinguishing three metabolomic patterns (factors 1, 2, and 4) associated with an increased risk of breast cancer in multivariable conditional logistic regression models and one (factor 3) with a decreased risk (FDR < 0.05; Table 3).
PCA Factorsb . | Cases/controls . | OR (95% CI) . | Ptrend . | Corrected Ptrendc . |
---|---|---|---|---|
Factor 1 | 200/200 | 1.38 (1.11–1.73) | 0.004 | 0.004 |
Factor 2 | 200/200 | 1.47 (1.16–1.85) | 0.001 | 0.002 |
Factor 3 | 200/200 | 0.49 (0.37–0.65) | <0.0001 | <0.0001 |
Factor 4 | 200/200 | 1.49 (1.17–1.89) | 0.001 | 0.002 |
PCA Factorsb . | Cases/controls . | OR (95% CI) . | Ptrend . | Corrected Ptrendc . |
---|---|---|---|---|
Factor 1 | 200/200 | 1.38 (1.11–1.73) | 0.004 | 0.004 |
Factor 2 | 200/200 | 1.47 (1.16–1.85) | 0.001 | 0.002 |
Factor 3 | 200/200 | 0.49 (0.37–0.65) | <0.0001 | <0.0001 |
Factor 4 | 200/200 | 1.49 (1.17–1.89) | 0.001 | 0.002 |
aMultivariable models were adjusted for age (continuous), BMI (continuous), smoking status (current smokers, former smokers, and nonsmokers), season of blood draw (a priori–defined periods: October–November/December–January–February/March–April–May), height (continuous), alcohol intake (continuous), physical activity (irregular/<1 h/d walking equivalent/≥1 h/d walking equivalent), education level (primary/secondary/superior), family history of breast cancer (yes/no), number of children (continuous), and use of hormone replacement therapy for menopause (yes/no; reference model). Tests for linear trend were performed using the continuous variables. ORs are computed for a 1 point increment of the continuous variable.
bThe four first PCA factors (metabolomic patterns) explained more than 36% of total variability. The first factor was notably characterized by higher plasma levels of valine and 5-aminovaleric acid, the second factor was in particular characterized by higher plasma levels of phenylalanine, valine, 5-aminovaleric acid, and tryptophan, the third by lower plasma level of pregnene-triol sulfate, and the fourth by higher plasma levels of phenylalanine, valine 5-aminovaleric acid, GGT, and glutamine (loading values ≥|0.20|). Moreover, several unidentified ions also characterized these four factors.
cPtrend after Benjamini Hochberg correction for multiple testing.
Discussion
To our knowledge, this prospective study was the first fully untargeted and nonoriented LC/MS-based analysis aiming to investigate the associations between prediagnosis metabolic profiles and long-term breast cancer risk. Untargeted metabolomics, by the detection of a large number of metabolites, allowed discriminating early phenotypes and revealing a plasma metabolic signature announcing the evolution toward breast cancer in the subsequent 13 years. This could be relevant for detecting specific plasmatic profiles of women at higher risk for this disease and improve personalized prevention. A profile (signature) of circulating levels of multiple metabolites provides insights to draw an overall picture of the different pathways involved in the etiology of this cancer (20). Even if the origin of metabolites variations (exogenous or endogenous) cannot be determined in this study, our results also illustrate that metabolomics integrates information regarding the influence of both extrinsic and intrinsic factors, such as lifestyle exposures and basal metabolism, and consequently address the complex issue of environment and health interaction.
In this study, we found that a lower plasma level of O-succinyl-homoserine and higher plasma levels of valine/norvaline, glutamine/isoglutamine, 5-aminovaleric acid, phenylalanine, tryptophan, γ-glutamyl-threonine, ATBC, and pregnene-triol sulfate were associated with an increased risk of developing breast cancer during the following decade. Three metabolomic patterns characterized by several concurrently metabolites variations were associated with an increased risk of breast cancer, as well as another metabolomic pattern associated with a decreased risk. These univariate analysis (analysis of one by one metabolite variation) and multivariate analysis (analysis of combinations of simultaneous metabolites variations) allowed getting complementary overviews of metabolic signatures discriminating phenotypes at higher risk for this disease well before breast cancer clinical onset.
We previously highlighted a higher plasma level of glutamine and valine in the future breast cancer cases with NMR analyses on the same population study (13), which was consistent with our current results. An increase in plasmatic levels of valine and three γ-glutamyl dipeptides was also recently found to be positively associated with BMI and estrogen receptor–positive breast cancer risk in postmenopausal women through a semiuntargeted MS study (12). The pregnene-triol sulfate is a steroid hormone and belongs to progestative family compounds. In line with our results, two semiuntargeted MS studies found a positive association between steroid hormones and breast cancer risk in postmenopausal women (11, 12). Four metabolomic studies reported that an increased level of phenylalanine was positively associated with hepatocellular carcinoma (HCC) risk (21–24) and three of these studies found that a higher level of glutamine was associated with a decreased risk of HCC (21–23). Concerning valine, findings were more contrasted; one of these studies showed a positive association with HCC risk (21), whereas another observed an inverse association (22). An increase in plasmatic valine level was also reported in two metabolomic studies to be positively associated with pancreatic adenoma risk (25) and inversely associated with prostate cancer risk (26). In three metabolomic studies an inverse association was found between phenylalanine (26), tryptophan (27), γ-glutamyl-histidine (28), and γ-glutamyl-phenylalanine (26, 27) and prostate cancer risk, whereas a positive association was reported between γ-glutamyl-glutamine and nonaggressive prostate cancer risk (27). The comparisons of our results with other prospective metabolomic studies should be interpreted with caution, taking into account methodologic and technological variations (as the wide range of metabolomics platforms and types of analyses), because the etiology may vary according to cancer locations.
The findings highlighted in this study provide evidence for circulating metabolites as putative risk biomarkers for breast cancer. Given that metabolites concerned by the observed variations are not specific of one metabolic pathway and as their variations could arise from several types of exposure and endogenous reactions, precaution is needed in the mechanistic interpretation of our epidemiologic findings. Even so, we propose interpretation hypotheses for some presumed pathways, summarized in Fig. 1. Among the possible interpretations, our results could suggest a perturbation of energy and amino acids metabolism, a set-up of conditions favoring cell proliferation, inflammation, and oxidative stress in women who will develop breast cancer involving environmental exposure and microbiota metabolism. Glutamine and valine (one of the branched chain amino acids; BCAA) positively correlated in our study, are two glycogenic amino acids; their plasma increase could suggest an amplified phenomenon of gluconeogenesis. BCAAs were previously suggested as markers of cardiovascular disease risk (29), insulin resistance (30), and diabetes (31). Moreover there is convincing evidence on the effect of obesity on an increased risk of postmenopausal breast cancer, and pathways related to insulin resistance is one of the number of plausible mechanisms that may explain this relationship (32). Moreover, valine and glutamine are regulators of many cell signaling pathways including activators of the mTOR pathway; one of the main cell proliferation routes (31). ATBC is an alternative plasticizer to phthalates (33) widely used in products such as food wrap, vinyl toys, and pharmaceutical excipients. Although this contaminant does not seem to accumulate in tissues (34), recent data revealed potential biological activity on tissue growth (35). The combination of plasmatic increase in both ATBC and in a progestative sulfate steroid may interact either with PPAR signaling involved in cell proliferation and inflammation, and/or with steroid signaling (36). Indeed, ATBC has been shown to influence steroid levels (35) and is known to interact with several signaling pathways as a ligand for steroid xenobiotic receptor or sexual hormone–binding globulin involved in steroid signaling (33). Phenylalanine and tryptophan are aromatic amino acids (AAA, correlated in our study; R = 0.27) and high level of AAAs in blood have been considered as an inflammatory marker in patients with liver cirrhosis (37). Disruptions in AAAs and BCAAs levels are found in several pathologies with chronical low-grade inflammation such as obesity and diabetes (38, 39); in our study valine (BCAA) is highly positively correlated with AAAs (phenylalanine; R = 0.47 and tryptophan; R = 0.24). Data on the kynurenine/tryptophan ratio, unfortunately unavailable in this study, would be useful to support this hypothesis, as it reflects IDO1 gene expression, which increases in situations of chronic low-grade inflammation. 5-aminovaleric acid is a catabolism product from lysine; its increase could witness either the presence of cellular necrosis or microbiota metabolism. In mice, higher urinary levels of 5-aminovaleric acid were associated with intestinal inflammation (40). Concerning identified metabolites, the second PCA-derived factor highlighted in our study could represent an inflammatory pattern, as it was notably characterized by higher plasma levels of phenylalanine, valine, 5-aminovaleric acid, and tryptophan. Our findings could also reflect an influence of gut microbiota, because O-succinyl-homoserine is a metabolite exclusively synthetized by bacteria. Another metabolite synthetized by gut microbiota, 2-amino-4-cyano butanoic acid, was associated with increased breast cancer risk in this study before FDR correction only [OR = 1.28 (1.03–1.58); P = 0.02; FDR = 0.3; data not tabulated). The variation of these two microbiota-related metabolites could be interpreted as an oxidative stress response given their potential role in glutamate and glutathione synthesis. Indeed, humans' metabolism can produce cysteine (one component of glutathione) using O-succinyl-homoserine instead of homocysteine (41). 2-amino-4-cyano butanoic acid is the substrate of nitrilase responsible for the de novo synthesis of glutamate in bacteria and fungus (42). Moreover it is also implied in heme synthesis, recognized as an oxidative biocompound (43). Likewise, the increase in plasmatic glutamine and in γ-glutamyl-threonine (positively correlated in our study) can suggest both a decrease in antioxidant defenses and an oxidative stress response (γ-glutamyl-dipeptides may indicate increased glutathione synthesis). The data available in this study do not allow to draw firm conclusions about these mechanistic interpretations, which need further investigations for their validation, such as complementary analyses of other metabolites variation involved in the same pathway (but not detected with the current analytic technic), fluxomic analyses to determine the rates of metabolic reactions, or lipidomic and proteomic analyses to cover a wide range of biggest molecules.
The major strengths of this study pertained to the fully untargeted MS metabolomics approach and the prospective design. However, several limitations should be acknowledged. First, the relatively small sample size was probably sufficient for main analyses, but it was certainly limited for secondary stratified analyses. For instance, a larger sample size could have help to investigate more precisely the relation between the metabolites and the duration before cancer development (e.g., short, medium, and long term). Second, we cannot rule out the possibility of residual or unmeasured confounding despite the large spectrum of covariates taken into account, and we cannot exclude the possibility of overfitting of the models. Third, because the size of the sample did not allow keeping apart a validation set, we did not compute predictive performance such as AUC. However the main interest of our study lied in a better understanding of breast cancer etiology rather than in an improvement in discrimination models. Then, only LC/MS and not gas chromatography–mass spectroscopy was applied. This may have limited our ability to detect some categories of metabolites which could have supported certain mechanistic interpretations. Moreover, as the metabolic profiles were established a decade before the diagnosis, some small variations may not have been detected. Some of the metabolites highlighted in our study are at the cross-roads of several metabolic pathways, thus these mechanistic interpretations resulting from literature should be considered with caution and need further investigations to be validated. Other interpretations could also be considered. Furthermore, this metabolomic analysis was based on a single blood draw so we could not investigate the stability of the profiles. Nevertheless, several studies showed a good stability and reproducibility of metabolomic measurements for most of metabolites (44, 45). Finally, as it is usually the case in a fully untargeted metabolomic approach, several associated ions could not be identified to date. However, these findings will be useful in the future in a context of strong development of metabolites databases and to compare our results with other future LC/MS studies.
In conclusion, this prospective study highlighted associations between baseline untargeted metabolomic profiles and subsequent long-term breast cancer risk. These findings generate mechanistic hypotheses regarding the etiology of this complex disease, which were consistent with those reported in our NMR analysis, and include a possible implication of gut microbiota activity and environmental contamination/exposure. If replicated in other independent prospective studies and after quantification of key metabolites to set discrimination thresholds, these results might contribute to develop screening strategies for the identification, before tumor set up, of women at higher risk of developing a breast cancer in the decade following the blood draw.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Disclaimer
The funders had no role in the design, analysis, or writing of this article.
Authors' Contributions
Conception and design: L. Lécuyer, M. Touvier
Development of methodology: L. Lécuyer, E. Pujos-Guillot, M. Touvier
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): B. Lyan, M. Petera, D. Centeno, P. Galan, S. Hercberg, E. Kesse-Guyot, M. Touvier, M. Lagree
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): L. Lécuyer, C. Dalle, B. Lyan, A. Demidem, A. Rossary, M.-P. Vasson, M. Petera, D. Centeno, M. Deschasaux, V. Partula, B. Srour, P. Latino-Martel, N. Druesne-Pecollo, S. Durand, E. Pujos-Guillot, M. Touvier
Writing, review, and/or revision of the manuscript: L. Lécuyer, C. Dalle, B. Lyan, A. Demidem, A. Rossary, M.-P. Vasson, M. Petera, T. Ferreira, D. Centeno, P. Galan, S. Hercberg, M. Deschasaux, V. Partula, P. Latino-Martel, E. Kesse-Guyot, N. Druesne-Pecollo, S. Durand, E. Pujos-Guillot, M. Touvier
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): L. Lécuyer, E. Pujos-Guillot, M. Touvier
Study supervision: L. Lécuyer, A. Demidem, P. Galan, S. Hercberg, M. Touvier
Acknowledgments
The authors thank Younes Esseddik, Frédéric Coffinieres, Thi Hong Van Duong, Paul Flanzy, Régis Gatibelza, Jagatjit Mohinder, and Maithyly Sivapalan (computer scientists), Rachida Mehroug and Frédérique Ferrat (logistic assistants), Nathalie Arnault, Véronique Gourlet, PhD, Fabien Szabo, PhD, Julien Allegre, and Laurent Bourhis (data-manager/statisticians), and Cédric Agaesse (dietitian) for their technical contribution to the SU.VI.MAX study. We also thank Nathalie Druesne-Pecollo, PhD (operational coordination), as well as all participants of the SU.VI.MAX study. The authors thank Dan Chaltiel for his help in R programming. This work was conducted in the framework of the French network for Nutrition And Cancer Research (NACRe network), www.inra.fr/nacre and received the NACRe Partnership Label. Metabolomic analysis was performed within the metaboHUB French infrastructure (ANR-INBS-0010). This work was supported by the French National Cancer Institute (grant no. INCa_8085 for the project, and PhD grant no. INCa_11323 for L. Lécuyer), the Federative Institute for Biomedical Research IFRB Paris 13, and the Cancéropôle Ile-de-France/Région Ile de France.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.