Background:

Diet has been recognized as a modifiable risk factor for breast cancer. Highlighting predictive diet-related biomarkers would be of great public health relevance to identify at-risk subjects. The aim of this exploratory study was to select diet-related metabolites discriminating women at higher risk of breast cancer using untargeted metabolomics.

Methods:

Baseline plasma samples of 200 incident breast cancer cases and matched controls, from a nested case–control study within the Supplémentation en Vitamines et Minéraux Antioxydants (SU.VI.MAX) cohort, were analyzed by untargeted LC-MS. Diet-related metabolites were identified by partial correlation with dietary exposures, and best predictors of breast cancer risk were then selected by Elastic Net penalized regression. The selection stability was assessed using bootstrap resampling.

Results:

595 ions were selected as candidate diet–related metabolites. Fourteen of them were selected by Elastic Net regression as breast cancer risk discriminant ions. A lower level of piperine (a compound from pepper) and higher levels of acetyltributylcitrate (an alternative plasticizer to phthalates), pregnene-triol sulfate (a steroid sulfate), and 2-amino-4-cyano butanoic acid (a metabolite linked to microbiota metabolism) were observed in plasma from women who subsequently developed breast cancer. This metabolomic signature was related to several dietary exposures such as a “Western” dietary pattern and higher alcohol and coffee intakes.

Conclusions:

Our study suggested a diet-related plasma metabolic signature involving exogenous, steroid metabolites, and microbiota-related compounds associated with long-term breast cancer risk that should be confirmed in large-scale independent studies.

Impact:

These results could help to identify healthy women at higher risk of breast cancer and improve the understanding of nutrition and health relationship.

Cancer is a multifactorial disease (1). Although a large part of cancers are explained by intrinsic factors, between 30% and 50% of all cancer cases are estimated to be preventable (2). In France in 2015, over 30% of diagnosed incident breast cancers, the first female cancer in the world in terms of incidence (3), are attributable to nutrition-related factors, including alcohol, physical inactivity contributing to excess adiposity and weight status (4). Beyond the three latter factors, a protective role of other dietary factors in breast carcinogenesis has been suggested for nonstarchy vegetables [estrogen receptor–negative (ER) breast cancer; ref. 3], dairy products (among premenopausal women; ref. 3), foods containing carotenoids (3), foods high in calcium (3), or dietary fiber (5). In contrast, increased risk has been suggested to be associated with some types of lipids such as plasma levels of trans-fatty acids produced by industrial processing (ER breast cancer; ref. 6) and saturated fatty acids intake (7). Diet impacts both the endogenous metabolome and the food metabolome (i.e., the metabolites derived from ingested foods and their subsequent metabolism in the human body; ref. 8). The identification of a set of metabolites that are both (i) influenced by diet and on which it is possible to act through dietary interventions, and (ii) related to increased or decreased breast cancer risk would open new perspectives in terms of prevention and potentially provide insights on the underlying mechanisms.

High-throughput technologies allow the exploration of thousands of molecules resulting from complex system-wide biological interactions. An in-depth untargeted investigation that is not hypothesis-driven holds promise for the discovery of new biomarkers. In comparison with approaches focused on a restricted number of a priori–selected biomarkers, untargeted metabolomics allows highlighting combinations of metabolites (potentially acting with synergistic and antagonist effects) associated with disease risk that could allow a better sensitivity and specificity of predictive models (9). This application to the nutrition field may highlight breast cancer–related metabolomic signatures, combining exogenous metabolites resulting directly from dietary exposure and endogenous or microbial metabolites, and therefore reflecting the impact of diet on host and its microbiota metabolism. So far, to our knowledge, only one previous study used nutritional semiuntargeted metabolomics to examine diet-related serum metabolite associations with long-term breast cancer risk (10). The latter highlighted 22 diet-related metabolites, mainly related to alcohol, vitamin E, and animal fat intakes, associated with breast cancer risk. However, this study was conducted in postmenopausal women only, thus these results cannot be generalized in premenopausal women. Moreover, the unknown compounds were not considered in statistical analyses, limiting the potential discoveries of new biomarkers.

The aim of our study was to select a small subset of diet-related metabolites that best predicted long-term breast cancer risk using up-to-date statistical method, particularly adapted to large-scale omic data (Elastic Net regressions) by analyzing untargeted mass spectrometry metabolomic data.

Study population

This study involved participants from the Supplémentation en Vitamines et Minéraux Antioxydants (SU.VI.MAX) prospective cohort (clinicaltrials.gov; NCT00272428), which initially aimed to investigate the effect of a daily antioxidant supplementation in nutritional doses on the incidence of cardiovascular diseases and cancers. This population-based, double-blinded, placebo-controlled, randomized trial was conducted over 8 years, and observational follow-up of health events was subsequently maintained during 5 years (13 years of follow-up). The study design and methods have been previously detailed (11, 12). 13,017 participants were recruited in 1994–1995 and were invited to provide their written informed consent. The trial was approved by the Ethics Committee for Studies with Human Subjects of Paris-Cochin Hospital (CCPPRB 706/2364) and the “Commission Nationale de l'Informatique et des Libertés” (CNIL 334641/907094), and was conducted according to the Declaration of Helsinki guidelines. This work focused on a nested case–control study, including SU.VI.MAX participants with a first incident invasive breast cancer diagnosed after baseline (N = 215) and matched with controls (1:1 ratio, using the density sampling method; ref. 13) for the following characteristics at baseline: age, menopausal status, body mass index (BMI), intervention group of the trial, smoking status, season of blood draw. Details about the nested case–control study are presented in Supplementary Methods and the flowchart is presented in Fig. 1.

Figure 1.

Participant flowchart, selection of the SU.VI.MAX study.

Figure 1.

Participant flowchart, selection of the SU.VI.MAX study.

Close modal

Baseline data collection and case ascertainment

Baseline data collection and case ascertainment are described in Supplementary Methods. Briefly, at enrollment, participants were invited to fulfill self-administered questionnaires about sociodemographic characteristics, smoking status, medication use, health status and family history of cancer, and underwent anthropometric measurements as well as a fasting blood draw. During the trial phase, participants were asked to complete computerized 24-hour dietary records every 2 months, including >990 food items (14). Dietary habits at baseline were estimated by averaging intakes from all dietary records collected during the first 2 years of participation in the SU.VI.MAX study. Self-reported health events were reviewed by a physician expert committee. Pathological reports were used to validate cases, and cancers were classified using the International Chronic Diseases Classification, 10th Revision, Clinical Modification (15).

Metabolomic analyses

Untargeted metabolomics (vs. targeted) was performed to discover new biomarkers and new potential metabolites associated with breast cancer risk. Untargeted metabolomics was performed on plasma samples (N = 430) following a slightly modified version of the procedure described by Pereira and colleagues (16). Profiling was conducted using high performance liquid chromatography/mass spectroscopy with the Metabolic profiler platform (Bruker Daltonique). To monitor the analytical system stability, at the beginning of each sequence, blank sample was injected three times to equilibrate the column, followed by a QC sample (pooled participants plasma samples). Then, one QC sample was also injected after each set of 12 participants plasma samples. All samples (blank, QC, and plasma samples) were analyzed using the same analytical method. Details on chemicals and reagents, biological samples preparation, metabolic profiling, raw data extraction, quality controls, and metabolite identification are available in Supplementary Methods and Supplementary Table S1. Data were processed under the Galaxy web-based platform Workflow4Metabolomics (https://workflow4metabolomics.org/) using first XCMS module for peak detection, followed by quality checks and signal drift correction to yield a data matrix containing variables (retention times, masses) and peak intensities that were corrected for batch effects. After a fast overview of all chromatograms, some individuals were excluded due to problems during sample preparation, multiple ions with null intensity or pollution on chromatograms (few serum collecting tubes were contaminated with a PEG-like materials), leaving 200 cases and 200 controls samples for both positive and negative mode analyses. Highly correlated ions (correlation threshold at 0.9) from the same metabolite within the same retention time cluster were removed using the metabolite correlation analysis (MCA) Galaxy module. After these processing steps, 528 and 690 ions (detected in positive and negative ionization modes, respectively) from plasma metabolome remained in datasets for statistical analyses. This metabolomic discovery approach allows semiquantitative measurements, representing relative ion intensities (no absolute concentration). However, linearity between ion intensities and their concentration level was previously validated and matrix effect was previously studied (16). In particular, Pearson correlation was checked between ion intensities and sample dilutions.

Statistical analysis

Participants' baseline characteristics were compared between cases and controls using conditional logistic regressions.

To cover different aspects of diet, dietary exposure was assessed using 4 complementary methods. First, two dietary scores were computed: the mPNNS-GS (modified “Program National Nutrition Santé—Guideline Score”; ref. 17), reflecting adherence to 2001 French nutritional recommendations and including 12 components, eight referred to food-serving recommendations and four referred to moderation in consumption; and the DQI-I (Diet Quality Index-International; ref. 18), including four components (variety, adequacy, moderation, and overall balance) with adapted cutoff values corresponding to French recommendations (19); higher scores representing a higher dietary quality. Then, we computed the average daily intake of 74 specific food groups (g/day) by considering several parameters such as the level of 24-hour recalls, the food groups of official French dietary guidelines and the current knowledge about nutritional biomarkers of dietary consumption. Finally, a principal component analysis (PCA) with varimax rotated factors by orthogonal transformation was performed. This PCA highlighted two dietary patterns representing a “Western diet” (mainly characterized by higher intakes of alcohol, bread, processed and red meat, animal fat, cheese, and potatoes) and a “Healthy diet” (with higher intakes of fruits, vegetables, whole grain, yoghurt, and vegetable oil). Details of these dietary exposures computation are presented in Supplementary Methods, Supplementary Tables S2 and S3. Two statistical analyses were independently performed: Correlation analysis was used to select ions related with diet (first step) whereas Elastic Net regression was used for the selection of breast cancer risk–related ions (second step). In the current broad exploratory approach, dietary factors (either new or already suggested) were considered altogether in the same analysis and metabolites that were the best breast cancer risk predictors were selected.

Identification of ions potentially associated with diet

First, we estimated correlations between the 1,218 ions (both positive and negative modes) and dietary exposures using partial Spearman correlations adjusted for potential confounding factors specifically associated with diet, that is, for age, BMI, menopausal status, smoking status, season at time of blood draw, number of 24-hour dietary records and mean of daily energy intake during the 2 years following the blood draw. Given that ions were tested individually in the models, no normalization was previously performed. In this exploratory study, the aim was to collect a maximum of candidate ions potentially associated with diet in this first step, thus, all ions associated with diet at the threshold of P < 0.05 and correlation >|0.15| were selected for further analysis. Benjamini–Hochberg (BH) correction (20) was only tested before this selection (on 1,218 ions).

Subset selection of diet-related breast cancer risk discriminant ions

Among the ions potentially associated with diet highlighted in the first step, the best subset of predictors of breast cancer risk were selected by using penalized logistic regression thanks to the optimization of the α and λ parameters (no statistical test was performed when using this procedure). As metabolomic data are highly correlated, we used the Elastic Net method (21) implemented in the R glmnet package. This method allows variable selection by forcing the coefficients of the less predictive variables to be exactly zero. Our model considered all diet-related ions simultaneously and the following confounding factors specifically associated with breast cancer risk were included as unpenalized explanatory variables: Age, BMI, season, menopausal status, smoking status, height, physical activity, education level, alcohol intake, use of hormone replacement therapy for menopause, number of children and family history of breast cancer at blood draw (baseline), and intervention group of the initial SU.VI.MAX trial after dietary data collection. As in this second step all diet-related ions were considered simultaneously, intensities of these ions were previously unit-variance scaled. The Elastic Net penalization α and λ parameters (α = 0 implies no variable selection; α = 1 is equivalent to LASSO regression; λ defines the strength of regularization) were optimized by using 5-fold cross-validation repeated 100 times, to account for the variance of cross-validation. Details on these steps are given in footnote of Fig. 2.

Figure 2.

Percentage of the ions' selection frequency ≥40% over 1,000 bootstraps via Elastic Net regression. Elastic Net regression model considered all diet-related ions simultaneously, and the following confounding factors were included as unpenalized explanatory variables: age (continuous), BMI (continuous), season (a priori–defined periods: October–November/December–January–February/March–April–May), menopausal status (pre/postmenopause status), smoking status (current, former, and nonsmokers), height (continuous), physical activity (low, moderate, intense), education level (primary, secondary, superior), alcohol intake (continuous), use of hormone replacement therapy for menopause, number of children and family history of breast cancer at blood draw (baseline), and intervention group of the initial SU.VI.MAX trial (placebo/supplemented). The Elastic Net parameters α and λ were optimized by using 5-fold cross-validation (assigning the case/control pairs in the same fold) repeated 100 times, to account for the variance of cross-validation. After α had been optimized over a grid of 0.5 to 0.9 (α = 0 implies no variable selection; α = 1 is equivalent to LASSO regression) to get enough sparsity and still control for multicollinearity, λ (defining the strength of regularization) were also optimized by minimizing the mean deviance. Over the cross-validations of the Elastic Net regression models, the most frequent optimal α value was 0.5 and the corresponding average optimum λ value was 0.09. To determine the stability of ions selected on the original dataset, we applied a bootstrap resampling Elastic Net method (22) by repeating the selection process 1,000 times on resampled data and recording the percentage of times where each ion was selected (using the optimized parameter α = 0.5).

Figure 2.

Percentage of the ions' selection frequency ≥40% over 1,000 bootstraps via Elastic Net regression. Elastic Net regression model considered all diet-related ions simultaneously, and the following confounding factors were included as unpenalized explanatory variables: age (continuous), BMI (continuous), season (a priori–defined periods: October–November/December–January–February/March–April–May), menopausal status (pre/postmenopause status), smoking status (current, former, and nonsmokers), height (continuous), physical activity (low, moderate, intense), education level (primary, secondary, superior), alcohol intake (continuous), use of hormone replacement therapy for menopause, number of children and family history of breast cancer at blood draw (baseline), and intervention group of the initial SU.VI.MAX trial (placebo/supplemented). The Elastic Net parameters α and λ were optimized by using 5-fold cross-validation (assigning the case/control pairs in the same fold) repeated 100 times, to account for the variance of cross-validation. After α had been optimized over a grid of 0.5 to 0.9 (α = 0 implies no variable selection; α = 1 is equivalent to LASSO regression) to get enough sparsity and still control for multicollinearity, λ (defining the strength of regularization) were also optimized by minimizing the mean deviance. Over the cross-validations of the Elastic Net regression models, the most frequent optimal α value was 0.5 and the corresponding average optimum λ value was 0.09. To determine the stability of ions selected on the original dataset, we applied a bootstrap resampling Elastic Net method (22) by repeating the selection process 1,000 times on resampled data and recording the percentage of times where each ion was selected (using the optimized parameter α = 0.5).

Close modal

Spearman correlation matrix of the selected metabolites was computed. After Elastic Net selection, biological plausibility and plots of (ion intensity) × (dietary exposure) were check. To determine the stability of the selected ions, we applied a Bootstrap resampling Elastic Net method (22) by repeating the selection process 1,000 times on re-sampled data and recording the percentage of times where each ion was selected. Direction of the associations (OR) for the selected ions was estimated using logistic regression with all the selected ions in the same model and adjusted for the confounding factors cited above. Although P values are generated by this model, they cannot be used for inference given the prior Elastic Net-based selection. Complementary analyses were performed using logistic regression models for each selected ion adjusted for the same confounding factors to check the associations between each individual ion and breast cancer risk.

A flow chart of the different statistical steps is shown in Fig. 3.

Figure 3.

Flowchart of statistical analyses and summary of results.

Figure 3.

Flowchart of statistical analyses and summary of results.

Close modal

Because of the untargeted approach, only ions of interest were annotated according to the procedure described in Supplementary Methods. As proposed by Sumner and colleagues (23), the metabolites were classified according to levels of confidence in the identification process: identified (level 1), putatively annotated (level 2), putatively characterized compound classes (level 3), and unknown compound (level 4).

Analyses were performed using SAS (v9.3, Cary, NC) and R (v3.5.2) software.

Baseline characteristics of breast cancer cases and controls in the study population are summarized in Table 1. Among the 1,218 detected ions, 595 were selected as candidate diet-related ions and were considered in the penalized logistic regression analyses (associations between these ions and dietary exposure and FDR values are presented in Supplementary Table S4, N = 1,085).

Table 1.

Baseline characteristics of breast cancer cases and controlsa, SU.VI.MAX cohort, France (1994–2007).

Breast cancer cases (N = 200)Controls (N = 200)Pb
Age at baseline (y) 48.8 ± 5.8 48.7 ± 5.9 0.4 
BMI (kg/m²) 23.1 ± 3.9 23.5 ± 4.2 0.08 
   Not applicable 
 <18.5 kg/m² (underweight) 8 (4) 6 (3)  
 ≥18.5–<25 kg/m² (normal weight) 143 (71.5) 145 (72.5)  
 ≥25 kg/m² (overweight) 49 (24.5) 49 (24.5)  
Height (cm) 162.9 ± 6.1 160.9 ± 5.9 0.001 
Intervention group  Not applicable 
 Placebo 99 (49.5) 99 (49.5)  
 Antioxidants 101 (50.5) 101 (50.5)  
Smoking status  Not applicable 
 Never and former 159 (79.5) 159 (79.5)  
 Current smoker 41 (20.5) 41 (20.5)  
Physical activity  0.8 
 Irregular 64 (32) 58 (29)  
 <1 h/d walking equivalent 63 (31.5) 67 (33.5)  
 ≥1 h/d walking equivalent 73 (36.5) 75 (37.5)  
Educational level  0.1 
 Primary 35 (17.5) 43 (21.5)  
 Secondary 75 (37.5) 86 (43)  
 Superior 90 (45) 71 (35.5)  
Number of biological children 1.9 ± 1.2 2 ± 1.2 0.3 
Hormonal treatment for menopause (yes) 69 (34.5) 70 (35) 0.9 
Menopausal status at baseline  Not applicable 
 Premenopausal 127 (63.5) 127 (63.5)  
 Postmenopausal 73 (36.5) 73 (36.5)  
Menopausal status at diagnosis  Not applicable 
 Premenopausal 77 (38.5) 77 (38.5)  
 Postmenopausal 123 (61.5) 123 (61.5)  
Family history of breast cancerc (yes) 34 (17) 21 (10.5) 0.05 
Alcohol intake (g/day) 10.8 ± 11.2 11.6 ± 13.3 0.5 
Month of blood draw  Not applicable 
 March–April–May 74 (37) 74 (37)  
 October–November 30 (15) 31 (15.5)  
 December–January–February 96 (48) 95 (47.5)  
Breast cancer cases (N = 200)Controls (N = 200)Pb
Age at baseline (y) 48.8 ± 5.8 48.7 ± 5.9 0.4 
BMI (kg/m²) 23.1 ± 3.9 23.5 ± 4.2 0.08 
   Not applicable 
 <18.5 kg/m² (underweight) 8 (4) 6 (3)  
 ≥18.5–<25 kg/m² (normal weight) 143 (71.5) 145 (72.5)  
 ≥25 kg/m² (overweight) 49 (24.5) 49 (24.5)  
Height (cm) 162.9 ± 6.1 160.9 ± 5.9 0.001 
Intervention group  Not applicable 
 Placebo 99 (49.5) 99 (49.5)  
 Antioxidants 101 (50.5) 101 (50.5)  
Smoking status  Not applicable 
 Never and former 159 (79.5) 159 (79.5)  
 Current smoker 41 (20.5) 41 (20.5)  
Physical activity  0.8 
 Irregular 64 (32) 58 (29)  
 <1 h/d walking equivalent 63 (31.5) 67 (33.5)  
 ≥1 h/d walking equivalent 73 (36.5) 75 (37.5)  
Educational level  0.1 
 Primary 35 (17.5) 43 (21.5)  
 Secondary 75 (37.5) 86 (43)  
 Superior 90 (45) 71 (35.5)  
Number of biological children 1.9 ± 1.2 2 ± 1.2 0.3 
Hormonal treatment for menopause (yes) 69 (34.5) 70 (35) 0.9 
Menopausal status at baseline  Not applicable 
 Premenopausal 127 (63.5) 127 (63.5)  
 Postmenopausal 73 (36.5) 73 (36.5)  
Menopausal status at diagnosis  Not applicable 
 Premenopausal 77 (38.5) 77 (38.5)  
 Postmenopausal 123 (61.5) 123 (61.5)  
Family history of breast cancerc (yes) 34 (17) 21 (10.5) 0.05 
Alcohol intake (g/day) 10.8 ± 11.2 11.6 ± 13.3 0.5 
Month of blood draw  Not applicable 
 March–April–May 74 (37) 74 (37)  
 October–November 30 (15) 31 (15.5)  
 December–January–February 96 (48) 95 (47.5)  

Abbreviation: BMI, body mass index.

aValues are means ± SDs or n (%).

bP value for the comparison between breast cancer cases and controls using conditional logistic regression. Not applicable for matching factors except for more precise variables (e.g., age and BMI).

cAmong first-degree female relatives.

Fourteen ions resulted from the Elastic Net penalized regression. Among these, 2 were identified with a high level of confidence in annotation: piperine (M286T989) and acetyltributylcitrate (ATBC; M425T1158), two were putatively annotated: 2-amino-cyano-butanoic acid (M153T116) and pregnene-triol sulfate (M413T967) and the other were unknown compounds (M192T181, M265T186, M335T864, M364T125, M97T134, M166T144, M201T1091, M415T1344, M475T122, and M587T121). Identification details of the 14 ions selected by Elastic net are presented in Supplementary Methods. The Spearman correlation matrix of these ions is provided in Supplementary Table S5. 2-amino-cyano-butanoic acid was highly correlated (P < 0.0001) with 3 unknown compounds, including 2 NaCOOH adducts.

In particular, piperine was positively associated with alcoholic drinks intake (r = 0.19) and with a “Western” dietary pattern (r = 0.22); ATBC was positively associated with coffee intake (r = 0.17); 2-amino-cyano-butanoic acid was negatively associated with cake and biscuits intakes (r = −0.16) and pregnene-triol sulfate was positively associated with alcoholic drinks intakes (r = 0.17). The unknown compounds were associated with several dietary exposures such as pasta and cereals, salty products, processed meat, tomatoes, citrus fruit, and pressed cooked cheese intakes (see Supplementary Table S4). Table 2 displays the associations between the 14 selected ions and breast cancer risk from adjusted logistic regression, including the 14 ions altogether or one ion at a time. Lower levels of piperine and 6 unknown compounds (M335T864, M364T125, M166T144, M415T1344, M475T122, and M587T121) and higher levels of 2-amino-4-cyano butanoic acid, ATBC, pregnene-triol sulfate and 4 unknown compounds (M192T181, M265T186, M97T134, and M201T1091) were found in plasma from women who have subsequently developed breast cancer during follow-up.

Table 2.

Direction of the 14 selected ions variations from adjusted logistic regression modelsa—SU.VI.MAX cohort, France (1994–2007).

Mass/retentionMode ofAll selected ions togetherSelected ions one by one
Ions (annotationb)timedetectionOR (95% CI)PcOR (95% CI)Pc
M153T116 (level 2: 2-amino-cyano-butanoic acid153.0162/1.93 ESI Positive 1.23 (0.93–1.62) 0.1 1.08 (1.03–1.14) 0.002 
M192T181 (level 4: molecular formula; C9H6O4N) 192.0305/3.01 ESI Positive 1.37 (1.05–1.79) 0.02 1.08 (1.03–1.13) 0.003 
M265T186 (level 4: unknown) 265.3932/3.1 ESI Positive 1.28 (0.93–1.76) 0.1 1.09 (1.04–1.14) 0.0008 
M286T989 (level 1: piperine286.143/16.48 ESI Positive 0.76 (0.58–0.99) 0.04 0.94 (0.89–0.99) 0.01 
M335T864 (level 4: unknown) 335.2404/14.4 ESI Positive 0.77 (0.61–0.99) 0.04 0.94 (0.89–0.98) 0.009 
M364T125 (level 4: NaCOOH adduct) 363.9292/2.09 ESI Positive 0.74 (0.58–0.94) 0.01 0.93 (0.89–0.98) 0.007 
M425T1158 (level 1: ATBC425.2172/19.31 ESI Positive 1.39 (1.08–1.79) 0.01 1.07 (1.02–1.12) 0.006 
M97T134 (level 4: NaCOOH adduct) 96.9218/2.24 ESI Positive 1.23 (0.94–1.60) 0.1 1.10 (1.04–1.15) 0.0003 
M166T144 (level 4: unknown) 166.0387/2.4 ESI Negative 0.80 (0.62–1.03) 0.09 0.94 (0.89–0.98) 0.01 
M201T1091 (level 4: unknown) 201.149/18.18 ESI Negative 1.21 (0.94–1.57) 0.1 1.10 (1.04–1.15) 0.0003 
M413T967 (level 2: pregnene-triol sulfate413.2002/16.11 ESI Negative 1.40 (1.08–1.84) 0.01 1.08 (1.03–1.14) 0.004 
M415T1344 (level 4: unknown) 415.2078/22.4 ESI Negative 0.83 (0.64–1.07) 0.2 0.92 (0.87–0.96) 0.0006 
M475T122 (level 4: unknown) 474.7265/2.04 ESI Negative 0.89 (0.70–1.14) 0.4 0.93 (0.88–0.97) 0.003 
M587T121 (level 4: unknown) 586.6522/2.02 ESI Negative 0.69 (0.54–0.89) 0.004 0.93 (0.89–0.98) 0.005 
Mass/retentionMode ofAll selected ions togetherSelected ions one by one
Ions (annotationb)timedetectionOR (95% CI)PcOR (95% CI)Pc
M153T116 (level 2: 2-amino-cyano-butanoic acid153.0162/1.93 ESI Positive 1.23 (0.93–1.62) 0.1 1.08 (1.03–1.14) 0.002 
M192T181 (level 4: molecular formula; C9H6O4N) 192.0305/3.01 ESI Positive 1.37 (1.05–1.79) 0.02 1.08 (1.03–1.13) 0.003 
M265T186 (level 4: unknown) 265.3932/3.1 ESI Positive 1.28 (0.93–1.76) 0.1 1.09 (1.04–1.14) 0.0008 
M286T989 (level 1: piperine286.143/16.48 ESI Positive 0.76 (0.58–0.99) 0.04 0.94 (0.89–0.99) 0.01 
M335T864 (level 4: unknown) 335.2404/14.4 ESI Positive 0.77 (0.61–0.99) 0.04 0.94 (0.89–0.98) 0.009 
M364T125 (level 4: NaCOOH adduct) 363.9292/2.09 ESI Positive 0.74 (0.58–0.94) 0.01 0.93 (0.89–0.98) 0.007 
M425T1158 (level 1: ATBC425.2172/19.31 ESI Positive 1.39 (1.08–1.79) 0.01 1.07 (1.02–1.12) 0.006 
M97T134 (level 4: NaCOOH adduct) 96.9218/2.24 ESI Positive 1.23 (0.94–1.60) 0.1 1.10 (1.04–1.15) 0.0003 
M166T144 (level 4: unknown) 166.0387/2.4 ESI Negative 0.80 (0.62–1.03) 0.09 0.94 (0.89–0.98) 0.01 
M201T1091 (level 4: unknown) 201.149/18.18 ESI Negative 1.21 (0.94–1.57) 0.1 1.10 (1.04–1.15) 0.0003 
M413T967 (level 2: pregnene-triol sulfate413.2002/16.11 ESI Negative 1.40 (1.08–1.84) 0.01 1.08 (1.03–1.14) 0.004 
M415T1344 (level 4: unknown) 415.2078/22.4 ESI Negative 0.83 (0.64–1.07) 0.2 0.92 (0.87–0.96) 0.0006 
M475T122 (level 4: unknown) 474.7265/2.04 ESI Negative 0.89 (0.70–1.14) 0.4 0.93 (0.88–0.97) 0.003 
M587T121 (level 4: unknown) 586.6522/2.02 ESI Negative 0.69 (0.54–0.89) 0.004 0.93 (0.89–0.98) 0.005 

Abbreviation: CI, confidence interval.

aThe principal logistic regression model considered all the 14 selected ions at the same time. Logistic regression models from complementary analyses considered separately the selected ions (one model for each of the 14 ions). All these models were adjusted for age (continuous), BMI (continuous), season (a priori–defined periods: October–November/December–January–February/March–April–May), menopausal status (pre/postmenopause status), smoking status (current, former and nonsmokers), height (continuous), physical activity (low, moderate, intense), education level (primary, secondary, superior), alcohol intake (continuous), use of hormone replacement therapy for menopause, number of children and family history of breast cancer at blood draw (baseline), and intervention group of the initial SU,VI,MAX trial (placebo/supplemented). Tests for linear trend were performed using the continuous variables. ORs were presented for a 1 SD increase of the continuous variable (semiquantification).

bLevels of confidence for every identification were given accordingly to Sumner and colleagues (23): level 1, formally identified compound (confirmed with analysis of authentic standard); level 2, putatively identified compound (based upon spectral similarity with public/commercial spectral libraries or reference compound in the literature and/or physicochemical properties); level 3, putatively characterized compound classes; and level 4, unknown compound. Among the unknown compounds, two ions were NaCOOH adducts, but not all of them. The parent ions of the two ion adducts resulted from Elastic Net regression were not detected because data were acquired in positive and negative ion modes with a scan range from 50 to 1,000 mass-to-charge ratio (m/z). To observe the parent ions it would be necessary to acquire data with a lower mass range but this is not achievable with this kind of instrument.

cAlthough P values are generated by this model, they cannot be used for inference given the prior Elastic Net–based selection.

Concerning the stability of the selected 14 ions on original data, their selection frequencies were ≥50% on the 1,000 bootstraps except for 2-amino-cyano-butanoic acid (27%). It was even >70% for piperine (M286T989), ATBC (M425T1158) and pregnene-triol sulfate (M413T967) and 8 unknown compounds (see Fig. 2 presenting the frequency of selection of ions ≥40% over 1,000 bootstraps). Further investigation was carried out to annotate ions with selection frequencies >70% but not selected on the original dataset (i.e., M114T115, M498T1040, M230T171, M423T120, M454T1064, M196T953, M485T1437, M491T120, and M581T123); however, none of them were identified. The associations of these ions with dietary exposures are available in Supplementary Table S4 and with breast cancer risk (from adjusted logistic regression) in Supplementary Table S6.

A summary of the results is shown in Fig. 3.

This exploratory study used untargeted metabolomics coupled with multivariable penalized regression to screen for a limited set of ions potentially associated with various dietary exposures and maximized breast cancer risk discrimination.

Comparison with literature remains difficult due to the large diversity of study designs, type of biofluid used or statistical analyses performed. Nevertheless, the magnitude of the dietary associations highlighted in our study seems to be comparable with similar studies based on Food Frequency Questionnaires (FFQ) as in Playdon and colleagues (10) and seems relatively low for dietary patterns and nutritional scores as in Playdon and colleagues (24). However, some associations with individual food as citrus and coffee appear much higher compared with other dietary exposures and could match to metabolites from direct consumption of these foods. For instance, several correlations between ions and coffee intake were over 0.4, including one at 0.54 that seems higher than some studies based on FFQ as the one of Guertin and colleagues (25) despite their use of serum that is more concentrated in metabolites than plasma (26). Few studies have investigated the associations between diet-related metabolomics signatures and cancer risk. They observed associations between metabolites and, on the one hand, nutritional exposures (in particular coffee, alcohol, fibers, vitamin E and fried foods intake, BMI, physical activity), and on the other hand, risk of HCC (27–29), colorectal cancer (30, 31), and only two dealing with breast cancer (10, 32). In particular, most of these studies found metabolites related to alcohol intake, in particular, associated with breast cancer (10) or HCC (27–29) risks. In our study, among the 14 selected ions; 3 were positively associated to alcohol intake (M286T989, M335T864, M413T967), including piperine and pregnene-triol sulfate. Moreover, the ion M230T171, frequently selected during the bootstrap resampling but not selected on the original dataset, was also associated with alcoholic drinks intake. Compared with the literature, some associations were newly identified in this study (such as the positive association between ATBC, coffee intake and breast cancer risk), whereas several others were not replicated. These differences across studies are probably explained by differences in analytical technics, study design, and statistical approaches, as well as study population with heterogeneous underlying diets and cancer sites.

In this study, piperine, an exogenous active alkaloid with no endogenous origin reported so far, was highlighted as potential predictor of breast cancer risk (inverse association), with high stability across penalized models (selection frequencies >90% on the 1,000 bootstraps). Piperine is contained in black and long pepper (33) and is used as feed additive in animal feed, in particular for poultry (34). The human exposure to piperine via this latter source has been estimated at 0.93 μg/metabolic body weight per day (34). Several animal or cellular studies suggested a promising spectrum of properties for piperine such as anti-oxidant (35, 36), anti-inflammatory (33, 35, 36), immunomodulatory, bioavailability and absorption promoter for many active molecules (33, 37), antiasthmatic, anticonvulsing, antimutagenic, antimycobacterial, and anticancer activities (ref. 33; chemopreventive properties, including inhibition of angiogenesis and increased cell apoptosis), especially in breast cancer models (38–40). In our study, piperine was in particular positively associated with alcohol intake and with a “Western” dietary pattern. These associations are probably explained by the fact that either several foods (e.g., processed meat, sauces, industrial cheese, poultry) containing piperine as feed additive or via pepper are either part of a “Western” diet or consumed in association with alcoholic drinks. The association with alcohol intake could also be explained by higher alkaloid solubility in ethanol (41). Consistent with our findings, Playdon and colleagues (10) found a positive association between piperine and liquor intake and a decreased risk of breast cancer. Direct association was also found between piperine and wine intake in blood of female Twins (42). However, the origin of circulating piperine is not restricted to Western-type foods and can also be the results of adding black pepper into food (e.g., for salad or fish seasoning). Unfortunately, the level of detail of the SU.VI.MAX dietary questionnaire did not allow us to estimate pepper intake.

ATBC is an alternative plasticizer to phthalates (43) commonly used in polyvinyl resins and permitted as a food additive and food contact substance (44). Migration of ATBC from food packaging material into food has been observed for cheese, wrapped cake, microwaved soup, and microwaved peanut-containing cookies (44, 45) and its leaching rate from medical equipment was found 10 times faster than the potent endocrine disruptor di-2-ethylhexyl phthalate (44, 46, 47). In our study, we found a positive association between plasma ATBC and coffee intake. This association should be confirmed in independent human observational studies and in vivo or in vitro animal intervention studies; however, it may reflect contaminant migration from plastic cup into coffee. Milk added in coffee may facilitate this migration because one study suggested that ATBC is prone to migrate into protein liquids, such as aqueous skim milk solution (48). Its increase in plasma from women who have developed breast cancer could also come from other exposure that we were unable to detect in this study. Recent studies found potential biological activity of ATBC on tissue growth (49) and a potential disruption of ovarian function in female mice due to exposure to ATBC at low-dosage–imitating human exposure (50).

Currently, alcohol intake is the only established dietary risk factor for breast cancer risk with strong evidence (3). Some underlying mechanisms are already described; however, other remain to elucidate (3). In our study, pregnene-triol sulfate, a steroid sulfate hormone belonging to progestin family, was positively associated with both alcohol intake and breast cancer risk. Consistently with our results, a recent metabolomic study found positive correlations between alcohol intake and several serum steroids, notably pregnene-diol sulfate, which were associated with an increased breast cancer risk (10). Moreover, increased levels of sex steroids seem strongly associated with risk of postmenopausal breast cancers (51). Alcohol intake may increase circulating levels of steroid hormone, which could affect susceptibility to transform or promote cancer growth (52). An interventional study in postmenopausal women showed that alcohol consumption increased serum level of dehydroepiandrosterone (DHEA) sulfate (53), a precursor to androstenedione, testosterone, and, ultimately, estrone and estradiol. Furthermore, the pregnene-triol sulfate found in our study, may come from the 17-hydroxy-pregnenolone, which losing its side chain can produce DHEA. Other factors could influence the level of sex steroids, such as BMI and lactation (3). Furthermore, it has been shown that ATBC strongly activated human and rat Steroid and Xenobiotic Receptor (SXR) and may alter metabolism of endogenous steroid hormones (43).

In our study, increased plasma level of 2-amino-4-cyano butanoic acid was found in plasma from women who have subsequently developed a breast cancer during follow-up. This metabolite, also called alpha-amino-gamma-cyanobutanoic acid, is a non-proteinogenic alpha-amino acid (2-aminobutanoic acid) substituted at position 4 by a cyano group and an aliphatic nitrile. It may derive from butyrate (54) that is a short-chain fatty acid synthesized by the fermentation of fibers by colon bacteria (55). Butyrate has recently received growing attention for its beneficial effects on intestinal homeostasis and energy metabolism. With its anti-inflammatory properties, butyrate improves intestinal barrier function and mucosal immunity (56). In our study, we found an inverse association between plasma 2-amino-4-cyano butanoic acid and cake and biscuits intakes. To our knowledge, at this time no direct association between plasma 2-amino-cyano butanoic acid and diet exposure has been established. The production and effects of butyrate appear to be related to diet, including the type of dietary fibers and fat consumed, respectively (57). Butyrate is present in various types of foods, including vanilla, oats, peanut, and several fruits (58). Having no available data on the plasma butyrate level in our study, we cannot conclude on a possible link between the observed variations and a disturbance of butyrate metabolism and therefore of the microbiota. Our results need to be replicated in other independent observational studies, and intervention studies on animal models would provide a better understanding of the origin of its variations and related to food. Its association with the risk of breast cancer and a potential causal relationship could be investigated through cellular mechanistic studies.

Unfortunately, despite additional analytical analyses (including different fragmentation experiments using ultra high-resolution MS, fraction collection, pre concentration of the biological samples, H/D exchange and eventually additional analysis such as GC/MS), the other metabolites selected by Elastic Net regression for breast cancer risk analyses could not be identified, mainly because of limited signal intensities, the lack of commercial standards and the incompleteness of the available databases. However, sharing putatively annotated compounds or unknowns within the scientific community could be of great interest. Indeed, in case of absence of commercial standard, it could be relevant if several studies and consortia shared the same hypothesis, to further synthetize the compound or isolate it from a biological media. In case of too low signal abundance, as MS technologies are more and more sensitive, unknowns could be elucidated with new instruments in the future.

The major strengths of this study pertained to the very sensitive untargeted MS metabolomic analysis, the prospective design and the choice of statistical design using Elastic Net penalized regression; a suitable method to high-dimensional correlated data. Penalized regression techniques were applied to control the variance of estimates (that increases in the presence of many predictors or multicollinearity) by shrinking coefficients toward zero (59). Some penalized regression techniques such as the Least Absolute Shrinkage and Selection Operator (LASSO; ref. 59) and Elastic Net (21) simultaneously perform automatic variable selection by shrinking the irrelevant predictor coefficients to exactly zero. The latter tends to better select important variables when high correlations are present as in metabolomic data. Several studies applied these methods in multiple fields (as youth violence, genetics, cardiovascular disease risk; refs. 22, 60–62); however, to our knowledge, no previous study focusing on nutritional metabolomics and cancer prevention used penalized techniques. Nevertheless, several limitations should be acknowledged for this study. First, despite the processes of cross validation, iterations and bootstraps, these results need to be validated through an independent study sample. Unbiased predictive performance could not be investigated in this study due to the lack of an independent dataset. The investigation of performance gain when adding a diet-related metabolic signature to a model, including already known breast cancer risk factors would be useful to improve discrimination of women at higher risk of breast cancer. However, to get a positive impact in terms of prevention, this signature must be modifiable following a change of diet. This issue, as well as several others (e.g., replication, quantification) should be investigated before considering an application in public health. Second, associations may have been missed due to a lack of power, self-reporting bias or to the analytical protocol (LC-MS) that did not allow detecting all categories of metabolites. Indeed, although our analytical method was optimized to detect as many metabolites as possible, because of the huge chemical diversity of compounds in blood it is impossible to cover all metabolites using a single analytical method, even with an untargeted approach. Thus, some metabolites of interest may have been not detected with our analytical conditions. However, UPLC-MS is one of the most sensitive methodology available. Complementary GC-MS analyses could be useful to provide additional types of metabolites. Moreover, a larger sample size would have allowed a stratification of the population according to several parameters such as time prior to cancer diagnosis and menopausal status. Third, the possibility of residual or unmeasured confounding cannot be ruled out in this observational study. However, many potential confounders were accounted for. Finally, our study was based on a single blood draw, which limits the investigation of metabolomic profiles stability across time. Nevertheless, several studies have showed a good reproducibility of metabolomic measurements for most of metabolites (63, 64). Several blood draws during follow-up would allow a finer detection of metabolic pathways that are disrupted during carcinogenesis.

In conclusion, this exploratory prospective study identified a plasma diet–related metabolic signature of long-term breast cancer risk involving exogenous, steroid and microbiota-related metabolites. The hypotheses highlighted in this study should be further investigated in future large-scale independent studies. In the future, such signatures could help to better understand the etiology of nutrition and breast cancer and to identify key metabolites both associated to modifiable nutritional behavior and to breast cancer risk.

No potential conflicts of interest were disclosed.

The funders had no role in the design, analysis, or writing of this article.

Conception and design: L. Lécuyer, P. Micheau, P. Galan, S. Hercberg, C. Manach, M. Touvier

Development of methodology: L. Lécuyer, P. Micheau, C. Samieri, P. Ferrari, M. Touvier

Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): P. Micheau, A. Rossary, A. Demidem, M. Petera, M. Lagree, D. Centeno, P. Galan, S. Hercberg, E. Kesse-Guyot, S. Durand, M. Touvier

Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): L. Lécuyer, C. Dalle, S. Lefevre-Arbogast, P. Micheau, B. Lyan, C. Samieri, N. Assi, V. Viallon, M. Deschasaux, V. Partula, B. Srour, E. Kesse-Guyot, N. Druesne-Pecollo, M.-P. Vasson, S. Durand, E. Pujos-Guillot, C. Manach, M. Touvier

Writing, review, and/or revision of the manuscript: L. Lécuyer, S. Lefevre-Arbogast, P. Micheau, A. Rossary, A. Demidem, M. Petera, P. Galan, C. Samieri, N. Assi, V. Viallon, M. Deschasaux, V. Partula, P. Latino-Martel, E. Kesse-Guyot, N. Druesne-Pecollo, S. Durand, E. Pujos-Guillot, C. Manach, M. Touvier

Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): L. Lécuyer, P. Micheau, E. Kesse-Guyot, E. Pujos-Guillot, M. Touvier

Study supervision: P. Micheau, P. Galan, S. Hercberg, M. Touvier

Other (mass spectrometry analysis and metabolites identification): B. Lyan

The authors thank Younes Esseddik, Frédéric Coffinieres, Thi Hong Van Duong, Paul Flanzy, Régis Gatibelza, Jagatjit Mohinder, and Maithyly Sivapalan (computer scientists); Rachida Mehroug and Frédérique Ferrat (logistic assistants); Nathalie Arnault, Véronique Gourlet, PhD, Fabien Szabo, PhD, Julien Allegre, and Laurent Bourhis (data-manager/statisticians); and Cédric Agaesse (dietitian) for their technical contribution to the SU.VI.MAX study. We also thank Nathalie Druesne-Pecollo, PhD (operational coordination) as well as all participants of the SU.VI.MAX study. This work was conducted in the framework of the French network for Nutrition And Cancer Research (NACRe network), www.inra.fr/nacre, and received the NACRe Partnership Label. Metabolomic analysis was performed within the metaboHUB French infrastructure (ANR-INBS-0010). This work was supported by the French National Cancer Institute (grant number INCa_8085 for the project, and PhD grant number INCa_11323, to L. Lecuyer); the Federative Institute for Biomedical Research IFRB Paris 13; and the Cancéropôle Ile-de-France/Région Ile de France (PhD grant for M. Deschasaux).

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1.
Wu
S
,
Zhu
W
,
Thompson
P
,
Hannun
YA
. 
Evaluating intrinsic and non-intrinsic cancer risk factors
.
Nat Commun
2018
;
9
:
3490
.
2.
World Health Organisation
. 
Cancer prevention
; 
2019
.
Available from
: http://www.who.int/cancer/prevention/en/.
3.
World Cancer Research Fund/American Institute for Cancer Research
. 
Continuous update project expert report 2018. Diet, nutrition, physical activity, and breast cancer
.
Available from
: dietandcancerreport.org.
4.
Shield
KD
,
Freisling
H
,
Boutron-Ruault
M-C
,
Touvier
M
,
Marant Micallef
C
,
Jenab
M
, et al
New cancer cases attributable to diet among adults ages 30–84 years in France in 2015
.
Br J Nutr
2018
;
120
:
1171
80
.
5.
Reynolds
A
,
Mann
J
,
Cummings
J
,
Winter
N
,
Mete
E
,
Te Morenga
L
. 
Carbohydrate quality and human health: a series of systematic reviews and meta-analyses
.
Lancet
2019
;
393
:
434
45
.
6.
Chajès
V
,
Assi
N
,
Biessy
C
,
Ferrari
P
,
Rinaldi
S
,
Slimani
N
, et al
A prospective evaluation of plasma phospholipid fatty acids and breast cancer risk in the EPIC study
.
Ann Oncol
2017
;
28
:
2836
42
.
7.
Sellem
L
,
Srour
B
,
Guéraud
F
,
Pierre
F
,
Kesse-Guyot
E
,
Fiolet
T
, et al
Saturated, mono- and polyunsaturated fatty acid intake and cancer risk: results from the French prospective cohort NutriNet-Santé
.
Eur J Nutr
2018
;
58
:
1515
27
.
8.
Scalbert
A
,
Brennan
L
,
Manach
C
,
Andres-Lacueva
C
,
Dragsted
LO
,
Draper
J
, et al
The food metabolome: a window over dietary exposure
.
Am J Clin Nutr
2014
;
99
:
1286
308
.
9.
Davis
CD
,
Milner
JA
. 
Biomarkers for diet and cancer prevention research: potentials and challenges
.
Acta Pharmacol Sin
2007
;
28
:
1262
73
.
10.
Playdon
MC
,
Ziegler
RG
,
Sampson
JN
,
Stolzenberg-Solomon
R
,
Thompson
HJ
,
Irwin
ML
, et al
Nutritional metabolomics and breast cancer risk in a prospective study
.
Am J Clin Nutr
2017
;
106
:
637
49
.
11.
Hercberg
S
,
Galan
P
,
Preziosi
P
,
Bertrais
S
,
Mennen
L
,
Malvy
D
, et al
The SU.VI.MAX study: a randomized, placebo-controlled trial of the health effects of antioxidant vitamins and minerals
.
Arch Intern Med
2004
;
164
:
2335
42
.
12.
Hercberg
S
,
Preziosi
P
,
Briancon
S
,
Galan
P
,
Triol
I
,
Malvy
D
, et al
A primary prevention trial using nutritional doses of antioxidant vitamins and minerals in cardiovascular diseases and cancers in a general population: the SU.VI.MAX study–design, methods, and participant characteristics. SUpplementation en VItamines et Mineraux AntioXydants
.
Control Clin Trials
1998
;
19
:
336
51
.
13.
Vandenbroucke
JP
,
Pearce
N
. 
Case-control studies: basic concepts
.
Int J Epidemiol
2012
;
41
:
1480
9
.
14.
Le Moullec
N
,
Deheeger
M
,
Preziosi
P
,
Montero
P
,
Valeix
P
,
Rolland-Cachera
MF
, et al
Validation du manuel photos utilisé pour l'enquête alimentaire de l'étude SU.VI.MAX (validation of the food portion size booklet used in the SU.VI.MAX study)
.
Cah Nutr Diététique
1996
;
31
:
158
64
.
15.
World Health Organization
.
ICD-10, International Classification of Diseases and Related Health Problems. 10th revision
.
Geneva, Switzerland
:
World Health Organization
; 
1993
.
16.
Pereira
F
,
Martin
JF
,
Joly
C
,
Sébédio
JL
,
Pujos-Guillot
E
. 
Development and validation of a UPLC/MS method for a nutritional metabolomic study of human plasma
.
Metabolomics
2010
;
6
:
207
18
.
17.
Estaquio
C
,
Kesse-Guyot
E
,
Deschamps
V
,
Bertrais
S
,
Dauchet
L
,
Galan
P
, et al
Adherence to the French Programme National Nutrition Sante Guideline Score is associated with better nutrient intake and nutritional status
.
J Am Diet Assoc
2009
;
109
:
1031
41
.
18.
Kim
S
,
Haines
PS
,
Siega-Riz
AM
,
Popkin
BM
. 
The Diet Quality Index-International (DQI-I) provides an effective tool for cross-national comparison of diet quality as illustrated by China and the United States
.
J Nutr
2003
;
133
:
3476
84
.
19.
Martin
A
.
The “apports nutritionnels conseillés (ANC)” for the French population. Reproduction Nutrition Development, EDP Sciences 2001;41:119–28. Available from
: https://hal.archives-ouvertes.fr/hal-00900366/document.
20.
Benjamini
Y
,
Hochberg
Y
. 
Controlling the false discovery rate: a practical and powerful approach to multiple testing
.
J Roy Stat Soc B
1995
;
57
:
289
300
.
21.
Zou
H
,
Hastie
T
. 
Regularization and variable selection via the elastic net
.
J R Stat Soc Ser B Stat Methodol
2005
;
67
:
301
20
.
22.
Bunea
F
,
She
Y
,
Ombao
H
,
Gongvatana
A
,
Devlin
K
,
Cohen
R
. 
Penalized least squares regression methods and applications to neuroimaging
.
Neuroimage
2011
;
55
:
1519
27
.
23.
Sumner
LW
,
Amberg
A
,
Barrett
D
,
Beale
MH
,
Beger
R
,
Daykin
CA
, et al
Proposed minimum reporting standards for chemical analysis: Chemical Analysis Working Group (CAWG) Metabolomics Standards Initiative (MSI)
.
Metabolomics
2007
;
3
:
211
21
.
24.
Playdon
MC
,
Moore
SC
,
Derkach
A
,
Reedy
J
,
Subar
AF
,
Sampson
JN
, et al
Identifying biomarkers of dietary patterns by using metabolomics
.
Am J Clin Nutr
2017
;
105
:
450
65
.
25.
Guertin
KA
,
Moore
SC
,
Sampson
JN
,
Huang
WY
,
Xiao
Q
,
Stolzenberg-Solomon
RZ
, et al
Metabolomics in nutritional epidemiology: identifying metabolites associated with diet and quantifying their potential to uncover diet-disease relations in populations
.
Am J Clin Nutr
2014
;
100
:
208
17
.
26.
Yu
Z
,
Kastenmüller
G
,
He
Y
,
Belcredi
P
,
Möller
G
,
Prehn
C
, et al
Differences between human plasma and serum metabolite profiles
.
PLoS One
2011
;
6
:
e21230
.
27.
Assi
N
. 
A statistical framework to model the meeting-in-the-middle principle using metabolomic data: application to hepatocellular carcinoma in the EPIC study
.
Mutagenesis
2015
;
30
:
743
53
.
28.
Assi
N
,
Thomas
DC
,
Leitzman
M
,
Stepien
M
,
Chajes
V
,
Philip
T
, et al
Are metabolic signatures mediating the relationship between lifestyle factors and hepatocellular carcinoma risk? Results from a nested case–control study in EPIC
.
Cancer Epidemiol Biomarkers Prev
2018
;
27
:
531
40
.
29.
Assi
N
,
Gunter
MJ
,
Thomas
DC
,
Leitzmann
M
,
Stepien
M
,
Chajès
V
, et al
Metabolic signature of healthy lifestyle and its relation with risk of hepatocellular carcinoma in a large European cohort
.
Am J Clin Nutr
2018
;
108
:
117
26
.
30.
Chadeau-Hyam
M
,
Athersuch
TJ
,
Keun
HC
,
De Iorio
M
,
Ebbels
TMD
,
Jenab
M
, et al
Meeting-in-the-middle using metabolic profiling—a strategy for the identification of intermediate biomarkers in cohort studies
.
Biomarkers
2011
;
16
:
83
8
.
31.
Guertin
KA
,
Loftfield
E
,
Boca
SM
,
Sampson
JN
,
Moore
SC
,
Xiao
Q
, et al
Serum biomarkers of habitual coffee consumption may provide insight into the mechanism underlying the association between coffee consumption and colorectal cancer
.
Am J Clin Nutr
2015
;
101
:
1000
11
.
32.
Moore
SC
,
Playdon
MC
,
Sampson
JN
,
Hoover
RN
,
Trabert
B
,
Matthews
CE
, et al
A metabolomics analysis of body mass index and postmenopausal breast cancer risk
.
J Natl Cancer Inst
2018
;
110
:
djx244
.
33.
Zakerali
T
,
Shahbazi
S
. 
Rational druggability investigation toward selection of lead molecules: impact of the commonly used spices on inflammatory diseases
.
Assay Drug Dev Technol
2018
;
16
:
397
407
.
34.
EFSA, Panel on Additives and Products or Substances used in Animal Feed
. 
Safety and efficacy of pyridine and pyrrole derivatives belonging to chemical group 28 when used as flavourings for all animal species
.
EFSA J
2016
;
14
:
1
19
.
35.
Diwan
V
,
Poudyal
H
,
Brown
L
. 
Piperine attenuates cardiovascular, liver and metabolic changes in high carbohydrate, high fat-fed rats
.
Cell Biochem Biophys
2013
;
67
:
297
304
.
36.
Rather
RA
,
Bhagat
M
. 
Cancer chemoprevention and piperine: molecular mechanisms and therapeutic opportunities
.
Front Cell Dev Biol
2018
;
6
:
10
.
37.
Ajazuddin
,
Alexander
A
,
Qureshi
A
,
Kumari
L
,
Vaishnav
P
,
Sharma
M
, et al
Role of herbal bioactives as a potential bioavailability enhancer for active pharmaceutical ingredients
.
Fitoterapia
2014
;
97
:
1
14
.
38.
Greenshields
AL
,
Doucette
CD
,
Sutton
KM
,
Madera
L
,
Annan
H
,
Yaffe
PB
, et al
Piperine inhibits the growth and motility of triple-negative breast cancer cells
.
Cancer Lett
2015
;
357
:
129
40
.
39.
Do
MT
,
Kim
HG
,
Choi
JH
,
Khanal
T
,
Park
BH
,
Tran
TP
, et al
Antitumor efficacy of piperine in the treatment of human HER2-overexpressing breast cancer cells
.
Food Chem
2013
;
141
:
2591
9
.
40.
Doucette
CD
,
Hilchie
AL
,
Liwski
R
,
Hoskin
DW
. 
Piperine, a dietary phytochemical, inhibits angiogenesis
.
J Nutr Biochem
2013
;
24
:
231
9
.
41.
Chemical Book
. 
Alkaloids
; 
2019
.
Available from
: https://www.chemicalbook.com/ProductCatalog_EN/2322.htm.
42.
Pallister
T
,
Jennings
A
,
Mohney
RP
,
Yarand
D
,
Mangino
M
,
Cassidy
A
, et al
Characterizing blood metabolomics profiles associated with self-reported food intakes in female twins
.
PLoS One
2016
;
11
:
e0158568
.
43.
Takeshita
A
,
Igarashi-Migitaka
J
,
Nishiyama
K
,
Takahashi
H
,
Takeuchi
Y
,
Koibuchi
N
. 
Acetyl tributyl citrate, the most widely used phthalate substitute plasticizer, induces cytochrome p450 3a through steroid and xenobiotic receptor
.
Toxicol Sci
2011
;
123
:
460
70
.
44.
United States Consumer Product Safety Commission
. 
Review of exposure and toxicity data for phthalate substitutes
; 
2010
.
Available from
: https://www.cpsc.gov/s3fs-public/phthalsub.pdf.
45.
Sheftel
VO
. 
Indirect food additives and polymers: migration and toxicology
; 
2000
.
Available from
: https://nls.ldls.org.uk/welcome.html?ark:/81055/vdc_100056086042.0x000001.
46.
Welle
F
,
Wolz
G
,
Franz
R
. 
Migration of plasticizers from PVC tubes into enteral feeding solutions
.
Pharma International
2005
;
33
:
17
21
.
47.
Testai
E
Ms Scientific Committee SCENIHR. Electronic address: SANTE-C2-SCENIHR@ec.europa.eu
Hartemann
P
,
Rastogi
SC
,
Bernauer
U
,
Piersma
A
, et al
The safety of medical devices containing DEHP plasticized PVC or other plasticizers on neonates and other groups possibly at risk (2015 update)
.
Regul Toxicol Pharmacol
2016
;
76
:
209
10
.
48.
Nara
K
,
Nishiyama
K
,
Natsugari
H
,
Takeshita
A
,
Takahashi
H
. 
Leaching of the plasticizer, acetyl tributyl citrate: (ATBC) from plastic kitchen wrap
.
J Health Sci
2009
;
55
:
281
4
.
49.
Rasmussen
LM
,
Sen
N
,
Vera
JC
,
Liu
X
,
Craig
ZR
. 
Effects of in vitro exposure to dibutyl phthalate, mono-butyl phthalate, and acetyl tributyl citrate on ovarian antral follicle growth and viability
.
Biol Reprod
2017
;
96
:
1105
17
.
50.
Rasmussen
LM
,
Sen
N
,
Liu
X
,
Craig
ZR
. 
Effects of oral exposure to the phthalate substitute acetyl tributyl citrate on female reproduction in mice
.
J Appl Toxicol
2017
;
37
:
668
75
.
51.
Key
T
,
Appleby
P
,
Barnes
I
,
Reeves
G
,
Endogenous Hormones and Breast Cancer Collaborative Group
. 
Endogenous sex hormones and breast cancer in postmenopausal women: reanalysis of nine prospective studies
.
J Natl Cancer Inst
2002
;
94
:
606
16
.
52.
Singletary
KW
,
Gapstur
SM
. 
Alcohol and breast cancer: review of epidemiologic and experimental evidence and potential mechanisms
.
JAMA
2001
;
286
:
2143
51
.
53.
Dorgan
JF
,
Baer
DJ
,
Albert
PS
,
Judd
JT
,
Brown
ED
,
Corle
DK
, et al
Serum hormones and the alcohol-breast cancer association in postmenopausal women
.
J Natl Cancer Inst
2001
;
93
:
710
5
.
54.
PubChem
. 
2-Amino-4-cyanobutanoic acid
; 
2019
.
Available from
: https://pubchem.ncbi.nlm.nih.gov/compound/440770.
55.
Bourassa
MW
,
Alim
I
,
Bultman
SJ
,
Ratan
RR
. 
Butyrate, neuroepigenetics and the gut microbiome: can a high fiber diet improve brain health?
Neurosci Lett
2016
;
625
:
56
63
.
56.
Liu
H
,
Wang
J
,
He
T
,
Becker
S
,
Zhang
G
,
Li
D
, et al
Butyrate: a double-edged sword for health?
Adv Nutr
2018
;
9
:
21
9
.
57.
Lupton
JR
. 
Microbial degradation products influence colon cancer risk: the butyrate controversy
.
J Nutr
2004
;
134
:
479
82
.
58.
Api
AM
,
Belmonte
F
,
Belsito
D
,
Botelho
D
,
Bruze
M
,
Burton
GA
, et al
RIFM fragrance ingredient safety assessment, butyric acid, CAS Registry Number 107-92-6
.
Food Chem Toxicol
2019
;
127
(Suppl 1):
S81
S9
.
59.
Tibshirani
R
. 
Regression shrinkage and selection via the lasso
.
J Roy Stat Soc B
1996
;
58
:
267
88
.
60.
Goldstick
JE
,
Carter
PM
,
Walton
MA
,
Dahlberg
LL
,
Sumner
SA
,
Zimmerman
MA
, et al
Development of the SaFETy score: a clinical screening tool for predicting future firearm violence risk
.
Ann Intern Med
2017
;
166
:
707
14
.
61.
Frost
HR
,
Amos
CI
. 
Gene set selection via LASSO penalized regression (SLPR)
.
Nucleic Acids Res
2017
;
45
:
e114
.
62.
Stegemann
C
,
Pechlaner
R
,
Willeit
P
,
Langley
SR
,
Mangino
M
,
Mayr
U
, et al
Lipidomics profiling and risk of cardiovascular disease in the prospective population-based Bruneck study
.
Circulation
2014
;
129
:
1821
31
.
63.
Carayol
M
,
Licaj
I
,
Achaintre
D
,
Sacerdote
C
,
Vineis
P
,
Key
TJ
, et al
Reliability of serum metabolites over a two-year period: a targeted metabolomic approach in fasting and non-fasting samples from EPIC
.
PLoS One
2015
;
10
:
e0135437
.
64.
Floegel
A
,
Drogan
D
,
Wang-Sattler
R
,
Prehn
C
,
Illig
T
,
Adamski
J
, et al
Reliability of serum metabolite concentrations over a 4-month period using a targeted metabolomic approach
.
PLoS One
2011
;
6
:
e21103
.