An increasing number of cancer epidemiology studies use metabolomics assays. This scoping review characterizes trends in the literature in terms of study design, population characteristics, and metabolomics approaches and identifies opportunities for future growth and improvement. We searched PubMed/MEDLINE, Embase, Scopus, and Web of Science: Core Collection databases and included research articles that used metabolomics to primarily study cancer, contained a minimum of 100 cases in each main analysis stratum, used an epidemiologic study design, and were published in English from 1998 to June 2021. A total of 2,048 articles were screened, of which 314 full texts were further assessed resulting in 77 included articles. The most well-studied cancers were colorectal (19.5%), prostate (19.5%), and breast (19.5%). Most studies used a nested case–control design to estimate associations between individual metabolites and cancer risk and a liquid chromatography–tandem mass spectrometry untargeted or semi-targeted approach to measure metabolites in blood. Studies were geographically diverse, including countries in Asia, Europe, and North America; 27.3% of studies reported on participant race, the majority reporting White participants. Most studies (70.2%) included fewer than 300 cancer cases in their main analysis. This scoping review identified key areas for improvement, including needs for standardized race and ethnicity reporting, more diverse study populations, and larger studies.

Metabolomics is an “omics” approach focused on the large-scale analysis of the metabolome, the set of metabolites within a biological system (1, 2). The emergence of the field can be traced to 1998, when the term “metabolome” was first introduced by Oliver and colleagues (3). Metabolomics has been shown to be a powerful tool for studying human health and biology. It can be applied to estimate disease risk, elucidate biological mechanisms, and identify biomarkers for disease diagnosis and prognosis. Two main analytic approaches are used in these studies: untargeted and semi-targeted profiling. Untargeted studies aim to detect as many metabolites as possible using a global approach, where there is no a priori metabolite information leading to data acquisition (4). Semi-targeted studies profile hundreds of metabolites whose identity is defined from a range of chemical classes and metabolic pathways before experimentation (4). Additionally, there are targeted analyses, which quantify a smaller number of predefined metabolites that are related in function and class (4). To detect metabolites in a sample, commonly used metabolomic platforms include mass spectrometry (MS)–based and nuclear magnetic resonance (NMR)–based techniques. MS-based platforms have the advantages of broader metabolite coverage and higher sensitivity compared with NMR, but they are destructive to the sample, and technical reproducibility is variable (5). In contrast, NMR-based platforms provide detailed structural information on fewer metabolites and are nondestructive and fully quantitative (5).

There has been a significant investment into the field of metabolomics from the National Institutes of Health (NIH), which has catalyzed its use in epidemiology to study human diseases, including cancer (6). In 2012, the NIH Common Fund Metabolomics Program was established to address key challenges in the field, including the need for improved rigor and reproducibility, and establish an infrastructure for metabolomics in the United States (7). Following this initial investment, the international COnsortium of METabolomics Studies (COMETS) was established in 2014 to develop methods to harmonize data, develop data analysis tools specific to these studies, and build a network of epidemiologists in the field (6, 8, 9).

Metabolomic epidemiology, as defined by Lasky-Su and colleagues (10) is “the field of scientific enquiry involving the systematic use of epidemiological methods and principles to study population-based variation in the human metabolome as it associates with health-related outcomes or exposures.” Case–control, cross-sectional, prospective cohort, and nested study designs are common types of metabolomic epidemiology studies (10–12). Case–control and cross-sectional studies allow researchers to glean potential metabolomic differences by comparing individuals by disease or exposure status (12). Case–control studies also involve biospecimen collection at the time of diagnosis, allowing for stronger metabolite–disease associations (11). Studies that use prospective sampling allow the assessment of temporal relationships (13). Biospecimens that are typically collected in epidemiologic studies are suitable for metabolomics analysis (10, 14). However, metabolites are known to be affected by preanalytical conditions such as biospecimen collection, processing, and storage conditions; therefore, experimental processes should be consistently applied across all biospecimens (15).

To date, comprehensive reviews in the field of metabolomic epidemiology are lacking. Therefore, we launched a scoping review to systematically map the field of population-based cancer metabolomics research, characterizing trends in the literature in terms of study design, population characteristics, and metabolomics approaches, and to identify opportunities for future growth and improvement.

Protocol and registration

We used the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) extension for Scoping Reviews to write the protocol and final reporting of this review (16). The Materials and Methods section outlines the methodological protocol used for the study.

Eligibility criteria

We used the following inclusion criteria for the study selection process: (i) the article used metabolomics to study cancer; (ii) cancer was the focus of the article; (iii) the article contained at least 100 cases in each analysis stratification of the main analysis; (iv) the article used an epidemiologic study design; and (v) and the article was published in English from 1998 to June 2021. We also included studies based in an international setting to capture studies investigating international cohorts.

For this review, our definition of metabolomics included untargeted and semi-targeted metabolic profiling, assessment of two or more Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways (17), and lipidomics. Although lipids are represented under a single KEGG pathway, we included lipidomics in our definition due to the large number and wide diversity of metabolites in the biochemical class. We excluded from our definition of metabolomics targeted analyses. Articles must have had a minimum of 100 cases in each analysis stratification of the main analysis to meet our definition of a population-based epidemiology study (2). If the study included training and validation sets, then the discovery set was considered the main analysis. We limited to the English language only as our review team did not have translation capabilities available, and publication year 1998 because the scientific term “metabolome” was first introduced in 1998 (3).

Information sources and search

Search strategies were developed using keywords and controlled vocabularies (i.e., MeSH, Emtree) for metabolomics, cancer, and epidemiologic study designs and four databases were searched by a biomedical librarian (AAL): PubMed/MEDLINE (United States National Library of Medicine), Embase (Elsevier), Scopus (Elsevier), and Web of Science: Core Collection (Clarivate Analytics). All searches were conducted in April and June 2021. The searches were limited by publication date (January 1, 1998–June 1, 2021) and language (English), and animal studies and specific article types not reporting data (e.g., retractions, corrigenda, errata, reviews, editorials, commentaries, letters, conference abstracts/proceedings, and meta-analyses) were excluded. A search strategy was used to exclude animal studies and article types. All database results were exported to EndNote X9.3.3 (Clarivate Analytics), and the find duplicates feature was used to identify unique articles.

Selection of sources of evidence

We performed a pilot of the study selection process (both levels) with four authors (ZF, EL, CTY, and KAZ) on a random sample of records in Covidence (Veritas Health Innovation) and the eligibility criteria were revised as necessary.

All unique records were imported by AAL in XML format into Covidence for study screening. Each article at the title and abstract and full-text level was screened independently by two authors. First, three authors (ZF, CTY, and KAZ) independently screened the titles and abstracts of all unique records using the predefined eligibility criteria. Next, the same three authors independently screened the full text of those articles included after title and abstract screening against our eligibility criteria. For both levels of study selection, EL resolved disagreements between authors during the screening through an independent review.

Data charting process and data items

Data charting was performed in Qualtrics (Qualtrics XM) using a form that was tested with all participating authors prior to its use. Three authors (ZF, CTY, and KAZ) performed the data charting, and each article was charted independently by two authors. The list of charted data items, including any assumptions and definitions used, is described below. Following data charting, SAS 9.4 (SAS institute) was used to identify discrepancies in the extracted data. Discordances in the data collected were resolved by EL and through group discussion when necessary.

We extracted the following data items from each article, when available:

  • Publication year

  • Author name

  • Epidemiologic study design

  • Cancer type

  • Population characteristics (i.e., cohort name, sample size, age range of study participants, and participant race and ethnicity)

  • Study setting (i.e., country(ies) from where the study population was chosen)

  • Primary metabolomics-specific research aim

  • Analytic platform used to perform metabolomics

  • Metabolomics approach (i.e., untargeted or semi-targeted)

  • Biospecimen type

  • Whether the study compared race/ethnic groups in some way

All data items were presented as closed-ended questions with prespecified responses, except for author name, publication year, cohort name, and sample size. When authors selected “other” during extraction, they were required to provide an additional brief explanation of the data item. When authors selected “yes” for whether a study performed comparisons by race or ethnicity, they were required to provide a brief explanation of the comparisons and relevant findings. Data items that were not available in the article were recorded as missing.

We categorized epidemiologic study designs as follows, based on standard epidemiology definitions (18,19):

  • Case–control study

  • Case–cohort study

  • Case-only study

  • Case-series study

  • Cohort study

  • Cross-sectional study

  • Intervention trial

  • Nested case–control study

  • Other: Any study design not fitting into the categories

Studies that did not investigate a pediatric cancer or specify participant age range but did report a mean age >18 years were categorized as adult studies. We used the five race categories outlined by the United States Office of Management and Budget (OMB) for our collection of study population race data: American Indian or Alaska Native, Asian, Black or African American, Native Hawaiian or Other Pacific Islander, and White (20). When studies reported race data that did not fit the OMB-outlined categories, the authors recorded the race data as “other.’’ For extraction of study population ethnicity data, we used the categories "Hispanic or Latino" and "Not Hispanic or Latino,” also as outlined by OMB standards (20). The authors made no assumptions about the race and ethnicity of study participants and recorded race and ethnicity data items as directly communicated by the paper. Following data extraction and discordance resolution, CTY reviewed the articles recorded as reporting race data to determine if the papers reported disaggregated subgroup information. ZF and CTY also charted cancer case, diseased control, and healthy control numbers post hoc. KAZ resolved discordances between ZF and CTY.

We used the following prespecified research aim categories when extracting data about an article's primary objective for using metabolomics:

  • Cancer risk estimation using incident cases

  • Cancer risk estimation using prevalent cases

  • Risk of recurrence or death among cancer survivors

  • Treatment intervention with an outcome of cancer

  • Lifestyle intervention with an outcome of cancer

  • Cancer progression/natural history

  • Biomarker of exposure and cancer risk estimation using incident cases

  • Biomarker of exposure and cancer risk estimation using prevalent cases

  • Other.

We defined “cancer risk estimation using prevalent cases” as studies that estimated associations of metabolites with cancer prevalence. Studies that were recorded as “other” were further reviewed following data charting by CTY and KAZ to identify additional research aims categories. The additional aim categories included biomarkers of disease diagnosis, biomarkers of survival, biomarkers of disease diagnosis and survival, disease differentiation: cases vs. controls (compares cases vs. controls to examine whether disease is present), disease differentiation: tumor vs. non-tumor (compares tumor vs. non-tumor tissue to determine if disease is present), association study: prognosis/recurrence, and descriptive study: progression/survival. To be considered a biomarker study, the analysis needed to include a receiver operating characteristic curve, otherwise the study was considered either as association or descriptive.

We categorized the metabolomics approaches used as:

  • Untargeted: Metabolomics study applied for wide detection of metabolites in a sample, ranging from 100s to 1,000s of metabolites (4). These studies lack knowledge of its metabolite targets prior to experimentation and are not quantitative (4).

  • Semi-targeted: Metabolomics study applied to profile 100s of metabolites in a sample (4). These studies use predefined metabolite targets that are chosen from several metabolic pathways and chemical classes of biological interest which cover a wide range of metabolism (4).

Following data extraction and discordance resolution, ZF and CTY independently reviewed the articles recorded as performing semi-targeted metabolomics to further extract the metabolites targeted in these studies. CTY then categorized the metabolites according to the eight super-pathways profiled and defined by Metabolon, Inc.: amino acids, carbohydrates, cofactors and vitamins, energy, lipids, nucleotides, peptides, and xenobiotics (21, 22). Metabolites that did not fall under a Metabolon super-pathway were categorized as “other.”

Synthesis of results

We collected descriptive statistics from the included studies and summarized in a narrative format, as well as reported visually using pie charts, bar graphs, and map formats the charted data items. CTY performed analyses in RStudio Version 1.3.1093 (RStudio) and Excel Version 2108 (Microsoft). All figures were created using Excel Version 2108 (Microsoft).

Selection of sources of evidence

The database searches identified 4,414 articles and 2,366 duplicates were removed prior to screening. The titles and abstracts of 2,048 articles were screened, of which 1,734 were excluded. The full text of 314 articles was then assessed for eligibility, of which 237 were excluded and 77 articles were included in the review. Figure 1 displays the flow of information throughout the review and the reasons for exclusion at full-text screening.

Figure 1.

PRISMA flow diagram for selecting sources of evidence. The flow diagram shows the process used to select sources of evidence to be included in the scoping review examining population-based cancer metabolomics research.

Figure 1.

PRISMA flow diagram for selecting sources of evidence. The flow diagram shows the process used to select sources of evidence to be included in the scoping review examining population-based cancer metabolomics research.

Close modal

Key characteristics of included studies

For each study included in this review, Table 1 presents the key characteristics and data collected including study design, cancer type, study setting, sample size, age range, race and Hispanic ethnicity groups reported, metabolomics-specific research aim, metabolomics approach, metabolomics analytic platform, and biospecimen.

Table 1.

Study-, population-, and metabolomics-related characteristics of population-based cancer metabolomics studies.

StudyPopulationMetabolomics
Author yearStudy designCancer typeStudy settingCancer cases/diseased controls/healthy controlsAge rangeReported raceReported Hispanic ethnicityAimApproachAnalytic platformBiospecimen
Adams 2019 (79Nested case–control study Prostate United Kingdom 2,291/0/2,661 18+ NR NR Cancer risk estimation using incident cases Semi-targeted NMR Serum 
Assi 2015 (80Nested case–control study Liver and intrahepatic bile duct United Kingdom, Denmark, Germany, Greece, Italy, Netherlands, Spain, Sweden 114/0/222 18+ NR NR Biomarker of exposure and cancer risk estimation using incident cases Untargeted NMR Serum 
Assi 2018 (81Nested case–control study Liver and intrahepatic bile duct United Kingdom, Denmark, France, Germany, Greece, Italy, Netherlands, Norway, Spain, Sweden 147/0/147 18+ NR NR Biomarker of exposure and cancer risk estimation using incident cases Semi-targeted LC-MS Serum 
Assi 2018 (82Nested case–control study Liver and intrahepatic bile duct United Kingdom, Denmark, Germany, Greece, Italy, Netherlands, Spain, Sweden 147/0/147 18+ NR NR Biomarker of exposure and cancer risk estimation using incident cases Semi-targeted LC-MS Serum 
Battini 2017 (40Case-only study Pancreas France 106/0/0 NR NR NR Biomarkers of survival Semi-targeted NMR Tissue 
Björkblom 2016 (83Nested case–control study Brain and other nervous system Norway 110/0/110 18+ NR NR Cancer risk estimation using incident cases Untargeted GC-MS Serum 
Bro 2015 (84Nested case–control study Breast Denmark 838/0/838 18+ NR NR Cancer risk estimation using incident cases Untargeted NMR Plasma 
Bruzzone 2020 (48Cross-sectional study Prostate Spain 453/202/0 18+ NR NR Disease differentiation: cases vs. controls Untargeted NMR Urine 
Budczies 2012 (36Cross-sectional study Breast Germany 184/0/42 18+ NR NR Biomarkers of disease diagnosis and survival Untargeted GC-MS Tissue 
Cross 2014 (85Nested case–control study Colorectal United States 255/0/254 18+ Black or African American, White, Othera NR Biomarker of exposure and cancer risk estimation using incident cases Semi-targeted GC-MS, LC-MS Serum 
Cross 2014 (86Nested case–control study Colorectal United States 254/0/254 18+ Black or African American, White, Othera NR Cancer risk estimation using incident cases Semi-targeted GC-MS, LC-MS Serum 
Deng 2019 (46Case-control study Colorectal United States, Canada 171/0/171 18+ NR NR Cancer risk estimation using incident cases Semi-targeted LC-MS Urine 
Dickerman 2020 (87Nested case–control study Prostate United States 212/0/212 18+ White NR Biomarker of exposure and cancer risk estimation using incident cases Semi-targeted LC-MS Plasma 
Farshidfar 2016 (88Nested case–control study Colorectal Canada 222/0/156 18+ NR NR Biomarkers of disease diagnosis Untargeted GC-MS Serum 
Fest 2019 (89Otherb Pancreas Estonia, Finland, Netherlands, Norway 389/0/946 18+ NR NR Cancer risk estimation using incident cases Semi-targeted NMR Serum 
Gaudet 2012 (90Case–control study Uterus Poland 250/0/250 18+ NR NR Cancer risk estimation using prevalent cases Semi-targeted GC-MS Serum 
Geijsen 2019 (91Case–control study Colorectal Austria, Germany 180/0/153 18+ White NR Cancer risk estimation using prevalent cases Untargeted LC-MS Plasma 
Guertin 2015 (92Nested case–control study Colorectal United States 251/0/247 18+ White NR Biomarker of exposure and cancer risk estimation using incident cases Semi-targeted GC-MS, LC-MS Serum 
Hadi 2017 (93Case–control study Breast Pakistan 152/0/155 18+ NR NR Disease differentiation: cases vs. controls Untargeted GC-MS Serum 
Hakimi 2016 (38Case-only study Kidney and renal pelvis United States 138/0/0 18+ Asian, Black or African American, White, Othera NR Descriptive study: progression/survival Semi-targeted GC-MS, LC-MS Tissue 
Han 2020 (39Case-only study Liver and intrahepatic bile duct China 156/0/0 18+ NR NR Biomarkers of survival Untargeted LC-MS Tissue 
Hao 2020 (94Case-only study Lung and bronchus NR 439/0/0 18+ NR NR Risk of recurrence or death among cancer survivors Untargeted LC-MS, NMR Serum 
Hasim 2012 (47Case–control study Esophagus China 108/0/40 18+ NR NR Disease differentiation: cases vs. controls Untargeted NMR Plasma, Urine 
His 2019 (95Nested case–control study Breast United Kingdom, Denmark, France, Germany, Greece, Italy, Netherlands, Norway, Spain, Sweden 1,624/0/1,624 18+ NR NR Cancer risk estimation using incident cases Semi-targeted LC-MS Plasma 
Huang 2016 (29Nested case–control study Prostate United States 380/0/380 18+ Black or African American, White, Othera Not Hispanic or Latino Cancer risk estimation using incident cases Semi-targeted GC-MS, LC-MS Serum 
Huang 2016 (32Case–control study Breast United States 106/0/61 18+ American Indian or Alaska Native, Asian, Black or African American, White, Othera Hispanic or Latino Biomarkers of disease diagnosis Untargeted GC-MS, LC-MS Plasma, Serum 
Huang 2019 (96Nested case–control study Prostate Finland 523/0/523 18+ White NR Cancer risk estimation using incident cases Semi-targeted LC-MS Serum 
Huang 2020 (97Case–control study Lung and bronchus China 200/0/200 18+ NR NR Biomarkers of disease diagnosis Untargeted Otherc Serum 
Huang 2020 (98Nested case–control study Liver and intrahepatic bile duct Taiwan 109/0/107 18+ NR NR Cancer risk estimation using incident cases Semi-targeted NMR Plasma 
Jobard 2021 (99Nested case–control study Breast France 791/0/791 18+ NR NR Cancer risk estimation using incident cases Untargeted NMR Plasma 
Kaji 2020 (42Case-only study Stomach Japan 140/0/0 18+ NR NR Risk of recurrence or death among cancer survivors Untargeted CE-MS Tissue 
Kliemann 2021 (100Nested case–control study Colorectal, uterus United Kingdom, Denmark, France, Germany, Italy, Netherlands, Norway, Spain, Sweden 1,058d/0/1,071 18+ NR NR Cancer risk estimation using incident cases Semi-targeted LC-MS Plasma, Serum 
Kühn 2016 (101Case-cohort study Breast, colorectal, prostate Germany 835e/0/774 18+ NR NR Cancer risk estimation using incident cases Semi-targeted LC-MS, Otherf Plasma 
Lécuyer 2018 (102Nested case–control study Breast France 206/0/396 18+ NR NR Cancer risk estimation using incident cases Untargeted NMR Plasma 
Lécuyer 2019 (103Nested case–control study Breast France 211/0/211 18+ NR NR Cancer risk estimation using incident cases Untargeted LC-MS Plasma 
Lécuyer 2020 (104Nested case–control study Breast France 200/0/200 18+ NR NR Biomarker of exposure and cancer risk estimation using incident cases Untargeted LC-MS Plasma 
Lécuyer 2021 (105Nested case–control study Prostate France 171/0/171 18+ NR NR Cancer risk estimation using incident cases Untargeted NMR Plasma 
Li 2016 (106Case–control study Prostate China 560/0/500 18+ NR NR Biomarkers of disease diagnosis Untargeted LC-MS Plasma 
Li 2019 (107Cross-sectional study Colorectal China 120/120/120 18+ NR NR Biomarkers of disease diagnosis Untargeted LC-MS Serum 
Li 2021 (108Case–control study Colorectal China 144/110/74 18+ NR NR Biomarkers of disease diagnosis Untargeted Otherg Plasma 
Liang 2016 (109Case–control study Liver and intrahepatic bile duct, otherh China 176i/0/85 18+ NR NR Disease differentiation: cases vs. controls Untargeted LC-MS Serum 
Liang 2017 (49Case–control study Prostate China 236/0/233 18+ NR NR Cancer risk estimation using prevalent cases Untargeted LC-MS Urine 
Loftfield 2020 (110Nested case–control study Liver and intrahepatic bile duct Finland 221/242/463 18+ NR NR Biomarker of exposure and cancer risk estimation using incident cases Untargeted LC-MS Serum 
Louis 2016 (111Case–control study Lung and bronchus Belgium 233/0/226 18+ NR NR Cancer risk estimation using prevalent cases Semi-targeted NMR Plasma 
Mamtimin 2011 (112Case–control study Breast, cervix uteri, colorectal, esophagus, liver and intrahepatic bile duct, lung and bronchus, stomach, otherj China 170k/0/50 18+ NR NR Disease differentiation: cases vs. controls Untargeted NMR Plasma 
Mathé 2014 (30Case–control study Lung and bronchus United States 469/0/536 18+ Black or African American, White NR Biomarkers of disease diagnosis and survival Untargeted LC-MS Urine 
McCullough 2021 (113Nested case–control study Colorectal United States 517/0/517 18+ Black or African American, White, Otherl NR Cancer risk estimation using incident cases Semi-targeted LC-MS Plasma 
Meller 2016 (41Case-only study Prostate Germany 106/0/0 18+ NR NR Association study: prognosis/recurrence Untargeted GC-MS, LC-MS Tissue 
Men 2020 (45Case–control study Breast China 106/0/38 18+ NR NR Disease differentiation: cases vs. controls Untargeted NMR Urine 
Mondul 2015 (25Nested case–control study Prostate Finland 200/0/200 18+ Whitem NR Cancer risk estimation using incident cases Semi-targeted GC-MS, LC-MS Serum 
Moore 2018 (33Nested case–control study Breast United States 621/0/621 18+ White, Othern Not Hispanic or Latino Biomarker of exposure and cancer risk estimation using incident cases Semi-targeted LC-MS Serum 
Moore 2021 (34Nested case–control study Breast United States 782/0/782 18+ White, Otherl Not Hispanic or Latino Cancer risk estimation using incident cases Semi-targeted LC-MS Serum 
Ose 2021 (114Case-only study Colorectal Austria, Germany, Netherlands 440/0/0 18+ NR NR Risk of recurrence or death among cancer survivors Semi-targeted LC-MS Plasma 
Petrick 2019 (23Case–control study Leukemia United States 332/0/324 0–18 NR Hispanic or Latino, Not Hispanic or Latino Disease differentiation: cases vs. controls Untargeted LC-MS Othero 
Piyarathna 2018 (35Case-only study Bladder United States, Germany 165/0/0 18+ NR NR Cancer progression/natural history Untargeted LC-MS Tissue 
Röhnisch 2020 (115Nested case–control study Prostate Sweden 777/0/777 18+ NR NR Cancer risk estimation using incident cases Semi-targeted LC-MS, NMR Plasma 
Ros-Mazurczyk 2017 (116Case-control study Lung and bronchus Poland 100/0/300 18+ NR NR Cancer risk estimation using prevalent cases Untargeted LC-MS Serum 
Schmidt 2017 (117Nested case–control study Prostate United Kingdom, Germany, Greece, Italy, Netherlands, Spain 1,077/0/1,077 18+ NR NR Cancer risk estimation using incident cases Semi-targeted LC-MS Plasma 
Schmidt 2020 (118Nested case–control study Prostate United Kingdom, Germany, Greece, Italy, Netherlands, Spain 3,057/0/3,057 18+ NR NR Cancer risk estimation using incident cases Semi-targeted LC-MS Plasma 
Seow 2019 (26Nested case–control study Lung and bronchus China 275/0/289 18+ Asianp NR Cancer risk estimation using incident cases Untargeted LC-MS, NMR Urine 
Shu 2018 (119Nested case–control study Pancreas China 226/0/226 18+ NR NR Cancer risk estimation using incident cases Untargeted GC-MS, LC-MS Plasma 
Shu 2018 (27Nested case–control study Colorectal China 245/0/245 18+ Asianp NR Cancer risk estimation using incident cases Untargeted GC-MS, LC-MS Plasma 
Stepien 2021 (120Nested case–control study Liver and intrahepatic bile duct United Kingdom, Denmark, France, Germany, Greece, Italy, Netherlands, Norway, Spain, Sweden 129/0/129 18+ NR NR Cancer risk estimation using incident cases Untargeted LC-MS Serum 
Stolzenberg-Solomon 2020 (121Nested case–control study Pancreas United States, Finland 479/0/479 18+ Asian, Black or African American, White, Othera NR Cancer risk estimation using incident cases Semi-targeted GC-MS, LC-MS Serum 
Su 2019 (122Nested case–control study Liver and intrahepatic bile duct China 134/0/134 18+ NR NR Biomarkers of disease diagnosis Semi-targeted LC-MS Serum 
Sun 2019 (37Case-only study Esophagus China 256/0/0 NR NR NR Disease differentiation: tumor vs. non-tumor tissue Untargeted Otherq Tissue 
Vanhove 2018 (123Case–control study Lung and bronchus Belgium 269/108/347 18+ NR NR Biomarkers of disease diagnosis Semi-targeted NMR Plasma 
Wang 2016 (43Case-only study Stomach China 125/0/54 18+ NR NR Biomarkers of disease diagnosis Untargeted NMR Tissue 
Wang 2021 (124Case–cohort study Prostate United States 241/0/347 18+ White, Otherr NR Cancer risk estimation using incident cases Semi-targeted LC-MS Plasma 
Wei 2021 (125Cross-sectional study Esophagus China 207/0/257 18+ NR NR Biomarker of exposure and cancer risk estimation using prevalent cases Untargeted LC-MS Serum 
Wikoff 2015 (126Nested case–control study Lung and bronchus United States 100/0/199 18+ Black or African American, White, Othera NR Cancer risk estimation using incident cases Untargeted LC-MS Serum 
Wilson 2013 (24Case-only study Brain and other nervous system United Kingdom 115/0/0 0–18 NR NR Risk of recurrence or death among cancer survivors Semi-targeted Others Tissue 
Yamakawa 2017 (44Cohort study Stomach Japan 103/0/0 18+ NR NR Risk of recurrence or death among cancer survivors Untargeted CE-MS Tissue 
Yi 2014 (28Cross-sectional study Oral cavity and pharynx China 100/0/100 18+ Asiant NR Disease differentiation: cases vs. controls Untargeted GC-MS Serum 
Zeleznik 2020 (127Nested case–control study Ovary United States 252/0/252 18+ NR NR Cancer risk estimation using incident cases Semi-targeted LC-MS Plasma 
Zhang 2020 (128Cross-sectional study Colorectal China 539/73/0 18+ NR NR Biomarkers of disease diagnosis Untargeted LC-MS Plasma 
Zhao 2019 (31Case–control study Breast United States 134/0/57 18+ Black or African American Hispanic or Latino, Not Hispanic or Latino Cancer risk estimation using prevalent cases Semi-targeted GC-MS, LC-MS Plasma 
StudyPopulationMetabolomics
Author yearStudy designCancer typeStudy settingCancer cases/diseased controls/healthy controlsAge rangeReported raceReported Hispanic ethnicityAimApproachAnalytic platformBiospecimen
Adams 2019 (79Nested case–control study Prostate United Kingdom 2,291/0/2,661 18+ NR NR Cancer risk estimation using incident cases Semi-targeted NMR Serum 
Assi 2015 (80Nested case–control study Liver and intrahepatic bile duct United Kingdom, Denmark, Germany, Greece, Italy, Netherlands, Spain, Sweden 114/0/222 18+ NR NR Biomarker of exposure and cancer risk estimation using incident cases Untargeted NMR Serum 
Assi 2018 (81Nested case–control study Liver and intrahepatic bile duct United Kingdom, Denmark, France, Germany, Greece, Italy, Netherlands, Norway, Spain, Sweden 147/0/147 18+ NR NR Biomarker of exposure and cancer risk estimation using incident cases Semi-targeted LC-MS Serum 
Assi 2018 (82Nested case–control study Liver and intrahepatic bile duct United Kingdom, Denmark, Germany, Greece, Italy, Netherlands, Spain, Sweden 147/0/147 18+ NR NR Biomarker of exposure and cancer risk estimation using incident cases Semi-targeted LC-MS Serum 
Battini 2017 (40Case-only study Pancreas France 106/0/0 NR NR NR Biomarkers of survival Semi-targeted NMR Tissue 
Björkblom 2016 (83Nested case–control study Brain and other nervous system Norway 110/0/110 18+ NR NR Cancer risk estimation using incident cases Untargeted GC-MS Serum 
Bro 2015 (84Nested case–control study Breast Denmark 838/0/838 18+ NR NR Cancer risk estimation using incident cases Untargeted NMR Plasma 
Bruzzone 2020 (48Cross-sectional study Prostate Spain 453/202/0 18+ NR NR Disease differentiation: cases vs. controls Untargeted NMR Urine 
Budczies 2012 (36Cross-sectional study Breast Germany 184/0/42 18+ NR NR Biomarkers of disease diagnosis and survival Untargeted GC-MS Tissue 
Cross 2014 (85Nested case–control study Colorectal United States 255/0/254 18+ Black or African American, White, Othera NR Biomarker of exposure and cancer risk estimation using incident cases Semi-targeted GC-MS, LC-MS Serum 
Cross 2014 (86Nested case–control study Colorectal United States 254/0/254 18+ Black or African American, White, Othera NR Cancer risk estimation using incident cases Semi-targeted GC-MS, LC-MS Serum 
Deng 2019 (46Case-control study Colorectal United States, Canada 171/0/171 18+ NR NR Cancer risk estimation using incident cases Semi-targeted LC-MS Urine 
Dickerman 2020 (87Nested case–control study Prostate United States 212/0/212 18+ White NR Biomarker of exposure and cancer risk estimation using incident cases Semi-targeted LC-MS Plasma 
Farshidfar 2016 (88Nested case–control study Colorectal Canada 222/0/156 18+ NR NR Biomarkers of disease diagnosis Untargeted GC-MS Serum 
Fest 2019 (89Otherb Pancreas Estonia, Finland, Netherlands, Norway 389/0/946 18+ NR NR Cancer risk estimation using incident cases Semi-targeted NMR Serum 
Gaudet 2012 (90Case–control study Uterus Poland 250/0/250 18+ NR NR Cancer risk estimation using prevalent cases Semi-targeted GC-MS Serum 
Geijsen 2019 (91Case–control study Colorectal Austria, Germany 180/0/153 18+ White NR Cancer risk estimation using prevalent cases Untargeted LC-MS Plasma 
Guertin 2015 (92Nested case–control study Colorectal United States 251/0/247 18+ White NR Biomarker of exposure and cancer risk estimation using incident cases Semi-targeted GC-MS, LC-MS Serum 
Hadi 2017 (93Case–control study Breast Pakistan 152/0/155 18+ NR NR Disease differentiation: cases vs. controls Untargeted GC-MS Serum 
Hakimi 2016 (38Case-only study Kidney and renal pelvis United States 138/0/0 18+ Asian, Black or African American, White, Othera NR Descriptive study: progression/survival Semi-targeted GC-MS, LC-MS Tissue 
Han 2020 (39Case-only study Liver and intrahepatic bile duct China 156/0/0 18+ NR NR Biomarkers of survival Untargeted LC-MS Tissue 
Hao 2020 (94Case-only study Lung and bronchus NR 439/0/0 18+ NR NR Risk of recurrence or death among cancer survivors Untargeted LC-MS, NMR Serum 
Hasim 2012 (47Case–control study Esophagus China 108/0/40 18+ NR NR Disease differentiation: cases vs. controls Untargeted NMR Plasma, Urine 
His 2019 (95Nested case–control study Breast United Kingdom, Denmark, France, Germany, Greece, Italy, Netherlands, Norway, Spain, Sweden 1,624/0/1,624 18+ NR NR Cancer risk estimation using incident cases Semi-targeted LC-MS Plasma 
Huang 2016 (29Nested case–control study Prostate United States 380/0/380 18+ Black or African American, White, Othera Not Hispanic or Latino Cancer risk estimation using incident cases Semi-targeted GC-MS, LC-MS Serum 
Huang 2016 (32Case–control study Breast United States 106/0/61 18+ American Indian or Alaska Native, Asian, Black or African American, White, Othera Hispanic or Latino Biomarkers of disease diagnosis Untargeted GC-MS, LC-MS Plasma, Serum 
Huang 2019 (96Nested case–control study Prostate Finland 523/0/523 18+ White NR Cancer risk estimation using incident cases Semi-targeted LC-MS Serum 
Huang 2020 (97Case–control study Lung and bronchus China 200/0/200 18+ NR NR Biomarkers of disease diagnosis Untargeted Otherc Serum 
Huang 2020 (98Nested case–control study Liver and intrahepatic bile duct Taiwan 109/0/107 18+ NR NR Cancer risk estimation using incident cases Semi-targeted NMR Plasma 
Jobard 2021 (99Nested case–control study Breast France 791/0/791 18+ NR NR Cancer risk estimation using incident cases Untargeted NMR Plasma 
Kaji 2020 (42Case-only study Stomach Japan 140/0/0 18+ NR NR Risk of recurrence or death among cancer survivors Untargeted CE-MS Tissue 
Kliemann 2021 (100Nested case–control study Colorectal, uterus United Kingdom, Denmark, France, Germany, Italy, Netherlands, Norway, Spain, Sweden 1,058d/0/1,071 18+ NR NR Cancer risk estimation using incident cases Semi-targeted LC-MS Plasma, Serum 
Kühn 2016 (101Case-cohort study Breast, colorectal, prostate Germany 835e/0/774 18+ NR NR Cancer risk estimation using incident cases Semi-targeted LC-MS, Otherf Plasma 
Lécuyer 2018 (102Nested case–control study Breast France 206/0/396 18+ NR NR Cancer risk estimation using incident cases Untargeted NMR Plasma 
Lécuyer 2019 (103Nested case–control study Breast France 211/0/211 18+ NR NR Cancer risk estimation using incident cases Untargeted LC-MS Plasma 
Lécuyer 2020 (104Nested case–control study Breast France 200/0/200 18+ NR NR Biomarker of exposure and cancer risk estimation using incident cases Untargeted LC-MS Plasma 
Lécuyer 2021 (105Nested case–control study Prostate France 171/0/171 18+ NR NR Cancer risk estimation using incident cases Untargeted NMR Plasma 
Li 2016 (106Case–control study Prostate China 560/0/500 18+ NR NR Biomarkers of disease diagnosis Untargeted LC-MS Plasma 
Li 2019 (107Cross-sectional study Colorectal China 120/120/120 18+ NR NR Biomarkers of disease diagnosis Untargeted LC-MS Serum 
Li 2021 (108Case–control study Colorectal China 144/110/74 18+ NR NR Biomarkers of disease diagnosis Untargeted Otherg Plasma 
Liang 2016 (109Case–control study Liver and intrahepatic bile duct, otherh China 176i/0/85 18+ NR NR Disease differentiation: cases vs. controls Untargeted LC-MS Serum 
Liang 2017 (49Case–control study Prostate China 236/0/233 18+ NR NR Cancer risk estimation using prevalent cases Untargeted LC-MS Urine 
Loftfield 2020 (110Nested case–control study Liver and intrahepatic bile duct Finland 221/242/463 18+ NR NR Biomarker of exposure and cancer risk estimation using incident cases Untargeted LC-MS Serum 
Louis 2016 (111Case–control study Lung and bronchus Belgium 233/0/226 18+ NR NR Cancer risk estimation using prevalent cases Semi-targeted NMR Plasma 
Mamtimin 2011 (112Case–control study Breast, cervix uteri, colorectal, esophagus, liver and intrahepatic bile duct, lung and bronchus, stomach, otherj China 170k/0/50 18+ NR NR Disease differentiation: cases vs. controls Untargeted NMR Plasma 
Mathé 2014 (30Case–control study Lung and bronchus United States 469/0/536 18+ Black or African American, White NR Biomarkers of disease diagnosis and survival Untargeted LC-MS Urine 
McCullough 2021 (113Nested case–control study Colorectal United States 517/0/517 18+ Black or African American, White, Otherl NR Cancer risk estimation using incident cases Semi-targeted LC-MS Plasma 
Meller 2016 (41Case-only study Prostate Germany 106/0/0 18+ NR NR Association study: prognosis/recurrence Untargeted GC-MS, LC-MS Tissue 
Men 2020 (45Case–control study Breast China 106/0/38 18+ NR NR Disease differentiation: cases vs. controls Untargeted NMR Urine 
Mondul 2015 (25Nested case–control study Prostate Finland 200/0/200 18+ Whitem NR Cancer risk estimation using incident cases Semi-targeted GC-MS, LC-MS Serum 
Moore 2018 (33Nested case–control study Breast United States 621/0/621 18+ White, Othern Not Hispanic or Latino Biomarker of exposure and cancer risk estimation using incident cases Semi-targeted LC-MS Serum 
Moore 2021 (34Nested case–control study Breast United States 782/0/782 18+ White, Otherl Not Hispanic or Latino Cancer risk estimation using incident cases Semi-targeted LC-MS Serum 
Ose 2021 (114Case-only study Colorectal Austria, Germany, Netherlands 440/0/0 18+ NR NR Risk of recurrence or death among cancer survivors Semi-targeted LC-MS Plasma 
Petrick 2019 (23Case–control study Leukemia United States 332/0/324 0–18 NR Hispanic or Latino, Not Hispanic or Latino Disease differentiation: cases vs. controls Untargeted LC-MS Othero 
Piyarathna 2018 (35Case-only study Bladder United States, Germany 165/0/0 18+ NR NR Cancer progression/natural history Untargeted LC-MS Tissue 
Röhnisch 2020 (115Nested case–control study Prostate Sweden 777/0/777 18+ NR NR Cancer risk estimation using incident cases Semi-targeted LC-MS, NMR Plasma 
Ros-Mazurczyk 2017 (116Case-control study Lung and bronchus Poland 100/0/300 18+ NR NR Cancer risk estimation using prevalent cases Untargeted LC-MS Serum 
Schmidt 2017 (117Nested case–control study Prostate United Kingdom, Germany, Greece, Italy, Netherlands, Spain 1,077/0/1,077 18+ NR NR Cancer risk estimation using incident cases Semi-targeted LC-MS Plasma 
Schmidt 2020 (118Nested case–control study Prostate United Kingdom, Germany, Greece, Italy, Netherlands, Spain 3,057/0/3,057 18+ NR NR Cancer risk estimation using incident cases Semi-targeted LC-MS Plasma 
Seow 2019 (26Nested case–control study Lung and bronchus China 275/0/289 18+ Asianp NR Cancer risk estimation using incident cases Untargeted LC-MS, NMR Urine 
Shu 2018 (119Nested case–control study Pancreas China 226/0/226 18+ NR NR Cancer risk estimation using incident cases Untargeted GC-MS, LC-MS Plasma 
Shu 2018 (27Nested case–control study Colorectal China 245/0/245 18+ Asianp NR Cancer risk estimation using incident cases Untargeted GC-MS, LC-MS Plasma 
Stepien 2021 (120Nested case–control study Liver and intrahepatic bile duct United Kingdom, Denmark, France, Germany, Greece, Italy, Netherlands, Norway, Spain, Sweden 129/0/129 18+ NR NR Cancer risk estimation using incident cases Untargeted LC-MS Serum 
Stolzenberg-Solomon 2020 (121Nested case–control study Pancreas United States, Finland 479/0/479 18+ Asian, Black or African American, White, Othera NR Cancer risk estimation using incident cases Semi-targeted GC-MS, LC-MS Serum 
Su 2019 (122Nested case–control study Liver and intrahepatic bile duct China 134/0/134 18+ NR NR Biomarkers of disease diagnosis Semi-targeted LC-MS Serum 
Sun 2019 (37Case-only study Esophagus China 256/0/0 NR NR NR Disease differentiation: tumor vs. non-tumor tissue Untargeted Otherq Tissue 
Vanhove 2018 (123Case–control study Lung and bronchus Belgium 269/108/347 18+ NR NR Biomarkers of disease diagnosis Semi-targeted NMR Plasma 
Wang 2016 (43Case-only study Stomach China 125/0/54 18+ NR NR Biomarkers of disease diagnosis Untargeted NMR Tissue 
Wang 2021 (124Case–cohort study Prostate United States 241/0/347 18+ White, Otherr NR Cancer risk estimation using incident cases Semi-targeted LC-MS Plasma 
Wei 2021 (125Cross-sectional study Esophagus China 207/0/257 18+ NR NR Biomarker of exposure and cancer risk estimation using prevalent cases Untargeted LC-MS Serum 
Wikoff 2015 (126Nested case–control study Lung and bronchus United States 100/0/199 18+ Black or African American, White, Othera NR Cancer risk estimation using incident cases Untargeted LC-MS Serum 
Wilson 2013 (24Case-only study Brain and other nervous system United Kingdom 115/0/0 0–18 NR NR Risk of recurrence or death among cancer survivors Semi-targeted Others Tissue 
Yamakawa 2017 (44Cohort study Stomach Japan 103/0/0 18+ NR NR Risk of recurrence or death among cancer survivors Untargeted CE-MS Tissue 
Yi 2014 (28Cross-sectional study Oral cavity and pharynx China 100/0/100 18+ Asiant NR Disease differentiation: cases vs. controls Untargeted GC-MS Serum 
Zeleznik 2020 (127Nested case–control study Ovary United States 252/0/252 18+ NR NR Cancer risk estimation using incident cases Semi-targeted LC-MS Plasma 
Zhang 2020 (128Cross-sectional study Colorectal China 539/73/0 18+ NR NR Biomarkers of disease diagnosis Untargeted LC-MS Plasma 
Zhao 2019 (31Case–control study Breast United States 134/0/57 18+ Black or African American Hispanic or Latino, Not Hispanic or Latino Cancer risk estimation using prevalent cases Semi-targeted GC-MS, LC-MS Plasma 

Abbreviation: NR, not reported.

aOther race groups not specified.

bmeta-analysis of multiple nested case–control studies.

cferric particle-assisted laser desorption/ionization mass spectrometry.

dcolorectal cancer = 423, endometrial cancer = 635.

ebreast cancer = 362, colorectal cancer = 163, prostate cancer = 310.

fflow injection analysis-tandem mass spectrometry.

gmetal-organic framework/platform hybrid-assisted laser desorption/ionization mass spectrometry.

hextrahepatic cholangiocarcinoma.

iextrahepatic cholangiocarcinoma = 90, intrahepatic cholangiocarcinoma = 86.

jleucoma and other not specified.

kfemale breast cancer = 13, cervix cancer = 17, esophageal cancer = 26, hepatoma = 8, leucoma = 10, lung = 26, rectal = 7, stomach = 23, other not specified = 40.

lother race groups not specified and unknown.

mFinnish.

nother race groups not specified and missing.

oneonatal blood spots.

pChinese.

qairflow-assisted desorption electrospray ionization mass spectrometry imaging.

rnon-White unspecified.

s1H magnetic resonance spectroscopy.

tHan Chinese.

In general, the number of cancer epidemiology studies employing metabolomics has increased over time from 2011 through June 2021 (Supplementary Fig. S1).

Figure 2 shows the breakdown of study design types used in metabolomic epidemiology studies of cancer. The most common study design used was a nested case–control study design (n = 37, 48.1%). Case–control studies (n = 19, 24.7%) and case-only studies (n = 11, 14.3%) were the next two most used.

Figure 2.

Pie chart displays the breakdown of study design types used in metabolomic epidemiology studies of cancer. *Other includes meta-analysis of multiple nested case–control studies.

Figure 2.

Pie chart displays the breakdown of study design types used in metabolomic epidemiology studies of cancer. *Other includes meta-analysis of multiple nested case–control studies.

Close modal

The primary cancer types investigated were colorectal (n = 15, 19.5%), prostate (n = 15, 19.5%), and female breast (n = 15, 19.5%; Fig. 3). The studies included in this review examined 19 different cancer types.

Figure 3.

Bar graph depicts cancer types studied in the metabolomic epidemiology literature. *Other includes extrahepatic cholangiocarcinoma, leucoma, and other not specified.

Figure 3.

Bar graph depicts cancer types studied in the metabolomic epidemiology literature. *Other includes extrahepatic cholangiocarcinoma, leucoma, and other not specified.

Close modal

Most studies investigated adult-onset cancer (n = 73, 94.8%; Table 1). Two studies (2.6%) included subjects under 18 years of age to study pediatric cancers, namely, acute lymphoblastic leukemia (23) and pediatric brain tumors (24). Two studies (2.6%) did not report participant age information; however, these studies focused on adult-onset cancers (pancreas and esophagus).

The geographic distribution of studies is displayed in Supplementary Fig. S2. Most studies were conducted in European countries, including Germany (n = 14, 18.2%), France (n = 10, 13.0%), the Netherlands (n = 10, 13.0%), and the United Kingdom (n = 10, 13.0%). Studies were also conducted in the United States (n = 19, 24.7%) and China (n = 19, 24.7%). One study (1.3%) was missing study setting data.

Only 27.3% (n = 21) of studies reported on race (Table 1; Supplementary Fig. S3). In studies reporting race, 17 reported White participants, 11 reported other race groups (i.e., other race groups not specified, unknown, missing, and non-White unspecified), 10 reported Black or African American participants, 6 reported Asian participants, and 1 reported American Indian or Alaska Native participants (Supplementary Fig. S3). Of the 17 studies reporting White participants, only the study by Mondul and colleagues (25) specified that participants were Finnish. Of the six studies reporting Asian participants, three studies specified that participants were Chinese (26–28). Yi and colleagues (28) provided the distinction that all participants were Han Chinese. Three studies compared race or ethnicity groups (29–31). Mathé and colleagues (30) investigated urine metabolite predictors of lung cancer among Black and White participants. Stratified analyses by self-reported race revealed cortisol sulfate to have the strongest association with survival in Black participants. Huang and colleagues (29) performed serum metabolomics in a nested case–control design to examine prostate cancer risk in a sample population that included non-Hispanic White men, Black men, and men of other races. Their analysis by race did not reveal substantial differences in serum metabolites of overall prostate cancer risk between race groups, although sample sizes for non-White men were small. Zhao and colleagues (31) looked at plasma metabolites and breast cancer risk in Hispanic and non-Hispanic Black women. The authors did not find differences in metabolite profiles between Black and Hispanic women among the controls but acknowledged that they did not have adequate power to detect a difference. Five additional studies reported Hispanic ethnicity of their participants but did not make comparisons between ethnic groups (23, 29, 32–34).

All studies included in this review had a minimum of 100 cancer cases in each analysis stratification of the main analysis. Most studies (n = 54, 70.1%) included fewer than 300 cancer cases and 6.5% (n = 5) included over 1,000 cancer cases (Supplementary Fig. S4).

Cancer types that included on average greater than 300 cases across all studies were prostate (n = 706; 15 studies), breast (n = 451; 14 studies), uterus (n = 443; 2 studies), and leukemia (n = 332; 1 study; Supplementary Fig. S5).

Figure 4 displays the metabolomics-specific aims pursued by eligible studies. The majority of studies used metabolomics to estimate cancer risk (n = 46, 59.8%). Of these studies, 30 estimated cancer risk using incident cases, 6 estimated cancer risk using prevalent cases, 9 examined biomarkers of exposure and cancer risk using incident cases, and 1 study examined biomarkers of exposure and cancer risk using prevalent cases. Metabolomics was used in other studies to primarily identify biomarkers of disease diagnosis (n = 10, 13.0%) and perform disease differentiation between cases and controls (n = 8, 10.4%).

Figure 4.

Pie chart shows the breakdown of metabolomics-specific aims of included metabolomic epidemiology studies of cancer. (a) Cancer risk estimation using incident cases; (b) cancer risk estimation using prevalent cases; (c) biomarker of exposure and cancer risk estimation using incident cases; (d) biomarker of exposure and cancer risk estimation using prevalent cases; (e) biomarkers of disease diagnosis; (f) biomarkers of survival; (g) biomarkers of disease diagnosis and survival; (h) disease differentiation: cases vs. controls; (i) disease differentiation: tumor vs. non-tumor tissue; (j) risk of recurrence or death among cancer survivors; (k) cancer progression/natural history; (l) association study: prognosis/recurrence; (m) descriptive study: progression/survival.

Figure 4.

Pie chart shows the breakdown of metabolomics-specific aims of included metabolomic epidemiology studies of cancer. (a) Cancer risk estimation using incident cases; (b) cancer risk estimation using prevalent cases; (c) biomarker of exposure and cancer risk estimation using incident cases; (d) biomarker of exposure and cancer risk estimation using prevalent cases; (e) biomarkers of disease diagnosis; (f) biomarkers of survival; (g) biomarkers of disease diagnosis and survival; (h) disease differentiation: cases vs. controls; (i) disease differentiation: tumor vs. non-tumor tissue; (j) risk of recurrence or death among cancer survivors; (k) cancer progression/natural history; (l) association study: prognosis/recurrence; (m) descriptive study: progression/survival.

Close modal

Serum (n = 31, 40.3%) and plasma (n = 30, 39.0%) were the most used biospecimens (Table 1). Eleven studies (14.3%) performed metabolomics on tissue to study cancers of the bladder (35), brain and other nervous system (24), breast (36), esophagus (37), kidney and renal pelvis (38), liver and intrahepatic bile duct (39), pancreas (40), prostate (41), and stomach (refs. 42–44; Table 1). Seven studies (9.1%) used urine samples to study breast (45), colorectal (46), esophageal (47), lung and bronchus (26, 30), and prostate cancers (refs. 48, 49; Table 1). Petrick and colleagues (23) used neonatal blood spots to study acute lymphoblastic leukemia.

The proportions of untargeted and semi-targeted studies were similar, 55.8% (n = 43) and 44.2% (n = 34), respectively (Table 1). Supplementary Table S1 displays the metabolite super-pathways assessed in the 34 semi-targeted studies. All studies examined metabolites belonging to the lipid super-pathway. Following lipids, the second most covered super-pathway of metabolites was amino acids.

The most frequently used analytic platforms were liquid chromatography–tandem mass spectrometry (LC-MS; n = 49, 63.6%), NMR spectroscopy (n = 19, 24.7%), and gas chromatography–tandem mass spectrometry (GC-MS; n = 18, 23.4%; Table 1).

Synthesis of results

The number of cancer epidemiology studies using metabolomics assays has increased over time from 2011 through June 2021. Colorectal, prostate, and female breast cancers were the most well-studied cancers. Studies were geographically diverse, but few studies reported on race or ethnicity, and those that did reported a majority of White participants. Most studies included fewer than 300 cancer cases in their main analysis. Most studies enrolled adult participants, used a nested case–control design to estimate metabolite-cancer risk, and used an LC-MS untargeted or semi-targeted approach to measure metabolites in either serum or plasma. Most studies measured lipids and amino acids.

Summary of evidence

We identified 77 population-based cancer metabolomics research studies, published between January 1998 and June 2021, that investigated 19 different cancer types. Although studies were conducted across diverse geographic settings, our findings indicate a need for standardized reporting of race and ethnicity of study participants, as well as more racially/ethnically diverse study populations. Our findings also indicate trends in metabolomic epidemiology for prospective designs to assess cancer risk, blood-based biospecimens, mass spectrometry–based platforms, and untargeted metabolomics. Studies that used a semi-targeted strategy largely covered lipid and amino acid super-pathways. Furthermore, sample sizes of included metabolomic epidemiology studies of cancer emphasize the need for larger studies in the field.

Most epidemiologic studies have used metabolomics to gain insight into cancer etiology. Accordingly, they used a prospective case–control design nested in large cohorts that collected blood or other biospecimens at study baseline.

Colorectal, prostate, and female breast cancers were the most studied cancers and are also among the most common cancers worldwide (50), which enables researchers to more easily recruit or ascertain cases in epidemiologic studies. Etiologic studies of rare cancers are more challenging as low incidence rates for rare cancers pose as limitations to case recruitment and require very large sample sizes and extended follow-up to ascertain sufficient case numbers for analysis in prospective studies. Although rare cancers individually have low incidence rates, they collectively account for 20% to 25% of all oncology patients (51). Patients with rare cancers often have relatively worse outcomes for reasons including delayed diagnoses, less effective standard treatments, reduced access to clinical expertise, and gaps in funding for preclinical and clinical research (51). Metabolomics studies investigating rare cancers may improve their diagnosis, prognosis, and treatment.

Most studies performed metabolomics on serum or plasma as many mature cohort studies collected baseline blood. However, a broader range of biospecimen types (e.g., stool, tumor tissue) collected in ongoing and future cohorts may further increase discovery potential in metabolomic epidemiology studies.

Mass spectrometry–based analytic platforms, specifically LC-MS, were most often used by metabolomic epidemiology studies investigating cancer. Mass spectrometry–based metabolomics has historically been used in cancer research for early detection and screening applications (2). Within the subset of studies that used LC-MS in a semi-targeted approach (n = 26), the majority conducted their metabolomic profiling through Metabolon, Inc. and Biocrates Life Sciences (n = 22; Supplementary Fig. S6). Collaborations with fee-for-service industry partners provides an opportunity for epidemiologists and researchers without their own metabolomics laboratories to pursue metabolomics experiments (52). These platforms are most likely capturing the metabolites that align with the hypotheses of these studies. Studies that performed semi-targeted metabolomics primarily covered lipid and amino acid metabolism. The wide coverage of lipids is likely due to lipids being chemically well-studied. Furthermore, alterations in lipid and amino acid metabolism have been identified as hallmarks of cancer (2). Lipids assume functions in a variety of cancer processes, such as angiogenesis, cell growth, proliferation, survival, migration, invasion, and metastasis (53–55). Among the ways amino acid metabolism supports cancer cell growth and proliferation include providing materials for macromolecule (e.g., protein, lipid, and nucleic acid) synthesis and supplying alternative energy sources through anaplerotic reactions that feed the citric acid cycle in times of glucose deficiency (2, 56, 57). Considering that lipid and amino acid metabolism are integral to cancer metabolism, studying the metabolites that fall under these pathways may provide valuable insights that lead to improvements in cancer diagnosis, prognosis, and therapy development.

Two key gaps in the population-based cancer metabolomics literature were identified in this scoping review: (i) the need for standardized race and ethnicity reporting and (ii) the need for larger studies.

In this review, we found that a minority of studies reported on study participant race and ethnicity. Across the studies that did report these data, few reported subgroup information, and there was variation in where in the manuscript (e.g., methods, results, supplementary data) race and ethnicity information was reported. These findings demonstrate a need in the field to outline reporting standards for participant race and ethnicity information. Lin and Kelsey (58) explain that race and ethnicity data have potential utility in epidemiologic research to elucidate disease etiology, reveal the roles and interactions of genes and environment in disease, identify subgroups experiencing unequal care, assess population-specific conceptualizations of disease factors for developing tailored interventions, and study within-group biological variations. Clear reporting of participant race, ethnicity, and other sociodemographic information can aid researchers in determining study generalizability and uncovering health disparities and inequities (59), which is important when considering cancers with known racial and ethnic disparities, such as higher incidence of triple-negative breast cancer in Black women and liver cancer in Asian individuals. Although we did observe more consistent reporting of study setting, explicit reporting of race and ethnicity is needed to prevent the inaccurate conflation of geographic setting data as proxies for race and ethnicity by readers. The international medical and scientific publishing community has outlined recommendations for race and ethnicity reporting in the scientific literature (59–62). Perhaps initiatives can be spurred from within the metabolomic epidemiology community to draft and adopt guidance specific to the field.

Additionally, diverse representation is needed in metabolomic epidemiology studies. Lack of diversity has been well documented in genomics and the Eurocentric focus in the field has negative consequences including lack of generalizability, as well as poor replicability and accuracy of results in other populations (63, 64). Metabolomics researchers can learn from genomics and prioritize diverse study representation in order to reflect diversity more accurately within and across populations. In mature cancer cohorts, case numbers for rare cancers and some minority populations can be limited. As we saw in the present review, most studies examined sample sizes of fewer than 300 cancer cases.

Sample size affects statistical power or the ability to detect a statistically significant association (65). Depending on the level of variability in metabolite levels that arise due to between-subject, within-subject, and technical variability a larger sample size, with a sufficient number of cases, may be required (66, 67). High between-subject variability is good (i.e., large effect size), whereas high within-subject and technical variability decrease study power (66, 67). Sampson and colleagues (67) reported that to detect associations between metabolites and disease in a case–control design, population-based metabolomics studies require large sample sizes of 1,000 or more subjects, assuming 1:1 matching and a single measurement; however, they also demonstrated that incorporating serial measurements improved statistical power by reducing within-subject variability. Although some epidemiologic studies, particularly cohort studies, have collected and stored serial blood samples, few metabolomics analyses nested in these larger studies have incorporated them. In fact, only two studies reviewed included serial samples (Supplementary Table S2). Power calculations performed by Nicholson and colleagues (68) estimate that sample sizes of a few thousand are sufficient for identifying disease-predictive metabolite levels. To detect the likely moderate to weak effect sizes between an individual metabolite and cancer risk, large sample sizes are needed (67). Further exacerbating the sample size requirements is the issue of multiple comparisons because epidemiologic studies using an untargeted or semi-targeted metabolomics approach analyze hundreds of metabolites or thousands of metabolite features in relation to a disease outcome; thus, multiple testing correction is needed to limit false positives (11). While focusing on metabolites within a single biological pathway could preserve study power, the tradeoff would be the omittance of valuable data (11).

The upstream solution of establishing larger, more diverse cohorts is important. The NIH's All of Us Research Program and National Cancer Institute's Connect for Cancer Prevention Study are examples of newly established prospective cohorts that are focused on recruiting large sample sizes (1,000,000+ and 200,000 participants, respectively) and participants from diverse backgrounds and geographic locations that have been historically underrepresented in cancer research (69–71). Although this effort is necessary, such large cohorts cost tens of millions of dollars to set up and take decades to mature (72, 73). A potentially more tractable recommendation, then, is to design pooling and consortia efforts. Collaborative consortia offer the opportunity for researchers to achieve larger sample sizes and increased statistical power through pooling of metabolomics data across studies (74). The COnsortium of METabolomics Studies (COMETS) is an example of such a consortium where researchers are currently developing methods to harmonize existing data and assaying new samples across multiple cohorts (6, 8, 75, 76). COMETS is comprised of prospective cohorts with blood metabolomics data, acquired using NMR or MS, on 100 or more participants who are followed longitudinally for disease outcomes (6, 8). With over 70 cohorts from Asia, Europe, North America, and South America to date, COMETS can support large population-based studies with increasing geographic and demographic diversity (6, 8).

The first successful pooling project based on COMETS performed pooled analyses on data from greater than 32,000 participants (from 16 and 17 population-based studies) from the United States, Europe, and Asia to examine associations of metabolites (i.e., circulating trimethylamine N-oxide and choline and its related metabolites) with cardiometabolic biomarkers and with dietary and nondietary factors (77, 78). For pooling studies that are harmonizing existing data across multiple cohorts, funding needs only to cover the analytic work which minimizes cost. COMETS Analytics was developed to support consortia-based analyses through a standardized, federated approach for meta-analysis of metabolomics data (75, 76).

For pooling projects that assay new samples, funding needs to consider common samples for quality control (QC). COMETS has developed a reference set of 30 samples (serum, EDTA plasma, or heparinized plasma) and three pooled QC samples to facilitate metabolite level comparisons and promote pooling analyses (6, 8). Incorporating serial samples, when available, also has the potential to reduce metabolite measure variability and improve statistical power.

An additional downstream solution to address issues of sample size and statistical power would be to follow up promising leads from untargeted and semi-targeted metabolomics studies with targeted analyses for further validation of findings. Targeted studies can also be solely performed in cohort studies if the hypothesis warrants it. Targeted studies are hypothesis driven, do not contend with issues of multiple comparisons, and the measurements are quantified, therefore making them less vulnerable to instrument variation.

Limitations

This review has several limitations. Due to a lack of language translation capability, we only included studies published in English. We also used race and ethnicity categories outlined by a United States federal organization for our data extraction. Consequently, our results may not fully capture race and ethnic groups, or terminology used in other countries. Additionally, our results cover the published literature up to June 2021 but are reflective of trends in metabolomics and cancer epidemiology over more than two decades. Although grading study design and the appropriate use of statistical methods was beyond the scope of this review, we noted a general need for more standardized methods and reporting of results, which would not only increase transparency but also the ability to compare and reproduce findings.

Conclusions

This paper presents a comprehensive scoping review of the literature at the intersection of metabolomics, cancer, and epidemiology. The application of metabolomics in an epidemiologic context to study cancer is increasingly feasible and prevalent, with utility in risk estimation, biomarker identification, disease differentiation, and diagnosis and prognosis efforts. As the field continues to advance, researchers should focus on making strides in diverse participant sampling, clear reporting of race and ethnicity, and conducting well-powered studies incorporating more individuals or longitudinal metabolomics data.

No disclosures were reported.

This work was supported by the National Cancer Institute Division of Cancer Control and Population Sciences and Intramural Research Program.

The publication costs of this article were defrayed in part by the payment of publication fees. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.

Note: Supplementary data for this article are available at Cancer Epidemiology, Biomarkers & Prevention Online (http://cebp.aacrjournals.org/).

1.
Fiehn
O
.
Metabolomics: the link between genotypes and phenotypes
.
Plant Mol Biol
2002
;
48
:
155
71
.
2.
Liesenfeld
DB
,
Habermann
N
,
Owen
RW
,
Scalbert
A
,
Ulrich
CM
.
Review of mass spectrometry-based metabolomics in cancer research
.
Cancer Epidemiol Biomarkers Prev
2013
;
22
:
2182
201
.
3.
Oliver
SG
,
Winson
MK
,
Kell
DB
,
Baganz
F
.
Systematic functional analysis of the yeast genome
.
Trends Biotechnol
1998
;
16
:
373
8
.
4.
Dunn
WB
,
Broadhurst
DI
,
Atherton
HJ
,
Goodacre
R
,
Griffin
JL
.
Systems level studies of mammalian metabolomes: the roles of mass spectrometry and nuclear magnetic resonance spectroscopy
.
Chem Soc Rev
2011
;
40
:
387
426
.
5.
Liu
X
,
Locasale
JW
.
Metabolomics: a primer
.
Trends Biochem Sci
2017
;
42
:
274
84
.
6.
Zanetti
KA
.
Building infrastructure at the National Cancer Institute to support metabolomic analyses in epidemiological studies
.
Metabolomics
2021
;
17
:
46
.
7.
National Institutes of Health Common Fund
.
Metabolomics: public health relevance
.
Available from:
<https://commonfund.nih.gov/Metabolomics/PublicHealthRelevance>.
8.
Yu
B
,
Zanetti
KA
,
Temprosa
M
,
Albanes
D
,
Appel
N
,
Barrera
CB
, et al
.
The onsortium of metabolomics studies (COMETS): metabolomics in 47 prospective cohort studies
.
Am J Epidemiol
2019
;
188
:
991
1012
.
9.
National Cancer Institute
.
March 10
.
COnsortium of METabolomics Studies (COMETS)
.
Available from:
<https://epi.grants.cancer.gov/comets/>.
10.
Lasky-Su
J
,
Kelly
RS
,
Wheelock
CE
,
Broadhurst
D
.
A strategy for advancing for population-based scientific discovery using the metabolome: the establishment of the metabolomics society metabolomic epidemiology task group
.
Metabolomics
2021
;
17
:
45
.
11.
Haznadar
M
,
Maruvada
P
,
Mette
E
,
Milner
J
,
Moore
SC
,
Nicastro
HL
, et al
.
Navigating the road ahead: addressing challenges for use of metabolomics in epidemiology studies
.
Metabolomics
2014
;
10
:
176
8
.
12.
van Roekel
EH
,
Loftfield
E
,
Kelly
RS
,
Zeleznik
OA
,
Zanetti
KA
.
Metabolomics in epidemiologic research: challenges and opportunities for early-career epidemiologists
.
Metabolomics
2019
;
15
:
9
.
13.
Zanetti
KA
,
Mette
E
,
Maruvada
P
,
Milner
J
,
Moore
SC
,
Nicastro
HL
, et al
.
The future of metabolomic profiling in population-based research
.
J Anal Bioanal Tech.
2014
;
5:1–4
.
14.
Tzoulaki
I
,
Ebbels
TM
,
Valdes
A
,
Elliott
P
,
Ioannidis
JP
.
Design and analysis of metabolomics studies in epidemiologic research: a primer on -omic technologies
.
Am J Epidemiol
2014
;
180
:
129
39
.
15.
Stevens
VL
,
Hoover
E
,
Wang
Y
,
Zanetti
KA
.
Pre-analytical factors that affect metabolite stability in human urine, plasma, and serum: a review
.
Metabolites
2019
;
9
:
156
.
16.
Tricco
AC
,
Lillie
E
,
Zarin
W
,
O'Brien
KK
,
Colquhoun
H
,
Levac
D
, et al
.
PRISMA extension for scoping reviews (PRISMA-ScR): checklist and explanation
.
Ann Intern Med
2018
;
169
:
467
73
.
17.
Kanehisa
M
.
Post-genome informatics
.
Oxford, UK
:
Oxford University Press
;
2000
.
147
p.
18.
Rothman
K
,
Greenland
S
,
Lash
TL
.
Modern epidemiology
. Philadelphia, PA:
Lippincott Williams & Wilkins
;
2008
.
19.
Dekkers
OM
,
Egger
M
,
Altman
DG
,
Vandenbroucke
JP
.
Distinguishing case series from cohort studies
.
Ann Intern Med
2012
;
156
(
1 Pt 1
):
37
40
.
20.
Office of Management and Budget
.
1997 February 9
.
Revisions to the standards for the classification of federal data on race and ethnicity
.
Available from:
<https://obamawhitehouse.archives.gov/omb/fedreg_1997standards>.
21.
Germain
A
,
Ruppert
D
,
Levine
SM
,
Hanson
MR
.
Prospective biomarkers from plasma metabolomics of myalgic encephalomyelitis/chronic fatigue syndrome implicate redox imbalance in disease symptomatology
.
Metabolites
2018
;
8
:
90
.
22.
Metabolon
.
April 20
.
Available from:
<https://www.metabolon.com/>.
23.
Petrick
LM
,
Schiffman
C
,
Edmands
WMB
,
Yano
Y
,
Perttula
K
,
Whitehead
T
, et al
.
Metabolomics of neonatal blood spots reveal distinct phenotypes of pediatric acute lymphoblastic leukemia and potential effects of early-life nutrition
.
Cancer Lett
2019
;
452
:
71
8
.
24.
Wilson
M
,
Cummins
CL
,
Macpherson
L
,
Sun
Y
,
Natarajan
K
,
Grundy
RG
, et al
.
Magnetic resonance spectroscopy metabolite profiles predict survival in paediatric brain tumours
.
Eur J Cancer
2013
;
49
:
457
64
.
25.
Mondul
AM
,
Moore
SC
,
Weinstein
SJ
,
Karoly
ED
,
Sampson
JN
,
Albanes
D
.
Metabolomic analysis of prostate cancer risk in a prospective cohort: The alpha-tocolpherol, beta-carotene cancer prevention (ATBC) study
.
Int J Cancer
2015
;
137
:
2124
32
.
26.
Seow
WJ
,
Shu
XO
,
Nicholson
JK
,
Holmes
E
,
Walker
DI
,
Hu
W
, et al
.
Association of untargeted urinary metabolomics and lung cancer risk among never-smoking women in China
.
JAMA Netw Open
2019
;
2
:
e1911970
.
27.
Shu
X
,
Xiang
YB
,
Rothman
N
,
Yu
D
,
Li
HL
,
Yang
G
, et al
.
Prospective study of blood metabolites associated with colorectal cancer risk
.
Int J Cancer
2018
;
143
:
527
34
.
28.
Yi
LZ
,
Song
CC
,
Hu
ZY
,
Yang
LF
,
Xiao
LB
,
Yi
B
, et al
.
A metabolic discrimination model for nasopharyngeal carcinoma and its potential role in the therapeutic evaluation of radiotherapy
.
Metabolomics
2014
;
10
:
697
708
.
29.
Huang
J
,
Mondul
AM
,
Weinstein
SJ
,
Koutros
S
,
Derkach
A
,
Karoly
E
, et al
.
Serum metabolomic profiling of prostate cancer risk in the prostate, lung, colorectal, and ovarian cancer screening trial
.
Br J Cancer
2016
;
115
:
1087
95
.
30.
Mathe
EA
,
Patterson
AD
,
Haznadar
M
,
Manna
SK
,
Krausz
KW
,
Bowman
ED
, et al
.
Noninvasive urinary metabolomic profiling identifies diagnostic and prognostic markers in lung cancer
.
Cancer Res
2014
;
74
:
3259
70
.
31.
Zhao
H
,
Shen
J
,
Moore
SC
,
Ye
YQ
,
Wu
XF
,
Esteva
FJ
, et al
.
Breast cancer risk in relation to plasma metabolites among Hispanic and African American women
.
Breast Cancer Res Tr
2019
;
176
:
697
-.
32.
Huang
S
,
Chong
N
,
Lewis
NE
,
Jia
W
,
Xie
G
,
Garmire
LX
.
Novel personalized pathway-based metabolomics models reveal key metabolic pathways for breast cancer diagnosis
.
Genome Med
2016
;
8
:
34
.
33.
Moore
SC
,
Playdon
MC
,
Sampson
JN
,
Hoover
RN
,
Trabert
B
,
Matthews
CE
, et al
.
A metabolomics analysis of body mass index and postmenopausal breast cancer risk
.
J Natl Cancer Inst
2018
;
110
:
588
97
.
34.
Moore
SC
,
Mazzilli
KM
,
Sampson
JN
,
Matthews
CE
,
Carter
BD
,
Playdon
MC
, et al
.
A metabolomics analysis of postmenopausal breast cancer risk in the cancer prevention study II
.
Metabolites
2021
;
11
:
95
.
35.
Piyarathna
DWB
,
Rajendiran
TM
,
Putluri
V
,
Vantaku
V
,
Soni
T
,
von Rundstedt
FC
, et al
.
Distinct lipidomic landscapes associated with clinical stages of urothelial cancer of the bladder
.
Eur Urol Focus
2018
;
4
:
907
15
.
36.
Budczies
J
,
Denkert
C
,
Muller
BM
,
Brockmoller
SF
,
Klauschen
F
,
Gyorffy
B
, et al
.
Remodeling of central metabolism in invasive breast cancer compared to normal breast tissue: a GC-TOFMS based metabolomics study
.
BMC Genomics
2012
;
13
:
334
.
37.
Sun
C
,
Li
T
,
Song
X
,
Huang
L
,
Zang
Q
,
Xu
J
, et al
.
Spatially resolved metabolomics to discover tumor-associated metabolic alterations
.
Proc Natl Acad Sci U S A
2019
;
116
:
52
7
.
38.
Hakimi
AA
,
Reznik
E
,
Lee
CH
,
Creighton
CJ
,
Brannon
AR
,
Luna
A
, et al
.
An integrated metabolic atlas of clear cell renal cell carcinoma
.
Cancer Cell
2016
;
29
:
104
16
.
39.
Han
J
,
Han
ML
,
Xing
H
,
Li
ZL
,
Yuan
DY
,
Wu
H
, et al
.
Tissue and serum metabolomic phenotyping for diagnosis and prognosis of hepatocellular carcinoma
.
Int J Cancer
2020
;
146
:
1741
53
.
40.
Battini
S
,
Faitot
F
,
Imperiale
A
,
Cicek
AE
,
Heimburger
C
,
Averous
G
, et al
.
Metabolomics approaches in pancreatic adenocarcinoma: tumor metabolism profiling predicts clinical outcome of patients
.
Bmc Med
2017
;
15
:
56
.
41.
Meller
S
,
Meyer
HA
,
Bethan
B
,
Dietrich
D
,
Maldonado
SG
,
Lein
M
, et al
.
Integration of tissue metabolomics, transcriptomics and immunohistochemistry reveals ERG- and Gleason score-specific metabolomic alterations in prostate cancer
.
Oncotarget
2016
;
7
:
1421
38
.
42.
Kaji
S
,
Irino
T
,
Kusuhara
M
,
Makuuchi
R
,
Yamakawa
Y
,
Tokunaga
M
, et al
.
Metabolomic profiling of gastric cancer tissues identified potential biomarkers for predicting peritoneal recurrence
.
Gastric Cancer
2020
;
23
:
874
83
.
43.
Wang
H
,
Zhang
H
,
Deng
P
,
Liu
C
,
Li
D
,
Jie
H
, et al
.
Tissue metabolic profiling of human gastric cancer assessed by (1)H NMR
.
BMC Cancer
2016
;
16
:
371
.
44.
Yamakawa
Y
,
Kusuhara
M
,
Terashima
M
,
Kinugasa
Y
,
Sugino
T
,
Abe
M
, et al
.
CD44 variant 9 expression as a predictor for gastric cancer recurrence: immunohistochemical and metabolomic analysis of surgically resected tissues
.
Biomedical Research-Tokyo
2017
;
38
:
41
52
.
45.
Men
Y
,
Li
L
,
Zhang
F
,
Kong
X
,
Zhang
W
,
Hao
C
, et al
.
Evaluation of heavy metals and metabolites in the urine of patients with breast cancer
.
Oncol Lett
2020
;
19
:
1331
7
.
46.
Deng
L
,
Ismond
K
,
Liu
Z
,
Constable
J
,
Wang
H
,
Alatise
OI
, et al
.
Urinary metabolomics to identify a unique biomarker panel for detecting colorectal cancer: a multicenter study
.
Cancer Epidemiol Biomarkers Prev
2019
;
28
:
1283
91
.
47.
Hasim
A
,
Ma
H
,
Mamtimin
B
,
Abudula
A
,
Niyaz
M
,
Zhang
L-w
et al
.
Revealing the metabonomic variation of EC using 1H-NMR spectroscopy and its association with the clinicopathological characteristics
.
Mol Biol Rep
2012
;
39
:
8955
64
.
48.
Bruzzone
C
,
Loizaga-Iriarte
A
,
Sanchez-Mosquera
P
,
Gil-Redondo
R
,
Astobiza
I
,
Diercks
T
, et al
.
H-1 NMR-based urine metabolomics reveals signs of enhanced carbon and nitrogen recycling in prostate cancer
.
J Proteome Res
2020
;
19
:
2419
28
.
49.
Liang
Q
,
Liu
H
,
Xie
LX
,
Li
X
,
Zhang
AH
.
High-throughput metabolomics enables biomarker discovery in prostate cancer
.
RSC Adv
2017
;
7
:
2587
93
.
50.
Sung
H
,
Ferlay
J
,
Siegel
RL
,
Laversanne
M
,
Soerjomataram
I
,
Jemal
A
, et al
.
Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries
.
CA Cancer J Clin
2021
;
71
:
209
49
.
51.
Komatsubara
KM
,
Carvajal
RD
.
The promise and challenges of rare cancer research
.
Lancet Oncol
2016
;
17
:
136
8
.
52.
Pinu
FR
,
Goldansaz
SA
,
Jaine
J
.
Translational metabolomics: current challenges and future opportunities
.
Metabolites
2019
;
9
:
108
.
53.
Beger
RD
.
A review of applications of metabolomics in cancer
.
Metabolites
2013
;
3
:
552
74
.
54.
Bian
X
,
Liu
R
,
Meng
Y
,
Xing
D
,
Xu
D
,
Lu
Z
.
Lipid metabolism and cancer
.
J Exp Med
2021
;
218
:
e20201606
.
55.
Fernandis
AZ
,
Wenk
MR
.
Lipid-based biomarkers for cancer
.
J Chromatogr B Analyt Technol Biomed Life Sci
2009
;
877
:
2830
5
.
56.
Lieu
EL
,
Nguyen
T
,
Rhyne
S
,
Kim
J
.
Amino acids in cancer
.
Exp Mol Med
2020
;
52
:
15
30
.
57.
Wei
Z
,
Liu
X
,
Cheng
C
,
Yu
W
,
Yi
P
.
Metabolism of amino acids in cancer
.
Front Cell Dev Biol
2020
;
8
:
603837
.
58.
Lin
SS
,
Kelsey
JL
.
Use of race and ethnicity in epidemiologic research: concepts, methodological issues, and suggestions for research
.
Epidemiol Rev
2000
;
22
:
187
202
.
59.
Flanagin
A
,
Frey
T
,
Christiansen
SL
,
C
AMAMoS
.
Updated guidance on the reporting of race and ethnicity in medical and science journals
.
JAMA
2021
;
326
:
621
7
.
60.
Bokor-Billmann
T
,
Langan
EA
,
Billmann
F
.
The reporting of race and/or ethnicity in the medical literature: a retrospective bibliometric analysis confirmed room for improvement
.
J Clin Epidemiol
2020
;
119
:
1
6
.
61.
Frey
T
,
Young
R
.
Correct and preferred usage
. In:
Christiansen
S
,
Iverson
C
,
Flanagin
A
,
editors
.
AMA manual of style: a guide for authors and editors
.
11th ed.
: USA:
Oxford University Press
;
2020
.
62.
International Committee of Medical Journal Editors
.
Recommendations for the conduct, reporting, editing, and publication of scholarly work in medical journals
.
Available from:
https://www.icmje.org/recommendations/
63.
Popejoy
AB
,
Fullerton
SM
.
Genomics is failing on diversity
.
Nature
2016
;
538
:
161
4
.
64.
Sirugo
G
,
Williams
SM
,
Tishkoff
SA
.
The missing diversity in human genetic studies
.
Cell
2019
;
177
:
1080
.
65.
Hajian-Tilaki
K
.
Sample size estimation in epidemiologic studies
.
Caspian J Intern Med
2011
;
2
:
289
98
.
66.
Playdon
MC
,
Joshi
AD
,
Tabung
FK
,
Cheng
S
,
Henglin
M
,
Kim
A
, et al
.
Metabolomics analytics workflow for epidemiological research: perspectives from the consortium of metabolomics studies (COMETS)
.
Metabolites
2019
;
9
:
145
.
67.
Sampson
JN
,
Boca
SM
,
Shu
XO
,
Stolzenberg-Solomon
RZ
,
Matthews
CE
,
Hsing
AW
, et al
.
Metabolomics in epidemiology: sources of variability in metabolite measurements and implications
.
Cancer Epidemiol Biomarkers Prev
2013
;
22
:
631
40
.
68.
Nicholson
G
,
Rantalainen
M
,
Maher
AD
,
Li
JV
,
Malmodin
D
,
Ahmadi
KR
, et al
.
Human metabolic profiles are stably controlled by genetic and environmental variation
.
Mol Syst Biol
2011
;
7
:
525
.
69.
National Institutes of Health
.
May 9
.
All of us research program overview
.
Available from:
<https://allofus.nih.gov/about/program-overview>.
70.
National Cancer Institute Division of Cancer Epidemiology and Genetics
.
May 9
.
Connect for cancer prevention study
.
Available from:
<https://www.cancer.gov/connect-prevention-study/>.
71.
All of Us Research Program I
,
Denny
JC
,
Rutter
JL
,
Goldstein
DB
,
Philippakis
A
,
Smoller
JW
, et al
.
The "all of us" research program
.
N Engl J Med
2019
;
381
:
668
76
.
72.
Hense
HW
.
When size matters
.
Int J Epidemiol
2011
;
40
:
5
7
.
73.
Manolio
TA
,
Collins
R
.
Enhancing the feasibility of large cohort studies
.
Jama-J Am Med Assoc
2010
;
304
:
2290
1
.
74.
Viallon
V
,
His
M
,
Rinaldi
S
,
Breeur
M
,
Gicquiau
A
,
Hemon
B
, et al
.
A new pipeline for the normalization and pooling of metabolomics data
.
Metabolites
2021
;
11
:
631
.
75.
Lacey
JV
Jr.
,
Benbow
JL
.
Invited commentary: standards, inputs, and outputs-strategies for improving data-sharing and consortia-based epidemiologic research
.
Am J Epidemiol
2022
;
191
:
159
62
.
76.
Temprosa
M
,
Moore
SC
,
Zanetti
KA
,
Appel
N
,
Ruggieri
D
,
Mazzilli
KM
, et al
.
COMETS analytics: an online tool for analyzing and meta-analyzing metabolomics data in large research consortia
.
Am J Epidemiol
2022
;
191
:
147
58
.
77.
Yang
JJ
,
Shu
XO
,
Herrington
DM
,
Moore
SC
,
Meyer
KA
,
Ose
J
, et al
.
Circulating trimethylamine N-oxide in association with diet and cardiometabolic biomarkers: an international pooled analysis
.
Am J Clin Nutr
2021
;
113
:
1145
56
.
78.
Pan
XF
,
Yang
JJ
,
Shu
XO
,
Moore
SC
,
Palmer
ND
,
Guasch-Ferre
M
, et al
.
Associations of circulating choline and its related metabolites with cardiometabolic biomarkers: an international pooled analysis
.
Am J Clin Nutr
2021
;
114
:
893
906
.
79.
Adams
CD
,
Richmond
R
,
Ferreira
DLS
,
Spiller
W
,
Tan
V
,
Zheng
J
, et al
.
Circulating metabolic biomarkers of screen-detected prostate cancer in the ProtecT study
.
Cancer Epidem Biomar
2019
;
28
:
208
16
.
80.
Assi
N
,
Fages
A
,
Vineis
P
,
Chadeau-Hyam
M
,
Stepien
M
,
Duarte-Salles
T
, et al
.
A statistical framework to model the meeting-in-the-middle principle using metabolomic data: application to hepatocellular carcinoma in the EPIC study
.
Mutagenesis
2015
;
30
:
743
53
.
81.
Assi
N
,
Gunter
MJ
,
Thomas
DC
,
Leitzmann
M
,
Stepien
M
,
Chajes
V
, et al
.
Metabolic signature of healthy lifestyle and its relation with risk of hepatocellular carcinoma in a large European cohort
.
Am J Clin Nutr
2018
;
108
:
117
26
.
82.
Assi
N
,
Thomas
DC
,
Leitzmann
M
,
Stepien
M
,
Chajes
V
,
Philip
T
, et al
.
Are metabolic signatures mediating the relationship between lifestyle factors and hepatocellular carcinoma risk? Results from a nested case-control study in EPIC
.
Cancer Epidemiol Biomarkers Prev
2018
;
27
:
531
40
.
83.
Bjorkblom
B
,
Wibom
C
,
Jonsson
P
,
Moren
L
,
Andersson
U
,
Johannesen
TB
, et al
.
Metabolomic screening of pre-diagnostic serum samples identifies association between alpha- and gamma-tocopherols and glioblastoma risk
.
Oncotarget
2016
;
7
:
37043
53
.
84.
Bro
R
,
Kamstrup-Nielsen
MH
,
Engelsen
SB
,
Savorani
F
,
Rasmussen
MA
,
Hansen
L
, et al
.
Forecasting individual breast cancer risk using plasma metabolomics and biocontours
.
Metabolomics
2015
;
11
:
1376
80
.
85.
Cross
AJ
,
Boca
S
,
Freedman
ND
,
Caporaso
NE
,
Huang
WY
,
Sinha
R
, et al
.
Metabolites of tobacco smoking and colorectal cancer risk
.
Carcinogenesis
2014
;
35
:
1516
22
.
86.
Cross
AJ
,
Moore
SC
,
Boca
S
,
Huang
W-Y
,
Xiong
X
,
Stolzenberg-Solomon
R
, et al
.
A prospective study of serum metabolites and colorectal cancer risk
.
Cancer
2014
;
120
:
3049
57
.
87.
Dickerman
BA
,
Ebot
EM
,
Healy
BC
,
Wilson
KM
,
Eliassen
AH
,
Ascherio
A
, et al
.
A metabolomics analysis of adiposity and advanced prostate cancer risk in the health professionals follow-up study
.
Metabolites
2020
;
10
:
99
.
88.
Farshidfar
F
,
Weljie
AM
,
Kopciuk
KA
,
Hilsden
R
,
McGregor
SE
,
Buie
WD
, et al
.
A validated metabolomic signature for colorectal cancer: exploration of the clinical value of metabolomics
.
Brit J Cancer
2016
;
115
:
848
57
.
89.
Fest
J
,
Vijfhuizen
LS
,
Goeman
JJ
,
Veth
O
,
Joensuu
A
,
Perola
M
, et al
.
Search for early pancreatic cancer blood biomarkers in five European prospective population biobanks using metabolomics
.
Endocrinology
2019
;
160
:
1731
42
.
90.
Gaudet
MM
,
Falk
RT
,
Stevens
RD
,
Gunter
MJ
,
Bain
JR
,
Pfeiffer
RM
, et al
.
Analysis of serum metabolic profiles in women with endometrial cancer and controls in a population-based case-control study
.
J Clin Endocrinol Metab
2012
;
97
:
3216
23
.
91.
Geijsen
AJMR
,
Brezina
S
,
Keski-Rahkonen
P
,
Baierl
A
,
Bachleitner-Hofmann
T
,
Bergmann
MM
, et al
.
Plasma metabolites associated with colorectal cancer: a discovery-replication strategy
.
Int J Cancer
2019
;
145
:
1221
31
.
92.
Guertin
KA
,
Loftfield
E
,
Boca
SM
,
Sampson
JN
,
Moore
SC
,
Xiao
Q
, et al
.
Serum biomarkers of habitual coffee consumption may provide insight into the mechanism underlying the association between coffee consumption and colorectal cancer
.
Am J Clin Nutr
2015
;
101
:
1000
11
.
93.
Hadi
NI
,
Jamal
Q
,
Iqbal
A
,
Shaikh
F
,
Somroo
S
,
Musharraf
SG
.
Serum metabolomic profiles for breast cancer diagnosis, grading and staging by gas chromatography-mass spectrometry
.
Sci Rep
2017
;
7
:
1715
.
94.
Hao
D
,
Sengupta
A
,
Ding
KY
,
Ubeydullah
ER
,
Krishnaiah
S
,
Leighl
NB
, et al
.
Metabolites as prognostic markers for metastatic Non-Small Cell Lung Cancer (NSCLC) patients treated with first-line platinum-doublet chemotherapy
.
Cancers
2020
;
12
:
1926
.
95.
His
M
,
Viallon
V
,
Dossus
L
,
Gicquiau
A
,
Achaintre
D
,
Scalbert
A
, et al
.
Prospective analysis of circulating metabolites and breast cancer in EPIC
.
Bmc Med
2019
;
17
:
178
.
96.
Huang
J
,
Mondul
AM
,
Weinstein
SJ
,
Derkach
A
,
Moore
SC
,
Sampson
JN
, et al
.
Prospective serum metabolomic profiling of lethal prostate cancer
.
Int J Cancer
2019
;
145
:
3231
43
.
97.
Huang
L
,
Wang
L
,
Hu
X
,
Chen
S
,
Tao
Y
,
Su
H
, et al
.
Machine learning of serum metabolic patterns encodes early-stage lung adenocarcinoma
.
Nat Commun
2020
;
11
:
3556
.
98.
Huang
BY
,
Tsai
MR
,
Hsu
JK
,
Lin
CY
,
Lin
CL
,
Hu
JT
, et al
.
Longitudinal change of metabolite profile and its relation to multiple risk factors for the risk of developing hepatitis B-related hepatocellular carcinoma
.
Mol Carcinogen
2020
;
59
:
1269
79
.
99.
Jobard
E
,
Dossus
L
,
Baglietto
L
,
Fornili
M
,
Lecuyer
L
,
Mancini
FR
, et al
.
Investigation of circulating metabolites associated with breast cancer risk by untargeted metabolomics: a case-control study nested within the French E3N cohort
.
Brit J Cancer
2021
;
124
:
1734
43
.
100.
Kliemann
N
,
Viallon
V
,
Murphy
N
,
Beeken
RJ
,
Rothwell
JA
,
Rinaldi
S
, et al
.
Metabolic signatures of greater body size and their associations with risk of colorectal and endometrial cancers in the European prospective investigation into cancer and nutrition
.
BMC Med
2021
;
19
:
101
.
101.
Kuhn
T
,
Floegel
A
,
Sookthai
D
,
Johnson
T
,
Rolle-Kampczyk
U
,
Otto
W
, et al
.
Higher plasma levels of lysophosphatidylcholine 18:0 are related to a lower risk of common cancers in a prospective metabolomics study
.
BMC Med
2016
;
14
:
13
.
102.
Lecuyer
L
,
Bala
AV
,
Deschasaux
M
,
Bouchemal
N
,
Triba
MN
,
Vasson
MP
, et al
.
NMR metabolomic signatures reveal predictive plasma metabolites associated with long-term risk of developing breast cancer
.
Int J Epidemiol
2018
;
47
:
484
94
.
103.
Lecuyer
L
,
Dalle
C
,
Lyan
B
,
Demidem
A
,
Rossary
A
,
Vasson
MP
, et al
.
Plasma metabolomic signatures associated with long-term breast cancer risk in the SU.VI.MAX prospective cohort
.
Cancer Epidem Biomar
2019
;
28
:
1300
7
.
104.
Lecuyer
L
,
Dalle
C
,
Lefevre-Arbogast
S
,
Micheau
P
,
Lyan
B
,
Rossary
A
, et al
.
Diet-related metabolomic signature of long-term breast cancer risk using penalized regression: an exploratory study in the SU.VI .MAX cohort
.
Cancer Epidem Biomar
2020
;
29
:
396
405
.
105.
Lecuyer
L
,
Bala
AV
,
Demidem
A
,
Rossary
A
,
Bouchemal
N
,
Triba
MN
, et al
.
NMR metabolomic profiles associated with long-term risk of prostate cancer
.
Metabolomics
2021
;
17
:
32
.
106.
Li
YF
,
Qiu
S
,
Zhang
AH
.
High-throughput metabolomics to identify metabolites to serve as diagnostic biomarkers of prostate cancer
.
Anal Methods-Uk
2016
;
8
:
3284
90
.
107.
Li
JK
,
Li
J
,
Wang
H
,
Qi
LW
,
Zhu
YM
,
Lai
MD
.
Tyrosine and glutamine-leucine are metabolic markers of early-stage colorectal cancers
.
Gastroenterology
2019
;
157
:
257
9
.
108.
Li
X
,
Kulkarni
AS
,
Liu
X
,
Gao
WQ
,
Huang
L
,
Hu
Z
, et al
.
Metal-organic framework hybrids aid metabolic profiling for colorectal cancer
.
Small Methods
2021
;
5
:
e2001001
.
109.
Liang
Q
,
Liu
H
,
Zhang
T
,
Jiang
Y
,
Xing
H
,
Zhang
H
.
Serum metabolomics uncovering specific metabolite signatures of intra- and extrahepatic cholangiocarcinoma
.
Mol Biosyst
2016
;
12
:
334
40
.
110.
Loftfield
E
,
Rothwell
JA
,
Sinha
R
,
Keski-Rahkonen
P
,
Robinot
N
,
Albanes
D
, et al
.
Prospective investigation of serum metabolites, coffee drinking, liver cancer incidence, and liver disease mortality
.
J Natl Cancer Inst
2020
;
112
:
286
94
.
111.
Louis
E
,
Adriaensens
P
,
Guedens
W
,
Bigirumurame
T
,
Baeten
K
,
Vanhove
K
, et al
.
Detection of lung cancer through metabolic changes measured in blood plasma
.
J Thorac Oncol
2016
;
11
:
516
23
.
112.
Mamtimin
B
,
Upur
H
,
Hao
FH
,
Matsidik
A
,
Rahim
R
.
Plasma metabonomic analysis with (1)H nuclear magnetic resonance revealing the relationship of different tumors and the disease homology theory of traditional Uyghur medicine
.
Chin J Integr Med
2011
;
17
:
111
5
.
113.
McCullough
ML
,
Hodge
RA
,
Campbell
PT
,
Stevens
VL
,
Wang
Y
.
Pre-diagnostic circulating metabolites and colorectal cancer risk in the cancer prevention study-II nutrition cohort
.
Metabolites
2021
;
11
:
156
.
114.
Ose
J
,
Gigic
B
,
Brezina
S
,
Lin
TD
,
Baierl
A
,
Geijsen
AJMR
, et al
.
Targeted plasma metabolic profiles and risk of recurrence in stage II and III colorectal cancer patients: results from an international cohort consortium
.
Metabolites
2021
;
11
:
129
.
115.
Rohnisch
HE
,
Kyro
C
,
Olsen
A
,
Thysell
E
,
Hallmans
G
,
Moazzami
AA
.
Identification of metabolites associated with prostate cancer risk: a nested case-control study with long follow-up in the Northern Sweden health and disease study
.
BMC Med
2020
;
18
:
187
.
116.
Ros-Mazurczyk
M
,
Jelonek
K
,
Marczyk
M
,
Binczyk
F
,
Pietrowska
M
,
Polanska
J
, et al
.
Serum lipid profile discriminates patients with early lung cancer from healthy controls
.
Lung Cancer
2017
;
112
:
69
74
.
117.
Schmidt
JA
,
Fensom
GK
,
Rinaldi
S
,
Scalbert
A
,
Appleby
PN
,
Achaintre
D
, et al
.
Pre-diagnostic metabolite concentrations and prostate cancer risk in 1077 cases and 1077 matched controls in the European Prospective Investigation into Cancer and Nutrition
.
BMC Med
2017
;
15
:
122
.
118.
Schmidt
JA
,
Fensom
GK
,
Rinaldi
S
,
Scalbert
A
,
Appleby
PN
,
Achaintre
D
, et al
.
Patterns in metabolite profile are associated with risk of more aggressive prostate cancer: a prospective study of 3,057 matched case–control sets from EPIC
.
Int J Cancer
2020
;
146
:
720
30
.
119.
Shu
X
,
Zheng
W
,
Yu
D
,
Li
HL
,
Lan
Q
,
Yang
G
, et al
.
Prospective metabolomics study identifies potential novel blood metabolites associated with pancreatic cancer risk
.
Int J Cancer
2018
;
143
:
2161
7
.
120.
Stepien
M
,
Keski-Rahkonen
P
,
Kiss
A
,
Robinot
N
,
Duarte-Salles
T
,
Murphy
N
, et al
.
Metabolic perturbations prior to hepatocellular carcinoma diagnosis: findings from a prospective observational cohort study
.
Int J Cancer
2021
;
148
:
609
25
.
121.
Stolzenberg-Solomon
R
,
Derkach
A
,
Moore
S
,
Weinstein
SJ
,
Albanes
D
,
Sampson
J
.
Associations between metabolites and pancreatic cancer risk in a large prospective epidemiological study
.
Gut
2020
;
69
:
2008
15
.
122.
Su
BZ
,
Luo
P
,
Yang
Z
,
Yu
P
,
Li
ZF
,
Yin
PY
, et al
.
A novel analysis method for biomarker identification based on horizontal relationship: identifying potential biomarkers from large-scale hepatocellular carcinoma metabolomics data
.
Anal Bioanal Chem
2019
;
411
:
6377
86
.
123.
Vanhove
K
,
Giesen
P
,
Owokotomo
OE
,
Mesotten
L
,
Louis
E
,
Shkedy
Z
, et al
.
The plasma glutamate concentration as a complementary tool to differentiate benign PET-positive lung lesions from lung cancer
.
BMC Cancer
2018
;
18
:
868
.
124.
Wang
Y
,
Jacobs
EJ
,
Carter
BD
,
Gapstur
SM
,
Stevens
VL
.
Plasma metabolomic profiles and risk of advanced and fatal prostate cancer
.
Eur Urol Oncol
2021
;
4
:
56
65
.
125.
Wei
MK
,
Zhao
LH
,
Lv
JL
,
Li
X
,
Zhou
GS
,
Fan
BB
, et al
.
The mediation effect of serum metabolites on the relationship between long-term smoking exposure and esophageal squamous cell carcinoma
.
BMC Cancer
2021
;
21
:
415
.
126.
Wikoff
WR
,
Hanash
S
,
DeFelice
B
,
Miyamoto
S
,
Barnett
M
,
Zhao
Y
, et al
.
Diacetylspermine is a novel prediagnostic serum biomarker for non-small-cell lung cancer and has additive performance with pro-surfactant protein B
.
J Clin Oncol
2015
;
33
:
3880
6
.
127.
Zeleznik
OA
,
Clish
CB
,
Kraft
P
,
Avila-Pacheco
J
,
Eliassen
AH
,
Tworoger
SS
.
Circulating lysophosphatidylcholines, phosphatidylcholines, ceramides, and sphingomyelins and ovarian cancer risk: a 23-year prospective study
.
J Natl Cancer Inst
2020
;
112
:
628
36
.
128.
Zhang
F
,
Li
CB
,
Deng
K
,
Wang
ZZ
,
Zhao
WW
,
Yang
K
, et al
.
Metabolic phenotyping to monitor chronic enteritis canceration
.
Metabolomics
2020
;
16
:
29
.
This open access article is distributed under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.