Abstract
Population-based screening programs are credited with earlier colorectal cancer diagnoses and treatment initiation, which reduce mortality rates and improve patient health outcomes. However, recommended screening methods are unsatisfactory as they are invasive, are resource intensive, suffer from low uptake, or have poor diagnostic performance. Our goal was to identify a urine metabolomic-based biomarker panel for the detection of colorectal cancer that has the potential for global population-based screening.
Prospective urine samples were collected from study participants. Based upon colonoscopy and histopathology results, 342 participants (colorectal cancer, 171; healthy controls, 171) from two study sites (Canada, United States) were included in the analyses. Targeted liquid chromatography-mass spectrometry (LC-MS) was performed to quantify 140 highly valuable metabolites in each urine sample. Potential biomarkers for colorectal cancer were identified by comparing the metabolomic profiles from colorectal cancer versus controls. Multiple models were constructed leading to a good separation of colorectal cancer from controls.
A panel of 17 metabolites was identified as possible biomarkers for colorectal cancer. Using only two of the selected metabolites, namely diacetylspermine and kynurenine, a predictor for detecting colorectal cancer was developed with an AUC of 0.864, a specificity of 80.0%, and a sensitivity of 80.0%.
We present a potentially “universal” metabolomic biomarker panel for colorectal cancer independent of cohort clinical features based on a North American population. Further research is needed to confirm the utility of the profile in a prospective, population-based colorectal cancer screening trial.
A urinary metabolomic biomarker panel was identified for colorectal cancer with the potential of clinical application.
Introduction
Colorectal cancer is the third most commonly diagnosed malignancy and the fourth leading cause of cancer-related deaths in the world. On the basis of 2018 estimates, the 2040 incidence rates for colorectal cancer are projected to increase by 72% to 3.1 million new cases, while mortality rates will increase by 82% to 1.5 million deaths (https://gco.iarc.fr/tomorrow). Mortalities due to colorectal cancer are largely preventable through regular screening and early detection using fecal-based tests and colonoscopy (1). To be effective, population-based screening must be programmatic rather than opportunistic to ensure a high rate of compliance (2). Such programs have been instituted nationally or regionally within many countries in Europe (e.g., UK, Ireland, Germany, France), United States, Japan, and Australia as reviewed by Navarro and colleagues (3).
The most commonly used population-based screening modalities are the fecal immunochemical test (FIT) and colonoscopy (4). FIT detects hidden blood in stool that occurs mostly in the later stages of cancer and has low sensitivity for detecting the precursors to colorectal cancer, adenomatous polyps (9). A new fecal DNA test detects DNA mutations in addition to hidden blood in stool with improved sensitivity (5), but it is costly and only available in a few countries. To date, fecal-based tests are limited to colorectal cancer detection not prevention, and have low adherence rates due to the need for stool collection and manipulation (6–10). Colonoscopy has a superior sensitivity and specificity to noninvasive screening tests, but is costly in terms of direct and indirect health care dollars, has a higher risk of procedural-related complications, and, like fecal-based tests, has low rates of screening compliance (11).
To increase screening compliance rates, programs have largely focused on colorectal cancer education and sending reminders to eligible participants (12, 13). An alternative approach for improving colorectal cancer screening rates is to use a biosample other than stool (14). A blood-based screening test has been shown to have higher patient uptake than FIT (15), but its cost-effectiveness is debatable for population-based screening (16). Urine is commonly used for many clinical tests, can be readily collected, and is more acceptable to patients (17, 18). Recently, putative biomarkers of colorectal cancer were identified in urine in the forms of volatile organic compounds (19), modified cytosine nucleosides (20), and polyamines (21, 22). As well, we have reported a urine-based screening test specific for colorectal adenomatous polyps (23, 24) developed in a Canadian population and its subsequent validation in a homogenous Asian cohort to demonstrate its clinical relevance transcending both diet and ethnicity (25).
In the current multicenter study, the potential utility of urine-based metabolomics for detecting colorectal cancer was investigated. This was done by analyzing metabolites in urine samples from colonoscopy- and histopathology-confirmed cases of colorectal cancer and healthy controls (e.g., polyp- and colorectal cancer-free). Our findings highlight the predictive potential of urinary metabolites for colorectal cancer and we discuss the clinical relevance of a proposed screening test.
Materials and Methods
Study participants and sample collection
Adult patients with newly diagnosed colorectal cancer (based on preoperative imaging, colonoscopies, and pathology reports of biopsies) were eligible for study inclusion provided they had not received colorectal cancer–related treatment. Canadian recruitment (October 2008–2010) was conducted at four tertiary hospitals in the Edmonton region (Grey Nuns Hospital, Misericordia Hospital, University of Alberta Hospital, and the Royal Alexandra Hospital) and included patients from across the prairie provinces (i.e., CRC-CAD cohort). American patients were recruited (February–July 2018) from the Memorial Sloan Kettering Cancer Center (MSKCC) in New York City, New York (i.e., CRC-MSKCC cohort).
Patients diagnosed with colorectal cancer provided a urine sample prior to any operation, chemotherapy, radiation, or other cancer-related treatment. Clinical features, such as age, gender, and smoking status, were also collected at this time. Each urine sample was transferred to labeled 1 mL tubes (5×) and frozen at −80°C within 1 hour of collection. Frozen urine was shipped on dry ice in a standard insulated Styrofoam shipper and immediately transferred to a −80°C freezer upon arrival at the University of Alberta in Edmonton, Alberta. Pathology reports were reviewed to abstract cancer stage.
The healthy controls were selected from a previous population-based study (n = 1,000) called Stop COlorectal cancer through Prevention and Education (SCOPE; refs. 23, 26–28) The SCOPE program, regional colon cancer screening program (Edmonton, Alberta, Canada) where over 1,000 urine samples were collected from April 2008 to October 2009. Study participants (40–74 years of age) of average or increased risk for colorectal cancer were recruited. On day of entry, participants provided informed written consent, a midstream urine sample, and completed a demographic survey. Urine was aliquoted and frozen at −80°C within 1 hour of collection. Colonoscopy was performed 2–6 weeks after the urine collection confirmed that the individuals were classified as normal based upon endoscopy findings and pathology reports. Urine samples from the healthy controls were matched 1:1 to the colorectal cancer cases based on gender. A study design chart was shown in Supplementary Fig. S1 (Supporting Information).
Ethics approval was obtained from the Health Research Ethics Boards at the University of Alberta (Pro0000514 and Pro00074045) and MSKCC (IRB catalog nos. 06-107 and 15-209).
Metabolite analysis
Targeted liquid chromatography-mass spectrometry (LC-MS) was performed to quantify urinary metabolites in each sample using the LC-MS kit TMIC00UJ designed and prepared by The Metabolomics Innovation Centre (TMIC) at the University of Alberta in Edmonton, Alberta. Calibration solutions (Cal 1–Cal 7), isotopically labeled standard mix, quality control solutions (QC 1–QC 3), LC-MS methods, and standard operating procedures were provided by TMIC. The TMIC00UJ kit was a combination of three assays to identify 140 unique urinary metabolites (see Supplementary Table S1) indexed by the Human Metabolome Database (www.hmdb.ca). The phenyl isothiocyanate (PITC) assay quantified 47 biologic amines in the LC mode while 75 lipids were semiquantified in the flow injection analysis (FIA) mode. The organic acid assay quantified 17 compounds while ascorbic acid was quantified independently.
The TMIC00UJ kit components were run on an API4000 Qtrap tandem mass spectrometry instrument (AB Sciex) coupled with a Waters UPLC system (Waters Limited). Urine samples were thawed on ice, vortexed, then centrifuged at 13,000 × g. Each plate contained 82 unique urine samples as well as 1 solvent blank solution, 3 matrix solutions, 7 calibration solutions (Cal 1–Cal 7), and 3 quality control (QC) samples. PBS (1×, pH 7.4) was used as the matrix solution. Metabolite quantification was achieved using the AB Sciex Analyst software, version 1.6.2. During quantification, each metabolite was identified using the internal standard and compared against the established calibration curve. The lower limits of detection (LLOD) were calculated as three times the value of the matrix solutions. The upper limit of detection was not reached for any metabolite.
Statistical analysis
Data preprocessing was performed using code written in R, version 3.4.3. Metabolites that were lower than the LLOD or not detected in more than half of the urine samples were removed from the initial list of 140 metabolites. For the remaining metabolites on the list, if a sample had a metabolite concentration that was less than the LLOD, it was replaced with half the value of the LLOD. Statistical analyses were conducted with MetaboAnalyst, version 4.0 for the web (29). Metabolite concentration was normalized against creatinine, log-transformed, and auto-scaled. Potential biomarkers for colorectal cancer were identified (30) by comparing the metabolomic profiles of the colorectal cancer and control groups for both fold-change analyses and Student t tests. One-way ANOVA was performed on the independent sample groups (e.g., CRC-CAD, CRC-MSKCC, and control) to identify statistically significant metabolite differences (31–33). The metabolites with concentration changes in the same direction for both the CRC-CAD and CRC-MSKCC groups were considered consistent colorectal cancer markers. Furthermore, multivariate models, using principal component analysis (PCA), partial least squares discriminant analysis (PLS-DA), and sparse PLS-DA (sPLS-DA; ref. 34) were constructed. Finally, predictors were built using the logistic regression with selected biomarkers. Leave-out approach was used to evaluate the built models. A total 171 controls were randomized to form 121 controls for training and 50 controls for testing with balanced age, gender, and smoking status. A total of 121 CRC-CAD and 121 controls were used as training set to build a model, and 50 CRC-MSKCC and 50 controls were used as testing set to validate the model.
All code for statistical analyses was also written in R, version 3.4.3 (https://www.R-project.org). The glmnet package was used for logistic regression (35). ROC (12) curves were generated and reported using the ROCR package.
Results
Patient characteristics
In Canada, a total of 161 participants were enrolled of which 40 were excluded due to missing clinical information. A total of 50 samples were collected from patients at MSKCC and used for this study. The 171 colorectal cancer samples were matched with 171 urine samples from colonoscopy-confirmed healthy controls. See Table 1 for a summary of clinical characteristics for the participants. Statistical analysis was performed on control group versus colorectal cancer group. The P value for gender was 0.63 indicating there was no significant difference in gender between colorectal cancer and controls. The P value for smoking was 0.02 with more current smokers in the colorectal cancer group. The P value for age was 2.83 × 10−13 indicating there was a significant difference in age between colorectal cancer and controls where the mean age in colorectal cancer group was approximately 7 years older than the control group.
. | . | CRC Cases . | ||
---|---|---|---|---|
. | Controls . | CRC-All . | CRC-CAD . | CRC-MSKCC . |
Mean age, years (SD) | 58.9 (5.6) | 66.4 (11.5) | 67.4 (10.9) | 63.8 (12.5) |
Gender, n (%) | ||||
Male | 100 (58.5%) | 89 (52.0%) | 68 (53.7%) | 24 (48.0%) |
Female | 71 (41.5%) | 82 (48.0%) | 59 (46.3%) | 26 (52.0%) |
Smoking, n (%) | ||||
Current | 12 (7.0%) | 29 (17.0%) | 24 (19.8%) | 5 (10.0%) |
Prior | 66 (38.6%) | 56 (32.7%) | 38 (31.4%) | 18 (36.0%) |
Never | 87 (50.9%) | 86 (50.3%) | 59 (48.8%) | 27 (54.0%) |
By Stage, n (%) | ||||
0 | – | 3 (1.8%) | 3 (2.5%) | 0 (0.0%) |
I | – | 30 (17.5%) | 16(13.2%) | 14 (28.0%) |
II | – | 50(29.2%) | 30 (24.8%) | 20 (40.0%) |
III | – | 57 (33.3%) | 51(42.1%) | 6 (12.0%) |
IV | – | 31 (18.1%) | 21 (17.4%) | 10 (20.0%) |
Total, n | 171 | 171 | 121 | 50 |
. | . | CRC Cases . | ||
---|---|---|---|---|
. | Controls . | CRC-All . | CRC-CAD . | CRC-MSKCC . |
Mean age, years (SD) | 58.9 (5.6) | 66.4 (11.5) | 67.4 (10.9) | 63.8 (12.5) |
Gender, n (%) | ||||
Male | 100 (58.5%) | 89 (52.0%) | 68 (53.7%) | 24 (48.0%) |
Female | 71 (41.5%) | 82 (48.0%) | 59 (46.3%) | 26 (52.0%) |
Smoking, n (%) | ||||
Current | 12 (7.0%) | 29 (17.0%) | 24 (19.8%) | 5 (10.0%) |
Prior | 66 (38.6%) | 56 (32.7%) | 38 (31.4%) | 18 (36.0%) |
Never | 87 (50.9%) | 86 (50.3%) | 59 (48.8%) | 27 (54.0%) |
By Stage, n (%) | ||||
0 | – | 3 (1.8%) | 3 (2.5%) | 0 (0.0%) |
I | – | 30 (17.5%) | 16(13.2%) | 14 (28.0%) |
II | – | 50(29.2%) | 30 (24.8%) | 20 (40.0%) |
III | – | 57 (33.3%) | 51(42.1%) | 6 (12.0%) |
IV | – | 31 (18.1%) | 21 (17.4%) | 10 (20.0%) |
Total, n | 171 | 171 | 121 | 50 |
Metabolite analysis
A total of 140 metabolites were quantified in each urine sample by three LC-MS assays. In the PITC assay, a total of 47 biologic amines were quantified in LC mode and a total of 75 lipids were semiquantified in the FIA mode. In the organic acid assay, a total of 17 valuable organic acids were quantified. Ascorbic acid was quantified using a specific assay. For each assay, a total of 382 samples including both the colorectal cancer and control samples were randomized and analyzed using 5 plates in 96-well plate format. For each plate, a set of calibration curves was generated and used for quantification. Linear regression (R2) for the calibration curves of each metabolite were >0.99 for all plates. For each plate, the LLODs were calculated to be three times the values of the matrix solutions and an average of LLODs from 5 plates were reported in Supplementary Table S1 and used for later analysis. Metabolites concentration that is lower than the LLOD was unreliable and classified as missing value. A total of 46 metabolite features (including methyl-histidine, propionic acid, isobutyric acid, and 43 lipids) were removed as >50% of the information was missing (Supplementary Table S1). Three QC samples at different concentration levels were included in each 96-well plate to assess the coefficient of variation (CV%) across the 5 different plates. The CV% of QC samples for each metabolite was calculated as the SD divided by the average. Notably, the CV% for each metabolite across was <15% indicating a robust analytic method.
Potential biomarkers for colorectal cancer
Potential biomarkers for colorectal cancer were identified by comparing the metabolomic profile from colorectal cancer versus controls for both the fold change (FC) analyses and t-tests. A total of 17 metabolites were identified by volcano plot with a threshold for FC either >2 or <0.5 and P < 0.05 (Table 2). Results from the one-way ANOVA analyses for the three study groups identified consistent markers for colorectal cancer. For each of the 17 metabolites, the concentration change in either colorectal cancer group (e.g., CRC-CAD or CRC-MSKCC) compared with the control group were analyzed. Diacetylspermine (Fig. 1A), proline, kynurenine, and glucose were upregulated in both colorectal cancer groups compared with controls and classified as consistent biomarkers. Although they were identified as potential markers according to the volcano plot for colorectal cancer cases versus controls, the concentrations of 3-(3-Hydroxyphenyl)-3-hydroxypropanoic acid (HPHPA, Fig. 1B), beta-hydroxybutyric acid, 3,4-dihydroxyl phenylalanine (DOPA), 4-hydroxyproline, aminoadipic acid, putrescine, indole acetic acid, hippuric acid, citric acid, and sarcosine did not significantly change when CRC-CAD were compared with controls. Similarly, there were no significant changes in the concentrations of Tetradecenoyl carnitine (C14:1), and aspartic acid (Fig. 1C), and sarcosine when CRC-MSKCC was compared with controls. When compared against the control group, the concentration of butyric acid (Fig. 1D) increased in CRC-CAD and decreased in CRC-MSKCC. The concentration changes of 13 metabolites were dependent on the cohort rather than colorectal cancer status and were discarded from future analyses (Table 2).
. | . | . | . | Metabolite concentration change relative to controls . | . | |
---|---|---|---|---|---|---|
Metabolite . | HMDB ID . | FC . | P . | CRC-CAD . | CRC-MSKCC . | Consistent biomarker . |
1. Diacetylspermine | HMDB02172 | 10.75 | 3.61E–31 | + | + | Yes |
2. Proline | HMDB00162 | 2.53 | 4.04E–31 | + | + | Yes |
3. C14:1 | HMDB62588 | 3.20 | 3.19E–22 | + | NC | No |
4. Kynurenine | HMDB00684 | 3.50 | 6.53E–16 | + | + | Yes |
5. Glucose | HMDB00122 | 3.06 | 1.90E–15 | + | + | Yes |
6. HPHPA | HMDB02643 | 0.33 | 9.44E–11 | NC | − | No |
7. Aspartic acid | HMDB00191 | 0.32 | 5.73E–10 | − | NC | No |
8. Beta-hydroxybutyric acid | HMDB00357 | 17.56 | 2.55E–09 | NC | + | No |
9. DOPA | HMDB00181 | 14.63 | 5.57E–09 | NC | + | No |
10. 4-Hydroxyproline | HMDB00725 | 2.53 | 1.31E–08 | NC | + | No |
11. Aminoadipic acid | HMDB00510 | 0.47 | 2.70E–08 | NC | − | No |
12. Putrescine | HMDB01414 | 3.78 | 1.36E–05 | NC | + | No |
13. Indole acetic acid | HMDB00197 | 0.21 | 2.06E–04 | NC | − | No |
14. Hippuric acid | HMDB00714 | 0.39 | 4.42E–04 | NC | − | No |
15. Citric acid | HMDB00094 | 3.07 | 1.18E–03 | NC | + | No |
16. Sarcosine | HMDB00271 | 14.68 | 1.82E–03 | NC | NC | No |
17. Butyric acid | HMDB00039 | 0.19 | 9.72E–03 | + | − | No |
. | . | . | . | Metabolite concentration change relative to controls . | . | |
---|---|---|---|---|---|---|
Metabolite . | HMDB ID . | FC . | P . | CRC-CAD . | CRC-MSKCC . | Consistent biomarker . |
1. Diacetylspermine | HMDB02172 | 10.75 | 3.61E–31 | + | + | Yes |
2. Proline | HMDB00162 | 2.53 | 4.04E–31 | + | + | Yes |
3. C14:1 | HMDB62588 | 3.20 | 3.19E–22 | + | NC | No |
4. Kynurenine | HMDB00684 | 3.50 | 6.53E–16 | + | + | Yes |
5. Glucose | HMDB00122 | 3.06 | 1.90E–15 | + | + | Yes |
6. HPHPA | HMDB02643 | 0.33 | 9.44E–11 | NC | − | No |
7. Aspartic acid | HMDB00191 | 0.32 | 5.73E–10 | − | NC | No |
8. Beta-hydroxybutyric acid | HMDB00357 | 17.56 | 2.55E–09 | NC | + | No |
9. DOPA | HMDB00181 | 14.63 | 5.57E–09 | NC | + | No |
10. 4-Hydroxyproline | HMDB00725 | 2.53 | 1.31E–08 | NC | + | No |
11. Aminoadipic acid | HMDB00510 | 0.47 | 2.70E–08 | NC | − | No |
12. Putrescine | HMDB01414 | 3.78 | 1.36E–05 | NC | + | No |
13. Indole acetic acid | HMDB00197 | 0.21 | 2.06E–04 | NC | − | No |
14. Hippuric acid | HMDB00714 | 0.39 | 4.42E–04 | NC | − | No |
15. Citric acid | HMDB00094 | 3.07 | 1.18E–03 | NC | + | No |
16. Sarcosine | HMDB00271 | 14.68 | 1.82E–03 | NC | NC | No |
17. Butyric acid | HMDB00039 | 0.19 | 9.72E–03 | + | − | No |
NOTE: “+” indicates a significant metabolite concentration increase; “–” indicates a significant metabolite concentration decrease; “NC” means that the metabolite concentration was not significantly changed.
Prediction models
To construct an effective diagnostic model for colorectal cancer, we conducted multivariate analysis using MetaboAnalyst. Among the PCA, PLS-DA, and sPLS-DA model options, sPLS-DA provided the best separation between the groups with the least number of metabolites. Figure 2A shows the separation plot from sPLS-DA with component 1 and component 2. The classification error rate was 11.4%. The metabolites selected by the sPLS-DA model for component 1 and component 2 with their loading value are shown in Fig. 2B and C. Notably, diacetylspermine, proline, kynurenine, and glucose were among the top six selected features based on loading values for component 1. This confirms their selection as consistent markers.
Finally, logistic regression models were constructed in R with selected metabolites. We used a leave-out approach to build and evaluate models as it is most rigorous. A total of 121 CRC-CAD and 121 controls were used as training set to build a model and 50 CRC-MSKCC and 50 controls were used as testing set to validate the model. The first model (I) used the 17 metabolites listed in Table 2 selected according to the volcano plot of colorectal cancer versus control. This model had an AUC value of 0.967 for training set and 0.868 for testing set (Fig. 3IA and B). At specificity of 80%, the model's sensitivity were 99.2% for training set and 74.0% for testing set, respectively (Table 3). The second model (II) was limited to the four metabolites (e.g., proline, diacetylspermine, kynurenine, and glucose) identified as robust colorectal cancer biomarkers from the ANOVA analysis. The model had an AUC of 0.903 for training set and an AUC of 0.873 on testing set (Fig. 3IIA and B) with a training sensitivity of 82.6% and a testing sensitivity of 72.% at specificity of 80% (Table 3). The last logistic regression model (III) incorporated only diacetylspermine and kynurenine. Proline and glucose were excluded due to their potential association with diet (36), a feature that was not controlled during the 24 hours prior to urine sample collection. With an AUC of 0.868 on training set and an AUC of 0.851 on testing set (Fig. 3IIIA and B), model III has the least AUC drop from training to testing among 3 models that confirmed the robustness of the selected biomarkers. At specificity of 80%, model III's sensitivity were 80.0% for training set and 74.0% for testing set, respectively (Table 3).
. | . | AUC . | Sensitivity at specificity of 80% . | ||||
---|---|---|---|---|---|---|---|
Logistic regression models . | Features . | Train . | Test . | Delta (Train-Test) . | Train . | Test . | Delta (Train-Test) . |
I | Proline, diacetylspermine, C14.1, kynurenine, glucose, aspartic acid, Glutamate, Beta-Hydroxybutyric acid, HPHPA, DOPA, c4-OH, proline, putrescine, indole acetic acid, citric acid, hippuric acid, sarcosine, and butyric acid | 0.967 | 0.868 | 0.099 | 99.2% | 74.0% | 25.2% |
II | Proline, diacetylspermine, kynurenine, and glucose | 0.903 | 0.873 | 0.030 | 82.6% | 72.0% | 10.6% |
III | Diacetylspermine and kynurenine | 0.864 | 0.851 | 0.013 | 80.0% | 74.0% | 6.0% |
. | . | AUC . | Sensitivity at specificity of 80% . | ||||
---|---|---|---|---|---|---|---|
Logistic regression models . | Features . | Train . | Test . | Delta (Train-Test) . | Train . | Test . | Delta (Train-Test) . |
I | Proline, diacetylspermine, C14.1, kynurenine, glucose, aspartic acid, Glutamate, Beta-Hydroxybutyric acid, HPHPA, DOPA, c4-OH, proline, putrescine, indole acetic acid, citric acid, hippuric acid, sarcosine, and butyric acid | 0.967 | 0.868 | 0.099 | 99.2% | 74.0% | 25.2% |
II | Proline, diacetylspermine, kynurenine, and glucose | 0.903 | 0.873 | 0.030 | 82.6% | 72.0% | 10.6% |
III | Diacetylspermine and kynurenine | 0.864 | 0.851 | 0.013 | 80.0% | 74.0% | 6.0% |
Discussion
We have identified a discrete subset of common urinary metabolites that may serve as potential biomarkers for colorectal cancer when used in combination based upon modeling to separate colorectal cancer and control samples. An sPLS-DA model with two components was built with a classification error rate of 11.4%. For logistic models, the AUC varied from 0.965 to 0.868 highlighting the predictive power of urinary metabolomics for colorectal cancer screening. However, given the sample size (n = 342), one needs to be conscientious about error due to overfitting the model. To guard against this, further analyses were performed by building a model that only used consistent biomarkers regardless of the cohorts. Finally, a metabolomic predictor for colorectal cancer was built with two metabolites: diacetylspermine and kynurenine. At its optimal cut-off value of 0.498, the predictor's specificity and sensitivity values were 90.6% and 74.3%, respectively.
The mechanism of diacetylspermine and kynurenine being colorectal cancer markers still needs to be investigated. Here, we plotted the trend of their changes from control, to stage 0, to stage I, to stage II, to stage III, to stage IV in Fig. 4. For both diacetylspermine and kynurenine, the biggest change was observed from control to stage 0 confirming the usage of these two markers for early screening. There was a continuous increase in diacetylspermine as the cancer progresses. The final metabolites, diacetylspermine and kynurenine, have been associated with cancer detection in the past. For instance, increased urinary kynurenine concentrations were first identified in patients with different malignancies by Spacek in 1955 (37). Urine samples were collected without dietary modifications, and the kynurenine levels increased from 1- to 7-fold in patients with colorectal cancer. Several teams have identified diacetylspermine's presence in urine in association with hepatocellular carcinoma (sensitivity of 65.5%, specificity versus cirrhosis of 76.0%; ref. 38), breast and colorectal cancers (sensitivity was 60.2% and 75.8%, respectively; ref. 39), pancreatobiliary cancer (sensitivity of 75%; ref. 40), and non–small cell lung cancer recurrence following resection (sensitivity 62.2%; ref. 41). In spite of its utility, urinary diacetylspermine was unable to discriminate between patients with and without bladder cancer (42). Enrichment of proline has been identified as a biomarker for colorectal cancer based upon serum, tissue, urine, exhaled breath, and plasma (43). The urinary metabolite glucose is typically associated with reduced concentrations in samples from patients with cancer compared with healthy controls while we report increased levels in both colorectal cancer groups (43).
Although our approach to diagnose colorectal cancer is novel and promising, there were several limitations to this study. Smoking and age are known contributors to colorectal cancer (44). As such, we tried to match controls and colorectal cancer cases based upon smoking status and age; however, this was not possible due to higher than expected rates of smoking (current, prior, never) and age in the two colorectal cancer groups. This may have impacted the selection of metabolites in a negative way. Urinary metabolites are waste products, it is unclear the upstream metabolic role of either diacetylspermine or kynurenine in cancer pathogenesis. Knowing more about the metabolic cycles and degradation pathways involved in colorectal cancer will be helpful to identify additional biomarkers. The specificity of the metabolic profile must also be evaluated by comparing with samples from patients with other cancer types. Although promising results were obtained, the metabolomic profile obtained cannot yet be considered definitive and need to be tested in clinical setting, ideally within a pragmatic study setting to make the findings relevant and generalizable to others. Testing the predictive performance of the metabolite profile against other cancers is especially relevant as diacetylspermine has been included in many noncolorectal cancer panels. In addition, it may be beneficial to make a more comprehensive metabolomic assessment. This could be done using additional analytic assays, such as gas chromatography–mass spectrometry, which will enable the detection of more metabolites (45). A more comprehensive metabolomic profile may improve diagnostic accuracy. It is possible that we could derive a better understanding of the underlying metabolic processes associated with colorectal cancer. We intentionally did not have patients follow a controlled diet or fast before providing a urine sample. Appreciating the diurnal changes in urinary metabolite concentrations (46), all collections were completed during daytime business hours. Dietary controls place unreasonable burdens on patients and believed that this would decrease the value of this or any urinary biomarker panel intended for use as a screening tool for colorectal cancer. Furthermore, it is highly probably that differences in the intestinal microbiota between healthy individuals and those with colorectal cancer impact urinary metabolites more so than diet (47). A limitation of any large multicenter study is the need to handle, ship, and store the biosamples over time. To minimize metabolite degradation, all specimens were handled similarly regardless of collection date and aliquoting prior to the first freeze at −80°C prevented exposure to multiple freeze–thaw cycles (48, 49).
In conclusion, this metabolomic-based predictor for colorectal cancer has potential clinical application for population-based colorectal cancer screening using urine; a preferred biosample that is readily available, straightforward to collect as part of any physician's clinic visit, and acceptable to patients in most cultures. Further supporting the use of urine is availability of collection, handling, shipping, and storage protocols many of which have been instituted by major biobanks and repositories. A 2018 systematic review of 16 urinary metabolomic studies in colorectal cancer listed metabolites independently reported three or more times (47); none of which were the same as those we reported. As the largest, multicenter urine-based metabolomics study conducted to date (43, 47), there were insufficient samples at each cancer stage to analyze them independently or in sequence to understand the disease trajectory. Larger datasets supported by comprehensive clinicodemographic characteristics will be valuable to discern the discrete shifts in metabolites associated with real-time changes in cellular metabolism associated with disease. This will also facilitate external validation of putative biomarker panels such as that reported herein.
Disclosure of Potential Conflicts of Interest
L. Deng is a Senior Scientist at Metabolomic Technologies, Inc. No potential conflicts of interest were disclosed by the other authors.
Authors' Contributions
Conception and design: L. Deng, H. Wang, O.I. Alatise, M.R. Weiser, T.P. Kingham, D. Chang
Development of methodology: L. Deng, H. Wang, O.I. Alatise, D. Chang
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): L. Deng, J. Constable, H. Wang, O.I. Alatise, T.P. Kingham
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): L. Deng, K. Ismond, Z. Liu, O.I. Alatise, M.R. Weiser
Writing, review, and/or revision of the manuscript: L. Deng, K. Ismond, J. Constable, H. Wang, O.I. Alatise, M.R. Weiser, T.P. Kingham
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): L. Deng, K. Ismond, J. Constable, O.I. Alatise
Study supervision: L. Deng, O.I. Alatise, M.R. Weiser, D. Chang
Acknowledgments
We would like to express our deep gratitude to Dr. Richard N. Fedorak, who contributed to this project and passed away on Nov. 8, 2018. This work was funded, in part, by the National Institute of Biomedical Imaging and Bioengineering (NIBIB), NIH (K. Ismond, O.I. Alatise, and T.P. Kingham are supported by grant number UG3EB024965), and Mitacs (IT10425, to Z. Liu).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.