Abstract
Metabolites are the end products of cellular regulatory processes, and their levels can be regarded as the ultimate response of biological systems to genetic or environmental changes. We have used a metabolite profiling approach to test the hypothesis that quantitative signatures of primary metabolites can be used to characterize molecular changes in ovarian tumor tissues. Sixty-six invasive ovarian carcinomas and nine borderline tumors of the ovary were analyzed by gas chromatography/time-of-flight mass spectrometry (GC-TOF MS) using a novel contamination-free injector system. After automated mass spectral deconvolution, 291 metabolites were detected, of which 114 (39.1%) were annotated as known compounds. By t test statistics with P < 0.01, 51 metabolites were significantly different between borderline tumors and carcinomas, with a false discovery rate of 7.8%, estimated with repeated permutation analysis. Principal component analysis (PCA) revealed four principal components that were significantly different between both groups, with the highest significance found for the second component (P = 0.00000009). PCA as well as additional supervised predictive models allowed a separation of 88% of the borderline tumors from the carcinomas. Our study shows for the first time that large-scale metabolic profiling using GC-TOF MS is suitable for analysis of fresh frozen human tumor samples, and that there is a consistent and significant change in primary metabolism of ovarian tumors, which can be detected using multivariate statistical approaches. We conclude that metabolomics is a promising high-throughput, automated approach in addition to functional genomics and proteomics for analyses of molecular changes in malignant tumors. (Cancer Res 2006; 66(22): 10795-804)
Introduction
Metabolite levels can be regarded as the amplified output of biological systems in response to genetic or environmental changes. However, only in the past years, technologies have been developed that allow comprehensive and quantitative investigation of a multitude of different metabolites, which is called “metabolomics” in analogy to the terms “transcriptomics” and “proteomics” (1–3). Griffin et al. (4) have suggested metabolic profiling as a promising tool for analysis of the malignant phenotype, drug target evaluation, and tumor diagnosis. The advance of instrumentation and computation has enabled new strategies for separation and identification of a manifold of individual metabolites. To this end, comprehensive analysis by gas chromatography coupled with mass spectrometry (MS) has been used as a “gold standard” for studying primary metabolism, specifically in plant metabolomics (5, 6). Up to 1,000 individual metabolites could be retrieved from plant tissues using time-of-flight (TOF) MS concomitant with deconvolution software to identify individual compounds based on detection of model ions even in those cases where the individual mass spectra of two or more compounds overlap (7). Metabolite profiling has been used in clinical research to study metabolic diseases (8, 9); however, surprisingly few reports (10, 11) have studied comprehensive metabolic responses for the characterization of tumor types. The combination of gas chromatography-TOF (GC-TOF)–based analysis with automatical deconvolution techniques has not been used for analysis of human tumor samples, thus far, but only on mouse tissues in obesity-related research (12).
Ovarian carcinoma is the fifth most common cancer in women. In the United States, an estimated number of 20,180 new cases of ovarian carcinoma and an estimated number of 15,310 deaths are expected for 2006 (13). It has long been recognized that besides invasive ovarian carcinoma, a different type of lesion exists in the ovary, the so-called borderline tumor (review ref. 14). Borderline tumors represent 5% to 10% of ovarian epithelial tumors. They are characterized by complex atypical proliferations of neoplastic and slightly atypical epithelium (15). In contrast to invasive carcinomas, borderline tumors lack a destructive stromal invasion. The different biological background of borderline tumors and invasive ovarian carcinomas translates into different clinical behavior of both groups of tumors. Although ovarian carcinomas are rapidly progressing tumors with a high recurrence rate and a 5-year survival rate as low as 30% to 40% (13, 14), borderline tumors have an indolent clinical course with recurrences in a minority of patients and almost no tumor-associated deaths. We used these two types of lesions as a model and evaluated the hypothesis that GC-TOF–based metabolite profiling may serve as a tool to assess the molecular changes in tumor tissue and to detect metabolic patterns associated with different biological tumor entities.
Materials and Methods
Study population and histopathologic examination. For metabolic profiling, 75 patients with ovarian lesions who were diagnosed at the Institute of Pathology, Charité Hospital, Berlin, Germany were included into the study. The tissue specimens included 66 primary invasive ovarian carcinomas and 9 borderline tumors. Tissue was dissected by a senior pathologist in the operating room and was immediately frozen in liquid nitrogen and stored at −80°C. Additional H&E sections were done for histopathologic evaluation (16).
Metabolic profiling. Fresh weight (5 mg) of frozen biopsy tissue was prepared under standard operation procedure SOP 2003-2. Tissue was homogenized in 2-mL Eppendorf tubes for 30 seconds at 25 s-1 using 3-mm inner diameter metal balls in a ball mill (Retsch, Germany). Extraction was carried using 1 mL of a one-phasic mixture of chloroform/methanol/water (2:5:2, v/v/v) at −20°C for 5 minutes (17). After centrifugation, the supernatant was concentrated to complete dryness in a speedvac concentrator. The dried metabolic extract was derivatized in two steps: first, carbonyl functions were protected by methoximation using 20 μL of a 40 mg/mL solution of methoxyamine.hydrochloride in pyridine at 28°C for 90 minutes. Afterwards, acidic protons (e.g., hydroxyl, amine, sulfhydryl, and carboxyl groups) were exchanged against trimethylsilyl group using 180 μL N-methyl-N-trimethylsilyltrifluoroacetamide (Macherey-Nagel, Dueren, Germany) at 37°C for 30 minutes to increase the volatility of polar metabolites; 1.5 μL of this solution was injected into an automatic liner exchange system with direct thermodesorption unit (DTD; ATAS GL, Zoetermeer, the Netherlands). For every sample, a fresh liner and microvial was taken to avoid sample carryover and cross-contamination. The sample was introduced at 40°C using a programmable temperature vaporization OPTIC3 injector (ATAS GL) and heated to 290°C using a 4°C/min ramp using the variables shown in Supplementary Table S1.
An Agilent 6890 gas chromatography oven (Hewlett-Packard, Atlanta, GA) was coupled to a Pegasus III TOF mass spectrometer from Leco (St. Joseph, MI). A MDN-35 fused silica capillary column of 30-m length, 0.32-mm inner diameter, and 0.25-μm film thickness was used for separation. For the liner deactivation procedure, the initial oven temperature was set to 85°C with an instant ramp of 50°C/min and a target temperature of 320°C with a hold time of 3-minute duration. For the analysis, the gas chromatography oven was set to 85°C with duration of 210 seconds and a following ramp of 15°C/min. The target time was 360°C with duration of 2 minutes. The transfer line temperature was set to 250°C. Mass spectra were acquired with a scan range of 83 to 500 m/z and an acquisition rate of 20 spectra per second. The ionization mode was electron effect at 70 eV. The temperature for the ion source was set to 250°C. Chromatogram acquisition, data handling, automated peak deconvolution, library search, and retention index calculation were done by the Leco ChromaTOF software (v1.61).
Handling of missing values and normalization. To minimize the number of missing values, only metabolites were included in the statistical analyses that were consistently detected in at least 80% of samples. All known artifact peaks, such as internal standards, column bleed, plasticizers, or reagent peaks, were excluded from the result sheets. All metabolite data were normalized relative to the sum of all known metabolites in each sample and were log transformed. We have replaced missing data with arithmetic means whenever required by the statistical methods.
Statistical approach for detection of metabolic changes. Most of the data mining was done using the statistical language R, a programming and visualization environment that is especially useful for the analysis of high-density molecular data (18). Univariate analyses were carried out without replacement of missing data. Alterations between borderline tumors and invasive carcinomas were investigated by thresholds on the fold change and Welch's t test P values. The results of different selection procedures were validated by 1,000 random permutations of the tumors. The false discovery rate for the generated metabolite list was defined as the ratio (nexp/nobs) between the number of observed significant metabolites between borderline tumors and carcinomas (nobs) and the number of metabolites that were expected to be significant by chance from the permutation distribution (nexp). A P was calculated to assess the significance of finding nobs or more modified metabolite concentrations.
Principal component analysis. In principal component analysis (PCA), the original set of metabolites is reduced to a new set of principal components that retain the variance-covariance structure of the data but use lesser dimensions of data space. PCA was done using the Statistica Data Miner (StatSoft, Inc., Tulsa, OK) We used the normalized data matrix as an input for PCA; this data matrix was centered about the means and scaled by the SDs. Missing values were substituted by the mean. The case designation as borderline tumor or ovarian carcinoma was used as a grouping variable for visualization of data. Furthermore, the correlation of the individual components with the grouping variable was assessed by t test analysis.
Classification. Four classification methods were checked for their ability to identify predictive signatures that are capable of distinguishing between borderline tumors and invasive carcinomas. For Fisher's linear discriminant analysis (LDA) and linear support vector machines (SVM), we made use of the corresponding functions in the R packages MASS and e1071. For nearest mean classification (NMC) and nearest centroid classification (NCC) as described in refs. 19 and 20, we have used our own implementations. All four classifiers share the property of linearity and can be visualized as a split of the space of the metabolite concentrations by a hyperplane, where all points on one side of the hyperplane are classified as carcinomas, and the others are classified as borderline tumors. The classifiers were validated in a leave-one-out approach. As a data matrix without missing values is needed by two of the four classification methods (LDA and SVM), the training data were completed by the method described above. However, model components corresponding to missing values of the test sample would be useless information for its classification. Therefore, the predictive models were fitted to restricted training data sets that contain only the metabolites with present values in the test sample. Furthermore, we have checked whether feature selection before classification leads to an improvement of the classification results: Welch's t statistics was employed to select the 2, 3, …, 291 most significantly altered metabolites from the training data. The corresponding classification rates for invasive carcinomas and for borderline tumors as well as the overall classification rate were recorded and plotted against the number of selected metabolites.
Finally and going beyond the leave-one-out approach, we have applied a multiple random validation strategy (19) to check the robustness of the classification results with regard to different choices of the training set. To this end, we have drawn 2,000 random training-test splits for each possible size (2n = 6, 8, …, 16) of a balanced training set containing an equal number of borderline tumors and invasive carcinomas. The resulting distributions of average prediction rates (average of prediction rate for borderline tumors and prediction rate for invasive carcinomas) were visualized in box plots.
Results
GC-TOF–based analysis of ovarian tumor samples. We investigated tissue samples from 66 patients with invasive ovarian carcinomas and 9 patients with borderline tumors of the ovary. The clinicopathologic data are shown in Supplementary Table S2. Using the GC-TOF, a total of 291 individual metabolites were consistently detected in at least 80% of the tissue samples. Several of these metabolites had severely overlapping peaks that were deconvoluted by the selective ion traces using a software-based approach (Fig. 1A and B). Compound identification was done by comparison of mass spectra and retention indices with those obtained with commercially available reference compounds. Only in a very few clear cases, identifications have been based on mass spectral similarities in comparison with the commercial library purchased from the U.S. National Institute of Standards and Technology. One of such cases is given in Fig. 1B (pentadecanoic acid). Even low abundant aromatic compounds like tyrosine and hypoxanthine (Fig. 1B) were clearly identifiable at only 3- to 4.5-second retention time difference besides the major abundant compound myo-inositol (Fig. 1B). It is important to note that the validation of the analytic method included compounds that are prone to oxidation, such as ascorbate (the first peak eluting in Fig. 1B), which would remain undetectable in case the tissue or the extract would have been subject to oxidizing conditions during the sample preparation. Using this approach, we were able to identify 114 (39.1%) of the 291 metabolites. As Supplementary Material, a list of the identified compounds is given, which supports the comprehensiveness of the method (Supplementary Table S3). A large variety of different compound classes was positively identified, ranging from simple hydroxyacids and amino acids to sugars and sugar alcohols, but it also comprised readily oxidizable compounds, such as ascorbate and cysteine, and compounds with high metabolic turnover, such as organic phosphates. Certain differences between ovarian carcinoma and borderline tumors can already be depicted by visual inspection (e.g., for glycerol-α-phosphate that was found at lower abundance relative to inositol in borderline tumors; Supplementary Fig. S1). Furthermore, use of the automated liner exchange system described in Materials and Methods enabled a contamination-free analysis of free fatty acids, such as arachidonic acid, oleate, and monoglycerides.
Detection of differential metabolites by t test–based statistics. In a first approach, Welch's t test was employed to detect metabolites that are significantly different between ovarian carcinomas and borderline tumors. Using a threshold P < 0.01, 51 metabolites were detected that were different in both groups (Fig. 2; Table 1). We have used the Kyoto Encyclopedia of Genes and Genomes (KEGG) database (21) to connect these metabolic changes to different pathways and key enzymes. As shown in Table 1, differences were detected in the area of purine and pyrimidine metabolism, glycerolipid metabolism, and energy metabolism. A number of interesting features in differential regulation of ovarian carcinoma and borderline tumors can be revealed. First, it was apparent that no monoglycerolipids, sugars, or cholesterol derivatives were elevated or decreased in borderline tumors. Some statistically significant biomarkers may be regarded as nonspecific, such as the generic stress responders proline and tocopherol. However, a range of further metabolites was found in relatively lower concentrations in borderline tumors compared with ovarian carcinoma that can be interpreted in a functional way. Specifically, marked differential regulation of creatinine, lactate, glucose-1-phosphate, and the tricarboxylic acid cycle (TCA) intermediates fumarate and malate point to changes in energy metabolism that supports the notion that malignant tumors have higher metabolic turnover rates and thus higher demands in energy supply. However, another interpretation might be even more viable. The involvement of TCA cycle intermediates does also support the cycle's role in determining relative fluxes of amino acid and lipid metabolism through pyruvate dehydrogenase, citrate synthase, and transaminase reactions departing from a-ketoglutarate and oxaloacetate. In correspondence to this view, levels of a number of amino acids were found to be higher in ovarian carcinoma compared with borderline tumors. Interestingly, specifically amino acids were elevated in ovarian carcinoma that serve as nitrogen donors, such as glutamine, glutamate, and asparagine, but also cysteine, glycine, and threonine that are regarded as important building blocks in protein biosynthesis. Altogether, the changes in energy metabolism, lower levels in selected free fatty acids but higher levels in proteinogenic amino acids, purines, pyrimidines, and membrane lipid precursors all support the notion that higher rates of cell divisions in ovarian carcinoma are reflected in altered levels of primary metabolites.
Metabolite . | Fold change (ovarian cancer vs borderline tumor) . | P . | Pathway . | Enzymes involved (selected) . |
---|---|---|---|---|
Glycerolphosphate alpha | 5.3 | 0.0058 | Glycerolipid metabolism | |
Uracil | 4.2 | 6.20e−06 | Pyrimidine metabolism | 1.3.1.2 Dihydropyrimidine dehydrogenase; 2.4.2.4 thymidine phosphorylase; |
Hypoxanthine | 3.8 | 0.0033 | Purine metabolism | 2.4.2.4 Thymidine phosphorylase |
Pyrazine-2,5-bishydroxy | 3.4 | 0.0033 | ||
Inositol-2-phosphate | 3.3 | 0.0042 | Glycerolipid metabolism | |
Phosphoric acid | 3.2 | 0.0029 | Glycerolipid metabolism | |
Glutamic acid | 3 | 0.00018 | Amino acid metabolism C/N balance | |
Glycine | 3 | 4.30e−08 | Gly/Ser/Thr metabolism | |
Malic acid | 2.8 | 0.0028 | TCA | 1.1.1.37 Malate dehydrogenase; 1.1.3.3 malate oxidase |
γ-Aminobutyric acid | 2.7 | 0.0042 | Amino acid metabolism C/N balance | 1.2.1.3 Aldehyde dehydrogenase (NAD+); 2.6.1.19 GABA transaminase; 4.1.1.15 glutamate decarboxylase |
α-Tocopherol 2 | 2.5 | 0.0083 | (No biosynthesis in humans) | |
Glucose-1-phosphate degr. | 2.5 | 0.0025 | Glycerolipid metabolism | |
Proline | 2.3 | 3.80e−07 | Amino acid metabolism | |
Fumaric acid | 2.2 | 0.0011 | TCA | |
Creatine | 2.2 | 0.00069 | Creatine P breakdown | |
Cysteine | 2 | 0.006 | Amino acid metabolism | |
Butyric acid 2-hydroxy | 2 | 0.0051 | Propanoate metabolism | 1.1.1.27 Lactate dehydrogenase |
Glycine minor | 2 | 0.00041 | Gly/Ser/Thr metabolism | |
Glutamine | 2 | 0.0083 | Amino acid metabolism C/N balance | 3.5.1.2 Glutaminase |
Threonine | 1.9 | 0.00011 | Gly/Ser/Thr metabolism | |
Asparagine | 1.6 | 0.00093 | Amino acid metabolism | |
Nonadecanoic acid | −1.2 | 0.0014 | Free fatty acid | |
Stearic acid | −1.4 | 0.0096 | Free fatty acid | |
Heptadecanoic acid | −1.4 | 0.00073 | Free fatty acid | |
Benzoic acid | −1.6 | 0.00011 | Phenylalanine metabolism | |
Lactic acid | −2.2 | 0.0017 | Propanoate−,glycolysis, pyruvate metabolism | 1.1.1.27 Lactate dehydrogenase |
Metabolite . | Fold change (ovarian cancer vs borderline tumor) . | P . | Pathway . | Enzymes involved (selected) . |
---|---|---|---|---|
Glycerolphosphate alpha | 5.3 | 0.0058 | Glycerolipid metabolism | |
Uracil | 4.2 | 6.20e−06 | Pyrimidine metabolism | 1.3.1.2 Dihydropyrimidine dehydrogenase; 2.4.2.4 thymidine phosphorylase; |
Hypoxanthine | 3.8 | 0.0033 | Purine metabolism | 2.4.2.4 Thymidine phosphorylase |
Pyrazine-2,5-bishydroxy | 3.4 | 0.0033 | ||
Inositol-2-phosphate | 3.3 | 0.0042 | Glycerolipid metabolism | |
Phosphoric acid | 3.2 | 0.0029 | Glycerolipid metabolism | |
Glutamic acid | 3 | 0.00018 | Amino acid metabolism C/N balance | |
Glycine | 3 | 4.30e−08 | Gly/Ser/Thr metabolism | |
Malic acid | 2.8 | 0.0028 | TCA | 1.1.1.37 Malate dehydrogenase; 1.1.3.3 malate oxidase |
γ-Aminobutyric acid | 2.7 | 0.0042 | Amino acid metabolism C/N balance | 1.2.1.3 Aldehyde dehydrogenase (NAD+); 2.6.1.19 GABA transaminase; 4.1.1.15 glutamate decarboxylase |
α-Tocopherol 2 | 2.5 | 0.0083 | (No biosynthesis in humans) | |
Glucose-1-phosphate degr. | 2.5 | 0.0025 | Glycerolipid metabolism | |
Proline | 2.3 | 3.80e−07 | Amino acid metabolism | |
Fumaric acid | 2.2 | 0.0011 | TCA | |
Creatine | 2.2 | 0.00069 | Creatine P breakdown | |
Cysteine | 2 | 0.006 | Amino acid metabolism | |
Butyric acid 2-hydroxy | 2 | 0.0051 | Propanoate metabolism | 1.1.1.27 Lactate dehydrogenase |
Glycine minor | 2 | 0.00041 | Gly/Ser/Thr metabolism | |
Glutamine | 2 | 0.0083 | Amino acid metabolism C/N balance | 3.5.1.2 Glutaminase |
Threonine | 1.9 | 0.00011 | Gly/Ser/Thr metabolism | |
Asparagine | 1.6 | 0.00093 | Amino acid metabolism | |
Nonadecanoic acid | −1.2 | 0.0014 | Free fatty acid | |
Stearic acid | −1.4 | 0.0096 | Free fatty acid | |
Heptadecanoic acid | −1.4 | 0.00073 | Free fatty acid | |
Benzoic acid | −1.6 | 0.00011 | Phenylalanine metabolism | |
Lactic acid | −2.2 | 0.0017 | Propanoate−,glycolysis, pyruvate metabolism | 1.1.1.27 Lactate dehydrogenase |
Permutation analysis. The large number of variables measured simultaneously in -omics studies gives rise to a massive multiple testing situation with an accumulating risk of false-positive detections when one proceeds from metabolite to metabolite. We have addressed these issues in a framework of repeated sample permutations. First, the number of metabolites selected by a threshold P < 0.01 was found to be <51 in all 1,000 shuffled data sets, a strong evidence for the presence of a biological signal in the data. Furthermore, the false discovery rate was estimated as 7.8%, suggesting that approximately four metabolites in the set of 51 metabolites were detected by chance and the majority of detected metabolite changes is due biological differences.
Cluster analysis. To visualize the differences in metabolite signatures between borderline tumors and ovarian carcinomas, we used a hierarchical clustering algorithm based on the Pearson correlation coefficient and the average linkage method and did simultaneous clustering of metabolites and tumor samples. Unsupervised clustering using all metabolites did not result in a separation between borderline tumors and ovarian carcinomas (data not shown). Therefore, we investigated whether cluster analysis could be used for visualization and interpretation of the metabolite signatures detected by t test statistics. When we used the Welch t test's P = 0.01 as a filter for subsequent clustering analysis, a separation of most of the tumors was possible (Fig. 3). Grouping of metabolites generally was observed in a manner discussed above; that is, compounds that were biochemically related were generally found to cluster together, such as the pairs malate/fumarate, glutamine/glutamate, uracil/hypoxanthine, and stearate/heptadecanoate. This clustering supports the interpretation that levels of primary metabolites reflect the general metabolic turnover rates that are altered in invasive ovarian carcinoma. Interestingly, lactate, which was found to be differently regulated in the Welch t test, might be one of the few false-positive discoveries when investigating the color coding in the cluster analysis: only a very few patients had elevated levels, obviously enough to cross the significance thresholds but not enough to regard lactate a valid biomarker. In this way, cluster analysis helps investigating putative biological biomarkers and reducing the risk of over interpretation of findings.
PCA. PCA is used for data reduction in multidimensional data matrices and has been described for analysis of plant metabolomic data (22) as well as cDNA microarray data (23). We have used PCA to identify principal components in the data matrix of all metabolites. The differences of individual principal components between borderline tumors and carcinomas were compared with the t test. We found that four of the principal components (principal components 2, 8, 27, and 30) were significantly different between both groups (Fig. 4A-D). Interestingly, the principal component 2, which is the second most important component, showed a very strong difference between both groups (P = 0.00000009). As shown in Fig. 4E, the three-dimensional scatter blot using the information from components 2, 8, and 27 allowed a separation of borderline tumors and ovarian carcinomas, with only one borderline tumor located in the ovarian carcinoma group. These results suggest that there is an underlying structure in the complex metabolic data set, and that statistical approaches are able to detect elements of this underlying structure.
Construction of classification models. Four supervised methods were employed to construct predictive models that are capable of distinguishing between invasive carcinomas and borderline tumors based on metabolomic profiling: LDA, SVM, NMC, and NCC (19, 20, 24, 25). The corresponding four classifiers were built from the data of all 291 measured metabolites and validated in a leave-one-out cross-validation. Prediction accuracies were calculated separately for invasive carcinomas, for borderline tumors, and for all lesions (Supplementary Table S4). The best results were obtained with the NMC and the NCC method that both yielded prediction accuracies of 87.9% and 88.9% for the invasive carcinomas and the borderline tumors, respectively.
Additionally, we have checked the performance of a two-step classification scheme consisting of a feature selection step followed by the construction of a predictive model with LDA, SVM, NMC, or NCC (Supplementary Fig. S2A). To this end, the 2, 3, …, 291 most significantly altered metabolites were used for model building, and the corresponding classification accuracies were plotted against the number of features (Fig. 5A-D). Again, the best and most stable results were obtained with nearest-mean classification. Furthermore, we observed that the feature selection step was not essential for good classification rates, as one of the best results was obtained with NMC or NCC without it (i.e., with a classifier built from all 291 measured metabolites). On the other hand, good classification result could be obtained with classifiers that are built from a very small number of metabolites (as 5, 10, or 20).
Finally, we have checked the robustness of the classification results with regard to different choices of the training data (Supplementary Fig. S2B). In a repeated loop, a balanced training set was randomly drawn from the 95 metabolite profiles; a classifier was fitted; and the classifier was evaluated on a test set that contained all remaining profiles. For each size of training set, we obtained a distribution of average prediction rates that was visualized in a box plot (Supplementary Fig. S3A-D). The prediction rates turned out to significantly exceed the baseline of 50% over a wide range of the training set size. A lack of significance was only observed for the largest size of training set (2n = 16). However, in that case, the test sets contain only a single borderline tumor, and the lack of significance for the average prediction rates arises from large variations in the prediction rates for borderline tumors (data not shown).
Discussion
In this study, we investigated metabolomic profiles of different types of ovarian tumors, borderline tumors, and invasive carcinomas. We were able to validate significant differences in metabolite levels and to detect metabolite signatures that separate about 90% of borderline tumors from the carcinomas and vice versa. We have shown that the metabolite signatures are capable of predicting the status (borderline tumor or invasive carcinoma) of a previously unknown test tumor, and that the classification results are robust against different choices of the training set.
Our study shows for the first time that large-scale metabolic profiling using GC-TOF is suitable for analysis of fresh frozen human tumor samples. Using the KEGG database, we have linked the metabolic changes to some putative key enzymes that play an important role in the corresponding pathways. Some of these enzymes (e.g., dihydropyrimidine dehydrogenase and thymidine phosphorylase) that regulate in pyrimidine metabolism have already been shown to play a prognostic role in ovarian cancer (26). Tanner et al. have shown that aldehyde dehydrogenase is increased in low stage ovarian carcinomas compared with high-stage tumors (27). Nicholson-Guthrie et al. have described increased γ-aminobutyric acid levels in ovarian cancer patients (28).
In the last years, there have already been several studies investigating metabolic changes in different types of malignant tumors. These studies have mainly used nuclear magnetic resonance (NMR) spectroscopy to detect changes in metabolic patterns associated with apoptosis (29–31) or response to hypoxia (32) in cell cultures. Sitter et al. have used high-resolution magic angle spinning in combination with PCA to investigate tissue samples of eight patients with cervical cancer and found higher levels of cholines and amino acid residues and lower levels of glucose in the malignant tissue (33). Gribbestadt et al. compared malignant and nonmalignant breast tissue using 1H NMR spectroscopy and found low levels of glucose and high levels of choline in tumor tissue (34).
These NMR-based methods (35, 36) are interesting as well, as it is possible to get high-resolution spectra from intact tissue samples that are fully available for additional standardized histopathologic analysis (37). A clinically viable solution might eventually reach out to apply both GC-TOF and liquid chromatography-MS (LC-MS) to identify metabolic biomarkers in malignant tumors by large-scale metabolomics and to use NMR-based methods to measure these biomarkers in the clinical routine, potentially by in vivo imaging.
In this article, we have presented results from a novel and very specific variant of GC-TOF MS: the conjunction with automated liner exchange and direct thermodesorption injection (DTD). Two advantages of this technique were expected: (a) avoidance of cross contamination and (b) more gentle introduction of compounds onto the gas chromatograph, which enables nondestructive analysis for thermolabile compounds. Because each sample was analyzed with a fresh and previously unused microvial and liner, no cross-contamination could occur due to matrix depositions. Experiments on blood plasma injections have validated the viability of this technique to quantify endogenous levels of free fatty acids and monoglycerides (data not shown). We were, therefore, confident that liner exchange and DTD would enable to detect significant changes in lipid metabolism and individual fatty acids, and indeed, such changes were observed in comparison of borderline and ovarian carcinoma. Thus far, this is the first report on metabolite profiling by GC-TOF in which changes in lipid metabolism have been found, a result that may have been prevented by widespread use of classic split/splitless injectors otherwise. Furthermore, such classic injectors introduce compounds onto the gas chromatography column by 230°C to 250°C hot injectors, a temperature that is sufficient to decompose more thermolabile compounds like organic phosphates. DTD injection, however, slowly ramped the temperature up to the boiling point of each compound, which supposedly leads to less destruction and improved quantitation capabilities.
Quality controls have been used primarily for the technique itself (i.e., instrument sensitivity, reagent blank controls, and method blank controls). Around 25% of all injections were done for quality control; however, the error rate for the extraction technique used here is unknown because the quantity of tissue biopsies was not large enough to do in-depth validation studies of the extraction technique. Nevertheless, this extraction technique was successfully used before for mouse tissues (38). Method error rates for mouse and plant tissues as well as blood plasma were found in the range of 5% to 25% relative SD, depending on the chemical structure of the metabolites. Further limitations of the technique of GC-TOF analysis are certainly, that bisphosphates and trisphosphates cannot be analyzed due to their thermolability. Equally, large compounds >550 Da are usually too involatile to be detected by GC-TOF. Such compound classes, including for example complex membrane lipids, need to be analyzed be LC-MS or direct infusion electrospray MS.
Despite these limitations, use of GC-TOF has several advantages for cancer diagnostics. First, a low amount of tissue is needed. In this study, 5-mg fresh tissue sections were used, but a recent study has shown that as few as 5,000 pooled single plant cells would be sufficient to quantify 66 identified metabolites (39). Moreover, gas chromatography–based MS is a very mature technique that causes limited costs for each measurement and results in highly standardized chromatograms and mass spectra, which enables a convenient and robust route to compound identification and quantification in laboratory routines. Specifically, the high level of automation would allow standardized larger studies involving a number of institutes, thereby largely improving the statistical power of the eventual assessments. In addition to the studies of tumor tissue, investigations of biomarkers may also be done in serum samples. These investigations would be particularly interesting in addition to serum proteomic studies that have already been done in ovarian cancer patients (40, 41).
It has to be pointed out that there are several limitations of our investigation. To have a large cohort for our analysis, we combined different groups of ovarian tumors and borderline tumors. To refine the analysis in further studies, it may be interesting to analyze changes in more defined cohorts. For example, serous carcinomas of different histologic grades might reveal specific metabolic patterns when compared with serous borderline tumors. If possible, cases of the so-called micropapillary serous tumors may be included. These investigations are particularly interesting in the light of new data, suggesting that an adenoma-borderline tumor-carcinoma sequence exists for a subgroup of low-grade serous ovarian tumors (42–44). Investigations of metabolic patterns in the transition between borderline tumors and serous G1 carcinomas may provide interesting new information on the proposed adenoma-borderline tumor-carcinoma sequence. However, these investigations are limited by the rarity of these ovarian lesions. Only multi-institutional projects with combined resources would be able to conduct these studies. In our investigations, the differences detected between borderline tumors and invasive carcinomas reflect differences in the tumor tissue that contains a mixture of epithelial tumor cells and stromal cells, including inflammatory cells and blood vessels. Therefore, the metabolic patterns described in our analysis are not limited to the epithelial cells but reflect metabolic patterns in this complex tumor microecosystem. Because it has been shown that stromal changes play an important role in the progression of malignant tumors (45, 46), this approach may even have advantages compared with studies that are limited to epithelial cells.
A statistical-methodologic difficulty arises from the composition of the patient cohort with a characteristic overabundance of invasive carcinomas (n = 66) and a relatively small number of borderline tumors (n = 9). We carefully paid attention to possible biases that could be induced by this imbalance by reporting either specific prediction rates for borderline tumors and for invasive carcinomas, or the average of the specific prediction rates. However, the imbalanced composition of the patient cohort is expected to be responsible for the tendency to obtain lower prediction rates for borderline tumors than for carcinomas. The imbalance problem was best tackled by the nearest-mean and the nearest-centroid classifier, leading to the highest and most stable prediction rates (Supplementary Fig. S2C and D). This is in line with the good performance of simple classifiers on gene expression data, as it was stated by Dudoit et al. (47).
Thus, our results show that metabolic signatures as well as individual metabolites can be detected from fresh-frozen tumor tissue of ovarian tumors. Therefore, metabolomics is a promising approach in addition to functional genomics and proteomics for analyses of changes in malignant tumors. Metabolomic analysis will be relevant for tumor biology in different types of investigations. First, similar to functional genomics, it is a promising method for classification of different tumor types and for comparison of malignant tumors with their corresponding normal tissue. Second, analysis of metabolic changes may be used to develop classifiers for therapy response prediction and may lead to the identification of new prognostic markers. Third, metabolomics may be an interesting method to investigate the changes associated with administration of new molecular targeted therapeutics (48).
To get a complete picture of the changes during malignant tumor progression, it will be interesting for further studies to integrate data from all levels of “-omics” to a single data matrix to get a holistic view on the changes associated with malignant transformation. This data matrix would contain information on changes in mRNA levels done with cDNA microarrays as well as proteomic and metabolomic data (49) and might be used as a basis for modeling of signal transduction pathways coupled to metabolic circuits (50). Such analyses can be supposed to enable links from metabolic changes in biochemical pathways to the enzymes involved, and subsequently to the genetic alterations, leading altogether to a better understanding of the molecular changes relevant for malignant transformation. However, it should be pointed out that these approaches are still at the very beginning and will require more experience with the acquisition and analysis of metabolic data as well as the way of integration with other -omics data.
Note: Supplementary data for this article are available at Cancer Research Online (http://cancerres.aacrjournals.org/).
Acknowledgments
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
We thank Ines Koch for her excellent technical assistance and Olaf Kießlich for his help with the statistical analysis.