Metabolites are the end products of cellular regulatory processes, and their levels can be regarded as the ultimate response of biological systems to genetic or environmental changes. We have used a metabolite profiling approach to test the hypothesis that quantitative signatures of primary metabolites can be used to characterize molecular changes in ovarian tumor tissues. Sixty-six invasive ovarian carcinomas and nine borderline tumors of the ovary were analyzed by gas chromatography/time-of-flight mass spectrometry (GC-TOF MS) using a novel contamination-free injector system. After automated mass spectral deconvolution, 291 metabolites were detected, of which 114 (39.1%) were annotated as known compounds. By t test statistics with P < 0.01, 51 metabolites were significantly different between borderline tumors and carcinomas, with a false discovery rate of 7.8%, estimated with repeated permutation analysis. Principal component analysis (PCA) revealed four principal components that were significantly different between both groups, with the highest significance found for the second component (P = 0.00000009). PCA as well as additional supervised predictive models allowed a separation of 88% of the borderline tumors from the carcinomas. Our study shows for the first time that large-scale metabolic profiling using GC-TOF MS is suitable for analysis of fresh frozen human tumor samples, and that there is a consistent and significant change in primary metabolism of ovarian tumors, which can be detected using multivariate statistical approaches. We conclude that metabolomics is a promising high-throughput, automated approach in addition to functional genomics and proteomics for analyses of molecular changes in malignant tumors. (Cancer Res 2006; 66(22): 10795-804)

Metabolite levels can be regarded as the amplified output of biological systems in response to genetic or environmental changes. However, only in the past years, technologies have been developed that allow comprehensive and quantitative investigation of a multitude of different metabolites, which is called “metabolomics” in analogy to the terms “transcriptomics” and “proteomics” (13). Griffin et al. (4) have suggested metabolic profiling as a promising tool for analysis of the malignant phenotype, drug target evaluation, and tumor diagnosis. The advance of instrumentation and computation has enabled new strategies for separation and identification of a manifold of individual metabolites. To this end, comprehensive analysis by gas chromatography coupled with mass spectrometry (MS) has been used as a “gold standard” for studying primary metabolism, specifically in plant metabolomics (5, 6). Up to 1,000 individual metabolites could be retrieved from plant tissues using time-of-flight (TOF) MS concomitant with deconvolution software to identify individual compounds based on detection of model ions even in those cases where the individual mass spectra of two or more compounds overlap (7). Metabolite profiling has been used in clinical research to study metabolic diseases (8, 9); however, surprisingly few reports (10, 11) have studied comprehensive metabolic responses for the characterization of tumor types. The combination of gas chromatography-TOF (GC-TOF)–based analysis with automatical deconvolution techniques has not been used for analysis of human tumor samples, thus far, but only on mouse tissues in obesity-related research (12).

Ovarian carcinoma is the fifth most common cancer in women. In the United States, an estimated number of 20,180 new cases of ovarian carcinoma and an estimated number of 15,310 deaths are expected for 2006 (13). It has long been recognized that besides invasive ovarian carcinoma, a different type of lesion exists in the ovary, the so-called borderline tumor (review ref. 14). Borderline tumors represent 5% to 10% of ovarian epithelial tumors. They are characterized by complex atypical proliferations of neoplastic and slightly atypical epithelium (15). In contrast to invasive carcinomas, borderline tumors lack a destructive stromal invasion. The different biological background of borderline tumors and invasive ovarian carcinomas translates into different clinical behavior of both groups of tumors. Although ovarian carcinomas are rapidly progressing tumors with a high recurrence rate and a 5-year survival rate as low as 30% to 40% (13, 14), borderline tumors have an indolent clinical course with recurrences in a minority of patients and almost no tumor-associated deaths. We used these two types of lesions as a model and evaluated the hypothesis that GC-TOF–based metabolite profiling may serve as a tool to assess the molecular changes in tumor tissue and to detect metabolic patterns associated with different biological tumor entities.

Study population and histopathologic examination. For metabolic profiling, 75 patients with ovarian lesions who were diagnosed at the Institute of Pathology, Charité Hospital, Berlin, Germany were included into the study. The tissue specimens included 66 primary invasive ovarian carcinomas and 9 borderline tumors. Tissue was dissected by a senior pathologist in the operating room and was immediately frozen in liquid nitrogen and stored at −80°C. Additional H&E sections were done for histopathologic evaluation (16).

Metabolic profiling. Fresh weight (5 mg) of frozen biopsy tissue was prepared under standard operation procedure SOP 2003-2. Tissue was homogenized in 2-mL Eppendorf tubes for 30 seconds at 25 s-1 using 3-mm inner diameter metal balls in a ball mill (Retsch, Germany). Extraction was carried using 1 mL of a one-phasic mixture of chloroform/methanol/water (2:5:2, v/v/v) at −20°C for 5 minutes (17). After centrifugation, the supernatant was concentrated to complete dryness in a speedvac concentrator. The dried metabolic extract was derivatized in two steps: first, carbonyl functions were protected by methoximation using 20 μL of a 40 mg/mL solution of methoxyamine.hydrochloride in pyridine at 28°C for 90 minutes. Afterwards, acidic protons (e.g., hydroxyl, amine, sulfhydryl, and carboxyl groups) were exchanged against trimethylsilyl group using 180 μL N-methyl-N-trimethylsilyltrifluoroacetamide (Macherey-Nagel, Dueren, Germany) at 37°C for 30 minutes to increase the volatility of polar metabolites; 1.5 μL of this solution was injected into an automatic liner exchange system with direct thermodesorption unit (DTD; ATAS GL, Zoetermeer, the Netherlands). For every sample, a fresh liner and microvial was taken to avoid sample carryover and cross-contamination. The sample was introduced at 40°C using a programmable temperature vaporization OPTIC3 injector (ATAS GL) and heated to 290°C using a 4°C/min ramp using the variables shown in Supplementary Table S1.

An Agilent 6890 gas chromatography oven (Hewlett-Packard, Atlanta, GA) was coupled to a Pegasus III TOF mass spectrometer from Leco (St. Joseph, MI). A MDN-35 fused silica capillary column of 30-m length, 0.32-mm inner diameter, and 0.25-μm film thickness was used for separation. For the liner deactivation procedure, the initial oven temperature was set to 85°C with an instant ramp of 50°C/min and a target temperature of 320°C with a hold time of 3-minute duration. For the analysis, the gas chromatography oven was set to 85°C with duration of 210 seconds and a following ramp of 15°C/min. The target time was 360°C with duration of 2 minutes. The transfer line temperature was set to 250°C. Mass spectra were acquired with a scan range of 83 to 500 m/z and an acquisition rate of 20 spectra per second. The ionization mode was electron effect at 70 eV. The temperature for the ion source was set to 250°C. Chromatogram acquisition, data handling, automated peak deconvolution, library search, and retention index calculation were done by the Leco ChromaTOF software (v1.61).

Handling of missing values and normalization. To minimize the number of missing values, only metabolites were included in the statistical analyses that were consistently detected in at least 80% of samples. All known artifact peaks, such as internal standards, column bleed, plasticizers, or reagent peaks, were excluded from the result sheets. All metabolite data were normalized relative to the sum of all known metabolites in each sample and were log transformed. We have replaced missing data with arithmetic means whenever required by the statistical methods.

Statistical approach for detection of metabolic changes. Most of the data mining was done using the statistical language R, a programming and visualization environment that is especially useful for the analysis of high-density molecular data (18). Univariate analyses were carried out without replacement of missing data. Alterations between borderline tumors and invasive carcinomas were investigated by thresholds on the fold change and Welch's t test P values. The results of different selection procedures were validated by 1,000 random permutations of the tumors. The false discovery rate for the generated metabolite list was defined as the ratio (nexp/nobs) between the number of observed significant metabolites between borderline tumors and carcinomas (nobs) and the number of metabolites that were expected to be significant by chance from the permutation distribution (nexp). A P was calculated to assess the significance of finding nobs or more modified metabolite concentrations.

Principal component analysis. In principal component analysis (PCA), the original set of metabolites is reduced to a new set of principal components that retain the variance-covariance structure of the data but use lesser dimensions of data space. PCA was done using the Statistica Data Miner (StatSoft, Inc., Tulsa, OK) We used the normalized data matrix as an input for PCA; this data matrix was centered about the means and scaled by the SDs. Missing values were substituted by the mean. The case designation as borderline tumor or ovarian carcinoma was used as a grouping variable for visualization of data. Furthermore, the correlation of the individual components with the grouping variable was assessed by t test analysis.

Classification. Four classification methods were checked for their ability to identify predictive signatures that are capable of distinguishing between borderline tumors and invasive carcinomas. For Fisher's linear discriminant analysis (LDA) and linear support vector machines (SVM), we made use of the corresponding functions in the R packages MASS and e1071. For nearest mean classification (NMC) and nearest centroid classification (NCC) as described in refs. 19 and 20, we have used our own implementations. All four classifiers share the property of linearity and can be visualized as a split of the space of the metabolite concentrations by a hyperplane, where all points on one side of the hyperplane are classified as carcinomas, and the others are classified as borderline tumors. The classifiers were validated in a leave-one-out approach. As a data matrix without missing values is needed by two of the four classification methods (LDA and SVM), the training data were completed by the method described above. However, model components corresponding to missing values of the test sample would be useless information for its classification. Therefore, the predictive models were fitted to restricted training data sets that contain only the metabolites with present values in the test sample. Furthermore, we have checked whether feature selection before classification leads to an improvement of the classification results: Welch's t statistics was employed to select the 2, 3, …, 291 most significantly altered metabolites from the training data. The corresponding classification rates for invasive carcinomas and for borderline tumors as well as the overall classification rate were recorded and plotted against the number of selected metabolites.

Finally and going beyond the leave-one-out approach, we have applied a multiple random validation strategy (19) to check the robustness of the classification results with regard to different choices of the training set. To this end, we have drawn 2,000 random training-test splits for each possible size (2n = 6, 8, …, 16) of a balanced training set containing an equal number of borderline tumors and invasive carcinomas. The resulting distributions of average prediction rates (average of prediction rate for borderline tumors and prediction rate for invasive carcinomas) were visualized in box plots.

GC-TOF–based analysis of ovarian tumor samples. We investigated tissue samples from 66 patients with invasive ovarian carcinomas and 9 patients with borderline tumors of the ovary. The clinicopathologic data are shown in Supplementary Table S2. Using the GC-TOF, a total of 291 individual metabolites were consistently detected in at least 80% of the tissue samples. Several of these metabolites had severely overlapping peaks that were deconvoluted by the selective ion traces using a software-based approach (Fig. 1A and B). Compound identification was done by comparison of mass spectra and retention indices with those obtained with commercially available reference compounds. Only in a very few clear cases, identifications have been based on mass spectral similarities in comparison with the commercial library purchased from the U.S. National Institute of Standards and Technology. One of such cases is given in Fig. 1B (pentadecanoic acid). Even low abundant aromatic compounds like tyrosine and hypoxanthine (Fig. 1B) were clearly identifiable at only 3- to 4.5-second retention time difference besides the major abundant compound myo-inositol (Fig. 1B). It is important to note that the validation of the analytic method included compounds that are prone to oxidation, such as ascorbate (the first peak eluting in Fig. 1B), which would remain undetectable in case the tissue or the extract would have been subject to oxidizing conditions during the sample preparation. Using this approach, we were able to identify 114 (39.1%) of the 291 metabolites. As Supplementary Material, a list of the identified compounds is given, which supports the comprehensiveness of the method (Supplementary Table S3). A large variety of different compound classes was positively identified, ranging from simple hydroxyacids and amino acids to sugars and sugar alcohols, but it also comprised readily oxidizable compounds, such as ascorbate and cysteine, and compounds with high metabolic turnover, such as organic phosphates. Certain differences between ovarian carcinoma and borderline tumors can already be depicted by visual inspection (e.g., for glycerol-α-phosphate that was found at lower abundance relative to inositol in borderline tumors; Supplementary Fig. S1). Furthermore, use of the automated liner exchange system described in Materials and Methods enabled a contamination-free analysis of free fatty acids, such as arachidonic acid, oleate, and monoglycerides.

Figure 1.

Metabolic profiling by GC-TOF, showing the capability to resolve hundreds of compounds within 25-minute runs. A, base peak intensity GC-MS chromatogram of an extract from an ovarian carcinoma. B, magnification of a 10-second window of the base peak chromatogram around the abundant compound myo-inositol. Three ion traces at m/z 117, 218, and 265 are selected to show separation and mass spectral deconvolution of identified compounds: pentadecanoic acid and identification via mass spectral match for National Institute of Standards and Technology library search. Retention index (RI) and peak purity (a unitless score for quality of mass spectral deconvolution, with scores <1 indicating well resolved mass spectra). Tyrosine, hypoxanthine, and myo-inositol identified via matching both retention index and mass spectrum using a custom-built library. Quality of mass spectral match is indicated as scores up to 1,000, with scores >800 indicating high confidence matches.

Figure 1.

Metabolic profiling by GC-TOF, showing the capability to resolve hundreds of compounds within 25-minute runs. A, base peak intensity GC-MS chromatogram of an extract from an ovarian carcinoma. B, magnification of a 10-second window of the base peak chromatogram around the abundant compound myo-inositol. Three ion traces at m/z 117, 218, and 265 are selected to show separation and mass spectral deconvolution of identified compounds: pentadecanoic acid and identification via mass spectral match for National Institute of Standards and Technology library search. Retention index (RI) and peak purity (a unitless score for quality of mass spectral deconvolution, with scores <1 indicating well resolved mass spectra). Tyrosine, hypoxanthine, and myo-inositol identified via matching both retention index and mass spectrum using a custom-built library. Quality of mass spectral match is indicated as scores up to 1,000, with scores >800 indicating high confidence matches.

Close modal

Detection of differential metabolites by t test–based statistics. In a first approach, Welch's t test was employed to detect metabolites that are significantly different between ovarian carcinomas and borderline tumors. Using a threshold P < 0.01, 51 metabolites were detected that were different in both groups (Fig. 2; Table 1). We have used the Kyoto Encyclopedia of Genes and Genomes (KEGG) database (21) to connect these metabolic changes to different pathways and key enzymes. As shown in Table 1, differences were detected in the area of purine and pyrimidine metabolism, glycerolipid metabolism, and energy metabolism. A number of interesting features in differential regulation of ovarian carcinoma and borderline tumors can be revealed. First, it was apparent that no monoglycerolipids, sugars, or cholesterol derivatives were elevated or decreased in borderline tumors. Some statistically significant biomarkers may be regarded as nonspecific, such as the generic stress responders proline and tocopherol. However, a range of further metabolites was found in relatively lower concentrations in borderline tumors compared with ovarian carcinoma that can be interpreted in a functional way. Specifically, marked differential regulation of creatinine, lactate, glucose-1-phosphate, and the tricarboxylic acid cycle (TCA) intermediates fumarate and malate point to changes in energy metabolism that supports the notion that malignant tumors have higher metabolic turnover rates and thus higher demands in energy supply. However, another interpretation might be even more viable. The involvement of TCA cycle intermediates does also support the cycle's role in determining relative fluxes of amino acid and lipid metabolism through pyruvate dehydrogenase, citrate synthase, and transaminase reactions departing from a-ketoglutarate and oxaloacetate. In correspondence to this view, levels of a number of amino acids were found to be higher in ovarian carcinoma compared with borderline tumors. Interestingly, specifically amino acids were elevated in ovarian carcinoma that serve as nitrogen donors, such as glutamine, glutamate, and asparagine, but also cysteine, glycine, and threonine that are regarded as important building blocks in protein biosynthesis. Altogether, the changes in energy metabolism, lower levels in selected free fatty acids but higher levels in proteinogenic amino acids, purines, pyrimidines, and membrane lipid precursors all support the notion that higher rates of cell divisions in ovarian carcinoma are reflected in altered levels of primary metabolites.

Figure 2.

Significant metabolite differences between invasive ovarian carcinomas and ovarian borderline tumors. Alterations in the mean metabolite content expressed as fold change of comparison of two groups. Only metabolite differences with P < 0.01 are shown. Fifty-one significant alterations in metabolite content were found. Permutation analysis was used to estimate the false detection rate as 7.8%.

Figure 2.

Significant metabolite differences between invasive ovarian carcinomas and ovarian borderline tumors. Alterations in the mean metabolite content expressed as fold change of comparison of two groups. Only metabolite differences with P < 0.01 are shown. Fifty-one significant alterations in metabolite content were found. Permutation analysis was used to estimate the false detection rate as 7.8%.

Close modal
Table 1.

Differences in known metabolites between borderline tumors and invasive ovarian carcinomas

MetaboliteFold change (ovarian cancer vs borderline tumor)PPathwayEnzymes involved (selected)
Glycerolphosphate alpha 5.3 0.0058 Glycerolipid metabolism  
Uracil 4.2 6.20e−06 Pyrimidine metabolism 1.3.1.2 Dihydropyrimidine dehydrogenase; 2.4.2.4 thymidine phosphorylase; 
Hypoxanthine 3.8 0.0033 Purine metabolism 2.4.2.4 Thymidine phosphorylase 
Pyrazine-2,5-bishydroxy 3.4 0.0033   
Inositol-2-phosphate 3.3 0.0042 Glycerolipid metabolism  
Phosphoric acid 3.2 0.0029 Glycerolipid metabolism  
Glutamic acid 0.00018 Amino acid metabolism C/N balance  
Glycine 4.30e−08 Gly/Ser/Thr metabolism  
Malic acid 2.8 0.0028 TCA 1.1.1.37 Malate dehydrogenase; 1.1.3.3 malate oxidase 
γ-Aminobutyric acid 2.7 0.0042 Amino acid metabolism C/N balance 1.2.1.3 Aldehyde dehydrogenase (NAD+); 2.6.1.19 GABA transaminase; 4.1.1.15 glutamate decarboxylase 
α-Tocopherol 2 2.5 0.0083 (No biosynthesis in humans)  
Glucose-1-phosphate degr. 2.5 0.0025 Glycerolipid metabolism  
Proline 2.3 3.80e−07 Amino acid metabolism  
Fumaric acid 2.2 0.0011 TCA  
Creatine 2.2 0.00069 Creatine P breakdown  
Cysteine 0.006 Amino acid metabolism  
Butyric acid 2-hydroxy 0.0051 Propanoate metabolism 1.1.1.27 Lactate dehydrogenase 
Glycine minor 0.00041 Gly/Ser/Thr metabolism  
Glutamine 0.0083 Amino acid metabolism C/N balance 3.5.1.2 Glutaminase 
Threonine 1.9 0.00011 Gly/Ser/Thr metabolism  
Asparagine 1.6 0.00093 Amino acid metabolism  
Nonadecanoic acid −1.2 0.0014 Free fatty acid  
Stearic acid −1.4 0.0096 Free fatty acid  
Heptadecanoic acid −1.4 0.00073 Free fatty acid  
Benzoic acid −1.6 0.00011 Phenylalanine metabolism  
Lactic acid −2.2 0.0017 Propanoate−,glycolysis, pyruvate metabolism 1.1.1.27 Lactate dehydrogenase 
MetaboliteFold change (ovarian cancer vs borderline tumor)PPathwayEnzymes involved (selected)
Glycerolphosphate alpha 5.3 0.0058 Glycerolipid metabolism  
Uracil 4.2 6.20e−06 Pyrimidine metabolism 1.3.1.2 Dihydropyrimidine dehydrogenase; 2.4.2.4 thymidine phosphorylase; 
Hypoxanthine 3.8 0.0033 Purine metabolism 2.4.2.4 Thymidine phosphorylase 
Pyrazine-2,5-bishydroxy 3.4 0.0033   
Inositol-2-phosphate 3.3 0.0042 Glycerolipid metabolism  
Phosphoric acid 3.2 0.0029 Glycerolipid metabolism  
Glutamic acid 0.00018 Amino acid metabolism C/N balance  
Glycine 4.30e−08 Gly/Ser/Thr metabolism  
Malic acid 2.8 0.0028 TCA 1.1.1.37 Malate dehydrogenase; 1.1.3.3 malate oxidase 
γ-Aminobutyric acid 2.7 0.0042 Amino acid metabolism C/N balance 1.2.1.3 Aldehyde dehydrogenase (NAD+); 2.6.1.19 GABA transaminase; 4.1.1.15 glutamate decarboxylase 
α-Tocopherol 2 2.5 0.0083 (No biosynthesis in humans)  
Glucose-1-phosphate degr. 2.5 0.0025 Glycerolipid metabolism  
Proline 2.3 3.80e−07 Amino acid metabolism  
Fumaric acid 2.2 0.0011 TCA  
Creatine 2.2 0.00069 Creatine P breakdown  
Cysteine 0.006 Amino acid metabolism  
Butyric acid 2-hydroxy 0.0051 Propanoate metabolism 1.1.1.27 Lactate dehydrogenase 
Glycine minor 0.00041 Gly/Ser/Thr metabolism  
Glutamine 0.0083 Amino acid metabolism C/N balance 3.5.1.2 Glutaminase 
Threonine 1.9 0.00011 Gly/Ser/Thr metabolism  
Asparagine 1.6 0.00093 Amino acid metabolism  
Nonadecanoic acid −1.2 0.0014 Free fatty acid  
Stearic acid −1.4 0.0096 Free fatty acid  
Heptadecanoic acid −1.4 0.00073 Free fatty acid  
Benzoic acid −1.6 0.00011 Phenylalanine metabolism  
Lactic acid −2.2 0.0017 Propanoate−,glycolysis, pyruvate metabolism 1.1.1.27 Lactate dehydrogenase 

Permutation analysis. The large number of variables measured simultaneously in -omics studies gives rise to a massive multiple testing situation with an accumulating risk of false-positive detections when one proceeds from metabolite to metabolite. We have addressed these issues in a framework of repeated sample permutations. First, the number of metabolites selected by a threshold P < 0.01 was found to be <51 in all 1,000 shuffled data sets, a strong evidence for the presence of a biological signal in the data. Furthermore, the false discovery rate was estimated as 7.8%, suggesting that approximately four metabolites in the set of 51 metabolites were detected by chance and the majority of detected metabolite changes is due biological differences.

Cluster analysis. To visualize the differences in metabolite signatures between borderline tumors and ovarian carcinomas, we used a hierarchical clustering algorithm based on the Pearson correlation coefficient and the average linkage method and did simultaneous clustering of metabolites and tumor samples. Unsupervised clustering using all metabolites did not result in a separation between borderline tumors and ovarian carcinomas (data not shown). Therefore, we investigated whether cluster analysis could be used for visualization and interpretation of the metabolite signatures detected by t test statistics. When we used the Welch t test's P = 0.01 as a filter for subsequent clustering analysis, a separation of most of the tumors was possible (Fig. 3). Grouping of metabolites generally was observed in a manner discussed above; that is, compounds that were biochemically related were generally found to cluster together, such as the pairs malate/fumarate, glutamine/glutamate, uracil/hypoxanthine, and stearate/heptadecanoate. This clustering supports the interpretation that levels of primary metabolites reflect the general metabolic turnover rates that are altered in invasive ovarian carcinoma. Interestingly, lactate, which was found to be differently regulated in the Welch t test, might be one of the few false-positive discoveries when investigating the color coding in the cluster analysis: only a very few patients had elevated levels, obviously enough to cross the significance thresholds but not enough to regard lactate a valid biomarker. In this way, cluster analysis helps investigating putative biological biomarkers and reducing the risk of over interpretation of findings.

Figure 3.

Supervised clustering. Hierarchical clustering was done for better visualization of results of the t test and led to separation of borderline tumors and carcinomas by clustering for 51 significantly (P < 0.01) regulated metabolites. Two groups of tumors are distinguished. Group 1 contains most of the borderline tumors (as well as two carcinomas), whereas group 2 contains most of the carcinomas as well as one borderline tumor. blt, borderline tumor; ovca, ovarian carcinoma.

Figure 3.

Supervised clustering. Hierarchical clustering was done for better visualization of results of the t test and led to separation of borderline tumors and carcinomas by clustering for 51 significantly (P < 0.01) regulated metabolites. Two groups of tumors are distinguished. Group 1 contains most of the borderline tumors (as well as two carcinomas), whereas group 2 contains most of the carcinomas as well as one borderline tumor. blt, borderline tumor; ovca, ovarian carcinoma.

Close modal

PCA. PCA is used for data reduction in multidimensional data matrices and has been described for analysis of plant metabolomic data (22) as well as cDNA microarray data (23). We have used PCA to identify principal components in the data matrix of all metabolites. The differences of individual principal components between borderline tumors and carcinomas were compared with the t test. We found that four of the principal components (principal components 2, 8, 27, and 30) were significantly different between both groups (Fig. 4A-D). Interestingly, the principal component 2, which is the second most important component, showed a very strong difference between both groups (P = 0.00000009). As shown in Fig. 4E, the three-dimensional scatter blot using the information from components 2, 8, and 27 allowed a separation of borderline tumors and ovarian carcinomas, with only one borderline tumor located in the ovarian carcinoma group. These results suggest that there is an underlying structure in the complex metabolic data set, and that statistical approaches are able to detect elements of this underlying structure.

Figure 4.

PCA. The data matrix from the metabolomic analysis was used to detect principal compounds. The relevance for each of these compounds was tested using the t test. Four of the components (components 2, 8, 27, and 30) were found to be significantly different between borderline tumors and ovarian carcinomas. AD, the vector space constructed by three of these components in a three-dimensional scatterplot allowed a separation of borderline tumors (filled dots) and ovarian carcinomas (open dots). Only one of the borderline tumors was located in the carcinoma group (E).

Figure 4.

PCA. The data matrix from the metabolomic analysis was used to detect principal compounds. The relevance for each of these compounds was tested using the t test. Four of the components (components 2, 8, 27, and 30) were found to be significantly different between borderline tumors and ovarian carcinomas. AD, the vector space constructed by three of these components in a three-dimensional scatterplot allowed a separation of borderline tumors (filled dots) and ovarian carcinomas (open dots). Only one of the borderline tumors was located in the carcinoma group (E).

Close modal

Construction of classification models. Four supervised methods were employed to construct predictive models that are capable of distinguishing between invasive carcinomas and borderline tumors based on metabolomic profiling: LDA, SVM, NMC, and NCC (19, 20, 24, 25). The corresponding four classifiers were built from the data of all 291 measured metabolites and validated in a leave-one-out cross-validation. Prediction accuracies were calculated separately for invasive carcinomas, for borderline tumors, and for all lesions (Supplementary Table S4). The best results were obtained with the NMC and the NCC method that both yielded prediction accuracies of 87.9% and 88.9% for the invasive carcinomas and the borderline tumors, respectively.

Additionally, we have checked the performance of a two-step classification scheme consisting of a feature selection step followed by the construction of a predictive model with LDA, SVM, NMC, or NCC (Supplementary Fig. S2A). To this end, the 2, 3, …, 291 most significantly altered metabolites were used for model building, and the corresponding classification accuracies were plotted against the number of features (Fig. 5A-D). Again, the best and most stable results were obtained with nearest-mean classification. Furthermore, we observed that the feature selection step was not essential for good classification rates, as one of the best results was obtained with NMC or NCC without it (i.e., with a classifier built from all 291 measured metabolites). On the other hand, good classification result could be obtained with classifiers that are built from a very small number of metabolites (as 5, 10, or 20).

Figure 5.

Performance of four classification algorithms in dependence of the number of metabolites in the classifier. Prediction rates (% cases) are shown for LDA (A), linear SVM (B), NMC (C), and NCC (D). Prediction rates are reported separately for borderline tumors and invasive carcinomas.

Figure 5.

Performance of four classification algorithms in dependence of the number of metabolites in the classifier. Prediction rates (% cases) are shown for LDA (A), linear SVM (B), NMC (C), and NCC (D). Prediction rates are reported separately for borderline tumors and invasive carcinomas.

Close modal

Finally, we have checked the robustness of the classification results with regard to different choices of the training data (Supplementary Fig. S2B). In a repeated loop, a balanced training set was randomly drawn from the 95 metabolite profiles; a classifier was fitted; and the classifier was evaluated on a test set that contained all remaining profiles. For each size of training set, we obtained a distribution of average prediction rates that was visualized in a box plot (Supplementary Fig. S3A-D). The prediction rates turned out to significantly exceed the baseline of 50% over a wide range of the training set size. A lack of significance was only observed for the largest size of training set (2n = 16). However, in that case, the test sets contain only a single borderline tumor, and the lack of significance for the average prediction rates arises from large variations in the prediction rates for borderline tumors (data not shown).

In this study, we investigated metabolomic profiles of different types of ovarian tumors, borderline tumors, and invasive carcinomas. We were able to validate significant differences in metabolite levels and to detect metabolite signatures that separate about 90% of borderline tumors from the carcinomas and vice versa. We have shown that the metabolite signatures are capable of predicting the status (borderline tumor or invasive carcinoma) of a previously unknown test tumor, and that the classification results are robust against different choices of the training set.

Our study shows for the first time that large-scale metabolic profiling using GC-TOF is suitable for analysis of fresh frozen human tumor samples. Using the KEGG database, we have linked the metabolic changes to some putative key enzymes that play an important role in the corresponding pathways. Some of these enzymes (e.g., dihydropyrimidine dehydrogenase and thymidine phosphorylase) that regulate in pyrimidine metabolism have already been shown to play a prognostic role in ovarian cancer (26). Tanner et al. have shown that aldehyde dehydrogenase is increased in low stage ovarian carcinomas compared with high-stage tumors (27). Nicholson-Guthrie et al. have described increased γ-aminobutyric acid levels in ovarian cancer patients (28).

In the last years, there have already been several studies investigating metabolic changes in different types of malignant tumors. These studies have mainly used nuclear magnetic resonance (NMR) spectroscopy to detect changes in metabolic patterns associated with apoptosis (2931) or response to hypoxia (32) in cell cultures. Sitter et al. have used high-resolution magic angle spinning in combination with PCA to investigate tissue samples of eight patients with cervical cancer and found higher levels of cholines and amino acid residues and lower levels of glucose in the malignant tissue (33). Gribbestadt et al. compared malignant and nonmalignant breast tissue using 1H NMR spectroscopy and found low levels of glucose and high levels of choline in tumor tissue (34).

These NMR-based methods (35, 36) are interesting as well, as it is possible to get high-resolution spectra from intact tissue samples that are fully available for additional standardized histopathologic analysis (37). A clinically viable solution might eventually reach out to apply both GC-TOF and liquid chromatography-MS (LC-MS) to identify metabolic biomarkers in malignant tumors by large-scale metabolomics and to use NMR-based methods to measure these biomarkers in the clinical routine, potentially by in vivo imaging.

In this article, we have presented results from a novel and very specific variant of GC-TOF MS: the conjunction with automated liner exchange and direct thermodesorption injection (DTD). Two advantages of this technique were expected: (a) avoidance of cross contamination and (b) more gentle introduction of compounds onto the gas chromatograph, which enables nondestructive analysis for thermolabile compounds. Because each sample was analyzed with a fresh and previously unused microvial and liner, no cross-contamination could occur due to matrix depositions. Experiments on blood plasma injections have validated the viability of this technique to quantify endogenous levels of free fatty acids and monoglycerides (data not shown). We were, therefore, confident that liner exchange and DTD would enable to detect significant changes in lipid metabolism and individual fatty acids, and indeed, such changes were observed in comparison of borderline and ovarian carcinoma. Thus far, this is the first report on metabolite profiling by GC-TOF in which changes in lipid metabolism have been found, a result that may have been prevented by widespread use of classic split/splitless injectors otherwise. Furthermore, such classic injectors introduce compounds onto the gas chromatography column by 230°C to 250°C hot injectors, a temperature that is sufficient to decompose more thermolabile compounds like organic phosphates. DTD injection, however, slowly ramped the temperature up to the boiling point of each compound, which supposedly leads to less destruction and improved quantitation capabilities.

Quality controls have been used primarily for the technique itself (i.e., instrument sensitivity, reagent blank controls, and method blank controls). Around 25% of all injections were done for quality control; however, the error rate for the extraction technique used here is unknown because the quantity of tissue biopsies was not large enough to do in-depth validation studies of the extraction technique. Nevertheless, this extraction technique was successfully used before for mouse tissues (38). Method error rates for mouse and plant tissues as well as blood plasma were found in the range of 5% to 25% relative SD, depending on the chemical structure of the metabolites. Further limitations of the technique of GC-TOF analysis are certainly, that bisphosphates and trisphosphates cannot be analyzed due to their thermolability. Equally, large compounds >550 Da are usually too involatile to be detected by GC-TOF. Such compound classes, including for example complex membrane lipids, need to be analyzed be LC-MS or direct infusion electrospray MS.

Despite these limitations, use of GC-TOF has several advantages for cancer diagnostics. First, a low amount of tissue is needed. In this study, 5-mg fresh tissue sections were used, but a recent study has shown that as few as 5,000 pooled single plant cells would be sufficient to quantify 66 identified metabolites (39). Moreover, gas chromatography–based MS is a very mature technique that causes limited costs for each measurement and results in highly standardized chromatograms and mass spectra, which enables a convenient and robust route to compound identification and quantification in laboratory routines. Specifically, the high level of automation would allow standardized larger studies involving a number of institutes, thereby largely improving the statistical power of the eventual assessments. In addition to the studies of tumor tissue, investigations of biomarkers may also be done in serum samples. These investigations would be particularly interesting in addition to serum proteomic studies that have already been done in ovarian cancer patients (40, 41).

It has to be pointed out that there are several limitations of our investigation. To have a large cohort for our analysis, we combined different groups of ovarian tumors and borderline tumors. To refine the analysis in further studies, it may be interesting to analyze changes in more defined cohorts. For example, serous carcinomas of different histologic grades might reveal specific metabolic patterns when compared with serous borderline tumors. If possible, cases of the so-called micropapillary serous tumors may be included. These investigations are particularly interesting in the light of new data, suggesting that an adenoma-borderline tumor-carcinoma sequence exists for a subgroup of low-grade serous ovarian tumors (4244). Investigations of metabolic patterns in the transition between borderline tumors and serous G1 carcinomas may provide interesting new information on the proposed adenoma-borderline tumor-carcinoma sequence. However, these investigations are limited by the rarity of these ovarian lesions. Only multi-institutional projects with combined resources would be able to conduct these studies. In our investigations, the differences detected between borderline tumors and invasive carcinomas reflect differences in the tumor tissue that contains a mixture of epithelial tumor cells and stromal cells, including inflammatory cells and blood vessels. Therefore, the metabolic patterns described in our analysis are not limited to the epithelial cells but reflect metabolic patterns in this complex tumor microecosystem. Because it has been shown that stromal changes play an important role in the progression of malignant tumors (45, 46), this approach may even have advantages compared with studies that are limited to epithelial cells.

A statistical-methodologic difficulty arises from the composition of the patient cohort with a characteristic overabundance of invasive carcinomas (n = 66) and a relatively small number of borderline tumors (n = 9). We carefully paid attention to possible biases that could be induced by this imbalance by reporting either specific prediction rates for borderline tumors and for invasive carcinomas, or the average of the specific prediction rates. However, the imbalanced composition of the patient cohort is expected to be responsible for the tendency to obtain lower prediction rates for borderline tumors than for carcinomas. The imbalance problem was best tackled by the nearest-mean and the nearest-centroid classifier, leading to the highest and most stable prediction rates (Supplementary Fig. S2C and D). This is in line with the good performance of simple classifiers on gene expression data, as it was stated by Dudoit et al. (47).

Thus, our results show that metabolic signatures as well as individual metabolites can be detected from fresh-frozen tumor tissue of ovarian tumors. Therefore, metabolomics is a promising approach in addition to functional genomics and proteomics for analyses of changes in malignant tumors. Metabolomic analysis will be relevant for tumor biology in different types of investigations. First, similar to functional genomics, it is a promising method for classification of different tumor types and for comparison of malignant tumors with their corresponding normal tissue. Second, analysis of metabolic changes may be used to develop classifiers for therapy response prediction and may lead to the identification of new prognostic markers. Third, metabolomics may be an interesting method to investigate the changes associated with administration of new molecular targeted therapeutics (48).

To get a complete picture of the changes during malignant tumor progression, it will be interesting for further studies to integrate data from all levels of “-omics” to a single data matrix to get a holistic view on the changes associated with malignant transformation. This data matrix would contain information on changes in mRNA levels done with cDNA microarrays as well as proteomic and metabolomic data (49) and might be used as a basis for modeling of signal transduction pathways coupled to metabolic circuits (50). Such analyses can be supposed to enable links from metabolic changes in biochemical pathways to the enzymes involved, and subsequently to the genetic alterations, leading altogether to a better understanding of the molecular changes relevant for malignant transformation. However, it should be pointed out that these approaches are still at the very beginning and will require more experience with the acquisition and analysis of metabolic data as well as the way of integration with other -omics data.

Note: Supplementary data for this article are available at Cancer Research Online (http://cancerres.aacrjournals.org/).

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

We thank Ines Koch for her excellent technical assistance and Olaf Kießlich for his help with the statistical analysis.

1
Fiehn O. Metabolomics: the link between genotypes and phenotypes.
Plant Mol Biol
2002
;
48
:
155
–71.
2
Bino RJ, Hall RD, Fiehn O, et al.
Trends Plant Sci
2004
;
9
:
418
–25.
3
Oliver SG. Functional genomics: lessons from yeast.
Philos Trans R Soc Lond B Biol Sci
2002
;
357
:
17
–23.
4
Griffin JL, Shockcor JP. Metabolic profiles of cancer cells.
Nat Rev Cancer
2004
;
4
:
551
–61.
5
Catchpole GS, Beckmann M, Enot DP, et al. Hierarchical metabolomics demonstrates substantial compositional similarity between genetically modified and conventional potato crops.
Proc Natl Acad Sci U S A
2005
;
102
:
14458
–62.
6
Fiehn O, Kopka J, Dormann P, et al.
Nat Biotechnol
2000
;
18
:
1157
–61.
7
Halket JM, Przyborowska A, Stein SE, et al. Deconvolution gas chromatography/mass spectrometry of urinary organic acids-potential for pattern recognition and automated identification of metabolic disorders.
Rapid Commun Mass Spectrom
1999
;
13
:
279
–84.
8
Tanaka K, Hine DG, West-Dull A, Lynn TB. Gaschromatographic method of analysis of urinary organic acids. I. Retention indices of 155 metabolically important compounds.
Clin Chem
1980
;
26
:
1839
–46.
9
Tanaka K, West-Dull A, Hine DG, Lynn TB, Lowe T. Gas-chromatographic method of analysis of urinary organic acids. II. Description of the procedure, and its application to diagnosis of patients with organic acidurias.
Clin Chem
1980
;
26
:
1847
–53.
10
Odunsi K, Wollman RM, Ambrosone CB, et al. Detection of epithelial ovarian cancer using H-1-NMR-based metabonomics.
Int J Cancer
2005
;
113
:
782
–8.
11
Ippolito JE, Xu J, Jain S, et al. An integrated functional genomics and metabolomics approach for defining poor prognosis in human neuroendocrine cancers.
Proc Natl Acad Sci U S A
2005
;
102
:
9901
–6.
12
Welthagen W, Shellie R, Ristow M, Spranger J, Zimmermann R, Fiehn O. Comprehensive two dimensional gas chromatography - time of flight mass spectrometry, GCxGC-TOF for high resolution metabolomics: biomarker discovery on spleen tissue extracts of obese NZO compared to lean C57BL/6 mice.
Metabolomics
2005
;
1
:
57
–65.
13
Jemal A, Siegel R, Ward E, et al. Cancer statistics, 2006.
CA Cancer J Clin
2006
;
56
:
106
–30.
14
Silverberg SG, Bell DA, Kurman RJ, et al. Borderline ovarian tumors: key points and workshop summary.
Hum Pathol
2004
;
35
:
910
–7.
15
Hart WR. Borderline epithelial tumors of the ovary.
Mod Pathol
2005
;
18
Suppl 2:
S33
–50.
16
Shimizu Y, Kamoi S, Amada S, et al. Toward the development of a universal grading system for ovarian epithelial carcinoma.
Gynecol Oncol
1998
;
70
:
2
–12.
17
Weckwerth W, Wenzel K, Fiehn O. Process for the integrated extraction, identification and quantification of metabolites, proteins and RNA to reveal their co-regulation in biochemical networks.
Proteomics
2004
;
4
:
78
–83.
18
R Development Core Team. R: A language and environment for statistical computing. Vienna (Austria): R Foundation for Statistical Computing; 2005. ISBN 3-900051-07-0. Available from: http://www.R-project.org.
19
Wessels LF, Reinders MJ, Hart AA, et al. A protocol for building and evaluating predictors of disease state based on microan data.
Bioinformatics
2005
;
21
:
3755
–62.
20
Michiels S, Koscielny S, Hill C. Prediction of cancer outcome with microarrays: a multiple random validation strategy.
Lancet
2005
;
365
:
488
–92.
21
Kyoto Encyclopedia of Genes and Genomes (KEGG). Available from: http://www.genome.ad.jp/kegg/.
22
Scholz M, Gatzek S, Sterling A, Fiehn O, Selbig J. Metabolite fingerprinting: detecting biological features by independent component analysis.
Bioinformatics
2004
;
20
:
2447
–54.
23
Schwartz DR, Kardia SL, Shedden KA, et al. Gene expression in ovarian cancer reflects both morphology and biological behavior, distinguishing clear cell from other poor-prognosis ovarian carcinomas.
Cancer Res
2002
;
62
:
4722
–9.
24
Fan RE, Chen PH, Lin CJ. Working set selection using the second order information for training SVM.
J Machine Learn Res
2005
;
6
:
1889
–918.
25
Venables WN, Ripley BD. Modern Applied Statistics with S. 4th ed. (New York, NY) Springer; 2002.
26
Fujiwaki R, Hata K, Nakayama K, et al. Gene expression for dihydropyrimidine dehydrogenase and thymidine phosphorylase influences outcome in epithelial ovarian cancer.
J Clin Oncol
2000
;
18
:
3946
–51.
27
Tanner B, Hengstler JG, Dietrich B, et al. Glutathione, glutathione S-transferase α and pi, and aldehyde dehydrogenase content in relationship to drug resistance in ovarian cancer.
Gynecol Oncol
1997
;
65
:
54
–62.
28
Nicholson-Guthrie CS, Guthrie GD, Sutton GP, Baenziger JC. Urine GABA levels in ovarian cancer patients: elevated GABA in malignancy.
Cancer Lett
2001
;
162
:
27
–30.
29
Griffin JL, Lehtimaki KK, Valonen PK, et al. Assignment of 1H nuclear magnetic resonance visible polyunsaturated fatty acids in BT4C gliomas undergoing ganciclovir-thymidine kinase gene therapy-induced programmed cell death.
Cancer Res
2003
;
63
:
3195
–201.
30
Hakumaki JM, Poptani H, Puumalainen AM, et al. Quantitative 1H nuclear magnetic resonance diffusion spectroscopy of BT4C rat glioma during thymidine kinase-mediated gene therapy in vivo: identification of apoptotic response.
Cancer Res
1998
;
58
:
3791
–9.
31
Williams SN, Anthony ML, Brindle KM. Induction of apoptosis in two mammalian cell lines results in increased levels of fructose-1,6-bisphosphate and CDP-choline as determined by 31P MRS.
Magn Reson Med
1998
;
40
:
411
–20.
32
Griffiths JR, McSheehy PM, Robinson SP, et al. Metabolic changes detected by in vivo magnetic resonance studies of HEPA-1 wild-type tumors and tumors deficient in hypoxia-inducible factor-1β (HIF-1β): evidence of an anabolic role for the HIF-1 pathway.
Cancer Res
2002
;
62
:
688
–95.
33
Sitter B, Bathen T, Hagen B, Arentz C, Skjeldestad FE, Gribbestad IS. Cervical cancer tissue characterized by high-resolution magic angle spinning MR spectroscopy.
MAGMA
2004
;
16
:
174
–81.
34
Gribbestad IS, Sitter B, Lundgren S, Krane J, Axelson D. Metabolite composition in breast tumors examined by proton nuclear magnetic resonance spectroscopy.
Anticancer Res
1999
;
19
:
1737
–46.
35
Florian CL, Preece NE, Bhakoo KK, Williams SR, Noble M. Characteristic metabolic profiles revealed by 1H NMR spectroscopy for three types of human brain and nervous system tumours.
NMR Biomed
1995
;
8
:
253
–64.
36
Florian CL, Preece NE, Bhakoo KK, Williams SR, Noble MD. Cell type-specific fingerprinting of meningioma and meningeal cells by proton nuclear magnetic resonance spectroscopy.
Cancer Res
1995
;
55
:
420
–7.
37
Howe FA, Barton SJ, Cudlip SA, et al. Metabolic profiles of human brain tumors using quantitative in vivo 1H magnetic resonance spectroscopy.
Magn Reson Med
2003
;
49
:
223
–32.
38
Shellie RA, Welthagen W, Zrostlikova J, et al. Statistical methods for comparing comprehensive two-dimensional gas chromatography-time-of-flight mass spectrometry results: metabolomic analysis of mouse tissue extracts.
J Chromatogr A
2005
;
1086
:
83
–90.
39
Schad M, Mungur R, Fiehn O, Kehr J. Metabolic profiling of laser microdissected vascular bundles of Arabidopsis thaliana.
Plant Methods
2005
;
1
:
2
.
40
Petricoin EF, Ardekani AM, Hitt BA, et al. Use of proteomic patterns in serum to identify ovarian cancer.
Lancet
2002
;
359
:
572
–7.
41
Calvo KR, Liotta LA, Petricoin EF. Clinical proteomics: from biomarker discovery and cell signaling profiles to individualized personal therapy.
Biosci Rep
2005
;
25
:
107
–25.
42
Shih IeM, Kurman RJ. Ovarian tumorigenesis: a proposed model based on morphological and molecular genetic analysis.
Am J Pathol
2004
;
164
:
1511
–8.
43
Singer G, Stohr R, Cope L, et al. Patterns of p53 mutations separate ovarian serous borderline tumors and low- and high-grade carcinomas and provide support for a new model of ovarian carcinogenesis: a mutational analysis with immunohistochemical correlation.
Am J Surg Pathol
2005
;
29
:
218
–24.
44
Meinhold-Heerlein I, Bauerschlag D, Hilpert F, et al. Molecular and prognostic distinction between serous ovarian carcinomas of varying grade and malignant potential.
Oncogene
2005
;
24
:
1053
–65.
45
De Wever O, Mareel M. Role of tissue stroma in cancer cell invasion.
J Pathol
2003
;
200
:
429
–47.
46
Liotta LA, Kohn EC. The microenvironment of the tumour-host interface.
Nature
2001
;
411
:
375
–9.
47
Dudoit S, Fridlyand J. Classification in microarray experiments. In: Speed TP, editor. Statistical analysis of gene expression microarray data. Chapter 3. (Boca Raton, FL) Chapman & Hall/CRC; 2003. p. 93–158.
48
Lim HK, Stellingweif S, Sisenwine S, Chan KW. Rapid drug metabolite profiling using fast liquid chromatography, automated multiple-stage mass spectrometry and receptor-binding.
J Chromatogr A
1999
;
831
:
227
–41.
49
Hirai MY, Yano M, Goodenowe DB, et al. Integration of transcriptomics and metabolomics for understanding of global responses to nutritional stresses in Arabidopsis thaliana.
Proc Natl Acad Sci U S A
2004
;
101
:
10205
–10.
50
Oksman-Caldentey KM, Inze D, Oresic M. Connecting genes to metabolites by a systems biology approach.
Proc Natl Acad Sci U S A
2004
;
101
:
9949
–50.