In the clinical laboratory, large amounts of laboratory data are accumulated, which may provide clues to aid in the prediction of various disease states. However, most work has focused on finding a single marker for diagnosis, which oftentimes perform poorly. The reason for this is that combinations of multiple markers are what need to be used for prediction, complicating matters. In the bioinformatics field, support vector machines (SVMs) for classification and feature extraction have gained much popularity due to their ability to handle multi-dimensional data and to pinpoint (combinations of) specific features that are key to the classification problem at hand. Using our preoperative clinical laboratory data based on 254 surgically operated and histologically confirmed pancreatic cancer patients, we attempted to classify the extent of cancer growth by (1) first finding the most pertinent markers, or features, for classification, (2) training the SVM classifier using the selected features, and (3) using the discrimination scores returned by the SVM to draw a line between the classes such that predictions can be made directly from the clinical laboratory data. We tested various classifications, such as between poorly- and well-differentiation of cancer growth, and as a result of (1), we found that previously suspected markers always ranked highly: CA19-9, elastase I, amylase, and fibrinogen. After performing step 2, we were able to establish an equation for each classification using the corresponding features such that new clinical laboratory data can be automatically classified as belonging to one of the trained classes. Thus, by specifying the values of CA19-9, elastase I and amylase for a new patient, for example, if the result of the equation is > 0, we may predict well differentiated cancer whereas a value of < 0 indicates poorly differentiated cancer. Using these equations, we could accurately classify our current data at a rate of almost 90% accuracy. Previous work have shown that SVMs are useful in prediction, but this is the first work that presents a mathematical model that can directly utilize clinical laboratory data for diagnosis, indicating its strong potential as an indicator for cancer growth prediction based on multiple markers.

Citation Format: {Authors}. {Abstract title} [abstract]. In: Proceedings of the 101st Annual Meeting of the American Association for Cancer Research; 2010 Apr 17-21; Washington, DC. Philadelphia (PA): AACR; Cancer Res 2010;70(8 Suppl):Abstract nr 3765.