Abstract
Intraductal papillary mucinous neoplasm (IPMN) is a precursor of pancreatic ductal adenocarcinoma. Low-grade dysplasia has a relatively good prognosis, whereas high-grade dysplasia and IPMN invasive carcinoma require surgical intervention. However, diagnostic distinction is difficult. We aimed to identify biomarkers in peripheral blood for accurate discrimination.
Sera were obtained from 302 patients with IPMNs and 88 healthy donors. For protein biomarkers, serum samples were analyzed on microarrays made of 2,977 antibodies. A support vector machine (SVM) algorithm was applied to define classifiers, which were validated on a separate sample set. For microRNA biomarkers, a PCR-based screen was performed for discovery. Biomarker candidates confirmed by quantitative PCR were used to train SVM classifiers, followed by validation in a different sample set. Finally, a combined SVM classifier was established entirely independent of the earlier analyses, again using different samples for training and validation.
Panels of 26 proteins or seven microRNAs could distinguish high- and low-risk IPMN with an AUC value of 95% and 94%, respectively. Upon combination, a panel of five proteins and three miRNAs yielded an AUC of 97%. These values were much better than those obtained in the same patient cohort by using the guideline criteria for discrimination. In addition, accurate discrimination was achieved between other patient subgroups.
Protein and microRNA biomarkers in blood allow precise diagnosis and risk stratification of IPMN cases, which should improve patient management and thus the prognosis of IPMN patients.
IPMN usually develops from benign forms (low risk) to a premalignant stage and finally IPMN-associated carcinoma (high-risk). Patients with low-risk IPMN are kept under close surveillance, whereas the others have the tumor surgically resected. Currently, distinction between the two risk groups is difficult to achieve. Late intervention, however, results in poor prognosis. Inversely, if surgery is performed when not required, it poses an unnecessary risk to patient health. The definition of a highly accurate diagnostic panel of five proteins and three miRNAs in the serum of IPMN patients allows non-invasive, blood-based discrimination of the risk groups. In a cohort that embodies the actual representation of IPMN patients, for whom such a distinction is required, the biomarker signature markedly outperformed the discrimination accuracy obtained following the current clinical guidelines. The established process could substantially affect clinical disease management and thereby considerably improve patient prognosis.
Introduction
Pancreatic ductal adenocarcinoma (PDAC) is one of the most aggressive malignancies (1) and bound to become the second most frequent cause of cancer-related death in the Western world (2). There are two types of PDAC precursor lesions, namely pancreatic intraepithelial neoplasias (PanIN) and mucinous cysts. The latter are subdivided further into intraductal papillary mucinous neoplasms (IPMN) and mucinous cystic neoplasms (MCN). In particular IPMNs have a high potential to develop into malignant tumors with a bad prognosis and are therefore of high clinical relevance. They are frequently discovered through incidental cross-sectional imaging (3, 4). Diagnostically, three IPMN grades are described: low-grade and high-grade dysplasia (also called carcinoma in situ) as well as IPMN with an associated carcinoma (5). On the basis of clinical evidence, low-grade IPMN dysplasia are considered benign with a low risk of malignant progression. They have a 5-year disease-specific survival rate of 97%, as compared with 84% and 39% of patients with high-grade and invasive IPMN (6). They are kept under close surveillance to monitor further development of the disease. High-grade dysplasia is a premalignant form and IPMNs with an associated carcinoma have already become malignant. Because of the high risk, these IPMN patients require surgical tumor resection (7, 8). Distinguishing high-risk from low-risk IPMN patients accurately is critical to give patients the appropriate treatment (9). If the intervention is late, a highly malignant tumor could develop which results in poor prognosis. If surgery is performed when not required, it poses an unnecessary risk to the patient's health. There are quite a few indicators for surgery, such as a positive cytology for malignant, high-grade dysplasia, tumor-related jaundice, enhancing mural nodes, main pancreatic duct dilatation, tumor growth-rate, cyst diameter, increased carbohydrate antigen 19–9 (CA19–9) blood concentration, new-onset of diabetes mellitus, and diagnosis of pancreatitis (10, 11). Despite this large number of diagnostic features, however, diagnosis is not sufficiently accurate. Also imaging does not solve the problem; resected IPMN samples frequently reveal other grades of malignancy than predicted prior to resection (12).
Blood-based diagnostics of PDAC has been developing during the last few years. Blood is attractive for biomarker detection because of its stability and accessibility. Also, diagnosis would be minimally invasive, reducing the burden to the patients and cost. In addition, samples could be easily collected repeatedly from the same patient, allowing longitudinal monitoring. A large number of blood biomarkers or biomarker signatures for the diagnosis of PDAC has been published (for a comprehensive review see ref. 13). However, not much progress has been made toward actual translation to clinical routine. Even the utility of serum markers that are already in use in clinical diagnostics, such as CA19–9, is limited by a relatively high degree of false results.
There have been far fewer studies about blood-based detection of IPMN patients or the discrimination of IPMN grades than there are on PDAC and they are mostly lacking validation (14). In addition, performance is insufficient for application. Serum levels of CA19–9 are associated with malignant IPMN, but the diagnostic power is low (11, 15). Reported AUC values vary, ranging from 0.62 to 0.78. Another serum marker—carcinoembryonic antigen (CEA)—exhibits little utility in patient management. Also the blood type was proposed as a predictor of high-grade dysplasia but was of rather limited accuracy (16). At the RNA level, miRNA miR-4539 isolated from extracellular vesicles was reported to discriminate IPMN patients from healthy individuals with an AUC of 0.72; miR-6132 could distinguish benign IPMN and IPMN-derived carcinoma with an AUC of 0.77 (17). Five other miRNAs were found to correlate somehow with high-grade dysplasia or malignant carcinoma (AUC = 0.73; ref. 18) and a signature of 8 long noncoding RNAs yielded an AUC of 0.77 (19). With respect to proteins, MIC-1/GDF15 was considered, but proved ineffective (20). Also THBS2 showed very low accuracy with an AUC of 0.65 (21). Detection of anti-p53 antibody in combination with serum concentrations of CEA and CA19–9 could discriminate high-grade dysplasia and malignant IPMN from low-grade IPMN with a sensitivity of just 38.4% and specificity of 81.6% (22). The relatively strong anti-p53 antibody marker, however, could be detected in only 8.2% of the samples. Thus, for most patients, specificity was actually substantially lower than 81.6%. Recently, variations in the abundance of six serum proteins were described that distinguished low- and high-risk IPMNs with an accuracy of 83% (23). However, the analysis was based on a small sample number and not validated in an independent sample set.
We aimed at substantially better discrimination by performing a much more comprehensive study, analyzing a large number of serum samples and studying variations in two molecule classes, namely proteins and miRNAs. The samples were collected from patients with low-grade or high-grade dysplasia as well as invasive carcinoma. Serum from healthy blood donors was analyzed for reference. Besides producing separate protein and miRNA diagnostic signatures, we generated independently a classifier that combined both molecule classes as several studies suggested that a combination of diagnostic features of different molecule classes could yield better robustness (13). The main objective of our study was the discrimination between low- and high-risk patients, because of this is most relevant clinically. An accurate biomarker signature in serum could substantially simplify risk stratification and thus a decision about performing tumor resection or not.
Materials and Methods
Serum samples
Serum samples from 302 IPMN patients and 88 healthy donors were obtained from the Pancobank biorepository of the European Pancreas Center at the Surgery Department of University Hospital Heidelberg. Written informed consent had been obtained from all blood donors. Ethical approval was given by the Ethics Committee of Heidelberg University (ethics votes 159/2002 and 708/2019). Work was done in compliance with the provisions of the Declaration of Helsinki. All patients underwent IPMN resection because of the suspicion of malignancy or other indications for surgery. Only sera were included, for which pathologic diagnosis of the resected material confirmed IPMN with a specific grade of dysplasia. For further sample information, see Supplementary Table S1.
Protein profiling
Microarrays containing 2,977 antibodies targeting 2,286 proteins were produced as described in detail earlier (24); a list of the antibodies’ target proteins is given in Supplementary Table S2. The antibodies were spotted onto epoxysilane-coated slides (Schott Nexterion) using contact printing (MicroGrid-2; BioRobotics). Positional marker molecules as well as negative and positive controls were included. The slides were stored in dry and dark conditions at 4°C until use. Microarray analyses were performed as described in detail elsewhere (25). The protein concentration of the serum samples was adjusted to 4 mg/mL. Proteins were labeled with the fluorescent dyes DY649 or DY549 (Dyomics), respectively, at a molar dye/protein ratio of 7.5 at 4°C for 2 hours. Unreacted dye was quenched by the addition of 10% glycine. As a common reference sample for normalization, we utilized the pooled cellular protein lysates of 24 PDAC cell lines; it was analyzed simultaneously to each serum sample labeled with the other respective dye. Incubation was at 40 μg/mL protein at 4°C overnight. After washing and drying, a PowerScanner system (Tecan) was used for image capture at constant laser power and photo-multiplier tube gain. The images were analyzed with the software package GenePix Pro 6.0 (Molecular Devices), generating numerical values of signal intensities.
miRNA profiling
Total RNA was purified from 100 μL serum with the miRNeasy Serum/Plasma Kit (Qiagen). cDNA was synthesized with the miRCURY LNA RT Kit and directly studied with the miRCURY LNA SYBR Green PCR Kit and the miRCURY LNA Human Serum/Plasma Focus miRNA PCR Panel (Qiagen) according to the manufacturer's protocols. It determined the abundance of 179 miRNAs (Supplementary Table S3). The cycle threshold (Ct) values were transformed to miRNA copy numbers by taking advantage of standard curves generated with the spike-in control cel-miR-39–3p.
qPCR
miRNA biomarker candidates were further studied by qPCR assays on more serum samples. The miRCURY LNA RT Kit was used for reverse transcription of RNA purified with the miRNeasy Serum/Plasma Kit. For each reaction, 5 μL of 2xmiRCURY SYBR Green Master mix and 1 μL PCR primer were added to 4 μL of cDNA, which had been diluted 1:20. After an initial heat activation for 2 minutes at 95°C, there were 45 PCR cycles at 95°C for 10 seconds and 56°C for 1 minute on a LightCycler 480 (Roche Diagnostics) according to the protocol of the miRCURY LNA miRNA SYBR Green PCR Kit. Simultaneously, a standard curve based on cel-miR-39–3p was produced to transform qPCR data into miRNA copy numbers.
Statistical analyses
Protein biomarkers
Analyses were conducted with the R software (version 4.0.4; RRID: SCR_001905). Data normalization and preprocessing were performed with the “limma” package (v. 3.50; ref. 26). The function “backgroundCorrect” was used for background correction (27). Variances within and between microarrays were corrected by applying the functions “normalizeWithinArrays,” “normalizeBetweenArrays,” and “removeBatchEffect” (28). For quality control, the function “boxplot” and cluster dendrograms from the “cluster” package (v. 2.1.3) were used (29). The samples were randomly divided into a discovery and validation cohort. In the discovery cohort, empirical Bayes test was used for comparing the data of two groups, such as healthy controls versus IPMN samples, using the “limma” package. To evaluate the diagnostic power of individual proteins, univariate logistic regression combined with ROC analysis (30) was performed; proteins with a P-value <0.05 were considered biomarker candidates. Least absolute shrinkage and selection operator (LASSO) regression based on the R package “glmnet” (v. 4.1–3; ref. 31) and recursive feature elimination (RFE) with 5-fold cross-validation of the “caret classification and regression” tool (v. 6.0–91) within the R software (RRID: SCR_001905) were applied for feature selection to enhance predictive accuracy (32). Identified protein biomarker candidates were used for the construction and training of a support vector machine (SVM) classifier for two-group classifications using the “e1071” tool (v. 1.7–9) within the R software (RRID: SCR_001905). During the training process, the three parameters cost (ranging from 0 to10), gamma (0 to 1), and epsilon (0 to 1) of the SVM classifier were optimized by an exhaustive search approach. The classifier was then applied with fixed parameters to assess its diagnostic performance with the data resulting from the validation cohort. Values for sensitivity, specificity and the AUC were calculated using the R package “pROC” (30).
miRNA biomarkers
For identifying miRNA biomarkers in serum, candidates were identified first by studying abundance differences in a discovery sample cohort detected by the miRCURY LNA PCR System with the R package “limma” (26). Next, qPCR was conducted for these miRNA candidates using serum samples of a training cohort. Differentially abundant miRNAs with P values <0.05 in two-tailed Student t test were used to define miRNA panels by RFE with 5-fold cross-validation. These panels were then applied to construct and train SVM classifiers. The SVM classifiers with fixed parameters were applied to the data generated from the validation sample cohort to assess the diagnostic performance. AUC values were calculated using the R package “pROC” (30).
Combining protein and miRNA biomarkers
Samples, for which both protein and qPCR miRNA profiling data were available, were randomly separated into a training and validation cohort. Empirical Bayes test and Student t test were used to confirm significant changes of protein and miRNA abundance, respectively. After RFE with 5-fold cross-validation, we used the combined miRNA and protein data resulting from the training cohort to construct and optimize a SVM classifier. It was then applied with fixed parameters to the data resulting from the validation samples to confirm its prediction and classification potential and determine the diagnostic value of the combined biomarker panel.
Data availability
The data generated in this study are available within the article and its supplementary data files. For additional data requests, please contact the corresponding author.
Results
Clinical information about IPMN patients
Serum samples representing 302 IPMN patients and 88 healthy donors (Supplementary Table S1) were obtained from the European Pancreas Center at University Hospital Heidelberg. All patients underwent IPMN resection because of the suspicion of malignancy or other indications for surgery and represent a typical cohort. Blood had been collected prior to tumor resection. Only samples of patients were included, for whom pathologic analysis of the tissue confirmed IPMN with a specific grade of dysplasia. Demographic characteristics, preoperative symptoms, radiologic findings, laboratory data, and pathologic factors were collected. Preoperative cross-sectional imaging was reviewed for radiographic IPMN features, including cyst localization and size, presence of a solid component, thickened/enhancing cyst wall, lymphadenopathy, and main pancreatic duct dilatation. If multiple IPMN lesions were detected, the largest cyst size was recorded. IPMNs were classified as main-duct, branch-duct, or mixed-type based on preoperative imaging and postoperative histology according to international consensus guidelines (15). Preoperative serum levels of CA19–9 and bilirubin as well as the presence of symptoms for obstructive jaundice and pancreatitis were documented. Following surgical resection, an expert pancreatic pathologist assessed the specimens. The highest degree of dysplasia identified in each specimen was recorded as the grade of dysplasia. The study was done in accordance with the Standards for Reporting Diagnostic Accuracy Studies (33).
Protein signatures
For the identification of protein biomarkers, microarrays made of 2,977 antibodies were incubated with serum to acquire protein abundance information. The serum samples were divided randomly into a discovery and a validation cohort. The former consisted of 55 samples from healthy donors, 130 from patients of low-risk and 89 of high-risk IPMN; the overall design of the analysis is shown in Fig. 1. The normalized data from the microarray experiments is accessible in Supplementary Table S4. A relatively large number of proteins exhibited significant abundance variations in at least one of the pairwise comparisons of patient groups (Table 1). However, no protein was sufficiently informative on its own. Because of the molecular heterogeneity of pancreatic cancers, it can be expected that a biomarker panel will be required for an appropriately robust assay. Therefore, absolute shrinkage and selection operator (LASSO) regression and recursive feature elimination (RFE) were applied for further selection. The resulting biomarkers were used to train support vector machine (SVM) classifiers. The SVM algorithm determines the location of samples in a high-dimensional space based on the levels of the biomarker candidates in each sample; each axis in the space represents one protein. The SVM classifier draws a hyperplane, which is a decision boundary to best separate samples into two classes. During the training process, a combination of three parameters (cost, gamma, epsilon) was optimized to avoid over- and underfitting. The classifier with fixed parameter settings was subsequently assessed on data from separate validation samples from 33 healthy donors, 46 low-risk and 37 high-risk IPMN patients.
Performance of serum protein panels to distinguish IPMN patients from healthy individuals or discriminate between different malignancy risks. A, Workflow of the process for the identification of serum protein biomarker classifiers. B, Performance results are presented as ROC curves and corresponding AUC values as determined in the discovery and validation cohorts, respectively. C, For some typical proteins, the difference is shown as a boxplots, indicating median, first and third quartile as well as maximum and minimum scores.
Performance of serum protein panels to distinguish IPMN patients from healthy individuals or discriminate between different malignancy risks. A, Workflow of the process for the identification of serum protein biomarker classifiers. B, Performance results are presented as ROC curves and corresponding AUC values as determined in the discovery and validation cohorts, respectively. C, For some typical proteins, the difference is shown as a boxplots, indicating median, first and third quartile as well as maximum and minimum scores.
List of serum-protein classifiers.
. | Number of biomarkers . | . | . | Discovery cohort . | Validation cohort . | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Diagnostic objective . | Differential abundance analysis . | Univariate logistic regression . | LASSO regression . | RFE with 5-fold cross-validation . | SVM classifier parameters . | Protein biomarker panel . | AUC . | Sen . | Spe . | AUC . | Sen . | Spe . |
High-risk IPMN | 841 | 728 | 43 | 6 | Cost: 9.0 | CD72, DOK5, ENO1, POU2F2, SUPT6H + NPY4R | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
vs. healthy | Gamma: 0.01 | |||||||||||
Epsilon: 0.2 | ||||||||||||
Low-risk IPMN | 928 | 798 | 23 | 7 | Cost: 0.2 | CD72, DOK5, ENO1, POU2F2, SUPT6H | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
vs. healthy | Gamma: 0.1 | + NPY4R + MMP26 | ||||||||||
Epsilon: 0.1 | ||||||||||||
IPMN | 870 | 749 | 34 | 6 | Cost: 2.0 | CD72, DOK5, ENO1, POU2F2, SUPT6H + DNAJC17 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
vs. healthy | Gamma: 0.2 | |||||||||||
Epsilon: 0.1 | ||||||||||||
High-risk IPMN | 371 | 362 | 97 | 26 | Cost: 0.2 | ABT1, BRAF, CCT4, EEF1A1, ETFDH, | 0.98 | 0.99 | 0.90 | 0.95 | 0.84 | 0.96 |
vs. low-risk IPMN | Gamma: 0.05 | FGR, GSTM4, IL18, IMPDH1, IPO13, | ||||||||||
Epsilon: 0.3 | KLHL23, L1CAM, LRRC20, NCOR1, NETO2, OSR2, PHAX, PIM1, PVALB, RPH3AL, SERGEF, SYVN1, TECPR1, TJP2, TMEM161A, WDR54 | |||||||||||
High-risk IPMN | 371 | 362 | 97 | 17 | Cost: 0.1 | ABT1, EEF1A1, ETFDH, FGR, GSTM4, | 0.96 | 0.94 | 0.89 | 0.91 | 0.84 | 0.85 |
vs. low-risk IPMN | Gamma: 0.06 | IPO13, KLHL23, L1CAM, LRRC20, NCOR1, NETO2, PHAX, PIM1, RPH3AL, SYVN1, TMEM161A, WDR54 | ||||||||||
Epsilon: 0.08 | ||||||||||||
Malignant IPMN | 778 | 678 | 43 | 4 | Cost: 1.0 | CD72, DOK5, SUPT6H + DNAJC17 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
vs. healthy | Gamma: 0.05 | |||||||||||
Epsilon: 0.8 |
. | Number of biomarkers . | . | . | Discovery cohort . | Validation cohort . | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Diagnostic objective . | Differential abundance analysis . | Univariate logistic regression . | LASSO regression . | RFE with 5-fold cross-validation . | SVM classifier parameters . | Protein biomarker panel . | AUC . | Sen . | Spe . | AUC . | Sen . | Spe . |
High-risk IPMN | 841 | 728 | 43 | 6 | Cost: 9.0 | CD72, DOK5, ENO1, POU2F2, SUPT6H + NPY4R | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
vs. healthy | Gamma: 0.01 | |||||||||||
Epsilon: 0.2 | ||||||||||||
Low-risk IPMN | 928 | 798 | 23 | 7 | Cost: 0.2 | CD72, DOK5, ENO1, POU2F2, SUPT6H | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
vs. healthy | Gamma: 0.1 | + NPY4R + MMP26 | ||||||||||
Epsilon: 0.1 | ||||||||||||
IPMN | 870 | 749 | 34 | 6 | Cost: 2.0 | CD72, DOK5, ENO1, POU2F2, SUPT6H + DNAJC17 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
vs. healthy | Gamma: 0.2 | |||||||||||
Epsilon: 0.1 | ||||||||||||
High-risk IPMN | 371 | 362 | 97 | 26 | Cost: 0.2 | ABT1, BRAF, CCT4, EEF1A1, ETFDH, | 0.98 | 0.99 | 0.90 | 0.95 | 0.84 | 0.96 |
vs. low-risk IPMN | Gamma: 0.05 | FGR, GSTM4, IL18, IMPDH1, IPO13, | ||||||||||
Epsilon: 0.3 | KLHL23, L1CAM, LRRC20, NCOR1, NETO2, OSR2, PHAX, PIM1, PVALB, RPH3AL, SERGEF, SYVN1, TECPR1, TJP2, TMEM161A, WDR54 | |||||||||||
High-risk IPMN | 371 | 362 | 97 | 17 | Cost: 0.1 | ABT1, EEF1A1, ETFDH, FGR, GSTM4, | 0.96 | 0.94 | 0.89 | 0.91 | 0.84 | 0.85 |
vs. low-risk IPMN | Gamma: 0.06 | IPO13, KLHL23, L1CAM, LRRC20, NCOR1, NETO2, PHAX, PIM1, RPH3AL, SYVN1, TMEM161A, WDR54 | ||||||||||
Epsilon: 0.08 | ||||||||||||
Malignant IPMN | 778 | 678 | 43 | 4 | Cost: 1.0 | CD72, DOK5, SUPT6H + DNAJC17 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
vs. healthy | Gamma: 0.05 | |||||||||||
Epsilon: 0.8 |
Abbreviations: Sen, sensitivity; Spe, specificity; vs., versus.
The resulting biomarker panels exhibited very high accuracy (Fig. 1). For the discrimination of IPMN and healthy individuals, a classifier was constructed from six proteins. It perfectly distinguished sera from IPMN patients and healthy individuals within the discovery cohort. Absolute accuracy was also achieved upon validation (Table 1). Five of the six biomarker molecules—CD72, DOK5, ENO1, POU2F2, and SUPT6H—are also part of the 7-protein signature that discriminated low-risk cases from healthy individuals and the 6-protein panel that separated high-risk patients from healthy individuals; in all these comparisons, the AUC value was 1.00 (Table 1).
Discrimination between patients with high- and low-risk IPMN was more difficult to achieve. A large group of 26 proteins was required (Table 1; Fig. 1), yielding the best performance with an AUC value of 0.98 in the discovery cohort. This value was slightly reduced to 0.95 upon validation. While discrimination power is very high with a specificity of 96%, the panel lacks some sensitivity at 84%. This is nevertheless substantially better than the performance of CA19–9, for example. Problematic in terms of translation to application is the large number of proteins. We therefore tried to define a smaller panel, setting an AUC threshold at 0.90. The smallest possible protein panel was made of 17 proteins achieving an AUC of 0.91 (Table 1; Fig. 1).
We also looked at distinguishing either high-risk IPMN from a combination of low-risk IPMN cases and healthy individuals or low-risk patients from high-risk IPMN and healthy donors. Again, mostly relatively large marker panels were required (Supplementary Table S5; Supplementary Fig. S1). The overall smallest number of biomarkers consists of four proteins, which discriminated samples from patients with malignant IPMN and those from healthy people with an accuracy of 100% (Table 1). These suggest that the number of required biomarkers gets smaller the wider the pathologic distance is between two patient groups.
miRNA signatures
We also investigated the serum samples’ miRNA content for its diagnostic utility. The overall workflow of defining miRNA biomarkers is displayed in Fig. 2. First a commercially available PCR-array assay that investigates the levels of 179 miRNAs was performed on a discovery cohort of 10 samples from healthy donors, 20 low-risk and 20 high-risk IPMN cases. We chose to focus on this subset of 179 miRNAs, because they represent molecules that are relatively abundant in serum and therefore likely to enable robust measurements. Identified biomarker candidates were then studied by qPCR in 24 healthy, 43 low-risk and 51 high-risk IPMN samples. The results were used to train SVM classifiers. For validation of their diagnostic performance, 16 normal, 19 low-risk, and 22 high-risk IPMN samples were analyzed at fixed classifier parameters. The normalized miRNA copy numbers resulting from all three analysis phases are shown in Supplementary Table S6. As before, diagnostic performance is presented as ROC curves based on the qPCR results generated on the training and validation sample cohorts (Fig. 2), respectively.
Performance of serum miRNA panels to distinguish IPMN patients from healthy individuals or discriminate between different malignancy risks. A, Workflow of the process for the identification of serum miRNA biomarker classifiers. B, Performance results are presented as ROC curves and corresponding AUC values as determined in the training and validation cohorts, respectively. C, For some typical miRNA molecules, the variations in their abundance level across IPMN grades are shown as boxplots, indicating median, first and third quartile as well as maximum and minimum scores. A, adenoma; B, borderline; C, carcinoma in situ; I, invasive carcinoma.
Performance of serum miRNA panels to distinguish IPMN patients from healthy individuals or discriminate between different malignancy risks. A, Workflow of the process for the identification of serum miRNA biomarker classifiers. B, Performance results are presented as ROC curves and corresponding AUC values as determined in the training and validation cohorts, respectively. C, For some typical miRNA molecules, the variations in their abundance level across IPMN grades are shown as boxplots, indicating median, first and third quartile as well as maximum and minimum scores. A, adenoma; B, borderline; C, carcinoma in situ; I, invasive carcinoma.
In terms of the size of the biomarker panels, miRNA-based discrimination performed better than the proteins. Although the achieved accuracy was similar, the actual number of molecules was much smaller. As seen for the proteins, there was a common set of biomarkers—three in case of miRNA: miR-122–5p, miR-125b-5p, miR-146a-5p—that formed the core of the miRNA panels (Table 2). On their own, they allowed an accurate discrimination of high-risk IPMN from healthy individuals. By adding two more miRNAs—miR-365a-3p and miR-375—also low-risk IPMN could be split from healthy. Taking into account a sixth and seventh molecules, finally, permitted to distinguish between high- and low-risk IPMN with very high accuracy; the AUC upon validation was 0.94 (Table 2). Further comparisons of patient groups also yielded high AUC values (Supplementary Table S7; Supplementary Fig. S2). For none of them there was a panel larger than seven biomarkers.
List of serum-miRNA classifiers.
. | . | . | . | Training cohort . | Validation cohort . | ||||
---|---|---|---|---|---|---|---|---|---|
Diagnostic object . | Biomarker panel size . | SVM classifier parameters . | miRNA biomarker panel . | AUC . | Sen . | Spe . | AUC . | Sen . | Spe . |
High-risk IPMN | 3 | Cost: 0.2 | miR-122–5p, miR-125b-5p, miR-146a-5p | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
vs. healthy | Gamma: 0.2 | ||||||||
Epsilon: 0.6 | |||||||||
Low-risk IPMN | 5 | Cost: 9.0 | miR-122–5p, miR-125b-5p, miR-146a-5p | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
vs. healthy | Gamma: 0.9 | + miR-365a-3p, miR-375 | |||||||
Epsilon: 0.9 | |||||||||
IPMN | 6 | Cost: 4.0 | miR-122–5p, miR-125b-5p, miR-146a-5p | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
vs. healthy | Gamma: 0.9 | + miR-365a-3p, miR-375 | |||||||
Epsilon: 0.9 | + miR-130a-3p | ||||||||
High-risk IPMN | 7 | Cost: 0.7 | miR-122–5p, miR-125b-5p, miR-146a-5p | 0.95 | 0.94 | 0.84 | 0.94 | 0.86 | 0.95 |
vs. low-risk IPMN | Gamma: 0.6 | + miR-365a-3p, miR-375 | |||||||
Epsilon: 0.5 | + miR-155–5p, miR-205–5p |
. | . | . | . | Training cohort . | Validation cohort . | ||||
---|---|---|---|---|---|---|---|---|---|
Diagnostic object . | Biomarker panel size . | SVM classifier parameters . | miRNA biomarker panel . | AUC . | Sen . | Spe . | AUC . | Sen . | Spe . |
High-risk IPMN | 3 | Cost: 0.2 | miR-122–5p, miR-125b-5p, miR-146a-5p | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
vs. healthy | Gamma: 0.2 | ||||||||
Epsilon: 0.6 | |||||||||
Low-risk IPMN | 5 | Cost: 9.0 | miR-122–5p, miR-125b-5p, miR-146a-5p | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
vs. healthy | Gamma: 0.9 | + miR-365a-3p, miR-375 | |||||||
Epsilon: 0.9 | |||||||||
IPMN | 6 | Cost: 4.0 | miR-122–5p, miR-125b-5p, miR-146a-5p | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
vs. healthy | Gamma: 0.9 | + miR-365a-3p, miR-375 | |||||||
Epsilon: 0.9 | + miR-130a-3p | ||||||||
High-risk IPMN | 7 | Cost: 0.7 | miR-122–5p, miR-125b-5p, miR-146a-5p | 0.95 | 0.94 | 0.84 | 0.94 | 0.86 | 0.95 |
vs. low-risk IPMN | Gamma: 0.6 | + miR-365a-3p, miR-375 | |||||||
Epsilon: 0.5 | + miR-155–5p, miR-205–5p |
Abbreviations: Sen, sensitivity; Spe, specificity; vs., versus.
Combination panel of serum proteins and miRNAs
Although IPMN cases could be distinguished from healthy individuals with an AUC of 1.00, no absolutely correct discrimination was achieved between high- and low-risk patients. Even accumulating all differentially abundant proteins or miRNAs did not result in an AUC value better than the values shown in Tables 1 and 2. Therefore, we looked into improving performance by generating a classifier that integrates differentially abundant proteins and miRNAs. Marker selection started from scratch. Serum samples were randomly separated in a training cohort of 118 and a validation cohort of 57 samples. Identification of differentially abundant molecules and subsequent RFE with 5-fold cross-validation on the data originating from the training cohort formed the basis for training and optimizing a classifier, which was then applied to the validation samples. A biomarker panel of five proteins (EEF1A1, RPH3AL, NCOR1, L1CAM, TMEM161A) and three miRNAs (miR-146a-5p, miR-155–5p, miR-375) was found that could distinguish high-risk from low-risk IPMN cases with AUC values of 0.99 in the training cohort and 0.97 upon validation (Fig. 3). For the results of other separating other patient groups, see Supplementary Table S8.
Diagnostic performance of a combined panel of protein and miRNA biomarkers to discriminate high-risk from low-risk IPMN. The results are presented as ROC curves and corresponding AUC values as determined in the training and validation cohorts, respectively.
Diagnostic performance of a combined panel of protein and miRNA biomarkers to discriminate high-risk from low-risk IPMN. The results are presented as ROC curves and corresponding AUC values as determined in the training and validation cohorts, respectively.
Comparison with current clinical discrimination criteria
For the IPMN patients, information was available concerning the aspects of preoperative CA19–9 levels, diagnosis of pancreatitis, cyst diameter, thickened/enhancing cyst wall, detection of a solid component, main pancreatic duct dilatation, lymphadenopathy and obstructive jaundice (Supplementary Table S9), which are diagnostically relevant parameters according to international IPMN guidelines (10, 15, 34). We used the same approach for the training of a SVM classifier that had been applied to the protein and miRNA data. The training cohort consisted of 145 sera (92 low-risk, 53 high-risk), whereas the validation was done in 65 samples (40 low-risk, 25 high-risk). Upon validation, an AUC value of 0.81 was achieved (Fig. 4). Because this was higher than expected, we checked further to avoid potential overfitting. Besides re-running the analysis independently several times with different training and validation sets, we also reversed the analysis, using the 65 validation sera for an entirely independent creation of an SVM classifier and the 145 samples originally used for training now utilized for validation. In all analyses, results were obtained that are very similar to the initial finding; in the reverse analysis, validation produced an AUC value of 0.82 (Fig. 4). Furthermore, we imputed missing values by running the “mice” R package. Training and validation cohorts were randomly selected four times. The AUC values in the respective validation cohorts ranged from 0.80 to 0.82 (Supplementary Fig. S3). The results document that the blood-based analysis is substantially more accurate in comparison to the criteria recommended in the guidelines, with an AUC of 0.97 compared with 0.82.
Diagnostic performance of clinical parameters to discriminate high-risk from low-risk IPMN. The 8 patient characteristics of preoperative CA19–9 levels, diagnosis of pancreatitis, cyst diameter, thickened/enhancing cyst wall, detection of a solid component, main pancreatic duct dilatation, lymphadenopathy, and obstructive jaundice were used for the training of an SVM classifier. The training and validation cohorts consisted of 145 and 65 samples, respectively. A, The discriminative performance of the classifier is shown as ROC curves and corresponding AUC values. B, As a means to check for overfitting, also the reverse analysis was performed. The 65 samples were used for independent training, whereas validation was done on the 145 samples. Again, the resulting ROC curves and corresponding AUC values are shown.
Diagnostic performance of clinical parameters to discriminate high-risk from low-risk IPMN. The 8 patient characteristics of preoperative CA19–9 levels, diagnosis of pancreatitis, cyst diameter, thickened/enhancing cyst wall, detection of a solid component, main pancreatic duct dilatation, lymphadenopathy, and obstructive jaundice were used for the training of an SVM classifier. The training and validation cohorts consisted of 145 and 65 samples, respectively. A, The discriminative performance of the classifier is shown as ROC curves and corresponding AUC values. B, As a means to check for overfitting, also the reverse analysis was performed. The 65 samples were used for independent training, whereas validation was done on the 145 samples. Again, the resulting ROC curves and corresponding AUC values are shown.
Discussion
In this study, protein and miRNA biomarker panels were identified that discriminate at high accuracy different IPMN progression stages through SVM classifiers that were made of serum proteins, circulating miRNAs or a mixture of both. This includes particularly the discrimination of healthy individuals from IPMN patients and high-risk from low-risk IPMN cases. The former is an important finding toward detection of IPMN; it is a matter of further research to determine, how early during tumor development this might work. The latter was the key objective of our study and allows a risk-stratification once IPMN has been diagnosed and should help improving clinical decision-making. At low risk of developing a malignant tumor, an operation could be avoided, keeping patients under close surveillance instead. At high risk, tumor resection would be performed, improving the patients’ prognosis substantially. The detection of high risk may even be possible earlier during the tumor pathogenesis process, because the test could be easily repeated frequently during the course of the disease. As a third result of clinical potential, we found biomarkers that effectively distinguished between healthy individuals and patients with malignant IPMN. They may become useful for the detection of early-stage asymptomatic PDAC.
The analyses were performed on a sample cohort collected at a single medical institution. An advantage of this is that there should be minimal effects of differences in sample handling. Consequently, the detected variations are likely to be the result of the biological differences of IPMN grades rather than biases caused by serum processing. For an application in a wider clinical setting, however, the performance of the defined biomarker panels has to be re-evaluated in a multicenter and prospective study so as to compensate for such potential handling effects, although common standard procedures were applied. Furthermore, the parameters of the SVM classifier probably require refinement for it to become commonly applicable at many different sites for a diagnosis of individual patients. The more pathologically confirmed samples the classifier would be based on, the more its performance will be fine-tuned and optimized, thereby continuously improving accuracy and robustness, although eventually only by subtle increments. Because IPMN cases are relatively rare and the time span will be long, it will be a challenge to perform such a study. Concerning eventual clinical testing, qPCR-based assays could be used for miRNA biomarkers, whereas multiplex ELISAs could be applied for protein biomarkers.
Already now, however, the number of analyzed samples was large enough to warrant a solid evaluation. We are confident that the identified marker panels are robust. This is supported by the repeated and thus reproducible identification of particular marker molecules in different, independent analyses. In all comparisons, the AUC value achieved with the validation cohort was identical or slightly less than the one calculated for the training cohort. In none of them, the difference was significant, however. This suggests that there was no major overfitting or underfitting generated during classifier calculation and optimization. This is further supported by the fact that repetition with changed training and validation sets or reversal of the analysis did not make a difference. The quality of the identified biomarkers is indicated by the fact that CA19–9 did not contribute to the creation of any of the protein classifiers, although it was also assayed on the antibody microarray. Although CA19–9 exhibited differential abundance, other factors were found to be superior in diagnostic performance.
For our initial screening for miRNA biomarker candidates, we used the miRCURY LNA miRNA PCR system. The studied miRNAs are present in serum at high concentrations. This was done on purpose to focus on molecules, which could act as robust biomarkers. These miRNAs are likely to be detectable and also relatively small changes in abundance may be big enough in absolute value to be significant. This focus on abundant miRNAs may be at the sacrifice of excluding informative miRNA molecules, which are present in serum at only low concentrations but may nevertheless be specific. However, they could be particularly prone to lacking sensitivity, thereby negatively affecting overall performance.
Thirty-seven of the overall 67 protein biomarkers identified in the pairwise comparisons (Table 1; Supplementary Table S1) exhibited correlation with IPMN progression. Of these molecules, only VEGFA had been reported in connection with cystic pancreatic tumors. It was found at elevated concentrations in the cyst fluid of serous cystic neoplasms and could distinguish it from other cystic lesions including IPMN (35). Eighteen proteins—ANXA1, AQP1, CAPN2, CD59, EEF1A1, ELMO2, ENO1, GSTM4, HMMR, IGF2BP3, L1CAM, NCOR1, NETO2, PRTN3, TJP2, TSLP, TTK, and VEGFA—have been shown to be involved in promoting PDAC proliferation or invasion (36–53). Their correlation with IPMN progression suggests that they may be factors that are critical for tumor functioning. However, meaningful conclusions on functional aspects are difficult to be drawn, because most functional information about proteins is based on cellular studies. Variations in serum may have different causes or consequences and represent the status of not just the tumor but the entire organism. Therefore, potential explanations are rather speculative.
Recently, extracellular vesicle-associated MUC5AC was found as a possible marker for invasive IPMN (54), lacking any assessment of non-invasive high-grade samples, however. The antibody microarray used in our study did not include MUC5AC, unfortunately, although assaying 2,286 proteins. Therefore, we were not able to analyze its performance in our study. Besides analyzing blood, there has been interest in developing protein biomarkers from pancreatic cyst fluid for IPMN classification. Protein IL1B was found to be upregulated in both serum samples and pancreatic cyst fluid from high-risk IPMN compared with low-risk IPMN (55). However, IL1B was not included in our SVM classifier for IPMN risk stratification because of its comparably small discriminative power, although more abundant in serum samples of high- than low-risk IPMN patients.
Overall, 10 miRNA molecules were part of the biomarker panels that discriminated best between the different IPMN stages (Table 2; Supplementary Table S2). Five of them have already been studied in IPMN. In our analysis, miR-125b-5p showed negative correlation with IPMN progression, consistent with changes observed in pancreatic cystic fluid (56). Decreases in serum levels of miR-130a-3p from benign to malignant IPMN match findings of such a reduction in IPMN tissues (57). Variations in miR-375 levels were observed in tissues of serous cyst-adenoma, mucinous cystic lesions, and PDAC (58). For miR-146a-5p and miR-155–5p, there was less in IPMN sera compared with samples from healthy individuals; the opposite has been found in tissues (59–61). Such asymmetric distribution has been observed in cancer before (e.g., 62) and suggests an actively regulated secretion of miRNA molecules from cells. Four of the other five molecules—miR-122–5p, miR-205–5p, miR-326, miR-365a-3p—are known to be associated with PDAC (63–66). No such link has been reported for miR-154–5p, but it has been found to act as a tumor suppressor in several tumor entities (67). As for proteins, a functional interpretation of variations in miRNA levels in serum is rather speculative. Also, the entire picture of all changes should be taken into account to such an end rather than focusing on individual molecules.
In conclusion, a small set of serum-based protein and miRNA biomarkers was defined that permits discrimination between patients with high- and low-risk IPMN. This minimally invasive and accurate process could contribute substantially to patient management and improve the prognosis of patients with cystic pancreatic tumors.
Authors' Disclosures
S. Roth reports grants from German Cancer Aid during the conduct of the study. No disclosures were reported by the other authors.
Authors' Contributions
C. Zhang: Conceptualization, software, formal analysis, validation, investigation, visualization, methodology, writing–original draft, writing–review and editing. F.N. Al-Shaheri: Formal analysis, validation, investigation, methodology, writing–review and editing. M.S.S. Alhamdani: Software, investigation, methodology. A.S. Bauer: Data curation, software, formal analysis. J.D. Hoheisel: Conceptualization, formal analysis, supervision, investigation, methodology, project administration, writing–review and editing. M. Schenk: Investigation. U. Hinz: Data curation, investigation. P. Goedecke: Data curation, investigation. K. Al-Halabi: Data curation, formal analysis, investigation. M.W. Büchler: Resources, supervision, project administration, writing–review and editing. N.A. Giese: Resources, data curation, formal analysis. T. Hackert: Conceptualization, resources, formal analysis, supervision, validation, project administration, writing–review and editing. S. Roth: Conceptualization, resources, data curation, formal analysis, supervision, validation, project administration, writing–review and editing.
Acknowledgments
Chaoyang Zhang was supported by a scholarship of the China Scholarship Council.
The publication costs of this article were defrayed in part by the payment of publication fees. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.
Note: Supplementary data for this article are available at Clinical Cancer Research Online (http://clincancerres.aacrjournals.org/).