Purpose:

To determine whether a multianalyte liquid biopsy can improve the detection and staging of pancreatic ductal adenocarcinoma (PDAC).

Experimental Design:

We analyzed plasma from 204 subjects (71 healthy, 44 non-PDAC pancreatic disease, and 89 PDAC) for the following biomarkers: tumor-associated extracellular vesicle miRNA and mRNA isolated on a nanomagnetic platform that we developed and measured by next-generation sequencing or qPCR, circulating cell-free DNA (ccfDNA) concentration measured by qPCR, ccfDNA KRAS G12D/V/R mutations detected by droplet digital PCR, and CA19-9 measured by electrochemiluminescence immunoassay. We applied machine learning to training sets and subsequently evaluated model performance in independent, user-blinded test sets.

Results:

To identify patients with PDAC versus those without, we generated a classification model using a training set of 47 subjects (20 PDAC and 27 noncancer). When applied to a blinded test set (N = 136), the model achieved an AUC of 0.95 and accuracy of 92%, superior to the best individual biomarker, CA19-9 (89%). We next used a cohort of 20 patients with PDAC to train our model for disease staging and applied it to a blinded test set of 25 patients clinically staged by imaging as metastasis-free, including 9 subsequently determined to have had occult metastasis. Our workflow achieved significantly higher accuracy for disease staging (84%) than imaging alone (accuracy = 64%; P < 0.05).

Conclusions:

Algorithmically combining blood-based biomarkers may improve PDAC diagnostic accuracy and preoperative identification of nonmetastatic patients best suited for surgery, although larger validation studies are necessary.

Translational Relevance

Pancreatic ductal adenocarcinoma (PDAC) is a highly lethal disease, partly because most cases are not diagnosed until disease is widespread. There is, therefore, an urgent need for sensitive, noninvasive diagnostics. However, even for patients with pathologically confirmed PDAC, standard-of-care imaging can have low sensitivity to detect early metastatic disease. This complicates disease staging and therapy selection, including curative-intent surgery. Here we describe a multianalyte liquid biopsy to better detect and stage PDAC from a single blood sample. This approach was able to distinguish patients with PDAC from those without. Moreover, among patients with PDAC, the model could improve detection of occult metastatic disease that was imaging negative at baseline and only discovered intraoperatively or by subsequent imaging within 4 months of baseline blood draw. Although a larger validation study is needed, this test may improve early disease detection and, when performed in addition to diagnostic imaging, patient selection for curative-intent surgery.

Pancreatic ductal adenocarcinoma (PDAC) is the third leading cause of cancer-related death in the United States, with an overall 5-year survival of 9% (1). Diagnosis and staging currently rely on endoscopic ultrasound-guided biopsy, CT, and MRI (2). Most patients are diagnosed at an advanced stage, and sufficiently sensitive and specific screening tests for early disease remain elusive. While curative-intent surgery remains an option for patients whose disease is confined to the pancreas, distinguishing these patients from those with metastases, who are unlikely to benefit from surgery, remains challenging due to the presence of occult metastases not detectable by standard-of-care imaging (3–5).

To address these challenges, several blood-based liquid biopsy biomarkers have been developed but show low sensitivity for detection of early-stage disease (6–8). Carbohydrate antigen 19-9 (CA19-9), a long-standing PDAC-associated biomarker, is clinically utilized to monitor response to therapy but its role in screening or determining surgical resectability is unclear (9). More recently, several liquid biopsy biomarkers have shown potential for the diagnosis and staging of PDAC. Patients with PDAC with detectable circulating tumor cells (CTC) had significantly reduced progression-free and overall survival (10, 11), although CTCs are often undetectable in early-stage disease. Circulating cell-free DNA (ccfDNA) concentration has been shown to correlate with disease burden (12, 13); KRAS mutations in ccfDNA have been detectable at various stages of disease although at lower rates in early-stage disease (14); soluble protein biomarkers have demonstrated diagnostic value (15), and tumor-associated extracellular vesicles (EV) have generated enthusiasm for their potential to improve diagnosis of the disease (7, 15–17).

In our previous work, we showed that by enriching tumor-associated EVs from plasma using an immunomagnetic nanofluidic chip, and analyzing RNA cargo, we could identify transcriptional signatures that accurately classify metastatic PDAC patients from healthy controls in clinical cohorts (18). However, although we have demonstrated promising results for detection of early-stage disease in a murine model of pancreatic cancer (KPCY; ref. 18), we have not yet demonstrated the accuracy of our approach for detection of early disease in human patients. Work from other groups has shown that the performance of a biomarker can be improved by combining it with different types of circulating biomarkers, such as combining ccfDNA and soluble proteins (19, 20). Here we build on previous work and describe a multianalyte panel that algorithmically combines tumor-associated EV mRNA and miRNA, ccfDNA concentration and KRAS mutation detection, and CA19-9 using machine learning. Using training sets of samples from patients, disease controls, and healthy individuals as well as independent, blinded test sets, we first apply the approach to distinguish cancer versus noncancer patient samples. We then retrain the model for disease staging and the detection of metastatic disease for patients with PDAC originally staged by standard-of-care imaging.

Patients and sample collection and processing

Whole blood was collected at baseline (therapy-naïve) from 204 total patients at the Hospital of the University of Pennsylvania (Philadelphia, PA) under IRB Protocol #822028 after obtaining written informed consent. The study was conducted in accordance with the Declaration of Helsinki. Among the 89 patients with PDAC, 58 were clinically staged on the basis of baseline imaging as having local disease only (M0), including 37 resectable patients and 21 patients with locally advanced disease. The remaining 31 patients had evidence of metastatic disease on baseline imaging (M1; Table 1). For the staging analysis, retrospective chart review was conducted to determine whether 34 patients originally staged by imaging as metastasis-free (M0) and resectable might have harbored metastatic disease below the level of detection for standard-of-care imaging. Ten patients were categorized as having had occult metastases, including 4 with metastases detected intraoperatively and 6 with very early recurrence, here defined as within 4 months of baseline blood draw (Supplementary Fig. S1). Time to metastasis was defined with respect to the date of baseline blood draw, censoring patients based on the date of last follow-up. Imaging data and clinical staging were obtained by chart abstraction. The 115 subjects serving as noncancer controls included 44 patients with noncancer pancreatic diseases such as intraductal papillary mucinous neoplasm (IPMN) and pancreatitis, as well as 71 healthy individuals enrolled at the time of routine screening procedures such as endoscopy. Patients with an active malignancy at the time of blood draw were excluded from the control cohorts. All noncancer control patients were followed for a minimum of 4 months to verify that no patient received a PDAC diagnosis subsequent to blood draw. Venous blood was collected in K2EDTA vacutainers (Becton Dickinson) or Streck cfDNA BCT (Streck) and processed to plasma as described previously (18). K2EDTA and Streck cfDNA whole blood was processed within 3 or 24 hours after blood draw, respectively. Plasma was aliquoted and stored at −80°C for future use. All subjects had sufficient total plasma from a single blood draw such that all assays described below could be performed. In addition to the 204 samples for which results are reported, a batch of 10 additional samples was processed but yielded results for 4 biomarkers that were significantly different than the training set. Remeasurement was not possible due to the plasma sample having been exhausted. This batch was excluded from the blinded test set before being classified using machine learning (21). This highlights a limitation of machine learning–based approaches, in that the model can only be trusted when the test data is consistent with the data the model was trained with. An active area of research is automated outlier analysis to avoid errors in machine learning based on spurious data. The study was designed and conducted in accordance with the Reporting recommendations for tumor MARKer prognostic studies guidelines (22).

Table 1.

Clinical characteristics of study population.

Discovery set (N = 29)
Age range (median)GenderNon-PDAC pathologyTNM stageClinical stage
Healthy controls 62.0–69.4 (64.7) n = 4 Male  
  n = 3 Female  
Disease control 43.4–72.0 (70.7) n = 4 Male n = 4 Pancreatitis  
  n = 1 Female n = 1 Billary stricture  
M0 60.5–71.5 (68.3) n = 3 Male  n = 2 cT2cN0M0 n = 2 IB 
  n = 1 Female  n = 1 cT3cN0M0 n = 1 IIA 
    n = 1 cTxN0M0 n = 1 X 
M1 51.0–67.0 (64.0) n = 7 Male  n = 4 cT3N1M1 n = 13 IV 
  n = 6 Female  n = 1 cT4N1M1  
    n = 8 cTxNxM1  
Discovery set (N = 29)
Age range (median)GenderNon-PDAC pathologyTNM stageClinical stage
Healthy controls 62.0–69.4 (64.7) n = 4 Male  
  n = 3 Female  
Disease control 43.4–72.0 (70.7) n = 4 Male n = 4 Pancreatitis  
  n = 1 Female n = 1 Billary stricture  
M0 60.5–71.5 (68.3) n = 3 Male  n = 2 cT2cN0M0 n = 2 IB 
  n = 1 Female  n = 1 cT3cN0M0 n = 1 IIA 
    n = 1 cTxN0M0 n = 1 X 
M1 51.0–67.0 (64.0) n = 7 Male  n = 4 cT3N1M1 n = 13 IV 
  n = 6 Female  n = 1 cT4N1M1  
    n = 8 cTxNxM1  
Training set (N = 47)a
Age range (median)GenderNon-PDAC pathologyTNM stageClinical stage
Healthy controls 45.6–75.3 (62.3) n = 8 Male  
  n = 7 Female  
Disease control 43.4–82.2 (65.0) n = 10 Male n = 9 Pancreatitis  
  n = 2 Female n = 3 IPMN  
M0 54.4–77.7 (65.2) n = 3 Male  n = 3 cT1cN0M0 n = 3 IA 
  n = 6 Female  n = 1 cT1cNxM0 n = 5 IB 
    n = 5 cT2N0M0 n = 1 X 
M1 51.0–81.5 (67.5) n = 5 Male  n = 1 cT1cN0M1 n = 11 IV 
  n = 6 Female  n = 2 cT2N0M1  
    n = 1 cT2N1M1  
    n = 1 cT2N0M1  
    n = 1 cT3N1M1  
    n = 2 cT4N1M1  
    n = 1 cTxN0M1  
    n = 2 cTxN1M1  
Training set (N = 47)a
Age range (median)GenderNon-PDAC pathologyTNM stageClinical stage
Healthy controls 45.6–75.3 (62.3) n = 8 Male  
  n = 7 Female  
Disease control 43.4–82.2 (65.0) n = 10 Male n = 9 Pancreatitis  
  n = 2 Female n = 3 IPMN  
M0 54.4–77.7 (65.2) n = 3 Male  n = 3 cT1cN0M0 n = 3 IA 
  n = 6 Female  n = 1 cT1cNxM0 n = 5 IB 
    n = 5 cT2N0M0 n = 1 X 
M1 51.0–81.5 (67.5) n = 5 Male  n = 1 cT1cN0M1 n = 11 IV 
  n = 6 Female  n = 2 cT2N0M1  
    n = 1 cT2N1M1  
    n = 1 cT2N0M1  
    n = 1 cT3N1M1  
    n = 2 cT4N1M1  
    n = 1 cTxN0M1  
    n = 2 cTxN1M1  
Test set (N = 136)
Age range (median)GenderNon-PDAC pathologyTNM stageClinical stage
Healthy controls 41.6–85.8 (66.0) n = 20 Male  
  n = 29 Female  
Disease control 19.9–83.1 (63.7) n = 13 Male n = 3 IPMN  
  n = 17 Female n = 12 Pancreatitis  
   n = 2 Biliary stricture  
   n = 1 Benign neurofibroma  
   n = 11 Pancreatic cyst  
   n = 1 Pancreatic duct dilation  
M0 50.5–85.4 (66.5) n = 26 Male  n = 3 cT1cN0M0  
  n = 19 Female  n = 16 cT2N0M0  
    n = 4 cT2N1M0 n = 3 IA 
    n = 7 cT3N0M0 n = 15 IB 
    n = 4 cT3N1M0 n = 7 IIA 
    n = 3 cT4N0M0 n = 9 IIB 
    n = 1 cT4N0M1 n = 8 III 
    n = 5 cT4N1M0 n = 3 IV 
    n = 2 cT4N1M1  
M1 48.6–71.2 (62.5) n = 7 Male  n = 1 cT0N0M1 n = 12 IV 
  n = 5 Female  n = 1 cT2N0M1  
    n = 1 cT2N1M1  
    n = 2 cT3N0M1  
    n = 1 cT3NxM1  
    n = 2 cT4N0M1  
    n = 4 cT4N1M1  
Test set (N = 136)
Age range (median)GenderNon-PDAC pathologyTNM stageClinical stage
Healthy controls 41.6–85.8 (66.0) n = 20 Male  
  n = 29 Female  
Disease control 19.9–83.1 (63.7) n = 13 Male n = 3 IPMN  
  n = 17 Female n = 12 Pancreatitis  
   n = 2 Biliary stricture  
   n = 1 Benign neurofibroma  
   n = 11 Pancreatic cyst  
   n = 1 Pancreatic duct dilation  
M0 50.5–85.4 (66.5) n = 26 Male  n = 3 cT1cN0M0  
  n = 19 Female  n = 16 cT2N0M0  
    n = 4 cT2N1M0 n = 3 IA 
    n = 7 cT3N0M0 n = 15 IB 
    n = 4 cT3N1M0 n = 7 IIA 
    n = 3 cT4N0M0 n = 9 IIB 
    n = 1 cT4N0M1 n = 8 III 
    n = 5 cT4N1M0 n = 3 IV 
    n = 2 cT4N1M1  
M1 48.6–71.2 (62.5) n = 7 Male  n = 1 cT0N0M1 n = 12 IV 
  n = 5 Female  n = 1 cT2N0M1  
    n = 1 cT2N1M1  
    n = 2 cT3N0M1  
    n = 1 cT3NxM1  
    n = 2 cT4N0M1  
    n = 4 cT4N1M1  

Note: Designation of M0 versus M1 is based on baseline imaging.

a8 patients are included in the discovery as well as training sets.

Tumor-derived EV miRNA and mRNA isolation by track etched magnetic nanopore device

EVs from each patient's K2EDTA-collected plasma (1.5 mL) were magnetically labeled using biotinylated antibodies and anti-biotin ultrapure 50-nm-diameter nanoparticles (Miltenyi Biotec). Antibodies used in this study included anti-human CD326 (EpCAM; BioLegend), anti-human CD104 (Thermo Fisher Scientific), anti-human c-Met Monoclonal (Thermo Fisher Scientific), anti-human CD44v6 antibody (Thermo Fisher Scientific), and anti-human TSPAN8 (Miltenyi Biotec). These surface markers have been shown previously to enrich pancreatic tumor-associated EVs from plasma (18, 23). These five biotinylated antibodies (1.25 μL each) were pipetted into the human plasma samples and incubated for 20 minutes at room temperature on a shaking mixer. Subsequently, anti-biotin magnetic nanoparticles (20 μL; Miltenyi Biotec) were added to the samples and incubated for another 20 minutes at room temperature on the shaking mixer. Next, the plasma samples were loaded into the reservoir of the track etched magnetic nanopore (TENPO) device which was connected to a programmable syringe pump (Braintree Scientific) to provide the negative pressure driving the sample through the device.

Details on the design and fabrication of TENPO have been reported previously (18). Briefly, a permanent magnet (NdFeB disc magnet, d = 1.5 inches, h = 0.75 inches; K&J Magnetics) was placed beneath the TENPO device to magnetize TENPO's paramagnetic Ni80Fe20 film and the superparamagnetic nanoparticles used to label the EVs. While samples were pulled through the device, EVs that were labeled with a sufficient number of magnetic nanoparticles were captured at the edges of the chip's nanopores, while background EVs flowed through and were discarded. The positively selected EVs were subsequently lysed on the chip by directly loading QIAzol lysis reagent (700 mL; Qiagen), incubated for 3 minutes, and collected the lysate. The RNA was then extracted from this lysate off-chip (ExoRNeasy Serum/Plasma Kit, Qiagen). The EV miRNAs and mRNAs were eluted and stored at −80°C or immediately processed for further analysis.

EV miRNA sequencing and candidate discovery

A discovery cohort of 29 samples (Table 1; Supplementary Fig. S2) was analyzed by next-generation sequencing to identify miRNAs in the enriched tumor-associated EVs that might be differentially expressed among patient cohorts. QIAseq miRNA Library Kit (Qiagen) was used to make a library from isolated EV miRNA. A BioAnalyzer was used to quantify RNA prior to sequencing. The library was sequenced using a HiSeq 2500 Kit (Illumina, Next-Generation Sequencing Core, University of Pennsylvania, Philadelphia, PA). A modified version of the UPenn SCAP-T RNA-Seq expression pipeline (Fisher, S A., “Safisher/Ngs.” GitHub, 2017) was used for expression quantification by aligning to the hg38 genomes. The minimum fragment length allowed past the TRIM module was adjusted to 16 bases for miRNA analysis. The number of allowed mismatches was capped at one and unannotated splices were prohibited. Expression counts were normalized by DESeq2 (24) and quantified using VERSE (25), using Gencode 25 and UCSD mm10 gene annotations, combined with MirBase v21 annotations for 3p and 5p miRNA.

Selection of EV RNA panel

To identify potential EV miRNA candidates for PDAC diagnosis, we first applied the feature selection algorithm least absolute shrinkage and selection operator (LASSO) on EV miRNA sequencing results to find the most informative miRNAs (Supplementary Fig. S2A). The resulting eight miRNA candidates were: hsa.miR.103b, hsa.miR.23a.3p, hsa.miR.432.5p, hsa.miR.409.3p, hsa.miR.224.5p, hsa.miR.1299, hsa.miR.4782.5p, and hsa.miR.4772.3p (Supplementary Fig. S2B). We next validated the miRNA candidates by qPCR, and identified three miRNAs (hsa.miR.4772.3p, hsa.miR.4782.5p, and hsa.miR.432.5p) with Cq ≥ 40, which were considered to not be adequately abundant and were therefore excluded from further analysis (Supplementary Fig. S2C). The remaining five miRNAs were measured by qPCR within the training set (N = 47) and were compared with the EV miRNA sequencing data (Supplementary Fig. S2D) within each patient subset (noncancer and PDAC). The qPCR and sequencing data corresponded well with one another (R2 = 0.6; Supplementary Fig. S2D). We also included six EV mRNAs (CD63, CK18, GAPDH, H3F3A, KRAS, and ODC1) which had previously been used to distinguish patients with stage IV PDAC from healthy controls (18) to form a panel of 11 potential EV RNA biomarkers. These 11 EV RNA biomarkers combined with CA19-9, ccfDNA concentration (qPCR for ALU), and ctDNA (KRAS mutation allele fraction) formed the final 14-biomarker candidates for later classification. The workflow of multianalyte panel generation is shown in Supplementary Fig. S3.

EV miRNA and mRNA qPCR

The miScript SYBR Green PCR Kit (Qiagen) and miScript primers (Qiagen) were used to quantify EV miRNAs. A master mix containing miScript SYBR Green, miScript primer, universal primer, and RNase-free water was prepared at a 5:1:1:2 ratio. 9 μL of the master mix was added to each well of a 384-well plate, followed by 1 μL of cDNA. 40 cycles were run with a default setting using CFX384 Touch Real-Time PCR machine (Bio-Rad). The SsoAdvanced Universal SYBR Green Supermix (Bio-Rad) and primers (Integrated DNA Technologies) were used for EV mRNA quantification. The SYBR Green supermix, primers, and RNase-free water were combined at a 5:0.5:3.5 ratio for the master mix. 9 μL of the master mix was added to each well, followed by 1 μL of cDNA. 40 cycles were run with a default setting using CFX384 Touch Real-Time PCR machine (Bio-Rad). Duplicates were performed for each sample. The melting curves for the amplified DNA were manually validated before subsequent analysis.

ccfDNA extraction and concentration

ccfDNA was isolated from K2EDTA- or Streck-collected plasma. If necessary to ensure a consistent input volume across all samples, the volume was adjusted with PBS and the measured ccfDNA concentration was corrected for original input. Extraction was performed using the QIAamp Circulating Nucleic Acid Kit (Qiagen, catalog no. 55114) with two modifications to the manufacturer's protocol. First, incubation of the buffer-lysate solution was increased to 1 hour at 60°C. Second, the final elution was carried out twice with 30 μL of Buffer AVE for a total of 60 μL. The extracted ccfDNA from 1 mL of plasma was used for downstream assays with extracted ccfDNA stored at 4°C for short-term use or at −20°C for long-term storage. The concentration of extracted ccfDNA was quantified by qPCR for a 115 bp amplicon of the ALU repetitive element (26). Briefly, qPCR was carried out on 1 μL of extracted ccfDNA, in quadruplicate, using Power SYBR Green PCR Master Mix (Applied Biosystems, catalog no. 4367659) according to the manufacturer's instructions on a ViiA 7 Real-Time PCR System (Applied Biosystems). Results were normalized to a standard curve of reference DNA (Promega, catalog no. PAG3041) using QuantStudio Real-Time PCR Software (Applied Biosystems).

Preamplification droplet digital PCR for detection of circulating KRAS G12D/V/R mutations

Pre-amplification PCR of the KRAS G12 locus was performed using 15 μL of ccfDNA eluate in a 50 μL reaction. Preamplified material was diluted 1:4 with TE buffer and stored for short-term use at 4°C and at −20°C for long-term storage. Multiplex droplet digital PCR (ddPCR) to detect KRAS G12D/V/R/WT or duplex ddPCR (KRAS G12D/WT, G12V/WT, or G12R/WT) was prepared as a 30 μL reaction mix containing 2× TaqMan Genotyping Master Mix, 1× droplet stabilizer, and 200 nmol/L primers (Table 2), probes at 50 nmol/L (multiplex G12R only) or 100 nmol/L (multiplex G12D and WT, both probes in duplex assays), and 10 μL of diluted preamplification reaction. Multiplex ddPCR for KRAS G12D/V/R/WT was initially used to identify positive samples; these findings were verified and quantified by testing with identified variant's specific duplex assay. 25 μL of each reaction mix was loaded onto the RainDrop Source instrument (RainDance Technologies, Inc.) for droplet production. Mutant allele fraction was calculated as the mutant allele copy number divided by the total (wild-type + mutant) copy number. Samples that failed to meet mutant copy number thresholds or with a mutant allele fraction <0.01% were considered undetectable and assigned a value of 0.001%. Of the samples with a detectable KRAS mutation, the allele fraction was analyzed as a continuous variable, with values ranging from 0.01% to 39.08% (median, 0.405%).

Table 2.

Primers and probes for KRAS mutation analysis.

Primer/probeSequence
KRAS G12 forward primer AGGCCTGCTGAAAATGACTGAATAT 
KRAS G12 reverse primer GCTGTATCGTCAAGGCACTCTT 
KRASWT-VIC probe VIC-TTGGAGCTGGTGGCGT-MGBNFQ 
KRAS G12D-FAM probe FAM-TGGAGCTGATGGCGT-MGBNFQ 
KRAS G12R-FAM probe FAM-TTGGAGCTCGTGGCGT-MGBNFQ 
KRAS G12V FAM probe FAM-GAGCTGTTGGCGT-MGBNFQ 
Primer/probeSequence
KRAS G12 forward primer AGGCCTGCTGAAAATGACTGAATAT 
KRAS G12 reverse primer GCTGTATCGTCAAGGCACTCTT 
KRASWT-VIC probe VIC-TTGGAGCTGGTGGCGT-MGBNFQ 
KRAS G12D-FAM probe FAM-TGGAGCTGATGGCGT-MGBNFQ 
KRAS G12R-FAM probe FAM-TTGGAGCTCGTGGCGT-MGBNFQ 
KRAS G12V FAM probe FAM-GAGCTGTTGGCGT-MGBNFQ 

CA19-9 measurement

The Hospital of the University of Pennsylvania Clinical Immunology Laboratory was provided a 200 μL aliquot of K2EDTA plasma that had been banked at −80°C. CA19-9 was measured as a research assay by electrochemiluminescence immunoassay using the Elecsys CA19-9 Immunoassay on a cobas e601 platform (Roche), per the manufacturer's instructions. The resulting CA19-9 values ranged from 0 to 793,700 U/mL (median, 18.165 U/mL).

Machine-learning data analysis

Our machine learning–based development of a PDAC diagnostic includes a feature selection step, a training step, and a validation step using a blinded test set. To mitigate the effects of overfitting, the blinded test sets were separated and completely independent from the data used to discover features or to train the model. First, to select the features used in our model, we performed feature selection using LASSO on the 14-biomarker candidates from the training set of data, which is labeled with each subject's true state (e.g., PDAC vs. Noncancer). Using these identified features, we then trained a classifier model. During the development of this model, we evaluated its performance using cross-validation within the training set. Finally, this machine-learning model was evaluated by classifying subjects in a separate, user-blinded test set.

The following additional steps were taken to mitigate the effects of overfitting in the development of and the evaluation of our machine-learning model. Instead of using only a single machine-learning algorithm, we instead used an ensemble of classifier models (including K-Nearest Neighbors, SVM, linear discriminate analysis, logistic regression, and Naive Bayes) and averaged their results. By performing model averaging, the overfitting by any single algorithm can be mitigated, as each model will overfit the data differently and thus be averaged out, providing a more accurate model than any single method alone (27). We additionally applied a bootstrapping method to randomly select multiple subgroups of the training set to train the ensemble model, and thus mitigate the effects of outlier data in the training set. Most importantly, the model was evaluated using an independent, blinded data set only once, avoiding the possibility of the model overfitting the test set. The classifier model implemented in Python and LASSO was carried out in Matlab 2017a. The 95% confidence interval (CI) for sensitivity, specificity, and accuracy of PDAC diagnosis and occult metastasis detection were calculated on the basis of binomial proportion confidence interval. McNemar asymptotic test (Matlab 2017a) was used to test concordance between our panel with CA19-9 and imaging for PDAC diagnosis and occult metastasis detection, respectively (28). Mann–Whitney test was used to evaluate the statistical significance of differences in individual biomarker profiles between two groups.

Biomarker panel development

We constructed a biomarker panel including multiple blood-based analytes with the aim of improving sensitivity and specificity of disease diagnosis and staging (Fig. 1). We included previously reported tumor-associated markers such as ccfDNA concentration and ccfDNA-based detection of the KRAS G12D, V, and R mutations present in about 90% of PDAC tumors (29). CA19-9 is a routinely ordered laboratory test for PDAC monitoring and thus could readily be applied in the setting of disease detection. Although we previously showed that a panel of EV miRNAs could detect PDAC in a transgenic mouse model, we wanted to determine which miRNAs would be optimal for analyzing human samples. To do this, we isolated EVs and their miRNA cargo from the plasma of a discovery cohort of 29 patients (Table 1; Supplementary Fig. S1), including 7 healthy controls, 5 disease controls (1 nonmalignant biliary stricture and 4 pancreatitis), and 17 patients with PDAC of various disease stages. Next-generation sequencing was performed on extracted EV miRNA and we applied the LASSO feature to the results to identify the most informative miRNAs (Supplementary Fig. S2A and S2B). Among the 8 most informative, only 5 were selected to move forward based on their abundance as detected by qPCR (Cq ≤ 40; Supplementary Fig. S2C). To validate qPCR-based detection of the 5 miRNAs, matched samples were run by qPCR and the results compared with sequencing results, resulting in a correlation coefficient of R2 = 0.6. We then added six EV mRNA candidates (CD63, CK18, GAPDH, H3F3A, KRAS, and ODC1) which we had previously used to distinguish metastatic patients with PDAC from healthy controls (18). Altogether, including ccfDNA concentration, circulating mutant KRAS allele fraction, and CA19-9 concentration, we analyzed a total of 14-biomarker candidates for each subject.

Figure 1.

Combining multiple circulating biomarkers to diagnose and stage PDAC. Our biomarker panel consists of the mRNA and miRNA cargo of tumor-derived EVs enriched from plasma, circulating CA19-9, circulating cell-free DNA concentration (as determined by qPCR to detect the ALU repeat element), and circulating mutant KRAS allele fraction. This multiplex panel is combined algorithmically using machine learning. The system is trained using supervised learning on a cohort of 47 patients including 15 healthy individuals, 12 noncancer disease controls, and 20 with various stages of PDAC. Finally, the developed classifiers are evaluated using an independent, blinded test set of 136 individuals to quantify performance.

Figure 1.

Combining multiple circulating biomarkers to diagnose and stage PDAC. Our biomarker panel consists of the mRNA and miRNA cargo of tumor-derived EVs enriched from plasma, circulating CA19-9, circulating cell-free DNA concentration (as determined by qPCR to detect the ALU repeat element), and circulating mutant KRAS allele fraction. This multiplex panel is combined algorithmically using machine learning. The system is trained using supervised learning on a cohort of 47 patients including 15 healthy individuals, 12 noncancer disease controls, and 20 with various stages of PDAC. Finally, the developed classifiers are evaluated using an independent, blinded test set of 136 individuals to quantify performance.

Close modal

Using this panel of 14 biomarkers, we trained our machine-learning model with a set of 15 healthy controls, 12 disease controls (3 IPMN and 9 pancreatitis), and 20 patients with PDAC of various stages (Fig. 2A; Table 1). The best individual marker at distinguishing patients with PDAC from noncancer controls was CA19-9 (Fig. 2C; Supplementary Fig. S4), which also showed the highest fold-change between the PDAC and non-PDAC cohort among the 14-biomarker candidates (Fig. 2B). CA19-9 achieved an accuracy of A = (TP + TN)/total = 84% (95% CI, 82%–85%), where TP is the number of true positives and TN is the number of true negatives, using the clinical threshold of 36 U/mL (30–32). The best performing individual EV mRNA marker was CK18 (A = 66%; 95% CI, 58%–73%), which also was shown to be a predictive marker in our previous study on EV mRNA biomarkers (18). The best performing EV miRNA marker was miR.409 (A = 59%; 95% CI, 55%–63%), a marker that has been associated with pancreatic oncogenesis (33, 34). The accuracy of ccfDNA concentration was A = 62% (95% CI, 52%–73%), and that of circulating mutant KRAS allele fraction was A = 66%.

Figure 2.

Development of the biomarker panel using the training set. A, Heatmap shows values for the 14 circulating biomarkers from each patient in the training set, which included 15 healthy controls, 12 disease controls, and 20 patients with PDAC. B, Fold changes of all biomarkers are plotted comparing PDAC versus noncancer patients. Error bars, SD. ΔCq is calculated as Cq,PDACCq,NC. C, Accuracy of each individual biomarker for PDAC diagnosis. Clinical threshold of 36 U/mL was used for CA19-9. Other biomarkers' thresholds were determined by linear discriminant analysis. Error bars are SE from bootstrapping 10 times from the training set. D, A colormap shows the Pearson correlation coefficient (R) between each circulating biomarker. The inset colormap shows the average Pearson correlation coefficient among EV-miRNAs (by averaging R from all possible EV-miRNA pairs), EV-mRNAs (by averaging R from all possible EV-mRNA pairs) with the CA19-9, ccfDNA concentration, and KRAS mutation detection in ccfDNA designated ctDNA (for circulating tumor DNA, in the figure).

Figure 2.

Development of the biomarker panel using the training set. A, Heatmap shows values for the 14 circulating biomarkers from each patient in the training set, which included 15 healthy controls, 12 disease controls, and 20 patients with PDAC. B, Fold changes of all biomarkers are plotted comparing PDAC versus noncancer patients. Error bars, SD. ΔCq is calculated as Cq,PDACCq,NC. C, Accuracy of each individual biomarker for PDAC diagnosis. Clinical threshold of 36 U/mL was used for CA19-9. Other biomarkers' thresholds were determined by linear discriminant analysis. Error bars are SE from bootstrapping 10 times from the training set. D, A colormap shows the Pearson correlation coefficient (R) between each circulating biomarker. The inset colormap shows the average Pearson correlation coefficient among EV-miRNAs (by averaging R from all possible EV-miRNA pairs), EV-mRNAs (by averaging R from all possible EV-mRNA pairs) with the CA19-9, ccfDNA concentration, and KRAS mutation detection in ccfDNA designated ctDNA (for circulating tumor DNA, in the figure).

Close modal

To generate a predictive panel of biomarkers, each biomarker needs predictive power and the constituent biomarkers should not correlate with one another, such that each biomarker carries some unique information about the state of the patient. Pairwise correlation coefficients (R) between biomarkers were calculated and revealed that individual biomarkers were generally not well correlated with one another, except CA19-9 and circulating mutant KRAS allele fraction (|R| = 0.73; Fig. 2D), and were therefore suitable to be combined together in a panel. More specifically, we found that CA19-9 did not correlate with either ccfDNA concentration or EV RNAs (|R| < 0.4). Moreover, ccfDNA concentration did not correlate with EV RNAs (|R| < 0.5) and was weakly correlated with circulating mutant KRAS allele fraction |R| = 0.55. Tumor-derived EV miRNAs weakly correlated with one another (averaged |R| among EV miRNAs is 0.65) but not with other biomarkers (|R| < 0.40). Tumor-derived EV mRNAs weakly correlated with one another (averaged |R| = 0.66) but not with other biomarkers (|R| < 0.40). Interestingly, EV-CK18, in addition to having the greatest accuracy of any individual EV mRNA biomarker, was also particularly uncorrelated with any other measured biomarkers (|R| < 0.55).

Distinguishing patients with PDAC from noncancer controls

We next sought to identify the optimal panel of biomarkers from the 14 discussed above to distinguish patients with PDAC from noncancer controls. To achieve this, we applied LASSO to our training set of data (Figs. 2A and 3A) and determined that the best performing panel (AUC = 0.93), as measured using 10-fold cross-validation, included 5 diverse biomarkers: EV-CK18 mRNA, EV-CD63 mRNA, EV-miR.409, ccfDNA concentration, and CA19-9 (Fig. 3AC). Next, we addressed the question of whether we had included enough subjects to properly train our model by generating a learning curve (Fig. 3D). We found that the model's performance plateaued beyond 25 patients, indicating that our training set sample of 47 subjects was sufficient for the patient population in this study.

Figure 3.

Applying the biomarker panel to distinguish PDAC from noncancer. A, A summary of the patient cohort used to train our platform to classify PDAC versus non-PDAC. B, We selected the panel using LASSO. The best performing panel was selected on the basis of its AUC using 10-fold cross-validation within the training set repeated 5 times. Error bars, SE. C, The resulting PDAC versus non-PDAC (PDAC-NC) panel consists of 5 biomarkers. D, A learning curve generated by bootstrapping 10 times within the training set. Error bars, SE. E, A summary of the independent patient cohort used to evaluate the classification of PDAC-NC in a blinded study. F, The confusion matrix on the blinded test set showing that 75 of 79 noncancer samples (95%) and 50 of 57 PDAC samples (86%) were correctly identified. NPV, negative predictive value; PPV, positive predictive value; TNR, true negative rate; TPR, true positive rate. G, Receiver operating characteristic (ROC) curve comparison between the PDAC-NC panel and the best individual biomarker CA19-9, plus a control experiment of unselected biomarkers, where the training set was used to generate a model without using feature selection. H, Comparison of accuracy of our PDAC-NC panel and the best individual biomarkers, plus the same control experiments described above. Error bars are SE from bootstrapping 10 times.

Figure 3.

Applying the biomarker panel to distinguish PDAC from noncancer. A, A summary of the patient cohort used to train our platform to classify PDAC versus non-PDAC. B, We selected the panel using LASSO. The best performing panel was selected on the basis of its AUC using 10-fold cross-validation within the training set repeated 5 times. Error bars, SE. C, The resulting PDAC versus non-PDAC (PDAC-NC) panel consists of 5 biomarkers. D, A learning curve generated by bootstrapping 10 times within the training set. Error bars, SE. E, A summary of the independent patient cohort used to evaluate the classification of PDAC-NC in a blinded study. F, The confusion matrix on the blinded test set showing that 75 of 79 noncancer samples (95%) and 50 of 57 PDAC samples (86%) were correctly identified. NPV, negative predictive value; PPV, positive predictive value; TNR, true negative rate; TPR, true positive rate. G, Receiver operating characteristic (ROC) curve comparison between the PDAC-NC panel and the best individual biomarker CA19-9, plus a control experiment of unselected biomarkers, where the training set was used to generate a model without using feature selection. H, Comparison of accuracy of our PDAC-NC panel and the best individual biomarkers, plus the same control experiments described above. Error bars are SE from bootstrapping 10 times.

Close modal

To further evaluate our approach, we applied our 5-marker panel to an independent blinded test set of 136 subjects (Fig. 3E) and achieved an accuracy of A = 92% (95% CI, 86%–96%), with sensitivity of 88% (95% CI, 76%–95%) and specificity of 95% (95% CI, 88%–99%; Fig. 3F). We also calculated an AUC of 0.95 (Fig. 3G). Comparing with CA19-9 alone, our panel had a higher accuracy although the added benefit of our panel did not reach statistical significance (P = 0.103, using McNemar test; Fig. 3H). To validate that the performance is specific to the set of biomarkers that we had selected, we compared results with a control experiment where we randomly chose sets of 5 biomarkers (AUC = 0.62). Our model's performance was significantly better than using randomly selected features (P < 0.01, McNemar test). Taken together, these results suggest that a multianalyte panel can accurately predict detection of PDAC.

Distinguishing metastatic from nonmetastatic PDAC

Imaging is a widely used but imperfect technique for detecting metastases and determining whether a PDAC patient's disease is sufficiently localized for consideration of curative-intent surgery. We hypothesized that we could train our model to identify a biomarker panel that, in conjunction with imaging, could better stage patients with PDAC by distinguishing metastatic from nonmetastatic disease. To train the model, we selected 20 patients with PDAC originally staged by imaging, which included 9 resectable patients with no detectable metastasis (M0), and 11 patients with metastasis (M1; Fig. 4A). Because some patients originally identified as M0 may have had occult metastases below the level of imaging detection, we conducted chart review and retrospectively restratified the M0 patients into two groups: (i) M0s: those with no evidence of metastatic disease intraoperatively or within 4 months of follow-up and (ii) Occult metastases: those who had metastases detected intraoperatively or had metastatic recurrence within 4 months of blood draw. We performed a sensitivity analysis of time-to-distant failure among our patient cohort (Supplementary Fig. S5) to select the cutoff of 4 months, a time that is far shorter than the median recurrence-free, relapse-free, or metastasis-free survivals reported in both experimental and control arms in large randomized trials (35–37). This stratification resulted in the training set of 8 M0 and 12 M1 (11 with imaging-confirmed metastases and one with occult metastases; Fig. 4A). Using LASSO, a biomarker panel of 4 markers, including EV-miR.1299, EV-GAPDH, circulating mutant KRAS allele fraction, and CA19-9 was selected as having the highest accuracy (A = 91%; Fig. 4B and C). A learning curve using 8-fold cross-validation showed that the curve plateaued by 15 subjects, indicating that the 20 subjects in our training set were sufficient for this study (Fig. 4D).

Figure 4.

Retraining the model to distinguish metastatic from nonmetastatic PDAC. A, Patient cohort used to train our platform to classify occult or imaging-confirmed metastatic patients from nonmetastatic patients with PDAC. Dotted line indicates one patient with PDAC who was originally determined by imaging to be M0 but was subsequently determined to have harbored occult metastases due to metastatic outgrowth less than 4 months from blood draw, hence was considered as occult metastases. B, We selected the panel using LASSO. The best performing panel was selected on the basis of its AUC using 8-fold cross-validation within the training set and repeated 10 times. The inset shows the comparison of the accuracy between our panel (red) and the clinical diagnosis (gray). Error bars are SE from bootstrapping 10 repeats. C, The panel for metastatic PDAC detection consists of 4 biomarkers. D, Learning curve of metastatic PDAC detection generated by bootstrapping N = 10 times within the training set. Error bars, SE. E, Proposed clinical workflow to combine liquid biopsy with imaging for a test set of 37 patients with PDAC, including 9 patients who were determined to have a time to metastases of <4 months. Baseline imaging was used to classify patients as either metastatic (M1; N = 12, top arm) or no detectable metastases (M0imaging; N = 25, bottom arm). For the 25 M0imaging patients, the liquid biopsy panel was then performed, resulting in two patient classifications, those called by the model as M1 (occult metastases, top arm) or those called as M0 (M0LB, bottom arm). LB, liquid biopsy; ML, machine learning. F, Shown are the confusion matrices for the 25 M0imaging patients with PDAC by imaging alone (bottom) and our method combining liquid biopsy with imaging (top). Our panel achieved accuracy = 84% with 78% sensitivity and 88% specificity. NPV, negative predictive value; PPV, positive predictive value; TNR, true negative rate; TPR, true positive rate. G, Receiver operating characteristic (ROC) curve analysis on N = 25 M0imaging patients with PDAC in the blinded test set. Inset shows the accuracy comparison between imaging only (gray, accuracy = 64%), control experiment using unselected biomarkers (yellow, accuracy = 48%), and liquid biopsy (red, accuracy = 84%) panel. Error bars are SE from bootstrapping 10 repeats.

Figure 4.

Retraining the model to distinguish metastatic from nonmetastatic PDAC. A, Patient cohort used to train our platform to classify occult or imaging-confirmed metastatic patients from nonmetastatic patients with PDAC. Dotted line indicates one patient with PDAC who was originally determined by imaging to be M0 but was subsequently determined to have harbored occult metastases due to metastatic outgrowth less than 4 months from blood draw, hence was considered as occult metastases. B, We selected the panel using LASSO. The best performing panel was selected on the basis of its AUC using 8-fold cross-validation within the training set and repeated 10 times. The inset shows the comparison of the accuracy between our panel (red) and the clinical diagnosis (gray). Error bars are SE from bootstrapping 10 repeats. C, The panel for metastatic PDAC detection consists of 4 biomarkers. D, Learning curve of metastatic PDAC detection generated by bootstrapping N = 10 times within the training set. Error bars, SE. E, Proposed clinical workflow to combine liquid biopsy with imaging for a test set of 37 patients with PDAC, including 9 patients who were determined to have a time to metastases of <4 months. Baseline imaging was used to classify patients as either metastatic (M1; N = 12, top arm) or no detectable metastases (M0imaging; N = 25, bottom arm). For the 25 M0imaging patients, the liquid biopsy panel was then performed, resulting in two patient classifications, those called by the model as M1 (occult metastases, top arm) or those called as M0 (M0LB, bottom arm). LB, liquid biopsy; ML, machine learning. F, Shown are the confusion matrices for the 25 M0imaging patients with PDAC by imaging alone (bottom) and our method combining liquid biopsy with imaging (top). Our panel achieved accuracy = 84% with 78% sensitivity and 88% specificity. NPV, negative predictive value; PPV, positive predictive value; TNR, true negative rate; TPR, true positive rate. G, Receiver operating characteristic (ROC) curve analysis on N = 25 M0imaging patients with PDAC in the blinded test set. Inset shows the accuracy comparison between imaging only (gray, accuracy = 64%), control experiment using unselected biomarkers (yellow, accuracy = 48%), and liquid biopsy (red, accuracy = 84%) panel. Error bars are SE from bootstrapping 10 repeats.

Close modal

To further evaluate our panel's ability to identify occult metastatic disease, we applied our approach to an independent blinded test set of 37 subjects with PDAC as part of a clinical workflow starting with standard-of-care diagnostic imaging and followed by liquid biopsy (Fig. 4E). Twelve of 37 patients were identified by imaging alone as having metastases, were classified as M1, and had no further evaluation. The remaining 25 patients were determined by baseline imaging to be resectable have no detectable metastases (M0imaging). Upon retrospective chart review, 16 of 25 had no evidence of metastases within 4 months. Nine of 25 patients were determined to have had occult metastases, including 4 who had surgery aborted due to intraoperative detection of metastatic disease and another 5 who completed surgery but had distant metastases detected on imaging within 4 months of their baseline blood draw. Our liquid biopsy workflow correctly identified 7 of 9 patients as having occult metastatic disease, and 14 of 16 patients as being metastasis-free (Fig. 4E). Thus, by comparing the liquid biopsy prediction with the true state of the patients, we found that our test had an accuracy of detecting distant metastasis of A = 84% (95% CI, 64%–95%) with sensitivity of 78% (95% CI, 40%–97%) and specificity of 88% (95% CI, 62%–98%) with an AUC = 0.85, which compares favorably with the accuracy of imaging alone (A = 65%; P < 0.05, McNemar test) among 25 patients originally identified as M0 by imaging (Fig. 4G). We also ran a control experiment to confirm the performance is specific to the biomarkers identified from our training set. In the control experiment, we randomly selected biomarkers and the resulting AUC = 0.53 with accuracy of 48% for metastatic disease detection. Our model's performance was significantly better than the control experiment (P < 0.01, McNemar test) as well. Taken together, these results suggest that a multianalyte panel outperforms conventional imaging for metastatic disease detection.

In this study, we applied a multianalyte liquid biopsy approach to clinical baseline blood samples obtained from patients with PDAC of all stages, as well as healthy and disease controls. We demonstrate that this platform can accurately identify patients with PDAC (A = 92%) and, for patients with pathologically confirmed PDAC, improve the detection of occult metastases that are not initially detected by standard-of-care imaging but are found intraoperatively or shortly after surgery (A = 84%). Surgical resection remains the only curative therapy for PDAC (3), but is limited to patients without detectable metastases. At time of diagnosis, only about 15%–20% of patients with PDAC will be deemed candidates for surgical resection based on imaging and clinical status (1, 3). Even in this subgroup, the intraoperative detection of metastases, prompting the surgery to be aborted, or rapid emergence of distant metastases within months of surgery, can still occur (1, 3, 38–40). Those patients with recurrent disease demonstrate survival similar to a de novo metastatic patient (41) thus questioning the potential benefit of surgery in that setting. This yields two important clinical problems that our approach addresses: (i) detecting disease at an early enough stage for surgery to be feasible, and (ii) once diagnosed with PDAC, accurately determining which patients would or would not benefit from surgery.

Our work differentiates itself most significantly from previous work in the following aspects: (i) it combines a diverse set of noninvasive markers, (ii) our panel can not only diagnose PDAC, but also improve staging accuracy; and (iii) it uses machine-learning approaches that are resilient against overfitting and can continue to be trained and improved in future studies. To construct our multianalyte panel, we selected the marker CA19-9, which is routinely ordered as a clinical blood test for patients with PDAC, with existing liquid biopsy approaches for measuring ccfDNA concentration (12, 26), ccfDNA allele fraction of mutant KRAS (14, 29), and mRNA and miRNA isolated from tumor-associated EVs. We and others have shown that the mRNA and miRNA cargo of tumor-derived EVs can be readily detected in preclinical and clinical samples (39). In the present work, we additionally demonstrate that EV transcriptional profiling provides orthogonal diagnostic information, thus providing the rationale for adding EV-based measures to those from protein- and DNA-based markers.

In our work, and in other studies, multianalyte panels have demonstrated several advantages compared with single markers (17, 19). Individual EV biomarkers have previously demonstrated promising results for PDAC (39, 40, 42, 43), but faced challenges when applied to patient cohorts in different institutions (44). Melo and colleagues reported that GPC1+ exosomes were informative for distinguishing patients with PDAC from healthy and disease controls with an AUC = 1 (40). However, independent studies reported markedly different performance of GPC1+ EVs for PDAC diagnosis (42, 44). CTCs have shown promise (10, 11, 45) but detecting CTCs in early-stage PDAC is challenging because of their low concentration. Recent publications have also shown a benefit of combining multiple biomarkers for PDAC diagnosis; however, biomarkers in most publications tend to come from a single category, for example, from EV cargo nucleic acids including miRNAs (46, 47), mRNAs (18), DNAs (48), or from EV surface protein profiling (16). Few studies combined biomarkers from different categories: Cohen and colleagues combined CA19-9 with circulating tumor DNA and plasma proteins (20), and Madhavan and colleagues combined EV cargo proteins and miRNAs (23), but both focused on PDAC diagnosis only. Assays that identify signatures across multiple biomarkers have the potential to be more robust for diverse patient populations and are less dependent on any single reagent than single marker assays. An additional concern could be the complexity of conducting multiple tests. To address this, we have used widely available commercial platforms for ccfDNA analysis. Potential drawbacks to a multianalyte panel could be the requirement for multiple blood draws, a large blood volume, or multiple collection types. However, our panel can be performed utilizing only 3 mL appropriately processed EDTA preserved plasma, less than the typical yield from a standard 10 mL blood collection tube. While the magnetic nanofluidic-based approach used to isolate tumor-derived EVs is not yet commercially available, it is high-throughput, robust, and inexpensive to manufacture and thus well suited to eventual clinical adaptation.

Here, we demonstrate proof-of-concept for a liquid biopsy–based multianalyte panel, using a baseline blood sample. However, there are several limitations of our study offering opportunities for future study. For the occult metastasis cohort, we included those who had metastases detected intraoperatively or had recurrence within 4 months of baseline blood draw, a cutoff determined by a sensitivity analysis of time-to-distant failure (Supplementary Fig. S5). While 4 months is far shorter than the median recurrence-free, relapse-free, or metastasis-free survivals reported in experimental and control arms of large randomized trials (35–37), this cutoff should be reexamined in a larger cohort. In addition, the biomarkers in our panel are largely tumor-derived, and the model would likely benefit from the addition of tumor extrinsic factors, including the immune compartment. Work is underway to address these limitations in the setting of a larger cohort of patients with PDAC that includes an external validation cohort, and extend the model's utility to detection and staging of other solid tumors.

M.H. O'Hara is an employee/paid consultant for Natera. B.W. Katona is an employee/paid consultant for Exact Sciences. D. Issadore holds ownership interest (including patents) in Chip Diagnostics. No potential conflicts of interest were disclosed by the other authors.

Conception and design: Z. Yang, M.J. LaRiviere, J. Ko, J.E. Till, B.Z. Stanger, D. Issadore, E.L. Carpenter

Development of methodology: Z. Yang, M.J. LaRiviere, J. Ko, J.E. Till, N. Bhagwat, D. Issadore, E.L. Carpenter

Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): Z. Yang, M.J. LaRiviere, J.E. Till, T. Christensen, S.S. Yee, T.A. Black, K. Tien, A. Lin, D. Herman, M.H. O'Hara, C.M. Vollmer, B.W. Katona, E.L. Carpenter

Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): Z. Yang, J.E. Till, A. Lin, C.M. Vollmer, B.Z. Stanger, D. Issadore, E.L. Carpenter

Writing, review, and/or revision of the manuscript: Z. Yang, M.J. LaRiviere, J. Ko, J.E. Till, T. Christensen, S.S. Yee, T.A. Black, K. Tien, A. Lin, H. Shen, D. Herman, M.H. O'Hara, C.M. Vollmer, B.W. Katona, B.Z. Stanger, D. Issadore, E.L. Carpenter

Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): Z. Yang, J.E. Till, T. Christensen, S.S. Yee, T.A. Black, K. Tien, H. Shen, A. Adallah, D. Issadore

Study supervision: Z. Yang, M.H. O'Hara, D. Issadore, E.L. Carpenter

D. Issadore was supported by the Pennsylvania Department of Health Commonwealth Universal Research Enhancement Program, the National Institute of Health: R21MH118170, American Cancer Society - CEOs against Cancer - CA Division Research Scholar Grant (RSG-15-227-01-CSM), and Congressionally Directed Medical Research Programs (CDMRP) W81XWH-19-2-0002. E.L. Carpenter was supported by the Pancreatic Cancer Action Network Translational Research Award, the University of Pennsylvania Pancreatic Cancer Research Center, and the Abramson Cancer Center Pancreatic Translational Center of Excellence. We gratefully acknowledge the patients, their families, and the clinicians who provided their care. We would also like to thank Janae Romeo, Colleen Redlinger, and the clinical research coordinator team.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1.
Society AC
. 
Key statistics for pancreatic cancer
; 
2019
. Available from: https://www.cancer.org/cancer/pancreatic-cancer/about/key-statistics.html.
2.
Hidalgo
M
. 
Pancreatic cancer
.
N Engl J Med
2010
;
362
:
1605
17
.
3.
Ryan
DP
,
Hong
TS
,
Bardeesy
N
. 
Pancreatic adenocarcinoma
.
N Engl J Med
2014
;
371
:
1039
49
.
4.
Wolff
RA
,
Varadhachary
GR
,
Evans
DB
. 
Adjuvant therapy for adenocarcinoma of the pancreas: analysis of reported trials and recommendations for future progress
.
Ann Surg Oncol
2008
;
15
:
2773
.
5.
Wolff
RA
. 
Adjuvant or neoadjuvant therapy in the treatment in pancreatic malignancies: where are we
.
Surg Clin
2018
;
98
:
95
111
.
6.
Crowley
E
,
Di Nicolantonio
F
,
Loupakis
F
,
Bardelli
A
. 
Liquid biopsy: monitoring cancer-genetics in the blood
.
Nat Rev Clin Oncol
2013
;
10
:
472
.
7.
Bettegowda
C
,
Sausen
M
,
Leary
RJ
,
Kinde
I
,
Wang
Y
,
Agrawal
N
, et al
Detection of circulating tumor DNA in early- and late-stage human malignancies
.
Sci Transl Med
2014
;
6
:
224ra24
.
8.
Li
J
,
Zhu
J
,
Hassan
MM
,
Evans
DB
,
Abbruzzese
JL
,
Li
D
. 
K-ras mutation and p16 and preproenkephalin promoter hypermethylation in plasma DNA of pancreatic cancer patients: in relation to cigarette smoking
.
Pancreas
2007
;
34
:
55
62
.
9.
Bergquist
JR
,
Puig
CA
,
Shubert
CR
,
Groeschl
RT
,
Habermann
EB
,
Kendrick
ML
, et al
Carbohydrate antigen 19-9 elevation in anatomically resectable, early stage pancreatic cancer is independently associated with decreased overall survival and an indication for neoadjuvant therapy: a national cancer database study
.
J Am Coll Surg
2016
;
223
:
52
65
.
10.
Effenberger
KE
,
Schroeder
C
,
Hanssen
A
,
Wolter
S
,
Eulenburg
C
,
Tachezy
M
, et al
Improved risk stratification by circulating tumor cell counts in pancreatic cancer
.
Clin Cancer Res
2018
;
24
:
2844
50
.
11.
Okubo
K
,
Uenosono
Y
,
Arigami
T
,
Mataki
Y
,
Matsushita
D
,
Yanagita
S
, et al
Clinical impact of circulating tumor cells and therapy response in pancreatic cancer
.
Eur J Surg Oncol
2017
;
43
:
1050
5
.
12.
Benesova
L
,
Belsanova
B
,
Suchanek
S
,
Kopeckova
M
,
Minarikova
P
,
Lipska
L
, et al
Mutation-based detection and monitoring of cell-free tumor DNA in peripheral blood of cancer patients
.
Anal. Biochem
2013
;
433
:
227
34
.
13.
Da Silva Filho
BF
,
Gurgel
AP
,
Neto
,
de Azevedo
DA
,
de Freitas
AC
,
Silva Neto Jda
C
, et al
Circulating cell-free DNA in serum as a biomarker of colorectal cancer
.
J Clin Pathol
2013
;
66
:
775
8
.
14.
Thierry
AR
,
Mouliere
F
,
El Messaoudi
S
,
Mollevi
C
,
Lopez-Crapez
E
,
Rolet
F
, et al
Clinical validation of the detection of KRAS and BRAF mutations from circulating tumor DNA
.
Nat Med
2014
;
20
:
430
5
.
15.
Kim
J
,
Bamlet
WR
,
Oberg
AL
,
Chaffee
KG
,
Donahue
G
,
Cao
XJ
, et al
Detection of early pancreatic ductal adenocarcinoma with thrombospondin-2 & CA19-9 blood markers
.
Sci Transl Med
2017
;
9
:
pii
:
eaah5583
.
16.
Yang
KS
,
Im
H
,
Hong
S
,
Pergolini
I
,
Del Castillo
AF
,
Wang
R
, et al
Multiparametric plasma EV profiling facilitates diagnosis of pancreatic malignancy
.
Sci Transl Med
2017
;
9
:
pii
:
eaal3226
.
17.
Kinde
I
,
Wu
J
,
Papadopoulos
N
,
Kinzler
KW
,
Vogelstein
B
. 
Detection and quantification of rare mutations with massively parallel sequencing
.
Proc Natl Acad Sci U S A
2011
;
108
:
9530
5
.
18.
Ko
J
,
Bhagwat
N
,
Yee
SS
,
Ortiz
N
,
Sahmoud
A
,
Black
T
, et al
Combining machine learning and nanofluidic technology to diagnose pancreatic cancer using exosomes
.
ACS Nano
2017
;
11
:
11182
93
.
19.
Cohen
JD
,
Li
L
,
Wang
Y
,
Thoburn
C
,
Afsari
B
,
Danilova
L
, et al
Detection and localization of surgically resectable cancers with a multi-analyte blood test
.
Science
2018
;
359
:
926
30
.
20.
Cohen
JD
,
Javed
AA
,
Thoburn
C
,
Wong
F
,
Tie
J
,
Gibbs
P
, et al
Combined circulating tumor DNA and protein biomarker-based liquid biopsy for the earlier detection of pancreatic cancers
.
Proc Natl Acad Sci U S A
2017
;
114
:
10202
7
.
21.
Laurikkala
J
,
Juhola
M
,
Kentala
E
,
Lavrac
N
,
Miksch
S
,
Kavsek
B
. 
Informal identification of outliers in medical data
.
Fifth Int Work Intell data Anal Med Pharmacol
2000
;
20
4
.
22.
McShane
LM
,
Altman
DG
,
Sauerbrei
W
,
Taube
SE
,
Gion
M
,
Clark
GM
. 
REporting recommendations for tumor MARKer prognostic studies (REMARK)
.
Breast Cancer Res Treat
2006
;
100
:
229
35
.
23.
Madhavan
B
,
Yue
S
,
Galli
U
,
Rana
S
,
Gross
W
,
Müller
M
, et al
Combined evaluation of a panel of protein and miRNA serumexosome biomarkers for pancreatic cancer diagnosis increases sensitivity and specificity
.
Int J Cancer
2015
;
136
:
2616
27
.
24.
Love
MI
,
Huber
W
,
Anders
S
. 
Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2
.
Genome Biol
2014
;
15
:
550
.
25.
Zhu
Q
,
Fisher
SA
,
Shallcross
J
,
Kim
J
. 
VERSE: a versatile and efficient RNA-Seq read counting tool
. Preprint from bioRxiv, 2016.
26.
Fawzy
A
,
Sweify
KM
,
El-Fayoumy
HM
,
Nofal
N
. 
Quantitative analysis of plasma cell-free DNA and its DNA integrity in patients with metastatic prostate cancer using ALU sequence
.
J Egypt Natl Canc Inst
2016
;
28
:
235
42
.
27.
Statnikov
A
,
Aliferis
CF
,
Tsamardinos
I
,
Hardin
D
,
Levy
S
. 
A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis
.
Bioinformatics
2004
;
21
:
631
43
.
28.
Fagerland
MW
,
Lydersen
S
,
Laake
P
. 
The McNemar test for binary matched-pairs data: Mid-p and asymptotic are better than exact conditional
.
BMC Med Res Methodol
2013
;
13
:
91
.
29.
Waters
AM
,
Der
CJ
. 
KRAS: the critical driver and therapeutic target for pancreatic cancer
.
Cold Spring Harb Perspect Med
2018
;
8
:
pii
:
a031435
.
30.
Ritts
RE
 Jr
,
del Villano
BC
,
Go
VLW
,
Herberman
RB
,
Klug
TL
,
Zurawski
VR
. 
Initial clinical evaluation of an immunoradiometric assay for CA 199 using the NCI serum bank
.
Int J Cancer
1984
;
33
:
339
45
.
31.
Farini
R
,
Fabris
C
,
Bonvicini
P
,
Piccoli
A
,
Del Favero
G
,
Venturini
R
, et al
CA 19-9 in the differential diagnosis between pancreatic cancer and chronic pancreatitis
.
Eur J Cancer Clin Oncol
1985
;
21
:
429
32
.
32.
Safi
F
,
Roscher
R
,
Beger
HG
. 
The clinical relevance of the tumor marker CA 19-9 in the diagnosing and monitoring of pancreatic carcinoma
.
Bull Cancer
1990
;
77
:
83
91
.
33.
Drakaki
A
,
Iliopoulos
D
. 
MicroRNA-gene signaling pathways in pancreatic cancer
.
Biomed J
2013
;
36
:
200
8
.
34.
Bloomston
M
,
Frankel
WL
,
Petrocca
F
,
Volinia
S
,
Alder
H
,
Hagan
JP
, et al
MicroRNA expression patterns to differentiate pancreatic adenocarcinoma from normal pancreas and chronic pancreatitis
.
JAMA
2007
;
297
:
1901
8
.
35.
Neoptolemos
JP
,
Stocken
DD
,
Friess
H
,
Bassi
C
,
Dunn
JA
,
Hickey
H
, et al
A randomized trial of chemoradiotherapy and chemotherapy after resection of pancreatic cancer
.
N Engl J Med
2004
;
350
:
1200
10
.
36.
Neoptolemos
JP
,
Moore
MJ
,
Cox
TF
,
Valle
JW
,
Palmer
DH
,
McDonald
AC
, et al
Effect of adjuvant chemotherapy with fluorouracil plus folinic acid or gemcitabine vs. observation on survival in patients with resected periampullary adenocarcinoma: the ESPAC-3 periampullary cancer randomized trial
.
JAMA
2012
;
308
:
147
56
.
37.
Conroy
T
,
Hammel
P
,
Hebbar
M
,
Ben Abdelghani
M
,
Wei
AC
,
Raoul
J-L
, et al
FOLFIRINOX or gemcitabine as adjuvant therapy for pancreatic cancer
.
N Engl J Med
2018
;
379
:
2395
406
.
38.
Sefrioui
D
,
Blanchard
F
,
Toure
E
,
Basile
P
,
Beaussire
L
,
Dolfus
C
, et al
Diagnostic value of CA19.9, circulating tumour DNA and circulating tumour cells in patients with solid pancreatic tumours
.
Br J Cancer
2017
;
117
:
1017
.
39.
Ko
J
,
Carpenter
E
,
Issadore
D
. 
Detection and isolation of circulating exosomes and microvesicles for cancer monitoring and diagnostics using micro-/nano-based devices
.
Analyst
2016
;
141
:
450
60
.
40.
Melo
SA
,
Luecke
LB
,
Kahlert
C
,
Fernandez
AF
,
Gammon
ST
,
Kaye
J
, et al
Glypican-1 identifies cancer exosomes and detects early pancreatic cancer
.
Nature
2015
;
523
:
177
82
.
41.
Gbolahan
OB
,
Tong
Y
,
Sehdev
A
,
O'Neil
B
,
Shahda
S
. 
Overall survival of patients with recurrent pancreatic cancer treated with systemic therapy: a retrospective study
.
BMC Cancer
2019
;
19
:
1
9
.
42.
Li
TD
,
Zhang
R
,
Chen
H
,
Huang
ZP
,
Ye
X
,
Wang
H
, et al
An ultrasensitive polydopamine bi-functionalized SERS immunoassay for exosome-based diagnosis and classification of pancreatic cancer
.
Chem Sci
2018
;
9
:
5372
82
.
43.
Liang
K
,
Liu
F
,
Fan
J
,
Sun
D
,
Liu
C
,
Lyon
CJ
, et al
Nanoplasmonic quantification of tumour-derived extracellular vesicles in plasma microsamples for diagnosis and treatment monitoring
.
Nat Biomed Eng
2017
;
1
:
pii: 0021
.
44.
Lucien
F
,
Lac
V
,
Billadeau
DD
,
Borgida
A
,
Gallinger
S
,
Leong
HS
. 
Glypican-1 and glycoprotein 2 bearing extracellular vesicles do not discern pancreatic cancer from benign pancreatic diseases
.
Oncotarget
2019
;
10
:
1045
.
45.
Poruk
KE
,
Valero
V
,
Saunders
T
,
Blackford
AL
,
Griffin
JF
,
Poling
J
, et al
Circulating tumor cell phenotype predicts recurrence and survival in pancreatic adenocarcinoma
.
Ann Surg
2016
;
264
:
1073
81
.
46.
Lai
X
,
Wang
M
,
McElyea
SD
,
Sherman
S
,
House
M
,
Korc
M
. 
A microRNA signature in circulating exosomes is superior to exosomal glypican-1 levels for diagnosing pancreatic cancer
.
Cancer Lett
2017
;
393
:
86
93
.
47.
Goto
T
,
Fujiya
M
,
Konishi
H
,
Sasajima
J
,
Fujibayashi
S
,
Hayashi
A
, et al
An elevated expression of serum exosomal microRNA-191, - 21, -451a of pancreatic neoplasm is considered to be efficient diagnostic marker
.
BMC Cancer
2018
;
18
:
116
.
48.
Yang
S
,
Che
SPY
,
Kurywchak
P
,
Tavormina
JL
,
Gansmo
LB
,
Correa de Sampaio
P
, et al
Detection of mutant KRAS and TP53 DNA in circulating exosomes from healthy individuals and patients with pancreatic cancer
.
Cancer Biol Ther
2017
;
18
:
158
65
.

Supplementary data