Abstract
The primary origin of neuroendocrine tumor metastases can be difficult to determine by histopathology alone, but is critical for therapeutic decision making. DNA methylation–based profiling is now routinely used in the diagnostic workup of brain tumors. This has been enabled by the availability of cost-efficient array-based platforms. We have extended these efforts to augment histopathologic diagnosis in neuroendocrine tumors.
Methylation data was compiled for 69 small intestinal, pulmonary, and pancreatic neuroendocrine tumors. These data were used to build a ridge regression calibrated random forest classification algorithm (neuroendocrine neoplasm identifier, NEN-ID). The model was validated during 3 × 3 nested cross-validation and tested in a local and an external cohort (n = 198 cases).
NEN-ID predicted the origin of tumor samples with high accuracy (>95%). In addition, the diagnostic approach was determined to be robust across a range of possible confounding experimental parameters, such as tumor purity and array quality. A software infrastructure and online user interface were built to make the model available to the scientific community.
This DNA methylation–based prediction model can be used in the workup for patients with neuroendocrine tumors of unknown primary. To facilitate validation and clinical implementation, we provide a user-friendly, publicly available web-based version of NEN-ID.
One of 9 patients with neuroendocrine tumors presents with metastases of unknown primary. The value of IHC panels to determine neuroendocrine tumor origin in these cases is limited. In this article, we investigate the use of DNA methylation as a biomarker of origin, an approach which has been successfully applied in other cancer types. Using widely available array-based methylation profiling, we provide evidence that our one-stop test is superior to traditional IHC techniques in determining small intestinal, pulmonary, or pancreatic neuroendocrine tumor origin. We systematically validated our results through cross-validation, both in local and external test cohorts. The classifier presented here (neuroendocrine neoplasm identifier) has already been successfully applied in the diagnostic workup of patients with neuroendocrine tumors of unknown primary and is available as an open-source web application. This platform offers full analysis, classification, and clinical report generation for neuroendocrine tumor methylation data, facilitating implementation and validation in clinical practice.
Introduction
Neuroendocrine tumors have an incidence of about 6 per 100,000 per year and regularly present with metastases at diagnosis (1). In about 1 in 9 patients with a neuroendocrine tumor, the primary tumor is unknown (2). As lymphoid, hepatic, and pulmonary metastases can originate from any organ site in the body, the diagnostic workup can be challenging. Furthermore, discerning pulmonary metastases from primary pulmonary neuroendocrine tumors is difficult, as these can appear morphologically and immunohistochemically similar. Also, for patients with hereditary neuroendocrine tumor syndromes, who are at risk of developing synchronous primary neuroendocrine tumors, the primary origin of metastases is often unclear. It is crucial to know the primary origin because it determines prognosis, relevance of prognostic markers, treatment options, and eligibility for clinical trials (1, 3, 4). Guidelines recommend that IHC panels should be performed on the metastatic tumor tissue, which can give clues about the primary site of origin, and guide or confirm imaging results and additional diagnostic techniques (5). However, diagnostic sensitivity and specificity vary greatly for the IHC panels, and the primary tumor is often not visible on imaging. A reliable one-stop test to accurately determine primary origin is needed.
DNA methylation is important in regulating gene expression and the distribution of DNA methylation across the genome is highly tissue and tumor specific (6). Previously, whole-genome methylation profiling has successfully been applied to classify brain tumors and identify origin of carcinomas of unknown primary (7–9). This method has not yet been explored for neuroendocrine tumors. Compared with the traditional IHC panels, genome methylation–based prediction models take advantage of combining many thousands of measurements at once and lack interobserver bias, which improves the accuracy. To test the feasibility of this approach for neuroendocrine tumors, we developed a methylation profiling–based prediction model and determined the accuracy in predicting the most common primary sites of origin for neuroendocrine tumor metastases, which are the small intestine, lung, and pancreas. The Illumina methylation platforms (450k and EPIC) were chosen to assess whole-genome methylation, as these arrays are widely available and have reproducible results (8). To make validation and clinical implementation attainable, we subsequently developed the neuroendocrine neoplasm identifier (NEN-ID) web application to make our analysis pipeline available to the research and clinical neuroendocrine tumor community via an online interface.
Materials and Methods
Methylation cohorts
The University Medical Center Utrecht (UMCU) Biobank Research Ethics Committee approved the use of archival material for this study. The methylation training cohort was compiled of publicly available methylation Illumina 450k data of small intestinal, pulmonary, and pancreatic neuroendocrine tumors [Gene Expression Omnibus (GEO) accession numbers GSE73832, GSE118133, and GSE117852, respectively; refs. 10–12; Supplementary Fig. S1]. For small intestinal neuroendocrine tumors, supplementary data and data deposited in the GEO repository were matched by age, sex, grade, tissue type, and origin. Only primary formalin-fixed, paraffin-embedded neuroendocrine tumor samples of certain ileal, terminal ileal, or ileocecal origin were selected (n = 19) to ensure a homogenous training population (10). Neuroendocrine tumor origins with <5 cases were excluded. All pancreatic neuroendocrine tumors (n = 32) and pulmonary (n = 18) were included in the further analysis (11, 12). As a “local test cohort,” the UMCU pathology archive was screened for surgical specimens of primary neuroendocrine tumors and neuroendocrine tumor metastases of small intestinal, pulmonary, pancreatic, and unknown primary origin. Cases of unknown primary were defined as unknown primary tumor location after routine clinical workup, including imaging. Additional small intestinal, pulmonary, and pancreatic neuroendocrine tumors were retrieved from GEO repository and the European Genome-phenome Archive (EGA) and combined to an “external test cohort” (accession numbers GSE73832, GSE53051, GSE134089, and EGAD00010001720; refs. 10, 13–15; Supplementary Fig. S1). For the simulated nontumor contamination, publicly available data of purified lymphocytes, normal pancreas, and lung were downloaded [GEO accession number GSE103541; ref. 16 and The Cancer Genome Atlas (TCGA)]. Methylation data of rare neuroendocrine tumors were retrieved from previous studies (GSE73832 and GSE134089) and data of non-neuroendocrine tumor and normal samples were retrieved from TCGA database.
Sample methylation array processing
For the local test cohort, consisting of cases from our own institution (n = 26), DNA extraction, bisulfite conversion, and array processing were performed in-house (core facility). Unstained sections (4 μm) were freshly cut for DNA extraction. Tumor tissue was macrodissected if normal tissue was present in the slide. DNA was extracted after proteinase K digestion using the Maxwell RSC 48 Automated Nucleic Acid Extraction Platform (Promega). Further steps were performed according to the protocols provided by the DNA Methylation BeadChip manufacturer (Illumina). Briefly, bisulfite conversion was performed using the EZ DNA Methylation Kit (Zymo Research). Formalin fixation–induced DNA damage was restored using the FFPE Restore Kit (Illumina). Whole-genome amplification on the bisulfite converted and restored DNA was performed, followed by fragmentation, precipitation, and hybridization to CpG Site–specific Beads (Illumina). After clearing unhybridized and nonspecifically hybridized DNA, BeadChips were extended, stained, coated, and scanned using the iScan Software (Illumina). Raw data were exported as IDAT files for further processing.
IHC
Unstained sections (4 μm) of the local cases were cleared and deparaffinized. Slides were processed using the BenchMark ULTRA Autostainer (Ventana Medical Systems) with 24 minutes cell conditioning one antigen retrieval at 100°C. Slides were incubated with CDX2 (Clone EPR2764Y, catalog no., ILM2353-C1, 1:200; Immunologic) and TTF1 (Clone SPT24, RRID: AB_442138, 1:60; Leica Biosystems) at 36°C for 32 minutes, and visualized with Optiview DAB (Ventana Medical Systems). Islet-1 IHC was performed manually. Briefly, peroxidase activity was blocked with 0.6% H2O2 in methanol. Slides were cooked in 10 mmol/L citrate (pH 6) buffer for 20 minutes. After applying Pierce Protein-Free Blocking Buffer (Thermo Fisher Scientific), slides were incubated with Islet-1 antibody (Clone EP283, catalog no., 431R-15, 1:400, Cell Marque) in Normal Antibody Diluent (Immunologic) at 20°C for 60 minutes. The signal was amplified using the BrightVision+ Poly- HRP-Anti Mouse/Rabbit IgG Biotin-free Kit (Immunologic), and detected with Bright-DAB (Immunologic) for 8 minutes. Slides were counterstained with hematoxylin. Scoring was performed in consensus by a neuroendocrine tumor researcher (W.M. Hackeng) and an experienced neuroendocrine tumor pathologist (L.A.A. Brosens).
Data analysis
All analyses were performed using R version 3.6.3 (17). The following sections provide a nonexhaustive overview of the most important analytic steps. The full code used to produce figures and results presented in the article is available for inspection on Github (whackeng/NEN-ID). Methylation data have been deposited at the EGA, which is hosted by the European Bioinformatics Institute (EBI) and the Centre for Genomic Regulation (CRG), under accession number EGAS00001004878.
Preprocessing
The ChAMP package was used to load and process raw data (18). Failed probes with a detection P value of more than 0.01 were removed. Probes were filtered if they fulfilled any of the following criteria: beadcount < 3 in at least 5% of samples, non-CpG targets, or location on the X/Y chromosomes. Furthermore, all SNP containing and multi-hit probes were removed. Ileal neuroendocrine tumors were not filtered for beadcount, as raw data were not available for these cases. Samples were normalized using the BMIQ method, as this approach was used for the preprocessed small intestinal neuroendocrine tumor samples (10). In addition, sample quality was assessed by visual inspection of the sample density plots and failed CpG fraction.
Probe selection and unsupervised analysis
Training datasets were randomly downsampled three times, stratified by tumor origin. For each downsampled dataset, probes were ordered by the mean absolute SD. The 5,000 most variable probes were selected by the minimal rank of the mean absolute SD over all downsampled datasets. Unsupervised hierarchical clustering was performed on the most informative probes using the Euclidean distance. The t-distributed stochastic neighbor embedding (t-SNE) visualization was made with the Rtsne package on the most variable probes of the entire training cohort (19).
Random forest classifier training and ridge logistic regression calibration
Data were filtered for CpG probes present in both the 450k array and the EPIC array. The 5,000 most variable probes selected during triplicate downsampling were used as input variables. The random forest model was built using the randomForest package with 500 binary classification trees and default parameters. A downsampling strategy was used to balance classes present in the training cohort. For each decision tree, Breiman random forest algorithm creates a randomly bootstrapped dataset with replacement (20). The class of each sample in the training set is predicted by the aggregated votes (bootstrap aggregation/bagging) of all random forest trees for which the specific sample was omitted from the bootstrapped training set [out-of-bag (OOB) votes], and for which the sample can be considered “unknown.” The internal OOB votes from the random forest model were used to optimize the lambda parameter through a 3-fold cross-validation using glmnet. The loss measured during cross-validation was deviance (default). A multinomial ridge penalized regression model was built using the optimized Lambda.1se parameter.
3 × 3 nested cross-validation
The entire training cohort was randomly split in 3-folds stratified for neuroendocrine tumor origin. During the outer loop 3-fold cross-validation, each fold is held out once to be used as test set, while the others are used as training set to fit a random forest model. Each fold is used once for testing. During the inner loop cross-validation, the lambda parameter is optimized on the OOB votes of the training fold. A ridge logistic regression model is fit on all the OOB votes using the optimized Lambda.1se parameter. The test fold held out of the outer cross-validation loop is then tested using the raw random forest model, after which raw scores are calibrated using the logistic ridge regression model. The whole process of randomly splitting the entire training cohort into 3-folds and 3 × 3 nested cross-validation is repeated three times. Output model metrics included overall majority vote– and probability-based model accuracy, multiclass AUC (21), and log loss, and are presented as descriptive summary statistics. Furthermore, raw and calibrated scores were collected and used to calculate density plots. To plot the output of this three-class problem, calibrated output of new cases was visualized using a ternary plot with ggtern (22). Tolerance ellipses were drawn on the calibrated OOB scores using the Mahalanobis distance and isometric log ratio transformation by implementation of DrawMahal (23) in ggtern.
Failed probe imputation
The effect of randomly imputing a percentage of probes on accuracy was determined from 5% to 95% random imputation with steps of 5% on the complete set of beta values (around 400k) in the local test cohort. Each time, for a specified percentage of probes, the beta values were replaced with a random beta value selected from other probes from the same case, thus following a normal beta distribution. Imputed cases were tested with the raw and calibrated random forest model. This process was repeated 30 times. Mean and SDs of accuracy, multiclass AUC, and log loss were plotted over incrementally increasing percentages of randomly imputed probes.
Simulated normal cell contamination
Normalized data of sorted lymphocytes, normal pancreas, and normal lung tissue were used to test the effect of nontumor contamination on the model accuracy. First, mean beta values were calculated for B cells, CD4 cells, CD8 cells, granulocytes and monocytes, normal pancreas, and normal lung. Next, a weighted average of the mean beta values of CD4 cells (*0.55), B cells (*0.35), and CD8 cells (*0.1) was calculated to simulate the average lymph node beta values. Percentages were based on the literature (24–27). To simulate contamination, a weighted average of mean normal cell population beta values was calculated with the beta values of each specific tumor. The percentage of the contamination determined the weight, which ranged from 5% to 95% with steps of 5%. The local test cohort was used to test the effect of normal cell contamination. Similar to the random probe imputation, model metrics were plotted over incrementally increasing contamination percentages.
Results
Neuroendocrine tumors of different origin show distinct methylomes
We assembled a cohort of 69 neuroendocrine tumors, including 19 small intestinal, 18 pulmonary, and 32 pancreatic neuroendocrine tumors, which had been assayed on the microarray-based Illumina HumanMethylation450 Platform (refs. 10–12; Table 1; Supplementary Fig. S1), further referred to as “entire training cohort.” Characteristics of the cohort can be found in Supplementary Table S1. In addition, we generated 850k EPIC methylation array data for 26 neuroendocrine tumor samples from 23 patients from our own pathology archive (4 small intestinal, 10 pulmonary, 9 pancreatic neuroendocrine tumors, and 3 neuroendocrine tumors of unknown primary) as a “local test cohort.” For pancreatic neuroendocrine tumor patients 5 and 6, primary tumors and matched metachronous liver metastasis (+3 years, patient 5) and two metachronous liver metastases (+3 and +6 years, patient 6) were included (Supplementary Data S1). Dimensionality reduction through t-SNE on the most variable probes of the entire training cohort showed clustering of samples in the entire training cohort and our local test cohort mostly corresponding with the three primary sites of origins (Fig. 1A). Few public data cases and one local sample clustered separately from the main origin clusters (Fig. 1A, unsupervised hierarchical clustering of the entire training cohort; Supplementary Fig. S2). These cases represent the “lung carcinoid type 2” (LC2) subtype of pulmonary neuroendocrine tumors, which carry MEN1 mutations as presented by Laddha and colleagues (12). As the local test cohort was from a single institution, and contained all three different sites of origin, the clustering with the anatomic locations in the entire training cohort suggests that the main source of variability in the methylation data is indeed origin of the tumor, rather than laboratory variance.
Samples used for training or testing.
. | Entire training cohort . | Local test cohort and external test cohort . | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
. | Primary ileal NET . | Primary pulmonary NET . | Primary pancreatic NET . | Pancreatic NET metastases . | Primary small intestinal NET . | Small intestinal NET metastases . | Primary pulmonary NET . | Pulmonary NET metastases . | Primary pancreatic NET . | Pancreatic NET metastases . | Metastases of unknown primary . |
Karpathakis and colleagues (10) | 19 | 16 | 21 | ||||||||
Laddha and colleagues (12) | 18 | ||||||||||
Chan and colleagues (11) | 23 | 9 | |||||||||
Local test cohort | 1 | 3 | 9 | 1 | 6 | 3 | 3 | ||||
Timp and colleagues (13) | 5 | ||||||||||
Tirosh and colleagues (14) | 22 | 50 | 2 | ||||||||
Alcala and colleagues (15) | 56 | ||||||||||
Subtotal | 19 | 18 | 32 | 63 | 66 | 64 | 5 | ||||
Total | 69 | 198 |
. | Entire training cohort . | Local test cohort and external test cohort . | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
. | Primary ileal NET . | Primary pulmonary NET . | Primary pancreatic NET . | Pancreatic NET metastases . | Primary small intestinal NET . | Small intestinal NET metastases . | Primary pulmonary NET . | Pulmonary NET metastases . | Primary pancreatic NET . | Pancreatic NET metastases . | Metastases of unknown primary . |
Karpathakis and colleagues (10) | 19 | 16 | 21 | ||||||||
Laddha and colleagues (12) | 18 | ||||||||||
Chan and colleagues (11) | 23 | 9 | |||||||||
Local test cohort | 1 | 3 | 9 | 1 | 6 | 3 | 3 | ||||
Timp and colleagues (13) | 5 | ||||||||||
Tirosh and colleagues (14) | 22 | 50 | 2 | ||||||||
Alcala and colleagues (15) | 56 | ||||||||||
Subtotal | 19 | 18 | 32 | 63 | 66 | 64 | 5 | ||||
Total | 69 | 198 |
Abbreviation: NET, neuroendocrine tumor.
Overview of performed analyses and local test cohort results. A, Unsupervised 2D t-SNE visualization of most variable probes in the entire training cohort, merged with the local test cohort. B, Workflow of the model validation, with a 3 × 3 nested cross-validation scheme with three repeats for random forest building and nested lambda optimization. Random forest and calibrated ridge regression output were tested on the internal test sets (black fold) and a test set of external cases (local test cohort). A number of model metrics, such as classification accuracy, AUC, and log loss were investigated. C, NEN-ID classifier scores in ternary plot for the cases in the local test cohort, resulting in perfect classification accuracy. Tolerance ellipses are drawn for the calibrated OOB scores for the model. D, The effect of erroneous data was tested in the local test cohort by random probe imputation for a percentage of probes. Similarly, the effect of normal cells (e.g., lymphocytes and normal lung) contaminating the tumor samples was simulated by mixing beta values of sorted or mixed normal cell populations with tumor samples.
Overview of performed analyses and local test cohort results. A, Unsupervised 2D t-SNE visualization of most variable probes in the entire training cohort, merged with the local test cohort. B, Workflow of the model validation, with a 3 × 3 nested cross-validation scheme with three repeats for random forest building and nested lambda optimization. Random forest and calibrated ridge regression output were tested on the internal test sets (black fold) and a test set of external cases (local test cohort). A number of model metrics, such as classification accuracy, AUC, and log loss were investigated. C, NEN-ID classifier scores in ternary plot for the cases in the local test cohort, resulting in perfect classification accuracy. Tolerance ellipses are drawn for the calibrated OOB scores for the model. D, The effect of erroneous data was tested in the local test cohort by random probe imputation for a percentage of probes. Similarly, the effect of normal cells (e.g., lymphocytes and normal lung) contaminating the tumor samples was simulated by mixing beta values of sorted or mixed normal cell populations with tumor samples.
Random forest models with calibration robustly predict neuroendocrine tumor origin
On the basis of the observations in our unsupervised analysis, we trained a random forest model to distinguish between the different origins. This machine learning technique builds an ensemble of decision trees and aggregates the class votes across the individual predictions into a final score. Data of the 5,000 most variable probes were used as input to train the model (Supplementary Data S2). Only probes present in both the Illumina 450k and EPIC array were used to make the model compatible for both platforms. On the basis of the OOB error, which is an internal random forest validation measure evaluated on held-out samples, the model showed a high accuracy of 97%. To improve classification accuracy and make better comparison of class scores possible, a ridge penalized logistic regression calibration model was fitted to the random forest scores (28). Logistic regression was used to improve class estimates by taking all random forest class scores into account to predict probability. The ridge penalty (lambda) was selected during cross-validation, to shrink the regression coefficients and prevent overfitting of the calibration model.
To test the generalizability of this model, a 3 × 3 nested cross-validation was performed with three repeats (Fig. 1B), as described previously (8, 28). Briefly, nine different randomly selected (sub)training and test cohorts were selected from the entire training cohort and used to build and test calibrated random forest models. In addition, our local test cohort (n = 23 cases, with known origin) was included as separate test set for every training partition during cross-validation. Each different training fold, therefore, consisted of approximately 46 cases (on average by random selection), while each combined test fold/local test cohort consisted of around 45 cases. The mean OOB accuracy during cross-validation was 95%. Median cross-validation lambda penalty for the ridge logistic regression was 0.087 (range, 0.064–0.113). Within the test folds and local test cohort during cross-validation, the mean predictive accuracy of the corresponding calibrated random forest models was 96.1% and 99.5%, respectively. Mean multiclass AUC ranged between 0.9924 and 1 (Table 2). Ridge logistic regression calibration improved accuracy in the local test cohort. Also, it caused a decrease in log loss values, meaning the correct class scores rose and incorrect class scores dropped. We concluded by the near perfect performance of these predictive models in 3 × 3 cross-validation that a random forest model built on this data is likely to perform well on unseen data.
Model metrics of 3 × 3 nested internal and external cross-validation.
Samples . | Model . | Mean accuracy (range) . | Mean AUC (range) . | Mean log loss (range) . |
---|---|---|---|---|
Internal training | RF (OOB) | 0.9541 (0.9348–0.9787) | 0.9964 (0.9937–0.9993) | 0.2950 (0.2405–0.3522) |
Internal test fold | Raw RF | 0.9661 (0.9130–1.0000) | 0.9989 (0.9949–1.0000) | 0.2830 (0.2315–0.3414) |
Internal test fold | RF + LR | 0.9613 (0.9130–1.0000) | 0.9989 (0.9924–1.0000) | 0.1977 (0.1601–0.2586) |
Local test cohort | Raw RF | 0.9517 (0.9130–0.9565) | 0.9986 (0.9963–1.0000) | 0.4240 (0.4083–0.4459) |
Local test cohort | RF + LR | 0.9952 (0.9565–1.0000) | 0.9998 (0.9981–1.0000) | 0.2867 (0.2508–0.3313) |
Samples . | Model . | Mean accuracy (range) . | Mean AUC (range) . | Mean log loss (range) . |
---|---|---|---|---|
Internal training | RF (OOB) | 0.9541 (0.9348–0.9787) | 0.9964 (0.9937–0.9993) | 0.2950 (0.2405–0.3522) |
Internal test fold | Raw RF | 0.9661 (0.9130–1.0000) | 0.9989 (0.9949–1.0000) | 0.2830 (0.2315–0.3414) |
Internal test fold | RF + LR | 0.9613 (0.9130–1.0000) | 0.9989 (0.9924–1.0000) | 0.1977 (0.1601–0.2586) |
Local test cohort | Raw RF | 0.9517 (0.9130–0.9565) | 0.9986 (0.9963–1.0000) | 0.4240 (0.4083–0.4459) |
Local test cohort | RF + LR | 0.9952 (0.9565–1.0000) | 0.9998 (0.9981–1.0000) | 0.2867 (0.2508–0.3313) |
Abbreviations: AUC, multiclass AUC as defined by Hand and Till (21); LR, ridge logistic regression model; OOB, 1 − (OOB error %); RF, random forest model.
To determine range of class scores to expect when classifying unknown data in the future, we kept track of the scores assigned to training cohort data test folds during cross-validation (Supplementary Fig. S3A; Supplementary Data S3). The true class score distribution was left skewed, and around one-fourth neuroendocrine tumors received true class scores within the left tail of the distribution (scores ≤ 80%), indicating a broad range of expected true class scores. As expected, similar results were obtained from the calibrated OOB scores taken from the random forest model trained on the entire training cohort without cross-validation (Supplementary Fig. S3B).
NEN-ID performance in the local test cohort
We now tested the accuracy of the model trained on the entire training cohort (NEN-ID classifier) in our local test cohort. The random forest ridge regression calibrated output was 100% accurate in predicting origin of the 23 cases with a known origin (Fig. 1C; Table 3; Supplementary Fig. S3C; Supplementary Data S3 and S4). Of note, one pulmonary neuroendocrine tumor from our local test cohort (Lung_NEN_sporadic5), which initially clustered with the LC2 subtype in the middle of the t-SNE (Fig. 1A) and showed chromosome 11 loss (locus MEN1), was correctly classified as pulmonary neuroendocrine tumor. Another case (Lung_NEN_sporadic3), with ectopic symptomatic ACTH production, was also correctly classified (Supplementary Data S1 and S4). In comparison, an IHC panel used to determine origin (TTF1, CDX2, and Islet-1) could only correctly classify 65%–74% cases (15 or 17/23), depending on the guidelines used for interpretation (refs. 29, 30; Table 3). The NEN-ID classifier was especially useful for triple-negative cases or cases with more than one positive IHC marker. CDX2-positive pancreatic neuroendocrine tumors showed the lowest scores, but were nonetheless correctly classified. As morphology can also be used to determine origin (31, 32), we determined neuroendocrine tumor growth patterns for the local test cohort. Although all small intestinal neuroendocrine tumors showed nested growth patterns, this pattern was also seen in other neuroendocrine tumors. Spindle-like cells were specific for lung origin, but were only seen in few cases (Table 3; Supplementary Data S1).
IHC, morphology, and NEN-ID classification of local test cohort.
Name . | Primary/metastasis . | Origin . | CDX2 . | TTF1 . | Islet1 . | IHC panel 1 . | IHC panel 2 . | Growth patterna . | NEN-ID class . | Class score . |
---|---|---|---|---|---|---|---|---|---|---|
Ileum_met_spor_NET | M | SI | − | − | − | ? | ? | A+C | SI | 73% |
Ileum_prim_spor_NET | P | SI | + | − | − | SI | SI | A | SI | 88% |
Ileum_met_spor_NET2 | M | SI | + | − | − | SI | SI | A | SI | 85% |
Ileum_met_spor_NET3 | M | SI | + | − | − | SI | SI | A | SI | 76% |
Lung_NEN_sporadic1 | P | Lung | − | + | − | Lung | Lung | D | Lung | 80% |
Lung_NEN_sporadic2 | P | Lung | − | + | − | Lung | Lung | D | Lung | 90% |
Lung_NEN_sporadic3 | P | Lung | − | + | − | Lung | Lung | B+C | Lung | 76% |
Lung_NEN_sporadic4 | P | Lung | + 10%–20% | + | + 10%–20% | ? | ? | D | Lung | 87% |
Lung_NEN_sporadic5 | P | Lung | − | − | − | ? | ? | B | Lung | 82% |
Lung_NEN_sporadic6 | P | Lung | − | + | − | Lung | Lung | A+B | Lung | 94% |
Lung_met_NEN_sporadic1 | M | Lung | − | + | + | ? | Lung | D | Lung | 89% |
Lung_NEN_sporadic7 | P | Lung | − | − | − | ? | ? | B+C | Lung | 92% |
Lung_NEN_sporadic8 | P | Lung | − | + | − | Lung | Lung | C+D | Lung | 94% |
Lung_NEN_sporadic9 | P | Lung | − | + | + | ? | Lung | A | Lung | 85% |
1LCasePan | P | Pancreas | − | − | + | Pancreas | Pancreas | B | Pancreas | 93% |
2LCasePan | P | Pancreas | − | <1% + | + | Pancreas | Pancreas | B+C | Pancreas | 65% |
4InCasePan | P | Pancreas | − | − | + | Pancreas | Pancreas | A+C | Pancreas | 95% |
3LCasePan | P | Pancreas | − | − | + | Pancreas | Pancreas | B | Pancreas | 93% |
5MCasePan | M | Pancreas | + | − | + | ? | SI | B | Pancreas | 42% |
6M2CasePan | M | Pancreas | − | − | + | Pancreas | Pancreas | D | Pancreas | 92% |
prim-met5In | P | Pancreas | + | − | + | ? | SI | B | Pancreas | 51% |
prim-met6M1 | M | Pancreas | − | − | + | Pancreas | Pancreas | B | Pancreas | 91% |
prim-met6In | M | Pancreas | − | − | + | Pancreas | Pancreas | B | Pancreas | 81% |
Lung_NEN_MEN1 | P/M | Unknown | − | + | + | ? | Lung | D | Lung | 63% |
NET-UP | M | Unknown | + | − | − | SI | SI | A | SI | 82% |
NET-UP2 | M | Unknown | + | − | − | SI | SI | A | SI | 73% |
Name . | Primary/metastasis . | Origin . | CDX2 . | TTF1 . | Islet1 . | IHC panel 1 . | IHC panel 2 . | Growth patterna . | NEN-ID class . | Class score . |
---|---|---|---|---|---|---|---|---|---|---|
Ileum_met_spor_NET | M | SI | − | − | − | ? | ? | A+C | SI | 73% |
Ileum_prim_spor_NET | P | SI | + | − | − | SI | SI | A | SI | 88% |
Ileum_met_spor_NET2 | M | SI | + | − | − | SI | SI | A | SI | 85% |
Ileum_met_spor_NET3 | M | SI | + | − | − | SI | SI | A | SI | 76% |
Lung_NEN_sporadic1 | P | Lung | − | + | − | Lung | Lung | D | Lung | 80% |
Lung_NEN_sporadic2 | P | Lung | − | + | − | Lung | Lung | D | Lung | 90% |
Lung_NEN_sporadic3 | P | Lung | − | + | − | Lung | Lung | B+C | Lung | 76% |
Lung_NEN_sporadic4 | P | Lung | + 10%–20% | + | + 10%–20% | ? | ? | D | Lung | 87% |
Lung_NEN_sporadic5 | P | Lung | − | − | − | ? | ? | B | Lung | 82% |
Lung_NEN_sporadic6 | P | Lung | − | + | − | Lung | Lung | A+B | Lung | 94% |
Lung_met_NEN_sporadic1 | M | Lung | − | + | + | ? | Lung | D | Lung | 89% |
Lung_NEN_sporadic7 | P | Lung | − | − | − | ? | ? | B+C | Lung | 92% |
Lung_NEN_sporadic8 | P | Lung | − | + | − | Lung | Lung | C+D | Lung | 94% |
Lung_NEN_sporadic9 | P | Lung | − | + | + | ? | Lung | A | Lung | 85% |
1LCasePan | P | Pancreas | − | − | + | Pancreas | Pancreas | B | Pancreas | 93% |
2LCasePan | P | Pancreas | − | <1% + | + | Pancreas | Pancreas | B+C | Pancreas | 65% |
4InCasePan | P | Pancreas | − | − | + | Pancreas | Pancreas | A+C | Pancreas | 95% |
3LCasePan | P | Pancreas | − | − | + | Pancreas | Pancreas | B | Pancreas | 93% |
5MCasePan | M | Pancreas | + | − | + | ? | SI | B | Pancreas | 42% |
6M2CasePan | M | Pancreas | − | − | + | Pancreas | Pancreas | D | Pancreas | 92% |
prim-met5In | P | Pancreas | + | − | + | ? | SI | B | Pancreas | 51% |
prim-met6M1 | M | Pancreas | − | − | + | Pancreas | Pancreas | B | Pancreas | 91% |
prim-met6In | M | Pancreas | − | − | + | Pancreas | Pancreas | B | Pancreas | 81% |
Lung_NEN_MEN1 | P/M | Unknown | − | + | + | ? | Lung | D | Lung | 63% |
NET-UP | M | Unknown | + | − | − | SI | SI | A | SI | 82% |
NET-UP2 | M | Unknown | + | − | − | SI | SI | A | SI | 73% |
Note: IHC panel 1: CDX2+TTF1−Islet1− for small intestinal origin, CDX2−TTF1+Islet1− for pulmonary origin, and CDX2−TTF1−Islet1+ for pancreatic origin. IHC panel 2: CDX2+TTF1− for small intestinal origin, CDX2−/TTF1+ for pulmonary origin, and in case of CDX2−/TTF1−, Islet-1+ for pancreatic origin.
Abbreviations: M, metastasis; P, primary tumor; SI, small intestine.
aGrowth patterns as described by Soga and Tazawa (31): A, nodular nests and peripheral cords; B, trabeculae or ribbon-like structures; C, tubular, acinar or rosette-like structures; and D, lower or atypical differentiations.
The three neuroendocrine tumors of unknown primary from the local test cohort are highlighted in Fig. 2. Two neuroendocrine tumors of unknown primary were mesenterial lymph node and liver metastases from patients with a sporadic neuroendocrine tumor. Samples were CDX2 positive and showed a nested growth pattern, suggesting small intestinal origin, but this could also be seen in some pancreatic neuroendocrine tumors. The third case was a G2/3 neuroendocrine tumor resected from left upper lung lobe from an MEN1 patient (33), for which doubt could exist about primary pulmonary origin. The case was TTF1 and Islet-1 positive, and had a PIK3CA mutation. As TTF1 can be positive in pancreatic neuroendocrine tumors (especially high-grade neuroendocrine carcinomas) and mTOR pathway mutations are more commonly seen in pancreatic neuroendocrine tumors (15, 34, 35), it was uncertain whether the lung lesion was a primary pulmonary neuroendocrine tumor or a pulmonary pancreatic neuroendocrine tumor metastasis. The NEN-ID classifier readily confirmed small intestinal origin in the first two cases and classified the MEN1 neuroendocrine tumor as primary pulmonary neuroendocrine tumor. These predictions improved the current practice diagnostic workup by either confirming or narrowing down the differential diagnosis to a single site of origin. In addition, copy-number profiling, which can be derived from DNA methylation data, showed various clinically relevant gains and losses.
Highlighted neuroendocrine tumors (NET) of unknown primary from local test cohort. A, Current practice in the UMCU (Utrecht, the Netherlands) to determine neuroendocrine tumor origin. Three confirmed neuroendocrine tumor cases (morphology and positive chromogranin A/synaptophysin) of different grades for which the origin was unknown or uncertain. Neuroendocrine tumor origin cannot be determined by studying only morphology (hematoxylin and eosin slides). Additional IHC staining (CDX2, TTF1, Islet-1, and not shown) and clinicopathologic information can make one or another origin more likely. B, NEN-ID classifier readily classifies cases. Probability scores fall within class tolerance ellipses. In addition, methylation-based copy-number plots (not shown) can be used to further unravel tumorigenic mechanisms and provide prognostic information.
Highlighted neuroendocrine tumors (NET) of unknown primary from local test cohort. A, Current practice in the UMCU (Utrecht, the Netherlands) to determine neuroendocrine tumor origin. Three confirmed neuroendocrine tumor cases (morphology and positive chromogranin A/synaptophysin) of different grades for which the origin was unknown or uncertain. Neuroendocrine tumor origin cannot be determined by studying only morphology (hematoxylin and eosin slides). Additional IHC staining (CDX2, TTF1, Islet-1, and not shown) and clinicopathologic information can make one or another origin more likely. B, NEN-ID classifier readily classifies cases. Probability scores fall within class tolerance ellipses. In addition, methylation-based copy-number plots (not shown) can be used to further unravel tumorigenic mechanisms and provide prognostic information.
NEN-ID performance in external data and overall diagnostic accuracy
To learn more about the predictive behavior of NEN-ID classifier, we compiled an “external test cohort” with all other publicly available 450k/EPIC methylation data of well-differentiated neuroendocrine tumors of small intestinal, pulmonary, and pancreatic origin (10, 12–15). The external test cohort consisted of 59 sporadic small intestinal neuroendocrine tumors (primary tumors and metastases), 56 pulmonary neuroendocrine tumors, 55 pancreatic neuroendocrine tumors (sporadic, VHL, and MEN1 syndrome associated), and two neuroendocrine tumors of unknown primary (Table 1; Supplementary Fig. S1). Predictive accuracy in the external test cohort was 96% (Supplementary Figs. S3D and S4; Supplementary Data S3 and S4). The presence of hereditary pancreatic neuroendocrine tumor syndromes did not seem to affect accuracy, as 97% tumors (33 of 10 VHL + 24 MEN1 tumors) were correctly classified. To calculate overall sensitivity and specificity per origin, the local and external test cohorts were combined (N = 193; Table 4). Overall diagnostic accuracy was 96%. Importantly, 27 of 28 metastatic samples in the combined test cohort were correctly classified, showing that methylation profiles in metastases remain highly specific for their origin. In addition, 2 patients in our local test cohort had matched primary and metastatic tumor samples. Highly correlated beta values and random forest classifier scores also suggested similar methylation in these cases (Supplementary Fig. S5).
Sensitivity and specificity in the combined local and external test cohort.
. | NEN-ID predicted small intestinal . | NEN-ID predicted pulmonary . | NEN-ID predicted pancreatic . | Sensitivity . | Specificity . |
---|---|---|---|---|---|
Small intestinal origin | 59 | 0 | 4 | 94% (59/63) | 99% (130/131) |
Pulmonary origin | 0 | 65 | 1 | 98% (65/66) | 99% (127/128) |
Pancreatic origin | 1 | 1 | 62 | 97% (62/64) | 96% (129/134) |
. | NEN-ID predicted small intestinal . | NEN-ID predicted pulmonary . | NEN-ID predicted pancreatic . | Sensitivity . | Specificity . |
---|---|---|---|---|---|
Small intestinal origin | 59 | 0 | 4 | 94% (59/63) | 99% (130/131) |
Pulmonary origin | 0 | 65 | 1 | 98% (65/66) | 99% (127/128) |
Pancreatic origin | 1 | 1 | 62 | 97% (62/64) | 96% (129/134) |
Finally, we tested the behavior of the NEN-ID classifier on samples which were not part of the training cohort, including 28 neuroendocrine tumors of other origins (stomach, duodenum, colon, and appendix), 320 non-neuroendocrine tumor samples, and 180 normal samples from TCGA database. A random forest model is built so that it will always produce three-class votes adding up to 100%. However, we observed the scores of the highest scoring class were significantly lower in the neuroendocrine tumors of other origins, TCGA tumor/normal cases compared with neuroendocrine tumor cases in the combined local and external test cohort (Supplementary Fig. S6). Most non-neuroendocrine tumor cases were classified as pancreatic neuroendocrine tumor, although with lower probability scores, which might be a reflection of the methylation heterogeneity observed between pancreatic samples (Fig. 1A and C; Supplementary Fig. S7), causing the ridge regression model to favor the pancreatic class in case raw random forest scores are equal for all classes.
Erroneous data do not affect accuracy
All probes used by the random forest model must be present for the random forest model to work. Consequently, failed probes are imputed, which can introduce noise. We therefore tested the effect of failed probes on the NEN-ID model metrics by imputing random beta values for increasing numbers of randomly selected measurement points (Fig. 1D). A percentage of probes was randomly imputed in the local test cohort. Surprisingly, up to 50% of all probes could be replaced with random beta values without drastic effect on prediction accuracy (Supplementary Fig. S8A). With an increasing percentage of imputed probes, an increase of log loss was seen for the raw and calibrated random forest scores.
Simulated normal cell contamination does not affect accuracy
Similar to noise created by failed probes, a mix of tumor cells and nontumor cells could obscure the methylation signature as recognized by the random forest model. In the clinical setting, biopsies and cytology specimens of neuroendocrine tumor metastases of unknown primary are often taken from lymph nodes, liver, or lung. Contamination with lymphocytes or organ tissue could potentially harm predictive accuracy more than randomly imputed data. Using publicly available data of tissues and cell types, we determined whether a pure population or a mix of normal cells might affect predictive accuracy (Fig. 1D). Reassuringly, accuracy remained perfect up to 25% simulated contamination and high (>90% correct) up to 40% contamination (Supplementary Fig. S8B).
The random forest model is available for further validation
As validation with more cases from the scientific community is necessary, we set out to develop a tool that can be used without any experience in programming. The Shiny web application, Methedrine, was created on the basis of CrystalMeth package (Vignette and source code Github cgeisenberger/crystalmeth) to provide a user-friendly web-based interface for data analysis. After uploading raw Illumina methylation array output (IDAT) files, Methedrine preprocesses data, analyzes tumor purity by the Purify package (36), plots copy numbers using the Conumee package (37), and predicts class for all uploaded samples. Optionally, output for three-class problems can be plotted in a ternary plot (Fig. 1C). An example of the website output is provided in Supplementary Fig. S9, showing a pancreatic neuroendocrine tumor from our own institution with chromosome 11 loss (note copy-number plot). The tool is compatible with both 450k and EPIC (850k) Illumina arrays, with a maximum of 10 tumor samples per run, and is available through www.nen-id.com or www.nen-id.net.
Discussion
Whole-genome methylation data–based predictive models are a trending topic in molecular tumor classification research and are revolutionizing cancer diagnostics. Here, we demonstrate, as a proof of concept, that a calibrated methylation-based random forest model can be used to accurately distinguish small intestinal, pulmonary, and pancreatic origin of primary and metastatic neuroendocrine tumors. This application of a methylation-based model to predict origin has the potential to improve current diagnostic workup of neuroendocrine tumors of unknown primary. As the necessity of basic programming experience to use the predictive model may delay further application, we present the easy-to-use web application, Methedrine, to analyze and classify new samples online, removing any barrier for further validation and clinical implementation.
In the diagnostic workup for patients with a neuroendocrine tumor metastasis of unknown primary, internal medicine specialists, (nuclear) radiologists, and pathologists work together to determine the site of origin. However, if the primary site remains unknown based on imaging, the pathologist can try to make an educated guess on the basis of tissue morphology and presence of markers, like CDX2, TTF1, Islet-1, PAX-6/8, PDX1, SATB2, ATRX, DAXX, and ALT (29, 30, 38–41). Depending on the panel, and the antibodies used in the panel, sensitivity and specificity range from 40% to 90%. Overall diagnostic accuracy is reported to be up to 80% (29, 30), which is similar to the diagnostic accuracy we observed in our local test cohort using a panel of CDX2, TTF1, and Islet-1. In comparison, our calibrated random forest prediction model shows superior classification accuracy. During repeated 3-fold cross-validation, our predictive model showed near perfect sensitivity and specificity, as demonstrated by multiclass ROC AUCs > 0.99. Accuracy of the final NEN-ID classifier for predicting small intestinal, pulmonary, and pancreatic origin was 96% (187/193) on the combined local and external test cohort. Importantly, 27 of 28 metastases present in this combined test cohort were correctly classified, showing that methylation profiles remain origin specific in the metastatic setting.
For a diagnostic test to be clinically applicable, it should meet high-quality standards, be easy to perform and interpret, and work on any type/quality input material/data. It has been previously shown that Illumina's methylation arrays are technically robust and have high interlaboratory and interarray correlations (8). As we provide a website for data analysis, there is no interobserver variability. Furthermore, Illumina's methylation array can be performed on formalin-fixed biopsy specimens, making it useful in the diagnostic setting. Importantly, we demonstrated that the NEN-ID classifier accuracy is not significantly affected by simulated low-quality samples. For up to 50% of simulated failed probes (imperfect data quality) and up to 25% of simulated contamination (imperfect sample quality), the diagnostic accuracy remained highly accurate. In our experience, the number of failed probes stayed well below 1%, and as neuroendocrine tumors are highly cellular and grow in solid/nested clumps, selecting or scraping >80% tumor cells is within reach, also when growing in lymph nodes. This is illustrated by the fact that most tumors in the training cohort had low percentages of failed probes (Github whackeng/NEN-ID) and showed pathologist or data-driven estimates of tumor purity well above 80% (11, 12). Although the true effect of normal cell contamination has to be determined in a nonsimulated setting, this result indicates a certain degree of robustness of the predictive model for samples of low tumor purity.
While we show that the NEN-ID classifier is highly accurate in determining origin, a limitation is its restriction to predicting classes only present in the training cohort, which are well-differentiated neuroendocrine tumors of small intestinal, pulmonary, or pancreatic origin. Sufficient data for neuroendocrine tumors of other origins were not available, and including classes with few samples would have significantly decreased our training cohort size, as downsampling was used to create equal classes during probe selection and model building. The NEN-ID model is able to generate prediction scores on non-neuroendocrine tumor samples (like colon adenocarcinoma or normal skin); however, prediction scores tend to be low (Supplementary Fig. S6). This emphasizes the key role of the pathologist in confirming neuroendocrine tumor differentiation with neuroendocrine and keratin markers beforehand. The fact that the model is not trained to recognize neuroendocrine tumors from other origins is also a challenge. However, studies show that neuroendocrine tumor metastases of unknown primary are, when the primary site is ultimately detected, most likely (>90%) of small intestinal, pancreatic, and pulmonary origin (42–44). Indeed, these three neuroendocrine tumor subtypes have the highest proportion of distant metastases at presentation and are the most common, making them most relevant when studying neuroendocrine tumors of unknown primary (1). Furthermore, studies using IHC panels to predict neuroendocrine tumor origin are often similarly limited to a selection of neuroendocrine tumor origins (29, 30, 39–41). Nevertheless, when using the current NEN-ID model, the possibility of other origins should be considered. In the future, this tool needs to be expanded by including more classes, including rare origins and subtypes of neuroendocrine tumors.
A strong point of our study is that the results are unbiased and reproducible. Many studies reporting the diagnostic value of transcription factors for certain origins (e.g., TTF1) do not report blinding for origin, and use different antibodies and staining protocols. Furthermore, as this is a noncommercial effort, there are no restrictions in using and building on the presented results by the scientific community, and there is no need for ordering specific arrays if the methylation array is available. The relative cost of Illumina's methylation array is less than commercial tests like CancerTYPE ID, but more expensive than IHC (43). However, it combines improved diagnostic accuracy with a wealth of additional information, including copy-number profiling and single-gene methylation (e.g., CDKN2A methylation).
In conclusion, this study shows how combining publicly available data can result in a meaningful predictive model that predicts small intestinal, pulmonary, and pancreatic neuroendocrine tumor origin with high accuracy. We present the complete source code on Github for others to investigate, reproduce, and extend. As illustrated by three cases from our institution, the NEN-ID classifier can already be complementary in the clinical diagnostic workup of neuroendocrine tumor cases of unknown primary origin, for example, when pancreatic and small intestinal origin needs to be distinguished for diagnostic or therapeutic purposes. However, prospective studies need to be performed to determine the true added effect of this predictive model value for neuroendocrine tumors of unknown primary. The web application enables straightforward clinical validation and implementation. By sharing, improving, and comparing data with the scientific community, we hope to fully harvest the potential of methylation-based prediction models, and ultimately enable prediction in all neuroendocrine neoplasms with high accuracy.
Authors' Disclosures
No disclosures were reported.
Authors' Contributions
W.M. Hackeng: Conceptualization, data curation, formal analysis, validation, investigation, visualization, methodology, writing-original draft, writing-review and editing. K.M.A. Dreijerink: Resources, supervision, writing-review and editing. W.W.J. de Leng: Resources, supervision, writing-review and editing. F.H.M. Morsink: Resources, investigation, methodology, writing-review and editing. G.D. Valk: Resources, investigation, writing-review and editing. M.R. Vriens: Resources, supervision, writing-review and editing. G.J.A. Offerhaus: Resources, supervision, writing-review and editing. C. Geisenberger: Software, formal analysis, supervision, visualization, methodology, writing-review and editing. L.A.A. Brosens: Conceptualization, resources, supervision, funding acquisition, project administration, writing-review and editing.
Acknowledgments
We thank Aaron Isaacs for his introduction into methylation data analysis. We thank Paul van Diest for his generous support and Erwin van der Biezen for supervising the processing of samples. We would like to thank all research teams for sharing their data, including the lungNENomics project and the Rare Cancers Genomics initiative. The results here are, in part, based upon data generated by TCGA Research Network: https://www.cancer.gov/tcga. This work was supported by the Dutch Digestive Foundation/Maag Lever Darm Stichting (grant no., CDG 14-020 to L.A.A. Brosens and W.M. Hackeng).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
References
Supplementary data
Cohort information
NEN-ID probe importance