Background:

The role of methylation in pancreatic cancer risk remains unclear. We integrated genome and methylome data to identify CpG sites (CpG) with the genetically predicted methylation to be associated with pancreatic cancer risk. We also studied gene expression to understand the identified associations.

Methods:

Using genetic data and white blood cell methylation data from 1,595 subjects of European descent, we built genetic models to predict DNA methylation levels. After internal and external validation, we applied prediction models with satisfactory performance to the genetic data of 8,280 pancreatic cancer cases and 6,728 controls of European ancestry to investigate the associations of predicted methylation with pancreatic cancer risk. For associated CpGs, we compared their measured levels in pancreatic tumor versus benign tissue.

Results:

We identified 45 CpGs at nine loci showing an association with pancreatic cancer risk, including 15 CpGs showing an association independent from identified risk variants. We observed significant correlations between predicted methylation of 16 of the 45 CpGs and predicted expression of eight adjacent genes, of which six genes showed associations with pancreatic cancer risk. Of the 45 CpGs, we were able to compare measured methylation of 16 in pancreatic tumor versus benign pancreatic tissue. Of them, six showed differentiated methylation.

Conclusions:

We identified methylation biomarker candidates associated with pancreatic cancer using genetic instruments and added additional insights into the role of methylation in regulating gene expression in pancreatic cancer development.

Impact:

A comprehensive study using genetic instruments identifies 45 CpG sites at nine genomic loci for pancreatic cancer risk.

This article is featured in Highlights of This Issue, p. 1979

As the most fatal malignancy of all major cancers, pancreatic cancer is the third leading cause of cancer death in the United States with an overall 5-year survival rate of only 9% (1). Furthermore, distinct from other common cancers, the mortality from pancreatic cancer is expected to continue to increase and may develop into the second leading cause of cancer death before 2030 (2). One of the major reasons for the lethality of this disease is that most patients with pancreatic cancer are diagnosed late due to nonspecific symptoms in earlier stages. Unfortunately, up till now, there are no effective screening tests available for pancreatic cancer. Serum CA 19–9 is the only validated biomarker that is clinically used for pancreatic cancer diagnosis in symptomatic patients or for prognostic surveillance in predicting tumor stage or overall survival. However, this biomarker alone cannot serve as an effective screening tool given its unsatisfactory sensitivity (75.5%) and specificity (77.6%), as well as the inferior positive predictive value (0.5%–0.9%; ref. 3). There are urgent needs to identify additional biomarkers for improved risk assessment of pancreatic cancer.

DNA methylation, an important epigenetic modification that regulates gene expression, has been shown to be potentially related to pancreatic cancer. A number of studies evaluating DNA methylation levels in blood or pancreas tissue have identified multiple candidate DNA methylation markers for pancreatic cancer, including methylation at VHL, MYF3, TMS, GPC3, SRBC, HYAL2, ADAMTS1, BNC1, SERPINB5, and B3GALT5 (4–8). However, many of these earlier studies involved a small sample size and only investigated a few CpG sites (CpG), resulting in insufficient statistical power and limited scope for identifying discriminant DNA methylation markers. More importantly, previous studies using a conventional study design would be difficult to establish causality.

It has been increasingly recognized that one potential strategy for reducing several of these limitations is to evaluate the associations of interest using genetic instruments. The genetically determined proportion of DNA methylation levels should be less susceptible to these biases, given the random assortment of alleles from parents to offspring during the production of gametes. Studies have suggested there is high heritability for a large portion of CpGs, and multiple associations have been identified between genetic variants and DNA methylation levels of CpGs (9–12). In a large study with sufficient power, many of the DNA methylation associated genetic variants are likely to serve as strong instrument variables for assessing the association between DNA methylation and pancreatic cancer risk. In this study, we employed such a novel strategy to identify DNA methylation biomarker candidates associated with pancreatic cancer risk.

Besides identifying promising biomarkers, the findings of such a study may also help better understand the etiology of pancreatic cancer. So far, genome-wide association studies (GWAS) have identified 20 independent common susceptibility loci for pancreatic cancer in individuals of European ancestry, however, together these variants can only explain a small proportion of the total risk (13–18). Recent work estimated the heritability of pancreatic cancer to be 21.2% (19). A large proportion of the pancreatic cancer heritability remains unexplained (19). Recently, two large transcriptome-wide association studies (TWAS) of pancreatic cancer were conducted. In these studies 31 candidate susceptibility genes, of which the genetically-predicted expression was associated with pancreatic cancer risk, were identified (20). This study represents another endeavor focusing on studying DNA methylation, the findings of which may contribute to additional understanding of pancreatic cancer genetics. These CpGs may influence pancreatic cancer risk either through regulating expression of pancreatic cancer susceptibility genes or through other mechanisms. In this work, we also studied gene expression aiming to characterize whether some of the identified associated CpGs may influence pancreatic cancer risk through regulating expression of their target genes.

As far as we know, this study is the first large study to evaluate the association between genetically-predicted DNA methylation and pancreatic cancer risk, using data of 8,280 cases and 6,728 controls of European descendants from Pancreatic Cancer Cohort Consortium (PanScan) and Pancreatic Cancer Case-Control Consortium (PanC4). For the identified associated DNA methylation biomarker candidates, we further compared their directly measured levels in pancreatic tumor tissue specimens (n = 18) versus benign pancreatic tissue specimens (n = 18).

The overall study design is shown in Fig. 1. First, we developed genetic prediction models for DNA methylation levels by leveraging data of the Framingham Heart Study (FHS). After external validation, we selected DNA methylation models with satisfactory prediction performance for assessing associations of genetically predicted methylation levels with pancreatic cancer risk, by using data of the PanScan/PanC4 consortia which involves 8,280 cases and 6,728 controls. For CpGs showing an association with pancreatic cancer risk, we assessed correlations between their predicted methylation and predicted expression of adjacent genes (PanScan/PanC4), to identify potential target genes of these CpGs. For the identified candidate target genes, we further evaluated associations of their genetically predicted expression with pancreatic cancer risk. For the associated CpGs, we also compared their directly measured levels in pancreatic tumor tissue versus benign pancreatic tissue. Additional description of relevant studies was included in the Supplementary Materials and Methods.

Figure 1.

Study design flow chart. PanScan, Pancreatic Cancer Cohort Consortium (PanScan); PanC4, Pancreatic Cancer Case-Control Consortium.

Figure 1.

Study design flow chart. PanScan, Pancreatic Cancer Cohort Consortium (PanScan); PanC4, Pancreatic Cancer Case-Control Consortium.

Close modal

DNA methylation prediction models

Genetic data and white blood cell DNA methylation data of a total of 1,595 unrelated subjects from the FHS Offspring Cohort were used for methylation genetic prediction model building. The detailed information for the datasets and data quality control (QC), has been described elsewhere (21–23). The genetic data were imputed to the Haplotype Reference Consortium reference panel. SNPs with high imputation quality (R2 ≥ 0.8), minor allele frequency (MAF) ≥5%, and those included in the HapMap Phase 2 version and not strand ambiguous were retained. The R package “minfi” was used for the quality control (QC) and normalization of the DNA methylation data (24). For the methylation level at each CpG site, a prediction model was built following the elastic net method (α = 0.50) using in-cis SNPs (flanking a 2 Mb window) with adjustment for age, sex, six cell type composition variables, and top 10 genetic principal components (PC). Ten-fold cross-validation was used to choose the penalty parameter lambda and validate the models internally (25). Performance of established prediction models were also examined externally by using data from Women's Health Initiative (WHI; N = 883), which were downloaded from dbGaP (accession nos. phs001335, phs000675, and phs000315). Identical methods were used for the imputation and QC as it was described for FHS data. DNA methylation data were processed following a similar procedure as for FHS data. We calculated the predicted DNA methylation for each CpG site using the models that were established using FHS data, and then compared the predicted methylation with the measured levels using Spearman correlation. DNA methylation prediction models with both internal and external performance R2 ≥ 0.01 (correlation between predicted and measured DNA methylation level > 0.1) were used for downstream association analyses. This is one of the standard criteria used in TWAS for gene expression (26–28), heritability of which is in similar range to that of DNA methylation in blood (29, 30). Importantly, in our work we aimed to capture the genetically regulated component of DNA methylation levels, and thus it is expected that the model performance R2 will not necessarily always be high for different CpGs. Indeed, the upper limit for such R2 should be the heritability of each CpG. We further excluded CpGs with SNPs within their probes in the Illumina 450K Beadchip because of potential bias for the measurement of DNA methylation levels of such CpGs (31).

Evaluation of the association between genetically predicted DNA methylation levels and pancreatic cancer risk

For evaluating associations of predicted DNA methylation levels with pancreatic cancer risk, we used data of GWAS conducted in PanScan and PanC4. Detailed information on these consortia has been described elsewhere (13–18). For the current analyses, the genetic and covariate data were accessed from dbGaP (dbGaP Study Accession: phs000206.v5.p3 and phs000648.v1.p1). We performed subject and SNP level QC based on guidelines recommended by the consortia (17). Briefly, in PanC4 dataset, we excluded subjects who were related to each other, with missing call rate ≥2%, or with missing information on covariates age and sex; we excluded SNPs with missing call rate ≥2%, positional duplicates, more than two discordant calls in study duplicates, more than one mendelian error in HapMap control trios, Hardy–Weinberg equilibrium (HWE) P < 1 × 10−4, sex difference in allele frequency > 0.2 for autosomes/XY in subjects of European ancestry, and/or sex difference in heterozygosity > 0.3 for autosomes/XY for European ancestry subjects, or with MAF < 0.005. In PanScan datasets, we excluded subjects with sex discordance, related with each other, or with a call rate <94%; we further excluded SNPs with a call rate <94% or HWE P < 1 × 10−7. In our analyses we only retained subjects with genetic ancestry of Europeans evaluated using pancreatic cancer analysis. The genotype data from all sources were imputed together to the Haplotype Reference Consortium reference panel (r1.1 2016; ref. 32) using Minimac3 for imputation and SHAPEIT for prephasing (33, 34), by using the Michigan Imputation Server (https://imputationserver.sph.umich.edu). Only imputed data with an imputation quality of at least 0.3 were retained in the association analyses. The final dataset included 8,280 cases and 6,728 controls.

The S-PrediXcan method (35) was used to evaluate the associations between genetically predicted DNA methylation levels and pancreatic cancer risk, using summary statistics of SNP-pancreatic cancer associations generated with adjustments of age, sex, and top PCs. The Z-score for the association between predicted DNA methylation levels at each CpG and pancreatic cancer risk was estimated on the basis of the formula of:

Here

{w_{\begin{array}{@{}*{1}{c}@{}}{sm} \end{array}}}$
represents the weight of SNP s on the methylation levels of the CpG m. |{\hat{\beta}_s}$| and |{\rm{se}}({\hat{\beta}_s})$| refer to the GWAS-estimated effect size and SE of SNP s on pancreatic cancer risk, respectively. |{\hat{\sigma}_s}$| and |\ {\hat{\sigma}_m}$| are the estimated variances of SNP s and the predicted methylation level at CpGs m, respectively. For this study, the correlations between predicting SNPs were estimated on the basis of the data of European descendants from 1000 Genomes Project Phase 3. Considering that a large number of CpGs may have correlated DNA methylation and predicted methylation levels, a FDR-adjusted P value < 0.05 was used to determine significant associations. For identified associated CpGs, GCTA-COJO analyses were conducted to examine whether the observed associations were independent of previously identified risk variants of pancreatic cancer (36). Briefly, for each SNP that was included in the prediction models of the identified CpGs, we used GCTA-COJO to estimate the modified |{\hat{\beta}_s}$| and |{\rm{se}}({\hat{\beta}_s})$| conditioning on nearby GWAS-identified pancreatic cancer risk SNPs. Then we re-performed the S-PrediXcan analysis using the modified values of |{\hat{\beta}_s}$| and |{\rm{se}}({\hat{\beta}_s})$| to assess the associations between genetically predicted DNA methylation levels and pancreatic cancer risk after adjusting for previously reported GWAS risk SNPs. Only associated CpGs with a large proportion of predicting SNPs (>50%) in the corresponding models used in association analyses were reported here, to decrease possibility of false positive findings. We further performed analyses using individual level genetic data for these CpGs, and conducted analyses to examine whether the identified significant associations were consistent cross study phases (PanScan I, II, III; PanScan I, II; PanC4 and PanScan I, II; and PanC4), especially for PanScan III which included only cases.

Potential target genes of associated CpGs

The identified CpGs associated with pancreatic cancer risk were annotated with ANNOVAR (29). To determine potential target genes of these CpGs, we assessed whether genetically predicted DNA methylation levels of these CpGs were significantly correlated with genetically predicted expression of their adjacent genes in 8,280 cases and 6,728 controls of European ancestry included in PanScan I–III and PanC4. We estimated genetically predicted gene expression using prediction models built with data from the Genotype-Tissue Expression (GTEx) project focusing on blood tissue (N = 338). Only gene expression prediction models with R2 ≥ 0.01 were used for the analyses. For genes showing a correlation (P < 0.05), we further assessed whether their genetically predicted expression was significantly associated with pancreatic cancer risk. Finally, we assessed the consistency of the direction of identified associations in the DNA methylation-gene expression-pancreatic cancer risk pathway.

Directly measured levels of associated CpGs in pancreatic tumor tissue specimens versus benign pancreatic tissue specimens

RRBS was performed on DNA extracted from 18 pancreatic tumor tissue specimens and 18 benign pancreatic tissue specimens, as described previously (37). Sequencing was performed using the Illumina HiSeq 2000 in the Mayo Clinic Medical Genome Facility. SAAP-RRBS was used for sequence alignment and methylation extraction (38). We compared the DNA methylation levels of identified associated CpGs in pancreatic tumor tissue specimens versus benign pancreatic tissue specimens. For this exploratory analysis, P < 0.05 was used to determine significant differences.

Data Availability

The datasets used for the analyses described in this manuscript were obtained from dbGaP at http://www.ncbi.nlm.nih.gov/sites/entrez?db=gap through dbGaP accession phs000206.v5.p3 and phs000648.v1.p1 for PanScan/PanC4 data, phs000342 and phs000724 for FHS, phs000315, phs000675, and phs001335 for WHI, and phs000424.v6.p1 for GTEx.

DNA methylation prediction models

Using data from the FHS, we were able to establish DNA methylation prediction models for a total of 223,959 CpGs, of which 70,269 showed a prediction performance (R2) ≥ 0.01 in both internal and external validation. Among them, 62,994 CpGs have no SNPs within their probes. The prediction models for these 62,994 CpGs showed similar performance in external and internal validation (Supplementary Fig. S1). The correlation coefficient between R2 in FHS and WHI was 0.95.

Associations between genetically predicted DNA methylation and pancreatic cancer risk

Of the 62,994 CpGs examined, 45 at nine genomic loci showed significant associations with pancreatic cancer risk for their genetically predicted methylation levels after FDR adjustment (Supplementary Fig. S2). Fifteen of the 45 CpGs were located >500 kb away from any risk variant reported in previous GWAS of pancreatic cancer. Positive associations between predicted DNA methylation level and pancreatic cancer risk were observed for cg02871659, cg18279742, cg01554064, cg04520704, cg19586165, cg16557858, cg02944084, and cg20930114; in contrast, inverse associations were identified for cg24483576, cg24520381, cg22833065, cg17288560, cg19439043, cg03013999, and cg15445000. After conditioning on previously identified pancreatic cancer risk variants, associations for all of these 15 CpGs at five novel loci remained largely unchanged (Table 1), suggesting that the identified associations represent novel associations independent of previously identified risk SNPs. On the other hand, for the other 30 identified CpGs located at four known pancreatic cancer risk loci, their associations with pancreatic cancer risk were all significantly attenuated after conditioning on adjacent risk SNPs (Table 2), suggesting that the identified associations may be influenced by the risk SNPs. On the basis of subgroup analyses, the associations of the identified 45 CpGs tended to be robust across different subsets (PanScan I, II, and III; PanScan I and II; PanC4 and PanScan I, II; and PanC4; Supplementary Table S1).

Table 1.

CpG sites with genetically predicted DNA methylation to be independently associated with pancreatic cancer risk after adjustment for previously identified risk SNPs.

CpG siteChrPosition (build37)Number of SNPs used for predictionClassificationR2aOR (95% CI)bP valuecP value after FDRRisk SNP adjusted forP value after adjusting for risk SNP
cg20930114 110372285 15 Exonic 0.02 1.94 (1.44–2.61) 1.28 × 10−5 0.019 rs1486134 1.28 × 10−5 
cg01554064 106855171 27 Upstream 0.20 1.22 (1.12–1.32) 1.75 × 10−6 0.003 rs505922 1.77 × 10−6 
cg02871659 16 2014063 Intronic 0.32 1.18 (1.09–1.28) 3.34 × 10−5 0.045 rs7190458 3.41 × 10−5 
cg18279742 16 2015703 46 Upstream/downstream 0.21 1.20 (1.10–1.30) 2.89 × 10−5 0.040 rs7190458 2.94 × 10−5 
cg15445000 17 37608096 50 Upstream 0.28 0.85 (0.80–0.91) 2.42 × 10−6 0.005 rs4795218 1.16 × 10−6 
cg03013999 17 37608204 21 Upstream 0.18 0.81 (0.74–0.89) 4.02 × 10−6 0.007 rs4795218 1.63 × 10−6 
cg19439043 17 37719913 27 Intergenic 0.04 0.64 (0.53–0.76) 6.76 × 10−7 0.002 rs4795218 2.51 × 10−7 
cg17288560 17 37720009 18 Intergenic 0.05 0.62 (0.52–0.75) 3.41 × 10−7 0.001 rs4795218 1.35 × 10−7 
cg24520381 17 37784694 20 Intronic 0.02 0.54 (0.43–0.69) 3.71 × 10−7 0.001 rs4795218 1.10 × 10−7 
cg24483576 17 37792770 13 UTR3 0.03 0.51 (0.38–0.68) 7.31 × 10−6 0.012 rs4795218 4.23 × 10−6 
cg19586165 17 37814072 10 Exonic 0.08 1.38 (1.19–1.59) 1.26 × 10−5 0.019 rs4795218 2.86 × 10−6 
cg02944084 17 37827057 22 Downstream 0.03 1.81 (1.44–2.29) 5.82 × 10−7 0.001 rs4795218 1.47 × 10−7 
cg16557858 17 37879740 23 Intronic 0.06 1.47 (1.25–1.74) 4.98 × 10−6 0.009 rs4795218 1.23 × 10−6 
cg22833065 17 38095691 14 Intergenic 0.03 0.59 (0.46–0.76) 3.14 × 10−5 0.043 rs4795218 1.86 × 10−5 
cg04520704 22 18325160 18 Intronic 0.08 1.36 (1.18–1.57) 2.63 × 10−5 0.038 rs16986825 2.65 × 10−5 
CpG siteChrPosition (build37)Number of SNPs used for predictionClassificationR2aOR (95% CI)bP valuecP value after FDRRisk SNP adjusted forP value after adjusting for risk SNP
cg20930114 110372285 15 Exonic 0.02 1.94 (1.44–2.61) 1.28 × 10−5 0.019 rs1486134 1.28 × 10−5 
cg01554064 106855171 27 Upstream 0.20 1.22 (1.12–1.32) 1.75 × 10−6 0.003 rs505922 1.77 × 10−6 
cg02871659 16 2014063 Intronic 0.32 1.18 (1.09–1.28) 3.34 × 10−5 0.045 rs7190458 3.41 × 10−5 
cg18279742 16 2015703 46 Upstream/downstream 0.21 1.20 (1.10–1.30) 2.89 × 10−5 0.040 rs7190458 2.94 × 10−5 
cg15445000 17 37608096 50 Upstream 0.28 0.85 (0.80–0.91) 2.42 × 10−6 0.005 rs4795218 1.16 × 10−6 
cg03013999 17 37608204 21 Upstream 0.18 0.81 (0.74–0.89) 4.02 × 10−6 0.007 rs4795218 1.63 × 10−6 
cg19439043 17 37719913 27 Intergenic 0.04 0.64 (0.53–0.76) 6.76 × 10−7 0.002 rs4795218 2.51 × 10−7 
cg17288560 17 37720009 18 Intergenic 0.05 0.62 (0.52–0.75) 3.41 × 10−7 0.001 rs4795218 1.35 × 10−7 
cg24520381 17 37784694 20 Intronic 0.02 0.54 (0.43–0.69) 3.71 × 10−7 0.001 rs4795218 1.10 × 10−7 
cg24483576 17 37792770 13 UTR3 0.03 0.51 (0.38–0.68) 7.31 × 10−6 0.012 rs4795218 4.23 × 10−6 
cg19586165 17 37814072 10 Exonic 0.08 1.38 (1.19–1.59) 1.26 × 10−5 0.019 rs4795218 2.86 × 10−6 
cg02944084 17 37827057 22 Downstream 0.03 1.81 (1.44–2.29) 5.82 × 10−7 0.001 rs4795218 1.47 × 10−7 
cg16557858 17 37879740 23 Intronic 0.06 1.47 (1.25–1.74) 4.98 × 10−6 0.009 rs4795218 1.23 × 10−6 
cg22833065 17 38095691 14 Intergenic 0.03 0.59 (0.46–0.76) 3.14 × 10−5 0.043 rs4795218 1.86 × 10−5 
cg04520704 22 18325160 18 Intronic 0.08 1.36 (1.18–1.57) 2.63 × 10−5 0.038 rs16986825 2.65 × 10−5 

aR2: model prediction performance (R2) derived using FHS data.

bOR and CI per one standard deviation increase in genetically predicted DNA methylation.

cP Value: derived from association analyses of 8,282 cases and 6,728 controls; FDR-adjust P ≤ 0.05 considered statistically significant.

Table 2.

CpG sites with genetically predicted DNA methylation to be associated with pancreatic cancer risk that are potentially influenced by previously identified risk SNPs.

CpG siteChrPosition (build37)Number of SNPs used for predictionClassificationR2aOR (95% CI)bP valuecP value after FDRRisk SNP adjusted forP value after adjusting for risk SNP
cg10015974 199827580 87 Intergenic 0.13 0.80 (0.73–0.87) 1.28 × 10−7 3.84 × 10−4 rs16986825; rs3790844 0.02 
cg10098523 200002343 40 Intronic 0.22 0.83 (0.78–0.90) 1.29 × 10−6 2.73 × 10−3 rs16986825; rs3790844 0.52 
cg07926895 200005833 24 Intronic 0.03 0.61 (0.49–0.77) 1.89 × 10−5 2.77 × 10−2 rs16986825; rs3790844 0.32 
cg17804356 200009927 Intronic 0.01 3.38 (2.12–5.39) 2.81 × 10−7 8.05 × 10−4 rs16986825; rs3790844 0.32 
cg07507801 1291235 Intronic 0.03 2.29 (1.66–3.16) 5.14 × 10−7 1.30 × 10−3 rs2736098; rs35226131; rs401681 0.13 
cg07380026 1296007 14 Upstream 0.01 4.52 (2.97–6.90) 2.39 × 10−12 1.67 × 10−8 rs2736098; rs35226131; rs401681 4.55 × 10−3 
cg26603275 1298965 10 Intergenic 0.04 2.24 (1.75–2.87) 1.11 × 10−10 6.36 × 10−7 rs2736098; rs35226131; rs401681 0.05 
cg11624060 1316038 25 Intergenic 0.18 1.28 (1.17–1.40) 2.49 × 10−8 9.23 × 10−5 rs2736098; rs35226131; rs401681 0.93 
cg26209169 1316264 22 Intergenic 0.24 1.24 (1.15–1.34) 2.19 × 10−8 8.62 × 10−5 rs2736098; rs35226131; rs401681 0.83 
cg10441424 1316636 16 Intergenic 0.01 2.08 (1.52–2.86) 5.82 × 10−6 1.02 × 10−2 rs2736098; rs35226131; rs401681 0.65 
cg07493874 1342172 11 Intronic 0.15 0.69 (0.61–0.77) 8.91 × 10−11 5.61 × 10−7 rs2736098; rs35226131; rs401681 0.93 
cg19915256 1345677 11 Upstream 0.02 2.85 (2.00–4.04) 5.16 × 10−9 2.32 × 10−5 rs2736098; rs35226131; rs401681 0.52 
cg27028750 1349422 20 Intergenic 0.25 0.79 (0.74–0.85) 6.59 × 10−10 3.46 × 10−6 rs2736098; rs35226131; rs401681 0.43 
cg03474926 136023407 24 Intronic 0.01 2.72 (1.90–3.89) 5.18 × 10−8 1.72 × 10−4 rs505922 0.36 
cg01169778 136038690 14 Intronic 0.04 1.98 (1.46–2.68) 1.04 × 10−5 1.62 × 10−2 rs505922 0.13 
cg14653977 136038692 20 Intronic 0.03 4.27 (3.09–5.89) 1.12 × 10−18 3.53 × 10−14 rs505922 0.08 
cg13531387 136078657 13 Intergenic 0.11 0.34 (0.25–0.45) 3.16 × 10−13 2.49 × 10−9 rs505922 0.75 
cg00878953 136129875 36 Downstream 0.15 0.65 (0.54–0.79) 6.83 × 10−6 1.16 × 10−2 rs505922 0.42 
cg11879188 136149908 36 Intronic 0.5 2.28 (1.84–2.83) 4.84 × 10−14 4.36 × 10−10 rs505922 0.89 
cg21160290 136149941 43 Intronic 0.71 1.99 (1.69–2.34) 8.87 × 10−17 1.12 × 10−12 rs505922 0.76 
cg22535403 136150032 44 Intronic 0.69 2.29 (1.89–2.77) 4.63 × 10−17 7.29 × 10−13 rs505922 0.59 
cg24267699 136151359 13 Upstream 0.59 2.50 (2.07–3.02) 1.33 × 10−21 8.38 × 10−17 rs505922 0.01 
cg06818865 136151958 10 Intergenic 0.3 1.84 (1.52–2.24) 8.47 × 10−10 4.10 × 10−6 rs505922 0.16 
cg13660174 136238392 19 Intronic 0.07 1.64 (1.34–2.00) 1.30 × 10−6 2.73 × 10−3 rs505922 0.29 
cg13568213 136387235 16 Intronic 0.03 7.05 (3.43–14.48) 1.08 × 10−7 3.40 × 10−4 rs505922 0.17 
cg21101465 13 28493404 20 Upstream 0.04 0.61 (0.49–0.76) 9.94 × 10−6 1.61 × 10−2 rs9581943 0.06 
cg11853320 13 28493913 52 Upstream 0.08 0.69 (0.61–0.79) 3.88 × 10−8 1.36 × 10−4 rs9581943 0.46 
cg26793256 13 28494004 55 Upstream 0.06 0.72 (0.62–0.82) 1.56 × 10−6 3.17 × 10−3 rs9581943 0.16 
cg04633225 13 28494161 22 Upstream 0.02 0.45 (0.34–0.59) 1.09 × 10−8 4.58 × 10−5 rs9581943 0.06 
cg11213248 13 28534648 Intergenic 0.22 0.81 (0.75–0.88) 1.16 × 10−6 2.61 × 10−3 rs9581943 2.00 × 10−4 
CpG siteChrPosition (build37)Number of SNPs used for predictionClassificationR2aOR (95% CI)bP valuecP value after FDRRisk SNP adjusted forP value after adjusting for risk SNP
cg10015974 199827580 87 Intergenic 0.13 0.80 (0.73–0.87) 1.28 × 10−7 3.84 × 10−4 rs16986825; rs3790844 0.02 
cg10098523 200002343 40 Intronic 0.22 0.83 (0.78–0.90) 1.29 × 10−6 2.73 × 10−3 rs16986825; rs3790844 0.52 
cg07926895 200005833 24 Intronic 0.03 0.61 (0.49–0.77) 1.89 × 10−5 2.77 × 10−2 rs16986825; rs3790844 0.32 
cg17804356 200009927 Intronic 0.01 3.38 (2.12–5.39) 2.81 × 10−7 8.05 × 10−4 rs16986825; rs3790844 0.32 
cg07507801 1291235 Intronic 0.03 2.29 (1.66–3.16) 5.14 × 10−7 1.30 × 10−3 rs2736098; rs35226131; rs401681 0.13 
cg07380026 1296007 14 Upstream 0.01 4.52 (2.97–6.90) 2.39 × 10−12 1.67 × 10−8 rs2736098; rs35226131; rs401681 4.55 × 10−3 
cg26603275 1298965 10 Intergenic 0.04 2.24 (1.75–2.87) 1.11 × 10−10 6.36 × 10−7 rs2736098; rs35226131; rs401681 0.05 
cg11624060 1316038 25 Intergenic 0.18 1.28 (1.17–1.40) 2.49 × 10−8 9.23 × 10−5 rs2736098; rs35226131; rs401681 0.93 
cg26209169 1316264 22 Intergenic 0.24 1.24 (1.15–1.34) 2.19 × 10−8 8.62 × 10−5 rs2736098; rs35226131; rs401681 0.83 
cg10441424 1316636 16 Intergenic 0.01 2.08 (1.52–2.86) 5.82 × 10−6 1.02 × 10−2 rs2736098; rs35226131; rs401681 0.65 
cg07493874 1342172 11 Intronic 0.15 0.69 (0.61–0.77) 8.91 × 10−11 5.61 × 10−7 rs2736098; rs35226131; rs401681 0.93 
cg19915256 1345677 11 Upstream 0.02 2.85 (2.00–4.04) 5.16 × 10−9 2.32 × 10−5 rs2736098; rs35226131; rs401681 0.52 
cg27028750 1349422 20 Intergenic 0.25 0.79 (0.74–0.85) 6.59 × 10−10 3.46 × 10−6 rs2736098; rs35226131; rs401681 0.43 
cg03474926 136023407 24 Intronic 0.01 2.72 (1.90–3.89) 5.18 × 10−8 1.72 × 10−4 rs505922 0.36 
cg01169778 136038690 14 Intronic 0.04 1.98 (1.46–2.68) 1.04 × 10−5 1.62 × 10−2 rs505922 0.13 
cg14653977 136038692 20 Intronic 0.03 4.27 (3.09–5.89) 1.12 × 10−18 3.53 × 10−14 rs505922 0.08 
cg13531387 136078657 13 Intergenic 0.11 0.34 (0.25–0.45) 3.16 × 10−13 2.49 × 10−9 rs505922 0.75 
cg00878953 136129875 36 Downstream 0.15 0.65 (0.54–0.79) 6.83 × 10−6 1.16 × 10−2 rs505922 0.42 
cg11879188 136149908 36 Intronic 0.5 2.28 (1.84–2.83) 4.84 × 10−14 4.36 × 10−10 rs505922 0.89 
cg21160290 136149941 43 Intronic 0.71 1.99 (1.69–2.34) 8.87 × 10−17 1.12 × 10−12 rs505922 0.76 
cg22535403 136150032 44 Intronic 0.69 2.29 (1.89–2.77) 4.63 × 10−17 7.29 × 10−13 rs505922 0.59 
cg24267699 136151359 13 Upstream 0.59 2.50 (2.07–3.02) 1.33 × 10−21 8.38 × 10−17 rs505922 0.01 
cg06818865 136151958 10 Intergenic 0.3 1.84 (1.52–2.24) 8.47 × 10−10 4.10 × 10−6 rs505922 0.16 
cg13660174 136238392 19 Intronic 0.07 1.64 (1.34–2.00) 1.30 × 10−6 2.73 × 10−3 rs505922 0.29 
cg13568213 136387235 16 Intronic 0.03 7.05 (3.43–14.48) 1.08 × 10−7 3.40 × 10−4 rs505922 0.17 
cg21101465 13 28493404 20 Upstream 0.04 0.61 (0.49–0.76) 9.94 × 10−6 1.61 × 10−2 rs9581943 0.06 
cg11853320 13 28493913 52 Upstream 0.08 0.69 (0.61–0.79) 3.88 × 10−8 1.36 × 10−4 rs9581943 0.46 
cg26793256 13 28494004 55 Upstream 0.06 0.72 (0.62–0.82) 1.56 × 10−6 3.17 × 10−3 rs9581943 0.16 
cg04633225 13 28494161 22 Upstream 0.02 0.45 (0.34–0.59) 1.09 × 10−8 4.58 × 10−5 rs9581943 0.06 
cg11213248 13 28534648 Intergenic 0.22 0.81 (0.75–0.88) 1.16 × 10−6 2.61 × 10−3 rs9581943 2.00 × 10−4 

aR2: model prediction performance (R2) derived using FHS data.

bOR and CI per one standard deviation increase in genetically predicted DNA methylation.

cP value: derived from association analyses of 8,282 cases and 6,728 controls; FDR-adjust P ≤ 0.05 considered statistically significant.

Candidate target genes of associated CpGs

For the 45 CpGs associated with pancreatic cancer risk, ANNOVAR annotation suggested 32 adjacent genes. Of them, we were able to build blood tissue gene expression prediction models with R2 ≥ 0.01 for nine (RPS2, STARD3, GBGT1, ABO, SURF6, ERBB2, ORMDL3, SNHG9, SOWAHC). We further assessed Spearman rank correlations for 17 pairs of CpG site-gene for their genetically predicted levels of DNA methylation and gene expression, respectively (Supplementary Table S2). For all genes except for STARD3, we observed significant (P < 0.001) correlations (Supplementary Table S2).

Associations of predicted expression of candidate target genes with pancreatic cancer risk

Of these eight genes showing significant correlations, six further showed a significant association with pancreatic cancer risk for their genetically predicted expression levels, namely, ABO (P = 6.72 × 10−12), RPS2 (P = 3.48 × 10−5), SURF6 (P = 8.47 × 10−3), ORMDL3 (P = 2.58 × 10−4), SNHG9 (P = 1.15 × 10−2), and SOWAHC (P = 8.30 × 10−4). Overall, a total of 12 CpGs with six genes showed significant associations in each pair of the relationships in the DNA methylation-gene expression-pancreatic cancer risk pathway. Encouragingly, all these associations showed consistent directions. Taken the CpG site cg24267699 located upstream of ABO as an example, its genetically predicted DNA methylation showed a positive association with pancreatic cancer risk (OR = 2.50; P = 1.33 × 10−21). Meanwhile, we observed an inverse correlation between the genetically predicted DNA methylation level of cg24267699 and predicted expression of ABO (correlation coefficient = −0.62; P < 0.001), as well as an inverse association between predicted expression of ABO and pancreatic cancer risk (OR = 0.89, P = 6.72 × 10−12; Table 3; Supplementary Tables S2 and S3; Supplementary Fig. S3). Consistent three-way associations were also observed for CpGs and five other genes (RPS2, SURF6, ORMDL3, SNHG9, and SOWAHC), which have not been previously reported as pancreatic cancer susceptibility genes in GWAS or TWAS.

Table 3.

Associations showing consistent direction of effect for predicted DNA methylation-predicted gene expression-pancreatic cancer risk pathway.

DNA methylation and pancreatic cancer riskDNA methylation and gene expressionGene expression and pancreatic cancer risk
CpG siteChrPositionAssociated geneClassificationORP valueCorrelation coefficientCorrelation P valueORP value
cg20930114 110372285 SOWAHC Exonic 1.94 1.28 × 10−5 −0.516 <0.001 0.64 8.30 × 10−4 
cg00878953 136129875 ABO Downstream 0.65 6.83 × 10−6 0.420 <0.001 0.49 6.72 × 10−12 
cg11879188 136149908  Intronic 2.28 4.84 × 10−14 −0.350 <0.001   
cg21160290 136149941  Intronic 1.99 8.87 × 10−17 −0.344 <0.001   
cg22535403 136150032  Intronic 2.29 4.63 × 10−17 −0.369 <0.001   
cg24267699 136151359  Upstream 2.50 1.33 × 10−21 −0.620 <0.001   
cg06818865a 136151958  Intergenic 1.84 8.47 × 10−10 −0.423 <0.001   
cg06818865a 136151958 SURF6 Intergenic 1.84 8.47 × 10−10 −0.323 <0.001 0.91 8.47 × 10−3 
cg02871659 16 2014063 RPS2 Intronic 1.18 3.34 × 10−5 −0.742 <0.001 0.64 3.48 × 10−5 
cg18279742 16 2015703  Upstream 1.20 2.89 × 10−5 −0.739 <0.001   
cg18279742 16 2015703 SNHG9 Downstream 1.20 2.89 × 10−5 0.305 <0.001 1.10 1.15 × 10−2 
cg22833065 17 38095691 ORMDL3 Downstream 0.59 3.14 × 10−5 −0.831 <0.001 1.15 2.58 × 10−4 
DNA methylation and pancreatic cancer riskDNA methylation and gene expressionGene expression and pancreatic cancer risk
CpG siteChrPositionAssociated geneClassificationORP valueCorrelation coefficientCorrelation P valueORP value
cg20930114 110372285 SOWAHC Exonic 1.94 1.28 × 10−5 −0.516 <0.001 0.64 8.30 × 10−4 
cg00878953 136129875 ABO Downstream 0.65 6.83 × 10−6 0.420 <0.001 0.49 6.72 × 10−12 
cg11879188 136149908  Intronic 2.28 4.84 × 10−14 −0.350 <0.001   
cg21160290 136149941  Intronic 1.99 8.87 × 10−17 −0.344 <0.001   
cg22535403 136150032  Intronic 2.29 4.63 × 10−17 −0.369 <0.001   
cg24267699 136151359  Upstream 2.50 1.33 × 10−21 −0.620 <0.001   
cg06818865a 136151958  Intergenic 1.84 8.47 × 10−10 −0.423 <0.001   
cg06818865a 136151958 SURF6 Intergenic 1.84 8.47 × 10−10 −0.323 <0.001 0.91 8.47 × 10−3 
cg02871659 16 2014063 RPS2 Intronic 1.18 3.34 × 10−5 −0.742 <0.001 0.64 3.48 × 10−5 
cg18279742 16 2015703  Upstream 1.20 2.89 × 10−5 −0.739 <0.001   
cg18279742 16 2015703 SNHG9 Downstream 1.20 2.89 × 10−5 0.305 <0.001 1.10 1.15 × 10−2 
cg22833065 17 38095691 ORMDL3 Downstream 0.59 3.14 × 10−5 −0.831 <0.001 1.15 2.58 × 10−4 

aThe same CpG site was annotated to two different genes.

Directly measured levels of associated CpGs in pancreatic tumor tissue versus benign pancreatic tissue

Of the 45 CpGs, 16 were directly captured in the Reduced representation bisulfite sequencing (RRBS) of 18 pancreatic tumor tissue specimens and 18 benign pancreatic tissue specimens. Of them, significances of levels of two CpGs (cg04520704 and cg04633225) in tumor versus benign tissues could not be determined. Among the others, six demonstrated significant different levels in pancreatic tumor tissue versus benign pancreatic tissue (Table 4). Encouragingly, the effect directions for all of them are consistent with findings from analyses using genetic instruments (Table 4).

Table 4.

CpGs showing consistent direction of effect for directly measured levels in pancreas tumor versus benign tissues and genetically predicted levels in blood of pancreatic cancer cases versus controls.

CpG siteChrPositionDirection of association between genetically predicted levels and pancreatic cancer riskAverage levels in benign pancreatic tissueStandard deviation of levels in benign pancreatic tissueAverage levels in pancreatic tumor tissueStandard deviation of levels in pancreatic tumor tissueP value comparing levels in pancreas tumor versus benign tissue
cg17804356 200009927 0.02 0.04 0.12 0.15 <0.0004 
cg20930114 110372285 0.005 0.02 0.04 0.05 0.0004 
cg07380026 1296007 0.24 0.18 0.54 0.20 <0.0004 
cg01169778 136038690 0.23 0.11 0.46 0.29 0.01 
cg22535403 136150032 0.35 0.21 0.48 0.28 0.05 
cg21101465 13 28493404 − 0.36 0.19 0.27 0.22 0.02 
CpG siteChrPositionDirection of association between genetically predicted levels and pancreatic cancer riskAverage levels in benign pancreatic tissueStandard deviation of levels in benign pancreatic tissueAverage levels in pancreatic tumor tissueStandard deviation of levels in pancreatic tumor tissueP value comparing levels in pancreas tumor versus benign tissue
cg17804356 200009927 0.02 0.04 0.12 0.15 <0.0004 
cg20930114 110372285 0.005 0.02 0.04 0.05 0.0004 
cg07380026 1296007 0.24 0.18 0.54 0.20 <0.0004 
cg01169778 136038690 0.23 0.11 0.46 0.29 0.01 
cg22535403 136150032 0.35 0.21 0.48 0.28 0.05 
cg21101465 13 28493404 − 0.36 0.19 0.27 0.22 0.02 

This study is by far the first large-scale study that evaluated the relationship between genetically predicted DNA methylation levels and pancreatic cancer risk. We identified 45 CpGs of which the predicted DNA methylation levels showed significant associations with pancreatic cancer risk at FDR < 0.05, including 15 CpGs located at five novel loci that have not been reported in previous GWAS. For the remaining 30 CpGs located at four known pancreatic cancer risk loci, the observed associations were substantially attenuated after adjusting for GWAS-identified risk SNPs, implying that the associations may be at least partly due to the reported risk SNPs. We found consistent direction of associations in the DNA methylation-gene expression-pancreatic cancer risk pathway for 12 CpGs with six genes. Our findings were further supported with the evidence from differentiated DNA methylation at six CpGs for their directly measured levels observed in pancreatic tumor versus benign tissue. Our study identified novel methylation biomarker candidates for pancreatic cancer, as well as provided new information in understanding etiology of pancreatic cancer, a highly lethal malignancy.

Of the 45 identified associated CpGs, we were able to assess correlations between genetically predicted DNA methylation and gene expression levels for 17 CpGs with nine adjacent genes. Among the examined correlations, except for the one between cg19586165 and STARD3, all others were statistically significant. The possible speculation for the insignificant correlation suggested that the most proximal gene of cg19586165, STARD3, might not be the actual target gene. Additional strategies beyond the scope of simple statistical correlations are needed to verify its actual target gene. Of the eight linked genes correlated with predicted DNA methylation of the identified CpGs, six (ABO, RPS2, SURF6, ORMDL3, SNHG9, and SOWAHC) demonstrated significant associations with pancreatic cancer risk for their predicted expression. Among them, The ABO blood group gene located at 9q34.2 has already been implicated as a potential target gene of pancreatic cancer risk SNPs from previous GWAS and TWAS (17, 20). Genotype-inferred non-O blood type was consistently suggested to be associated with an increased risk of pancreatic cancer compared with other blood types, which may be partly explained by differentiated expression of blood group antigens, or alterations in the systemic inflammatory state (39). SURF6 has been previously suggested as a potential pancreatic cancer biomarker, as indicated by a study comparing its expression level in malignant pancreatic cells to that in normal pancreatic duct cells or human papillomavirus-immortalized pancreatic duct epithelial cells (40). A higher expression level of SNHG9, a noncoding RNA, has been identified as a novel prognostic markers for pancreatic cancer (41). To the best of our knowledge, our study is the first one implicating potential link between this gene and pancreatic cancer risk. Further functional studies are needed to better understand potential regulatory effects of the identified CpGs on expression of the genes, and link between expression of the genes and pancreatic cancer.

In this study we systematically assessed relationships between genetically predicted DNA methylation in blood, genetically predicted expression for putatively target genes in blood, and pancreatic cancer risk. For our analyses using genetic instruments we used data generated from white blood cells rather than pancreatic tissue for several reasons. First, it is very challenging to acquire a large sample of pancreatic tissue from healthy subjects without pancreatic cancer. Information from pancreatic tumor-adjacent normal tissue would be less desirable, due to potential influence of somatic alterations on DNA methylation. Furthermore, findings of biomarkers identified in a study design using data from white blood cell samples may confer more translational and practical utilities for future risk assessment of pancreatic cancer, compared with biomarkers in pancreas tissue as it is impractical to obtain pancreas tissue from healthy subjects. We also acknowledge that compared with pancreas specimens, a study focusing on blood samples may not be ideal for pinpointing the underlying etiology of pancreatic cancer development given possible tissue-specific DNA methylation patterns. However, it is also worth noting that, high concordance for the genetically regulated component of DNA methylation cross several tissue types has been reported for a large number of CpGs (10, 42). In this study, we have compared the directly measured levels of a proportion of identified associated CpGs in pancreatic tumor tissue versus benign pancreatic tissue. It is worth noting that for this comparison, the overall DNA methylation levels influenced by both genetic and non-genetic factors were assessed, which is different from the analyses focusing on genetic instruments, in which case only genetically regulated components of DNA methylation levels were evaluated. Although the involved sample size is relatively small (18 vs. 18), we were still able to observe significant differences for six of the CpGs among the limited associated CpGs that were captured in our measurement using RRBS. Unlike The Cancer Genome Atlas (TCGA) study, in which only methylation of pancreatic tumor and tumor-adjacent normal tissue from patients with pancreatic cancer are available, in our comparison the control group focuses on histologically normal pancreas tissue from subjects without pancreatic cancer, thus representing a better design compared with other datasets such as TCGA.

Our study has several strengths. First, we used datasets with relatively large sample sizes for both methylation prediction model building (N = 1,595) and main association analyses for pancreatic cancer risk (8,280 cases and 6,728 controls), which enabled us to conduct a well powered assessment of the DNA methylation-pancreatic cancer risk associations. Second, our innovative study design of using genetic instruments to predict DNA methylation decreased several biases that are commonly embedded in traditional epidemiological studies, such as residual confounding and reverse causality. In addition, by integrating multi-omics data of DNA methylation and gene expression from various resources, we were able to further verify our findings by examining the consistency of the associations in the DNA methylation-gene expression-pancreatic cancer risk pathway for the identified significant CpGs, which may further contribute to potential etiologic understanding of pancreatic cancer. The performance of our developed models were externally validated in an independent WHI dataset, which uses different genotyping platforms (Illumina vs. Affymatrix used in FHS dataset), supporting the utility of our prediction models across platforms. Finally, besides evidence from analyses using genetic instruments, we found additional evidence for some of the identified CpGs using their directly measured levels in pancreatic tissue, further supporting relevance of the identified CpGs with pancreatic cancer. Although the sample size for this analysis is relatively small, it is worth noting that our study comparing tissue samples of PC cases and non-pancreatic cancer controls could well overcome the potential limitation of many other studies (e.g., The Cancer Genome Atlas) comparing tumor samples of cases and tumor-adjacent normal tissue samples of cases.

Several potential limitations need to be acknowledged for appropriate interpretation of our findings. First, the associated CpGs identified in this study do not necessarily imply their causal role in pancreatic cancer. Similar to TWAS, although our findings will be useful for prioritizing candidate DNA methylation biomarkers, false positive findings could exist for some of the identified associations (43). There are several potential reasons for this, such as correlated DNA methylation across individuals, correlated predicted DNA methylation, as well as shared variants (43). In our study, multiple identified CpGs locate at the same loci. Future functional investigation will better characterize whether the identified CpGs play a causal role in pancreatic tumorigenesis. Second, during the DNA methylation genetic prediction model building, due to a lack of data, we were not able to incorporate additional variables, including established pancreatic cancer risk factors, such as smoking, alcohol drinking, body mass index, diabetes status, etc., for adjustments. Future work for developing DNA methylation genetic prediction models after adjusting for these additional variables are warranted to validate our findings. Third, although we were able to show that a proportion of the pancreatic cancer-associated CpGs, we identified demonstrated differential levels in pancreatic tumor versus benign tissue, further work directly comparing DNA methylation levels of these CpGs in prediagnosed blood of pancreatic cancer cases and controls are warranted to further validate our findings. Fourth, it is worth noting that the PanScan III data on dbGaP only contained data for cases but not for controls. In this analysis for improving statistical power, we included cases of PanScan III in the analyses. Previous work suggested that imputation of datasets genotyped by different platforms before merging could generate slightly more SNPs than imputations after combining the datasets together (44). In this study, we merged genotyped data across cases and controls of PanScan I, II, III along with PanC4 and then imputed the data together. Although the design of incorporating data of cases only in PanScan III could be of potentially concerning, we carefully compared the association results in different subgroups (Supplementary Table S1), and the estimates are quite robust, suggesting that this is a less concerning issue and our design should be appropriate. Finally, in this study, we evaluated ANNOVAR annotated genes as candidate target genes of associated CpG sites for correlation analysis. With the recognized chromatin interaction and long-range regulation of gene expression in the human genome, it is possible that for some CpGs the target genes may not necessarily the nearest genes. Further work is warranted to better characterize potential target genes of our identified CpGs using other approaches beyond simply statistical correlation analyses.

In summary, in a large-scale study, we identified 45 CpGs showing significant associations with pancreatic cancer risk for their genetically predicted DNA methylation, including 15 at five novel loci showing an association independent from known risk variants. We further observed consistent directions of associations in the DNA methylation-gene expression-pancreatic cancer risk pathway. We found differentiated DNA methylation at six of the identified CpGs for their measured levels in pancreatic tumor versus benign tissue. The pancreatic cancer risk associated CpGs identified in this study could be investigated in future studies with direct measurement of circulating DNA methylation levels for examining potential utility in pancreatic cancer risk assessment.

The funding organization had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; or decision to submit the manuscript for publication.

J.B. Kisiel reports grants from Exact Sciences during the conduct of the study; also has a patent 10704107 issued and licensed to Exact Sciences. D.W. Mahoney reports a patent for 10704107 Detecting Gastrointestinal Neoplasms with royalties paid from Exact Sciences; and Mayo Clinic and Exact Sciences own intellectual property under which D.W. Mahoney is listed as an inventor and may receive royalties through a contracted services agreement between Mayo Clinic and Exact Sciences. W.R. Taylor reports other support from Exact Sciences during the conduct of the study; other support from Exact Sciences outside the submitted work; also has a patent 10704107 issued, licensed, and with royalties paid from Exact Sciences. X.-O. Shu reports grants from NCI during the conduct of the study. L. Wu reports grants from NIH during the conduct of the study. No disclosures were reported by the other authors.

J. Zhu: Data curation, formal analysis, investigation, methodology, writing–original draft. Y. Yang: Data curation, formal analysis, methodology, writing–original draft. J.B. Kisiel: Data curation, funding acquisition, validation, writing–review and editing. D.W. Mahoney: Validation, investigation, writing–review and editing. D.S. Michaud: Investigation, writing–review and editing. X. Guo: Data curation, writing–review and editing. W.R. Taylor: Formal analysis, writing–review and editing. X.-O. Shu: Writing–review and editing. X. Shu: Data curation, writing–review and editing. D. Liu: Writing–review and editing. B. Li: Writing–review and editing. R. Tao: Writing–review and editing. Q. Cai: Writing–review and editing. W. Zheng: Writing–review and editing. J. Long: Conceptualization, resources, supervision, investigation, methodology, writing–review and editing. L. Wu: Conceptualization, resources, data curation, formal analysis, supervision, funding acquisition, validation, investigation, methodology, writing–original draft, project administration.

The authors would like to thank all of the individuals for their participation in the parent studies of PanScan/PanC4 consortia and all the researchers, clinicians, technicians, and administrative staff for their contribution to the studies. L. Wu is supported by the University of Hawaii Cancer Center, and NCI R00 CA218892. D. Liu is partially supported by the Harbin Medical University Cancer Hospital. Data on CpG positions in the independent case and control tissues was funded in part by Exact Sciences (Madison, WI). The PanScan study was funded in whole or in part with federal funds from the NCI, US NIH under contract number HHSN261200800001E. Additional support was received from NIH/NCI K07 CA140790, the American Society of Clinical Oncology Conquer Cancer Foundation, the Howard Hughes Medical Institute, the Lustgarten Foundation, the Robert T. and Judith B. Hale Fund for Pancreatic Cancer Research and Promises for Purple. A full list of acknowledgments for each participating study is provided in the Supplementary Note of the manuscript with PubMed ID: 25086665. For the PanC4 GWAS study, the patients and controls were derived from the following PANC4 studies: Johns Hopkins National Familial Pancreas Tumor Registry, Mayo Clinic Biospecimen Resource for Pancreas Research, Ontario Pancreas Cancer Study (OPCS), Yale University, MD Anderson Case Control Study, Queensland Pancreatic Cancer Study, University of California San Francisco Molecular Epidemiology of Pancreatic Cancer Study, International Agency of Cancer Research and Memorial Sloan Kettering Cancer Center. This work is supported by NCI R01CA154823. Genotyping services were provided by the Center for Inherited Disease Research (CIDR). CIDR is fully funded through a federal contract from the NIH to Johns Hopkins University, contract number HHSN2682011000111.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1.
Siegel
RL
,
Miller
KD
,
Jemal
A
. 
Cancer statistics, 2020. CA A
Cancer J Clin
2020
;
70
:
7
30
.
2.
Rahib
L
,
Smith
BD
,
Aizenberg
R
,
Rosenzweig
AB
,
Fleshman
JM
,
Matrisian
LM
. 
Projecting cancer incidence and deaths to 2030: the unexpected burden of thyroid, liver, and pancreas cancers in the United States
.
Cancer Res
2014
;
74
:
2913
21
.
3.
Loosen
SH
,
Neumann
UP
,
Trautwein
C
,
Roderburg
C
,
Luedde
T
. 
Current and future biomarkers for pancreatic adenocarcinoma
.
Tumour Biol
2017
;
39
:
1010428317692231
.
4.
Eissa
MAL
,
Lerner
L
,
Abdelfatah
E
,
Shankar
N
,
Canner
JK
,
Hasan
NM
, et al
Promoter methylation of ADAMTS1 and BNC1 as potential biomarkers for early detection of pancreatic cancer in blood
.
Clin Epigenetics
2019
;
11
:
59
.
5.
Aronica
A
,
Avagliano
L
,
Caretti
A
,
Tosi
D
,
Bulfamante
GP
,
Trinchera
M
. 
Unexpected distribution of CA19.9 and other type 1 chain Lewis antigens in normal and cancer tissues of colon and pancreas: importance of the detection method and role of glycosyltransferase regulation
.
Biochim Biophys Acta Gen Subj
2017
;
1861
:
3210
20
.
6.
Schott
S
,
Yang
R
,
Stöcker
S
,
Canzian
F
,
Giese
N
,
Bugert
P
, et al
HYAL2 methylation in peripheral blood as a potential marker for the detection of pancreatic cancer: a case control study
.
Oncotarget
2017
;
8
:
67614
25
.
7.
Mardin
WA
,
Ntalos
D
,
Mees
ST
,
Spieker
T
,
Senninger
N
,
Haier
J
, et al
SERPINB5 promoter hypomethylation differentiates pancreatic ductal adenocarcinoma from pancreatitis
.
Pancreas
2016
;
45
:
743
7
.
8.
Melson
J
,
Li
Y
,
Cassinotti
E
,
Melnikov
A
,
Boni
L
,
Ai
J
, et al
Commonality and differences of methylation signatures in the plasma of patients with pancreatic cancer and colorectal cancer
.
Int J Cancer
2014
;
134
:
2656
62
.
9.
Bell
JT
,
Pai
AA
,
Pickrell
JK
,
Gaffney
DJ
,
Pique-Regi
R
,
Degner
JF
, et al
DNA methylation patterns associate with genetic and gene expression variation in HapMap cell lines
.
Genome Biol
2011
;
12
:
R10
.
10.
Hannon
E
,
Weedon
M
,
Bray
N
,
O'Donovan
M
,
Mill
J
. 
Pleiotropic effects of trait-associated genetic variation on DNA methylation: utility for refining GWAS loci
.
Am J Hum Genet
2017
;
100
:
954
9
.
11.
Grundberg
E
,
Meduri
E
,
Sandling
JK
,
Hedman
ÅK
,
Keildson
S
,
Buil
A
, et al
Global analysis of DNA methylation variation in adipose tissue from twins reveals links to disease-associated variants in distal regulatory elements
.
Am J Hum Genet
2013
;
93
:
876
90
.
12.
McRae
AF
,
Powell
JE
,
Henders
AK
,
Bowdler
L
,
Hemani
G
,
Shah
S
, et al
Contribution of genetic variation to transgenerational inheritance of DNA methylation
.
Genome Biol
2014
;
15
:
R73
.
13.
Zhang
M
,
Wang
Z
,
Obazee
O
,
Jia
J
,
Childs
EJ
,
Hoskins
J
, et al
Three new pancreatic cancer susceptibility signals identified on chromosomes 1q32.1, 5p15.33 and 8q24.21
.
Oncotarget
2016
;
7
:
66328
43
.
14.
Wolpin
BM
,
Rizzato
C
,
Kraft
P
,
Kooperberg
C
,
Petersen
GM
,
Wang
Z
, et al
Genome-wide association study identifies multiple susceptibility loci for pancreatic cancer
.
Nat Genet
2014
;
46
:
994
1000
.
15.
Petersen
GM
,
Amundadottir
L
,
Fuchs
CS
,
Kraft
P
,
Stolzenberg-Solomon
RZ
,
Jacobs
KB
, et al
A genome-wide association study identifies pancreatic cancer susceptibility loci on chromosomes 13q22.1, 1q32.1 and 5p15.33
.
Nat Genet
2010
;
42
:
224
8
.
16.
Amundadottir
L
,
Kraft
P
,
Stolzenberg-Solomon
RZ
,
Fuchs
CS
,
Petersen
GM
,
Arslan
AA
, et al
Genome-wide association study identifies variants in the ABO locus associated with susceptibility to pancreatic cancer
.
Nat Genet
2009
;
41
:
986
90
.
17.
Klein
AP
,
Wolpin
BM
,
Risch
HA
,
Stolzenberg-Solomon
RZ
,
Mocci
E
,
Zhang
M
, et al
Genome-wide meta-analysis identifies five new susceptibility loci for pancreatic cancer
.
Nat Commun
2018
;
9
:
556
.
18.
Childs
EJ
,
Mocci
E
,
Campa
D
,
Bracci
PM
,
Gallinger
S
,
Goggins
M
, et al
Common variation at 2p13.3, 3q29, 7p13 and 17q25.1 associated with susceptibility to pancreatic cancer
.
Nat Genet
2015
;
47
:
911
6
.
19.
Chen
F
,
Childs
EJ
,
Mocci
E
,
Bracci
P
,
Gallinger
S
,
Li
D
, et al
Analysis of heritability and genetic architecture of pancreatic cancer: a PanC4 study
.
Cancer Epidemiol Biomarkers Prev
2019
;
28
:
1238
45
.
20.
Zhong
J
,
Jermusyk
A
,
Wu
L
,
Hoskins
JW
,
Collins
I
,
Mocci
E
, et al
A transcriptome-wide association study (TWAS) identifies novel candidate susceptibility genes for pancreatic cancer
.
J Natl Cancer Inst
2020
;
112
:
1003
12
.
21.
Yang
Y
,
Wu
L
,
Shu
X
,
Lu
Y
,
Shu
XO
,
Cai
Q
, et al
Genetic data from nearly 63,000 women of European descent predicts DNA methylation biomarkers and epithelial ovarian cancer risk
.
Cancer Res
2019
;
79
:
505
17
.
22.
Yang
Y
,
Wu
L
,
Shu
XO
,
Cai
Q
,
Shu
X
,
Li
B
, et al
Genetically predicted levels of DNA methylation biomarkers and breast cancer risk: data from 228,951 women of European descent
.
J Natl Cancer Inst
2019
;
112
:
395
404
.
23.
Wu
L
,
Yang
Y
,
Guo
X
,
Shu
XO
,
Cai
Q
,
Shu
X
, et al
An integrative multi-omics analysis to identify candidate DNA methylation biomarkers related to prostate cancer risk
.
Nat Commun
2020
;
11
:
3905
.
24.
Aryee
MJ
,
Jaffe
AE
,
Corrada-Bravo
H
,
Ladd-Acosta
C
,
Feinberg
AP
,
Hansen
KD
, et al
Minfi: a flexible and comprehensive bioconductor package for the analysis of Infinium DNA methylation microarrays
.
Bioinformatics
2014
;
30
:
1363
9
.
25.
Wu
L
,
Shi
W
,
Long
J
,
Guo
X
,
Michailidou
K
,
Beesley
J
, et al
A transcriptome-wide association study of 229,000 women identifies new candidate susceptibility genes for breast cancer
.
Nat Genet
2018
;
50
:
968
78
.
26.
Wu
L
,
Wang
J
,
Cai
Q
,
Cavazos
TB
,
Emami
NC
,
Long
J
, et al
Identification of novel susceptibility loci and genes for prostate cancer risk: a transcriptome-wide association study in over 140,000 European descendants
.
Cancer Res
2019
;
79
:
3192
204
.
27.
Wu
L
,
Shi
W
,
Long
J
,
Guo
X
,
Michailidou
K
,
Beesley
J
, et al
A transcriptome-wide association study of 229,000 women identifies new candidate susceptibility genes for breast cancer
.
Nat Genet
2018
;
50
:
968
78
.
28.
Gamazon
ER
,
Wheeler
HE
,
Shah
KP
,
Mozaffari
SV
,
Aquino-Michaels
K
,
Carroll
RJ
, et al
A gene-based association method for mapping traits using reference transcriptome data
.
Nat Genet
2015
;
47
:
1091
8
.
29.
Huan
T
,
Joehanes
R
,
Song
C
,
Peng
F
,
Guo
Y
,
Mendelson
M
, et al
Genome-wide identification of DNA methylation QTLs in whole blood highlights pathways for cardiovascular disease
.
Nat Commun
2019
;
10
:
4267
.
30.
Wheeler
HE
,
Shah
KP
,
Brenner
J
,
Garcia
T
,
Aquino-Michaels
K
,
Cox
NJ
, et al
Survey of the heritability and sparse architecture of gene expression traits across human tissues
.
PLoS Genet
2016
;
12
:
e1006423
.
31.
McRae
AF
,
Marioni
RE
,
Shah
S
,
Yang
J
,
Powell
JE
,
Harris
SE
, et al
Identification of 55,000 replicated DNA Methylation QTL
.
Sci Rep
2018
;
8
:
17605
.
32.
The Haplotype Reference Consortium. A reference panel of 64,976 haplotypes for genotype imputation
.
Nat Genet
2016
;
48
:
1279
83
.
33.
Delaneau
O
,
Marchini
J
,
Zagury
JF
. 
A linear complexity phasing method for thousands of genomes
.
Nat Methods
2012
;
9
:
179
81
.
34.
Howie
BN
,
Donnelly
P
,
Marchini
J
. 
A flexible and accurate genotype imputation method for the next generation of genome-wide association studies
.
PLoS Genet
2009
;
5
:
e1000529
.
35.
Barbeira
AN
,
Dickinson
SP
,
Bonazzola
R
,
Zheng
J
,
Wheeler
HE
,
Torres
JM
, et al
Exploring the phenotypic consequences of tissue specific gene expression variation inferred from GWAS summary statistics
.
Nat Commun
2018
;
9
:
1825
.
36.
Yang
J
,
Ferreira
T
,
Morris
AP
,
Medland
SE
,
Madden
PAF
,
Heath
AC
, et al
Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits
.
Nat Genet.
2012
;
44
:
369
75
.
37.
Kisiel
JB
,
Raimondo
M
,
Taylor
WR
,
Yab
TC
,
Mahoney
DW
,
Sun
Z
, et al
New DNA methylation markers for pancreatic cancer: discovery, tissue validation, and pilot testing in pancreatic juice
.
Clin Cancer Res.
2015
;
21
:
4473
81
.
38.
Sun
Z
,
Baheti
S
,
Middha
S
,
Kanwar
R
,
Zhang
Y
,
Li
X
, et al
SAAP-RRBS: streamlined analysis and annotation pipeline for reduced representation bisulfite sequencing
.
Bioinformatics
2012
;
28
:
2180
1
.
39.
Wolpin
BM
,
Chan
AT
,
Hartge
P
,
Chanock
SJ
,
Kraft
P
,
Hunter
DJ
, et al
ABO blood group and the risk of pancreatic cancer
.
JNCI J Natl Cancer Inst
2009
;
101
:
424
31
.
40.
Jones
S
,
Zhang
X
,
Parsons
DW
,
Lin
JCH
,
Leary
RJ
,
Angenendt
P
, et al
Core signaling pathways in human pancreatic cancers revealed by global genomic analyses
.
Science
2008
;
321
:
1801
6
.
41.
Zhang
B
,
Li
C
,
Sun
Z
. 
Long non-coding RNA LINC00346, LINC00578, LINC00673, LINC00671, LINC00261, and SNHG9 are novel prognostic markers for pancreatic cancer
.
Am J Transl Res
2018
;
10
:
2648
58
.
42.
Stueve
TR
,
Li
WQ
,
Shi
J
,
Marconett
CN
,
Zhang
T
,
Yang
C
, et al
Epigenome-wide analysis of DNA methylation in lung tissue shows concordance with blood studies and identifies tobacco smoke-inducible enhancers
.
Hum Mol Genet
2017
;
26
:
3014
27
.
43.
Wainberg
M
,
Sinnott-Armstrong
N
,
Mancuso
N
,
Barbeira
AN
,
Knowles
DA
,
Golan
D
, et al
Opportunities and challenges for transcriptome-wide association studies
.
Nat Genet
2019
;
51
:
592
9
.
44.
van Iperen
EPA
,
Hovingh
GK
,
Asselbergs
FW
,
Zwinderman
AH
. 
Extending the use of GWAS data by combining data from different genetic platforms
.
PLoS One
2017
;
12
:
e0172082
.