Genome-Wide Association and Two-Sample Mendelian Randomization Analyses of Plasma Ghrelin and Gastrointestinal Cancer Risk

Abstract Background: Observational studies have suggested that the gut hormone ghrelin is an early marker of future risk of developing gastrointestinal cancer. However, whether ghrelin is a causal risk factor remains unclear. We conducted a genome-wide association study (GWAS) of plasma ghrelin and used Mendelian randomization (MR) to investigate the possible causal association between ghrelin and gastrointestinal cancer risk. Methods: Genetic variants associated with plasma ghrelin were identified in a GWAS comprising 10,742 Swedish adults in the discovery (N = 6,259) and replication (N = 4,483) cohorts. The association between ghrelin and gastrointestinal cancer was examined through a two-sample MR analysis using the identified genetic variants as instruments and GWAS data from the UK Biobank, FinnGen, and a colorectal cancer consortium. Results: GWAS found associations between multiple genetic variants within ±200 kb of the GHRL gene and plasma ghrelin. A two-sample MR analysis revealed that genetically predicted higher plasma ghrelin levels were associated with a lower risk of gastrointestinal cancer in UK Biobank and in a meta-analysis of the UK Biobank and FinnGen studies. The combined OR per approximate doubling of genetically predicted plasma ghrelin was 0.91 (95% confidence interval, 0.85–0.99; P = 0.02). Colocalization analysis revealed limited evidence of shared causal variants for plasma ghrelin and gastrointestinal cancer at the GHRL locus (posterior probability H4 = 24.5%); however, this analysis was likely underpowered. Conclusions: Our study provides evidence in support of a possible causal association between higher plasma ghrelin levels and a reduced risk of gastrointestinal cancer. Impact: Elevated plasma ghrelin levels might reduce the risk of gastrointestinal cancer.


Introduction
Ghrelin is a 28 amino acid peptide produced by the enteroendocrine cells of the gastrointestinal tract, particularly in the stomach (1)(2)(3).In addition to its growth hormone-releasing activity (1), ghrelin regulates appetite, energy homeostasis, gastric acid secretion, and gut motility (4)(5)(6).Accumulating evidence further indicates that ghrelin modulates cell proliferation and apoptosis and may affect the development of gastrointestinal cancer (3,5,7,8).Several observational studies have found that low circulating levels of ghrelin are associated with an increased risk of esophageal (9)(10)(11), stomach (10,12,13), and colorectal cancers (in the years approaching diagnosis; ref. 14).In contrast, other observational studies reported that low ghrelin levels were associated with a reduced risk of esophageal cancer (13) but were not clearly associated with colorectal cancer (15,16).Given the observational design of the available studies on ghrelin levels and gastrointestinal cancer, it remains unknown whether the reported associations are causal or driven by biases inherited in observational studies, such as confounding and reverse causation.
Mendelian randomization (MR) is an increasingly exploited study design that can improve causal inference in observational studies by leveraging genetic variants that are strongly associated with exposure (e.g., ghrelin) as instruments to decipher the causal effect of exposure on the outcome (e.g., due to the unchangeable nature of genetic variants and the random allocation of alleles at conception).Compared with classical observational studies, MR studies are less susceptible to reverse causation bias and confounding from self-selected behaviors and environmental factors.
First, we conducted a genome-wide association study (GWAS) to identify genetic variants associated with ghrelin.Second, we used genetic variants identified through the two-sample MR framework to investigate the association between lifelong higher ghrelin levels and the risk of gastrointestinal cancer.

Participants in GWAS analysis
The GWAS was based on data from the Swedish Infrastructure for Medical Population-Based Life-Course and Environmental Research (SIMPLER; https://www.simpler4health.se/),comprising two large longitudinal cohorts and a biobank.In addition, SIMPLER encompasses two clinical sub-cohorts, each with participants born between 1920 and 1952 from neighboring Swedish counties.Participants in the clinical subcohorts were randomly chosen from two larger cohorts (i.e., the Swedish Mammography Cohort and Cohort of Swedish Men) and invited to participate in a health examination.Participants from the clinical sub-cohorts with accurate GWAS, of European descent according to GWAS results, and with protein data, were eligible for inclusion in the present GWAS, which encompassed a discovery cohort of 6,259 women and men who resided in V€ astmanland County and provided a blood sample between 2010 and 2019, and a replication cohort of 4,483 women who lived in Uppsala County and provided a blood sample between 2003 and 2009.

Ghrelin measurement
Blood samples were collected in the morning after 12 hours of overnight fasting.After 5 to 10 minutes of storage at room temperature, the samples were centrifuged at room temperature at 310 g for 11 minutes, after which the buffy coat was extracted.The samples were immediately centrifuged at 1,615 g for 11 minutes at 4 C. Subsequently, the plasma samples were aliquoted into multiple tubes and stored at À80 C until analysis.The samples were light-protected from the time of blood collection and sample preparation until freezing.
Total plasma ghrelin concentration (UniProt Q9UBU3) was measured using a high-throughput multiplex immunoassay (Olink Proseek Multiplex CVD II; Olink Bioscience, Uppsala, Sweden), which runs normalized protein expression values on a log 2 scale standardized for each analysis plate.Olink's proximity extension assay technology uses pairs of antibodies equipped with DNA reporter molecules to produce DNA amplicons, which are subsequently quantified using the Fluidigm BioMark HD real-time polymerase chain reaction platform (17,18).As only correctly matched antibody pairs produce a signal, the proximity extension assay technique has an accuracy advantage over the conventional multiplex immunoassays.Olink NPX Manager software was used for data analysis.A one-unit increment in NPX corresponds to an approximate doubling of measured plasma ghrelin levels.The within-and between-run precision coefficients of variation were 9% and 16%, respectively.More details on the protein analyses have been reported previously (19).

Genotyping and GWAS analysis
Details of genotyping and GWAS quality control for the discovery and replication cohorts have been previously described (19).The genetic dataset comprised of $7.8 million DNA markers.The GWAS analysis was conducted using linear regression in SNPTEST (20), assuming an additive genetic model and with adjustment for age, sex, and five genetic principal components.Genetic variants associated with plasma ghrelin at P < 5Â10 À8 in the discovery cohort were tested in the replication cohort.SNP associated with plasma ghrelin in the same direction (P < 0.05) in the replication cohort were considered replicated.

Two-sample MR analysis
The SNPs identified in the GWAS were used as instrumental variables for plasma ghrelin levels.Two sets of genetic instruments were used in this study.We selected SNPs with low LD (R 2 < 0.1) and within AE200 kb of the GHRL gene for the first genetic instrument.SNPs with low LD were identified by clumping (based on Europeans from the 1000 genomes reference panel) implemented using the TwoSampleMR package in R (21).We applied the multiplicative random effects inverse-variance weighted method and adjusted for correlations between SNPs (22).The correlation matrix was obtained from 367,643 unrelated adults of European descent in the UK Biobank.As a second genetic instrument, we selected the cis-SNP with the strongest association (lowest P value) with plasma ghrelin, and the MR estimate was computed by dividing the beta coefficient for the SNP-outcome association by the beta coefficient for the SNP-ghrelin association.
We used the two-sample MR design to examine the associations of plasma ghrelin proxied by the four and one SNP instruments with any gastrointestinal cancer (primary outcome) and specific cancers in the gastrointestinal tract (ancillary outcomes) using outcome data from the UK Biobank (as described previously; ref. 23) as well as publicly available summary genetic data from the FinnGen study (release R8; refs.24,25).The association estimates were adjusted for age, sex, and the 10 principal genetic components.The outcome classification is provided in Supplementary Table S1 and the number of cases for each outcome is shown in Supplementary Table S2.In the UK Biobank there were 11 952 individuals diagnosed with gastrointestinal cancer, and in FinnGen, there were 9,822 such cases.For colorectal cancer, we additionally used summary-level data from a meta-analysis of 16 GWASs, comprising 73,673 cases and 86,854 controls of European ancestries (26).Studies included in the colorectal cancer meta-analysis dataset used in the present study are presented in Supplementary Table S3.
An online tool was used to calculate the statistical power of the MR analysis (27).MR associations with P value < 0.05 were regarded as statistically significant.MR analyses were performed using the Men-delianRandomization (28) and TwoSampleMR (21) packages in R. Meta-analysis of results from the UK Biobank and FinnGen studies was conducted using the metan command in Stata (College Station, Texas), and heterogeneity between the two studies was quantified using the I 2 statistic (29).

Colocalization analysis
Colocalization analysis, using the coloc package in R (30), was conducted as a sensitivity analysis to evaluate whether plasma ghrelin and gastrointestinal cancer share the same causal genetic variant at the GHRL locus (AE200 kb windows around GHRL gene).Such an analysis can indicate whether the phenotypes are influenced by different causal genetic variants that are in LD, indicative of horizontal pleiotropy (i.e., when a genetic variant affects the outcome through a pathway that does not involve the studied exposure) and violation of the exclusion restriction assumption (31).H 4 > 50% was considered supportive of colocalization of the two phenotypes.

Ethics approval and consent to participate
The studies included in this study were approved by a relevant ethical review authority, and the participants provided written informed consent.The Swedish Ethical Review Authority approved the analyses for this study.All methods were performed in accordance with the relevant guidelines and regulations.

Data availability
Summary statistics for the genetic variants used in this study are shown in Table 1.Data from the UK Biobank is accessible upon application (https://www.ukbiobank.ac.uk/).Data from the FinnGen study is publicly available (https://finngen.gitbook.io/documentation/).Colorectal cancer summary-level data were obtained from a colorectal cancer GWAS consortium (26).

Gwas
The mean (AE standard deviation) age of participants in the discovery cohort (34.7% women) and replication cohort (100% women) was respectively 73.8 (5.3) and 67.1 (6.8) years and the corresponding body mass index was 26.6 (4.0) kg/m 2 and 25.9 (4.3) kg/m 2 .In the replication cohort, 28 SNPs around the GHRL locus (within AE200 kb of the gene) on chromosome 3 were associated with plasma ghrelin at P < 5Â10 À8 (Table 1).These SNPs were all associated with plasma ghrelin in the same direction in the replication cohort (P < 0.001) as well as in both cohorts combined (P < 9Â10 À11 ; Table 1).

Selection and performance of the two genetic instruments
The first genetic instrument was based on SNPs in low LD and within AE200 kb of the GHRL gene.The instrument explained 4.6% of the variance in plasma ghrelin levels in the UK Biobank study when accounting for the correlations among SNPs.The inclusion of more SNPs [in modest to high LD (R2 > 0.1)] as instrumental variables did not increase the phenotypic variance when accounting for the correlations.Power was high (≥99% to detect a significant OR of 0.8 or 1.2) in MR analysis of any gastrointestinal cancer and colorectal cancer, but low in MR analysis of other specific gastrointestinal cancers (Supplementary Table S2).The strongest SNP (smallest P value), rs34911341 in GHRL, was used as a secondary genetic instrument.

Two-sample MR analysis
Higher plasma ghrelin levels proxied by the four genetic variant instruments were associated with a statistically significant reduction in the risk of gastrointestinal cancer in the UK Biobank and in a meta-analysis of the two studies; the association was inverse but non-significant in FinnGen (Fig. 1).The OR per approximate doubling of genetically predicted plasma ghrelin was 0.91 [95% confidence interval (CI), 0.85-0.99;P ¼ 0.02] in the meta-analysis, without evidence of heterogeneity between studies (I 2 ¼ 0%).No significant association was observed between the genetically predicted plasma ghrelin levels and any specific gastrointestinal cancer (Fig. 1).
The single genetic variant instrument was associated with any gastrointestinal cancer in the UK Biobank but not in FinnGen or in the meta-analysis of both studies (Supplementary Fig. S1).The OR per approximate doubling of genetically predicted plasma ghrelin was 0.89 (95% CI, 0.78-0.98;P ¼ 0.02) in the UK Biobank and 0.99 (95% CI, 0.93-1.05;P ¼ 0.62) in the meta-analysis.

Colocalization analysis
Colocalization analysis provided limited evidence of shared causal variants of plasma ghrelin and gastrointestinal cancer in the UK Biobank at the GHRL locus (posterior probability H 4 ¼ 24.5%; Supplementary Table S4 and Supplementary Fig. S2).There was little evidence to suggest the presence of distinct causal variants (posterior probability H 3 ¼ 2.3%).Ã Beta coefficients represent the change in plasma ghrelin levels per additional effect allele.The Olink NPX Manager software was used for data analysis, and a one-unit higher NPX represents an approximate doubling of the measured plasma ghrelin levels. a The most robust (smallest P value) cis-SNP was associated with plasma ghrelin and used as a secondary genetic instrument in two-sample MR analyses.b SNPs in low linkage disequilibrium were used as the primary genetic instrument in two-sample MR analyses.These SNPs were selected using clumping in the TwoSampleMR package in R with the European population as the reference population.

Discussion
This GWAS identified 28 genetic variants that are strongly associated with plasma ghrelin levels.The genetic variants were within AE200 kb of the GHRL gene, which encodes preproghrelin, which is posttranslationally processed into different peptides, including ghrelin (4,5,32).Our two-sample MR analysis revealed an inverse association between plasma ghrelin proxied by the primary (four SNP) genetic instrument and the risk of gastrointestinal cancer.The association was less robust when a single genetic variant instrument was used, with an inverse association found only in the UK Biobank.There is limited evidence for shared causal variants of plasma ghrelin and gastrointestinal cancer risk at the GHRL locus.
Our MR analyses focusing on gastrointestinal cancer were motivated by several previous observational studies that reported inverse associations between the total circulating levels of ghrelin and the risk of gastrointestinal cancers, including esophageal squamous cell car-cinoma (9, 10), esophageal adenocarcinoma (11), stomach cancer (10,12,13), and colorectal cancer (in the years approaching diagnosis only; ref. 14).Our main MR analysis confirmed a significant inverse association between plasma ghrelin and the composite outcome of gastrointestinal cancer in the UK Biobank study and in a metaanalysis of both studies.The analyses of genetically predicted plasma ghrelin levels in relation to specific gastrointestinal cancers were underpowered, but all associations were in the inverse direction in the UK Biobank and meta-analysis.Colocalization analysis was underpowered, and neither supported nor disproved the existence of shared genetic variants at the GHRL locus.The reason for the lack of an association between genetically predicted plasma ghrelin levels and the risk of gastrointestinal cancer in FinnGen is unclear.However, it should be noted that the minor allele frequency (MAF) of the top hit ghrelin-associated SNP (rs34911341) was four times higher in Finn-Gen (MAF ¼ 0.024) than in the Swedish (MAF ¼ 0.005) and British (MAF ¼ 0.006) populations included in this study as well as in the Associations of plasma ghrelin with gastrointestinal cancer risk in two-sample MR analysis using the primary instrument with four genetic variants.CRC, colorectal cancer.ORs are scaled per approximate doubling of genetically predicted plasma ghrelin.There was evidence of modest heterogeneity in the meta-analysis of colorectal cancer (I 2 ¼ 19%), but no heterogeneity between estimates in meta-analyses of the other cancers (I 2 ¼ 0%).
Icelandic population (MAF ¼ 0.008; ref. 33).This difference may explain the discrepancy in the results.
Although most in vitro studies have shown that ghrelin promotes tumor development, there are also data showing inhibition of cancer growth and increased apoptosis (3,(5)(6)(7)(8).Most studies have been conducted on acyl ghrelin, which is a ghrelin isoform that binds to the growth hormone secretagogue receptor and stimulates growth hormone release (1,8).Ghrelin gene can produce bioactive peptides other than ghrelin, primarily des-acyl ghrelin and obestatin, which are generated via alternative splicing or posttranslational modifications (32).Although no receptors have been identified for des-acyl ghrelin and obestatin, these peptides have been proven to be active, may either support or antagonize the effect of acyl ghrelin, and may have independent activities (32,34).
The SNP with the strongest association with plasma ghrelin in the present GWAS was also the strongest cis-SNP associated with plasma ghrelin (UniProt Q9UBU3) measured with Olink in the UK Biobank (35) and with SomaScan in an Icelandic cohort (33).However, the direction of association of this SNP with plasma ghrelin differed between the methods.The T allele of rs34911341 was positively associated with plasma ghrelin in the UK Biobank (ref.35; beta coefficients of 1.39 and 1.30 in the discovery and replication samples, respectively; very similar to the estimate in the current GWAS) but negatively associated with plasma ghrelin in the Icelandic study (33).This difference may be related to the fact that the Olink method utilizes two antibodies for the same protein that must simultaneously bind to the protein to provide a signal.In the case of ghrelin, this binding is complicated because the preprotein consists of both ghrelin and obestatin, which act in opposite directions.In general, Olink's method is considered to have a more reliable protein target specificity and a higher number of phenotypic associations than SomaScan (36).
A strength of this study is the MR design, which diminished the bias due to confounding and reverse causation.Furthermore, the use of a relatively strong primary genetic instrument for exposure and a large number of cases for the composite outcome of gastrointestinal cancer provided a high statistical power in the main MR analysis.Nonetheless, the power was low in the analyses of specific gastrointestinal cancers, except for colorectal cancer, and in the colocalization analysis.Thus, further MR analyses of the association between ghrelin levels and specific gastrointestinal cancers based on large-scale genetic consortia data are warranted.A limitation of this MR study and of previous observational studies on ghrelin and cancer is the inability to separate the effect of acyl and des-acyl ghrelin and obestatin.Another shortcoming is that we were unable to examine the association of plasma ghrelin with the histopathologic subtypes of esophageal cancer and molecular subtypes of colorectal cancer.Finally, the study populations comprised individuals of European ancestry, which limits the transferability of our results to non-European populations.
In conclusion, this GWAS identified associations between multiple genetic variations in the GHRL gene and plasma ghrelin levels.Our MR analysis provided suggestive evidence in support of a possible causal association between higher plasma ghrelin levels and a reduced risk of gastrointestinal cancer.Further research is warranted to establish the causal role of ghrelin in gastrointestinal cancer prevention.

Figure 1 .
Figure 1.Associations of plasma ghrelin with gastrointestinal cancer risk in two-sample MR analysis using the primary instrument with four genetic variants.CRC, colorectal cancer.ORs are scaled per approximate doubling of genetically predicted plasma ghrelin.There was evidence of modest heterogeneity in the meta-analysis of colorectal cancer (I 2 ¼ 19%), but no heterogeneity between estimates in meta-analyses of the other cancers (I 2 ¼ 0%).

Table 1 .
SNPs associated with plasma ghrelin in GWAS analysis of the discovery and replication cohorts and in both cohorts combined.