Abstract
Background: Differential DNA methylation as measured in blood is a promising marker of bladder cancer susceptibility. However, previous studies have exclusively used postdiagnostic blood samples, meaning that observed associations may be markers of disease rather than susceptibility.
Methods: Genome-wide methylation was measured in prediagnostic blood samples, using the Illumina Infinium HumanMethylation450 Bead Array, among 440 bladder cancer cases with the transitional cell carcinoma (TCC) subtype and 440 matched cancer-free controls from the Women's Health Initiative cohort. After normalization and probe filtering, we used conditional logistic regression models to test for associations between methylation measurements at 361,184 CpG sites and bladder cancer risk.
Results: Increased methylation at cg22748573, located in a CpG island within the 5′-UTR/first exon of the CITED4 gene, was associated with an 82% decreased risk of bladder cancer after adjusting for race/ethnicity, smoking status, pack-years of smoking, and leukocyte cell profile and accounting for multiple testing (OR = 0.18, q-value = 0.05). The result was robust to sensitivity analyses accounting for time between enrollment and diagnosis, race, tumor subtype, and secondhand smoke exposure.
Conclusions: Although results need to be confirmed in additional prospective studies, differential methylation in CITED4, as measured in blood, is a promising marker of bladder cancer susceptibility.
Impact: Identification of biomarkers of bladder cancer susceptibility in easily accessible tissues may allow targeting of screening efforts so as to improve bladder cancer prognosis. This is particularly important among women, who tend to have poorer bladder cancer outcomes than men. Cancer Epidemiol Biomarkers Prev; 27(6); 689–95. ©2018 AACR.
This article is featured in Highlights of This Issue, p. 617
Introduction
According to 2016 estimates, urinary bladder cancer is the 12th most frequently diagnosed cancer among women in the United States and is responsible for approximately 4,570 deaths among women each year (1). The incidence of bladder cancer increases with age (2), and it is most frequently diagnosed between the ages of 55 and 84 (SEER 18 data; 2009–2013). The most common histology of bladder cancer is transitional cell carcinoma (TCC), which originates in the urothelial cells of the inner lining of the bladder (3) and accounts for 94% of bladder cancer cases (4). On the basis of prognostic characteristics, TCC can be further divided into non–muscle-invasive (NMIBC) and muscle-invasive (MIBC) subtypes, which represent 75% and 25% of all newly diagnosed disease, respectively (5). Although patients with NMIBC are at high risk of recurrence, patients with MIBC have relatively poor survival rates.
Perhaps due to similarities between urinary tract infections and early bladder cancer symptoms causing delays in seeking clinical evaluation, women are 21% more likely to be diagnosed with advanced bladder cancer than men (6, 7). Once diagnosed, the risk of death from bladder cancer is approximately 1.5 times greater for women (8). As such, there is great need for screening indicators that would aid in the early diagnosis of bladder cancer, particularly among women.
Differential DNA methylation is a particularly promising biomarker of cancer susceptibility because of its well-established links to human cancer (9, 10). Altered methylation is known to both silence tumor suppressor genes and to activate oncogenes (10), and changes in DNA methylation are common and early events in bladder carcinogenesis (11). For example, Wolff and colleagues found 526 hypermethylated loci in invasive urothelial tumors as compared with urothelial tissue from cancer-free participants, and noninvasive and invasive urothelial tumors shared 117 differentially hypermethylated loci, indicating these changes are present during the early development of bladder cancer (12).
Only one previous study has evaluated the association of genome-wide methylation, as measured in blood, with bladder cancer risk. Although it identified 9 strongly associated loci, the study considered only a limited number of total loci (∼27 k), included only 112 cases and 118 controls, and used postdiagnostic blood samples collected an average of 1 year after diagnosis (13). As such, DNA methylation levels may have been influenced by the presence of cancer or its treatment (4).
To our knowledge, ours is the first study of genome-wide DNA methylation, as measured in prediagnostic blood, and bladder cancer risk. In addition to a much larger sample size, we utilized the Illumina Infinium HumanMethylation450 Bead Array to capture a greater number of CpG sites across the genome. Our study focuses entirely on women, who are an understudied population more likely to experience adverse bladder cancer outcomes and would particularly benefit from biomarkers of bladder cancer susceptibility.
Materials and Methods
Study setting
This study was approved by the Institutional Review Board of the Fred Hutchinson Cancer Research Center (Seattle, WA). The study was nested in the Women's Health Initiative (WHI), a prospective cohort of 161,808 postmenopausal women from across the United States, recruited from 1993 to 1998, who were between the ages of 50 and 79 at enrollment (14). There are two arms of the WHI, the clinical trials study (CT) and the observational study (OS), with 68,132 and 93,676 participants, respectively. The CT involved concurrent randomized controlled trials of hormone therapy, dietary modification, and calcium/vitamin D and ended in 2005. Those not able or willing to participate in the clinical trials were asked to participate in the OS. After 2005, WHI participants were invited to enroll in the WHI Extension Studies, which tracked health outcomes for another 10 years.
Study subjects
Bladder cancer cases were identified through annual medical questionnaires and confirmed by blinded physician adjudicators using standard criteria based on pathology, cytology, and operative reports and on hospital discharge information. As of September 2012, 618 WHI participants were diagnosed with bladder cancer. We restricted to the 584 participants diagnosed with the TCC subtype of bladder cancer. We excluded cases lacking eligible controls, cases diagnosed with any cancer before baseline, and cases that did not have a sufficient amount of baseline DNA. We also removed the only American Indian case and her matched control to allow for accurate estimation of the race/ethnicity covariate. Selected cases were followed for a median of 7.22 years until bladder cancer diagnosis. Cancer-free controls were matched to cases on year of enrollment, age at enrollment (±2 years), follow-up time greater than or equal to their matched case, trial component, and DNA extraction method (5-Prime, phenol, Bioserve, or PurGene).
Once the methylation data were generated, we checked genetic distance among our study samples and discovered a genetically identical case–control pair. This issue was confined to the single case–control pair, as quality control duplicates located on the same assay plate demonstrated high levels of genetic concordance with their replicates on other plates. The problematic case–control pair was excluded from all analyses. Our final sample size included 440 case–control pairs, where 228 cases were OS participants and 212 were CT participants. Of the 440 cases, 105 were classified as having MIBC.
Data and biospecimen collection
During initial screening for the WHI, basic demographic information was collected, and eligible women were invited for a clinic visit. Staff collected physical measurements and blood specimens at baseline. Blood samples were taken after at least 12 hours of fasting, divided into aliquots, centrifuged, and stored at −70°C as buffy coats. Self-administered questionnaires were also collected at baseline to gather information related to risk factors for various health outcomes of interest, including detailed data on smoking. Participants were asked to report whether they had ever smoked at least 100 cigarettes in their lifetime, age at smoking initiation, cigarettes per day, and years of smoking, as well as age at quitting smoking (if applicable), from which smoking status (current, former, never) and pack-years were computed. For OS participants, information on secondhand smoke exposure was captured at baseline by asking the participant whether she lived with an inside-smoker as a child, lived with an inside-smoker as an adult, and whether she worked in a place where people smoked. Secondhand smoke exposure data were not collected for CT participants.
DNA methylation array
We used the Illumina Infinium HumanMethylation450 Bead Array to interrogate methylation status at 485,577 CpG sites. The array covers RefSeq genes (99%) with an average of 17 CpG sites per gene distributed across various regions including the first exon, 3′ and 5′ untranslated regions, the gene body, and close proximity to transcription start sites. Within or near a gene, CpGs can also occur in CpG islands, which are regions with a high density of CpG sites. If occurring within 2 kb of a CpG island, a locus is in the shore of the CpG island. If occurring within 2 kb of a CpG shore, a locus is in the shelf of the CpG island (15).
DNA was previously extracted from buffy coats using 5-prime, phenol, Bioserve, or PurGene methods and stored at −70°C. DNA was quantified by Picogreen (Invitrogen). The EZ DNA Methylation Kit (Zymo Research) was used to treat 500 ng of DNA with sodium bisulfate, which converts unmethylated cytosine to uracil. Converted DNA was stored for no more than one month at −80°C prior to being assayed. Following standard Infinium HD Methylation protocols (Illumina), 4 μL of converted DNA was denatured and neutralized to prepare it for amplification. The denatured DNA was isothermally amplified at 37°C, followed by endpoint enzymatic fragmentation and isopropanol precipitation. The DNA was then resuspended in Illumina hybridization buffer. Twelve samples were applied to each Illumina beadchip and were kept separated with an IntelliHyb seal. The prepared beadchip was then incubated in a hybridization oven at 48°C for 16 to 24 hours with rocking. Unhybridized and nonspecifically hybridized DNA was washed away. The beadchip underwent staining and extension in capillary flow-through chambers on a Freedom EVO liquid handling robot (Tecan Trading AG), after which beadchips were scanned using the Illumina iScan+. Twenty-one blind duplicate sample pairs were randomly placed among study samples for quality control assessment.
We used the M-value to measure methylation at each CpG site to reduce heteroscedasticity, as compared with the β-value (16). The M-value is the base 2 logit of the β-value, where the β-value is the ratio of the methylated signal over the total signal and is interpreted as the percent of methylation at a specific site (17).
Methylation data processing
After reading in the raw image files and checking for failed samples, we performed normalization in two steps. First, we corrected the probe intensity levels for background fluorescence, to avoid potential dye-bias, using the normal-exponential convolution with out-of-band probes (noob) method (18). Next, we performed functional normalization to adjust the marginal distribution of methylation levels using summarized information from the control and out-of-band probes (19). This method effectively isolates unwanted variation, including batch effects, and performs well in studies of cancer (19). After normalization, we sequentially excluded probes that had detection P values greater than or equal to 0.01 in at least 10% of samples (707 probes), probes with a beadcount less than 3 in at least 10% of samples (215 probes), SNP-related probes (87,706 probes), cross-reactive probes (24,939 probes), probes located on the sex chromosomes (9,578 probes), and non-CpG sites (1,183 probes). Because SNP-related probes may impact probe binding (20), we removed probes with any SNP present at the CpG interrogation or single-nucleotide extension site as identified by the Illumina annotation, and probes within 10 base pairs of a common SNP (minor allele frequency > 1%) based on 1000 Genomes data from “Illumina450ProbeVariants.db” (21). Previously identified cross-reactive probes were also excluded (20, 22). We performed these steps using the “minfi” and “wateRmelon” packages within R (version 3.4.0; refs. 23, 24).
Imputation of missing data
We used multivariate imputation by chained equations (MICE) to impute values for missing covariate data (25). Using default imputation model specifications, we generated 5 datasets (10 iterations/dataset) with complete covariate information. As the first step in generating each dataset, we multiply imputed missing values for the pack-years and race/ethnicity covariates. Our predictor set for the initial phase of imputation included these variables as well as: age at baseline, WHI arm (CT, OS), ever smoking status, cell type composition, case/control status, and methylation levels at 123 CpG sites. These sites were identified by including the top 100 sites associated with current smoking and the top 100 sites correlated with pack-years among current smokers based on a large study by Ambatipudi and colleagues (26). Although there were 155 unique sites from these lists, we only used the 123 loci that remained in our dataset after applying our processing pipeline. As the second step, within each dataset with imputed values for pack-years and race/ethnicity, we added smoking status at baseline to the predictor set and then singly imputed missing values for current or former smoking status among smokers (pack-years > 0) and assumed never smoking status among nonsmokers (pack-years = 0).
Statistical analysis
We used conditional logistic regression models to assess the association between bladder cancer case–control status and each continuous M-value after adjusting for potential confounders, including race/ethnicity (Asian/Pacific Islander, black/African American, Hispanic/Latino, non-Hispanic white, other), smoking status (never, former, current), pack-years of smoking (continuous), as well as the estimated proportion of CD4+ T cells (continuous), CD8+ T cells (continuous), natural killer cells (continuous), granulocytes (continuous), B cells (continuous), and monocytes (continuous). The relative contributions of the six major white blood cell types were reconstructed using the Houseman and colleagues' method (27). To account for multiple testing, we calculated q-values by adjusting P values using the false discovery rate (FDR). Associations with bladder cancer at a q-value less than or equal to 0.05 were deemed statistically significant.
Because we performed multiple imputation, the conditional logistic regression analysis for each site was repeated in all of the imputed datasets. To determine the combined parameter estimates and their variance, the results from each of the five complete data analyses were pooled using the approach of Rubin (28) as implemented in the “mice” R package (25).
Sensitivity analyses
To address the potential influence of early, undiagnosed cancer at the time of blood draw on DNA methylation, we performed a sensitivity analysis restricted to cases diagnosed at least 3 years after enrollment and their matched controls. To address any possible residual confounding by race/ethnicity, we also completed an analysis restricting to case–control pairs that had matching races/ethnicities. We also performed an analysis restricted to the MIBC bladder cancer subtype to explore the impact of clinically heterogeneous bladder cancer subtypes on our primary results. To address the potential impact of secondhand smoke exposure, we restricted to OS participants and additionally adjusted for secondhand smoke exposure at home as a child (yes, no), secondhand smoke exposure at home as an adult (yes, no), and secondhand smoke exposure at work (yes, no). Given the importance of smoking as a potential confounder, we also completed our analyses restricting to never smokers. In addition, we completed an unmatched, genome-wide analyses with the M-value of each CpG site as the outcome to check for associations with smoking status (current, never), pack-years of smoking, and second-hand smoke exposure. The absence of associations among our top hits with these smoking variables would help address concerns that our primary findings were driven by residual confounding from incomplete adjustment for tobacco smoke exposure.
Results
Genome-wide analysis of DNA methylation
On the basis of the 65 SNP probes included on the Illumina array, the genetic distance between the 21 blind duplicate pairs in our study ranged from 0.005 to 0.026, verifying that these samples were genetically identical. The Pearson correlation coefficients between the methylation measurements (β-values) across all 361,184 loci were at least 0.995 for each duplicate pair, indicating suitable precision of the assay.
The distribution of selected variables is presented by case–control status in Table 1. Compared with controls, a greater proportion of cases were white, were more likely to be past and current smokers, and to have greater pack years of smoking. Although OS cases were equally likely as controls to be exposed to secondhand smoke at home as children and at work, they were slightly more likely than controls to be exposed to secondhand smoke at home as adults. Inferred cell-type composition was similar across cases and controls.
Distribution of relevant demographic and clinical characteristics among bladder cancer cases and controls nested within the WHI
. | Cases . | Controls . |
---|---|---|
Age (mean, SD) | 65.12 (7.04) | 65.12 (7.03) |
Race (n, %) | ||
Asian/Pacific Islander | 4 (1%) | 12 (3%) |
Black/African American | 25 (6%) | 45 (10%) |
Hispanic/Latino | 8 (2%) | 16 (3%) |
Non-Hispanic white | 400 (91%) | 360 (82%) |
Other | 2 (<1%) | 7 (2%) |
Missing | 1 (<1%) | 0 (0%) |
Smoking (n, %) | ||
Never smoked | 156 (35%) | 236 (54%) |
Past smoker | 218 (50%) | 178 (40%) |
Current smoker | 56 (13%) | 19 (4%) |
Missing | 10 (2%) | 7 (2%) |
Pack-years (n, %) | ||
Never smoker | 156 (36%) | 236 (54%) |
<5 | 36 (8%) | 68 (15%) |
5–<20 | 63 (14%) | 46 (10%) |
≥20 | 162 (37%) | 78 (18%) |
Missing | 23 (5%) | 12 (3%) |
Child secondhand smoke exposure at home (n, %)a | ||
No | 80 (18%) | 78 (18%) |
Yes | 141 (32%) | 145 (33%) |
Do not know | 4 (1%) | 4 (1%) |
Missing | 215 (49%) | 213 (48%) |
Adult secondhand smoke exposure at home (n, %)a | ||
No | 48 (11%) | 70 (16%) |
Yes | 178 (40%) | 155 (35%) |
Missing | 214 (49%) | 215 (49%) |
Secondhand smoke exposure at work (n, %)a | ||
No | 46 (10%) | 54 (12%) |
Yes | 180 (41%) | 173 (39%) |
Missing | 214 (49%) | 213 (49%) |
Cell-type composition (mean, SD)b | ||
B cells | 0.06 (0.03) | 0.06 (0.03) |
CD8 T cells | 0.09 (0.05) | 0.09 (0.04) |
CD4 T cells | 0.19 (0.08) | 0.17 (0.07) |
Natural killer cells | 0.09 (0.05) | 0.09 (0.04) |
Granulocytes | 0.50 (0.13) | 0.51 (0.13) |
Monocytes | 0.11 (0.03) | 0.11 (0.03) |
Tumor behavior (n, %) | ||
Non—muscle-invasive bladder cancer | 335 (76%) | N/A |
Muscle-invasive bladder cancer | 105 (24%) | N/A |
. | Cases . | Controls . |
---|---|---|
Age (mean, SD) | 65.12 (7.04) | 65.12 (7.03) |
Race (n, %) | ||
Asian/Pacific Islander | 4 (1%) | 12 (3%) |
Black/African American | 25 (6%) | 45 (10%) |
Hispanic/Latino | 8 (2%) | 16 (3%) |
Non-Hispanic white | 400 (91%) | 360 (82%) |
Other | 2 (<1%) | 7 (2%) |
Missing | 1 (<1%) | 0 (0%) |
Smoking (n, %) | ||
Never smoked | 156 (35%) | 236 (54%) |
Past smoker | 218 (50%) | 178 (40%) |
Current smoker | 56 (13%) | 19 (4%) |
Missing | 10 (2%) | 7 (2%) |
Pack-years (n, %) | ||
Never smoker | 156 (36%) | 236 (54%) |
<5 | 36 (8%) | 68 (15%) |
5–<20 | 63 (14%) | 46 (10%) |
≥20 | 162 (37%) | 78 (18%) |
Missing | 23 (5%) | 12 (3%) |
Child secondhand smoke exposure at home (n, %)a | ||
No | 80 (18%) | 78 (18%) |
Yes | 141 (32%) | 145 (33%) |
Do not know | 4 (1%) | 4 (1%) |
Missing | 215 (49%) | 213 (48%) |
Adult secondhand smoke exposure at home (n, %)a | ||
No | 48 (11%) | 70 (16%) |
Yes | 178 (40%) | 155 (35%) |
Missing | 214 (49%) | 215 (49%) |
Secondhand smoke exposure at work (n, %)a | ||
No | 46 (10%) | 54 (12%) |
Yes | 180 (41%) | 173 (39%) |
Missing | 214 (49%) | 213 (49%) |
Cell-type composition (mean, SD)b | ||
B cells | 0.06 (0.03) | 0.06 (0.03) |
CD8 T cells | 0.09 (0.05) | 0.09 (0.04) |
CD4 T cells | 0.19 (0.08) | 0.17 (0.07) |
Natural killer cells | 0.09 (0.05) | 0.09 (0.04) |
Granulocytes | 0.50 (0.13) | 0.51 (0.13) |
Monocytes | 0.11 (0.03) | 0.11 (0.03) |
Tumor behavior (n, %) | ||
Non—muscle-invasive bladder cancer | 335 (76%) | N/A |
Muscle-invasive bladder cancer | 105 (24%) | N/A |
aInformation about exposure to secondhand smoke was only available for participants in the Observational Study arm, so the 212 case–control pairs from the clinical trials arm were missing these data.
bCell-type composition was calculated using the Houseman method (27).
As shown in Table 1, 35 participants were missing pack-years of smoking data and 17 of those missing pack-year data were also missing smoking status data. A single participant was missing race/ethnicity data. The imputed datasets are summarized in Supplementary Table S1.
Results for the 10 CpG sites with the lowest q-values in association with bladder cancer are summarized in Table 2, whereas results for all 361,184 CpG loci are provided in Supplementary Table S2. These results are also summarized as a volcano plot in Fig. 1, with the top association highlighted. After adjusting for multiple comparisons, increased methylation at cg22748573 was statistically significantly associated with a reduced risk of bladder cancer (OR = 0.18 per unit increase in M-value, q-value = 0.05). cg22748573 is located within the chr1:41326949-41328285 CpG island in the 5′-UTR of the CBP/p300-interacting transactivator with Glu/Asp-rich carboxy-terminal domain 4 (CITED4) gene, and it had an average methylation level of 6% in controls. Although not statistically significant, increased methylation at cg20010635 was strongly associated with risk of bladder cancer (OR = 8.42 per unit increase in M-value, q-value = 0.40). The cg20010635 CpG site is located in the body of the calmodulin-binding transcription activator 1 (CAMTA1) gene and had an average methylation of 95% among controls.
Top 10 CpG sites in tests of association between methylation level and bladder cancer risk in prediagnostic blood from the Women's Health Initative
Illumina ID . | Chra . | Position . | Gene . | Gene groupb . | Relation to CpG islandc . | Average β-value in controls . | OR (95% CI)d . | P . | Q-valuee . |
---|---|---|---|---|---|---|---|---|---|
cg22748573 | chr1 | 41327924 | CITED4 | 5′UTR;1stExon | Island | 0.058 | 0.18 (0.09–0.34) | 1.5 × 10−7 | 0.05 |
cg20010635 | chr1 | 7811409 | CAMTA1 | Body | OpenSea | 0.952 | 8.42 (3.48–20.33) | 2.2 × 10−6 | 0.40 |
cg25955565 | chr14 | 71863756 | SNORD56B | TSS1500 | OpenSea | 0.945 | 9.29 (3.59–24.08) | 4.5 × 10−6 | 0.41 |
cg26492847 | chr11 | 96063104 | MAML2 | Body | OpenSea | 0.929 | 7.24 (3.01–17.38) | 9.5 × 10−6 | 0.41 |
cg06414161 | chr16 | 53534040 | AKTIP | Body | Shelf | 0.931 | 8.93 (3.36–23.72) | 1.1 × 10−5 | 0.41 |
cg27627854 | chr11 | 108158382 | ATM | Body;1stExon | OpenSea | 0.905 | 4.19 (2.21–7.98) | 1.2 × 10−5 | 0.41 |
cg02393449 | chr10 | 70743010 | DDX21 | 3′UTR | OpenSea | 0.929 | 6.91 (2.89–16.53) | 1.4 × 10−5 | 0.41 |
cg26695157 | chr12 | 109985283 | OpenSea | 0.917 | 3.71 (2.05–6.72) | 1.5 × 10−5 | 0.41 | ||
cg17803089 | chr6 | 35419793 | FANCE | TSS1500 | Shore | 0.036 | 0.18 (0.08–0.39) | 1.8 × 10−5 | 0.41 |
cg26729242 | chr10 | 96980135 | C10orf129 | Body | OpenSea | 0.906 | 3.00 (1.81–4.96) | 1.9 × 10−5 | 0.41 |
Illumina ID . | Chra . | Position . | Gene . | Gene groupb . | Relation to CpG islandc . | Average β-value in controls . | OR (95% CI)d . | P . | Q-valuee . |
---|---|---|---|---|---|---|---|---|---|
cg22748573 | chr1 | 41327924 | CITED4 | 5′UTR;1stExon | Island | 0.058 | 0.18 (0.09–0.34) | 1.5 × 10−7 | 0.05 |
cg20010635 | chr1 | 7811409 | CAMTA1 | Body | OpenSea | 0.952 | 8.42 (3.48–20.33) | 2.2 × 10−6 | 0.40 |
cg25955565 | chr14 | 71863756 | SNORD56B | TSS1500 | OpenSea | 0.945 | 9.29 (3.59–24.08) | 4.5 × 10−6 | 0.41 |
cg26492847 | chr11 | 96063104 | MAML2 | Body | OpenSea | 0.929 | 7.24 (3.01–17.38) | 9.5 × 10−6 | 0.41 |
cg06414161 | chr16 | 53534040 | AKTIP | Body | Shelf | 0.931 | 8.93 (3.36–23.72) | 1.1 × 10−5 | 0.41 |
cg27627854 | chr11 | 108158382 | ATM | Body;1stExon | OpenSea | 0.905 | 4.19 (2.21–7.98) | 1.2 × 10−5 | 0.41 |
cg02393449 | chr10 | 70743010 | DDX21 | 3′UTR | OpenSea | 0.929 | 6.91 (2.89–16.53) | 1.4 × 10−5 | 0.41 |
cg26695157 | chr12 | 109985283 | OpenSea | 0.917 | 3.71 (2.05–6.72) | 1.5 × 10−5 | 0.41 | ||
cg17803089 | chr6 | 35419793 | FANCE | TSS1500 | Shore | 0.036 | 0.18 (0.08–0.39) | 1.8 × 10−5 | 0.41 |
cg26729242 | chr10 | 96980135 | C10orf129 | Body | OpenSea | 0.906 | 3.00 (1.81–4.96) | 1.9 × 10−5 | 0.41 |
aChr, chromosome.
bFunctional region of gene as indicated in v1.2 Illumina annotation: TSS1500 = 200–1,500 bases upstream of the transcription start site; 5′UTR = Within the 5 prime untranslated region; 1stExon = First segment of gene coding for peptide sequence; Body = Between the ATG and stop codon; 3′UTR = Between the stop codon and poly A signal; multiple listings indicate a locus in a region with multiple splice variants.
cPosition relative to CpG island as indicated in v1.2 Illumina annotation: Island = Within CpG island (CG content > 50%, Obs/Exp CpG ratio > 0.60, and length > 200 bps); OpeanSea = Non-island region; Shore = 0–2 kb flanking CpG Island; Shelf = 2–4 kb flanking CpG Island.
d95% confidence interval (CI) for OR estimate.
eQ-values represent P values adjusted for multiple testing using the FDR method.
Figure 1 is a volcano plot of results from the analysis of 361,184 CpG sites in association with risk of bladder cancer in the Women's Health Initative. The figure plots statistical significance versus effect size for the association of CpG sites with bladder cancer. The locus with the most statistically significant association (cg22748573) is highlighted.
Figure 1 is a volcano plot of results from the analysis of 361,184 CpG sites in association with risk of bladder cancer in the Women's Health Initative. The figure plots statistical significance versus effect size for the association of CpG sites with bladder cancer. The locus with the most statistically significant association (cg22748573) is highlighted.
Sensitivity analyses
Most of the sensitivity analyses yielded ORs for cg22748573 that were similar to or slightly stronger than the result from the main analysis, suggesting little bias in the OR estimate. However, as expected, statistical significance was reduced due to the decreased sample size in each of these analyses. When examining the potential impact of undiagnosed disease, we excluded 74 case–control pairs where the case was diagnosed with bladder cancer within 3 years of study entry, and cg22748573 remained the most significant finding (OR = 0.15, q-value = 0.06). In the race/ethnicity sensitivity analysis, we included only the 337 case–control pairs with matching race/ethnicity, and cg22748573 was no longer statistically significantly associated with bladder cancer; however, cg22748573 still had the third lowest P value (OR = 0.19, q-value = 0.47). With very small numbers in various non-white race/ethnic groups, the MIBC analysis had to be restricted to 96 case–control pairs with white or black race/ethnicity for our statistical models to converge. Although the analysis produced null results, the cg22748573 locus in CITED4 was the second most significant locus and the magnitude of the association was much stronger than in the overall TCC analysis (OR = 0.03, q-value = 0.71).
After restricting to OS participants and additionally imputing the secondhand smoke exposure variables for 8 subjects, the sensitivity analysis for secondhand smoke exposure was also null, but had cg22748573 as the 11th most significant locus (OR = 0.13, q-value = 0.87). With only 85 case–control pairs in the analysis restricted to never smokers, cg22748573 was only the 530th most significant locus (OR = 0.08, q-value = 0.93), but the magnitude of the association with bladder cancer was stronger than the result from our primary analysis. In our analyses of associations between genome-wide DNA methylation and the various tobacco smoke exposure variables, no statistically significant associations were observed with any of the differentially methylated CpG sites highlighted in Table 2 (unpublished observations).
Discussion
We identified a differentially methylated locus in prediagnostic blood that was statistically significantly associated with an 82% reduction in bladder cancer risk. The cg22748573 locus resides in a CpG island within the putative promoter region of CITED4. Although methylation is often tissue specific, methylation levels are more likely to be conserved across various tissues in CpG islands. Price and colleagues observed that probes differentially methylated between tissues were depleted in Illumina-annotated islands (29), and Eckhardt and colleagues reported that only 13% of tissue-specific differentially methylated regions in the 5′-UTR were located within CpG islands (30). Although speculative, this suggests that the methylation status of cg22748573 in blood may reflect that of normal bladder tissue.
There is evidence that CITED4 hypermethylation is associated with better cancer outcomes (31), which is unexpected, as CITED genes have been identified as tumor suppressors (32–34), and methylation of CpG loci in promoter regions is typically associated with transcriptional silencing (35–37). However, based on data from subjects in The Cancer Genome Atlas with both Illumina Infinium HumanMethylation450 Bead Array and Illumina HiSeq RNA-Seq data (38), we found that methylation at cg22748573 was not significantly associated with CITED4 expression in normal bladder tissue (Spearman correlation = −0.22, P = 0.31). Therefore, the importance of the methylation change at cg22748573 may not be its impact on CITED4 expression. Instead, this methylation change may be a marker of decreased susceptibility to the development of bladder cancer through other mechanisms. As an example, hypermethylation of cg22748573 has been closely associated with deletion of 1p and 19q in oligodendrogliomas (31), where tumors with 1p/19q tumor deletions often lack TP53 mutations (39) and have better patient prognoses (31, 40).
Tews and colleagues observed CITED4 hypermethylation in glioma patients to be associated with longer recurrence-free and overall survival, but hypermethylation was not observed in matched leukocyte DNA (31). However, no comparisons between normal brain tissue and blood were made. Given the genomic instability of tumors, one would expect the correlation with DNA methylation levels to be lower for tumor than for normal tissue. Even if blood does not reflect the methylation state of CITED4 in bladder tissue, differential methylation of CITED4, a gene that has been shown to play a central role in blood cell differentiation in mice (41), may reflect systemic changes due to factors like inflammation or environmental exposures that are associated with bladder cancer. In this context, differential methylation of CITED4 could still be a valuable marker of susceptibility to bladder cancer.
Although not statistically significant at the genome-wide level, the second-most significant association was one of increased bladder cancer risk with increased methylation at the cg20010635 locus, occurring in the gene body of CAMTA1. CAMTA1 has been shown to act as an oncogene when fused with WWTR1 in epithelioid hemangioendothelioma (42). Because CAMTA1 is usually only expressed in the brain (42), its role in bladder carcinogenesis is uncertain.
Although not among the top 10 associations highlighted in Table 2, we also observed a particularly strong association between increased methylation at cg13413384 and risk of bladder cancer (OR = 20.30 per unit increase in M-value, q-value = 0.44). This locus is found in the body of the retinoid X receptor-α (RXRA) gene, an oncogene found to be significantly altered in some bladder cancers (43). In our study, the observed methylation change occurred in the RXRA gene body, where increased methylation levels have a well-supported association with higher expression levels (44–47).
Marsit and colleagues conducted the first study of loci-specific genome-wide DNA methylation and bladder cancer in blood (13). This study identified the top 9 loci associated with bladder cancer among 112 cases and 118 controls. Although CITED4 was not among their top hits, the Marsit and colleagues' study was comparatively underpowered and utilized postdiagnostic blood samples.
To our knowledge, ours is the first study to investigate the potential link between genome-wide DNA methylation in blood and future risk of bladder. Being nested in the WHI, we benefited from a relatively large sample size with detailed baseline information on potential confounders.
Even with prediagnostic blood samples, it is possible that methylation measurements in blood reflected early tumor development (48). However, our sensitivity analysis restricted to cases diagnosed at least 3 years after enrollment suggests that the association observed with cg22748573 is not driven by the presence of early cancer. Sensitivity analyses also suggested that residual confounding due to inadequate control of exposure to tobacco smoke did not drive our primary result.
Our results suggest that subtypes of TCC may share some methylation susceptibility markers; however, we were underpowered to specifically examine genome-wide associations between DNA methylation and MIBC. We will explore efforts to pool data with future studies of DNA methylation and bladder cancer in an effort to increase power to evaluate subtype-specific effects.
Ideally, prediagnostic measurements of DNA methylation would be measured in the tissue of interest. Accessing bladder tissue, however, involves invasive procedures. Although urine samples can provide epithelial cells from the bladder, they also include epithelial cells from the rest of the urinary tract. The contribution of cells from the various tissues in the urinary tract can be highly variable (49), and urine samples were only collected from three of the clinical sites in the WHI.
Overall, we observed differential methylation of cg22748573 in prediagnostic blood to be strongly and significantly associated with bladder cancer risk among postmenopausal women. If confirmed in additional prospective studies, molecular studies should be conducted to determine the exact role of this locus in bladder carcinogenesis. Future prospective studies should involve both men and women and should evaluate the contribution of methylation measurements at this locus to risk prediction models that account for smoking history and germline genetic polymorphisms previously associated with bladder cancer risk.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Authors' Contributions
Conception and design: K.M. Jordahl, K.T. Kelsey, E. White, P. Bhatti
Development of methodology: K.M. Jordahl, T.W. Randolph, E. White, P. Bhatti
Acquisition of data: K.M. Jordahl, X. Song, C.L. Sather, L.F. Tinker, P. Bhatti
Analayis and interpretation of data: K.M. Jordahl, T.W. Randolph, X. Song, C.L. Sather, L.F. Tinker, A.I. Phipps, K.T. Kelsey, E.W. White, P. Bhatti
Writing, review, and/or revision of the manuscript: K.M. Jordahl, T.W. Randolph, X. Song, C.L. Sather, L.F. Tinker, A.I. Phipps, K.T. Kelsey, E.W. White, P. Bhatti
Administrative, technical, or material support: K.M. Jordahl, X. Song, C.L. Sather, L.F. Tinker, P. Bhatti
Study Supervision: K.M. Jordahl, T.W. Randolph, L.F. Tinker, A.I. Phipps, E. White, P. Bhatti
Acknowledgments
Research reported in this publication was supported by the American Cancer Society under award number 125299-RSG-13-100-01-CCE. K.M. Jordahl was partially supported by the NCI of the NIH under award number R25 CA094880.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.