Abstract
Background: Single nucleotide polymorphisms (SNP) in microRNA-related genes have been associated with epithelial ovarian cancer (EOC) risk in two reports, yet associated alleles may be inconsistent across studies.
Methods: We conducted a pooled analysis of previously identified SNPs by combining genotype data from 3,973 invasive EOC cases and 3,276 controls from the Ovarian Cancer Association Consortium. We also conducted imputation to obtain dense coverage of genes and comparable genotype data for all studies. In total, 226 SNPs within 15 kb of 4 miRNA biogenesis genes (DDX20, DROSHA, GEMIN4, and XPO5) and 23 SNPs located within putative miRNA binding sites of 6 genes (CAV1, COL18A1, E2F2, IL1R1, KRAS, and UGT2A3) were genotyped or imputed and analyzed in the entire dataset.
Results: After adjustment for European ancestry, no overall association was observed between any of the analyzed SNPs and EOC risk.
Conclusions: Common variants in these evaluated genes do not seem to be strongly associated with EOC risk.
Impact: This analysis suggests earlier associations between EOC risk and SNPs in these genes may have been chance findings, possibly confounded by population admixture. To more adequately evaluate the relationship between genetic variants and cancer risk, large sample sizes are needed, adjustment for population stratification should be carried out, and use of imputed SNP data should be considered. Cancer Epidemiol Biomarkers Prev; 20(8); 1793–7. ©2011 AACR.
Introduction
MicroRNAs (miRNA) are short, noncoding RNAs that regulate translation (1). Single nucleotide polymorphisms (SNP) in precursor and mature miRNAs, their processing machinery, or in miRNA binding sites of target genes have been implicated in cancer risk (2). Liang and colleagues (3) analyzed 238 SNPs from 8 miRNA processing genes and 138 genes containing potential miRNA binding sites in 339 epithelial ovarian cancer (EOC) cases and 349 controls self-reported to be Caucasian and identified associations between EOC risk and 13 SNPs from 4 processing genes (DDX20, DROSHA/RNASEN, GEMIN4, and XPO5) and 7 binding site genes (ATG4A, CAV1, COL18A1, E2F2, IL1R1, KRAS, and UGT2A3). We (4) genotyped 318 SNPs in 18 miRNA processing genes in 2,172 EOC cases and 3,052 controls of European ancestry, and identified 6 SNPs from 4 genes (DROSHA, FMR1, LIN28, and LIN28B) as significantly associated with EOC risk. Here, we conducted a pooled analysis of variants reported as risk associated by Liang and colleagues (3) in 3,973 cases and 3,276 controls from the International Ovarian Cancer Association Consortium (OCAC; ref. 5). We imputed SNPs to expand coverage of genes and regions, totaling 249 SNPs from 10 of the 11 highlighted genes (3).
Materials and Methods
Participating OCAC studies were from North America (US-CAN), the United Kingdom (UK), and Poland (POL). Study characteristics have been reported (4) and are summarized in Table 1. Briefly, cases had pathologically confirmed primary invasive EOC. Controls had at least 1 ovary intact when interviewed. All studies collected data on disease status, self-reported ethnicity, and histologic subtype. Subjects with less than 80% European ancestry were excluded (4), and the first 2 principal components (PC) representing European ancestry (9) were estimated for all SNPs with call rates more than 99% using Golden Helix SVS PCA function, algorithmically equivalent to EigenSTRAT. The protocol was approved by the institutional review board at each site, and all participants provided written informed consent. Pooled data included 3,973 cases (51% serous) and 3,276 controls.
Study name (abbreviations) . | Study population . | Genotyping platform . | Study type . | Number of subjectsa . | |
---|---|---|---|---|---|
. | . | . | . | Cases . | Controls . |
North America (US-CAN) | |||||
Mayo Clinic Ovarian Cancer Study (MAY) | Upper Midwest | Illumina 610K | Clinic based | 359 | 520 |
North Carolina Ovarian Cancer Study (NCO) | North Carolina | Illumina 610K | Population based | 494 | 654 |
Tampa Bay Ovarian Cancer Study (TBO) | Tampa | Illumina 610K | Population based | 227 | 169 |
Familial Ovarian Tumor Study (TOR) | Ontario, Canada | Illumina 610K | Population based | 734 | 524 |
New England Case-Control Study of Ovarian Cancer (NEC) | New England | Illumina 317K, 370K | Population based | 133b | 142 |
US/CAN subtotal | 1,947 | 2,009 | |||
United Kingdom (UK) | |||||
SEARCH (SEA) | England | Illumina 610K | Population based | 1,118 | – |
United Kingdom Ovarian Cancer Population Study (UKO) | England | Illumina 610K | Population based | 506 | – |
Cancer Research UK Familial Ovarian Cancer Register (FOCR) | England | Illumina 610K | Familial Cancer Register | 44 | – |
Royal Marsden Hospital Study (RMH) | England | Illumina 610K | Hospital based | 146 | – |
UK 58 Birth Cohort (58 BC) | England, Wales, Scotland | Illumina 550K | Cohort | – | 712 |
UK Subtotal | 1,814 | 712 | |||
Poland (POL) | |||||
Polish Ovarian Cancer Study (POL) | Warsaw and Lodz, Poland | Illumina 660w | Population based | 212 | 555 |
Overall total | 3,973 | 3,276 |
Study name (abbreviations) . | Study population . | Genotyping platform . | Study type . | Number of subjectsa . | |
---|---|---|---|---|---|
. | . | . | . | Cases . | Controls . |
North America (US-CAN) | |||||
Mayo Clinic Ovarian Cancer Study (MAY) | Upper Midwest | Illumina 610K | Clinic based | 359 | 520 |
North Carolina Ovarian Cancer Study (NCO) | North Carolina | Illumina 610K | Population based | 494 | 654 |
Tampa Bay Ovarian Cancer Study (TBO) | Tampa | Illumina 610K | Population based | 227 | 169 |
Familial Ovarian Tumor Study (TOR) | Ontario, Canada | Illumina 610K | Population based | 734 | 524 |
New England Case-Control Study of Ovarian Cancer (NEC) | New England | Illumina 317K, 370K | Population based | 133b | 142 |
US/CAN subtotal | 1,947 | 2,009 | |||
United Kingdom (UK) | |||||
SEARCH (SEA) | England | Illumina 610K | Population based | 1,118 | – |
United Kingdom Ovarian Cancer Population Study (UKO) | England | Illumina 610K | Population based | 506 | – |
Cancer Research UK Familial Ovarian Cancer Register (FOCR) | England | Illumina 610K | Familial Cancer Register | 44 | – |
Royal Marsden Hospital Study (RMH) | England | Illumina 610K | Hospital based | 146 | – |
UK 58 Birth Cohort (58 BC) | England, Wales, Scotland | Illumina 550K | Cohort | – | 712 |
UK Subtotal | 1,814 | 712 | |||
Poland (POL) | |||||
Polish Ovarian Cancer Study (POL) | Warsaw and Lodz, Poland | Illumina 660w | Population based | 212 | 555 |
Overall total | 3,973 | 3,276 |
aTotals represent the number of non-Hispanic white Europeans passing genotyping quality control criteria and meeting study site-specific inclusion/exclusion criteria.
bCases from NEC that were evaluated as part of this investigation represent postmenopausal advanced papillary serous carcinomas; 26 of these cases were ascertained as part of a hospital-based pre-operative study.
SNP genotyping and quality control have been described (4, 6). SNP imputation was carried out within studies (US-CAN, UK, and POL) with MACH version 1.0.16 by using CEU phased data from HapMap release 22 (genome build 36). We imputed data for 186 SNPs that span 15 kb upstream and downstream of each miRNA processing gene or reside in a putative miRNA binding site in the 3′ untranslated region (UTR) of target genes as predicted by SNPInfo (7) and/or PolymiRTS (8); the remaining 63 SNPs were directly genotyped.
Study-specific ORs and 95% CIs were estimated using unconditional logistic regression. Log-additive genetic models were fit for each SNP, modeling the number of copies of the minor allele. For imputed SNPs, we used expected counts of minor alleles obtained from MACH. Study-specific estimates were adjusted for age at diagnosis/interview (US-CAN and POL), component study sites (US-CAN), and the first 2 PCs (US-CAN, UK, and POL). Allele frequencies across studies were similar, suggesting low-genetic heterogeneity between populations and appropriateness for combining data. Pooled estimates were adjusted for (i) study (US-CAN, UK, and POL) and (ii) study and the first 2 PCs. We used PLINK for statistical analysis (10).
Results
Two hundred twenty-six SNPs were evaluated within or near miRNA processing genes DDX20 (n = 17), DROSHA (n = 179), GEMIN4 (n = 11), and XPO5 (n = 19). Table 2 displays association results for the 6 processing SNPs (or their tagSNPs) identified by Liang and colleagues (3); none were risk associated. Of all other miRNA processing SNPs evaluated, only 3 DROSHA SNPs were associated with risk (P < 0.05) when accounting for study site only, but none retained statistical significance after further adjustment for ancestry (see Supplementary Table S1).
Gene (locus) . | SNP (maj/min allelea) . | Location (putative miRs)b . | OR (95% CI) reported by Liang and colleagues (ref. 3) . | MAFc . | Pooled OR (95% CI), adjusted for studyd . | P . | Pooled OR (95% CI), adjusted for study and ancestrye . | P . |
---|---|---|---|---|---|---|---|---|
miRNA processing | ||||||||
DDX20 (1p21,1-p13.2) | rs197414 (C/A)f | Missense | 0.69 (0.48,0.99) | 0.13 | 1.02 (0.92,1.12) | 0.70 | 1.04 (0.94,1.15) | 0.49 |
DROSHA (5p13.3) | rs9292427 (C/T)g | Intron | 0.71 (0.51,0.99) | 0.46 | 1.01 (0.95,1.08) | 0.72 | 1.01 (0.94,1.08) | 0.79 |
GEMIN4 (17p13) | rs2740349 (A/C)h | Exon 1, ns | 0.70 (0.51,0.96) | 0.18 | 0.99 (0.92,1.09) | 0.97 | 1.02 (0.93,1.11) | 0.71 |
rs2740351 (T/C)i | Flanks 5′UTR | 0.71 (0.57,0.87) | 0.45 | 0.98 (0.91,1.04) | 0.46 | 1.00 (0.94,1.07) | 0.98 | |
rs7813 (T/G)i | Exon 1, ns | 0.71 (0.57,0.88) | 0.46 | 0.97 (0.91,1.04) | 0.38 | 1.00 (0.93,1.07) | 0.91 | |
XPO5 (6p21.1) | rs2257082 (C/A) | Exon 1, ss | 0.73 (0.54,0.99) | 0.27 | 0.99 (0.92,1.07) | 0.87 | 1.00 (0.93,1.08) | 0.95 |
miRNA binding sites | ||||||||
CAV1 (7q31.1) | rs9920 (G/A) | 3′UTR (miR 630) | 1.50 (1.04,2.17) | 0.10 | 1.13 (1.10,1.26) | 0.03 | 1.06 (0.95,1.19) | 0.29 |
COL18A1 (21q22.3) | rs7499 (G/A) | 3′UTR (miR-594) | 1.47 (1.07,2.02) | 0.42c | 0.98 (0.92,1.05) | 0.57 | 0.98 (0.92,1.05) | 0.50 |
E2F2 (1p36) | rs2075993 (A/C)j | 3′UTR (miR-663,486–3p) | 1.24 (1.00,1.54) | 0.48 | 1.01 (0.95,1.08) | 0.67 | 1.01 (0.94,1.08) | 0.87 |
ILIR1 (2q12) | rs3917328 (C/T) | 3′UTR (miR-335, 31) | 1.65 (1.03,2.64) | 0.05c | 1.06 (0.91,1.23) | 0.49 | 1.00 (0.86,1.17) | 0.99 |
KRAS (12p12.1) | rs13096 (A/G)k | 3′UTR (miR-1244) | 1.26 (1.01,1.57) | 0.45 | 1.00 (0.94,1.07) | 0.94 | 0.99 (0.93,1.06) | 0.85 |
UGT2A3 (4q13.2) | rs17147016 (T/A)h | 3′UTR (miR-224, 1279) | 1.47 (1.08,2.01) | 0.19c | 1.02 (0.93,1.11) | 0.70 | 1.01 (0.93,1.10) | 0.88 |
Gene (locus) . | SNP (maj/min allelea) . | Location (putative miRs)b . | OR (95% CI) reported by Liang and colleagues (ref. 3) . | MAFc . | Pooled OR (95% CI), adjusted for studyd . | P . | Pooled OR (95% CI), adjusted for study and ancestrye . | P . |
---|---|---|---|---|---|---|---|---|
miRNA processing | ||||||||
DDX20 (1p21,1-p13.2) | rs197414 (C/A)f | Missense | 0.69 (0.48,0.99) | 0.13 | 1.02 (0.92,1.12) | 0.70 | 1.04 (0.94,1.15) | 0.49 |
DROSHA (5p13.3) | rs9292427 (C/T)g | Intron | 0.71 (0.51,0.99) | 0.46 | 1.01 (0.95,1.08) | 0.72 | 1.01 (0.94,1.08) | 0.79 |
GEMIN4 (17p13) | rs2740349 (A/C)h | Exon 1, ns | 0.70 (0.51,0.96) | 0.18 | 0.99 (0.92,1.09) | 0.97 | 1.02 (0.93,1.11) | 0.71 |
rs2740351 (T/C)i | Flanks 5′UTR | 0.71 (0.57,0.87) | 0.45 | 0.98 (0.91,1.04) | 0.46 | 1.00 (0.94,1.07) | 0.98 | |
rs7813 (T/G)i | Exon 1, ns | 0.71 (0.57,0.88) | 0.46 | 0.97 (0.91,1.04) | 0.38 | 1.00 (0.93,1.07) | 0.91 | |
XPO5 (6p21.1) | rs2257082 (C/A) | Exon 1, ss | 0.73 (0.54,0.99) | 0.27 | 0.99 (0.92,1.07) | 0.87 | 1.00 (0.93,1.08) | 0.95 |
miRNA binding sites | ||||||||
CAV1 (7q31.1) | rs9920 (G/A) | 3′UTR (miR 630) | 1.50 (1.04,2.17) | 0.10 | 1.13 (1.10,1.26) | 0.03 | 1.06 (0.95,1.19) | 0.29 |
COL18A1 (21q22.3) | rs7499 (G/A) | 3′UTR (miR-594) | 1.47 (1.07,2.02) | 0.42c | 0.98 (0.92,1.05) | 0.57 | 0.98 (0.92,1.05) | 0.50 |
E2F2 (1p36) | rs2075993 (A/C)j | 3′UTR (miR-663,486–3p) | 1.24 (1.00,1.54) | 0.48 | 1.01 (0.95,1.08) | 0.67 | 1.01 (0.94,1.08) | 0.87 |
ILIR1 (2q12) | rs3917328 (C/T) | 3′UTR (miR-335, 31) | 1.65 (1.03,2.64) | 0.05c | 1.06 (0.91,1.23) | 0.49 | 1.00 (0.86,1.17) | 0.99 |
KRAS (12p12.1) | rs13096 (A/G)k | 3′UTR (miR-1244) | 1.26 (1.01,1.57) | 0.45 | 1.00 (0.94,1.07) | 0.94 | 0.99 (0.93,1.06) | 0.85 |
UGT2A3 (4q13.2) | rs17147016 (T/A)h | 3′UTR (miR-224, 1279) | 1.47 (1.08,2.01) | 0.19c | 1.02 (0.93,1.11) | 0.70 | 1.01 (0.93,1.10) | 0.88 |
NOTE: All P values are 2-sided.
Abbreviations: US-CAN, United States-Canada; maj, major; min, minor; miR, miRNA; ns, nonsynonymous SNP; ss, synonymous SNP; MAF, minor allele frequency among all controls.
aThe major allele represents the most frequently occurring allele and serves as the reference allele during modeling.
bSNP location derived from Illumina annotation files, HapMap2 data (http://hapmap.ncbi.nlm.nih.gov/), and dbSNP (http://www.ncbi.nlm.nih.gov/projects/SNP/).
SNPinfo http://snpinfo.niehs.nih.gov/ and the PolymiRTS database (http://compbio.uthsc.edu/miRSNP/) were used to predict miRNAs whose binding activity may be altered because of the SNP location.
cGenotype data was imputed for all participants by using MACH version 1.0.16 using phased data from HapMap release 22 (genome build 36) derived from individuals with European ancestry (CEU).
dPooled OR and 95% CI estimated by using a log-additive model adjusted for study (US-CAN, UK, and POL).
ePooled OR and 95% CI estimated by using a log-additive model adjusted for study and the first 2 PCs representing European ancestry.
fDDX20 rs19714 is in linkage disequilibrium (LD; r2 = 0.90) with rs197383 identified by Liang and colleagues.
gDROSHA rs9292427 is in LD (r2 = 0.98) with rs4867329 identified by Liang and colleagues.
hSNP deviates from Hardy–Weinberg Equilibrium (HWE) among all controls with PHWE values of 0.020 for rs607613, 0.040 for rs615435, 0.013 for rs2740349, 0.004 for rs3732133, and 0.034 for rs17147016, respectively.
iGEMIN4 SNP pair in LD (r2 = 1).
jE2F2 SNP pair in LD (r2 = 0.97).
kKRAS rs13096 is in LD (r2 = 1) with rs10771184 identified by Liang and colleagues.
There were 23 SNPs predicted to disrupt miRNA binding within 6 of the 7 candidate genes (3). We did not evaluate SNPs within ATG4A because neither genotype nor imputed data were available for SNPs within the 3′ UTR. Table 2 shows results from the 6 binding site SNPs (or their tagSNPs) identified by Liang and colleagues (3). To minimize redundancy because of tagSNPs, results from 21 of the 23 binding site SNPs evaluated are displayed in Supplementary Table S1. Only 1 previously identified binding site SNP, CAV1 rs9920 (3), and 2 imputed CAV1 SNPs (rs1049314 and rs8713) were associated with risk in the pooled, study site-adjusted analysis (Table 2; Supplementary Table S1). However, none of these CAV1 SNPs were risk associated after further adjustment for ancestry.
Study-specific estimates were generally similar across studies, and results did not change appreciably when considering a dominant genetic model or serous-only histology (data not shown).
Discussion
We did not detect consistent associations between the majority of previously identified polymorphisms (3) and EOC risk. Although we did identify associations between EOC risk and 3 SNPs flanking the 3′UTR of DROSHA and 3 SNPs in miRNA binding sites of CAV1, none retained statistical significance after controlling for European ancestry. Consistent with recent large scale (11) but not smaller studies (3, 12), we did not identify associations between EOC risk and SNPs in miRNA binding sites of KRAS.
Several explanations exist for not replicating the findings presented by Liang and colleagues (3). First, our analysis suggests their results may be confounded by population admixture, underscoring the importance of estimating population stratification rather than relying on self-reported ancestry in genetic association studies. Because of their relatively small sample size (3), chance is an alternate explanation for their findings. Our pooled sample had at least 90% statistical power to detect an SNP with a minor allele frequency of 0.09 and a log-additive OR of 1.2. This analysis highlights the importance of having large studies and/or combining genotype data from multiple studies to increase statistical power to detect true associations, and shows the utility of population stratification and imputation.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Acknowledgments
We thank all individuals who participated in this research along with all researchers, clinicians, and staff who have contributed to the participating studies.
Grant Support
The genotyping, bioinformatic, and biostatistical data analysis for MAY, NCO, and TOR was supported by R01-CA-114343 and R01-CA114343-S1. The MAY study is supported by R01-CA-122443 and P50-CA-136393 and funding from the Mayo Foundation. The NCO study is supported by R01-CA-76016. The TBO study is supported by R01-CA-106414, the American Cancer Society (CRTG-00-196-01-CCE), and the Advanced Cancer Detection Center Grant, Department of Defense (DAMD-17-98-1-8659). The TOR study is supported by grants from the Canadian Cancer Society and the NIH (R01-CA-63682 and R01-CA-63678). The Mayo Clinic Genotyping Shared Resource is supported by the National Cancer Institute (P30-CA-15083). The NEC study is supported by grants CA-54419 and P50 CA105009. The POL study was supported by the Intramural Research Program of the NIH, National Cancer Institute, Division of Cancer Epidemiology and Genetics, and the Center for Cancer Research. The SEA study is funded by a program grant from Cancer Research UK. The UKO study is supported by funding from Cancer Research UK, the Eve Appeal, and the OAK Foundation; some of this work was undertaken at UCLH/UCL who received some funding from the Department of Health's NIHR Biomedical Research Centre funding scheme. UK genotyping and data analysis were supported by a project grant from Cancer Research UK. UK studies also make use of data generated by the Wellcome Trust Case-Control consortium. A list of investigators who contributed to the generation of data is available at http://www.wtccc.org.uk.