Abstract
We evaluated the generalizability of a single nucleotide polymorphism (SNP), rs2046210 (A/G allele), associated with breast cancer risk that was initially identified at 6q25.1 in a genome-wide association study conducted among Chinese women. In a pooled analysis of more than 31,000 women of East-Asian, European, and African ancestry, we found a positive association for rs2046210 and breast cancer risk in Chinese women [ORs (95% CI) = 1.30 (1.22–1.38) and 1.64 (1.50–1.80) for the AG and AA genotypes, respectively, P for trend = 1.54 × 10−30], Japanese women [ORs (95% CI) = 1.31 (1.13–1.52) and 1.37 (1.06–1.76), P for trend = 2.51 × 10−4], and European-ancestry American women [ORs (95% CI) = 1.07 (0.99–1.16) and 1.18 (1.04–1.34), P for trend = 0.0069]. No association with this SNP, however, was observed in African American women [ORs (95% CI) = 0.81 (0.63–1.06) and 0.85 (0.65–1.11) for the AG and AA genotypes, respectively, P for trend = 0.4027]. In vitro functional genomic studies identified a putative functional variant, rs6913578. This SNP is 1,440 bp downstream of rs2046210 and is in high linkage disequilibrium with rs2046210 in Chinese (r2 = 0.91) and European-ancestry (r2 = 0.83) populations, but not in Africans (r2 = 0.57). SNP rs6913578 was found to be associated with breast cancer risk in Chinese and European-ancestry American women. After adjusting for rs2046210, the association of rs6913578 with breast cancer risk in African Americans approached borderline significance. Results from this large consortium study confirmed the association of rs2046210 with breast cancer risk among women of Chinese, Japanese, and European ancestry. This association may be explained in part by a putatively functional variant (rs6913578) identified in the region. Cancer Res; 71(4); 1344–55. ©2011 AACR.
Introduction
Breast cancer, one of the most common malignancies among women worldwide, is a complex polygenic disorder for which genetic factors play a significant role in disease etiology (1, 2). We recently identified a novel genetic susceptibility locus at 6q25.1 for breast cancer risk in a genome-wide association study (GWAS) conducted among Chinese women living in Shanghai (3). A nearly 60% elevated risk for breast cancer was found among women homozygous for the variant A allele in rs2046210, a single nucleotide polymorphism (SNP) located approximately 29 kb upstream of the ESR1 gene. It has yet to be determined whether this SNP is associated with breast cancer risk in other populations. Investigation of the association in other racial and ethnic groups is needed to determine the generalizability of this finding and to identify causal variants for the association. In this article, we report a pooled analysis of the association between rs2046210 and breast cancer risk in a consortium of 14 studies including more than 31,000 women of East-Asian, European, and African ancestry. We also conducted functional genomic studies to identify possible causal variants at this locus.
Materials and Methods
Study population
Fourteen studies contributing a total of 17,188 breast cancer cases and 14,660 controls participated in this consortium. Detailed descriptions of participating studies are included in the Supplement. Briefly, the consortium included 18,414 Chinese women from 7 studies conducted in Shanghai [n = 10,373; Shanghai Breast Cancer Study (SBCS)-I (3, 4), SBCS-II (3), and Shanghai Breast Cancer Survival Study (SBCSS)/Shanghai Endometrial Cancer Study (SECS; ref. 3)], Tianjin [n = 3,115; Tianjin Study (5)], Nanjing [n = 2,084; Nanjing Study (6, 7)], Taiwan [n = 2,014; Taiwan Study (8, 9)], and Hong Kong [n = 828; Hong Kong Study (10)]; 3,142 Japanese women from 3 studies conducted in Nagoya [n = 1,288; Hospital-based Epidemiologic Research Program at Aichi Cancer Center (HERPACC-II; ref. 11)], Hawaii [n = 1,048; Multiethnic Cohort Study (MEC; refs.12, 13)], and Nagano [n = 806, Nagano Breast Cancer Study (14)]; 8,258 European-ancestry Americans from 3 studies conducted in WI/MA/NH [n = 3,266; Collaborative Breast Cancer Study (CBCS; refs. 15, 16)], TN [n = 3,060; Nashville Breast Health Study (NBHS; ref.3), and NY [n = 1,932; Long Island Breast Cancer Study Project (LIBCSP; ref. 17)]; and 2,034 African Americans from 2 studies conducted in 12 southern U.S. states [n = 1,568; Southern Community Cohort Study (SCCS; ref.18) and TN (n = 466; NBHS) (Table 1)].
Characteristics of studies participating in the breast cancer consortium
Study (reference) . | Ethnicity . | Study designa . | Study period . | Nb . | Age (mean) . | Menopause (%) . | ER (+)c (%) . |
---|---|---|---|---|---|---|---|
SBCS-Id (3, 4) | Chinese | Population | 1996–1998 | 1,105/1,213 | 47.5/47.3 | 32.7/36.3 | 63.8 |
SBCS-IId (3) | Chinese | Population | 2002–2005 | 1,915/1,836 | 50.9/51.7e | 43.6/49.4e | 63.6 |
SBCSSd/SECSd (3) | Chinese | Population | 2002–2006 | 3,405/899 | 55.0/54.9 | 53.7/61.7e | 64.5 |
Tianjin (5) | Chinese | Hospital | 2004–2008 | 1,532/1,583 | 51.7/51.9 | 51.7/55.4f | 44.3 |
Nanjing (6, 7) | Chinese | Hospital | 2004–2008 | 1,050/1,034 | 51.6/52.0 | 53.7/55.2 | 55.4 |
Taiwan (8, 9) | Chinese | Hospital | 2004–2007 | 1,001/1,013 | 51.6/47.4e | 52.7/39.8e | 65.9 |
Hong Kong (10) | Chinese | Hospital | 2003–2004 | 407/421 | 45.5/45.4 | 52.2/41.4e | 72.0 |
Nagoya, Japand (11) | Japanese | Hospital | 2000–2005 | 644/644 | 51.4/51.1 | 48.5/48.5 | 72.8 |
MECd (12, 13) | Japanese | Population | 1993–2008 | 541/507 | 65.1/60.3e | 86.4/83.3 | 86.2 |
Nagano, Japand (14) | Japanese | Hospital | 2001–2005 | 403/403 | 53.7/53.9 | 54.6/65.0e | 74.8 |
NBHS-Whited (3) | European | Population | 2001–2008 | 1,592/1,468 | 54.8/52.2e | 65.5/58.9e | 74.9 |
CBCSd (15, 16) | European | Population | 1998–2001 | 1,828/1,438 | 53.7/53.4 | 59.0/61.0 | NAg |
LIBCSPd (17) | European | Population | 1996–1997 | 953/979 | 58.8/56.7e | 67.6/66.5 | 76.7 |
CGEMS | European | Population | NAg | 1,145/1,142 | NAg | NAg | NAg |
SCCS-Blackd (18) | African | Population | 2002–2008 | 522/1,046 | 48.1/56.6e | 59.5/76.7e | NAg |
NBHS-Black (3) | African | Population | 2001–2008 | 290/176 | 54.5/52.1f | 70.7/61.9 | NAg |
Study (reference) . | Ethnicity . | Study designa . | Study period . | Nb . | Age (mean) . | Menopause (%) . | ER (+)c (%) . |
---|---|---|---|---|---|---|---|
SBCS-Id (3, 4) | Chinese | Population | 1996–1998 | 1,105/1,213 | 47.5/47.3 | 32.7/36.3 | 63.8 |
SBCS-IId (3) | Chinese | Population | 2002–2005 | 1,915/1,836 | 50.9/51.7e | 43.6/49.4e | 63.6 |
SBCSSd/SECSd (3) | Chinese | Population | 2002–2006 | 3,405/899 | 55.0/54.9 | 53.7/61.7e | 64.5 |
Tianjin (5) | Chinese | Hospital | 2004–2008 | 1,532/1,583 | 51.7/51.9 | 51.7/55.4f | 44.3 |
Nanjing (6, 7) | Chinese | Hospital | 2004–2008 | 1,050/1,034 | 51.6/52.0 | 53.7/55.2 | 55.4 |
Taiwan (8, 9) | Chinese | Hospital | 2004–2007 | 1,001/1,013 | 51.6/47.4e | 52.7/39.8e | 65.9 |
Hong Kong (10) | Chinese | Hospital | 2003–2004 | 407/421 | 45.5/45.4 | 52.2/41.4e | 72.0 |
Nagoya, Japand (11) | Japanese | Hospital | 2000–2005 | 644/644 | 51.4/51.1 | 48.5/48.5 | 72.8 |
MECd (12, 13) | Japanese | Population | 1993–2008 | 541/507 | 65.1/60.3e | 86.4/83.3 | 86.2 |
Nagano, Japand (14) | Japanese | Hospital | 2001–2005 | 403/403 | 53.7/53.9 | 54.6/65.0e | 74.8 |
NBHS-Whited (3) | European | Population | 2001–2008 | 1,592/1,468 | 54.8/52.2e | 65.5/58.9e | 74.9 |
CBCSd (15, 16) | European | Population | 1998–2001 | 1,828/1,438 | 53.7/53.4 | 59.0/61.0 | NAg |
LIBCSPd (17) | European | Population | 1996–1997 | 953/979 | 58.8/56.7e | 67.6/66.5 | 76.7 |
CGEMS | European | Population | NAg | 1,145/1,142 | NAg | NAg | NAg |
SCCS-Blackd (18) | African | Population | 2002–2008 | 522/1,046 | 48.1/56.6e | 59.5/76.7e | NAg |
NBHS-Black (3) | African | Population | 2001–2008 | 290/176 | 54.5/52.1f | 70.7/61.9 | NAg |
aWith the exception of the MEC and SCCS, all other studies used the case–control study design using either a population-based or hospital-based approach.
bCases/controls.
cAmong cases with ER data
dSBCS-I: Shanghai Breast Cancer Study-I; SBCS-II: Shanghai Breast Cancer Study-II; SBCSS: Shanghai Breast Cancer Survival Study; SECS: Shanghai Endometrial Cancer Study; Nagoya, Japan: Hospital-based Epidemiologic Research Program at Aichi Cancer Center; Nagano, Japan: Nagano Breast Cancer study; MEC: Multiethnic Cohort Study; LIBCSP: Long Island Breast Cancer Study Project; CBCS: Collaborative Breast Cancer Study; NBHS: Nashville Breast Health Study; SCCS: Southern Community Cohort Study.
eSignificant at α = 0.01 level (t test for continuous variables, Chi-square test for categorical variables).
fSignificant at α = 0.05 level (t test for continuous variables, Chi-square test for categorical variables).
gData not available.
Genotyping
Genotyping assays were done at 6 different centers. The genotyping assay protocol was developed and validated at the Vanderbilt Molecular Epidemiology Laboratory, and TaqMan genotyping assay reagents were provided to investigators of the Tianjin study (Tianjin Cancer Institute and Hospital), Nanjing study (Nanjing Medical University), LIBCSP (Columbia University), MEC (University of Southern California), and Nagano Breast Cancer study (Japan National Cancer Center), who conducted the genotyping assays at their own laboratories. Samples from the other 8 studies were genotyped at Vanderbilt using TaqMan and Affymetrix SNP arrays or at Proactive Genomics using the iPlex Sequenom MassArray platform. The Shanghai study samples were genotyped with Affymetrix Genome-Wide Human SNP Array 5.0 or 6.0 (Stage 1 of the initial GWAS) and Sequenom (Stages 2 and 3) as described previously (3). The SCCS samples were genotyped with Sequenom. All other samples were genotyped with the TaqMan assay.
Genotyping quality controls
Quality control (QC) procedures for samples from the Shanghai studies have been described previously (3). The consistency rate was 99.7% based on 2,572 comparisons with blinded QC samples and 99.2% based on 1,751 comparisons with HapMap DNA samples. For the SCCS samples genotyped with the Sequenom platform, 2 negative controls, 2 blinded duplicates, and 2 samples from the HapMap project were included in each 96-well plate. The QC consistency rate was 100% for blinded duplicates and 100% for the HapMap samples comparing genotyping data obtained from the current study with data obtained from the HapMap project. For TaqMan genotyping assays conducted at the Vanderbilt Molecular Epidemiology Laboratory, 2 negative controls and 2 blinded duplicates were included in each 96-well plate, along with 30 unrelated European and 45 Chinese samples from the HapMap project for QC purposes. The consistency rate was 98.8% for the blinded duplicates and 100% for the HapMap samples comparing genotyping data obtained from the current study with data obtained from the HapMap project. Each of the non-Vanderbilt laboratories was asked to genotype a trial plate containing DNA from 46 unrelated European-ancestry and 70 Chinese-ancestry samples before the main study genotyping was conducted. The consistency rate across all centers for these trial samples was 100% compared with genotypes previously determined at Vanderbilt. In addition, replicate samples comparing 3% to 7% of all study samples were dispersed among the genotyping plates at all centers. The genotype distribution for rs2046210 was in Hardy–Weinberg equilibrium among controls for all participating studies, with the exception of control samples from the Taiwan study (P = 0.003). The genotype distributions for rs6929137, rs3734804, rs6913578, and rs7763637 were in Hardy–Weinberg equilibrium among controls for all participating studies.
Imputation
To evaluate the association of breast cancer risk with SNPs that were not directly genotyped in the initial GWA scan, we imputed the genotypes of these SNPs using the program MACH (19). MACH determines the probability distribution of missing genotypes conditional on a set of known haplotypes, while simultaneously estimating the fine-scale recombination map. For the Shanghai studies, the imputation was based on 660,118 autosomal SNPs genotyped using Affymetrix Genome-Wide Human SNP Array 6.0 with a minor allele frequency (MAF) greater than 1% that passed the QC check and using phased HCB/JPT data from HapMap Phase II (release 22). For the National Cancer Institute Cancer Genetic Markers of Susceptibility (CGEMS) study (20), genotypes were imputed on the basis of 513,602 autosomal SNPs genotyped using Illumina HumanHap550 BeadChip with a MAF greater than 1% and phased CEU data from HapMap Phase II (release 22). Logistic regression was used to estimate the association of imputed SNPs of interest with breast cancer risk taking into account the degree of uncertainty of genotype imputation.
Plasmid constructs and luciferase assays
DNA fragments carrying the minor alleles of study SNPs were amplified by using PCR and cloned upstream of a luciferase reporter vector, pGL3 promoter or pGL3 basic (Promega). The major alleles were generated by using a QuickChange Site-Directed Mutagenesis Kit (Strategene). Details on PCR primers and site-specific mutagenesis oligonucleotides are provided in the Supplement. All DNA constructs were verified by sequencing analysis. Enhancer and promoter activities were determined by transient transfection followed by an in vitro luciferase assay in HEK293 cells. Transfection was done with the use of FuGene 6 Transfection Reagent (Roche Diagnostics) in triplicate for each of the constructs. Briefly, 2 × 105 cells were seeded in 24-well plates and cotransfected with pGL4.73, a Renilla-expressing vector, which served as a reference for transfection efficiency. Thirty-six to 48 hours later, the cells were lysed with Passive Lysis Buffer and luminescence (relative light units) was measured using the Dual-Luciferase Assay System (Promega). Regulatory activity was measured as a ratio of firefly luciferase activity to Renilla luciferase activity, and the mean from at least 3 independent experiments are presented.
Electrophoretic mobility shift assay
Biotin-labeled, double-stranded oligonucleotide probes (details in Supplement) containing either the major or minor allele sequence were synthesized. The probes were incubated with nuclear protein extracts from HEK293 and MCF7 cells, in the presence or absence of competitors (that is, unlabeled probes). Protein-DNA complexes were resolved by polyacrylamide gel electrophoresis and detected using a LightShift Chemiluminescent EMSA kit (Pierce Biotechnology).
Statistical analysis
Individual data were obtained from each study for a pooled analysis. Case–control differences in selected demographic characteristics and major risk factors were evaluated using t tests (for continuous variables) and Chi-square tests (for categorical variables). Associations between SNPs and breast cancer risk were determined using odds ratios (OR) and 95% CIs derived from logistic regression models. ORs were estimated for heterozygotes and homozygotes for the variant allele compared with homozygotes for the common allele. ORs were also estimated for the variant allele on the basis of a log-additive model and adjusted for age, study site, and ethnicity, when appropriate. Adjusting for nongenetic risk factors, including age at first live birth, age at menarche, age at menopause, body mass index, participation in exercise, family history of breast cancer and history of benign breast diseases, did not alter the observed association, and thus only age-adjusted and study site-adjusted results are presented. Heterogeneity across studies and between ethnicities was assessed with likelihood ratio tests. Stratified analyses by ethnicity, menopausal status, and estrogen receptor (ER) status were carried out.
Results
The distributions of age and menopausal status for participating studies are shown in Table 1. Higher risk of breast cancer was consistently observed for all known major breast cancer risk factors, including a family history of breast cancer, a prior history of benign breast disease, physical inactivity, early onset of menarche, late onset of menopause, and late age at first live birth (data not shown). Except for the CBCS and SCCS, data on ER status were available from all studies.
Generalizability of the association of rs2046210 with breast cancer risk
Table 2 presents associations between rs2046210 genotypes and breast cancer risk by study site and ethnicity. The variant A allele, which was the minor allele in all groups except African Americans, was associated with increased breast cancer risk in all Chinese studies. Pooled analyses of samples from all studies conducted among Chinese women (SBCS-I, SBCS-II, SBCS/SECS, Tianjin, Nanjing, Taiwan, and Hong Kong) produced ORs of 1.30 (95% CI: 1.22–1.38) and 1.64 (95% CI: 1.50–1.80) for the AG and AA genotypes, respectively (P for trend = 1.54 × 10−30). After excluding from the analysis the Shanghai data from which the original association was derived, the association with breast cancer was stronger; ORs were 1.26 (95% CI: 1.14–1.39) and 1.77 (95% CI: 1.55–2.02), respectively, for the AG and AA genotypes (P for trend = 2.82 × 10−17). SNP rs2046210 was also associated with increased breast cancer risk in all 3 studies conducted among Japanese women (Nagoya, MEC, and Nagano), with pooled ORs of 1.31 (95% CI: 1.13–1.52) and 1.37 (95% CI: 1.06–1.76) for the AG and AA genotypes, respectively (P for trend = 2.51 × 10−4). The homogeneity test for results between the Chinese and Japanese studies was not statistically significant (P = 0.42); therefore, all studies conducted among Chinese and Japanese women were combined into an “East-Asians” group for subsequent pooled analyses.
Association of SNP rs2046210 with breast cancer risk by study site and ethnicity
. | Frequency of the A allele (%) . | GG . | AG . | AA . | . | |||
---|---|---|---|---|---|---|---|---|
. | . | Na . | OR (95% CI) . | Na . | OR (95% CI) . | Na . | OR (95% CI) . | P for trend . |
By study siteb | ||||||||
—Shanghai | 41.9/36.4 | 2,144/1,611 | 1.00d | 3,183/1,802 | 1.33 (1.22–1.45) | 1,098/535 | 1.54 (1.37–1.74) | 4.67 × 10−15 |
—Tianjin | 42.1/35.5 | 512/655 | 1.00 | 750/732 | 1.31 (1.12–1.53) | 270/196 | 1.76 (1.42–2.19) | 9.60 × 10−8 |
—Nanjing | 43.2/36.8 | 341/415 | 1.00 | 510/477 | 1.30 (1.08–1.57) | 199/142 | 1.71 (1.32–2.21) | 2.66 × 10−5 |
—Taiwan | 42.0/36.7 | 334/384 | 1.00 | 494/514 | 1.11 (0.91–1.34) | 173/115 | 1.73 (1.31–2.28) | 5.02 × 10−4 |
—Hong Kong | 45.0/36.4 | 129/256 | 1.00 | 190/278 | 1.36 (1.02–1.80) | 88/87 | 2.01 (1.40–2.89) | 1.63 × 10−4 |
—Nagoya, Japan | 34.0/26.9 | 273/349 | 1.00 | 295/236 | 1.60 (1.27–2.02) | 69/54 | 1.63 (1.11–2.41) | 1.28 × 10−4 |
—MEC-Japanese | 28.7/26.3 | 280/277 | 1.00 | 211/193 | 1.08 (0.84–1.40) | 50/37 | 1.34 (0.85–2.11) | 0.2251 |
—Nagano, Japan | 30.3/28.0 | 195/214 | 1.00 | 172/152 | 1.24 (0.93–1.66) | 36/37 | 1.07 (0.65–1.76) | 0.3310 |
—NBHS-White | 37.6/34.4 | 613/618 | 1.00 | 761/691 | 1.11 (0.95–1.29) | 218/159 | 1.38 (1.10–1.75) | 0.0077 |
—CBCS-White | 37.3/36.6 | 706/567 | 1.00 | 882/690 | 1.03 (0.89–1.19) | 240/181 | 1.07 (0.85–1.33) | 0.5684 |
—LIBCSP-White | 37.4/36.8 | 370/391 | 1.00 | 454/455 | 1.05 (0.87–1.28) | 129/133 | 1.02 (0.77–1.36) | 0.7309 |
—CGEMS-White | 36.9/34.5 | 446/478 | 1.00 | 554/541 | 1.10 (0.92–1.31) | 145/123 | 1.26 (0.96–1.66) | 0.0838 |
—SCCS-Black | 62.6/62.1 | 74/149 | 1.00 | 242/494 | 0.99 (0.72–1.35) | 206/403 | 1.03 (0.74–1.43) | 0.7848 |
—NBHS-Black | 57.2/61.4 | 62/26 | 1.00 | 124/84 | 0.62 (0.36–1.06) | 104/66 | 0.66 (0.38–1.15) | 0.2332 |
By ethnic groupc | ||||||||
—Chinese | 42.2/36.3 | 3,460/3,321 | 1.00 | 5,127/3,803 | 1.30 (1.22–1.38) | 1,828/1,075 | 1.64 (1.50–1.80) | 1.54 × 10−30 |
—Chinese (excl. Shanghai) | 42.7/36.2 | 1,316/1,710 | 1.00 | 1,944/2,001 | 1.26 (1.14–1.39) | 730/540 | 1.77 (1.55–2.02) | 2.82 × 10−17 |
—Japanese | 31.3/27.0 | 748/840 | 1.00 | 678/581 | 1.31 (1.13–1.52) | 155/128 | 1.37 (1.06–1.76) | 2.51 × 10−4 |
—East-Asians | 40.7/34.8 | 4,208/4,161 | 1.00 | 5,805/4,384 | 1.30 (1.22–1.38) | 1,983/1,203 | 1.61 (1.48–1.76) | 2.47 × 10−33 |
—European ancestry | 37.3/35.5 | 2,135/2,054 | 1.00 | 2,651/2,377 | 1.07 (0.99–1.16) | 732/596 | 1.18 (1.04–1.34) | 0.0069 |
—African Americans | 60.7/62.0 | 136/175 | 1.00 | 366/578 | 0.81 (0.63–1.06) | 310/469 | 0.85 (0.65–1.11) | 0.4027 |
All womenc | 40.6/38.0 | 6,479/6,469 | 1.00 | 8,822/7,605 | 1.20 (1.15–1.26) | 3,025/2,482 | 1.41 (1.32–1.50) | 3.64 × 10−27 |
. | Frequency of the A allele (%) . | GG . | AG . | AA . | . | |||
---|---|---|---|---|---|---|---|---|
. | . | Na . | OR (95% CI) . | Na . | OR (95% CI) . | Na . | OR (95% CI) . | P for trend . |
By study siteb | ||||||||
—Shanghai | 41.9/36.4 | 2,144/1,611 | 1.00d | 3,183/1,802 | 1.33 (1.22–1.45) | 1,098/535 | 1.54 (1.37–1.74) | 4.67 × 10−15 |
—Tianjin | 42.1/35.5 | 512/655 | 1.00 | 750/732 | 1.31 (1.12–1.53) | 270/196 | 1.76 (1.42–2.19) | 9.60 × 10−8 |
—Nanjing | 43.2/36.8 | 341/415 | 1.00 | 510/477 | 1.30 (1.08–1.57) | 199/142 | 1.71 (1.32–2.21) | 2.66 × 10−5 |
—Taiwan | 42.0/36.7 | 334/384 | 1.00 | 494/514 | 1.11 (0.91–1.34) | 173/115 | 1.73 (1.31–2.28) | 5.02 × 10−4 |
—Hong Kong | 45.0/36.4 | 129/256 | 1.00 | 190/278 | 1.36 (1.02–1.80) | 88/87 | 2.01 (1.40–2.89) | 1.63 × 10−4 |
—Nagoya, Japan | 34.0/26.9 | 273/349 | 1.00 | 295/236 | 1.60 (1.27–2.02) | 69/54 | 1.63 (1.11–2.41) | 1.28 × 10−4 |
—MEC-Japanese | 28.7/26.3 | 280/277 | 1.00 | 211/193 | 1.08 (0.84–1.40) | 50/37 | 1.34 (0.85–2.11) | 0.2251 |
—Nagano, Japan | 30.3/28.0 | 195/214 | 1.00 | 172/152 | 1.24 (0.93–1.66) | 36/37 | 1.07 (0.65–1.76) | 0.3310 |
—NBHS-White | 37.6/34.4 | 613/618 | 1.00 | 761/691 | 1.11 (0.95–1.29) | 218/159 | 1.38 (1.10–1.75) | 0.0077 |
—CBCS-White | 37.3/36.6 | 706/567 | 1.00 | 882/690 | 1.03 (0.89–1.19) | 240/181 | 1.07 (0.85–1.33) | 0.5684 |
—LIBCSP-White | 37.4/36.8 | 370/391 | 1.00 | 454/455 | 1.05 (0.87–1.28) | 129/133 | 1.02 (0.77–1.36) | 0.7309 |
—CGEMS-White | 36.9/34.5 | 446/478 | 1.00 | 554/541 | 1.10 (0.92–1.31) | 145/123 | 1.26 (0.96–1.66) | 0.0838 |
—SCCS-Black | 62.6/62.1 | 74/149 | 1.00 | 242/494 | 0.99 (0.72–1.35) | 206/403 | 1.03 (0.74–1.43) | 0.7848 |
—NBHS-Black | 57.2/61.4 | 62/26 | 1.00 | 124/84 | 0.62 (0.36–1.06) | 104/66 | 0.66 (0.38–1.15) | 0.2332 |
By ethnic groupc | ||||||||
—Chinese | 42.2/36.3 | 3,460/3,321 | 1.00 | 5,127/3,803 | 1.30 (1.22–1.38) | 1,828/1,075 | 1.64 (1.50–1.80) | 1.54 × 10−30 |
—Chinese (excl. Shanghai) | 42.7/36.2 | 1,316/1,710 | 1.00 | 1,944/2,001 | 1.26 (1.14–1.39) | 730/540 | 1.77 (1.55–2.02) | 2.82 × 10−17 |
—Japanese | 31.3/27.0 | 748/840 | 1.00 | 678/581 | 1.31 (1.13–1.52) | 155/128 | 1.37 (1.06–1.76) | 2.51 × 10−4 |
—East-Asians | 40.7/34.8 | 4,208/4,161 | 1.00 | 5,805/4,384 | 1.30 (1.22–1.38) | 1,983/1,203 | 1.61 (1.48–1.76) | 2.47 × 10−33 |
—European ancestry | 37.3/35.5 | 2,135/2,054 | 1.00 | 2,651/2,377 | 1.07 (0.99–1.16) | 732/596 | 1.18 (1.04–1.34) | 0.0069 |
—African Americans | 60.7/62.0 | 136/175 | 1.00 | 366/578 | 0.81 (0.63–1.06) | 310/469 | 0.85 (0.65–1.11) | 0.4027 |
All womenc | 40.6/38.0 | 6,479/6,469 | 1.00 | 8,822/7,605 | 1.20 (1.15–1.26) | 3,025/2,482 | 1.41 (1.32–1.50) | 3.64 × 10−27 |
aCases/controls.
bAdjusted for age and study site.
cAdjusted for age, study site, and ethnicity.
dReference group.
eP values derived from homogeneity tests were 0.7615 for 7 studies conducted among Chinese women, 0.4222 between Chinese and Japanese studies, 0.3529 across 4 studies conducted among women of European ancestry, and 3.18 × 10−6 between East-Asian and European–ancestry women.
Among women of European ancestry, a positive association between the A allele of the rs2046210 variant and breast cancer risk was found in all 3 studies (NBHS, CBCS, and LIBCSP) with directly genotyped data, although the trend test was statistically significant only in the NBHS (Table 2). SNP rs2046210 was not directly genotyped in the CGEMS study. Genotype data for this SNP among 1,145 breast cancer cases and 1,142 controls were imputed (MACH score = 1.00). An association with breast cancer risk was found with ORs of 1.10 (95% CI: 0.92–1.31) and 1.26 (95% CI: 0.96–1.66) for the AG and AA genotypes, respectively, which is consistent with the data from the 3 studies conducted among women of European ancestry included in the current analysis. In pooled analyses of all samples (5,518 cases/5,027 controls) from women of European ancestry (NBHS-White, CBCS, LIBCSP, and CGEMS), ORs were 1.07 (95% CI: 0.99–1.16) and 1.18 (95% CI: 1.04–1.34) for the AG and AA genotypes, respectively (P for trend = 0.0069; Table 2).
SNP rs2046210 was not associated with breast cancer risk among African Americans (Table 2). In the SCCS analysis of prevalent breast cancer cases, the case and control distributions of alleles were nearly identical, whereas among NBHS African Americans the ORs for the AG and AA genotypes were below 1.0. In pooled analyses of African American samples (812 cases/1,222 controls) from the 2 studies (SCCS and NBHS-Black), ORs were 0.81 (95% CI: 0.63–1.06) and 0.85 (95% CI: 0.65–1.11) for the AG and AA genotypes, respectively (P for trend = 0.40). The sample size for African Americans included in this study, however, was small and the frequency of the A allele in the African American population (62.0%) was considerably higher than that in the East-Asian (34.8%) and European-ancestry (35.5%) populations. Figure 1 presents a forest plot summarizing the results of these studies. We also did analyses stratified by menopausal and ER status and found that the association with rs2046210 is more evident for ER(−) breast cancer compared with ER(+) breast cancer (P = 0.0004) in East-Asian women but not in women of European ancestry (Table 3).
ORs (95%) per risk allele for breast cancer by study site and ethnicity. The size of the boxes is proportional to the sample size of each study. The width of the diamonds represents the range of confident intervals of combined ORs derived from meta-analyses.
ORs (95%) per risk allele for breast cancer by study site and ethnicity. The size of the boxes is proportional to the sample size of each study. The width of the diamonds represents the range of confident intervals of combined ORs derived from meta-analyses.
Association of SNP rs2046210 with breast cancer risk by ethnicity, menopausal status, and ER statusa
. | East-Asians . | European-ancestry Americansb . | |||||
---|---|---|---|---|---|---|---|
. | Cases . | Controls . | OR (95% CI) . | Cases . | Controls . | OR (95% CI) . | |
All women | |||||||
—GG | 4,208 | 4,161 | 1.00 (reference) | 1,689 | 1,576 | 1.00 (reference) | |
—AG | 5,805 | 4,384 | 1.30 (1.22–1.38) | 2,097 | 1,836 | 1.06 (0.97–1.17) | |
—AA | 1,983 | 1,203 | 1.61 (1.48–1.76) | 587 | 473 | 1.16 (1.01–1.33) | |
—Per A allele | 1.28 (1.23–1.33) | 1.07 (1.01–1.14) | |||||
—P for trend | 2.47 × 10−33 | 0.0321 | |||||
Premenopausal women | |||||||
—GG | 2,040 | 1,867 | 1.00 (reference) | 600 | 585 | 1.00 (reference) | |
—AG | 2,827 | 2,075 | 1.23 (1.13–1.34) | 727 | 651 | 1.08 (0.93–1.27) | |
—AA | 956 | 572 | 1.50 (1.33–1.69) | 198 | 176 | 1.10 (0.87–1.39) | |
—Per A allele | 1.23 (1.16–1.30) | 1.16 (0.95–1.18) | |||||
—P for trend | 4.35 × 10−12 | 0.2998 | |||||
Postmenopausal women | |||||||
—GG | 2,127 | 2,126 | 1.00 (reference) | 1,031 | 908 | 1.00 (reference) | |
—AG | 2,910 | 2,121 | 1.36 (1.25–1.48) | 1,249 | 1,081 | 1.02 (0.90–1.15) | |
—AA | 1,002 | 577 | 1.71 (1.52–1.93) | 359 | 272 | 1.17 (0.97–1.40) | |
—Per A allele | 1.32 (1.25–1.40) | 1.06 (0.98–1.16) | |||||
—P for trend | 4.48 × 10−22 | 0.1562 | |||||
P for interaction with menopause: 0.0850 | P for interaction with menopause: 0.9816 | ||||||
ER (+)c | |||||||
—GG | 2,418 | 4,161 | 1.00 (reference) | 425 | 1,009 | 1.00 (reference) | |
—AG | 3,142 | 4,384 | 1.24 (1.16–1.33) | 522 | 1,146 | 1.08 (0.93–1.26) | |
—AA | 1,043 | 1,203 | 1.52 (1.37–1.68) | 133 | 292 | 1.07 (0.85–1.36) | |
—Per A allele | 1.23 (1.18–1.29) | 1.05 (0.94–1.17) | |||||
—P for trend | 1.06 × 10−18 | 0.3847 | |||||
ER (−)c | |||||||
—GG | 1,295 | 4,161 | 1.00 (reference) | 138 | 1,009 | 1.00 (reference) | |
—AG | 1,930 | 4,384 | 1.36 (1.25–1.48) | 167 | 1,146 | 1.07 (0.84–1.37) | |
—AA | 695 | 1,203 | 1.77 (1.58–1.98) | 42 | 292 | 1.05 (0.73–1.52) | |
—Per A allele | 1.34 (1.27–1.41) | 1.04 (0.88–1.23) | |||||
—P for trend | 3.64 × 10−25 | 0.6745 | |||||
P for association with ER statusd: 0.0004 | P for association with ER statusd: 0.9883 |
. | East-Asians . | European-ancestry Americansb . | |||||
---|---|---|---|---|---|---|---|
. | Cases . | Controls . | OR (95% CI) . | Cases . | Controls . | OR (95% CI) . | |
All women | |||||||
—GG | 4,208 | 4,161 | 1.00 (reference) | 1,689 | 1,576 | 1.00 (reference) | |
—AG | 5,805 | 4,384 | 1.30 (1.22–1.38) | 2,097 | 1,836 | 1.06 (0.97–1.17) | |
—AA | 1,983 | 1,203 | 1.61 (1.48–1.76) | 587 | 473 | 1.16 (1.01–1.33) | |
—Per A allele | 1.28 (1.23–1.33) | 1.07 (1.01–1.14) | |||||
—P for trend | 2.47 × 10−33 | 0.0321 | |||||
Premenopausal women | |||||||
—GG | 2,040 | 1,867 | 1.00 (reference) | 600 | 585 | 1.00 (reference) | |
—AG | 2,827 | 2,075 | 1.23 (1.13–1.34) | 727 | 651 | 1.08 (0.93–1.27) | |
—AA | 956 | 572 | 1.50 (1.33–1.69) | 198 | 176 | 1.10 (0.87–1.39) | |
—Per A allele | 1.23 (1.16–1.30) | 1.16 (0.95–1.18) | |||||
—P for trend | 4.35 × 10−12 | 0.2998 | |||||
Postmenopausal women | |||||||
—GG | 2,127 | 2,126 | 1.00 (reference) | 1,031 | 908 | 1.00 (reference) | |
—AG | 2,910 | 2,121 | 1.36 (1.25–1.48) | 1,249 | 1,081 | 1.02 (0.90–1.15) | |
—AA | 1,002 | 577 | 1.71 (1.52–1.93) | 359 | 272 | 1.17 (0.97–1.40) | |
—Per A allele | 1.32 (1.25–1.40) | 1.06 (0.98–1.16) | |||||
—P for trend | 4.48 × 10−22 | 0.1562 | |||||
P for interaction with menopause: 0.0850 | P for interaction with menopause: 0.9816 | ||||||
ER (+)c | |||||||
—GG | 2,418 | 4,161 | 1.00 (reference) | 425 | 1,009 | 1.00 (reference) | |
—AG | 3,142 | 4,384 | 1.24 (1.16–1.33) | 522 | 1,146 | 1.08 (0.93–1.26) | |
—AA | 1,043 | 1,203 | 1.52 (1.37–1.68) | 133 | 292 | 1.07 (0.85–1.36) | |
—Per A allele | 1.23 (1.18–1.29) | 1.05 (0.94–1.17) | |||||
—P for trend | 1.06 × 10−18 | 0.3847 | |||||
ER (−)c | |||||||
—GG | 1,295 | 4,161 | 1.00 (reference) | 138 | 1,009 | 1.00 (reference) | |
—AG | 1,930 | 4,384 | 1.36 (1.25–1.48) | 167 | 1,146 | 1.07 (0.84–1.37) | |
—AA | 695 | 1,203 | 1.77 (1.58–1.98) | 42 | 292 | 1.05 (0.73–1.52) | |
—Per A allele | 1.34 (1.27–1.41) | 1.04 (0.88–1.23) | |||||
—P for trend | 3.64 × 10−25 | 0.6745 | |||||
P for association with ER statusd: 0.0004 | P for association with ER statusd: 0.9883 |
aAdjusted for age and study site.
bIncludes NBHS-White, CBCS, and LIBCSP.
cNo ER information was available in the CBCS, and thus this study was not included in the analysis.
dDerived from the Chi-squared test to examine the association between ER status and rs2046210 genotypes in the case group only.
Functional genomic studies of the chr 6q25.1 locus
There are 2 nonsynonymous SNPs (rs6929137 and rs3734804) in the C6orf97 gene, which is in the 6q25.1 locus. These 2 SNPs are in strong linkage disequilibrium (LD) with rs2046210 (r2 = 0.91 in Chinese, 0.87 in Europeans, and 0.001 in Africans for rs6929137; r2 = 0.91 in Chinese, 0.56 in Europeans, and 0.42 in Africans for rs3734804). In an attempt to identify SNPs that may be more strongly associated with breast cancer risk in women of European ancestry than the originally reported SNP (rs2046210), we genotyped these 2 SNPs in 1,592 European-ancestry American cases and 1,468 controls from the NBHS (NBHS-White). The variant alleles of the 2 SNPs were also associated with breast cancer risk [per variant allele OR = 1.11 (95% CI: 0.99–1.24) for rs6929137 and 1.12 (95% CI: 1.01–1.24) for rs3734804]. The associations, however, were not stronger than the initially reported SNP rs2046210 in the NBHS-White group (OR per variant allele = 1.15, 95% CI: 1.04–1.28). These 2 SNPs are not included in Affymetrix Genome-Wide Human SNP Array 6.0 and thus we imputed genotype data for these 2 SNPs. Again, these SNPs showed a significant association with breast cancer risk in the Shanghai samples [2,073 cases and 2,084 controls; ORs per variant allele, 1.26 (95% CI: 1.15–1.39) for rs6929137 and 1.27 (95% CI: 1.16–1.39) for rs3734804]. The associations with these 2 SNPs were slightly stronger than with the initially reported SNP rs2046210 identified in the GWAS [OR per variant allele = 1.25 (95% CI: 1.14–1.36)]. However, rs6929137 was not associated with breast cancer risk in African American (21). Thus, further evaluations of these 2 SNPs were not conducted.
To evaluate whether SNP rs2046210 has any regulatory function, we conducted luciferase reporter assays. The reporter construct containing the major G allele and the construct containing the minor A allele produced similar levels of luciferase activity. The results of a search for transcription factor binding sites [TFBS; the “TFBS Conserved” track of the UCSC Genome Browser (22)] showed that rs2046210 does not alter putative transcription factor binding.
To identify potential causal SNPs, we conducted a series of heterologous promoter and enhancer assays, focusing on the 36-kb region between the C6orf97 and ESR1 genes. We divided the 36-kb region (chromosome 6:151,983,304–152,019,420) into 4 parts and used long-range PCR to amplify 4 DNA fragments [a, 8.6 kb (harboring the rs2046210 polymorphic site); b, 9.2 kb; c, 9.3 kb; and d, 8.9 kb], then cloned the 4 fragments into both pGL3 basic and pGL3 promoter vectors (Fig. 2A). The templates for PCR were DNA carrying the minor or major alleles of SNP rs2046210. The SNP rs2046210A construct carried the minor alleles of SNP rs2046210 and other SNPs in close proximity and strong LD with rs2046210, whereas the rs2046210G construct carried the major alleles of rs2046210 and other SNPs in close proximity and strong LD with rs2046210. Luciferase activity derived from the fragment “a” construct in the pGL3 promoter vector with rs2046210A was significantly different from the fragment “a” construct with rs2046210G (data not shown). To refine the location of potential causal SNPs, we subdivided fragment “a” into 3 smaller DNA fragments (e, 2.2 kb; f, 4.1 kb; and g, 2.3 kb) containing either rs2046210A or rs2046210G into the pGL3 promoter vector and carried out luciferase assays. Luciferase activity derived from fragment “g” with rs2046210A was significantly different from that of fragment “g” with rs2046210G (data not shown). Six SNPs, including rs2046210, in fragment “g” were associated with breast cancer risk in Stage 1 of the initial SBCS GWAS. After excluding SNPs that showed no evidence of alteration of putative transcription factor binding in the database search, we found 3 candidate SNPs (rs7740686, rs7763637, and rs6913578) in this region (Fig 2A). We then generated major allele constructs for each of these 3 SNPs by using site-directed mutagenesis by using the 2.3-kb fragment “g” with the rs2046210A construct as the template. Luciferase activity was significantly higher in constructs harboring the major alleles of rs6913578 (rs6913578-A in Fig. 2B) or rs7763637 (rs7763537-G in Fig. 2B) compared with the corresponding minor alleles (Minor Alleles in Fig. 2B).
In vitro functional characterization of SNP rs2046210 and other potential functional SNPs at 6q25.1. A, diagram of cloning strategy. A 36-kb region (chromosome 6:151,983,304–152,019,420) between the C6orf97 and ESR1 genes was divided into 4 DNA fragments (a–d), which were separately cloned into pGL3 basic and pGL3 promoter vectors. The 8.6-kb “a” fragment was further divided to 3 DNA fragments (e, f, and g) and subcloned into a pGL3 promoter vector. The “g” fragment harbored 4 SNPs (rs7740686, rs2046210, rs7763637, and rs6913578). B, luciferase reporter activity assays: HEK293 cells were transiently transfected with pGL3 promoter/luciferase reporter constructs containing the 2.3-kb “g” fragment. 1. Minor alleles construct: contained the minor alleles for all 4 SNPs (rs7740686-T, rs2046210-A, rs7763637-A, and rs6913578 C); 2. Major alleles construct: contained the major alleles for all 4 SNPs (rs7740686-A, rs2046210-G, rs7763637-G, and rs6913578-A); 3. rs7740686-A construct: contained the rs7740686 major allele A and the minor alleles for the other 3 SNPs; 4. rs2046210-G construct: contained the rs2046210 major allele G and the minor alleles for the other 3 SNPs; 5. rs7763637-G construct: contained rs7763637 major allele G and the minor alleles for the other 3 SNPs; 6. rs6913578-A construct: contained rs6913578 major allele A and the minor alleles for the other 3 SNPs. Relative luciferase activity is shown as the mean ± SD of 3 experiments conducted in triplicate (relative to the Minor Allele construct). Statistical analysis was conducted by using Student's t test to compare the minor and major alleles (*, P < 0.01 when compared with the minor alleles, n = 9). C, EMSA. Nuclear protein extracts from MCF-7 (top) and HEK293 (bottom) cells were incubated with biotin-labeled probes corresponding to reference allele (lanes 1–5) or the risk allele (lanes 6–10) of rs6913578 in the absence or presence of competitors. Lanes 1 and 6, no nuclear extracts; lanes 2 and 7, unlabeled competitor in 200-fold molar excess; lanes 3 and 8 (5 mmol/L MgCl2), lanes 4 and 9 (2.5 mmol/L MgCl2), and lanes 5 and 10 (1.25 mmol/L MgCl2), no competitor. I: free biotin-labeled probes. II: specific DNA-protein complex bands.
In vitro functional characterization of SNP rs2046210 and other potential functional SNPs at 6q25.1. A, diagram of cloning strategy. A 36-kb region (chromosome 6:151,983,304–152,019,420) between the C6orf97 and ESR1 genes was divided into 4 DNA fragments (a–d), which were separately cloned into pGL3 basic and pGL3 promoter vectors. The 8.6-kb “a” fragment was further divided to 3 DNA fragments (e, f, and g) and subcloned into a pGL3 promoter vector. The “g” fragment harbored 4 SNPs (rs7740686, rs2046210, rs7763637, and rs6913578). B, luciferase reporter activity assays: HEK293 cells were transiently transfected with pGL3 promoter/luciferase reporter constructs containing the 2.3-kb “g” fragment. 1. Minor alleles construct: contained the minor alleles for all 4 SNPs (rs7740686-T, rs2046210-A, rs7763637-A, and rs6913578 C); 2. Major alleles construct: contained the major alleles for all 4 SNPs (rs7740686-A, rs2046210-G, rs7763637-G, and rs6913578-A); 3. rs7740686-A construct: contained the rs7740686 major allele A and the minor alleles for the other 3 SNPs; 4. rs2046210-G construct: contained the rs2046210 major allele G and the minor alleles for the other 3 SNPs; 5. rs7763637-G construct: contained rs7763637 major allele G and the minor alleles for the other 3 SNPs; 6. rs6913578-A construct: contained rs6913578 major allele A and the minor alleles for the other 3 SNPs. Relative luciferase activity is shown as the mean ± SD of 3 experiments conducted in triplicate (relative to the Minor Allele construct). Statistical analysis was conducted by using Student's t test to compare the minor and major alleles (*, P < 0.01 when compared with the minor alleles, n = 9). C, EMSA. Nuclear protein extracts from MCF-7 (top) and HEK293 (bottom) cells were incubated with biotin-labeled probes corresponding to reference allele (lanes 1–5) or the risk allele (lanes 6–10) of rs6913578 in the absence or presence of competitors. Lanes 1 and 6, no nuclear extracts; lanes 2 and 7, unlabeled competitor in 200-fold molar excess; lanes 3 and 8 (5 mmol/L MgCl2), lanes 4 and 9 (2.5 mmol/L MgCl2), and lanes 5 and 10 (1.25 mmol/L MgCl2), no competitor. I: free biotin-labeled probes. II: specific DNA-protein complex bands.
To investigate whether the DNA sequences containing rs6913578 or rs7763637 interact with nuclear proteins and, if so, whether these SNP alter protein-DNA interactions, we conducted electrophoretic mobility shift assays (EMSA). We found that the minor allele (C) of rs6913578 significantly altered DNA-protein complex (II) intensity in both HEK293 and MCF7 cells (Fig. 2C), whereas there was no detectable interaction of rs7763637 with nuclear proteins (data not shown).
Evaluation of putative functional variants with breast cancer risk
SNP rs6913578 is located 1,440 bp downstream of rs2046210. SNP rs2046210 is in strong LD with rs6913578 and rs7763637 in Chinese populations (r2 = 0.91 and 0.901, respectively) and European-ancestry populations (r2 = 0.83 and 0.87, respectively), but is not in African populations (r2 = 0.57 for both). Both rs6913578 and rs7763637 are associated with breast cancer risk in Chinese women and European-ancestry Americans, and the association was stronger than with rs2046210 in European-ancestry Americans (Table 4). The positive associations of these SNPs with breast cancer risk diminished (Table 4) after adjusting for rs2046210, which is not surprising given the high LD with rs2046210. These 2 SNPs, rs6913578 and rs7763637, showed weak associations with breast cancer risk in African Americans (Table 4). After adjusting for rs2046210, however, the associations in African Americans approached borderline significance (P for trend = 0.096 and 0.077, respectively).
Association of SNPs rs6913578 and rs7763637 with breast cancer risk
SNP . | Studya . | Alleleb . | Genotyped or imputed . | Adjusted SNP . | No. of cases . | No. of controls . | Frequency (%)c . | OR (95% CI)d . | P for trend . | |
---|---|---|---|---|---|---|---|---|---|---|
. | . | . | . | . | . | . | . | Heterozygous . | Homozygous . | . |
rs6913578 | C/A | |||||||||
Chinese | Imputed | None | 2,069 | 2,080 | 39.6/34.4 | 1.26 (1.10–1.44) | 1.54 (1.27–1.87) | 1.44 × 10−6 | ||
rs2046210 | 2,064 | 2,077 | 39.6/34.4 | 1.15 (0.85–1.55) | 1.27 (0.72–2.26) | 0.3852 | ||||
European-ancestry Americans | Imputed + genotyped | None | 2,691 | 2,571 | 33.7/31.4 | 1.06 (0.95–1.19) | 1.31 (1.08–1.60) | 0.0128 | ||
rs2046210 | 2,653 | 2,532 | 33.6/31.3 | 0.89 (0.68–1.17) | 0.92 (0.57–1.47) | 0.5349 | ||||
African Americans | Genotyped | None | 799 | 1,736 | 47.9/47.5 | 1.04 (0.84–1.28) | 1.13 (0.88–1.45) | 0.3363 | ||
rs2046210 | 795 | 1,727 | 48.1/47.5 | 1.16 (0.89–1.49) | 1.38 (0.94–2.03) | 0.0964 | ||||
rs7763637 | A/G | |||||||||
SBCS-GWAS | Genotyped | None | 1,927 | 1,936 | 39.5/33.9 | 1.28 (1.11–1.46) | 1.62 (1.33–1.98) | 3.49 × 10−7 | ||
rs2046210 | 1,924 | 1,933 | 39.6/33.9 | 1.26 (0.92–1.72) | 1.58 (0.87–2.86) | 0.1294 | ||||
European-ancestry Americans | Imputed + genotyped | None | 2,693 | 2,571 | 34.8/32.5 | 1.08 (0.96–1.21) | 1.29 (1.07–1.55) | 0.0104 | ||
rs2046210 | 2,654 | 2,528 | 34.7/32.3 | 0.94 (0.71–1.24) | 0.85 (0.53–1.36) | 0.4950 | ||||
African Americans | Genotyped | None | 790 | 1,749 | 47.8/47.8 | 1.01 (0.82–1.25) | 1.12 (0.87–1.43) | 0.3974 | ||
rs2046210 | 786 | 1,741 | 48.0/47.8 | 1.15 (0.89–1.49) | 1.42 (0.97–2.09) | 0.0768 |
SNP . | Studya . | Alleleb . | Genotyped or imputed . | Adjusted SNP . | No. of cases . | No. of controls . | Frequency (%)c . | OR (95% CI)d . | P for trend . | |
---|---|---|---|---|---|---|---|---|---|---|
. | . | . | . | . | . | . | . | Heterozygous . | Homozygous . | . |
rs6913578 | C/A | |||||||||
Chinese | Imputed | None | 2,069 | 2,080 | 39.6/34.4 | 1.26 (1.10–1.44) | 1.54 (1.27–1.87) | 1.44 × 10−6 | ||
rs2046210 | 2,064 | 2,077 | 39.6/34.4 | 1.15 (0.85–1.55) | 1.27 (0.72–2.26) | 0.3852 | ||||
European-ancestry Americans | Imputed + genotyped | None | 2,691 | 2,571 | 33.7/31.4 | 1.06 (0.95–1.19) | 1.31 (1.08–1.60) | 0.0128 | ||
rs2046210 | 2,653 | 2,532 | 33.6/31.3 | 0.89 (0.68–1.17) | 0.92 (0.57–1.47) | 0.5349 | ||||
African Americans | Genotyped | None | 799 | 1,736 | 47.9/47.5 | 1.04 (0.84–1.28) | 1.13 (0.88–1.45) | 0.3363 | ||
rs2046210 | 795 | 1,727 | 48.1/47.5 | 1.16 (0.89–1.49) | 1.38 (0.94–2.03) | 0.0964 | ||||
rs7763637 | A/G | |||||||||
SBCS-GWAS | Genotyped | None | 1,927 | 1,936 | 39.5/33.9 | 1.28 (1.11–1.46) | 1.62 (1.33–1.98) | 3.49 × 10−7 | ||
rs2046210 | 1,924 | 1,933 | 39.6/33.9 | 1.26 (0.92–1.72) | 1.58 (0.87–2.86) | 0.1294 | ||||
European-ancestry Americans | Imputed + genotyped | None | 2,693 | 2,571 | 34.8/32.5 | 1.08 (0.96–1.21) | 1.29 (1.07–1.55) | 0.0104 | ||
rs2046210 | 2,654 | 2,528 | 34.7/32.3 | 0.94 (0.71–1.24) | 0.85 (0.53–1.36) | 0.4950 | ||||
African Americans | Genotyped | None | 790 | 1,749 | 47.8/47.8 | 1.01 (0.82–1.25) | 1.12 (0.87–1.43) | 0.3974 | ||
rs2046210 | 786 | 1,741 | 48.0/47.8 | 1.15 (0.89–1.49) | 1.42 (0.97–2.09) | 0.0768 |
aChinese: SBCS-GWAS; European-ancestry Americans: NBHS-White and CGEMS; African Americans: SCCS and NBHS-black.
bRisk allele/reference allele.
cTest allele frequency in cases/controls.
dAdjusted for age.
Discussion
In this pooled analysis of 17,188 cases and 14,660 controls, we confirmed the association of rs2046210 at 6q25.1 with breast cancer risk among women with Chinese, Japanese, and European ancestry. In vitro functional genomic studies identified a putatively functional variant, rs6913578, a SNP 1,440 bp downstream of rs2046210, which is in high LD with rs2046210 in Chinese and European-ancestry populations, but is not in Africans. SNP rs6913578 had a stronger association with breast cancer risk in European-ancestry Americans than rs2046210, the SNP originally associated with breast cancer risk in a GWAS conducted in a Chinese population. In African Americans, the association of rs6913578 with breast cancer risk approached borderline significance after adjusting for rs2046210.
Genes that are located in the 1-Mb region centered around rs2046210 include PLEKHG1, MTHFD1L, AKAP12, ZBTB2, RMND1, C6orf211, C6orf97, ESR1, C6orf98, SYNE1, and NANOGP11. SNP rs2046210 is located 29 kb upstream of the first untranslated region of the ESR1 gene, 180 kb upstream of the transcription start site of its first exon (3, 23), and 6 kb downstream of the C6orf97 gene. Because of its relative proximity to the ESR1 gene and the biological function of ER-α, it is possible that SNP rs2046210, or SNPs in LD with it, may alter ESR1 gene expression and thereby affect susceptibility to breast cancer. A search of predicted TFBS using the “TFBS Conserved” track of the UCSC Genome Browser (22) indicated that there is no TFBS on this SNP. We further scanned for noncoding RNA (Evofold) and miRNA/snoRNA/scaRNA (sno/miRNA) in this region by using the UCSC Genome Browser and found that this SNP is not in the coding region for any noncoding RNA or miRNA/snoRNA/scaRNA. Our functional genomic analyses also provided no support for the potential functionality of rs2046210.
Our in vitro functional genomic experiments indicated that the location of the potential functional SNPs may be in a 2.3-kb region. Specifically, SNPs rs6913578, which is 1,440 bp downstream of rs2046210, and rs7763637, which is 947 bp downstream of rs2046210, altered luciferase reporter activity. These results suggest that these 2 common SNPs may influence DNA binding protein interactions and affect the expression of neighboring genes. We conducted EMSA to examine this hypothesis and confirmed that the C allele of rs6913578 significantly altered DNA-nuclear protein interaction. Thus, it is possible that nuclear protein(s) selectively and differently bind to specific alleles of the rs6913578 polymorphic site resulting in modification of the transcription of neighboring genes. However, there has been no confirmation to date that the putative transcription factors or their associated proteins are involved in the regulation of ESR1, C6orf97, or nearby genes. Interestingly, both rs6913578 and rs7763637 were associated with breast cancer risk among Chinese women and European-ancestry Americans (and the associations were stronger than rs2046210 in European-ancestry Americans), but not among African American women. However, after adjusting for rs2046210, the association of rs6913578 and rs7763637 with breast cancer risk in African Americans approached a borderline significance level. Our data show and highlight the importance of conducting interracial genetic association studies in populations with different LD structures to identify potential causal genetic variants for breast cancer and other complex diseases. Further studies will be required to determine causal SNPs related to breast cancer risk at the 6q25.1 locus.
In a recent study, Stacey and colleagues (24) reported an association of rs9397435 at the 6q25.1 locus with breast cancer risk in European, Chinese, and African populations. This SNP is located 2,854 bp downstream of rs2046210 and 1,414 bp downstream of rs6913578 and is only weakly correlated with rs2046210 in European (r2 = 0.087) and African (r2 = 0.039) populations. The risk allele frequency of this SNP in Asians is approximately 32%, comparable to that for rs2046210, but it is very low, only about 6.3% among European and African populations. Very recently, Turnbull and colleagues (25) evaluated the 6q25.1 locus with breast cancer risk in a GWAS conducted among 3,659 European-ancestry cases and 4,897 similar controls and identified SNP rs3757318 (MAF = 7%) to have the most significant association with breast cancer risk. This SNP is located approximately 200 kb upstream of ESR1 in an intron of the C6orf97 gene and 34,253 bp upstream of rs2046210. SNP rs3757318 is only weakly correlated with rs2046210 in Europeans (r2 = 0.088), whereas the correlation is stronger in Chinese populations (r2 = 0.48). Similarly, SNP rs3757318 is only weakly correlated with rs6913578 in European (r2 = 0.038) and Chinese populations (r2 = 0.181). Using imputed data from our GWAS, we showed that rs3757318 was associated with breast cancer risk with a per variant allele OR of 1.21 (P for trend = 5.4 × 10−4), an association that is not as strong as rs2046210. It is unclear, however, whether this SNP is functional or is related to breast cancer risk in women of African ancestry.
Results reported to date from GWAS have clearly shown that GWAS results cannot be applied uniformly across all ethnic groups. Several SNPs identified in GWAS conducted among women of European ancestry could not be replicated in Asian-ancestry women (26–30). In our study, we have shown that the strength of the association with rs2046210 varies considerably across ethnic groups. This is not surprising given that most, if not all SNPs identified in GWAS are tagging SNPs, and there exists considerable differences in genetic architecture across ethnic groups. Fine-mapping studies are needed to identify additional genetic risk variants and/or causal variants for breast cancer.
Major strengths of our study are its large sample size and its ability to evaluate the consistency of the findings across multiple studies conducted in different locations and in populations with different ethnic ancestry. In addition, we conducted functional genomic studies of this locus to identify possible functional variants. Ancestry informative markers, however, were not adjusted for in this study. In addition, our in vitro functional experiments were conducted only on a 36-kb region and then were narrowed down to 4 common polymorphisms in a 2.3-kb region. It is possible that other functional SNPs, both common and rare, exist at this locus.
In summary, results from this large consortium study confirmed the association of rs2046210 with breast cancer risk among Chinese women, Japanese women, and European-ancestry Americans. SNP rs6913578 may be a functional SNP responsible for the observed association with breast cancer risk of SNPs at the 6q25.1 locus. Additional fine-scale mapping studies are needed to identify causal variants at this locus.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Acknowledgments
The authors thank study participants and research staff for their contributions and commitment to this project.
Grant Support
This research was supported by US NIH grant R01CA124558. The genotyping assays conducted at Vanderbilt University were done at the Survey and Biospecimen Core, which is supported in part by the Vanderbilt-Ingram Cancer Center (P30CA68485). Participating studies (Principal Investigator, grant support) of the consortium are as follows: the Shanghai Breast Cancer Study (W. Zheng, R01CA64277), the Nashville Breast Health Study (W. Zheng, R01CA100374), the Shanghai Breast Cancer Survival Study (X.O. Shu, R01CA118229), the Shanghai Endometrial Cancer Study (X.O. Shu, R01CA92585, contributed only controls to the consortium), the Tianjin Study (K. Chen, the National Natural Science Foundation of China Grant No. 30771844), the Nanjing Study (H. Shen, IRT0631, China), the Taiwan Biobank Study (C.-Y. Shen, DOH97–01), the Hong Kong Study (U.S. Khoo, Research Grant Council, Hong Kong SAR, China, HKU 7520/05M and 76730M), the Nagoya study (K. Tajima, Grants-in-Aid for Scientific Research on Priority Areas (17015052) from the Ministry of Education, Culture, Sports, Science, and Technology of Japan; H. Tanaka, Grants-in-Aid for the Third Term Comprehensive Ten-Year Strategy for Cancer Control from the Ministry of Health, Labor and Welfare of Japan, H20–002), the Multiethnic Cohort Study (B. E. Henderson, CA63464; L. Kolonel, CA54281; and C.A. Haiman, CA132839), the Nagano Breast Cancer Study [S. Tsugane, Grants-in-Aid for the Third Term Comprehensive Ten-Year Strategy for Cancer Control from the Ministry of Health, Labor and Welfare of Japan, and for Scientific Research on Priority Areas (17015049) from the Ministry of Education, Culture, Sports, Science, and Technology of Japan], the Collaborative Breast Cancer Study including Massachusetts (K.M. Egan, R01CA47305), Wisconsin (P.A. Newcomb, R01 CA47147) and New Hampshire (L. Titus-Ernstoff, R01CA69664) centers, the Long Island Breast Cancer Study Project (M.D. Gammon, U01CA/ES66572; R.M. Santella, P30ES009089; J.A. Swenberg, P30ES010126), and the Southern Community Cohort Study (W.J. Blot, R01CA092447). The content is solely the responsibility of the authors and does not necessarily represent the official views of the funding agencies.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.