Abstract
GATA-binding protein 3 (GATA3) is a transcription factor and a putative tumor suppressor that is highly expressed in normal breast luminal epithelium and estrogen receptor α (ER)–positive breast tumors. We hypothesized that common genetic variation in GATA3 could influence breast carcinogenesis. Four tag single-nucleotide polymorphisms (SNP) in GATA3 and its 3′ flanking gene FLJ4598 were genotyped in two case control studies in Norway and Poland (2,726 cases and 3,420 controls). Analyses of pooled data suggested a reduced risk of breast cancer associated with two intronic variants in GATA3 in linkage disequilibrium (rs3802604 in intron 3 and rs570613 in intron 4). Odds ratio (95% confidence interval) for rs570613 heterozygous and rare homozygous versus common homozygous were 0.85 (0.75-1.95) and 0.82 (0.62-0.96), respectively (Ptrend = 0.004). Stronger associations were observed for subjects with ER-negative, than ER-positive, tumors (Pheterogeneity = 0.01 for rs3802604; Pheterogeneity = 0.09 for rs570613). Although no individual SNPs were associated with ER-positive tumors, two haplotypes (GGTC in 2% of controls and AATT in 7% of controls) showed significant and consistent associations with increased risk for these tumors when compared with the common haplotype (GATT in 46% of controls): 1.71 (1.27-2.32) and 1.26 (1.03-1.54), respectively. In summary, data from two independent study populations showed two intronic variants in GATA3 associated with overall decreases in breast cancer risk and suggested heterogeneity of these associations by ER status. These differential associations are consistent with markedly different levels of GATA3 protein by ER status. Additional epidemiologic studies are needed to clarify these intriguing relationships. (Cancer Epidemiol Biomarkers Prev 2007;16(11):2269–75)
Introduction
GATA-binding protein 3 (GATA3) is a transcription factor that is highly expressed in normal breast luminal epithelium and the luminal A tumor subtype that has been defined in studies of gene expression patterns (1-5). GATA3 expression highly correlates with expression of estrogen receptor α (ER; refs. 6, 7), as well as other genes thought to be important in breast luminal epithelial cell biology, including LIV1, RERG, and TFF3 (8). Luminal A tumors are associated with better prognosis than other tumor subtypes (luminal B, basal-like, and HER2+ subtypes; refs. 1, 2), the latter of which show little or no expression of GATA3. Building upon previous reports, including metaanalysis of microarray studies of GATA3 transcripts (7), Mehra et al. (9) have shown that GATA3 protein expression in tumors predicted prognosis. These authors also confirmed that GATA3 levels are lower in tumors that are ER-negative and in those with high histologic grade. Parikh et al. have also shown that among ER-positive tumors, GATA3 protein expression seems to predict tumor estrogen responsiveness (10).
The mechanisms underlying the role of GATA3 in estrogen response and breast cancer prognosis are not clear. For example, GATA3 mRNA is down-regulated in normal breast xenograft samples after estradiol treatment (11), but estradiol-treated MCF-7 cells do not show altered expression of GATA3 mRNA (6). The discovery of somatic mutations in GATA3 in some ER-positive breast cancers, coupled with the observation that GATA3-transduced cells have greatly decreased proliferation rates in vitro, suggest that GATA3 act as a tumor suppressor (12). Thus, we hypothesized that common genetic variation could alter expression and/or function of GATA3 and thus play an etiologic role in breast cancer. To address this hypothesis, we carried out a comprehensive evaluation of common variation in GATA3 and evaluated its relation with breast cancer risk in two independent study populations from Norway and Poland.
Materials and Methods
Study Populations
Norwegian Breast Cancer Study. Breast cancer patients (n = 731) were enrolled in accordance with local institutional review board guidelines and included four series of previously described case series.
(a) Breast cancer patients sequentially enrolled at Ullevål University Hospital from 1990 to 1994, representing the breast cancer population during this period (13, 14). The mean age was 64 years (range, 28-92 years). Blood samples were collected in 1994 to 1996 from 119 patients who were still alive (∼80%), living in the Oslo area, and consented to give blood (∼70% of eligible women). Time from diagnosis to blood collection was 0 to 6 years.
(b) Breast cancer patients admitted to the Norwegian Radium Hospital in 1972 to 1991 (15). The mean age at diagnosis was 57 years (range, 27-94 years). Blood was drawn between 1987 and 1991 (n = 224), either during follow-up visits, relapse, or at diagnosis.
(c) Breast cancer patients diagnosed at the Norwegian Radium Hospital and treated with radiotherapy in 1975 to 1986. The mean age at diagnosis was 59 years (range, 26-75 years). Blood samples were collected in 1996 from a subset of patients alive at that time (21%) who were part of a treatment evaluation and agreed (82%) to provide a blood sample (n = 263; ref. 16).
(d) Breast cancer patients with stage I and stage II disease enrolled in the Oslo micrometastases study between 1995 and 1998 who had blood samples collected at the time of diagnoses (n = 125; refs. 17, 18). The average age at diagnosis was 56 years (range, 29-82 years). Blood sampling was done just before primary surgery for their breast cancer from all patients included in the study.
The majority of control subjects (n = 1,015) were women with a negative mammogram from the Tromsø Mammography and Breast Cancer Study conducted in 2001 and 2002. About 70% of the women, 55 to 71 years of age, residing in the municipality of Tromsø in Norway and attending the Norwegian Breast Cancer Screening Program at the University Hospital of North Norway agreed to participate (19). The study was approved by the National Data Inspection Board and the Regional Committee for Medical Research Ethics. In addition, we included healthy woman (n = 109), 55 to 72 years of age, participating in the Norwegian Breast Cancer Screening Program in Bergen in 1999, with two negative mammogram during a 2-year period (20). From this latter group, women with p.o. hormone replacement therapy or history of diabetes or other endocrine disorders were excluded. The mean (range) ages for cases (n = 731) and controls (n = 1,124) in the Norwegian study were 56 (26-93) and 62 (55-72) years, respectively.
Polish Breast Cancer Study. A population-based case control study was conducted among women residing in two Polish cities, Warsaw and Lodz (21). Eligible cases were women ages 20 to 74 years who were newly diagnosed with either histologically or cytologically confirmed in situ or invasive breast cancer in 2000 to 2003. About 90% of cases were identified through a rapid identification system in participating hospitals, and the remainder through cancer registries, to ensure complete case ascertainment. Controls with no history of breast cancer were randomly selected from population lists during the case ascertainment period and were frequency matched to cases by city and age in 5-year categories. Institutional Review Board approval was obtained from all participating institutions, and signed informed consent was obtained for all respondents.
A total of 2,386 cases (79% of eligible cases) and 2,502 controls (69% of eligible controls) provided a personal interview on known and suspected risk factors. Blood samples for DNA extraction were obtained from 1,995 cases (84% of participating cases) and 2,296 controls (94% of participating controls). Most cases (94%) were diagnosed with invasive tumors. The mean (range) age was 56 (27-74) for cases and 56 (24-75) years for controls.
Genotyping. A resequence analysis of all exons, including the 5′ and 3′ untranlated regions and evolutionarily conserved intronic regions in GATA3 was done in 94 healthy Norwegian women and 102 individuals in the SNP500Cancer panel (23). Single-nucleotide polymorphisms (SNP) were selected using the haplotype-tagging program of Stram et al. 22 with r2 of >0.80 and minor allele frequency of >0.05 and genotyped in the Norwegian and Polish studies. SNP selection included the closely positioned upstream neighboring gene FLJ45983 because of linkage disequilibrium observed with SNPs in GATA3, suggesting that variation in this gene might be involved in regulatory functions affecting GATA3. The function of this gene in unknown. Genotype analyses were done on blood DNA at the Core Genotyping Facility of the Division of Cancer Epidemiology and Genetics, National Cancer Institute for three SNPs in GATA3 (rs3802604, IVS4+1468G>A; rs570613, IVS4+401T>C; rs422628, IVS4-27C>T) and one SNP in the 3′ flanking gene FLJ45983 (rs1149901, Ex1-425G>A). Description and methods for each genotype assay can be found at http://snp500cancer.nci.nih.gov (23). Duplicated DNA pairs from 95 subjects in the Polish study showed >99% concordance for all but one assay in intron 3 of GATA3 (rs3802604) with 98% concordance. Completion was ≥95% for all assays in the Polish study (5% of samples felt into the NTC cluster, whereas ≤1% were undetermined calls), and the percentage of completion was similar for cases and controls. Completion was ≥98% for all assays in the Norwegian study, except for rs570613 which had a completion of 92%. We did not observe significant departures from Hardy-Weinberg equilibrium for any of the SNPs evaluated in the Norwegian or Polish control populations.
Statistical Analyses. Odds ratios (OR) and their 95% confidence intervals (95% CI) were derived from unconditional logistic regression models adjusting for age in 5-year categories. The association between genotypes and breast cancer risk was tested using a trend test. Differences in mean age at diagnosis in cases across different genotypes were tested using a t test. Age-specific estimates of ORs (95% CI) for genotype-disease associations were obtained from logistic regression analyses with interaction terms between genotypes (assuming a linear trend) and age categories. The test for interaction between genotypes and age was done by including an interaction term in a logistic regression model, considering both variables as continuous. We evaluated heterogeneity in the genotype ORs by hormone receptor status in logistic regression models among cases with receptor status as the outcome variable and genotypes as explanatory variables adjusting for age and study. Case control analyses were also done to estimate associations between genotypes and different tumor types. Heterogeneity of estimated ORs by study was tested by introducing an interaction term for genotype and study. Estimates obtained using pooled data from both studies were adjusted by study.
Pairwise linkage disequilibrium was estimated between SNPs based on D′ and r2 values using Haploview.11
Block structure was determined using genotype data from the control population, and the solid spline of linkage disequilibrium option (D′threshold > 0.80). Haplotype frequencies within each block, ORs, and their 95% CIs were estimated using HaploStats12 (version 1.2.1; ref. 24). A global score statistic, adjusted for the matching factors age (in 5-year categories) and study site (Lodz or Warsaw), was used to evaluate the overall difference in haplotype frequencies between cases and controls. Phylogenetic trees [neighbor-joining (ref. 25), nucleotide p distance] were constructed using MEGA 3.113 (26) to assess nucleotide similarity of different haplotypes.Results
We observed similar linkage disequilibrium patterns between SNPs in the control populations of the two study populations, although linkage disequilibrium between the SNP in the 3′ neighboring gene FLJ45983 and SNPs in GATA3 was less in the Polish, than in the Norwegian, study (Fig. 1). There was low correlation between SNPs, with the exception of intron 3 (rs3802604) and intron 4 (rs570613) of GATA3 (r2 = 0.67 and 0.65 in the Norwegian and Polish populations, respectively). The allele frequencies among controls in the two study populations were similar. rs1149901 allele frequency was 0.26 and 0.28 for the Norwegian and Polish populations, respectively, 0.40 and 0.37 for rs3802604, 0.45 and 0.40 for rs570613, and 0.26 and 0.24 for rs422628. None of these differences were statistically significant.
An SNP in intron 4 of GATA3 (rs570613) was associated with a significant decreased risk of breast cancer in the Norwegian study (Table 1). Data from the Polish study were consistent with a reduction in risk; however, the association was weaker and not statistically significant. Pooled analyses adjusting for study and age showed a significant reduction in risk (Ptrend = 0.004), with no significant evidence for study heterogeneity (Table 1). Similar associations were observed for a linked SNP in intron 3 (rs3802604; D′/r2 = 0.86/0.65 in Polish controls), although they were not statistically significant. Neither the SNPs in exon 1 of FLJ45983 nor a SNP in intron 4 of GATA3 (rs422628) were significantly associated with breast cancer risk in the Norwegian or the Polish studies. We observed no significant differences in genotype frequencies among the different types of controls or cases in the Norwegian study (data not shown).
Gene . | Norwegian study . | . | . | . | Polish study . | . | . | Pheterogeneity . | Pooled estimates . | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
SNP . | Genotype . | Case . | Control . | OR* (95% CI) . | Case . | Control . | OR* (95% CI) . | . | OR* (95% CI) . | |||||||||
FLJ45983 | ||||||||||||||||||
rs1149901 (Ex1-425G>A) | GG | 385 | 609 | 1.00 | 960 | 1,113 | 1.00 | 1.00 | ||||||||||
AG | 273 | 410 | 1.14 (0.86-1.50) | 785 | 903 | 1.01 (0.88-1.15) | 1.03 (0.92-1.15) | |||||||||||
AA | 35 | 85 | 0.55 (0.29-1.04) | 151 | 158 | 1.11 (0.88-1.42) | 0.08 | 0.99 (0.80-1.22) | ||||||||||
Ptrend | 0.51 | 0.52 | 0.81 | |||||||||||||||
GATA3 | ||||||||||||||||||
rs3802604 (IVS4+1468G>A) | AA | 276 | 387 | 1.00 | 777 | 859 | 1.00 | 1.00 | ||||||||||
AG | 320 | 536 | 0.82 (0.61-1.11) | 870 | 1,029 | 0.93 (0.82-1.06) | 0.91 (0.81-1.03) | |||||||||||
GG | 85 | 175 | 0.78 (0.51-1.17) | 242 | 277 | 0.96 (0.79-1.17) | 0.14 | 0.90 (0.76-1.06) | ||||||||||
Ptrend | 0.16 | 0.46 | 0.11 | |||||||||||||||
rs570613 (IVS4+401T>C) | TT | 256 | 320 | 1.00 | 724 | 776 | 1.00 | 1.00 | ||||||||||
CT | 335 | 568 | 0.72 (0.53-0.97) | 893 | 1,075 | 0.89 (0.78-1.02) | 0.85 (0.75-0.95) | |||||||||||
CC | 104 | 211 | 0.62 (0.41-0.93) | 278 | 325 | 0.91 (0.75-1.10) | 0.82 (0.69-0.96) | |||||||||||
Ptrend | 0.01 | 0.19 | 0.07 | 0.004 | ||||||||||||||
rs422628 (IVS4-27C>T) | TT | 364 | 576 | 1.00 | 1,104 | 1,261 | 1.00 | 1.00 | ||||||||||
CT | 239 | 396 | 1.09 (0.81-1.46) | 692 | 809 | 0.97 (0.85-1.11) | 0.98 (0.88-1.10) | |||||||||||
CC | 30 | 77 | 0.71 (0.38-1.33) | 105 | 109 | 1.10 (0.83-1.45) | 0.97 (0.77-1.24) | |||||||||||
P for trend | 0.72 | 0.91 | 0.10 | 0.74 |
Gene . | Norwegian study . | . | . | . | Polish study . | . | . | Pheterogeneity . | Pooled estimates . | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
SNP . | Genotype . | Case . | Control . | OR* (95% CI) . | Case . | Control . | OR* (95% CI) . | . | OR* (95% CI) . | |||||||||
FLJ45983 | ||||||||||||||||||
rs1149901 (Ex1-425G>A) | GG | 385 | 609 | 1.00 | 960 | 1,113 | 1.00 | 1.00 | ||||||||||
AG | 273 | 410 | 1.14 (0.86-1.50) | 785 | 903 | 1.01 (0.88-1.15) | 1.03 (0.92-1.15) | |||||||||||
AA | 35 | 85 | 0.55 (0.29-1.04) | 151 | 158 | 1.11 (0.88-1.42) | 0.08 | 0.99 (0.80-1.22) | ||||||||||
Ptrend | 0.51 | 0.52 | 0.81 | |||||||||||||||
GATA3 | ||||||||||||||||||
rs3802604 (IVS4+1468G>A) | AA | 276 | 387 | 1.00 | 777 | 859 | 1.00 | 1.00 | ||||||||||
AG | 320 | 536 | 0.82 (0.61-1.11) | 870 | 1,029 | 0.93 (0.82-1.06) | 0.91 (0.81-1.03) | |||||||||||
GG | 85 | 175 | 0.78 (0.51-1.17) | 242 | 277 | 0.96 (0.79-1.17) | 0.14 | 0.90 (0.76-1.06) | ||||||||||
Ptrend | 0.16 | 0.46 | 0.11 | |||||||||||||||
rs570613 (IVS4+401T>C) | TT | 256 | 320 | 1.00 | 724 | 776 | 1.00 | 1.00 | ||||||||||
CT | 335 | 568 | 0.72 (0.53-0.97) | 893 | 1,075 | 0.89 (0.78-1.02) | 0.85 (0.75-0.95) | |||||||||||
CC | 104 | 211 | 0.62 (0.41-0.93) | 278 | 325 | 0.91 (0.75-1.10) | 0.82 (0.69-0.96) | |||||||||||
Ptrend | 0.01 | 0.19 | 0.07 | 0.004 | ||||||||||||||
rs422628 (IVS4-27C>T) | TT | 364 | 576 | 1.00 | 1,104 | 1,261 | 1.00 | 1.00 | ||||||||||
CT | 239 | 396 | 1.09 (0.81-1.46) | 692 | 809 | 0.97 (0.85-1.11) | 0.98 (0.88-1.10) | |||||||||||
CC | 30 | 77 | 0.71 (0.38-1.33) | 105 | 109 | 1.10 (0.83-1.45) | 0.97 (0.77-1.24) | |||||||||||
P for trend | 0.72 | 0.91 | 0.10 | 0.74 |
Adjusted by age, in addition to study in pooled analyses.
We obtained information on ER and progesterone receptor status from diagnostic hospitals for 448 cases (58% of total) in Norway and 1,464 cases (73% of total) in Poland. The numbers of ER-negative cases were 178 in Norway and 505 in Poland. Case-only analyses showed that genotypes with the variant allele for the intron 3 SNP (rs3802604) in GATA3 were less common for cases with ER-negative tumors than ER-positive tumors in both populations (Supplementary Table S1). Although the association was only significant in the Polish study, there was no evidence of study heterogeneity, and the pooled ORs (95% CIs) for the association between genotypes and ER status among cases were 0.86 (0.70-1.06) and 0.65 (0.47-0.90) for heterozygous and homozygous variants compared with common homozygous (Ptrend = 0.01).
Pooled case control analyses stratified by ER status showed a reduced risk for ER-negative tumors associated with the intron 3 polymorphism [pooled OR (95% CI), 0.86 (0.72-1.04) for heterozygotes and 0.72 (0.54-0.96) for homozygote variants; Ptrend = 0.02] and no association with ER-positive tumors (Ptrend = 0.80), which was consistent between the two study populations (Table 2). Similar associations were observed for the intron 4 SNP (rs570613) in GATA3; however, differences by ER status were not statistically significant.
Gene SNP . | Genotype . | Controls . | ER positive . | . | Ptrend . | ER negative . | . | Ptrend . | Pheterogeneity† . | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | . | . | Cases . | OR* (95% CI) . | . | Cases . | OR* (95% CI) . | . | . | |||||||||
Norwegian study | ||||||||||||||||||
FLJ45983 | ||||||||||||||||||
rs1149901 (Ex1-425G>A) | GG | 609 | 141 | 1.00 | 101 | 1.00 | ||||||||||||
AG | 410 | 108 | 1.04 (0.71-1.54) | 68 | 1.18 (0.72-1.93) | |||||||||||||
AA | 85 | 11 | 0.54 (0.22-1.3) | 0.42 | 7 | 0.18 (0.02-1.36) | 0.43 | 0.42 | ||||||||||
GATA3 | ||||||||||||||||||
rs3802604 (IVS4+1468G>A) | AA | 387 | 98 | 1.00 | 73 | 1.00 | ||||||||||||
AG | 536 | 126 | 0.86 (0.57-1.29) | 85 | 0.77 (0.46-1.29) | |||||||||||||
AA | 175 | 34 | 0.73 (0.4-1.31) | 0.26 | 18 | 0.58 (0.26-1.29) | 0.14 | 0.21 | ||||||||||
rs570613 (IVS4+401T>C) | TT | 320 | 96 | 1.00 | 68 | 1.00 | ||||||||||||
CT | 568 | 128 | 0.69 (0.46-1.04) | 85 | 0.77 (0.45-1.32) | |||||||||||||
CC | 211 | 39 | 0.52 (0.29-0.92) | 0.02 | 23 | 0.64 (0.31-1.34) | 0.21 | 0.44 | ||||||||||
rs422628 (IVS4-27C>T) | TT | 576 | 137 | 1.00 | 92 | 1.00 | ||||||||||||
CT | 396 | 89 | 0.91 (0.59-1.4) | 60 | 1.12 (0.68-1.87) | |||||||||||||
CC | 77 | 13 | 1.13 (0.54-2.36) | 0.99 | 8 | 0.21 (0.03-1.55) | 0.43 | 0.75 | ||||||||||
Polish study | ||||||||||||||||||
FLJ45983 | ||||||||||||||||||
rs1149901 (Ex1-425G>A) | GG | 1,113 | 445 | 1.00 | 255 | 1.00 | ||||||||||||
AG | 903 | 405 | 1.12 (0.96-1.32) | 183 | 0.88 (0.72-1.09) | |||||||||||||
AA | 158 | 70 | 1.11 (0.82-1.50) | 0.20 | 41 | 1.13 (0.78-1.64) | 0.81 | 0.24 | ||||||||||
GATA3 | ||||||||||||||||||
rs3802604 (IVS4+1468G>A) | AA | 859 | 360 | 1.00 | 212 | 1.00 | ||||||||||||
AG | 1,029 | 420 | 0.97 (0.82-1.15) | 217 | 0.85 (0.69-1.05) | |||||||||||||
AA | 277 | 131 | 1.13 (0.89-1.44) | 0.50 | 52 | 0.76 (0.55-1.06) | 0.05 | 0.02 | ||||||||||
rs570613 IVS4+401T>C | TT | 776 | 340 | 1.00 | 192 | 1.00 | ||||||||||||
CT | 1,075 | 442 | 0.94 (0.79-1.11) | 219 | 0.82 (0.66-1.02) | |||||||||||||
CC | 325 | 139 | 0.98 (0.77-1.24) | 0.70 | 65 | 0.81 (0.59-1.10) | 0.08 | 0.15 | ||||||||||
rs422628 (IVS4-27C>T) | TT | 1,261 | 525 | 1.00 | 288 | 1.00 | ||||||||||||
CT | 809 | 346 | 1.03 (0.87-1.21) | 160 | 0.87 (0.70-1.07) | |||||||||||||
CC | 109 | 53 | 1.17 (0.83-1.65) | 0.44 | 31 | 1.25 (0.82-1.89) | 0.76 | 0.75 | ||||||||||
Pooled data | ||||||||||||||||||
FLJ45983 | ||||||||||||||||||
rs1149901 (Ex1-425G>A) | GG | 1,722 | 586 | 1.00 | 356 | 1.00 | ||||||||||||
AG | 1,313 | 513 | 1.13 (0.98-1.31) | 251 | 0.93 (0.78-1.12) | |||||||||||||
AA | 243 | 81 | 1.01 (0.77-1.33) | 0.28 | 48 | 1.02 (0.73-1.43) | 0.71 | 0.13 | ||||||||||
GATA3 | ||||||||||||||||||
rs3802604 (IVS4+1468G>A) | AA | 1,246 | 458 | 1.00 | 285 | 1.00 | ||||||||||||
AG | 1,565 | 546 | 0.97 (0.84-1.13) | 302 | 0.86 (0.72-1.04) | |||||||||||||
AA | 452 | 165 | 1.05 (0.85-1.31) | 0.80 | 70 | 0.72 (0.54-0.96) | 0.02 | 0.01 | ||||||||||
rs570613 (IVS4+401T>C) | TT | 1,096 | 436 | 1.00 | 260 | 1.00 | ||||||||||||
CT | 1,643 | 570 | 0.90 (0.78-1.05) | 304 | 0.80 (0.66-0.97) | |||||||||||||
CC | 536 | 178 | 0.89 (0.72-1.1) | 0.20 | 88 | 0.71 (0.54-0.94) | 0.006 | 0.09 | ||||||||||
rs422628 (IVS4-27C>T) | TT | 1,837 | 662 | 1.00 | 380 | 1.00 | ||||||||||||
CT | 1,205 | 435 | 1.01 (0.88-1.17) | 220 | 0.90 (0.75-1.09) | |||||||||||||
CC | 186 | 66 | 1.12 (0.83-1.51) | 0.56 | 39 | 1.08 (0.75-1.58) | 0.67 | 0.30 |
Gene SNP . | Genotype . | Controls . | ER positive . | . | Ptrend . | ER negative . | . | Ptrend . | Pheterogeneity† . | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | . | . | Cases . | OR* (95% CI) . | . | Cases . | OR* (95% CI) . | . | . | |||||||||
Norwegian study | ||||||||||||||||||
FLJ45983 | ||||||||||||||||||
rs1149901 (Ex1-425G>A) | GG | 609 | 141 | 1.00 | 101 | 1.00 | ||||||||||||
AG | 410 | 108 | 1.04 (0.71-1.54) | 68 | 1.18 (0.72-1.93) | |||||||||||||
AA | 85 | 11 | 0.54 (0.22-1.3) | 0.42 | 7 | 0.18 (0.02-1.36) | 0.43 | 0.42 | ||||||||||
GATA3 | ||||||||||||||||||
rs3802604 (IVS4+1468G>A) | AA | 387 | 98 | 1.00 | 73 | 1.00 | ||||||||||||
AG | 536 | 126 | 0.86 (0.57-1.29) | 85 | 0.77 (0.46-1.29) | |||||||||||||
AA | 175 | 34 | 0.73 (0.4-1.31) | 0.26 | 18 | 0.58 (0.26-1.29) | 0.14 | 0.21 | ||||||||||
rs570613 (IVS4+401T>C) | TT | 320 | 96 | 1.00 | 68 | 1.00 | ||||||||||||
CT | 568 | 128 | 0.69 (0.46-1.04) | 85 | 0.77 (0.45-1.32) | |||||||||||||
CC | 211 | 39 | 0.52 (0.29-0.92) | 0.02 | 23 | 0.64 (0.31-1.34) | 0.21 | 0.44 | ||||||||||
rs422628 (IVS4-27C>T) | TT | 576 | 137 | 1.00 | 92 | 1.00 | ||||||||||||
CT | 396 | 89 | 0.91 (0.59-1.4) | 60 | 1.12 (0.68-1.87) | |||||||||||||
CC | 77 | 13 | 1.13 (0.54-2.36) | 0.99 | 8 | 0.21 (0.03-1.55) | 0.43 | 0.75 | ||||||||||
Polish study | ||||||||||||||||||
FLJ45983 | ||||||||||||||||||
rs1149901 (Ex1-425G>A) | GG | 1,113 | 445 | 1.00 | 255 | 1.00 | ||||||||||||
AG | 903 | 405 | 1.12 (0.96-1.32) | 183 | 0.88 (0.72-1.09) | |||||||||||||
AA | 158 | 70 | 1.11 (0.82-1.50) | 0.20 | 41 | 1.13 (0.78-1.64) | 0.81 | 0.24 | ||||||||||
GATA3 | ||||||||||||||||||
rs3802604 (IVS4+1468G>A) | AA | 859 | 360 | 1.00 | 212 | 1.00 | ||||||||||||
AG | 1,029 | 420 | 0.97 (0.82-1.15) | 217 | 0.85 (0.69-1.05) | |||||||||||||
AA | 277 | 131 | 1.13 (0.89-1.44) | 0.50 | 52 | 0.76 (0.55-1.06) | 0.05 | 0.02 | ||||||||||
rs570613 IVS4+401T>C | TT | 776 | 340 | 1.00 | 192 | 1.00 | ||||||||||||
CT | 1,075 | 442 | 0.94 (0.79-1.11) | 219 | 0.82 (0.66-1.02) | |||||||||||||
CC | 325 | 139 | 0.98 (0.77-1.24) | 0.70 | 65 | 0.81 (0.59-1.10) | 0.08 | 0.15 | ||||||||||
rs422628 (IVS4-27C>T) | TT | 1,261 | 525 | 1.00 | 288 | 1.00 | ||||||||||||
CT | 809 | 346 | 1.03 (0.87-1.21) | 160 | 0.87 (0.70-1.07) | |||||||||||||
CC | 109 | 53 | 1.17 (0.83-1.65) | 0.44 | 31 | 1.25 (0.82-1.89) | 0.76 | 0.75 | ||||||||||
Pooled data | ||||||||||||||||||
FLJ45983 | ||||||||||||||||||
rs1149901 (Ex1-425G>A) | GG | 1,722 | 586 | 1.00 | 356 | 1.00 | ||||||||||||
AG | 1,313 | 513 | 1.13 (0.98-1.31) | 251 | 0.93 (0.78-1.12) | |||||||||||||
AA | 243 | 81 | 1.01 (0.77-1.33) | 0.28 | 48 | 1.02 (0.73-1.43) | 0.71 | 0.13 | ||||||||||
GATA3 | ||||||||||||||||||
rs3802604 (IVS4+1468G>A) | AA | 1,246 | 458 | 1.00 | 285 | 1.00 | ||||||||||||
AG | 1,565 | 546 | 0.97 (0.84-1.13) | 302 | 0.86 (0.72-1.04) | |||||||||||||
AA | 452 | 165 | 1.05 (0.85-1.31) | 0.80 | 70 | 0.72 (0.54-0.96) | 0.02 | 0.01 | ||||||||||
rs570613 (IVS4+401T>C) | TT | 1,096 | 436 | 1.00 | 260 | 1.00 | ||||||||||||
CT | 1,643 | 570 | 0.90 (0.78-1.05) | 304 | 0.80 (0.66-0.97) | |||||||||||||
CC | 536 | 178 | 0.89 (0.72-1.1) | 0.20 | 88 | 0.71 (0.54-0.94) | 0.006 | 0.09 | ||||||||||
rs422628 (IVS4-27C>T) | TT | 1,837 | 662 | 1.00 | 380 | 1.00 | ||||||||||||
CT | 1,205 | 435 | 1.01 (0.88-1.17) | 220 | 0.90 (0.75-1.09) | |||||||||||||
CC | 186 | 66 | 1.12 (0.83-1.51) | 0.56 | 39 | 1.08 (0.75-1.58) | 0.67 | 0.30 |
Adjusted by age, in addition to study in pooled analyses.
Based on case-only comparisons between genotype frequencies among ER-positive and ER-negative tumors (see Supplementary Table S1 for details).
Genotypes were not significantly associated with progesterone receptor status or progesterone receptor in combination with ER in analyses restricted to cases (Supplementary Tables S2 and S3). These analyses indicate that the genotype associations with breast cancer risk are not modified by progesterone receptor status. Data from the Polish study suggested the presence of interactions between the SNPs evaluated and age (Supplementary Table S4). Age-specific estimates showed the associations with reduced breast cancer risk to be limited to older women. Similarly, analyses by menopausal status in this study showed the inverse associations with risk for the three GATA3 SNPs to be limited to postmenopausal women (67.2% of controls). Specifically, the per allele OR (95% CI) for premenopausal and postmenopausal women were, respectively, 1.14 (0.95-1.36) and 0.91 (0.81-1.01) for rs3802604 (Pinteraction = 0.034), 1.11 (0.93-1.33) and 0.88 (0.79-0.98) for rs570613 (Pinteraction = 0.026), and 1.24 (1.02-1.51) and 0.91 (0.81-1.04) for rs422628 (Pinteraction = 0.010).
These analyses could not be done in the Norwegian study because controls were 55 years or older. Case-only analyses in both study populations showed that the SNPs evaluated were significantly associated with age at diagnosis in the Polish study; however, associations were weaker and not statistically significant in the Norwegian population (data not shown). The associations with age at diagnosis in the Polish study remained significant after adjusting for ER status of the tumors (data not shown).
We observed nine haplotypes with a frequency >1% in the control populations (Table 3; Supplementary Table S5 for analyses by study). The most common haplotype (GATT) carried common alleles for all SNPs and was present in 46% of the pooled control population. Analyses of pooled data showed four haplotypes carrying both variants in introns 3 and 4 (rs3802604 and rs570613) individually associated with decreased breast cancer risk (GGCT in 14% of controls, GGCC in 1% of controls, AGCC in 18% of controls, and AGCT in 2% of controls). Compared with the common haplotype GATT, only two of these haplotypes (GGCT and AGCC) were associated with reductions in the risk for ER-negative tumors [0.71 (0.58-0.88) and 0.85 (0.72-1.01), respectively; Table 3]. We observed two haplotypes carrying only one of the two variants (GACT in 6% of controls and GGTC in 2% of controls), and none of them were associated with significant reductions in risk of ER-negative tumors. Interestingly, the haplotype carrying the variant in intron 3 (GGTC) and a haplotype with a variant in exon 1 of FLJ45983 (AATT in 7% of controls) were associated with significant increases in risk for ER-positive tumors [1.71 (1.27-2.32) and 1.26 (1.03-1.54), respectively; Table 3]. This increase in risk was consistently found in the Norwegian and Polish populations (Supplementary Table S5).
Haplotypes* . | . | . | . | Overall association . | . | . | . | Association with ER-positive tumors . | . | . | Association with ER-negative tumors . | . | . | Pheterogeneity† . | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | . | . | . | Controls‡ . | All cases‡ . | OR‡ (95% CI) . | P . | ER-positive cases‡ . | OR§ (95% CI) . | P . | ER-negative cases‡ . | OR§ (95% CI) . | P . | . | ||||||||||
G | A | T | T | 0.46 | 0.47 | 1.00 | 0.45 | 1.00 | 0.50 | 1.00 | ||||||||||||||
. | . | . | C | 0.02 | 0.02 | 0.96 (0.72-1.30) | 0.81 | 0.02 | 0.88 (0.59-1.32) | 0.53 | 0.03 | 1.07 (0.69-1.67) | 0.76 | 0.59 | ||||||||||
. | . | C | . | 0.06 | 0.06 | 0.89 (0.75-1.05) | 0.17 | 0.06 | 0.95 (0.76-1.19) | 0.67 | 0.06 | 0.89 (0.68-1.17) | 0.42 | 0.64 | ||||||||||
. | G | . | C | 0.02 | 0.03 | 1.32 (1.01-1.71) | 0.04 | 0.03 | 1.71 (1.27-2.32) | 0.0005 | 0.02 | 1.07 (0.69-1.67) | 0.76 | 0.02 | ||||||||||
. | G | C | . | 0.14 | 0.13 | 0.90 (0.79-1.01) | 0.08 | 0.14 | 1.01 (0.86-1.17) | 0.93 | 0.11 | 0.71 (0.58-0.88) | 0.002 | 0.002 | ||||||||||
. | G | C | C | 0.01 | 0.01 | 0.90 (0.61-1.33) | 0.61 | 0.01 | 0.85 (0.50-1.43) | 0.54 | 0.02 | 1.06 (0.58-1.91) | 0.86 | 0.58 | ||||||||||
A | G | C | C | 0.18 | 0.17 | 0.91 (0.82-1.01) | 0.07 | 0.17 | 0.97 (0.84-1.11) | 0.66 | 0.16 | 0.85 (0.72-1.01) | 0.07 | 0.10 | ||||||||||
A | . | . | . | 0.07 | 0.08 | 1.13 (0.97-1.33) | 0.13 | 0.09 | 1.26 (1.03-1.54) | 0.02 | 0.08 | 1.00 (0.77-1.30) | 0.99 | 0.07 | ||||||||||
A | G | C | . | 0.02 | 0.02 | 0.88 (0.61-1.28) | 0.51 | 0.01 | 0.90 (0.56-1.45) | 0.66 | 0.02 | 0.87 (0.49-1.56) | 0.65 | 0.94 | ||||||||||
Rare haplotypes | 1.04 (0.75-1.46) | 0.80 | 1.29 (0.87-1.93) | 0.21 | 0.59 (0.30-1.17) | 0.13 | 0.04 | |||||||||||||||||
P (global test) | 0.04 | 0.008 | 0.09 |
Haplotypes* . | . | . | . | Overall association . | . | . | . | Association with ER-positive tumors . | . | . | Association with ER-negative tumors . | . | . | Pheterogeneity† . | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | . | . | . | Controls‡ . | All cases‡ . | OR‡ (95% CI) . | P . | ER-positive cases‡ . | OR§ (95% CI) . | P . | ER-negative cases‡ . | OR§ (95% CI) . | P . | . | ||||||||||
G | A | T | T | 0.46 | 0.47 | 1.00 | 0.45 | 1.00 | 0.50 | 1.00 | ||||||||||||||
. | . | . | C | 0.02 | 0.02 | 0.96 (0.72-1.30) | 0.81 | 0.02 | 0.88 (0.59-1.32) | 0.53 | 0.03 | 1.07 (0.69-1.67) | 0.76 | 0.59 | ||||||||||
. | . | C | . | 0.06 | 0.06 | 0.89 (0.75-1.05) | 0.17 | 0.06 | 0.95 (0.76-1.19) | 0.67 | 0.06 | 0.89 (0.68-1.17) | 0.42 | 0.64 | ||||||||||
. | G | . | C | 0.02 | 0.03 | 1.32 (1.01-1.71) | 0.04 | 0.03 | 1.71 (1.27-2.32) | 0.0005 | 0.02 | 1.07 (0.69-1.67) | 0.76 | 0.02 | ||||||||||
. | G | C | . | 0.14 | 0.13 | 0.90 (0.79-1.01) | 0.08 | 0.14 | 1.01 (0.86-1.17) | 0.93 | 0.11 | 0.71 (0.58-0.88) | 0.002 | 0.002 | ||||||||||
. | G | C | C | 0.01 | 0.01 | 0.90 (0.61-1.33) | 0.61 | 0.01 | 0.85 (0.50-1.43) | 0.54 | 0.02 | 1.06 (0.58-1.91) | 0.86 | 0.58 | ||||||||||
A | G | C | C | 0.18 | 0.17 | 0.91 (0.82-1.01) | 0.07 | 0.17 | 0.97 (0.84-1.11) | 0.66 | 0.16 | 0.85 (0.72-1.01) | 0.07 | 0.10 | ||||||||||
A | . | . | . | 0.07 | 0.08 | 1.13 (0.97-1.33) | 0.13 | 0.09 | 1.26 (1.03-1.54) | 0.02 | 0.08 | 1.00 (0.77-1.30) | 0.99 | 0.07 | ||||||||||
A | G | C | . | 0.02 | 0.02 | 0.88 (0.61-1.28) | 0.51 | 0.01 | 0.90 (0.56-1.45) | 0.66 | 0.02 | 0.87 (0.49-1.56) | 0.65 | 0.94 | ||||||||||
Rare haplotypes | 1.04 (0.75-1.46) | 0.80 | 1.29 (0.87-1.93) | 0.21 | 0.59 (0.30-1.17) | 0.13 | 0.04 | |||||||||||||||||
P (global test) | 0.04 | 0.008 | 0.09 |
SNPs are in the same order as in Table 1. Neuclotide changes individually associated with reduced breast cancer risk are bolded. Haplotypes are sorted by similarity of nucleotide sequences according to phylogenetic trees (26).
P value for a test of heterogeneity of haplotype ORs by ER status.
Haplotype frequencies.
Adjusted by age, in addition to study in pooled analyses.
Discussion
This comprehensive evaluation of common genetic variation of GATA3 in two independent breast cancer studies in Norway and Poland showed consistent evidence for a differential association between common variation in GATA3 and specific tumor types defined by ER status. In particular, two SNPs in strong linkage disequilibrium located in intron 3 (rs3802604) and intron 4 (rs570613) of GATA3 and the two most common haplotypes carrying both of these variant alleles were associated with a decreased risk of ER-breast cancer. Two other haplotypes were associated with an increased risk of ER-positive tumors.
No previously published studies have evaluated the association between GATA3 polymorphisms and breast cancer risk. However, two tag SNPs in this report (rs570613 and rs3802604) were included in a genome wide-scan done in a total of 1,200 breast cancer cases and 1,200 controls from the Nurse's Health Study (NHS) under the Cancer Genetic Markers of Susceptibility project.14
The NHS data were consistent with a reduced risk of breast cancer associated with the homozygous variant genotype for rs570613 [OR (95% CI), 1.01 (0.86-1.22) for CT versus TT and 0.78 (0.60-1.00) for CC versus TT; calculated from genotype frequencies found at http://cgems.cancer.gov/data/], as found in our two study populations. However, the rs3802604 SNP was not associated with risk in the Nurse's Health Study (P2df adjusted = 0.72). None of the other tag SNPs in GATA3 genotyped in the Nurse's Health Study/Cancer Genetic Markers of Susceptibility project showed significant associations with breast cancer risk (P2df adjusted > 0.18).The biology of GATA3 is complex and encompasses functions in stem cells during differentiation and in terminally differentiated cells (27). Kaufman et al. have shown by targeted disruption of GATA3 in mice that GATA3 is required for proper hair follicle stem cell function (28). In the immune system, GATA3 is required for multiple developmental decisions (29). Thus, if GATA3 signaling differs by developmental stage and cell type, then GATA3 SNPs might be expected to have different effects on the risk of different tumor subtypes that arise from different cell lineages. Heterogeneity in GATA3 signaling across different developmental lineages may underlie the etiologic heterogeneity with respect to ER status observed in this study.
A differential association of GATA3 variants by ER status is consistent with previous observations about different roles of GATA3 in ER-positive and ER-negative breast tumor subtypes. GATA3 protein expression varies according to ER status (1, 2), predicts hormone therapy responsiveness (10), and predicts outcomes in ER-positive patients (7, 9). In human breast tissue, GATA3 and ER expression are highly correlated with proteins expressed highly in cells that line the ducts (luminal epithelial cells). In addition, GATA3 and ER are known to transcriptionally regulate a number of genes in common (6, 30, 31), so ER may influence GATA3 and vice versa.
This study represents the first report of an association between GATA3 polymorphisms and breast cancer risk. The SNP in intron 3 (rs3802604) is found in a highly conserved region of GATA3. Conservation is often high in regions of DNA with important functional consequences, but functional data has not been reported for any of the GATA3 SNPs we investigated. The two intronic SNPs associated with risk were in linkage disequilibrium (D′ ∼ 0.90, r2 ∼ 0.65), and thus, it is unclear whether one or both SNPs are responsible for the observed associations. Neither of the two haplotypes with only one of these SNPs was significantly associated with ER-negative tumors, suggesting that both variants might be needed; however, this could also be due to low power to detect significant association for the less common haplotypes. It is also possible that one or both are in linkage disequilibrium with a protective allele not measured in our study. The observed associations between two haplotypes [one carrying the variant for the intron 3 (rs3802604) SNP] and increased risk of ER-positive tumors were unexpected based on individual SNP analyses and require confirmation in other study populations.
The agreement between two large and independent study populations in this report and in NHS data online suggests validity of findings because potential biases that could explain associations are unlikely to be the same in both populations. The rate of participation in the Polish study is among the highest for population-based studies with collection of biological specimens. Although we cannot rule out selection bias, associations with most established risk factors for breast cancer were of expected direction and magnitude (21), indicating that selection bias is unlikely to be important. We had less knowledge regarding participation rates and distribution of breast cancer risk factors among the breast cancer cases in the Norwegian study; however, the findings were generally consistent between the study populations, particularly after stratification by ER tumor status. Both study populations were of homogeneous ethnic background, thus reducing the possibility of bias due to population stratification. Blood samples from most cases in the Norwegian were collected sometime after diagnosis, which provides the possibility of survival biases if the genotypes were to be related to survival.
Results from this report, together with the evidence that GATA3 plays a role as a tumor suppressor (12), suggest that common variation in GATA3 differentially affects the risk for developing breast cancer depending on the ER status of the tumors. Further epidemiologic studies aimed at clarifying this relation are warranted.
Grant support: NIH National Cancer Institute Breast Cancer Specialized Program of Research Excellence grants P50-CA58223 and R01-CA-101227-01 (C.M. Perou) and UNC Lineberger Cancer Control Education Program grant R25 CA57726 (M.A. Troester). The Polish Breast Cancer Study was supported by the Intramural Research Program of NIH National Cancer Institute Division of Cancer Epidemiology and Genetics and the Center for Cancer Research. The Tromsø Mammography and Breast Cancer Study was conducted in collaboration with Department of Clinical Research and the Department of Radiology, Center for Breast Imaging, University Hospital of North Norway; Norwegian Women and Cancer Study, University of Tromsø; and Cancer Registry of Norway. The study was supported by the Norwegian Cancer Society, Aakre Foundation, and Norwegian Women's Public Health Association. This work was also supported by the European Union's EU FP6 grant 502983, Research Council of Norway grant 155218/300, and Norwegian Cancer Society grant D 99061.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Note: Supplementary data for this article are available at Cancer Epidemiology Biomarkers and Prevention Online (http://cebp.aacrjournals.org/). M. Garcia-Closas and M.A. Troester contributed equally to the drafting of the manuscript.
Acknowledgments
We thank Drs. Neonila Szeszenia-Dabrowska (Nofer Institute of Occupational Medicine), Witold Zatonski (M. Sklodowska-Curie Institute of Oncology and Cancer Center), and Aljcia Bardin-Mikolajczak (M. Sklodowska-Curie Institute of Oncology and Cancer Center) for their contribution to the design of the study and field work of the Polish Breast Cancer Study, Douglas Richesson (DCEG, National Cancer Institute) for his assistance on statistical analyses, Anita Soni (Westat) for her work on study management for the Polish Breast Cancer Study, Pei Chao (IMS) for her work on data and sample management, and physicians, nurses, interviewers, and study participants for their efforts during filed work.