Association studies on susceptibility to breast cancer using single nucleotide polymorphisms (SNP) in the progesterone receptor (PGR) gene have been previously published, but the results have been inconclusive. We used a comprehensive SNP-tagging approach to search for low-penetrance susceptibility alleles in a study of up to 4,647 cases and 4,564 controls, in a two-stage study design. We identified seven tagging SNPs using genotype data from the National Institute of Environmental Health Sciences (NIEHS) Environmental Genome Project and typed these, and an additional three SNPs, in 2,345 breast cancer cases and 2,284 controls (set 1). Three SNPs showed no evidence for association and were not studied further, whereas seven SNPs (rs11571171, rs7116336, rs660149, rs10895068, rs500760, rs566351, and rs1042838) exhibited significant associations at P < 0.1 using either a heterogeneity or trend test and progressed to be genotyped in set 2. After both stages, only one SNP was significantly associated with an increased risk of breast cancer — the PGR-12 (rs1042638) V660L valine to leucine polymorphism [VL heterozygotes (odds ratio, 1.13; 95% confidence interval, 1.03-1.24) and the LL homozygotes (odds ratio, 1.30; 95% confidence interval, 0.98-1.73), Phet = 0.008, Ptrend = 0.002]. Similar estimates were obtained in a combined analysis of our data with those from three other published studies. We conclude that the 660L allele may be associated with a moderately increased risk of breast cancer, but that other common SNPs in the PGR gene are unlikely to be associated with a substantial risk of breast cancer. (Cancer Epidemiol Biomarkers Prev 2006;15(4):675–82)

Progesterone is a key steroid sex hormone in the orchestration of female sexual development and reproductive activity (1). Together with estrogen, it is important in the establishment and maintenance of pregnancy, pubescent mammary and epithelial development, and, through the stimulation of prolactin production (2), mammary ductal branching and lobuloalveolar differentiation during pregnancy (3).

The physiological actions of progesterone are mediated by the progesterone receptor (PGR) and many studies on the gene, located at 11q23, and translated protein have investigated their possible roles in tumorigenesis. The progesterone ligand binds to its steroid hormone receptor (4) and dimerizes. This complex works as a transcription factor, controlling the expression of downstream genes involved in mammary cell growth and differentiation. In addition, synthetic progestins, as well as the steroid hormones themselves, have been seen to lead to increased transcription of downstream oncogenic targets, such as c-myc and c-fos (5, 6).

Loss of heterozygosity at 11q22-qter has been frequently seen in cervical, ovarian, and breast cancers and has been associated with higher-grade tumors and a more aggressive disease course (7-9), indicating the existence of a tumor suppressor gene within this region and PGR is a good candidate. Similar correlations have been seen between tumor invasiveness and low levels of hormone (10, 11) or higher levels of receptor (12-14). The majority of breast cancers stain positively for both PGR and estrogen receptor; receptor positivity is predictive of response to tamoxifen and overall survival. More recently, microarray studies have found PGR-negative tissue to have high levels of transcripts of genes associated with cell proliferation (15).

The PGR gene is transcribed from two alternative promoters and translated into two different zinc-finger proteins, PR-A and PR-B. These differ by a 165-amino-acid NH2-terminal region present only in PR-B and known as the “B upstream segment” (16). PR-B is a potent transcriptional activator and contributes to the proliferative effects of estrogen, whereas PR-A, the shorter isoform, is necessary to oppose the effects of both PR-B and the estrogen receptor (17, 18).

The promoter region polymorphism +331 G>A has been reported to increase expression of the PR-B isoform and has been postulated to predispose women to breast cancer through increasing PR-B-dependent stimulation of mammary cell proliferation (19), although a more recent study failed to find any association between this polymorphism and breast cancer risk in postmenopausal women (20). Studies on endometrial cancer alone (21) and in combination with clear cell ovarian cancers (22) have also been inconclusive. Other work has associated the rare allele of this polymorphism with an increased likelihood of multiple failed embryo implantations during in vitro fertilization treatment (23).

The other commonly studied PGR polymorphic variants are PROGINS and V660L. These polymorphisms are in perfect linkage disequilibrium with one another (RP2=1.0), so their effects cannot be distinguished using genetic epidemiology. The PROGINS polymorphism consists of a 306 bp Alu insertion in the G intron of the PGR gene, which always occurs with the L allele of the V660L polymorphism. The insert-carrying allele exhibits higher mRNA stability and is transcribed to a more stable and transcriptionally active protein (24). The PROGINS insertion allele has been reported as inversely correlated with risk of breast cancer (25, 26), ovarian cancer (27), and endometriosis (28) in some populations, whereas in other studies, no association has been reported (29-31).

The V660L polymorphism results from G>T substitution in exon 4 of the PGR gene. In the published studies undertaken to date, no significant association has been found between this polymorphism and risk of breast cancer (32-34). One study of ovarian cancer (35) also failed to find an association, whereas another has shown an association of the L allele with an increased risk (36). V660L has also been reported, along with S344T and H770L, to be associated with an increased likelihood of repeated miscarriage (37), suggesting that the resultant PGR protein does not function optimally. However, none of these effects can be attributable to one polymorphism or the other as they are in perfect linkage disequilibrium.

To evaluate whether there are common breast cancer susceptibility alleles in PGR, we have conducted a large case-control association study. We have used a comprehensive SNP tagging approach to identify and test SNPs that can evaluate the effect of all common SNPs in PGR.

Breast Cancer Case-Control Series

Cases were drawn from the SEARCH (Breast) Study,3

an ongoing population-based study with cases ascertained through the East Anglian Cancer Registry.4 All women diagnosed with invasive breast cancer under the age of 55 years between January 1, 1991, and June 30, 1996, and who were alive at the start of the study (prevalent cases), as well as women ages <70 years who were diagnosed from 1996 onward (incident cases) were eligible for inclusion. Approximately 65% of eligible patients have enrolled in the study. Women taking part in the study were asked to provide a 20ml blood sample for DNA analysis and to complete a comprehensive epidemiological questionnaire. Eligible patients who did not take part in the study were similar to participants, except that, as might be expected, the proportion of clinical stage III/IV cases was somewhat higher in nonparticipants (Supplementary Table S1). Controls were randomly selected from the Norfolk component of European Prospective Investigation of Cancer (EPIC; ref. 38). EPIC is a prospective study of diet and cancer being carried out in nine European countries. The EPIC-Norfolk cohort comprises 25,000 individuals resident in Norfolk, East Anglia — the same region from which the cases have been recruited. Controls are not matched to cases, but are broadly similar in age, being ages 42 to 81 years. Ethical approval was obtained from the Eastern Multicentre Research Ethics Committee and informed consent was obtained from each patient.

To maximize efficiency, we used a two-stage study design (39-41) in which SNPs that are evidently not associated with breast cancer risk are dropped at the end of set 1. The staged approach substantially reduces genotyping costs without significantly affecting statistical power — a comparison is shown in Supplementary Table S2. We carried out genotyping on an initial subset (set 1) of the first 2,345 enrolled cases with invasive cancer and 2,284 EPIC controls. The geographical and ethnic background of cases and controls was very similar, with over 98% being of Anglo-Saxon ancestry. The cases were aged 25 to 73 years at diagnosis (mean, 50.2; SD, 7.8). The controls were aged 44 to 81 years at blood collection and 3 to 5 years after enrollment (mean, 65.2; SD, 7.6). It has been possible to determine menopausal status from the questionnaire data for 2,034 cases (87%) and of these, 1,292 were premenopausal and 742 were postmenopausal at diagnosis.

SNPs that exhibited a difference in genotype distribution between cases and controls that reached a predefined threshold of P < 0.1, using either a 2 degree of freedom (df) heterogeneity test (Phet) or a trend test (Ptrend), were further evaluated in a second subset of 2,302 cases from SEARCH and 2,280 controls from EPIC-Norfolk (set 2). All selection criteria were as for set 1. The set 2 cases at diagnosis were aged 23 to 70 years (mean, 53.3; SD, 9.3) and the controls were aged 43 to 81 years (mean, 62.3; SD, 8.6). Menopausal status has been determined for 1,930 set 2 cases (84%) and, of these, 1,563 were premenopausal and 367 were postmenopausal at diagnosis.

As there was no evidence for heterogeneity between set 1 and set 2, it was possible to combine the data for the two series.

Haplotype Block Definition and Tagging

Our principle hypothesis was that there are one or more SNPs in PGR that are associated with an increased or decreased risk of breast cancer. Thus, the aim of the SNP tagging approach was to identify a set of SNPs (stSNP) that efficiently tags all the known SNPs and is also expected to tag any unknown SNPs in the gene. The best measure of the extent to which one SNP tags another SNP is the pairwise correlation coefficient RP2 because the loss in power incurred by using a marker SNP in place of a true causal SNP is directly related to this value. We aimed to define a set of tagging SNPs such that all known common SNPs (minor allele frequency >0.05) had an estimated RP2 of >0.8 with at least one tagging SNP. However, some SNPs are poorly correlated with other single SNPs but may be efficiently tagged by multiple SNPs, thus reducing the number of tagging SNPs needed. As an alternative, we aimed for the correlation between each SNP and a group of tagging SNPs (RS2) to be at >0.8. We used the University of Washington NIEHS Environmental Genome Project SNPs Program PDR90 resequencing data to identify tagging SNPs.5

Two hundred forty-five SNPs were identified in the PGR gene, of which 81 were biallelic SNPs or insertion/deletion polymorphisms of <7 bp, with a minor allele frequency >5%.

The Graphical Overview of Linkage Disequilibrium package (42) was used to create a graphical summary of pairwise linkage disequilibrium patterns for the 81 eligible variants and, hence, to identify haplotype blocks (Fig. 1A).6

Tagging SNPs were selected using the TagSNPs program (43).7 This program uses the partition-ligation expectation-maximization algorithm to estimate haplotype frequencies based on the full set of 81 SNPs. An RS2 value was obtained between every measured SNP and every possible set of stSNPs, where RS2 is the expected squared correlation between an observed genotype at the SNP the genotype predicted on the basis of only the set of stSNPs. The optimal set of stSNPs was taken to be the smallest set that gave a minimum RS2 of >0.8.

Figure 1.

A. Graphical Overview of Linkage Disequilibrium output—pairwise D′ values and RP2 for the 81 PGR SNPs with minor allele frequency >0.05 in the Environmental Genome Project data set, illustrating the single block of linkage disequilibrium. B. TagSNPs output for the eight stSNPs chosen in the Environmental Genome Project data set. 1, common/ancestral allele; 2, rare allele.

Figure 1.

A. Graphical Overview of Linkage Disequilibrium output—pairwise D′ values and RP2 for the 81 PGR SNPs with minor allele frequency >0.05 in the Environmental Genome Project data set, illustrating the single block of linkage disequilibrium. B. TagSNPs output for the eight stSNPs chosen in the Environmental Genome Project data set. 1, common/ancestral allele; 2, rare allele.

Close modal

Using this design, and assuming a minimum RS2 of 0.8, this study had >85% power to detect, at a significance level of P < 0.0001, any dominant susceptibility allele with a frequency of ≥5% conferring a relative risk of at least 1.4, or a recessive allele with frequency ≥10% conferring a relative risk of at least 2.

Taqman Genotyping

Genotyping was done by 5′ nuclease assays (Taqman) using the ABI PRISM 7900HT Sequence Detection System according to instructions of the manufacturer. Primers and probes were supplied directly by Applied Biosystems (Warrington, United Kingdom) as either Assays-by-Design or Assays-on-Demand (PGR-07, PGR-09, and PGR-11 only), the details of which, along with the reaction conditions, are shown in Supplementary Table S3. All assays were carried out in 384-well plate format, with each plate including negative controls (with no DNA) and positive controls duplicated on a separate quality control plate. Assays for which >98% of the duplicated samples did not give identical genotypes were discarded. Failed genotypes were not repeated.

Statistical Methods

Deviation of genotype frequencies in controls from the Hardy-Weinberg equilibrium was assessed by a χ2 test with 1 df. The primary tests of association were univariate analyses for each of the stSNPs. Genotype frequencies in cases and controls were compared using a 2 df, χ2 test for heterogeneity (Phet) and a 1 df Cochran-Armitage χ2 test for trend in risk by allele dose (Ptrend). Genotype-specific risks were estimated as odds ratios (OR) using standard cross-product ratios, with confidence intervals (CI) calculated using the variance of the log (OR), estimated by the standard Taylor expansion.

Likelihood ratio tests to compare models of recessive, codominant, and dominant modes of SNP action were done using binary logistic regression to assess the log likelihood of each model compared with a general model.

Tests for interaction between genotype and menopausal status were carried out in a case only design using a χ2 test with 2 df. Under the assumption that genotype is not related to exposure (menopausal status), this provides a more powerful test of interaction than a full case-control analysis.

We compared the common haplotype frequencies (>0.05) in cases and controls using the haploscore program (44), implemented in S-plus. Haploscore computes score statistics (and hence significance levels) to test for associations between individual haplotypes and disease status, along with a global score test of association.

For the V660L polymorphism, we pooled our results with those from other published studies for the same SNP. A Mantel-Haenszel test was used to evaluate the difference in genotype frequencies between cases and controls, stratified by study. Genotype-specific ORs were estimated using logistic regression, with an appropriate test for heterogeneity between studies.

PupaSNP Finder

PupaSNP (putative phenotypic alterations caused by SNPs) is a web-based tool used as a means of identifying potential phenotypic effects of SNPs at the level of transcription (45).8

The program uses submitted gene sequences or chromosomal coordinates to retrieve a list of SNPs that could affect conserved regions, such as intron/exon boundaries, exon splicing enhancers, and transcription factor binding sites. The SNP location data is based on the Ensembl genome browser map.9

Defining Tagging SNPs

Two hundred forty-five variants were identified from the NIEHS Environmental Genome Project resequencing data. Of these, 81 were suitable for further evaluation. The Graphical Overview of Linkage Disequilibrium plot of D′ with these data showed a single block of linkage disequilibrium with no evidence for recombination hotspots (Fig. 1A). The TagSNPs program (43) was run on the set of 81 SNPs spanning the whole PGR gene region, and a set of eight stSNPs, which tagged the diversity of all the other common PGR SNPs, was chosen for further study (Fig. 1B). One of the tagging SNPs (PGR-01) could not be made into a successful Taqman assay.

An additional SNP in the promoter region, +331G>A (PGR-06), was also selected for analysis. Although its rare allele frequency in the NIEHS PDR90 sample population was lower than our 5% threshold, this polymorphism was included because there were previous reports of its association with an increased risk of both endometrial and breast cancer (19, 21). We also typed two additional SNPs that had been selected based on their genomic positions and rare allele frequencies before the availability of the NIEHS Environmental Genome Project resequencing data. These two SNPs were not present in the NIEHS data. Thus, a total of 10 SNPs were investigated in set 1.

Genotyping Set 1

The results of the set 1 genotyping in are summarized in Table 1. There was no evidence for deviation of the genotype frequencies from Hardy-Weinberg equilibrium in controls, apart from PGR-04 where there was some evidence of an excess of rare homozygotes (PHardy-Weinberg equilibrium = 0.03). Re-evaluation of the genotyping raw data shows nothing abnormal about the assay or genotype calls, and this seems likely to have been a chance finding. Seven of the SNPs, (PGR-03, PGR-04, PGR-05, PGR-07, PGR-10, PGR-11, and PGR-12) exhibited possible evidence for an association at P < 0.1 using either Phet or Ptrend, and, therefore, fitted our criteria for further evaluation.

Table 1.

Genotype frequencies and risks for the 10 SNPs genotyped in set 1

PolymorphismdBSNP referenceRare allele frequencyGenotypeControlsCasesOR (95% CI)Pgenotype frequencyPtrendPHWE
PGR-06 rs10895068 0.06 GG 2,002 1,929 1.00* 0.9 1.0 0.6 
   GA 260 253 1.01 (0.84-1.21)    
   AA 0.74 (0.23-2.34)    
PGR-09 rs506487 0.35 CC 922 872 1.00* 0.5 1.0 0.6 
   CT 1,010 903 0.95 (0.83-1.08)    
   TT 262 260 1.05 (0.86-1.28)    
PGR-11 rs566351 0.37 CC 870 849 1.00* 0.3 0.1 0.7 
   CT 1,015 908 0.92 (0.80-1.04)    
   TT 306 264 0.88 (0.73-1.07)    
PGR-03 rs11571171 0.32 TT 1,035 1,049 1.00* 0.2 0.1 1.0 
   TC 990 907 0.90 (0.80-1.02)    
   CC 237 215 0.90 (0.73-1.10)    
PGR-08 rs578938 0.32 AA 660 651 1.00* 0.4 0.3 0.4 
   AG 603 587 0.99 (0.84-1.15)    
   GG 152 125 0.83 (0.64-1.08)    
PGR-04 rs7116336 0.09 AA 1,908 1,801 1.00* 0.3 0.1 0.03 
   AT 339 356 1.11 (0.95-1.31)    
   TT 25 30 1.27 (0.74-2.17)    
PGR-05 rs660149 0.29 GG 1,148 1,188 1.00* 0.03 0.02 0.7 
   GC 940 830 0.85 (0.75-0.97)    
   CC 184 166 0.87 (0.70-1.09)    
PGR-12 rs1042838 0.14 GG 1,461 1,302 1.00* 0.2 0.07 0.4 
   GT 513 517 1.13 (0.98-1.30)    
   TT 39 42 1.21 (0.78-1.88)    
PGR-07 rs492457 0.27 AA 1,181 1,153 1.00* 0.2 0.1 0.9 
   AG 851 754 0.91 (0.80-1.03)    
   GG 155 132 0.87 (0.68-1.12)    
PGR-10 rs500760 0.22 AA 1,311 1,182 1.00* 0.3 0.1 0.4 
   AG 766 763 1.10 (0.97-1.25)    
   GG 102 100 1.10 (0.82-1.46)    
PolymorphismdBSNP referenceRare allele frequencyGenotypeControlsCasesOR (95% CI)Pgenotype frequencyPtrendPHWE
PGR-06 rs10895068 0.06 GG 2,002 1,929 1.00* 0.9 1.0 0.6 
   GA 260 253 1.01 (0.84-1.21)    
   AA 0.74 (0.23-2.34)    
PGR-09 rs506487 0.35 CC 922 872 1.00* 0.5 1.0 0.6 
   CT 1,010 903 0.95 (0.83-1.08)    
   TT 262 260 1.05 (0.86-1.28)    
PGR-11 rs566351 0.37 CC 870 849 1.00* 0.3 0.1 0.7 
   CT 1,015 908 0.92 (0.80-1.04)    
   TT 306 264 0.88 (0.73-1.07)    
PGR-03 rs11571171 0.32 TT 1,035 1,049 1.00* 0.2 0.1 1.0 
   TC 990 907 0.90 (0.80-1.02)    
   CC 237 215 0.90 (0.73-1.10)    
PGR-08 rs578938 0.32 AA 660 651 1.00* 0.4 0.3 0.4 
   AG 603 587 0.99 (0.84-1.15)    
   GG 152 125 0.83 (0.64-1.08)    
PGR-04 rs7116336 0.09 AA 1,908 1,801 1.00* 0.3 0.1 0.03 
   AT 339 356 1.11 (0.95-1.31)    
   TT 25 30 1.27 (0.74-2.17)    
PGR-05 rs660149 0.29 GG 1,148 1,188 1.00* 0.03 0.02 0.7 
   GC 940 830 0.85 (0.75-0.97)    
   CC 184 166 0.87 (0.70-1.09)    
PGR-12 rs1042838 0.14 GG 1,461 1,302 1.00* 0.2 0.07 0.4 
   GT 513 517 1.13 (0.98-1.30)    
   TT 39 42 1.21 (0.78-1.88)    
PGR-07 rs492457 0.27 AA 1,181 1,153 1.00* 0.2 0.1 0.9 
   AG 851 754 0.91 (0.80-1.03)    
   GG 155 132 0.87 (0.68-1.12)    
PGR-10 rs500760 0.22 AA 1,311 1,182 1.00* 0.3 0.1 0.4 
   AG 766 763 1.10 (0.97-1.25)    
   GG 102 100 1.10 (0.82-1.46)    

Abbreviation: HWE, Hardy-Weinberg equilibrium.

*

Reference group.

There was no evidence of a difference in haplotype frequencies between cases and controls (χ211=13.5) using the global score test of haploscore (44). We used the haploscore and the TagSNPs (43) programs to determine the haplotype arrangements of all 10 SNPs in our set 1 subjects (Fig. 2). Two SNPs, PGR-05 and PGR-07, were found to be in perfect linkage disequilibrium (Rp2 = 1.0) with one another in the East Anglian population sample despite having tagged different haplotypes in the NIEHS Environmental Genome Project sample set. Thus, the redundant PGR-07 SNP was omitted from further investigation. We also selected PGR-06 (+331G>A) for evaluation in set 2 as it has been associated with breast cancer in other studies.

Figure 2.

The common haplotypes in set 1. 1, common/ancestral allele; 2, rare allele. Adjacent to each haplotype is the frequency in set 1 and below, the haplotype frequency in the Environmental Genome Project data (in italics). Frequencies have been derived from TagSNPs; however, the tree structure is assumed. n/a, we are unable to distinguish between these haplotypes in the Environmental Genome Project data as PGR-06 was not included in the tagging SNP selection process (minor allele frequency <0.05). The P values for test of difference in haplotype frequency between cases and controls, calculated using haploscore (in bold). A global test for difference in frequency of all 12 haplotypes gave χ2 = 13.5, 11 df, P = 0.26.

Figure 2.

The common haplotypes in set 1. 1, common/ancestral allele; 2, rare allele. Adjacent to each haplotype is the frequency in set 1 and below, the haplotype frequency in the Environmental Genome Project data (in italics). Frequencies have been derived from TagSNPs; however, the tree structure is assumed. n/a, we are unable to distinguish between these haplotypes in the Environmental Genome Project data as PGR-06 was not included in the tagging SNP selection process (minor allele frequency <0.05). The P values for test of difference in haplotype frequency between cases and controls, calculated using haploscore (in bold). A global test for difference in frequency of all 12 haplotypes gave χ2 = 13.5, 11 df, P = 0.26.

Close modal

Genotyping Set 2

The results for the seven SNPs genotyped in both stages are presented in Table 2. At the end of both stages, only one SNP, PGR-12 (V660L), showed a significant association with breast cancer risk. Relative to the common VV homozygote, the VL heterozygotes had an OR for developing breast cancer of 1.10 (95% CI, 1.00-1.21) and the LL rare homozygotes had an OR of 1.24 (95% CI, 0.93-1.65), with Phet = 0.07 and Ptrend = 0.02.

Table 2.

Genotype frequencies and risks for the seven SNPs genotyped in the complete two-stage study

PolymorphismGenotypeSet 1 and set 2
ControlsCasesOR (95% CI)Pgenotype frequencyPtrend
PGR-06 GG 4,005 3,960 1.00* 0.8 0.6 
 GA 529 506 0.97 (0.85-1.10)   
 AA 14 12 0.87 (0.40-1.88)   
PGR-11 CC 1,779 1,754 1.00* 0.6 0.4 
 CT 2,056 1,941 0.96 (0.87-1.05)   
 TT 584 557 0.97 (0.85-1.11)   
PGR-03 TT 2,119 2,153 1.00* 0.2 0.2 
 TC 1,962 1,847 0.93 (0.85-1.01)   
 CC 453 447 0.97 (0.84-1.12)   
PGR-04 AA 3,759 3,558 1.00* 0.3 0.2 
 AT 735 755 1.09 (0.97-1.21)   
 TT 52 53 1.08 (0.73-1.58)   
PGR-05 GG 2,353 2,421 1.00* 0.06 0.07 
 GC 1,855 1,719 0.90 (0.83-0.98)   
 CC 334 327 0.95 (0.81-1.12)   
PGR-12 GG 3,070 2,878 1.00* 0.07 0.02 
 GT 1,119 1,154 1.10 (1.00-1.21)   
 TT 88 102 1.24 (0.93-1.65)   
PGR-10 AA 2,662 2,503 1.00* 0.2 0.09 
 AG 1,572 1,612 1.09 (1.00-1.19)   
 GG 213 212 1.06 (0.87-1.29)   
PolymorphismGenotypeSet 1 and set 2
ControlsCasesOR (95% CI)Pgenotype frequencyPtrend
PGR-06 GG 4,005 3,960 1.00* 0.8 0.6 
 GA 529 506 0.97 (0.85-1.10)   
 AA 14 12 0.87 (0.40-1.88)   
PGR-11 CC 1,779 1,754 1.00* 0.6 0.4 
 CT 2,056 1,941 0.96 (0.87-1.05)   
 TT 584 557 0.97 (0.85-1.11)   
PGR-03 TT 2,119 2,153 1.00* 0.2 0.2 
 TC 1,962 1,847 0.93 (0.85-1.01)   
 CC 453 447 0.97 (0.84-1.12)   
PGR-04 AA 3,759 3,558 1.00* 0.3 0.2 
 AT 735 755 1.09 (0.97-1.21)   
 TT 52 53 1.08 (0.73-1.58)   
PGR-05 GG 2,353 2,421 1.00* 0.06 0.07 
 GC 1,855 1,719 0.90 (0.83-0.98)   
 CC 334 327 0.95 (0.81-1.12)   
PGR-12 GG 3,070 2,878 1.00* 0.07 0.02 
 GT 1,119 1,154 1.10 (1.00-1.21)   
 TT 88 102 1.24 (0.93-1.65)   
PGR-10 AA 2,662 2,503 1.00* 0.2 0.09 
 AG 1,572 1,612 1.09 (1.00-1.19)   
 GG 213 212 1.06 (0.87-1.29)   
*

Reference group.

Due to the ongoing nature of the SEARCH sample collection, we had accrued an additional 69 cases and 834 controls on completion of both stages, which we then also genotyped for the V660L polymorphism. The resultant “All UK” data (including the additional samples) showed somewhat stronger evidence of an association (ORVL/VV, 1.13; 95% CI, 1.03-1.24; ORLL/VV, 1.30; 95% CI, 0.98-1.73; Phet = 0.008; Ptrend = 0.002; Table 3). There was no difference in genotype frequencies between prevalent and incident cases for V660L (P = 0.78; data not shown) and there is no association between V660L genotype and survival after diagnosis (P = 0.63; data not shown).

Table 3.

All UK data (set 1 + set 2 + additional samples) for PGR-12 (V660L)

PolymorphismGenotypeAll UK
ControlsCasesOR (95% CI)Pgenotype frequencyPtrend
PGR-12 GG 3,700 2,925 1.00* 0.008 0.002 
 GT 1,312 1,176 1.13 (1.03-1.24)   
 TT 99 102 1.30 (0.98-1.73)   
PolymorphismGenotypeAll UK
ControlsCasesOR (95% CI)Pgenotype frequencyPtrend
PGR-12 GG 3,700 2,925 1.00* 0.008 0.002 
 GT 1,312 1,176 1.13 (1.03-1.24)   
 TT 99 102 1.30 (0.98-1.73)   
*

Reference group.

For the other six SNPs that progressed to set 2, the final genotype distributions did not show statistically significant differences. The most significant of the remaining SNPs was PGR-05 (Ptrend = 0.07, Phet = 0.06) for which there was some suggestion of a higher risk associated with the CC genotype.

A Meta-analysis of V660L

We did a combined analysis of the genotype frequencies associated with V660L using our own data and that from three published studies (refs. 32, 33, 36; Table 4). Since De Vivo et al. (33) analyzed carriers of the L allele as a single genotype, a full analysis of genotype-specific risks was only possible in the other three studies. The pattern of risks in the SEARCH, Spurdle et al. (32), and Pearce et al. (36) studies seemed somewhat different in that the SEARCH and Spurdle et al. studies showed evidence of a positive association between the L allele and breast cancer, whereas the Pearce et al. study did not. However, there was no significant evidence of heterogeneity in the estimated ORs between studies (homogeneity test χ2 = 4.81, 4 df). The combined analysis based on these three studies provided evidence for increased risks of breast cancer associated with the VL and LL genotypes (ORLV versus VV, 1.08; 95% CI, 1.00-1.16; ORLL versus VV, 1.17; 95% CI, 0.93-1.47), Phet = 0.17, Ptrend = 0.05. The ORs were indicative of a codominant (allele dosage) model with an estimated OR per L allele carried of 1.08 (95% CI, 1.01-1.15). If the De Vivo et al. (33) study is also included, the estimated OR associated with VL and LL genotypes combined was 1.09 (95% CI, 1.02-1.17), P = 0.009.

Table 4.

Genotype frequencies and risks for V660L in all subgroups of samples meta-analyzed by logistic regression

PGR-12/V660LGenotypeControlsCasesOR (95% CI)PTest of homogeneity of ORs
All UK VV 3,700 2,925 1.00*   
 VL 1,312 1,176 1.13 (1.03-1.24) 0.005  
 LL 99 102 1.30 (0.98-1.73) 0.05  
 L dose   1.15 (1.06-1.24) 0.001  
Spurdle et al. study (32) VV 552 1,018 1.00*   
 VL 222 387 0.95 (0.78-1.15) 0.6  
 LL 19 47 1.34 (0.78-2.30) 0.3  
 L dose   1.01 (0.86-1.19) 0.9  
Pearce et al. study (36) VV 2,025 1,400 1.00*   
 VL 363 252 1.00 (0.84-1.20) 1.0  
 LL 37 15 0.59 (0.32-1.07) 0.08  
 L dose   0.94 (0.80-1.09) 0.4  
All UK, (32, 36) VL   1.08 (1.00-1.16) 0.05 0.14 
 LL   1.17 (0.93-1.47) 0.17 0.04 
 L dose   1.08 (1.01-1.15) 0.02 0.05 
De Vivo et al. study (33) VV 1,186 869 1.00*   
 VL + LL 474 383 1.10 (0.94-1.30) 0.2  
All UK, (32, 33, 36) VV Carrier risk  1.00*   
 VL + LL   1.09 (1.02-1.17) 0.01 0.2 
PGR-12/V660LGenotypeControlsCasesOR (95% CI)PTest of homogeneity of ORs
All UK VV 3,700 2,925 1.00*   
 VL 1,312 1,176 1.13 (1.03-1.24) 0.005  
 LL 99 102 1.30 (0.98-1.73) 0.05  
 L dose   1.15 (1.06-1.24) 0.001  
Spurdle et al. study (32) VV 552 1,018 1.00*   
 VL 222 387 0.95 (0.78-1.15) 0.6  
 LL 19 47 1.34 (0.78-2.30) 0.3  
 L dose   1.01 (0.86-1.19) 0.9  
Pearce et al. study (36) VV 2,025 1,400 1.00*   
 VL 363 252 1.00 (0.84-1.20) 1.0  
 LL 37 15 0.59 (0.32-1.07) 0.08  
 L dose   0.94 (0.80-1.09) 0.4  
All UK, (32, 36) VL   1.08 (1.00-1.16) 0.05 0.14 
 LL   1.17 (0.93-1.47) 0.17 0.04 
 L dose   1.08 (1.01-1.15) 0.02 0.05 
De Vivo et al. study (33) VV 1,186 869 1.00*   
 VL + LL 474 383 1.10 (0.94-1.30) 0.2  
All UK, (32, 33, 36) VV Carrier risk  1.00*   
 VL + LL   1.09 (1.02-1.17) 0.01 0.2 
*

Reference group.

Several polymorphisms in the PGR gene have been previously examined for association with susceptibility to multiple cancers. However, these have generally been chosen in an ad hoc manner. In this study, by using a set of stSNPs, we have been able to interrogate the whole gene in an attempt to formally evaluate potential associations with all common variants or haplotypes and with breast cancer risk. We genotyped 10 SNPs that define the 10 common haplotypes in PGR. Of these SNPs, only one showed an association with breast cancer that was significant at the 5% level, the PGR-12 (V660L) G>T polymorphism in exon 4. Four other published studies have examined the effects of V660L on breast cancer risk and of the three with available data; two indicate an association of the L allele with increased risk (refs. 32, 33, 34, 36; Table 4). A combined analysis of the available studies suggests a codominant effect of the L allele, with no significant heterogeneity between studies. However, it should be emphasized that the size of the estimated ORs are moderate and despite the size of the combined data set (10,648 cases and 7,915 controls), the level of significance (Ptrend = 0.002 in our data, Ptrend = 0.027 in the combined data) is such that the association could still be attributable to chance. Thus, further evaluation in a larger case-control series will be required to confirm or refute this finding. Contrary to the report by De Vivo et al. (33), there was no significant difference in genotype distribution between premenopausal and postmenopausal cases in our study (Supplementary Table S4), suggesting that this variant has a similar effect on risk in both groups.

Because of the danger of false positives due to multiple testing, we consider it unwise to attempt interaction and subgroup analyses until the main genetic effect is fully established (46) and so we have generally avoided this. It will be interesting to see if there are stronger subgroup associations in future studies.

In addition to V660L, we found some weak evidence that an association with PGR-05 polymorphism, an intronic polymorphism just upstream of V660L, is also associated with breast cancer risk. This polymorphism has a suggestive dominant protective effect in our SEARCH breast cancer cases (heterozygote risk OR, 0.90; 95% CI, 0.83-0.98; rare homozygote risk OR, 0.95; 95% CI, 0.81-1.12), but a larger sample size is needed to confirm this and, consequently, this polymorphism is worthy of further investigation in other populations. PGR-10 also showed some weak evidence of an association, but this may be explained by the fact that the rare PGR-10 allele is also present on the 660L haplotype. The P value for test of difference in haplotype frequency between cases and calculated using haploscore did not prove to be significant.

The promoter region SNP, +331G>A (PGR-06), did not exhibit any significant differences in genotype distribution in set 1 or set 2 individually or when combined. A previous, smaller report had indicated that the rare allele was associated with a reduced risk in premenopausal women and an increased risk in the postmenopausal group (19). However, our data indicated no significant difference in genotype distribution between premenopausal and postmenopausal cases (Supplementary Table S2).

We have attempted a comprehensive SNP tagging study of the PGR gene. How certain can we be that we have evaluated all the common PGR SNPs and haplotypes? A cross-comparison of all 81 suitable SNPs identified in the NIEHS Environmental Genome Project analysis with our set of stSNPs showed 56 to be tagged on a pairwise basis with RP2 > 0.80, and a further 24 tagged by a multivariate RS2 > 0.79. The remaining SNP was the singleton SNP that failed assay design. This SNP, which has a minor allele frequency of 0.12 and is a nonsynonymous C>T polymorphism 1.8 kb upstream of the 5′ untranslated region, is unlikely to be functional and, given that no other SNPs define the same haplotype, it is improbable that a real association has been missed by not typing it. By further analysis of the Environmental Genome Project PDR90 data set, we could exclude 28 subjects who clearly carried African-specific alleles,10

10

P.D.P. Pharoah, personal communication.

and there remained 73 suitable SNPs. The same total haplotype and SNP tagging was found in this “PDR62” data set with our chosen stSNPs. Our confidence in the adequacy of the tagging is reinforced by the fact that the two SNPs originally examined in a previous study, and here genotyped in addition to the tagging set, were both in perfect linkage disequilibrium with a member of the tagging set in our set 1 genotyping—PGR-07 and PGR-08 with PGR-05 and PGR-03, respectively. Thus, they contributed no additional information. Using the Environmental Genome Project data, the gene was treated as a single block of linkage disequilibrium for the purposes of SNP selection (Fig. 1); however, different data sets and different SNP search methods may lead to different block structures; for example, Pearce et al. (36) identified SNPs over a wider genomic region and treated their data as four linkage disequilibrium blocks.

What is the maximum estimated disease risk associated with any of the common SNPs we have excluded from association with breast cancer? For all the SNPs studied, the maximum upper 95% CI for any OR was 1.21 for a heterozygote and 1.88 for a rare homozygote (Tables 2-4). Based on these upper confidence limits, the allele frequencies of the tagging SNPs and assuming an RP2 of 0.8, the maximum OR associated with any SNP is unlikely to be >1.3 in heterozygotes and 2.8 in homozygotes.

What could explain the association we are seeing with the V660L polymorphism? It is possible that the rare L allele could affect splicing. The PupaSNP web tool (45) indicates that the presence of the rare allele of V660L may lead to the loss of a cis-acting, DNA-binding SF2-type splicing enhancer site, defining the intron/exon boundary.11

This could lead to abnormal RNA splicing and exon skipping. In vitro functional assays will be necessary to determine the exact mode of action of variants at this residue.

Alternatively, the effect could be steric. Investigation into the conserved domain structure of the PGR protein shows this V660L polymorphism to be in the hinge region between the central zinc finger DNA-binding domain and the HOLI ligand-binding domain of the three-dimensional structure. It is possible that this nonsynonymous change from valine to leucine, a structurally similar residue differing from the former by an extra methyl group, will cause sufficient steric interference to upset the tertiary structure between the progesterone-binding domain and the DNA-binding region. A subtle change in this structure may affect the manner in which the homodimerized hormone/receptor complex binds and controls transcription from the response elements of certain downstream genes involved in mammary cell growth.

Another explanation is that the effects we are seeing with V660L are due to another polymorphism in strong linkage disequilibrium and hence carried on the same haplotypes. The PROGINS Alu insertion is in perfect linkage disequilibrium with the leucine allele of V660L. Its associated increased expression of the PR-B isoform, which is more transcriptionally active, may lead to PR-B-dependent stimulation of mammary cell growth. Further analysis of the NIEHS individual genotype data identified 16 SNPs to be in perfect linkage disequilibrium with V660L (Rp2 = 1). Of these, however, only one, the nonsynonymous polymorphism S344T, is a likely functional mutation. It would have a similar steric effect to that proposed for V660L, but would affect the progesterone-binding region of the PGR protein, possibly causing suboptimal ligand binding.

It is also possible that we could be seeing a true haplotype effect due to the combined effects of multiple variants. Both Leu660 and Thr344 add an extra methyl group to the tertiary protein structure, but neither is predicted to have a particularly dramatic effect on its own. However, because these two methyl-adding alleles, as well as the PROGINS Alu insertion, are inherited together on the same haplotype, they may have a much greater effect on PGR function in combination than alone.

In summary, we have found evidence that the haplotype associated with the V660L allele is associated with a small but significant increased risk of breast cancer, and we showed that other PGR haplotypes are unlikely to be associated with a measurably different risk of the disease. Further epidemiologic studies are required to confirm the risk associated with V660L and to determine if the risks in certain subgroups of carriers are sufficiently large to warrant cancer-preventative intervention. Other approaches would be needed to evaluate the functional basis of this association.

Grant support: Cancer Research UK.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

Supplementary data for this article are available at Cancer Epidemiology Biomarkers & Prevention Online (http://cebp.aacrjournals.org/).

Note: Current address for L. Tee: Birmingham University, Department of Medical and Molecular Genetics, Division of Paediatrics and Child Health, Birmingham Women's Hospital, Birmingham B15 2TG, United Kingdom. B.A.J. Ponder is a Cancer Research UK Gibb Fellow, P.D.P. Pharoah is a Cancer Research UK Senior Clinical Research Fellow, and D.F. Easton is a Cancer Research UK Principal Research Fellow.

We thank the SEARCH (Breast) Study team (Oluseun Ajai, Patricia Harrington, Hannah Munday, Barbara Perkins, Karen Redman and Mitul Shah) and the EPIC management team (Sheila Bingham, Nicholas Day, Kay-Tee Khaw, and Nick Wareham).

1
Lydon JP, DeMayo FJ, Funk CR, et al. Mice lacking progesterone receptor exhibit pleiotropic reproductive abnormalities.
Genes Dev
1995
;
9
:
2266
–78.
2
Westberg L, Ho HP, Baghaei F, et al. Polymorphisms in oestrogen and progesterone receptor genes: possible influence on prolactin levels in women.
Clin Endocrinol (Oxf)
2004
;
61
:
216
–23.
3
Cheon YP, Li Q, Xu X, et al. A genomic approach to identify novel progesterone receptor regulated pathways in the uterus during implantation.
Mol Endocrinol
2002
;
16
:
2853
–71.
4
Williams SP, Sigler PB. Atomic structure of progesterone complexed with its receptor.
Nature
1998
;
393
:
392
–6.
5
Musgrove EA, Lee CS, Sutherland RL. Progestins both stimulate and inhibit breast cancer cell cycle progression while increasing expression of transforming growth factor α, epidermal growth factor receptor, c-fos, and c-myc genes.
Mol Cell Biol
1991
;
11
:
5032
–43.
6
Deming SL, Nass SJ, Dickson RB, et al. C-myc amplification in breast cancer: a meta-analysis of its occurrence and prognostic relevance.
Br J Cancer
2000
;
83
:
1688
–95.
7
Gabra H, Taylor L, Cohen BB, et al. Chromosome 11 allele imbalance and clinicopathological correlates in ovarian tumours.
Br J Cancer
1995
;
72
:
367
–75.
8
Winqvist R, Hampton GM, Mannermaa A, et al. Loss of heterozygosity for chromosome 11 in primary human breast tumors is associated with poor survival after metastasis.
Cancer Res
1995
;
55
:
2660
–4.
9
Hampton GM, Mannermaa A, Winqvist R, et al. Loss of heterozygosity in sporadic human breast carcinoma: a common region between 11q22 and 11q23.3.
Cancer Res
1994
;
54
:
4586
–9.
10
Micheli A, Muti P, Secreto G, et al. Endogenous sex hormones and subsequent breast cancer in premenopausal women.
Int J Cancer
2004
;
112
:
312
–8.
11
Chappell PE, Lydon JP, Conneely OM, et al. Endocrine defects in mice carrying a null mutation for the progesterone receptor gene.
Endocrinology
1997
;
138
:
4147
–52.
12
Stierer M, Rosen H, Weber R, et al. Immunohistochemical and biochemical measurement of estrogen and progesterone receptors in primary breast cancer. Correlation of histopathology and prognostic factors.
Ann Surg
1993
;
218
:
13
–21.
13
Mohsin SK, Weiss H, Havighurst T, et al. Progesterone receptor by immunohistochemistry and clinical outcome in breast cancer: a validation study.
Mod Pathol
2004
;
17
:
1545
–54.
14
Bernoux A, de Cremoux P, Laine-Bidron C, et al. Estrogen receptor negative and progesterone receptor positive primary breast cancer: pathological characteristics and clinical outcome. Institut Curie Breast Cancer Study Group.
Breast Cancer Res Treat
1998
;
49
:
219
–25.
15
Nagai MA, Da Ros N, Neto MM, et al. Gene expression profiles in breast tumors regarding the presence or absence of estrogen and progesterone receptors.
Int J Cancer
2004
;
111
:
892
–9.
16
Sartorius CA, Melville MY, Hovland AR, et al. A third transactivation function (AF3) of human progesterone receptors located in the unique N-terminal segment of the B-isoform.
Mol Endocrinol
1994
;
8
:
1347
–60.
17
Conneely OM, Jericevic BM, Lydon JP. Progesterone receptors in mammary gland development and tumorigenesis.
J Mammary Gland Biol Neoplasia
2003
;
8
:
205
–14.
18
Mulac-Jericevic B, Lydon JP, DeMayo FJ, et al. Defective mammary gland morphogenesis in mice lacking the progesterone receptor B isoform.
Proc Natl Acad Sci U S A
2003
;
100
:
9744
–9.
19
De Vivo I, Hankinson SE, Colditz GA, et al. A functional polymorphism in the progesterone receptor gene is associated with an increase in breast cancer risk.
Cancer Res
2003
;
63
:
5236
–38.
20
Feigelson HS, Rodriguez C, Jacobs EJ, et al. No association between the progesterone receptor gene +331G>A polymorphism and breast cancer.
Cancer Epidemiol Biomarkers Prev
2004
;
13
:
1084
–5.
21
De Vivo I, Huggins GS, Hankinson SE, et al. A functional polymorphism in the promoter of the progesterone receptor gene associated with endometrial cancer risk.
Proc Natl Acad Sci U S A
2002
;
99
:
12263
–8.
22
Berchuck A, Schildkraut JM, Wenham RM, et al. Progesterone receptor promoter +331A polymorphism is associated with a reduced risk of endometrioid and clear cell ovarian cancers.
Cancer Epidemiol Biomarkers Prev
2004
;
13
:
2141
–7.
23
Cramer DW, Hornstein MD, McShane P, et al. Human progesterone receptor polymorphisms and implantation failure during in vitro fertilization.
Am J Obstet Gynecol
2003
;
189
:
1085
–92.
24
McKenna NJ, Kieback DG, Carney DN, et al. A germline TaqI restriction fragment length polymorphism in the progesterone receptor gene in ovarian carcinoma.
Br J Cancer
1995
;
71
:
451
–5.
25
Wang-Gohrke S, Chang-Claude J, Becher H, et al. Progesterone receptor gene polymorphism is associated with decreased risk for breast cancer by age 50.
Cancer Res
2000
;
60
:
2348
–50.
26
Dunning AM, Healey CS, Pharoah PD, et al. A systematic review of genetic polymorphisms and breast cancer risk.
Cancer Epidemiol Biomarkers Prev
1999
;
8
:
843
–54.
27
Rowe SM, Coughlan SJ, McKenna NJ, et al. Ovarian carcinoma-associated TaqI restriction fragment length polymorphism in intron G of the progesterone receptor gene is due to an Alu sequence insertion.
Cancer Res
1995
;
55
:
2743
–5.
28
Lattuada D, Somigliana E, Vigano P, et al. Genetics of endometriosis: a role for the progesterone receptor gene polymorphism PROGINS?
Clin Endocrinol (Oxf)
2004
;
61
:
190
–4.
29
Manolitsas TP, Englefield P, Eccles DM, et al. No association of a 306-bp insertion polymorphism in the progesterone receptor gene with ovarian and breast cancer.
Br J Cancer
1997
;
75
:
1398
–9.
30
Tong D, Fabjani G, Heinze G, et al. Analysis of the human progesterone receptor gene polymorphism progins in Austrian ovarian carcinoma patients.
Int J Cancer
2001
;
95
:
394
–7.
31
Fabjani G, Tong D, Czerwenka K, et al. Human progesterone receptor gene polymorphism PROGINS and risk for breast cancer in Austrian women.
Breast Cancer Res Treat
2002
;
72
:
131
–7.
32
Spurdle AB, Hopper JL, Chen X, et al. The progesterone receptor exon 4 Val660Leu G/T polymorphism and risk of breast cancer in Australian women.
Cancer Epidemiol Biomarkers Prev
2002
;
11
:
439
–43.
33
De Vivo I, Hankinson SE, Colditz GA, et al. The progesterone receptor Val660→Leu polymorphism and breast cancer risk.
Breast Cancer Res
2004
;
6
:
R636
–9.
34
Gold B, Kalush F, Bergeron J, et al. Estrogen receptor genotypes and haplotypes associated with breast cancer risk.
Cancer Res
2004
;
64
:
8891
–900.
35
Spurdle AB, Webb PM, Purdie DM, et al. No significant association between progesterone receptor exon 4 Val660Leu G/T polymorphism and risk of ovarian cancer.
Carcinogenesis
2001
;
22
:
717
–21.
36
Pearce CL, Hirschhorn JN, Wu AH, et al. Clarifying the PROGINS allele association in ovarian and breast cancer risk: a haplotype-based analysis.
J Natl Cancer Inst
2005
;
97
:
51
–9.
37
Schweikert A, Rau T, Berkholz A, et al. Association of progesterone receptor polymorphism with recurrent abortions.
Eur J Obstet Gynecol Reprod Biol
2004
;
113
:
67
–72.
38
Day N, Oakes S, Luben R, et al. EPIC-Norfolk: study design and characteristics of the cohort. European Prospective Investigation of Cancer.
Br J Cancer
1999
;
80
Suppl 1:
95
–103.
39
Satagopan JM, Verbel DA, Venkatraman ES, et al. Two-stage designs for gene-disease association studies.
Biometrics
2002
;
58
:
163
–70.
40
Satagopan JM, Elston RC. Optimal two-stage genotyping in population-based association studies.
Genet Epidemiol
2003
;
25
:
149
–57.
41
Satagopan JM, Venkatraman ES, Begg CB. Two-stage designs for gene-disease association studies with sample size constraints.
Biometrics
2004
;
60
:
589
–97.
42
Abecasis GR, Cookson WO. GOLD—graphical overview of linkage disequilibrium.
Bioinformatics
2000
;
16
:
182
–3.
43
Stram DO, Haiman CA, Hirschhorn JN, et al. Choosing haplotype-tagging SNPS based on unphased genotype data using a preliminary sample of unrelated subjects with an example from the Multiethnic Cohort Study.
Hum Hered
2003
;
55
:
27
–36.
44
Schaid DJ, Rowland CM, Tines DE, et al. Score tests for association between traits and haplotypes when linkage phase is ambiguous.
Am J Hum Genet
2002
;
70
:
425
–34.
45
Conde L, Vaquerizas JM, Santoyo J, et al. PupaSNP Finder: a web tool for finding SNPs with putative effect at transcriptional level.
Nucleic Acids Res
2004
;
32
(Web Server issue):
W242
–8.
46
Pharoah PD, Dunning AM, Ponder BA, et al. The reliable identification of disease-gene associations.
Cancer Epidemiol Biomarkers Prev
2005
;
14
:
1362
.

Supplementary data