Abstract
Background: Family history is an established risk factor for breast cancer. Although some important genetic factors have been identified, the extent to which familial risk can be attributed to genetic factors versus common environment remains unclear.
Methods: We estimated the familial concordance and heritability of breast cancer among 21,054 monozygotic and 30,939 dizygotic female twin pairs from the Nordic Twin Study of Cancer, the largest twin study of cancer in the world. We accounted for left-censoring, right-censoring, as well as the competing risk of death.
Results: From 1943 through 2010, 3,933 twins were diagnosed with breast cancer. The cumulative lifetime incidence of breast cancer taking competing risk of death into account was 8.1% for both zygosities, although the cumulative risk for twins whose co-twins had breast cancer was 28% among monozygotic and 20% among dizygotic twins. The heritability of liability to breast cancer was 31% [95% confidence interval (CI), 10%–51%] and the common environmental component was 16% (95% CI, 10%–32%). For premenopausal breast cancer these estimates were 27% and 12%, respectively, and for postmenopausal breast cancer 22% and 16%, respectively. The relative contributions of genetic and environmental factors were constant between ages 50 and 96. Our results are compatible with the Peto–Mack hypothesis.
Conclusion: Our findings indicate that familial factors explain almost half of the variation in liability to develop breast cancer, and results were similar for pre- and postmenopausal breast cancer
Impact: We estimate heritability of breast cancer, taking until now ignored sources of bias into account. Cancer Epidemiol Biomarkers Prev; 25(1); 145–50. ©2015 AACR.
Introduction
Although much is known about the causes of breast cancer, the role of genetic factors remains incompletely understood (1, 2). In 2000 Lichtenstein and colleagues reported that 27% of the variation underlying breast cancer liability in a Nordic twin cohort could be explained by genetic factors (3). Today a number of specific genes have been identified that account for approximately 30% of the familial risk (4). The largest specific gene effects are associated with the BRCA1/2, which accounts for about 16% of the total familial risk (5).
Traditional twin studies to date have largely ignored issues of censoring and competing risk of death in estimating the genetic contribution to risk (3, 6). This can lead to severe bias in the estimates for incidence, risk concordance, and heritability. In our analysis of the Nordic Twin Study of Cancer (NorTwinCan) data, we handle these issues by utilizing recently developed statistical methods (7). The purpose of this study is to conduct a refined investigation of the familial risk of breast cancer using the world's largest twin study of cancer. Our data have been expanded beyond those analyzed by Lichtenstein and colleagues (3) through the addition of the Norwegian twin cohorts and the inclusion of 14 to 16 additional years of follow-up for the Danish, Finnish, and Swedish twin cohorts. Besides estimating the familial risk of breast cancer in monozygotic and dizygotic twin pairs by determining the cumulative incidence and casewise concordance, we also estimate the heritability of risk and of liability for breast cancer, and examine age differences in the familial risk of developing breast cancer. Peto and Mack hypothesized a pattern of constant incidence over age for co-twins of twins diagnosed with breast cancer (1), which we investigate. Additionally, we estimate heritability for pre- and postmenopausal cancer separately. These results provide a useful reference for genome-wide association studies (GWAS) as well as for further studies on the etiology of breast cancer.
Materials and Methods
The twin cohorts
This study is based on the population-based NorTwinCan database, consisting of the Danish (8), Finnish (9), Norwegian (10), and Swedish (11, 12) twin registries combined with data from the national cancer and mortality registries. In this database each twin has a personal identity code, which includes information on sex and birth date, and enables combination of twin, cancer and mortality information. Characteristics of the four twin cohorts are summarized in Table 1. Zygosity in the twin registries was determined by validated questionnaires, which are known to classify more than 95% of pairs of twins correctly (13). We restrict genetic analyses in the present study to same-sexed female twin pairs with known zygosity (9–13) as opposite-sex twin pairs require different methods and have been studied specifically in a recent study (14). The ethical committees for each country approved the study.
. | Denmark . | Finland . | Norway . | Sweden . | Total . |
---|---|---|---|---|---|
Birth cohorts | 1870–2004 | 1880–1957 | 1895–1979 | 1886–2008 | |
Cancer registration since | 1943 | 1953 | 1953 | 1958 | |
Initiation of follow-up | 1943 | 1975 | 1964 | 1961 | |
End of follow-up | 2009–12-31 | 2010–12-31 | 2008–12-31 | 2009–12-31 | |
N female twins | 33,339 | 12,507 | 12,824 | 45,869 | 104,539 |
N MZ/DZ complete pairs | 6,235/10,430 | 2,026/4,179 | 2,935/3,439 | 9,859/12,892 | 21,055/30,940 |
N MZ/DZ complete uncensored pairs | 1,251/2,332 | 403/792 | 195/232 | 1,781/3,238 | 3,630/6,594 |
N breast cancer cases | 1,229 | 635 | 358 | 1,711 | 3,933 |
N breast cancer in complete pairs | 1,229 | 630 | 352 | 1,700 | 3,911 |
N breast cancer in complete uncensored pairs | 695 | 221 | 91 | 907 | 1,914 |
N concordant uncensored MZ/DZ pairs | 44/48 | 15/23 | 13/5 | 52/65 | 124/141 |
N discordant uncensored MZ/DZ pairs | 170/341 | 39/106 | 18/37 | 243/430 | 470/914 |
N unaffected uncensored MZ/DZ pairs | 1,037/1,943 | 349/663 | 164/190 | 1,486/2,743 | 3,036/5,539 |
. | Denmark . | Finland . | Norway . | Sweden . | Total . |
---|---|---|---|---|---|
Birth cohorts | 1870–2004 | 1880–1957 | 1895–1979 | 1886–2008 | |
Cancer registration since | 1943 | 1953 | 1953 | 1958 | |
Initiation of follow-up | 1943 | 1975 | 1964 | 1961 | |
End of follow-up | 2009–12-31 | 2010–12-31 | 2008–12-31 | 2009–12-31 | |
N female twins | 33,339 | 12,507 | 12,824 | 45,869 | 104,539 |
N MZ/DZ complete pairs | 6,235/10,430 | 2,026/4,179 | 2,935/3,439 | 9,859/12,892 | 21,055/30,940 |
N MZ/DZ complete uncensored pairs | 1,251/2,332 | 403/792 | 195/232 | 1,781/3,238 | 3,630/6,594 |
N breast cancer cases | 1,229 | 635 | 358 | 1,711 | 3,933 |
N breast cancer in complete pairs | 1,229 | 630 | 352 | 1,700 | 3,911 |
N breast cancer in complete uncensored pairs | 695 | 221 | 91 | 907 | 1,914 |
N concordant uncensored MZ/DZ pairs | 44/48 | 15/23 | 13/5 | 52/65 | 124/141 |
N discordant uncensored MZ/DZ pairs | 170/341 | 39/106 | 18/37 | 243/430 | 470/914 |
N unaffected uncensored MZ/DZ pairs | 1,037/1,943 | 349/663 | 164/190 | 1,486/2,743 | 3,036/5,539 |
Statistical analyses
We followed the approaches used by Scheike and colleagues (15, 16), which extend classical methods of twin data analysis to correct for censoring and competing risk of death during follow-up, when studying breast cancer risk in a twin, given breast cancer in the co-twin. We have previously described these methods in a study of prostate cancer (17). If neither censoring nor competing risk of death were present then our results would agree with those obtained from the standard quantitative genetic models of twin data (18, 19). Our modeling distinguished three possible outcomes. A twin could either be (i) diagnosed with cancer, (ii) die before end of follow-up, without being diagnosed with breast cancer, or (iii) be censored by either being lost to follow-up, mostly due to emigration (<2%) or surviving without breast cancer until end of follow-up. Times of entry and end of follow-up were defined for each country (Table 1). Both twins in a pair are followed up the same time, and can hence be assumed censored at the same time except if one twin emigrated, in which case we artificially censored the co-twin.
We estimated the casewise concordances in MZ (monozygotic) and DZ (dizygotic) pairs, both overall and by age. This provides a measure of the risk of developing breast cancer conditional on the co-twin being diagnosed with breast cancer. As we have almost full ascertainment in this study, the casewise concordance will equal the probandwise concordance used in some other studies. Moreover, we calculated the multilocus index as a measure of nonadditive effects of multiple risk loci (20, 21), To investigate premenopausal cancer we only considered as cases those with a diagnosis before age 50, although for investigating postmenopausal cancer, we only considered as cases those with a diagnosis after or at age 53, considering diagnosis at an earlier age as competing risk. Those ages were chosen as the median age of menopause has been estimated to be around 51 years, with some variation (22).
The biometric ACE model was used to estimate additive genetic (A), common environment (C), and unique individual effects (E) explaining the variation in liability of breast cancer (18, 19). MZ twins are genetically identical at the sequence level, therefore A effects are perfectly correlated in MZ pairs. In contrast, DZ twins are genetically as similar as siblings, which correspond to a correlation of 0.5 for A effects. Common environmental effects (C) are assumed to be equal and correlate 1.0 in all pairs regardless of zygosity. Hence, a higher concordance between MZ twins compared with DZ twins indicates a genetic effect. If the C effects that influence variance in the liability to develop breast cancer are more correlated among members of MZ than DZ pairs this would lead to an overestimation of A, and if they were less correlated among members of MZ than DZ pairs this would lead to an underestimation of A. The E component is assumed to be independent within twin pairs. In addition, using the Akaike information criterion, we checked whether alternative models that included a dominant genetic effect (D) instead of the additive genetic or common environment effect, improved the fit of the model. Doing this we took into account that models containing a D component, without an A component would be biologically implausible, hence mainly checking the fit of an ADE model.
We used a liability threshold model to estimate the variance components, as well as the cumulative incidence and casewise concordance, both overall and by age. We tested for equal cumulative risk of cancer between MZ and DZ twins, and compared this risk with risk estimates calculated by the nonparametric Aalen–Johansen estimator (23). For these estimates we used the maximum age in the data set (slightly above 100 years) as the endpoint of the lifetime risk. We investigated if country or birth year should be added as covariates. As the age at diagnosis was used as the time scale in the time to event models, it was automatically taken into account in the nonparametric baseline hazard.
Furthermore, we investigated the pattern suggested by Peto and Mack of breast cancer incidence in patients' relatives (1), by plotting cumulative hazard of breast cancer diagnosis stratified by age of onset in the co-twin, taking account for censoring and competing risk of deaths.
Results
The NorTwinCan cohort comprises 104,539 women from same-sexed twin pairs of known zygosity of whom 3,933 were diagnosed with breast cancer through 2010 at the latest (Table 1). Restricting to complete pairs, there were 3,911 cases of breast cancer diagnosed among 42,110 MZ and 61,880 DZ twins, i.e., among 51,995 pairs. Of these 124 MZ and 141 DZ pairs were concordant for breast cancer.
The cumulative incidence of breast cancer over the life span was 8.1% and did not differ between MZ and DZ twins. Table 2 presents the lifetime risk of disease, casewise concordances for disease risk by zygosity, and the genetic and common environmental variance components underlying variation in breast cancer liability. These results reveal a considerably increased risk of breast cancer among women whose co-twin had breast cancer. This familial effect is substantially greater among MZ than DZ pairs.
. | . | Casewise concordance and 95% CI . | Estimates and 95% CIs from twin modeling . | ||
---|---|---|---|---|---|
Age at diagnosis . | Lifetime risk and 95% CI . | MZ . | DZ . | Common env. c2 . | Heritability h2 . |
Any ages | 8.1% (7.8%–8.5%) | 28% (23%–33%) | 20% (17%–24%) | 16% (10%–32%) | 31% (10%–52%) |
<50 years | 1.5% (1.2%–1.7%) | 10% (5%–17%) | 6% (3%–10%) | 12% (0%–39%) | 27% (0%–62%) |
≥53 years | 7.2% (6.6%–7.8%) | 21% (16%–27%) | 16% (13%–20%) | 16% (0%–34%) | 22% (0%–46%) |
. | . | Casewise concordance and 95% CI . | Estimates and 95% CIs from twin modeling . | ||
---|---|---|---|---|---|
Age at diagnosis . | Lifetime risk and 95% CI . | MZ . | DZ . | Common env. c2 . | Heritability h2 . |
Any ages | 8.1% (7.8%–8.5%) | 28% (23%–33%) | 20% (17%–24%) | 16% (10%–32%) | 31% (10%–52%) |
<50 years | 1.5% (1.2%–1.7%) | 10% (5%–17%) | 6% (3%–10%) | 12% (0%–39%) | 27% (0%–62%) |
≥53 years | 7.2% (6.6%–7.8%) | 21% (16%–27%) | 16% (13%–20%) | 16% (0%–34%) | 22% (0%–46%) |
Figure 1 shows the casewise concordance of breast cancer. At every age, breast cancer risk among DZ twins whose co-twin had breast cancer was higher compared with the overall cumulative incidence. Moreover, the breast cancer risk for a MZ twin given that her co-twin was already diagnosed was 1.5 times higher than the corresponding concordance risk for DZ pairs. The relative recurrence risk was higher in MZ than in DZ pairs (Table 3). The multilocus index provided no clear indication of genetic heterogeneity for breast cancer, with estimates nonsignificantly different from 2. The relative recurrence risk for both MZ and DZ twins was higher at younger ages, although the multilocus index was relatively stable by age.
Age . | Relative recurrence risk for MZ . | Relative recurrence risk for DZ . | Multilocus index . |
---|---|---|---|
. | (95% CI) . | (95% CI) . | (95% CI) . |
All | 2.16 (1.77–2.55) | 1.70 (1.41–1.99) | 1.64 (0.78–2.50) |
−50 | 5.91 (1.75–10.07) | 3.51 (1.02–6.00) | 1.96 (−0.59–4.51) |
50–60 | 4.93 (3.36–6.50) | 2.77 (1.83–3.71) | 2.21 (0.74–3.68) |
60–70 | 2.98 (2.27–3.69) | 2.24 (1.73–2.75) | 1.60 (0.74–2.46) |
70–80 | 2.50 (2.01–2.99) | 1.80 (1.45–2.15) | 1.87 (0.87–2.87) |
80–90 | 2.15 (1.76–2.54) | 1.67 (1.38–1.96) | 1.71 (0.77–2.65) |
90+ | 2.04 (1.65–2.43) | 1.68 (1.39–1.97) | 1.53 (0.65–2.41) |
Age . | Relative recurrence risk for MZ . | Relative recurrence risk for DZ . | Multilocus index . |
---|---|---|---|
. | (95% CI) . | (95% CI) . | (95% CI) . |
All | 2.16 (1.77–2.55) | 1.70 (1.41–1.99) | 1.64 (0.78–2.50) |
−50 | 5.91 (1.75–10.07) | 3.51 (1.02–6.00) | 1.96 (−0.59–4.51) |
50–60 | 4.93 (3.36–6.50) | 2.77 (1.83–3.71) | 2.21 (0.74–3.68) |
60–70 | 2.98 (2.27–3.69) | 2.24 (1.73–2.75) | 1.60 (0.74–2.46) |
70–80 | 2.50 (2.01–2.99) | 1.80 (1.45–2.15) | 1.87 (0.87–2.87) |
80–90 | 2.15 (1.76–2.54) | 1.67 (1.38–1.96) | 1.71 (0.77–2.65) |
90+ | 2.04 (1.65–2.43) | 1.68 (1.39–1.97) | 1.53 (0.65–2.41) |
Fitting the biometric models adjusting for censoring and competing risk, the ACE model fitted best; neither the A nor the C could be dropped without significantly worsening the fit. The ACE model indicated about 31% heritability and 16% common environment effect. Adding the year of birth or country as covariate did not change the ACE-heritability estimates.
Figure 2 shows the magnitude of the additive genetic and common environmental variance components of breast cancer liability by age at diagnosis derived from the biometric ACE modeling. It appears that the size of those components does not differ between 60 and almost 100 years of age.
Among concordant twin pairs, there were slight and nonsignificant zygosity differences in the time between the diagnosis of breast cancer in the first and second twin. The mean difference was 11.8 years (SE, 0.80) for concordant MZ pairs and 11.7 years (SE, 0.83) for concordant DZ pairs (P value, 0.92). Figure 3 shows the cumulative hazard of breast cancer dependent on the co-twin's age at diagnosis, stratified by zygosity. A later cancer occurrence in a twin, if the co-twin was diagnosed at a higher age is visible in the figure, both for MZ and DZ twins. Although this pattern is present for MZ as well as DZ twins, it is much stronger for MZ, with an early diagnosis in the co-twin being associated with earlier and higher hazard for the twin. The Peto–Mack hypothesis (1) would predict parallel curves for higher ages in Fig. 3, a pattern which is not violated in this study.
We investigated postmenopausal breast cancer separately and observed casewise concordances of 21% in MZ and 16% in DZ pairs. Moreover, the common environment component was estimated to be 16% although the heritability was 22% (Table 2). For premenopausal breast cancer we observed a concordance of 10% in MZ and 6% in DZ pairs resulting in a common environment estimate of 12% and heritability of 27% (Table 2). Hence, premenopausal breast cancer seems to have a slightly higher genetic contribution than postmenopausal breast cancer, and slightly lower common environment contributions.
We also looked into metachronous bilateral breast cancer as described by Hartmann and colleagues (28), defined as two diagnoses of breast cancer in the same woman at least 3 months apart. In the NorTwinCan cohort there were five MZ twins with bilateral breast cancer, four of these concordant with a co-twin who had only one breast cancer diagnosis. There were three DZ twins with bilateral breast cancer, all from pairs discordant for any breast cancer.
Finally, the NorTwinCan cohort contains 98,841 men with known zygosity from same-sex twin pairs. Of those 17 were diagnosed with breast cancer during follow-up, consisting of 7 MZ and 10 DZ twins. None of those were concordant for breast cancer with their co-twin, and we have not conducted further analyses on the data from the male twins.
Discussion
Results from this large, population-based and virtually complete cohort of Nordic twins provide evidence that genetic differences between women explain a substantial portion of the variation (31%) in liability to develop breast cancer. Moreover, among DZ twins who are as genetically similar as siblings, the lifetime probability of developing breast cancer if the co-twin had cancer is around 20%, which is twice the lifetime risk in the general population. Our study also provides new insights regarding the importance of genetic effects across age of diagnosis. Specifically, the relative contribution of genetic factors was similar across the age groups, although the effect of common environment drops slightly in the older group.
There are several strengths of the current study, which extend the work by Lichtenstein and colleagues (3). Our study includes more than twice the number of pairs and breast cancer events, one additional country, new birth cohorts, and more than 10 years of additional follow-up. We accounted for varying follow-up time, problems of censoring, and competing causes of death (3). We estimated the genetic component of variation in liability to develop breast cancer to be 31% [95% confidence interval (CI), 10%–52%], which is similar to that reported by Lichtenstein and colleagues of 27% (95% CI, 4%—41%; ref. 3). However, our study provides novel insight into the variation across age for measures of risk and liability of breast cancer diagnosis. Namely, although the heritability estimates are similar, our incorporation of competing risk of death and censoring revealed that the lifetime cumulative incidence of breast cancer is lower (8.1%) as compared with the 75-year incidences of 13% for MZ and 9% in DZ as reported earlier (3).
Our analyses assume that the probability of screening among co-twins is independent of zygosity. However, if an MZ co-twin is more likely to be screened than a DZ co-twin of a diagnosed twin, the genetic component might be inflated due to overdiagnosis (29), but in that case we would have observed a higher incidence of breast cancer in the MZ than the DZ twins. However, MZ twins had only a 0.3% higher incidence than DZ twins in all countries, and therefore differential screening seems unlikely. The lack of difference in incidence between MZ and DZ twins is consistent with the assumption that the causes (genetic and environmental) of breast cancer do not differ by twin type. Other uncontrolled co-factors as education or health behavior could also influence the results, if they would differ by zygosity, but earlier studies indicate no large differences between MZ and DZ twins (30–33).
It has been reported previously that casewise concordance in DZ twins may increase with age (1). As expected we detect a similar pattern in our data, but taking prevalence by age into account, the relative recurrence risk is greatest among younger women and decreases slightly with age in MZ and DZ pairs although the corresponding multilocus index appears to be stable across age.
Moreover, our investigation of the age of diagnosis, depending on the co-twin's age at diagnosis shows an association, both for MZ and DZ twins, with an earlier diagnosis in the co-twin being associated with an earlier diagnosis and higher incidence in the twin. This pattern is, consistent with heritable risks, much stronger in MZ than in DZ twins. Our results are consistent with the hypothesis of constant hazard at higher ages by Peto and Mack (1) although they only predicted the dependency on age of diagnosis of the co-twin for DZ twins, although we find evidence of it for MZ twins as well.
Twin studies provide context for GWAS, which have identified multiple risk loci for breast cancer incidence (4). The multilocus index, being less than two and stable across age, suggests an additive effect of multiple genes. The index in breast cancer is lower than that observed in prostate cancer (17). Our estimates of total heritability make it possible to determine the extent to which breast cancer variability is explained by risk loci known today. The concept of missing heritability has been proposed to describe the discrepancy between the variance in cancer associated with identified genetic loci and total heritability (34, 35) and as only 30% of the familial risk can be explained by known genes (4), a large amount of the heritable risk found in this study remains unexplained by known genetic risk factors. A recent study on height and body mass index indicates that much of the missing heritability in those cases can be explained by a large number of small genetic effects not detected as significant in current GWAS (36), a similar pattern could explain the missing heritability of breast cancer.
In summary, heritability does not provide an estimate directly translatable to public health policy (37). It does, however, provide insight into individual differences in susceptibility to develop breast cancer, framing results from GWAS and missing heritability. Estimates of the shared risk of breast cancer between MZ twins, who have identical genomes, provide an upper limit of the potential for genotyping and whole-genome sequencing to classify individuals' risk (35, 37, 38). Finally, casewise concordance among DZ twins can help us to assess the cancer risk of first degree relatives in families affected by breast cancer.
Disclosure of Potential Conflicts of Interest
J.R. Harris is a consultant at National Institute on Aging, NIH. No potential conflicts of interest were disclosed by the other authors.
Authors' Contributions
Conception and design: L.A. Mucci, J.R. Harris, H.-O. Adami, K. Christensen, N.V. Holm, E. Pukkala, J. Kaprio, J.B. Hjelmborg
Development of methodology: S. Möller, T. Scheike, K. Holst, E. Pukkala, J.B. Hjelmborg
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): J.R. Harris, K. Czene, K. Christensen, N.V. Holm, E. Pukkala, A. Skytthe, J. Kaprio
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): S. Möller, T. Scheike, K. Holst, U. Halekoh, H.-O. Adami, N.V. Holm, E. Pukkala, J. Kaprio, J.B. Hjelmborg
Writing, review, and/or revision of the manuscript: S. Möller, L.A. Mucci, J.R. Harris, T. Scheike, K. Holst, U. Halekoh, H.-O. Adami, K. Czene, K. Christensen, N.V. Holm, E. Pukkala, A. Skytthe, J. Kaprio, J.B. Hjelmborg
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): K. Czene, K. Christensen, N.V. Holm, A. Skytthe, J.B. Hjelmborg
Study supervision: L.A. Mucci, J.B. Hjelmborg
Acknowledgments
The authors are thankful to The Danish Twin Registry for hosting and managing the joint Nordic twin data.
Grant Support
This work was supported by funding from the Ellison Foundation to Harvard School of Public Health (PI: L.A. Mucci and H.-O. Adami) and the Nordic Cancer Union (PI: J. Kaprio). The Finnish Twin Cohort was supported by the Academy of Finland (grants # 213506, 129680, 265240, and 263278), U.S. BioSHaRE-EU, grant agreement HEALTH-F4-2010-261433. L.A. Mucci is a Prostate Cancer Foundation Young Investigator and H.-O. Adami has a Distinguished Professor Award at Karolinska Institutet (Dnr: 2368/10-221). The Danish Twin Cohort was supported by the Odense University Hospital AgeCare program (Academy of Geriatric Cancer Research). The Ministry for Higher Education financially supports the Swedish Twin Registry.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.