Abstract
The recent release of version 2.0-8 of the BayesMendel package contains an updated BRCAPRO risk prediction model, which includes revised modeling of contralateral breast cancer (CBC) penetrance, provisions for pedigrees of mixed ethnicity and an adjustment for mastectomies among family members. We estimated penetrance functions for CBC by a combination of parametric survival modeling of literature data and deconvolution of SEER9 data. We then validated the resulting updated model of CBC in BRCAPRO by comparing it with the previous release (BayesMendel 2.0-7), using pedigrees from the Cancer Genetics Network (CGN) Model Validation Study. Version 2.0-8 of BRCAPRO discriminates BRCA1/BRCA2 carriers from noncarriers with similar accuracy compared with the previous version (increase in AUC, 0.0043), is slightly more precise in terms of the root-mean-square error (decrease in RMSE, 0.0108), and it significantly improves calibration (ratio of observed to expected events of 0.9765 in version 2.0-8, compared with 0.8910 in version 2.0-7). We recommend that the new version be used in clinical counseling, particularly in settings where families with CBC are common. Cancer Epidemiol Biomarkers Prev; 23(8); 1689–95. ©2014 AACR.
Introduction
The BRCA1 and BRCA2 genes help explain about 5% to 10% of breast cancer cases (1, 2). About 12% of women in the general population will develop breast cancer sometime during their lives (3). In contrast, according to recent estimates, 55% to 65% of women who inherit a harmful BRCA1 mutation and around 45% of women who inherit a harmful BRCA2 mutation will develop breast cancer by age 70 years (4, 5). Models to predict carrier status are commonly used clinically. Among these, BRCAPRO is the most commonly used, and consistently ranks among the best performing in validation studies (6). BRCAPRO is continually improved based upon current literature (7–11) and has been incorporated into software used by clinicians, researchers, and software developers (12–15). The most recent version of BRCAPRO, included in the BayesMendel 2.0-8 package (16), provides updated estimates for contralateral breast cancer (CBC) penetrance, can handle pedigrees, including multiple ethnicities, and adjusts for mastectomies among any of the relatives. CBC occurrence was previously estimated via the simplifying assumption that the first and second diagnoses were independent. Families with CBC had been previously identified as being one of the subgroups in which predictions provided by BRCAPRO are less accurate (17). Recent literature provides further evidence of a strong dependence between fist and second diagnoses (18–20) as well as explicit estimates of the cumulative incidence of CBC among carriers (21). Before distributing the upgraded model, we validated it using data from the Cancer Genetics Network (CGN) Model Validation Study (6).
Materials and Methods
Validation data
The CGN Model Validation Study comprises pedigrees collected in 8 high-risk counseling clinics that well represent the populations within which the BRCAPRO model is more commonly applied. The study is described in detail elsewhere (6). For reproducibility, we created an R package that reads in the raw data provided by the eight centers, performs any necessary preprocessing, and reproduces the BRCAPRO risks evaluated independently at the sites using independent software. Across the eight sites, 2,089 pedigrees were provided, of which 2,038 run without error through both versions of BRCAPRO, and were, thus, eligible for a head-to-head model comparison. Because we do not have separate information on ethnicity of individual family members, nor information on mastectomies for these individuals, we focus on improvements resulting from the updated CBC penetrances. We compare the ability of the models to (i) discriminate carriers from noncarriers using the area under the ROC curve (AUC; refs. 22, 23), (ii) increase precision of estimated carrier probabilities using the root-mean-square error of prediction (for RMSE see refs. 22, 23), and (iii) estimate the overall number of observed carriers using the observed-to-expected ratio (OE; refs. 22, 23). For each metric, we evaluate the estimates, their 95% bias-corrected accelerated (BCa) bootstrap confidence intervals (CI), the estimates of the differences between the two versions of BRCAPRO, and their corresponding 95% BCa bootstrap CIs. We assessed overall performance among the 2,038 pedigrees and considered the following two subgroups: 322 families with a CBC diagnosis in any of the relatives (subgroup 1); 155 families in which the proband is diagnosed with CBC (subgroup 2). We used the R package ROCR (24) to estimate the AUC and MSE.
Statistical modeling
We estimated the penetrances of CBC for carriers of mutations in either one of the BRCA genes or both, and noncarriers as functions. For consistency with earlier versions of BRCAPRO, we used a discrete time variable t from 0 to 110, interpreted as difference between the ages of the two breast cancer diagnoses, in years. A value of t = 0 indicates that the first unilateral and second contralateral diagnoses occurred within a year of each other. We use the term unilateral only for the first diagnosis; the second is always called contralateral. We estimated penetrances for BRCA1 and BRCA2 carriers from published cumulative risks estimates (21). To estimate noncarrier penetrance, we first estimated the general population penetrance from SEER9 database and subtracted the penetrance attributable to carriers.
Penetrance of BRCA carriers
Graeser and colleagues (21), estimated the cumulative risk of CBC following a first unilateral breast cancer diagnosis in relatives of BRCA1 or BRCA2 carriers. This was a retrospective, multicenter study based on the German Consortium for Hereditary Breast and Ovarian Cancer, ranging from 1996 to 2008, and including a total of 1,520 families with a deleterious germline mutation in either BRCA1 or BRCA2. Exclusion criteria from the study were: (i) relatives testing negative for the known mutation in the family; (ii) synchronous bilateral or noninvasive breast cancer; (iii) insufficient information about age at cancer events or bilateral mastectomy. The final study cohort comprised 2,020 women with unilateral breast cancer (978 index patients and 1,042 relatives); the group of index patients was not used to estimate risk of CBC following first diagnosis because they were already selectively chosen to be DNA tested. The authors provided the cumulative risk of CBC at 5, 10, 15, and 25 years following the original diagnosis. In relatives of BRCA1 carriers, age of unilateral breast cancer diagnosis was associated with risk of CBC, and estimates were provided for women diagnosed with unilateral breast cancer before age 40, between 40 and 49, and older. There does not seem to be a pronounced difference in cumulative risk between women diagnosed between 40 and 49 years old and women diagnosed later.
We used maximum-likelihood estimation, assuming an underlying Weibull distribution, to fit these cumulative risk estimates. For BRCA1, we fit two separate models: one for women diagnosed with unilateral breast cancer before age 40 and a second model for the two older age categories combined, using their averaged cumulative risks as input. For BRCA2, differences among cumulative risks for any the age categories failed to achieve significance (21); we, thus, fit a single cumulative risks model to all age groups combined.
BRCAPRO needs to cover a broad range of combinations of ages of first and second diagnoses, including some that are so far apart to have never been observed in ref. (21) or SEER. To cover these unlikely cases, we extrapolated the CBC penetrance curves beyond the 25 years follow-up covered by (21). To this end, we made the assumption that the penetrance follows an exponential decay starting at year 26 after the first unilateral diagnoses, and ending with a penetrance of essentially zero at year 110.
We also need to cover the unlikely scenario of carriers of both a BRCA1 and BRCA2 mutation. In keeping with earlier versions of BRCAPRO, we assume that the age of onset for these individuals is distributed as the minimum of the corresponding random variables for carriers of a single mutation. The cumulative risk |$R_{{\rm 12}} (t)$| for carriers of both mutations is, thus, obtained from the BRCA1 cumulative risk |$R_{\rm 1} (t)$|, and the BRCA2 cumulative risk |$R_{\rm 2} (t)$| as
Penetrance of the general population
From the SEER9 database, covering diagnoses from 1973 to 2006, we extracted patients who experienced invasive CBC. Repeated diagnoses, bilateral diagnoses, and diagnoses with unknown laterality were removed for a total available sample of 457,304 women ages 11 to 108 at unilateral breast cancer diagnosis. We split the general population into two different age groups, containing, respectively, 29,659 women ages <40 and 427,645 women ages ≥40 at their first breast cancer diagnosis. We modeled the time to CBC from unilateral breast cancer diagnosis using a Weibull parametric survival curve and derived the cumulative risk. The maximum time to CBC was 34 years after unilateral breast cancer diagnosis.
Penetrance for noncarriers
We estimated the cumulative risk for noncarriers |$R_{{\rm noncar}} (t)$| from the cumulative risk of carriers |$R_{{\rm car}} (t)$| and the cumulative risk for the general population |$R_{{\rm pop}} (t)$|. Let |$\theta _{\rm 1} (t), \,\theta _{\rm 2} (t)$| be the allele frequencies of deleterious mutations in the BRCA1 and BRCA2 genes, respectively. Also let |$\theta _{{\rm 12}} (t)$| be the allele frequency of a mutation in both BRCA1 and BRCA2. We can express |$R_{{\rm pop}} (t)$| as a mixture of |$R_{{\rm car}} (t)$| and |$R_{{\rm noncar}} (t)$|
where |$\pi _{{\rm car}} \, = \,\theta _{\rm 1} {\rm +}\theta _{\rm 2} - \theta _{{\rm 12}} $| and |$\pi _{{\rm noncar}\,} = \,{\rm 1} - \pi _{{\rm car}} $| are, respectively, the probabilities of a being carrier and a noncarrier in the general population. The term |$R_{{\rm car}} (t)$| in equation (2) can be written as a linear combination:
where |$R_{\rm 1} (t), \,R_{\rm 2} (t), \,R_{{\rm 12}} (t)$| are defined as above. By solving equation (2) for |$R_{{\rm noncar}} (t)$|, we obtained the cumulative risk for noncarriers. As before, for years 35 to 110 following unilateral breast cancer diagnosis, we assumed that the penetrance function follows an exponential decay to essentially zero, as previously explained.
Further adjustments
We applied some minor additional adjustments to the curves. Because the first CBC cumulative risk estimates in ref. (21) were 5 years after unilateral breast cancer diagnosis, we assumed the cumulative risks |$R_{\rm 1} (t)$| and |$R_{\rm 2} (t)$| for BRCA carriers to be linear between years 1 and 5; moreover, we set |$R_{\rm 1} ({\rm 0})$| equal to |$R_{\rm 1} (1)$| and |$R_{\rm 2} (0)$| equal to |$R_{\rm 2} (1)$|, respectively, for BRCA1 and BRCA2 mutation carriers. This removed a singularity at time t = 0 given by the Weibull parametric model, which would have made the estimated probability of a contemporaneous diagnosis of CBC greater than 1. We also removed a singularity at time t = 0 for the general and noncarrier population penetrance curves, assuming a linear cumulative risk between times t = 0 and t = 1. Figure 1 shows the final penetrance density functions that have been included in the current implementation of BRCAPRO 2.0-8.
Results
Performance of BRCAPRO 2.0-8
As expected, only probands in subgroup 1 have a modified risk of being a BRCA carrier in BRCAPRO 2.0-8 compared with 2.0-7. Figure 2 provides an overall comparison. For the vast majority of families with CBC, the carrier probability is reduced in the new version. This is because, generally, two positively correlated diagnoses provide less evidence toward increased risk than would two independent diagnoses. A large number of families, highly enriched for noncarriers moves from high to low risk by the typical definitions of risk used clinically (e.g., 5% or 10%). Figure 3 further breaks down CBC families depending on whether the proband or a relative is affected with CBC (Fig. 3A), and depending on the time interval between the two diagnoses (Fig. 3B). The carrier risk decreased more pronouncedly if the CBC occurred in the proband and/or if fewer years passed between unilateral and contralateral breast diagnoses. Although in most families with CBC, the estimated carrier risk is lower in the revised model, exceptions occur when at least 12 years passed between diagnoses.
The two versions of BRCAPRO discriminate similarly well-between carriers and noncarriers overall (difference in AUC between release 2.0-8 and release 2.0-7 = 0.0043), in subgroup 1 (difference in AUC = 0.0002) and in subgroup 2 (difference in AUC = 0.0068); see Table 1 for the BCa 95% CIs. The new version has increased precision as measured by a statistically significant decrease in the RMSE of 0.0108 (95% CI, −0.0154 to −0.0067; see also Table 1). As expected, this trend in the RMSE is driven by families in subgroup 1 presenting with a statistically significant decrease in the RMSE of 0.0551 (95% CI, −0.0761 to −0.0347), and in subgroup 2 with a statistically significant decrease in the RMSE of 0.0633 (95% CI, −0.0984 to −0.0306).
. | Overall . | CBC in family . | CBC proband . |
---|---|---|---|
AUC 2.0-7 | 0.7884 (0.7613, 0.8126) | 0.7785 (0.7177, 0.8295) | 0.7479 (0.6533, 0.8220) |
AUC 2.0-8 | 0.7927 (0.7661, 0.8169) | 0.7788 (0.7185, 0.8311) | 0.7547 (0.6623, 0.8274) |
|${\rm \Delta}_{{\rm AUC}} $| | 0.0043 (0.0005, 0.0082) | 0.0002 (−0.0189, 0.0194) | 0.0068 (−0.0016, 0.0312) |
OE 2.0-7 | 0.8910 (0.8239, 0.9608) | 0.5515 (0.4700, 0.6360) | 0.5837 (0.4680, 0.7051) |
OE 2.0-8 | 0.9765 (0.9036, 1.0519) | 0.7296 (0.6198, 0.8402) | 0.8018 (0.6445, 0.9750) |
|${\rm \Delta}_{{\rm OE}} $| | 0.0855 (0.0722, 0.1021) | 0.1781 (0.1441, 0.2186) | 0.2181 (0.1634, 0.2894) |
RMSE 2.0-7 | 0.3854 (0.3698, 0.4020) | 0.4998 (0.4652, 0.5355) | 0.5314 (0.4789, 0.5848) |
RMSE 2.0-8 | 0.3745 (0.3587, 0.3910) | 0.4448 (0.4090, 0.4805) | 0.4682 (0.4162, 0.5240) |
|${\rm \Delta}_{{\rm RMSE}} $| | −0.0108 (−0.0154, −0.0067) | −0.0551 (−0.0761, −0.0347) | −0.0633 (−0.0984, −0.0306) |
. | Overall . | CBC in family . | CBC proband . |
---|---|---|---|
AUC 2.0-7 | 0.7884 (0.7613, 0.8126) | 0.7785 (0.7177, 0.8295) | 0.7479 (0.6533, 0.8220) |
AUC 2.0-8 | 0.7927 (0.7661, 0.8169) | 0.7788 (0.7185, 0.8311) | 0.7547 (0.6623, 0.8274) |
|${\rm \Delta}_{{\rm AUC}} $| | 0.0043 (0.0005, 0.0082) | 0.0002 (−0.0189, 0.0194) | 0.0068 (−0.0016, 0.0312) |
OE 2.0-7 | 0.8910 (0.8239, 0.9608) | 0.5515 (0.4700, 0.6360) | 0.5837 (0.4680, 0.7051) |
OE 2.0-8 | 0.9765 (0.9036, 1.0519) | 0.7296 (0.6198, 0.8402) | 0.8018 (0.6445, 0.9750) |
|${\rm \Delta}_{{\rm OE}} $| | 0.0855 (0.0722, 0.1021) | 0.1781 (0.1441, 0.2186) | 0.2181 (0.1634, 0.2894) |
RMSE 2.0-7 | 0.3854 (0.3698, 0.4020) | 0.4998 (0.4652, 0.5355) | 0.5314 (0.4789, 0.5848) |
RMSE 2.0-8 | 0.3745 (0.3587, 0.3910) | 0.4448 (0.4090, 0.4805) | 0.4682 (0.4162, 0.5240) |
|${\rm \Delta}_{{\rm RMSE}} $| | −0.0108 (−0.0154, −0.0067) | −0.0551 (−0.0761, −0.0347) | −0.0633 (−0.0984, −0.0306) |
NOTE: Rows labeled by Δ contain the difference of the figure of merit between BRCAPRO 2.0-7 and 2.0-8, with corresponding 95% CIs. The differences ΔOE and ΔRMSE are statistically significant.
The calibration of BRCAPRO improves in version 2.0-8. The new OE of 0.98, a statistically significant increase of 0.09 with respect to version 2.0-7, and is closer to the target value of 1; when this metric is considered separately for the two genes, the OE for BRCA1(2) carrier status is 1.04 (0.89). In both subgroups 1 and 2, this is an improvement (0.73 for version 2.0-8 from 0.55 for version 2.0-7, and 0.8 from 0.58, respectively). In BRCAPRO 2.0-8, the 5-year risk of a CBC diagnosis for probands with a unilateral breast diagnosis is lower than in the previous version. The biggest impacts upon 5-year risk of CBC occur when the first diagnosis is before age 40, and in presence of family CBC (Fig. 4). For unaffected probands, the risk of a breast cancer diagnosis within 5 years remains the same if there is no family history of CBC, but decreases otherwise (Fig. 5).
Discussion
We updated the BRCAPRO and improved the way in which it handles CBC. These changes mitigate a previous problem with overestimation of carrier risk in families with CBC, and lead to significantly improved model calibration, as well as a significant improvement in estimation accuracy, of the order of 20 or more percentage points, in these families. On the basis of this evaluation, we suggest that the new version should be used in clinical counseling, particularly in settings in which families with CBC are common and in which affected probands are counseled about their contralateral recurrence risk.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Authors' Contributions
Conception and design: E. Mazzola, G. Parmigiani
Development of methodology: E. Mazzola, G. Parmigiani
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): E. Mazzola, J. Chipman, S.-C. Cheng
Writing, review, and/or revision of the manuscript: E. Mazzola, J. Chipman, S.-C. Cheng, G. Parmigiani
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): J. Chipman
Study supervision: G. Parmigiani
Grant Support
This work has been supported by NIH/NCI awards 5R21CA177233-02 (to G. Parmigiani, PI) and 5P30CA006516-49 (to E. Benz, PI).