Background:

Risk prediction models may be useful for precision breast cancer screening. We aimed to evaluate the performance of breast cancer risk models developed in European-ancestry studies in a Korean population.

Methods:

We compared discrimination and calibration of three multivariable risk models in a cohort of 77,457 women from the Korean Cancer Prevention Study (KCPS)-II. The first incorporated U.S. breast cancer incidence and mortality rates, U.S. risk factor distributions, and RR estimates from European-ancestry studies. The second recalibrated the first by using Korean incidence and mortality rates and Korean risk factor distributions, while retaining the European-ancestry RR estimates. Finally, we derived a Korea-specific model incorporating the RR estimates from KCPS.

Results:

The U.S. European-ancestry breast cancer risk model was well calibrated among Korean women <50 years [expected/observed = 1.124 (0.989, 1.278)] but markedly overestimated the risk for those ≥50 years [E/O = 2.472 (2.005, 3.049)]. Recalibrating absolute risk estimates using Korean breast cancer rates and risk distributions markedly improved the calibration in women ≥50 [E/O = 1.018 (0.825, 1.255)]. The model incorporating Korean-based RRs had similar but not clearly improved performance relative to the recalibrated model.

Conclusions:

The poor performance of the U.S. European-ancestry breast cancer risk model among older Korean women highlights the importance of tailoring absolute risk models to specific populations. Recalibrating the model using Korean incidence and mortality rates and risk factor distributions greatly improved performance.

Impact:

The data will provide valuable information to plan and evaluate actions against breast cancer focused on primary prevention and early detection in Korean women.

Breast cancer is the most common nonskin cancer in women worldwide. Although Korea has had a lower incidence compared with Western countries, the incidence of breast cancer is rapidly increasing. It is now the second leading cancer in Korean women (1) after thyroid cancer, with 21,402 new cases diagnosed in 2014. This increasing trend likely reflects changes in reproductive factors in Korean women such as early menarche, late menopause, and having fewer children at an older age (2), all of which are secondary to the rapid development and westernization of Korea. However, the age distribution of breast cancer incidence in Korea is still markedly different from that in the Western countries, with a peak at 45–49 years and a higher proportion of premenopausal women (3–5).

Mammography and other screening modalities can reduce morbidity and mortality of breast cancer (6, 7). In Korea, mammography is recommended every 2 years for women ages 40 or older. Breast cancer susceptibility, however, largely depends on multiple risk factors, and it is crucial to identify high-risk women who may benefit from aggressive screening strategies. Thus, building and improving the predictive values for risk prediction models is an important step toward targeted screening and prevention.

A number of breast cancer risk models have been developed in European-ancestry populations (8). These models use information on reproductive factors, family history of disease, mammographic density, and measured genetic factors to estimate a woman's absolute risk of disease. Recent work has focused on developing and validating a “synthetic” multivariable risk model that can include a comprehensive set of risk factors (9, 10). The absolute risks from this model are well calibrated across U.S.- and European cohorts, but they are unlikely to provide accurate risk estimates for Korean women without further modification. It is unknown whether recalibrated risk estimates using Korean population incidence rates but retaining relative risk estimates from European-based studies will perform well among Korean women.

This study aims to evaluate the discrimination and calibration of three breast cancer risk models among Korean women: a model based on risk factor RRs estimated from European-ancestry studies, U.S. incidence and mortality rates, and the distribution of risk factors among U.S. non-Hispanic white women; a model using European-ancestry RR estimates but Korean incidence and mortality rates and Korean risk factor distributions; and a model that combines Korean incidence and mortality rates and Korean risk factor distributions with Korean RR estimates. Although the last model, in principle, should perform best, in practice, if the risk factor RRs are similar across European-ancestry and Korean populations, the recalibrated model using RRs estimated among Europeans may perform well, especially if the sample sizes used to estimate the Korean RRs are relatively small.

We calculate absolute risk estimates and evaluate their performance using data from the Korean Cancer Prevention Study-II (KCPS-II) Biobank and the Individualized Coherent Absolute Risk Estimation (iCARE) software. iCARE was developed to develop and validate risk prediction models for a population combining information on RR estimates, age-specific incidence/mortality rates and risk factor distributions from multiple data sources (11).

### Study population for discrimination and calibration analyses

We used the KCPSII Biobank to evaluate the discrimination and calibration of breast cancer absolute risk models. The KCPS-II includes 78,282 women who undertook routine health assessments at health promotion centers between 2004 and 2013. The study design and recruitment have been described in detail previously (12). All participants gave written informed consent before participation. The Institutional Review Board of Yonsei University approved this study protocol (IRB approval number 4-2011-0277). Exclusion criteria included no information on height and weight, history of breast cancer, and age at entry below 20 or above 80 years. The final analytic samples included 77,457 women (Supplementary Fig. S1; Supplementary Table S1).

### Data collection

All participants were asked to complete a structured questionnaire to collect the following details: age at menarche, age at menopause, parity, age at first birth, oral contraceptive (OC) use (never, ever), hormone replacement therapy (HRT) use, alcohol intake, history of benign breast disease (BBD), and family history of breast cancer. Height and weight were measured while participants wore light clothing. Body mass index (BMI) was calculated as the weight (kg) divided by the height squared (m2).

### Follow-up for breast cancer

The principal outcome was incidence of breast cancer (ICD-10 codes C50). Because all participants have a unique identification number assigned at birth, allowing linkage with the national cancer registry and hospital admission records, the follow-up was almost 100% complete. Cancer diagnoses are based on histologic type, resulting in high accuracy.

### Study population for RR estimation

We used the KCPS to independently estimate the relative risks for breast cancer risk factors (Supplementary Table S2). The KCPS is a 1.3-million-member prospective cohort study, designed to assess risk factors for mortality, incidence, and hospital admission from cancer, with a follow-up of 25 years (13). The KCPS cohort includes the 443,627 women ages 20–80 years who received health insurance from the Korean Medical Insurance Corporation and who had biennial medical evaluations between 1992 and 1995. The collection of risk factors was similarly done to the KCPS-II. Because history of BBD was not asked for women in the KCPS, we defined history of BBD based on ICD-10 code D24. In the KCPS cohort, an incident breast cancer was coded on the basis of a hospital admission for a cancer diagnosis.

### Statistical analysis

We evaluated the performance of a recently published breast cancer absolute risk model (9) in the KCPS-II Biobank. We compared the performance of three models: (i) the U.S.-based European-ancestry model, using incidence, mortality, and risk factor distributions among U.S. non-Hispanic white women and European-ancestry RRs (USEA); (ii) a recalibrated model, using Korean incidence mortality and risk-factor distributions but European-ancestry RRs (KREA); and (iii) a fully Korean-based model using Korean incidence mortality and risk-factor distributions and RR estimates from the KCPS (KRKR).

The models include data on reproductive, anthropometric, behavioral, and clinical risk factors: age at menarche (≤10, 11, 12, 13, 14, 15, ≥16 years), age at menopause (<40, 40–44, 45–49, 50–54, ≥55 years), parity (0, 1, 2, ≥3 births), age at first birth (<20, 20–24, 25–29, ≥30 years), OC use (never, ever), HRT use (never, ever), BMI (<18.5, 18.5–24.9, 25.0–29.9, ≥30.0 kg/m2), height (cm/10), alcohol intake (0, 1–4, 5–14, 15–24, 25–34, 35–44, ≥45 g/day), history of BBD (no, yes), and family history of breast cancer (no, yes).

Due to the effect of estrogen, postmenopausal women have a greater risk of developing breast cancer than premenopausal women. Moreover, several factors such as obesity and HRT use have been linked to a higher risk of breast cancer only for postmenopausal women (14, 15). Therefore, we assessed the breast cancer risk models separately for women younger than 50 years and women ages 50 or older.

For the USEA and KREA models, we used literature-based RRs of the risk factors (9). For the KRKR model, the RR estimates were obtained from multivariable Cox regression models based on a Korean cohort (KCPS). Supplementary Table S2 provides detailed descriptions of RR estimates included in the models and population distribution.

iCARE uses average age-specific incidence rates to calibrate the predicted risks (16, 17). We used information on age-specific breast cancer incidence rates (Supplementary Fig. S2) and mortality rates from population-based registries in the United States and Korea: the 2008-2012 U.S. Surveillance Epidemiology and End Results data and the 2010 Korea National Statistical Office, respectively.

To obtain information on risk factor distributions, iCARE uses an additional individual-level reference dataset of risk factors representing each population (11). The reference datasets were 2010 National Health and Nutrition Examination Survey (NHANES) for the U.S.-based model and 2010–2012 Korean NHANES (KNHANES) for the recalibrated model and the Korean-based model. To account for missing data in KNHANES for continuous factors, we performed conditional mean imputation, using MICE to draw multiple samples of missing factors conditional on observed data (m = 10), then averaging factor values over the samples. For two unmeasured factors, history of BBD and family history of breast cancer, in the KNHANES, we used single random draw imputation based on the prevalence of the corresponding factors from the validation cohort. Using the imputation and simulation described above, we created a complete dataset of KNHANES with no missing information on risk factors.

Discrimination and calibration were used to evaluate the performance of model validation. For risk discrimination, we assessed the area under the receiver operating characteristic curve (AUC). For calibration, the KCPS-II Biobank participants were categorized into deciles of 5-year absolute risk predicted by iCARE Lit model. The predicted and observed incidence in each decile was compared using expected-to-observed ratio (E/O) and the Hosmer-Lemeshow χ2 test. Furthermore, we estimated cumulative and 10-year absolute risk using the current probability method (16) in the Korean-based model.

The absolute risk of developing breast cancer for a woman of age a over the time interval a + s can be calculated as

Formula (A) holds under the assumptions that the risk factors Z act in a multiplicative fashion on the baseline hazard function |${\lambda _0}( t )$|⁠. Formula (A) accounts for competing risks due to mortality from other causes through the age-specific mortality rate function m(t).

Cumulative risk is evaluated as absolute risk between age 20 years and a specific age. The 10-year risk is evaluated as absolute risk over the next 10 years for a woman who has attained a specific age without developing breast cancer. To use this method, we estimated the multivariable RRs of each women in the KCPS-II based on their risk factors Z, the log-relative risks β estimated in the KCPS, the age-specific mortality rates of breast cancer in Korea, and the risk factor distribution in KNHANES. iCARE uses the log relative risks, the risk factor distribution, and the population average age-specific incidence rates to calculate the baseline hazard; it then calculates absolute risk for each subject using formula (A).

All statistical tests were two-sided at a significance level of 0.05 and calculated using SAS version 9.4 software (SAS Institute, Cary, NC) for descriptive statistics and relative risks. Absolute risks were evaluated with R 3.5.0 software using the iCARE package 1.0.0.

### Baseline risk factors

A total of 680 breast cancer cases were diagnosed during follow-up in the KCPS-II Biobank (322 cases were diagnosed within 5 years). Baseline risk factor distributions stratified by age of 50 are displayed in Supplementary Table S1. Compared with women ages 50 years or older, women younger than 50 years tend to have earlier age at menarche, fewer births, later age at first birth, and were more likely to drink alcohol.

### Population distribution and RRs

The population distributions and RRs of breast cancer risk factors exhibit different patterns comparing the U.S. non-Hispanic white and Korean population (Supplementary Table S2). The U.S. non-Hispanic white women tend to have earlier age at menarche, later age at menopause, and earlier age at first birth than Korean women. The proportions of women who use OC or HRT were markedly higher among U.S. non-Hispanic white women than among Korean women. The U.S. non-Hispanic white women, on average, had a higher BMI than Korean women.

The relative risks did not differ greatly between European-ancestry and Korean women for most risk factors: the OR comparing the highest versus the lowest risk category among Korean women was on average 0.40–1.90 times that of European-ancestry women. One notable exception was BBD, which had a larger effect on breast cancer among Korean women than European-ancestry women (RR = 5.05 vs. 1.68). This may reflect the narrower definition of BBD in the KPCS-II, focusing on women with a history of benign neoplasm of unspecified breast. In addition, the inverse association between BMI and breast cancer risk seen among premenopausal European-ancestry was not seen among Korean women.

### Risk projections

Figure 1 shows cumulative and 10-year risks of breast cancer among Korean women between age 20 and 80 years by percentiles of absolute risk estimated in the KRKR model. The cumulative risk at 80 years for women in the 95th percentile of risk was 7.56%, while the average cumulative risk was 2.06%. The 10-year risk of breast cancer for women in the 95th percentile of risk peaked at 2.61% at age 45. The 10-year risks increased from age 20 to 45 and decreased thereafter.

Figure 1.

Cumulative and 10-year breast cancer risk for Korean women, stratified by risk percentiles in the KPCS-II Biobank estimated in the Korean-based model. Cumulative risk is evaluated as absolute risk between age 20 years and a specific age shown on the x-axis. The 10-year risk is evaluated as absolute risk over the next 10 years for a woman who has attained a specific age (shown on the x-axis) without developing breast cancer.

Figure 1.

Cumulative and 10-year breast cancer risk for Korean women, stratified by risk percentiles in the KPCS-II Biobank estimated in the Korean-based model. Cumulative risk is evaluated as absolute risk between age 20 years and a specific age shown on the x-axis. The 10-year risk is evaluated as absolute risk over the next 10 years for a woman who has attained a specific age (shown on the x-axis) without developing breast cancer.

Close modal

### Predictive capacities

The AUCs for women ages <50 years and ≥50 years for the USEA model in the KCPS-II were 71.8% [95% confidence interval (CI), 68.8–74.8] and 57.1% (95% CI, 51.2–62.9), showing better ability to distinguish cases from noncases among younger women (Table 1). The USEA model was well calibrated among Korean women ages <50 years [E/O (95% CI) = 1.12 (0.99–1.28); Fig. 2] but it overestimated the risk for those ages ≥50 years [E/O (95% CI) = 2.47 (2.01–3.05); Fig. 3]. Recalibrating absolute risk estimates using Korean age-specific incidence rates and risk distributions markedly improves the calibration in women ages ≥50 years [E/O (95% CI) = 1.02 (0.83–1.26)]. Recalibrating using the Korean age-specific incidence rates while keeping a U.S. risk factor reference distribution underestimated risk in the KCPS-II [E/O (95% CI) = 0.74 (0.65,0.84)]; recalibrating using a Korean risk factor reference distribution while keeping U.S. incidence rates overestimated risk [E/O (95% CI) = 1.36 (1.19,1.54); Supplementary Table S3]. In addition, incorporating Korean-based RR estimates also improved model calibration [<50 years E/O (95% CI) = 0.96 (0.85–1.09); >50 years E/O (95% CI) = 0.94 (0.76–1.16)]. In discrimination, however, the AUC slightly decreased among women ages <50 [AUC (95% CI) = 69.7% (66.7–72.6)] and those ages ≥50 years [AUC (95% CI) = 58.4% (52.9–63.8)]. For all models, miscalibration was most evident in the extreme risk deciles (Supplementary Table S4; Supplementary Fig. S3).

Table 1.

Discrimination and calibration for the breast cancer risk prediction models validated using the Korean Cancer Prevention Study-II Biobank.

Age groupModelAUC (95% CI)E/O ratio (95% CI)
<50 years of age (233 cases, 57,206 noncases) U.S.-based European-ancestry 71.8 (68.8–74.8) 1.124 (0.989–1.278)
Recalibrated 70.7 (67.7–73.7) 0.894 (0.787–1.017)
Korean-based 69.7 (66.7–72.6) 0.960 (0.845–1.091)
≥50 years of age (87 cases, 18,680 noncases) U.S.-based European-ancestry 57.1 (51.2–62.9) 2.472 (2.005–3.049)
Recalibrated 61.5 (56.2–66.9) 1.018 (0.825–1.255)
Korean-based 58.4 (52.9–63.8) 0.941 (0.763–1.161)
Age groupModelAUC (95% CI)E/O ratio (95% CI)
<50 years of age (233 cases, 57,206 noncases) U.S.-based European-ancestry 71.8 (68.8–74.8) 1.124 (0.989–1.278)
Recalibrated 70.7 (67.7–73.7) 0.894 (0.787–1.017)
Korean-based 69.7 (66.7–72.6) 0.960 (0.845–1.091)
≥50 years of age (87 cases, 18,680 noncases) U.S.-based European-ancestry 57.1 (51.2–62.9) 2.472 (2.005–3.049)
Recalibrated 61.5 (56.2–66.9) 1.018 (0.825–1.255)
Korean-based 58.4 (52.9–63.8) 0.941 (0.763–1.161)

Note: The AUCs reported in Table 1 are defined on the basis of predicted absolute risk and incorporate the variation due to age.

(A) The U.S.-based European-ancestry model, using incidence, mortality, and risk factor distributions among U.S. non-Hispanic white women and European-ancestry relative risk (RR) estimates; (B) a recalibrated model, using Korean incidence mortality and risk-factor distributions but European-ancestry RR estimates; and (C) a fully Korean-based model using Korean incidence mortality and risk-factor distributions and RR estimates from the Korean Cancer Prevention Study.

Abbreviations: CI, confidence interval; E, expected 5-year absolute risk; O, observed 5-year incidence.

Figure 2.

Absolute risk calibration of breast cancer risk prediction models in the KCPS-II among women less than 50 years of age. The risk categories are based on absolute risk. KCPS-II, Korean Cancer Prevention Study-II Biobank; HL, Hosmer-Lemeshow test statistic.

Figure 2.

Absolute risk calibration of breast cancer risk prediction models in the KCPS-II among women less than 50 years of age. The risk categories are based on absolute risk. KCPS-II, Korean Cancer Prevention Study-II Biobank; HL, Hosmer-Lemeshow test statistic.

Close modal
Figure 3.

Absolute risk calibration of breast cancer risk prediction models in the KCPS-II among women 50 years of age or older. The risk categories are based on absolute risk. KCPS-II, Korean Cancer Prevention Study-II Biobank; HL, Hosmer-Lemeshow test statistic.

Figure 3.

Absolute risk calibration of breast cancer risk prediction models in the KCPS-II among women 50 years of age or older. The risk categories are based on absolute risk. KCPS-II, Korean Cancer Prevention Study-II Biobank; HL, Hosmer-Lemeshow test statistic.

Close modal

In this study, we evaluated the performance of the breast cancer risk models originally developed for U.S. women to predict the 5-year breast cancer risk in a Korean population, directly and after recalibration to account for Korean age-specific incidence rates, risk factor distributions, and relative risks. To the best of our knowledge, our study is the first to assess these breast cancer risk models in an East Asian population.

The discrimination of the unrecalibrated USEA model and the two recalibrated models was similar for women <50 years (AUCs between 70.7% and 71.8%), but the recalibrated models performed better for women ≥50 years (AUC of 57.1 vs. 58.4% and 61.5%). The differences in AUCs indicate that the recalibration is changing the rank ordering of women according to their predicted risk. This reordering occurs because iCARE uses the distribution of risk factors in the population not only to define the baseline incidence rate, but also to estimate risks for women who are missing data on one or more risk factor. Our results suggest that using a reference distribution that better matches the target population can improve discrimination. In the case of the KCPS-II Biobank, missing data on factors that have very different distributions in the United States and Korea (e.g., age at menarche, parity, age at first birth, and alcohol intake) likely accounts for the improvement in discrimination between the USEA and recalibrated models.

The USEA breast cancer risk model was well calibrated among Korean women <50 years but overestimated the risk for those ≥50 years. Further recalibrations of the model showed appreciably improved calibration, especially among older women. This underscores the general importance of recalibrating absolute risk models to reflect the age-specific incidence rates, distribution of risk factors, and relative risks in the target population.

Consistent with previous reports, we found lower breast cancer incidence in the KCPS-II Biobank relative to incidence among U.S. non-Hispanic whites (2, 18). Relative to U.S. non-Hispanic whites, women in the KCPS-II Biobank had a lower proportion of risk factors such as earlier age at menarche, OC or HRT use, and BMI.

The RRs for most of the risk factor categories differed modestly between Korean and European-ancestry women; the OR comparing extreme risk factor categories among Korean women was generally between 0.40 and 1.90 times that among European-ancestry women. The largest exception was BBD, which had a RR 5 times larger in Korean than European-ancestry women. Moreover, Korean women ≥50 years had a larger effect of BBD than those <50, whereas the U.S. women had a similar effect between the age categories. This may reflect our definition of BBD in the KCPS, where the RRs were estimated: women with any ICD-10 code of D24 (“benign neoplasm of unspecified breast”) at baseline were considered to have a history of BBD. This is a more restrictive definition than was used in the European-ancestry studies (19; “atypical hyperplasia of the breast”) and may define a smaller, more homogenous group of women at higher risk of breast cancer. We chose to use ICD-10 D24 to define BBD because we believe that the accuracy of this insurance claims code would be better compared with other codes capturing more heterogeneous forms of BBD.

Risk factor distributions in our study were consistent with distributions of primary risk factors for breast cancer observed in previous Korean studies: early menarche, late menopause (20, 21), later and fewer births (2), taller height (22), obesity (23), history of BBD (24), alcohol intake (25), family history (20), and OC use (26). A systematic review reported that HRT had no significant effect on breast cancer in Korean women (27).

The literature-based absolute risk model for European-ancestry women that we assessed here has recently been validated in the European-ancestry U.S. and UK populations, showing good calibration (10, 28). Risk prediction models used in one country need to be carefully considered before they are adopted and incorporated into guidelines of other countries. These considerations need to account for different disease epidemiology across populations. Indeed, when we applied the original iCARE model, which uses the incidence rates of breast cancer and competing all-cause mortality rates among U.S. non-Hispanic white women, the 5-year absolute risk was overpredicted among Korean women older than 50 (E/O = 2.472; Table 1). This might be due to a variation in age-specific breast cancer incidence between U.S. non-Hispanic whites and Koreans. Consistent with previous findings, we found that the incidence rate of breast cancer in Korean women increased up to age 50 and decreased thereafter; whereas the incidence rate in U.S. non-Hispanic whites rose with age (3–5).

The recalibrated KREA model, which applied Korean incidence and mortality data and the RR estimates from European-ancestry studies, showed markedly improved calibration among women older than 50 years (E/O = 0.92). When the RR estimates from Korean population were further incorporated, the E/O ratio became nearly 1, although the AUC decreased somewhat. The unexpected decrease in model discrimination may be chance fluctuation, or it could reflect relatively imprecise estimates of the Korean RRs from the KCPS. This highlights the importance of considering the bias-variance tradeoff when developing risk models for specific target populations. Estimates of RRs from the target population will be unbiased, but if the available sample sizes are small, those estimates may be highly variable and the resulting risk model may have poor out-of-sample performance. If RR estimates from large samples from a nontarget population are available, they may have relatively good performance in the target population, given their improved precision—provided the true RRs in the target and nontarget population are not too different. In this specific case, considering the large size of the KCPS and the small, likely chance differences in AUCs between the recalibrated models with European-ancestry and Korean RRs, we believe the fully recalibrated KRKR model using Korean RRs is most appropriate for Korean women.

The striking age incidence curve of breast cancer in Korea—rising into the mid-40s, then declining—has been consistently observed over the last few decades (18). Korea is experiencing an aging society, and there is a strong generational cohort effect in breast cancer occurrence in Korean women. It has been reported that reproductive factors such as early age at menarche, late age at menopause, delayed first pregnancy, and changes in breast feeding patterns are associated with the cohort effect of breast cancer incidence among Korean women (18). Another reason for the highest peak in the middle-aged women may be due to the rapid increase in breast cancer screening experience in that age group, that is, higher rate of screening rates among women ages in their 40s and 50s, which is compatible with the age-incidence curve findings (29). This may also be responsible for the larger effect of BBD observed among Korean women ≥50 years than those <50 in our study. According to previous projections, the breast cancer incidence in Korea will increase up to 100 per 100,000 women in the future and the incidence curve by age will be similar to the current curve observed in Western women (2).

Several breast cancer risk assessment tools have been proposed in Korea. Previous case-control studies attempted to identify high-risk groups using a breast cancer probability model with relevant risk factors (30–32). A model developed from a prospective cohort study in Korea with an 8-year follow-up was internally validated in the same source population (33). However, the model did not differentiate between premenopausal and postmenopausal women and included only three risk factors (age, age at menarche, and lactation). A more recent study developed a Korean risk prediction model for breast cancer by modifying the Breast Cancer Risk Assessment Tool (BCRAT) and validated it in two Korean cohorts, showing a better validity than that in the original BCRAT (34). Similar to our study, the study calculated the risks separately for two age groups (<50 and ≥50 years old) and included several reproductive factors and modifiable lifestyle habits such as OC use and BMI. However, the study could not assess model calibration by different levels of risk due to a small number of breast cancer cases.

Matsuno and colleagues developed the Asian American Breast Cancer Study (AABCS) model using ethnicity-specific data to estimate absolute risks for Asian and Pacific Islander American women; and found that for Chinese and Filipino women, projections of absolute risk were lower in the AABCS model compared with the BCRAT that uses data from white women (35). However, because the AABCS model is designed for American women, it may not be generalizable to women in Asian countries who have historically had lower breast cancer risk than Asian women in the United States or European countries (36).

The limitations of our study include few breast cancer cases in the validation cohort, especially for women who are older than 50 years. After a few more years of follow-up, we expect to obtain a larger number of events when the women in the cohort become older in the future. We also acknowledge that the RR model might differ between populations because the distributions of breast cancer subtypes differ. For example, a recent study found that higher proportions of estrogen receptor–positive breast cancer at a younger age among Asian women compared with non-Hispanic white women, which was not considered in this study (37). Another limitation was simulated data for unmeasured risk factors in the KNHANES, which may provide different results from using actual data. Estimates of the Korea-specific RRs from the KCPS may be inaccurate for some risk factors due to a high proportion of missing data. Finally, the risk models we have adapted here do not include several important risk factors, which may have led to diminished discriminatory accuracy. Of particular importance here, the model does not include history of breastfeeding, which has been found to be the strongest protective factor in Korean women (38), whereas in European-ancestry women, the protective effect is relatively small (39, 40). Model fit might also be improved by tailoring categories for available data to the Korean population, for example, using Asian-specific World Health Organization cutoffs for overweight and obesity (41). The relative risk models also do not include interactions among the risk factors, for example, between BMI and hormone therapy (9, 42–44). Further research should consider more comprehensive models including history of breastfeeding and other risk factors, such as breast density, bone mineral density, and genetic and biomarkers. One advantage of the iCARE package is that it allows incorporation of polygenic risk score derived from single-nucleotide polymorphisms. In our future research, we plan to evaluate whether and how genetic information improve the performance of the breast cancer risk prediction models in Korean population.

We included modifiable risk factors such as parity, age at first birth, OC use, HRT use, BMI, and alcohol intake, in the breast cancer risk models, allowing policy makers to quantify risk reduction after risk factor modification and encourages the general population to modify behaviors. Moreover, we evaluated model calibration stratified by levels of risk, which can be useful for risk-based prevention and screening by classifying subjects at the extremes of risk. The success of recalibration of existing breast cancer risk models in this Korean cohort suggests that recalibration could be of great value for assessment of breast cancer risk in other countries.

In conclusion, although the original USEA breast cancer risk model using incidence rates and risk factor distributions from U.S. non-Hispanic women and European-ancestry relative risks showed relatively good discrimination and calibration among Korean women younger than 50, it had lower discrimination and poor calibration among Korean women older than 50. Recalibrated models using Korean breast cancer incidence rates and RRs had good discrimination and improved calibration. The data from this study will provide valuable information to plan and evaluate actions against breast cancer focused on primary prevention and early detection in Korean women. Future work to improve model discrimination should incorporate additional risk factors, including history of breast feeding, genetic risk markers, and breast density.

No potential conflicts of interest were disclosed.

Conception and design: Y.H. Jee, C. Gao, S.H. Jee, P. Kraft

Development of methodology: Y.H. Jee, C. Gao, S.H. Jee, P. Kraft

Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): S.H. Jee

Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): Y.H. Jee, C. Gao, J. Kim, S. Park, S.H. Jee, P. Kraft

Writing, review, and/or revision of the manuscript: Y.H. Jee, C. Gao, J. Kim, S. Park, S.H. Jee, P. Kraft

Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): Y.H. Jee, S.H. Jee, P. Kraft

Study supervision: S.H. Jee, P. Kraft

This work was supported by a grant of the NCI (P30CA00651654; to P. Kraft).

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1.
Kweon
SS
.
Updates on cancer epidemiology in Korea
, 2018.
Chonnam Med J
2018
;
54
:
90
100
.
2.
Park
SK
,
Kim
Y
,
Kang
D
,
Jung
EJ
,
Yoo
KY
.
Risk factors and control strategies for the rapidly rising rate of breast cancer in Korea
.
J Breast Cancer
2011
;
14
:
79
87
.
3.
Leong
SP
,
Shen
ZZ
,
Liu
TJ
,
Agarwal
G
,
Tajima
T
,
Paik
NS
, et al
Is breast cancer the same disease in Asian and Western countries?
World J Surg
2010
;
34
:
2308
24
.
4.
Ko
BS
,
Noh
WC
,
Kang
SS
,
Park
BW
,
Kang
EY
,
Paik
NS
, et al
Changing patterns in the clinical characteristics of korean breast cancer from 1996-2010 using an online nationwide breast cancer database
.
J Breast Cancer
2012
;
15
:
393
400
.
5.
Korean Breast Cancer Society
.
Early screening of breast cancer in Korea
.
J Korean Breast Cancer Soc
2002
;
5
:
225
34
.
6.
Ray
KM
,
Joe
BN
,
Freimanis
RI
,
Sickles
EA
,
Hendrick
RE
.
Screening mammography in women 40–49 years old: current evidence
.
AJR Am J Roentgenol
2018
;
210
:
264
70
.
7.
Puschel
K
,
G
,
Soto
G
,
Gonzalez
K
,
Martinez
J
,
Holte
S
, et al
Strategies for increasing mammography screening in primary care in Chile: results of a randomized clinical trial
.
Cancer Epidemiol Biomarkers Prev
2010
;
19
:
2254
61
.
8.
Cintolo-Gonzalez
JA
,
Braun
D
,
Blackford
AL
,
Mazzola
E
,
Acar
A
,
Plichta
JK
, et al
Breast cancer risk models: a comprehensive overview of existing models, validation, and clinical applications
.
Breast Cancer Res Treat
2017
;
164
:
263
84
.
9.
Garcia-Closas
M
,
Gunsoy
NB
,
Chatterjee
N
.
Combined associations of genetic and environmental risk factors: implications for prevention of breast cancer
.
J Natl Cancer Inst
2014
;
106
.
pii: dju305
.
10.
Choudhury
PP
,
Wilcox
AN
,
Brook
MN
,
Zhang
Y
,
Ahearn
T
,
Orr
N
, et al
Comparative validation of breast cancer risk prediction models and projections for future risk stratification
.
J Natl Cancer Inst
2020
;
112
:
278
85
.
11.
Choudhury
PP
,
Maas
P
,
Wilcox
A
,
Wheeler
W
,
Brook
M
,
Check
D
, et al
iCARE: R package to build, validate and apply absolute risk models
.
PLoS One
2020
;
15
:
e0228198
.
12.
Jee
YH
,
Emberson
J
,
Jung
KJ
,
Lee
SJ
,
Lee
S
,
Back
JH
, et al
Cohort profile: The Korean Cancer Prevention Study-II (KCPS-II) biobank
.
Int J Epidemiol
2018
;
47
:
385
6f
.
13.
Jee
SH
,
Sull
JW
,
Park
J
,
Lee
S-Y
,
Ohrr
H
,
Guallar
E
, et al
Body-mass index and mortality in Korean men and women
.
N Engl J Med
2006
;
355
:
779
87
.
14.
Neuhouser
ML
,
Aragaki
AK
,
Prentice
RL
,
Manson
JE
,
Chlebowski
R
,
Carty
CL
, et al
Overweight, obesity, and postmenopausal invasive breast cancer risk: a secondary analysis of the women's health initiative randomized clinical trials
.
JAMA Oncol
2015
;
1
:
611
21
.
15.
Beral
V
.
Breast cancer and hormone-replacement therapy in the million women study
.
Lancet
2003
;
362
:
419
27
.
16.
Gail
MH
,
Brinton
LA
,
Byar
DP
,
Corle
DK
,
Green
SB
,
Schairer
C
, et al
Projecting individualized probabilities of developing breast cancer for white females who are being examined annually
.
J Natl Cancer Inst
1989
;
81
:
1879
86
.
17.
Tyrer
J
,
Duffy
SW
,
Cuzick
J
.
A breast cancer prediction model incorporating familial and personal risk factors
.
Stat Med
2004
;
23
:
1111
30
.
18.
Choi
Y
,
Kim
Y
,
Park
SK
,
Shin
HR
,
Yoo
KY
.
Age-period-cohort analysis of female breast cancer mortality in Korea
.
Cancer Res Treat
2016
;
48
:
11
9
.
19.
Hartmann
LC
,
Sellers
TA
,
Frost
MH
,
Lingle
WL
,
Degnim
AC
,
Ghosh
K
, et al
Benign breast disease and the risk of breast cancer
.
N Engl J Med
2005
;
353
:
229
37
.
20.
Yoo
KY
,
Kang
D
,
Park
SK
,
Kim
SU
,
Kim
SU
,
Shin
A
, et al
Epidemiology of breast cancer in Korea: occurrence, high-risk groups, and prevention
.
J Korean Med Sci
2002
;
17
:
1
6
.
21.
Shin
A
,
Song
YM
,
Yoo
KY
,
Sung
J
.
Menstrual factors and cancer risk among Korean women
.
Int J Epidemiol
2011
;
40
:
1261
8
.
22.
Choi
YJ
,
Lee
DH
,
Han
KD
,
Yoon
H
,
Shin
CM
,
Park
YS
, et al
Adult height in relation to risk of cancer in a cohort of 22,809,722 Korean adults
.
Br J Cancer
2019
;
120
:
668
74
.
23.
Jung
D
,
Lee
S-M
.
BMI and breast cancer in Korean women: a meta-analysis
.
Asian Nurs Res (Korean Soc Nurs Sci)
2009
;
3
:
31
40
.
24.
Park
SK
,
Yoo
KY
,
Kang
D
,
Kim
SU
,
Lee
SY
,
Im
HJ
, et al
A case-control study on risk factors of benign breast disorders in Korea
.
Epidemiol Health
2000
;
22
:
11
9
.
25.
Choi
YJ
,
Myung
SK
,
Lee
JH
.
Light alcohol drinking and risk of cancer: a meta-analysis of cohort studies
.
Cancer Res Treat
2018
;
50
:
474
87
.
26.
Choi
B-R
,
Kwon
M-H
,
Bang
M-R
.
Oral contraceptive use and breast cancer in Korean women
.
Korean J Health Serv Manag
2014
;
8
:
221
9
.
27.
Bae
JM
,
Kim
EH
.
Hormone replacement therapy and risk of breast cancer in Korean women: a quantitative systematic review
.
J Prev Med Public Health
2015
;
48
:
225
30
.
28.
Gao
C
,
Choudhury
PP
,
Maas
P
,
Tamimi
R
,
Eliassen
H
,
Chatterjee
N
, et al
Validation of breast cancer risk prediction model using nurses health and nurse health II studies
[abstract]
. In:
Proceedings of the Special Conference: Improving Cancer Risk Prediction for Prevention and Early Detection; Nov 16–19, 2016
;
:
AACR; 2017
.
Abstract nr A05
.
29.
Lee
JH
,
Yim
SH
,
Won
YJ
,
Jung
KW
,
Son
BH
,
Lee
HD
, et al
Population-based breast cancer statistics in Korea during 1993–2002: incidence, mortality, and survival
.
J Korean Med Sci
2007
;
22
:
S11
6
.
30.
Park
SK
,
Yoo
KY
,
Kang
DH
,
Ahn
SH
,
Noh
DY
,
Choe
KJ
.
The estimation of breast cancer disease-probability by difference of individual susceptibility
.
Cancer Res Treat
2003
;
35
:
35
51
.
31.
Lee
CY
,
Ko
IS
,
Kim
HS
,
Lee
WH
,
Chang
SB
,
Min
JS
, et al
Development and validation study of the breast cancer risk appraisal for Korean women
.
Nurs Health Sci
2004
;
6
:
201
7
.
32.
Lee
EO
,
Ahn
SH
,
You
C
,
Lee
DS
,
Han
W
,
Choe
KJ
, et al
Determining the main risk factors and high-risk groups of breast cancer using a predictive model for breast cancer risk assessment in South Korea
.
Cancer Nurs
2004
;
27
:
400
6
.
33.
Jee
SH
,
Song
JW
,
Nam
CM
.
Development of the individualized health risk appraisal model of breast cancer risk in Korean women
.
Epidemiol Health
2004
;
26
:
50
8
.
34.
Park
B
,
Ma
SH
,
Shin
A
,
Chang
MC
,
Choi
JY
,
Kim
S
, et al
Korean risk assessment model for breast cancer risk prediction
.
PLoS One
2013
;
8
:
e76736
.
35.
Matsuno
RK
,
Costantino
JP
,
Ziegler
RG
,
Anderson
GL
,
Li
H
,
Pee
D
, et al
Projecting individualized absolute invasive breast cancer risk in Asian and Pacific Islander American women
.
J Natl Cancer Inst
2011
;
103
:
951
61
.
36.
Stanford
JL
,
Herrinton
LJ
,
Schwartz
SM
,
Weiss
NS
.
Breast cancer incidence in Asian migrants to the United States and their descendants
.
Epidemiology
1995
;
6
:
181
3
.
37.
Lin
CH
,
Yap
YS
,
Lee
KH
,
Im
SA
,
Naito
Y
,
Yeo
W
, et al
Contrasting epidemiology and clinicopathology of female breast cancer in asians vs the US population
.
J Natl Cancer Inst
2019
;
111
:
1298
306
.
38.
Kim
Y
,
Choi
JY
,
Lee
KM
,
Park
SK
,
Ahn
SH
,
Noh
DY
, et al
Dose-dependent protective effect of breast-feeding against breast cancer among ever-lactated women in Korea
.
Eur J Cancer Prev
2007
;
16
:
124
9
.
39.
Key
TJ
,
Verkasalo
PK
,
Banks
E
.
Epidemiology of breast cancer
.
Lancet Oncol
2001
;
2
:
133
40
.
40.
Kvåle
G
,
Heuch
I
.
Menstrual factors and breast cancer risk
.
Cancer
1988
;
62
:
1625
31
.
41.
Consultation
WE
.
Appropriate body-mass index for Asian populations and its implications for policy and intervention strategies
.
Lancet
2004
;
363
:
157
63
.
42.
Sandvei
MS
,
Vatten
LJ
,
Bjelland
EK
,
Eskild
A
,
Hofvind
S
,
Ursin
G
, et al
Menopausal hormone therapy and breast cancer risk: effect modification by body mass through life
.
Eur J Epidemiol
2019
;
34
:
267
78
.
43.
Chlebowski
RT
,
Anderson
GL
,
Aragaki
AK
,
Prentice
R
.
Breast cancer and menopausal hormone therapy by race/ethnicity and body mass index
.
J Natl Cancer Inst
2016
;
108
.
pii: djv327
.
44.
Hou
N
,
Hong
S
,
Wang
W
,
OI
,
Dignam
JJ
,
Huo
D
.
Hormone replacement therapy and breast cancer: heterogeneous risks by race, weight, and breast density
.
J Natl Cancer Inst
2013
;
105
:
1365
72
.