Abstract
Mammographic percentage of volumetric density is an important risk factor for breast cancer. Epidemiology studies historically used film images often limited to craniocaudal (CC) views to estimate area-based breast density. More recent studies using digital mammography images typically use the averaged density between craniocaudal (CC) and mediolateral oblique (MLO) view mammography for 5- and 10-year risk prediction. The performance in using either and both mammogram views has not been well-investigated. We use 3,804 full-field digital mammograms from the Joanne Knight Breast Health Cohort (294 incident cases and 657 controls), to quantity the association between volumetric percentage of density extracted from either and both mammography views and to assess the 5 and 10-year breast cancer risk prediction performance. Our results show that the association between percent volumetric density from CC, MLO, and the average between the two, retain essentially the same association with breast cancer risk. The 5- and 10-year risk prediction also shows similar prediction accuracy. Thus, one view is sufficient to assess association and predict future risk of breast cancer over a 5 or 10-year interval.
Expanding use of digital mammography and repeated screening provides opportunities for risk assessment. To use these images for risk estimates and guide risk management in real time requires efficient processing. Evaluating the contribution of different views to prediction performance can guide future applications for risk management in routine care.
Introduction
Mammographic breast density is well established as a risk factor for breast cancer. The classic studies used film mammograms and identified density as a risk factor. In 2005, Boyd wrote a classic article making the case for density as an intermediate phenotype (1). Because then it has been used as an endpoint for prevention trials (2), consistent with it being accepted as an intermediate phenotype. Although assessment of breast density in the film-screen era was qualitative and based on the radiologist's subjective evaluation of the presence of dense glandular tissue, widespread use of digital mammography created the potential for quantitative assessment of breast density. Older methods were area-based using two-dimensional (2D) film images, which has advanced to volumetric assessment with advent of digital images. The association of breast density with breast cancer risk is consistent across a range of methods used to estimate density as percent density (amount of dense glandular tissue compared with total breast tissue; refs. 3, 4). More specifically, the percentage of volumetric density (often %VD) and its association with breast cancer risk has been widely studied in the literature (3, 5–7). Volumetric density assessment is an established risk factor and is incorporated into existing risk assessment models, including Tyrer–Cuzick version 8 (8). Furthermore, a continuous measure of breast density shows low concordance with the 77 SNP polygenetic breast cancer risk scores (Spearman r = 0.024) and also a summary breast cancer–risk factor score derived from questionnaire risk factor data (Spearman r = 0.054; ref. 9). As risk models are implemented in broader clinical use, breast density is commonly included (10, 11).
However, debate in the epidemiologic literature still asks if the view used to assess breast density changes the magnitude of association or model performance for risk prediction (12). We note that approaches to process images and automatically remove pectoral muscles are described as more accurate for craniocaudal (CC) views than those for mediolateral oblique (MLO) mammograms (12). Previous meta-analysis of largely film-based studies suggests that the CC view shows a stronger association with future breast cancer risk than the MLO (13). With the advent of digital mammography, easier access to images has facilitated extension from consideration of density to define or mask lesions to now use the CC and MLO views for 5- and 10-year risk prediction (14, 15) consistent with NCCN guidelines (16) and providing sufficient interval for intervention to reduce risk (16).
To quantify the benefits of multiple views of breast density on predicting risk of future breast cancer as opposed to diagnosis, we draw on prospective data from the Joanne Knight Breast Health Cohort (JKBHC; ref. 17). We focus on the longer-term 5- or 10-year risk in this article in contrast with diagnosis of breast lesions, where more views (including tomosynthesis) may generate more precise identification to suspicious lesions (18). Few studies have investigated this issue thoroughly in the context of digital mammograms (4, 19, 20). We aim to investigate the association and risk prediction of CC, MLO, and the average of both in this study using the JKBHC comprised of 3,804 full-field digital mammograms.
Materials and Methods
Study population
The JKBHC is comprised of over 10,000 women ages 30 to 84 undergoing repeated mammography screening at Siteman Cancer Center and followed since 2010 (17). All women had a baseline mammogram at entry and completed a risk factor questionnaire. Mammograms are obtained using the same technology (Hologic). Women were excluded from the cohort if they had a history of cancer at baseline (other than nonmelanoma skin cancer). Women with breast implants were excluded from the cohort. Follow-up through October, 2020 was maintained through record linkages to electronic health records and pathology registries. 80% of participants had a medical center visit (mammography and other health visits) within the past 2 years. All analyses performed in this study use the nested case–control cohort within JKBHC where the pathology confirmed breast cancer cases were matched to two controls sampled from the perspective cohort. We sampled 2 control women for each case based on age at entry to the cohort and year of enrollment as previously described (17, 21). We identified 347 cases and 694 controls. After linkage to screening mammogram files, we excluded 8 women with breast implants and the rest without screening mammograms retrieved, we retained 294 cases diagnosed through October, 2020 and 657 controls. Supplementary Fig. S1 demonstrates the cohort selection and case identification. All women had 4 mammogram images and both the CC and MLO views are used in this study. Thus we used 3,804 images. We assessed the baseline characteristics of cases who did not have mammogram images retrieved (53) against those case who did (294) and observed comparable risk factor distribution, age (56.6 vs. 57.5), and breast density categories defined by BI-RADS clinical reports (48.8% vs. 49.0%).
Ethical approval for this prospective nested case–control cohort study was obtained from the Washington University in St Louis institutional review board. Participants provided informed written consent under the US Common Rule.
Breast cancer–risk factor collection
Women self-reported breast cancer–risk factors on entry to the cohort. These are drawn from established and validated measures (22). The questionnaire at entry assessed height, current weight, parity, age at first birth, menses ceased (yes/no), age at menopause (natural or with surgical removal of uterus, with removal of ovaries or without removal of ovaries), age at hysterectomy, family history of breast cancer (mother and/or sister), personal history of benign breast biopsy, and race.
Volumetric mammographic breast density assessment
As previously described (21), the volumetric percentage of breast density (VPD) within each digital mammogram is estimated with an automated pixel-thresholding algorithm developed and implemented at Washington University on processed images that directly takes in the full digital mammograms. The skin around the breast in CC and MLO views is removed and the pectoral muscle in MLO views is automatically removed using the boundary detection algorithm before estimating the dense volume. The VPD is then estimated using the volume of dense glandular tissue divided by the total breast volume that normalizes the difference in breast size across women and is consistent with other density estimation methods in the literature (23, 24). The correlation between VPD generated from our algorithm with the commercially available and widely used Volpara 4th edition (Volpara Solutions, Matakina Technology Limited, Wellington, New Zealand), is 0.81 based on an out-of-sample study with 375 women recruited from the mammography service at Washington University with mean age 47 (sd = 4.8; Supplementary Fig. S2; refs. 25, 26). Volpara reports the percentage of volumetric density that corresponds to BI-RADS as: A, <3.5%, Fatty; B, 3.5% to <7.5%, Scattered; C, 7.5% to <15.5%, Heterogeneously dense; D, 15.5% or more, Extremely dense. We refer to the CC view VPD as CC-VPD, MLO view VPD as MLO-VPD, and the average of the two as Full-VPD from here on.
Statistical analysis
We first assessed the Pearson correlation between the VPDs using different views. To estimate the 5 and 10-year breast cancer risk, we adopted the Cox proportional hazards model adjusting for baseline age in single years, BMI (continuous), family history of breast cancer (mother and/or sister), personal history of biopsy confirmed benign breast disease, parity (1+ vs. 0), and menopausal status (menses ceased; yes/no). This approach is consistent with routine practice as the loglikelihood for a conditional logistic regression model is statistically proven to be equivalent to the loglikelihood from a Cox model (27–29). Three different Cox regressions have been fitted that use CC-VPD, MLO-VPD, and Full-VPD on their original percent scale to investigate the model performance using different views.
The proportional hazards assumption has been formally checked using statistical tests and graphical diagnostics based on the scaled Schoenfeld residuals. The linearity assumption was checked by the Martingale residuals. We report the hazard ratios (HR), the 95% confidence interval (CI), and P value under all three Cox regressions. The 5 and 10-year prediction accuracy is assessed by the AUC using Uno's estimator (30) that considers censoring, under a 10-fold cross-validation study.
Data availability
All datasets were accessed and used under IRB-approved protocols and are available from the corresponding author upon reasonable request. The density estimation pipeline developed at Washington University is available upon request.
Results
The participant characteristics by case and control status are presented in Table 1. Among controls, the mean age was 56.6, mean BMI was 27.4 kg/m2, 81.1% were white and 12.9% were black, 82.8% of women reported having 1 or more children, and 68.8% were postmenopausal.
. | Cases (n = 294) . | Controls (n = 657) . | ||
---|---|---|---|---|
Risk factors . | Mean (sd) . | Median (Range) . | Mean (sd) . | Median (Range) . |
Age, y | 56.56 | 57.19 | 56.62 | 56.68 |
(8.75) | (35.38–76.80) | (8.67) | (35.27–76.84) | |
BMI (kg/m2) | 29.25 | 28.45 | 27.38 | 25.82 |
(6.37) | (17.47–54.54) | (6.19) | (14.14–50.96) | |
CC-VPD (%) | 6.31 | 5.59 | 6.19 | 5.22 |
(3.46) | (1.82–19.16) | (3.89) | (1.66–24.46) | |
MLO-VPD (%) | 6.78 | 6.41 | 6.52 | 5.77 |
(2.81) | (2.01–18.53) | (3.19) | (1.91–22.54) | |
Full-VPD (%) | 6.48 | 6.20 | 6.28 | 5.45 |
(2.78) | (1.92–16.44) | (3.23) | (1.87–20.27) | |
n (%) | ||||
Family history of breast cancer | 92 (31.30%) | 156 (23.74%) | ||
Race | ||||
White | 232 (78.91%) | 533 (81.13%) | ||
Black | 57 (19.39%) | 85 (12.93%) | ||
Others | 5 (1.70%) | 39 (5.94%) | ||
Biopsy confirmed benign breast disease | 88 (29.93%) | 179 (27.25%) | ||
Parity (1+) | 232 (78.91%) | 544 (82.80%) | ||
Postmenopausal | 199 (67.69%) | 452 (68.80%) | ||
Diagnosis (years since entry mammogram) | ||||
0 ≤ 5 | 104 (35.4%) | |||
5 ≤ 10 | 176 (60.0%) | |||
10+ | 14 (1.02%) | — |
. | Cases (n = 294) . | Controls (n = 657) . | ||
---|---|---|---|---|
Risk factors . | Mean (sd) . | Median (Range) . | Mean (sd) . | Median (Range) . |
Age, y | 56.56 | 57.19 | 56.62 | 56.68 |
(8.75) | (35.38–76.80) | (8.67) | (35.27–76.84) | |
BMI (kg/m2) | 29.25 | 28.45 | 27.38 | 25.82 |
(6.37) | (17.47–54.54) | (6.19) | (14.14–50.96) | |
CC-VPD (%) | 6.31 | 5.59 | 6.19 | 5.22 |
(3.46) | (1.82–19.16) | (3.89) | (1.66–24.46) | |
MLO-VPD (%) | 6.78 | 6.41 | 6.52 | 5.77 |
(2.81) | (2.01–18.53) | (3.19) | (1.91–22.54) | |
Full-VPD (%) | 6.48 | 6.20 | 6.28 | 5.45 |
(2.78) | (1.92–16.44) | (3.23) | (1.87–20.27) | |
n (%) | ||||
Family history of breast cancer | 92 (31.30%) | 156 (23.74%) | ||
Race | ||||
White | 232 (78.91%) | 533 (81.13%) | ||
Black | 57 (19.39%) | 85 (12.93%) | ||
Others | 5 (1.70%) | 39 (5.94%) | ||
Biopsy confirmed benign breast disease | 88 (29.93%) | 179 (27.25%) | ||
Parity (1+) | 232 (78.91%) | 544 (82.80%) | ||
Postmenopausal | 199 (67.69%) | 452 (68.80%) | ||
Diagnosis (years since entry mammogram) | ||||
0 ≤ 5 | 104 (35.4%) | |||
5 ≤ 10 | 176 (60.0%) | |||
10+ | 14 (1.02%) | — |
Note: Continuous covariates are reported with mean and standard deviation (sd); binary covariates are reported by the number of positive responses and their corresponding percentage.
As shown in Fig. 1, the CC-VPD and MLO-VPD retain a significant positive linear relationship with each other (Pearson correlation of 0.70). The correlation between Full-VPD averaged between both views and CC-VPD is 0.93, and for MLO-VPD is 0.90. The correlation between BMI and CC-VPD is −0.48 and between age and CC-VPD is −0.22 where similar correlations are seen under both MLO- and Full-VPD. Additional correlation plots by case and control status are included in Supplementary Fig. S3.
In Table 2, we summarize the associations between baseline risk factors and breast cancer risk using CC-, MLO-, and Full-VPD. We observe that each year increase in baseline age is associated with an increase of 2.3% in breast cancer risk and each unit increase in BMI (kg/m2) with a 5.0% increase in breast cancer risk in all three models. Women with family history of breast cancer are at significantly higher risk of breast cancer than those without. In all three models, the VPD measure was statistically significant, and an increase in VPD is positively associated with breast cancer risk. Subgroup analyses by menopausal status are shown in Supplementary Tables S1–S6. Results were largely unchanged in postmenopausal women, though family history had a somewhat stronger in premenopausal women.
Risk factorsa . | Measure of mammographic breast density used in breast cancer incidence model . | |||||
---|---|---|---|---|---|---|
. | CC-VPD . | MLO-VPD . | Full-VPD . | |||
Model . | HR (95% CI) . | P . | HR (95% CI) . | P . | HR (95% CI) . | P . |
Age | 1.02 (1.00–1.05) | 0.04 | 1.02 (1.00–1.05) | 0.07 | 1.02 (1.00–1.05) | 0.04 |
BMI (kg/m2) | 1.05 (1.03–1.08) | <0.01 | 1.05 (1.03–1.07) | <0.01 | 1.05 (1.03–1.07) | <0.01 |
Family history (yes vs. no) | 1.43 (1.11–1.83) | <0.01 | 1.44 (1.12–1.84) | <0.01 | 1.42 (1.11–1.83) | <0.01 |
Biopsy confirmed benign breast disease | 1.01 (0.78–1.32) | 0.92 | 1.04 (0.80–1.35) | 0.76 | 1.02 (0.78–1.32) | 0.91 |
Parity (1+ vs. 0) | 0.96 (0.72–1.28) | 0.78 | 1.01 (0.76–1.35) | 0.92 | 0.99 (0.74–1.31) | 0.92 |
Menopause (post vs. pre) | 0.67 (0.45–1.01) | 0.05 | 0.68 (0.46–1.03) | 0.07 | 0.68 (0.45–1.02) | 0.06 |
CC-VPD (per 1% increase) | 1.05 (1.01–1.09) | 0.01 | — | — | — | — |
MLO-VPD (per 1% increase) | — | — | 1.05 (1.01–1.09) | 0.02 | — | — |
Full-VPD (per 1% increase) | — | — | — | — | 1.06 (1.02–1.11)a | <0.01 |
Risk factorsa . | Measure of mammographic breast density used in breast cancer incidence model . | |||||
---|---|---|---|---|---|---|
. | CC-VPD . | MLO-VPD . | Full-VPD . | |||
Model . | HR (95% CI) . | P . | HR (95% CI) . | P . | HR (95% CI) . | P . |
Age | 1.02 (1.00–1.05) | 0.04 | 1.02 (1.00–1.05) | 0.07 | 1.02 (1.00–1.05) | 0.04 |
BMI (kg/m2) | 1.05 (1.03–1.08) | <0.01 | 1.05 (1.03–1.07) | <0.01 | 1.05 (1.03–1.07) | <0.01 |
Family history (yes vs. no) | 1.43 (1.11–1.83) | <0.01 | 1.44 (1.12–1.84) | <0.01 | 1.42 (1.11–1.83) | <0.01 |
Biopsy confirmed benign breast disease | 1.01 (0.78–1.32) | 0.92 | 1.04 (0.80–1.35) | 0.76 | 1.02 (0.78–1.32) | 0.91 |
Parity (1+ vs. 0) | 0.96 (0.72–1.28) | 0.78 | 1.01 (0.76–1.35) | 0.92 | 0.99 (0.74–1.31) | 0.92 |
Menopause (post vs. pre) | 0.67 (0.45–1.01) | 0.05 | 0.68 (0.46–1.03) | 0.07 | 0.68 (0.45–1.02) | 0.06 |
CC-VPD (per 1% increase) | 1.05 (1.01–1.09) | 0.01 | — | — | — | — |
MLO-VPD (per 1% increase) | — | — | 1.05 (1.01–1.09) | 0.02 | — | — |
Full-VPD (per 1% increase) | — | — | — | — | 1.06 (1.02–1.11)a | <0.01 |
Note: Findings from 294 women diagnosed with breast cancer and 657 controls from a cohort of 10,481 women.
aRisk factors entered as continuous in the model except for family history, history of benign breast biopsy, party and menopause that are binary.
We further show in Fig. 2, the 5 and 10-year breast cancer prediction performance using the three VPD measures. To avoid over-optimism, the reported AUCs are based on Uno's integrated 5-year AUC estimated by averaging over a 10-fold cross-validation. As a benchmark, we see that the conventional Full-VPD achieved an AUC of 0.636 (sd = 0.023). When we separately estimated the 5-year AUC using CC-VPD and MLO-VPD, we observe an AUC of 0.641 (sd = 0.024) and AUC of 0.622 (sd = 0.025). For 10-year risk, we observe an AUC of 0.591 (sd = 0.062) for CC; an AUC of 0.595 (sd = 0.057) for MLO and an AUC of 0.605 (sd = 0.058) for Full-VPD. Although these AUCs are not statistically different from each other at either 5 or 10 years, we see that using one view (either CC- or MLO-VPD) can be as efficient as using both. The linearity assumption checked by the Martingale residuals was deemed reasonable in this study (Supplementary Fig. S4) and proportional hazards assumption was satisfactory under the model with CC, MLO, and the average of both (Supplementary Figs. S5–S7).
Discussion
In this study, which used data from 3,804 full-field digital mammography scans obtained from the JKBHC, we demonstrated that using the craniocaudal (CC)-view, the MLO view, and the average of both views, exhibit comparable associations with breast cancer risk. In addition, we found that using only one view (either CC or MLO) performs just as effectively as combining both views when it comes to predicting the risk of breast cancer over a 5- or 10-year period. Prior report from a meta-analysis of older film-based mammography with 13 case–control studies shows that absolute dense area and the percentage of dense area in the CC view had stronger associations with breast cancer (percentage of dense area summary OR 1.59 per SD; 95% CI, 1.46–1.69) compared with the MLO view measures (percentage of dense area summary OR 1.40 per SD; 95% CI, 1.28–1.54). 12 of the 13 studies used cumulus to estimate measures of breast density (13).
Combined cross-sectional data from 22 countries show that measures and changes of breast density with age are consistent across a diverse set of women worldwide, suggesting that breast density is an intrinsic biologic feature of women and that change across the life course is an inherent biologic feature (31). With the goal of incorporating breast density into risk prediction models (10, 32, 33) to improve risk management (16), efficient automated measures have become increasingly important. Machine-derived measures of breast density remove non-constant reader-specific differences (4). Although machine-derived approaches have become standard in clinical practice the value for prediction of future risk is less well studied. Others have compared performance of automated measures against clinical classification for BI-RADS showing comparable association with breast cancer (34–36). A number of studies compare approaches with estimation of density by different machines/technologies and show no important variation (4, 7, 19). Furthermore, the approach reported here for density estimation is related to breast cancer risk in prospective data, and change in density over up to 10 years in the breast developing breast cancer shows decline in density more slowly than in women who remain free from breast cancer (21).
We consider strengths and limitations of this study. Our population is diverse and all measures are consistent with routine screening services. Images are all obtained from a single system and were processed over a week for this comparison within the images. On the other hand, performance over 10 or more years for risk prediction may vary in ways not yet detected. We note that VPD is estimated from the 2D mammography. Thus, the estimated VPD is a surrogate to the true three-dimensional mammography. This approach is consistent with Volpara and Libra for estimating the VPD (37). However, it routinely shows strong association with breast cancer risk (3).
As others evaluate the implementation of digital image assessment for risk into routine clinical practice to develop risk classification that can guide risk management and maximize population health (10, 38), issues of efficiency become paramount for total population coverage (10). Further research is needed to determine optimal screening frequency and risk management strategies and how best to use images in this context.
Conclusion
We show that the association between VPD from CC, MLO, and the average between the two, retain similar association with breast cancer risk, and that solely using any one view mammography performs as well as or better than combining CC and MLO views when predicting future risk of breast cancer over a 5 and 10-year intervals.
Authors' Disclosures
R.M. Tamimi reports grants from NIH/NCI during the conduct of the study. G.A. Colditz reports a patent for ADAPT: Automated Volumetric Mammographic Density Assessment by Pixel Thresholding for Individual Breasts pending. S. Jiang reports a patent for ADAPT: Automated Volumetric Mammographic Density Assessment by Pixel Thresholding for Individual Breasts pending. No disclosures were reported by the other authors.
Authors' Contributions
S. Chen: Data curation, software, formal analysis, writing–original draft, writing–review and editing. R.M. Tamimi: Investigation, writing–review and editing. G.A. Colditz: Resources, data curation, supervision, funding acquisition, investigation, methodology, project administration, writing–review and editing. S. Jiang: Conceptualization, resources, data curation, software, formal analysis, supervision, funding acquisition, validation, investigation, methodology, project administration, writing–review and editing.
Acknowledgments
This work was supported by Breast Cancer Research Foundation grant number (BCRF 21–028; to G.A. Colditz), and in part by NCI (R37 CA256810; to S. Jiang).
The publication costs of this article were defrayed in part by the payment of publication fees. Therefore, and solely to indicate this fact, this article is hereby marked “advertisement” in accordance with 18 USC section 1734.
Note: Supplementary data for this article are available at Cancer Prevention Research Online (http://cancerprevres.aacrjournals.org/).