Mammographic density is a strong risk factor for breast cancer and is reported clinically as part of Breast Imaging Reporting and Data System (BI-RADS) results issued by radiologists. Automated assessment of density is needed that can be used for both full-field digital mammography (FFDM) and digital breast tomosynthesis (DBT) as both types of exams are acquired in standard clinical practice. We trained a deep learning model to automate the estimation of BI-RADS density from a prospective Washington University clinic-based cohort of 9,714 women, entering into the cohort in 2013 with follow-up through October 31, 2020. The cohort included 27% non-Hispanic Black women. The trained algorithm was assessed in an external validation cohort that included 18,360 women screened at Emory from January 1, 2013, and followed up through December 31, 2020, that included 42% non-Hispanic Black women. Our model-estimated BI-RADS density demonstrated substantial agreement with the density as assessed by radiologists. In the external validation, the agreement with radiologists for category B 81% and C 77% for FFDM and B 83% and C 74% for DBT shows important distinction for separation of women with dense breast. We obtained a Cohen’s κ of 0.72 (95% confidence interval, 0.71–0.73) in FFDM and 0.71 (95% confidence interval, 0.69–0.73) in DBT. We provided a consistent and fully automated BI-RADS estimation for both FFDM and DBT using a deep learning model. The software can be easily implemented anywhere for clinical use and risk prediction.

Prevention Relevance: The proposed model can reduce interobserver variability in BI-RADS density assessment, thereby providing more standard and consistent density assessment for use in decisions about supplemental screening and risk assessment.

In the era of precision prevention and tailored screening, there is an increasing emphasis on getting the right prevention to the right women at the right time and tailoring screening examination protocols to women based on individual risk. Underpinning this approach is the need for accurate risk assessments that can be generated and delivered in real time in the clinic (1). Strong evidence shows that adding mammographic density to breast cancer risk prediction models improves their performance (2, 3). A systematic review identified seven studies (out of 11) that showed significant increase in the AUC when mammographic breast density was added to the prediction model. The increase in the AUC ranged from 0.03 to 0.14. Thus, the major breast cancer risk models now also include a measure of breast density (4), which is considered an intermediate marker of risk as well as a surrogate endpoint in prevention trials (5, 6).

There is a long record of epidemiologic investigations using mammographic density estimated from films and, more recently, from digital images (711). Additional research has focused on texture and other features beyond breast density (1215). Current clinical mammographic density assessment relies heavily on subjective radiologist assessment, as described in the fifth edition of the Breast Imaging Reporting and Data System (BI-RADS), rather than on quantitative volumetric analysis (16). However, the interobserver variability amongst radiologists is inevitable and occurs even with the same radiologist from year to year. This results in inconsistent and potentially less accurate recommendations for supplemental screening examinations, such as MRI or ultrasound, and changes in the calculated risk assessment (1720).

Accurate, efficient, and consistent processing of mammograms in real time to guide subsequent clinical decisions is, therefore, a priority (1). National mandating reporting of breast density to women will be required by the U.S. FDA beginning September 10, 2024, which will further drive the clinical need for accurate and consistent density assessment.

There exist previous works on automated density estimation, mostly based on full-field digital mammography (FFDM) images (21, 22). However, automated mammographic density assessment models have moved from using FFDM to digital breast tomosynthesis (DBT) in the United States. DBT was approved by the FDA for all women in 2011 (23). It improves the cancer detection rate on screening (24) and has demonstrated usefulness in both screening and diagnostic settings (25). The uptake of tomosynthesis for breast screening has varied over time based on insurance status and other population characteristics (23, 26). It was approved by Medicare in 2015, but through 2017, women from underrepresented race/ethnic groups, having lower education and income, and from rural residences had been slower to access this technology (26).

Given that both FFDM and DBT are used in clinical practice, more than 80% of mammography screening procedures now use DBT (26). It is imperative that mammographic density can be automated using both FFDM and DBT images. We note that there exist fully automated tools, e.g, Volpara and Quantra, which can perform mammographic density estimation for both FFDM and DBT. However, both Volpara and Quantra rely on using raw mammogram data or “for processing” images (21) which are typically not stored longer than a month in most clinics and research institutions as they are not used for image interpretation. In this study, we provide a deep learning algorithm that can be directly applied to processed or “for presentation” FFDM and DBT images that are used for image interpretation in clinical practice. We draw on two routine screening services (27, 28) with a mixture of FFDM and DBT studies to develop and externally validate the deep learning model to automate density per assessment given in the fifth edition of BI-RADS Atlas (29).

Analytic data set

The Joanne Knight Breast Health Cohort at Washington University (WashU cohort) is used as the source for training data in this study (30). This cohort of women were recruited from November 2008 to April 2012 through an American College of Radiology–accredited and –designated comprehensive breast imaging center providing routine breast screening in St. Louis and includes 10,481 women free from breast cancer with 27% non-Hispanic (NH) Black women (30). The age at entry ranges from 23.1 to 93.3, and 61% of the women are postmenopausal. Eligibility criteria included consent for follow-up and attending a routine screening visit. We excluded 389 women whose entry examination led to the diagnosis of breast cancer and 121 women whose retrieved images did not contain all four standard views. In 2015, the breast health service transitioned to all screening mammograms being DBT. All mammograms were uniformly processed using Hologic machines. For this analysis, we restricted the FFDM mammograms to be past 2013 to ensure that the recorded BI-RADS density used the fifth edition definitions and identified 9,714 women. From this cohort, we identified women free from cancer at their first available DBT examination (4,736 women) past 2015. On entry to the cohort, women self-reported breast cancer risk factors using established and validated measures (31).

External validation cohort

The external validation data set is drawn from the Emory Breast Imaging Dataset (EMBED) with 116,902 patients with up to 8 years of mammograms (28). The public-access cohort represents a 20% random sample from the full EMBED with de-identified mammograms of 22,383 diverse women (42% NH Black) undergoing screening or diagnostic mammograms from January 2013 through December 2020. The age at entry ranges from 20.2 to 89. Similar to the WashU cohort, we excluded women who had diagnostic images (n = 2,734) and images that did not contain all four standard views (n = 1,289), leaving a cohort of 18,360 women. Approximately 35.9% (n = 6,586) of the women underwent DBT, of whom 58.5% (n = 3,855) have both FFDM and DBT at different breast screening visits included in the EMBED. The data included age, race, and time from the initial digital screening mammogram to breast cancer diagnosis. Mammograms were obtained using Hologic machines (92%), GE HealthCare (6%), and Fujifilm (2%). All BI-RADS density values recorded in Emory use the fifth edition.

Training the density assessment model

Model training in FFDM

We transformed Digital Imaging and Communications in Medicine (DICOM) files from presentation view into 16-bit PNG files using the pydicom and PIL tools. In the training dataset, the images are all processed by Hologic. In the external dataset, the images are processed by on Hologic machines (92%), GE (6%), and Fujifilm (2%). Our model takes mediolateral oblique and craniocaudal images as input. For FFDM images, these are the standard four views of mammograms. All mammograms have each been resized to 1,664 × 2,048 pixels in this analysis. All mammograms are de-meaned (centered) and normalized in a column-wise fashion. The mean and SD are saved from the training dataset and subsequently applied onto the external validation.

Each view of the mammogram is independently encoded by using ResNet-18 with a global max pooling layer to compress the image representation to a 512-dimensional vector (32). The model was trained using the Adam optimizer with a learning rate of 10−4 and a weight decay of 10−5. Given the four views for each woman, we end up with a 2,048-dimensional vector that summarizes all information embedded in the mammograms. Results are reported for the epoch that had the lowest cross-entropy loss on the validation set.

Transfer learning for synthetic DBT

Similar preprocessing procedures have been performed in synthetic DBT images. Specifically, we utilized synthesized DBT images that are automatically generated from a series of raw 2D projections. All synthesized DBT images have each been resized to 1,664 × 2,048 pixels in this analysis. All synthesized DBT are de-meaned (centered) and normalized in a column-wise fashion. The mean and SD are saved from the training dataset and subsequently applied onto the external validation.

Transfer learning is a powerful technique in machine learning that involves leveraging a pretrained model on a new, but related, task (33). In this study, we freeze the weights from the previously trained model using FFDM images and replace the final classification layer of the pretrained model with a new layer in synthetic DBT images. This involves training the model with a low learning rate of 10−5 to adapt the weights from the FFDM images to the synthetic DBT images. Fine-tuning allows the model to retain the learned features from the FFDM screening while adjusting to the specific characteristics of synthetic DBT images.

Density output

The output from our model is a continuous measure that is converted into BI-RADS categories. The BI-RADS categories are determined from three model-defined cut-off points. Our model-defined cut-off points are (0, 1.5) for BI-RADS A, (1.501, 2.3) for BI-RADS B, (2.301, 3.3) for BI-RADS C, and (3.301, ) for BI-RADS D. This cut-off is agnostic to FFDM or synthetic DBT. For illustrations, we show in Supplementary Fig. S1 the distribution of the continuous density measures estimated by FFDM using our proposed model in the external validation cohort.

Statistical analysis

We developed two deep learning models using the WashU cohort to estimate BI-RADS density at each of the examinations (34). Our model takes all four views of processed or “for presentation” mammograms (craniocaudal and mediolateral oblique views) as input. The first model was trained using only FFDM mammograms. The second model was fine-tuned from the FFDM model to accommodate synthetic DBT images that are generated from the series of raw projections. This model for synthetic DBT uses a transfer learning approach that transfers knowledge from the pretrained FFDM model to the new synthetic DBT task.

Testing and validation

The WashU dataset was randomly split, with 20% of women in testing, 15% in validation, and the rest for training. To assess the classification performance of the proposed algorithm within the WashU cohort, a confusion matrix was generated for the 20% of women in the testing set. We emphasize that the Emory cohort is only used for testing. Therefore, all women within the Emory cohort that we have constructed have been projected back onto the trained model in the WashU cohort to record the model performance. Model performance is reported using a confusion matrix that compares the absolute counts of radiologist-scored BI-RADS density versus BI-RADS density estimated via the proposed method. The evaluation of the concordance between radiologists’ BI-RADS scores and the BI-RADS density estimated by the proposed method is measured using Cohen’s κ.

Misclassification error is reported with a confusion matrix that compares the absolute counts of radiologist-scored BI-RADS density versus BI-RADS density estimated via the proposed method. We demonstrate the confusion matrix by BI-RADS A, B, C, and D, as well as by dense (A/B) versus nondense (C/D).

Additionally, we evaluate the concordance between radiologists’ assessments using BI-RADS (fifth edition) and the BI-RADS density estimated by the proposed method. Cohen’s κ is a measure used to quantify interrater agreements (35). If the raters are in complete agreement, then κ = 1; if there is no agreement, then κ = 0. Performances for both the misclassification error and interrater agreement are separately reported for FFDM and synthetic DBT.

This prospective cohort study was supported by WashU. Ethical approval was obtained from the Institutional Review Board of WashU in St. Louis. Written informed consent was obtained for study participation, and the study was conducted in accordance with the Declaration of Helsinki. The Emory cohort de-identified data were shared following Institutional Review Board approval.

Data availability

Development data mammogram images at WashU are available with data use agreement. Requests to access the data should be directed to the corresponding authors. External validation data from Emory are publicly available at https://github.com/Emory-HITI/EMBED_Open_Data.

Cohort characteristics

In the WashU cohort, breast cancer risk factors were assessed at entry to the cohort for the women in this prospective study (Table 1). There was no important difference between FFDM and synthetic DBT distribution in the radiologist-assigned qualitative breast density [fifth edition BI-RADS A/B categories (“not dense”) vs. BI-RADS C/D categories (“dense”)]. The cohort included 26% NH Black and 70% NH White women.

Table 1.

Baseline patient characteristics by case status of WashU mammography screening cohort and the external validation cohort EMBED.

WashU derivation cohortEmory validation cohort
FFDM (n = 9,714)DBT (n = 4,736)FFDM (n = 15,629)DBT (n = 6,586)
Mean (SD) 
 Age (years) 55.7 (10.0) 54.6 (8.7) 55.6 (12.2) 57.8 (11.6) 
Number (%)  
 BI-RADS  
  A 999 (10.3%) 454 (9.6%) 1,647 (10.6%) 720 (10.9%) 
  B 4,926 (50.8%) 2,323 (49.0%) 6,488 (41.5%) 2,628 (39.9%) 
  C 3,350 (34.5%) 1705 (36.1%) 6,565 (42.0%) 2,825 (42.9%) 
  D 422 (4.4%) 239 (5.0%) 926 (5.9%) 413 (6.3%) 
  NR 0 (0%) 15 (0.3%) 0 (0%) 0 (0%) 
 Race     
  White 6,768 (69.8%) 3,321 (70.1%) 6,413 (41.1%) 2,560 (38.9%) 
  Black 2,549 (26.3%) 1,251 (26.4%) 6,584 (42.1%) 3,031 (46.0%) 
  Asian 83 (0.9%) 36 (0.8%) 968 (6.2%) 465 (7.1%) 
  Others 88 (0.9%) 34 (0.7%) 268 (1.7%) 62 (0.9%) 
  NR 209 (2.1%) 94 (2.0%) 1,393 (8.9%) 468 (7.1%) 
 Breast cancer cases 469 (4.8%) 105 (2.2%) 408 (2.6%) 133 (2.0%) 
WashU derivation cohortEmory validation cohort
FFDM (n = 9,714)DBT (n = 4,736)FFDM (n = 15,629)DBT (n = 6,586)
Mean (SD) 
 Age (years) 55.7 (10.0) 54.6 (8.7) 55.6 (12.2) 57.8 (11.6) 
Number (%)  
 BI-RADS  
  A 999 (10.3%) 454 (9.6%) 1,647 (10.6%) 720 (10.9%) 
  B 4,926 (50.8%) 2,323 (49.0%) 6,488 (41.5%) 2,628 (39.9%) 
  C 3,350 (34.5%) 1705 (36.1%) 6,565 (42.0%) 2,825 (42.9%) 
  D 422 (4.4%) 239 (5.0%) 926 (5.9%) 413 (6.3%) 
  NR 0 (0%) 15 (0.3%) 0 (0%) 0 (0%) 
 Race     
  White 6,768 (69.8%) 3,321 (70.1%) 6,413 (41.1%) 2,560 (38.9%) 
  Black 2,549 (26.3%) 1,251 (26.4%) 6,584 (42.1%) 3,031 (46.0%) 
  Asian 83 (0.9%) 36 (0.8%) 968 (6.2%) 465 (7.1%) 
  Others 88 (0.9%) 34 (0.7%) 268 (1.7%) 62 (0.9%) 
  NR 209 (2.1%) 94 (2.0%) 1,393 (8.9%) 468 (7.1%) 
 Breast cancer cases 469 (4.8%) 105 (2.2%) 408 (2.6%) 133 (2.0%) 

Abbreviation: NR, not reported.

Comparable BI-RADS distribution and ethnic diversity in the Emory external validation cohort are reported in Table 1. The cohort included 42% NH Black women. There was no important difference between FFDM and synthetic DBT distribution in the radiologist-assigned qualitative breast density [fifth edition BI-RADS A/B categories (“not dense”) vs. BI-RADS C/D categories (“dense”)].

Model performance in FFDM

We show the estimated misclassification counts against radiologists’ reading in a confusion matrix in Fig. 1 using FFDM. The model was first evaluated in an internal validation composed of 20% of random samples from the WashU cohort that is left out from the training data. The BI-RADS classification as predicted by our proposed model exhibits close agreement with the radiologists’ scoring. The model agrees with the radiologists’ score 84% of time for women with nondense (BI-RADS A/B) breast and 91% of the time for women with dense breast (BI-RADS C/D). The confusion matrix separated for the four categories of BI-RADS are also displayed in Fig. 1. This resulted in a Cohen’s κ of 0.74 [95% confidence interval (CI), 0.73–0.75] for the interrater agreements using the four categories in the WashU cohort.

Figure 1.

Comparison of density estimation (BI-RADS density, fifth edition) by the deep learning model and by radiologists’ reading of FFDM. Left, WashU (n = 1,943). Right, Emory (n = 15,629). The internal training data were excluded from the results represented in WashU.

Figure 1.

Comparison of density estimation (BI-RADS density, fifth edition) by the deep learning model and by radiologists’ reading of FFDM. Left, WashU (n = 1,943). Right, Emory (n = 15,629). The internal training data were excluded from the results represented in WashU.

Close modal

Similarly, when evaluating performance in the Emory external validation cohort, the proposed model–estimated BI-RADS density agrees with the radiologists’ score 87% of time for women with nondense (BI-RADS A/B) breast and 84% of the time for women with dense breast (BI-RADS C/D). The confusion matrix separated for the four categories of BI-RADS are also displayed in Fig. 1. Importantly categories B and C had high agreement with radiologists (B 81% and C 77%) in the external validation cohort. This resulted in a Cohen’s κ of 0.72 (95% CI, 0.71–0.73) for the interrater agreements using the four categories in the external validation cohort.

Model performance in synthetic DBT

We further show the estimated misclassification counts in a confusion matrix in Fig. 2 when using the synthetic DBT images. We see similar performances in this case when compared when FFDM. In the WashU internal validation cohort, the model agrees with the radiologists’ score 84% of time for women with nondense (BI-RADS A/B) breast and 90% of the time for women with dense breast (BI-RADS C/D). The separate results for the four categories are displayed in Fig. 2, resulting in a Cohen’s κ of 0.74 (95% CI, 0.73–0.75).

Figure 2.

Comparison of density estimation (BI-RADS density, fifth edition) by the deep learning model and by radiologists’ reading of DBT using synthetic DBT. Left, WashU (n = 945). Right, Emory (n = 6,586). The internal training data were excluded from the results represented in WashU.

Figure 2.

Comparison of density estimation (BI-RADS density, fifth edition) by the deep learning model and by radiologists’ reading of DBT using synthetic DBT. Left, WashU (n = 945). Right, Emory (n = 6,586). The internal training data were excluded from the results represented in WashU.

Close modal

When evaluated in the external cohort, the proposed model–estimated BI-RADS density agrees with the radiologists’ score 90% of time for women with nondense (BI-RADS A/B) breast and 80% of the time for women with dense breast (BI-RADS C/D). Importantly, in the four-category setting, categories B and C had high agreement with radiologists (B 83% and C 74%) in the external validation cohort. The confusion matrix separated for the four categories of BI-RADS are also displayed in Fig. 2, resulting in a Cohen’s κ of 0.71 (95% CI, 0.69–0.73).

Subgroup analysis

In this study, we report results in our external validation dataset for the subset of women who were diagnosed with breast cancer in their follow-up since baseline. Supplementary Fig. S2 shows the results obtained by FFDM, and Supplementary Fig. S3 demonstrates the results obtained by synthetic DBT.

With the widespread use of DBT in the United States, it is imperative that automated density measures are readily available. In this study, we report an automated tool to assess mammographic density in both FFDM and synthetic DBT mammography. The deep learning algorithm calibrates well with the radiologist rating in BI-RADS in both the internal validation and the independent external validation that is racially diverse with 42% of NH Black women. There are strong agreements between the deep learning algorithm with radiologists when comparing dense versus nondense in the external validation. When looking at the four categories of density, the Cohen’s κ was 0.71 to 0.72 in FFDM and synthetic DBT for the interrater agreements in the external validation cohort, which is stronger than those reported in other studies. Thus, this study extends beyond previous works, in which the women were largely limited to NH White women.

Most automated breast density estimation tools focus on density estimation using FFDM images. Given the uptake of DBT in the United States achieving more than 80% coverage in screening mammography exams as of 2021 (26), the proposed deep learning algorithm is timely and among the first that can accommodate synthetic DBT images.

The performance of the tool differs between data sets. We do not know the number of radiologists at Emory or their distribution across community settings that include two community hospitals, one large inner-city hospital, and one private academic hospital. This contrasts with WashU, where breast radiologists worked and read images in one location. Thus, variation between data sets may reflect usual practice to practice variation (36). Although these screening centers have ongoing quality improvement/quality assurance programs in place as required by the FDA, we do not have access to these internal performance data (37).

Our proposed algorithm has advantages over existing mammographic density estimation tools that function for both FFDM and DBT. For instance, Volpara and Quantra both require raw or “for process” images as input; however, most institutions do not store those for more than a month, meaning that exams cannot be subsequently reprocessed after acquisition (21). We overcome that burden in this study by using processed or “for presentation” FFDM and synthetic DBT images that are permanently archived. This algorithm could tie into routine breast imaging services and deliver output of density and BI-RADS category to the reading radiologist as an aid to classifying density, which is now a reportable feature in the United States (37). This is similar to other programs in use and aims to reduce variability among providers over time (38).

In the external validation, we obtained a Cohen’s κ of 0.72 (95% CI, 0.71–0.73) for FFDM and 0.71 (95% CI, 0.69–0.73) for synthetic DBT. When comparing interrater agreements of FFDM with Volpara v1.5.0 and Quantra v2.0 in previous studies (21), Volpara reported a Cohen’s κ of 0.57 (95% CI, 0.55–0.59) and Quantra reported a κ of 0.46 (95% CI, 0.44–0.47). Additionally, the outputted BI-RADS density is achieved by grouping a continuous measure of density estimated from the proposed algorithm. Such a continuous measure may be more sensitive when studying changes in density over time (3).

There are limitations to the study. Given the clinician-to-clinician variability in reporting BI-RADS density, a real-time assessment with multiple radiologists reading the same set of images will be more robust. This study has been largely limited to images generated on Hologic machines. Broader evaluation on other manufacturers is warranted. Experience using transfer learning to adapt to the DBT setting reassures us that this is not a major issue moving forward.

Despite these limitations, this study has several strengths. The external mammography data are drawn from diverse external validation data sources, including mammography screening in an urban Atlanta clinical service. The model can use synthetic DBT or “for presentation” images from synthetic DBT adding to access. These real-world data add to the generalizability of the model’s validation.

In conclusion, we provide a consistent and fully automated BI-RADS estimation method for both FFDM and synthetic DBT using the same deep learning model. The software can be easily implemented in all clinical practices.

S. Jiang reports grants from the NCI during the conduct of the study; in addition, S. Jiang has a patent pending. G.A. Colditz reports grants from the NCI during the conduct of the study; in addition, G.A. Colditz has a patent pending. No disclosures were reported by the other authors.

S. Jiang: Conceptualization, resources, software, formal analysis, validation, investigation, visualization, methodology, writing–original draft, writing–review and editing. D.L. Bennett: Investigation, writing–review and editing. S. Chen: Data curation, writing–review and editing. A.T. Toriola: Investigation, writing–review and editing. G.A. Colditz: Conceptualization, resources, data curation, validation, investigation, methodology, project administration, writing–review and editing.

This work was supported by the NCI (R01CA246592 awarded to A.T. Toriola, R37CA256810 awarded to S. Jiang, and P30CA091842 awarded to T.J. Eberlein).

Note: Supplementary data for this article are available at Cancer Prevention Research Online (http://cancerprevres.aacrjournals.org/).

1.
Pashayan
N
,
Antoniou
AC
,
Ivanus
U
,
Esserman
LJ
,
Easton
DF
,
French
D
, et al
.
Personalized early detection and prevention of breast cancer: ENVISION consensus statement
.
Nat Rev Clin Oncol
2020
;
17
:
687
705
.
2.
Vilmun
BM
,
Vejborg
I
,
Lynge
E
,
Lillholm
M
,
Nielsen
M
,
Nielsen
MB
, et al
.
Impact of adding breast density to breast cancer risk models: a systematic review
.
Eur J Radiol
2020
;
127
:
109019
.
3.
Jiang
S
,
Bennett
DL
,
Rosner
BA
,
Colditz
GA
.
Longitudinal analysis of change in mammographic density in each breast and its association with breast cancer risk
.
JAMA Oncol
2023
;
9
:
808
14
.
4.
Brentnall
AR
,
Harkness
EF
,
Astley
SM
,
Donnelly
LS
,
Stavrinos
P
,
Sampson
S
, et al
.
Mammographic density adds accuracy to both the Tyrer-Cuzick and Gail breast cancer risk models in a prospective UK screening cohort
.
Breast Cancer Res
2015
;
17
:
147
.
5.
Boyd
NF
,
Guo
H
,
Martin
LJ
,
Sun
L
,
Stone
J
,
Fishell
E
, et al
.
Mammographic density and the risk and detection of breast cancer
.
N Engl J Med
2007
;
356
:
227
36
.
6.
Boyd
N
,
Martin
L
,
Chavez
S
,
Gunasekara
A
,
Salleh
A
,
Melnichouk
O
, et al
.
Breast-tissue composition and other risk factors for breast cancer in young women: a cross-sectional study
.
Lancet Oncol
2009
;
10
:
569
80
.
7.
Pettersson
A
,
Graff
RE
,
Ursin
G
,
Santos Silva
ID
,
McCormack
V
,
Baglietto
L
, et al
.
Mammographic density phenotypes and risk of breast cancer: a meta-analysis
.
J Natl Cancer Inst
2014
;
106
:
dju078
.
8.
Burton
A
,
Maskarinec
G
,
Perez-Gomez
B
,
Vachon
C
,
Miao
H
,
Lajous
M
, et al
.
Mammographic density and ageing: a collaborative pooled analysis of cross-sectional data from 22 countries worldwide
.
PLoS Med
2017
;
14
:
e1002335
.
9.
Ward
SV
,
Burton
A
,
Tamimi
RM
,
Pereira
A
,
Garmendia
ML
,
Pollan
M
, et al
.
The association of age at menarche and adult height with mammographic density in the International Consortium of Mammographic Density
.
Breast Cancer Res
2022
;
24
:
49
.
10.
Haas
CB
,
Chen
H
,
Harrison
T
,
Fan
S
,
Gago-Dominguez
M
,
Castelao
JE
, et al
.
Disentangling the relationships of body mass index and circulating sex hormone concentrations in mammographic density using Mendelian randomization
.
Breast Cancer Res Treat
2024
;
206
:
295
305
.
11.
McCormack
VA
,
Perry
NM
,
Vinnicombe
SJ
,
Dos Santos Silva
I
.
Changes and tracking of mammographic density in relation to Pike's model of breast tissue aging: a UK longitudinal study
.
Int J Cancer
2010
;
127
:
452
61
.
12.
Nguyen
TL
,
Schmidt
DF
,
Makalic
E
,
Maskarinec
G
,
Li
S
,
Dite
GS
, et al
.
Novel mammogram-based measures improve breast cancer risk prediction beyond an established mammographic density measure
.
Int J Cancer
2021
;
148
:
2193
202
.
13.
Hopper
JL
,
Nguyen
TL
,
Schmidt
DF
,
Makalic
E
,
Song
Y-M
,
Sung
J
, et al
.
Going beyond conventional mammographic density to discover novel mammogram-based predictors of breast cancer risk
.
J Clin Med
2020
;
9
:
627
.
14.
McCormack
VA
,
dos Santos Silva
I
.
Breast density and parenchymal patterns as markers of breast cancer risk: a meta-analysis
.
Cancer Epidemiol Biomarkers Prev
2006
;
15
:
1159
69
.
15.
Anandarajah
A
,
Chen
Y
,
Colditz
GA
,
Hardi
A
,
Stoll
C
,
Jiang
S
.
Studies of parenchymal texture added to mammographic breast density and risk of breast cancer: a systematic review of the methods used in the literature
.
Breast Cancer Res
2022
;
24
:
101
.
16.
O’Driscoll
J
,
Burke
A
,
Mooney
T
,
Phelan
N
,
Baldelli
P
,
Smith
A
, et al
.
A scoping review of programme specific mammographic breast density related guidelines and practices within breast screening programmes
.
Eur J Radiol Open
2023
;
11
:
100510
.
17.
Grimm
LJ
,
Anderson
AL
,
Baker
JA
,
Johnson
KS
,
Walsh
R
,
Yoon
SC
, et al
.
Interobserver variability between breast imagers using the fifth edition of the BI-RADS MRI Lexicon
.
AJR Am J Roentgenol
2015
;
204
:
1120
4
.
18.
Ooms
EA
,
Zonderland
HM
,
Eijkemans
MJC
,
Kriege
M
,
Mahdavian Delavary
B
,
Burger
CW
, et al
.
Mammography: interobserver variability in breast density assessment
.
Breast
2007
;
16
:
568
76
.
19.
Redondo
A
,
Comas
M
,
Macià
F
,
Ferrer
F
,
Murta-Nascimento
C
,
Maristany
MT
, et al
.
Inter- and intraradiologist variability in the BI-RADS assessment and breast density categories for screening mammograms
.
Br J Radiol
2012
;
85
:
1465
70
.
20.
Berg
WA
,
Campassi
C
,
Langenberg
P
,
Sexton
MJ
.
Breast imaging reporting and data system: inter- and intraobserver variability in feature analysis and final assessment
.
AJR Am J Roentgenol
2000
;
174
:
1769
77
.
21.
Brandt
KR
,
Scott
CG
,
Ma
L
,
Mahmoudzadeh
AP
,
Jensen
MR
,
Whaley
DH
, et al
.
Comparison of clinical and automated breast density measurements: implications for risk prediction and supplemental screening
.
Radiology
2016
;
279
:
710
9
.
22.
Matthews
TP
,
Singh
S
,
Mombourquette
B
,
Su
J
,
Shah
MP
,
Pedemonte
S
, et al
.
A multisite study of a breast density deep learning model for full-field digital mammography and synthetic mammography
.
Radiol Artif Intell
2021
;
3
:
e200015
.
23.
Clark
CR
,
Tosteson
TD
,
Tosteson
ANA
,
Onega
T
,
Weiss
JE
,
Harris
KA
, et al
.
Diffusion of digital breast tomosynthesis among women in primary care: associations with insurance type
.
Cancer Med
2017
;
6
:
1102
7
.
24.
Libesman
S
,
Zackrisson
S
,
Hofvind
S
,
Seidler
AL
,
Bernardi
D
,
Lång
K
, et al
.
An individual participant data meta-analysis of breast cancer detection and recall rates for digital breast tomosynthesis versus digital mammography population screening
.
Clin Breast Cancer
2022
;
22
:
e647
54
.
25.
Expert Panel on Breast Imaging
;
Niell
BL
,
Jochelson
MS
,
Amir
T
,
Brown
A
,
Adamson
M
,
Baron
P
, et al
.
ACR appropriateness Criteria® female breast cancer screening: 2023 update
.
J Am Coll Radiol
2024
;
21
:
S126
43
.
26.
Lee
CI
,
Zhu
W
,
Onega
T
,
Henderson
LM
,
Kerlikowske
K
,
Sprague
BL
, et al
.
Comparative access to and use of digital breast tomosynthesis screening by women's race/ethnicity and socioeconomic status
.
JAMA Netw Open
2021
;
4
:
e2037546
.
27.
Moore
JX
,
Han
Y
,
Appleton
C
,
Colditz
G
,
Toriola
AT
.
Determinants of mammographic breast density by race among a large screening population
.
JNCI Cancer Spectr
2020
;
4
:
pkaa010
.
28.
Jeong
JJ
,
Vey
BL
,
Bhimireddy
A
,
Kim
T
,
Santos
T
,
Correa
R
, et al
.
The EMory BrEast imaging Dataset (EMBED): a racially diverse, granular dataset of 3.4 million screening and diagnostic mammographic images
.
Radiol Artif Intell
2023
;
5
:
e220047
.
29.
Spak
DA
,
Plaxco
JS
,
Santiago
L
,
Dryden
MJ
,
Dogan
BE
.
BI-RADS® fifth edition: a summary of changes
.
Diagn Interv Imaging
2017
;
98
:
179
90
.
30.
Colditz
GA
,
Bennett
DL
,
Tappenden
J
,
Beers
C
,
Ackermann
N
,
Wu
N
, et al
.
Joanne Knight breast health cohort at Siteman Cancer Center
.
Cancer Causes Control
2022
;
33
:
623
9
.
31.
Colditz
GA
,
Hankinson
SE
.
The Nurses' Health Study: lifestyle and health among women
.
Nat Rev Cancer
2005
;
5
:
388
96
.
32.
He
K
,
Zhang
X
,
Ren
S
,
Sun
J
.
Deep residual learning for image recognition
. In:
Proceedings of the IEEE conference on computer vision and pattern recognition. Volume 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR): The Computer Society
;
2016 Jun 27–30
;
Las Vegas, NV
.
IEEE
;
2016
. p.
770
8
.
33.
Pan
SJ
,
Yang
Q
.
A survey on transfer learning
.
IEEE Trans Knowledge Data Eng
2009
;
22
:
1345
59
.
34.
Jiang
S
,
Xie
Y
,
Colditz
GA
.
Functional ensemble survival tree: dynamic prediction of Alzheimer's disease progression accommodating multiple time-varying covariates
.
J R Stat Soc C Appl Stat
2021
;
70
:
66
79
.
35.
Landis
JR
,
Koch
GG
.
The measurement of observer agreement for categorical data
.
Biometrics
1977
;
33
:
159
74
.
36.
Lee
CI
,
Abraham
L
,
Miglioretti
DL
,
Onega
T
,
Kerlikowske
K
,
Lee
JM
, et al
.
National performance benchmarks for screening digital breast tomosynthesis: update from the breast cancer surveillance consortium
.
Radiology
2023
;
307
:
e222499
.
37.
Food and Drug Administration
.
Important information: final rule to amend the mammography quality standards act (MQSA)
;
2024
[updated 2024 Sep 10; cited 2023 Mar 10]. Available from:
https://www.fda.gov/radiation-emitting-products/mammography-quality-standards-act-mqsa-and-mqsa-program/important-information-final-rule-amend-mammography-quality-standards-act-mqsa.
38.
Destounis
S
,
Arieno
A
,
Morgan
R
,
Roberts
C
,
Chan
A
.
Qualitative versus quantitative mammographic breast density assessment: applications for the US and abroad
.
Diagnostics (Basel)
2017
;
7
:
30
.
This open access article is distributed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) license.