Abstract
Breast density is a strong risk factor for breast cancer; however, no standard assessment method exists. An automated breast density method was modified and compared with a semi-automated, user-assisted thresholding method (Cumulus method) and the Breast Imaging Reporting and Data System four-category tissue composition measure for their ability to predict future breast cancer risk. The three estimation methods were evaluated in a matched breast cancer case-control (n = 372 and n = 713, respectively) study at the Mayo Clinic using digitized film mammograms. Mammograms from the craniocaudal view of the noncancerous breast were acquired on average 7 years before diagnosis. Two controls with no previous history of breast cancer from the screening practice were matched to each case on age, number of previous screening mammograms, final screening exam date, menopausal status at this date, interval between earliest and latest available mammograms, and residence. Both Pearson linear correlation (R) and Spearman rank correlation (r) coefficients were used for comparing the three methods as appropriate. Conditional logistic regression was used to estimate the risk for breast cancer (odds ratios and 95% confidence intervals) associated with the quartiles of percent breast density (automated breast density method, Cumulus method) or Breast Imaging Reporting and Data System categories. The area under the receiver operator characteristic curve was estimated and used to compare the discriminatory capabilities of each approach. The continuous measures (automated breast density method and Cumulus method) were highly correlated with each other (R = 0.70) but less with Breast Imaging Reporting and Data System (r = 0.49 for automated breast density method and r = 0.57 for Cumulus method). Risk estimates associated with the lowest to highest quartiles of automated breast density method were greater in magnitude [odds ratios: 1.0 (reference), 2.3, 3.0, 5.2; P trend < 0.001] than the corresponding quartiles for the Cumulus method [odds ratios: 1.0 (reference), 1.7, 2.1, and 3.8; P trend < 0.001] and Breast Imaging Reporting and Data System [odds ratios: 1.0 (reference), 1.6, 1.5, 2.6; P trend < 0.001] method. However, all methods similarly discriminated between case and control status; areas under the receiver operator characteristic curve were 0.64, 0.63, and 0.61 for automated breast density method, Cumulus method, and Breast Imaging Reporting and Data System, respectively. The automated breast density method is a viable option for quantitatively assessing breast density from digitized film mammograms. (Cancer Epidemiol Biomarkers Prev 2008;17(11):3090–7)
Introduction
Breast density is a significant breast cancer risk factor (1). Various methods have been investigated for measuring breast density, but to date, no accepted measurement standards exist (2). The Cumulus user-assisted method based on soft-copy display intensity thresholding (3-5) is considered the de facto standard. It and other similar techniques (6) generate quantitative measures that have shown, with repeatability, to correlate with breast cancer risk (1). However, subjective estimates, including the American College of Radiology Breast Imaging Reporting and Data System four-category tissue composition description and subjective classifications of breast density, have also showed consistent associations with breast cancer risk (1). Current density estimation methods show strong associations with risk (1), but they remain time intensive, require operator training (7), and may not be comparable across studies. Thus, an automated quantitative metric for assessing breast density that does not require operator interaction is needed. We use the term “automated” to describe the method of breast density estimation without regard to the digitization or image formation process. This includes the automated labeling of pixel(s) within a given image and the method used for summarizing the labeled pixels to provide a density estimate.
Current computer-based methods for density estimation can be loosely categorized into two groups: (a) those that compensate for acquisition influences resulting from the interpatient variation in the X-ray exposure, target-filter combination, compression height, and X-ray generation voltage (8-13), and (b) those that operate on the image data without considering the acquisition influences (3-6, 14). The acquisition compensation techniques are under various stages of development and testing for film (8, 10, 13) and full-field digital mammography applications (9, 11, 12). Whether considering the acquisition or not, film-based approaches require a digitizing step before their application in contrast to those techniques intended for full-field digital mammography applications. Few acquisition-based approaches have been replicated or validated using breast cancer as an endpoint. A recent study comparing the Standard Mammogram Format volumetric method with the Cumulus method for its ability to discriminate cancer status found Cumulus method to be more strongly associated with risk (15). Other nonacquisition-based approaches that rely on some form of quantitative summary feature (that is, skewness, kurtosis, etc.) derived from the digitized film data resulted in weaker risk associations than the Cumulus user-assisted method measure (14, 16).
Because breast density is an important risk factor whose magnitude of association is stronger than most established breast cancer risk factors (17), it may improve risk prediction models for breast cancer (18-20). However, breast density has not been routinely incorporated into clinical risk prediction, perhaps because of the lack of standardized estimates and the technical expertise required to generate such estimates in the clinical setting. An estimate of density that could be automatically applied to digitized film and digital data from full-field digital mammography would help alleviate these barriers.
Although most mammography clinics in the United States continue to use film mammography, full-field digital mammography is increasing as the screening standard (21) because of its superior sensitivity in dense breasts (22). In the interim, although full-field digital mammography and film coexist, assessing breast density from digitized film will continue to be relevant because of the large film archives with serial data and well-annotated clinical outcomes and the ability to translate clinically relevant findings identified from film to digital mammography.
In this work, a previously described automated breast density method developed for digitized film applications (23) was modified and compared with the continuous Cumulus method measure and the radiologists' classification of Breast Imaging Reporting and Data System tissue composition. The correlation of the three estimates and their ability to discriminate breast cancer were evaluated on digitized films within a breast cancer case-control study.
Materials and Methods
Study Population and Image Data
The case-control study used for this work was described in detail previously (24). Eligible participants provided research authorization for medical record studies at the Mayo Clinic. Informed consent for this case-control study was waived. All patient data were retrospectively obtained through medical records, clinical databases, or mammogram films; no patients were contacted for this study. The Mayo Clinic institutional review board approved the study protocol.
Breast cancer cases (n = 372) diagnosed from 1997 to 2001 were ascertained from the Mayo Clinic mammography screening practice in Rochester, Minnesota. Eligible cases were at least 50 years old at diagnosis, had at least 2 previous screening mammograms done at the Mayo Clinic 2 years before diagnosis (or corresponding exam date), and lived within a 120-mile radius. Invasive (n = 300) and in situ (n = 72) breast cancers were included. Multiple mammograms before diagnosis were retrospectively collected to investigate changes over time in mammographic density and breast cancer risk (for another study) as well as to establish a population of women having routine mammograms. Two controls from the screening practice with no previous history of breast cancer were matched to each case on age, final screening exam date, menopausal status at final exam date, time between earliest and latest mammograms, number of previous screening mammograms, and residence. Because mammograms are only retained for 10 years at the Mayo Clinic, the earliest mammogram available during this period was defined as the baseline image. The baseline mammogram available for cases and controls was used for the analyses. Medical records provided weight, height, and postmenopausal hormone therapy use for all serial mammogram dates; the measures closest to the earliest mammogram dates were used (24). Weight and height were available within 1 week of the baseline mammogram for 85% and 68% of participants, respectively. Postmenopausal hormone therapy information (ever or never use) was available for 84% of participants. A mammography database, which is based on self-reported information gathered at each mammogram (including menopause status, family history in first-degree relatives, age at first birth, number of births, previous biopsies, and previous breast cancer), provided all remaining patient information.
Mammograms were digitized with a Kodak Lumiscan 75 scanner with a 12-bit grayscale pixel depth. The pixel size was 0.130 × 0.130 mm2 for films measuring 18 × 24 cm2 and 24 × 30 cm2. All four views, left and right mediolateral oblique view and left and right craniocaudal view, were digitized for each woman. The film background in each image was manually blacked out to protect patient privacy and increase data compression rates when archiving the data. For the mediolateral oblique and craniocaudal views, an edge detection program automatically segmented the breast tissue from the background image, and the delineation was manually checked by the reader for accuracy. The mediolateral oblique view additionally required the manual removal of the pectoral muscle. This process was the same for the estimation of breast density by the Cumulus method and automated breast density method measures. In practice, the craniocaudal views are preferable for automated processing because they do not require removal of the pectoral muscle (25). However, we included the modified mediolateral oblique views as another assessment of the automated breast density method statistical decision process.
The three measures for estimation of breast density are described below.
User-Assisted Thresholding Measure. The semi-automated, user-assisted display method (Cumulus user-assisted method) is a user-guided process that requires manual segmentation and intensity thresholding to estimate breast density (3-5). The Cumulus measure will be referred to as PD in this report. The operator set two thresholds: one delineated the breast from the background (done automatically as defined above) and the other set the threshold between dense and nondense pixels (3). The application computed the total area and dense area and then calculated PD by dividing the dense area by the total area, which was a unit-less ratio or proportion. Batch files were created for cases and controls with randomly assigned views and sides within person (26). A single technician analyzed all images for consistency. This technician repeatedly showed high intraclass correlation (above 0.90) while reading >500 duplicate images across varying time frames.
The Automated (Automated Breast Density Method) Approach. The automated breast density method (23) is an extension of earlier work that analyzed digitized mammograms after applying digital filtering (27). This work showed a strong correspondence between areas of increased breast density in the raw image (bright image areas) and areas of increased pixel variance within a special type of high-pass filtered representation (23) of the raw image. In the original developmental work (23, 27), a deconvolution operation defined the special high-pass filter application, which is a prewhitening filter. A prewhitening operation removed the Fourier spectral form of the raw image, leaving a zero mean noise field with little spatial correlation. The prewhitening operation was applied by first estimating the Fourier spectral form for a given raw image and then constructing a filter to remove the form. In the work presented here, a wavelet high-pass filter was used in place of the deconvolution operation because it was faster, did not require estimating the raw image spectral form, and produced similar breast density results in the preliminary (training) analysis. The wavelet filter methods used here were discussed in detail previously (28, 29). The automated breast density method operated by applying a statistical test within the filtered image by scanning a small search window (defined below) across the image. Within each window positioned at a given location, a statistical test based on the filtered image variance was applied to detect regions corresponding to brighter regions in the raw image. The statistical test was based on χ2 analysis, which followed from the spatial statistical qualities of the filtered image representation. The automated breast density method produced an area-based measure (a ratio) similar to PD, which was summarized and normalized in the same way once a given image was binary labeled by the outcomes of the statistical tests. The automated breast density method measure will be referred to as PDA in this report.
Briefly, we first converted the image data from the Lumiscan 75 scanner digitizer to the digital representation from the DBA digitizer (DBA Systems) used in the earlier work (23, 27, 28). Both systems used digitized film data, but the Lumiscan 75 scanner is linear in its optical density pixel value relation whereas the DBA is exponential. This involved linearly transforming the Lumiscan 75 scanner data to optical density units and then converting to the older representation using the relation: original representation = 29891 exp (-2.36 optical density). Next, the breast region field of view was automatically located relative to the off-breast background. This was done by taking all pixels with pixel values >0 because the off-breast background had already been blacked out as described above. Finally, the automated breast density method was applied to the breast field of view of the filtered image. This involved automatically dividing the filtered image into a grid of 4 × 4 pixel boxes (the box is the search window) and then applying an automated statistical decision to each grid location so that all pixels within a given grid were either labeled as fatty tissue or dense tissue (in the first stage). A second iteration was done to refine the labeling of fat and dense tissue, resulting in the density-labeled automated breast density method output image and overall proportion of density (in the second stage). The two-stage detection scheme was discussed in considerable detail in the original report (23).
To establish parameters for the automated breast density method algorithm, a set of mammogram images on healthy women (training data) was obtained from a similar Caucasian population. The automated breast density method has two adjustable detection parameters corresponding to each detection stage. These values were estimated via correlation analysis with the Cumulus user-assisted method for the images in the training set. The parameters were fixed when a maximum correlation was achieved.
Breast Imaging Reporting and Data System Tissue Composition Descriptions. The Breast Imaging Reporting and Data System four-category tissue composition assessment has been part of standard clinical practice at the Mayo Clinic since 1992. Mayo Clinic attending radiologists classified Breast Imaging Reporting and Data System tissue composition into one of four categories as defined in the Breast Imaging Reporting and Data System lexicon (American College of Radiology, third edition): (a) the breast is almost entirely fat; (b) there are scattered fibroglandular densities; (c) the breast tissue is heterogeneously dense, which may lower the sensitivity of mammography; and (d) the breast is extremely dense, which could obscure a lesion on mammography. These ratings convey the relative possibility that a lesion may be obscured in mammography. All four mammogram views (craniocaudal and mediolateral oblique for ipsilateral and contralateral sides) contribute to the assessment of Breast Imaging Reporting and Data System tissue composition. In our study, we used the Breast Imaging Reporting and Data System estimates that experienced radiologists assessed in the clinical setting. These radiologists did not systematically assess Breast Imaging Reporting and Data System composition for this study, but this rating has shown adequate interobserver validity (30). The Breast Imaging Reporting and Data System estimate used in current clinical practice includes quantifying the percentage of breast density into four categories in conjunction with the previous descriptors. However, due to the retrospective data collection, the Breast Imaging Reporting and Data System used in our analyses followed the older convention that did not include breast density percentages.
Statistical Analysis
Previous results from this case-control study illustrated the association of PD (Cumulus method) with breast cancer was invariant to the mammogram view or (for the cases) whether the view was based on the cancerous or noncancerous breast (24). Thus, we restricted our evaluation in this report to the noncancerous breast and present results for the craniocaudal and mediolateral oblique views.
Summaries of the distribution of matching variables were presented as means and standard deviations (SDs) or counts and percentages. The distributions of both PD and PDA were approximately normally distributed in this population. The correlation of PDA with PD was estimated using the Pearson R for the entire range of breast density values. The correlations between mediolateral oblique and craniocaudal view for PD and PDA were also calculated using this method. The correlation of PD and PDA with the Breast Imaging Reporting and Data System classification was estimated using the Spearman r because of the discrete nature of the Breast Imaging Reporting and Data System measure. Linear regression methods and corresponding R2 values evaluated the apparent curvilinear relationship between PD and PDA. Conditional logistic regression examined the associations of the three methods of breast density estimation with breast cancer. PD and PDA were examined as a categorical measure based on quartiles of their distributions among controls and a continuous measure reflecting an increase of one SD. For Breast Imaging Reporting and Data System, the four-category classification was used. Odds ratios and 95% confidence intervals were estimated for these measures of breast density. All models were adjusted for age and body mass index, which was calculated as weight (kilograms) / height (square meters).
The strength of association for the three methods with breast cancer was summarized with a modified C-statistic, also known as area under the receiver operator characteristic curve. C-statistics reflect how often a model correctly identifies the case in a random case-control pair and range from 0.5 (random chance) to 1.0 (perfect prediction). To take advantage of the case-control matching, only pairs occurring within matched sets of cases and controls were used in the calculation of the C-statistic (31).
Results
The average time interval between the earliest mammogram and the diagnosis of cancer (or corresponding exam date) was 7.1 + 1.5 years for cases and 7.0 ± 1.5 years for controls; these intervals were >5 years for >90% of the participants (Table 1). As described previously, the cases and controls were closely matched as shown by the similarities of age, number of screening mammograms, interval between mammograms, menopausal status, and residence (Table 1; ref. 24). Figure 1 illustrates craniocaudal mammograms from three women in the Lumiscan 75 scanner representation used for PD (Fig. 1A) compared with the converted digital DBA representation used for PDA (Fig. 1B). The corresponding PD (Fig. 1A) and PDA (Fig. 1B) values are provided in the figure legends. The converted images seem to have more contrast than the Lumiscan 75 scanner representation. The distinct difference between the representations is exemplified by the degree of bright tissue near the breast-background border region in the Lumiscan 75 scanner representation.
Description of matching variables and PD (Cumulus method) for cases and controls
Characteristic . | Case . | . | Control . | . | P . | |||||
---|---|---|---|---|---|---|---|---|---|---|
. | Case n . | Mean (SD) or % . | Control n . | Mean (SD) or % . | . | |||||
Matching variables | ||||||||||
Age at earliest mammogram (y) | 372 | 61.33 (10.36) | 713 | 61.06 (10.03) | 0.70 | |||||
Interval between early and late mamamogram (y) | 372 | 71.1 (1.5) | 713 | 7.0 (1.5) | 0.53 | |||||
No. of screening mammograms | 372 | 5.01 (1.45) | 713 | 5.18 (1.79) | 0.11 | |||||
Residence (% Olmsted County) | 183 | 49.59% | 365 | 51.19% | 0.49 | |||||
Postmenopausal at earliest mammogram | 312 | 84.32% | 582 | 82.44% | 0.44 | |||||
PD* | ||||||||||
Ipsilateral side | ||||||||||
Mediolateral oblique view | 366 | 28.42 (14.43) | 705 | 24.48 (13.77) | <0.001 | |||||
Craniocaudal view | 366 | 30.55 (14.09) | 701 | 26.63 (14.74) | <0.001 | |||||
Contralateral side | ||||||||||
Mediolateral oblique view | 364 | 27.95 (14.22) | 707 | 24.38 (13.54) | <0.001 | |||||
Craniocaudal view | 363 | 30.76 (14.46) | 703 | 26.62 (14.56) | <0.001 |
Characteristic . | Case . | . | Control . | . | P . | |||||
---|---|---|---|---|---|---|---|---|---|---|
. | Case n . | Mean (SD) or % . | Control n . | Mean (SD) or % . | . | |||||
Matching variables | ||||||||||
Age at earliest mammogram (y) | 372 | 61.33 (10.36) | 713 | 61.06 (10.03) | 0.70 | |||||
Interval between early and late mamamogram (y) | 372 | 71.1 (1.5) | 713 | 7.0 (1.5) | 0.53 | |||||
No. of screening mammograms | 372 | 5.01 (1.45) | 713 | 5.18 (1.79) | 0.11 | |||||
Residence (% Olmsted County) | 183 | 49.59% | 365 | 51.19% | 0.49 | |||||
Postmenopausal at earliest mammogram | 312 | 84.32% | 582 | 82.44% | 0.44 | |||||
PD* | ||||||||||
Ipsilateral side | ||||||||||
Mediolateral oblique view | 366 | 28.42 (14.43) | 705 | 24.48 (13.77) | <0.001 | |||||
Craniocaudal view | 366 | 30.55 (14.09) | 701 | 26.63 (14.74) | <0.001 | |||||
Contralateral side | ||||||||||
Mediolateral oblique view | 364 | 27.95 (14.22) | 707 | 24.38 (13.54) | <0.001 | |||||
Craniocaudal view | 363 | 30.76 (14.46) | 703 | 26.62 (14.56) | <0.001 |
The total case/control patient numbers vary from the top portion of the table in comparison with the image numbers used for the PD analysis because only cases and controls with both Cumulus method and automated breast density method measures were used in these analyses.
A. Craniocaudal mammogram images displayed in the Lumiscan 75 scanner representation (top) and the corresponding Cumulus method density (PD)–labeled images (bottom) for three women with varying densities from left to right are 13%, 56%, and 28%. B. Craniocaudal mammogram images displayed in the converted digital (DBA) representation (top) and the corresponding automated breast density method density (PDA) detected images (bottom) for three women with varying densities from left to right are 18%, 24%, and 21%.
A. Craniocaudal mammogram images displayed in the Lumiscan 75 scanner representation (top) and the corresponding Cumulus method density (PD)–labeled images (bottom) for three women with varying densities from left to right are 13%, 56%, and 28%. B. Craniocaudal mammogram images displayed in the converted digital (DBA) representation (top) and the corresponding automated breast density method density (PDA) detected images (bottom) for three women with varying densities from left to right are 18%, 24%, and 21%.
For the craniocaudal views, the correlation of PD-PDA was R = 0.70 (Fig. 2); for PDA–Breast Imaging Reporting and Data System, r = 0.49 (Table 2); and for PD–Breast Imaging Reporting and Data System, r = 0.57 (Table 2). Figure 2 illustrates a slight curvilinear association between the PDA and PD. The inclusion of a quadratic term entered in the linear regression model improved the model fit, increasing R2 from 0.45 to 0.53 for the craniocaudal view (mediolateral oblique view R2, 0.42 to 0.50). The range of density was substantially reduced for the automated breast density method (6-32%) compared with the Cumulus method (0-80%). Consequently, the variance of the automated breast density method is reduced as well (Table 3). The correlation of PD between the craniocaudal and mediolateral oblique views was higher (R = 0.90) than the between-view correlation for PDA (R = 0.78).
The correlation between Cumulus method (PD) and automated breast density method (PDA) for craniocaudal mammogram view among 703 controls.
The correlation between Cumulus method (PD) and automated breast density method (PDA) for craniocaudal mammogram view among 703 controls.
Mean PDA (automated breast density method) and PD (Cumulus method) by Breast Imaging Reporting and Data System category for controls only
BI-RADS category . | Controls (n) . | PDA . | . | PD . | . | ||
---|---|---|---|---|---|---|---|
. | . | Mean (SD) CC view . | Mean (SD) MLO view . | Mean (SD) CC view . | Mean (SD) MLO view . | ||
1 | 254 | 16.6 (5.1) | 16.4 (4.5) | 16.8 (11.0) | 15.8 (10.2) | ||
2 | 78 | 17.5 (5.0) | 17.8 (4.2) | 21.3 (11.1) | 18.7 (9.3) | ||
3 | 182 | 21.1 (4.5) | 20.8 (4.3) | 31.7 (12.0) | 29.0 (11.6) | ||
4 | 198 | 23.0 (4.6) | 22.2 (4.6) | 36.2 (13.5) | 33.1 (12.9) | ||
All categories | 19.6 (5.5) | 19.3 (5.1) | 26.5 (14.6) | 24.3 (13.6) | |||
r* | 0.49 | 0.48 | 0.57 | 0.56 |
BI-RADS category . | Controls (n) . | PDA . | . | PD . | . | ||
---|---|---|---|---|---|---|---|
. | . | Mean (SD) CC view . | Mean (SD) MLO view . | Mean (SD) CC view . | Mean (SD) MLO view . | ||
1 | 254 | 16.6 (5.1) | 16.4 (4.5) | 16.8 (11.0) | 15.8 (10.2) | ||
2 | 78 | 17.5 (5.0) | 17.8 (4.2) | 21.3 (11.1) | 18.7 (9.3) | ||
3 | 182 | 21.1 (4.5) | 20.8 (4.3) | 31.7 (12.0) | 29.0 (11.6) | ||
4 | 198 | 23.0 (4.6) | 22.2 (4.6) | 36.2 (13.5) | 33.1 (12.9) | ||
All categories | 19.6 (5.5) | 19.3 (5.1) | 26.5 (14.6) | 24.3 (13.6) | |||
r* | 0.49 | 0.48 | 0.57 | 0.56 |
NOTE: Contralateral side. 707 controls were used for the Breast Imaging Reporting and Data System, PD, and PDA mediolateral oblique analyses, and 703 controls for craniocaudal analyses.
Abbreviations: BI-RADS, Breast Imaging Reporting and Data System; CC, craniocaudal; MLO, mediolateral oblique.
Spearman correlation coefficient.
Association of breast cancer risk factors and breast density assessed by three methods
Variable . | Mean (SD) . | . | BI-RADS 4 (%) . | |||
---|---|---|---|---|---|---|
. | PDA . | PD . | . | |||
Age (y) | ||||||
<50 | 23.0 ± 4.6 | 35.9 ± 13.8 | 45.2 | |||
50-59 | 20.3 ± 5.8 | 28.7 ± 14.9 | 36.3 | |||
60-69 | 18.5 ± 5.3 | 23.3 ± 13.7 | 19.8 | |||
70+ | 17.7 ± 4.9 | 21.9 ± 12.0 | 15.1 | |||
BMI quartiles | ||||||
17.1-23.5 | 21.5 ± 4.7 | 35.5 ± 14.9 | 42.6 | |||
23.6-26.1 | 20.0 ± 5.2 | 27.7 ± 12.2 | 28.6 | |||
26.2-29.9 | 19.0 ± 5.8 | 24.1 ± 13.2 | 20.1 | |||
30.0-53.7 | 17.9 ± 5.6 | 19.3 ± 12.7 | 18.7 | |||
Menopausal status | ||||||
Premenopausal | 22.9 ± 4.6 | 35.4 ± 13.8 | 47.6 | |||
Postmenopausal | 18.9 ± 5.4 | 24.8 ± 14.0 | 23.6 | |||
Postmenopausal hormones | ||||||
Never | 19.2 ± 5.5 | 25.4 ± 14.7 | 24.4 | |||
Ever | 20.8 ± 5.4 | 30.4 ± 13.8 | 35.5 | |||
Unknown | 17.8 ± 4.7 | 21.5 ± 12.3 | 28.6 | |||
First-degree family history breast cancer | ||||||
None | 19.7 ± 5.5 | 27.1 ± 14.7 | 28.4 | |||
Positive | 18.9 ± 5.5 | 23.0 ± 12.9 | 23.0 | |||
Parity | ||||||
Nulliparous | 20.7 ± 5.0 | 32.4 ± 13.8 | 32.2 | |||
1-2 children | 20.0 ± 5.6 | 28.4 ± 14.9 | 32.6 | |||
3+ children | 19.1 ± 5.6 | 24.3 ± 14.1 | 24.0 |
Variable . | Mean (SD) . | . | BI-RADS 4 (%) . | |||
---|---|---|---|---|---|---|
. | PDA . | PD . | . | |||
Age (y) | ||||||
<50 | 23.0 ± 4.6 | 35.9 ± 13.8 | 45.2 | |||
50-59 | 20.3 ± 5.8 | 28.7 ± 14.9 | 36.3 | |||
60-69 | 18.5 ± 5.3 | 23.3 ± 13.7 | 19.8 | |||
70+ | 17.7 ± 4.9 | 21.9 ± 12.0 | 15.1 | |||
BMI quartiles | ||||||
17.1-23.5 | 21.5 ± 4.7 | 35.5 ± 14.9 | 42.6 | |||
23.6-26.1 | 20.0 ± 5.2 | 27.7 ± 12.2 | 28.6 | |||
26.2-29.9 | 19.0 ± 5.8 | 24.1 ± 13.2 | 20.1 | |||
30.0-53.7 | 17.9 ± 5.6 | 19.3 ± 12.7 | 18.7 | |||
Menopausal status | ||||||
Premenopausal | 22.9 ± 4.6 | 35.4 ± 13.8 | 47.6 | |||
Postmenopausal | 18.9 ± 5.4 | 24.8 ± 14.0 | 23.6 | |||
Postmenopausal hormones | ||||||
Never | 19.2 ± 5.5 | 25.4 ± 14.7 | 24.4 | |||
Ever | 20.8 ± 5.4 | 30.4 ± 13.8 | 35.5 | |||
Unknown | 17.8 ± 4.7 | 21.5 ± 12.3 | 28.6 | |||
First-degree family history breast cancer | ||||||
None | 19.7 ± 5.5 | 27.1 ± 14.7 | 28.4 | |||
Positive | 18.9 ± 5.5 | 23.0 ± 12.9 | 23.0 | |||
Parity | ||||||
Nulliparous | 20.7 ± 5.0 | 32.4 ± 13.8 | 32.2 | |||
1-2 children | 20.0 ± 5.6 | 28.4 ± 14.9 | 32.6 | |||
3+ children | 19.1 ± 5.6 | 24.3 ± 14.1 | 24.0 |
NOTE: Controls only. PD and PDA assessed from craniocaudal view from contralateral side of 703 controls. Breast Imaging Reporting and Data System estimated using both craniocaudal and mediolateral oblique view.
Abbreviation: BMI, body mass index.
All three measures showed expected associations with established breast cancer risk factors, including inverse associations with age, body mass index, parity, postmenopausal status, and never postmenopausal hormone therapy use, although the parity associations were more pronounced for PD compared with PDA or Breast Imaging Reporting and Data System (Table 3). Table 4 shows associations between these measures with breast cancer. Positive associations between breast density and breast cancer were found with all measures and both views. The risk associations for all three density estimates were comparable. PDA had higher odds ratios and wider confidence intervals than the PD and Breast Imaging Reporting and Data System classification; however, the areas under the receiver operator characteristic curve for all methods were virtually identical (Table 4). In addition, the areas under the receiver operator characteristic curve for the continuous PD and PDA measures (per one SD) were similar to those from the models that evaluated quartiles of PD and PDA (Table 4). There was no evidence of a quadratic association for either the continuous PD or PDA (data not shown).
Association of breast density with risk for breast cancer for the Breast Imaging Reporting and Data System, Cumulus method, and automated breast density method
Category . | BI-RADS . | . | AUC . | View . | Quartile PD or PDA*,† . | PD . | . | . | PDA . | . | . | |||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | Case/control . | OR (95%) . | . | . | . | Case/control . | OR (95%) . | AUC . | Case/control . | OR (95%) . | AUC . | |||||
1 | 92/254 | 1.00 (reference) | 0.61 | CC | 1 | 56/175 | 1.00 (reference) | 0.63 | 41/178 | 1.00 (reference) | 0.64 | |||||
2 | 43/78 | 1.58 (1.01-2.48) | 2 | 87/176 | 1.69 (1.13-2.53) | 81/172 | 2.31 (1.47-3.64) | |||||||||
3 | 86/182 | 1.49 (1.04-2.14) | 3 | 90/176 | 2.10 (1.37-3.22) | 100/178 | 3.02 (1.94-4.70) | |||||||||
4 | 148/198 | 2.61 (1.82-3.75) | 4 | 130/176 | 3.81 (2.42-5.99) | 141/175 | 5.18 (3.26-8.20) | |||||||||
Continuous‡ | 1.67 (1.42-1.97) | 0.64 | 1.79 (1.53-2.10) | 0.64 | ||||||||||||
MLO | 1 | 57/175 | 1.00 (reference) | 0.60 | 36/175 | 1.00 (reference) | 0.64 | |||||||||
2 | 98/178 | 1.83 (1.23-2.74) | 75/179 | 2.53 (1.56-4.10) | ||||||||||||
3 | 96/175 | 2.00 (1.32-3.05) | 114/179 | 4.48 (2.76-7.27) | ||||||||||||
4 | 113/179 | 2.82 (1.81-4.40) | 139/174 | 6.14 (3.74-10.08) | ||||||||||||
Continuous‡ | 1.56 (1.33-1.83) | 0.63 | 1.89 (1.61-2.23) | 0.64 |
Category . | BI-RADS . | . | AUC . | View . | Quartile PD or PDA*,† . | PD . | . | . | PDA . | . | . | |||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | Case/control . | OR (95%) . | . | . | . | Case/control . | OR (95%) . | AUC . | Case/control . | OR (95%) . | AUC . | |||||
1 | 92/254 | 1.00 (reference) | 0.61 | CC | 1 | 56/175 | 1.00 (reference) | 0.63 | 41/178 | 1.00 (reference) | 0.64 | |||||
2 | 43/78 | 1.58 (1.01-2.48) | 2 | 87/176 | 1.69 (1.13-2.53) | 81/172 | 2.31 (1.47-3.64) | |||||||||
3 | 86/182 | 1.49 (1.04-2.14) | 3 | 90/176 | 2.10 (1.37-3.22) | 100/178 | 3.02 (1.94-4.70) | |||||||||
4 | 148/198 | 2.61 (1.82-3.75) | 4 | 130/176 | 3.81 (2.42-5.99) | 141/175 | 5.18 (3.26-8.20) | |||||||||
Continuous‡ | 1.67 (1.42-1.97) | 0.64 | 1.79 (1.53-2.10) | 0.64 | ||||||||||||
MLO | 1 | 57/175 | 1.00 (reference) | 0.60 | 36/175 | 1.00 (reference) | 0.64 | |||||||||
2 | 98/178 | 1.83 (1.23-2.74) | 75/179 | 2.53 (1.56-4.10) | ||||||||||||
3 | 96/175 | 2.00 (1.32-3.05) | 114/179 | 4.48 (2.76-7.27) | ||||||||||||
4 | 113/179 | 2.82 (1.81-4.40) | 139/174 | 6.14 (3.74-10.08) | ||||||||||||
Continuous‡ | 1.56 (1.33-1.83) | 0.63 | 1.89 (1.61-2.23) | 0.64 |
NOTE: Contralateral Side. All models were adjusted for age and body mass index. For all models, P values from tests for trend were <0.001. Numbers of cases and controls are the same for the analyses of PD and PDA done within craniocaudal (363 cases, 703 controls) and mediolateral oblique (364 cases, 707 controls) views. Numbers of cases and controls for Breast Imaging Reporting and Data System analyses were 369 cases and 712 controls.
Abbreviations: OR, odds ratio; AUC, area under the receiver operator characteristic curve.
Cut points for quartiles of Cumulus method PD for craniocaudal view are 15.8, 25.8, and 35.6, and for mediolateral oblique view are 14.5, 23.7, and 32.7.
Cut points for quartiles for automated breast density method PDA for craniocaudal view are 15.4, 20.1, and 23.9, and for mediolateral oblique view are 15.7, 19.4, and 23.0.
Continuous PD and PDA assessments correspond to a one SD increase. SDs are 14.6 and 13.5 for PD from craniocaudal and mediolateral oblique views and 5.5 and 5.1 for PDA from craniocaudal and mediolateral oblique views.
Discussion
The results of the current study show that automated breast density method is a viable option for estimating breast density. PDA is correlated with the PD and Breast Imaging Reporting and Data System estimates, and the risk estimates for quartiles of PDA are comparable with reported estimates from a variety of studies (1). In addition, PDA discriminates between case-control status as well as PD or Breast Imaging Reporting and Data System.
The incorporation of breast density into risk assessment models could provide a woman with an improved estimate of her absolute breast cancer risk. Accurate knowledge of individual risk results in more informed clinical decisions about interventions and screening frequency and can provide a reference of where a woman lies with respect to average risk (32). Data show that clinical radiologists' Breast Imaging Reporting and Data System categorization, together with age and ethnicity, provides as much information as the existing Gail model for breast cancer risk prediction (18), underscoring its relative importance as a marker of risk. Two recent studies evaluated the addition of breast density to risk prediction models and found some improvement of fit (19, 20). To facilitate the incorporation of breast density into clinical practice and decision making, the automated breast density method could be added to existing computer-assisted diagnostic systems (33) currently used in clinical practice to detect suspicious areas on mammograms. This would allow for a quantitative breast density measure that is automatically and objectively generated, guaranteeing reproducible and comparable measurements across sites. In addition, the automated breast density method would result in greater time and cost savings in comparison with user-assisted or experienced observer interpretation methods.
In addition to improving absolute risk prediction, the automated breast density method may also be the most appropriate measure for assessing change in breast density. Change in breast density has been used as an intermediate biomarker for efficacy of interventions, such as tamoxifen or gonadotropin-releasing hormone agonist (34, 35). When assessing the amount of change in density, a quantitative measure is preferable because the degree of change is limited to the resolution of the measure. We (36) and others (35, 37, 38) have shown that breast density changes that occur in postmenopausal women are small in magnitude and could be missed if assessed solely by the Breast Imaging Reporting and Data System categorical measure. Thus, for the assessment of serial breast density changes, especially in postmenopausal women, a quantitative estimate such as the automated breast density method or Cumulus method is preferred. Furthermore, the automated breast density method may be preferable for comparability of breast density change across studies and institutions because the automated breast density method is independent of operator whereas the Cumulus method requires thresholds subject to interreader variation.
The automated breast density method is currently an experimental software or approach that requires further development, testing, and validation in other study populations. The film data used for this study, including the training data used for initialization of the automated breast density method parameters and the case-control study data, were collected by one center and digitized with a single digitizer (Lumiscan 75 scanner). The automated breast density method requires modification to accept image data acquired from various mammogram detectors (digitized film or full-field digital mammography digital acquisition). Preliminary work (39) shows that when applying the automated breast density method to full-field digital mammography data, the density estimate correlates with a calibrated full-field digital mammography breast density measure (9) when applied to the same data set. Thus, it is possible that the automated breast density method can be applied to full-field digital mammography data to discriminate case and control status.
Although the automated breast density method and Cumulus method measures are correlated, the range of density was substantially narrower for the automated breast density method (6-32%) when compared with the Cumulus method (0-80%). This difference in scale is partly due to the different data representations of the Lumiscan 75 scanner and the DBA digitizer outputs. The Lumiscan 75 scanner raw image pixel range is compressed relative to the DBA representation. This difference, together with the way that the parameters of the automated breast density method–PDA were selected to maximize its linear correlation with Cumulus method–PD, results in different scales for the two breast density quantitative estimates. However, these scale differences do not represent a serious obstacle because of the near-linear relation. It would be relatively easy to either expand one scale or compress the other to obtain breast density scores that are similarly scaled. We have retained the simple estimates from each to make comparisons between the two scales similar to the way the Celsius and Fahrenheit temperature scales are compared.
On the other hand, there is little evidence indicating that either the PD or PDA scales or the corresponding breast density representations are optimal or preferable. For example, both representations follow from binary labeling. This form of labeling implies a pixel labeled as 100% dense breast tissue within the central uniformly compressed region of the breast carries as much weight as a pixel labeled similarly in the region near the breast-background border that is not of uniform thickness. As another example, the implications in the extreme case wherein a breast is labeled near 100% breast density would imply that the breast volume is composed of 100% dense breast tissue, which is unlikely.
The automated breast density method requires a consistent method for segmenting the breast tissue from the digital image. The work presented here represents an ideal application because each image background was set to zero before automated breast density method, which made segmentation of the breast tissue trivial. This does not represent a hindrance to its application on other data sets because many fully automated routines also segment the breast tissue from the background (40-42).
The generalizability of automated breast density method is currently limited to film mammography, which comprises ∼85% of mammography units in the United States (21). Developing methods for film analysis may be questioned in light of the emerging full-field digital mammography technology, which may provide many benefits because of its digital form (43). However, film mammograms are still the mainstream for screening, and the associated data archives that span several years are capable of supporting clinical studies. This is not generally the case for full-field digital mammography data. On the other hand, full-field digital mammography data may lend itself to acquisition standardization (or calibration) more easily than film, but the techniques for this purpose are currently works in progress (9, 11, 12) that warrant validation in terms of their short-term and long-term ability to produce a measure that correlates with risk. Various manufactures of full-field digital mammography systems have raw (inverted data scales) and display data representations. Some full-field digital mammography systems allow easy access to the raw data, whereas other systems do not. Likewise, some centers discard the raw data because of storage limitations, which limits the analyses that can be done on the full-field digital mammography images. At this time, both forms of mammography (film and the various full-field digital mammography systems) exist simultaneously, and both can assist in improving risk estimations.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Grant support: Department of Defense (grant DAMD 17-00-1-0331) and National Cancer Institute (grants R01 CA97396 and P50 CA116201).
Acknowledgments
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
We thank the women within our mammography practice who provided research authorization for medical record studies.