Abstract
Background: Mammographic density is a strong risk factor for breast cancer.
Methods: We present a novel approach to enhance area density measures that takes advantage of the relative density of the pectoral muscle that appears in lateral mammographic views. We hypothesized that the grey scale of film mammograms is normalized to volume breast density but not pectoral density and thus pectoral density becomes an independent marker of volumetric density.
Results: From analysis of data from a Swedish case–control study (1,286 breast cancer cases and 1,391 control subjects, ages 50–75 years), we found that the mean intensity of the pectoral muscle (MIP) was highly associated with breast cancer risk [per SD: OR = 0.82; 95% confidence interval (CI), 0.75–0.88; P = 6 × 10−7] after adjusting for a validated computer-assisted measure of percent density (PD), Cumulus. The area under curve (AUC) changed from 0.600 to 0.618 due to using PD with the pectoral muscle as reference instead of a standard area-based PD measure. We showed that MIP is associated with a genetic variant known to be associated with mammographic density and breast cancer risk, rs10995190, in a subset of women with genetic data. We further replicated the association between MIP and rs10995190 in an additional cohort of 2,655 breast cancer cases (combined P = 0.0002).
Conclusions: MIP is a marker of volumetric density that can be used to complement area PD in mammographic density studies and breast cancer risk assessment.
Impact: Inclusion of MIP in risk models should be considered for studies using area PD from analog films. Cancer Epidemiol Biomarkers Prev; 23(7); 1314–23. ©2014 AACR.
Introduction
Mammographic density, which is often quantified in terms of the ratio of the fibroglandular tissue to the total breast area in mammograms and termed as percent density (PD), has been confirmed by many studies to be one of the strongest risk factors for breast cancer in women (1, 2). The fibroglandular (dense) tissue is visualized in mammograms as the bright area, in contrast to the darker regions of fatty, radiolucent tissue. For film mammograms, a widely used, semiautomated approach for measuring density is the software tool Cumulus (3). Cumulus is a simple tool that allows a trained reader to define the total breast area and dense breast area using 2 histogram thresholds. In recent years, a number of other algorithms for measuring PD have also been developed (4–7). However, these tools effectively operate using the same principles. Although film mammography screening has largely been replaced by full-field digital mammography, film mammography remains one of the largest resources of breast density studies because breast cancer is a latent disease and many cancer outcomes are evidenced by film mammograms dating back 15 to 20 years. Thus, the analysis of film mammograms will be critically important for the advancement of our understanding of breast cancer for the foreseeable future.
Several studies have shown that categorical scoring of mammographic density and area mammographic density measures can contribute to breast cancer risk prediction (8–10). Yet, other studies have shown that improved risk prediction can be obtained using additional aspects of mammographic images, such as texture-based measures (11) and measuring mammographic density on a volumetric scale (12, 13).
One way to investigate whether existing approaches to PD measurement miss relevant risk information is to assess the importance of acquisition parameters (many of which are steered by the mammographic automatic exposure control, AEC, system) for breast cancer risk, independently of the measured density. However, for film images, these parameters are often not stored or embedded in the image itself, as is the case for the images collected in the studies included in this article. Olson and colleagues (14), however, manually extracted a selection of image acquisition parameters from film images and subsequently evaluated their association with breast cancer. They found no evidence that X-ray tube potential, X-ray tube current-time, breast thickness, or compression force were associated with breast cancer status, after adjusting for PD, or that they confound the association between PD and breast cancer.
In this article, we describe a novel approach to enhance area density measures that takes advantage of the relative density of the pectoral muscle that is visible in lateral mammographic views.
Materials and Methods
On scrutinizing the digitized film images in our studies, a visible illuminance fluctuation across film mammograms was detected, which was apparent even amongst images with similar PD measurements. Some selected examples are shown in Fig. 1. The images in the bottom row of Fig. 1 are dim (weak illuminance) as compared with the images in the top row (strong illuminance). Each column comprises 2 mammograms with similar Cumulus PD readings.
Variability in illuminance observed in film mammograms retrieved with similar PD readings. The rows show images with similar illuminance, whereas the columns show images with similar area PD.
Variability in illuminance observed in film mammograms retrieved with similar PD readings. The rows show images with similar illuminance, whereas the columns show images with similar area PD.
Because the exposure conditions of the breast are decided by a sensor placed under the film and in the middle of the breast tissue, the image illuminance is governed by the attenuation of the breast thickness and volumetric density, but illuminance is not associated with pectoral density. The sensor is not placed in the pectoral region. Thus, the image pixels in the pectoral muscle may be an independent monitor of exposure conditions related to the volumetric density. In short, we hypothesized that the grey scale of film mammograms is normalized to volume breast density but not pectoral density and thus pectoral density becomes an independent marker of volumetric density. Specifically, we believed that the images in the bottom row of Fig. 1, relative to those in the top row, have high attenuation due to dense tissue thickness located toward the center of the breast, so that these images, with decreased illuminance, will have an excess breast cancer risk which is not captured by area-based PD.
To capture pectoral density, we use the mean pixel intensity in the pectoral muscle (MIP) appearing in mediolateral oblique (MLO) images and propose its use in enhancing area-based measures of mammographic density. Following our above reasoning, we expect MIP to be negatively associated with breast cancer risk. We first assessed association of the MIP with breast cancer status, adjusting for an existing area-based measure of PD, in a population-based case–control study of postmenopausal breast cancer. Owing to the (small) possibility that our results could be biased because of systematic differences in machine models/measurements between breast cancer cases and healthy controls, we subsequently assessed the association of MIP with a genetic variant established as being associated with both mammographic density and breast cancer risk. We did this first in a subset of the case–control data, including only those women for which genetic data are available. We subsequently replicated the association in an additional cohort of cases. Finally, we combined an existing area-based PD measure with MIP to obtain a measure of PD with the pectoral muscle as reference and assessed its performance in case–control classification. The causal relationships which we propose between the key entities studied in this article are described by Fig. 2, in which observed quantities are contained in boxes and unobserved quantities are in ovals.
Proposed causal relationships between entities pertaining to mammographic density. The directions of the arrows depict the directions of influence.
Proposed causal relationships between entities pertaining to mammographic density. The directions of the arrows depict the directions of influence.
Study populations
CAHRES (CAncer and Hormone REplacement Study) is a population-based breast cancer case–control cohort which was initiated in the mid-1990s and includes approximately 6,000 women (∼3,000 cases and 3,000 controls) from which area-based PD measurements of approximately 3,000 mammograms are available (4, 15). Body mass index (BMI) was recorded at entrance to the study, whereas age was assessed according to date of mammography (15). In this article, we include data from 2,677 women on whom complete data were available with respect to age, BMI, and area-based PD. Of these women, 1,235 had available information on the genetic variant studied in this article. Images were selected as those which were closest to (but before) date of diagnosis (for cases) or closest to date of questionnaire (for controls). The contralateral image was selected for cases, whereas for controls, side (left/right) was selected at random.
Libro-1 is a large cohort study of breast cancer cases, which comprises women in the Stockholm–Gotland region who were diagnosed with breast cancer between the years 2001–2008. The overall aim of the study is to identify prognostic factors for breast cancer. We included 2,655 women with complete data on age, BMI, area-based PD, and genotype (rs10995190). Contralateral images, closest to (before) diagnosis, were selected. All participants provided written informed consent, and the study was approved by the ethical review board at Karolinska Institutet (Stockholm, Sweden).
Established approaches to PD measurement used in this study
In both studies, the film mammograms were digitized using an Array 2905HD Laser Film Digitizer (Array Corporation Europe). Optical densities from 0 to 4.7 were converted to image grey scale values from 0 to 4,095 (i.e., a 12-bit dynamic range where 0 represents the darkest pixel value and 4,095 represent the brightest pixel value). Within each study, all mammograms were digitized in a single batch (in random order) by the same person. The method used to measure area breast density was, however, unique by study. For the CAHRES mammograms, Cumulus was used, whereas an ImageJ macro-plugin (4) was used to analyze the Libro-1 mammograms. The reason for this difference is that a trained reader was not available to read the Libro-1 images.
Cumulus is the most widely used method for measurement of mammographic density in analog images and is a semiautomatic tool. The software has been widely validated on film mammograms and uses a simple thresholding mechanism. The first task for a Cumulus user is to manually trace and remove the pectoral muscle. The reader subsequently uses sliders to perform global (breast region) and then local (dense region) interactive thresholding. A trained reader carried out the Cumulus measurements for the 2677 CAHRES images included here, blinded to case–control status and to tumor characteristics.
The ImageJ macro-plugin used to measure area density in Libro-1 images has been described fully by Li and colleagues (4). In brief, the macro-plugin calculates features derived from area measurements, intensity measurements, shape descriptors, and other statistical metrics under different threshold methods. The algorithm was trained against Cumulus measurements and is viewed as mimicking Cumulus.
Genetic data
Genotype data for a common SNP, rs10995190, in the gene ZNF365, confirmed to be associated with both mammographic density (P = 9 × 10−10; ref. 16) and breast cancer risk (P = 1 × 10−36; ref. 17), were available from both CAHRES and Libro-1 studies. The SNP was included on a custom Illumina iSelect genotyping array (iCOGS) designed for replication and fine mapping of common and rare variants with relevance to breast, ovary, and prostate cancer, which has been used for both studies (17).
Mean intensity of the pectoral muscle
Two measures of MIP were defined, one from an automatic algorithm and the other from the manually outlined (Cumulus) pectoral muscle. The automatic method was used on all study images (CAHRES and Libro-1). Automatic detection of the pectoral muscle (from which the mean pixel intensity was measured) from mammographic images uses a segmentation method described generically (18), which has been extended successfully to segment regions of interest in mammographic images, as described in detail (Cheddad and colleagues, unpublished observations). Because segmentation of the pectoral muscle is fully automated in this approach, mal-segmentation (e.g., missing the pectoral muscle's true boundary) of the pectoral muscle could occur. For a subset of the CAHRES images (1,725 of the 2,677 images), it was, however, also possible to retrieve the locations (x, y coordinates) of the manually (Cumulus-reader) outlined pectoral muscles, from which we derived alternative “manual” MIP measurements. We compared automatic and manual MIP measurements to validate the robustness of the automatic approach. The automatic approach is used throughout the article except when explicitly stated otherwise.
Statistical methods
For the case–control study of postmenopausal breast cancer, CAHRES, we first evaluated the association of MIP with case–control status using unconditional logistic regression (case–control status as dependent variable and MIP as the independent variable), adjusting for PD, age, and BMI. PD was square-root–transformed and MIP was transformed using a Box–Cox transformation (power = 4.9), before analysis. We subsequently repeated the analysis but additionally adjusting for a number of other potential confounders, namely, hormone replacement therapy (HRT), parity, and age at first birth. In a yet further analysis, we adjusted for county of residence and date of mammogram. We fitted linear regression models with MIP as outcome variable and included a variety of variables as covariates to learn about the mechanisms behind our measure of pectoral muscle density. We then carried out regression analyses based on the subset of images for which the manual measure of MIP, based on manual identification of the pectoral muscle, was available (using both measures of MIP, separately). For this subset, we also evaluated the Pearson coefficient of correlation between the 2 MIP measures.
To evaluate the association between MIP with genotypes of the SNP rs10995190, we fitted linear regression models using PD and MIP as outcome variables and carried out Wald tests. Regression models were fitted first to both data sets CAHRES and Libro-1, separately. The variables, age, BMI, and rs10995190 genotype (coded 0/1/2, treated as a continuous variable), were treated as covariates in initial analyses and other covariates were also included in additional analyses. We also analyzed data from the 2 studies jointly and additionally included study indicator (CAHRES/Libro-1) as a covariate. For MIP as an outcome variable, we repeated the analysis additionally adjusting for the square-root of PD as a covariate.
Finally, we evaluated a measure of mammographic density, calculated as a weighted sum of PD0.5 and MIP4.9 in case–control classification (using CAHRES images). The measure represents an enhancement of PD from using the pectoral muscle, which we call PD with the pectoral muscle as reference (PDP). To enable a fair comparison of PDP with PD, the weights of PDP are derived from the results of the genetic association analysis. We performed a grid search to find the optimal value of Δ in
with the optimal value of Δ being defined as that which resulted in the strongest association with rs10995190, measured in terms of a Wald test statistic assessing the association between PDP and rs10995190 in a regression analysis (with PDP as outcome and adjusted for age and BMI). We then used the optimal value of Δ (= −13.7) to evaluate the performance of PDP in classification of cases and controls using CAHRES. We used only Libro-1 images to estimate Δ so that a separate set of images were used for “training” and “testing.” We compared PDP and PD by assessing the quality of different linear regression models for breast cancer status, first, in terms of the Akaike information criterion (AIC) and, second, in terms of area under the receiver operating characteristic curve (AUC) using DeLong test for 2 correlated receiver operator characteristic (ROC) curves.
Results
Key characteristics of cases and controls included in our association analysis of breast cancer risk are described in Table 1. MIP was found to be significantly associated (P = 6 × 10−7) with breast cancer status after adjusting for age, BMI, and PD (Table 2). The SD of MIP (transformed) was 0.105 (mean = 0.336) which, in combination with the regression coefficient of −1.932, means that an increase of 1 SD in MIP is estimated to be associated with a 23% increase in the odds of having breast cancer. On the basis of the same model, an increase of 1 SD in PD (transformed) was estimated to be associated with a 40% increase in odds. Fitting the same model but with PD and MIP untransformed yielded a similar result (coefficient for PD = 0.201, P = 9 × 10−11; coefficient for MIP = −3.389, P = 2 × 10−6). The association between MIP and case–control status remained strongly significant after additionally adjusting for HRT use, parity, and age at first birth, even though the sample size was reduced because of missingness on these variables (Table 2). We also retrieved information on county of residence (there are 21 counties in Sweden) and year of mammogram and additionally adjusted for these factors. The association between MIP and case–control status remained strongly significant (regression coefficient = −2.29, P = 8 × 10−6). No evidence of an interaction effect between MIP and PD was found (data not shown).
Key characteristics of individuals included in the breast cancer case–control study (CAHRES)
. | Cases . | Controls . | P . |
---|---|---|---|
Numbera | 1,286 | 1,391 | |
HRT use | |||
Never | 861 (68) | 1,074 (79) | 1 × 10−10 |
Past | 102 (8) | 47 (3) | |
Current | 301 (24) | 245 (18) | |
Parity and age at first birth | |||
Nulliparous | 172 (13) | 143 (10) | 3 × 10−7 |
Parity ≤ 2 and age at first birth ≤ 25 | 381 (30) | 378 (27) | |
Parity ≤ 2 and age at first birth > 25 | 403 (31) | 377 (27) | |
Parity > 2 and age at first birth ≤ 25 | 239 (19) | 387 (28) | |
Parity > 2 and age at first birth > 25 | 84 (7) | 103 (8) | |
Age, y | 62.9 (6.5) | 63.9 (6.4) | 5 × 10−5 |
BMI | 25.3 (3.7) | 25.1 (3.9) | 5 × 10−5 |
PD | 18.7 (15.0) | 14.8 (13.5) | 1 × 10−12 |
|$\sqrt {PD}$| | 3.97 (1.72) | 3.45 (1.69) | 1 × 10−14 |
MIP | 0.786 (0.060) | 0.798 (0.055) | 6 × 10−8 |
MIP4.9 | 0.324 (0.107) | 0.346 (0.102) | 1 × 10−7 |
Date of mammogramsb | 28/02/1994 (27/06/1993, 22/09/1994) | 19/09/1994 (28/10/1993, 07/04/1995) |
. | Cases . | Controls . | P . |
---|---|---|---|
Numbera | 1,286 | 1,391 | |
HRT use | |||
Never | 861 (68) | 1,074 (79) | 1 × 10−10 |
Past | 102 (8) | 47 (3) | |
Current | 301 (24) | 245 (18) | |
Parity and age at first birth | |||
Nulliparous | 172 (13) | 143 (10) | 3 × 10−7 |
Parity ≤ 2 and age at first birth ≤ 25 | 381 (30) | 378 (27) | |
Parity ≤ 2 and age at first birth > 25 | 403 (31) | 377 (27) | |
Parity > 2 and age at first birth ≤ 25 | 239 (19) | 387 (28) | |
Parity > 2 and age at first birth > 25 | 84 (7) | 103 (8) | |
Age, y | 62.9 (6.5) | 63.9 (6.4) | 5 × 10−5 |
BMI | 25.3 (3.7) | 25.1 (3.9) | 5 × 10−5 |
PD | 18.7 (15.0) | 14.8 (13.5) | 1 × 10−12 |
|$\sqrt {PD}$| | 3.97 (1.72) | 3.45 (1.69) | 1 × 10−14 |
MIP | 0.786 (0.060) | 0.798 (0.055) | 6 × 10−8 |
MIP4.9 | 0.324 (0.107) | 0.346 (0.102) | 1 × 10−7 |
Date of mammogramsb | 28/02/1994 (27/06/1993, 22/09/1994) | 19/09/1994 (28/10/1993, 07/04/1995) |
NOTE: Mean (SD) or n (%) values are presented except where stated.
aAll women included in the analysis were postmenopausal.
bMedian dates (dd/mm/yyyy) are presented with lower and upper quartiles in parentheses.
Effect-estimates and P values (Wald tests) from fitting logistic regression models for breast cancer status in CAHRES women
Covariatea . | Estimate . | SE . | P . |
---|---|---|---|
With partial adjustment (n = 2,677) | |||
Age | −0.015 | 0.006 | 0.016 |
BMI | 0.054 | 0.011 | 2 × 10−6 |
PD | 0.194 | 0.026 | 1 × 10−13 |
MIP | −1.932 | 0.387 | 6 × 10−7 |
With full adjustment (n = 2,328) | |||
Age | −0.010 | 0.007 | 0.155 |
BMI | 0.057 | 0.012 | 4 × 10−6 |
HRT use | |||
Never | — | — | — |
Past | 0.753 | 0.204 | 2 × 10−4 |
Current | 0.279 | 0.108 | 0.010 |
Parity and age at first birth | |||
Nulliparous | — | — | — |
Parity ≤ 2 and age at first birth ≤ 25 | −0.186 | 0.149 | 2 × 10−5 |
Parity ≤ 2 and age at first birth > 25 | −0.104 | 0.148 | 0.045 |
Parity > 2 and age at first birth ≤ 25 | −0.665 | 0.155 | 2 × 10−4 |
Parity > 2 and age at first birth > 25 | −0.403 | 0.201 | 0.010 |
PD | 0.158 | 0.029 | 3 × 10−8 |
MIP | −1.974 | 0.417 | 2 × 10−6 |
Covariatea . | Estimate . | SE . | P . |
---|---|---|---|
With partial adjustment (n = 2,677) | |||
Age | −0.015 | 0.006 | 0.016 |
BMI | 0.054 | 0.011 | 2 × 10−6 |
PD | 0.194 | 0.026 | 1 × 10−13 |
MIP | −1.932 | 0.387 | 6 × 10−7 |
With full adjustment (n = 2,328) | |||
Age | −0.010 | 0.007 | 0.155 |
BMI | 0.057 | 0.012 | 4 × 10−6 |
HRT use | |||
Never | — | — | — |
Past | 0.753 | 0.204 | 2 × 10−4 |
Current | 0.279 | 0.108 | 0.010 |
Parity and age at first birth | |||
Nulliparous | — | — | — |
Parity ≤ 2 and age at first birth ≤ 25 | −0.186 | 0.149 | 2 × 10−5 |
Parity ≤ 2 and age at first birth > 25 | −0.104 | 0.148 | 0.045 |
Parity > 2 and age at first birth ≤ 25 | −0.665 | 0.155 | 2 × 10−4 |
Parity > 2 and age at first birth > 25 | −0.403 | 0.201 | 0.010 |
PD | 0.158 | 0.029 | 3 × 10−8 |
MIP | −1.974 | 0.417 | 2 × 10−6 |
aPD square-root–transformed, MIP transformed to the power 4.9.
Using the same data set, we also studied the relationships between MIP, mammographic density, age, and BMI. MIP was observed to be negatively associated with PD (r = −0.14, P = 2 × 10−5). This association remained strongly significant after adjusting for age and BMI (P = 1 × 10−6). We subsequently separated PD into its 2 constituent components, dense area and breast area, and fitted regression models using MIP as an outcome variable (Table 3). From fitting a model with dense area and breast area as covariates, we found that MIP is strongly inversely associated with dense area but is not associated with breast area (after adjusting for dense area). We then fitted a model with age, BMI, and dense area included as covariates and found that the coefficient for age was negative, whereas for BMI, it was positive (Table 3). We note that dense area was strongly, negatively associated with BMI (r = −0.16, P = 4 × 10−17).
Effect-estimates and P values (Wald tests) from fitting 2 linear regression models for MIP in CAHRES women
. | Estimate . | SE . | P . |
---|---|---|---|
Model 1 covariates | |||
Dense area | −0.060 | 0.015 | 3 × 10−5 |
Breast area | 0.020 | 0.016 | 0.222 |
Model 2 covariates | |||
Age | −0.001 | 3 × 10−4 | 2 × 10−4 |
BMI | 0.002 | 0.001 | 0.001 |
Dense area | −0.059 | 0.015 | 8 × 10−5 |
. | Estimate . | SE . | P . |
---|---|---|---|
Model 1 covariates | |||
Dense area | −0.060 | 0.015 | 3 × 10−5 |
Breast area | 0.020 | 0.016 | 0.222 |
Model 2 covariates | |||
Age | −0.001 | 3 × 10−4 | 2 × 10−4 |
BMI | 0.002 | 0.001 | 0.001 |
Dense area | −0.059 | 0.015 | 8 × 10−5 |
For the subset of 1,725 individuals for which it was possible to trace pectoral muscle segmentation as defined manually (i.e., from Cumulus software), logistic regression models were fitted with both automatic and manual MIP variables (separately) as covariates so the performances of the 2 variables could be compared. P values from Wald tests assessing the association with case–control status were identical to 3 decimal places for the 2 measures of MIP (P = 0.003 in both cases). The 2 measures were correlated with a Pearson coefficient of r = 0.90. Taken together, these results confirm that the automatic extraction of MIP is robust.
We next evaluated the association between MIP and rs10995190 in CAHRES women for which genotype data were available. Key characteristics of the dataset are described in Table 4. The P value from a Wald test of association between MIP and rs10995190, adjusting for age and BMI, was 0.0220 (Table 4). The association of the genetic variant with the square-root of PD was also evaluated (P value for association with rs10995190 was 0.0482). We then evaluated the association between MIP and rs10995190, adjusting for PD (P = 0.0344). After additionally adjusting for HRT use, parity, and age at first birth, the association was still significant even though the size of the data was reduced because of missing values on these variables. For replication of our results, we carried out the same analyses in the cohort of breast cancer cases, Libro-1. Key characteristics of the dataset are described in Table 4. P values corresponding to those reported above for CAHRES were obtained as 0.0006, 0.0005, and 0.0081 (Table 5). In a joint analysis (i.e., using all analog images from CAHRES and Libro-1 and including study indicator as an additional covariate), a P value of 0.0002 was obtained (2 × 10−5, unadjusted for PD) when testing for association between MIP and rs10995190, adjusting for age and BMI (Table 5). According to this model, each copy of the rare allele was associated with an increased MIP (an increase of 0.011 on the transformed MIP variable). From fitting a model with PD as outcome variable, we estimated that each copy of the rare allele was associated with a decreased PD (a decrease of 0.165 on the square-root scale).
Key characteristics of individuals included in the genetic association analyses
. | Mean (SD) or n (%) . | |
---|---|---|
. | CAHRES . | Libro-1 . |
Number | 1,235 | 2,655 |
Postmenopausal | ||
No | 0 (0) | 558 (21) |
Yes | 1,235 (100) | 2,097 (79) |
HRT use | ||
Never | 828 (67) | 1,169 (51) |
Past | 94 (8) | 665 (29) |
Current | 313 (25) | 449 (20) |
Parity and age at first birth | ||
Nulliparous | 131 (11) | 421 (16) |
Parity ≤ 2 and age at first birth ≤ 25 | 352 (29) | 761 (29) |
Parity ≤ 2 and age at first birth > 25 | 368 (30) | 839 (32) |
Parity > 2 and age at first birth ≤ 25 | 283 (23) | 393 (15) |
Parity > 2 and age at first birth > 25 | 101 (8) | 210 (8) |
Age, y | 62.78 (6.28) | 64.1 (8.9) |
BMI | 25.2 (3.9) | 25.3 (4.2) |
PD | 17.1 (14.6) | 32.1 (15.5) |
|$\sqrt {PD}$| | 3.75 (1.74) | 5.49 (1.39) |
MIP | 0.795 (0.057) | 0.756 (0.069) |
MIP4.9 | 0.339 (0.106) | 0.272 (0.098) |
. | Mean (SD) or n (%) . | |
---|---|---|
. | CAHRES . | Libro-1 . |
Number | 1,235 | 2,655 |
Postmenopausal | ||
No | 0 (0) | 558 (21) |
Yes | 1,235 (100) | 2,097 (79) |
HRT use | ||
Never | 828 (67) | 1,169 (51) |
Past | 94 (8) | 665 (29) |
Current | 313 (25) | 449 (20) |
Parity and age at first birth | ||
Nulliparous | 131 (11) | 421 (16) |
Parity ≤ 2 and age at first birth ≤ 25 | 352 (29) | 761 (29) |
Parity ≤ 2 and age at first birth > 25 | 368 (30) | 839 (32) |
Parity > 2 and age at first birth ≤ 25 | 283 (23) | 393 (15) |
Parity > 2 and age at first birth > 25 | 101 (8) | 210 (8) |
Age, y | 62.78 (6.28) | 64.1 (8.9) |
BMI | 25.2 (3.9) | 25.3 (4.2) |
PD | 17.1 (14.6) | 32.1 (15.5) |
|$\sqrt {PD}$| | 3.75 (1.74) | 5.49 (1.39) |
MIP | 0.795 (0.057) | 0.756 (0.069) |
MIP4.9 | 0.339 (0.106) | 0.272 (0.098) |
Summary of SNP (rs10995190) association with proposed novel measures
Outcomea . | Adjustment . | Number . | Estimate . | SE . | P . |
---|---|---|---|---|---|
CAHRES | |||||
MIP | b | 1,235 | 0.013 | 0.006 | 0.0220 |
PD | b | 1,235 | −0.175 | 0.088 | 0.0482 |
MIP | c | 1,235 | 0.012 | 0.005 | 0.0344 |
MIP | d | 1,111 | 0.014 | 0.006 | 0.0211 |
Libro-1 | |||||
MIP | b | 2,655 | 0.013 | 0.004 | 0.0006 |
PD | b | 2,655 | −0.175 | 0.050 | 0.0005 |
MIP | c | 2,655 | 0.009 | 0.004 | 0.0081 |
MIP | d | 2,058 | 0.011 | 0.004 | 0.0086 |
Combinede | |||||
MIP | b | 3,890 | 0.014 | 0.003 | 2 × 10−5 |
PD | b | 3,890 | −0.165 | 0.045 | 0.0002 |
MIP | c | 3,890 | 0.011 | 0.003 | 0.0002 |
MIP | d | 3,169 | 0.013 | 0.003 | 0.0003 |
Outcomea . | Adjustment . | Number . | Estimate . | SE . | P . |
---|---|---|---|---|---|
CAHRES | |||||
MIP | b | 1,235 | 0.013 | 0.006 | 0.0220 |
PD | b | 1,235 | −0.175 | 0.088 | 0.0482 |
MIP | c | 1,235 | 0.012 | 0.005 | 0.0344 |
MIP | d | 1,111 | 0.014 | 0.006 | 0.0211 |
Libro-1 | |||||
MIP | b | 2,655 | 0.013 | 0.004 | 0.0006 |
PD | b | 2,655 | −0.175 | 0.050 | 0.0005 |
MIP | c | 2,655 | 0.009 | 0.004 | 0.0081 |
MIP | d | 2,058 | 0.011 | 0.004 | 0.0086 |
Combinede | |||||
MIP | b | 3,890 | 0.014 | 0.003 | 2 × 10−5 |
PD | b | 3,890 | −0.165 | 0.045 | 0.0002 |
MIP | c | 3,890 | 0.011 | 0.003 | 0.0002 |
MIP | d | 3,169 | 0.013 | 0.003 | 0.0003 |
aPD square-root–transformed and MIP transformed to the power 4.9.
bAdjusted for age and BMI.
cAdjusted for age, BMI, and PD.
dAdjusted for age, BMI, menopausal status, HRT use, parity, age at first birth, and PD.
eCombined analysis additionally adjusted for study (CAHRES/Libro-1).
Finally, we assessed the performance of our PDP measure (PD with reference to the pectoral muscle) in case–control classification (see Statistical methods). In terms of AIC values, the model with PDP (row 3, Table 6) outperformed that with only PD (row 2, Table 6). An increase of 1 SD in PDP was estimated to be associated with a 49% increase in the odds of having breast cancer. The point estimates of AUC, from using age, BMI, and PD, and age, BMI, and PDP were 0.600 and 0.618, respectively. To compare these AUCs, we used DeLong test for 2 correlated ROC curves and obtained a P value of 0.014.
Model fit and case–control discrimination performance based on PD estimated with the pectoral muscle as reference
Model . | Log likelihood . | AIC . | AUC (95% CI) . |
---|---|---|---|
Age + BMI | −1801.1 | 3,608.2 | 0.550 (0.528–0.572) |
Age + BMI + PD | −1772.3 | 3,552.6 | 0.600 (0.578–0.621) |
Age + BMI + PDP | −1760.5 | 3,529.0 | 0.618 (0.597–0.639) |
Model . | Log likelihood . | AIC . | AUC (95% CI) . |
---|---|---|---|
Age + BMI | −1801.1 | 3,608.2 | 0.550 (0.528–0.572) |
Age + BMI + PD | −1772.3 | 3,552.6 | 0.600 (0.578–0.621) |
Age + BMI + PDP | −1760.5 | 3,529.0 | 0.618 (0.597–0.639) |
NOTE: PD square-root–transformed.
Discussion
This article proposes the use of a novel measure for MLO view mammographic images, the MIP, which can be used to enhance existing measures of mammographic density. Results presented here demonstrate that MIP is associated with breast cancer risk and with a mammographic density SNP, rs10995190 (ZNF365), after adjusting for a standard measure of area PD. By calculating AUC values, we have also shown that using a weighted sum of PD and MIP offers improvements over using PD alone, for discriminating between breast cancer cases and controls. Both AUCs may be overestimated, as the risk prediction models are constructed and evaluated on the same dataset. However, as PDP is defined externally (“trained” from the genetic association analysis of Libro-1), the comparison of the 2 models is a fair one. We estimated that an increase of 1 SD in PD is associated with a 40% increase in odds of having breast cancer, whereas an increase of 1 SD in PDP was estimated to be associated with a 49% increase in the odds of having breast cancer. In the genetic association analysis, MIP, on its own, was found to be more significantly associated with rs10995190 than was PD alone. Inclusion of the analysis of association between MIP and the genetic variant is a strength of our study because case–control association analysis of mammographic images is theoretically susceptible to bias, if there are differences in mammography machines used between cases and controls. That stated we are not aware of any important differences in mammography machines or in the digitization of images between cases and controls in CAHRES.
If the associations of MIP with breast cancer risk and the mammographic density genetic variant represent effects of unmeasured and relevant association of PD, then this means that previous studies based on PD can have underestimated the value of mammographic density in breast cancer risk prediction. The associations reported here therefore need to be validated on additional datasets. More research is also needed to find an optimal approach for combining area-based PD and MIP.
To our knowledge, the association of MIP with breast cancer risk has not been previously studied. Previous studies of mammographic density and risk/genetic association have either been based on cranio-caudal (CC) views or, in studies of MLO images, the pectoral muscle has been removed before analysis.
The across-image variability in illuminance, captured by MIP, is likely to relate to the AEC system's calibration mechanism, which detects the amount of radiation reaching a detector through a central region in the breast to provide an optimal film exposure with the midpoint of the film's linear response range fixed on the average density. The mechanism limits the exposure while producing a useful image and is dependent upon the breast composition, breast size, and sensor placement(s) (14). This means that the variation in illuminance of the images is related to variation in the amount of density throughout the thickness of the breast (volumetric density). The observed negative association between MIP and breast cancer risk is consistent with our hypothesis that images with decreased illuminance (high attenuation) have an excess breast cancer risk, which is not captured by measures of area-based PD. The fact that MIP is not independent of area-based PD given BMI and age, in our data, is also consistent with our explanation of why MIP is associated with breast cancer risk and a mammographic density genetic variant. The fact that MIP is associated with PD through its strong (negative) association with dense area but is not (after adjustment of dense area) associated with breast area lends even further credibility to our hypothesis that it is thickness of the dense tissue that drives the illuminance of the images. The observed negative association between age and MIP is likely to result from the fact that volume of dense tissue decreases with age. We postulate that BMI is positively associated with MIP because of the strong negative association between volume of dense tissue/dense area and BMI. Following this line of argument, we would expect that MIP is positively associated with compressed breast thickness (breast thickness was not recorded for images studied here). The results reported here could therefore be viewed as being inconsistent with those reported by Olson and colleagues (14), concerning the importance of the acquisition/machine calibration parameters for discriminating between breast cancer cases and controls independently of area-based density. We note however that the study of Olson and colleagues (14) was likely to be underpowered as it included only 254 breast cancer cases.
There are other possible explanations of why MIP is associated with breast cancer risk. It is possible that the association between MIP and breast cancer risk/mammographic density SNP is, at least partly, due to MIP capturing aspects of mammographic density other than a “residual” amount of density. We believe, however, that such an effect is, at strongest, negligible due to the simple nature of the MIP measure. Another possibility is that MIP simply “corrects” an area-based measurement of PD, which is distorted by variation in illuminance. Given the magnitude of the observed association, we find this also unlikely, and furthermore, the brightness of the image should not influence PD measurement from a trained Cumulus reader. A final possible explanation is that pectoral muscle density is a direct risk factor for breast cancer, as it is for fracture risk, another estrogen and age-related outcome (22). We consider this unconvincing, particularly given that MIP is associated with a mammographic density genetic variant.
Several researchers (4, 11, 23–29) have found textural/statistical features of film mammographic images to aid discrimination between breast cancer cases and controls. Häberle and colleagues (23) noted however that variability in imaging techniques and in standardization of the digitization process may have led to results which are highly specific to their study. We are currently studying the relationships between our novel measure and standard statistical/textural features of images and assessing how much of the association between textural/statistical features and breast cancer status/genotype are explained by the optical effect we report here. Results will be described in a separate article.
In summary, the results presented in this article lead us to suggest that our novel measure of mammographic density will offer improvements to breast cancer risk prediction models using area density and aid in genetic association analyses based on digitized film images.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
Authors' Contributions
Conception and design: A. Cheddad, K. Czene, P. Hall, K. Humphreys
Development of methodology: A. Cheddad, J.A. Shepherd, K. Humphreys
Acquisition of data (provided animals, acquired and managed patients, provided facilities, etc.): K. Czene, J. Li, P. Hall
Analysis and interpretation of data (e.g., statistical analysis, biostatistics, computational analysis): A. Cheddad, J. Li, K. Humphreys
Writing, review, and/or revision of the manuscript: A. Cheddad, K. Czene, J.A. Shepherd, J. Li, P. Hall, K. Humphreys
Administrative, technical, or material support (i.e., reporting or organizing data, constructing databases): P. Hall
Study supervision: P. Hall, K. Humphreys
Acknowledgments
The authors thank Mikael Eriksson for data collection and management and Louise Eriksson for Cumulus PD measurement in CAHRES mammographic images. They also thank the anonymous reviewers for helpful suggestions.
Grant Support
The authors thank support from the Swedish Research Council (grant number 521-2011-3205), the Swedish Cancer Society (contract number 11 0600), and the Swedish E-Science Research Centre (K. Humphreys, A. Cheddad). The Libro-1 study is supported by the Cancer Risk Prediction Center (CRisP; www.crispcenter.org), a Linneus Centre (Contract ID 70867902) financed by the Swedish Research Council (K. Czene). J. Li is supported by the 2nd Joint Council Office (JCO) Career Development Grant (13302EG065).
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.