Background: Recent guidelines suggest that chemoprevention with tamoxifen may be appropriate for women who have a 5-year risk of breast cancer greater than 1.66% calculated using the Gail model.

Objectives: To determine whether nipple aspirate fluid (NAF) cytology combined with the Gail model provides breast cancer risk assessment that is superior to either method alone.

Methods: Prospective observational cohort of 6,904 asymptomatic women. Breast cancer cases were identified through follow-up with the women and linkage to cancer registries. We used proportional hazards modeling to recalculate the coefficients for the predictor variables used in the Gail model. NAF cytology was added to create a second model. The two models were compared using the concordance statistic (c-statistic).

Results: During 14.6 years of follow-up, 400 women were diagnosed with breast cancer. There were 940 (14%) women with hyperplasia and 109 (1.6%) women with atypical hyperplasia found in NAF. Adding NAF cytology results to the Gail model significantly improved the model fit (P < 0.0001). The c-statistic for the Gail model was 0.62, indicating only modest discriminatory accuracy. Adding NAF cytology to the model increased the c-statistic to 0.64. NAF cytology results had the largest effect on discriminatory accuracy among women in the upper third of Gail model risk. The relative incidence for the highest quintile of risk score compared with the lowest quintile was 7.2 for the Gail model and 8.0 for the model including NAF cytology.

Conclusion: NAF cytology has the potential to improve prediction models of breast cancer incidence, particularly for high-risk women.

The Gail model is a multivariable statistical model that uses age, age at menarche, age at first live birth, family history of breast cancer, and number of breast biopsies to estimate breast cancer risk among individuals without a prior history of breast cancer (1). It was modified for the Breast Cancer Prevention Trial using Surveillance, Epidemiology, and End-Results data to update the underlying incidence rates and allow for different underlying rates based on race (2). Breast cancer risk estimation is recommended by the U.S. Preventive Services Task Force for all women considering chemoprophylaxis for breast cancer (3). The Gail model has been shown to accurately estimate the proportion of women who will develop breast cancer when used in large groups (2, 4, 5). However, it performs poorly at discriminating between individual women who will and will not develop breast cancer (5). Given the close balance between the risk and benefits of tamoxifen for most women considering chemoprophylaxis, discovering new strategies to improve the identification of women at very high risk for developing breast cancer is clinically important. Adding information from biological measurements to the risk model may improve prediction of the near-term risk of breast cancer.

Nipple aspiration is a minimally invasive procedure originally developed as a form of Papanicolau test for breast cancer. Prospective cohort studies have shown that cytology information from cells obtained from nipple aspiration predicts breast cancer incidence independent of traditional risk factors (6, 7). The objective of this study was to determine whether NAF cytology combined with Gail model risk assessment provides superior prognostic information to the Gail model alone.

Design and Study Cohort

We followed 8,338 women who participated in studies of nipple aspirate fluid (NAF) from 1972 to 1991 to determine their breast cancer status. This analysis is limited to the 6,904 women with complete follow-up who were free of breast cancer at study entry and were not diagnosed with breast cancer within 6 months of nipple aspiration. Written informed consent was obtained from each participant. The Committee on Human Research of the University of California, San Francisco approved this study.

We studied two groups of women. Women in the first group (n = 3,633) were volunteers recruited from 1972 to 1980 from three sources: the University of California, San Francisco (UCSF) outpatient clinics (35%), the Merritt Hospital (Oakland, California) site of the Breast Cancer Detection and Diagnosis Project (59%), and several small community-based screening programs (6%). Women in the second group (n = 3,271) were volunteers recruited from 1981 to 1991 at UCSF hospitals and clinics or were UCSF employees. Over the 20-year recruitment period, participants completed an evolving series of baseline questionnaires that assessed standard breast cancer risk factors such as age, family history of breast cancer, parity, ethnicity, demographic factors, reproductive and menstrual history, and history of breast diseases and procedures.

Nipple Aspiration

We used the method of Sartorius (6) to obtain breast fluid by nipple aspiration from women in the cohort. The nipple was first cleaned with a detergent. A small plastic cup attached to a 10 mL syringe was placed over the nipple. Whereas the participant gently compressed the breast with both hands, the plunger was retracted to the 5 to 6 mL mark. If fluid did not appear at the nipple surface within 5 seconds, the plunger was withdrawn to the 10 mL mark and held for an additional 10 to 15 seconds. Up to three attempts were made on each breast. If no fluid appeared after these attempts, the participant was considered a non-yielder. Nipple aspiration was not attempted in women with retracted nipples. If fluid appeared, it was collected in capillary tubes and processed for cytology (8). Each breast fluid specimen was classified according to the most severe epithelial change observed: normal, mild hyperplasia, moderate hyperplasia, or atypical hyperplasia. For this report, mild and moderate hyperplasia were combined into a single category of hyperplasia. We classified participants according to the following categories: nipple aspiration attempted and fluid not obtained; fluid obtained but not satisfactory for cytologic diagnosis; normal cytology; epithelial hyperplasia without atypia; and epithelial atypia.

Ascertainment and Validation of Breast Cancer Cases

Prospective follow-up methods for the cohort were presented in detail elsewhere (7, 9). Breast cancer status was initially ascertained through self-reports or next-of-kin reports if the participant was deceased. We identified cases by linking to the Northern California Cancer Center, the California Cancer Registry, and death certificates from the California Department of Health Services Center for Health Statistics Death Certificate Master Files.

Statistical Analysis

Data for risk factors were categorized according to the methods used for the Gail model. All missing data were coded according to the approach of the FORTRAN program used by the National Cancer Institute Risk Disk (BCPT.FOR, May 12, 2000). Specifically, for the number of first-degree relatives with breast cancer, missing values were categorized as 0; for age at menarche missing values were categorized as >14; for age at first birth missing values were categorized as < 20; and for number of breast biopsies missing values were categorized as 0. We used Cox proportional hazards regression to compare the distributions of time from study entry to breast cancer development. All models are adjusted for age at enrollment, ethnicity (White, Black, Asian, Latina), and year of entry into the study. We included a term for year of study entry in all models to adjust for any cohort effect due to the extended period of enrollment. Age was coded as a continuous variable. Ethnicity was coded using indicator variables with White as the reference group. The initial model included the risk factors used in the Gail model including the interaction terms for age and number of biopsies and for age at first live birth and family history (1, 2). Proportional hazards modeling was used to recalculate the coefficients for the predictor variables used in the Gail model. NAF cytology was added to create a second model. We calculated a risk score for each woman for both models by summing the product of the model coefficients by the woman's value for each variable in the model, including year of entry into the study. The two models were compared using the concordance statistic (c-statistic; ref. 10) and by comparing the incidence of breast cancer by quintiles of the risk score. We also calculated the incidence of breast cancer by nipple aspirate cytology results within tertiles of the Gail model risk score. For this analysis, we used tertiles rather than quintiles and combined atypia with hyperplasia to have sufficient numbers of events in each subgroup to give meaningful results.

During 14.6 women-years of follow-up, 400 women were diagnosed with breast cancer. Table 1 shows the baseline characteristics of the women in this cohort. The women were young with a median age at enrollment of 43 years. The participants were predominantly Caucasian (71%), but there were substantial numbers of Blacks (11%) and Asians (11%). A substantial number of the women had hyperplasia (14%) in the NAF, but only 2% had atypia.

Table 1.

Baseline characteristics of NAF cohort

Risk factorOverall, N (%)No breast cancer, N (%)Breast cancer, N (%)
Age groups, y    
    18-34 1,652 (24) 1,613 (25) 39 (10) 
    35-44 2,109 (31) 2,003 (31) 106 (26) 
    45-54 1,815 (26) 1,670 (25) 145 (36) 
    55+ 1,328 (19) 1,218 (19) 110 (28) 
Ethnicity    
    White 4,921 (71) 4,618 (71) 303 (76) 
    Black 744 (11) 706 (11) 38 (9) 
    Asian 762 (11) 719 (11) 43 (11) 
    Latina 324 (5) 314 (5) 10 (3) 
    Other 139 (2) 133 (2) 6 (1) 
    Missing 14 (0.2) 14 (0.2)  
First-degree relatives with breast cancer    
    0 5,886 (85) 5,575 (85) 311 (78) 
    1 769 (11) 703 (11) 66 (16) 
    ≥2 53 (1) 44 (1) 9 (2) 
    Missing 196 (3) 182 (3) 14 (4) 
Age at menarche    
    ≥14 1,610 (23) 1,526 (23) 84 (21) 
    12-13 3,626 (52) 3,414 (53) 212 (53) 
    <12 1,488 (22) 1,398 (21) 90 (23) 
    Missing 180 (3) 166 (3) 14 (3) 
Age at first birth    
    No full-term birth 2,480 (36) 2,359 (36) 121 (30) 
    <20 550 (8) 531 (8) 19 (5) 
    20-24 1,533 (22) 1,437 (22) 96 (24) 
    25-29 1,169 (17) 1,086 (17) 83 (21) 
    ≥30 730 (11) 681 (11) 49 (12) 
    Missing 442 (6) 410 (6) 32 (8) 
Number of breast biopsies    
    0 4,929 (71) 4,686 (72) 243 (61) 
    ≥1 1,767 (26) 1,623 (25) 144 (36) 
    Missing 208 (3) 195 (3) 13 (3) 
Cytology    
    No breast fluid 2,775 (40) 2,671 (41) 104 (26) 
    Unsatisfactory 453 (6) 419 (7) 34 (9) 
    Normal 2,627 (38) 2,454 (38) 173 (43) 
    Hyperplasia 940 (14) 863 (13) 77 (19) 
    Atypia 109 (2) 97 (1) 12 (3) 
Risk factorOverall, N (%)No breast cancer, N (%)Breast cancer, N (%)
Age groups, y    
    18-34 1,652 (24) 1,613 (25) 39 (10) 
    35-44 2,109 (31) 2,003 (31) 106 (26) 
    45-54 1,815 (26) 1,670 (25) 145 (36) 
    55+ 1,328 (19) 1,218 (19) 110 (28) 
Ethnicity    
    White 4,921 (71) 4,618 (71) 303 (76) 
    Black 744 (11) 706 (11) 38 (9) 
    Asian 762 (11) 719 (11) 43 (11) 
    Latina 324 (5) 314 (5) 10 (3) 
    Other 139 (2) 133 (2) 6 (1) 
    Missing 14 (0.2) 14 (0.2)  
First-degree relatives with breast cancer    
    0 5,886 (85) 5,575 (85) 311 (78) 
    1 769 (11) 703 (11) 66 (16) 
    ≥2 53 (1) 44 (1) 9 (2) 
    Missing 196 (3) 182 (3) 14 (4) 
Age at menarche    
    ≥14 1,610 (23) 1,526 (23) 84 (21) 
    12-13 3,626 (52) 3,414 (53) 212 (53) 
    <12 1,488 (22) 1,398 (21) 90 (23) 
    Missing 180 (3) 166 (3) 14 (3) 
Age at first birth    
    No full-term birth 2,480 (36) 2,359 (36) 121 (30) 
    <20 550 (8) 531 (8) 19 (5) 
    20-24 1,533 (22) 1,437 (22) 96 (24) 
    25-29 1,169 (17) 1,086 (17) 83 (21) 
    ≥30 730 (11) 681 (11) 49 (12) 
    Missing 442 (6) 410 (6) 32 (8) 
Number of breast biopsies    
    0 4,929 (71) 4,686 (72) 243 (61) 
    ≥1 1,767 (26) 1,623 (25) 144 (36) 
    Missing 208 (3) 195 (3) 13 (3) 
Cytology    
    No breast fluid 2,775 (40) 2,671 (41) 104 (26) 
    Unsatisfactory 453 (6) 419 (7) 34 (9) 
    Normal 2,627 (38) 2,454 (38) 173 (43) 
    Hyperplasia 940 (14) 863 (13) 77 (19) 
    Atypia 109 (2) 97 (1) 12 (3) 

The coefficients for predictors included in the Gail model calculated using data from this cohort were similar to those reported in the original Gail model (Table 2). Adding NAF cytology results to the Gail model significantly improved the model fit (P < 0.0001) without any significant effect on the coefficients for variables used in the Gail model (Table 2). There was no significant interaction of NAF cytology results with age and the results were similar when limiting the analyses to 5- and 10-year follow-up. The c-statistic for the Gail model was 0.62 indicating modest discriminatory accuracy. Adding NAF cytology to the model only increased the c-statistic to 0.64 (P = 0.006). The receiver operating characteristic curves for prediction of breast cancer are shown in Fig. 1. The area under the curve (equivalent to the c-statistic) for the combined model is modestly greater than for the Gail model alone.

Table 2.

Comparison of original Gail model risk factor relative risks for breast cancer with those calculated using the NAF cohort

Risk factorGail
Model 1
Model 2
RRRR(95% CI)RR(95% CI)
Age at menarche       
    ≥14  1.00 (reference) (reference) 
    12-13  1.10 1.11 (0.96-1.28) 1.09 (0.95-1.27) 
    <12  1.21 1.23 (0.92-1.64) 1.20 (0.89-1.60) 
Age <50 years       
    No previous biopsy  1.00 1.00 (reference) 1.00 (reference) 
    Previous biopsy  1.70 2.04 (1.55-2.69) 2.01 (1.52-2.64) 
    >1 previous biopsy  2.88     
Age ≥50 years       
    No previous biopsy  1.00 1.00 (reference) 1.00 (reference) 
    Previous biopsy  1.27 1.71 (1.24-2.34) 1.69 (1.23-2.32) 
    >1 previous biopsy  1.62     
Age at first birth No. first-degree relatives      
    <20 1.00 1.00 (reference) 1.00 (reference) 
 2.61 1.80 (1.17-2.75) 1.77 (1.16-2.70) 
 2+ 6.80 3.23 (1.38-7.55) 3.14 (1.35-7.31) 
    20-24 1.24 1.15 (1.01-1.30) 1.14 (1.01-1.30) 
 2.68 1.85 (1.35-2.52) 1.83 (1.34-2.48) 
 2+ 5.78 2.97 (1.71-5.15) 2.92 (1.69-5.05) 
    25-29 1.55 1.32 (1.02-1.70) 1.31 (1.01-1.68) 
 2.76 1.90 (1.37-2.61) 1.88 (1.37-2.59) 
 2+ 4.91 2.73 (1.64-4.52) 2.71 (1.63-4.49) 
    30+ 1.93 1.51 (1.04-2.21) 1.49 (1.02-2.18) 
 2.83 1.95 (1.24-3.05) 1.94 (1.24-3.03) 
 2+ 4.17 2.50 (1.17-5.36) 2.52 (1.18-5.38) 
Cytology       
    No breast fluid     1.00 (reference) 
    Unsatisfactory     1.17 (0.78-1.77) 
    Normal     1.46 (1.11-1.91) 
    Hyperplasia     2.22 (1.63-3.03) 
    Atypia     2.28 (1.24-4.22) 
Risk factorGail
Model 1
Model 2
RRRR(95% CI)RR(95% CI)
Age at menarche       
    ≥14  1.00 (reference) (reference) 
    12-13  1.10 1.11 (0.96-1.28) 1.09 (0.95-1.27) 
    <12  1.21 1.23 (0.92-1.64) 1.20 (0.89-1.60) 
Age <50 years       
    No previous biopsy  1.00 1.00 (reference) 1.00 (reference) 
    Previous biopsy  1.70 2.04 (1.55-2.69) 2.01 (1.52-2.64) 
    >1 previous biopsy  2.88     
Age ≥50 years       
    No previous biopsy  1.00 1.00 (reference) 1.00 (reference) 
    Previous biopsy  1.27 1.71 (1.24-2.34) 1.69 (1.23-2.32) 
    >1 previous biopsy  1.62     
Age at first birth No. first-degree relatives      
    <20 1.00 1.00 (reference) 1.00 (reference) 
 2.61 1.80 (1.17-2.75) 1.77 (1.16-2.70) 
 2+ 6.80 3.23 (1.38-7.55) 3.14 (1.35-7.31) 
    20-24 1.24 1.15 (1.01-1.30) 1.14 (1.01-1.30) 
 2.68 1.85 (1.35-2.52) 1.83 (1.34-2.48) 
 2+ 5.78 2.97 (1.71-5.15) 2.92 (1.69-5.05) 
    25-29 1.55 1.32 (1.02-1.70) 1.31 (1.01-1.68) 
 2.76 1.90 (1.37-2.61) 1.88 (1.37-2.59) 
 2+ 4.91 2.73 (1.64-4.52) 2.71 (1.63-4.49) 
    30+ 1.93 1.51 (1.04-2.21) 1.49 (1.02-2.18) 
 2.83 1.95 (1.24-3.05) 1.94 (1.24-3.03) 
 2+ 4.17 2.50 (1.17-5.36) 2.52 (1.18-5.38) 
Cytology       
    No breast fluid     1.00 (reference) 
    Unsatisfactory     1.17 (0.78-1.77) 
    Normal     1.46 (1.11-1.91) 
    Hyperplasia     2.22 (1.63-3.03) 
    Atypia     2.28 (1.24-4.22) 

NOTE: All models are additionally adjusted for age, year of entry into the cohort, and ethnicity. Gail, relative risks as reported in original Gail model. Model 1, Gail model fitted to this data set; c-statistic 0.62. Model 2, Gail model plus NAF cytology; c-statistic 0.64.

Abbreviations: RR, relative risk; 95% CI, 95% confidence interval.

Figure 1.

Receiver operating curves for predicting breast cancer: Gail model versus Gail plus NAF cytology. The received operating characteristic curves for the Gail model alone (solid line) and for the Gail model plus the NAF cytology results (broken line). Areas under the curves are 0.62 for the Gail model alone and 0.64 for the Gail model plus NAF. Straight line: ROC curve expected by chance alone.

Figure 1.

Receiver operating curves for predicting breast cancer: Gail model versus Gail plus NAF cytology. The received operating characteristic curves for the Gail model alone (solid line) and for the Gail model plus the NAF cytology results (broken line). Areas under the curves are 0.62 for the Gail model alone and 0.64 for the Gail model plus NAF. Straight line: ROC curve expected by chance alone.

Close modal

Figure 2 shows the incidence of breast cancer stratified by NAF cytology within tertiles of the Gail model risk score. Both variables were strongly associated with breast cancer incidence (P < 0.0001). Although the P value for the interaction between the two variables was not significant (P = 0.16), there was some variability in the relative risks for the NAF cytology results by tertile of Gail risk (Table 3). Women in the highest third of the Gail model risk score had the greatest range of breast cancer incidence by NAF cytology results and a larger increase in c-statistic with the addition of NAF results (0.57 to 0.61). The incidence for women in the third tertile was 10.3/1,000 woman-years among women with atypia or hyperplasia compared with 5.3/1,000 woman-years among women who did not yield fluid. In contrast, the breast cancer incidence for women in the lowest tertile with atypia or hyperplasia (2.2/1,000 woman-years) was only slightly higher than that for women who did not yield fluid (0.8/1,000 woman-years).

Figure 2.

Breast cancer incidence by nipple aspirate cytology within tertiles of Gail model risk.

Figure 2.

Breast cancer incidence by nipple aspirate cytology within tertiles of Gail model risk.

Close modal
Table 3.

Breast cancer incidence by nipple aspirate cytology within tertiles of Gail risk score

NAF cytologyGail risk score, tertile
1st
2nd
3rd
Number of events/womenRR (95% CI)PNumber of events/womenRR (95% CI)PNumber of events/womenRR (95% CI)P
No breast fluid 9/842 1.0 (reference) — 30/843 1.0 (reference) — 65/1,090 1.0 (reference) — 
Unsatisfactory 7/140 2.6 (0.97-7.0) 0.058 11/138 1.2 (0.6-2.4) 0.653 16/175 0.9 (0.5-1.6) 0.723 
Normal 39/954 2.4 (1.1-4.9) 0.022 70/948 1.3 (0.8-2.0) 0.232 64/725 1.1 (0.9-1.9) 0.516 
Hyperplasia* 13/365 2.8 (1.2-6.5) 0.019 35/371 2.1 (1.3-3.4) 0.004 41/313 1.8 (1.2-2.7) 0.002 
NAF cytologyGail risk score, tertile
1st
2nd
3rd
Number of events/womenRR (95% CI)PNumber of events/womenRR (95% CI)PNumber of events/womenRR (95% CI)P
No breast fluid 9/842 1.0 (reference) — 30/843 1.0 (reference) — 65/1,090 1.0 (reference) — 
Unsatisfactory 7/140 2.6 (0.97-7.0) 0.058 11/138 1.2 (0.6-2.4) 0.653 16/175 0.9 (0.5-1.6) 0.723 
Normal 39/954 2.4 (1.1-4.9) 0.022 70/948 1.3 (0.8-2.0) 0.232 64/725 1.1 (0.9-1.9) 0.516 
Hyperplasia* 13/365 2.8 (1.2-6.5) 0.019 35/371 2.1 (1.3-3.4) 0.004 41/313 1.8 (1.2-2.7) 0.002 
*

Including atypical hyperplasia.

Table 4 presents the average incidence of breast cancer for women stratified by quintiles of predicted risk. Only 32% of the cases of breast cancer occurred in women in the highest quintile of risk (expect 20% by chance alone) when the Gail model was used to predict risk. The relative incidence for the highest risk quintile compared with the lowest quintile was 7.2. In contrast, 33% of the cases were in the highest quintile when NAF cytology was added to the model and the relative incidence increased to 8.0.

Table 4.

Breast cancer incidence by quintile of predicted risk

QuintileModel 1
Model 2
Breast cancer, N (%)Incidence per 1,000 woman-yearsBreast cancer, N (%)Incidence per 1,000 woman-years
1st 29 (7) 1.2 28 (7) 1.2 
2nd 73 (18) 3.3 62 (16) 2.9 
3rd 84 (21) 4.2 77 (19) 3.8 
4th 85 (21) 4.6 101 (25) 5.4 
5th 129 (32) 7.8 132 (33) 8.0 
RR, 5th to 1st quintile (95% CI) 7.2 (4.5-11.1)  8.0 (5.3-11.9)  
QuintileModel 1
Model 2
Breast cancer, N (%)Incidence per 1,000 woman-yearsBreast cancer, N (%)Incidence per 1,000 woman-years
1st 29 (7) 1.2 28 (7) 1.2 
2nd 73 (18) 3.3 62 (16) 2.9 
3rd 84 (21) 4.2 77 (19) 3.8 
4th 85 (21) 4.6 101 (25) 5.4 
5th 129 (32) 7.8 132 (33) 8.0 
RR, 5th to 1st quintile (95% CI) 7.2 (4.5-11.1)  8.0 (5.3-11.9)  

Adding NAF cytology results to the predictor variables used to calculate the Gail risk for women modestly improved the discriminatory accuracy of the model (from c-statistic of 0.62 to 0.64). Clinically, the test information may be most useful for women at highest absolute risk by the Gail model because modest differences in relative risk are amplified. In this cohort, the incidence of breast cancer by NAF cytology ranged from 5.3 to 10.3 per 1,000 women years (non-yielder to hyperplasia/atypia) for women in the highest tertile of Gail risk. NAF cytology may be more informative in this population because women with multiple risk factors for breast cancer are more likely to produce NAF (11).

We preserved the categorization used in prior studies of NAF, but in this analysis hyperplasia and atypia had similar predictive power and could be categorized together without changing the study results. This may reflect the relative paucity of patients with atypia in our sample. In the other studies using biopsy specimens, the prevalence of atypia was much higher (12-15) although the largest study (16) had a prevalence of only 3% in 9,494 surgical biopsy specimens.

The composition of the cohort limits the strength of our conclusions in several ways. First, the 20-year period over which the cohort was assembled occurred during a time of changing incidence patterns for breast cancer (17). We adjusted for this by including year of entry into each model, but ideally cohort studies enroll participants over a short period of time to minimize cohort effects. Furthermore, some of the data used by the Gail model to calculate 5-year risk of invasive breast cancer were limited in this data set. We did not have data on how many prior biopsies had been done, nor did we know whether the pathology showed atypical hyperplasia. However, there were very few missing data. Most variables needed to calculate the Gail risk had less than 3% missing data and these were coded according to the method used by the National Cancer Institute Gail Risk Calculator.

The Gail model was originally developed using logistic regression in using a nested case-control design limited to 5 years of follow-up (1). Our cohort had longer follow-up and used proportional hazards modeling, but limiting the analysis to a 5-year follow-up period or using logistic regression produced similar estimates for the coefficients. By recalculating the coefficients for the Gail model risk factors, we optimized the predictive ability of the model in this data set. The fact that the c-statistic for the Gail model in this data set (0.62) was higher than that calculated for the Nurses Health Study (ref. 5; 0.58) suggests that there was no significant bias against the Gail model in our analyses. Because our model was developed and validated using the same data set, our estimates for the c-statistic are likely to be overly optimistic.

Another potential weakness of this study is the relatively young age of the women. Only 19% of the women are over 55 and nearly one in four are younger than 35, the age cutoff used in the development of the Gail model. However, younger women are more likely to benefit from NAF examination. Risk benefit analysis of tamoxifen use based on data from the Breast Cancer Prevention Trial (18) reported that tamoxifen was overall most beneficial in younger women as they were at much lower risk for the adverse effects of tamoxifen (stroke, venous thromboembolic disease, and uterine cancer) and they have a longer life expectancy. Prior work has shown that young women with risk factors for breast cancer are more likely to produce NAF (11, 19). Thus, as has been suggested by others (20), NAF may be most useful in helping premenopausal women with elevated Gail risk in making the decision about whether or not to use chemoprophylaxis.

However, even our model including NAF had modest discriminatory accuracy. Rockhill et al. (21) recently evaluated the discriminatory accuracy of the most sophisticated log-incidence model developed by Graham and Colditz (22, 23) based on ideas proposed by Pike et al. (24, 25) using prospective data from the Nurses' Health Study. The complete model incorporated 18 risk factors including those of the Gail model, alcohol intake, use of hormone therapy, height, and body mass index. Even this complex and sophisticated model was only modestly accurate at identifying which women would be at highest risk of developing breast cancer (c-statistic 0.63). A common feature of all of the models proposed to date is the lack of data measuring the biological state of the women at the time of risk assessment. Proposed biomarkers such as NAF cytology, breast density, bone mineral density, and serum hormone levels may enhance the accuracy of new risk models, although most do not seem to be strong enough risk factors to have a dramatic effect on discriminatory accuracy. Novel approaches, such as proteomic analysis of serum or NAF, may be needed to achieve sufficient discriminatory accuracy to appropriately target chemopreventive therapy.

Our results support the hypothesis that atypia or hyperplasia on NAF cytology can modify the estimated risk of breast cancer obtained from the Gail model, particularly for patients with higher Gail risk. NAF cytology has the potential to improve prediction models of breast cancer incidence. However, these results must be calibrated to national incidence data and validated in an independent study population before they can be incorporated into clinical practice.

Grant support: Public Health Service, National Cancer Institute (grants P01CA13556 and R01CA47228 ); California Breast Cancer Research Program (grant RB-0248); NeoMatrix, Irvine, California; and NIH/NIAMS Building Interdisciplinary Research Careers in Women's Health Faculty Scholar award.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

1
Gail MH, Brinton LA, Byar DP, et al. Projecting individualized probabilities of developing breast cancer for white females who are being examined annually.
J Natl Cancer Inst
1989
;
81
:
1879
–86.
2
Costantino JP, Gail MH, Pee D, et al. Validation studies for models projecting the risk of invasive and total breast cancer incidence.
J Natl Cancer Inst
1999
;
91
:
1541
–8.
3
Chemoprevention of breast cancer: recommendations and rationale.
Ann Intern Med
2002
;
137
:
56
–8.
4
Spiegelman D, Colditz GA, Hunter D, Hertzmark E. Validation of the Gail et al. model for predicting individual breast cancer risk.
J Natl Cancer Inst
1994
;
86
:
600
–7.
5
Rockhill B, Spiegelman D, Byrne C, Hunter DJ, Colditz GA. Validation of the Gail et al. model of breast cancer risk prediction and implications for chemoprevention.
J Natl Cancer Inst
2001
;
93
:
358
–66.
6
Sartorius OW, Smith HS, Morris P, Benedict D, Friesen L. Cytologic evaluation of breast fluid in the detection of breast disease.
J Natl Cancer Inst
1977
;
59
:
1073
–80.
7
Wrensch MR, Petrakis NL, Miike R, et al. Breast cancer risk in women with abnormal cytology in nipple aspirates of breast fluid.
J Natl Cancer Inst
2001
;
93
:
1791
–8.
8
King EB, Chew KL, Petrakis NL, Ernster VL. Nipple aspirate cytology for the study of breast cancer precursors.
J Natl Cancer Inst
1983
;
71
:
1115
–21.
9
Wrensch MR, Petrakis NL, King EB, et al. Breast cancer incidence in women with abnormal cytology in nipple aspirates of breast fluid.
Am J Epidemiol
1992
;
135
:
130
–41.
10
DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach.
Biometrics
1988
;
44
:
837
–45.
11
Petrakis NL, Ernster VL, Sacks ST, et al. Epidemiology of breast fluid secretion: association with breast cancer risk factors and cerumen type.
J Natl Cancer Inst
1981
;
67
:
277
–84.
12
Krieger N, Hiatt RA. Risk of breast cancer after benign breast diseases. Variation by histologic type, degree of atypia, age at biopsy, and length of follow-up.
Am J Epidemiol
1992
;
135
:
619
–31.
13
McDivitt RW, Stevens JA, Lee NC, Wingo PA, Rubin GL, Gersell D. Histologic types of benign breast disease and the risk for breast cancer. The Cancer and Steroid Hormone Study Group.
Cancer
1992
;
69
:
1408
–14.
14
Byrne C, Connolly JL, Colditz GA, Schnitt SJ. Biopsy confirmed benign breast disease, postmenopausal use of exogenous female hormones, and breast carcinoma risk.
Cancer
2000
;
89
:
2046
–52.
15
Fabian CJ, Kimler BF, Zalles CM, et al. Short-term breast cancer prediction by random periareolar fine-needle aspiration cytology and the Gail risk model.
J Natl Cancer Inst
2000
;
92
:
1217
–27.
16
Dupont WD, Page DL, Parl FF, et al. Estrogen replacement therapy in women with a history of proliferative breast disease.
Cancer
1999
;
85
:
1277
–83.
17
Nasseri K. Secular trends in the incidence of female breast cancer in the United States, 1973-1998.
Breast J
2004
;
10
:
129
–35.
18
Gail MH, Costantino JP, Bryant J, et al. Weighing the risks and benefits of tamoxifen treatment for preventing breast cancer.
J Natl Cancer Inst
1999
;
91
:
1829
–46.
19
Petrakis NL, Mason L, Lee R, Sugimoto B, Pawson S, Catchpool F. Association of race, age, menopausal status, and cerumen type with breast fluid secretion in nonlactating women, as determined by nipple aspiration.
J Natl Cancer Inst
1975
;
54
:
829
–34.
20
Vogel VG, Costantino JP, Wickerham DL, Cronin WM. Re: tamoxifen for prevention of breast cancer: report of the National Surgical Adjuvant Breast and Bowel Project P-1 Study.
J Natl Cancer Inst
2002
;
94
:
1504
.
21
Rockhill B, Byrne C, Rosner B, Louie MM, Colditz G. Breast cancer risk prediction with a log-incidence model: evaluation of accuracy.
J Clin Epidemiol
2003
;
56
:
856
–61.
22
Colditz GA, Rosner B. Cumulative risk of breast cancer to age 70 years according to risk factor status: data from the Nurses' Health Study.
Am J Epidemiol
2000
;
152
:
950
–64.
23
Rosner B, Colditz GA. Nurses' Health Study: log-incidence mathematical model of breast cancer incidence.
J Natl Cancer Inst
1996
;
88
:
359
–64.
24
Pike MC, Krailo MD, Henderson BE, Casagrande JT, Hoel DG. ‘Hormonal’ risk factors, ‘breast tissue age’ and the age-incidence of breast cancer.
Nature
1983
;
303
:
767
–70.
25
Pike MC, Spicer DV, Dahmoush L, Press MF. Estrogens, progestogens, normal breast cell proliferation, and breast cancer risk.
Epidemiol Rev
1993
;
15
:
17
–35.