Purpose: Identification of biologically and clinically distinct breast cancer subtypes could improve prognostic assessment of primary tumors. The characteristics of “molecular” breast cancer subtypes suggest that routinely assessed histopathologic features in combination with limited biomarkers may provide an informative classification for routine use.

Experimental Design: Hierarchical cluster analysis based on components of histopathologic grade (tubule formation, nuclear pleomorphism, and mitotic score), expression of ER, cytokeratin 5/6, and HER2 amplification identified four breast cancer subgroups in a cohort of 270 cases. Cluster subgroup membership was compared with observed and Adjuvant! Online predicted 10-year survival. Survival characteristics were confirmed in an independent cohort of 300 cases assigned to cluster subgroups using a decision tree model.

Results: Four distinct breast cancer cluster subgroups (A-D) were identified that were analogous to molecular tumor types and showed a significant association with survival in both the original and validation cohorts (P < 0.001). There was a striking difference between survival for patients in cluster subgroups A and B with ER+ breast cancer (P < 0.001). Outcome for all tumor types was well estimated by Adjuvant! Online, with the exception of cluster B ER+ cancers where Adjuvant! Online was too optimistic.

Conclusions: Breast cancer subclassification based on readily accessible pathologic features could improve prognostic assessment of ER+ breast cancer.

Translational Relevance

We describe a novel prognostic subclassification of invasive breast cancer based on routinely accessible pathologic features. In particular, this scheme identified a subgroup of women with estrogen receptor–positive (ER+) cancer whose survival may be overestimated by current prognostication tools. This practical approach could be applied to alert clinicians to a high-risk category of ER+ disease.

There are many clinical and biological features of breast cancer that show a consistent association with survival. Consequently, elements of stage, grade, and a variety of biomarkers are routinely used for prognostication and treatment planning. However, these patterns of association may be strongly influenced by very distinct breast cancer phenotypes at extreme ends of the prognostic spectrum. In this regard, there is an important difference between continuous prognostic variables, such as tumor size (1, 2), that show a true correlation with survival and more restricted characteristics (e.g., HER2 amplification), which are essentially attributes of a tumor category that has a distinctive outcome (3).

A further implication of the large number of prognostic features of breast cancer is that cases with the same prognosis have several features in common (4). This is especially true of biological features to the extent that unsupervised clustering of gene expression profiles can delineate several “molecular subtypes” of breast cancer, including luminal A, luminal B, ERBB2+, and basal-like subtypes, which are associated with clinical course (5). Moreover, the potential for such a subclassification based on aggregated biological characteristics to provide novel insight is shown by the clinical correlates already identified for basal-like tumors such as an increased frequency in BRCA1 mutation carriers (6).

In contemporary oncology practice, there remains an urgent need to refine the prognostic and predictive assessment of breast cancer. This is principally due to the difficulty of identifying patients with early breast cancer who are likely to benefit from adjuvant chemotherapy. Contributing to this challenge, the time-dependent tumor characteristics of size and lymph node status are generally less developed in early breast cancer and may be less declarative of risk. In this circumstance, assessment of primary tumor biology assumes particular importance as a clinical indicator and there is great interest in the potential for molecular profiling to make a major contribution in this area.

In the context of current focus on molecular profiling of breast cancer, little emphasis has been given to the ability of histopathologic features to define and detect informative tumor subtypes. This potential is suggested by the relationship between the molecular subtypes and histopathologic grade (7). Furthermore, as grade includes a score for tubule formation, nuclear aberration, and proliferation, it encompasses many of the key characteristics of cancer cell differentiation and growth that are influential in molecular classifiers. It is therefore possible that these routinely evaluated histopathologic features of breast cancer, in combination with some key biomarkers, may resolve clinically important tumor subtypes, with the obvious advantage of accessibility in a routine clinical setting.

The aim of this study was to use hierarchical cluster analysis to identify an informative breast cancer subclassification from a combination of routinely assessed histopathologic features and a limited number of routinely measured or accessible biomarkers. Candidate biomarkers were chosen to reflect differences between the molecular subtypes of breast cancer and included estrogen receptor (ER), progesterone receptor (PR), HER2 amplification, the “luminal” marker cytokeratin (CK) 18 (8), and basal markers CK5/6 and CK14 (9). The value of this subclassification was examined by comparison with an existing breast cancer prognosis modeling tool, Adjuvant! Online, and validated in an independent breast cancer cohort.

Patients. Patients belonged to a consecutive series of 748 women with primary invasive breast cancer, treated with postoperative radiotherapy at St. Vincent's Hospital Sydney between 1984 and 1995. The final study cohort consisted of 370 cases from whom a tumor tissue block could be obtained. The study was conducted with institutional Human Research Ethics Committee approval.

The majority of patients (364 of 370, 98.4%) were treated with breast-conserving surgery. Data on systemic adjuvant therapy were available for 364 (98.4%) patients: 185 (50.8%) received no adjuvant treatment, 61 (16.8%) received chemotherapy only (cyclophosphamide, methotrexate, and 5-fluorouracil–like regimen), 115 (31.6%) received endocrine therapy only (tamoxifen), and 3 (0.8%) received both chemotherapy and endocrine therapy.

Ten-year predicted overall and breast cancer–specific survival were determined using Adjuvant! Online version 8.0 (Adjuvant! Inc.). Patient age, tumor size, number of positive lymph nodes, grade, ER status, and type of systemic adjuvant therapy were entered for each patient. Patients with unknown tumor size or nodal status (n = 39) were excluded from this analysis. The default comorbidity assumption of “minor health problems” was used. Adjuvant! Online predictions were adjusted for the type of adjuvant therapy and patients with unknown adjuvant therapy status were assumed to have had no treatment (n = 6).

Information on date of death and cancer as the cause of death was obtained from the New South Wales Central Cancer Registry, except in six cases where follow-up data were obtained from the patient's medical records. In the absence of date of death, patients in the Registry were assumed to be alive at December 31, 2005. The period of follow-up was defined as time from surgery to last date of follow-up for censored cases or death for complete observation. The median follow-up period was 157 mo (range, 4-263 mo). A total of 138 deaths were observed with 96 deaths due to cancer. For the 232 censored (surviving) patients, the median follow-up period was 173 mo (range, 58-263 mo). Cancer-specific survival considered only death due to cancer as an event, with death due to other causes being censored at the time of occurrence.

Pathology review and immunohistochemistry. A H&E-stained section was prepared from each case and reviewed by one of two experienced observers (A.M.H./N.J.H.). Histopathologic grade was determined according to the modified method of Elston and Ellis (10) with the three component scores for tubule formation, nuclear pleomorphism, and mitotic count separately recorded (range for each score, 1-3). Data on tumor size and lymph node status were obtained from original pathology reports.

Tissue microarrays were constructed with three 0.6-mm cores from each case. Immunohistochemical analysis of ER and PR was done using a Ventana XT automated immunostainer (Ventana Medical Systems). Mouse monoclonal antibodies were used for detection of ER (clone 6F11, prediluted; Ventana Medical Systems) and PR (clone 16, prediluted; Ventana Medical Systems). Manual immunohistochemical staining was done to detect CK18 (mouse monoclonal anti-human CK18 clone CY-90, 1:400; Sigma), CK14 (mouse monoclonal anti-human CK14 clone LL002, 1:20; Novocastra), and CK5/6 (mouse monoclonal clone D5/16B4, 1:100; Zymed).

There was some loss of tissue cores in the process of staining but all evaluable cores were allocated a semiquantitative score for ER, PR, CK18, CK14, and CK5/6 based on overall intensity and proportion of tumor cells stained (0, +/−, +, ++, +++). For each marker, median scores from up to three tumor cores per case were used to derive the final score category; cases with final scores of + to +++ were designated positive and 0 or +/− were designated negative.

HER2 fluorescence in situ hybridization. Fluorescence in situ hybridization (FISH) for HER2 gene amplification was done on tissue microarray sections using the PathVysion HER2 DNA Dual Probe kit (Abbott) in accordance with the manufacturer's protocol, except for the substitution of sodium isothiocyanate pretreatment with 30-min incubation in DAKO Target Retrieval Solution (DAKO Australia Pty. Ltd.) at 90°C. HER2 amplification was determined by calculating the ratio of red (HER2 gene specific) to green (chromosome 17 centromeric) signals in 20 cancer cell nuclei. Ratios of <2, 2 to 4, and >4 were classed as no amplification, low amplification, and high amplification respectively, in accordance with scoring criteria in place at the time. Low and high-level amplification were designated positive.

Validation cohort. Data from an independent cohort of 300 invasive breast cancers were provided from the University of Nottingham (Nottingham, United Kingdom). These were cases diagnosed between 1987 and 1992 extracted from a larger cohort of 1,944 consecutive primary breast cancer patients presenting between 1986 and 1998 that were entered into the Nottingham Tenovus Primary Breast Carcinoma Series (11). In this validation group, a modified H-score method was used to report immunohistochemical results (11). Data were recoded as positive based on H-scores ≥1 for ER and CK5/6. Cases were designated HER2+ based on a positive FISH result if available (n = 102) or else a HER2 H-score >200.

Data analysis. Statistical analyses were done using Statistical Package for the Social Sciences for Windows version 15 (SPSS, Inc.) and S-PLUS version 6.2 (Insightful Corp.). Two-tailed tests with a significance level of 5% were used throughout. Pearson χ2 tests, or Fisher's exact as appropriate, were used to test for associations between cluster subgroup and clinicopathologic features. Kaplan-Meier survival curves were used to illustrate the survival distributions and log-rank tests used to compare them. Hazard ratios (HR) and their 95% confidence intervals (95% CI) estimated using Cox proportional hazards models were used to quantify the degree of association between survival and possible risk factors. Multiple Cox proportional hazards models with stepwise elimination were used to identify independent predictors of survival.

For hierarchical cluster analysis, raw score data from the components of histopathologic grade (nuclear pleomorphism, tubule formation, and mitotic count; scores 1, 2, and 3), ER, CK5/6 (scores 0, +, ++, and +++), and HER2 FISH [scores 0 (no amplification), 1 (low-level amplification), and 2 (high-level amplification)] were ranked in ascending order. For tied values, the mean rank was used. All ranked data were then normalized and standardized to z-scores before hierarchical cluster analysis. Clustering was done using the squared Euclidean distance metric. The average linkage (between groups) algorithm was used. Only cases with no missing values were used in this analysis (n = 270).

A regression tree model was developed using histopathologic grade features and dichotomized ER, CK5/6, and HER2 FISH scores from the original cohort to assign cases to cluster subgroups.

Patient and tumor characteristics. The cohort consisted of 370 patients with primary breast cancer. Clinicopathologic features are summarized in Table 1.

Table 1.

Patient and tumor characteristics

n (%)
Age at diagnosis, y (n = 370)  
    <36 23 (6.2) 
    36-50 117 (31.6) 
    51-65 144 (38.9) 
    66-75 66 (17.8) 
    >75 20 (5.4) 
Histologic subtype (n = 369)  
    Ductal, NOS 322 (87.3) 
    Lobular, NOS 17 (4.6) 
    Special type 30 (8.1) 
Invasive tumor size, mm (n = 359)  
    0-10 62 (17.3) 
    11-20 160 (44.6) 
    21-30 96 (26.7) 
    31-50 38 (10.6) 
    >50 3 (0.8) 
Lymph node status, no. positive nodes (n = 341)  
    0 205 (60.1) 
    1-3 88 (25.8) 
    4-9 30 (8.8) 
    >10 18 (5.3) 
Grade (n = 353)  
    1 101 (28.6) 
    2 128 (36.3) 
    3 124 (35.1) 
ER status (n = 295)  
    Positive 191 (64.7) 
    Negative 104 (35.3) 
PR status (n = 298)  
    Positive 168 (56.4) 
    Negative 130 (43.6) 
HER2 status* (n = 299)  
    Negative 256 (85.6) 
    Positive 43 (14.4) 
CK18 status (n = 302)  
    Positive 293 (97.0) 
    Negative 9 (3.0) 
CK5/6 status (n = 296)  
    Negative 258 (87.2) 
    Positive 38 (12.8) 
CK14 status (n = 283)  
    Negative 238 (84.1) 
    Positive 45 (15.9) 
n (%)
Age at diagnosis, y (n = 370)  
    <36 23 (6.2) 
    36-50 117 (31.6) 
    51-65 144 (38.9) 
    66-75 66 (17.8) 
    >75 20 (5.4) 
Histologic subtype (n = 369)  
    Ductal, NOS 322 (87.3) 
    Lobular, NOS 17 (4.6) 
    Special type 30 (8.1) 
Invasive tumor size, mm (n = 359)  
    0-10 62 (17.3) 
    11-20 160 (44.6) 
    21-30 96 (26.7) 
    31-50 38 (10.6) 
    >50 3 (0.8) 
Lymph node status, no. positive nodes (n = 341)  
    0 205 (60.1) 
    1-3 88 (25.8) 
    4-9 30 (8.8) 
    >10 18 (5.3) 
Grade (n = 353)  
    1 101 (28.6) 
    2 128 (36.3) 
    3 124 (35.1) 
ER status (n = 295)  
    Positive 191 (64.7) 
    Negative 104 (35.3) 
PR status (n = 298)  
    Positive 168 (56.4) 
    Negative 130 (43.6) 
HER2 status* (n = 299)  
    Negative 256 (85.6) 
    Positive 43 (14.4) 
CK18 status (n = 302)  
    Positive 293 (97.0) 
    Negative 9 (3.0) 
CK5/6 status (n = 296)  
    Negative 258 (87.2) 
    Positive 38 (12.8) 
CK14 status (n = 283)  
    Negative 238 (84.1) 
    Positive 45 (15.9) 

Abbreviation: NOS, not otherwise specified.

*

HER2 status determined by FISH.

Hierarchical cluster analysis to identify breast cancer subgroups. For each case, scores reflecting the following features were compiled: individual components of histopathologic grade (tubule formation, nuclear morphology, and mitotic score) and HER2 amplification and expression of ER, PR, CK18, CK5/6, and CK14. Various combinations of these features were then tested for the distribution of cases into distinct subgroups of comparable size by hierarchical cluster analysis (data not shown).

Clustering based on scores for the three components of grade, HER2, ER, and CK5/6 is illustrated in Fig. 1A. At the level of four cluster separation, this divided 270 cases with complete data into subgroups of 135, 75, 35, and 25 tumors that were designated clusters A, B, C, and D, respectively. Mean scores for the histopathologic and biomarker features of each cluster are shown in Supplementary Fig. S1 and other clinicopathologic covariates are summarized in Table 2.

Fig. 1.

A, dendrogram illustrating hierarchical cluster analysis of 270 invasive breast cancers. B to D, Kaplan-Meier cancer-specific survival analysis of histopathologic grade (P < 0.001; B), cluster subgroup (P < 0.001; C), and ER+ tumors in clusters A and B (P < 0.001; D).

Fig. 1.

A, dendrogram illustrating hierarchical cluster analysis of 270 invasive breast cancers. B to D, Kaplan-Meier cancer-specific survival analysis of histopathologic grade (P < 0.001; B), cluster subgroup (P < 0.001; C), and ER+ tumors in clusters A and B (P < 0.001; D).

Close modal
Table 2.

Clinical and histopathologic features of invasive breast cancers in the four cluster subgroups

CharacteristicTotal
Cluster A
Cluster B
Cluster C
Cluster D
P
n (%)n (%)n (%)n (%)n (%)
Age at diagnosis, y (n = 270)       
    <36 19 (7.0) 7 (5.2) 5 (6.7) 3 (8.6) 4 (16.0) 0.005 
    36-50 82 (30.4) 30 (22.2) 26 (34.7) 19 (54.3) 7 (28.0)  
    51-65 109 (40.4) 64 (47.4) 27 (36.0) 8 (22.9) 10 (40.0)  
    66-75 48 (17.8) 27 (20.0) 16 (21.3) 2 (5.7) 3 (12.0)  
    >75 12 (4.4) 7 (5.2) 1 (1.3) 3 (8.6) 1 (4.0)  
Histologic subtype (n = 270)       
    Ductal, NOS 244 (90.4) 118 (87.4) 71 (94.7) 33 (94.3) 22 (88.0) 0.071 
    Lobular, NOS 10 (3.7) 10 (7.4) 0 (0.0) 0 (0.0) 0 (0.0)  
    Special type 16 (5.9) 7 (5.2) 4 (5.3) 2 (5.7) 3 (12.0)  
Invasive tumor size, mm (n = 264)       
    0-10 42 (15.9) 34 (25.8) 5 (6.8) 2 (5.7) 1 (4.2) <0.001 
    11-20 119 (45.1) 64 (48.5) 27 (37.0) 18 (51.4) 10 (41.7)  
    >20 103 (39.0) 34 (25.8) 41 (56.2) 15 (42.9) 13 (54.2)  
Lymph node status, no. positive nodes (n = 250)       
    0 144 (57.6) 73 (59.8) 37 (52.9) 21 (60.0) 13 (56.5) 0.523 
    1-3 68 (27.2) 36 (29.5) 19 (27.1) 4 (11.4) 9 (39.1)  
    >3 38 (15.2) 13 (10.7) 14 (20.0) 10 (28.6) 1 (4.3)  
Grade (n = 270)       
    1 70 (25.9) 66 (48.9) 1 (1.3) 3 (8.6) 0 (0.0) <0.001 
    2 100 (37.0) 69 (51.1) 15 (20.0) 13 (37.1) 3 (12.0)  
    3 100 (37.0) 0 (0.0) 59 (78.7) 19 (54.3) 22 (88.0)  
ER status (n = 270)       
    Positive 174 (64.4) 114 (84.4) 42 (56.0) 18 (51.4) 0 (0.0) <0.001 
    Negative 96 (35.6) 21 (15.6) 33 (44.0) 17 (48.6) 25 (100.0)  
PR status (n = 269)       
    Positive 154 (57.2) 101 (75.4) 37 (49.3) 15 (42.9) 1 (4.0) <0.001 
    Negative 115 (42.8) 33 (24.6) 38 (50.7) 20 (57.1) 24 (96.0)  
HER2 status* (n = 270)       
    Negative 229 (84.8) 135 (100.0) 75 (100.0) 0 (0.0) 19 (76.0) <0.001 
    Positive 41 (15.2) 0 (0.0) 0 (0.0) 35 (100.0) 6 (24.0)  
CK18 status (n = 267)       
    Positive 261 (97.8) 134 (99.3) 72 (96.0) 34 (100.0) 21 (91.3) 0.047 
    Negative 6 (2.2) 1 (0.7) 3 (4.0) 0 (0.0) 2 (8.7)  
CK5/6 status (n = 270)       
    Negative 239 (88.5) 130 (96.3) 74 (98.7) 35 (100.0) 0 (0.0) <0.001 
    Positive 31 (11.5) 5 (3.7) 1 (1.3) 0 (0.0) 25 (100.0)  
CK14 status (n = 253)       
    Negative 214 (84.6) 115 (91.3) 60 (85.7) 27 (79.4) 12 (52.2) <0.001 
    Positive 39 (15.4) 11 (8.7) 10 (14.3) 7 (20.6) 11 (47.7)  
CharacteristicTotal
Cluster A
Cluster B
Cluster C
Cluster D
P
n (%)n (%)n (%)n (%)n (%)
Age at diagnosis, y (n = 270)       
    <36 19 (7.0) 7 (5.2) 5 (6.7) 3 (8.6) 4 (16.0) 0.005 
    36-50 82 (30.4) 30 (22.2) 26 (34.7) 19 (54.3) 7 (28.0)  
    51-65 109 (40.4) 64 (47.4) 27 (36.0) 8 (22.9) 10 (40.0)  
    66-75 48 (17.8) 27 (20.0) 16 (21.3) 2 (5.7) 3 (12.0)  
    >75 12 (4.4) 7 (5.2) 1 (1.3) 3 (8.6) 1 (4.0)  
Histologic subtype (n = 270)       
    Ductal, NOS 244 (90.4) 118 (87.4) 71 (94.7) 33 (94.3) 22 (88.0) 0.071 
    Lobular, NOS 10 (3.7) 10 (7.4) 0 (0.0) 0 (0.0) 0 (0.0)  
    Special type 16 (5.9) 7 (5.2) 4 (5.3) 2 (5.7) 3 (12.0)  
Invasive tumor size, mm (n = 264)       
    0-10 42 (15.9) 34 (25.8) 5 (6.8) 2 (5.7) 1 (4.2) <0.001 
    11-20 119 (45.1) 64 (48.5) 27 (37.0) 18 (51.4) 10 (41.7)  
    >20 103 (39.0) 34 (25.8) 41 (56.2) 15 (42.9) 13 (54.2)  
Lymph node status, no. positive nodes (n = 250)       
    0 144 (57.6) 73 (59.8) 37 (52.9) 21 (60.0) 13 (56.5) 0.523 
    1-3 68 (27.2) 36 (29.5) 19 (27.1) 4 (11.4) 9 (39.1)  
    >3 38 (15.2) 13 (10.7) 14 (20.0) 10 (28.6) 1 (4.3)  
Grade (n = 270)       
    1 70 (25.9) 66 (48.9) 1 (1.3) 3 (8.6) 0 (0.0) <0.001 
    2 100 (37.0) 69 (51.1) 15 (20.0) 13 (37.1) 3 (12.0)  
    3 100 (37.0) 0 (0.0) 59 (78.7) 19 (54.3) 22 (88.0)  
ER status (n = 270)       
    Positive 174 (64.4) 114 (84.4) 42 (56.0) 18 (51.4) 0 (0.0) <0.001 
    Negative 96 (35.6) 21 (15.6) 33 (44.0) 17 (48.6) 25 (100.0)  
PR status (n = 269)       
    Positive 154 (57.2) 101 (75.4) 37 (49.3) 15 (42.9) 1 (4.0) <0.001 
    Negative 115 (42.8) 33 (24.6) 38 (50.7) 20 (57.1) 24 (96.0)  
HER2 status* (n = 270)       
    Negative 229 (84.8) 135 (100.0) 75 (100.0) 0 (0.0) 19 (76.0) <0.001 
    Positive 41 (15.2) 0 (0.0) 0 (0.0) 35 (100.0) 6 (24.0)  
CK18 status (n = 267)       
    Positive 261 (97.8) 134 (99.3) 72 (96.0) 34 (100.0) 21 (91.3) 0.047 
    Negative 6 (2.2) 1 (0.7) 3 (4.0) 0 (0.0) 2 (8.7)  
CK5/6 status (n = 270)       
    Negative 239 (88.5) 130 (96.3) 74 (98.7) 35 (100.0) 0 (0.0) <0.001 
    Positive 31 (11.5) 5 (3.7) 1 (1.3) 0 (0.0) 25 (100.0)  
CK14 status (n = 253)       
    Negative 214 (84.6) 115 (91.3) 60 (85.7) 27 (79.4) 12 (52.2) <0.001 
    Positive 39 (15.4) 11 (8.7) 10 (14.3) 7 (20.6) 11 (47.7)  
*

HER2 status determined by FISH.

According to this scheme, breast cancers included in cluster A were characterized by low grade and 84.4% were ER+. Cluster B cases were relatively higher grade with 56% ER+. The principal distinguishing feature of cluster C cancers was HER2 positivity, whereas cluster D cases had the highest proportion of grade 3 cases (88%) and were distinctly CK5/6+ (Supplementary Fig. S1; Table 2).

Age at diagnosis was significantly different among the cluster subgroups (P = 0.005), with clusters C and D having a larger proportion of younger women (Table 2). Tumor size was also significantly different between the cluster subgroups (P < 0.001), with larger tumors present in clusters B, C, and D compared with cluster A. Lymph node status was not significantly associated with cluster subgroups (P = 0.52; Table 2).

Association between cluster subgroups and survival. In univariable Cox regression analyses, tumor size, lymph node status, grade, and cluster subgroup were significantly associated with cancer-specific survival (Table 3A). Multiple Cox regression analysis with stepwise elimination identified cluster subgroup and lymph node status as the only independent predictors (Table 3B).

Table 3.

HRs, 95% CIs, and P values for univariable Cox regression models of cancer-specific survival (A) and independent predictors of cancer-specific survival identified by multiple Cox regression analysis (B)

A. Univariable analysis
B. Multivariable analysis
95% CI for HR
95% CI for HR
HRLowerUpperPHRLowerUpperP
Age at diagnosis, y (n = 370)         
    ≤55 vs >55 0.9 0.602 1.345 0.608     
Histologic subtype (n = 369)    0.064     
    Ductal, NOS         
    Lobular, NOS 1.489 0.651 3.405 0.346     
    Special type 0.221 0.054 0.898 0.035     
Invasive tumor size, mm (n = 359)    <0.001     
    0-10         
    11-20 2.425 1.02 5.766 0.045     
    >20 4.935 2.12 11.486 <0.001     
Lymph node status, no. positive nodes (n = 341)    <0.001    <0.001 
    0         
    1-3 1.567 0.931 2.637 0.091 1.681 0.912 3.1 0.096 
    >3 4.587 2.801 7.511 <0.001 4.746 2.678 8.411 <0.001 
Grade (n = 353)    <0.001     
    1         
    2 2.899 1.425 5.898 0.003     
    3 5.372 2.729 10.576 <0.001     
ER status (n = 295)         
    Negative vs positive 0.661 0.424 1.032 0.069     
PR status (n = 298)         
    Negative vs positive 0.7 0.449 1.091 0.115     
HER2 status* (n = 299)         
    Negative vs positive 1.347 0.756 2.401 0.312     
CK18 status (n = 302)         
    Negative vs positive 1.323 0.325 5.385 0.696     
CK5/6 status (n = 296)         
    Negative vs positive 1.203 0.636 2.277 0.569     
CK14 status (n = 283)         
    Negative vs positive 1.538 0.884 2.675 0.128     
Cluster subgroup (n = 270)    <0.001    <0.001 
    A         
    B 3.706 2.099 6.544 <0.001 3.397 1.842 6.265 <0.001 
    C 3.117 1.539 6.313 0.002 3.059 1.453 6.438 0.003 
    D 3.179 1.437 7.033 0.004 3.767 1.589 8.928 0.003 
A. Univariable analysis
B. Multivariable analysis
95% CI for HR
95% CI for HR
HRLowerUpperPHRLowerUpperP
Age at diagnosis, y (n = 370)         
    ≤55 vs >55 0.9 0.602 1.345 0.608     
Histologic subtype (n = 369)    0.064     
    Ductal, NOS         
    Lobular, NOS 1.489 0.651 3.405 0.346     
    Special type 0.221 0.054 0.898 0.035     
Invasive tumor size, mm (n = 359)    <0.001     
    0-10         
    11-20 2.425 1.02 5.766 0.045     
    >20 4.935 2.12 11.486 <0.001     
Lymph node status, no. positive nodes (n = 341)    <0.001    <0.001 
    0         
    1-3 1.567 0.931 2.637 0.091 1.681 0.912 3.1 0.096 
    >3 4.587 2.801 7.511 <0.001 4.746 2.678 8.411 <0.001 
Grade (n = 353)    <0.001     
    1         
    2 2.899 1.425 5.898 0.003     
    3 5.372 2.729 10.576 <0.001     
ER status (n = 295)         
    Negative vs positive 0.661 0.424 1.032 0.069     
PR status (n = 298)         
    Negative vs positive 0.7 0.449 1.091 0.115     
HER2 status* (n = 299)         
    Negative vs positive 1.347 0.756 2.401 0.312     
CK18 status (n = 302)         
    Negative vs positive 1.323 0.325 5.385 0.696     
CK5/6 status (n = 296)         
    Negative vs positive 1.203 0.636 2.277 0.569     
CK14 status (n = 283)         
    Negative vs positive 1.538 0.884 2.675 0.128     
Cluster subgroup (n = 270)    <0.001    <0.001 
    A         
    B 3.706 2.099 6.544 <0.001 3.397 1.842 6.265 <0.001 
    C 3.117 1.539 6.313 0.002 3.059 1.453 6.438 0.003 
    D 3.179 1.437 7.033 0.004 3.767 1.589 8.928 0.003 
*

HER2 status determined by FISH.

The relationship of grade and cluster subgroup membership with cancer-specific survival is shown in Fig. 1B and C, respectively. In contrast to grade that delineated good (grade 1), intermediate (grade 2), and poor (grade 3) prognostic subgroups of breast cancer, the cluster subgrouping identified two subgroups in relation to survival: one good prognosis (cluster A) and the other relatively poor (clusters B, C, and D).

Cluster membership identifies poor-prognosis ER+ breast cancer. Of particular note was the poor cancer-specific survival associated with cluster B tumors, which were HER2 though frequently ER+. To examine this further, the cancer-specific survival of ER+ cases in clusters A and B was compared (Fig. 1D). There was a striking difference in survival between the two groups, with ER+ cancers in cluster B having significantly worse outcome (P < 0.001).

Observed versus predicted survival for breast cancer cluster subgroups. Ten-year observed and Adjuvant! Online predicted survival rates for 331 patients are summarized in Table 4. The predicted 10-year breast cancer–specific survival rate for all patients was 80.4%, which was not significantly different from the observed cancer-specific survival estimate of 76.2% (P > 0.05; Table 4A). Furthermore, the predicted 10-year survival was within confidence intervals of observed survival for all subgroups defined by age, tumor size, lymph node status, grade, and ER (Table 4B).

Table 4.

Comparison of observed cancer-specific survival and Adjuvant! Online predicted breast cancer–specific survival

Frequency (n)Observed 10-y survival,* % (95% CI)Adjuvant! predicted 10-y survival (%)
A. All Patients 331 76.2 (71.4-81.0) 80.4 
B. Age at diagnosis (y) 331   
        <36 22 68.2 (48.4-88.0) 75.4 
        36-50 106 78.2 (70.2-86.2) 81.8 
        51-65 129 74.8 (67.0-82.6) 78.4 
        66-75 56 79.6 (68.0-91.2) 84.7 
        >75 18 72.8 (48.8-96.8) 79.1 
    Invasive tumor size (mm) 331   
        0-10 56 92.5 (85.3-99.7) 94.3 
        11-20 146 81.6 (75.0-88.2) 85.2 
        >20 129 62.8 (54.0-71.6) 68.9 
    No. positive lymph nodes 331   
        0 198 84.1 (78.7-89.5) 88.0 
        1-3 85 76.8 (67.4-86.2) 78.8 
        4-9 30 50.0 (31.8-68.2) 58.5 
        >10 18 31.3 (8.7-53.9) 40.8 
    Grade 320   
        1 90 91.0 (84.8-97.2) 93.1 
        2 121 76.3 (68.3-84.3) 81.9 
        3 109 61.4 (52.0-70.8) 68.0 
    ER status 264   
        Positive 171 78.1 (71.7-84.5) 84.2 
        Negative 93 70.1 (60.5-79.7) 70.9 
C. Cluster subgroup 245   
        A 120 87.1 (80.9-93.3) 87.6 
        B 68 61.6 (49.6-73.6) 71.2 
        C 35 67.4 (51.2-83.6) 72.1 
        D 22 63.0 (42.2-83.8) 68.6 
D. B1 (cluster B ER+39 54.0 (37.6-70.4) 74.7 
        B2 (cluster B ER29 71.4 (54.4-88.4) 66.4 
Frequency (n)Observed 10-y survival,* % (95% CI)Adjuvant! predicted 10-y survival (%)
A. All Patients 331 76.2 (71.4-81.0) 80.4 
B. Age at diagnosis (y) 331   
        <36 22 68.2 (48.4-88.0) 75.4 
        36-50 106 78.2 (70.2-86.2) 81.8 
        51-65 129 74.8 (67.0-82.6) 78.4 
        66-75 56 79.6 (68.0-91.2) 84.7 
        >75 18 72.8 (48.8-96.8) 79.1 
    Invasive tumor size (mm) 331   
        0-10 56 92.5 (85.3-99.7) 94.3 
        11-20 146 81.6 (75.0-88.2) 85.2 
        >20 129 62.8 (54.0-71.6) 68.9 
    No. positive lymph nodes 331   
        0 198 84.1 (78.7-89.5) 88.0 
        1-3 85 76.8 (67.4-86.2) 78.8 
        4-9 30 50.0 (31.8-68.2) 58.5 
        >10 18 31.3 (8.7-53.9) 40.8 
    Grade 320   
        1 90 91.0 (84.8-97.2) 93.1 
        2 121 76.3 (68.3-84.3) 81.9 
        3 109 61.4 (52.0-70.8) 68.0 
    ER status 264   
        Positive 171 78.1 (71.7-84.5) 84.2 
        Negative 93 70.1 (60.5-79.7) 70.9 
C. Cluster subgroup 245   
        A 120 87.1 (80.9-93.3) 87.6 
        B 68 61.6 (49.6-73.6) 71.2 
        C 35 67.4 (51.2-83.6) 72.1 
        D 22 63.0 (42.2-83.8) 68.6 
D. B1 (cluster B ER+39 54.0 (37.6-70.4) 74.7 
        B2 (cluster B ER29 71.4 (54.4-88.4) 66.4 
*

Observed 10-y survival and 95% CI from life tables.

P < 0.05.

Similarly, predicted and observed survival estimates were concordant for each of the four cluster subgroups (Table 4C). However, stratification of cluster B according to ER status revealed a statistically significant difference for ER+ cases who had a predicted survival rate of 74.7% compared with 54% observed (P < 0.05; Table 4D). There were no differences between predicted and observed survival estimates in other cluster subgroups stratified by ER status (data not shown).

Cluster subgroups show a consistent association with survival in an independent cohort. A decision tree model based on the mitotic score (a component of histopathologic grade; ref. 10), HER2 and CK5/6 (Fig. 2A) correctly assigned 268 of 270 (99.3%) of the original cohort to the subgroups defined by hierarchical clustering. Using this decision tree, 300 breast cancers from an independent cohort were assigned to clusters A (n = 98), B (n = 107), C (n = 25), and D (n = 70). In this independent cohort, there were statistically significant differences in breast cancer–specific survival and overall survival (data not shown) according to histologic grade (P < 0.001; Fig. 2B). Survival analyses according to cluster subgroups (Fig. 2C) were also significantly different (P < 0.001), with best survival seen for cluster A cases, similar outcome for clusters B and D, and particularly poor survival associated with cancers in cluster C. Consistent with findings in the original cohort, there was a statistically significant difference in survival between ER+ breast cancers in clusters A and B (P < 0.001; Fig. 2D).

Fig. 2.

A, decision tree for assignment of individual breast tumors into four cluster subgroups. B to D, application of decision tree to an independent cohort of 300 breast cancers. Kaplan-Meier breast cancer–specific survival analysis of the independent cohort based on histopathologic grade (P < 0.001; B), cluster subgroup (P < 0.001; C), and ER+ tumors in clusters A and B (P < 0.001; D).

Fig. 2.

A, decision tree for assignment of individual breast tumors into four cluster subgroups. B to D, application of decision tree to an independent cohort of 300 breast cancers. Kaplan-Meier breast cancer–specific survival analysis of the independent cohort based on histopathologic grade (P < 0.001; B), cluster subgroup (P < 0.001; C), and ER+ tumors in clusters A and B (P < 0.001; D).

Close modal

In this study, four breast cancer subgroups were identified by hierarchical cluster analysis based on components of histopathologic grade, ER, and CK5/6 expression and HER2 amplification. These cluster subgroups showed a relationship with survival that was also observed in an independent validation cohort. Of particular importance was the finding that cluster subgroup B ER+ breast cancers were associated with especially poor survival. Moreover, and unlike other subgroups examined, survival of individuals with this cancer type was significantly overestimated by Adjuvant! Online. Although this analysis must be regarded as exploratory in view of the relatively small size of the patient cohorts, it suggests that the accuracy of current breast cancer prognostication could be improved by incorporation of a biologically based cancer subclassification, and this is especially true for a subgroup of ER+ tumors.

Features of individual cluster subgroups identified in this study were strongly concordant with the four major molecular subtypes defined by gene expression profiling (5, 6, 8). Specifically, cluster A had features of the luminal A molecular subtype, being grade 1 or 2 with frequent expression of ER. Cluster B comprised cancers that were not uncommonly (56%) ER+ but generally higher grade and likely represented the luminal B subtype. Cluster C contained tumors that were uniformly HER2+ and hence analogous to the ERBB2+ molecular expression subtype, whereas cluster D lesions showed features consistent with the described basal-like molecular subtype, including expression of the basal CK marker CK5/6, failure of ER expression, and high tumor grade. Furthermore, relative survival characteristics of the four cluster subgroups were highly consistent with those reported for the molecular subtypes, with cluster A/luminal A cancers showing better survival than the other subcategories (5, 6). In this regard, performance of the histopathology-based classifier is especially impressive because it assigned every breast cancer to a cluster subgroup, in contrast to the study by Sorlie et al. (6) delineating molecular subtypes that left up to 36% of cases unclassified.

The description of molecular breast cancer subtypes has had a major effect on translational breast cancer research, and there have been several attempts to estimate this classification in clinical cohorts using a limited panel of biomarkers. For example, triple-negative (ERPRHER2) breast cancer has been used as an approximation of basal-like tumors and other combinations of ER, PR, and HER2 have been used to estimate luminal A, luminal B, and ERBB2+ subtypes (12). The limitation of this approach is that the essential multidimensional nature of global gene expression profiling (8) is not captured by a few individual markers. In this study, we used an alternative strategy—rebuilding a classifier based not only on individual biomarkers but also on histopathology. The significance of this is that, like global gene expression profiling, histopathologic features can reflect broad biological themes, such as proliferation, without reliance on a single measure. Consequently, both our cluster subgroup classification and molecular profiling studies (6, 13) categorize a proportion of ER breast cancers as luminal and HER2 amplified cases as “basal-like” because of a pervasive similarity to other cancers in these categories. These results clearly show the continuing advantages of traditional histopathology in the molecular era of breast cancer assessment.

In this study, cluster subgroups were defined by an integrated analysis of grade, ER, CK5/6, and HER2 and showed a range of clinicopathologic correlates. However, a simple decision tree incorporating only the mitotic score, HER2, and CK5/6 score was sufficient to accurately predict cluster subgroup membership. Importantly, the survival characteristics of individual subgroups were largely maintained when the decision tree was applied to a validation cohort of 300 breast cancers. Compared with molecular profiling, there are obvious advantages in this approach, which relies on features that are either routinely documented or, in the case of CK5/6 expression, readily accessible in a routine diagnostic setting. However, it does not avoid the principal criticism of histopathologic grading, which is the potential for interobserver variability. In this regard, it should be noted that, unlike grade that incorporates scores for three histopathologic features, the only decisive feature in the cluster subgroup subclassification is a particularly low mitotic count (score 1). Studies using a particularly rigorous approach to mitotic count scoring have reported that high levels of concordance between observers can be achieved (14, 15). Although it is unlikely that these reports are directly applicable to routine practice, the fact that our cluster subgroup subclassification was robust in an independently reviewed validation cohort indicates consistency in the mitotic score results across two studies.

As a way of identifying a good prognostic subgroup of breast cancers, the cluster subgrouping was less accurate than histopathologic grade because cluster A cancers had slightly worse outcome that the grade 1 category in both the original and validation cohorts. However, the ability of the cluster subgrouping to specify ER+ breast cancers with unexpectedly poor prognosis offers a major advantage. In particular, it has the potential to contribute to the currently problematic identification of patients with ER+ cancer who may benefit from adjuvant chemotherapy (16). In this regard, our finding of poor-prognosis ER+ breast cancers in cluster B is consistent with several molecular profiling studies that have specifically addressed this issue. For example, analogous to the luminal B designation, Oh et al. (17) and Loi et al. (18) reported gene expression analyses that identified a subgroup of ER+ breast cancer with relatively poor prognosis that were distinguished by low ER and high proliferation expression patterns. Moreover, ER+ cancers at high risk of recurrence have been identified using a commercially available 21-gene expression assay that produces a recurrence score that is weighted for proliferation (19). High-risk cases identified using this assay have been shown to benefit from adjuvant chemotherapy (20), further highlighting the potential clinical value of identifying poor-prognosis ER+ cases.

The Adjuvant! Online predicted breast cancer–survival rates were concordant with observed cancer-specific survival for all patient subgroups examined, with the exception of cluster B ER+ cases for which observed survival was almost 20% lower than predicted. Adjuvant! Online includes ER status as a crucial variable to predict the efficacy of systemic adjuvant hormonal therapy, with a minor role in estimating overall prognosis (21). In the current study, patients were not uniformly treated and Adjuvant! Online survival predictions were adjusted for the treatment received. Taking these factors into account, the overoptimistic survival prediction obtained for cluster B ER+ may reflect the fact that ER expression does not confer an equivalent survival advantage to all breast cancer patients but is influenced by broader biological context. Alternatively, the data may reflect reduced benefit of adjuvant tamoxifen for ER+ cancers in cluster B compared with cluster A.

In this study, we have used a novel approach to identify biologically and clinically distinct breast cancer subgroups using standard histopathology data. In particular, our analysis identified a subgroup of ER+ breast cancers with unexpectedly poor prognosis. The identification of these cases using a simple approach that can be routinely applied has potential to improve management for a proportion of individuals with breast cancer.

No potential conflicts of interest were disclosed.

Grant support: University of Sydney Cancer Research Fund (R.L. Balleine, L.R. Webster, and C.L. Clarke). R.L. Balleine is a Cancer Institute New South Wales Fellow and C.L. Clarke is a Principal Research Fellow of the National Health and Medical Research Council of Australia.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

Note: Supplementary data for this article are available at Clinical Cancer Research Online (http://clincancerres.aacrjournals.org/).

We thank Michael Bilous (Institute of Clinical Pathology and Medical Research Westmead) and John Boyages, Nicholas Wilcken, Greg Heard, and Upali Jayasinghe (New South Wales Breast Cancer Institute) for helpful discussions and Deborah Packham (St. Vincents Clinical School, University of New South Wales) for constructing the tissue microarrays used in this study.

1
Michaelson JS, Silverstein M, Wyatt J, et al. Predicting the survival of patients with breast carcinoma using tumor size.
Cancer
2002
;
95
:
713
–23.
2
Carter CL, Allen C, Henson DE. Relation of tumor size, lymph node status, and survival in 24,740 breast cancer cases.
Cancer
1989
;
63
:
181
–7.
3
Slamon DJ, Clark GM, Wong SG, Levin WJ, Ullrich A, McGuire WL. Human breast cancer: correlation of relapse and survival with amplification of the HER-2/neu oncogene.
Science
1987
;
235
:
177
–82.
4
Massague J. Sorting out breast-cancer gene signatures.
N Engl J Med
2007
;
356
:
294
–7.
5
Sorlie T, Perou CM, Tibshirani R, et al. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications.
Proc Natl Acad Sci U S A
2001
;
98
:
10869
–74.
6
Sorlie T, Tibshirani R, Parker J, et al. Repeated observation of breast tumor subtypes in independent gene expression data sets.
Proc Natl Acad Sci U S A
2003
;
100
:
8418
–23.
7
Hu Z, Fan C, Oh DS, et al. The molecular portraits of breast tumors are conserved across microarray platforms.
BMC Genomics
2006
;
7
:
96
.
8
Perou CM, Sorlie T, Eisen MB, et al. Molecular portraits of human breast tumours.
Nature
2000
;
406
:
747
–52.
9
Lakhani SR, Reis-Filho JS, Fulford L, et al. Prediction of BRCA1 status in patients with breast cancer using estrogen receptor and basal phenotype.
Clin Cancer Res
2005
;
11
:
5175
–80.
10
Elston CW, Ellis IO. Pathological prognostic factors in breast cancer. I. The value of histological grade in breast cancer: experience from a large study with long-term follow-up.
Histopathology
1991
;
19
:
403
–10.
11
Abd El-Rehim DM, Ball G, Pinder SE, et al. High-throughput protein expression analysis using tissue microarray technology of a large well-characterised series identifies biologically distinct classes of breast cancer confirming recent cDNA expression analyses.
Int J Cancer
2005
;
116
:
340
–50.
12
Nguyen PL, Taghian AG, Katz MS, et al. Breast cancer subtype approximated by estrogen receptor, progesterone receptor, and HER-2 is associated with local and distant recurrence after breast-conserving therapy.
J Clin Oncol
2008
;
26
:
2373
–8.
13
Rouzier R, Perou CM, Symmans WF, et al. Breast cancer molecular subtypes respond differently to preoperative chemotherapy.
Clin Cancer Res
2005
;
11
:
5678
–85.
14
Medri L, Volpi A, Nanni O, et al. Prognostic relevance of mitotic activity in patients with node-negative breast cancer.
Mod Pathol
2003
;
16
:
1067
–75.
15
van Diest PJ, Baak JPA, Matze-Cok P, et al. Reproducibility of mitosis counting in 2,469 breast cancer specimens: results from the multicenter morphometric mammary carcinoma project.
Hum Pathol
1992
;
23
:
603
–7.
16
Henry NL, Hayes DF. Can biology trump anatomy? Do all node-positive patients with breast cancer need chemotherapy?
J Clin Oncol
2007
;
25
:
2501
–3.
17
Oh DS, Troester MA, Usary J, et al. Estrogen-regulated genes predict survival in hormone receptor-positive breast cancers.
J Clin Oncol
2006
;
24
:
1656
–64.
18
Loi S, Haibe-Kains B, Desmedt C, et al. Definition of clinically distinct molecular subtypes in estrogen receptor-positive breast carcinomas through genomic grade.
J Clin Oncol
2007
;
25
:
1239
–46.
19
Paik S, Shak S, Tang G, et al. A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer.
N Engl J Med
2004
;
351
:
2817
–26.
20
Paik S, Tang G, Shak S, et al. Gene expression and benefit of chemotherapy in women with node-negative, estrogen receptor-positive breast cancer.
J Clin Oncol
2006
;
24
:
3726
–34.
21
Ravdin PM, Siminoff LA, Davis GJ, et al. Computer program to assist in making decisions about adjuvant therapy for women with early breast cancer.
J Clin Oncol
2001
;
19
:
980
–91.

Supplementary data